Illumina Connected Multiomics Walkthrough

Illumina Connected Multiomics provides interactive visualizations and powerful statistics. This is a walkthrough of an analysis that could be done in Connected Multiomics with an example proteomic data set, produced by DRAGEN Protein Quantification. It covers the following features:

Creating a default analysis
Creating a custom analysis
Create a feature list to filter by
- IE, 9.5k human protein list
Managing sample metadata
Filtering samples
Filtering features
Data Transformation
PCA
Differential expression
Hierarchical clustering and creating heatmaps
Gene set enrichment analysis

For information on the Connected Multiomics Platform, including how to log in, please reference the following documentation: https://help.multiomics.illumina.com/icm

Demo Data

Demo data that can be used to follow along with this walkthrough is found in the Connected Multiomics Demo Data repository. To add this dataset to a study, perform the following steps:

After clicking "+ Add Demo Data", the data used in this walkthrough can be found at /Multiomics-Demo-Data/Proteomics/NovaSeq 6k-S4 Cancer-Normal. For this study, both SampleType (CRC/Control) and TimePoint (T1...T8) are used. This data must be ingested prior to starting the analysis. Add both the ADAT (counts) and TSV (metadata) to the study.

Creating a Default Analysis

Click on '+ New Analysis'.
In the pop-up window, provide a name for the analysis, select ‘Default: Illumina Proteomics’ as the Analysis Type, choose the sample group to be included in the analysis ('All 9k Illumina protein Prep Samples' will be selected by default), and click on the ‘Run Analysis’ button.
Exploring the PCA plot:
- By default, the plot is colored by BatchID. To change this, in the left hand bar, select Configure > Style > Color by.
- You may also want to explore other principal components. To do this, on the left side bar for the PCA plot, choose configure, axes, and update the data for each axis.

Creating a Custom Analysis

Click on ‘+ New Analysis’.
In the pop-up window, provide a name for the analysis, select ‘Custom: Illumina Proteomics’ as the Analysis Type, choose the sample group to be included in the analysis ('All 9k Illumina protein Prep Samples' will be selected by default), and click on the ‘Run Analysis’ button.

Note: make sure there are no duplicated Sample IDs in the analysis groups.
A pop-up message will show up if the analysis creation is successful.

Refresh the page to get the latest status of the analysis.
When the Status is ‘Complete’, click on the analysis tile to enter the analysis module.

There is no default initiated analysis for the custom proteomic data. To review the number of samples and features, hover over the data node.

Throughout the below analyses, rectangles/task nodes will produce circles/data nodes. Double clicking on the task node will describe the task as it occurred, and double clicking on the data node will take you to the results of the analysis. Nodes will be greyed out while the analysis is still in progress.

Create a List of Features for Filtering

The SOMAmer content by defaults includes all whitelisted SOMAmers counted during secondary analysis, including controls and non-human proteins. You may want to exclude some SOMAmers from tertiary analysis. One way to do this is to create a saved list of proteins.

Click on the setting icon on the top right corner of the analyses dashboard and click on 'Settings' from the menu.

On the settings page, click on 'Lists' from the left hand navigation bar, and then click on '+ New list' on the top of the right panel to add new list.

On the 'Local file' tab, click on '+ Choose' to select the local file and enter the name of the list in the 'Name' box; click on 'Add list' button to upload the list.

During the following analysis, the attached 9.5k human protein list is used. It was generated by filtering SOMAmers by Organism = Human (or only the SOMAmers associated with human proteins) and isolating the Entrez Gene Names. This is recommended for all 9.5k product analyses.

74KB

Entrez_Gene_Names_9k.txt

Open

If using an Illumina Protein Prep 6k dataset, it's recommended to use the 6k human protein list. It was generated by filtering SOMAmers by Organism = Human (or only the SOMAmers associated with human proteins) and isolating the Entrez Gene Names.

41KB

Entrez_Gene_Names_6k.txt

Open

Managing sample metadata

Click on 'Metadata' tab to view and add sample metadata.

Click on 'Manage' under 'Sample attributes' to reorder the metadata. Drag 'SampleStatus' and 'TimePoint' boxes to the front since they are the features that need to be colored for the downstream analysis. You can also add/remove/reorder other metadata or add new category to the current metadata in this page.

Filtering samples

Return to the analysis page, click on the 'Quantification' node, choose 'Filtering' > 'Filter samples' from the right hand tool box.

Select the samples with TimePoint T1 and T2.

Filtering features

Click on 'Finish' and return to the analysis page. Click on the 'TimePoint in T1,T2' node, choose 'Filtering' > 'Filter features' from the right hand tool box.

If enabled earlier, select 'Saved list' option, choose the previously uploaded Entrez Gene ID list from the dropdown, and make sure the "Feature identifier" is set to Feature ID; click 'Finish'. Alternatively, upload a manual list of features.

Data transformation

Click on the 'Filtered counts' node, choose 'Normalization and Scaling' > 'Normalization' from the right hand tool box.

Choose 'Add' and drag it to the right-hand box to avoid 0 counts. This prevents any 0 count values which could impact Limma-trend differential analysis, which assumes continuous data. Then click on 'Finish' to return to the analysis dashboard.

PCA

Click on the 'Normalized counts' node, select 'Exploratory analysis' > 'PCA' from the right hand tool box.

Use the default setting and click 'Finish'.

Double click on the 'PCA' node to view the PCA report.
- The scatter plot shows the data distribution (colored by SampleStatus) among the first three PCs.
- The scree plot (top right panel) shows the variant represented by each PC.
- The component loading table (bottom right panel) shows the correlation between every protein/SOMAmer and each PC. The variable in this table represents the SOMAmer's SeqID.
- For additional information on PCA, review the following documentation: https://help.partek.illumina.com/partek-flow/user-manual/task-menu/exploratory-analysis/pca

Differential expression

Click on the 'Normalized counts' node, select 'Statistics' > 'Differential analysis' from the right hand menu.

Select 'Limma-trend' (default) method and click 'Next'.

NOTE: Limma-trend is a robust model that fits the assumptions for small sample sizes of normalized protein counts. The Limma-trend model is also flexible with categorical and quantitative variables. For other datasets or experimental designs, consider other methods.

Select 'SampleStatus', 'TimePoint', 'DonorID' then click 'Add factors'. Select 'SampleStatus' and 'TimePoint' then click on 'Add interaction' to add the factors. Click on 'Next' to set up comparisons.

Drag 'CRC' to the top right box and 'Control' to the bottom right box. Click on 'Add comparison'. Then Select 'SampleStatus*TimePoint' from the Factor dropdown menu. Add T1 and T2 comparison between CRC and Control. Keep "Combine" selected for each of these comparisons. Click on 'Finish' bottom at the bottom.

Double click on the 'Limma-trend' node to view the report. On the left hand menu,
- select 'FDR', choose 'Per contrast' and specify 0.05 for CRC vs Control comparison
- select 'Fold change', choose 'Per contranst' and specify -2 to 2 for CRC vs Control comparison
- click on 'Generate Filtered Node'
- repeat this process on CRC T1 vs Control T1 comparison and CRC T2 vs Control T2 comparison

Return to the analyses dashboard and there will be 3 filtered feature list nodes added to the pipeline; right click on the 'Filtered feature list' node and click on 'Rename data node' to rename the node as 'T vs N'; apply the same procedure to the other two filtered feature lists and rename them as 'T vs N Time 1' and 'T vs N Time 2' respectively.

To compare the filtered feature lists, click on the 'Venn diagram' on the bottom menu and tick on the filtered lists ('T vs N', 'T vs N Time 1' and 'T vs N Time 2'); then click on 'Display selection' button on the bottom to visualize the Venn diagram.

Hierarchical clustering and creating heatmaps

Click on 'T vs N' data node, select 'Exploratory analysis' > 'Hierarchical clustering / heatmap' from the right hand tool box.

Choose 'Heatmap' and select the feature order and sample order.
- Choose 'Cluster' (default) as feature order
- Choose 'Assign order' and select 'SampleStatus' from the dropdown menu.
- Click on 'Finish' button at the bottom of the page.

Double click on the 'Hierarchical clustering / heatmap' node to view it.

For additional information on hierarchical clustering, view the following documentation: https://help.partek.illumina.com/partek-flow/user-manual/task-menu/exploratory-analysis/hierarchical-clustering

Gene set enrichment

Click on 'T vs N' data node, select 'Biological interpretation' > 'Gene set enrichment' from the right hand tool box.

Select 'KEGG database', and specify the background gene list as the previously uploaded list of gene symbols. Specifying this list ensures that only genes with associated SOMAmers are included in this analysis. Click on 'Finish' button at the bottom of the page.

Double click on the 'Pathway enrichment' node to view the enriched pathways.
- Click on the pathway name to view the pathway network
- To download genes in each pathway, click on the value in 'Genes in set' column in the corresponding pathway entry.

GSEA

To detect differential pathways between diseased and control samples, click on the 'Normalized counts' node and select 'Biological interpretation' > 'GSEA' from the right hand tool box.

Select 'KEGG database' (default) and click on 'Next' button at the bottom of the page.

Select 'SampleStatus' and click on 'Next'.

Drag 'CRC' to the top right box and 'Control' to the bottom right box. Keep "Combine" selected for this comparison. Click on 'Add comparison', and then click 'Finish' on the bottom of the page

Double click on the 'GSEA' node to view the results.

Click on the enrichment plot icon after each row index to visualize the enrichment score of the corresponding pathway.

NOTE. For clarify on the differences between Gene Set Enrichment Analysis and GSEA, please view this documentation: https://help.partek.illumina.com/partek-flow/frequently-asked-questions#what-is-the-difference-between-gsea-and-gene-set-enrichment

PreviousCompatibility with Excel NextFAQs

Last updated 9 days ago

Was this helpful?

hashtagDemo Data

hashtagCreating a Default Analysis

hashtagCreating a Custom Analysis

hashtagCreate a List of Features for Filtering

hashtagManaging sample metadata

hashtagFiltering samples

hashtagFiltering features

hashtagData transformation

hashtagPCA

hashtagDifferential expression

hashtagHierarchical clustering and creating heatmaps

hashtagGene set enrichment

hashtagGSEA