Bulk RNA

Getting started

Data / Task nodes and Performing tasks in Connected Multiomics

  • Within a study, the Analyses tab contains two elements: task nodes (rectangles) and data nodes (circles) connected by lines and arrows. Collectively, they represent a data analysis pipeline.

  • Clicking a data node brings up a context sensitive menu on the right. This menu changes depending on the type of data node. It will only present tasks which can be performed on that specific data type. Hover over the task to obtain additional information regarding each option.

  • Select the task you wish to perform from the menu. When configuring task options, additional information regarding each option is available. Click Finish to perform the task.

  • Depending on the task, a new data node may automatically be created and connected to the original data node. This contains the data resulting from the task. Tasks that do not produce new data types will not produce an additional data node.

  • To view the results of a task, click the data node and choose the Task report option on the menu.

Viewing and saving data

  • All data contained in data nodes can be downloaded to the local machine by selecting the node and navigating to the bottom of the toolbox then choose Download data.

  • The Data Viewer can be used to plot, modify, and save data. In this walkthrough the PCA data node and Hierarchical clustering / heatmap node can be automatically opened in the Data viewer by double-clicking the data node or opening the Task report from the toolbox.

  • To save an individual image within the Data Viewer to your machine, click Plot then Export image & select the format, size, and resolution then click Save. Use the plot-specific tools for this.

  • All visualizations within a sheet in the Data Viewer can be exported as one image (e.g. use one image with all plots for a poster). Use the Export drop-down at the top of the data-viewer for this and select Export image.

Input: secondary outputs from the DRAGEN analysis

You will noticed that there are two .sf file options to choose from in secondary outputs of the DRAGEN analysis.

  • <outputPrefix>.quant.genes.sf - Contains quantification results at the gene level. The results are produced by summing together all transcripts with the same geneID in the annotation file (GTF).

  • <outputPrefix>.quant.sf - Contains quantification results at the transcript level.

Import Data

Import data that has been processed through the DRAGEN RNA analysis pipeline in BaseSpace, Illumina Connected Analytics, or the command line.

  • Use the .sf file from the secondary outputs.

Annotation file (GTF)
Genome reference file

GENCODE v19

Homo sapiens [UCSC] hg19 v5

Homo sapiens [UCSC] hg19 v5 Pangenome

Homo sapiens [NCBI] hs37d5 v5

Homo sapiens [NCBI] hs37d5 v5 Pangenome

GENCODE v44

Homo sapiens [1000 Genomes] hg38 v5

Homo sapiens [1000 Genomes] hg38 v5 Pangenome

GENCODE vM23

Mus musculus [UCSC] mm10

ENSEMBL 98

Rattus norvegicus [UCSC] rn6

  • After creating a study and adding data to the study, click + New Analysis

  • Give the Analysis a name, select the Analysis Type > Custom: RNA, select the sample groups to add to the analysis, and click Run Analysis

  • The Status will show as Complete when ready to analyze

  • Click the complete Analysis to open and customize the analysis pipeline

  • The Quantification node is created when the analysis completes.

The Quantification node is the starting node for analysis

Annotate Features

Add gene-level annotations to the quantified data.

  • Single-click the Quantification node.

  • Select Annotate features under the Pre-analysis tools section in the toolbox on the right.

  • Choose the genome and annotation files that match those used in DRAGEN then click Finish.

  • Outcome:

    • Task node: Annotate features

    • Result node: Annotated counts

Annotate feature task node & Annotated counts data node

Normalization and Scaling

Normalize the data to prepare for downstream analysis.

  • Single-click the Annotated Counts node, then select the Normalization task from the Normalization and Scaling section.

  • Click the "Use Recommended" button or select an alternative method. We recommend the widely used Median ratio (DESeq2 only) method.

Median ratio (DESeq2 only) is the recommended normalization method for bulk transcriptomic data
  • Outcome:

    • Task node: Normalize counts

    • Result node: Normalized counts

Normalize counts task node & Normalized counts data node

Dimension Reduction (PCA)

Visualize sample clustering and variance.

  • From the Normalized Counts node, select PCA under Exploratory Analysis.

  • Outcome:

    • Task node: PCA

    • Result node: PCA

PCA task node & PCA data node

Differential Analysis

Compare gene expression across experimental groups.

  • From the Normalized counts node, select Differential Analysis from the Statistics section.

  • Choose your preferred model and set up the comparison. Note that we have chosen the DESeq2 method and used the corresponding normalization prior.

  • Outcome:

    • Task node: Differential analysis (labeled as model used)

    • Result node: Differential results (labeled as comparison made)

Differential analysis task node & Differential analysis data node labeled as model & comparison made

Filter Feature List

Refine the list of genes/features based on criteria.

  • Open the Differential Results node (double-click or single-click and select Task report from the toolbox).

  • Use the filter menu to apply criteria relevant to your study.

  • Click Generate filtered node once satisfied.

  • Outcome:

    • Task node: Filter list

    • Result node: Filtered feature list

Filter list task node & Filtered feature list data node

Gene Set Enrichment

Identify enriched biological pathways or gene sets.

  • Select Gene Set Enrichment from the Biological Interpretation section.

  • Choose between KEGG Pathway Enrichment or Gene Set Ontology.

  • Outcome:

    • Task node: Gene set enrichment

    • Result node: Pathway enrichment

Gene set enrichment task node & Pathway enrichment data node

Hierarchical clustering / Heatmap

Visualize features in an informative way.

  • Select Hierarchical clustering / Heatmap from the Exploratory analysis section.

  • This task can be used for either a heatmap or bubble map. Choose the task options that best suite your needs.

  • Double-click on the output node to visualize the results in the Data viewer.

  • Outcome:

    • Node: Hierarchical clustering / heatmap

Hierarchical clustering / heatmap result

Last updated

Was this helpful?