All pages
Powered by GitBook
1 of 18

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Bulk RNA-Seq

This tutorial gives an overview of RNA-Seq analysis with Partek Flow. It will guide you through creating an RNA-Seq analysis pipeline. The goals of the analysis are to create a list of differentially expressed genes, visualize these gene expression signatures by hierarchical clustering, and interpret the gene lists using gene ontology (GO) enrichment.

This tutorial will illustrate:

  • Importing the tutorial data set

  • Adding sample attributes

Description of the Data Set

This tutorial uses a subset of the data set published in Xu et al. 2013 (PMID: 23902433). In the experiment, mRNA was isolated from HT29 colon cancer cells treated with the drug 5-aza-deoxy-cytidine (5-aza) at three different doses: 0μM (control), 5μM, or 10μM. The mRNA was sequenced using Illumina HiSeq (paired end reads). The goal of the experiment was to identify differentially expressed genes between the different treatment groups.

Additional Assistance

If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

Trimming bases and filtering reads

Based on pre-alignment QA/QC, we need to trim low quality bases from the 3' end of reads.

  • Click the Unaligned reads data node

  • Click Pre-alignment tools in the task menu

  • Click Trim bases (Figure 1)

Importing the tutorial data set

The tutorial data set includes 9 samples equally divided into 3 treatment groups. Sequencing was performed by an Illumina HiSeq (paired-end reads), but the workflow can be easily adapted for data generated by other sequencers. Each sample has 2 fastq files for a total of 18 fastq files.

You can obtain the tutorial data set through Partek Flow.

  • Click your avatar

  • Click Settings in the drop-down menu (Figure 1)

At the top of the System information page, there is a section labeled Tutorial data (Figure 2).

Saving and running a pipeline

By following the steps in this tutorial, you have built a pipeline. You can save this pipeline for future use.

  • Select Create new pipeline near the bottom left-hand side of the Analyses tab

  • Select the Pre-alignment QA/QC, Trim bases, STAR, Post-alignment QA/QC, Quantify to annotation model, Filter features, Normalize counts, PCA, and DESeq2 task nodes to include them in the pipeline. Note that skipping tasks within a pipeline is not allowed if the task uses the upstream output

Adding sample attributes

Attributes describe samples. Examples of sample attributes include treatment group, age, sex, and time point. Attributes can be added individually in the Metadata tab or in bulk . In this tutorial, we will add one attribute, 5-AZA Dose, manually.

  • Click the Metadata tab

  • Click Manage under Sample attributes (Figure 1)

Exploring the data set with PCA

The principal components analysis (PCA) scatter plot allows us to visualize similarities and differences between the samples in a data set.

  • Click the Normalized counts data node

  • Click Exploratory analysis in the task menu

  • Click PCA

Viewing a dot plot for a gene

In addition to the volcano plot showing all genes, we can view expression levels of each gene on a dot plot.

  • Double-click the Filtered feature list data node to open the task report

  • Click the FDR step up header in the 5uM vs. 0uM section to sort by ascending FDR step up

In the task report table, there is a column labeled View with three icons in each row.

Quantifying to an annotation model

RNA-Seq uses the number of sequencing reads per gene or transcript to quantify gene expression. Once reads are aligned to a reference genome, we need to assign each read to a known transcript or gene to give a read-count per transcript or gene.

  • Click the Aligned reads data node

  • Click Quantification in the task menu

We will use Partek E/M to quantify reads to an annotation model in this tutorial. For more information about the other quantification options, please see the user guide.

Visualizing gene expression in Chromosome view

From the DESeq2 task report, we can browse to any gene in the Chromosome view.

  • Click in the KLHDC7B row to open Chromosome view (Figure 1)

A new tab will open showing KLHDC7B in the Chromosome view (Figure 2).

Chromosome View shows reference genome, annotation, and data set information together aligned at genomic coordinates.

Each track has Configure track and Move track buttons that can be used to modify each track.

The top track shows average number of total count normalized reads for each of the three treatment groups in a stacked histogram. The second track shows the Ensembl annotation.

Running post-alignment QA/QC

After alignment has completed, we can view the quality of alignment by performing post-alignment QA/QC.

  • Click the Aligned reads data node

  • Click QA/QC inthe task menu

  • Click Post-alignment QA/QC from the QA/QC section of the task menu (Figure 1)

Running pre-alignment QA/QC
Trimming bases and filtering reads
Aligning to a reference genome
Running post-alignment QA/QC
Quantifying to an annotation model
Filtering features
Normalizing counts
Exploring the data set with PCA
Performing differential expression analysis with DESeq2
Viewing DESeq2 results and creating a gene list
Viewing a dot plot for a gene
Visualizing gene expression in Chromosome view
Generating a hierarchical clustering heatmap
Performing biological interpretation
Saving and running a pipeline
our support page
By default, Trim bases removes bases starting at the 3' end and continuing until it finds a base pair call with a Phred score of equal to or greater than 35 (Figure 2).
  • Click Finish to run Trim bases with default settings

Figure 2. Configure the Trim bases task

The Trim bases task will generate a new data node, Trimmed reads (Figure 3). We can view the task report for Trim bases by double-clicking either the Trim bases task node or the Trimmed reads data node or choosing Task report from the task menu.

Figure 3. A task and a data node are created from the Trim bases task
  • Double-click the Trimmed reads data node to open the task report

The report shows the percentage of trimmed reads and reads removed in a table and two graphs (Figure 4).

Figure 4. Results of the Trim bases task

The results are fairly consistent across samples with ~2% of reads untrimmed, ~86% trimmed, and ~12% removed for each. The average quality score for each sample is increased with higher average quality scores at the 3' ends.

  • Click RNA-Seq 5-AZA to return to the Analyses tab

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Figure 1. Invoking the Trim bases task
Figure 2. Click RNA-Seq 5-AZA to download the sample data set
  • Click RNA-Seq 5-AZA to download the bulk RNA-Seq tutorial data set

A new project will be created and you will be directed to the Analyses Tab. The data will be downloaded automatically (Figure 3) and imported into your project. Because this is a tutorial project, there is no need to click on Add data as it will be done automatically.

Figure 3. A new project will be automatically created and populated with the tutorial data set

At first the project is empty, but the file download will start automatically in the background. You can wait a few minutes then refresh your browser or you can monitor the download progress using the Queue.

  • Click Queue

  • Click View Queued Tasks in the drop-down menu

The Queued tasks page will open (Figure 4).

Figure 4. Viewing download progress in the Queued tasks page
  • Click Projects

  • Click RNA-Seq 5-AZA in the drop-down menu

The Analyses tab will open (Figure 5). If the download has completed, you will see a blue circle titled mRNA.

The project name can be changed at any time by clicking the Project settings tab.

Figure 5. Analyses tab showing a mRNA data node

Once the download completes, the sample table will appear in the Metadata tab.

  • Click the Metadata tab. The Metadata tab includes the sample table with the names of each imported sample (Figure 6).

Figure 6. Sample table in the Data tab after downloading the tutorial data set

In the next section of the tutorial, we will add a sample attribute that indicates the treatment group of each sample.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Figure 1. Location of the Settings link on the main page of Partek Flow

Name the pipeline as RNA-Seq basic analysis

  • Give a description for the pipeline; we have noted trimming, alignment, filter genes, normalization, DESeq2

  • Click Create pipeline (Figure 1)

  • Figure 1. Select tasks to include then provide a name and description for the new pipeline

    To access this pipeline in the future, select an unaligned reads data node and choose Pipelines from the task menu. Available saved pipelines will be available to choose from the Pipelines section of the task menu (Figure 2).

    Figure 2. Created pipelines will appear in the Pipelines section of the task menu

    After selecting the pipeline, you will be prompted to choose the reference genome for alignment, the annotation for quantification, and the comparisons for DESeq2. After selecting these options, the pipeline will automatically run. See Pipelines for more information.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Click Add new attribute (Figure 2)
    Figure 2. Adding a new attribute
    • To configure a new attribute, at Name, type in 5-AZA Dose as the name of the attribute

    • Click Add to add 5-AZA Dose as a categorical, project-specific attribute (Figure 3)

    Figure 3. Configuring a new attribute
    • Name the first New category 0uM

    • Click the plus icon to add category (Figure 4)

    • Repeat for two additional categories, 5uM and 10uM (Figure 5)

    Figure 4. Creating attribute category
    Figure 5. All category attributes for 5-AZA Dose
    • Click Back to metadata tab

    The data table now includes an Attribute column for 5-AZA Dose (Figure 6). Next, we need to assign samples attribute categories for 5-AZA Dose.

    Figure 6. Data table updated with column for 5-AZA Dose
    • Select Assign values

    The option to edit the 5-AZA Dose field for each sample will appear as a drop-down menu (Figure 7).

    Figure 7. Dropdown menu to select treatment for each sample
    • Select the 5-AZA Dose text box for a sample to bring up a drop-down menu with the 5-AZA Dose attribute categories (0uM, 5uM, 10uM)

    • Use the drop-down menus to add a treatment group for each sample

    The first three samples (SRR592573-5) should be 0uM, the next three samples (SRR592576-8) should be 5uM, and the final three samples (SRR592579-81) should be 10uM (Figure 8).

    • Click Apply changes

    Figure 8. Add 5-AZA dose to each sample as shown

    The data table will now show each a 5-AZA Dose attribute for each sample.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    using a text file
    Figure 1. Adding sample attributes manually

    Click Finish to run PCA with the default options

    A PCA task node and a PCA data node will be added to the pipeline (Figure 1)

    Figure 1. PCA task node and data node are added to the pipeline
    • Double click the PCA data node to open the PCA scatter plot and corresponding information in the Data Viewer (Figure 2)

    Figure 2. Viewing the PCA plot in Data Viewer

    In the Data Viewer, select the 3D PCA scatter plot then click Style under Configure and set the Color by drop-down to 5-AZA Dose. The scatter plot shows each sample as a sphere, colored by treatment group, in a three dimensional plot. The x, y, and z axes are the first three principal components. The percentage of total variance explained by each is listed next to the axis label. The size of each axis is determined by the variance along that axis. The plot is fully interactive; it can be rotated and points selected.

    Here, we can see that samples separate based on treatment, but there is noticeable separation within treatment groups, particularly the 0μM and 10μM treatment groups.

    For more detailed information about the PCA scatter plot, please see the PCA user guide.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Select to open a dot plot for the gene KLHDC7B

    The dot plot for KLHDC7B (Figure 1) shows each sample as a point with normalized reads count on the y-axis. Samples are separated and colored by treatment group.

    Figure 1. Dot plot for KLHDC7B

    For more information about the dot plot, please see the Dot Plot user guide. To return to the DESeq2 report, switch to the browser table with filtered feature list.

    In the DESeq2 report, you can select to view additional information about the statistical results for a gene or select to view the region in Chromosome View. Chromosome View is discussed in the next section of the tutorial.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Chromosome icon
    Scatter plot icon
    Notebook icon

    Click Quantify to an annotation model (Partek E/M) (Figure 1)

    Figure 1. Invoking Quantify to an annotation model (Partek E/M)

    We will use the default options for quantification. To learn more about the different options, please see the Quantify to annotation model (Partek E/M) user guide.

    • Choose Ensembl Transcripts release 105 annotation from the Annotation model drop-down menu (you may need to download it first, via Library File Management or click Add annotation model from the bottom of the drop-down)

    • Click Finish (Figure 2)

    Figure 2. Configuring Quantify to annotation model (Partek E/M)

    The Quantify to annotation model task node outputs two data nodes, Gene counts and Transcript counts (Figure 3).

    Figure 3. The Quantify to annotation model task produces annotation model dependent outputs (e.g. Gene counts and Transcript counts node)

    To view the results of quantification, double-click an output data node to view the task report.

    The task report details the number of reads within exons, introns, and intergenic regions. For detailed information about the quantification results, see the Quantify to annotation model (Partek E/M) user guide.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Quantification

    We can add tracks from any data node using Select Tracks.

    • Click Select tracks

    A pop-up dialog showing the pipeline allows us to choose which data to display as tracks in Chromosome view (Figure 3).

    Figure 3. Choosing tracks to display in Chromosome view
    • Click Reads pileup under Aligned reads on the left-hand side of the dialog

    • Click Display selection to make the change

    The reads pileup track is now included (Figure 4).

    Figure 4. The reads pileup track shows every read in its aligned position on the reference genome

    To learn more about Chromosome view, please consult the Chromosome View user guide.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Figure 1. Browsing to a location in Chromosome View
    Figure 2. Viewing in Chromosome view
    Chromosome icon
    Configure track icon
    Move track icon

    A Post-alignment QA/QC task node will be generated (Figure 2).

    Figure 2. Post-alignment QA/QC task node
    • Double-click the Post-alignment QA/QC task node to view the task report

    Similar to the Pre-alignment QA/QC task report, general quality information about the whole data set is displayed and sample-level reports can be opened by clicking a sample name in the table.

    The first two graphs in the report (Figure 3) show the alignment breakdown and total reads in each sample.

    Figure 3. Alignment statistics for the data set

    From these graphs, we can see that more than 95% of reads were aligned, but the total number of reads for each sample varies. Normalizing for the variability in total read counts will be addressed in a later section of the tutorial.

    For more information about the graphs and information presented in the Post-alignment QA/QC task report, see the Post-alignment QA/QC user guide.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Figure 1. Invoking Post-alignment QA/QC

    Viewing DESeq2 results and creating a gene list

    Once we have performed DESeq2 to identify differentially expressed genes, we can create a list of significantly differentially expressed genes using cutoff thresholds.

    • Double click the DESeq2 data node to open the task report

    The task report shows genes on rows and the results of the DESeq2 on columns (Figure 1).

    Figure 1. Viewing the DESeq2 task report

    To get a sense of what filtering thresholds to set, we can view a volcano plot for a comparison.

    • Click next to the 5uM vs. 0uM comparison

    A volcano plot will open showing p-value on the y-axis and fold-change on the x-axis (Figure 2). If the gene labels are on (not shown), click on the plot to turn them off.

    Thresholds for the cutoff lines are set using the Statistics card (Left panel > Configure > Statistics). The default thresholds are |2| for the X axis and 0.05 for the Y axis.

    • Switch to the browser tab showing the DESeq2 report

    • Click FDR step up

    • Click the triangle next to FDR step up to open the FDR step up options

    • Leave All contrasts selected

    This will include genes that have a FDR step up value of less than or equal to 0.05 for all three contrasts, 5μM vs. 0μM, 10μM vs. 0μM and 5μM:10μM vs. 0μM. FDR step up is the false discovery rate adjusted p-value used by convention in microarray and next generation sequencing data sets in place of unadjusted p-value.

    • Click Fold-change

    • Click the triangle next to Fold-change to open the Fold-change options

    • Leave All contrasts selected

    • Set to From -2 to 2 with Exclude range selected. Hit Enter.

    Note that the number of genes that pass the filter is listed at the top of the filter menu next to Results: and will update to reflect any changes to the filter. Here, 28 genes pass the filter (Figure 3). Depending on your settings, the number may be slightly different.

    • Click to create a data node with only the genes that pass the filter

    This creates a Filter list task node and a Filtered feature list data node (Figure 4).

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Normalizing counts

    Because different samples have different total numbers of reads, it would be misleading to calculate differential expression by comparing read count numbers for genes across samples without normalizing for the total number of reads.

    • Click the Filtered counts data node

    • Click Normalization and scaling in the task menu

    • Click Normalization (Figure 1)

    The Normalization task set up page will open (Figure 2).

    Normalization can be performed by samples or by features. By samples is selected by default; this is appropriate for the tutorial data set.

    Available normalization methods are listed in the left-hand panel. For more information about these options, please see the user guide.

    For this tutorial, we will use the recommended default normalization settings.

    • Select

    This adds the Median ratio normalization method, which is suitable for performing differential expression analysis using DESeq2 (Figure 3).

    • Click Finish to perform normalization

    A Normalize counts task node and a Normalized counts data node are added to the pipeline (Figure 4)

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Filtering features

    Low expression genes may be indistinguishable from noise and will decrease the sensitivity of differential expression analysis.

    • Click the Gene counts node

    • Click Filtering in the task menu

    • Click Filter features (Figure 1)

    Figure 1. Selecting Filter features
    • Click Noise reduction filter

    • Set the filter to maximum <= 1

    • Click Finish (Figure 2)

    A new Filtered counts data node will be created to que additional tasks (Figure 3) like Normalization.

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Aligning to a reference genome

    With our reads trimmed, we now have high-quality reads for each sample. The next step is to align the reads to a reference genome. Alignment matches each of the short sequencing reads to a location in the reference genome.

    • Click Trimmed reads

    • Click Aligners in the task menu to display available aligners (Figure 1)

    Figure 1. A variety of different aligners are available in Partek Flow. In this tutorial we will use STAR aligner, a popular choice for aligning RNA-Seq data.

    Partek Flow offers a variety of different aligners. Mouse over any option for a short description. For this tutorial, we will use STAR, a fast and accurate aligner commonly used for RNA-Seq data. For more information about STAR and the other aligners, please consult the Aligners user guide.

    • Click STAR

    The STAR aligner options allow us to select the genome build (assembly) and index. For this tutorial, our data set contains only reads that map to chromosome 22 to minimize the time required for resource-intensive tasks, such as alignment.

    • Click Finish to run with hg38 selected for Assembly and Whole genome for the Aligner index (Figure 2)

    Alignment is a resource-intensive task and may take around 20 minutes to complete, even when mapping only reads from a single chromosome. Task and data nodes that have been queued, but not completed, are shown in a lighter color than completed tasks (Figure 3).

    The Align reads task generates an Aligned reads data node once complete. You can wait for the alignment task to finish or you can continue building the pipeline while the results of alignment are pending; additional tasks can be added to the pipeline and queued before the current task has completed.

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Running pre-alignment QA/QC

    With attributes added, we can begin building our pipeline.

    • Click the Analyses tab

    In the Analysis tab, data are represented as circles, termed data nodes. One data node, mRNA, should be visible in the Analysis tab (Figure 1).

    • Click the mRNA node

    Clicking a data node brings up the context-sensitive task menu on the right with tasks that can be performed on the data node (Figure 2).

    Generating a hierarchical clustering heatmap

    To check how well our list of differentially expressed genes distinguishes one treatment group from another, we can perform hierarchical clustering based on the gene list. Clustering can also be used to discover novel groups within your data set, identify gene expression signatures distinguishing groups of samples, and to identify genes with similar patterns of gene expression across samples.

    • Click the Filtered feature list data node

    • Click Exploratory analysis in the task menu

    • Click

    Performing biological interpretation

    To learn more about the biology underlying gene expression changes, we can use gene ontology (GO) or pathway enrichment analysis. Enrichment analysis identifies over-represented GO terms or pathways in a filtered list of genes.

    Normalize counts
    our support page
    Figure 1. Invoking Normalization task
    Figure 2. Normalization task set up page
    Figure 3. Select recommended normalization method
    Figure 4. Normalize counts task node and Normalized counts data node
    Recommended button
    Scatter plot icon
    Notebook icon
    Chromosome icon

    Set the cutoff value to 0.05. Hit Enter.

    our support page
    Figure 2. Viewing DESeq2 results with a volcano plot
    Figure 3. Applying filters to the DESeq2 results
    Figure 4. Filter list and a new Feature list node are added to the pipeline
    Generate filtered node button
    Volcano icon
    our support page
    Figure 2. Filtering low expressed genes
    Figure 3. Filtered counts node
    our support page
    Figure 2. Configuring the STAR aligner
    Figure 3. Queued tasks and their output data nodes are shown in a lighter color
    Figure 2. Select a data node to open the context-sensitive task menu

    Pre-alignment QA/QC assesses the quality of the unaligned reads and will help us determine whether trimming or filtering is necessary.

    • Click Pre-alignment QA/QC in the QA/QC section of the task menu

    • Click Finish to run the task with default settings

    Running a task creates a task node, e.g. the blue rectangle labeled Pre-alignment QA/QC (Figure 3), which contains details on the task and a report. While tasks have been queued or are in progress they have a lighter color. Any output nodes that the task will generate are also displayed in a lighter color until the task completes. Once the task begins running, a progress bar is displayed on the task node.

    Figure 3. Tasks are represented as rectangles
    • Click the Pre-alignment QA/QC node

    The context-sensitive task menu (Figure 4) shows the option to view the Task report and the Task details. You can also access a task report by double-clicking on a task node.

    Figure 4. Navigating to the task report using the context-sensitive menu
    • Click Task report

    Pre-aligment QA/QC provides information about the sequencing quality of unaligned reads (Figure 5). Both project level summaries and sample-level summaries are provided.

    Figure 5. Viewing the pre-alignment QA/QC report
    • Click sample SSR592573 in the data table of the report to open its sample-level report

    The Average base quality score per position graph (Figure 6) gives the average Phred score for each position in the reads.

    Figure 6. Average base quality score per position for sample SRR592573

    A Phred score is a measure of base call accuracy with a higher score indicating greater accuracy.

    Phred Quality Score
    Probability of incorrect base call
    Base call accuracy

    20

    1 in 100

    99%

    30

    1 in 1000

    99.9%

    40

    1 in 10,000

    99.99%

    50

    1 in 100,000

    By convention, a score above 20 is considered adequate. As you can see, the standard error bars in the graph show that some reads have quality scores below 20 for some of their base pair calls near the 3' end.

    Based on the results of Pre-alignment QA/QC, while most of the reads are high quality, we will need to perform read trimming and filtering. For more information about the information included in the task report, please see the Pre-alignment QA/QC user guide.

    • Click RNA-Seq 5-AZA (the project name) to return to the Analyses tab

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Figure 1. Data are represented as circles
    Hierarchical clustering
    (Figure 1)
    Figure 1. Invoking Hierarchical clustering task

    The Hierarchical clustering menu will open (Figure 2). Hierarchical clustering can be performed with a heatmap or bubble map plot. Cluster must be selected under Ordering for both Feature order and Sample order if both the features (columns) and samples (rows) are to be clustered.

    Figure 2. Configuring Hierarchical clustering
    • Click Finish to run with default settings

    A Hierarchical clustering task node will be added to the pipeline (Figure 3).

    Figure 3. Hierarchical clustering task node
    • Double-click the Hierarchical clustering / heatmap task node to view the heatmap

    The Dendrogram view will open showing a heatmap with the hierarchical clustering results (Figure 4).

    Figure 4. Viewing the hierarchical clustering heatmap

    Samples are shown on rows and genes on columns. Clustering for samples and genes is shown through the dendrogram trees. More similar samples/genes are separated by fewer branch points of the dendrogram tree.

    The heatmap displays standardized expression values with a mean of zero and standard deviation of one.

    The heatmap can be customized to improve data visualization using the menu on the Configuration panel on the left.

    • Click Annotations under Configure section in the left panel

    • Select 5-AZA Dose from Row annotation drop-down

    • Click Axes under Configure

    • Change Column labels Feature to gene_name using the drop-down

    Samples are now labeled with their 5-AZA Dose group and column labels are Gene names (Figure 5).

    Figure 5. Samples labeled with their 5-AZA Dose group

    Samples from the 5μM and 10μM groups are more similar to each other than to the 0μM group.

    We can save the heatmap as a publication-quality image.

    • Click the Export image icon in the top right corner of the plot

    • Choose format, size and resolution in the Export image dialog (Figure 6)

    Figure 6. Choosing format, size and resolution for export
    • Click Save

    The heatmap will be saved as a .png file and downloaded in your web browser.

    For more information about hierarchical clustering and the Dendrogram view, please see the Hierarchical Clustering user guide.

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Enrichment analysis
    • Click the Filtered feature list data node

    • Click Biological interpretation in the task menu

    • Click Gene set enrichment then select Gene set database to perform GO enrichment analysis (Figure 1)

    Figure 1. Invoking enrichment analysis
    • Select the latest gene set from geneontology.org from the Gene set database drop-down menu

    • Click Finish

    A gene set enrichment task node and data node will be added to the pipeline (Figure 2).

    Figure 2. Gene set enrichment task node and data node
    • Double-click the Gene set enrichment task node to open the task report (Figure 3)

    Figure 3. Viewing the GO enrichment task report

    The gene set enrichment task report lists GO terms by ascending p-value with the most significant GO term at the top of the list. Also included are the enrichment score, the number of genes from that GO term in the list, and the number of genes from that GO term that are not in the list.

    To view the genes associated with each GO term, select to open the extra details page. To view additional information about a GO term, click the blue gene set ID in the first column to open the linked geneontology.org entry in a new browser tab.

    For more information about GO enrichment analysis, please see the Gene Set Enrichment user guide.

    KEGG enrichment analysis

    KEGG enrichment analysis identifies pathways that are over-represented in a gene list data node.

    • Click the Filtered feature list data node

    • Click Biological interpretation in the task menu

    • Click Gene set enrichment then select KEGG database

    • Click Finish in the configuration dialog to run KEGG analysis with the Homo sapiens KEGG database

    A Pathway enrichment task node and data node will be added to the pipeline (Figure 4).

    Figure 4. Pathway enrichment task node and data node
    • Double-click the Pathway enrichment task node to open the task report

    The Pathway enrichment task report is similar to the gene set enrichment analysis task report (Figure 5).

    Figure 5. Pathway enrichment task report

    To view an interactive KEGG pathway map, click the pathway ID (Gene set column).

    Additional Assistance

    If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

    Enrichment analysis
    KEGG enrichment analysis

    Performing differential expression analysis with DESeq2

    After normalizing the data, we can perform differential analysis to identify genes that are differentially expressed based on treatment.

    • Click the Normalized counts node

    • Click Statistics in the task menu

    • Click Differential analysis in the task menu (Figure 1)

    Select the appropriate differential analysis method (Figure 2). In this tutorial we are going to use DESeq2, but Partek Flow offers a number of alternatives. Hover the mouse over the symbol for more information on each differential analysis method, or see our user guide for a more in-depth look.

    • Check 5-AZA Dose and click Add factors to add the attribute to the statistical model.

    • Select Next to continue with 5-AZA Dose as the selected attribute

    The Comparisons page will open (Figure 4).

    It is easiest to think about comparisons as the questions we are asking. In this case, we want to know what are the differentially expressed genes between untreated and treated cells. We can ask this for each dose individually and for both collectively.

    The upper box will be the numerator and the lower box will be the denominator in the comparison calculation so we will select the 0μM control in the lower box.

    • Drag 5μM to the upper box

    • Drag 0μM to the lower box

    • Click Add comparison to add 5μM vs. 0μM to the comparison table (Figure 5)

    • Repeat to create comparisons for 10μM vs. 0μM and 10μM,5μM vs. 0μM (Figure 6)

    • Click Finish to perform DESeq2 as configured

    A DESeq2 task node and a DESeq2 data node will be added to the pipeline (Figure 7).

    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    99.999%

    Differential Analysis
    our support page
    Figure 1. Navigating to the differential analysis options
    Figure 2. Select the method for differential analysis from the options provided.
    Figure 3. Selecting attributes to include in analysis
    Figure 4. The Comparison selector allows multiple comparisons to be designed and added
    Figure 5. Designing a comparison to add
    Figure 6. Comparisons for 5uM vs. 0uM, 10uM vs. 0uM, and 5uM:10uM vs. 0uM have been added
    Figure 7. DESeq2 task node and data node
    Info tip