# Differential methylation

Differential methylation is used to detect differentially methylated CpG loci (DML) or regions (DMR) between two conditions. The method is based on Bioconductor package DSS (Dispersion Shrinkage for Sequencing data), it is a count-based test. Detailed implementation can be found [here](https://www.bioconductor.org/packages/devel/bioc/vignettes/DSS/inst/doc/DSS.html#33_DMLDMR_detection_from_two-group_comparison).

{% hint style="warning" %}
Current DSS implementation in the Connected Multiomics supports DML/DMR detection from two-group comparison. DML/DMR detection from general experiment design and experiment without replicates are not supported.
{% endhint %}

This task can be invoked from the imported 5-base Methylation data node, which contains total read count and methylated read count for each CpG site.

Click on 5-base Methylation data node, choose **Statistics > Differential Methylation**

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-942f30e7de181a0f891d13a4b33651629f24edd8%2Fimage%20(341).png?alt=media" alt="" width="526"><figcaption></figcaption></figure></div>

Click **Next**. Select a categorical factor that has the two groups to compare and click **Next**

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-ebab6b3ed2686924256042cd316c2e271c5f1475%2Fimage%20(342).png?alt=media" alt="" width="379"><figcaption></figcaption></figure></div>

Setup the comparison(s) based on the factor selected:

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-0050bc2b2d6915009b64383c2955a5e612be4ff5%2Fimage%20(343).png?alt=media" alt="" width="452"><figcaption></figcaption></figure></div>

The subgroups of the factor are displayed in the left panel; click to select one and move it to one of the boxes on the right. The difference calculation on the comparison will use the group in the top box minus the group in the bottom box. The dialog setup is similar to ANOVA/LIMM-trend/LIMM-Voom.

Click on **Configure** in *Advanced options*, smooth span can be customized, the default value is 500. p-value for DML and DMR setting will be used to filter the results.

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-67a47a16d853918e5702d0834090f40466bfea88%2Fimage%20(344).png?alt=media" alt="" width="357"><figcaption></figcaption></figure></div>

After apply the advanced options, click **Finish** to run the task.

The task will generate two data nodes: *DML* and *DMR* which represent the differential methylation at loci level and region level respectively.

Double click on **DML** node to open the report:

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-9496cee0cb9b3a47639b8a4d57ed9c3c08123069%2Fimage%20(345).png?alt=media" alt=""><figcaption></figcaption></figure></div>

In this report, each row is a locus which passed the p-value cutoff set in the advanced dialog:

* chr: Chromosome where the CpG site is located
* pos: Genomic base pair location of the CpG site
* pval: Raw p-value from the Wald test for the differential methylation at this site
* fdr: Adjusted p-value based on Benjamini-Hochberg method
* diff: Difference in methylation level between the groups. Positive values indicate higher methylation in group 1; negative value indicate higher methylation in group 2.
* mu1 and mu2: Average methylation level in group 1 and group 2
* diff.se: Is the standard error of the estimated methylation difference between the two groups
* stat: Wald test statistics used to assess significance of methylation difference
* phi1 and phi2: Is dispersion parameter estimated for group 1 and group2. It represents the biological variability in methylation level within the group. Higher value indicates more variability within the group.
* postprob.overThreshold: Posterior probability that the methylation difference between the two groups exceeds a specified threshold--delta. Default the delta is 0

The left filter panel usage is the same as [GSA report](https://help.multiomics.illumina.com/icm/analyses/analysis-functionality/task-menu/differential-analysis/gsa#gsa-report)

Double click to open DMR report. This result is based on DML results

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-85a71d07d9b20ef35eb6cd594167d77b68cfcc50%2Fimage%20(346).png?alt=media" alt=""><figcaption></figcaption></figure></div>

In this report, each row is a region of a cluster of CpG loci that show consistent differential methylation between the two groups.

* chr: Chromosome where the region is located
* start: Start position of the region in base pairs
* end: Stop position of the region in base pairs
* length: Length of the region in base pairs
* nCG: Number of CpG sties within the region
* abs(areaStat): Absolute value of the areaStat. Large value indicates strong evidence of differential methylation
* diff.Methy: Difference in average methylation between the two groups
* meanMethy1 and meanMethy2: Average methylation level across the region in group 1 and group2 respectively.
* areaStat: Sum of the test statistics (stat in DML) across all the CpG sites in the region.

### References

Feng, Hao, Karen N Conneely, and Hao Wu. 2014. “A Bayesian Hierarchical Model to Detect Differentially Methylated Loci from Single Nucleotide Resolution Sequencing Data.” *Nucleic Acids Research* 42 (8): e69–e69.

Park, Yongseok, and Hao Wu. 2016. “Differential Methylation Analysis for Bs-Seq Data Under General Experimental Design.” *Bioinformatics* 32 (10): 1446–53.

Wu, Hao, Chi Wang, and Zhijin Wu. 2012. “A New Shrinkage Estimator for Dispersion Improves Differential Expression Detection in Rna-Seq Data.” *Biostatistics* 14 (2): 232–43.

Wu, Hao, Tianlei Xu, Hao Feng, Li Chen, Ben Li, Bing Yao, Zhaohui Qin, Peng Jin, and Karen N Conneely. 2015. “Detection of Differentially Methylated Regions from Whole-Genome Bisulfite Sequencing Data Without Replicates.” *Nucleic Acids Research* 43 (21): e141–e141.
