Annotate Variants (VEP)

An important aspect of variant analysis is the ability to prioritize variants for downstream analysis. As variant detection can often identify a large number of variants, it may be difficult to determine which variants may impact phenotypes. As implemented in Connected Multiomics, the Ensembl Variant Effect Predictor (VEP, version 84) [1] provides a means to add detailed annotation to variants in the analysis such as discrete aspects of transcript models and variant databases not available in the Annotate Variants task. For variants identified in human data, information from popular tools that predict the impact of variants that cause amino acid changes, SIFT [2] and PROVEAN [3] (available for the hg19 genome assembly), will be included. VEP databases can be obtained for multiple species, and content will be dependent on available transcript and variant information for that organism. The Annotate variants (VEP) task can be invoked from any Variants or Annotated variants data node, and the task will supplement any existing annotation in the vcf files. Annotation information will also be visible in the downstream View variants Variant report .

Annotate variants (VEP) dialog

The task dialog for Annotate variants (VEP) contains two sections: Select Variant Effect Predictor database and Advanced options. Select Variant Effect Predictor database will specify the reference assembly to utilize for variant detection. If the variant detection was performed in Connected Multiomics, the Assembly will be displayed as text in the section. Upon initial task usage, click the Create variant effect predictor database button to import a database. The VEP database for hg19 is available for automated download in Connected Multiomics, and information regarding obtaining additional databases for other species and genome assemblies can be found in the VEP documentation.

Advanced options provides a means to specify aspects of the annotation generated from the VEP annotation task. Upon invoking the task dialog, Option set is set to Default. Clicking Configure will open a window to specify additional components of annotation. VEP has Advanced options for Identifiers, Output options, and Co-located variants. Moving the mouse cursor over the info button will provide details for each parameter.

In the report, there variant impact information, it is a subjective classification of the severity of the variant consequence:

  • Low: a variant that is assumed to be mostly harmless or unlikely to change protein behavior.

  • Moderate: a non-disruptive variant that might change protein effectiveness.

  • Modifier: usually non-coding variants or variants affecting non-coding genes, where predictions are difficult or there is no evidence of impact.

  • High: a variant is assumed to have high disruptive impact in the protein, probably causing protein truncation, loss of function or triggering nonsense mediated decay.

References

  1. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P, Cunningham F. The Ensembl Variant Effect Predictor. Genome Biology Jun 6;17(1):122. (2016) doi:10.1186/s13059-016-0974-4

  2. Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003 Jul 1;31(13):3812-4. doi: 10.1093/nar/gkg509.

  3. Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the functional effect of amino acid substitutions and indels. PLoS One. 2012;7(10):e46688. doi: 10.1371/journal.pone.0046688.

Last updated

Was this helpful?