Library Files
The library files associated with the selected assembly are organized into several sections. Below is some information on each section.
Reference Files
This section includes two types of library file: reference sequence and cytoband files.
Reference sequences are the chromosome/scaffold/contig DNA sequences for a species. A reference sequence file is typically in FASTA or 2bit format. The reference sequence of a species is used for aligner index creation, variant detection and visualization of the reference sequence in the Chromosome view.
Cytoband files are used for drawing ideograms of chromosomes in the Chromosome view, including positions of cytogenetic bands if known.
Gene sets
Gene set files are required for biological interpretation analyses (e.g. GO enrichment). Genes are grouped together according to their biological function. Gene set files have to be in GMT format, where each row represents one gene set. The first column of a GMT file is the GO ID or gene set name. The second column is an optional text description. Subsequent columns are the gene symbols that belong to each gene set. Gene ontologies for various model organisms are available for automatic download from the ICM repository (source: geneontology.org). Because gene ontologies are frequently updated, geneontology.org is checked for updates quarterly. You can check for recent updates to the repository here.
Variant annotations
Variant annotation databases are collections of known genomic variants (e.g. single nucleotide polymorphisms). If you have performed a variant detection study, detected variants can be searched against variant annotation library files to see if the detected variants are known from previous studies. Furthermore, you can validate detected variants against 'gold-standard' variant annotation library files. Variant annotation files are typically in VCF format.
Variant annotation databases from commonly used sources (e.g. dbSNP) are available for automatic download from the Connected Multiomics repository. Because variant annotation databases are frequently updated, these sources are checked for updates quarterly. You can check for recent updates to the repository.
SnpEff variant databases
SnpEff1 is a variant annotation and effect prediction tool that requires its own variant annotation files, separate to the other Variant annotation library files. If you wish to use SnpEff, library files need to be added to this section.
VEP database
The Ensembl Variant Effect Predictor (VEP) is another variant annotation and prediction tool that requires its own annotation files, separate to the Variant annotation library files. If you wish to use VEP, library files need to be added to this section.
Annotation models
Annotation models describe genomic features (e.g. genes, transcripts, microRNAs) for a specific version of the reference sequence. Annotation models contain labels (e.g. gene ID) and genomic coordinates (e.g. chromosome, start & stop position) for each feature.
Annotation models will appear in separate tables (Figure 1). If you have multiple versions of annotation models from the same source, it is advisable to distinguish them by their date or version number.
Annotation models from commonly used sources (e.g. ENSEMBL) are available for automatic download from the ICM repository. Because annotation models are frequently updated, these sources are checked for updates quarterly. You can check for recent updates to the ICM repository.
Annotation models are used for quantification in gene expression analyses, annotating detected variants (e.g. to predict amino acid changes), visualizations in Chromosome view, generating coverage reports. Typical file formats include GTF, GFF, GFF3 and BED.

The arrows ( v /
) next to the annotation model name expand/collapse each table. Two of the annotation models displayed in Figure 1 are different versions from the same source (Ensembl), distinguishable by their version number(release 105 vs 98).
microRNA targets maps
It is required for the Get microRNA targets task. Database from TargetScan for various model organisms are available for automatic download from the ICM repository (source: https://www.targetscan.org/vert_80/).
References
Cingolani P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly. 6(2):80-92. PMID: 2272867
Last updated
Was this helpful?
