VAT

Data formats

VCF

The Variant Call Format (VCF) is a tab-delimited text file format to represent a number of different genetic variants including single nucleotide polymorphisms (SNPs), small insertions and deletions (Indels), and structural variants (SVs). This format was developed as part of the 1000 Genomes Project. A detailed summary of this file format can be found here. The annotation information is captured as part of the INFO field using the VA (Variant Annotation) tag. The string with the variant information has the following format:

AlleleNumber:GeneName:GeneId:Strand:Type:FractionOfTranscriptsAffected:{List of transcripts}

All annotated variant use the above format to capture information about the gene. The format describing the list of affected transcripts depends on the variant class (SNP, Indel, or SV) and the variant type as shown in the table below:

The allele number refers to the numbering of the alleles. By definition, the reference allele has zero as the allele number, whereas the alternate alleles are numbered starting at one (some variants have more than one alternate alleles). The type refers to the type of variant. For SNPs, the types can take on the following values (generated by snpMapper): synonymous, nonsynonymous, prematureStop, removedStop, and spliceOverlap. For Indels (generated by indelMapper), the types can take on the following values: spliceOverlap, startOverlap, endOverlap, insertionFS, insertionNFS, deletionFS, deletionNFS, where FS denotes 'frameshift' and NFS indicates 'non-frameshift'. The term spliceOverlap (for both SNPs and Indels) refers to a genetic variant that overlaps with a splice site (either two nucleotides downstream of an exon or two nucleotides upstream of an exon).

Example 1: A SNP is introducing a premature stop codon. This variant affects one out of five transcripts for this gene.

chr1	23112837	.	A	T	.	PASS	AA=A;AC=7;AN=118;DP=168;SF=2;VA=1:EPHB2:ENSG00000133216:+:prematureStop:1/5:EPHB2-001:ENST00000400191:3165_3055_1019_K->*

Example 2: A SNP leads to a non-synonymous substitution. This variant affects two out of four transcripts for this gene.

chr1	1110357	.	G	A	.	PASS	AA=G;AC=3;AN=118;DP=203;SF=2;VA=1:TTLL10:ENSG00000162571:+:nonsynonymous:2/4:TTLL10-001:ENST00000379288:1212_1187_396_R->H:TTLL10-202:ENST00000400931:1212_1187_396_R->H

Example 3: A SNP causing a non-synonymous substitution in one transcript and a splice overlap in another transcript of the same gene.

chr9	35819390	rs2381409	C	T	.	PASS	AA=N;AC=157;AN=240;DP=49;SF=0,1;VA=1:TMEM8B:ENSG00000137103:+:nonsynonymous:1/7:TMEM8B-202:ENST00000360192:2109_166_56_P->S,1:TMEM8B:ENSG00000137103:+:spliceOverlap:1/7:TMEM8B-001:ENST00000450762:2106

Example 4: An Indel with two alternate alleles. Each alternate allele leads to a non-frameshift deletion.

chr7	140118541	.	TACAACAACA	T,TACA	.	PASS	HP=1;VA=1:AC006344.1:ENSG00000236914:+:deletionNFS:1/1:AC006344.1-201:ENST00000434223:66_23_8_LQQQ->L,2:AC006344.1:ENSG00000236914:+:deletionNFS:1/1:AC006344.1-201:ENST00000434223:66_23_8_LQQ->L

Notice that multiple annotation entries are comma-separated. Multiple annotation entries arise when a variant causes different types of effects on different transcripts (Example 3) or if there are multiple alternate alleles (Example 4).

VAT also enables the grouping of samples. For examples, samples can be assigned to different sub-populations or they can be designated as cases or controls. This is done by modifying the header line using vcfModifyHeader. Specifically, the sample is prefixed by group identifier using the ':' character as a delimiter.

VAT

Data formats

VCF

Interval

List of programs

VAT Core Modules

snpMapper

indelMapper

svMapper

genericMapper

vcfSummary

vcf2images

vcfSubsetByGene

vcfModifyHeader

Auxiliary programs

gencode2interval

interval2sequences

External programs

bgzip/tabix

VCF tools

Example workflow

Prerequisites

Preprocessing of the annotation file

Annotation of the SNPs

Modification the VCF header line

Generation of summaries and images

Setting up the web server

Navigation menu

Search