VAT/download
From GersteinInfo
(→Download of preprocessed annotation sets) |
(→Download of preprocessed annotation sets) |
||
Line 190: | Line 190: | ||
* Coding sequence (CDS) elements where the both the ''gene_type'' and ''transcript_type'' are ''protein_coding'': | * Coding sequence (CDS) elements where the both the ''gene_type'' and ''transcript_type'' are ''protein_coding'': | ||
** GENCODE version 3b (hg18): [http://homes.gersteinlab.org/people/lh372/VAT/gencode3b.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode3b.fa Transcript sequences] | ** GENCODE version 3b (hg18): [http://homes.gersteinlab.org/people/lh372/VAT/gencode3b.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode3b.fa Transcript sequences] | ||
+ | ** GENCODE version 3c (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode3c.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode3c.fa Transcript sequences] | ||
+ | ** GENCODE version 4 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode4.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode4.fa Transcript sequences] | ||
+ | ** GENCODE version 5 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode5.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode5.fa Transcript sequences] | ||
+ | ** GENCODE version 6 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode6.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode6.fa Transcript sequences] | ||
+ | ** GENCODE version 7 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode7.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode7.fa Transcript sequences] |
Revision as of 15:49, 6 June 2011
Contents |
External Software
Required
- GSL - GNU Scientific Library (version-1.14; required for libBIOS, which is a general C library).
- BlatSuite - BLAT and a collection of utility programs. These tools are utilized by VAT. Note: these executables must be part of the PATH.
- GD library - The GD library is used to create an image for each gene model and its associated variants (version-2.0.35; required by VAT).
- Tabix - Tabix (version-0.2.3) is a generic tool that indexes position-sorted files in tab-delimited formats to facilitate fast retrieval (download). These tools are utilized by VAT. Note: these executables must be part of the PATH.
Optional
- VCF tools - VCF tools consists of a suite of useful modules to manipulate VCF files.
VAT Download
Important Note ============== THIS PACKAGE (VAT) IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
Source code
VAT is a based on a general C library, called libBIOS. A TAR ball of libBIOS and VAT can be downloaded here:
- libBIOS-1.1.0.tar.gz
- VAT-0.5.tar.gz - Initial upload
Executables
Statically built binaries for UNIX can be found here:
- VAT-0.5-UNIX.tar.gz - 64bit version
License information
The software package is released under the Creative Commons license (Attribution-NonCommerical).
For more details please refer to the Permissions Page on the Gerstein Lab webpage.
Installation
Installation of the external GSL and GD libraries
In order to install VAT two external libraries must be installed first. The libBIOS library depends on GSL, whereas VAT makes use of the GD library. Please follow the instructions provided by each package. The GSL library can be installed on most systems using the following commands (for details, please refer to the specific instructions at the GNU Scientific Library website):
$ cd /path/to/gsl-1.14/ $ ./configure --prefix=`pwd` $ make $ make install
Similarly, the GD library can be installed on most systems with the following commands:
$ cd /path/to/gd-2.0.35/ $ ./configure --prefix=`pwd` --with-jpeg=/path/to/jpegLib/ $ make $ make install
After they are installed, the first step to install VAT is the installation and configuration of libBIOS.
Installation and Configuration of libBIOS
Depending on where the three libraries (GSL, libBIOS, and GD) are installed, the following variables need to be set:
export CPPFLAGS="-I/path/to/gsl-1.14/include -I/path/to/libbios/include -I/path/to/gd-2.0.35/include" export LDFLAGS="-L/path/to/gsl-1.14/lib -L/path/to/libbios/lib -L/path/to/gd-2.0.35/lib"
libBIOS can be installed on most systems with the following commands:
$ cd /path/to/libbios-x.x.x/ $ ./configure --prefix=`pwd` $ make $ make install
Installation and Configuration of VAT
A few simple steps are required to install VAT:
$ cd /path/to/vat-x.x.x/ $ ./configure --prefix=`pwd` $ make $ make install
VAT contains a configuration file (vatConfirgurationTemplate.txt), which contains a set of variables that are used by a number of different programs. The name/value pairs are space or tab-delimited. Empty lines are lines starting with '//' are ignored.
// =============================================================================== // REQUIRED // =============================================================================== // Tabix directory (includes both tabix and bgzip) TABIX_DIR /path/to/tabix-0.2.3 // =============================================================================== // OPTIONAL (required only for CGIs) // =============================================================================== // CGI base URL (where the CGIs are located) WEB_URL_CGI http://webserver.org/path // Path to the web data directory where the preprocessed files are stored WEB_DATA_DIR /path/to/public_html/path/to/VAT // URL to preprocessed files WEB_DATA_URL http://webserver.org/path/to/VAT
This file has to be configured properly by filling in the required information. Subsequently, the following environment variable (VAT_CONFIG_FILE) has to be set:
VAT_CONFIG_FILE=/pathTo/vat/vatConfirgurationTemplate.txt
Setup of the web server
This step is optional, but useful for visualizing the results of processed data sets. The following steps are required:
- The executable vat_cgi has to be located in the cgi-bin directory on the web server
- The configuration file (vatConfirgurationTemplate.txt) must contain the pertinent information
- The following .htaccess file should be added to the cgi-bin:
SetEnv VAT_CONFIG_FILE /path/to/vatConfirgurationTemplate.txt
- The web data directory (defined by WEB_DATA_DIR in the configuration file) requires the following information:
- Preprocessed annotation sets (gencode3b.interval, gencode3b.fa, gencode3c.interval, gencode3c.fa)
- The tabix and bgzip executables
- Two images provided by the VAT source code: check.png and processing.gif (referred to by vat_cgi)
- Directory that has the same name as the data set (in this example: SampleData). This directory contains the images for each gene (created by vcf2images) and a VCF file for each gene (created by vcfSubsetByGene)
- SampleData
- SampleData.vcf.gz
- SampleData.vcf.gz.tbi
- SampleData.sampleSummary.txt (generated by vcfSummary)
- SampleData.geneSummary.txt (generated by vcfSummary)
For additional information please refer to the example workflow.
Download of preprocessed annotation sets
The following annotation sets are derived from the GENCODE project. Each each entry has a set of transcript coordinates (in Interval format) and a set of transcript sequences (introns removed; sequence with respect to the '+' strand; in FASTA format)
- Coding sequence (CDS) elements where the both the gene_type and transcript_type are protein_coding:
- GENCODE version 3b (hg18): Transcript coordinates, Transcript sequences
- GENCODE version 3c (hg19): Transcript coordinates, Transcript sequences
- GENCODE version 4 (hg19): Transcript coordinates, Transcript sequences
- GENCODE version 5 (hg19): Transcript coordinates, Transcript sequences
- GENCODE version 6 (hg19): Transcript coordinates, Transcript sequences
- GENCODE version 7 (hg19): Transcript coordinates, Transcript sequences