VAT/download

From GersteinInfo

(Difference between revisions)
Jump to: navigation, search
(Setup of the web server)
 
(81 intermediate revisions not shown)
Line 3: Line 3:
__TOC__
__TOC__
-
== Code ==
+
== External Software ==
-
=== Required Software - External ===
+
<center>[[#top|Top]]</center>
-
# [http://www.gnu.org/software/gsl/ GSL] - GNU Scientific Library
+
=== Required  ===
-
# [http://hgwdev.cse.ucsc.edu/~kent/exe/linux/blatSuite.34.zip BlatSuite] - BLAT and a collection of utility programs. Note: these executables must be part of the PATH.
+
-
# [http://www.libgd.org/Main_Page GD library] - The GD library is used to create an image for each gene model and its associated variants.
+
-
# [http://samtools.sourceforge.net/tabix.shtml Tabix] - Tabix is generic tool that indexes position-sorted files in tab-delimited formats to facilitate fast retrieval ([http://sourceforge.net/projects/samtools/files/tabix/ download]). Note: these executables must be part of the PATH
+
-
# [http://vcftools.sourceforge.net/index.html VCF tools] - VCF tools consists of a suite of useful modules to manipulate VCF files.
+
 +
* [http://www.gnu.org/software/gsl/ GSL] - GNU Scientific Library (version-1.14; required for libBIOS, which is a general C library).
 +
* [http://hgwdev.cse.ucsc.edu/~kent/exe/linux/blatSuite.34.zip BlatSuite] - BLAT and a collection of utility programs. These tools are utilized by VAT. Note: these executables must be part of the PATH.
 +
* [http://www.libgd.org/Main_Page GD library] - The GD library is used to create an image for each gene model and its associated variants (version-2.0.35; required by VAT).
 +
* [http://samtools.sourceforge.net/tabix.shtml Tabix] - Tabix (version-0.2.3) is a generic tool that indexes position-sorted files in tab-delimited formats to facilitate fast retrieval ([http://sourceforge.net/projects/samtools/files/tabix/ download]). These tools are utilized by VAT. Note: these executables must be part of the PATH.
 +
* [http://rna.urmc.rochester.edu/RNAstructure.html RNAstructure] - RNAstructure is a software package for RNA structure prediction and analysis. This tool is utilized by VAT for prediction of structures for RNA sequences with and without the variants.
 +
* [http://varna.lri.fr/downloads.html VARNA] - VARNA is a java applet for producing high quality RNA secondary structure plots. VAT utilizes VARNA for visualization of the RNA secondary structures.
<br>
<br>
-
=== Download ===
+
<center>[[#top|Top]]</center>
 +
 
 +
=== Optional ===
 +
 
 +
* [http://vcftools.sourceforge.net/index.html VCF tools] - VCF tools consists of a suite of useful modules to manipulate VCF files.
 +
 
 +
<br><br>
 +
 
 +
== VAT Download ==
 +
 
 +
<br>
<pre>
<pre>
Line 26: Line 38:
</pre>
</pre>
 +
<br>
-
==== Source code ====
+
<center>[[#top|Top]]</center>
-
A TAR ball containing the source code can be downloaded here:
+
=== Source code ===
-
* [http://archive.gersteinlab.org/proj/VAT/src VAT-0.5.tar.gz] - Initial upload
+
 +
VAT is a based on a general C library, called libBIOS.  A TAR ball of libBIOS and VAT can be downloaded here:
 +
* [http://homes.gersteinlab.org/people/lh372/VAT/libbios-1.0.0.tar.gz libbios-1.0.0.tar.gz] - Initial upload (6/14/2011)
 +
* [http://homes.gersteinlab.org/people/lh372/VAT/vat-1.0.0.tar.gz vat-1.0.0.tar.gz] - Initial upload (6/14/2011)
-
==== Executables ====
+
<br>
 +
 
 +
<center>[[#top|Top]]</center>
 +
 
 +
=== Executables ===
Statically built binaries for UNIX can be found here:
Statically built binaries for UNIX can be found here:
-
* [http://archive.gersteinlab.org/proj/VAT/src VAT-0.5-UNIX.tar.gz] - 64bit version
+
* [http://homes.gersteinlab.org/people/lh372/VAT/vat-1.0.0_64bit.zip vat-1.0.0_64bit.zip] - Initial upload (6/14/2011)
 +
<br>
 +
 +
<center>[[#top|Top]]</center>
-
==== License information ====
+
=== License information ===
The software package is released under the [http://creativecommons.org/licenses/by-nc/2.5/legalcode Creative Commons license (Attribution-NonCommerical)]. <br>
The software package is released under the [http://creativecommons.org/licenses/by-nc/2.5/legalcode Creative Commons license (Attribution-NonCommerical)]. <br>
For more details please refer to the [http://www.gersteinlab.org/misc/permissions.html Permissions Page] on the Gerstein Lab webpage.
For more details please refer to the [http://www.gersteinlab.org/misc/permissions.html Permissions Page] on the Gerstein Lab webpage.
 +
<br><br>
== Installation ==
== Installation ==
-
==Installing GSL and GD libraries==
+
<center>[[#top|Top]]</center>
-
In order to install VAT these external packages need to be installed first . Please, follow the instruction provided by the single packages. After they are installed, the first step for VAT is the installation and configuration of [http://rnaseq.gersteinlab.org/doc/bios/ BIOS]. BIOS is a C library of useful general definitions for manipulating strings, arrays, and parser and more related to bioinformatic analysis. It requires the GSL library, which, in most systems, can be installed with the following commands (for details, please refer to the specific instructions at the [http://www.gnu.org/software/gsl/ GNU Scientific Library] website):
+
 
 +
=== Installation of the external GSL and GD libraries ===
 +
 
 +
In order to install VAT two external libraries must be installed first. The libBIOS library depends on GSL, whereas VAT makes use of the GD library. Please follow the instructions provided by each package. The GSL library can be installed on most systems using the following commands (for details, please refer to the specific instructions at the [http://www.gnu.org/software/gsl/ GNU Scientific Library] website):
<pre>
<pre>
-
$ cd /path/to/gslSource/
+
$ cd /path/to/gsl-1.14/
-
$ ./configure --prefix=/path/to/installation/
+
$ ./configure --prefix=`pwd`
$ make
$ make
$ make install
$ make install
</pre>
</pre>
-
Similarly, the [http://www.libgd.org/Main_Page GD library] can be installed in most systems with:
+
Similarly, the [http://www.libgd.org/Main_Page GD library] can be installed on most systems with the following commands:
<pre>
<pre>
-
$ cd /path/to/gdSource/
+
$ cd /path/to/gd-2.0.35/
-
$ ./configure --prefix=/path/to/installation/ --with-jpeg=/path/to/jpegLib/
+
$ ./configure --prefix=`pwd` --with-jpeg=/path/to/jpegLib/
$ make
$ make
$ make install
$ make install
</pre>
</pre>
 +
 +
After they are installed, the first step to install VAT is the installation and configuration of libBIOS.
 +
 +
<br>
<center>[[#top|Top]]</center>
<center>[[#top|Top]]</center>
-
==Installing and configuring BIOS==
+
=== Installation and Configuration of libBIOS ===
-
To install [http://rnaseq.gersteinlab.org/doc/bios/ BIOS] a few variables need to be set before compiling the library. Here is an example of the procedure on a bash shell:
+
 
 +
Depending on where the three libraries (GSL, libBIOS, and GD) are installed, the following variables need to be set:
 +
 
<pre>
<pre>
-
$ export BIOINFOCONFDIR=/pathToBios/conf/
+
export CPPFLAGS="-I/path/to/gsl-1.14/include -I/path/to/libbios/include -I/path/to/gd-2.0.35/include"
-
$ export BIOINFOGSLDIR=/pathToGsl/
+
export LDFLAGS="-L/path/to/gsl-1.14/lib -L/path/to/libbios/lib -L/path/to/gd-2.0.35/lib"
-
$ cd /pathToBios/
+
</pre>
 +
 
 +
libBIOS can be installed on most systems with the following commands:
 +
<pre>
 +
$ cd /path/to/libbios-x.x.x/
 +
$ ./configure --prefix=`pwd`
$ make
$ make
-
$ make prod
+
$ make install
</pre>
</pre>
-
Please refer to [http://rnaseq.gersteinlab.org/doc/bios/ BIOS] documentation for additional information.
+
<br>
<center>[[#top|Top]]</center>
<center>[[#top|Top]]</center>
 +
 +
=== Installation of RNAstructure and VARNA ===
 +
 +
Download RNAstructure and follow the building instructions. Make sure that the build directory is included in PATH environment variable. In addition, RNAstructure needs an environment variable named DATAPATH to be set to the directory of thermodynamic parameter files that are distributed with RNAstructure package.
 +
 +
Download VARNA jar file and add the jar file path to CLASSPATH environment variable.
 +
 +
<br>
 +
 +
<center>[[#top|Top]]</center>
 +
 +
=== Installation and Configuration of VAT ===
 +
 +
A few simple steps are required to install VAT:
 +
<pre>
 +
$ cd /path/to/vat-x.x.x/
 +
$ ./configure --prefix=`pwd`
 +
$ make
 +
$ make install
 +
</pre>
 +
 +
VAT contains a configuration file ('''vatConfirgurationTemplate.txt'''), which contains a set of variables that are used by a number of different programs.  The name/value pairs are space or tab-delimited. Empty lines are lines starting with '//' are ignored.
 +
<pre>
 +
 +
// ===============================================================================
 +
// REQUIRED
 +
// ===============================================================================
 +
 +
// Tabix directory (includes both tabix and bgzip)
 +
TABIX_DIR /path/to/tabix-0.2.3
 +
 +
 +
// ===============================================================================
 +
// OPTIONAL (required only for CGIs)
 +
// ===============================================================================
 +
 +
// CGI base URL (where the CGIs are located)
 +
WEB_URL_CGI http://webserver.org/path
 +
 +
// Path to the web data directory where the preprocessed files are stored
 +
WEB_DATA_DIR /path/to/public_html/path/to/VAT
 +
// URL to preprocessed files
 +
WEB_DATA_URL http://webserver.org/path/to/VAT
 +
 +
</pre>
 +
 +
This file has to be '''configured properly''' by filling in the required information. Subsequently, the following environment variable ('''VAT_CONFIG_FILE''') has to be set:
 +
 +
VAT_CONFIG_FILE=/pathTo/vat/vatConfirgurationTemplate.txt
 +
 +
<br><br>
 +
 +
== Setup of the web server ==
 +
 +
<br>
 +
 +
This step is optional, but useful for visualizing the results of processed data sets. The following steps are required:
 +
 +
* The executable '''vat_cgi''' has to be located in the cgi-bin directory on the web server
 +
* The configuration file ('''vatConfirgurationTemplate.txt''') must contain the pertinent information
 +
* The following .htaccess file should be added to the cgi-bin:
 +
SetEnv VAT_CONFIG_FILE /path/to/vatConfirgurationTemplate.txt
 +
* The web data directory (defined by WEB_DATA_DIR in the configuration file) requires the following information:
 +
** '''Pre-processed annotation sets'''
 +
** The '''tabix''' and '''bgzip''' executables
 +
** Two images provided by the VAT source code: '''check.png''' and '''processing.gif''' (referred to by vat_cgi)
 +
** Directory that has the same name as the data set (in this example: SampleData). This directory contains the images for each gene (created by [http://info.gersteinlab.org/VAT#vcf2images vcf2images]) and a VCF file for each gene (created by [http://info.gersteinlab.org/VAT#vcfSubsetByGene vcfSubsetByGene])
 +
*** SampleData
 +
*** SampleData.vcf.gz
 +
*** SampleData.vcf.gz.tbi
 +
*** SampleData.sampleSummary.txt (generated by [http://info.gersteinlab.org/VAT#vcfSummary vcfSummary])
 +
*** SampleData.geneSummary.txt (generated by [http://info.gersteinlab.org/VAT#vcfSummary vcfSummary])
 +
** The construction of the vat_cgi URL requires following item/value pairs (for example: http://dynamic.gersteinlab.org/people/lh372/vat_cgi?mode=process&dataSet=ALL.2of4intersection.20100804.chr22&annotationSet=gencode3c&type=coding):
 +
*** mode=process
 +
*** dataSet=SampleData
 +
*** annotationSet=nameOfAnnotationSet
 +
*** type=coding or nonCoding
 +
 +
For additional information please refer to the [http://info.gersteinlab.org/VAT#Setting_up_the_web_server example workflow].
 +
 +
<br><br>
 +
 +
== Download of pre-processed annotation sets ==
 +
 +
<br>
 +
 +
The following annotation sets are derived from the [http://www.gencodegenes.org/ GENCODE] project. Each each entry has a set of '''transcript coordinates''' (in [http://info.gersteinlab.org/VAT#Interval Interval] format) and a set of '''transcript sequences''' (introns removed; sequence with respect to the '+' strand; in FASTA format)
 +
 +
* Coding sequence (CDS) elements where the both the ''gene_type'' and ''transcript_type'' are ''protein_coding'':
 +
** GENCODE version 3b (hg18): [http://homes.gersteinlab.org/people/lh372/VAT/gencode3b.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode3b.fa Transcript sequences]
 +
** GENCODE version 3c (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode3c.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode3c.fa Transcript sequences]
 +
** GENCODE version 4 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode4.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode4.fa Transcript sequences]
 +
** GENCODE version 5 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode5.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode5.fa Transcript sequences]
 +
** GENCODE version 6 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode6.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode6.fa Transcript sequences]
 +
** GENCODE version 7 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode7.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode7.fa Transcript sequences]
 +
 +
<br>
 +
 +
* miRNAs where ''gene_type'' is ''miRNA'':
 +
** GENCODE version 3b (hg18): [http://homes.gersteinlab.org/people/lh372/VAT/gencode3b.miRNA.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode3b.miRNA.fa Transcript sequences]
 +
** GENCODE version 3c (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode3c.miRNA.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode3c.miRNA.fa Transcript sequences]
 +
** GENCODE version 4 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode4.miRNA.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode4.miRNA.fa Transcript sequences]
 +
** GENCODE version 5 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode5.miRNA.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode5.miRNA.fa Transcript sequences]
 +
** GENCODE version 6 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode6.miRNA.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode6.miRNA.fa Transcript sequences]
 +
** GENCODE version 7 (hg19): [http://homes.gersteinlab.org/people/lh372/VAT/gencode7.miRNA.interval Transcript coordinates], [http://homes.gersteinlab.org/people/lh372/VAT/gencode7.miRNA.fa Transcript sequences]

Latest revision as of 15:33, 15 June 2011

VAT Main Page

Contents


External Software

Top

Required

  • GSL - GNU Scientific Library (version-1.14; required for libBIOS, which is a general C library).
  • BlatSuite - BLAT and a collection of utility programs. These tools are utilized by VAT. Note: these executables must be part of the PATH.
  • GD library - The GD library is used to create an image for each gene model and its associated variants (version-2.0.35; required by VAT).
  • Tabix - Tabix (version-0.2.3) is a generic tool that indexes position-sorted files in tab-delimited formats to facilitate fast retrieval (download). These tools are utilized by VAT. Note: these executables must be part of the PATH.
  • RNAstructure - RNAstructure is a software package for RNA structure prediction and analysis. This tool is utilized by VAT for prediction of structures for RNA sequences with and without the variants.
  • VARNA - VARNA is a java applet for producing high quality RNA secondary structure plots. VAT utilizes VARNA for visualization of the RNA secondary structures.


Top

Optional

  • VCF tools - VCF tools consists of a suite of useful modules to manipulate VCF files.



VAT Download


Important Note
==============

THIS PACKAGE (VAT) IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESSED OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.


Top

Source code

VAT is a based on a general C library, called libBIOS. A TAR ball of libBIOS and VAT can be downloaded here:


Top

Executables

Statically built binaries for UNIX can be found here:


Top

License information

The software package is released under the Creative Commons license (Attribution-NonCommerical).
For more details please refer to the Permissions Page on the Gerstein Lab webpage.



Installation

Top

Installation of the external GSL and GD libraries

In order to install VAT two external libraries must be installed first. The libBIOS library depends on GSL, whereas VAT makes use of the GD library. Please follow the instructions provided by each package. The GSL library can be installed on most systems using the following commands (for details, please refer to the specific instructions at the GNU Scientific Library website):

$ cd /path/to/gsl-1.14/
$ ./configure --prefix=`pwd`
$ make
$ make install

Similarly, the GD library can be installed on most systems with the following commands:

$ cd /path/to/gd-2.0.35/
$ ./configure --prefix=`pwd` --with-jpeg=/path/to/jpegLib/
$ make
$ make install

After they are installed, the first step to install VAT is the installation and configuration of libBIOS.


Top

Installation and Configuration of libBIOS

Depending on where the three libraries (GSL, libBIOS, and GD) are installed, the following variables need to be set:

export CPPFLAGS="-I/path/to/gsl-1.14/include -I/path/to/libbios/include -I/path/to/gd-2.0.35/include"
export LDFLAGS="-L/path/to/gsl-1.14/lib -L/path/to/libbios/lib -L/path/to/gd-2.0.35/lib"

libBIOS can be installed on most systems with the following commands:

$ cd /path/to/libbios-x.x.x/
$ ./configure --prefix=`pwd` 
$ make
$ make install


Top

Installation of RNAstructure and VARNA

Download RNAstructure and follow the building instructions. Make sure that the build directory is included in PATH environment variable. In addition, RNAstructure needs an environment variable named DATAPATH to be set to the directory of thermodynamic parameter files that are distributed with RNAstructure package.

Download VARNA jar file and add the jar file path to CLASSPATH environment variable.


Top

Installation and Configuration of VAT

A few simple steps are required to install VAT:

$ cd /path/to/vat-x.x.x/
$ ./configure --prefix=`pwd` 
$ make
$ make install

VAT contains a configuration file (vatConfirgurationTemplate.txt), which contains a set of variables that are used by a number of different programs. The name/value pairs are space or tab-delimited. Empty lines are lines starting with '//' are ignored.


// ===============================================================================
// REQUIRED
// ===============================================================================

// Tabix directory (includes both tabix and bgzip)
TABIX_DIR /path/to/tabix-0.2.3


// ===============================================================================
// OPTIONAL (required only for CGIs)
// ===============================================================================

// CGI base URL (where the CGIs are located)
WEB_URL_CGI http://webserver.org/path

// Path to the web data directory where the preprocessed files are stored
WEB_DATA_DIR /path/to/public_html/path/to/VAT
// URL to preprocessed files
WEB_DATA_URL http://webserver.org/path/to/VAT

This file has to be configured properly by filling in the required information. Subsequently, the following environment variable (VAT_CONFIG_FILE) has to be set:

VAT_CONFIG_FILE=/pathTo/vat/vatConfirgurationTemplate.txt



Setup of the web server


This step is optional, but useful for visualizing the results of processed data sets. The following steps are required:

  • The executable vat_cgi has to be located in the cgi-bin directory on the web server
  • The configuration file (vatConfirgurationTemplate.txt) must contain the pertinent information
  • The following .htaccess file should be added to the cgi-bin:
SetEnv VAT_CONFIG_FILE /path/to/vatConfirgurationTemplate.txt
  • The web data directory (defined by WEB_DATA_DIR in the configuration file) requires the following information:

For additional information please refer to the example workflow.



Download of pre-processed annotation sets


The following annotation sets are derived from the GENCODE project. Each each entry has a set of transcript coordinates (in Interval format) and a set of transcript sequences (introns removed; sequence with respect to the '+' strand; in FASTA format)


Personal tools