Tools

From GersteinInfo

(Difference between revisions)
Jump to: navigation, search
Line 56: Line 56:
=Genome Technology Tools=
=Genome Technology Tools=
-
=== Structural Variation ===
+
===Allele-Specific Effects===
:{|class="wikitable sortable" border="1" cellspacing="0" cellpadding="10"
:{|class="wikitable sortable" border="1" cellspacing="0" cellpadding="10"
|- bgcolor="lightsteelblue"
|- bgcolor="lightsteelblue"
!Name!!Release Date!!class="unsortable"|Description
!Name!!Release Date!!class="unsortable"|Description
|-style="height: 100px;"
|-style="height: 100px;"
-
|style="width:15%; text-align:center;"|[http://sv.gersteinlab.org/cnvnator/ '''CNVnator'''] [http://papers.gersteinlab.org/papers/CNVnator/index.html (citation)]||style="width:7%; text-align:center;"|2011|| CNVnator may be used to discover, genotype, and characterize typical and atypical CNVs from familial and population genome sequencing.
+
|style="width:15%; text-align:center;"|[http://alleleseq.gersteinlab.org/home.html '''AlleleSeq''']||style="width:7%; text-align:center;"|2011||AlleleSeq is a computational pipeline that is used to study allele-specific expression (ASE) and allele specific binding (ASB). The pipeline first constructs a diploid personal genome sequence, then maps RNA-seq and ChIP-seq functional genomic data onto this personal genome. Consequently, locations in which there are differences in number of mapped reads between maternally- and paternally-derived sequences can be identified, thereby providing evidence for allele-specific events.
 +
|}
 +
 
 +
===ChiP-Seq ===
 +
:{|class="wikitable sortable" border="1" cellspacing="0" cellpadding="10"
 +
|- bgcolor="lightsteelblue"
 +
!Name!!Release Date!!class="unsortable"|Description
|-style="height: 100px;"
|-style="height: 100px;"
-
|style="width:15%; text-align:center;"|[http://sv.gersteinlab.org/age/ '''AGE'''] [http://papers.gersteinlab.org/papers/age/index.html (citation)]||style="width:7%; text-align:center;"|2011|| AGE is used for defining breakpoints of genomic structural variants at single-nucleotide resolution, using optimal alignments with gap excision.
+
|style="width:15%; text-align:center;"|[http://www.gersteinlab.org/proj/PeakSeq/ '''PeakSeq''']||style="width:7%; text-align:center;"|2009|| A tool for calling peaks corresponding to transcription factor binding sites from ChIP-Seq data scored against a matched control such as input DNA. PeakSeq employs a two-pass strategy in which putative binding sites are first identified in order to compensate for genomic variation in the 'mappability' of sequences, before a second pass filters out sites not significantly enriched compared to the normalized control, computing precise enrichments and significances.
|}
|}
Line 74: Line 80:
|-style="height: 100px;"
|-style="height: 100px;"
|style="width:15%; text-align:center;"|[http://vat.gersteinlab.org/ '''VAT'''] <br> [http://github.com/gersteinlab/vat Github repo] ||style="width:7%; text-align:center;"|2012|| A computational framework to functionally annotate variants in personal genomes using a cloud-computing environment.
|style="width:15%; text-align:center;"|[http://vat.gersteinlab.org/ '''VAT'''] <br> [http://github.com/gersteinlab/vat Github repo] ||style="width:7%; text-align:center;"|2012|| A computational framework to functionally annotate variants in personal genomes using a cloud-computing environment.
 +
|}
 +
 +
===Microarrays & Proteomics===
 +
:{|class="wikitable sortable" border="1" cellspacing="0" cellpadding="10"
 +
|- bgcolor="lightsteelblue"
 +
!Name!!Release Date!!class="unsortable"|Description
 +
|-style="height: 100px;"
 +
|style="width:15%; text-align:center;"|[http://motips.gersteinlab.org/ '''MOTIPS''']||style="width:7%; text-align:center;"|2010||MOTIPS employs an efficient search algorithm to scan a target proteome for potential domain targets and to increase the accuracy of each hit by integrating a variety of pre-computed features, such as conservation, surface propensity, and disorder.
 +
|-style="height: 100px;"
 +
|style="width:15%; text-align:center;"|[http://proteomics.gersteinlab.org '''PARE''']||style="width:7%; text-align:center;"|2007||Protein Abundance and mRNA Expression (PARE) is a tool for comparing protein abundance and mRNA expression data. In addition to globally comparing the quantities of protein and mRNA, PARE allows users to select subsets of proteins for focused study (based on functional categories and complexes). Furthermore, it highlights correlation outliers, which may warrant further investigation.
|}
|}
Line 84: Line 100:
|-style="height: 100px;"
|-style="height: 100px;"
|style="width:15%; text-align:center;"|[http://act.gersteinlab.org/ '''ACT''']||style="width:7%; text-align:center;"|2011||The aggregation and correlation toolbox (ACT) is an efficient, multifaceted toolbox for analyzing continuous signal and discrete region tracks from high-throughput genomic experiments, such as RNA-seq or ChIP-chip signal profiles from the ENCODE and modENCODE projects, or lists of single nucleotide polymorphisms from the 1000 genomes project.
|style="width:15%; text-align:center;"|[http://act.gersteinlab.org/ '''ACT''']||style="width:7%; text-align:center;"|2011||The aggregation and correlation toolbox (ACT) is an efficient, multifaceted toolbox for analyzing continuous signal and discrete region tracks from high-throughput genomic experiments, such as RNA-seq or ChIP-chip signal profiles from the ENCODE and modENCODE projects, or lists of single nucleotide polymorphisms from the 1000 genomes project.
-
 
|-style="height: 100px;"
|-style="height: 100px;"
|style="text-align: center;"|[http://archive.gersteinlab.org/proj/rnaseq/IQSeq/ '''IQseq''']<br>[http://github.com/gersteinlab/IQSeq Github repo]||style="text-align:center;"|2012||A tool for isoform quantification with RNA-seq data. Given isoform annotation and alignment of RNA-seq reads, it will use an EM algorithm to infer the most probable expression level for each isoform of a gene.
|style="text-align: center;"|[http://archive.gersteinlab.org/proj/rnaseq/IQSeq/ '''IQseq''']<br>[http://github.com/gersteinlab/IQSeq Github repo]||style="text-align:center;"|2012||A tool for isoform quantification with RNA-seq data. Given isoform annotation and alignment of RNA-seq reads, it will use an EM algorithm to infer the most probable expression level for each isoform of a gene.
|-style="height: 100px;"
|-style="height: 100px;"
|style="text-align: center;"|[https://code.google.com/p/lesseq/ '''LESSeq''']<br>[http://github.com/gersteinlab/LESSeq Github repo]||style="text-align:center;"|2014||Local Event-based analysis of alternative Splicing using RNA-Seq
|style="text-align: center;"|[https://code.google.com/p/lesseq/ '''LESSeq''']<br>[http://github.com/gersteinlab/LESSeq Github repo]||style="text-align:center;"|2014||Local Event-based analysis of alternative Splicing using RNA-Seq
-
 
|-style="height: 100px;"
|-style="height: 100px;"
|style="text-align: center;"|[http://archive.gersteinlab.org/proj/rnaseq/rseqtools/ '''RSEQtools''']||style="text-align:center;"|2011||A suite of tools that use Mapped Read Format (MRF) for the analysis of RNA-Seq experiments. MRF is a compact data format that enables anonymization of confidential sequence information while maintaining the ability to conduct subsequent functional genomics studies. RSEQtools provides a suite of modules that convert to/from MRF data and perform common tasks such as calculating gene expression values, generating signal tracks of mapped reads, and segmenting that signal into actively transcribed regions.
|style="text-align: center;"|[http://archive.gersteinlab.org/proj/rnaseq/rseqtools/ '''RSEQtools''']||style="text-align:center;"|2011||A suite of tools that use Mapped Read Format (MRF) for the analysis of RNA-Seq experiments. MRF is a compact data format that enables anonymization of confidential sequence information while maintaining the ability to conduct subsequent functional genomics studies. RSEQtools provides a suite of modules that convert to/from MRF data and perform common tasks such as calculating gene expression values, generating signal tracks of mapped reads, and segmenting that signal into actively transcribed regions.
|}
|}
-
===ChiP-Seq ===
+
=== Structural Variation ===
:{|class="wikitable sortable" border="1" cellspacing="0" cellpadding="10"
:{|class="wikitable sortable" border="1" cellspacing="0" cellpadding="10"
|- bgcolor="lightsteelblue"
|- bgcolor="lightsteelblue"
!Name!!Release Date!!class="unsortable"|Description
!Name!!Release Date!!class="unsortable"|Description
|-style="height: 100px;"
|-style="height: 100px;"
-
|style="width:15%; text-align:center;"|[http://www.gersteinlab.org/proj/PeakSeq/ '''PeakSeq''']||style="width:7%; text-align:center;"|2009|| A tool for calling peaks corresponding to transcription factor binding sites from ChIP-Seq data scored against a matched control such as input DNA. PeakSeq employs a two-pass strategy in which putative binding sites are first identified in order to compensate for genomic variation in the 'mappability' of sequences, before a second pass filters out sites not significantly enriched compared to the normalized control, computing precise enrichments and significances.
+
|style="width:15%; text-align:center;"|[http://sv.gersteinlab.org/cnvnator/ '''CNVnator'''] [http://papers.gersteinlab.org/papers/CNVnator/index.html (citation)]||style="width:7%; text-align:center;"|2011|| CNVnator may be used to discover, genotype, and characterize typical and atypical CNVs from familial and population genome sequencing.
-
|}
+
-
 
+
-
===Allele-Specific Effects===
+
-
:{|class="wikitable sortable" border="1" cellspacing="0" cellpadding="10"
+
-
|- bgcolor="lightsteelblue"
+
-
!Name!!Release Date!!class="unsortable"|Description
+
|-style="height: 100px;"
|-style="height: 100px;"
-
|style="width:15%; text-align:center;"|[http://alleleseq.gersteinlab.org/home.html '''AlleleSeq''']||style="width:7%; text-align:center;"|2011||AlleleSeq is a computational pipeline that is used to study allele-specific expression (ASE) and allele specific binding (ASB). The pipeline first constructs a diploid personal genome sequence, then maps RNA-seq and ChIP-seq functional genomic data onto this personal genome. Consequently, locations in which there are differences in number of mapped reads between maternally- and paternally-derived sequences can be identified, thereby providing evidence for allele-specific events.
+
|style="width:15%; text-align:center;"|[http://sv.gersteinlab.org/age/ '''AGE'''] [http://papers.gersteinlab.org/papers/age/index.html (citation)]||style="width:7%; text-align:center;"|2011|| AGE is used for defining breakpoints of genomic structural variants at single-nucleotide resolution, using optimal alignments with gap excision.
-
|}
+
-
 
+
-
===Microarrays & Proteomics===
+
-
:{|class="wikitable sortable" border="1" cellspacing="0" cellpadding="10"
+
-
|- bgcolor="lightsteelblue"
+
-
!Name!!Release Date!!class="unsortable"|Description
+
-
|-style="height: 100px;"
+
-
|style="width:15%; text-align:center;"|[http://motips.gersteinlab.org/ '''MOTIPS''']||style="width:7%; text-align:center;"|2010||MOTIPS employs an efficient search algorithm to scan a target proteome for potential domain targets and to increase the accuracy of each hit by integrating a variety of pre-computed features, such as conservation, surface propensity, and disorder.
+
-
|-style="height: 100px;"
+
-
|style="width:15%; text-align:center;"|[http://proteomics.gersteinlab.org '''PARE''']||style="width:7%; text-align:center;"|2007||Protein Abundance and mRNA Expression (PARE) is a tool for comparing protein abundance and mRNA expression data. In addition to globally comparing the quantities of protein and mRNA, PARE allows users to select subsets of proteins for focused study (based on functional categories and complexes). Furthermore, it highlights correlation outliers, which may warrant further investigation.
+
|}
|}

Revision as of 22:46, 11 April 2014

Below we highlight some of our tools and data sets. For an overview of the associated published literature, please visit our tools publication page. You may also view a list of the papers associated with our core tools. Source code is available on our lab Github page.

Contents

Tool portals

MolMovDB

NameDescription
File:Morph-icon.jpg‎
MolMovDB

Servers and a suite of accessory tools for the analysis of conformational changes in protein and nucleic acid structures.

Networks

NameDescription

Networks

The Gerstein lab has been a pioneer in applying network analysis to generate knowledge form large-scale experiments. To this end, we have developed a portal for our network research.

Pseudogene.org

NameDescription
File:pseudogene.png‎
Pseudogene.org
Pseudogene.org is a collection of resources related to our efforts to survey eukaryotic genomes for pseudogene sequences, "pseudo-fold" usage, amino-acid composition, and single-nucleotide polymorphisms (SNPs) to help elucidate the relationships between pseudogene families across several organisms.

Structural Variants (SV)

NameDescription

Structural Variants

Software tools that may be used to investigate Structural Variations (SVs) and Copy Number Variations (CNVs).

Data sets

NameRelease DateDescription
BreakDB2009This database, which is part of the PEMer package, contains information about structural variants and associated breakpoints.

Evolution

NameRelease DateDescription
Coevolution analysis of protein residues2008An integrated online system that enables comparative analyses of residue coevolution with a comprehensive set of commonly used scoring functions, including statistical coupling analysis (SCA), explicit likelihood of subset variation (ELSC), mutual information and correlation-based methods.

Genome Technology Tools

Allele-Specific Effects

NameRelease DateDescription
AlleleSeq2011AlleleSeq is a computational pipeline that is used to study allele-specific expression (ASE) and allele specific binding (ASB). The pipeline first constructs a diploid personal genome sequence, then maps RNA-seq and ChIP-seq functional genomic data onto this personal genome. Consequently, locations in which there are differences in number of mapped reads between maternally- and paternally-derived sequences can be identified, thereby providing evidence for allele-specific events.

ChiP-Seq

NameRelease DateDescription
PeakSeq2009 A tool for calling peaks corresponding to transcription factor binding sites from ChIP-Seq data scored against a matched control such as input DNA. PeakSeq employs a two-pass strategy in which putative binding sites are first identified in order to compensate for genomic variation in the 'mappability' of sequences, before a second pass filters out sites not significantly enriched compared to the normalized control, computing precise enrichments and significances.

Functional Annotation

NameRelease DateDescription
FunSeq2013FunSeq can be used to automatically score and annotate the disease-causing potential of SNVs, particularly those which are non-coding. It can be used on cancer and personal genomes. Additionally, FunSeq can detect recurrent annotation elements in non-coding regions when running with multiple genomes. FunSeq is available for download.
VAT
Github repo
2012 A computational framework to functionally annotate variants in personal genomes using a cloud-computing environment.

Microarrays & Proteomics

NameRelease DateDescription
MOTIPS2010MOTIPS employs an efficient search algorithm to scan a target proteome for potential domain targets and to increase the accuracy of each hit by integrating a variety of pre-computed features, such as conservation, surface propensity, and disorder.
PARE2007Protein Abundance and mRNA Expression (PARE) is a tool for comparing protein abundance and mRNA expression data. In addition to globally comparing the quantities of protein and mRNA, PARE allows users to select subsets of proteins for focused study (based on functional categories and complexes). Furthermore, it highlights correlation outliers, which may warrant further investigation.

RNA-seq

NameRelease DateDescription
FusionSeq
Github repo
2010FusionSeq may be used to identify fusion transcripts from paired-end RNA-sequencing. FusionSeq includes filters to remove spurious candidate fusions with artifacts, such as misalignment or random pairing of transcript fragments, and it ranks candidates according to several statistics. It also includes a module to identify exact sequences at breakpoint junctions.
ACT2011The aggregation and correlation toolbox (ACT) is an efficient, multifaceted toolbox for analyzing continuous signal and discrete region tracks from high-throughput genomic experiments, such as RNA-seq or ChIP-chip signal profiles from the ENCODE and modENCODE projects, or lists of single nucleotide polymorphisms from the 1000 genomes project.
IQseq
Github repo
2012A tool for isoform quantification with RNA-seq data. Given isoform annotation and alignment of RNA-seq reads, it will use an EM algorithm to infer the most probable expression level for each isoform of a gene.
LESSeq
Github repo
2014Local Event-based analysis of alternative Splicing using RNA-Seq
RSEQtools2011A suite of tools that use Mapped Read Format (MRF) for the analysis of RNA-Seq experiments. MRF is a compact data format that enables anonymization of confidential sequence information while maintaining the ability to conduct subsequent functional genomics studies. RSEQtools provides a suite of modules that convert to/from MRF data and perform common tasks such as calculating gene expression values, generating signal tracks of mapped reads, and segmenting that signal into actively transcribed regions.

Structural Variation

NameRelease DateDescription
CNVnator (citation)2011 CNVnator may be used to discover, genotype, and characterize typical and atypical CNVs from familial and population genome sequencing.
AGE (citation)2011 AGE is used for defining breakpoints of genomic structural variants at single-nucleotide resolution, using optimal alignments with gap excision.

Networks

NameRelease DateDescription
TopNet2004TopNet is an automated web tool designed compare the topologies of sub-networks, looking for global differences associated with different types of proteins. This automated web tool designed to address this question, calculating and comparing topological characteristics for different sub-networks derived from any given protein network.
tYNA2006(TopNet-like Yale Network Analyzer). A Web system for managing, comparing and mining multiple networks, both directed and undirected. tYNA efficiently implements methods that have proven useful in network analysis, including identifying defective cliques, finding small network motifs (such as feed-forward loops), calculating global statistics (such as the clustering coefficient and eccentricity), and identifying hubs and bottlenecks etc.
PubNet2005A web-based tool that extracts several types of relationships returned by PubMed queries and maps them on to networks, allowing for graphical visualization, textual navigation, and topological analysis. 
DynaSIN2011The Dynamic Structure Interaction Network (DynaSIN) is a resource for studying protein-protein interaction networks in the context of conformational changes.

Structure and Macromolecular Motions

NameRelease DateDescription
Macromolecular Geometry and Packing Tools1994-2009A number of programs for calculating properties of protein and nucleic acid structures have been collected into a single distribution. Included is a library functions for analyzing structures, a convenient interactive command-line interpreter, and software for the calculation of geometrical quantities associated with macromolecular structures and their motions.
3V2010

The 3V web server extracts and comprehensively analyzes the internal volumes of input RNA and protein structures. It identifies internal volumes by taking the difference between two rolling-probe solvent-excluded surfaces.

HIT2006The Helix Interaction Tool (HIT) is a comprehensive package for analyzing helix-helix packing in proteins. This enables the user to obtain quantitative measures of the helix interaction surface area and helix crossing angle, as well as several methods for visualizing the helical interaction.
Morph Server2000A web server for generating and viewing models of protein conformational change using interpolation with energy minimization. The user may opt to use either single- or multi-chain proteins as input.

more tools

more tools

Personal tools