Tools
From GersteinInfo
| Contents | 
Pseudogene Tools
Pseudogene.org is a collection of resources related to our efforts to survey eukaryotic genomes for pseudogene sequences, "pseudo-fold" usage, amino-acid composition, and single-nucleotide polymorphisms (SNPs) to help elucidate the relationships between pseudogene families across several organisms.
Structural Biology Tools
Morph Server generates a plausible pathway between two conformations of a protein or nucleic acid structure. A large number of statistics and several high-quality movies are output. SPINE is our laboratory-information management system (LIMS) for the NorthEast Structural Genomics Consortium. The online version is restricted to consortium users, but most of the code is freely available for download. A number of programs for calculating properties of protein and nucleic acid structures have been collected into a single distribution. Included are a library of utility functions for dealing with structures, and a convenient interactive command-line interpreter. HIT (Helix Interaction Tool) is a web-based comprehensive package of tools for analyzing helix-helix interactions in proteins.
Transcriptome and Proteome Tools
ExpressYourself is an interactive platform for background correction, normalization, scoring, and quality assessment of raw microarray data. A new algorithm for local clustering of expression data to find timeshifted and/or inverted relationships is available as C source code. BoCaTFBS is a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments. This tool is based on a data mining approach combining noisy data from ChIP-chip experiments with known binding site patterns. BoCaTFBS uses boosted cascades of classifiers for optimum efficiency, in which components are alternating decision trees; it exploits interpositional correlations; and it explicitly integrates massive negative information from ChIP-chip experiments. ProCAT is a data analysis approach for protein microarrays. ProCAT corrects for background bias and spatial artifacts, identifies significant signals, filters nonspecific spots, and normalizes the resulting signal to protein abundance. ProCAT provides a powerful and flexible new approach for analyzing many types of protein microarrays. Tilescope is an online analysis pipeline for high-density tiling microarray data. Tilescope normalizes signals between channels and across arrays, combines replicate experiments, score each array element, and identifies genomic features. The program is designed with a modular, three-tiered architecture, facilitating parallelism, and a graphic user-friendly interface, presenting results in an organized web page, downloadable for further analysis. PARE (Protein Abundance and mRNA Expression is a tool for comparing protein abundance and mRNA expression data. In addition to globally comparing the quantities of protein and mRNA, PARE allows users to select subsets of proteins for focused study (based on functional categories and complexes). Furthermore, it highlights correlation outliers, which are potentially worth further examination. RSEQtools is a suite of tools that use Mapped Read Format (MRF) for the analysis of RNA-Seq experiments. FusionSeq is a computational framework for detecting chimeric transcripts from paired-end RNA-seq experiments. It provides a ranked list of fusion transcripts candidates that can be further evaluated via experimental methods. ACT (aggregation and correlation toolbox) is an aggregation and correlation toolbox for analyses of genome tracks. PeakSeq is a tool for calling peaks corresponding to transcription factor binding sites from ChIP-Seq data scored against a matched control such as Input DNA. IQSeq is a tool for isoform quantification with RNA-seq data. Given isoform annotation and alignment of RNA-seq reads, it will use EM algorithm to infer the most probable expression level for each isoform of a gene. Tiling is under construction.
Network Tools
TopNet is an automated web tool designed to calculate topological parameters and compare different sub-networks for any given network. Yeasthub is a semantic web-based application which demonstrates how a life sciences data warehouse can be built using a native Resource Description Framework (RDF) data store. This data warehouse allows integration of different types of yeast genome data provided by different resources in different formats including the tabular and RDF formats. tYNA (TopNet-like Yale Network Analyzer) is a Web system for managing, comparing and mining multiple networks, both directed and undirected. tYNA efficiently implements methods that have proven useful in network analysis, including identifying defective cliques, finding small network motifs (such as feed-forward loops), calculating global statistics (such as the clustering coefficient and eccentricity), and identifying hubs and bottlenecks etc. HUB is a tool for leveraging the structure of the semantic web to enhance information retrieval for proteomics. This tool helps Proteomics researchers to be able to quickly retrieve relevant information from the web and the biomedical literature.
Evolution and Genomics Tools
Coevolution analysis of protein residues: this is an integrated online system that enables comparative analyses of residue coevolution with a comprehensive set of commonly used scoring functions, including Statistical Coupling Analysis (SCA), Explicit Likelihood of Subset Variation (ELSC), mutual information and correlation-based methods. ACT (aggregation and correlation toolbox) is an aggregation and correlation toolbox for analyses of genome tracks.
Other
PubNet is a web-based tool that extracts several types of relationships returned by PubMed queries and maps them into networks, allowing for graphical visualization, textual navigation, and topological analysis.
PubNet is a web-based tool that extracts several types of relationships returned by PubMed queries and maps them into networks, allowing for graphical visualization, textual navigation, and topological analysis. 
tYNA (TopNet-like Yale Network Analyzer) is a Web system for managing, comparing and mining multiple networks, both directed and undirected. tYNA efficiently implements methods that have proven useful in network analysis, including identifying defective cliques, finding small network motifs (such as feed-forward loops), calculating global statistics (such as the clustering coefficient and eccentricity), and identifying hubs and bottlenecks etc.
HIT (Helix Interaction Tool) is a web-based comprehensive package of tools for analyzing helix-helix interactions in proteins.
BoCaTFBS is a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments. This tool is based on a data mining approach combining noisy data from ChIP-chip experiments with known binding site patterns. BoCaTFBS uses boosted cascades of classifiers for optimum efficiency, in which components are alternating decision trees; it exploits interpositional correlations; and it explicitly integrates massive negative information from ChIP-chip experiments.
ProCAT is a data analysis approach for protein microarrays. ProCAT corrects for background bias and spatial artifacts, identifies significant signals, filters nonspecific spots, and normalizes the resulting signal to protein abundance. ProCAT provides a powerful and flexible new approach for analyzing many types of protein microarrays.
Tilescope is an online analysis pipeline for high-density tiling microarray data. Tilescope normalizes signals between channels and across arrays, combines replicate experiments, score each array element, and identifies genomic features. The program is designed with a modular, three-tiered architecture, facilitating parallelism, and a graphic user-friendly interface, presenting results in an organized web page, downloadable for further analysis.
PARE (Protein Abundance and mRNA Expression is a tool for comparing protein abundance and mRNA expression data. In addition to globally comparing the quantities of protein and mRNA, PARE allows users to select subsets of proteins for focused study (based on functional categories and complexes). Furthermore, it highlights correlation outliers, which are potentially worth further examination.
HUB is a tool for leveraging the structure of the semantic web to enhance information retrieval for proteomics. This tool helps Proteomics researchers to be able to quickly retrieve relevant information from the web and the biomedical literature.
Coevolution analysis of protein residues: this is an integrated online system that enables comparative analyses of residue coevolution with a comprehensive set of commonly used scoring functions, including Statistical Coupling Analysis (SCA), Explicit Likelihood of Subset Variation (ELSC), mutual information and correlation-based methods.
RSEQtools is a suite of tools that use Mapped Read Format (MRF) for the analysis of RNA-Seq experiments. MRF is a compact data summary format for both short and long read alignments that enables the anonymization of confidential sequence information, while allowing one to still carry out many functional genomics studies. These tools consist of a set of modules that perform common tasks such as calculating gene expression values, generating signal tracks of mapped reads and segmenting that signal into actively transcribed regions. Moreover, the tools can readily be used to build customizable RNA-Seq workflows.
FusionSeq is a computational framework for detecting chimeric transcripts from paired-end RNA-seq experiments. It provides a ranked list of fusion transcripts candidates that can be further evaluated via experimental methods.
ACT (aggregation and correlation toolbox) is an aggregation and correlation toolbox for analyses of genome tracks. ACT is an efficient, multifaceted toolbox for analyzing continuous signal and discrete region tracks from high-throughput genomic experiments, such as RNA-seq or ChIP-chip signal profiles from the ENCODE and modENCODE projects, or lists of single nucleotide polymorphisms from the 1000 genomes project. It is able to generate aggregate profiles of a given track around a set of specified anchor points, such as transcription start sites. It is also able to correlate related tracks and analyze them for saturation.
PeakSeq is a tool for calling peaks corresponding to transcription factor binding sites from ChIP-Seq data scored against a matched control such as Input DNA.
IQSeq is a tool for isoform quantification with RNA-seq data. Given isoform annotation and alignment of RNA-seq reads, it will use EM algorithm to infer the most probable expression level for each isoform of a gene.
