Summary of Bioinformatics Tools made by the lab

From GersteinInfo

Jump to: navigation, search



The Gerstein lab has made it a priority to develop its cutting edge algorithms into tools in the form of downloadable programs, webservers, and databases. These tools are the heart of our work in transforming the big data of genomes into knowledge. Below we highlight some of these tools.

Genomics Tools

To extract knowledge from high-throughput genomic experiments, such as RNA-seq or ChIP-chip the Gerstein lab has made the following tools. To identify of splice sites and gene models from RNA-seq data we made RSEQtools (ref). In addition, RSEQtools has the benefit of allowing researchers to remove sequence information for read signal information, thus protecting the identity of the subjects’ data. To better understand alternative splicing and exon skipping events we made IQSeq (ref). To better identify fusion transcripts from paired-end RNA-sequencing we created Fusion-seq (ref). To aggregate the distribution of signals in RNA-seq or ChIP-chip signal profiles and to correlate multiple-related signal tracks we made ACT (ref). To distinguish candidate cancer drivers from inherited polymorphisms (passenger cancer mutations) we created FunSeq (ref). Large structural variation, including copy-number variation, and unbalanced inversion events, are widespread in human genomes to detect these we made BreakSeq (ref). Collectively, these tools provide knowledge that can inform personal medicine.

Genomics Tools
Tool Name Description Citation
RSEQtools To identify of splice sites and gene models from RNA-seq data Habegger, Lukas, et al. "RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries." Bioinformatics 27.2 (2011): 281-283.
IQSeq To better understand alternative splicing and exon skipping events Du, Jiang, et al. "IQSeq: integrated isoform quantification analysis based on next-generation sequencing." PloS one 7.1 (2012): e29175.

Structure Tools

To extract knowledge about the three-dimensional dynamics of proteins and ultimately their function we have built the Database of Macromolecular Movements (MolMovDB). Initially published in 1998 the main functionality was to interpolate the movements of macromolecules between two known crystal structures (ref). In 2005 a number of additions were made (ref). These additions include a more accurate method for interpolating multi – chain macromolecules, and an updated interface. In 2008 a Normal mode hinge prediction modal was added so that users could detect hinges in uploaded structures (ref). The MolMovDB and its subsequent additions have provided knowledge about the functioning of proteins and of the structure of potential new drugs.

Network Tools

Finally, the Gerstein lab has been a pioneer in applying network analysis to generate knowledge form large-scale experiments.To this end, we have developed TopNet-like Yale Network Analyzer (tYNA) for managing, comparing and mining multiple networks, both directed and undirected (ref). This tools focuses not on individual genes and proteins but on the relationships between them. For example, Identifying defective cliques, finding small network motifs (such as feed-forward loops), calculating global statistics (such as the clustering coefficient and eccentricity), and identifying hubs and bottlenecks. To apply semantic web technologies such as resource description framework (RDF), RDF site summary (RSS), relational-database-to-RDF mapping (D2RQ) to more efficiently query life sciences data and meta-data we built YeastHub (ref). The network tools developed in the Gerstein lab provide new insights into existing data and make information easy to find.

Personal tools