Tools

Gerstein Lab Publications

Main • By Subject • Queries • Code • Other Writings

Much of our work takes the form of freely available programs and/or web servers for analysis and visualization of biological data. Below is a partial listing of the many resources presented on this site, with links to the papers associated with them.

Please review the permissions statement before using any of these programs.

	The Morph Server generates a plausible pathway between two conformations of a protein or nucleic acid structure. A large number of statistics and several high-quality movies are output. [ citation 1 \| citation2 \| related ]
	ExpressYourself is an interactive platform for background correction, normalization, scoring, and quality assessment of raw microarray data. [ citation ]
	SPINE is our laboratory-information management system (LIMS) for the NorthEast Structural Genomics Consortium. The online version is restricted to consortium users, but most of the code is freely available for download. [ citation1 \| citation2 ]
	Pseudogene.org is a collection of resources related to our efforts to survey eukaryotic genomes for pseudogene sequences, "pseudo-fold" usage, amino-acid composition, and single-nucleotide polymorphisms (SNPs) to help elucidate the relationships between pseudogene families across several organisms.
	Tiling is under construction.
	TopNet is an automated web tool designed to calculate topological parameters and compare different sub-networks for any given network.
	A number of programs for calculating properties of protein and nucleic acid structures have been collected into a single distribution. Included are a library of utility functions for dealing with structures, and a convenient interactive command-line interpreter. [ related papers ]
	A new algorithm for local clustering of expression data to find timeshifted and/or inverted relationships is available as C source code. [ citation ]
	The publication listings on our site are automatically generated based on data from the NCBI and local annotations stored in XML format. The code which does this is freely available (some modification will be required for other sites).

IQSeq: integrated isoform quantification analysis based on next-generation sequencing.

J Du, J Leng, L Habegger, A Sboner, D McDermott, M Gerstein (2012). PLoS One 7:e29175.

Integration of protein motions with molecular networks reveals different mechanisms for permanent and transient interactions.

N Bhardwaj, A Abyzov, D Clarke, C Shou, MB Gerstein (2011). Protein Sci 20:1745-54.

AlleleSeq: analysis of allele-specific expression and binding in a network framework.

J Rozowsky, A Abyzov, J Wang, P Alves, D Raha, A Harmanci, J Leng, R Bjornson, Y Kong, N Kitabayashi, N Bhardwaj, M Rubin, M Snyder, M Gerstein (2011). Mol Syst Biol 7:522.

Identification of genomic indels and structural variations using split reads.

ZD Zhang, J Du, H Lam, A Abyzov, AE Urban, M Snyder, M Gerstein (2011). BMC Genomics 12:375.

ACT: aggregation and correlation toolbox for analyses of genome tracks.

J Jee, J Rozowsky, KY Yip, L Lochovsky, R Bjornson, G Zhong, Z Zhang, Y Fu, J Wang, Z Weng, M Gerstein (2011). Bioinformatics 27:1152-4.

CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.

A Abyzov, AE Urban, M Snyder, M Gerstein (2011). Genome Res 21:974-84.

Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data.

ZJ Lu, KY Yip, G Wang, C Shou, LW Hillier, E Khurana, A Agarwal, R Auerbach, J Rozowsky, C Cheng, M Kato, DM Miller, F Slack, M Snyder, RH Waterston, V Reinke, MB Gerstein (2011). Genome Res 21:276-85.

RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries.

L Habegger, A Sboner, TA Gianoulis, J Rozowsky, A Agarwal, M Snyder, M Gerstein (2011). Bioinformatics 27:281-3.

FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data.

A Sboner, L Habegger, D Pflueger, S Terry, DZ Chen, JS Rozowsky, AK Tewari, N Kitabayashi, BJ Moss, MS Chee, F Demichelis, MA Rubin, MB Gerstein (2011). Genome Biol 11:R104.

Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library.

HY Lam, XJ Mu, AM Stütz, A Tanzer, PD Cayting, M Snyder, PM Kim, JO Korbel, MB Gerstein (2010). Nat Biotechnol 28:47-55.

Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

J Du, RD Bjornson, ZD Zhang, Y Kong, M Snyder, MB Gerstein (2009). PLoS Comput Biol 5:e1000432.

PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data.

JO Korbel, A Abyzov, XJ Mu, N Carriero, P Cayting, Z Zhang, M Snyder, MB Gerstein (2010). Genome Biol 10:R23.

PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.

J Rozowsky, G Euskirchen, RK Auerbach, ZD Zhang, T Gibson, R Bjornson, N Carriero, M Snyder, MB Gerstein (2009). Nat Biotechnol 27:66-75.

An integrated system for studying residue coevolution in proteins.

KY Yip, P Patel, PM Kim, DM Engelman, D McDermott, M Gerstein (2008). Bioinformatics 24:290-2.

Leveraging the structure of the Semantic Web to enhance information retrieval for proteomics.

A Smith, K Cheung, M Krauthammer, M Schultz, M Gerstein (2007). Bioinformatics 23:3073-9.

PARE: a tool for comparing protein abundance and mRNA expression data.

EZ Yu, AE Burba, M Gerstein (2007). BMC Bioinformatics 8:309.

Tilescope: online analysis pipeline for high-density tiling microarray data.

ZD Zhang, J Rozowsky, HY Lam, J Du, M Snyder, M Gerstein (2007). Genome Biol 8:R81.

Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation.

JE Karro, Y Yan, D Zheng, Z Zhang, N Carriero, P Cayting, P Harrrison, M Gerstein (2007). Nucleic Acids Res 35:D55-60.

ProCAT: a data analysis approach for protein microarrays.

X Zhu, M Gerstein, M Snyder (2007). Genome Biol 7:R110.

BoCaTFBS: a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments.

LY Wang, M Snyder, M Gerstein (2007). Genome Biol 7:R102.

Helix Interaction Tool (HIT): a web-based tool for analysis of helix-helix interactions in proteins.

AE Burba, U Lehnert, EZ Yu, M Gerstein (2006). Bioinformatics 22:2735-8.

The tYNA platform for comparative interactomics: a web tool for managing, comparing and mining multiple networks.

KY Yip, H Yu, PM Kim, M Schultz, M Gerstein (2006). Bioinformatics 22:2968-70.

PseudoPipe: an automated pseudogene identification pipeline.

Z Zhang, N Carriero, D Zheng, J Karro, PM Harrison, M Gerstein (2006). Bioinformatics 22:1437-9.

PubNet: a flexible system for visualizing literature derived networks.

SM Douglas, GT Montelione, M Gerstein (2005). Genome Biol 6:R80.

YeastHub: a semantic web use case for integrating data in the life sciences domain.

KH Cheung, KY Yip, A Smith, R Deknikker, A Masiar, M Gerstein (2005). Bioinformatics 21 Suppl 1:i85-96.

TopNet: a tool for comparing biological sub-networks, correlating protein properties with topological statistics.

H Yu, X Zhu, D Greenbaum, J Karro, M Gerstein (2004). Nucleic Acids Res 32:328-37.

ExpressYourself: A modular platform for processing and visualizing microarray data.

NM Luscombe, TE Royce, P Bertone, N Echols, CE Horak, JT Chang, M Snyder, M Gerstein (2003). Nucleic Acids Res 31:3477-82.

SPINE 2: a system for collaborative structural proteomics within a federated database framework.

CS Goh, N Lan, N Echols, SM Douglas, D Milburn, P Bertone, R Xiao, LC Ma, D Zheng, Z Wunderlich, T Acton, GT Montelione, M Gerstein (2003). Nucleic Acids Res 31:2833-8.

Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.

J Qian, M Dolled-Filhart, J Lin, H Yu, M Gerstein (2001). J Mol Biol 314:1053-66.

SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics.

P Bertone, Y Kluger, N Lan, D Zheng, D Christendat, A Yee, AM Edwards, CH Arrowsmith, GT Montelione, M Gerstein (2001). Nucleic Acids Res 29:2884-98.

The morph server: a standardized system for analyzing and visualizing macromolecular motions in a database framework.

WG Krebs, M Gerstein (2000). Nucleic Acids Res 28:1665-75.

Return to front page