Gerstein Lab Publications

Main  •  By Subject  •  Queries  •  Code  •  Other Writings


Much of our work takes the form of freely available programs and/or web servers for analysis and visualization of biological data. Below is a partial listing of the many resources presented on this site, with links to the papers associated with them.

Please review the permissions statement before using any of these programs.

Morph Server The Morph Server generates a plausible pathway between two conformations of a protein or nucleic acid structure. A large number of statistics and several high-quality movies are output.
[ citation 1 | citation2 | related ]
ExpressYourself ExpressYourself is an interactive platform for background correction, normalization, scoring, and quality assessment of raw microarray data.
[ citation ]
SPINE SPINE is our laboratory-information management system (LIMS) for the NorthEast Structural Genomics Consortium. The online version is restricted to consortium users, but most of the code is freely available for download.
[ citation1 | citation2 ]
Pseudogenes Pseudogene.org is a collection of resources related to our efforts to survey eukaryotic genomes for pseudogene sequences, "pseudo-fold" usage, amino-acid composition, and single-nucleotide polymorphisms (SNPs) to help elucidate the relationships between pseudogene families across several organisms.
Tiling Tiling is under construction.
TopNet TopNet is an automated web tool designed to calculate topological parameters and compare different sub-networks for any given network.
Protein Geometry A number of programs for calculating properties of protein and nucleic acid structures have been collected into a single distribution. Included are a library of utility functions for dealing with structures, and a convenient interactive command-line interpreter. [ related papers ]
Local Clustering A new algorithm for local clustering of expression data to find timeshifted and/or inverted relationships is available as C source code. [ citation ]
Papers The publication listings on our site are automatically generated based on data from the NCBI and local annotations stored in XML format. The code which does this is freely available (some modification will be required for other sites).


IQSeq: integrated isoform quantification analysis based on next-generation sequencing.
J Du, J Leng, L Habegger, A Sboner, D McDermott, M Gerstein (2012). PLoS One 7:e29175.
website
 
medline

Integration of protein motions with molecular networks reveals different mechanisms for permanent and transient interactions.
N Bhardwaj, A Abyzov, D Clarke, C Shou, MB Gerstein (2011). Protein Sci 20:1745-54.
website
 
medline

AlleleSeq: analysis of allele-specific expression and binding in a network framework.
J Rozowsky, A Abyzov, J Wang, P Alves, D Raha, A Harmanci, J Leng, R Bjornson, Y Kong, N Kitabayashi, N Bhardwaj, M Rubin, M Snyder, M Gerstein (2011). Mol Syst Biol 7:522.
website
 
medline

Identification of genomic indels and structural variations using split reads.
ZD Zhang, J Du, H Lam, A Abyzov, AE Urban, M Snyder, M Gerstein (2011). BMC Genomics 12:375.
 
 
medline

ACT: aggregation and correlation toolbox for analyses of genome tracks.
J Jee, J Rozowsky, KY Yip, L Lochovsky, R Bjornson, G Zhong, Z Zhang, Y Fu, J Wang, Z Weng, M Gerstein (2011). Bioinformatics 27:1152-4.
website
 
medline

CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.
A Abyzov, AE Urban, M Snyder, M Gerstein (2011). Genome Res 21:974-84.
website
 
medline

Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data.
ZJ Lu, KY Yip, G Wang, C Shou, LW Hillier, E Khurana, A Agarwal, R Auerbach, J Rozowsky, C Cheng, M Kato, DM Miller, F Slack, M Snyder, RH Waterston, V Reinke, MB Gerstein (2011). Genome Res 21:276-85.
website
preprint
medline

RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries.
L Habegger, A Sboner, TA Gianoulis, J Rozowsky, A Agarwal, M Snyder, M Gerstein (2011). Bioinformatics 27:281-3.
website
 
medline

FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data.
A Sboner, L Habegger, D Pflueger, S Terry, DZ Chen, JS Rozowsky, AK Tewari, N Kitabayashi, BJ Moss, MS Chee, F Demichelis, MA Rubin, MB Gerstein (2011). Genome Biol 11:R104.
website
 
medline

Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library.
HY Lam, XJ Mu, AM Stütz, A Tanzer, PD Cayting, M Snyder, PM Kim, JO Korbel, MB Gerstein (2010). Nat Biotechnol 28:47-55.
website
preprint
medline

Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.
J Du, RD Bjornson, ZD Zhang, Y Kong, M Snyder, MB Gerstein (2009). PLoS Comput Biol 5:e1000432.
website
preprint
medline

PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data.
JO Korbel, A Abyzov, XJ Mu, N Carriero, P Cayting, Z Zhang, M Snyder, MB Gerstein (2010). Genome Biol 10:R23.
 
preprint
medline

PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.
J Rozowsky, G Euskirchen, RK Auerbach, ZD Zhang, T Gibson, R Bjornson, N Carriero, M Snyder, MB Gerstein (2009). Nat Biotechnol 27:66-75.
website
preprint
medline

An integrated system for studying residue coevolution in proteins.
KY Yip, P Patel, PM Kim, DM Engelman, D McDermott, M Gerstein (2008). Bioinformatics 24:290-2.
website
preprint
medline

Leveraging the structure of the Semantic Web to enhance information retrieval for proteomics.
A Smith, K Cheung, M Krauthammer, M Schultz, M Gerstein (2007). Bioinformatics 23:3073-9.
website
preprint
medline

PARE: a tool for comparing protein abundance and mRNA expression data.
EZ Yu, AE Burba, M Gerstein (2007). BMC Bioinformatics 8:309.
website
preprint
medline

Tilescope: online analysis pipeline for high-density tiling microarray data.
ZD Zhang, J Rozowsky, HY Lam, J Du, M Snyder, M Gerstein (2007). Genome Biol 8:R81.
website
preprint
medline

Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation.
JE Karro, Y Yan, D Zheng, Z Zhang, N Carriero, P Cayting, P Harrrison, M Gerstein (2007). Nucleic Acids Res 35:D55-60.
website
preprint
medline

ProCAT: a data analysis approach for protein microarrays.
X Zhu, M Gerstein, M Snyder (2007). Genome Biol 7:R110.
website
preprint
medline

BoCaTFBS: a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments.
LY Wang, M Snyder, M Gerstein (2007). Genome Biol 7:R102.
website
preprint
medline

Helix Interaction Tool (HIT): a web-based tool for analysis of helix-helix interactions in proteins.
AE Burba, U Lehnert, EZ Yu, M Gerstein (2006). Bioinformatics 22:2735-8.
website
preprint
medline

The tYNA platform for comparative interactomics: a web tool for managing, comparing and mining multiple networks.
KY Yip, H Yu, PM Kim, M Schultz, M Gerstein (2006). Bioinformatics 22:2968-70.
website
preprint
medline

PseudoPipe: an automated pseudogene identification pipeline.
Z Zhang, N Carriero, D Zheng, J Karro, PM Harrison, M Gerstein (2006). Bioinformatics 22:1437-9.
website
preprint
medline

PubNet: a flexible system for visualizing literature derived networks.
SM Douglas, GT Montelione, M Gerstein (2005). Genome Biol 6:R80.
website
preprint
medline

YeastHub: a semantic web use case for integrating data in the life sciences domain.
KH Cheung, KY Yip, A Smith, R Deknikker, A Masiar, M Gerstein (2005). Bioinformatics 21 Suppl 1:i85-96.
website
preprint
medline

TopNet: a tool for comparing biological sub-networks, correlating protein properties with topological statistics.
H Yu, X Zhu, D Greenbaum, J Karro, M Gerstein (2004). Nucleic Acids Res 32:328-37.
website
preprint
medline

ExpressYourself: A modular platform for processing and visualizing microarray data.
NM Luscombe, TE Royce, P Bertone, N Echols, CE Horak, JT Chang, M Snyder, M Gerstein (2003). Nucleic Acids Res 31:3477-82.
website
preprint
medline

SPINE 2: a system for collaborative structural proteomics within a federated database framework.
CS Goh, N Lan, N Echols, SM Douglas, D Milburn, P Bertone, R Xiao, LC Ma, D Zheng, Z Wunderlich, T Acton, GT Montelione, M Gerstein (2003). Nucleic Acids Res 31:2833-8.
website
preprint
medline

Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.
J Qian, M Dolled-Filhart, J Lin, H Yu, M Gerstein (2001). J Mol Biol 314:1053-66.
website
preprint
medline

SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics.
P Bertone, Y Kluger, N Lan, D Zheng, D Christendat, A Yee, AM Edwards, CH Arrowsmith, GT Montelione, M Gerstein (2001). Nucleic Acids Res 29:2884-98.
website
preprint
medline

The morph server: a standardized system for analyzing and visualizing macromolecular motions in a database framework.
WG Krebs, M Gerstein (2000). Nucleic Acids Res 28:1665-75.
website
preprint
medline


Return to front page