Gerstein Lab Publications

Main  •  By Subject  •  Queries  •  Code  •  Other Writings


These constitute a good selection of papers to introduce one to the lab and the general field of bioinformatics. They are focused on someone with a CS background. Specific fields are highlighted
Networks
PM Kim, JO Korbel, MB Gerstein (2007), PM Kim, LJ Lu, Y Xia, MB Gerstein (2006), LJ Lu, Y Xia, A Paccanaro, H Yu, M Gerstein (2005), H Yu, PM Kim, E Sprecher, V Trifonov, M Gerstein (2007)
Social Networks
H Yu, M Gerstein (2006), SM Douglas, GT Montelione, M Gerstein (2005)
Representative Web tools
KH Cheung, KY Yip, A Smith, R Deknikker, A Masiar, M Gerstein (2005), KY Yip, H Yu, PM Kim, M Schultz, M Gerstein (2006), KY Yip, P Patel, PM Kim, DM Engelman, D McDermott, M Gerstein (2008)
Semanic web and knowledge representation
A Smith, K Cheung, M Krauthammer, M Schultz, M Gerstein (2007), H Yu, R Jansen, G Stolovitzky, M Gerstein (2007)
Genome annotation + mining (very computationally oriented)
TE Royce, JS Rozowsky, MB Gerstein (2007), ZD Zhang, A Paccanaro, Y Fu, S Weissman, Z Weng, J Chang, M Snyder, MB Gerstein (2007), Z Zhang, N Carriero, D Zheng, J Karro, PM Harrison, M Gerstein (2006), J Du, JS Rozowsky, JO Korbel, ZD Zhang, TE Royce, MH Schultz, M Snyder, M Gerstein (2006), TE Royce, NJ Carriero, MB Gerstein (2007)
Motions and simulation
S Flores, N Echols, D Milburn, B Hespenheide, K Keating, J Lu, S Wells, EZ Yu, M Thorpe, M Gerstein (2006)
Easy to read things
A Smith, M Gerstein (2006), M Gerstein, D Zheng (2006), M Gerstein, M Levitt (1998)

Machine learning and genome annotation: a match meant to be?
KY Yip, C Cheng, M Gerstein (2013). Genome Biol 14: 205.
 
preprint
medline

Architecture of the human regulatory network derived from ENCODE data.
MB Gerstein, A Kundaje, M Hariharan, SG Landt, KK Yan, C Cheng, XJ Mu, E Khurana, J Rozowsky, R Alexander, R Min, P Alves, A Abyzov, N Addleman, N Bhardwaj, AP Boyle, P Cayting, A Charos, DZ Chen, Y Cheng, D Clarke, C Eastman, G Euskirchen, S Frietze, Y Fu, J Gertz, F Grubert, A Harmanci, P Jain, M Kasowski, P Lacroute, J Leng, J Lian, H Monahan, H O'Geen, Z Ouyang, EC Partridge, D Patacsil, F Pauli, D Raha, L Ramirez, TE Reddy, B Reed, M Shi, T Slifer, J Wang, L Wu, X Yang, KY Yip, G Zilberman-Schapira, S Batzoglou, A Sidow, PJ Farnham, RM Myers, SM Weissman, M Snyder (2012). Nature 489: 91-100.
website
 
medline

Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors.
KY Yip, C Cheng, N Bhardwaj, JB Brown, J Leng, A Kundaje, J Rozowsky, E Birney, P Bickel, M Snyder, M Gerstein (2012). Genome Biol 13: R48.
website
 
medline

VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment.
L Habegger, S Balasubramanian, DZ Chen, E Khurana, A Sboner, A Harmanci, J Rozowsky, D Clarke, M Snyder, M Gerstein (2012). Bioinformatics 28: 2267-9.
website
 
medline

TIP: a probabilistic method for identifying transcription factor target genes from ChIP-seq binding profiles.
C Cheng, R Min, M Gerstein (2011). Bioinformatics 27: 3221-7.
 
preprint
medline

The spread of scientific information: insights from the web usage statistics in PLoS article-level metrics.
KK Yan, M Gerstein (2011). PLoS One 6: e19917.
 
 
medline

Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model.
ZD Zhang, MB Gerstein (2010). BMC Bioinformatics 11: 539.
 
 
medline

Using semantic web rules to reason on an ontology of pseudogenes.
ME Holford, E Khurana, KH Cheung, M Gerstein (2010). Bioinformatics 26: i71-8.
website
 
medline

Genome-wide sequence-based prediction of peripheral proteins using a novel semi-supervised learning technique.
N Bhardwaj, M Gerstein, H Lu (2010). BMC Bioinformatics 11 Suppl 1: S6.
 
preprint
medline

Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels.
KY Yip, PM Kim, D McDermott, M Gerstein (2009). BMC Bioinformatics 10: 241.
 
preprint
medline

Training set expansion: an approach to improving the reconstruction of biological networks from limited and uneven reliable interactions.
KY Yip, M Gerstein (2009). Bioinformatics 25: 243-50.
website
preprint
medline

Manually structured digital abstracts: a scaffold for automatic text mining.
M Seringhaus, M Gerstein (2008). FEBS Lett 582: 1170.
 
preprint
medline

Uncovering trends in gene naming.
MR Seringhaus, PD Cayting, MB Gerstein (2008). Genome Biol 9: 401.
website
preprint
medline

An integrated system for studying residue coevolution in proteins.
KY Yip, P Patel, PM Kim, DM Engelman, D McDermott, M Gerstein (2008). Bioinformatics 24: 290-2.
website
preprint
medline

Positive selection at the protein network periphery: evaluation in terms of structural constraints and cellular context.
PM Kim, JO Korbel, MB Gerstein (2007). Proc Natl Acad Sci U S A 104: 20274-9.
website
preprint
medline

Leveraging the structure of the Semantic Web to enhance information retrieval for proteomics.
A Smith, K Cheung, M Krauthammer, M Schultz, M Gerstein (2007). Bioinformatics 23: 3073-9.
website
preprint
medline

Toward a universal microarray: prediction of gene expression through nearest-neighbor probe sequence identification.
TE Royce, JS Rozowsky, MB Gerstein (2007). Nucleic Acids Res 35: e99.
 
preprint
medline

Statistical analysis of the genomic distribution and correlation of regulatory elements in the ENCODE regions.
ZD Zhang, A Paccanaro, Y Fu, S Weissman, Z Weng, J Chang, M Snyder, MB Gerstein (2007). Genome Res 17: 787-97.
website
preprint
medline

An efficient pseudomedian filter for tiling microrrays.
TE Royce, NJ Carriero, MB Gerstein (2007). BMC Bioinformatics 8: 186.
website
preprint
medline

Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications.
H Yu, R Jansen, G Stolovitzky, M Gerstein (2007). Bioinformatics 23: 2163-73.
website
preprint
medline

Structured digital abstract makes text mining easy.
M Gerstein, M Seringhaus, S Fields (2007). Nature 447: 142.
 
preprint
medline

The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics.
H Yu, PM Kim, E Sprecher, V Trifonov, M Gerstein (2007). PLoS Comput Biol 3: e59.
website
preprint
medline

Relating three-dimensional structures to protein networks provides evolutionary insights.
PM Kim, LJ Lu, Y Xia, MB Gerstein (2006). Science 314: 1938-41.
website
preprint
medline

Data mining on the web.
A Smith, M Gerstein (2006). Science 314: 1682; author reply 1682.
 
preprint
medline

BoCaTFBS: a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments.
LY Wang, M Snyder, M Gerstein (2006). Genome Biol 7: R102.
website
preprint
medline

A supervised hidden markov model framework for efficiently segmenting tiling array data in transcriptional and chIP-chip experiments: systematically incorporating validated biological knowledge.
J Du, JS Rozowsky, JO Korbel, ZD Zhang, TE Royce, MH Schultz, M Snyder, M Gerstein (2006). Bioinformatics 22: 3016-24.
website
preprint
medline

Integration of curated databases to identify genotype-phenotype associations.
CS Goh, TA Gianoulis, Y Liu, J Li, A Paccanaro, YA Lussier, M Gerstein (2006). BMC Genomics 7: 257.
website
preprint
medline

The tYNA platform for comparative interactomics: a web tool for managing, comparing and mining multiple networks.
KY Yip, H Yu, PM Kim, M Schultz, M Gerstein (2006). Bioinformatics 22: 2968-70.
website
preprint
medline

Genomic analysis of the hierarchical structure of regulatory networks.
H Yu, M Gerstein (2006). Proc Natl Acad Sci U S A 103: 14724-31.
website
preprint
medline

The real life of pseudogenes.
M Gerstein, D Zheng (2006). Sci Am 295: 48-55.
website
preprint
medline

PseudoPipe: an automated pseudogene identification pipeline.
Z Zhang, N Carriero, D Zheng, J Karro, PM Harrison, M Gerstein (2006). Bioinformatics 22: 1437-9.
website
preprint
medline

The Database of Macromolecular Motions: new features added at the decade mark.
S Flores, N Echols, D Milburn, B Hespenheide, K Keating, J Lu, S Wells, EZ Yu, M Thorpe, M Gerstein (2006). Nucleic Acids Res 34: D296-301.
website
preprint
medline

PubNet: a flexible system for visualizing literature derived networks.
SM Douglas, GT Montelione, M Gerstein (2005). Genome Biol 6: R80.
website
preprint
medline

Assessing the limits of genomic data integration for predicting protein networks.
LJ Lu, Y Xia, A Paccanaro, H Yu, M Gerstein (2005). Genome Res 15: 945-53.
website
preprint
medline

YeastHub: a semantic web use case for integrating data in the life sciences domain.
KH Cheung, KY Yip, A Smith, R Deknikker, A Masiar, M Gerstein (2005). Bioinformatics 21 Suppl 1: i85-96.
website
preprint
medline

Determining the minimum number of types necessary to represent the sizes of protein atoms.
J Tsai, N Voss, M Gerstein (2001). Bioinformatics 17: 949-56.
website
preprint
medline

An XML Application for Genomic Data Interoperation
Cheung KH, Liu Y, Kumar K, Snyder M, Gerstein M, Miller P (2001). IEEE International Symposium on Bio-Informatics and Biomedical Engineering (BIBE) pp. 97-103
 
preprint
 


Return to front page