Gerstein Lab Publications

Main  •  By Subject  •  Queries  •  Code  •  Other Writings


These constitute a good selection of papers to introduce one to the lab and the general field of bioinformatics.

One might start with Luscombe et al. (2002) for an introduction to bioinformatics.

Specific fields are highlighted in other short papers:

Data integration for function prediction
Gerstein et al. (2002), Bertone and Gerstein (2001), Greenbaum et al. (2001)
Simulation on molecular structure
Gerstein and Levitt (1998); Gerstein and Chothia (1999)
Pseudogenes and genome annotation
Gerstein and Snyder (2003); Harrison and Gerstein (2002)
Structural genomics
Gerstein (2000); Teichmann et al. (1999)
E-publishing
Gerstein (1999)
Expression analysis
Gerstein and Jansen (2000)
In each list, it is probably best to read the papers listed first.

Unfortunately, none of these papers give one much detail on the mathematical or computational aspects of the work. For this, it's best to look at some samples of recent research, which are listed below, to get a sense of the type of mining that we are doing -- e.g. Yu & Gerstein (2006), Du et al. (2006), Lu et al., (2005), Yip et al., (2007), and Echols et al. (2003).


Architecture of the human regulatory network derived from ENCODE data.
MB Gerstein, A Kundaje, M Hariharan, SG Landt, KK Yan, C Cheng, XJ Mu, E Khurana, J Rozowsky, R Alexander, R Min, P Alves, A Abyzov, N Addleman, N Bhardwaj, AP Boyle, P Cayting, A Charos, DZ Chen, Y Cheng, D Clarke, C Eastman, G Euskirchen, S Frietze, Y Fu, J Gertz, F Grubert, A Harmanci, P Jain, M Kasowski, P Lacroute, J Leng, J Lian, H Monahan, H O'Geen, Z Ouyang, EC Partridge, D Patacsil, F Pauli, D Raha, L Ramirez, TE Reddy, B Reed, M Shi, T Slifer, J Wang, L Wu, X Yang, KY Yip, G Zilberman-Schapira, S Batzoglou, A Sidow, PJ Farnham, RM Myers, SM Weissman, M Snyder (2012). Nature 489: 91-100.
website
 
medline

A systematic survey of loss-of-function variants in human protein-coding genes.
DG MacArthur, S Balasubramanian, A Frankish, N Huang, J Morris, K Walter, L Jostins, L Habegger, JK Pickrell, SB Montgomery, CA Albers, ZD Zhang, DF Conrad, G Lunter, H Zheng, Q Ayub, MA DePristo, E Banks, M Hu, RE Handsaker, JA Rosenfeld, M Fromer, M Jin, XJ Mu, E Khurana, K Ye, M Kay, GI Saunders, MM Suner, T Hunt, IH Barnes, C Amid, DR Carvalho-Silva, AH Bignell, C Snow, B Yngvadottir, S Bumpstead, DN Cooper, Y Xue, IG Romero, 1000 Genomes Project Consortium, J Wang, Y Li, RA Gibbs, SA McCarroll, ET Dermitzakis, JK Pritchard, JC Barrett, J Harrow, ME Hurles, MB Gerstein, C Tyler-Smith (2012). Science 335:823-8.
 
 
medline

Novel insights through the integration of structural and functional genomics data with protein networks.
D Clarke, N Bhardwaj, MB Gerstein (2012). J Struct Biol 179: 320-6.
 
preprint
medline

AlleleSeq: analysis of allele-specific expression and binding in a network framework.
J Rozowsky, A Abyzov, J Wang, P Alves, D Raha, A Harmanci, J Leng, R Bjornson, Y Kong, N Kitabayashi, N Bhardwaj, M Rubin, M Snyder, M Gerstein (2011). Mol Syst Biol 7: 522.
website
 
medline

Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project.
XJ Mu, ZJ Lu, Y Kong, HY Lam, MB Gerstein (2011). Nucleic Acids Res 39: 7058-76.
website
 
medline

Gene inactivation and its implications for annotation in the era of personal genomics.
S Balasubramanian, L Habegger, A Frankish, DG MacArthur, R Harte, C Tyler-Smith, J Harrow, M Gerstein (2011). Genes Dev 25: 1-10.
 
 
medline

Getting started in gene orthology and functional analysis.
G Fang, N Bhardwaj, R Robilotto, MB Gerstein (2010). PLoS Comput Biol 6: e1000703.
website
preprint
medline

RigidFinder: a fast and sensitive method to detect rigid blocks in large macromolecular complexes.
A Abyzov, R Bjornson, M Felipe, M Gerstein (2010). Proteins 78: 309-24.
website
preprint
medline

Understanding modularity in molecular networks requires dynamics.
RP Alexander, PM Kim, T Emonet, MB Gerstein (2009). Sci Signal 2: pe44.
 
preprint
medline

Systematic identification of transcription factors associated with patient survival in cancers.
C Cheng, LM Li, P Alves, M Gerstein (2009). BMC Genomics 10: 225.
 
preprint
medline

Comparative analysis of processed ribosomal protein pseudogenes in four mammalian genomes.
S Balasubramanian, D Zheng, YJ Liu, G Fang, A Frankish, N Carriero, R Robilotto, P Cayting, M Gerstein (2009). Genome Biol 10: R2.
website
preprint
medline

PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.
J Rozowsky, G Euskirchen, RK Auerbach, ZD Zhang, T Gibson, R Bjornson, N Carriero, M Snyder, MB Gerstein (2009). Nat Biotechnol 27: 66-75.
website
preprint
medline

Pseudofam: the pseudogene families database.
HY Lam, E Khurana, G Fang, P Cayting, N Carriero, KH Cheung, MB Gerstein (2009). Nucleic Acids Res 37: D738-43.
website
preprint
medline

Genomics: protein fossils live on as RNA.
R Sasidharan, M Gerstein (2008). Nature 453: 729-31.
 
preprint
medline

A supervised hidden markov model framework for efficiently segmenting tiling array data in transcriptional and chIP-chip experiments: systematically incorporating validated biological knowledge.
J Du, JS Rozowsky, JO Korbel, ZD Zhang, TE Royce, MH Schultz, M Snyder, M Gerstein (2006). Bioinformatics 22: 3016-24.
website
preprint
medline

The tYNA platform for comparative interactomics: a web tool for managing, comparing and mining multiple networks.
KY Yip, H Yu, PM Kim, M Schultz, M Gerstein (2006). Bioinformatics 22: 2968-70.
website
preprint
medline

A computational approach for identifying pseudogenes in the ENCODE regions.
D Zheng, MB Gerstein (2006). Genome Biol 7 Suppl 1: S13.1-10.
website
preprint
medline

Assessing the limits of genomic data integration for predicting protein networks.
LJ Lu, Y Xia, A Paccanaro, H Yu, M Gerstein (2005). Genome Res 15: 945-53.
website
preprint
medline

A Bayesian networks approach for predicting protein-protein interactions from genomic data.
R Jansen, H Yu, D Greenbaum, Y Kluger, NJ Krogan, S Chung, A Emili, M Snyder, JF Greenblatt, M Gerstein (2003). Science 302: 449-53.
website
preprint
medline

Genomics. Defining genes in the genomics era.
M Snyder, M Gerstein (2003). Science 300: 258-60.
website
preprint
medline

MolMovDB: analysis and visualization of conformational change and structural flexibility.
N Echols, D Milburn, M Gerstein (2003). Nucleic Acids Res 31: 478-82.
website
preprint
medline

Studying genomes through the aeons: protein families, pseudogenes and proteome evolution.
PM Harrison, M Gerstein (2002). J Mol Biol 318: 1155-74.
website
preprint
medline

Proteomics. Integrating interactomes.
M Gerstein, N Lan, R Jansen (2002). Science 295: 284-7.
website
preprint
medline

What is bioinformatics? A proposed definition and overview of the field.
NM Luscombe, D Greenbaum, M Gerstein (2001). Methods Inf Med 40: 346-58.
website
preprint
medline

Interrelating different types of genomic data, from proteome to secretome: 'oming in on function.
D Greenbaum, NM Luscombe, R Jansen, J Qian, M Gerstein (2001). Genome Res 11: 1463-8.
website
preprint
medline

Integrative data mining: the new direction in bioinformatics.
P Bertone, M Gerstein (2001). IEEE Eng Med Biol Mag 20: 33-40.
 
preprint
medline

Integrative database analysis in structural genomics.
M Gerstein (2000). Nat Struct Biol 7 Suppl: 960-3.
website
preprint
medline

The current excitement in bioinformatics-analysis of whole-genome expression data: how does it relate to protein structure and function?
M Gerstein, R Jansen (2000). Curr Opin Struct Biol 10: 574-84.
website
preprint
medline

Perspectives: signal transduction. Proteins in motion.
M Gerstein, C Chothia (1999). Science 285: 1682-3.
website
preprint
medline

E-publishing on the Web: promises, pitfalls, and payoffs for bioinformatics.
M Gerstein (1999). Bioinformatics 15: 429-31.
 
preprint
medline

Advances in structural genomics.
SA Teichmann, C Chothia, M Gerstein (1999). Curr Opin Struct Biol 9: 390-9.
website
preprint
medline


Return to front page