Gerstein Lab Publications

Main  •  By Subject  •  Queries  •  Code  •  Other Writings


These constitute a good selection of papers to introduce one to the lab and the general field of bioinformatics.

One might start with Luscombe et al. (2002) for an introduction to bioinformatics.

Specific fields are highlighted in other short papers:

Data integration for function prediction
Cheng, et al. PLoS Comput. Biol. (2011); Cheng, et al., Genome Biol. (2011); Khurana, et al., PLoS Comput. Biol. (2013)
Structural Variation
Abyzov, et al., Genome Res. (2011)
Pseudogenes and loss of function
Pei, et al., Genome Biol. (2012); Khurana, et al., Nucleic Acids Res. (2010); Habegger, et al., Bioinfo. (2012)
Structure
Voss and Gerstein, Nucleic Acids Res. (2010); Bhardwaj, et al., Protein Sci. (2011)
Networks
Yan, et al., Proc Natl Acad Sci. (2010); Bhardwaj, et al., Sci Signal. (2010); Yu and Gerstein, Proc Natl Acad Sci. (2006)
Expression Analysis
Cheng, et al., Genome Res. (2012); Sboner, et al., Genome Res. (2011); Habegger, et al., Bioinfo. (2011)
Data Science Issues
Gerstein, Nature (2012); Greenbaum, et al., PLoS Comput Biol. (2011); Greenbaum and Gerstein, SF Op Ed (2012)
In each list, it is probably best to read the papers listed first.

Unfortunately, none of these papers give one much detail on the mathematical or computational aspects of the work. For this, it's best to look at intro-cs


Interpretation of genomic variants using a unified biological network approach.
E Khurana, Y Fu, J Chen, M Gerstein (2013). PLoS Comput Biol 9: e1002886.
website
preprint
medline

Genomics: ENCODE leads the way on big data.
M Gerstein (2012). Nature 489: 208.
website
preprint
medline

Understanding transcriptional regulation by integrative analysis of transcription factor binding data.
C Cheng, R Alexander, R Min, J Leng, KY Yip, J Rozowsky, KK Yan, X Dong, S Djebali, Y Ruan, CA Davis, P Carninci, T Lassman, TR Gingeras, R Guigo, E Birney, Z Weng, M Snyder, M Gerstein (2012). Genome Res 22: 1658-67.
website
 
medline

The GENCODE pseudogene resource.
B Pei, C Sisu, A Frankish, C Howald, L Habegger, XJ Mu, R Harte, S Balasubramanian, A Tanzer, M Diekhans, A Reymond, TJ Hubbard, J Harrow, MB Gerstein (2012). Genome Biol 13: R51.
website
 
medline

VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment.
L Habegger, S Balasubramanian, DZ Chen, E Khurana, A Sboner, A Harmanci, J Rozowsky, D Clarke, M Snyder, M Gerstein (2012). Bioinformatics 28: 2267-9.
website
 
medline

Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
D Greenbaum, A Sboner, X J Mu, M Gerstein (2011). PLoS Comput Biol 7: e1002278
 
 
medline

Construction and analysis of an integrated regulatory network derived from high-throughput sequencing data.
C Cheng, KK Yan, W Hwang, J Qian, N Bhardwaj, J Rozowsky, ZJ Lu, W Niu, P Alves, M Kato, M Snyder, M Gerstein (2011). PLoS Comput Biol 7: e1002190.
website
 
medline

Integration of protein motions with molecular networks reveals different mechanisms for permanent and transient interactions.
N Bhardwaj, A Abyzov, D Clarke, C Shou, MB Gerstein (2011). Protein Sci 20: 1745-54.
website
 
medline

CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.
A Abyzov, AE Urban, M Snyder, M Gerstein (2011). Genome Res 21: 974-84.
website
 
medline

A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets.
C Cheng, KK Yan, KY Yip, J Rozowsky, R Alexander, C Shou, M Gerstein (2011). Genome Biol 12: R15.
website
 
medline

RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries.
L Habegger, A Sboner, TA Gianoulis, J Rozowsky, A Agarwal, M Snyder, M Gerstein (2011). Bioinformatics 27: 281-3.
website
 
medline

Rewiring of transcriptional regulatory networks: hierarchy, rather than connectivity, better reflects the importance of regulators.
N Bhardwaj, PM Kim, MB Gerstein (2010). Sci Signal 3: ra79.
website
preprint
medline

FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data.
A Sboner, L Habegger, D Pflueger, S Terry, DZ Chen, JS Rozowsky, AK Tewari, N Kitabayashi, BJ Moss, MS Chee, F Demichelis, MA Rubin, MB Gerstein (2010). Genome Biol 11: R104.
website
 
medline

Segmental duplications in the human genome reveal details of pseudogene formation.
E Khurana, HY Lam, C Cheng, N Carriero, P Cayting, MB Gerstein (2010). Nucleic Acids Res 38: 6997-7007.
website
 
medline

3V: cavity, channel and cleft volume calculator and extractor.
NR Voss, M Gerstein (2010). Nucleic Acids Res 38: W555-62.
website
preprint
medline

Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks.
KK Yan, G Fang, N Bhardwaj, RP Alexander, M Gerstein (2010). Proc Natl Acad Sci U S A 107: 9186-91.
website
preprint
medline

Genomic analysis of the hierarchical structure of regulatory networks.
H Yu, M Gerstein (2006). Proc Natl Acad Sci U S A 103: 14724-31.
website
preprint
medline


Return to front page