Gerstein Lab Publications

Main  •  By Subject  •  Queries  •  Code  •  Other Writings

Gerstein lab contribution to center of excellence in genomic sciences (CEGS).

P50 HG02357-01 (PI Snyder)
9/1/01 - 8/31/06
NIH/NHGRI
Human Genome Array: Technology for Functional Analysis
Role: co-PI and informatics co-director

The overall grant is funding for the Yale CEGS (center of excellence in genomic sciences) focused on building human genome arrays (for not just genes but whole chromosomes) and developing novel technologies using these arrays for functional analysis of the human genome. The Gerstein lab contribution is to construct computational tools for designing and building the arrays, analyzing the resulting data, and integrating this with other genomic information.

Further information about the center can be found at http://bioinfo.mbb.yale.edu/array

Related material:

Year 1 report [ Not available ]
Year 2 report [ html ]
Year 3 report [ html ]
Year 4 report [ html ]
Year 6 report [ html ]
Year 7 report [ html ]
Year 7 Meeting Abstract [ html ]

Recent Yale CEGS tools and datasets available on the web (17-Sep-08) [ html ]


Articles funded by this grant:
AlleleSeq: analysis of allele-specific expression and binding in a network framework.
J Rozowsky, A Abyzov, J Wang, P Alves, D Raha, A Harmanci, J Leng, R Bjornson, Y Kong, N Kitabayashi, N Bhardwaj, M Rubin, M Snyder, M Gerstein (2011). Mol Syst Biol 7: 522.
website
 
medline

Identification of genomic indels and structural variations using split reads.
ZD Zhang, J Du, H Lam, A Abyzov, AE Urban, M Snyder, M Gerstein (2011). BMC Genomics 12: 375.
 
 
medline

Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project.
XJ Mu, ZJ Lu, Y Kong, HY Lam, MB Gerstein (2011). Nucleic Acids Res 39: 7058-76.
website
 
medline

ACT: aggregation and correlation toolbox for analyses of genome tracks.
J Jee, J Rozowsky, KY Yip, L Lochovsky, R Bjornson, G Zhong, Z Zhang, Y Fu, J Wang, Z Weng, M Gerstein (2011). Bioinformatics 27: 1152-4.
website
 
medline

CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.
A Abyzov, AE Urban, M Snyder, M Gerstein (2011). Genome Res 21: 974-84.
website
 
medline

AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision.
A Abyzov, M Gerstein (2011). Bioinformatics 27: 595-603.
website
 
medline

RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries.
L Habegger, A Sboner, TA Gianoulis, J Rozowsky, A Agarwal, M Snyder, M Gerstein (2011). Bioinformatics 27: 281-3.
website
 
medline

Analysis of diverse regulatory networks in a hierarchical context shows consistent tendencies for collaboration in the middle levels.
N Bhardwaj, KK Yan, MB Gerstein (2010). Proc Natl Acad Sci U S A 107: 6841-6.
website
preprint
medline

Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library.
HY Lam, XJ Mu, AM Stutz, A Tanzer, PD Cayting, M Snyder, PM Kim, JO Korbel, MB Gerstein (2010). Nat Biotechnol 28: 47-55.
website
preprint
medline

The relationship between the evolution of microRNA targets and the length of their UTRs.
C Cheng, N Bhardwaj, M Gerstein (2009). BMC Genomics 10: 431.
 
preprint
medline

Integrated assessment of genomic correlates of protein evolutionary rate.
Y Xia, EA Franzosa, MB Gerstein (2009). PLoS Comput Biol 5: e1000413.
 
preprint
medline

Personal phenotypes to go with personal genomes.
M Snyder, S Weissman, M Gerstein (2009). Mol Syst Biol 5: 273.
 
preprint
medline

Quantifying environmental adaptation of metabolic pathways in metagenomics.
TA Gianoulis, J Raes, PV Patel, R Bjornson, JO Korbel, I Letunic, T Yamada, A Paccanaro, LJ Jensen, M Snyder, P Bork, MB Gerstein (2009). Proc Natl Acad Sci U S A 106: 1374-9.
website
preprint
medline

Efficient yeast ChIP-Seq using multiplex short-read DNA sequencing.
P Lefrancois, GM Euskirchen, RK Auerbach, J Rozowsky, T Gibson, CM Yellman, M Gerstein, M Snyder (2009). BMC Genomics 10: 37.
 
preprint
medline

A myelopoiesis-associated regulatory intergenic noncoding RNA transcript within the human HOXA cluster.
X Zhang, Z Lian, C Padden, MB Gerstein, J Rozowsky, M Snyder, TR Gingeras, P Kapranov, SM Weissman, PE Newburger (2009). Blood 113: 2526-34.
 
preprint
medline

MSB: a mean-shift-based approach for the analysis of structural variation in the genome.
LY Wang, A Abyzov, JO Korbel, M Snyder, M Gerstein (2009). Genome Res 19: 106-17.
 
preprint
medline

RNA-Seq: a revolutionary tool for transcriptomics.
Z Wang, M Gerstein, M Snyder (2009). Nat Rev Genet 10: 57-63.
 
preprint
medline

High-resolution copy-number variation map reflects human olfactory receptor diversity and evolution.
Y Hasin, T Olender, M Khen, C Gonzaga-Jauregui, PM Kim, AE Urban, M Snyder, MB Gerstein, D Lancet, JO Korbel (2008). PLoS Genet 4: e1000249.
 
preprint
medline

Analysis of copy number variants and segmental duplications in the human genome: Evidence for a change in the process of formation in recent evolutionary history.
PM Kim, HY Lam, AE Urban, JO Korbel, J Affourtit, F Grubert, X Chen, S Weissman, M Snyder, MB Gerstein (2008). Genome Res 18: 1865-74.
website
preprint
medline

The transcriptional landscape of the yeast genome defined by RNA sequencing.
U Nagalakshmi, Z Wang, K Waern, C Shou, D Raha, M Gerstein, M Snyder (2008). Science 320: 1344-9.
 
preprint
medline

Rapid evolution by positive Darwinian selection in T-cell antigen CD4 in primates.
ZD Zhang, G Weinstock, M Gerstein (2008). J Mol Evol 66: 446-56.
 
preprint
medline

An integrated system for studying residue coevolution in proteins.
KY Yip, P Patel, PM Kim, DM Engelman, D McDermott, M Gerstein (2008). Bioinformatics 24: 290-2.
website
preprint
medline

Positive selection at the protein network periphery: evaluation in terms of structural constraints and cellular context.
PM Kim, JO Korbel, MB Gerstein (2007). Proc Natl Acad Sci U S A 104: 20274-9.
website
preprint
medline

Paired-end mapping reveals extensive structural variation in the human genome.
JO Korbel, AE Urban, JP Affourtit, B Godwin, F Grubert, JF Simons, PM Kim, D Palejev, NJ Carriero, L Du, BE Taillon, Z Chen, A Tanzer, AC Saunders, J Chi, F Yang, NP Carter, ME Hurles, SM Weissman, TT Harkins, MB Gerstein, M Egholm, M Snyder (2007). Science 318: 420-6.
website
preprint
medline

Toward a universal microarray: prediction of gene expression through nearest-neighbor probe sequence identification.
TE Royce, JS Rozowsky, MB Gerstein (2007). Nucleic Acids Res 35: e99.
 
preprint
medline

An efficient pseudomedian filter for tiling microrrays.
TE Royce, NJ Carriero, MB Gerstein (2007). BMC Bioinformatics 8: 186.
website
preprint
medline

Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome.
JO Korbel, AE Urban, F Grubert, J Du, TE Royce, P Starr, G Zhong, BS Emanuel, SM Weissman, M Snyder, MB Gerstein (2007). Proc Natl Acad Sci U S A 104: 10110-5.
website
preprint
medline

Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications.
H Yu, R Jansen, G Stolovitzky, M Gerstein (2007). Bioinformatics 23: 2163-73.
website
preprint
medline

Assessing the need for sequence-based normalization in tiling microarray experiments.
TE Royce, JS Rozowsky, MB Gerstein (2007). Bioinformatics 23: 988-97.
website
preprint
medline

The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they?
D Zheng, MB Gerstein (2007). Trends Genet 23: 219-24.
website
preprint
medline

New insights into Acinetobacter baumannii pathogenesis revealed by high-density pyrosequencing and transposon mutagenesis.
MG Smith, TA Gianoulis, S Pukatzki, JJ Mekalanos, LN Ornston, M Gerstein, M Snyder (2007). Genes Dev 21: 601-14.
 
preprint
medline

Comparative analysis of genome tiling array data reveals many novel primate-specific functional RNAs in human.
Z Zhang, AW Pang, M Gerstein (2007). BMC Evol Biol 7 Suppl 1: S14.
 
preprint
medline

Positional artifacts in microarrays: experimental verification and construction of COP, an automated detection tool.
H Yu, K Nguyen, T Royce, J Qian, K Nelson, M Snyder, M Gerstein (2007). Nucleic Acids Res 35: e8.
website
preprint
medline

Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation.
JE Karro, Y Yan, D Zheng, Z Zhang, N Carriero, P Cayting, P Harrrison, M Gerstein (2007). Nucleic Acids Res 35: D55-60.
website
preprint
medline

Genomic analysis of the hierarchical structure of regulatory networks.
H Yu, M Gerstein (2006). Proc Natl Acad Sci U S A 103: 14724-31.
website
preprint
medline

Extrapolating traditional DNA microarray statistics to tiling and protein microarray technologies.
TE Royce, JS Rozowsky, NM Luscombe, O Emanuelsson, H Yu, X Zhu, M Snyder, MB Gerstein (2006). Methods Enzymol 411: 282-311.
 
preprint
medline

Predicting essential genes in fungal genomes.
M Seringhaus, A Paccanaro, A Borneman, M Snyder, M Gerstein (2006). Genome Res 16: 1126-35.
website
preprint
medline

The real life of pseudogenes.
M Gerstein, D Zheng (2006). Sci Am 295: 48-55.
website
preprint
medline

PseudoPipe: an automated pseudogene identification pipeline.
Z Zhang, N Carriero, D Zheng, J Karro, PM Harrison, M Gerstein (2006). Bioinformatics 22: 1437-9.
website
preprint
medline

High-resolution mapping of DNA copy alterations in human chromosome 22 using high-density tiling oligonucleotide arrays.
AE Urban, JO Korbel, R Selzer, T Richmond, A Hacker, GV Popescu, JF Cubells, R Green, BS Emanuel, MB Gerstein, SM Weissman, M Snyder (2006). Proc Natl Acad Sci U S A 103: 4534-9.
 
preprint
medline

Design optimization methods for genomic DNA tiling arrays.
P Bertone, V Trifonov, JS Rozowsky, F Schubert, O Emanuelsson, J Karro, MY Kao, M Snyder, M Gerstein (2006). Genome Res 16: 271-81.
website
preprint
medline

Global changes in STAT target selection and transcription regulation upon interferon treatments.
SE Hartman, P Bertone, AK Nath, TE Royce, M Gerstein, S Weissman, M Snyder (2005). Genes Dev 19: 2953-68.
website
preprint
medline

Network security and data integrity in academia: an assessment and a proposal for large-scale archiving.
A Smith, D Greenbaum, SM Douglas, M Long, M Gerstein (2005). Genome Biol 6: 119.
 
preprint
medline

Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping.
TE Royce, JS Rozowsky, P Bertone, M Samanta, V Stolc, S Weissman, M Snyder, M Gerstein (2005). Trends Genet 21: 466-75.
 
preprint
medline

Integrated pseudogene annotation for human chromosome 22: evidence for transcription.
D Zheng, Z Zhang, PM Harrison, J Karro, N Carriero, M Gerstein (2005). J Mol Biol 349: 27-45.
website
preprint
medline

Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles.
Y Gilad, SA Rifkin, P Bertone, M Gerstein, KP White (2005). Genome Res 15: 674-80.
 
preprint
medline

Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability.
PM Harrison, D Zheng, Z Zhang, N Carriero, M Gerstein (2005). Nucleic Acids Res 33: 2374-83.
website
preprint
medline

The temporal patterning microRNA let-7 regulates several transcription factors at the larval to adult transition in C. elegans.
H Grosshans, T Johnson, KL Reinert, M Gerstein, FJ Slack (2005). Dev Cell 8: 321-30.
website
preprint
medline

A high productivity/low maintenance approach to high-performance computation for biomedicine: four case studies.
N Carriero, MV Osier, KH Cheung, PL Miller, M Gerstein, H Zhao, B Wu, S Rifkin, J Chang, H Zhang, K White, K Williams, M Schultz (2005). J Am Med Inform Assoc 12: 90-8.
 
preprint
medline

DNA replication-timing analysis of human chromosome 22 at high resolution and different developmental states.
EJ White, O Emanuelsson, D Scalzo, T Royce, S Kosak, EJ Oakeley, S Weissman, M Gerstein, M Groudine, M Snyder, D Schubeler (2004). Proc Natl Acad Sci U S A 101: 17771-6.
website
preprint
medline

Fast optimal genome tiling with applications to microarray design and homology search.
P Berman, P Bertone, B Dasgupta, M Gerstein, MY Kao, M Snyder (2004). J Comput Biol 11: 766-85.
website
preprint
medline

Global identification of human transcribed sequences with genome tiling arrays.
P Bertone, V Stolc, TE Royce, JS Rozowsky, AE Urban, X Zhu, JL Rinn, W Tongprasit, M Samanta, S Weissman, M Gerstein, M Snyder (2004). Science 306: 2242-6.
website
preprint
medline

Large-scale analysis of pseudogenes in the human genome.
Z Zhang, M Gerstein (2004). Curr Opin Genet Dev 14: 328-35.
website
preprint
medline

Major molecular differences between mammalian sexes are involved in drug metabolism and renal function.
JL Rinn, JS Rozowsky, IJ Laurenzi, PH Petersen, K Zou, W Zhong, M Gerstein, M Snyder (2004). Dev Cell 6: 791-800.
 
preprint
medline

CREB binds to multiple loci on human chromosome 22.
G Euskirchen, TE Royce, P Bertone, R Martone, JL Rinn, FK Nelson, F Sayward, NM Luscombe, P Miller, M Gerstein, S Weissman, M Snyder (2004). Mol Cell Biol 24: 3804-14.
website
preprint
medline

Comparative analysis of processed pseudogenes in the mouse and human genomes.
Z Zhang, N Carriero, M Gerstein (2004). Trends Genet 20: 62-7.
website
preprint
medline

Identification of novel functional elements in the human genome.
Z Lian, G Euskirchen, J Rinn, R Martone, P Bertone, S Hartman, T Royce, K Nelson, F Sayward, N Luscombe, J Yang, JL Li, P Miller, AE Urban, M Gerstein, S Weissman, M Snyder (2003). Cold Spring Harb Symp Quant Biol 68: 317-22.
 
preprint
medline

Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome.
Z Zhang, PM Harrison, Y Liu, M Gerstein (2003). Genome Res 13: 2541-58.
website
preprint
medline

A "polyORFomic" analysis of prokaryote genomes using disabled-homology filtering reveals conserved but undiscovered short ORFs.
PM Harrison, N Carriero, Y Liu, M Gerstein (2003). J Mol Biol 333: 885-92.
website
preprint
medline

Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data.
J Qian, J Lin, NM Luscombe, H Yu, M Gerstein (2003). Bioinformatics 19: 1917-26.
website
preprint
medline

Distribution of NF-kappaB-binding sites across human chromosome 22.
R Martone, G Euskirchen, P Bertone, S Hartman, TE Royce, NM Luscombe, JL Rinn, FK Nelson, P Miller, M Gerstein, S Weissman, M Snyder (2003). Proc Natl Acad Sci U S A 100: 12247-52.
website
preprint
medline

Identification and correction of spurious spatial correlations in microarray data.
J Qian, Y Kluger, H Yu, M Gerstein (2003). Biotechniques 35: 42-4, 46, 48.
 
preprint
medline

ExpressYourself: A modular platform for processing and visualizing microarray data.
NM Luscombe, TE Royce, P Bertone, N Echols, CE Horak, JT Chang, M Snyder, M Gerstein (2003). Nucleic Acids Res 31: 3477-82.
website
preprint
medline

Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements.
Z Zhang, M Gerstein (2003). J Biol 2: 11.
 
preprint
medline

Identification and characterization of over 100 mitochondrial ribosomal protein pseudogenes in the human genome.
Z Zhang, M Gerstein (2003). Genomics 81: 468-80.
website
preprint
medline

Genomics. Defining genes in the genomics era.
M Snyder, M Gerstein (2003). Science 300: 258-60.
website
preprint
medline

Spectral biclustering of microarray data: coclustering genes and conditions.
Y Kluger, R Basri, JT Chang, M Gerstein (2003). Genome Res 13: 703-16.
website
preprint
medline

The transcriptional activity of human Chromosome 22.
JL Rinn, G Euskirchen, P Bertone, R Martone, NM Luscombe, S Hartman, PM Harrison, FK Nelson, P Miller, M Gerstein, S Weissman, M Snyder (2003). Genes Dev 17: 529-40.
website
preprint
medline

Identification of pseudogenes in the Drosophila melanogaster genome.
PM Harrison, D Milburn, Z Zhang, P Bertone, M Gerstein (2003). Nucleic Acids Res 31: 1033-7.
website
preprint
medline

Studying genomes through the aeons: protein families, pseudogenes and proteome evolution.
PM Harrison, M Gerstein (2002). J Mol Biol 318: 1155-74.
website
preprint
medline

A question of size: the eukaryotic proteome and the problems in defining it.
PM Harrison, A Kumar, N Lang, M Snyder, M Gerstein (2002). Nucleic Acids Res 30: 1083-90.
website
preprint
medline

Fast optimal genome tiling with applications to microarray design and homology search.
P Berman, P Bertone, B DasGupta, M Gerstein, M-Y Kao, M Snyder (2002). Proceedings of the 2nd International Workshop on Algorithms in Bioinformatics. Springer-Verlag LNCS 2452: 419-433
website
preprint
 

Complex transcriptional circuitry at the G1/S transition in Saccharomyces cerevisiae.
CE Horak, NM Luscombe, J Qian, P Bertone, S Piccirrillo, M Gerstein, M Snyder (2002). Genes Dev 16: 3017-33.
website
preprint
medline

YMD: a microarray database for large-scale gene expression analysis.
KH Cheung, K White, J Hager, M Gerstein, V Reinke, K Nelson, P Masiar, R Srivastava, Y Li, J Li, H Zhao, J Li, DB Allison, M Snyder, P Miller, K Williams (2002). Proc AMIA Symp : 140-4.
website
preprint
medline

Genomic and proteomic analysis of the myeloid differentiation program: global analysis of gene expression during induced differentiation in the MPRO cell line.
Z Lian, Y Kluger, DS Greenbaum, D Tuck, M Gerstein, N Berliner, SM Weissman, PE Newburger (2002). Blood 100: 3209-20.
website
preprint
medline

Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome.
Z Zhang, P Harrison, M Gerstein (2002). Genome Res 12: 1466-82.
website
preprint
medline

The current excitement in bioinformatics-analysis of whole-genome expression data: how does it relate to protein structure and function?
M Gerstein, R Jansen (2000). Curr Opin Struct Biol 10: 574-84.
website
preprint
medline


Return to front page