Informatics-research-article

This is part of a disjoint subdivision of all the papers into 4 types.
(1) review = all review papers published by Gerstein Lab & colaborators
(2) informatics-research-article = informatics research article lead by the Gerstein Lab
(3) expt-collab = research article driven by an experimental or informatics lab in collaboration with Gerstein Lab
(4) otherpub = not one of the above

A comprehensive catalog of predicted functional upstream open reading frames in humans.

P McGillivray, R Ault, M Pawashe, R Kitchen, S Balasubramanian, M Gerstein (2018). Nucleic Acids Res 46: 3326-3338.

Interpretation of genomic variants using a unified biological network approach.

E Khurana, Y Fu, J Chen, M Gerstein (2013). PLoS Comput Biol 9: e1002886.

Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells.

A Abyzov, J Mariani, D Palejev, Y Zhang, MS Haney, L Tomasini, AF Ferrandino, LA Rosenberg Belmaker, A Szekely, M Wilson, A Kocabas, NE Calixto, EL Grigorenko, A Huttner, K Chawarska, S Weissman, AE Urban, M Gerstein, FM Vaccarino (2012). Nature 492: 438-42.

An integrated map of genetic variation from 1,092 human genomes.

1000 Genomes Project Consortium, GR Abecasis, A Auton, LD Brooks, MA DePristo, RM Durbin, RE Handsaker, HM Kang, GT Marth, GA McVean (2012). Nature 491: 56-65.

An integrated encyclopedia of DNA elements in the human genome.

ENCODE Project Consortium (2012). Nature 489: 57-74.

Architecture of the human regulatory network derived from ENCODE data.

MB Gerstein, A Kundaje, M Hariharan, SG Landt, KK Yan, C Cheng, XJ Mu, E Khurana, J Rozowsky, R Alexander, R Min, P Alves, A Abyzov, N Addleman, N Bhardwaj, AP Boyle, P Cayting, A Charos, DZ Chen, Y Cheng, D Clarke, C Eastman, G Euskirchen, S Frietze, Y Fu, J Gertz, F Grubert, A Harmanci, P Jain, M Kasowski, P Lacroute, JJ Leng, J Lian, H Monahan, H O'Geen, Z Ouyang, EC Partridge, D Patacsil, F Pauli, D Raha, L Ramirez, TE Reddy, B Reed, M Shi, T Slifer, J Wang, L Wu, X Yang, KY Yip, G Zilberman-Schapira, S Batzoglou, A Sidow, PJ Farnham, RM Myers, SM Weissman, M Snyder (2012). Nature 489: 91-100.

Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors.

KY Yip, C Cheng, N Bhardwaj, JB Brown, J Leng, A Kundaje, J Rozowsky, E Birney, P Bickel, M Snyder, M Gerstein (2012). Genome Biol 13: R48.

Understanding transcriptional regulation by integrative analysis of transcription factor binding data.

C Cheng, R Alexander, R Min, J Leng, KY Yip, J Rozowsky, KK Yan, X Dong, S Djebali, Y Ruan, CA Davis, P Carninci, T Lassman, TR Gingeras, R Guigo, E Birney, Z Weng, M Snyder, M Gerstein (2012). Genome Res 22: 1658-67.

The GENCODE pseudogene resource.

B Pei, C Sisu, A Frankish, C Howald, L Habegger, XJ Mu, R Harte, S Balasubramanian, A Tanzer, M Diekhans, A Reymond, TJ Hubbard, J Harrow, MB Gerstein (2012). Genome Biol 13: R51.

Detecting and annotating genetic variations using the HugeSeq pipeline.

HY Lam, C Pan, MJ Clark, P Lacroute, R Chen, R Haraksingh, M O'Huallachain, MB Gerstein, JM Kidd, CD Bustamante, M Snyder (2012). Nat Biotechnol 30: 226-9.

VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment.

L Habegger, S Balasubramanian, DZ Chen, E Khurana, A Sboner, A Harmanci, J Rozowsky, D Clarke, M Snyder, M Gerstein (2012). Bioinformatics 28: 2267-9.

A systematic survey of loss-of-function variants in human protein-coding genes.

DG MacArthur, S Balasubramanian, A Frankish, N Huang, J Morris, K Walter, L Jostins, L Habegger, JK Pickrell, SB Montgomery, CA Albers, ZD Zhang, DF Conrad, G Lunter, H Zheng, Q Ayub, MA DePristo, E Banks, M Hu, RE Handsaker, JA Rosenfeld, M Fromer, M Jin, XJ Mu, E Khurana, K Ye, M Kay, GI Saunders, MM Suner, T Hunt, IH Barnes, C Amid, DR Carvalho-Silva, AH Bignell, C Snow, B Yngvadottir, S Bumpstead, DN Cooper, Y Xue, IG Romero, 1000 Genomes Project Consortium, J Wang, Y Li, RA Gibbs, SA McCarroll, ET Dermitzakis, JK Pritchard, JC Barrett, J Harrow, ME Hurles, MB Gerstein, C Tyler-Smith (2012). Science 335: 823-8.

Novel insights through the integration of structural and functional genomics data with protein networks.

D Clarke, N Bhardwaj, MB Gerstein (2012). J Struct Biol 179: 320-6.

IQSeq: integrated isoform quantification analysis based on next-generation sequencing.

J Du, J Leng, L Habegger, A Sboner, D McDermott, M Gerstein (2012). PLoS One 7: e29175.

Systematic control of protein interactions for systems biology.

N Bhardwaj, D Clarke, M Gerstein (2011). Proc Natl Acad Sci U S A 108: 20279-80.

Construction and analysis of an integrated regulatory network derived from high-throughput sequencing data.

C Cheng, KK Yan, W Hwang, J Qian, N Bhardwaj, J Rozowsky, ZJ Lu, W Niu, P Alves, M Kato, M Snyder, M Gerstein (2011). PLoS Comput Biol 7: e1002190.

Genome-wide analysis of chromatin features identifies histone modification sensitive and insensitive yeast transcription factors.

C Cheng, C Shou, KY Yip, MB Gerstein (2011). Genome Biol 12: R111.

TIP: a probabilistic method for identifying transcription factor target genes from ChIP-seq binding profiles.

C Cheng, R Min, M Gerstein (2011). Bioinformatics 27: 3221-7.

Predicting protein ligand binding motions with the conformation explorer.

SC Flores, MB Gerstein (2011). BMC Bioinformatics 12: 417.

Identification of specificity determining residues in peptide recognition domains using an information theoretic approach applied to large-scale binding maps.

KY Yip, L Utz, S Sitwell, X Hu, SS Sidhu, BE Turk, M Gerstein, PM Kim (2011). BMC Biol 9: 53.

Modeling the relative relationship of transcription factor binding and histone modifications to gene expression levels in mouse embryonic stem cells.

C Cheng, M Gerstein (2011). Nucleic Acids Res 40: 553-68.

Integration of protein motions with molecular networks reveals different mechanisms for permanent and transient interactions.

N Bhardwaj, A Abyzov, D Clarke, C Shou, MB Gerstein (2011). Protein Sci 20: 1745-54.

AlleleSeq: analysis of allele-specific expression and binding in a network framework.

J Rozowsky, A Abyzov, J Wang, P Alves, D Raha, A Harmanci, J Leng, R Bjornson, Y Kong, N Kitabayashi, N Bhardwaj, M Rubin, M Snyder, M Gerstein (2011). Mol Syst Biol 7: 522.

Identification of genomic indels and structural variations using split reads.

ZD Zhang, J Du, H Lam, A Abyzov, AE Urban, M Snyder, M Gerstein (2011). BMC Genomics 12: 375.

The spread of scientific information: insights from the web usage statistics in PLoS article-level metrics.

KK Yan, M Gerstein (2011). PLoS One 6: e19917.

Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project.

XJ Mu, ZJ Lu, Y Kong, HY Lam, MB Gerstein (2011). Nucleic Acids Res 39: 7058-76.

The CRIT framework for identifying cross patterns in systems biology and application to chemogenomics.

TA Gianoulis, A Agarwal, M Snyder, MB Gerstein (2011). Genome Biol 12: R32.

ACT: aggregation and correlation toolbox for analyses of genome tracks.

J Jee, J Rozowsky, KY Yip, L Lochovsky, R Bjornson, G Zhong, Z Zhang, Y Fu, J Wang, Z Weng, M Gerstein (2011). Bioinformatics 27: 1152-4.

CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.

A Abyzov, AE Urban, M Snyder, M Gerstein (2011). Genome Res 21: 974-84.

A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets.

C Cheng, KK Yan, KY Yip, J Rozowsky, R Alexander, C Shou, M Gerstein (2011). Genome Biol 12: R15.

Measuring the evolutionary rewiring of biological networks.

C Shou, N Bhardwaj, HY Lam, KK Yan, PM Kim, M Snyder, MB Gerstein (2011). PLoS Comput Biol 7: e1001050.

AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision.

A Abyzov, M Gerstein (2011). Bioinformatics 27: 595-603.

Gene inactivation and its implications for annotation in the era of personal genomics.

S Balasubramanian, L Habegger, A Frankish, DG MacArthur, R Harte, C Tyler-Smith, J Harrow, M Gerstein (2011). Genes Dev 25: 1-10.

Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data.

ZJ Lu, KY Yip, G Wang, C Shou, LW Hillier, E Khurana, A Agarwal, R Auerbach, J Rozowsky, C Cheng, M Kato, DM Miller, F Slack, M Snyder, RH Waterston, V Reinke, MB Gerstein (2011). Genome Res 21: 276-85.

RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries.

L Habegger, A Sboner, TA Gianoulis, J Rozowsky, A Agarwal, M Snyder, M Gerstein (2011). Bioinformatics 27: 281-3.

Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.

MB Gerstein, ZJ Lu, EL Van Nostrand, C Cheng, BI Arshinoff, T Liu, KY Yip, R Robilotto, A Rechtsteiner, K Ikegami, P Alves, A Chateigner, M Perry, M Morris, RK Auerbach, X Feng, J Leng, A Vielle, W Niu, K Rhrissorrakrai, A Agarwal, RP Alexander, G Barber, CM Brdlik, J Brennan, JJ Brouillet, A Carr, MS Cheung, H Clawson, S Contrino, LO Dannenberg, AF Dernburg, A Desai, L Dick, AC Dose, J Du, T Egelhofer, S Ercan, G Euskirchen, B Ewing, EA Feingold, R Gassmann, PJ Good, P Green, F Gullier, M Gutwein, MS Guyer, L Habegger, T Han, JG Henikoff, SR Henz, A Hinrichs, H Holster, T Hyman, AL Iniguez, J Janette, M Jensen, M Kato, WJ Kent, E Kephart, V Khivansara, E Khurana, JK Kim, P Kolasinska-Zwierz, EC Lai, I Latorre, A Leahey, S Lewis, P Lloyd, L Lochovsky, RF Lowdon, Y Lubling, R Lyne, M MacCoss, SD Mackowiak, M Mangone, S McKay, D Mecenas, G Merrihew, DM Miller, A Muroyama, JI Murray, SL Ooi, H Pham, T Phippen, EA Preston, N Rajewsky, G Ratsch, H Rosenbaum, J Rozowsky, K Rutherford, P Ruzanov, M Sarov, R Sasidharan, A Sboner, P Scheid, E Segal, H Shin, C Shou, FJ Slack, C Slightam, R Smith, WC Spencer, EO Stinson, S Taing, T Takasaki, D Vafeados, K Voronina, G Wang, NL Washington, CM Whittle, B Wu, KK Yan, G Zeller, Z Zha, M Zhong, X Zhou, modENCODE Consortium, J Ahringer, S Strome, KC Gunsalus, G Micklem, XS Liu, V Reinke, SK Kim, LW Hillier, S Henikoff, F Piano, M Snyder, L Stein, JD Lieb, RH Waterston (2010). Science 330: 1775-87.

Rewiring of transcriptional regulatory networks: hierarchy, rather than connectivity, better reflects the importance of regulators.

N Bhardwaj, PM Kim, MB Gerstein (2010). Sci Signal 3: ra79.

Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model.

ZD Zhang, MB Gerstein (2010). BMC Bioinformatics 11: 539.

A map of human genome variation from population-scale sequencing

1000 Genomes Project Consortium, GR Abecasis, D Altshuler, A Auton, LD Brooks, RM Durbin, RA Gibbs, ME Hurles, GA McVean (2010). Nature 467: 1061-73.

FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data.

A Sboner, L Habegger, D Pflueger, S Terry, DZ Chen, JS Rozowsky, AK Tewari, N Kitabayashi, BJ Moss, MS Chee, F Demichelis, MA Rubin, MB Gerstein (2010). Genome Biol 11: R104.

Structured digital tables on the Semantic Web: toward a structured digital literature

KH Cheung, M Samwald, RK Auerbach, MB Gerstein (2010). Mol Syst Biol 6: 403.

Segmental duplications in the human genome reveal details of pseudogene formation.

E Khurana, HY Lam, C Cheng, N Carriero, P Cayting, MB Gerstein (2010). Nucleic Acids Res 38: 6997-7007.

Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays.

A Agarwal, D Koppstein, J Rozowsky, A Sboner, L Habegger, LW Hillier, R Sasidharan, V Reinke, RH Waterston, M Gerstein (2010). BMC Genomics 11: 383.

Using semantic web rules to reason on an ontology of pseudogenes.

ME Holford, E Khurana, KH Cheung, M Gerstein (2010). Bioinformatics 26: i71-8.

Analysis of combinatorial regulation: scaling of partnerships between regulators with the number of governed targets.

N Bhardwaj, MB Carson, A Abyzov, KK Yan, H Lu, MB Gerstein (2010). PLoS Comput Biol 6: e1000755.

3V: cavity, channel and cleft volume calculator and extractor.

NR Voss, M Gerstein (2010). Nucleic Acids Res 38: W555-62.

MOTIPS: automated motif analysis for predicting targets of modular protein domains.

HY Lam, PM Kim, J Mok, R Tonikian, SS Sidhu, BE Turk, M Snyder, MB Gerstein (2010). BMC Bioinformatics 11: 243.

Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks

KK Yan, G Fang, N Bhardwaj, RP Alexander, M Gerstein (2010). Proc Natl Acad Sci U S A 107: 9186-91.

Analysis of membrane proteins in metagenomics: networks of correlated environmental features and protein families.

PV Patel, TA Gianoulis, RD Bjornson, KY Yip, DM Engelman, MB Gerstein (2010). Genome Res 20: 960-71.

Getting started in gene orthology and functional analysis.

G Fang, N Bhardwaj, R Robilotto, MB Gerstein (2010). PLoS Comput Biol 6: e1000703.

Analysis of diverse regulatory networks in a hierarchical context shows consistent tendencies for collaboration in the middle levels.

N Bhardwaj, KK Yan, MB Gerstein (2010). Proc Natl Acad Sci U S A 107: 6841-6.

Molecular sampling of prostate cancer: a dilemma for predicting disease progression.

A Sboner, F Demichelis, S Calza, Y Pawitan, SR Setlur, Y Hoshida, S Perner, HO Adami, K Fall, LA Mucci, PW Kantoff, M Stampfer, SO Andersson, E Varenhorst, JE Johansson, MB Gerstein, TR Golub, MA Rubin, O Andren (2010). BMC Med Genomics 3: 8.

Identification and analysis of unitary pseudogenes: historic and contemporary gene losses in humans and other primates.

ZD Zhang, A Frankish, T Hunt, J Harrow, M Gerstein (2010). Genome Biol 11: R26.

Genome-wide sequence-based prediction of peripheral proteins using a novel semi-supervised learning technique.

N Bhardwaj, M Gerstein, H Lu (2010). BMC Bioinformatics 11 Suppl 1: S6.

Improved reconstruction of in silico gene regulatory networks by integrating knockout and perturbation data.

KY Yip, RP Alexander, KK Yan, M Gerstein (2010). PLoS One 5: e8121.

Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library.

HY Lam, XJ Mu, AM Stutz, A Tanzer, PD Cayting, M Snyder, PM Kim, JO Korbel, MB Gerstein (2010). Nat Biotechnol 28: 47-55.

RigidFinder: a fast and sensitive method to detect rigid blocks in large macromolecular complexes.

A Abyzov, R Bjornson, M Felipe, M Gerstein (2010). Proteins 78: 309-24.

Comprehensive analysis of the pseudogenes of glycolytic enzymes in vertebrates: the anomalously high number of GAPDH pseudogenes highlights a recent burst of retrotrans-positional activity.

YJ Liu, D Zheng, S Balasubramanian, N Carriero, E Khurana, R Robilotto, MB Gerstein (2009). BMC Genomics 10: 480.

Robust-linear-model normalization to reduce technical variability in functional protein microarrays.

A Sboner, A Karpikov, G Chen, M Smith, D Mattoon, L Freeman-Cook, B Schweitzer, MB Gerstein (2009). J Proteome Res 8: 5451-64.

The relationship between the evolution of microRNA targets and the length of their UTRs.

C Cheng, N Bhardwaj, M Gerstein (2009). BMC Genomics 10: 431.

mRNA expression profiles show differential regulatory effects of microRNAs between estrogen receptor-positive and estrogen receptor-negative breast cancer.

C Cheng, X Fu, P Alves, M Gerstein (2009). Genome Biol 10: R90.

Mapping accessible chromatin regions using Sono-Seq.

RK Auerbach, G Euskirchen, J Rozowsky, N Lamarre-Vincent, Z Moqtaderi, P Lefrancois, K Struhl, M Gerstein, M Snyder (2009). Proc Natl Acad Sci U S A 106: 14926-31.

Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels.

KY Yip, PM Kim, D McDermott, M Gerstein (2009). BMC Bioinformatics 10: 241.

Understanding modularity in molecular networks requires dynamics.

RP Alexander, PM Kim, T Emonet, MB Gerstein (2009). Sci Signal 2: pe44.

Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

J Du, RD Bjornson, ZD Zhang, Y Kong, M Snyder, MB Gerstein (2009). PLoS Comput Biol 5: e1000432.

Relating protein conformational changes to packing efficiency and disorder.

N Bhardwaj, M Gerstein (2009). Protein Sci 18: 1230-40.

Systematic identification of transcription factors associated with patient survival in cancers.

C Cheng, LM Li, P Alves, M Gerstein (2009). BMC Genomics 10: 225.

PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data.

JO Korbel, A Abyzov, XJ Mu, N Carriero, P Cayting, Z Zhang, M Snyder, MB Gerstein (2009). Genome Biol 10: R23.

StoneHinge: hinge prediction by network analysis of individual protein structures.

KS Keating, SC Flores, MB Gerstein, LA Kuhn (2009). Protein Sci 18: 359-71.

Quantifying environmental adaptation of metabolic pathways in metagenomics.

TA Gianoulis, J Raes, PV Patel, R Bjornson, JO Korbel, I Letunic, T Yamada, A Paccanaro, LJ Jensen, M Snyder, P Bork, MB Gerstein (2009). Proc Natl Acad Sci U S A 106: 1374-9.

Comparative analysis of processed ribosomal protein pseudogenes in four mammalian genomes.

S Balasubramanian, D Zheng, YJ Liu, G Fang, A Frankish, N Carriero, R Robilotto, P Cayting, M Gerstein (2009). Genome Biol 10: R2.

PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.

J Rozowsky, G Euskirchen, RK Auerbach, ZD Zhang, T Gibson, R Bjornson, N Carriero, M Snyder, MB Gerstein (2009). Nat Biotechnol 27: 66-75.

MSB: a mean-shift-based approach for the analysis of structural variation in the genome.

LY Wang, A Abyzov, JO Korbel, M Snyder, M Gerstein (2009). Genome Res 19: 106-17.

Training set expansion: an approach to improving the reconstruction of biological networks from limited and uneven reliable interactions.

KY Yip, M Gerstein (2009). Bioinformatics 25: 243-50.

Pseudofam: the pseudogene families database.

HY Lam, E Khurana, G Fang, P Cayting, N Carriero, KH Cheung, MB Gerstein (2009). Nucleic Acids Res 37: D738-43.

Analysis of copy number variants and segmental duplications in the human genome: Evidence for a change in the process of formation in recent evolutionary history.

PM Kim, HY Lam, AE Urban, JO Korbel, J Affourtit, F Grubert, X Chen, S Weissman, M Snyder, MB Gerstein (2008). Genome Res 18: 1865-74.

Modeling ChIP sequencing in silico with applications.

ZD Zhang, J Rozowsky, M Snyder, J Chang, M Gerstein (2008). PLoS Comput Biol 4: e1000158.

HingeMaster: normal mode hinge prediction approach and integration of complementary predictors.

SC Flores, KS Keating, J Painter, F Morcos, K Nguyen, EA Merritt, LA Kuhn, MB Gerstein (2008). Proteins 73: 299-319.

Rapid evolution by positive Darwinian selection in T-cell antigen CD4 in primates.

ZD Zhang, G Weinstock, M Gerstein (2008). J Mol Evol 66: 446-56.

The role of disorder in interaction networks: a structural analysis.

PM Kim, A Sboner, Y Xia, M Gerstein (2008). Mol Syst Biol 4: 179.

Analysis of nuclear receptor pseudogenes in vertebrates: how the silent tell their stories.

ZD Zhang, P Cayting, G Weinstock, M Gerstein (2008). Mol Biol Evol 25: 131-43.

An integrated system for studying residue coevolution in proteins.

KY Yip, P Patel, PM Kim, DM Engelman, D McDermott, M Gerstein (2008). Bioinformatics 24: 290-2.

Positive selection at the protein network periphery: evaluation in terms of structural constraints and cellular context.

PM Kim, JO Korbel, MB Gerstein (2007). Proc Natl Acad Sci U S A 104: 20274-9.

PARE: a tool for comparing protein abundance and mRNA expression data.

EZ Yu, AE Burba, M Gerstein (2007). BMC Bioinformatics 8: 309.

Toward a universal microarray: prediction of gene expression through nearest-neighbor probe sequence identification.

TE Royce, JS Rozowsky, MB Gerstein (2007). Nucleic Acids Res 35: e99.

FlexOracle: predicting flexible hinges by identification of stable domains.

SC Flores, MB Gerstein (2007). BMC Bioinformatics 8: 215.

Comparing classical pathways and modern networks: towards the development of an edge ontology

LJ Lu, A Sboner, YJ Huang, HX Lu, TA Gianoulis, KY Yip, PM Kim, GT Montelione, MB Gerstein (2007). Trends Biochem Sci 32: 320-31.

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

ENCODE Project Consortium, E Birney, JA Stamatoyannopoulos, A Dutta, R Guigo, TR Gingeras, EH Margulies, Z Weng, M Snyder, ET Dermitzakis, RE Thurman, MS Kuehn, CM Taylor, S Neph, CM Koch, S Asthana, A Malhotra, I Adzhubei, JA Greenbaum, RM Andrews, P Flicek, PJ Boyle, H Cao, NP Carter, GK Clelland, S Davis, N Day, P Dhami, SC Dillon, MO Dorschner, H Fiegler, PG Giresi, J Goldy, M Hawrylycz, A Haydock, R Humbert, KD James, BE Johnson, EM Johnson, TT Frum, ER Rosenzweig, N Karnani, K Lee, GC Lefebvre, PA Navas, F Neri, SC Parker, PJ Sabo, R Sandstrom, A Shafer, D Vetrie, M Weaver, S Wilcox, M Yu, FS Collins, J Dekker, JD Lieb, TD Tullius, GE Crawford, S Sunyaev, WS Noble, I Dunham, F Denoeud, A Reymond, P Kapranov, J Rozowsky, D Zheng, R Castelo, A Frankish, J Harrow, S Ghosh, A Sandelin, IL Hofacker, R Baertsch, D Keefe, S Dike, J Cheng, HA Hirsch, EA Sekinger, J Lagarde, JF Abril, A Shahab, C Flamm, C Fried, J Hackermuller, J Hertel, M Lindemeyer, K Missal, A Tanzer, S Washietl, J Korbel, O Emanuelsson, JS Pedersen, N Holroyd, R Taylor, D Swarbreck, N Matthews, MC Dickson, DJ Thomas, MT Weirauch, J Gilbert, J Drenkow, I Bell, X Zhao, KG Srinivasan, WK Sung, HS Ooi, KP Chiu, S Foissac, T Alioto, M Brent, L Pachter, ML Tress, A Valencia, SW Choo, CY Choo, C Ucla, C Manzano, C Wyss, E Cheung, TG Clark, JB Brown, M Ganesh, S Patel, H Tammana, J Chrast, CN Henrichsen, C Kai, J Kawai, U Nagalakshmi, J Wu, Z Lian, J Lian, P Newburger, X Zhang, P Bickel, JS Mattick, P Carninci, Y Hayashizaki, S Weissman, T Hubbard, RM Myers, J Rogers, PF Stadler, TM Lowe, CL Wei, Y Ruan, K Struhl, M Gerstein, SE Antonarakis, Y Fu, ED Green, U Karaoz, A Siepel, J Taylor, LA Liefer, KA Wetterstrand, PJ Good, EA Feingold, MS Guyer, GM Cooper, G Asimenos, CN Dewey, M Hou, S Nikolaev, JI Montoya-Burgos, A Loytynoja, S Whelan, F Pardi, T Massingham, H Huang, NR Zhang, I Holmes, JC Mullikin, A Ureta-Vidal, B Paten, M Seringhaus, D Church, K Rosenbloom, WJ Kent, EA Stone, NISC Comparative Sequencing Program, Baylor College of Medicine Human Genome Sequencing Center, Washington University Genome Sequencing Center, Broad Institute, Children's Hospital Oakland Research Institute, S Batzoglou, N Goldman, RC Hardison, D Haussler, W Miller, A Sidow, ND Trinklein, ZD Zhang, L Barrera, R Stuart, DC King, A Ameur, S Enroth, MC Bieda, J Kim, AA Bhinge, N Jiang, J Liu, F Yao, VB Vega, CW Lee, P Ng, A Shahab, A Yang, Z Moqtaderi, Z Zhu, X Xu, S Squazzo, MJ Oberley, D Inman, MA Singer, TA Richmond, KJ Munn, A Rada-Iglesias, O Wallerman, J Komorowski, JC Fowler, P Couttet, AW Bruce, OM Dovey, PD Ellis, CF Langford, DA Nix, G Euskirchen, S Hartman, AE Urban, P Kraus, S Van Calcar, N Heintzman, TH Kim, K Wang, C Qu, G Hon, R Luna, CK Glass, MG Rosenfeld, SF Aldred, SJ Cooper, A Halees, JM Lin, HP Shulha, X Zhang, M Xu, JN Haidar, Y Yu, Y Ruan, VR Iyer, RD Green, C Wadelius, PJ Farnham, B Ren, RA Harte, AS Hinrichs, H Trumbower, H Clawson, J Hillman-Jackson, AS Zweig, K Smith, A Thakkapallayil, G Barber, RM Kuhn, D Karolchik, L Armengol, CP Bird, PI de Bakker, AD Kern, N Lopez-Bigas, JD Martin, BE Stranger, A Woodroffe, E Davydov, A Dimas, E Eyras, IB Hallgrimsdottir, J Huppert, MC Zody, GR Abecasis, X Estivill, GG Bouffard, X Guan, NF Hansen, JR Idol, VV Maduro, B Maskeri, JC McDowell, M Park, PJ Thomas, AC Young, RW Blakesley, DM Muzny, E Sodergren, DA Wheeler, KC Worley, H Jiang, GM Weinstock, RA Gibbs, T Graves, R Fulton, ER Mardis, RK Wilson, M Clamp, J Cuff, S Gnerre, DB Jaffe, JL Chang, K Lindblad-Toh, ES Lander, M Koriabine, M Nefedov, K Osoegawa, Y Yoshinaga, B Zhu, PJ de Jong (2007). Nature 447: 799-816.

Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution.

D Zheng, A Frankish, R Baertsch, P Kapranov, A Reymond, SW Choo, Y Lu, F Denoeud, SE Antonarakis, M Snyder, Y Ruan, CL Wei, TR Gingeras, R Guigo, J Harrow, MB Gerstein (2007). Genome Res 17: 839-51.

Statistical analysis of the genomic distribution and correlation of regulatory elements in the ENCODE regions.

ZD Zhang, A Paccanaro, Y Fu, S Weissman, Z Weng, J Chang, M Snyder, MB Gerstein (2007). Genome Res 17: 787-97.

The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci.

JS Rozowsky, D Newburger, F Sayward, J Wu, G Jordan, JO Korbel, U Nagalakshmi, J Yang, D Zheng, R Guigo, TR Gingeras, S Weissman, P Miller, M Snyder, MB Gerstein (2007). Genome Res 17: 732-45.

An efficient pseudomedian filter for tiling microrrays.

TE Royce, NJ Carriero, MB Gerstein (2007). BMC Bioinformatics 8: 186.

Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome.

JO Korbel, AE Urban, F Grubert, J Du, TE Royce, P Starr, G Zhong, BS Emanuel, SM Weissman, M Snyder, MB Gerstein (2007). Proc Natl Acad Sci U S A 104: 10110-5.

Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications

H Yu, R Jansen, G Stolovitzky, M Gerstein (2007). Bioinformatics 23: 2163-73.

Hinge Atlas: relating protein sequence to sites of structural flexibility.

SC Flores, LJ Lu, J Yang, N Carriero, MB Gerstein (2007). BMC Bioinformatics 8: 167.

Tilescope: online analysis pipeline for high-density tiling microarray data.

ZD Zhang, J Rozowsky, HY Lam, J Du, M Snyder, M Gerstein (2007). Genome Biol 8: R81.

The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics.

H Yu, PM Kim, E Sprecher, V Trifonov, M Gerstein (2007). PLoS Comput Biol 3: e59.

Assessing the need for sequence-based normalization in tiling microarray experiments.

TE Royce, JS Rozowsky, MB Gerstein (2007). Bioinformatics 23: 988-97.

The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they?

D Zheng, MB Gerstein (2007). Trends Genet 23: 219-24.

Positional artifacts in microarrays: experimental verification and construction of COP, an automated detection tool.

H Yu, K Nguyen, T Royce, J Qian, K Nelson, M Snyder, M Gerstein (2007). Nucleic Acids Res 35: e8.

Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation.

JE Karro, Y Yan, D Zheng, Z Zhang, N Carriero, P Cayting, P Harrrison, M Gerstein (2007). Nucleic Acids Res 35: D55-60.

Analytical Evolutionary Model for Protein Fold Occurrence in Genomes, Accounting for the Effects of Gene Duplication, Deletion, Acquisition and Selective Pressure

M Kamal, N Luscombe, J Qian, M Gerstein (2006). in Power Laws, Scale-Free Networks and Genome Biology (edited by EV Koonin, YI Wolf, GP Karev; Springer, New York), pages 165-193

Novel transcribed regions in the human genome.

J Rozowsky, J Wu, Z Lian, U Nagalakshmi, JO Korbel, P Kapranov, D Zheng, S Dyke, P Newburger, P Miller, TR Gingeras, S Weissman, M Gerstein, M Snyder (2006). Cold Spring Harb Symp Quant Biol 71: 111-6.

Relating three-dimensional structures to protein networks provides evolutionary insights.

PM Kim, LJ Lu, Y Xia, MB Gerstein (2006). Science 314: 1938-41.

ProCAT: a data analysis approach for protein microarrays.

X Zhu, M Gerstein, M Snyder (2006). Genome Biol 7: R110.

BoCaTFBS: a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments.

LY Wang, M Snyder, M Gerstein (2006). Genome Biol 7: R102.

Helix Interaction Tool (HIT): a web-based tool for analysis of helix-helix interactions in proteins.

AE Burba, U Lehnert, EZ Yu, M Gerstein (2006). Bioinformatics 22: 2735-8.

A supervised hidden markov model framework for efficiently segmenting tiling array data in transcriptional and chIP-chip experiments: systematically incorporating validated biological knowledge.

J Du, JS Rozowsky, JO Korbel, ZD Zhang, TE Royce, MH Schultz, M Snyder, M Gerstein (2006). Bioinformatics 22: 3016-24.

Integration of curated databases to identify genotype-phenotype associations.

CS Goh, TA Gianoulis, Y Liu, J Li, A Paccanaro, YA Lussier, M Gerstein (2006). BMC Genomics 7: 257.

The tYNA platform for comparative interactomics: a web tool for managing, comparing and mining multiple networks.

KY Yip, H Yu, PM Kim, M Schultz, M Gerstein (2006). Bioinformatics 22: 2968-70.

Genomic analysis of the hierarchical structure of regulatory networks.

H Yu, M Gerstein (2006). Proc Natl Acad Sci U S A 103: 14724-31.

Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing.

R Pinard, A de Winter, GJ Sarkis, MB Gerstein, KR Tartaro, RN Plant, M Egholm, JM Rothberg, JH Leamon (2006). BMC Genomics 7: 216.

A computational approach for identifying pseudogenes in the ENCODE regions.

D Zheng, MB Gerstein (2006). Genome Biol 7 Suppl 1: S131-10.

Predicting essential genes in fungal genomes.

M Seringhaus, A Paccanaro, A Borneman, M Snyder, M Gerstein (2006). Genome Res 16: 1126-35.

Design principles of molecular networks revealed by global comparisons and composite motifs.

H Yu, Y Xia, V Trifonov, M Gerstein (2006). Genome Biol 7: R55.

The geometry of the ribosomal polypeptide exit tunnel.

NR Voss, M Gerstein, TA Steitz, PB Moore (2006). J Mol Biol 360: 893-906.

Genomic analysis of insertion behavior and target specificity of mini-Tn7 and Tn3 transposons in Saccharomyces cerevisiae.

M Seringhaus, A Kumar, J Hartigan, M Snyder, M Gerstein (2006). Nucleic Acids Res 34: e57.

PseudoPipe: an automated pseudogene identification pipeline.

Z Zhang, N Carriero, D Zheng, J Karro, PM Harrison, M Gerstein (2006). Bioinformatics 22: 1437-9.

Predicting interactions in protein networks by completing defective cliques.

H Yu, A Paccanaro, V Trifonov, M Gerstein (2006). Bioinformatics 22: 823-9.

Target hub proteins serve as master regulators of development in yeast.

AR Borneman, JA Leigh-Bell, H Yu, P Bertone, M Gerstein, M Snyder (2006). Genes Dev 20: 435-48.

Integrated prediction of the helical membrane protein interactome in yeast.

Y Xia, LJ Lu, M Gerstein (2006). J Mol Biol 357: 339-49.

The Database of Macromolecular Motions: new features added at the decade mark.

S Flores, N Echols, D Milburn, B Hespenheide, K Keating, J Lu, S Wells, EZ Yu, M Thorpe, M Gerstein (2006). Nucleic Acids Res 34: D296-301.

Design optimization methods for genomic DNA tiling arrays.

P Bertone, V Trifonov, JS Rozowsky, F Schubert, O Emanuelsson, J Karro, MY Kao, M Snyder, M Gerstein (2006). Genome Res 16: 271-81.

Inferring Protein-Protein Interactions Using Interaction Network Topologies

A Paccanaro, V Trifonov, H Yu, M Gerstein (2005). International Joint Conference on Neural Networks (IJCNN, Jul. 31-Aug. 4, Montreal, Canada), pages 161 - 166, vol. 1

Assessing the limits of genomic data integration for predicting protein networks.

LJ Lu, Y Xia, A Paccanaro, H Yu, M Gerstein (2005). Genome Res 15: 945-53.

YeastHub: a semantic web use case for integrating data in the life sciences domain.

KH Cheung, KY Yip, A Smith, R Deknikker, A Masiar, M Gerstein (2005). Bioinformatics 21 Suppl 1: i85-96.

Integrated pseudogene annotation for human chromosome 22: evidence for transcription.

D Zheng, Z Zhang, PM Harrison, J Karro, N Carriero, M Gerstein (2005). J Mol Biol 349: 27-45.

Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability.

PM Harrison, D Zheng, Z Zhang, N Carriero, M Gerstein (2005). Nucleic Acids Res 33: 2374-83.

Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms.

S Balasubramanian, Y Xia, E Freinkman, M Gerstein (2005). Nucleic Acids Res 33: 1710-21.

Normal modes for predicting protein motions: a comprehensive database assessment and associated Web tool.

V Alexandrov, U Lehnert, N Echols, D Milburn, D Engelman, M Gerstein (2005). Protein Sci 14: 633-43.

Calculation of standard atomic volumes for RNA and comparison with proteins: RNA is packed more tightly.

NR Voss, M Gerstein (2005). J Mol Biol 346: 477-92.

An XML-Based Approach to Integrating Heterogeneous Yeast Genome Data

KH Cheung, D Pan, A Smith, M Seringhaus, SM Douglas, M Gerstein (2004). International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences (METMBS); pp 236-242

Genomic analysis of regulatory network dynamics reveals large topological changes.

NM Luscombe, MM Babu, H Yu, M Snyder, SA Teichmann, M Gerstein (2004). Nature 431: 308-12.

Comprehensive analysis of pseudogenes in prokaryotes: widespread gene decay and failure of putative horizontally transferred genes.

Y Liu, PM Harrison, V Kunin, M Gerstein (2004). Genome Biol 5: R64.

Computer security in academia-a potential roadblock to distributed annotation of the human genome

D Greenbaum, SM Douglas, A Smith, J Lim, M Fischer, M Schultz, M Gerstein (2004). Nat Biotechnol 22: 771-2.

Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs.

H Yu, NM Luscombe, HX Lu, X Zhu, Y Xia, JD Han, N Bertin, S Chung, M Vidal, M Gerstein (2004). Genome Res 14: 1107-18.

A method using active-site sequence conservation to find functional shifts in protein families: application to the enzymes of central metabolism, leading to the identification of an anomalous isocitrate dehydrogenase in pathogens.

R Das, M Gerstein (2004). Proteins 55: 455-63.

Transmembrane protein domains rarely use covalent domain recombination as an evolutionary mechanism.

Y Liu, M Gerstein, DM Engelman (2004). Proc Natl Acad Sci U S A 101: 3495-7.

Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis.

CS Goh, N Lan, SM Douglas, B Wu, N Echols, A Smith, D Milburn, GT Montelione, H Zhao, M Gerstein (2004). J Mol Biol 336: 115-30.

TopNet: a tool for comparing biological sub-networks, correlating protein properties with topological statistics.

H Yu, X Zhu, D Greenbaum, J Karro, M Gerstein (2004). Nucleic Acids Res 32: 328-37.

Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures.

V Alexandrov, M Gerstein (2004). BMC Bioinformatics 5: 2.

Tools and databases to analyze protein flexibility; approaches to mapping implied features onto sequences.

WG Krebs, J Tsai, V Alexandrov, J Junker, R Jansen, M Gerstein (2003). Methods Enzymol 374: 544-84.

Relationship between gene co-expression and probe localization on microarray slides.

Y Kluger, H Yu, J Qian, M Gerstein (2003). BMC Genomics 4: 49.

Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome.

Z Zhang, PM Harrison, Y Liu, M Gerstein (2003). Genome Res 13: 2541-58.

A "polyORFomic" analysis of prokaryote genomes using disabled-homology filtering reveals conserved but undiscovered short ORFs.

PM Harrison, N Carriero, Y Liu, M Gerstein (2003). J Mol Biol 333: 885-92.

A Bayesian networks approach for predicting protein-protein interactions from genomic data.

R Jansen, H Yu, D Greenbaum, Y Kluger, NJ Krogan, S Chung, A Emili, M Snyder, JF Greenblatt, M Gerstein (2003). Science 302: 449-53.

Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data.

J Qian, J Lin, NM Luscombe, H Yu, M Gerstein (2003). Bioinformatics 19: 1917-26.

Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes.

Z Zhang, M Gerstein (2003). Nucleic Acids Res 31: 5338-48.

Comparing protein abundance and mRNA expression levels on a genomic scale.

D Greenbaum, C Colangelo, K Williams, M Gerstein (2003). Genome Biol 4: 117.

The human genome has 49 cytochrome c pseudogenes, including a relic of a primordial gene that still functions in mouse.

Z Zhang, M Gerstein (2003). Gene 312: 61-72.

Genomic analysis of gene expression relationships in transcriptional regulatory networks.

H Yu, NM Luscombe, J Qian, M Gerstein (2003). Trends Genet 19: 422-7.

Identification and correction of spurious spatial correlations in microarray data.

J Qian, Y Kluger, H Yu, M Gerstein (2003). Biotechniques 35: 42-4, 46, 48.

ExpressYourself: A modular platform for processing and visualizing microarray data.

NM Luscombe, TE Royce, P Bertone, N Echols, CE Horak, JT Chang, M Snyder, M Gerstein (2003). Nucleic Acids Res 31: 3477-82.

A method to assess compositional bias in biological sequences and its application to prion-like glutamine/asparagine-rich domains in eukaryotic proteomes.

PM Harrison, M Gerstein (2003). Genome Biol 4: R40.

SPINE 2: a system for collaborative structural proteomics within a federated database framework.

CS Goh, N Lan, N Echols, SM Douglas, D Milburn, P Bertone, R Xiao, LC Ma, D Zheng, Z Wunderlich, T Acton, GT Montelione, M Gerstein (2003). Nucleic Acids Res 31: 2833-8.

Identification and characterization of over 100 mitochondrial ribosomal protein pseudogenes in the human genome.

Z Zhang, M Gerstein (2003). Genomics 81: 468-80.

Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models.

R Jansen, HJ Bussemaker, M Gerstein (2003). Nucleic Acids Res 31: 2242-51.

Spectral biclustering of microarray data: coclustering genes and conditions.

Y Kluger, R Basri, JT Chang, M Gerstein (2003). Genome Res 13: 703-16.

Identification of pseudogenes in the Drosophila melanogaster genome.

PM Harrison, D Milburn, Z Zhang, P Bertone, M Gerstein (2003). Nucleic Acids Res 31: 1033-7.

MolMovDB: analysis and visualization of conformational change and structural flexibility.

N Echols, D Milburn, M Gerstein (2003). Nucleic Acids Res 31: 478-82.

Calculations of protein volumes: sensitivity analysis and parameter database.

J Tsai, M Gerstein (2002). Bioinformatics 18: 985-95.

Analysis of mRNA expression and protein abundance data: an approach for the comparison of the enrichment of features in the cellular population of proteins and transcripts.

D Greenbaum, R Jansen, M Gerstein (2002). Bioinformatics 18: 585-96.

Structural genomics analysis: characteristics of atypical, common, and horizontally transferred folds.

H Hegyi, J Lin, D Greenbaum, M Gerstein (2002). Proteins 47: 126-41.

A small reservoir of disabled ORFs in the yeast genome and its implications for the dynamics of proteome evolution.

P Harrison, A Kumar, N Lan, N Echols, M Snyder, M Gerstein (2002). J Mol Biol 316: 409-19.

A question of size: the eukaryotic proteome and the problems in defining it.

PM Harrison, A Kumar, N Lang, M Snyder, M Gerstein (2002). Nucleic Acids Res 30: 1083-90.

Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22.

PM Harrison, H Hegyi, S Balasubramanian, NM Luscombe, P Bertone, N Echols, T Johnson, M Gerstein (2002). Genome Res 12: 272-80.

Relating whole-genome expression data with protein-protein interactions.

R Jansen, D Greenbaum, M Gerstein (2002). Genome Res 12: 37-46.

Towards a systematic definition of protein function that scales to the genome level: Defining function in terms of interactions.

N Lan, R Jansen, M Gerstein (2002). Proceedings of the IEEE 90:1848-1858

Fast optimal genome tiling with applications to microarray design and homology search.

P Berman, P Bertone, B DasGupta, M Gerstein, M-Y Kao, M Snyder (2002). Proceedings of the 2nd International Workshop on Algorithms in Bioinformatics. Springer-Verlag LNCS 2452: 419-433

Integration of genomic datasets to predict protein complexes in yeast.

R Jansen, N Lan, J Qian, M Gerstein (2002). J Struct Funct Genomics 2: 71-81.

Digging deep for ancient relics: a survey of protein motifs in the intergenic sequences of four eukaryotic genomes.

ZL Zhang, PM Harrison, M Gerstein (2002). J Mol Biol 323: 811-22.

GeneCensus: genome comparisons in terms of metabolic pathway activity and protein family sharing.

J Lin, J Qian, D Greenbaum, P Bertone, R Das, N Echols, A Senes, B Stenger, M Gerstein (2002). Nucleic Acids Res 30: 4574-82.

Genomic analysis of membrane protein families: abundance and conserved motifs.

Y Liu, DM Engelman, M Gerstein (2002). Genome Biol 3: research0054.

Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome.

Z Zhang, P Harrison, M Gerstein (2002). Genome Res 12: 1466-82.

Normal mode analysis of macromolecular motions in a database framework: developing mode concentration as a useful classifying statistic.

WG Krebs, V Alexandrov, CA Wilson, N Echols, H Yu, M Gerstein (2002). Proteins 48: 682-95.

The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties.

NM Luscombe, J Qian, Z Zhang, T Johnson, M Gerstein (2002). Genome Biol 3: RESEARCH0040.

SNPs on human chromosomes 21 and 22 -- analysis in terms of protein features and pseudogenes.

S Balasubramanian, P Harrison, H Hegyi, P Bertone, N Luscombe, N Echols, P McGarvey, Z Zhang, M Gerstein (2002). Pharmacogenomics 3: 393-402.

Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.

N Echols, P Harrison, S Balasubramanian, NM Luscombe, P Bertone, Z Zhang, M Gerstein (2002). Nucleic Acids Res 30: 2515-23.

Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.

J Qian, M Dolled-Filhart, J Lin, H Yu, M Gerstein (2001). J Mol Biol 314: 1053-66.

Annotation transfer for genomics: measuring functional divergence in multi-domain proteins.

H Hegyi, M Gerstein (2001). Genome Res 11: 1632-40.

Calculating populations of subcellular compartments using density matrix formalism

V Alexandrov, M Gerstein (2001). International Journal of Quantum Chemistry 85:693-696

A standard reference frame for the description of nucleic acid base-pair geometry.

WK Olson, M Bansal, SK Burley, RE Dickerson, M Gerstein, SC Harvey, U Heinemann, XJ Lu, S Neidle, Z Shakked, H Sklenar, M Suzuki, CS Tung, E Westhof, C Wolberger, HM Berman (2001). J Mol Biol 313: 229-37.

Determining the minimum number of types necessary to represent the sizes of protein atoms.

J Tsai, N Voss, M Gerstein (2001). Bioinformatics 17: 949-56.

SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics.

P Bertone, Y Kluger, N Lan, D Zheng, D Christendat, A Yee, AM Edwards, CH Arrowsmith, GT Montelione, M Gerstein (2001). Nucleic Acids Res 29: 2884-98.

Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome.

PM Harrison, N Echols, MB Gerstein (2001). Nucleic Acids Res 29: 818-30.

PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information.

J Qian, B Stenger, CA Wilson, J Lin, R Jansen, SA Teichmann, J Park, WG Krebs, H Yu, V Alexandrov, N Echols, M Gerstein (2001). Nucleic Acids Res 29: 1750-64.

An XML Application for Genomic Data Interoperation

Cheung KH, Liu Y, Kumar K, Snyder M, Gerstein M, Miller P (2001). IEEE International Symposium on Bio-Informatics and Biomedical Engineering (BIBE) pp. 97-103

Genome analyses of spirochetes: a study of the protein structures, functions and metabolic pathways in Treponema pallidum and Borrelia burgdorferi.

R Das, H Hegyi, M Gerstein (2000). J Mol Microbiol Biotechnol 2: 387-92.

Genome-wide analysis relating expression level with protein subcellular localization.

A Drawid, R Jansen, M Gerstein (2000). Trends Genet 16: 426-30.

The stability of thermophilic proteins: a study based on comprehensive genome comparison.

R Das, M Gerstein (2000). Funct Integr Genomics 1: 76-88.

Measuring shifts in function and evolutionary opportunity using variability profiles: a case study of the globins.

GJ Naylor, M Gerstein (2000). J Mol Evol 51: 223-33.

A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome.

A Drawid, M Gerstein (2000). J Mol Biol 301: 1059-75.

Protein folds in the worm genome.

M Gerstein, J Lin, H Hegyi (2000). Pac Symp Biocomput : 30-41.

Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels.

J Lin, M Gerstein (2000). Genome Res 10: 808-18.

The morph server: a standardized system for analyzing and visualizing macromolecular motions in a database framework.

WG Krebs, M Gerstein (2000). Nucleic Acids Res 28: 1665-75.

Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.

CA Wilson, J Kreychman, M Gerstein (2000). J Mol Biol 297: 233-49.

Analysis of the yeast transcriptome with structural and functional categories: characterizing highly expressed proteins.

R Jansen, M Gerstein (2000). Nucleic Acids Res 28: 1481-8.

E-biomed and clinical research

J Tsai, R Taylor, C Chothia, M Gerstein (1999). J Mol Biol 290: 253-66.

The relationship between protein structure and function: a comprehensive survey with application to the yeast genome.

H Hegyi, M Gerstein (1999). J Mol Biol 288: 147-64.

How representative are the known structures of the proteins in a complete genome? A comprehensive structural census.

M Gerstein (1998). Fold Des 3: 497-512.

Patterns of protein-fold usage in eight microbial genomes: a comprehensive structural census.

M Gerstein (1998). Proteins 33: 518-34.

Measurement of the effectiveness of transitive sequence comparison, through a third 'intermediate' sequence.

M Gerstein (1998). Bioinformatics 14: 707-14.

A database of macromolecular motions.

M Gerstein, W Krebs (1998). Nucleic Acids Res 26: 4280-90.

A unified statistical framework for sequence comparison and structure comparison.

M Levitt, M Gerstein (1998). Proc Natl Acad Sci U S A 95: 5913-20.

Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins.

M Gerstein, M Levitt (1998). Protein Sci 7: 445-56.

A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure.

M Gerstein (1997). J Mol Biol 274: 562-76.

A structural census of the current population of protein sequences.

M Gerstein, M Levitt (1997). Proc Natl Acad Sci U S A 94: 11911-6.

Keeping the Shape but Changing the Charges: A Simulation Study of Urea and its Isosteric Analogues

J Tsai, M Gerstein, M Levitt (1996). Journal of Chemical Physics 104: 9417-9430

Packing at the protein-water interface.

M Gerstein, C Chothia (1996). Proc Natl Acad Sci U S A 93: 10167-72.

Using iterative dynamic programming to obtain accurate pairwise and multiple alignments of protein structures.

M Gerstein, M Levitt (1996). Proc Int Conf Intell Syst Mol Biol 4: 59-67.

Using a measure of structural variation to define a core for the globins.

M Gerstein, RB Altman (1995). Comput Appl Biosci 11: 633-44.

Binding geometry of alpha-helices that recognize DNA.

M Suzuki, M Gerstein (1995). Proteins 23: 525-35.

Average core structures and variability measures for protein families: application to the immunoglobulins.

M Gerstein, RB Altman (1995). J Mol Biol 251: 161-75.

The volume of atoms on the protein surface: calculated from simulation, using Voronoi polyhedra.

M Gerstein, J Tsai, M Levitt (1995). J Mol Biol 249: 955-66.

Methods for displaying macromolecular structural uncertainty: application to the globins.

RB Altman, C Hughes, MB Gerstein (1995). J Mol Graph 13: 142-52, 109-2.

DNA recognition and superstructure formation by helix-turn-helix proteins.

M Suzuki, N Yagi, M Gerstein (1995). Protein Eng 8: 329-38.

Stereochemical basis of DNA recognition by Zn fingers.

M Suzuki, M Gerstein, N Yagi (1994). Nucleic Acids Res 22: 3397-405.

Volume changes on protein folding.

Y Harpaz, M Gerstein, C Chothia (1994). Structure 2: 641-9.

Solution structure of the DNA binding octapeptide repeat of the K10 gene product.

M Suzuki, D Neuhaus, M Gerstein, S Aimoto (1994). Protein Eng 7: 461-70.

Volume changes in protein evolution.

M Gerstein, EL Sonnhammer, C Chothia (1994). J Mol Biol 236: 1067-78.

Finding an average core structure: application to the globins.

RB Altman, M Gerstein (1994). Proc Int Conf Intell Syst Mol Biol 2: 19-27.

Simulation of Water around a Model Protein Helix. 1. Two-dimensional Projections of Solvent Structure.

M Gerstein, R Lynden-Bell (1993). Journal of Physical Chemistry 97: 2982-2991.

Simulation of Water around a Model Protein Helix. 2. The Relative Contributions of Packing, Hydrophobicity, and Hydrogen Bonding.

M Gerstein, R Lynden-Bell (1993). Journal of Physical Chemistry 97: 2991-2999.

Domain closure in lactoferrin. Two hinges produce a see-saw motion between alternative close-packed interfaces.

M Gerstein, BF Anderson, GE Norris, EN Baker, AM Lesk, C Chothia (1993). J Mol Biol 234: 357-72.

An NMR study on the DNA-binding SPKK motif and a model for its interaction with DNA.

M Suzuki, M Gerstein, T Johnson (1993). Protein Eng 6: 565-74.

What is the natural boundary of a protein in solution?

M Gerstein, RM Lynden-Bell (1993). J Mol Biol 230: 641-50.

Domain closure in adenylate kinase. Joints on either side of two helices close like neighboring fingers.

M Gerstein, G Schulz, C Chothia (1993). J Mol Biol 229: 494-501.

Electron diffraction analysis of structural changes in the photocycle of bacteriorhodopsin.

S Subramaniam, M Gerstein, D Oesterhelt, R Henderson (1993). EMBO J 12: 1-8.

A Resolution-Sensitive Procedure for Comparing Protein Surfaces and its Application to the Comparison of Antigen-Combining Sites.

M Gerstein (1992). Acta Crystallographica A48: 271-276.

Polar zipper sequence in the high-affinity hemoglobin of Ascaris suum: amino acid sequence and structural interpretation.

I De Baere, L Liu, L Moens, J Van Beeumen, C Gielens, J Richelle, C Trotman, J Finch, M Gerstein, M Perutz (1992). Proc Natl Acad Sci U S A 89: 4638-42.

Analysis of protein loop closure. Two types of hinges produce one motion in lactate dehydrogenase.

M Gerstein, C Chothia (1991). J Mol Biol 220: 133-49.

Inverse Problem for Synchrotron Radiation in the Presence of Noise

N Fisch, A Kritz, M Gerstein (1987). Proceedings of the Sixth Joint Workshop on Electron Cyclotron Emission and Electron Cyclotron Resonance Heating. (eds. A Riviere, A Costley), 23-30 (Oxford, 16-17 September).

Return to front page