Complex genetic variation in nearly complete human genomes
GA Logsdon, P Ebert, PA Audano, M Loftus, D Porubsky, J Ebler, F Yilmaz, P Hallast, T Prodanov, D Yoo, CA Paisie, WT Harvey, X Zhao, GV Martino, M Henglin, KM Munson, K Rabbani, CS Chin, B Gu, H Ashraf, O Austine-Orimoloye, P Balachandran, MJ Bonder, H Cheng, Z Chong, J Crabtree, M Gerstein, LA Guethlein, P Hasenfeld, G Hickey, K Hoekzema, SE Hunt, M Jensen, Y Jiang, S Koren, Y Kwon, C Li, H Li, J Li, PJ Norman, KK Oshima, B Paten, AM Phillippy, NR Pollock, T Rausch, M Rautiainen, S Scholz, Y Song, A Soylev, A Sulovari, L Surapaneni, V Tsapalou, W Zhou, Y Zhou, Q Zhu, MC Zody, RE Mills, SE Devine, X Shi, ME Talkowski, MJP Chaisson, AT Dilthey, MK Konkel, JO Korbel, C Lee, CR Beck, EE Eichler, T Marschall (Preprint). bioRxiv.

Building a Hybrid Physical-Statistical Classifier for Predicting the Effect of Variants Related to Protein-Drug Interactions.
B Wang, C Yan, S Lou, P Emani, B Li, M Xu, X Kong, W Meyerson, YT Yang, D Lee, M Gerstein (2019). Structure 27: 1469-1481e3.

Genomic analysis of the hydrocarbon-producing, cellulolytic, endophytic fungus Ascocoryne sarcoides.
TA Gianoulis, MA Griffin, DJ Spakowicz, BF Dunican, CJ Alpha, A Sboner, AM Sismour, C Kodira, M Egholm, GM Church, MB Gerstein, SA Strobel (2012). PLoS Genet 8: e1002558.

Novel insights through the integration of structural and functional genomics data with protein networks.
D Clarke, N Bhardwaj, MB Gerstein (2012). J Struct Biol 179: 320-6.

Identification of specificity determining residues in peptide recognition domains using an information theoretic approach applied to large-scale binding maps.
KY Yip, L Utz, S Sitwell, X Hu, SS Sidhu, BE Turk, M Gerstein, PM Kim (2011). BMC Biol 9: 53.

MOTIPS: automated motif analysis for predicting targets of modular protein domains.
HY Lam, PM Kim, J Mok, R Tonikian, SS Sidhu, BE Turk, M Snyder, MB Gerstein (2010). BMC Bioinformatics 11: 243.

Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing.
JQ Wu, L Habegger, P Noisa, A Szekely, C Qiu, S Hutchison, D Raha, M Egholm, H Lin, S Weissman, W Cui, M Gerstein, M Snyder (2010). Proc Natl Acad Sci U S A 107: 5254-9.

Genome-wide identification of binding sites defines distinct functions for Caenorhabditis elegans PHA-4/FOXA in development and environmental response.
M Zhong, W Niu, ZJ Lu, M Sarov, JI Murray, J Janette, D Raha, KL Sheaffer, HY Lam, E Preston, C Slightham, LW Hillier, T Brock, A Agarwal, R Auerbach, AA Hyman, M Gerstein, SE Mango, SK Kim, RH Waterston, V Reinke, M Snyder (2010). PLoS Genet 6: e1000848.

Deciphering protein kinase specificity through large-scale analysis of yeast phosphorylation site motifs.
J Mok, PM Kim, HY Lam, S Piccirillo, X Zhou, GR Jeschke, DL Sheridan, SA Parker, V Desai, M Jwa, E Cameroni, H Niu, M Good, A Remenyi, JL Ma, YJ Sheu, HE Sassi, R Sopko, CS Chan, C De Virgilio, NM Hollingsworth, WA Lim, DF Stern, B Stillman, BJ Andrews, MB Gerstein, M Snyder, BE Turk (2010). Sci Signal 3: ra12.

LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics.
AK Smith, KH Cheung, KY Yip, M Schultz, MK Gerstein (2007). BMC Bioinformatics 8 Suppl 3: S5.

Differential binding of calmodulin-related proteins to their targets revealed through high-density Arabidopsis protein microarrays.
SC Popescu, GV Popescu, S Bachan, Z Zhang, M Seay, M Gerstein, M Snyder, SP Dinesh-Kumar (2007). Proc Natl Acad Sci U S A 104: 4730-5.

Robotic cloning and Protein Production Platform of the Northeast Structural Genomics Consortium.
TB Acton, KC Gunsalus, R Xiao, LC Ma, J Aramini, MC Baran, YW Chiang, T Climent, B Cooper, NG Denissova, SM Douglas, JK Everett, CK Ho, D Macapagal, PK Rajan, R Shastry, LY Shih, GV Swapna, M Wilson, M Wu, M Gerstein, M Inouye, JF Hunt, GT Montelione (2005). Methods Enzymol 394: 210-43.

Computational analysis of membrane proteins: genomic occurrence, structure prediction and helix interactions.
U Lehnert, Y Xia, TE Royce, CS Goh, Y Liu, A Senes, H Yu, ZL Zhang, DM Engelman, M Gerstein (2004). Q Rev Biophys 37: 121-46.

The protein target list of the Northeast Structural Genomics Consortium.
Z Wunderlich, TB Acton, J Liu, G Kornhaber, J Everett, P Carter, N Lan, N Echols, M Gerstein, B Rost, GT Montelione (2004). Proteins 56: 181-7.

A method using active-site sequence conservation to find functional shifts in protein families: application to the enzymes of central metabolism, leading to the identification of an anomalous isocitrate dehydrogenase in pathogens.
R Das, M Gerstein (2004). Proteins 55: 455-63.

Transmembrane protein domains rarely use covalent domain recombination as an evolutionary mechanism.
Y Liu, M Gerstein, DM Engelman (2004). Proc Natl Acad Sci U S A 101: 3495-7.

Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis.
CS Goh, N Lan, SM Douglas, B Wu, N Echols, A Smith, D Milburn, GT Montelione, H Zhao, M Gerstein (2004). J Mol Biol 336: 115-30.

Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures.
V Alexandrov, M Gerstein (2004). BMC Bioinformatics 5: 2.

Data mining crystallization databases: knowledge-based approaches to optimize protein crystal screens.
MS Kimber, F Vallee, S Houston, A Necakov, T Skarina, E Evdokimova, S Beasley, D Christendat, A Savchenko, CH Arrowsmith, M Vedadi, M Gerstein, AM Edwards (2003). Proteins 51: 562-8.

SPINE 2: a system for collaborative structural proteomics within a federated database framework.
CS Goh, N Lan, N Echols, SM Douglas, D Milburn, P Bertone, R Xiao, LC Ma, D Zheng, Z Wunderlich, T Acton, GT Montelione, M Gerstein (2003). Nucleic Acids Res 31: 2833-8.

Structural genomics: current progress.
M Gerstein, A Edwards, CH Arrowsmith, GT Montelione (2003). Science 299: 1663.

Strategies for structural proteomics of prokaryotes: Quantifying the advantages of studying orthologous proteins and of using both NMR and X-ray crystallography approaches.
A Savchenko, A Yee, A Khachatryan, T Skarina, E Evdokimova, M Pavlova, A Semesi, J Northey, S Beasley, N Lan, R Das, M Gerstein, CH Arrowmith, AM Edwards (2003). Proteins 50: 392-9.

Structural genomics analysis: characteristics of atypical, common, and horizontally transferred folds.
H Hegyi, J Lin, D Greenbaum, M Gerstein (2002). Proteins 47: 126-41.

Thermostability of membrane protein helix-helix interaction elucidated by statistical analysis.
D Schneider, Y Liu, M Gerstein, DM Engelman (2002). FEBS Lett 532: 231-6.

Genomic analysis of membrane protein families: abundance and conserved motifs.
Y Liu, DM Engelman, M Gerstein (2002). Genome Biol 3: research0054.

The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties.
NM Luscombe, J Qian, Z Zhang, T Johnson, M Gerstein (2002). Genome Biol 3: RESEARCH0040.

Structural genomics: a new era for pharmaceutical research
Y Liu, NM Luscombe, V Alexandrov, P Bertone, P Harrison, Z Zhang, M Gerstein (2002). Genome Biol 3: REPORTS4004.

Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model.
J Qian, NM Luscombe, M Gerstein (2001). J Mol Biol 313: 673-81.

Global perspectives on proteins: comparing genomes in terms of folds, pathways and beyond.
R Das, J Junker, D Greenbaum, MB Gerstein (2001). Pharmacogenomics J 1: 115-25.

SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics.
P Bertone, Y Kluger, N Lan, D Zheng, D Christendat, A Yee, AM Edwards, CH Arrowsmith, GT Montelione, M Gerstein (2001). Nucleic Acids Res 29: 2884-98.

Integrative database analysis in structural genomics.
M Gerstein (2000). Nat Struct Biol 7 Suppl: 960-3.

Genome analyses of spirochetes: a study of the protein structures, functions and metabolic pathways in Treponema pallidum and Borrelia burgdorferi.
R Das, H Hegyi, M Gerstein (2000). J Mol Microbiol Biotechnol 2: 387-92.

Structural proteomics of an archaeon.
D Christendat, A Yee, A Dharamsi, Y Kluger, A Savchenko, JR Cort, V Booth, CD Mackereth, V Saridakis, I Ekiel, G Kozlov, KL Maxwell, N Wu, LP McIntosh, K Gehring, MA Kennedy, AR Davidson, EF Pai, M Gerstein, AM Edwards, CH Arrowsmith (2000). Nat Struct Biol 7: 903-9.

Protein folds in the worm genome.
M Gerstein, J Lin, H Hegyi (2000). Pac Symp Biocomput : 30-41.

Advances in structural genomics.
SA Teichmann, C Chothia, M Gerstein (1999). Curr Opin Struct Biol 9: 390-9.

The relationship between protein structure and function: a comprehensive survey with application to the yeast genome.
H Hegyi, M Gerstein (1999). J Mol Biol 288: 147-64.

Comparing genomes in terms of protein structure: surveys of a finite parts list.
M Gerstein, H Hegyi (1998). FEMS Microbiol Rev 22: 277-304.

How representative are the known structures of the proteins in a complete genome? A comprehensive structural census.
M Gerstein (1998). Fold Des 3: 497-512.

Patterns of protein-fold usage in eight microbial genomes: a comprehensive structural census.
M Gerstein (1998). Proteins 33: 518-34.

A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure.
M Gerstein (1997). J Mol Biol 274: 562-76.


Return to front page