Main Scientific Publications

Total papers: 717

(Last updated Thu Oct 10 20:32:58 2024)

-- Preprint (12) --

Validation of Enhancer Regions in Primary Human Neural Progenitor Cells using Capture STARR-seq
SC Gaynor-Gillett, L Cheng, M Shi, J Liu, G Wang, M Spector, M Flaherty, M Wall, A Hwang, M Gu, Z Chen, Y Chen, P Consortium, JR Moran, J Zhang, D Lee, M Gerstein, D Geschwind, KP White (Preprint). bioRxiv.

Digital phenotyping from wearables using AI characterizes psychiatric disorders and identifies genetic associations
Jason J. Liu, Beatrice Borsari, Yunyang Li, Susanna Liu, Yuan Gao, Xin Xin, Shaoke Lou, Matthew Jensen, Diego Garrido-Martin, Terril Verplaetse, Garrett Ash, Jing Zhang, Matthew J. Girgenti, Walter Roberts, Mark Gerstein (2024). medRxiv.

Complex genetic variation in nearly complete human genomes
GA Logsdon, P Ebert, PA Audano, M Loftus, D Porubsky, J Ebler, F Yilmaz, P Hallast, T Prodanov, D Yoo, CA Paisie, WT Harvey, X Zhao, GV Martino, M Henglin, KM Munson, K Rabbani, CS Chin, B Gu, H Ashraf, O Austine-Orimoloye, P Balachandran, MJ Bonder, H Cheng, Z Chong, J Crabtree, M Gerstein, LA Guethlein, P Hasenfeld, G Hickey, K Hoekzema, SE Hunt, M Jensen, Y Jiang, S Koren, Y Kwon, C Li, H Li, J Li, PJ Norman, KK Oshima, B Paten, AM Phillippy, NR Pollock, T Rausch, M Rautiainen, S Scholz, Y Song, A Soylev, A Sulovari, L Surapaneni, V Tsapalou, W Zhou, Y Zhou, Q Zhu, MC Zody, RE Mills, SE Devine, X Shi, ME Talkowski, MJP Chaisson, AT Dilthey, MK Konkel, JO Korbel, C Lee, CR Beck, EE Eichler, T Marschall (Preprint). bioRxiv.

Homomorphic Encryption: An Application to Polygenic Risk Scores
Israel Yolou, Jiaqi Li, Can Kockan, Matthew Jensen, Mark Gerstein (2024). bioRxiv.

ζ -QVAE: A Quantum Variational Autoencoder utilizing Regularized Mixed-state Latent Representations
Wang, Gaoyuan, Jonathan Warrell, Prashant S. Emani, and Mark Gerstein (2024). arXiv.

ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks
Yuliang Liu, Xiangru Tang, Zefan Cai, Junjie Lu, Yichi Zhang, Yanjun Shao, Zexuan Deng, Helan Hu, Zengxian Yang, Kaikai An, Ruijun Huang, Shuzheng Si, Sheng Chen, Haozhe Zhao, Zhengliang Li, Liang Chen, Yiming Zong, Yan Wang, Tianyu Liu, Zhiwei Jiang, Baobao Chang, Yujia Qin, Wangchunshu Zhou, Yilun Zhao, Arman Cohan, Mark Gerstein (2023). arXiv.

chronODE: A framework to integrate time-series multi-omics data based on ordinary differential equations combined with machine learning
B Borsari, M Frank, ES Wattenberg, K Xu, S Liu, X Yu, MB Gerstein (2023). bioRxiv.

The ENCODE4 long-read RNA-seq collection reveals distinct classes of transcript structure diversity
F Reese, B Williams, G Balderrama-Gutierrez, D Wyman, MH Celik, E Rebboah, N Rezaie, D Trout, M Razavi-Mohseni, Y Jiang, B Borsari, S Morabito, HY Liang, CJ McGill, S Rahmanian, J Sakr, S Jiang, W Zeng, K Carvalho, AK Weimer, LA Dionne, A McShane, K Bedi, SI Elhajjajy, S Upchurch, J Jou, I Youngworth, I Gabdank, P Sud, O Jolanki, JS Strattan, MS Kagda, MP Snyder, BC Hitz, JE Moore, Z Weng, D Bennett, L Reinholdt, M Ljungman, MA Beer, MB Gerstein, L Pachter, R Guigo, BJ Wold, A Mortazavi (Preprint). bioRxiv.

REPIC — an ensemble learning methodology for cryo-EM particle picking
CJF Cameron, SJH Seager, FJ Sigworth, HD Tagare, MB Gerstein (2023). bioRxiv.

Compression-based Network Interpretability Schemes
J Warrell, H Mohsen, M Gerstein (2020). bioRxiv.

LESSeq: Local event-based analysis of alternative splicing using RNA-Seq data
J Leng, CJF Cameron, S Oh, E Khurana, JP Noonan, MB Gerstein (2019). bioRxiv.

A Variational Graph Partitioning Approach to Modeling Protein Liquid-liquid Phase Separation
Gaoyuan Wang, Jonathan H Warrell, Suchen Zheng, Mark Gerstein (2023). bioRxiv.

-- 2024 (23) --

FAVOR-GPT: a generative natural language interface to whole genome variant functional annotations
TC Li, H Zhou, V Verma, X Tang, Y Shao, E Van Buren, Z Weng, M Gerstein, B Neale, SR Sunyaev, X Lin (2024). Bioinform Adv 4: vbae143.

Spatially Exploring RNA Biology in Archival Formalin-Fixed Paraffin-Embedded Tissues
Z Bai, D Zhang, Y Gao, B Tao, D Zhang, S Bao, A Enninful, Y Wang, H Li, G Su, X Tian, N Zhang, Y Xiao, Y Liu, M Gerstein, M Li, Y Xing, J Lu, ML Xu, R Fan (2024). Cell.

Predicting spatially resolved gene expression via tissue morphology using adaptive spatial GNNs
T Song, E Cosatto, G Wang, R Kuang, M Gerstein, MR Min, J Warrell (2024). Bioinformatics 40: ii111-ii119.

Cross-ancestry atlas of gene, isoform, and splicing regulation in the developing human brain.
C Wen, M Margolis, R Dai, P Zhang, PF Przytycki, DD Vo, A Bhattacharya, N Matoba, M Tang, C Jiao, M Kim, E Tsai, C Hoh, N Aygun, RL Walker, C Chatzinakos, D Clarke, H Pratt, PsychENCODE Consortium, MA Peters, M Gerstein, NP Daskalakis, Z Weng, AE Jaffe, JE Kleinman, TM Hyde, DR Weinberger, NJ Bray, N Sestan, DH Geschwind, K Roeder, A Gusev, B Pasaniuc, JL Stein, MI Love, KS Pollard, C Liu, MJ Gandal, PsychENCODE Consortium (2024). Science 384: eadh0829.

Transcriptional determinism and stochasticity contribute to the complexity of autism-associated SHANK family genes.
X Lu, P Ni, P Suarez-Meade, Y Ma, EN Forrest, G Wang, Y Wang, A Quinones-Hinojosa, M Gerstein, YH Jiang (2024). Cell Rep 43: 114376.

Predicting A/B compartments from histone modifications using deep learning.
S Zheng, N Thakkar, HL Harris, S Liu, M Zhang, M Gerstein, EL Aiden, MJ Rowley, WS Noble, G Gursoy, R Singh (2024). iScience 27: 109570.

Fast, sensitive detection of protein homologs using deep dense retrieval.
L Hong, Z Hu, S Sun, X Tang, J Wang, Q Tan, L Zheng, S Wang, S Xu, I King, M Gerstein, Y Li (2024). Nat Biotechnol.

Representing core gene expression activity relationships using the latent structure implicit in Bayesian networks
J Gao, M Gerstein (2024). Bioinformatics 40.

Using a comprehensive atlas and predictive models to reveal the complexity and evolution of brain-active regulatory elements
HE Pratt, G Andrews, N Shedd, N Phalke, T Li, A Pampari, M Jensen, C Wen, P Consortium, MJ Gandal, DH Geschwind, M Gerstein, J Moore, A Kundaje, A Colubri, Z Weng (2024). Sci Adv 10: eadj4452.

Leveraging a large language model to predict protein phase transition: a physical, multiscale and interpretable approach
M Frank, P Ni, M Jensen, MB Gerstein (2024). Proc Natl Acad Sci U S A 121: e2320510121.

A survey of generative AI for de novo drug design: new frontiers in molecule and protein generation
X Tang, H Dai, E Knight, F Wu, Y Li, T Li, M Gerstein (2024). Brief Bioinform 25.

MolLM: A Unified Language Model to Integrate Biomedical Text with 2D and 3D Molecular Representations
X Tang, A Tran, J Tan, MB Gerstein (2024). Bioinformatics 40: i357-i368.

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning
X Tang, A Zou, Z Zhang, Y Zhao, X Zhang, A Cohan, M Gerstein (2024). In Findings of the Association for Computational Linguistics: ACL

Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
X Tang, Y Zong, J Phang, Y Zhao, W Zhou, A Cohan, M Gerstein (2024). In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Short Papers: 12–34

BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge
X Tang, B Qian, R Gao, J Chen, X Chen, MB Gerstein (2024). Bioinformatics 40: i266-i276.

The single-cell opioid responses in the context of HIV (SCORCH) consortium
SA Ament, RR Campbell, MK Lobo, JP Receveur, K Agrawal, A Borjabad, SN Byrareddy, L Chang, D Clarke, P Emani, D Gabuzda, KJ Gaulton, M Giglio, FM Giorgi, B Gok, C Guda, E Hadas, BR Herb, W Hu, A Huttner, MR Ishmam, MM Jacobs, J Kelschenbach, DW Kim, C Lee, S Liu, X Liu, BK Madras, AA Mahurkar, DC Mash, EA Mukamel, M Niu, RM O'Connor, CM Pagan, APS Pang, P Pillai, V Repunte-Canonigo, WB Ruzicka, J Stanley, T Tickle, SA Tsai, A Wang, L Wills, AM Wilson, SN Wright, S Xu, J Yang, M Zand, L Zhang, J Zhang, S Akbarian, S Buch, CS Cheng, MJ Corley, HS Fox, M Gerstein, S Gummuluru, M Heiman, YC Ho, M Kellis, PJ Kenny, Y Kluger, TA Milner, DJ Moore, S Morgello, LC Ndhlovu, TM Rana, PP Sanna, JS Satterlee, N Sestan, SA Spector, S Spudich, HU Tilgner, DJ Volsky, OR White, DW Williams, H Zeng (2024). Mol Psychiatry.

Single-cell genomics and regulatory networks for 388 human brains
PS Emani, JJ Liu, D Clarke, M Jensen, J Warrell, C Gupta, R Meng, CY Lee, S Xu, C Dursun, S Lou, Y Chen, Z Chu, T Galeev, A Hwang, Y Li, P Ni, X Zhou, PsychENCODE Consortium, TE Bakken, J Bendl, L Bicks, T Chatterjee, L Cheng, Y Cheng, Y Dai, Z Duan, M Flaherty, JF Fullard, M Gancz, D Garrido-Martin, S Gaynor-Gillett, J Grundman, N Hawken, E Henry, GE Hoffman, A Huang, Y Jiang, T Jin, NL Jorstad, R Kawaguchi, S Khullar, J Liu, J Liu, S Liu, S Ma, M Margolis, S Mazariegos, J Moore, JR Moran, E Nguyen, N Phalke, M Pjanic, H Pratt, D Quintero, AS Rajagopalan, TR Riesenmy, N Shedd, M Shi, M Spector, R Terwilliger, KJ Travaglini, B Wamsley, G Wang, Y Xia, S Xiao, AC Yang, S Zheng, MJ Gandal, D Lee, ES Lein, P Roussos, N Sestan, Z Weng, KP White, H Won, MJ Girgenti, J Zhang, D Wang, D Geschwind, M Gerstein (2024). Science 384: eadi5199.

Spatially informed gene signatures for response to immunotherapy in melanoma
TN Aung, J Warrell, S Martinez-Morilla, N Gavrielatou, I Vathiotis, V Yaghoobi, HM Kluger, M Gerstein, DL Rimm (2024). Clin Cancer Res 30: 3520-3532.

Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution
J Warrell, L Salichos, M Gancz, MB Gerstein (2024). J R Soc Interface 21: 20230647.

scENCORE: leveraging single-cell epigenetic data to predict chromatin conformation using graph embedding
Z Duan, S Xu, S Sai Srinivasan, A Hwang, CY Lee, F Yue, M Gerstein, Y Luan, M Girgenti, J Zhang (2024). Brief Bioinform 25.

Less-is-more: selecting transcription factor binding regions informative for motif inference
J Xu, J Gao, P Ni, M Gerstein (2024). Nucleic Acids Res 52: e20.

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Lauren Hong, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, Maosong Sun (2024). The Twelfth International Conference on Learning Representations (ICLR 2024).

Investigating Data Contamination in Modern Benchmarks for Large Language Models
C Deng, Y Zhao, X Tang, M Gerstein, A Cohan. (2024). In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Volume 1 (Long Papers): 8706–8719

-- 2023 (20) --

GersteinLab at MEDIQA-Chat 2023: Clinical Note Summarization from Doctor-Patient Conversations through Fine-tuning and In-context Learning
Xiangru Tang, Andrew Tran, Jeffrey Tan, Mark Gerstein (2023). Proceedings of the 5th Clinical Natural Language Processing Workshop.

Aligning factual consistency for clinical studies summarization through reinforcement learning
Xiangru Tang, Arman Cohan, Mark Gerstein (2023). Proceedings of the 5th Clinical Natural Language Processing Workshop.

Assessing and mitigating privacy risks of sparse, noisy genotypes by local alignment to haplotype databases.
PS Emani, MN Geradi, G Gursoy, MR Grasty, A Miranker, MB Gerstein (2023). Genome Res 33: 2156-2173.

Disentangled Wasserstein Autoencoder for T-Cell Receptor Engineering
T Li, H Guo, F Grazioli, MB Gerstein, MR Min (2023). NeurIPS 2023.

Calculating the future
D Greenbaum, M Gerstein (2023). Science 380: 589.

Resisting efficiency’s overreach
D Greenbaum, M Gerstein (2023). Science 381: 1162.

More than bad luck: Cancer and aging are linked to replication-driven changes to the epigenome.
CJ Minteer, K Thrush, J Gonzalez, P Niimi, M Rozenblit, J Rozowsky, J Liu, M Frank, T McCabe, AT Higgins-Chen, E Hofstatter, L Pusztai, K Beckman, M Gerstein, ME Levine (2023). Sci Adv 9: eadf4163.

Minor intron splicing is critical for survival of lethal prostate cancer.
A Augspach, KD Drake, L Roma, E Qian, SR Lee, D Clarke, S Kumar, M Jaquet, J Gallon, M Bolis, J Triscott, JA Galvan, Y Chen, GN Thalmann, M Kruithof-de Julio, JP Theurillat, S Wuchty, M Gerstein, S Piscuoglio, RN Kanadia, MA Rubin (2023). Mol Cell 83: 1983-2002e11.

The association between evening social media use and delayed sleep may be causal: Suggestive evidence from 120 million Reddit timestamps
WU Meyerson, SK Fineberg, FC Andrade, P Corlett, MB Gerstein, RH Hoyle (2023). Sleep Med 107: 212-218.

Genetic determination of regional connectivity in modelling the spread of COVID-19 outbreak for more efficient mitigation strategies
L Salichos, J Warrell, H Cevasco, A Chung, M Gerstein (2023). Sci Rep 13: 8470.

exRNA-eCLIP intersection analysis reveals a map of extracellular RNA binding proteins and associated RNAs across major human biofluids and carriers.
EL LaPlante, A Sturchler, R Fullem, D Chen, AC Starner, E Esquivel, E Alsop, AR Jackson, I Ghiran, G Pereira, J Rozowsky, J Chang, MB Gerstein, RP Alexander, ME Roth, JL Franklin, RJ Coffey, RL Raffai, IM Mansuy, S Stavrakis, AJ deMello, LC Laurent, YT Wang, CF Tsai, T Liu, J Jones, K Van Keuren-Jensen, E Van Nostrand, B Mateescu, A Milosavljevic (2023). Cell Genom 3: 100303.

Constructing a multiple-layer interactome for SARS-CoV-2 in the context of lung disease: Linking the virus with human genes and co-infecting microbes
S Lou, M Yang, T Li, W Zhao, H Cevasco, YT Yang, M Gerstein (2023). PLoS Comput Biol 19: e1011222.

Unified views on variant impact across many diseases
S Kumar, M Gerstein (2023). Trends Genet 39: 442-450.

The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models
J Rozowsky, J Gao, B Borsari, YT Yang, T Galeev, G Gursoy, CB Epstein, K Xiong, J Xu, T Li, J Liu, K Yu, A Berthel, Z Chen, F Navarro, MS Sun, J Wright, J Chang, CJF Cameron, N Shoresh, E Gaskell, J Drenkow, J Adrian, S Aganezov, F Aguet, G Balderrama-Gutierrez, S Banskota, GB Corona, S Chee, SB Chhetri, GC Cortez Martins, C Danyko, CA Davis, D Farid, NP Farrell, I Gabdank, Y Gofin, DU Gorkin, M Gu, V Hecht, BC Hitz, R Issner, Y Jiang, M Kirsche, X Kong, BR Lam, S Li, B Li, X Li, KZ Lin, R Luo, M Mackiewicz, R Meng, JE Moore, J Mudge, N Nelson, C Nusbaum, I Popov, HE Pratt, Y Qiu, S Ramakrishnan, J Raymond, L Salichos, A Scavelli, JM Schreiber, FJ Sedlazeck, LH See, RM Sherman, X Shi, M Shi, CA Sloan, JS Strattan, Z Tan, FY Tanaka, A Vlasova, J Wang, J Werner, B Williams, M Xu, C Yan, L Yu, C Zaleski, J Zhang, K Ardlie, JM Cherry, EM Mendenhall, WS Noble, Z Weng, ME Levine, A Dobin, B Wold, A Mortazavi, B Ren, J Gillis, RM Myers, MP Snyder, J Choudhary, A Milosavljevic, MC Schatz, BE Bernstein, R Guigo, TR Gingeras, M Gerstein (2023). Cell 186: 1493-1511e40.

Estimation of Bedtimes of Reddit Users: Integrated Analysis of Time Stamps and Surveys
WU Meyerson, SK Fineberg, YK Song, A Faber, G Ash, FC Andrade, P Corlett, MB Gerstein, RH Hoyle (2023). JMIR Form Res 7: e38112.

Privacy-preserving cancer type prediction with homomorphic encryption
E Sarkar, E Chielle, G Gursoy, L Chen, M Gerstein, M Maniatakos (2023). Sci Rep 13: 1661.

Binding peptide generation for MHC Class I proteins with deep reinforcement learning
Z Chen, B Zhang, H Guo, P Emani, T Clancy, C Jiang, M Gerstein, X Ning, C Cheng, MR Min (2023). Bioinformatics 39.

Proteome-wide screening for mitogen-activated protein kinase docking motifs and interactors
G Shi, C Song, J Torres Robles, L Salichos, HJ Lou, TT Lam, M Gerstein, BE Turk (2023). Sci Signal 16: eabm5518.

Recurrent repeat expansions in human cancer genomes
GS Erwin, G Gursoy, R Al-Abri, A Suriyaprakash, E Dolzhenko, K Zhu, CR Hoerner, SM White, L Ramirez, A Vadlakonda, A Vadlakonda, K von Kraut, J Park, CM Brannon, DA Sumano, RA Kirtikar, AA Erwin, TJ Metzner, RKC Yuen, AC Fan, JT Leppert, MA Eberle, M Gerstein, MP Snyder (2023). Nature 613: 96-102.

Insights from incorporating quantum computing into drug design workflows
B Lau, PS Emani, J Chapman, L Yao, T Lam, P Merrill, J Warrell, MB Gerstein, HYK Lam (2023). Bioinformatics 39.

-- 2022 (20) --

Baseline gene expression profiling determines long-term benefit to programmed cell death protein 1 axis blockade
IA Vathiotis, L Salichos, S Martinez-Morilla, N Gavrielatou, TN Aung, S Shafi, PF Wong, S Jessel, HM Kluger, KN Syrigos, S Warren, M Gerstein, DL Rimm (2022). NPJ Precis Oncol 6: 92.

Genomic research data and the justice system.
D Greenbaum, M Gerstein (2022). Science 377: 826-827.

Building integrative functional maps of gene regulation.
J Xu, HE Pratt, JE Moore, MB Gerstein, Z Weng (2022). Hum Mol Genet 31: R114-R122.

GATTACA is still pertinent 25 years later
D Greenbaum, M Gerstein (2022). Nat Genet 54: 1758-1760.

Baseline gene expression profiling determines long-term benefit to programmed cell death protein 1 axis blockade.
IA Vathiotis, L Salichos, S Martinez-Morilla, N Gavrielatou, TN Aung, S Shafi, PF Wong, S Jessel, HM Kluger, KN Syrigos, S Warren, M Gerstein, DL Rimm (2022). NPJ Precis Oncol 6: 92.

Illuminating links between cis-regulators and trans-acting variants in the human prefrontal cortex.
S Liu, H Won, D Clarke, N Matoba, S Khullar, Y Mu, D Wang, M Gerstein (2022). Genome Med 14: 133.

Genomic Privacy: Advocating for the Convergence of Legal and Technical Solutions
C Kockan, D Greenbaum, D Lee, M Gerstein (2022). Georgetown Journal of International Affairs 23.2 (2022): 246-253..

DeepVelo: Single-cell transcriptomic deep velocity field learning with neural ordinary differential equations.
Z Chen, WC King, A Hwang, M Gerstein, J Zhang (2022). Sci Adv 8: eabq3745.

Venus: An efficient virus infection detection and fusion site discovery method using single-cell and bulk RNA-seq data.
CY Lee, Y Chen, Z Duan, M Xu, MJ Girgenti, K Xu, M Gerstein, J Zhang (2022). PLoS Comput Biol 18: e1010636.

Broad transcriptomic dysregulation occurs across the cerebral cortex in ASD.
MJ Gandal, JR Haney, B Wamsley, CX Yap, S Parhami, PS Emani, N Chang, GT Chen, GD Hoftman, D de Alba, G Ramaswami, CL Hartl, A Bhattacharya, C Luo, T Jin, D Wang, R Kawaguchi, D Quintero, J Ou, YE Wu, NN Parikshak, V Swarup, TG Belgard, M Gerstein, B Pasaniuc, DH Geschwind (2022). Nature 611: 532-539.

FAVOR: functional annotation of variants online resource and annotator for variation across the human genome.
H Zhou, T Arapoglou, X Li, Z Li, X Zheng, J Moore, A Asok, S Kumar, EE Blue, S Buyske, N Cox, A Felsenfeld, M Gerstein, E Kenny, B Li, T Matise, A Philippakis, HL Rehm, HJ Sofia, G Snyder, NHGRI Genome Sequencing Program Variant Functional Annotation Working Group, Z Weng, B Neale, SR Sunyaev, X Lin (2022). Nucleic Acids Res 51: D1300-D1311.

Privacy-preserving Model Training for Disease Prediction Using Federated Learning with Differential Privacy.
A Khanna, V Schaffer, G Gursoy, M Gerstein (2022). Annu Int Conf IEEE Eng Med Biol Soc 2022: 1358-1361.

Phase 2 of extracellular RNA communication consortium charts next-generation approaches for extracellular RNA research.
B Mateescu, JC Jones, RP Alexander, E Alsop, JY An, M Asghari, A Boomgarden, L Bouchareychas, A Cayota, HC Chang, A Charest, DT Chiu, RJ Coffey, S Das, P De Hoff, A deMello, C D'Souza-Schorey, D Elashoff, KR Eliato, JL Franklin, DJ Galas, MB Gerstein, IH Ghiran, DB Go, S Gould, TR Grogan, JN Higginbotham, F Hladik, TJ Huang, X Huo, E Hutchins, DK Jeppesen, T Jovanovic-Talisman, BYS Kim, S Kim, KM Kim, Y Kim, RR Kitchen, V Knouse, EL LaPlante, CB Lebrilla, LJ Lee, KM Lennon, G Li, F Li, T Li, T Liu, Z Liu, AL Maddox, K McCarthy, B Meechoovet, N Maniya, Y Meng, A Milosavljevic, BH Min, A Morey, M Ng, J Nolan, GP De Oliveira Junior, ME Paulaitis, TA Phu, RL Raffai, E Reategui, ME Roth, DA Routenberg, J Rozowsky, J Rufo, S Senapati, S Shachar, H Sharma, AK Sood, S Stavrakis, A Sturchler, M Tewari, JP Tosar, AK Tucker-Schwartz, A Turchinovich, N Valkov, K Van Keuren-Jensen, KC Vickers, L Vojtech, WN Vreeland, C Wang, K Wang, Z Wang, JA Welsh, KW Witwer, DTW Wong, J Xia, YH Xie, K Yang, MP Zaborowski, C Zhang, Q Zhang, AM Zivkovic, LC Laurent (2022). iScience 25: 104653.

Standardized annotation of translated open reading frames.
JM Mudge, J Ruiz-Orera, JR Prensner, MA Brunet, F Calvet, I Jungreis, JM Gonzalez, M Magrane, TF Martinez, JF Schulz, YT Yang, MM Alba, JL Aspden, PV Baranov, AA Bazzini, E Bruford, MJ Martin, L Calviello, AR Carvunis, J Chen, JP Couso, EW Deutsch, P Flicek, A Frankish, M Gerstein, N Hubner, NT Ingolia, M Kellis, G Menschaert, RL Moritz, U Ohler, X Roucou, A Saghatelian, JS Weissman, S van Heesch (2022). Nat Biotechnol 40: 994-999.

Switching labs during a PhD.
J Park, M Gerstein (2022). Nature.

Storing and analyzing a genome on a blockchain.
G Gursoy, CM Brannon, E Ni, S Wagner, A Khanna, M Gerstein (2022). Genome Biol 23: 134.

Forest Fire Clustering for single-cell sequencing combines iterative label propagation with parallelized Monte Carlo simulations.
Z Chen, J Goldwasser, P Tuckman, J Liu, J Zhang, M Gerstein (2022). Nat Commun 13: 3538.

Cancer Relevance of Human Genes.
T Qing, H Mohsen, VL Cannataro, M Marczyk, M Rozenblit, J Foldi, M Murray, JP Townsend, Y Kluger, M Gerstein, L Pusztai (2022). J Natl Cancer Inst 114: 988-995.

The lasting legacy of John von Neumann
D Greenbaum, M Gerstein (2022). Science 375: 983.

Centers for Mendelian Genomics: A decade of facilitating gene discovery
SM Baxter, JE Posey, NJ Lake, N Sobreira, JX Chong, S Buyske, EE Blue, LH Chadwick, ZH Coban-Akdemir, KF Doheny, CP Davis, M Lek, C Wellington, SN Jhangiani, M Gerstein, RA Gibbs, RP Lifton, DG MacArthur, TC Matise, JR Lupski, D Valle, MJ Bamshad, A Hamosh, S Mane, DA Nickerson, Centers for Mendelian Genomics Consortium, HL Rehm, A O'Donnell-Luria (2022). Genet Med 24: 784-797.

-- 2021 (19) --

Haplotype-resolved diverse human genomes and integrated analysis of structural variation.
P Ebert, PA Audano, Q Zhu, B Rodriguez-Martin, D Porubsky, MJ Bonder, A Sulovari, J Ebler, W Zhou, R Serra Mari, F Yilmaz, X Zhao, P Hsieh, J Lee, S Kumar, J Lin, T Rausch, Y Chen, J Ren, M Santamarina, W Hops, H Ashraf, NT Chuang, X Yang, KM Munson, AP Lewis, S Fairley, LJ Tallon, WE Clarke, AO Basile, M Byrska-Bishop, A Corvelo, US Evani, TY Lu, MJP Chaisson, J Chen, C Li, H Brand, AM Wenger, M Ghareghani, WT Harvey, B Raeder, P Hasenfeld, AA Regier, HJ Abel, IM Hall, P Flicek, O Stegle, MB Gerstein, JMC Tubio, Z Mu, YI Li, X Shi, AR Hastie, K Ye, Z Chong, AD Sanders, MC Zody, ME Talkowski, RE Mills, SE Devine, C Lee, JO Korbel, T Marschall, EE Eichler (2021). Science 372.

Cross-platform transcriptomic profiling of the response to recombinant human erythropoietin.
G Wang, T Kitaoka, A Crawford, Q Mao, A Hesketh, FM Guppy, GI Ash, J Liu, MB Gerstein, YP Pitsiladis (2021). Sci Rep 11: 21705.

Fast and Scalable Private Genotype Imputation Using Machine Learning and Partially Homomorphic Encryption.
E Sarkar, E Chielle, G Gursoy, O Mazonka, M Gerstein, M Maniatakos (2021). IEEE Access 9: 93097-93110.

Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies.
X Zhao, RL Collins, WP Lee, AM Weber, Y Jun, Q Zhu, B Weisburd, Y Huang, PA Audano, H Wang, M Walker, C Lowther, J Fu, Human Genome Structural Variation Consortium, MB Gerstein, SE Devine, T Marschall, JO Korbel, EE Eichler, MJP Chaisson, C Lee, RE Mills, H Brand, ME Talkowski (2021). Am J Hum Genet 108: 919-928.

A new tool for technical standardization of the Ki67 immunohistochemical assay.
TN Aung, B Acs, J Warrell, Y Bai, P Gaule, S Martinez-Morilla, I Vathiotis, S Shafi, M Moutafi, M Gerstein, B Freiberg, R Fulton, DL Rimm (2021). Mod Pathol 34: 1261-1270.

Functional genomics data: privacy risk assessment and technological mitigation.
G Gursoy, T Li, S Liu, E Ni, CM Brannon, MB Gerstein (2021). Nat Rev Genet 23: 245-258.

Privacy-preserving genotype imputation with fully homomorphic encryption.
G Gursoy, E Chielle, CM Brannon, M Maniatakos, M Gerstein (2021). Cell Syst 13: 173-182e3.

Nodal modulator (NOMO) is required to sustain endoplasmic reticulum morphology.
C Amaya, CJF Cameron, SC Devarkar, SJH Seager, MB Gerstein, Y Xiong, C Schlieker (2021). J Biol Chem 297: 100937.

Differences in evolutionary accessibility determine which equally effective regulatory motif evolves to generate pulses.
K Xiong, M Gerstein, J Masel (2021). Genetics 219.

DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays.
Z Chen, J Zhang, J Liu, Y Dai, D Lee, MR Min, M Xu, M Gerstein (2021). Bioinformatics 37: i280-i288.

Recovering genotypes and phenotypes using allele-specific genes.
G Gursoy, N Lu, S Wagner, M Gerstein (2021). Genome Biol 22: 263.

Network propagation-based prioritization of long tail genes in 17 cancer types.
H Mohsen, V Gunasekharan, T Qing, M Seay, Y Surovtseva, S Negahban, Z Szallasi, L Pusztai, MB Gerstein (2021). Genome Biol 22: 287.

Establishing a Global Standard for Wearable Devices in Sport and Exercise Medicine: Perspectives from Academic and Industry Stakeholders.
GI Ash, M Stults-Kolehmainen, MA Busa, AE Gaffey, K Angeloudis, B Muniz-Pardos, R Gregory, RA Huggins, NS Redeker, SA Weinzimer, LA Grieco, K Lyden, E Megally, I Vogiatzis, L Scher, X Zhu, JS Baker, C Brandt, MS Businelle, LM Fucito, S Griggs, R Jarrin, BJ Mortazavi, T Prioleau, W Roberts, EK Spanakis, LM Nally, A Debruyne, N Bachl, F Pigozzi, F Halabchi, DA Ramagole, DC Janse van Rensburg, B Wolfarth, C Fossati, S Rozenstoka, K Tanisawa, M Borjesson, JA Casajus, A Gonzalez-Aguero, I Zelenkova, J Swart, G Gursoy, W Meyerson, J Liu, D Greenbaum, YP Pitsiladis, MB Gerstein (2021). Sports Med 51: 2237-2250.

Bayesian structural time series for biomedical sensor data: A flexible modeling framework for evaluating interventions.
J Liu, DJ Spakowicz, GI Ash, R Hoyd, R Ahluwalia, A Zhang, S Lou, D Lee, J Zhang, C Presley, A Greene, M Stults-Kolehmainen, LM Nally, JS Baker, LM Fucito, SA Weinzimer, AV Papachristos, M Gerstein (2021). PLoS Comput Biol 17: e1009303.

Whole-genome sequencing of phenotypically distinct inflammatory breast cancers reveals similar genomic alterations to non-inflammatory breast cancers.
X Li, S Kumar, A Harmanci, S Li, RR Kitchen, Y Zhang, VB Wali, SM Reddy, WA Woodward, JM Reuben, J Rozowsky, C Hatzis, NT Ueno, S Krishnamurthy, L Pusztai, M Gerstein (2021). Genome Med 13: 70.

Gene Tracer: a smart, interactive, voice-controlled Alexa skill For gene information retrieval and browsing, mutation annotation and network visualization.
S Lou, T Li, J Liu, M Gerstein (2021). Bioinformatics 37: 2998-3000.

SCAN-ATAC-Sim: a scalable and efficient method for simulating single-cell ATAC-seq data from bulk-tissue experiments.
Z Chen, J Zhang, J Liu, Z Zhang, J Zhu, D Lee, M Xu, M Gerstein (2021). Bioinformatics 37: 1756-1758.

Molecular medicine tumor board: whole-genome sequencing to inform on personalized medicine for a man with advanced prostate cancer.
AJ Armstrong, X Li, M Tucker, S Li, XJ Mu, KW Eng, A Sboner, M Rubin, M Gerstein (2021). Prostate Cancer Prostatic Dis 24: 786-793.

Quantum computing at the frontiers of biological sciences
PS Emani, J Warrell, A Anticevic, S Bekiranov, M Gandal, MJ McConnell, G Sapiro, A Aspuru-Guzik, JT Baker, M Bastiani, JD Murray, SN Sotiropoulos, J Taylor, G Senthil, T Lehner, MB Gerstein, AW Harrow (2021). Nat Methods 18: 701-709.

-- 2020 (41) --

Germline variant burden in cancer genes correlates with age at diagnosis and somatic mutation burden.
T Qing, H Mohsen, M Marczyk, Y Ye, T O'Meara, H Zhao, JP Townsend, M Gerstein, C Hatzis, Y Kluger, L Pusztai (2020). Nat Commun 11: 2438.

To mock or not: a comprehensive comparison of mock IP and DNA input for ChIP-seq.
J Xu, MM Kudron, A Victorsen, J Gao, HN Ammouri, FCP Navarro, L Gevirtzman, RH Waterston, KP White, V Reinke, M Gerstein (2020). Nucleic Acids Res 49: e17.

Weight-based Neural Network Interpretability using Activation Tuning and Personalized Products
H Mohsen, J Warrell, MR Min, S Negahban, M Gerstein (2020). The 15th Machine Learning in Computational Biology Workshop.

STARRPeaker: uniform processing and accurate identification of STARR-seq active regions.
D Lee, M Shi, J Moran, M Wall, J Zhang, J Liu, D Fitzgerald, Y Kyono, L Ma, KP White, M Gerstein (2020). Genome Biol 21: 298.

GENCODE 2021.
A Frankish, M Diekhans, I Jungreis, J Lagarde, JE Loveland, JM Mudge, C Sisu, JC Wright, J Armstrong, I Barnes, A Berry, A Bignell, C Boix, S Carbonell Sala, F Cunningham, T Di Domenico, S Donaldson, IT Fiddes, C Garcia Giron, JM Gonzalez, T Grego, M Hardy, T Hourlier, KL Howe, T Hunt, OG Izuogu, R Johnson, FJ Martin, L Martinez, S Mohanan, P Muir, FCP Navarro, A Parker, B Pei, F Pozo, FC Riera, M Ruffier, BM Schmitt, E Stapleton, MM Suner, I Sycheva, B Uszczynska-Ratajczak, MY Wolf, J Xu, YT Yang, A Yates, D Zerbino, Y Zhang, JS Choudhary, M Gerstein, R Guigo, TJP Hubbard, M Kellis, B Paten, ML Tress, P Flicek (2020). Nucleic Acids Res 49: D916-D923.

Predicting changes in protein thermodynamic stability upon point mutation with deep 3D convolutional neural networks.
B Li, YT Yang, JA Capra, MB Gerstein (2020). PLoS Comput Biol 16: e1008291.

Data Sanitization to Reduce Private Information Leakage from Functional Genomics.
G Gursoy, P Emani, CM Brannon, OA Jolanki, A Harmanci, JS Strattan, JM Cherry, AD Miranker, M Gerstein (2020). Cell 183: 905-917e16.

SVFX: a machine learning framework to quantify the pathogenicity of structural variants.
S Kumar, A Harmanci, J Vytheeswaran, MB Gerstein (2020). Genome Biol 21: 274.

NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis.
J Zhang, J Liu, P McGillivray, C Yi, L Lochovsky, D Lee, M Gerstein (2020). BMC Bioinformatics 21: 474.

Latent-space embedding of expression data identifies gene signatures from sputum samples of asthmatic patients.
S Lou, T Li, D Spakowicz, X Yan, GL Chupp, M Gerstein (2020). BMC Bioinformatics 21: 457.

Moving beyond buzzwords
D Greenbaum, M Gerstein (2020). Science. 370 (6513):178.

Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples.
MH Bailey, WU Meyerson, LJ Dursi, LB Wang, G Dong, WW Liang, A Weerasinghe, S Li, Y Li, S Kelso, MC3 Working Group, PCAWG novel somatic mutation calling methods working group, G Saksena, K Ellrott, MC Wendl, DA Wheeler, G Getz, JT Simpson, MB Gerstein, L Ding, PCAWG Consortium (2020). Nat Commun 11: 4748.

Predicting the frequencies of drug side effects.
D Galeano, S Li, M Gerstein, A Paccanaro (2020). Nat Commun 11: 4575.

Sex differences in oncogenic mutational processes.
CH Li, SD Prokopec, RX Sun, F Yousif, N Schmitz, PCAWG Tumour Subtypes and Clinical Translation, PC Boutros, PCAWG Consortium (2020). Nat Commun 11: 4330.

Cyclic and Multilevel Causation in Evolutionary Processes
J Warrell, M Gerstein (2020). Biology & Philosophy, 35(5), pp.1-36.

Expanded encyclopaedias of DNA elements in the human and mouse genomes.
ENCODE Project Consortium, JE Moore, MJ Purcaro, HE Pratt, CB Epstein, N Shoresh, J Adrian, T Kawli, CA Davis, A Dobin, R Kaul, J Halow, EL Van Nostrand, P Freese, DU Gorkin, Y Shen, Y He, M Mackiewicz, F Pauli-Behn, BA Williams, A Mortazavi, CA Keller, XO Zhang, SI Elhajjajy, J Huey, DE Dickel, V Snetkova, X Wei, X Wang, JC Rivera-Mulia, J Rozowsky, J Zhang, SB Chhetri, J Zhang, A Victorsen, KP White, A Visel, GW Yeo, CB Burge, E Lecuyer, DM Gilbert, J Dekker, J Rinn, EM Mendenhall, JR Ecker, M Kellis, RJ Klein, WS Noble, A Kundaje, R Guigo, PJ Farnham, JM Cherry, RM Myers, B Ren, BR Graveley, MB Gerstein, LA Pennacchio, MP Snyder, BE Bernstein, B Wold, RC Hardison, TR Gingeras, JA Stamatoyannopoulos, Z Weng (2020). Nature 583: 699-710.

Perspectives on ENCODE.
ENCODE Project Consortium, MP Snyder, TR Gingeras, JE Moore, Z Weng, MB Gerstein, B Ren, RC Hardison, JA Stamatoyannopoulos, BR Graveley, EA Feingold, MJ Pazin, M Pagan, DA Gilchrist, BC Hitz, JM Cherry, BE Bernstein, EM Mendenhall, DR Zerbino, A Frankish, P Flicek, RM Myers (2020). Nature 583: 693-698.

An integrative ENCODE resource for cancer genomics.
J Zhang, D Lee, V Dhiman, P Jiang, J Xu, P McGillivray, H Yang, J Liu, W Meyerson, D Clarke, M Gu, S Li, S Lou, J Xu, L Lochovsky, M Ung, L Ma, S Yu, Q Cao, A Harmanci, KK Yan, A Sethi, G Gursoy, MR Schoenberg, J Rozowsky, J Warrell, P Emani, YT Yang, T Galeev, X Kong, S Liu, X Li, J Krishnan, Y Feng, JC Rivera-Mulia, J Adrian, JR Broach, M Bolt, J Moran, D Fitzgerald, V Dileep, T Liu, S Mei, T Sasaki, C Trevilla-Garcia, S Wang, Y Wang, C Zang, D Wang, RJ Klein, M Snyder, DM Gilbert, K Yip, C Cheng, F Yue, XS Liu, KP White, M Gerstein (2020). Nat Commun 11: 3696.

Transcriptional activity and strain-specific history of mouse pseudogenes.
C Sisu, P Muir, A Frankish, I Fiddes, M Diekhans, D Thybert, DT Odom, P Flicek, TM Keane, T Hubbard, J Harrow, M Gerstein (2020). Nat Commun 11: 3695.

RADAR: annotation and prioritization of variants in the post-transcriptional regulome of RNA-binding proteins.
J Zhang, J Liu, D Lee, JJ Feng, L Lochovsky, S Lou, M Rutenberg-Schoenberg, M Gerstein (2020). Genome Biol 21: 151.

Supervised enhancer prediction with epigenetic pattern recognition and targeted validation.
A Sethi, M Gu, E Gumusgoz, L Chan, KK Yan, J Rozowsky, I Barozzi, V Afzal, JA Akiyama, I Plajzer-Frick, C Yan, CS Novak, M Kato, TH Garvin, Q Pham, A Harrington, BJ Mannion, EA Lee, Y Fukuda-Yuzawa, A Visel, DE Dickel, KY Yip, R Sutton, LA Pennacchio, M Gerstein (2020). Nat Methods 17: 807-814.

FANCY: fast estimation of privacy risk in functional genomics data.
G Gursoy, CM Brannon, FCP Navarro, M Gerstein (2020). Bioinformatics 36: 5145-5150.

Using blockchain to log genome dataset access: efficient storage and query.
G Gursoy, R Bjornson, ME Green, M Gerstein (2020). BMC Med Genomics 13: 78.

Using sigLASSO to optimize cancer mutation signatures jointly with sampling likelihood.
S Li, FW Crawford, MB Gerstein (2020). Nat Commun 11: 3575.

DiNeR: a Differential graphical model for analysis of co-regulation Network Rewiring.
J Zhang, J Liu, D Lee, S Lou, Z Chen, G Gursoy, M Gerstein (2020). BMC Bioinformatics 21: 281.

Approaches for integrating heterogeneous RNA-seq data reveal cross-talk between microbes and genes in asthmatic patients.
D Spakowicz, S Lou, B Barron, JL Gomez, T Li, Q Liu, N Grant, X Yan, R Hoyd, G Weinstock, GL Chupp, M Gerstein (2020). Genome Biol 21: 150.

TopicNet: a framework for measuring transcriptional regulatory network change.
S Lou, T Li, X Kong, J Zhang, J Liu, D Lee, M Gerstein (2020). Bioinformatics 36: i474-i481.

Epigenome-based splicing prediction using a recurrent neural network.
D Lee, J Zhang, J Liu, M Gerstein (2020). PLoS Comput Biol 16: e1008006.

Origins and characterization of variants shared between databases of somatic and germline human mutations.
W Meyerson, J Leisman, FCP Navarro, M Gerstein (2020). BMC Bioinformatics 21: 227.

Using Ethereum blockchain to store and query pharmacogenomics data via smart contracts.
G Gursoy, CM Brannon, M Gerstein (2020). BMC Med Genomics 13: 74.

The corrected gene proximity map for analyzing the 3D genome organization using Hi-C data.
C Ye, A Paccanaro, M Gerstein, KK Yan (2020). BMC Bioinformatics 21: 222.

Dermal Adipocyte Lipolysis and Myofibroblast Conversion Are Required for Efficient Skin Repair.
BA Shook, RR Wasko, O Mano, M Rutenberg-Schoenberg, MC Rudolph, B Zirak, GC Rivera-Gonzalez, F Lopez-Giraldez, S Zarini, A Rezza, DA Clark, M Rendl, MD Rosenblum, MB Gerstein, V Horsley (2020). Cell Stem Cell 26: 880-895e6.

Estimation of the carrier frequency of fumarate hydratase alterations and implications for kidney cancer risk in hereditary leiomyomatosis and renal cancer.
B Shuch, S Li, H Risch, RS Bindra, PD McGillivray, M Gerstein (2020). Cancer 126: 3657-3666.

Comparing Technological Development and Biological Evolution from a Network Perspective.
KK Yan, D Wang, K Xiong, M Gerstein (2020). Cell Syst 10: 219-222.

Passenger Mutations in More Than 2,500 Cancer Genomes: Overall Molecular Functional Impact and Consequences.
S Kumar, J Warrell, S Li, PD McGillivray, W Meyerson, L Salichos, A Harmanci, A Martinez-Fundichely, CWY Chan, MM Nielsen, L Lochovsky, Y Zhang, X Li, S Lou, JS Pedersen, C Herrmann, G Getz, E Khurana, MB Gerstein (2020). Cell 180: 915-927e16.

The Development of a Practical Artificial Intelligence Tool for Diagnosing and Evaluating Autism Spectrum Disorder: Multicenter Study.
T Chen, Y Chen, M Yuan, M Gerstein, T Li, H Liang, T Froehlich, L Lu (2020). JMIR Med Inform 8: e15767.

Establishing a Global Standard for Wearable Devices in Sport and Fitness: Perspectives from the New England Chapter of the American College of Sports Medicine Members.
GI Ash, M Stults-Kolehmainen, MA Busa, R Gregory, CE Garber, J Liu, M Gerstein, JA Casajus, A Gonzalez-Aguero, D Constantinou, M Geistlinger, FM Guppy, F Pigozzi, YP Pitsiladis (2020). Curr Sports Med Rep 19: 45-49.

Pan-cancer analysis of whole genomes.
ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium (2020). Nature 578: 82-93.

Analyses of non-coding somatic drivers in 2,658 cancer whole genomes.
E Rheinbay, MM Nielsen, F Abascal, JA Wala, O Shapira, G Tiao, H Hornshj, JM Hess, RI Juul, Z Lin, L Feuerbach, R Sabarinathan, T Madsen, J Kim, L Mularoni, S Shuai, A Lanzos, C Herrmann, YE Maruvka, C Shen, SB Amin, P Bandopadhayay, J Bertl, KA Boroevich, J Busanovich, J Carlevaro-Fita, D Chakravarty, CWY Chan, D Craft, P Dhingra, K Diamanti, NA Fonseca, A Gonzalez-Perez, Q Guo, MP Hamilton, NJ Haradhvala, C Hong, K Isaev, TA Johnson, M Juul, A Kahles, A Kahraman, Y Kim, J Komorowski, K Kumar, S Kumar, D Lee, KV Lehmann, Y Li, EM Liu, L Lochovsky, K Park, O Pich, ND Roberts, G Saksena, SE Schumacher, N Sidiropoulos, L Sieverling, N Sinnott-Armstrong, C Stewart, D Tamborero, JMC Tubio, HM Umer, L Uuskula-Reimand, C Wadelius, L Wadi, X Yao, CZ Zhang, J Zhang, JE Haber, A Hobolth, M Imielinski, M Kellis, MS Lawrence, C von Mering, H Nakagawa, BJ Raphael, MA Rubin, C Sander, LD Stein, JM Stuart, T Tsunoda, DA Wheeler, R Johnson, J Reimand, M Gerstein, E Khurana, PJ Campbell, N Lopez-Bigas, PCAWG Drivers and Functional Interpretation Working Group, PCAWG Structural Variation Working Group, J Weischenfeldt, R Beroukhim, I Martincorena, JS Pedersen, G Getz, PCAWG Consortium (2020). Nature 578: 102-111.

Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition.
B Rodriguez-Martin, EG Alvarez, A Baez-Ortega, J Zamora, F Supek, J Demeulemeester, M Santamarina, YS Ju, J Temes, D Garcia-Souto, H Detering, Y Li, J Rodriguez-Castro, A Dueso-Barroso, AL Bruzos, SC Dentro, MG Blanco, G Contino, D Ardeljan, M Tojo, ND Roberts, S Zumalave, PA Edwards, J Weischenfeldt, M Puiggros, Z Chong, K Chen, EA Lee, JA Wala, KM Raine, A Butler, SM Waszak, FCP Navarro, SE Schumacher, J Monlong, F Maura, N Bolli, G Bourque, M Gerstein, PJ Park, DC Wedge, R Beroukhim, D Torrents, JO Korbel, I Martincorena, RC Fitzgerald, P Van Loo, HH Kazazian, KH Burns, PCAWG Structural Variation Working Group, PJ Campbell, JMC Tubio, PCAWG Consortium (2020). Nat Genet 52: 306-319.

Estimating growth patterns and driver effects in tumor evolution from individual samples.
L Salichos, W Meyerson, J Warrell, M Gerstein (2020). Nat Commun 11: 732.

-- 2019 (19) --

Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis.
JS Johnson, DJ Spakowicz, BY Hong, LM Petersen, P Demkowicz, L Chen, SR Leopold, BM Hanson, HO Agresta, M Gerstein, E Sodergren, GM Weinstock (2019). Nat Commun 10: 5029.

Hierarchical PAC-Bayes Bounds via Deep Probabilistic Programming
J Warrell, MB Gerstein (2019). Bayesian Deep Learning Workshop at NeurIPS.

Pollen-derived RNAs Are Found in the Human Circulation.
M Koupenova, E Mick, HA Corkrey, A Singh, SE Tanriverdi, O Vitseva, D Levy, AM Keeler, M Ezzaty Mirhashemi, MK ElMallah, M Gerstein, J Rozowsky, K Tanriverdi, JE Freedman (2019). iScience 19: 916-926.

GRAM: A GeneRAlized Model to predict the molecular effect of a non-coding variant in a cell-type specific manner.
S Lou, KA Cotter, T Li, J Liang, H Mohsen, J Liu, J Zhang, S Cohen, J Xu, H Yu, MA Rubin, M Gerstein (2019). PLoS Genet 15: e1007860.

Leveraging protein dynamics to identify cancer mutational hotspots using 3D structures.
S Kumar, D Clarke, MB Gerstein (2019). Proc Natl Acad Sci U S A 116: 18962-18970.

Sharing data
D Greenbaum, M Gerstein (2019). Science 365 (6455):764.

TeXP: Deconvolving the effects of pervasive and autonomous transcription of transposable elements.
FC Navarro, J Hoops, L Bellfy, E Cerveira, Q Zhu, C Zhang, C Lee, MB Gerstein (2019). PLoS Comput Biol 15: e1007293.

A Single-Cell Transcriptomic Atlas of Human Neocortical Development during Mid-gestation.
D Polioudakis, L de la Torre-Ubieta, J Langerman, AG Elkins, X Shi, JL Stein, CK Vuong, S Nichterwitz, M Gevorgian, CK Opland, D Lu, W Connell, EK Ruzzo, JK Lowe, T Hadzic, FI Hinz, S Sabri, WE Lowry, MB Gerstein, K Plath, DH Geschwind (2019). Neuron 103: 785-801e8.

Building a Hybrid Physical-Statistical Classifier for Predicting the Effect of Variants Related to Protein-Drug Interactions.
B Wang, C Yan, S Lou, P Emani, B Li, M Xu, X Kong, W Meyerson, YT Yang, D Lee, M Gerstein (2019). Structure 27: 1469-1481e3.

Genomics and data science: an application within an umbrella
FCP Navarro, H Mohsen, C Yan, S Li, M Gu, W Meyerson, M Gerstein (2019). Genome Biol 20: 109.

Multi-platform discovery of haplotype-resolved structural variation in human genomes.
MJP Chaisson, AD Sanders, X Zhao, A Malhotra, D Porubsky, T Rausch, EJ Gardner, OL Rodriguez, L Guo, RL Collins, X Fan, J Wen, RE Handsaker, S Fairley, ZN Kronenberg, X Kong, F Hormozdiari, D Lee, AM Wenger, AR Hastie, D Antaki, T Anantharaman, PA Audano, H Brand, S Cantsilieris, H Cao, E Cerveira, C Chen, X Chen, CS Chin, Z Chong, NT Chuang, CC Lambert, DM Church, L Clarke, A Farrell, J Flores, T Galeev, DU Gorkin, M Gujral, V Guryev, WH Heaton, J Korlach, S Kumar, JY Kwon, ET Lam, JE Lee, J Lee, WP Lee, SP Lee, S Li, P Marks, K Viaud-Martinez, S Meiers, KM Munson, FCP Navarro, BJ Nelson, C Nodzak, A Noor, S Kyriazopoulou-Panagiotopoulou, AWC Pang, Y Qiu, G Rosanio, M Ryan, A Stutz, DCJ Spierings, A Ward, AE Welch, M Xiao, W Xu, C Zhang, Q Zhu, X Zheng-Bradley, E Lowy, S Yakneen, S McCarroll, G Jun, L Ding, CL Koh, B Ren, P Flicek, K Chen, MB Gerstein, PY Kwok, PM Lansdorp, GT Marth, J Sebat, X Shi, A Bashir, K Ye, SE Devine, ME Talkowski, RE Mills, T Marschall, JO Korbel, EE Eichler, C Lee (2019). Nat Commun 10: 1784.

exceRpt: A Comprehensive Analytic Platform for Extracellular RNA Profiling.
J Rozowsky, RR Kitchen, JJ Park, TR Galeev, J Diao, J Warrell, W Thistlethwaite, SL Subramanian, A Milosavljevic, M Gerstein (2019). Cell Syst 8: 352-357e3.

exRNA Atlas Analysis Reveals Distinct Extracellular RNA Cargo Types and Their Carriers Present across Human Biofluids.
OD Murillo, W Thistlethwaite, J Rozowsky, SL Subramanian, R Lucero, N Shah, AR Jackson, S Srinivasan, A Chung, CD Laurent, RR Kitchen, T Galeev, J Warrell, JA Diao, JA Welsh, K Hanspers, A Riutta, S Burgstaller-Muehlbacher, RV Shah, A Yeri, LM Jenkins, ME Ahsen, C Cordon-Cardo, N Dogra, SM Gifford, JT Smith, G Stolovitzky, AK Tewari, BH Wunsch, KK Yadav, KM Danielson, J Filant, C Moeller, P Nejad, A Paul, B Simonson, DK Wong, X Zhang, L Balaj, R Gandhi, AK Sood, RP Alexander, L Wang, C Wu, DTW Wong, DJ Galas, K Van Keuren-Jensen, T Patel, JC Jones, S Das, KH Cheung, AR Pico, AI Su, RL Raffai, LC Laurent, ME Roth, MB Gerstein, A Milosavljevic (2019). Cell 177: 463-477e15.

The Extracellular RNA Communication Consortium: Establishing Foundational Knowledge and Technologies for Extracellular RNA Research.
S Das, Extracellular RNA Communication Consortium, KM Ansel, M Bitzer, XO Breakefield, A Charest, DJ Galas, MB Gerstein, M Gupta, A Milosavljevic, MT McManus, T Patel, RL Raffai, J Rozowsky, ME Roth, JA Saugstad, K Van Keuren-Jensen, AM Weaver, LC Laurent (2019). Cell 177: 231-242.

Shaping the nebulous enhancer in the era of high-throughput assays and genome editing.
EY Ho, Q Cao, M Gu, RW Chan, Q Wu, M Gerstein, KY Yip (2019). Brief Bioinform 21: 836-850.

Measuring the reproducibility and quality of Hi-C data.
GG Yardmc, H Ozadam, MEG Sauria, O Ursu, KK Yan, T Yang, A Chakraborty, A Kaul, BR Lajoie, F Song, Y Zhan, F Ay, M Gerstein, A Kundaje, Q Li, J Taylor, F Yue, J Dekker, WS Noble (2019). Genome Biol 20: 57.

MicroRNA-dependent regulation of biomechanical genes establishes tissue stiffness homeostasis.
A Moro, TP Driscoll, LC Boraas, W Armero, DM Kasper, N Baeyens, C Jouy, V Mallikarjun, J Swift, SJ Ahn, D Lee, J Zhang, M Gu, M Gerstein, M Schwartz, S Nicoli (2019). Nat Cell Biol 21: 348-358.

Insights into genetics, human biology and disease gleaned from family based genomic studies.
JE Posey, AH O'Donnell-Luria, JX Chong, T Harel, SN Jhangiani, ZH Coban Akdemir, S Buyske, D Pehlivan, CMB Carvalho, S Baxter, N Sobreira, P Liu, N Wu, JA Rosenfeld, S Kumar, D Avramopoulos, JJ White, KF Doheny, PD Witmer, C Boehm, VR Sutton, DM Muzny, E Boerwinkle, M Gunel, DA Nickerson, S Mane, DG MacArthur, RA Gibbs, A Hamosh, RP Lifton, TC Matise, HL Rehm, M Gerstein, MJ Bamshad, D Valle, JR Lupski, Centers for Mendelian Genomics (2019). Genet Med 21: 798-812.

Next-Generation Sequencing to Diagnose Suspected Genetic Disorders.
S Li, MB Gerstein (2019). N Engl J Med 380: 200.

-- 2018 (25) --

Dependent Type Networks: A Probabilistic Logic via the Curry-Howard Correspondence in a System of Probabilistic Dependent Types
J Warrell, MB Gerstein (2018). Uncertainty in Artificial Intelligence Workshop on Uncertainty in Deep Learning.

Rank Projection Trees for Multilevel Neural Network Interpretation
J Warrell, H Mohsen, MB Gerstein (2018). NeurIPS Workshop on Machine Learning for Health.

Revealing the brain's molecular architecture.
PsychENCODE Consortium (2018). Science 362: 1262-1263.

Comprehensive functional genomic resource and integrative model for the human brain.
D Wang, S Liu, J Warrell, H Won, X Shi, FCP Navarro, D Clarke, M Gu, P Emani, YT Yang, M Xu, MJ Gandal, S Lou, J Zhang, JJ Park, C Yan, SK Rhie, K Manakongtreecheep, H Zhou, A Nathan, M Peters, E Mattei, D Fitzgerald, T Brunetti, J Moore, Y Jiang, K Girdhar, GE Hoffman, S Kalayci, ZH Gumus, GE Crawford, PsychENCODE Consortium, P Roussos, S Akbarian, AE Jaffe, KP White, Z Weng, N Sestan, DH Geschwind, JA Knowles, MB Gerstein (2018). Science 362.

Transcriptome-wide isoform-level dysregulation in ASD, schizophrenia, and bipolar disorder.
MJ Gandal, P Zhang, E Hadjimichael, RL Walker, C Chen, S Liu, H Won, H van Bakel, M Varghese, Y Wang, AW Shieh, J Haney, S Parhami, J Belmont, M Kim, P Moran Losada, Z Khan, J Mleczko, Y Xia, R Dai, D Wang, YT Yang, M Xu, K Fish, PR Hof, J Warrell, D Fitzgerald, K White, AE Jaffe, PsychENCODE Consortium, MA Peters, M Gerstein, C Liu, LM Iakoucheva, D Pinto, DH Geschwind (2018). Science 362.

Integrative functional genomic analysis of human brain development and neuropsychiatric risks.
M Li, G Santpere, Y Imamura Kawasawa, OV Evgrafov, FO Gulden, S Pochareddy, SM Sunkin, Z Li, Y Shin, Y Zhu, AMM Sousa, DM Werling, RR Kitchen, HJ Kang, M Pletikos, J Choi, S Muchnik, X Xu, D Wang, B Lorente-Galdos, S Liu, P Giusti-Rodriguez, H Won, CA de Leeuw, AF Pardinas, BrainSpan Consortium, PsychENCODE Consortium, PsychENCODE Developmental Subgroup, M Hu, F Jin, Y Li, MJ Owen, MC O'Donovan, JTR Walters, D Posthuma, MA Reimers, P Levitt, DR Weinberger, TM Hyde, JE Kleinman, DH Geschwind, MJ Hawrylycz, MW State, SJ Sanders, PF Sullivan, MB Gerstein, ES Lein, JA Knowles, N Sestan (2018). Science 362.

Transcriptome and epigenome landscape of human cortical development modeled in organoids.
A Amiri, G Coppola, S Scuderi, F Wu, T Roychowdhury, F Liu, S Pochareddy, Y Shin, A Safi, L Song, Y Zhu, AMM Sousa, PsychENCODE Consortium, M Gerstein, GE Crawford, N Sestan, A Abyzov, FM Vaccarino (2018). Science 362.

Text mining systems biology: Turning the microscope back on the observer
X Kong, M Gerstein (2018). Current Opinion in Systems Biology 11:117-122.

Reliability of Whole-Exome Sequencing for Assessing Intratumor Genetic Heterogeneity.
W Shi, CKY Ng, RS Lim, T Jiang, S Kumar, X Li, VB Wali, S Piscuoglio, MB Gerstein, AB Chagpar, B Weigelt, L Pusztai, JS Reis-Filho, C Hatzis (2018). Cell Rep 25: 1446-1457.

GENCODE reference annotation for the human and mouse genomes.
A Frankish, M Diekhans, AM Ferreira, R Johnson, I Jungreis, J Loveland, JM Mudge, C Sisu, J Wright, J Armstrong, I Barnes, A Berry, A Bignell, S Carbonell Sala, J Chrast, F Cunningham, T Di Domenico, S Donaldson, IT Fiddes, C Garcia Giron, JM Gonzalez, T Grego, M Hardy, T Hourlier, T Hunt, OG Izuogu, J Lagarde, FJ Martin, L Martinez, S Mohanan, P Muir, FCP Navarro, A Parker, B Pei, F Pozo, M Ruffier, BM Schmitt, E Stapleton, MM Suner, I Sycheva, B Uszczynska-Ratajczak, J Xu, A Yates, D Zerbino, Y Zhang, B Aken, JS Choudhary, M Gerstein, R Guigo, TJP Hubbard, M Kellis, B Paten, A Reymond, ML Tress, P Flicek (2018). Nucleic Acids Res 47: D766-D773.

Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci.
J Lilue, AG Doran, IT Fiddes, M Abrudan, J Armstrong, R Bennett, W Chow, J Collins, S Collins, A Czechanski, P Danecek, M Diekhans, DD Dolle, M Dunn, R Durbin, D Earl, A Ferguson-Smith, P Flicek, J Flint, A Frankish, B Fu, M Gerstein, J Gilbert, L Goodstadt, J Harrow, K Howe, X Ibarra-Soria, M Kolmogorov, CJ Lelliott, DW Logan, J Loveland, CE Mathews, R Mott, P Muir, S Nachtweide, FCP Navarro, DT Odom, N Park, S Pelan, SK Pham, M Quail, L Reinholdt, L Romoth, L Shirley, C Sisu, M Sjoberg-Herrera, M Stanke, C Steward, M Thomas, G Threadgold, D Thybert, J Torrance, K Wong, J Wood, B Yalcin, F Yang, DJ Adams, B Paten, TM Keane (2018). Nat Genet 50: 1574-1583.

What’s next for humanity?
D Greenbaum, M Gerstein (2018). Science 362 (6415):648.

Human History, Human Genomes
D Greenbaum, M Gerstein (2018). Cell 174:1043-1044.

Allele-specific epigenome maps reveal sequence-dependent stochastic switching at regulatory loci.
V Onuchic, E Lurie, I Carrero, P Pawliczek, RY Patel, J Rozowsky, T Galeev, Z Huang, RC Altshuler, Z Zhang, RA Harris, C Coarfa, L Ashmore, JW Bertol, WD Fakhouri, F Yu, M Kellis, M Gerstein, A Milosavljevic (2018). Science 361.

Isoform-Level Interpretation of High-Throughput Proteomics Data Enabled by Deep Integration with RNA-seq.
BC Carlyle, RR Kitchen, J Zhang, RS Wilson, TT Lam, JS Rozowsky, KR Williams, N Sestan, MB Gerstein, AC Nairn (2018). J Proteome Res 17: 3431-3444.

KBase: The United States Department of Energy Systems Biology Knowledgebase.
AP Arkin, RW Cottingham, CS Henry, NL Harris, RL Stevens, S Maslov, P Dehal, D Ware, F Perez, S Canon, MW Sneddon, ML Henderson, WJ Riehl, D Murphy-Olson, SY Chan, RT Kamimura, S Kumari, MM Drake, TS Brettin, EM Glass, D Chivian, D Gunter, DJ Weston, BH Allen, J Baumohl, AA Best, B Bowen, SE Brenner, CC Bun, JM Chandonia, JM Chia, R Colasanti, N Conrad, JJ Davis, BH Davison, M DeJongh, S Devoid, E Dietrich, I Dubchak, JN Edirisinghe, G Fang, JP Faria, PM Frybarger, W Gerlach, M Gerstein, A Greiner, J Gurtowski, HL Haun, F He, R Jain, MP Joachimiak, KP Keegan, S Kondo, V Kumar, ML Land, F Meyer, M Mills, PS Novichkov, T Oh, GJ Olsen, R Olson, B Parrello, S Pasternak, E Pearson, SS Poon, GA Price, S Ramakrishnan, P Ranjan, PC Ronald, MC Schatz, SMD Seaver, M Shukla, RA Sutormin, MH Syed, J Thomason, NL Tintle, D Wang, F Xia, H Yoo, S Yoo, D Yu (2018). Nat Biotechnol 36: 566-569.

Analysis of sensitive information leakage in functional genomics signal profiles through genomic deletions.
A Harmanci, M Gerstein (2018). Nat Commun 9: 2453.

Encoding human serine phosphopeptides in bacteria for proteome-wide identification of phosphorylation-dependent interactions.
KW Barber, P Muir, RV Szeligowski, S Rogulina, M Gerstein, JR Sampson, FJ Isaacs, J Rinehart (2018). Nat Biotechnol 36: 638-644.

Network Analysis as a Grand Unifier in Biomedical Data Science
P McGillivray, D Clarke, W Meyerson, J Zhang, D Lee, M Gu, S Kumar, H Zhou, MB Gerstein (2018). Annual Review of Biomedical Data Science Vol. 1.

Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli
D Thybert, M Roller, FCP Navarro, I Fiddes, I Streeter, C Feig, D Martin-Galvez, M Kolmogorov, V Janousek, W Akanni, B Aken, S Aldridge, V Chakrapani, W Chow, L Clarke, C Cummins, A Doran, M Dunn, L Goodstadt, K Howe, M Howell, AA Josselin, RC Karn, CM Laukaitis, L Jingtao, F Martin, M Muffato, S Nachtweide, MA Quail, C Sisu, M Stanke, K Stefflova, C Van Oosterhout, F Veyrunes, B Ward, F Yang, G Yazdanifar, A Zadissa, DJ Adams, A Brazma, M Gerstein, B Paten, S Pham, TM Keane, DT Odom, P Flicek (2018). Genome Res 28: 448-459.

A comprehensive catalog of predicted functional upstream open reading frames in humans.
P McGillivray, R Ault, M Pawashe, R Kitchen, S Balasubramanian, M Gerstein (2018). Nucleic Acids Res 46: 3326-3338.

FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods.
T Becker, WP Lee, J Leone, Q Zhu, C Zhang, S Liu, J Sargent, K Shanker, A Mil-Homens, E Cerveira, M Ryan, J Cha, FCP Navarro, T Galeev, M Gerstein, RE Mills, DG Shin, C Lee, A Malhotra (2018). Genome Biol 19: 38.

Shared molecular neuropathology across major psychiatric disorders parallels polygenic overlap.
MJ Gandal, JR Haney, NN Parikshak, V Leppa, G Ramaswami, C Hartl, AJ Schork, V Appadurai, A Buil, TM Werge, C Liu, KP White, CommonMind Consortium, PsychENCODE Consortium, iPSYCH-BROAD Working Group, S Horvath, DH Geschwind (2018). Science 359: 693-697.

Gene names can confound most-searched listings
MB Gerstein, FCP Navarro (2018). Nature 553: 405.

Integrative Personal Omics Profiles during Periods of Weight Gain and Loss.
BD Piening, W Zhou, K Contrepois, H Rost, GJ Gu Urban, T Mishra, BM Hanson, EJ Bautista, S Leopold, CY Yeh, D Spakowicz, I Banerjee, C Chen, K Kukurba, D Perelman, C Craig, E Colbert, D Salins, S Rego, S Lee, C Zhang, J Wheeler, MR Sailani, L Liang, C Abbott, M Gerstein, A Mardinoglu, U Smith, DL Rubin, S Pitteri, E Sodergren, TL McLaughlin, GM Weinstock, MP Snyder (2018). Cell Syst 6: 157-170e8.

-- 2017 (20) --

Novel approaches for bioinformatic analysis of salivary RNA sequencing data for development.
KE Kaczor-Urbanowicz, Y Kim, F Li, T Galeev, RR Kitchen, M Gerstein, K Koyano, SH Jeong, X Wang, D Elashoff, SY Kang, SM Kim, K Kim, S Kim, D Chia, X Xiao, J Rozowsky, DTW Wong (2017). Bioinformatics 34: 1-8.

The ModERN Resource: Genome-Wide Binding Profiles for Hundreds of Drosophila
MM Kudron, A Victorsen, L Gevirtzman, LW Hillier, WW Fisher, D Vafeados, M Kirkey, AS Hammonds, J Gersch, H Ammouri, ML Wall, J Moran, D Steffen, M Szynkarek, S Seabrook-Sturgis, N Jameel, M Kadaba, J Patton, R Terrell, M Corson, TJ Durham, S Park, S Samanta, M Han, J Xu, KK Yan, SE Celniker, KP White, L Ma, M Gerstein, V Reinke, RH Waterston (2017). Genetics 208: 937-949.

A multiregional proteomic survey of the postnatal human brain.
BC Carlyle, RR Kitchen, JE Kanyo, EZ Voss, M Pletikos, AMM Sousa, TT Lam, MB Gerstein, N Sestan, AC Nairn (2017). Nat Neurosci 20: 1787-1795.

Molecular and cellular reorganization of neural circuits in the human lineage.
AMM Sousa, Y Zhu, MA Raghanti, RR Kitchen, M Onorati, ATN Tebbenkamp, B Stutz, KA Meyer, M Li, YI Kawasawa, F Liu, RG Perez, M Mele, T Carvalho, M Skarica, FO Gulden, M Pletikos, A Shibata, AR Stephenson, MK Edler, JJ Ely, JD Elsworth, TL Horvath, PR Hof, TM Hyde, JE Kleinman, DR Weinberger, M Reimers, RP Lifton, SM Mane, JP Noonan, MW State, ES Lein, JA Knowles, T Marques-Bonet, CC Sherwood, MB Gerstein, N Sestan (2017). Science 358: 1027-1032.

MOAT: efficient detection of highly mutated regions with the Mutations Overburdening Annotations Tool.
L Lochovsky, J Zhang, M Gerstein (2017). Bioinformatics 34: 1031-1033.

Reconstruction of enhancer-target networks in 935 samples of human primary cells, tissues and cell lines.
Q Cao, C Anyansi, X Hu, L Xu, L Xiong, W Tang, MTS Mok, C Cheng, X Fan, M Gerstein, ASL Cheng, KY Yip (2017). Nat Genet 49: 1428-1436.

Using ALoFT to determine the impact of putative loss-of-function variants in protein-coding genes.
S Balasubramanian, Y Fu, M Pawashe, P McGillivray, M Jin, J Liu, KJ Karczewski, DG MacArthur, M Gerstein (2017). Nat Commun 8: 382.

MrTADFinder: A network modularity based approach to identify topologically associating domains in multiple resolutions.
KK Yan, S Lou, M Gerstein (2017). PLoS Comput Biol 13: e1005647.

Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis.
SME Sahraeian, M Mohiyuddin, R Sebra, H Tilgner, PT Afshar, KF Au, N Bani Asadi, MB Gerstein, WH Wong, MP Snyder, E Schadt, HYK Lam (2017). Nat Commun 8: 59.

Landscape and variation of novel retroduplications in 26 human populations.
Y Zhang, S Li, A Abyzov, MB Gerstein (2017). PLoS Comput Biol 13: e1005567.

Cancer genomics: Less is more in the hunt for driver mutations.
S Kumar, M Gerstein (2017). Nature 547: 40-41.

Using FunSeq2 for Coding and Non-Coding Variant Annotation and Prioritization.
P Dhingra, Y Fu, M Gerstein, E Khurana (2017). Curr Protoc Bioinformatics 57: 15111-151117.

Multiple-Swarm Ensembles: Improving the Predictive Power and Robustness of Predictive Models and Its Use in Computational Biology.
P Alves, S Liu, D Wang, M Gerstein (2017). IEEE/ACM Trans Comput Biol Bioinform 15: 926-933.

Dynamic RNA-protein interactions underlie the zebrafish maternal-to-zygotic transition.
V Despic, M Dejung, M Gu, J Krishnan, J Zhang, L Herzel, K Straube, MB Gerstein, F Butter, KM Neugebauer (2017). Genome Res 27: 1184-1194.

Structuring supplemental materials in support of reproducibility
D Greenbaum, J Rozowsky, V Stodden, M Gerstein (2017). Genome Biol 18: 64.

HiC-spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps.
KK Yan, GG Yardimci, C Yan, WS Noble, M Gerstein (2017). Bioinformatics 33: 2199-2201.

Whole-genome analysis of papillary kidney cancer finds significant noncoding alterations.
S Li, BM Shuch, MB Gerstein (2017). PLoS Genet 13: e1006685.

MicroRNAs Establish Uniform Traits during the Architecture of Vertebrate Embryos.
DM Kasper, A Moro, E Ristori, A Narayanan, G Hill-Teran, E Fleming, M Moreno-Mateos, CE Vejnar, J Zhang, D Lee, M Gu, M Gerstein, A Giraldez, S Nicoli (2017). Dev Cell 40: 552-565e5.

Stroke and Circulating Extracellular RNAs.
E Mick, R Shah, K Tanriverdi, V Murthy, M Gerstein, J Rozowsky, R Kitchen, MG Larson, D Levy, JE Freedman (2017). Stroke 48: 828-834.

One thousand somatic SNVs per skin fibroblast cell set baseline of mosaic mutational load with patterns that suggest proliferative origin.
A Abyzov, L Tomasini, B Zhou, N Vasmatzis, G Coppola, M Amenduni, R Pattni, M Wilson, M Gerstein, S Weissman, AE Urban, FM Vaccarino (2017). Genome Res 27: 512-523.

-- 2016 (19) --

Intensification: A Resource for Amplifying Population-Genetic Signals with Protein Repeats.
J Chen, B Wang, L Regan, M Gerstein (2016). J Mol Biol 429: 435-445.

Localized structural frustration for evaluating the impact of sequence variants.
S Kumar, D Clarke, M Gerstein (2016). Nucleic Acids Res 44: 10062-10073.

DREISS: Using State-Space Models to Infer the Dynamics of Gene Expression Driven by External and Internal Regulatory Networks.
D Wang, F He, S Maslov, M Gerstein (2016). PLoS Comput Biol 12: e1005146.

Opinion: GMOs Are Not “Frankenfoods”
D Greenbaum, M Gerstein (2016). The Scientist (30 Aug).

iTAR: a web server for identifying target genes of transcription factors using ChIP-seq or ChIP-chip data.
CC Yang, EH Andrews, MH Chen, WY Wang, JJ Chen, M Gerstein, CC Liu, C Cheng (2016). BMC Genomics 17: 632.

Pangolin genomes and the evolution of mammalian scales and immunity.
SW Choo, M Rayko, TK Tan, R Hari, A Komissarov, WY Wee, AA Yurchenko, S Kliver, G Tamazian, A Antunes, RK Wilson, WC Warren, KP Koepfli, P Minx, K Krasheninnikova, A Kotze, DL Dalton, E Vermaak, IC Paterson, P Dobrynin, FT Sitam, JJ Rovie-Ryan, WE Johnson, AM Yusoff, SJ Luo, KV Karuppannan, G Fang, D Zheng, MB Gerstein, L Lipovich, SJ O'Brien, GJ Wong (2016). Genome Res 26: 1312-1322.

Discordant Expression of Circulating microRNA from Cellular and Extracellular Sources.
R Shah, K Tanriverdi, D Levy, M Larson, M Gerstein, E Mick, J Rozowsky, R Kitchen, V Murthy, E Mikalev, JE Freedman (2016). PLoS One 11: e0153691.

Diverse human extracellular RNAs are widely detected in human plasma.
JE Freedman, M Gerstein, E Mick, J Rozowsky, D Levy, R Kitchen, S Das, R Shah, K Danielson, L Beaulieu, FC Navarro, Y Wang, TR Galeev, A Holman, RY Kwong, V Murthy, SE Tanriverdi, M Koupenova-Zamor, E Mikhalev, K Tanriverdi (2016). Nat Commun 7: 11106.

Extending gene ontology in the context of extracellular RNA and vesicle communication.
KH Cheung, S Keerthikumar, P Roncaglia, SL Subramanian, ME Roth, M Samuel, S Anand, L Gangoda, S Gould, R Alexander, D Galas, MB Gerstein, AF Hill, RR Kitchen, J Lotvall, T Patel, DC Procaccini, P Quesenberry, J Rozowsky, RL Raffai, A Shypitsyna, AI Su, C Thery, K Vickers, MH Wauben, S Mathivanan, A Milosavljevic, LC Laurent (2016). J Biomed Semantics 7: 19.

A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.
J Chen, J Rozowsky, TR Galeev, A Harmanci, R Kitchen, J Bedford, A Abyzov, Y Kong, L Regan, M Gerstein (2016). Nat Commun 7: 11101.

Identifying Allosteric Hotspots with Dynamics: Application to Inter- and Intra-species Conservation.
D Clarke, A Sethi, S Li, S Kumar, RWF Chang, J Chen, M Gerstein (2016). Structure 24: 826-837.

Cross-Disciplinary Network Comparison: Matchmaking Between Hairballs
KK Yan, D Wang, A Sethi, P Muir, R Kitchen, C Cheng, M Gerstein (2016). Cell Syst 2: 147-157.

Who Owns Your DNA?
D Greenbaum, M Gerstein (2016). Cell 165:257-258.

Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis.
F He, S Yoo, D Wang, S Kumari, M Gerstein, D Ware, S Maslov (2016). Plant J 86: 472-80.

The real cost of sequencing: scaling computation to keep pace with data generation
P Muir, S Li, S Lou, D Wang, DJ Spakowicz, L Salichos, J Zhang, GM Weinstock, F Isaacs, J Rozowsky, M Gerstein (2016). Genome Biol 17: 53.

Temporal Dynamics of Collaborative Networks in Large Scientific Consortia
D Wang, KK Yan, J Rozowsky, E Pan, M Gerstein (2016). Trends Genet 32: 251-253.

Quantification of private information leakage from phenotype-genotype data: linking attacks
A Harmanci, M Gerstein (2016). Nat Methods 13: 251-6.

Role of non-coding sequence variants in cancer.
E Khurana, Y Fu, D Chakravarty, F Demichelis, MA Rubin, M Gerstein (2016). Nat Rev Genet 17: 93-108.

Understanding genome structural variations.
A Abyzov, S Li, MB Gerstein (2016). Oncotarget 7: 7370-1.

-- 2015 (19) --

Knowledge Based Factorized High Order Sparse Learning Models
S Purushotham, MR Min, CJ Kuo, M Gerstein (2015). NIPS Workshop on Machine Learning in Computational Biology.

The Molecular Taxonomy of Primary Prostate Cancer.
Cancer Genome Atlas Research Network (2015). Cell 163: 1011-25.

Reads meet rotamers: structural biology in the age of deep sequencing.
A Sethi, D Clarke, J Chen, S Kumar, TR Galeev, L Regan, M Gerstein (2015). Curr Opin Struct Biol 35: 125-34.

The PsychENCODE project.
PsychENCODE Consortium, S Akbarian, C Liu, JA Knowles, FM Vaccarino, PJ Farnham, GE Crawford, AE Jaffe, D Pinto, S Dracheva, DH Geschwind, J Mill, AC Nairn, A Abyzov, S Pochareddy, S Prabhakar, S Weissman, PF Sullivan, MW State, Z Weng, MA Peters, KP White, MB Gerstein, A Amiri, C Armoskus, AE Ashley-Koch, T Bae, A Beckel-Mitchener, BP Berman, GA Coetzee, G Coppola, N Francoeur, M Fromer, R Gao, K Grennan, J Herstein, DH Kavanagh, NA Ivanov, Y Jiang, RR Kitchen, A Kozlenkov, M Kundakovic, M Li, Z Li, S Liu, LM Mangravite, E Mattei, E Markenscoff-Papadimitriou, FC Navarro, N North, L Omberg, D Panchision, N Parikshak, J Poschmann, AJ Price, M Purcaro, TE Reddy, P Roussos, S Schreiner, S Scuderi, R Sebra, M Shibata, AW Shieh, M Skarica, W Sun, V Swarup, A Thomas, J Tsuji, H van Bakel, D Wang, Y Wang, K Wang, DM Werling, AJ Willsey, H Witt, H Won, CC Wong, GA Wray, EY Wu, X Xu, L Yao, G Senthil, T Lehner, P Sklar, N Sestan (2015). Nat Neurosci 18: 1707-12.

Illuminating the Genome's Dark Matter
D Greenbaum, M Gerstein (2015). Cell 163:1047-1048.

Comprehensive Molecular Characterization of Papillary Renal-Cell Carcinoma.
Cancer Genome Atlas Research Network, WM Linehan, PT Spellman, CJ Ricketts, CJ Creighton, SS Fei, C Davis, DA Wheeler, BA Murray, L Schmidt, CD Vocke, M Peto, AA Al Mamun, E Shinbrot, A Sethi, S Brooks, WK Rathmell, AN Brooks, KA Hoadley, AG Robertson, D Brooks, R Bowlby, S Sadeghi, H Shen, DJ Weisenberger, M Bootwalla, SB Baylin, PW Laird, AD Cherniack, G Saksena, S Haake, J Li, H Liang, Y Lu, GB Mills, R Akbani, MD Leiserson, BJ Raphael, P Anur, D Bottaro, L Albiges, N Barnabas, TK Choueiri, B Czerniak, AK Godwin, AA Hakimi, TH Ho, J Hsieh, M Ittmann, WY Kim, B Krishnan, MJ Merino, KR Mills Shaw, VE Reuter, E Reznik, CS Shelley, B Shuch, S Signoretti, R Srinivasan, P Tamboli, G Thomas, S Tickoo, K Burnett, D Crain, J Gardner, K Lau, D Mallery, S Morris, JD Paulauskis, RJ Penny, C Shelton, WT Shelton, M Sherman, E Thompson, P Yena, MT Avedon, J Bowen, JM Gastier-Foster, M Gerken, KM Leraas, TM Lichtenberg, NC Ramirez, T Santos, L Wise, E Zmuda, JA Demchok, I Felau, CM Hutter, M Sheth, HJ Sofia, R Tarnuzzer, Z Wang, L Yang, JC Zenklusen, J Zhang, B Ayala, J Baboud, S Chudamani, J Liu, L Lolla, R Naresh, T Pihl, Q Sun, Y Wan, Y Wu, A Ally, M Balasundaram, S Balu, R Beroukhim, T Bodenheimer, C Buhay, YS Butterfield, R Carlsen, SL Carter, H Chao, E Chuah, A Clarke, KR Covington, M Dahdouli, N Dewal, N Dhalla, HV Doddapaneni, JA Drummond, SB Gabriel, RA Gibbs, R Guin, W Hale, A Hawes, DN Hayes, RA Holt, AP Hoyle, SR Jefferys, SJ Jones, CD Jones, D Kalra, C Kovar, L Lewis, J Li, Y Ma, MA Marra, M Mayo, S Meng, M Meyerson, PA Mieczkowski, RA Moore, D Morton, LE Mose, AJ Mungall, D Muzny, JS Parker, CM Perou, J Roach, JE Schein, SE Schumacher, Y Shi, JV Simons, P Sipahimalani, T Skelly, MG Soloway, C Sougnez, A Tam, D Tan, N Thiessen, U Veluvolu, M Wang, MD Wilkerson, T Wong, J Wu, L Xi, J Zhou, J Bedford, F Chen, Y Fu, M Gerstein, D Haussler, K Kasaian, P Lai, S Ling, A Radenbaugh, D Van Den Berg, JN Weinstein, J Zhu, M Albert, I Alexopoulou, JJ Andersen, JT Auman, J Bartlett, S Bastacky, J Bergsten, ML Blute, L Boice, RJ Bollag, J Boyd, E Castle, YB Chen, JC Cheville, E Curley, B Davies, A DeVolk, R Dhir, L Dike, J Eckman, J Engel, J Harr, R Hrebinko, M Huang, L Huelsenbeck-Dill, M Iacocca, B Jacobs, M Lobis, JK Maranchie, S McMeekin, J Myers, J Nelson, J Parfitt, A Parwani, N Petrelli, B Rabeno, S Roy, AL Salner, J Slaton, M Stanton, RH Thompson, L Thorne, K Tucker, PM Weinberger, C Winemiller, LA Zach, R Zuna (2015). N Engl J Med 374: 135-45.

A global reference for human genetic variation.
1000 Genomes Project Consortium, A Auton, LD Brooks, RM Durbin, EP Garrison, HM Kang, JO Korbel, JL Marchini, S McCarthy, GA McVean, GR Abecasis (2015). Nature 526: 68-74.

An integrated map of structural variation in 2,504 human genomes.
PH Sudmant, T Rausch, EJ Gardner, RE Handsaker, A Abyzov, J Huddleston, Y Zhang, K Ye, G Jun, MH Fritz, MK Konkel, A Malhotra, AM Stutz, X Shi, FP Casale, J Chen, F Hormozdiari, G Dayama, K Chen, M Malig, MJP Chaisson, K Walter, S Meiers, S Kashin, E Garrison, A Auton, HYK Lam, XJ Mu, C Alkan, D Antaki, T Bae, E Cerveira, P Chines, Z Chong, L Clarke, E Dal, L Ding, S Emery, X Fan, M Gujral, F Kahveci, JM Kidd, Y Kong, EW Lameijer, S McCarthy, P Flicek, RA Gibbs, G Marth, CE Mason, A Menelaou, DM Muzny, BJ Nelson, A Noor, NF Parrish, M Pendleton, A Quitadamo, B Raeder, EE Schadt, M Romanovitch, A Schlattl, R Sebra, AA Shabalin, A Untergasser, JA Walker, M Wang, F Yu, C Zhang, J Zhang, X Zheng-Bradley, W Zhou, T Zichner, J Sebat, MA Batzer, SA McCarroll, 1000 Genomes Project Consortium, RE Mills, MB Gerstein, A Bashir, O Stegle, SE Devine, C Lee, EE Eichler, JO Korbel (2015). Nature 526: 75-81.

Leveraging long read sequencing from a single individual to provide a comprehensive resource for benchmarking variant calling methods.
JC Mu, P Tootoonchi Afshar, M Mohiyuddin, X Chen, J Li, N Bani Asadi, MB Gerstein, WH Wong, HY Lam (2015). Sci Rep 5: 14493.

An ensemble approach to accurately detect somatic mutations using SomaticSeq.
LT Fang, PT Afshar, A Chhibber, M Mohiyuddin, Y Fan, JC Mu, G Gibeling, S Barr, NB Asadi, MB Gerstein, DC Koboldt, W Wang, WH Wong, HY Lam (2015). Genome Biol 16: 197.

Tracking Distinct RNA Populations Using Efficient and Reversible Covalent Chemistry.
EE Duffy, M Rutenberg-Schoenberg, CD Stark, RR Kitchen, MB Gerstein, MD Simon (2015). Mol Cell 59: 858-66.

LARVA: an integrative framework for large-scale analysis of recurrent variants in noncoding annotations.
L Lochovsky, J Zhang, Y Fu, E Khurana, M Gerstein (2015). Nucleic Acids Res 43: 8123-34.

High-order neural networks and kernel methods for peptide-MHC binding prediction.
PP Kuksa, MR Min, R Dugar, M Gerstein (2015). Bioinformatics 31: 3600-7.

FOXG1-Dependent Dysregulation of GABA/Glutamate Neuron Differentiation in Autism Spectrum Disorders.
J Mariani, G Coppola, P Zhang, A Abyzov, L Provini, L Tomasini, M Amenduni, A Szekely, D Palejev, M Wilson, M Gerstein, EL Grigorenko, K Chawarska, KA Pelphrey, JR Howe, FM Vaccarino (2015). Cell 162: 375-390.

Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms.
A Abyzov, S Li, DR Kim, M Mohiyuddin, AM Stutz, NF Parrish, XJ Mu, W Clark, K Chen, M Hurles, JO Korbel, HY Lam, C Lee, MB Gerstein (2015). Nat Commun 6: 7256.

The computer connection
D Greenbaum, M Gerstein (2015). Science 347: 956.

MetaSV: an accurate and integrative structural-variant caller for next generation sequencing.
M Mohiyuddin, JC Mu, J Li, N Bani Asadi, MB Gerstein, A Abyzov, WH Wong, HY Lam (2015). Bioinformatics 31: 2741-4.

Loregic: a method to characterize the cooperative logic of regulatory factors.
D Wang, KK Yan, C Sisu, C Cheng, J Rozowsky, W Meyerson, MB Gerstein (2015). PLoS Comput Biol 11: e1004132.

An approach for determining and measuring network hierarchy applied to comparing the phosphorylome and the regulome.
C Cheng, E Andrews, KK Yan, M Ung, D Wang, M Gerstein (2015). Genome Biol 16: 63.

-- 2014 (15) --

Proposed social and technological solutions to issues of data privacy in personal genomics
D Greenbaum, A Harmanci, M Gerstein (2014). IEEE IEEE International Symposium on Ethics in Science, Technology and Engineering.

Ensemble Learning Based Sparse High-Order Boltzmann Machine for Unsupervised Feature Interaction Identification
MR Min, X Ning, Y Qi, C Cheng, A Bonner, M Gerstein (2014). NIPS Workshop on Machine Learning in Computational Biology.

Interpretable Sparse High-Order Boltzmann Machines
MR Min, X Ning, C Cheng, M Gerstein (2014). JMLR W&CP 33:614-622 (AISTATS 2014).

VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications.
JC Mu, M Mohiyuddin, J Li, N Bani Asadi, MB Gerstein, A Abyzov, WH Wong, HY Lam (2014). Bioinformatics 31: 1469-71.

Decoding neuroproteomics: integrating the genome, translatome and functional anatomy.
RR Kitchen, JS Rozowsky, MB Gerstein, AC Nairn (2014). Nat Neurosci 17: 1491-9.

MUSIC: identification of enriched regions in ChIP-Seq experiments using a mappability-corrected multiscale signal processing framework.
A Harmanci, J Rozowsky, M Gerstein (2014). Genome Biol 15: 474.

FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer.
Y Fu, Z Liu, S Lou, J Bedford, XJ Mu, KY Yip, E Khurana, M Gerstein (2014). Genome Biol 15: 480.

Comparative analysis of regulatory information and circuits across distant species.
AP Boyle, CL Araya, C Brdlik, P Cayting, C Cheng, Y Cheng, K Gardner, LW Hillier, J Janette, L Jiang, D Kasper, T Kawli, P Kheradpour, A Kundaje, JJ Li, L Ma, W Niu, EJ Rehm, J Rozowsky, M Slattery, R Spokony, R Terrell, D Vafeados, D Wang, P Weisdepp, YC Wu, D Xie, KK Yan, EA Feingold, PJ Good, MJ Pazin, H Huang, PJ Bickel, SE Brenner, V Reinke, RH Waterston, M Gerstein, KP White, M Kellis, M Snyder (2014). Nature 512: 453-6.

Comparative analysis of the transcriptome across distant species.
MB Gerstein, J Rozowsky, KK Yan, D Wang, C Cheng, JB Brown, CA Davis, L Hillier, C Sisu, JJ Li, B Pei, AO Harmanci, MO Duff, S Djebali, RP Alexander, BH Alver, R Auerbach, K Bell, PJ Bickel, ME Boeck, NP Boley, BW Booth, L Cherbas, P Cherbas, C Di, A Dobin, J Drenkow, B Ewing, G Fang, M Fastuca, EA Feingold, A Frankish, G Gao, PJ Good, R Guigo, A Hammonds, J Harrow, RA Hoskins, C Howald, L Hu, H Huang, TJ Hubbard, C Huynh, S Jha, D Kasper, M Kato, TC Kaufman, RR Kitchen, E Ladewig, J Lagarde, E Lai, J Leng, Z Lu, M MacCoss, G May, R McWhirter, G Merrihew, DM Miller, A Mortazavi, R Murad, B Oliver, S Olson, PJ Park, MJ Pazin, N Perrimon, D Pervouchine, V Reinke, A Reymond, G Robinson, A Samsonova, GI Saunders, F Schlesinger, A Sethi, FJ Slack, WC Spencer, MH Stoiber, P Strasbourger, A Tanzer, OA Thompson, KH Wan, G Wang, H Wang, KL Watkins, J Wen, K Wen, C Xue, L Yang, K Yip, C Zaleski, Y Zhang, H Zheng, SE Brenner, BR Graveley, SE Celniker, TR Gingeras, R Waterston (2014). Nature 512: 445-8.

Comparative analysis of pseudogenes across three phyla.
C Sisu, B Pei, J Leng, A Frankish, Y Zhang, S Balasubramanian, R Harte, D Wang, M Rutenberg-Schoenberg, W Clark, M Diekhans, J Rozowsky, T Hubbard, J Harrow, MB Gerstein (2014). Proc Natl Acad Sci U S A 111: 13361-6.

OrthoClust: an orthology-based network framework for clustering data across multiple species.
KK Yan, D Wang, J Rozowsky, H Zheng, C Cheng, M Gerstein (2014). Genome Biol 15: R100.

Guidelines for investigating causality of sequence variants in human disease.
DG MacArthur, TA Manolio, DP Dimmock, HL Rehm, J Shendure, GR Abecasis, DR Adams, RB Altman, SE Antonarakis, EA Ashley, JC Barrett, LG Biesecker, DF Conrad, GM Cooper, NJ Cox, MJ Daly, MB Gerstein, DB Goldstein, JN Hirschhorn, SM Leal, LA Pennacchio, JA Stamatoyannopoulos, SR Sunyaev, D Valle, BF Voight, W Winckler, C Gunter (2014). Nature 508: 469-76.

Defining functional DNA elements in the human genome.
M Kellis, B Wold, MP Snyder, BE Bernstein, A Kundaje, GK Marinov, LD Ward, E Birney, GE Crawford, J Dekker, I Dunham, LL Elnitski, PJ Farnham, EA Feingold, M Gerstein, MC Giddings, DM Gilbert, TR Gingeras, ED Green, R Guigo, T Hubbard, J Kent, JD Lieb, RM Myers, MJ Pazin, B Ren, JA Stamatoyannopoulos, Z Weng, KP White, RC Hardison (2014). Proc Natl Acad Sci U S A 111: 6131-8.

Transcriptional landscape of the prenatal human brain.
JA Miller, SL Ding, SM Sunkin, KA Smith, L Ng, A Szafer, A Ebbert, ZL Riley, JJ Royall, K Aiona, JM Arnold, C Bennet, D Bertagnolli, K Brouner, S Butler, S Caldejon, A Carey, C Cuhaciyan, RA Dalley, N Dee, TA Dolbeare, BA Facer, D Feng, TP Fliss, G Gee, J Goldy, L Gourley, BW Gregor, G Gu, RE Howard, JM Jochim, CL Kuan, C Lau, CK Lee, F Lee, TA Lemon, P Lesnar, B McMurray, N Mastan, N Mosqueda, T Naluai-Cecchini, NK Ngo, J Nyhus, A Oldre, E Olson, J Parente, PD Parker, SE Parry, A Stevens, M Pletikos, M Reding, K Roll, D Sandman, M Sarreal, S Shapouri, NV Shapovalova, EH Shen, N Sjoquist, CR Slaughterbeck, M Smith, AJ Sodt, D Williams, L Zollei, B Fischl, MB Gerstein, DH Geschwind, IA Glass, MJ Hawrylycz, RF Hevner, H Huang, AR Jones, JA Knowles, P Levitt, JW Phillips, N Sestan, P Wohnoutka, C Dang, A Bernard, JG Hohmann, ES Lein (2014). Nature 508: 199-206.

Identification of a major determinant for serine-threonine kinase phosphoacceptor specificity.
C Chen, BH Ha, AF Thevenin, HJ Lou, R Zhang, KY Yip, JR Peterson, M Gerstein, PM Kim, P Filippakopoulos, S Knapp, TJ Boggon, BE Turk (2014). Mol Cell 53: 140-7.

-- 2013 (14) --

Interpretable Sparse High-Order Boltzmann Machines for Transcription Factor Interaction Identification
MR Min, X Ning, C Cheng, M Gerstein (2013). NIPS Workshop on Machine Learning in Computational Biology.

Assessment of transcript reconstruction methods for RNA-seq.
T Steijger, JF Abril, PG Engstrom, F Kokocinski, RGASP Consortium, TJ Hubbard, R Guigo, J Harrow, P Bertone (2013). Nat Methods 10: 1177-84.

Deep Inside Champions, Just Genes?
D Greenbaum, J Chen, M Gerstein (2013). Science 342: 560.

Integrative annotation of variants from 1092 humans: application to cancer genomics.
E Khurana, Y Fu, V Colonna, XJ Mu, HM Kang, T Lappalainen, A Sboner, L Lochovsky, J Chen, A Harmanci, J Das, A Abyzov, S Balasubramanian, K Beal, D Chakravarty, D Challis, Y Chen, D Clarke, L Clarke, F Cunningham, US Evani, P Flicek, R Fragoza, E Garrison, R Gibbs, ZH Gumus, J Herrero, N Kitabayashi, Y Kong, K Lage, V Liluashvili, SM Lipkin, DG MacArthur, G Marth, D Muzny, TH Pers, GRS Ritchie, JA Rosenfeld, C Sisu, X Wei, M Wilson, Y Xue, F Yu, 1000 Genomes Project Consortium, ET Dermitzakis, H Yu, MA Rubin, C Tyler-Smith, M Gerstein (2013). Science 342: 1235587.

Analysis of variable retroduplications in human populations suggests coupling of retrotransposition to cell division.
A Abyzov, R Iskow, O Gokcumen, DW Radke, S Balasubramanian, B Pei, L Habegger, 1000 Genomes Project Consortium, C Lee, M Gerstein (2013). Genome Res 23: 2042-52.

Proceed with Caution
D Greenbaum, M Gerstein (2013). The Scientist 27:26 (1 Oct.)

Identification of yeast cell cycle regulated genes based on genomic features.
C Cheng, Y Fu, L Shen, M Gerstein (2013). BMC Syst Biol 7: 70.

Child development and structural variation in the human genome.
Y Zhang, R Haraksingh, F Grubert, A Abyzov, M Gerstein, S Weissman, AE Urban (2013). Child Dev 84: 34-48.

A comprehensive nuclear receptor network for breast cancer cells.
R Kittler, J Zhou, S Hua, L Ma, Y Liu, E Pendleton, C Cheng, M Gerstein, KP White (2013). Cell Rep 3: 538-51.

Accurate identification and analysis of human mRNA isoforms using deep long read sequencing.
H Tilgner, D Raha, L Habegger, M Mohiuddin, M Gerstein, M Snyder (2013). G3 (Bethesda) 3: 387-97.

The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes.
SB Montgomery, DL Goode, E Kvikstad, CA Albers, ZD Zhang, XJ Mu, G Ananda, B Howie, KJ Karczewski, KS Smith, V Anaya, R Richardson, J Davis, 1000 Genomes Project Consortium, DG MacArthur, A Sidow, L Duret, M Gerstein, KD Makova, J Marchini, G McVean, G Lunter (2013). Genome Res 23: 749-61.

Sixty years of genome biology.
WF Doolittle, P Fraser, MB Gerstein, BR Graveley, S Henikoff, C Huttenhower, A Oshlack, CP Ponting, JL Rinn, MC Schatz, J Ule, D Weigel, GM Weinstock (2013). Genome Biol 14: 113.

Machine learning and genome annotation: a match meant to be?
KY Yip, C Cheng, M Gerstein (2013). Genome Biol 14: 205.

Interpretation of genomic variants using a unified biological network approach.
E Khurana, Y Fu, J Chen, M Gerstein (2013). PLoS Comput Biol 9: e1002886.

-- 2012 (26) --

Epigenetic repression of miR-31 disrupts androgen receptor homeostasis and contributes to prostate cancer progression.
PC Lin, YL Chiu, S Banerjee, K Park, JM Mosquera, E Giannopoulou, P Alves, AK Tewari, MB Gerstein, H Beltran, AM Melnick, O Elemento, F Demichelis, MA Rubin (2012). Cancer Res 73: 1232-44.

Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells.
A Abyzov, J Mariani, D Palejev, Y Zhang, MS Haney, L Tomasini, AF Ferrandino, LA Rosenberg Belmaker, A Szekely, M Wilson, A Kocabas, NE Calixto, EL Grigorenko, A Huttner, K Chawarska, S Weissman, AE Urban, M Gerstein, FM Vaccarino (2012). Nature 492: 438-42.

An integrated map of genetic variation from 1,092 human genomes.
1000 Genomes Project Consortium, GR Abecasis, A Auton, LD Brooks, MA DePristo, RM Durbin, RE Handsaker, HM Kang, GT Marth, GA McVean (2012). Nature 491: 56-65.

Genomics: ENCODE leads the way on big data.
M Gerstein (2012). Nature 489: 208.

An integrated encyclopedia of DNA elements in the human genome.
ENCODE Project Consortium (2012). Nature 489: 57-74.

Architecture of the human regulatory network derived from ENCODE data.
MB Gerstein, A Kundaje, M Hariharan, SG Landt, KK Yan, C Cheng, XJ Mu, E Khurana, J Rozowsky, R Alexander, R Min, P Alves, A Abyzov, N Addleman, N Bhardwaj, AP Boyle, P Cayting, A Charos, DZ Chen, Y Cheng, D Clarke, C Eastman, G Euskirchen, S Frietze, Y Fu, J Gertz, F Grubert, A Harmanci, P Jain, M Kasowski, P Lacroute, JJ Leng, J Lian, H Monahan, H O'Geen, Z Ouyang, EC Partridge, D Patacsil, F Pauli, D Raha, L Ramirez, TE Reddy, B Reed, M Shi, T Slifer, J Wang, L Wu, X Yang, KY Yip, G Zilberman-Schapira, S Batzoglou, A Sidow, PJ Farnham, RM Myers, SM Weissman, M Snyder (2012). Nature 489: 91-100.

Landscape of transcription in human cells.
S Djebali, CA Davis, A Merkel, A Dobin, T Lassmann, A Mortazavi, A Tanzer, J Lagarde, W Lin, F Schlesinger, C Xue, GK Marinov, J Khatun, BA Williams, C Zaleski, J Rozowsky, M Roder, F Kokocinski, RF Abdelhamid, T Alioto, I Antoshechkin, MT Baer, NS Bar, P Batut, K Bell, I Bell, S Chakrabortty, X Chen, J Chrast, J Curado, T Derrien, J Drenkow, E Dumais, J Dumais, R Duttagupta, E Falconnet, M Fastuca, K Fejes-Toth, P Ferreira, S Foissac, MJ Fullwood, H Gao, D Gonzalez, A Gordon, H Gunawardena, C Howald, S Jha, R Johnson, P Kapranov, B King, C Kingswood, OJ Luo, E Park, K Persaud, JB Preall, P Ribeca, B Risk, D Robyr, M Sammeth, L Schaffer, LH See, A Shahab, J Skancke, AM Suzuki, H Takahashi, H Tilgner, D Trout, N Walters, H Wang, J Wrobel, Y Yu, X Ruan, Y Hayashizaki, J Harrow, M Gerstein, T Hubbard, A Reymond, SE Antonarakis, G Hannon, MC Giddings, Y Ruan, B Wold, P Carninci, R Guigo, TR Gingeras (2012). Nature 489: 101-8.

GENCODE: the reference human genome annotation for The ENCODE Project.
J Harrow, A Frankish, JM Gonzalez, E Tapanari, M Diekhans, F Kokocinski, BL Aken, D Barrell, A Zadissa, S Searle, I Barnes, A Bignell, V Boychenko, T Hunt, M Kay, G Mukherjee, J Rajan, G Despacio-Reyes, G Saunders, C Steward, R Harte, M Lin, C Howald, A Tanzer, T Derrien, J Chrast, N Walters, S Balasubramanian, B Pei, M Tress, JM Rodriguez, I Ezkurdia, J van Baren, M Brent, D Haussler, M Kellis, A Valencia, A Reymond, M Gerstein, R Guigo, TJ Hubbard (2012). Genome Res 22: 1760-74.

Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors.
KY Yip, C Cheng, N Bhardwaj, JB Brown, J Leng, A Kundaje, J Rozowsky, E Birney, P Bickel, M Snyder, M Gerstein (2012). Genome Biol 13: R48.

Understanding transcriptional regulation by integrative analysis of transcription factor binding data.
C Cheng, R Alexander, R Min, J Leng, KY Yip, J Rozowsky, KK Yan, X Dong, S Djebali, Y Ruan, CA Davis, P Carninci, T Lassman, TR Gingeras, R Guigo, E Birney, Z Weng, M Snyder, M Gerstein (2012). Genome Res 22: 1658-67.

ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia.
SG Landt, GK Marinov, A Kundaje, P Kheradpour, F Pauli, S Batzoglou, BE Bernstein, P Bickel, JB Brown, P Cayting, Y Chen, G DeSalvo, C Epstein, KI Fisher-Aylor, G Euskirchen, M Gerstein, J Gertz, AJ Hartemink, MM Hoffman, VR Iyer, YL Jung, S Karmakar, M Kellis, PV Kharchenko, Q Li, T Liu, XS Liu, L Ma, A Milosavljevic, RM Myers, PJ Park, MJ Pazin, MD Perry, D Raha, TE Reddy, J Rozowsky, N Shoresh, A Sidow, M Slattery, JA Stamatoyannopoulos, MY Tolstorukov, KP White, S Xi, PJ Farnham, JD Lieb, BJ Wold, M Snyder (2012). Genome Res 22: 1813-31.

The GENCODE pseudogene resource.
B Pei, C Sisu, A Frankish, C Howald, L Habegger, XJ Mu, R Harte, S Balasubramanian, A Tanzer, M Diekhans, A Reymond, TJ Hubbard, J Harrow, MB Gerstein (2012). Genome Biol 13: R51.

Modeling gene expression using chromatin features in various cellular contexts.
X Dong, MC Greven, A Kundaje, S Djebali, JB Brown, C Cheng, TR Gingeras, M Gerstein, R Guigo, E Birney, Z Weng (2012). Genome Biol 13: R53.

Tcf7 is an important regulator of the switch of self-renewal and differentiation in a multipotential hematopoietic cell line.
JQ Wu, M Seay, VP Schulz, M Hariharan, D Tuck, J Lian, J Du, M Shi, Z Ye, M Gerstein, MP Snyder, S Weissman (2012). PLoS Genet 8: e1002565.

Detecting and annotating genetic variations using the HugeSeq pipeline.
HY Lam, C Pan, MJ Clark, P Lacroute, R Chen, R Haraksingh, M O'Huallachain, MB Gerstein, JM Kidd, CD Bustamante, M Snyder (2012). Nat Biotechnol 30: 226-9.

Chromatin state signatures associated with tissue-specific gene expression and enhancer activity in the embryonic limb.
J Cotney, J Leng, S Oh, LE DeMare, SK Reilly, MB Gerstein, JP Noonan (2012). Genome Res 22: 1069-80.

The human proteome - a scientific opportunity for transforming diagnostics, therapeutics, and healthcare.
M Vidal, DW Chan, M Gerstein, M Mann, GS Omenn, D Tagle, S Sechi, Workshop Participants (2012). Clin Proteomics 9: 6.

The Centers for Mendelian Genomics: a new large-scale initiative to identify the genes underlying rare Mendelian conditions.
MJ Bamshad, JA Shendure, D Valle, A Hamosh, JR Lupski, RA Gibbs, E Boerwinkle, RP Lifton, M Gerstein, M Gunel, S Mane, DA Nickerson, Centers for Mendelian Genomics (2012). Am J Med Genet A 158A: 1523-5.

VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment.
L Habegger, S Balasubramanian, DZ Chen, E Khurana, A Sboner, A Harmanci, J Rozowsky, D Clarke, M Snyder, M Gerstein (2012). Bioinformatics 28: 2267-9.

On sports and genes.
G Zilberman-Schapira, J Chen, M Gerstein (2012). Recent Pat DNA Gene Seq 6: 180-8.

Genomic analysis of the hydrocarbon-producing, cellulolytic, endophytic fungus Ascocoryne sarcoides.
TA Gianoulis, MA Griffin, DJ Spakowicz, BF Dunican, CJ Alpha, A Sboner, AM Sismour, C Kodira, M Egholm, GM Church, MB Gerstein, SA Strobel (2012). PLoS Genet 8: e1002558.

Personal omics profiling reveals dynamic molecular and medical phenotypes.
R Chen, GI Mias, J Li-Pook-Than, L Jiang, HY Lam, R Chen, E Miriami, KJ Karczewski, M Hariharan, FE Dewey, Y Cheng, MJ Clark, H Im, L Habegger, S Balasubramanian, M O'Huallachain, JT Dudley, S Hillenmeyer, R Haraksingh, D Sharon, G Euskirchen, P Lacroute, K Bettinger, AP Boyle, M Kasowski, F Grubert, S Seki, M Garcia, M Whirl-Carrillo, M Gallardo, MA Blasco, PL Greenberg, P Snyder, TE Klein, RB Altman, AJ Butte, EA Ashley, M Gerstein, KC Nadeau, H Tang, M Snyder (2012). Cell 148: 1293-307.

A systematic survey of loss-of-function variants in human protein-coding genes.
DG MacArthur, S Balasubramanian, A Frankish, N Huang, J Morris, K Walter, L Jostins, L Habegger, JK Pickrell, SB Montgomery, CA Albers, ZD Zhang, DF Conrad, G Lunter, H Zheng, Q Ayub, MA DePristo, E Banks, M Hu, RE Handsaker, JA Rosenfeld, M Fromer, M Jin, XJ Mu, E Khurana, K Ye, M Kay, GI Saunders, MM Suner, T Hunt, IH Barnes, C Amid, DR Carvalho-Silva, AH Bignell, C Snow, B Yngvadottir, S Bumpstead, DN Cooper, Y Xue, IG Romero, 1000 Genomes Project Consortium, J Wang, Y Li, RA Gibbs, SA McCarroll, ET Dermitzakis, JK Pritchard, JC Barrett, J Harrow, ME Hurles, MB Gerstein, C Tyler-Smith (2012). Science 335: 823-8.

Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation.
G Li, X Ruan, RK Auerbach, KS Sandhu, M Zheng, P Wang, HM Poh, Y Goh, J Lim, J Zhang, HS Sim, SQ Peh, FH Mulawadi, CT Ong, YL Orlov, S Hong, Z Zhang, S Landt, D Raha, G Euskirchen, CL Wei, W Ge, H Wang, C Davis, KI Fisher-Aylor, A Mortazavi, M Gerstein, T Gingeras, B Wold, Y Sun, MJ Fullwood, E Cheung, E Liu, WK Sung, M Snyder, Y Ruan (2012). Cell 148: 84-98.

Novel insights through the integration of structural and functional genomics data with protein networks.
D Clarke, N Bhardwaj, MB Gerstein (2012). J Struct Biol 179: 320-6.

IQSeq: integrated isoform quantification analysis based on next-generation sequencing.
J Du, J Leng, L Habegger, A Sboner, D McDermott, M Gerstein (2012). PLoS One 7: e29175.

-- 2011 (40) --

Performance comparison of whole-genome sequencing platforms.
HY Lam, MJ Clark, R Chen, R Chen, G Natsoulis, M O'Huallachain, FE Dewey, L Habegger, EA Ashley, MB Gerstein, AJ Butte, HP Ji, M Snyder (2011). Nat Biotechnol 30: 78-82.

Systematic control of protein interactions for systems biology.
N Bhardwaj, D Clarke, M Gerstein (2011). Proc Natl Acad Sci U S A 108: 20279-80.

Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
D Greenbaum, A Sboner, XJ Mu, M Gerstein (2011). PLoS Comput Biol 7: e1002278.

Closure of the NCBI SRA and implications for the long-term future of genomics data storage.
D Lipman, P Flicek, S Salzberg, M Gerstein, R Knight (2011). Genome Biol 12: 402.

Genome-wide mapping of copy number variation in humans: comparative analysis of high resolution array platforms.
RR Haraksingh, A Abyzov, M Gerstein, AE Urban, M Snyder (2011). PLoS One 6: e27859.

Construction and analysis of an integrated regulatory network derived from high-throughput sequencing data.
C Cheng, KK Yan, W Hwang, J Qian, N Bhardwaj, J Rozowsky, ZJ Lu, W Niu, P Alves, M Kato, M Snyder, M Gerstein (2011). PLoS Comput Biol 7: e1002190.

Genome-wide analysis of chromatin features identifies histone modification sensitive and insensitive yeast transcription factors.
C Cheng, C Shou, KY Yip, MB Gerstein (2011). Genome Biol 12: R111.

The role of cloud computing in managing the deluge of potentially private genetic data
D Greenbaum, M Gerstein (2011). Am J Bioeth 11: 39-41.

TIP: a probabilistic method for identifying transcription factor target genes from ChIP-seq binding profiles.
C Cheng, R Min, M Gerstein (2011). Bioinformatics 27: 3221-7.

Predicting protein ligand binding motions with the conformation explorer.
SC Flores, MB Gerstein (2011). BMC Bioinformatics 12: 417.

Identification of specificity determining residues in peptide recognition domains using an information theoretic approach applied to large-scale binding maps.
KY Yip, L Utz, S Sitwell, X Hu, SS Sidhu, BE Turk, M Gerstein, PM Kim (2011). BMC Biol 9: 53.

Modeling the relative relationship of transcription factor binding and histone modifications to gene expression levels in mouse embryonic stem cells.
C Cheng, M Gerstein (2011). Nucleic Acids Res 40: 553-68.

Identification of a disease-defining gene fusion in epithelioid hemangioendothelioma.
MR Tanas, A Sboner, AM Oliveira, MR Erickson-Johnson, J Hespelt, PJ Hanwright, J Flanagan, Y Luo, K Fenwick, R Natrajan, C Mitsopoulos, M Zvelebil, BL Hoch, SW Weiss, M Debiec-Rychter, R Sciot, RB West, AJ Lazar, A Ashworth, JS Reis-Filho, CJ Lord, MB Gerstein, MA Rubin, BP Rubin (2011). Sci Transl Med 3: 98ra82.

Integration of protein motions with molecular networks reveals different mechanisms for permanent and transient interactions.
N Bhardwaj, A Abyzov, D Clarke, C Shou, MB Gerstein (2011). Protein Sci 20: 1745-54.

The real cost of sequencing: higher than you think!
A Sboner, XJ Mu, D Greenbaum, RK Auerbach, MB Gerstein (2011). Genome Biol 12: 125.

AlleleSeq: analysis of allele-specific expression and binding in a network framework.
J Rozowsky, A Abyzov, J Wang, P Alves, D Raha, A Harmanci, J Leng, R Bjornson, Y Kong, N Kitabayashi, N Bhardwaj, M Rubin, M Snyder, M Gerstein (2011). Mol Syst Biol 7: 522.

Identification of genomic indels and structural variations using split reads.
ZD Zhang, J Du, H Lam, A Abyzov, AE Urban, M Snyder, M Gerstein (2011). BMC Genomics 12: 375.

The reality of pervasive transcription.
MB Clark, PP Amaral, FJ Schlesinger, ME Dinger, RJ Taft, JL Rinn, CP Ponting, PF Stadler, KV Morris, A Morillon, JS Rozowsky, MB Gerstein, C Wahlestedt, Y Hayashizaki, P Carninci, TR Gingeras, JS Mattick (2011). PLoS Biol 9: e1000625; discussion e1001102.

The spread of scientific information: insights from the web usage statistics in PLoS article-level metrics.
KK Yan, M Gerstein (2011). PLoS One 6: e19917.

Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project.
XJ Mu, ZJ Lu, Y Kong, HY Lam, MB Gerstein (2011). Nucleic Acids Res 39: 7058-76.

Social considerations in research: consider them but don't use them.
D Greenbaum, M Gerstein (2011). Am J Bioeth 11: 31-2.

A user's guide to the encyclopedia of DNA elements (ENCODE).
ENCODE Project Consortium (2011). PLoS Biol 9: e1001046.

Diverse protein kinase interactions identified by protein microarrays reveal novel connections between cellular processes.
J Fasolo, A Sboner, MG Sun, H Yu, R Chen, D Sharon, PM Kim, M Gerstein, M Snyder (2011). Genes Dev 25: 767-78.

The CRIT framework for identifying cross patterns in systems biology and application to chemogenomics.
TA Gianoulis, A Agarwal, M Snyder, MB Gerstein (2011). Genome Biol 12: R32.

A cis-regulatory map of the Drosophila genome.
N Negre, CD Brown, L Ma, CA Bristow, SW Miller, U Wagner, P Kheradpour, ML Eaton, P Loriaux, R Sealfon, Z Li, H Ishii, RF Spokony, J Chen, L Hwang, C Cheng, RP Auburn, MB Davis, M Domanus, PK Shah, CA Morrison, J Zieba, S Suchy, L Senderowicz, A Victorsen, NA Bild, AJ Grundstad, D Hanley, DM MacAlpine, M Mannervik, K Venken, H Bellen, R White, M Gerstein, S Russell, RL Grossman, B Ren, JW Posakony, M Kellis, KP White (2011). Nature 471: 527-31.

Diverse roles and interactions of the SWI/SNF chromatin remodeling complex revealed using global approaches.
GM Euskirchen, RK Auerbach, E Davidov, TA Gianoulis, G Zhong, J Rozowsky, N Bhardwaj, MB Gerstein, M Snyder (2011). PLoS Genet 7: e1002008.

ACT: aggregation and correlation toolbox for analyses of genome tracks.
J Jee, J Rozowsky, KY Yip, L Lochovsky, R Bjornson, G Zhong, Z Zhang, Y Fu, J Wang, Z Weng, M Gerstein (2011). Bioinformatics 27: 1152-4.

Tiling array data analysis: a multiscale approach using wavelets.
A Karpikov, J Rozowsky, M Gerstein (2011). BMC Bioinformatics 12: 57.

CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.
A Abyzov, AE Urban, M Snyder, M Gerstein (2011). Genome Res 21: 974-84.

A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets.
C Cheng, KK Yan, KY Yip, J Rozowsky, R Alexander, C Shou, M Gerstein (2011). Genome Biol 12: R15.

The genomic complexity of primary human prostate cancer.
MF Berger, MS Lawrence, F Demichelis, Y Drier, K Cibulskis, AY Sivachenko, A Sboner, R Esgueva, D Pflueger, C Sougnez, R Onofrio, SL Carter, K Park, L Habegger, L Ambrogio, T Fennell, M Parkin, G Saksena, D Voet, AH Ramos, TJ Pugh, J Wilkinson, S Fisher, W Winckler, S Mahan, K Ardlie, J Baldwin, JW Simons, N Kitabayashi, TY MacDonald, PW Kantoff, L Chin, SB Gabriel, MB Gerstein, TR Golub, M Meyerson, A Tewari, ES Lander, G Getz, MA Rubin, LA Garraway (2011). Nature 470: 214-20.

Mapping copy number variation by population-scale genome sequencing.
RE Mills, K Walter, C Stewart, RE Handsaker, K Chen, C Alkan, A Abyzov, SC Yoon, K Ye, RK Cheetham, A Chinwalla, DF Conrad, Y Fu, F Grubert, I Hajirasouliha, F Hormozdiari, LM Iakoucheva, Z Iqbal, S Kang, JM Kidd, MK Konkel, J Korn, E Khurana, D Kural, HY Lam, J Leng, R Li, Y Li, CY Lin, R Luo, XJ Mu, J Nemesh, HE Peckham, T Rausch, A Scally, X Shi, MP Stromberg, AM Stutz, AE Urban, JA Walker, J Wu, Y Zhang, ZD Zhang, MA Batzer, L Ding, GT Marth, G McVean, J Sebat, M Snyder, J Wang, K Ye, EE Eichler, MB Gerstein, ME Hurles, C Lee, SA McCarroll, JO Korbel, 1000 Genomes Project (2011). Nature 470: 59-65.

Measuring the evolutionary rewiring of biological networks.
C Shou, N Bhardwaj, HY Lam, KK Yan, PM Kim, M Snyder, MB Gerstein (2011). PLoS Comput Biol 7: e1001050.

AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision.
A Abyzov, M Gerstein (2011). Bioinformatics 27: 595-603.

Gene inactivation and its implications for annotation in the era of personal genomics.
S Balasubramanian, L Habegger, A Frankish, DG MacArthur, R Harte, C Tyler-Smith, J Harrow, M Gerstein (2011). Genes Dev 25: 1-10.

Annual Research Review: The promise of stem cell research for neuropsychiatric disorders.
FM Vaccarino, AE Urban, HE Stevens, A Szekely, A Abyzov, EL Grigorenko, M Gerstein, S Weissman (2011). J Child Psychol Psychiatry 52: 504-16.

Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data.
ZJ Lu, KY Yip, G Wang, C Shou, LW Hillier, E Khurana, A Agarwal, R Auerbach, J Rozowsky, C Cheng, M Kato, DM Miller, F Slack, M Snyder, RH Waterston, V Reinke, MB Gerstein (2011). Genome Res 21: 276-85.

Diverse transcription factor binding features revealed by genome-wide ChIP-seq in C. elegans.
W Niu, ZJ Lu, M Zhong, M Sarov, JI Murray, CM Brdlik, J Janette, C Chen, P Alves, E Preston, C Slightham, L Jiang, AA Hyman, SK Kim, RH Waterston, M Gerstein, M Snyder, V Reinke (2011). Genome Res 21: 245-54.

RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries.
L Habegger, A Sboner, TA Gianoulis, J Rozowsky, A Agarwal, M Snyder, M Gerstein (2011). Bioinformatics 27: 281-3.

Discovery of non-ETS gene fusions in human prostate cancer using next-generation RNA sequencing.
D Pflueger, S Terry, A Sboner, L Habegger, R Esgueva, PC Lin, MA Svensson, N Kitabayashi, BJ Moss, TY MacDonald, X Cao, T Barrette, AK Tewari, MS Chee, AM Chinnaiyan, DS Rickman, F Demichelis, MB Gerstein, MA Rubin (2011). Genome Res 21: 56-67.

-- 2010 (32) --

Reproducible Research: Addressing the need for data and code sharing in computational science
Yale Law School Roundtable on Data and Code Sharing (2010). Computing in Science & Engineering 12(5): 8-13 (Sept/Oct).

Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.
MB Gerstein, ZJ Lu, EL Van Nostrand, C Cheng, BI Arshinoff, T Liu, KY Yip, R Robilotto, A Rechtsteiner, K Ikegami, P Alves, A Chateigner, M Perry, M Morris, RK Auerbach, X Feng, J Leng, A Vielle, W Niu, K Rhrissorrakrai, A Agarwal, RP Alexander, G Barber, CM Brdlik, J Brennan, JJ Brouillet, A Carr, MS Cheung, H Clawson, S Contrino, LO Dannenberg, AF Dernburg, A Desai, L Dick, AC Dose, J Du, T Egelhofer, S Ercan, G Euskirchen, B Ewing, EA Feingold, R Gassmann, PJ Good, P Green, F Gullier, M Gutwein, MS Guyer, L Habegger, T Han, JG Henikoff, SR Henz, A Hinrichs, H Holster, T Hyman, AL Iniguez, J Janette, M Jensen, M Kato, WJ Kent, E Kephart, V Khivansara, E Khurana, JK Kim, P Kolasinska-Zwierz, EC Lai, I Latorre, A Leahey, S Lewis, P Lloyd, L Lochovsky, RF Lowdon, Y Lubling, R Lyne, M MacCoss, SD Mackowiak, M Mangone, S McKay, D Mecenas, G Merrihew, DM Miller, A Muroyama, JI Murray, SL Ooi, H Pham, T Phippen, EA Preston, N Rajewsky, G Ratsch, H Rosenbaum, J Rozowsky, K Rutherford, P Ruzanov, M Sarov, R Sasidharan, A Sboner, P Scheid, E Segal, H Shin, C Shou, FJ Slack, C Slightam, R Smith, WC Spencer, EO Stinson, S Taing, T Takasaki, D Vafeados, K Voronina, G Wang, NL Washington, CM Whittle, B Wu, KK Yan, G Zeller, Z Zha, M Zhong, X Zhou, modENCODE Consortium, J Ahringer, S Strome, KC Gunsalus, G Micklem, XS Liu, V Reinke, SK Kim, LW Hillier, S Henikoff, F Piano, M Snyder, L Stein, JD Lieb, RH Waterston (2010). Science 330: 1775-87.

Rewiring of transcriptional regulatory networks: hierarchy, rather than connectivity, better reflects the importance of regulators.
N Bhardwaj, PM Kim, MB Gerstein (2010). Sci Signal 3: ra79.

Extensive in vivo metabolite-protein interactions revealed by large-scale systematic analyses.
X Li, TA Gianoulis, KY Yip, M Gerstein, M Snyder (2010). Cell 143: 639-50.

Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model.
ZD Zhang, MB Gerstein (2010). BMC Bioinformatics 11: 539.

A map of human genome variation from population-scale sequencing
1000 Genomes Project Consortium, GR Abecasis, D Altshuler, A Auton, LD Brooks, RM Durbin, RA Gibbs, ME Hurles, GA McVean (2010). Nature 467: 1061-73.

FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data.
A Sboner, L Habegger, D Pflueger, S Terry, DZ Chen, JS Rozowsky, AK Tewari, N Kitabayashi, BJ Moss, MS Chee, F Demichelis, MA Rubin, MB Gerstein (2010). Genome Biol 11: R104.

Structured digital tables on the Semantic Web: toward a structured digital literature
KH Cheung, M Samwald, RK Auerbach, MB Gerstein (2010). Mol Syst Biol 6: 403.

Annotating non-coding regions of the genome.
RP Alexander, G Fang, J Rozowsky, M Snyder, MB Gerstein (2010). Nat Rev Genet 11: 559-71.

Segmental duplications in the human genome reveal details of pseudogene formation.
E Khurana, HY Lam, C Cheng, N Carriero, P Cayting, MB Gerstein (2010). Nucleic Acids Res 38: 6997-7007.

Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays.
A Agarwal, D Koppstein, J Rozowsky, A Sboner, L Habegger, LW Hillier, R Sasidharan, V Reinke, RH Waterston, M Gerstein (2010). BMC Genomics 11: 383.

Using semantic web rules to reason on an ontology of pseudogenes.
ME Holford, E Khurana, KH Cheung, M Gerstein (2010). Bioinformatics 26: i71-8.

Analysis of combinatorial regulation: scaling of partnerships between regulators with the number of governed targets.
N Bhardwaj, MB Carson, A Abyzov, KK Yan, H Lu, MB Gerstein (2010). PLoS Comput Biol 6: e1000755.

3V: cavity, channel and cleft volume calculator and extractor.
NR Voss, M Gerstein (2010). Nucleic Acids Res 38: W555-62.

MOTIPS: automated motif analysis for predicting targets of modular protein domains.
HY Lam, PM Kim, J Mok, R Tonikian, SS Sidhu, BE Turk, M Snyder, MB Gerstein (2010). BMC Bioinformatics 11: 243.

Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks
KK Yan, G Fang, N Bhardwaj, RP Alexander, M Gerstein (2010). Proc Natl Acad Sci U S A 107: 9186-91.

Analysis of membrane proteins in metagenomics: networks of correlated environmental features and protein families.
PV Patel, TA Gianoulis, RD Bjornson, KY Yip, DM Engelman, MB Gerstein (2010). Genome Res 20: 960-71.

Network modeling identifies molecular functions targeted by miR-204 to suppress head and neck tumor metastasis.
Y Lee, X Yang, Y Huang, H Fan, Q Zhang, Y Wu, J Li, R Hasina, C Cheng, MW Lingen, MB Gerstein, RR Weichselbaum, HR Xing, YA Lussier (2010). PLoS Comput Biol 6: e1000730.

Getting started in gene orthology and functional analysis.
G Fang, N Bhardwaj, R Robilotto, MB Gerstein (2010). PLoS Comput Biol 6: e1000703.

Analysis of diverse regulatory networks in a hierarchical context shows consistent tendencies for collaboration in the middle levels.
N Bhardwaj, KK Yan, MB Gerstein (2010). Proc Natl Acad Sci U S A 107: 6841-6.

Variation in transcription factor binding among humans.
M Kasowski, F Grubert, C Heffelfinger, M Hariharan, A Asabere, SM Waszak, L Habegger, J Rozowsky, M Shi, AE Urban, MY Hong, KJ Karczewski, W Huber, SM Weissman, MB Gerstein, JO Korbel, M Snyder (2010). Science 328: 232-5.

Molecular sampling of prostate cancer: a dilemma for predicting disease progression.
A Sboner, F Demichelis, S Calza, Y Pawitan, SR Setlur, Y Hoshida, S Perner, HO Adami, K Fall, LA Mucci, PW Kantoff, M Stampfer, SO Andersson, E Varenhorst, JE Johansson, MB Gerstein, TR Golub, MA Rubin, O Andren (2010). BMC Med Genomics 3: 8.

Identification and analysis of unitary pseudogenes: historic and contemporary gene losses in humans and other primates.
ZD Zhang, A Frankish, T Hunt, J Harrow, M Gerstein (2010). Genome Biol 11: R26.

Genome-wide sequence-based prediction of peripheral proteins using a novel semi-supervised learning technique.
N Bhardwaj, M Gerstein, H Lu (2010). BMC Bioinformatics 11 Suppl 1: S6.

Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing.
JQ Wu, L Habegger, P Noisa, A Szekely, C Qiu, S Hutchison, D Raha, M Egholm, H Lin, S Weissman, W Cui, M Gerstein, M Snyder (2010). Proc Natl Acad Sci U S A 107: 5254-9.

Personal genome sequencing: current approaches and challenges.
M Snyder, J Du, M Gerstein (2010). Genes Dev 24: 423-31.

Genome-wide identification of binding sites defines distinct functions for Caenorhabditis elegans PHA-4/FOXA in development and environmental response.
M Zhong, W Niu, ZJ Lu, M Sarov, JI Murray, J Janette, D Raha, KL Sheaffer, HY Lam, E Preston, C Slightham, LW Hillier, T Brock, A Agarwal, R Auerbach, AA Hyman, M Gerstein, SE Mango, SK Kim, RH Waterston, V Reinke, M Snyder (2010). PLoS Genet 6: e1000848.

Deciphering protein kinase specificity through large-scale analysis of yeast phosphorylation site motifs.
J Mok, PM Kim, HY Lam, S Piccirillo, X Zhou, GR Jeschke, DL Sheridan, SA Parker, V Desai, M Jwa, E Cameroni, H Niu, M Good, A Remenyi, JL Ma, YJ Sheu, HE Sassi, R Sopko, CS Chan, C De Virgilio, NM Hollingsworth, WA Lim, DF Stern, B Stillman, BJ Andrews, MB Gerstein, M Snyder, BE Turk (2010). Sci Signal 3: ra12.

Close association of RNA polymerase II and many transcription factors with Pol III genes.
D Raha, Z Wang, Z Moqtaderi, L Wu, G Zhong, M Gerstein, K Struhl, M Snyder (2010). Proc Natl Acad Sci U S A 107: 3639-44.

Improved reconstruction of in silico gene regulatory networks by integrating knockout and perturbation data.
KY Yip, RP Alexander, KK Yan, M Gerstein (2010). PLoS One 5: e8121.

Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library.
HY Lam, XJ Mu, AM Stutz, A Tanzer, PD Cayting, M Snyder, PM Kim, JO Korbel, MB Gerstein (2010). Nat Biotechnol 28: 47-55.

RigidFinder: a fast and sensitive method to detect rigid blocks in large macromolecular complexes.
A Abyzov, R Bjornson, M Felipe, M Gerstein (2010). Proteins 78: 309-24.

-- 2009 (39) --

Science and the Law: Grappling with the Gulf
D Greenbaum, MB Gerstein (2009). Science 323: 210.

EBNA1 regulates cellular gene expression by binding cellular promoters.
A Canaan, I Haviv, AE Urban, VP Schulz, S Hartman, Z Zhang, D Palejev, AB Deisseroth, J Lacy, M Snyder, M Gerstein, SM Weissman (2009). Proc Natl Acad Sci U S A 106: 22421-6.

Social networking and personal genomics: suggestions for optimizing the interaction.
D Greenbaum, M Gerstein (2009). Am J Bioeth 9: 15-9.

Bayesian modeling of the yeast SH3 domain interactome predicts spatiotemporal dynamics of endocytosis proteins.
R Tonikian, X Xin, CP Toret, D Gfeller, C Landgraf, S Panni, S Paoluzi, L Castagnoli, B Currell, S Seshagiri, H Yu, B Winsor, M Vidal, MB Gerstein, GD Bader, R Volkmer, G Cesareni, DG Drubin, PM Kim, SS Sidhu, C Boone (2009). PLoS Biol 7: e1000218.

Comprehensive analysis of the pseudogenes of glycolytic enzymes in vertebrates: the anomalously high number of GAPDH pseudogenes highlights a recent burst of retrotrans-positional activity.
YJ Liu, D Zheng, S Balasubramanian, N Carriero, E Khurana, R Robilotto, MB Gerstein (2009). BMC Genomics 10: 480.

Robust-linear-model normalization to reduce technical variability in functional protein microarrays.
A Sboner, A Karpikov, G Chen, M Smith, D Mattoon, L Freeman-Cook, B Schweitzer, MB Gerstein (2009). J Proteome Res 8: 5451-64.

The relationship between the evolution of microRNA targets and the length of their UTRs.
C Cheng, N Bhardwaj, M Gerstein (2009). BMC Genomics 10: 431.

Computational analysis of membrane proteins: the largest class of drug targets.
Y Arinaminpathy, E Khurana, DM Engelman, MB Gerstein (2009). Drug Discov Today 14: 1130-5.

mRNA expression profiles show differential regulatory effects of microRNAs between estrogen receptor-positive and estrogen receptor-negative breast cancer.
C Cheng, X Fu, P Alves, M Gerstein (2009). Genome Biol 10: R90.

Mapping accessible chromatin regions using Sono-Seq.
RK Auerbach, G Euskirchen, J Rozowsky, N Lamarre-Vincent, Z Moqtaderi, P Lefrancois, K Struhl, M Gerstein, M Snyder (2009). Proc Natl Acad Sci U S A 106: 14926-31.

Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levels.
KY Yip, PM Kim, D McDermott, M Gerstein (2009). BMC Bioinformatics 10: 241.

Getting started in text mining: part two
A Rzhetsky, M Seringhaus, MB Gerstein (2009). PLoS Comput Biol 5: e1000411.

N-myc downstream regulated gene 1 (NDRG1) is fused to ERG in prostate cancer.
D Pflueger, DS Rickman, A Sboner, S Perner, CJ LaFargue, MA Svensson, BJ Moss, N Kitabayashi, Y Pan, A de la Taille, R Kuefer, AK Tewari, F Demichelis, MS Chee, MB Gerstein, MA Rubin (2009). Neoplasia 11: 804-11.

Small RNAs originated from pseudogenes: cis- or trans-acting?
X Guo, Z Zhang, MB Gerstein, D Zheng (2009). PLoS Comput Biol 5: e1000449.

Understanding modularity in molecular networks requires dynamics.
RP Alexander, PM Kim, T Emonet, MB Gerstein (2009). Sci Signal 2: pe44.

An approach to comparing tiling array and high throughput sequencing technologies for genomic transcript mapping.
R Sasidharan, A Agarwal, J Rozowsky, M Gerstein (2009). BMC Res Notes 2: 150.

Artificial transmembrane oncoproteins smaller than the bovine papillomavirus E5 protein redefine sequence requirements for activation of the platelet-derived growth factor beta receptor.
K Talbert-Slagle, S Marlatt, FN Barrera, E Khurana, J Oates, M Gerstein, DM Engelman, AM Dixon, D Dimaio (2009). J Virol 83: 9773-85.

The genetic architecture of Down syndrome phenotypes revealed by high-resolution analysis of human segmental trisomies.
JO Korbel, T Tirosh-Wagner, AE Urban, XN Chen, M Kasowski, L Dai, F Grubert, C Erdman, MC Gao, K Lange, EM Sobel, GM Barlow, AS Aylsworth, NJ Carpenter, RD Clark, MY Cohen, E Doran, T Falik-Zaccai, SO Lewin, IT Lott, BC McGillivray, JB Moeschler, MJ Pettenati, SM Pueschel, KW Rao, LG Shaffer, M Shohat, AJ Van Riper, D Warburton, S Weissman, MB Gerstein, M Snyder, JR Korenberg (2009). Proc Natl Acad Sci U S A 106: 12031-6.

Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.
J Du, RD Bjornson, ZD Zhang, Y Kong, M Snyder, MB Gerstein (2009). PLoS Comput Biol 5: e1000432.

Unlocking the secrets of the genome.
SE Celniker, LA Dillon, MB Gerstein, KC Gunsalus, S Henikoff, GH Karpen, M Kellis, EC Lai, JD Lieb, DM MacAlpine, G Micklem, F Piano, M Snyder, L Stein, KP White, RH Waterston, modENCODE Consortium (2009). Nature 459: 927-30.

Integrated assessment of genomic correlates of protein evolutionary rate.
Y Xia, EA Franzosa, MB Gerstein (2009). PLoS Comput Biol 5: e1000413.

Dynamic and complex transcription factor binding during an inducible response in yeast.
L Ni, C Bruce, C Hart, J Leigh-Bell, D Gelperin, L Umansky, MB Gerstein, M Snyder (2009). Genes Dev 23: 1351-63.

Relating protein conformational changes to packing efficiency and disorder.
N Bhardwaj, M Gerstein (2009). Protein Sci 18: 1230-40.

Personal phenotypes to go with personal genomes.
M Snyder, S Weissman, M Gerstein (2009). Mol Syst Biol 5: 273.

Systematic identification of transcription factors associated with patient survival in cancers.
C Cheng, LM Li, P Alves, M Gerstein (2009). BMC Genomics 10: 225.

Zebrafish miR-1 and miR-133 shape muscle gene expression and regulate sarcomeric actin organization.
Y Mishima, C Abreu-Goodger, AA Staton, C Stahlhut, C Shou, C Cheng, M Gerstein, AJ Enright, AJ Giraldez (2009). Genes Dev 23: 619-32.

PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data.
JO Korbel, A Abyzov, XJ Mu, N Carriero, P Cayting, Z Zhang, M Snyder, MB Gerstein (2009). Genome Biol 10: R23.

StoneHinge: hinge prediction by network analysis of individual protein structures.
KS Keating, SC Flores, MB Gerstein, LA Kuhn (2009). Protein Sci 18: 359-71.

Quantifying environmental adaptation of metabolic pathways in metagenomics.
TA Gianoulis, J Raes, PV Patel, R Bjornson, JO Korbel, I Letunic, T Yamada, A Paccanaro, LJ Jensen, M Snyder, P Bork, MB Gerstein (2009). Proc Natl Acad Sci U S A 106: 1374-9.

Efficient yeast ChIP-Seq using multiplex short-read DNA sequencing.
P Lefrancois, GM Euskirchen, RK Auerbach, J Rozowsky, T Gibson, CM Yellman, M Gerstein, M Snyder (2009). BMC Genomics 10: 37.

Distinct genomic aberrations associated with ERG rearranged prostate cancer.
F Demichelis, SR Setlur, R Beroukhim, S Perner, JO Korbel, CJ Lafargue, D Pflueger, C Pina, MD Hofer, A Sboner, MA Svensson, DS Rickman, A Urban, M Snyder, M Meyerson, C Lee, MB Gerstein, R Kuefer, MA Rubin (2009). Genes Chromosomes Cancer 48: 366-80.

A myelopoiesis-associated regulatory intergenic noncoding RNA transcript within the human HOXA cluster.
X Zhang, Z Lian, C Padden, MB Gerstein, J Rozowsky, M Snyder, TR Gingeras, P Kapranov, SM Weissman, PE Newburger (2009). Blood 113: 2526-34.

Comparative analysis of processed ribosomal protein pseudogenes in four mammalian genomes.
S Balasubramanian, D Zheng, YJ Liu, G Fang, A Frankish, N Carriero, R Robilotto, P Cayting, M Gerstein (2009). Genome Biol 10: R2.

PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.
J Rozowsky, G Euskirchen, RK Auerbach, ZD Zhang, T Gibson, R Bjornson, N Carriero, M Snyder, MB Gerstein (2009). Nat Biotechnol 27: 66-75.

MAPK target networks in Arabidopsis thaliana revealed using functional protein microarrays.
SC Popescu, GV Popescu, S Bachan, Z Zhang, M Gerstein, M Snyder, SP Dinesh-Kumar (2009). Genes Dev 23: 80-92.

MSB: a mean-shift-based approach for the analysis of structural variation in the genome.
LY Wang, A Abyzov, JO Korbel, M Snyder, M Gerstein (2009). Genome Res 19: 106-17.

RNA-Seq: a revolutionary tool for transcriptomics.
Z Wang, M Gerstein, M Snyder (2009). Nat Rev Genet 10: 57-63.

Training set expansion: an approach to improving the reconstruction of biological networks from limited and uneven reliable interactions.
KY Yip, M Gerstein (2009). Bioinformatics 25: 243-50.

Pseudofam: the pseudogene families database.
HY Lam, E Khurana, G Fang, P Cayting, N Carriero, KH Cheung, MB Gerstein (2009). Nucleic Acids Res 37: D738-43.

-- 2008 (25) --

Genomics Confounds Gene Classification
M Seringhaus, M Gerstein (2008). American Scientist 96:466-473 (Nov-Dec)

Mismatch oligonucleotides in human and yeast: guidelines for probe design on tiling microarrays.
M Seringhaus, J Rozowsky, T Royce, U Nagalakshmi, J Jee, M Snyder, M Gerstein (2008). BMC Genomics 9: 635.

Defining the TRiC/CCT interactome links chaperonin function to stabilization of newly made proteins with complex topologies.
AY Yam, Y Xia, HT Lin, A Burlingame, M Gerstein, J Frydman (2008). Nat Struct Mol Biol 15: 1255-62.

Genomic anonymity: have we already lost it?
D Greenbaum, J Du, M Gerstein (2008). Am J Bioeth 8: 71-4.

High-resolution copy-number variation map reflects human olfactory receptor diversity and evolution.
Y Hasin, T Olender, M Khen, C Gonzaga-Jauregui, PM Kim, AE Urban, M Snyder, MB Gerstein, D Lancet, JO Korbel (2008). PLoS Genet 4: e1000249.

Analysis of copy number variants and segmental duplications in the human genome: Evidence for a change in the process of formation in recent evolutionary history.
PM Kim, HY Lam, AE Urban, JO Korbel, J Affourtit, F Grubert, X Chen, S Weissman, M Snyder, MB Gerstein (2008). Genome Res 18: 1865-74.

Modeling ChIP sequencing in silico with applications.
ZD Zhang, J Rozowsky, M Snyder, J Chang, M Gerstein (2008). PLoS Comput Biol 4: e1000158.

Transmembrane protein oxygen content and compartmentalization of cells.
R Sasidharan, A Smith, M Gerstein (2008). PLoS One 3: e2726.

Seeking a New Biology through Text Mining
A Rzhetsky, M Seringhaus, M Gerstein (2008). Cell 134: 9-13.

Genomics: protein fossils live on as RNA.
R Sasidharan, M Gerstein (2008). Nature 453: 729-31.

The current excitement about copy-number variation: how it relates to gene duplications and protein families.
JO Korbel, PM Kim, X Chen, AE Urban, S Weissman, M Snyder, MB Gerstein (2008). Curr Opin Struct Biol 18: 366-74.

Targeting the human cancer pathway protein interaction network by structural genomics.
YJ Huang, D Hang, LJ Lu, L Tong, MB Gerstein, GT Montelione (2008). Mol Cell Proteomics 7: 2048-60.

A genomic analysis of RNA polymerase II modification and chromatin architecture related to 3' end RNA polyadenylation.
Z Lian, A Karpikov, J Lian, MC Mahajan, S Hartman, M Gerstein, M Snyder, SM Weissman (2008). Genome Res 18: 1224-37.

Association of cytokeratin 7 and 19 expression with genomic stability and favorable prognosis in clear cell renal cell cancer.
KD Mertz, F Demichelis, A Sboner, MS Hirsch, P Dal Cin, K Struckmann, M Storz, S Scherrer, DM Schmid, RT Strebel, NM Probst-Hensch, M Gerstein, H Moch, MA Rubin (2008). Int J Cancer 123: 569-76.

The transcriptional landscape of the yeast genome defined by RNA sequencing.
U Nagalakshmi, Z Wang, K Waern, C Shou, D Raha, M Gerstein, M Snyder (2008). Science 320: 1344-9.

HingeMaster: normal mode hinge prediction approach and integration of complementary predictors.
SC Flores, KS Keating, J Painter, F Morcos, K Nguyen, EA Merritt, LA Kuhn, MB Gerstein (2008). Proteins 73: 299-319.

Rapid evolution by positive Darwinian selection in T-cell antigen CD4 in primates.
ZD Zhang, G Weinstock, M Gerstein (2008). J Mol Evol 66: 446-56.

Open access: taking full advantage of the content.
PE Bourne, JL Fink, M Gerstein (2008). PLoS Comput Biol 4: e1000037.

The role of disorder in interaction networks: a structural analysis.
PM Kim, A Sboner, Y Xia, M Gerstein (2008). Mol Syst Biol 4: 179.

Manually structured digital abstracts: a scaffold for automatic text mining
M Seringhaus, M Gerstein (2008). FEBS Lett 582: 1170.

Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets.
DS Johnson, W Li, DB Gordon, A Bhattacharjee, B Curry, J Ghosh, L Brizuela, JS Carroll, M Brown, P Flicek, CM Koch, I Dunham, M Bieda, X Xu, PJ Farnham, P Kapranov, DA Nix, TR Gingeras, X Zhang, H Holster, N Jiang, RD Green, JS Song, SA McCuine, E Anton, L Nguyen, ND Trinklein, Z Ye, K Ching, D Hawkins, B Ren, PC Scacheri, J Rozowsky, A Karpikov, G Euskirchen, S Weissman, M Gerstein, M Snyder, A Yang, Z Moqtaderi, H Hirsch, HP Shulha, Y Fu, Z Weng, K Struhl, RM Myers, JD Lieb, XS Liu (2008). Genome Res 18: 393-403.

Uncovering trends in gene naming.
MR Seringhaus, PD Cayting, MB Gerstein (2008). Genome Biol 9: 401.

Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome.
JQ Wu, J Du, J Rozowsky, Z Zhang, AE Urban, G Euskirchen, S Weissman, M Gerstein, M Snyder (2008). Genome Biol 9: R3.

Analysis of nuclear receptor pseudogenes in vertebrates: how the silent tell their stories.
ZD Zhang, P Cayting, G Weinstock, M Gerstein (2008). Mol Biol Evol 25: 131-43.

An integrated system for studying residue coevolution in proteins.
KY Yip, P Patel, PM Kim, DM Engelman, D McDermott, M Gerstein (2008). Bioinformatics 24: 290-2.

-- 2007 (45) --

Semantic Web Approach to Database Integration in the Life Sciences
KH Cheung, AK Smith, KYL Yip, CJO Baker, MB Gerstein (2007). in Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences (eds. C Baker and K Cheung, Springer, NY), pp. 11-30

Semantic Web Standards: Legal and Social Issues and Implications
D Greenbaum, M Gerstein (2007). in Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences (eds. C Baker and K Cheung, Springer, NY), pp. 413-433

Positive selection at the protein network periphery: evaluation in terms of structural constraints and cellular context.
PM Kim, JO Korbel, MB Gerstein (2007). Proc Natl Acad Sci U S A 104: 20274-9.

Integrative microarray analysis of pathways dysregulated in metastatic prostate cancer.
SR Setlur, TE Royce, A Sboner, JM Mosquera, F Demichelis, MD Hofer, KD Mertz, M Gerstein, MA Rubin (2007). Cancer Res 67: 10296-303.

Leveraging the structure of the Semantic Web to enhance information retrieval for proteomics
A Smith, K Cheung, M Krauthammer, M Schultz, M Gerstein (2007). Bioinformatics 23: 3073-9.

Diverse cellular functions of the Hsp90 molecular chaperone uncovered using systems approaches.
AJ McClellan, Y Xia, AM Deutschbauer, RW Davis, M Gerstein, J Frydman (2007). Cell 131: 121-35.

Paired-end mapping reveals extensive structural variation in the human genome.
JO Korbel, AE Urban, JP Affourtit, B Godwin, F Grubert, JF Simons, PM Kim, D Palejev, NJ Carriero, L Du, BE Taillon, Z Chen, A Tanzer, AC Saunders, J Chi, F Yang, NP Carter, ME Hurles, SM Weissman, TT Harkins, MB Gerstein, M Egholm, M Snyder (2007). Science 318: 420-6.

PARE: a tool for comparing protein abundance and mRNA expression data.
EZ Yu, AE Burba, M Gerstein (2007). BMC Bioinformatics 8: 309.

Divergence of transcription factor binding sites across related yeast species.
AR Borneman, TA Gianoulis, ZD Zhang, H Yu, J Rozowsky, MR Seringhaus, LY Wang, M Gerstein, M Snyder (2007). Science 317: 815-9.

The minimum information required for reporting a molecular interaction experiment (MIMIx).
S Orchard, L Salwinski, S Kerrien, L Montecchi-Palazzi, M Oesterheld, V Stumpflen, A Ceol, A Chatr-aryamontri, J Armstrong, P Woollard, JJ Salama, S Moore, J Wojcik, GD Bader, M Vidal, ME Cusick, M Gerstein, AC Gavin, G Superti-Furga, J Greenblatt, J Bader, P Uetz, M Tyers, P Legrain, S Fields, N Mulder, M Gilson, M Niepmann, L Burgoon, J De Las Rivas, C Prieto, VM Perreau, C Hogue, HW Mewes, R Apweiler, I Xenarios, D Eisenberg, G Cesareni, H Hermjakob (2007). Nat Biotechnol 25: 894-8.

Toward a universal microarray: prediction of gene expression through nearest-neighbor probe sequence identification.
TE Royce, JS Rozowsky, MB Gerstein (2007). Nucleic Acids Res 35: e99.

Transcription factor binding site identification in yeast: a comparison of high-density oligonucleotide and PCR-based microarray platforms.
AR Borneman, ZD Zhang, J Rozowsky, MR Seringhaus, M Gerstein, M Snyder (2007). Funct Integr Genomics 7: 335-45.

FlexOracle: predicting flexible hinges by identification of stable domains.
SC Flores, MB Gerstein (2007). BMC Bioinformatics 8: 215.

Comparing classical pathways and modern networks: towards the development of an edge ontology
LJ Lu, A Sboner, YJ Huang, HX Lu, TA Gianoulis, KY Yip, PM Kim, GT Montelione, MB Gerstein (2007). Trends Biochem Sci 32: 320-31.

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.
ENCODE Project Consortium, E Birney, JA Stamatoyannopoulos, A Dutta, R Guigo, TR Gingeras, EH Margulies, Z Weng, M Snyder, ET Dermitzakis, RE Thurman, MS Kuehn, CM Taylor, S Neph, CM Koch, S Asthana, A Malhotra, I Adzhubei, JA Greenbaum, RM Andrews, P Flicek, PJ Boyle, H Cao, NP Carter, GK Clelland, S Davis, N Day, P Dhami, SC Dillon, MO Dorschner, H Fiegler, PG Giresi, J Goldy, M Hawrylycz, A Haydock, R Humbert, KD James, BE Johnson, EM Johnson, TT Frum, ER Rosenzweig, N Karnani, K Lee, GC Lefebvre, PA Navas, F Neri, SC Parker, PJ Sabo, R Sandstrom, A Shafer, D Vetrie, M Weaver, S Wilcox, M Yu, FS Collins, J Dekker, JD Lieb, TD Tullius, GE Crawford, S Sunyaev, WS Noble, I Dunham, F Denoeud, A Reymond, P Kapranov, J Rozowsky, D Zheng, R Castelo, A Frankish, J Harrow, S Ghosh, A Sandelin, IL Hofacker, R Baertsch, D Keefe, S Dike, J Cheng, HA Hirsch, EA Sekinger, J Lagarde, JF Abril, A Shahab, C Flamm, C Fried, J Hackermuller, J Hertel, M Lindemeyer, K Missal, A Tanzer, S Washietl, J Korbel, O Emanuelsson, JS Pedersen, N Holroyd, R Taylor, D Swarbreck, N Matthews, MC Dickson, DJ Thomas, MT Weirauch, J Gilbert, J Drenkow, I Bell, X Zhao, KG Srinivasan, WK Sung, HS Ooi, KP Chiu, S Foissac, T Alioto, M Brent, L Pachter, ML Tress, A Valencia, SW Choo, CY Choo, C Ucla, C Manzano, C Wyss, E Cheung, TG Clark, JB Brown, M Ganesh, S Patel, H Tammana, J Chrast, CN Henrichsen, C Kai, J Kawai, U Nagalakshmi, J Wu, Z Lian, J Lian, P Newburger, X Zhang, P Bickel, JS Mattick, P Carninci, Y Hayashizaki, S Weissman, T Hubbard, RM Myers, J Rogers, PF Stadler, TM Lowe, CL Wei, Y Ruan, K Struhl, M Gerstein, SE Antonarakis, Y Fu, ED Green, U Karaoz, A Siepel, J Taylor, LA Liefer, KA Wetterstrand, PJ Good, EA Feingold, MS Guyer, GM Cooper, G Asimenos, CN Dewey, M Hou, S Nikolaev, JI Montoya-Burgos, A Loytynoja, S Whelan, F Pardi, T Massingham, H Huang, NR Zhang, I Holmes, JC Mullikin, A Ureta-Vidal, B Paten, M Seringhaus, D Church, K Rosenbloom, WJ Kent, EA Stone, NISC Comparative Sequencing Program, Baylor College of Medicine Human Genome Sequencing Center, Washington University Genome Sequencing Center, Broad Institute, Children's Hospital Oakland Research Institute, S Batzoglou, N Goldman, RC Hardison, D Haussler, W Miller, A Sidow, ND Trinklein, ZD Zhang, L Barrera, R Stuart, DC King, A Ameur, S Enroth, MC Bieda, J Kim, AA Bhinge, N Jiang, J Liu, F Yao, VB Vega, CW Lee, P Ng, A Shahab, A Yang, Z Moqtaderi, Z Zhu, X Xu, S Squazzo, MJ Oberley, D Inman, MA Singer, TA Richmond, KJ Munn, A Rada-Iglesias, O Wallerman, J Komorowski, JC Fowler, P Couttet, AW Bruce, OM Dovey, PD Ellis, CF Langford, DA Nix, G Euskirchen, S Hartman, AE Urban, P Kraus, S Van Calcar, N Heintzman, TH Kim, K Wang, C Qu, G Hon, R Luna, CK Glass, MG Rosenfeld, SF Aldred, SJ Cooper, A Halees, JM Lin, HP Shulha, X Zhang, M Xu, JN Haidar, Y Yu, Y Ruan, VR Iyer, RD Green, C Wadelius, PJ Farnham, B Ren, RA Harte, AS Hinrichs, H Trumbower, H Clawson, J Hillman-Jackson, AS Zweig, K Smith, A Thakkapallayil, G Barber, RM Kuhn, D Karolchik, L Armengol, CP Bird, PI de Bakker, AD Kern, N Lopez-Bigas, JD Martin, BE Stranger, A Woodroffe, E Davydov, A Dimas, E Eyras, IB Hallgrimsdottir, J Huppert, MC Zody, GR Abecasis, X Estivill, GG Bouffard, X Guan, NF Hansen, JR Idol, VV Maduro, B Maskeri, JC McDowell, M Park, PJ Thomas, AC Young, RW Blakesley, DM Muzny, E Sodergren, DA Wheeler, KC Worley, H Jiang, GM Weinstock, RA Gibbs, T Graves, R Fulton, ER Mardis, RK Wilson, M Clamp, J Cuff, S Gnerre, DB Jaffe, JL Chang, K Lindblad-Toh, ES Lander, M Koriabine, M Nefedov, K Osoegawa, Y Yoshinaga, B Zhu, PJ de Jong (2007). Nature 447: 799-816.

Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies.
GM Euskirchen, JS Rozowsky, CL Wei, WH Lee, ZD Zhang, S Hartman, O Emanuelsson, V Stolc, S Weissman, MB Gerstein, Y Ruan, M Snyder (2007). Genome Res 17: 898-909.

Structured RNAs in the ENCODE selected regions of the human genome.
S Washietl, JS Pedersen, JO Korbel, C Stocsits, AR Gruber, J Hackermuller, J Hertel, M Lindemeyer, K Reiche, A Tanzer, C Ucla, C Wyss, SE Antonarakis, F Denoeud, J Lagarde, J Drenkow, P Kapranov, TR Gingeras, R Guigo, M Snyder, MB Gerstein, A Reymond, IL Hofacker, PF Stadler (2007). Genome Res 17: 852-64.

Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution.
D Zheng, A Frankish, R Baertsch, P Kapranov, A Reymond, SW Choo, Y Lu, F Denoeud, SE Antonarakis, M Snyder, Y Ruan, CL Wei, TR Gingeras, R Guigo, J Harrow, MB Gerstein (2007). Genome Res 17: 839-51.

Statistical analysis of the genomic distribution and correlation of regulatory elements in the ENCODE regions.
ZD Zhang, A Paccanaro, Y Fu, S Weissman, Z Weng, J Chang, M Snyder, MB Gerstein (2007). Genome Res 17: 787-97.

The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci.
JS Rozowsky, D Newburger, F Sayward, J Wu, G Jordan, JO Korbel, U Nagalakshmi, J Yang, D Zheng, R Guigo, TR Gingeras, S Weissman, P Miller, M Snyder, MB Gerstein (2007). Genome Res 17: 732-45.

Integrated analysis of experimental data sets reveals many novel promoters in 1% of the human genome.
ND Trinklein, U Karaoz, J Wu, A Halees, S Force Aldred, PJ Collins, D Zheng, ZD Zhang, MB Gerstein, M Snyder, RM Myers, Z Weng (2007). Genome Res 17: 720-31.

What is a gene, post-ENCODE? History and updated definition
MB Gerstein, C Bruce, JS Rozowsky, D Zheng, J Du, JO Korbel, O Emanuelsson, ZD Zhang, S Weissman, M Snyder (2007). Genome Res 17: 669-81.

An efficient pseudomedian filter for tiling microrrays.
TE Royce, NJ Carriero, MB Gerstein (2007). BMC Bioinformatics 8: 186.

Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome.
JO Korbel, AE Urban, F Grubert, J Du, TE Royce, P Starr, G Zhong, BS Emanuel, SM Weissman, M Snyder, MB Gerstein (2007). Proc Natl Acad Sci U S A 104: 10110-5.

Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications
H Yu, R Jansen, G Stolovitzky, M Gerstein (2007). Bioinformatics 23: 2163-73.

Global survey of human T leukemic cells by integrating proteomics and transcriptomics profiling.
L Wu, SI Hwang, K Rezaul, LJ Lu, V Mayya, M Gerstein, JK Eng, DH Lundgren, DK Han (2007). Mol Cell Proteomics 6: 1343-53.

Hinge Atlas: relating protein sequence to sites of structural flexibility.
SC Flores, LJ Lu, J Yang, N Carriero, MB Gerstein (2007). BMC Bioinformatics 8: 167.

Tilescope: online analysis pipeline for high-density tiling microarray data.
ZD Zhang, J Rozowsky, HY Lam, J Du, M Snyder, M Gerstein (2007). Genome Biol 8: R81.

Structured digital abstract makes text mining easy
M Gerstein, M Seringhaus, S Fields (2007). Nature 447: 142.

LinkHub: a Semantic Web system that facilitates cross-database queries and information retrieval in proteomics.
AK Smith, KH Cheung, KY Yip, M Schultz, MK Gerstein (2007). BMC Bioinformatics 8 Suppl 3: S5.

Getting connected: analysis and principles of biological networks.
X Zhu, M Gerstein, M Snyder (2007). Genes Dev 21: 1010-24.

RNAi development.
M Gerstein, SM Douglas (2007). PLoS Comput Biol 3: e80.

The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics.
H Yu, PM Kim, E Sprecher, V Trifonov, M Gerstein (2007). PLoS Comput Biol 3: e59.

Assessing the need for sequence-based normalization in tiling microarray experiments.
TE Royce, JS Rozowsky, MB Gerstein (2007). Bioinformatics 23: 988-97.

The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they?
D Zheng, MB Gerstein (2007). Trends Genet 23: 219-24.

Global identification and characterization of transcriptionally active regions in the rice genome.
L Li, X Wang, R Sasidharan, V Stolc, W Deng, H He, J Korbel, X Chen, W Tongprasit, P Ronald, R Chen, M Gerstein, XW Deng (2007). PLoS One 2: e294.

Differential binding of calmodulin-related proteins to their targets revealed through high-density Arabidopsis protein microarrays.
SC Popescu, GV Popescu, S Bachan, Z Zhang, M Seay, M Gerstein, M Snyder, SP Dinesh-Kumar (2007). Proc Natl Acad Sci U S A 104: 4730-5.

New insights into Acinetobacter baumannii pathogenesis revealed by high-density pyrosequencing and transposon mutagenesis.
MG Smith, TA Gianoulis, S Pukatzki, JJ Mekalanos, LN Ornston, M Gerstein, M Snyder (2007). Genes Dev 21: 601-14.

Comparative analysis of genome tiling array data reveals many novel primate-specific functional RNAs in human.
Z Zhang, AW Pang, M Gerstein (2007). BMC Evol Biol 7 Suppl 1: S14.

Publishing perishing? Towards tomorrow's information architecture
MR Seringhaus, MB Gerstein (2007). BMC Bioinformatics 8: 17.

Chemistry Nobel rich in structure
M Seringhaus, M Gerstein (2007). Science 315: 40-1.

Positional artifacts in microarrays: experimental verification and construction of COP, an automated detection tool.
H Yu, K Nguyen, T Royce, J Qian, K Nelson, M Snyder, M Gerstein (2007). Nucleic Acids Res 35: e8.

Assessing the performance of different high-density tiling microarray strategies for mapping transcribed regions of the human genome.
O Emanuelsson, U Nagalakshmi, D Zheng, JS Rozowsky, AE Urban, J Du, Z Lian, V Stolc, S Weissman, M Snyder, MB Gerstein (2007). Genome Res 17: 886-97.

Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation.
JE Karro, Y Yan, D Zheng, Z Zhang, N Carriero, P Cayting, P Harrrison, M Gerstein (2007). Nucleic Acids Res 35: D55-60.

An interdepartmental Ph.D. program in computational biology and bioinformatics: the Yale perspective.
M Gerstein, D Greenbaum, K Cheung, PL Miller (2007). J Biomed Inform 40: 73-9.

-- 2006 (31) --

The Death of the Scientific Paper
Seringhaus M, Gerstein M (2006). The Scientist. 20(9): 25

Analytical Evolutionary Model for Protein Fold Occurrence in Genomes, Accounting for the Effects of Gene Duplication, Deletion, Acquisition and Selective Pressure
M Kamal, N Luscombe, J Qian, M Gerstein (2006). in Power Laws, Scale-Free Networks and Genome Biology (edited by EV Koonin, YI Wolf, GP Karev; Springer, New York), pages 165-193

Novel transcribed regions in the human genome.
J Rozowsky, J Wu, Z Lian, U Nagalakshmi, JO Korbel, P Kapranov, D Zheng, S Dyke, P Newburger, P Miller, TR Gingeras, S Weissman, M Gerstein, M Snyder (2006). Cold Spring Harb Symp Quant Biol 71: 111-6.

Relating three-dimensional structures to protein networks provides evolutionary insights.
PM Kim, LJ Lu, Y Xia, MB Gerstein (2006). Science 314: 1938-41.

Data mining on the web
A Smith, M Gerstein (2006). Science 314: 1682; author reply 1682.

An integrative genomic approach to uncover molecular mechanisms of prokaryotic traits.
Y Liu, J Li, L Sam, CS Goh, M Gerstein, YA Lussier (2006). PLoS Comput Biol 2: e159.

ProCAT: a data analysis approach for protein microarrays.
X Zhu, M Gerstein, M Snyder (2006). Genome Biol 7: R110.

BoCaTFBS: a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments.
LY Wang, M Snyder, M Gerstein (2006). Genome Biol 7: R102.

Helix Interaction Tool (HIT): a web-based tool for analysis of helix-helix interactions in proteins.
AE Burba, U Lehnert, EZ Yu, M Gerstein (2006). Bioinformatics 22: 2735-8.

A supervised hidden markov model framework for efficiently segmenting tiling array data in transcriptional and chIP-chip experiments: systematically incorporating validated biological knowledge.
J Du, JS Rozowsky, JO Korbel, ZD Zhang, TE Royce, MH Schultz, M Snyder, M Gerstein (2006). Bioinformatics 22: 3016-24.

Integration of curated databases to identify genotype-phenotype associations.
CS Goh, TA Gianoulis, Y Liu, J Li, A Paccanaro, YA Lussier, M Gerstein (2006). BMC Genomics 7: 257.

The tYNA platform for comparative interactomics: a web tool for managing, comparing and mining multiple networks.
KY Yip, H Yu, PM Kim, M Schultz, M Gerstein (2006). Bioinformatics 22: 2968-70.

Genomic analysis of the hierarchical structure of regulatory networks.
H Yu, M Gerstein (2006). Proc Natl Acad Sci U S A 103: 14724-31.

TOS9 regulates white-opaque switching in Candida albicans.
T Srikantha, AR Borneman, KJ Daniels, C Pujol, W Wu, MR Seringhaus, M Gerstein, S Yi, M Snyder, DR Soll (2006). Eukaryot Cell 5: 1674-87.

Extrapolating traditional DNA microarray statistics to tiling and protein microarray technologies.
TE Royce, JS Rozowsky, NM Luscombe, O Emanuelsson, H Yu, X Zhu, M Snyder, MB Gerstein (2006). Methods Enzymol 411: 282-311.

Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing.
R Pinard, A de Winter, GJ Sarkis, MB Gerstein, KR Tartaro, RN Plant, M Egholm, JM Rothberg, JH Leamon (2006). BMC Genomics 7: 216.

A computational approach for identifying pseudogenes in the ENCODE regions.
D Zheng, MB Gerstein (2006). Genome Biol 7 Suppl 1: S131-10.

Predicting essential genes in fungal genomes.
M Seringhaus, A Paccanaro, A Borneman, M Snyder, M Gerstein (2006). Genome Res 16: 1126-35.

The real life of pseudogenes.
M Gerstein, D Zheng (2006). Sci Am 295: 48-55.

Design principles of molecular networks revealed by global comparisons and composite motifs.
H Yu, Y Xia, V Trifonov, M Gerstein (2006). Genome Biol 7: R55.

The geometry of the ribosomal polypeptide exit tunnel.
NR Voss, M Gerstein, TA Steitz, PB Moore (2006). J Mol Biol 360: 893-906.

Genomic analysis of insertion behavior and target specificity of mini-Tn7 and Tn3 transposons in Saccharomyces cerevisiae.
M Seringhaus, A Kumar, J Hartigan, M Snyder, M Gerstein (2006). Nucleic Acids Res 34: e57.

Tools needed to navigate landscape of the genome
M Gerstein (2006). Nature 440: 740.

PseudoPipe: an automated pseudogene identification pipeline.
Z Zhang, N Carriero, D Zheng, J Karro, PM Harrison, M Gerstein (2006). Bioinformatics 22: 1437-9.

Global landscape of protein complexes in the yeast Saccharomyces cerevisiae.
NJ Krogan, G Cagney, H Yu, G Zhong, X Guo, A Ignatchenko, J Li, S Pu, N Datta, AP Tikuisis, T Punna, JM Peregrin-Alvarez, M Shales, X Zhang, M Davey, MD Robinson, A Paccanaro, JE Bray, A Sheung, B Beattie, DP Richards, V Canadien, A Lalev, F Mena, P Wong, A Starostine, MM Canete, J Vlasblom, S Wu, C Orsi, SR Collins, S Chandran, R Haw, JJ Rilstone, K Gandi, NJ Thompson, G Musso, P St Onge, S Ghanny, MH Lam, G Butland, AM Altaf-Ul, S Kanaya, A Shilatifard, E O'Shea, JS Weissman, CJ Ingles, TR Hughes, J Parkinson, M Gerstein, SJ Wodak, A Emili, JF Greenblatt (2006). Nature 440: 637-43.

High-resolution mapping of DNA copy alterations in human chromosome 22 using high-density tiling oligonucleotide arrays.
AE Urban, JO Korbel, R Selzer, T Richmond, A Hacker, GV Popescu, JF Cubells, R Green, BS Emanuel, MB Gerstein, SM Weissman, M Snyder (2006). Proc Natl Acad Sci U S A 103: 4534-9.

Predicting interactions in protein networks by completing defective cliques.
H Yu, A Paccanaro, V Trifonov, M Gerstein (2006). Bioinformatics 22: 823-9.

Target hub proteins serve as master regulators of development in yeast.
AR Borneman, JA Leigh-Bell, H Yu, P Bertone, M Gerstein, M Snyder (2006). Genes Dev 20: 435-48.

Integrated prediction of the helical membrane protein interactome in yeast.
Y Xia, LJ Lu, M Gerstein (2006). J Mol Biol 357: 339-49.

The Database of Macromolecular Motions: new features added at the decade mark.
S Flores, N Echols, D Milburn, B Hespenheide, K Keating, J Lu, S Wells, EZ Yu, M Thorpe, M Gerstein (2006). Nucleic Acids Res 34: D296-301.

Design optimization methods for genomic DNA tiling arrays.
P Bertone, V Trifonov, JS Rozowsky, F Schubert, O Emanuelsson, J Karro, MY Kao, M Snyder, M Gerstein (2006). Genome Res 16: 271-81.

-- 2005 (24) --

Inferring Protein-Protein Interactions Using Interaction Network Topologies
A Paccanaro, V Trifonov, H Yu, M Gerstein (2005). International Joint Conference on Neural Networks (IJCNN, Jul. 31-Aug. 4, Montreal, Canada), pages 161 - 166, vol. 1

Protein Interaction Prediction by Integrating Genomic Features and Protein Interaction Network Analysis
LJ Lu, Y Xia, H Yu, A Rives, H Lu, F Schubert, M Gerstein (2005). Data Analysis and Visualization in Genomics and Proteomics (Wiley, NY)

Biochemical and genetic analysis of the yeast proteome with a movable ORF collection.
DM Gelperin, MA White, ML Wilkinson, Y Kon, LA Kung, KJ Wise, N Lopez-Hoyo, L Jiang, S Piccirillo, H Yu, M Gerstein, ME Dumont, EM Phizicky, M Snyder, EJ Grayhack (2005). Genes Dev 19: 2816-26.

Global analysis of protein phosphorylation in yeast.
J Ptacek, G Devgan, G Michaud, H Zhu, X Zhu, J Fasolo, H Guo, G Jona, A Breitkreutz, R Sopko, RR McCartney, MC Schmidt, N Rachidi, SJ Lee, AS Mah, L Meng, MJ Stark, DF Stern, C De Virgilio, M Tyers, B Andrews, M Gerstein, B Schweitzer, PF Predki, M Snyder (2005). Nature 438: 679-84.

Global changes in STAT target selection and transcription regulation upon interferon treatments.
SE Hartman, P Bertone, AK Nath, TE Royce, M Gerstein, S Weissman, M Snyder (2005). Genes Dev 19: 2953-68.

A pilot study of transcription unit analysis in rice using oligonucleotide tiling-path microarray.
V Stolc, L Li, X Wang, X Li, N Su, W Tongprasit, B Han, Y Xue, J Li, M Snyder, M Gerstein, J Wang, XW Deng (2005). Plant Mol Biol 59: 137-49.

Network security and data integrity in academia: an assessment and a proposal for large-scale archiving.
A Smith, D Greenbaum, SM Douglas, M Long, M Gerstein (2005). Genome Biol 6: 119.

PubNet: a flexible system for visualizing literature derived networks.
SM Douglas, GT Montelione, M Gerstein (2005). Genome Biol 6: R80.

Proton sensitivity of ASIC1 appeared with the rise of fishes by changes of residues in the region that follows TM1 in the ectodomain of the channel.
T Coric, D Zheng, M Gerstein, CM Canessa (2005). J Physiol 568: 725-35.

Assessing the limits of genomic data integration for predicting protein networks.
LJ Lu, Y Xia, A Paccanaro, H Yu, M Gerstein (2005). Genome Res 15: 945-53.

Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping.
TE Royce, JS Rozowsky, P Bertone, M Samanta, V Stolc, S Weissman, M Snyder, M Gerstein (2005). Trends Genet 21: 466-75.

YeastHub: a semantic web use case for integrating data in the life sciences domain.
KH Cheung, KY Yip, A Smith, R Deknikker, A Masiar, M Gerstein (2005). Bioinformatics 21 Suppl 1: i85-96.

Integrated pseudogene annotation for human chromosome 22: evidence for transcription.
D Zheng, Z Zhang, PM Harrison, J Karro, N Carriero, M Gerstein (2005). J Mol Biol 349: 27-45.

Applications of DNA tiling arrays to experimental genome annotation and regulatory pathway discovery.
P Bertone, M Gerstein, M Snyder (2005). Chromosome Res 13: 259-74.

Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles.
Y Gilad, SA Rifkin, P Bertone, M Gerstein, KP White (2005). Genome Res 15: 674-80.

Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability.
PM Harrison, D Zheng, Z Zhang, N Carriero, M Gerstein (2005). Nucleic Acids Res 33: 2374-83.

Use of thioredoxin as a reporter to identify a subset of Escherichia coli signal sequences that promote signal recognition particle-dependent translocation.
D Huber, D Boyd, Y Xia, MH Olma, M Gerstein, J Beckwith (2005). J Bacteriol 187: 2983-91.

Robotic cloning and Protein Production Platform of the Northeast Structural Genomics Consortium.
TB Acton, KC Gunsalus, R Xiao, LC Ma, J Aramini, MC Baran, YW Chiang, T Climent, B Cooper, NG Denissova, SM Douglas, JK Everett, CK Ho, D Macapagal, PK Rajan, R Shastry, LY Shih, GV Swapna, M Wilson, M Wu, M Gerstein, M Inouye, JF Hunt, GT Montelione (2005). Methods Enzymol 394: 210-43.

Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms.
S Balasubramanian, Y Xia, E Freinkman, M Gerstein (2005). Nucleic Acids Res 33: 1710-21.

The temporal patterning microRNA let-7 regulates several transcription factors at the larval to adult transition in C. elegans.
H Grosshans, T Johnson, KL Reinert, M Gerstein, FJ Slack (2005). Dev Cell 8: 321-30.

Normal modes for predicting protein motions: a comprehensive database assessment and associated Web tool.
V Alexandrov, U Lehnert, N Echols, D Milburn, D Engelman, M Gerstein (2005). Protein Sci 14: 633-43.

Calculation of standard atomic volumes for RNA and comparison with proteins: RNA is packed more tightly.
NR Voss, M Gerstein (2005). J Mol Biol 346: 477-92.

Impediments to database interoperation: legal issues and security concerns
D Greenbaum, A Smith, M Gerstein (2005). Nucleic Acids Res 33: D3-4.

A high productivity/low maintenance approach to high-performance computation for biomedicine: four case studies.
N Carriero, MV Osier, KH Cheung, PL Miller, M Gerstein, H Zhao, B Wu, S Rifkin, J Chang, H Zhang, K White, K Williams, M Schultz (2005). J Am Med Inform Assoc 12: 90-8.

-- 2004 (31) --

An XML-Based Approach to Integrating Heterogeneous Yeast Genome Data
KH Cheung, D Pan, A Smith, M Seringhaus, SM Douglas, M Gerstein (2004). International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences (METMBS); pp 236-242

Computational analysis of membrane proteins: genomic occurrence, structure prediction and helix interactions.
U Lehnert, Y Xia, TE Royce, CS Goh, Y Liu, A Senes, H Yu, ZL Zhang, DM Engelman, M Gerstein (2004). Q Rev Biophys 37: 121-46.

DNA replication-timing analysis of human chromosome 22 at high resolution and different developmental states.
EJ White, O Emanuelsson, D Scalzo, T Royce, S Kosak, EJ Oakeley, S Weissman, M Gerstein, M Groudine, M Snyder, D Schubeler (2004). Proc Natl Acad Sci U S A 101: 17771-6.

Fast optimal genome tiling with applications to microarray design and homology search.
P Berman, P Bertone, B Dasgupta, M Gerstein, MY Kao, M Snyder (2004). J Comput Biol 11: 766-85.

Global identification of human transcribed sequences with genome tiling arrays.
P Bertone, V Stolc, TE Royce, JS Rozowsky, AE Urban, X Zhu, JL Rinn, W Tongprasit, M Samanta, S Weissman, M Gerstein, M Snyder (2004). Science 306: 2242-6.

The ENCODE (ENCyclopedia Of DNA Elements) Project.
ENCODE Project Consortium (2004). Science 306: 636-40.

Information assessment on predicting protein-protein interactions.
N Lin, B Wu, R Jansen, M Gerstein, H Zhao (2004). BMC Bioinformatics 5: 154.

Regulation of gene expression by a metabolic enzyme.
DA Hall, H Zhu, X Zhu, T Royce, M Gerstein, M Snyder (2004). Science 306: 482-4.

Large-scale mutagenesis of the yeast genome using a Tn7-derived multipurpose transposon.
A Kumar, M Seringhaus, MC Biery, RJ Sarnovsky, L Umansky, S Piccirillo, M Heidtman, KH Cheung, CJ Dobry, MB Gerstein, NL Craig, M Snyder (2004). Genome Res 14: 1975-86.

Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction.
R Jansen, M Gerstein (2004). Curr Opin Microbiol 7: 535-45.

Genomic analysis of regulatory network dynamics reveals large topological changes.
NM Luscombe, MM Babu, H Yu, M Snyder, SA Teichmann, M Gerstein (2004). Nature 431: 308-12.

Comprehensive analysis of pseudogenes in prokaryotes: widespread gene decay and failure of putative horizontally transferred genes.
Y Liu, PM Harrison, V Kunin, M Gerstein (2004). Genome Biol 5: R64.

Large-scale analysis of pseudogenes in the human genome.
Z Zhang, M Gerstein (2004). Curr Opin Genet Dev 14: 328-35.

The protein target list of the Northeast Structural Genomics Consortium.
Z Wunderlich, TB Acton, J Liu, G Kornhaber, J Everett, P Carter, N Lan, N Echols, M Gerstein, B Rost, GT Montelione (2004). Proteins 56: 181-7.

Structure and evolution of transcriptional regulatory networks.
MM Babu, NM Luscombe, L Aravind, M Gerstein, SA Teichmann (2004). Curr Opin Struct Biol 14: 283-91.

Analyzing cellular biochemistry in terms of molecular networks.
Y Xia, H Yu, R Jansen, M Seringhaus, S Baxter, D Greenbaum, H Zhao, M Gerstein (2004). Annu Rev Biochem 73: 1051-87.

Major molecular differences between mammalian sexes are involved in drug metabolism and renal function.
JL Rinn, JS Rozowsky, IJ Laurenzi, PH Petersen, K Zou, W Zhong, M Gerstein, M Snyder (2004). Dev Cell 6: 791-800.

Computer security in academia-a potential roadblock to distributed annotation of the human genome
D Greenbaum, SM Douglas, A Smith, J Lim, M Fischer, M Schultz, M Gerstein (2004). Nat Biotechnol 22: 771-2.

Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs.
H Yu, NM Luscombe, HX Lu, X Zhu, Y Xia, JD Han, N Bertin, S Chung, M Vidal, M Gerstein (2004). Genome Res 14: 1107-18.

Genomic analysis of essentiality within protein networks.
H Yu, D Greenbaum, H Xin Lu, X Zhu, M Gerstein (2004). Trends Genet 20: 227-31.

Selection and characterization of small random transmembrane proteins that bind and activate the platelet-derived growth factor beta receptor.
LL Freeman-Cook, AM Dixon, JB Frank, Y Xia, L Ely, M Gerstein, DM Engelman, D DiMaio (2004). J Mol Biol 338: 907-20.

Conformational changes associated with protein-protein interactions.
CS Goh, D Milburn, M Gerstein (2004). Curr Opin Struct Biol 14: 104-9.

CREB binds to multiple loci on human chromosome 22.
G Euskirchen, TE Royce, P Bertone, R Martone, JL Rinn, FK Nelson, F Sayward, NM Luscombe, P Miller, M Gerstein, S Weissman, M Snyder (2004). Mol Cell Biol 24: 3804-14.

A method using active-site sequence conservation to find functional shifts in protein families: application to the enzymes of central metabolism, leading to the identification of an anomalous isocitrate dehydrogenase in pathogens.
R Das, M Gerstein (2004). Proteins 55: 455-63.

Exploring the range of protein flexibility, from a structural proteomics perspective.
M Gerstein, N Echols (2004). Curr Opin Chem Biol 8: 14-9.

Transmembrane protein domains rarely use covalent domain recombination as an evolutionary mechanism.
Y Liu, M Gerstein, DM Engelman (2004). Proc Natl Acad Sci U S A 101: 3495-7.

Comparative analysis of processed pseudogenes in the mouse and human genomes.
Z Zhang, N Carriero, M Gerstein (2004). Trends Genet 20: 62-7.

Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis.
CS Goh, N Lan, SM Douglas, B Wu, N Echols, A Smith, D Milburn, GT Montelione, H Zhao, M Gerstein (2004). J Mol Biol 336: 115-30.

TopNet: a tool for comparing biological sub-networks, correlating protein properties with topological statistics.
H Yu, X Zhu, D Greenbaum, J Karro, M Gerstein (2004). Nucleic Acids Res 32: 328-37.

Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures.
V Alexandrov, M Gerstein (2004). BMC Bioinformatics 5: 2.

A map of the interactome network of the metazoan C. elegans.
S Li, CM Armstrong, N Bertin, H Ge, S Milstein, M Boxem, PO Vidalain, JD Han, A Chesneau, T Hao, DS Goldberg, N Li, M Martinez, JF Rual, P Lamesch, L Xu, M Tewari, SL Wong, LV Zhang, GF Berriz, L Jacotot, P Vaglio, J Reboul, T Hirozane-Kishikawa, Q Li, HW Gabel, A Elewa, B Baumgartner, DJ Rose, H Yu, S Bosak, R Sequerra, A Fraser, SE Mango, WM Saxton, S Strome, S Van Den Heuvel, F Piano, J Vandenhaute, C Sardet, M Gerstein, L Doucette-Stamm, KC Gunsalus, JW Harper, ME Cusick, FP Roth, DE Hill, M Vidal (2004). Science 303: 540-3.

-- 2003 (33) --

Tools and databases to analyze protein flexibility; approaches to mapping implied features onto sequences.
WG Krebs, J Tsai, V Alexandrov, J Junker, R Jansen, M Gerstein (2003). Methods Enzymol 374: 544-84.

An analysis of the present system of scientific publishing: what's wrong and where to go from here
D Greenbaum, J Lim, M Gerstein (2003). Interdiscip Sci Rev 28:293-302

Identification of novel functional elements in the human genome.
Z Lian, G Euskirchen, J Rinn, R Martone, P Bertone, S Hartman, T Royce, K Nelson, F Sayward, N Luscombe, J Yang, JL Li, P Miller, AE Urban, M Gerstein, S Weissman, M Snyder (2003). Cold Spring Harb Symp Quant Biol 68: 317-22.

Relationship between gene co-expression and probe localization on microarray slides.
Y Kluger, H Yu, J Qian, M Gerstein (2003). BMC Genomics 4: 49.

Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome.
Z Zhang, PM Harrison, Y Liu, M Gerstein (2003). Genome Res 13: 2541-58.

A genome-wide analysis of blue-light regulation of Arabidopsis transcription factor gene expression during seedling development.
Y Jiao, H Yang, L Ma, N Sun, H Yu, T Liu, Y Gao, H Gu, Z Chen, M Wada, M Gerstein, H Zhao, LJ Qu, XW Deng (2003). Plant Physiol 133: 1480-93.

Reconstructing genetic networks in yeast.
Z Zhang, M Gerstein (2003). Nat Biotechnol 21: 1295-7.

A "polyORFomic" analysis of prokaryote genomes using disabled-homology filtering reveals conserved but undiscovered short ORFs.
PM Harrison, N Carriero, Y Liu, M Gerstein (2003). J Mol Biol 333: 885-92.

A Bayesian networks approach for predicting protein-protein interactions from genomic data.
R Jansen, H Yu, D Greenbaum, Y Kluger, NJ Krogan, S Chung, A Emili, M Snyder, JF Greenblatt, M Gerstein (2003). Science 302: 449-53.

Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data.
J Qian, J Lin, NM Luscombe, H Yu, M Gerstein (2003). Bioinformatics 19: 1917-26.

Distribution of NF-kappaB-binding sites across human chromosome 22.
R Martone, G Euskirchen, P Bertone, S Hartman, TE Royce, NM Luscombe, JL Rinn, FK Nelson, P Miller, M Gerstein, S Weissman, M Snyder (2003). Proc Natl Acad Sci U S A 100: 12247-52.

Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes.
Z Zhang, M Gerstein (2003). Nucleic Acids Res 31: 5338-48.

Comparing protein abundance and mRNA expression levels on a genomic scale.
D Greenbaum, C Colangelo, K Williams, M Gerstein (2003). Genome Biol 4: 117.

A universal legal framework as a prerequisite for database interoperability.
D Greenbaum, M Gerstein (2003). Nat Biotechnol 21: 979-82.

The human genome has 49 cytochrome c pseudogenes, including a relic of a primordial gene that still functions in mouse.
Z Zhang, M Gerstein (2003). Gene 312: 61-72.

Genomic analysis of gene expression relationships in transcriptional regulatory networks.
H Yu, NM Luscombe, J Qian, M Gerstein (2003). Trends Genet 19: 422-7.

Identification and correction of spurious spatial correlations in microarray data.
J Qian, Y Kluger, H Yu, M Gerstein (2003). Biotechniques 35: 42-4, 46, 48.

Sequences and topology.
M Gerstein, JM Thornton (2003). Curr Opin Struct Biol 13: 341-3.

ExpressYourself: A modular platform for processing and visualizing microarray data.
NM Luscombe, TE Royce, P Bertone, N Echols, CE Horak, JT Chang, M Snyder, M Gerstein (2003). Nucleic Acids Res 31: 3477-82.

Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements.
Z Zhang, M Gerstein (2003). J Biol 2: 11.

A method to assess compositional bias in biological sequences and its application to prion-like glutamine/asparagine-rich domains in eukaryotic proteomes.
PM Harrison, M Gerstein (2003). Genome Biol 4: R40.

Data mining crystallization databases: knowledge-based approaches to optimize protein crystal screens.
MS Kimber, F Vallee, S Houston, A Necakov, T Skarina, E Evdokimova, S Beasley, D Christendat, A Savchenko, CH Arrowsmith, M Vedadi, M Gerstein, AM Edwards (2003). Proteins 51: 562-8.

SPINE 2: a system for collaborative structural proteomics within a federated database framework.
CS Goh, N Lan, N Echols, SM Douglas, D Milburn, P Bertone, R Xiao, LC Ma, D Zheng, Z Wunderlich, T Acton, GT Montelione, M Gerstein (2003). Nucleic Acids Res 31: 2833-8.

Identification and characterization of over 100 mitochondrial ribosomal protein pseudogenes in the human genome.
Z Zhang, M Gerstein (2003). Genomics 81: 468-80.

Genomics. Defining genes in the genomics era
M Snyder, M Gerstein (2003). Science 300: 258-60.

Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models.
R Jansen, HJ Bussemaker, M Gerstein (2003). Nucleic Acids Res 31: 2242-51.

Spectral biclustering of microarray data: coclustering genes and conditions.
Y Kluger, R Basri, JT Chang, M Gerstein (2003). Genome Res 13: 703-16.

Structural genomics: current progress.
M Gerstein, A Edwards, CH Arrowsmith, GT Montelione (2003). Science 299: 1663.

The transcriptional activity of human Chromosome 22.
JL Rinn, G Euskirchen, P Bertone, R Martone, NM Luscombe, S Hartman, PM Harrison, FK Nelson, P Miller, M Gerstein, S Weissman, M Snyder (2003). Genes Dev 17: 529-40.

Identification of pseudogenes in the Drosophila melanogaster genome.
PM Harrison, D Milburn, Z Zhang, P Bertone, M Gerstein (2003). Nucleic Acids Res 31: 1033-7.

Strategies for structural proteomics of prokaryotes: Quantifying the advantages of studying orthologous proteins and of using both NMR and X-ray crystallography approaches.
A Savchenko, A Yee, A Khachatryan, T Skarina, E Evdokimova, M Pavlova, A Semesi, J Northey, S Beasley, N Lan, R Das, M Gerstein, CH Arrowmith, AM Edwards (2003). Proteins 50: 392-9.

Ontologies for proteomics: towards a systematic definition of structure and function that scales to the genome level.
N Lan, GT Montelione, M Gerstein (2003). Curr Opin Chem Biol 7: 44-54.

MolMovDB: analysis and visualization of conformational change and structural flexibility.
N Echols, D Milburn, M Gerstein (2003). Nucleic Acids Res 31: 478-82.

-- 2002 (32) --

Calculations of protein volumes: sensitivity analysis and parameter database.
J Tsai, M Gerstein (2002). Bioinformatics 18: 985-95.

Subcellular localization of the yeast proteome.
A Kumar, S Agarwal, JA Heyman, S Matson, M Heidtman, S Piccirillo, L Umansky, A Drawid, R Jansen, Y Liu, KH Cheung, P Miller, M Gerstein, GS Roeder, M Snyder (2002). Genes Dev 16: 707-19.

Analysis of mRNA expression and protein abundance data: an approach for the comparison of the enrichment of features in the cellular population of proteins and transcripts.
D Greenbaum, R Jansen, M Gerstein (2002). Bioinformatics 18: 585-96.

Structural genomics analysis: characteristics of atypical, common, and horizontally transferred folds.
H Hegyi, J Lin, D Greenbaum, M Gerstein (2002). Proteins 47: 126-41.

A small reservoir of disabled ORFs in the yeast genome and its implications for the dynamics of proteome evolution.
P Harrison, A Kumar, N Lan, N Echols, M Snyder, M Gerstein (2002). J Mol Biol 316: 409-19.

Studying genomes through the aeons: protein families, pseudogenes and proteome evolution.
PM Harrison, M Gerstein (2002). J Mol Biol 318: 1155-74.

A question of size: the eukaryotic proteome and the problems in defining it.
PM Harrison, A Kumar, N Lang, M Snyder, M Gerstein (2002). Nucleic Acids Res 30: 1083-90.

Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22.
PM Harrison, H Hegyi, S Balasubramanian, NM Luscombe, P Bertone, N Echols, T Johnson, M Gerstein (2002). Genome Res 12: 272-80.

Relating whole-genome expression data with protein-protein interactions.
R Jansen, D Greenbaum, M Gerstein (2002). Genome Res 12: 37-46.

Proteomics. Integrating interactomes.
M Gerstein, N Lan, R Jansen (2002). Science 295: 284-7.

Towards a systematic definition of protein function that scales to the genome level: Defining function in terms of interactions.
N Lan, R Jansen, M Gerstein (2002). Proceedings of the IEEE 90:1848-1858

Blurring the boundaries between scientific 'papers' and biological databases
M Gerstein, J Junker (2002). Nature Yearbook of Science and Technology 210-212 (ed. D Butler, Palgrave Macmillan Publishers)

Fast optimal genome tiling with applications to microarray design and homology search.
P Berman, P Bertone, B DasGupta, M Gerstein, M-Y Kao, M Snyder (2002). Proceedings of the 2nd International Workshop on Algorithms in Bioinformatics. Springer-Verlag LNCS 2452: 419-433

Integration of genomic datasets to predict protein complexes in yeast.
R Jansen, N Lan, J Qian, M Gerstein (2002). J Struct Funct Genomics 2: 71-81.

Complex transcriptional circuitry at the G1/S transition in Saccharomyces cerevisiae.
CE Horak, NM Luscombe, J Qian, P Bertone, S Piccirrillo, M Gerstein, M Snyder (2002). Genes Dev 16: 3017-33.

YMD: a microarray database for large-scale gene expression analysis.
KH Cheung, K White, J Hager, M Gerstein, V Reinke, K Nelson, P Masiar, R Srivastava, Y Li, J Li, H Zhao, J Li, DB Allison, M Snyder, P Miller, K Williams (2002). Proc AMIA Symp : 140-4.

Thermostability of membrane protein helix-helix interaction elucidated by statistical analysis.
D Schneider, Y Liu, M Gerstein, DM Engelman (2002). FEBS Lett 532: 231-6.

Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons.
A Mateos, J Dopazo, R Jansen, Y Tu, M Gerstein, G Stolovitzky (2002). Genome Res 12: 1703-15.

Digging deep for ancient relics: a survey of protein motifs in the intergenic sequences of four eukaryotic genomes.
ZL Zhang, PM Harrison, M Gerstein (2002). J Mol Biol 323: 811-22.

GeneCensus: genome comparisons in terms of metabolic pathway activity and protein family sharing.
J Lin, J Qian, D Greenbaum, P Bertone, R Das, N Echols, A Senes, B Stenger, M Gerstein (2002). Nucleic Acids Res 30: 4574-82.

Genomic and proteomic analysis of the myeloid differentiation program: global analysis of gene expression during induced differentiation in the MPRO cell line.
Z Lian, Y Kluger, DS Greenbaum, D Tuck, M Gerstein, N Berliner, SM Weissman, PE Newburger (2002). Blood 100: 3209-20.

Genomic analysis of membrane protein families: abundance and conserved motifs.
Y Liu, DM Engelman, M Gerstein (2002). Genome Biol 3: research0054.

Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome.
Z Zhang, P Harrison, M Gerstein (2002). Genome Res 12: 1466-82.

Bridging structural biology and genomics: assessing protein interaction data with known complexes.
AM Edwards, B Kus, R Jansen, D Greenbaum, J Greenblatt, M Gerstein (2002). Trends Genet 18: 529-36.

Normal mode analysis of macromolecular motions in a database framework: developing mode concentration as a useful classifying statistic.
WG Krebs, V Alexandrov, CA Wilson, N Echols, H Yu, M Gerstein (2002). Proteins 48: 682-95.

The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties.
NM Luscombe, J Qian, Z Zhang, T Johnson, M Gerstein (2002). Genome Biol 3: RESEARCH0040.

Functional profiling of the Saccharomyces cerevisiae genome.
G Giaever, AM Chu, L Ni, C Connelly, L Riles, S Veronneau, S Dow, A Lucau-Danila, K Anderson, B Andre, AP Arkin, A Astromoff, M El-Bakkoury, R Bangham, R Benito, S Brachat, S Campanaro, M Curtiss, K Davis, A Deutschbauer, KD Entian, P Flaherty, F Foury, DJ Garfinkel, M Gerstein, D Gotte, U Guldener, JH Hegemann, S Hempel, Z Herman, DF Jaramillo, DE Kelly, SL Kelly, P Kotter, D LaBonte, DC Lamb, N Lan, H Liang, H Liao, L Liu, C Luo, M Lussier, R Mao, P Menard, SL Ooi, JL Revuelta, CJ Roberts, M Rose, P Ross-Macdonald, B Scherens, G Schimmack, B Shafer, DD Shoemaker, S Sookhai-Mahadeo, RK Storms, JN Strathern, G Valle, M Voet, G Volckaert, CY Wang, TR Ward, J Wilhelmy, EA Winzeler, Y Yang, G Yen, E Youngman, K Yu, H Bussey, JD Boeke, M Snyder, P Philippsen, RW Davis, M Johnston (2002). Nature 418: 387-91.

SNPs on human chromosomes 21 and 22 -- analysis in terms of protein features and pseudogenes.
S Balasubramanian, P Harrison, H Hegyi, P Bertone, N Luscombe, N Echols, P McGarvey, Z Zhang, M Gerstein (2002). Pharmacogenomics 3: 393-402.

Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.
N Echols, P Harrison, S Balasubramanian, NM Luscombe, P Bertone, Z Zhang, M Gerstein (2002). Nucleic Acids Res 30: 2515-23.

GATA-1 binding sites mapped in the beta-globin locus by using mammalian chIp-chip analysis.
CE Horak, MC Mahajan, NM Luscombe, M Gerstein, SM Weissman, M Snyder (2002). Proc Natl Acad Sci U S A 99: 2924-9.

Structural genomics: a new era for pharmaceutical research
Y Liu, NM Luscombe, V Alexandrov, P Bertone, P Harrison, Z Zhang, M Gerstein (2002). Genome Biol 3: REPORTS4004.

An integrated approach for finding overlooked genes in yeast.
A Kumar, PM Harrison, KH Cheung, N Lan, N Echols, P Bertone, P Miller, MB Gerstein, M Snyder (2002). Nat Biotechnol 20: 58-63.

-- 2001 (21) --

Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model.
J Qian, NM Luscombe, M Gerstein (2001). J Mol Biol 313: 673-81.

Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.
J Qian, M Dolled-Filhart, J Lin, H Yu, M Gerstein (2001). J Mol Biol 314: 1053-66.

Annotation transfer for genomics: measuring functional divergence in multi-domain proteins.
H Hegyi, M Gerstein (2001). Genome Res 11: 1632-40.

What is bioinformatics? A proposed definition and overview of the field.
NM Luscombe, D Greenbaum, M Gerstein (2001). Methods Inf Med 40: 346-58.

Global perspectives on proteins: comparing genomes in terms of folds, pathways and beyond.
R Das, J Junker, D Greenbaum, MB Gerstein (2001). Pharmacogenomics J 1: 115-25.

Global analysis of protein activities using proteome chips.
H Zhu, M Bilgin, R Bangham, D Hall, A Casamayor, P Bertone, N Lan, R Jansen, S Bidlingmaier, T Houfek, T Mitchell, P Miller, RA Dean, M Gerstein, M Snyder (2001). Science 293: 2101-5.

Calculating populations of subcellular compartments using density matrix formalism
V Alexandrov, M Gerstein (2001). International Journal of Quantum Chemistry 85:693-696

A standard reference frame for the description of nucleic acid base-pair geometry.
WK Olson, M Bansal, SK Burley, RE Dickerson, M Gerstein, SC Harvey, U Heinemann, XJ Lu, S Neidle, Z Shakked, H Sklenar, M Suzuki, CS Tung, E Westhof, C Wolberger, HM Berman (2001). J Mol Biol 313: 229-37.

Determining the minimum number of types necessary to represent the sizes of protein atoms.
J Tsai, N Voss, M Gerstein (2001). Bioinformatics 17: 949-56.

Interrelating different types of genomic data, from proteome to secretome: 'oming in on function.
D Greenbaum, NM Luscombe, R Jansen, J Qian, M Gerstein (2001). Genome Res 11: 1463-8.

Protein Geometry: Distances, Areas, and Volumes
M Gerstein, F M Richards (2001). International Tables for Crystallography (Volume F, Chapter 22.1.1, pages 531-539; M Rossmann & E Arnold, editors; Dordrecht: Kluwer)

A Bauhaus for Biologists: An Introduction to Protein Architecture by A. M. Lesk
P Harrison, M Gerstein (2001). Trends Biochem Sci 26:204-205

Integrative data mining: the new direction in bioinformatics.
P Bertone, M Gerstein (2001). IEEE Eng Med Biol Mag 20: 33-40.

Genomic and proteomic analysis of the myeloid differentiation program.
Z Lian, L Wang, S Yamaga, W Bonds, Y Beazer-Barclay, Y Kluger, M Gerstein, PE Newburger, N Berliner, SM Weissman (2001). Blood 98: 513-24.

RNA expression patterns change dramatically in human neutrophils exposed to bacteria.
YV Subrahmanyam, S Yamaga, Y Prashar, HH Lee, NP Hoe, Y Kluger, M Gerstein, JD Goguen, PE Newburger, SM Weissman (2001). Blood 97: 2457-68.

SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics.
P Bertone, Y Kluger, N Lan, D Zheng, D Christendat, A Yee, AM Edwards, CH Arrowsmith, GT Montelione, M Gerstein (2001). Nucleic Acids Res 29: 2884-98.

Sequences and Topology
M Gerstein, B Honig (2001). Current Opinion in Structural Biology 11: 327-329.

What is Bioinformatics? A Proposed Definition and Overview of the Field
N Luscombe, D Greenbaum, M Gerstein (2001). Intl. Medical Informatics Association (Yearbook, Pages 83-99)

Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome.
PM Harrison, N Echols, MB Gerstein (2001). Nucleic Acids Res 29: 818-30.

PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information.
J Qian, B Stenger, CA Wilson, J Lin, R Jansen, SA Teichmann, J Park, WG Krebs, H Yu, V Alexandrov, N Echols, M Gerstein (2001). Nucleic Acids Res 29: 1750-64.

An XML Application for Genomic Data Interoperation
Cheung KH, Liu Y, Kumar K, Snyder M, Gerstein M, Miller P (2001). IEEE International Symposium on Bio-Informatics and Biomedical Engineering (BIBE) pp. 97-103

-- 2000 (18) --

Integrative database analysis in structural genomics.
M Gerstein (2000). Nat Struct Biol 7 Suppl: 960-3.

Genome analyses of spirochetes: a study of the protein structures, functions and metabolic pathways in Treponema pallidum and Borrelia burgdorferi.
R Das, H Hegyi, M Gerstein (2000). J Mol Microbiol Biotechnol 2: 387-92.

Structural proteomics: prospects for high throughput sample preparation.
D Christendat, A Yee, A Dharamsi, Y Kluger, M Gerstein, CH Arrowsmith, AM Edwards (2000). Prog Biophys Mol Biol 73: 339-45.

Analysis of yeast protein kinases using protein chips.
H Zhu, JF Klemic, S Chang, P Bertone, A Casamayor, KG Klemic, D Smith, M Gerstein, MA Reed, M Snyder (2000). Nat Genet 26: 283-9.

Genome-wide analysis relating expression level with protein subcellular localization.
A Drawid, R Jansen, M Gerstein (2000). Trends Genet 16: 426-30.

The current excitement in bioinformatics-analysis of whole-genome expression data: how does it relate to protein structure and function?
M Gerstein, R Jansen (2000). Curr Opin Struct Biol 10: 574-84.

The stability of thermophilic proteins: a study based on comprehensive genome comparison.
R Das, M Gerstein (2000). Funct Integr Genomics 1: 76-88.

Measuring shifts in function and evolutionary opportunity using variability profiles: a case study of the globins.
GJ Naylor, M Gerstein (2000). J Mol Evol 51: 223-33.

Structural proteomics of an archaeon.
D Christendat, A Yee, A Dharamsi, Y Kluger, A Savchenko, JR Cort, V Booth, CD Mackereth, V Saridakis, I Ekiel, G Kozlov, KL Maxwell, N Wu, LP McIntosh, K Gehring, MA Kennedy, AR Davidson, EF Pai, M Gerstein, AM Edwards, CH Arrowsmith (2000). Nat Struct Biol 7: 903-9.

A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome.
A Drawid, M Gerstein (2000). J Mol Biol 301: 1059-75.

Proteomics of Mycoplasma genitalium: identification and characterization of unannotated and atypical proteins in a small model genome.
S Balasubramanian, T Schneider, M Gerstein, L Regan (2000). Nucleic Acids Res 28: 3075-82.

Protein folds in the worm genome.
M Gerstein, J Lin, H Hegyi (2000). Pac Symp Biocomput : 30-41.

Annotation of the human genome.
M Gerstein (2000). Science 288: 1590.

Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels.
J Lin, M Gerstein (2000). Genome Res 10: 808-18.

The morph server: a standardized system for analyzing and visualizing macromolecular motions in a database framework.
WG Krebs, M Gerstein (2000). Nucleic Acids Res 28: 1665-75.

Statistical analysis of amino acid patterns in transmembrane helices: the GxxxG motif occurs frequently and in association with beta-branched residues at neighboring positions.
A Senes, M Gerstein, DM Engelman (2000). J Mol Biol 296: 921-36.

Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.
CA Wilson, J Kreychman, M Gerstein (2000). J Mol Biol 297: 233-49.

Analysis of the yeast transcriptome with structural and functional categories: characterizing highly expressed proteins.
R Jansen, M Gerstein (2000). Nucleic Acids Res 28: 1481-8.

-- 1999 (10) --

Large-scale analysis of the yeast genome by transposon tagging and gene disruption.
P Ross-Macdonald, PS Coelho, T Roemer, S Agarwal, A Kumar, R Jansen, KH Cheung, A Sheehan, D Symoniatis, L Umansky, M Heidtman, FK Nelson, H Iwasaki, K Hager, M Gerstein, P Miller, GS Roeder, M Snyder (1999). Nature 402: 413-8.

Perspectives: signal transduction. Proteins in motion.
M Gerstein, C Chothia (1999). Science 285: 1682-3.

Motions in a Database Framework: from Structure to Sequence
M Gerstein, R Jansen, T Johnson, J Tsai, W Krebs (1999). Rigidity Theory and Applications 401-442 (ed. M F Thorpe and P M Duxbury, Kluwer Academic/Plenum Publishers).

E-publishing on the Web: promises, pitfalls, and payoffs for bioinformatics.
M Gerstein (1999). Bioinformatics 15: 429-31.

E-publishing on the Web: promises, pitfalls, and payoffs for bioinformatics
ES Brodkin, M Gerstein (1999). N Engl J Med 341: 1080; author reply 1081.

E-biomed and clinical research
J Tsai, R Taylor, C Chothia, M Gerstein (1999). J Mol Biol 290: 253-66.

Advances in structural genomics.
SA Teichmann, C Chothia, M Gerstein (1999). Curr Opin Struct Biol 9: 390-9.

Building the future of biocomputing.
M Gerstein (1999). Nature 399: 101.

The relationship between protein structure and function: a comprehensive survey with application to the yeast genome.
H Hegyi, M Gerstein (1999). J Mol Biol 288: 147-64.

Forging links in an electronic paper chain
M Gerstein (1999). Nature 398: 20.

-- 1998 (9) --

Comparing genomes in terms of protein structure: surveys of a finite parts list.
M Gerstein, H Hegyi (1998). FEMS Microbiol Rev 22: 277-304.

How representative are the known structures of the proteins in a complete genome? A comprehensive structural census.
M Gerstein (1998). Fold Des 3: 497-512.

Patterns of protein-fold usage in eight microbial genomes: a comprehensive structural census.
M Gerstein (1998). Proteins 33: 518-34.

Simulating water and the molecules of life.
M Gerstein, M Levitt (1998). Sci Am 279: 100-5.

Measurement of the effectiveness of transitive sequence comparison, through a third 'intermediate' sequence.
M Gerstein (1998). Bioinformatics 14: 707-14.

A database of macromolecular motions.
M Gerstein, W Krebs (1998). Nucleic Acids Res 26: 4280-90.

Repeated tertiary fold of RNA polymerase II and implications for DNA binding.
J Fu, M Gerstein, PR David, AL Gnatt, DA Bushnell, AM Edwards, RD Kornberg (1998). J Mol Biol 280: 317-22.

A unified statistical framework for sequence comparison and structure comparison.
M Levitt, M Gerstein (1998). Proc Natl Acad Sci U S A 95: 5913-20.

Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins.
M Gerstein, M Levitt (1998). Protein Sci 7: 445-56.

-- 1997 (6) --

A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure.
M Gerstein (1997). J Mol Biol 274: 562-76.

Simulating the minimum core for hydrophobic collapse in globular proteins.
J Tsai, M Gerstein, M Levitt (1997). Protein Sci 6: 2606-16.

A structural census of the current population of protein sequences.
M Gerstein, M Levitt (1997). Proc Natl Acad Sci U S A 94: 11911-6.

Protein evolution. How far can sequences diverge?
C Chothia, M Gerstein (1997). Nature 385: 579, 581.

Protein folding: the endgame.
M Levitt, M Gerstein, E Huang, S Subbiah, J Tsai (1997). Annu Rev Biochem 66: 549-79.

LPFC: an Internet library of protein family core structures.
R Schmidt, M Gerstein, RB Altman (1997). Protein Sci 6: 246-8.

-- 1996 (4) --

Practicing Cyberlaw in the Year 2000.
R Becker, M Gerstein (1996). New Jersey Lawyer Magazine 179: 12-27 (September).

Keeping the Shape but Changing the Charges: A Simulation Study of Urea and its Isosteric Analogues
J Tsai, M Gerstein, M Levitt (1996). Journal of Chemical Physics 104: 9417-9430

Packing at the protein-water interface.
M Gerstein, C Chothia (1996). Proc Natl Acad Sci U S A 93: 10167-72.

Using iterative dynamic programming to obtain accurate pairwise and multiple alignments of protein structures.
M Gerstein, M Levitt (1996). Proc Int Conf Intell Syst Mol Biol 4: 59-67.

-- 1995 (7) --

Using a measure of structural variation to define a core for the globins.
M Gerstein, RB Altman (1995). Comput Appl Biosci 11: 633-44.

Binding geometry of alpha-helices that recognize DNA.
M Suzuki, M Gerstein (1995). Proteins 23: 525-35.

Average core structures and variability measures for protein families: application to the immunoglobulins.
M Gerstein, RB Altman (1995). J Mol Biol 251: 161-75.

The volume of atoms on the protein surface: calculated from simulation, using Voronoi polyhedra.
M Gerstein, J Tsai, M Levitt (1995). J Mol Biol 249: 955-66.

Methods for displaying macromolecular structural uncertainty: application to the globins.
RB Altman, C Hughes, MB Gerstein (1995). J Mol Graph 13: 142-52, 109-2.

DNA recognition and superstructure formation by helix-turn-helix proteins.
M Suzuki, N Yagi, M Gerstein (1995). Protein Eng 8: 329-38.

DNA recognition code of transcription factors.
M Suzuki, SE Brenner, M Gerstein, N Yagi (1995). Protein Eng 8: 319-28.

-- 1994 (7) --

Purcell's early work on NMR: Contingency versus Inevitability
M Gerstein (1994). American Journal of Physics 62: 596-601

Stereochemical basis of DNA recognition by Zn fingers.
M Suzuki, M Gerstein, N Yagi (1994). Nucleic Acids Res 22: 3397-405.

Volume changes on protein folding.
Y Harpaz, M Gerstein, C Chothia (1994). Structure 2: 641-9.

Structural mechanisms for domain movements in proteins.
M Gerstein, AM Lesk, C Chothia (1994). Biochemistry 33: 6739-49.

Solution structure of the DNA binding octapeptide repeat of the K10 gene product.
M Suzuki, D Neuhaus, M Gerstein, S Aimoto (1994). Protein Eng 7: 461-70.

Volume changes in protein evolution.
M Gerstein, EL Sonnhammer, C Chothia (1994). J Mol Biol 236: 1067-78.

Finding an average core structure: application to the globins.
RB Altman, M Gerstein (1994). Proc Int Conf Intell Syst Mol Biol 2: 19-27.

-- 1993 (7) --

Simulation of Water around a Model Protein Helix. 1. Two-dimensional Projections of Solvent Structure.
M Gerstein, R Lynden-Bell (1993). Journal of Physical Chemistry 97: 2982-2991.

Simulation of Water around a Model Protein Helix. 2. The Relative Contributions of Packing, Hydrophobicity, and Hydrogen Bonding.
M Gerstein, R Lynden-Bell (1993). Journal of Physical Chemistry 97: 2991-2999.

Domain closure in lactoferrin. Two hinges produce a see-saw motion between alternative close-packed interfaces.
M Gerstein, BF Anderson, GE Norris, EN Baker, AM Lesk, C Chothia (1993). J Mol Biol 234: 357-72.

An NMR study on the DNA-binding SPKK motif and a model for its interaction with DNA.
M Suzuki, M Gerstein, T Johnson (1993). Protein Eng 6: 565-74.

What is the natural boundary of a protein in solution?
M Gerstein, RM Lynden-Bell (1993). J Mol Biol 230: 641-50.

Domain closure in adenylate kinase. Joints on either side of two helices close like neighboring fingers.
M Gerstein, G Schulz, C Chothia (1993). J Mol Biol 229: 494-501.

Electron diffraction analysis of structural changes in the photocycle of bacteriorhodopsin.
S Subramaniam, M Gerstein, D Oesterhelt, R Henderson (1993). EMBO J 12: 1-8.

-- 1992 (2) --

A Resolution-Sensitive Procedure for Comparing Protein Surfaces and its Application to the Comparison of Antigen-Combining Sites.
M Gerstein (1992). Acta Crystallographica A48: 271-276.

Polar zipper sequence in the high-affinity hemoglobin of Ascaris suum: amino acid sequence and structural interpretation.
I De Baere, L Liu, L Moens, J Van Beeumen, C Gielens, J Richelle, C Trotman, J Finch, M Gerstein, M Perutz (1992). Proc Natl Acad Sci U S A 89: 4638-42.

-- 1991 (1) --

Analysis of protein loop closure. Two types of hinges produce one motion in lactate dehydrogenase.
M Gerstein, C Chothia (1991). J Mol Biol 220: 133-49.

-- 1987 (1) --

Inverse Problem for Synchrotron Radiation in the Presence of Noise
N Fisch, A Kritz, M Gerstein (1987). Proceedings of the Sixth Joint Workshop on Electron Cyclotron Emission and Electron Cyclotron Resonance Heating. (eds. A Riviere, A Costley), 23-30 (Oxford, 16-17 September).