Listing of our best scientific papers. Also, see descriptions on our key contributions page and a complete list of papers.


The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models
J Rozowsky, J Gao, B Borsari, YT Yang, T Galeev, G Gursoy, CB Epstein, K Xiong, J Xu, T Li, J Liu, K Yu, A Berthel, Z Chen, F Navarro, MS Sun, J Wright, J Chang, CJF Cameron, N Shoresh, E Gaskell, J Drenkow, J Adrian, S Aganezov, F Aguet, G Balderrama-Gutierrez, S Banskota, GB Corona, S Chee, SB Chhetri, GC Cortez Martins, C Danyko, CA Davis, D Farid, NP Farrell, I Gabdank, Y Gofin, DU Gorkin, M Gu, V Hecht, BC Hitz, R Issner, Y Jiang, M Kirsche, X Kong, BR Lam, S Li, B Li, X Li, KZ Lin, R Luo, M Mackiewicz, R Meng, JE Moore, J Mudge, N Nelson, C Nusbaum, I Popov, HE Pratt, Y Qiu, S Ramakrishnan, J Raymond, L Salichos, A Scavelli, JM Schreiber, FJ Sedlazeck, LH See, RM Sherman, X Shi, M Shi, CA Sloan, JS Strattan, Z Tan, FY Tanaka, A Vlasova, J Wang, J Werner, B Williams, M Xu, C Yan, L Yu, C Zaleski, J Zhang, K Ardlie, JM Cherry, EM Mendenhall, WS Noble, Z Weng, ME Levine, A Dobin, B Wold, A Mortazavi, B Ren, J Gillis, RM Myers, MP Snyder, J Choudhary, A Milosavljevic, MC Schatz, BE Bernstein, R Guigo, TR Gingeras, M Gerstein (2023). Cell 186: 1493-1511e40.

Data Sanitization to Reduce Private Information Leakage from Functional Genomics.
G Gursoy, P Emani, CM Brannon, OA Jolanki, A Harmanci, JS Strattan, JM Cherry, AD Miranker, M Gerstein (2020). Cell 183: 905-917e16.

Passenger Mutations in More Than 2,500 Cancer Genomes: Overall Molecular Functional Impact and Consequences.
S Kumar, J Warrell, S Li, PD McGillivray, W Meyerson, L Salichos, A Harmanci, A Martinez-Fundichely, CWY Chan, MM Nielsen, L Lochovsky, Y Zhang, X Li, S Lou, JS Pedersen, C Herrmann, G Getz, E Khurana, MB Gerstein (2020). Cell 180: 915-927e16.

Comprehensive functional genomic resource and integrative model for the human brain.
D Wang, S Liu, J Warrell, H Won, X Shi, FCP Navarro, D Clarke, M Gu, P Emani, YT Yang, M Xu, MJ Gandal, S Lou, J Zhang, JJ Park, C Yan, SK Rhie, K Manakongtreecheep, H Zhou, A Nathan, M Peters, E Mattei, D Fitzgerald, T Brunetti, J Moore, Y Jiang, K Girdhar, GE Hoffman, S Kalayci, ZH Gumus, GE Crawford, PsychENCODE Consortium, P Roussos, S Akbarian, AE Jaffe, KP White, Z Weng, N Sestan, DH Geschwind, JA Knowles, MB Gerstein (2018). Science 362.

The real cost of sequencing: scaling computation to keep pace with data generation.
P Muir, S Li, S Lou, D Wang, DJ Spakowicz, L Salichos, J Zhang, GM Weinstock, F Isaacs, J Rozowsky, M Gerstein (2016). Genome Biol 17: 53.

Temporal Dynamics of Collaborative Networks in Large Scientific Consortia.
D Wang, KK Yan, J Rozowsky, E Pan, M Gerstein (2016). Trends Genet 32: 251-253.

Quantification of private information leakage from phenotype-genotype data: linking attacks.
A Harmanci, M Gerstein (2016). Nat Methods 13: 251-6.

Comparative analysis of the transcriptome across distant species.
MB Gerstein, J Rozowsky, KK Yan, D Wang, C Cheng, JB Brown, CA Davis, L Hillier, C Sisu, JJ Li, B Pei, AO Harmanci, MO Duff, S Djebali, RP Alexander, BH Alver, R Auerbach, K Bell, PJ Bickel, ME Boeck, NP Boley, BW Booth, L Cherbas, P Cherbas, C Di, A Dobin, J Drenkow, B Ewing, G Fang, M Fastuca, EA Feingold, A Frankish, G Gao, PJ Good, R Guigo, A Hammonds, J Harrow, RA Hoskins, C Howald, L Hu, H Huang, TJ Hubbard, C Huynh, S Jha, D Kasper, M Kato, TC Kaufman, RR Kitchen, E Ladewig, J Lagarde, E Lai, J Leng, Z Lu, M MacCoss, G May, R McWhirter, G Merrihew, DM Miller, A Mortazavi, R Murad, B Oliver, S Olson, PJ Park, MJ Pazin, N Perrimon, D Pervouchine, V Reinke, A Reymond, G Robinson, A Samsonova, GI Saunders, F Schlesinger, A Sethi, FJ Slack, WC Spencer, MH Stoiber, P Strasbourger, A Tanzer, OA Thompson, KH Wan, G Wang, H Wang, KL Watkins, J Wen, K Wen, C Xue, L Yang, K Yip, C Zaleski, Y Zhang, H Zheng, SE Brenner, BR Graveley, SE Celniker, TR Gingeras, R Waterston (2014). Nature 512: 445-8.

Comparative analysis of pseudogenes across three phyla.
C Sisu, B Pei, J Leng, A Frankish, Y Zhang, S Balasubramanian, R Harte, D Wang, M Rutenberg-Schoenberg, W Clark, M Diekhans, J Rozowsky, T Hubbard, J Harrow, MB Gerstein (2014). Proc Natl Acad Sci U S A 111: 13361-6.

Integrative annotation of variants from 1092 humans: application to cancer genomics.
E Khurana, Y Fu, V Colonna, XJ Mu, HM Kang, T Lappalainen, A Sboner, L Lochovsky, J Chen, A Harmanci, J Das, A Abyzov, S Balasubramanian, K Beal, D Chakravarty, D Challis, Y Chen, D Clarke, L Clarke, F Cunningham, US Evani, P Flicek, R Fragoza, E Garrison, R Gibbs, ZH Gumus, J Herrero, N Kitabayashi, Y Kong, K Lage, V Liluashvili, SM Lipkin, DG MacArthur, G Marth, D Muzny, TH Pers, GRS Ritchie, JA Rosenfeld, C Sisu, X Wei, M Wilson, Y Xue, F Yu, 1000 Genomes Project Consortium, ET Dermitzakis, H Yu, MA Rubin, C Tyler-Smith, M Gerstein (2013). Science 342: 1235587.

Architecture of the human regulatory network derived from ENCODE data.
MB Gerstein, A Kundaje, M Hariharan, SG Landt, KK Yan, C Cheng, XJ Mu, E Khurana, J Rozowsky, R Alexander, R Min, P Alves, A Abyzov, N Addleman, N Bhardwaj, AP Boyle, P Cayting, A Charos, DZ Chen, Y Cheng, D Clarke, C Eastman, G Euskirchen, S Frietze, Y Fu, J Gertz, F Grubert, A Harmanci, P Jain, M Kasowski, P Lacroute, JJ Leng, J Lian, H Monahan, H O'Geen, Z Ouyang, EC Partridge, D Patacsil, F Pauli, D Raha, L Ramirez, TE Reddy, B Reed, M Shi, T Slifer, J Wang, L Wu, X Yang, KY Yip, G Zilberman-Schapira, S Batzoglou, A Sidow, PJ Farnham, RM Myers, SM Weissman, M Snyder (2012). Nature 489: 91-100.

Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
D Greenbaum, A Sboner, XJ Mu, M Gerstein (2011). PLoS Comput Biol 7: e1002278.

Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.
MB Gerstein, ZJ Lu, EL Van Nostrand, C Cheng, BI Arshinoff, T Liu, KY Yip, R Robilotto, A Rechtsteiner, K Ikegami, P Alves, A Chateigner, M Perry, M Morris, RK Auerbach, X Feng, J Leng, A Vielle, W Niu, K Rhrissorrakrai, A Agarwal, RP Alexander, G Barber, CM Brdlik, J Brennan, JJ Brouillet, A Carr, MS Cheung, H Clawson, S Contrino, LO Dannenberg, AF Dernburg, A Desai, L Dick, AC Dose, J Du, T Egelhofer, S Ercan, G Euskirchen, B Ewing, EA Feingold, R Gassmann, PJ Good, P Green, F Gullier, M Gutwein, MS Guyer, L Habegger, T Han, JG Henikoff, SR Henz, A Hinrichs, H Holster, T Hyman, AL Iniguez, J Janette, M Jensen, M Kato, WJ Kent, E Kephart, V Khivansara, E Khurana, JK Kim, P Kolasinska-Zwierz, EC Lai, I Latorre, A Leahey, S Lewis, P Lloyd, L Lochovsky, RF Lowdon, Y Lubling, R Lyne, M MacCoss, SD Mackowiak, M Mangone, S McKay, D Mecenas, G Merrihew, DM Miller, A Muroyama, JI Murray, SL Ooi, H Pham, T Phippen, EA Preston, N Rajewsky, G Ratsch, H Rosenbaum, J Rozowsky, K Rutherford, P Ruzanov, M Sarov, R Sasidharan, A Sboner, P Scheid, E Segal, H Shin, C Shou, FJ Slack, C Slightam, R Smith, WC Spencer, EO Stinson, S Taing, T Takasaki, D Vafeados, K Voronina, G Wang, NL Washington, CM Whittle, B Wu, KK Yan, G Zeller, Z Zha, M Zhong, X Zhou, modENCODE Consortium, J Ahringer, S Strome, KC Gunsalus, G Micklem, XS Liu, V Reinke, SK Kim, LW Hillier, S Henikoff, F Piano, M Snyder, L Stein, JD Lieb, RH Waterston (2010). Science 330: 1775-87.

Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks.
KK Yan, G Fang, N Bhardwaj, RP Alexander, M Gerstein (2010). Proc Natl Acad Sci U S A 107: 9186-91.

PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.
J Rozowsky, G Euskirchen, RK Auerbach, ZD Zhang, T Gibson, R Bjornson, N Carriero, M Snyder, MB Gerstein (2009). Nat Biotechnol 27: 66-75.

Structured digital abstract makes text mining easy.
M Gerstein, M Seringhaus, S Fields (2007). Nature 447: 142.

Relating three-dimensional structures to protein networks provides evolutionary insights.
PM Kim, LJ Lu, Y Xia, MB Gerstein (2006). Science 314: 1938-41.

Genomic analysis of the hierarchical structure of regulatory networks.
H Yu, M Gerstein (2006). Proc Natl Acad Sci U S A 103: 14724-31.

The real life of pseudogenes.
M Gerstein, D Zheng (2006). Sci Am 295: 48-55.

Calculation of standard atomic volumes for RNA and comparison with proteins: RNA is packed more tightly.
NR Voss, M Gerstein (2005). J Mol Biol 346: 477-92.

Genomic analysis of regulatory network dynamics reveals large topological changes.
NM Luscombe, MM Babu, H Yu, M Snyder, SA Teichmann, M Gerstein (2004). Nature 431: 308-12.

A Bayesian networks approach for predicting protein-protein interactions from genomic data.
R Jansen, H Yu, D Greenbaum, Y Kluger, NJ Krogan, S Chung, A Emili, M Snyder, JF Greenblatt, M Gerstein (2003). Science 302: 449-53.

MolMovDB: analysis and visualization of conformational change and structural flexibility.
N Echols, D Milburn, M Gerstein (2003). Nucleic Acids Res 31: 478-82.

Simulating water and the molecules of life.
M Gerstein, M Levitt (1998). Sci Am 279: 100-5.

A unified statistical framework for sequence comparison and structure comparison.
M Levitt, M Gerstein (1998). Proc Natl Acad Sci U S A 95: 5913-20.


Return to front page