Current undergraduates are listed at http://www.gersteinlab.org/people/
Past ones are at http://www.gersteinlab.org/people/alumni.htm
 

Privacy-preserving Model Training for Disease Prediction Using Federated Learning with Differential Privacy.
A Khanna, V Schaffer, G Gursoy, M Gerstein (2022). Annu Int Conf IEEE Eng Med Biol Soc 2022: 1358-1361.

Privacy-preserving genotype imputation with fully homomorphic encryption.
G Gursoy, E Chielle, CM Brannon, M Maniatakos, M Gerstein (2021). Cell Syst 13: 173-182e3.

Recovering genotypes and phenotypes using allele-specific genes.
G Gursoy, N Lu, S Wagner, M Gerstein (2021). Genome Biol 22: 263.

Data Sanitization to Reduce Private Information Leakage from Functional Genomics.
G Gursoy, P Emani, CM Brannon, OA Jolanki, A Harmanci, JS Strattan, JM Cherry, AD Miranker, M Gerstein (2020). Cell 183: 905-917e16.

Using blockchain to log genome dataset access: efficient storage and query.
G Gursoy, R Bjornson, ME Green, M Gerstein (2020). BMC Med Genomics 13: 78.

Using Ethereum blockchain to store and query pharmacogenomics data via smart contracts.
G Gursoy, CM Brannon, M Gerstein (2020). BMC Med Genomics 13: 74.

Comprehensive functional genomic resource and integrative model for the human brain.
D Wang, S Liu, J Warrell, H Won, X Shi, FCP Navarro, D Clarke, M Gu, P Emani, YT Yang, M Xu, MJ Gandal, S Lou, J Zhang, JJ Park, C Yan, SK Rhie, K Manakongtreecheep, H Zhou, A Nathan, M Peters, E Mattei, D Fitzgerald, T Brunetti, J Moore, Y Jiang, K Girdhar, GE Hoffman, S Kalayci, ZH Gumus, GE Crawford, PsychENCODE Consortium, P Roussos, S Akbarian, AE Jaffe, KP White, Z Weng, N Sestan, DH Geschwind, JA Knowles, MB Gerstein (2018). Science 362.

Network Analysis as a Grand Unifier in Biomedical Data Science
P McGillivray, D Clarke, W Meyerson, J Zhang, D Lee, M Gu, S Kumar, H Zhou, MB Gerstein (2018). Annual Review of Biomedical Data Science Vol. 1.

A comprehensive catalog of predicted functional upstream open reading frames in humans.
P McGillivray, R Ault, M Pawashe, R Kitchen, S Balasubramanian, M Gerstein (2018). Nucleic Acids Res 46: 3326-3338.

Identifying Allosteric Hotspots with Dynamics: Application to Inter- and Intra-species Conservation.
D Clarke, A Sethi, S Li, S Kumar, RWF Chang, J Chen, M Gerstein (2016). Structure 24: 826-837.

Temporal Dynamics of Collaborative Networks in Large Scientific Consortia
D Wang, KK Yan, J Rozowsky, E Pan, M Gerstein (2016). Trends Genet 32: 251-253.

Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms.
A Abyzov, S Li, DR Kim, M Mohiyuddin, AM Stutz, NF Parrish, XJ Mu, W Clark, K Chen, M Hurles, JO Korbel, HY Lam, C Lee, MB Gerstein (2015). Nat Commun 6: 7256.

FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer.
Y Fu, Z Liu, S Lou, J Bedford, XJ Mu, KY Yip, E Khurana, M Gerstein (2014). Genome Biol 15: 480.

Integrative annotation of variants from 1092 humans: application to cancer genomics.
E Khurana, Y Fu, V Colonna, XJ Mu, HM Kang, T Lappalainen, A Sboner, L Lochovsky, J Chen, A Harmanci, J Das, A Abyzov, S Balasubramanian, K Beal, D Chakravarty, D Challis, Y Chen, D Clarke, L Clarke, F Cunningham, US Evani, P Flicek, R Fragoza, E Garrison, R Gibbs, ZH Gumus, J Herrero, N Kitabayashi, Y Kong, K Lage, V Liluashvili, SM Lipkin, DG MacArthur, G Marth, D Muzny, TH Pers, GRS Ritchie, JA Rosenfeld, C Sisu, X Wei, M Wilson, Y Xue, F Yu, 1000 Genomes Project Consortium, ET Dermitzakis, H Yu, MA Rubin, C Tyler-Smith, M Gerstein (2013). Science 342: 1235587.

Architecture of the human regulatory network derived from ENCODE data.
MB Gerstein, A Kundaje, M Hariharan, SG Landt, KK Yan, C Cheng, XJ Mu, E Khurana, J Rozowsky, R Alexander, R Min, P Alves, A Abyzov, N Addleman, N Bhardwaj, AP Boyle, P Cayting, A Charos, DZ Chen, Y Cheng, D Clarke, C Eastman, G Euskirchen, S Frietze, Y Fu, J Gertz, F Grubert, A Harmanci, P Jain, M Kasowski, P Lacroute, JJ Leng, J Lian, H Monahan, H O'Geen, Z Ouyang, EC Partridge, D Patacsil, F Pauli, D Raha, L Ramirez, TE Reddy, B Reed, M Shi, T Slifer, J Wang, L Wu, X Yang, KY Yip, G Zilberman-Schapira, S Batzoglou, A Sidow, PJ Farnham, RM Myers, SM Weissman, M Snyder (2012). Nature 489: 91-100.

VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment.
L Habegger, S Balasubramanian, DZ Chen, E Khurana, A Sboner, A Harmanci, J Rozowsky, D Clarke, M Snyder, M Gerstein (2012). Bioinformatics 28: 2267-9.

A systematic survey of loss-of-function variants in human protein-coding genes.
DG MacArthur, S Balasubramanian, A Frankish, N Huang, J Morris, K Walter, L Jostins, L Habegger, JK Pickrell, SB Montgomery, CA Albers, ZD Zhang, DF Conrad, G Lunter, H Zheng, Q Ayub, MA DePristo, E Banks, M Hu, RE Handsaker, JA Rosenfeld, M Fromer, M Jin, XJ Mu, E Khurana, K Ye, M Kay, GI Saunders, MM Suner, T Hunt, IH Barnes, C Amid, DR Carvalho-Silva, AH Bignell, C Snow, B Yngvadottir, S Bumpstead, DN Cooper, Y Xue, IG Romero, 1000 Genomes Project Consortium, J Wang, Y Li, RA Gibbs, SA McCarroll, ET Dermitzakis, JK Pritchard, JC Barrett, J Harrow, ME Hurles, MB Gerstein, C Tyler-Smith (2012). Science 335: 823-8.

ACT: aggregation and correlation toolbox for analyses of genome tracks.
J Jee, J Rozowsky, KY Yip, L Lochovsky, R Bjornson, G Zhong, Z Zhang, Y Fu, J Wang, Z Weng, M Gerstein (2011). Bioinformatics 27: 1152-4.

Mapping copy number variation by population-scale genome sequencing.
RE Mills, K Walter, C Stewart, RE Handsaker, K Chen, C Alkan, A Abyzov, SC Yoon, K Ye, RK Cheetham, A Chinwalla, DF Conrad, Y Fu, F Grubert, I Hajirasouliha, F Hormozdiari, LM Iakoucheva, Z Iqbal, S Kang, JM Kidd, MK Konkel, J Korn, E Khurana, D Kural, HY Lam, J Leng, R Li, Y Li, CY Lin, R Luo, XJ Mu, J Nemesh, HE Peckham, T Rausch, A Scally, X Shi, MP Stromberg, AM Stutz, AE Urban, JA Walker, J Wu, Y Zhang, ZD Zhang, MA Batzer, L Ding, GT Marth, G McVean, J Sebat, M Snyder, J Wang, K Ye, EE Eichler, MB Gerstein, ME Hurles, C Lee, SA McCarroll, JO Korbel, 1000 Genomes Project (2011). Nature 470: 59-65.

FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data.
A Sboner, L Habegger, D Pflueger, S Terry, DZ Chen, JS Rozowsky, AK Tewari, N Kitabayashi, BJ Moss, MS Chee, F Demichelis, MA Rubin, MB Gerstein (2010). Genome Biol 11: R104.

Efficient yeast ChIP-Seq using multiplex short-read DNA sequencing.
P Lefrancois, GM Euskirchen, RK Auerbach, J Rozowsky, T Gibson, CM Yellman, M Gerstein, M Snyder (2009). BMC Genomics 10: 37.

PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.
J Rozowsky, G Euskirchen, RK Auerbach, ZD Zhang, T Gibson, R Bjornson, N Carriero, M Snyder, MB Gerstein (2009). Nat Biotechnol 27: 66-75.

Mismatch oligonucleotides in human and yeast: guidelines for probe design on tiling microarrays.
M Seringhaus, J Rozowsky, T Royce, U Nagalakshmi, J Jee, M Snyder, M Gerstein (2008). BMC Genomics 9: 635.

Analysis of copy number variants and segmental duplications in the human genome: Evidence for a change in the process of formation in recent evolutionary history.
PM Kim, HY Lam, AE Urban, JO Korbel, J Affourtit, F Grubert, X Chen, S Weissman, M Snyder, MB Gerstein (2008). Genome Res 18: 1865-74.

The current excitement about copy-number variation: how it relates to gene duplications and protein families.
JO Korbel, PM Kim, X Chen, AE Urban, S Weissman, M Snyder, MB Gerstein (2008). Curr Opin Struct Biol 18: 366-74.

Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome.
JO Korbel, AE Urban, F Grubert, J Du, TE Royce, P Starr, G Zhong, BS Emanuel, SM Weissman, M Snyder, MB Gerstein (2007). Proc Natl Acad Sci U S A 104: 10110-5.

The Database of Macromolecular Motions: new features added at the decade mark.
S Flores, N Echols, D Milburn, B Hespenheide, K Keating, J Lu, S Wells, EZ Yu, M Thorpe, M Gerstein (2006). Nucleic Acids Res 34: D296-301.

Design optimization methods for genomic DNA tiling arrays.
P Bertone, V Trifonov, JS Rozowsky, F Schubert, O Emanuelsson, J Karro, MY Kao, M Snyder, M Gerstein (2006). Genome Res 16: 271-81.

Protein Interaction Prediction by Integrating Genomic Features and Protein Interaction Network Analysis
LJ Lu, Y Xia, H Yu, A Rives, H Lu, F Schubert, M Gerstein (2005). Data Analysis and Visualization in Genomics and Proteomics (Wiley, NY)

Network security and data integrity in academia: an assessment and a proposal for large-scale archiving.
A Smith, D Greenbaum, SM Douglas, M Long, M Gerstein (2005). Genome Biol 6: 119.

PubNet: a flexible system for visualizing literature derived networks.
SM Douglas, GT Montelione, M Gerstein (2005). Genome Biol 6: R80.

Robotic cloning and Protein Production Platform of the Northeast Structural Genomics Consortium.
TB Acton, KC Gunsalus, R Xiao, LC Ma, J Aramini, MC Baran, YW Chiang, T Climent, B Cooper, NG Denissova, SM Douglas, JK Everett, CK Ho, D Macapagal, PK Rajan, R Shastry, LY Shih, GV Swapna, M Wilson, M Wu, M Gerstein, M Inouye, JF Hunt, GT Montelione (2005). Methods Enzymol 394: 210-43.

Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms.
S Balasubramanian, Y Xia, E Freinkman, M Gerstein (2005). Nucleic Acids Res 33: 1710-21.

Normal modes for predicting protein motions: a comprehensive database assessment and associated Web tool.
V Alexandrov, U Lehnert, N Echols, D Milburn, D Engelman, M Gerstein (2005). Protein Sci 14: 633-43.

An XML-Based Approach to Integrating Heterogeneous Yeast Genome Data
KH Cheung, D Pan, A Smith, M Seringhaus, SM Douglas, M Gerstein (2004). International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences (METMBS); pp 236-242

The protein target list of the Northeast Structural Genomics Consortium.
Z Wunderlich, TB Acton, J Liu, G Kornhaber, J Everett, P Carter, N Lan, N Echols, M Gerstein, B Rost, GT Montelione (2004). Proteins 56: 181-7.

Computer security in academia-a potential roadblock to distributed annotation of the human genome
D Greenbaum, SM Douglas, A Smith, J Lim, M Fischer, M Schultz, M Gerstein (2004). Nat Biotechnol 22: 771-2.

Exploring the range of protein flexibility, from a structural proteomics perspective.
M Gerstein, N Echols (2004). Curr Opin Chem Biol 8: 14-9.

Mining the structural genomics pipeline: identification of protein properties that affect high-throughput experimental analysis.
CS Goh, N Lan, SM Douglas, B Wu, N Echols, A Smith, D Milburn, GT Montelione, H Zhao, M Gerstein (2004). J Mol Biol 336: 115-30.

Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data.
J Qian, J Lin, NM Luscombe, H Yu, M Gerstein (2003). Bioinformatics 19: 1917-26.

ExpressYourself: A modular platform for processing and visualizing microarray data.
NM Luscombe, TE Royce, P Bertone, N Echols, CE Horak, JT Chang, M Snyder, M Gerstein (2003). Nucleic Acids Res 31: 3477-82.

SPINE 2: a system for collaborative structural proteomics within a federated database framework.
CS Goh, N Lan, N Echols, SM Douglas, D Milburn, P Bertone, R Xiao, LC Ma, D Zheng, Z Wunderlich, T Acton, GT Montelione, M Gerstein (2003). Nucleic Acids Res 31: 2833-8.

MolMovDB: analysis and visualization of conformational change and structural flexibility.
N Echols, D Milburn, M Gerstein (2003). Nucleic Acids Res 31: 478-82.

Subcellular localization of the yeast proteome.
A Kumar, S Agarwal, JA Heyman, S Matson, M Heidtman, S Piccirillo, L Umansky, A Drawid, R Jansen, Y Liu, KH Cheung, P Miller, M Gerstein, GS Roeder, M Snyder (2002). Genes Dev 16: 707-19.

Structural genomics analysis: characteristics of atypical, common, and horizontally transferred folds.
H Hegyi, J Lin, D Greenbaum, M Gerstein (2002). Proteins 47: 126-41.

A small reservoir of disabled ORFs in the yeast genome and its implications for the dynamics of proteome evolution.
P Harrison, A Kumar, N Lan, N Echols, M Snyder, M Gerstein (2002). J Mol Biol 316: 409-19.

Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22.
PM Harrison, H Hegyi, S Balasubramanian, NM Luscombe, P Bertone, N Echols, T Johnson, M Gerstein (2002). Genome Res 12: 272-80.

GeneCensus: genome comparisons in terms of metabolic pathway activity and protein family sharing.
J Lin, J Qian, D Greenbaum, P Bertone, R Das, N Echols, A Senes, B Stenger, M Gerstein (2002). Nucleic Acids Res 30: 4574-82.

Normal mode analysis of macromolecular motions in a database framework: developing mode concentration as a useful classifying statistic.
WG Krebs, V Alexandrov, CA Wilson, N Echols, H Yu, M Gerstein (2002). Proteins 48: 682-95.

SNPs on human chromosomes 21 and 22 -- analysis in terms of protein features and pseudogenes.
S Balasubramanian, P Harrison, H Hegyi, P Bertone, N Luscombe, N Echols, P McGarvey, Z Zhang, M Gerstein (2002). Pharmacogenomics 3: 393-402.

Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.
N Echols, P Harrison, S Balasubramanian, NM Luscombe, P Bertone, Z Zhang, M Gerstein (2002). Nucleic Acids Res 30: 2515-23.

An integrated approach for finding overlooked genes in yeast.
A Kumar, PM Harrison, KH Cheung, N Lan, N Echols, P Bertone, P Miller, MB Gerstein, M Snyder (2002). Nat Biotechnol 20: 58-63.

Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions.
J Qian, M Dolled-Filhart, J Lin, H Yu, M Gerstein (2001). J Mol Biol 314: 1053-66.

Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome.
PM Harrison, N Echols, MB Gerstein (2001). Nucleic Acids Res 29: 818-30.

PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information.
J Qian, B Stenger, CA Wilson, J Lin, R Jansen, SA Teichmann, J Park, WG Krebs, H Yu, V Alexandrov, N Echols, M Gerstein (2001). Nucleic Acids Res 29: 1750-64.

Genome-wide analysis relating expression level with protein subcellular localization.
A Drawid, R Jansen, M Gerstein (2000). Trends Genet 16: 426-30.

A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome.
A Drawid, M Gerstein (2000). J Mol Biol 301: 1059-75.

Protein folds in the worm genome.
M Gerstein, J Lin, H Hegyi (2000). Pac Symp Biocomput : 30-41.

Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels.
J Lin, M Gerstein (2000). Genome Res 10: 808-18.

Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.
CA Wilson, J Kreychman, M Gerstein (2000). J Mol Biol 297: 233-49.


Return to front page