Summary of Gerstein Lab Research in 2016

During 2016 the lab had a number of research highlights. We have published three interlinked tools: Stress, Frustration, and Intensification, for assessing the impact of rare genomic variants using knowledge of molecular structure. The tools are of particular interest to the medical genetics community because as they can help explain various cancer mutations as well as variants associated with genetic diseases. Another highlight is our publishing a framework for quantifying privacy risks as a result of linking clinical and phenotype variables. This paper is a timely work given the ongoing debate on data sharing. Apart from these works, we have a few research papers on topics in genomics, such as analyzing allele-specific binding and gene expression analysis, and several review articles on the role of non-coding variants, network comparison, and the cost of sequencing.

Regarding service, I worked on further developing the computational biology program at Yale. In particular, I co-chaired a committee about moving toward a Center for Biomedical Data Science at the Medical School. My lab served the research community in participating in many consortiums, such as PCAWG (the Pan-Cancer Analysis Working Group), the ENCODE consortium, PsychENCODE, 1000 Genomes' structural variation group (and its follow-ons), and the Extracellular RNA Communication Consortium. In 2016, I gave talks and participated in many meetings, including an important data-science education forum at the Cold Spring Harbor Laboratory.

Regarding teaching, I further developed my course in Bioinformatics by including more practical hands-on materials. For example, we introduced a collaborative programming assignment utilizing the GitHub site.


Intensification: A Resource for Amplifying Population-Genetic Signals with Protein Repeats.
J Chen, B Wang, L Regan, M Gerstein (2016). J Mol Biol 429: 435-445.

Localized structural frustration for evaluating the impact of sequence variants.
S Kumar, D Clarke, M Gerstein (2016). Nucleic Acids Res 44: 10062-10073.

DREISS: Using State-Space Models to Infer the Dynamics of Gene Expression Driven by External and Internal Regulatory Networks.
D Wang, F He, S Maslov, M Gerstein (2016). PLoS Comput Biol 12: e1005146.

Opinion: GMOs Are Not “Frankenfoods”
D Greenbaum, M Gerstein (2016). The Scientist (30 Aug).

iTAR: a web server for identifying target genes of transcription factors using ChIP-seq or ChIP-chip data.
CC Yang, EH Andrews, MH Chen, WY Wang, JJ Chen, M Gerstein, CC Liu, C Cheng (2016). BMC Genomics 17: 632.

Pangolin genomes and the evolution of mammalian scales and immunity.
SW Choo, M Rayko, TK Tan, R Hari, A Komissarov, WY Wee, AA Yurchenko, S Kliver, G Tamazian, A Antunes, RK Wilson, WC Warren, KP Koepfli, P Minx, K Krasheninnikova, A Kotze, DL Dalton, E Vermaak, IC Paterson, P Dobrynin, FT Sitam, JJ Rovie-Ryan, WE Johnson, AM Yusoff, SJ Luo, KV Karuppannan, G Fang, D Zheng, MB Gerstein, L Lipovich, SJ O'Brien, GJ Wong (2016). Genome Res 26: 1312-1322.

Discordant Expression of Circulating microRNA from Cellular and Extracellular Sources.
R Shah, K Tanriverdi, D Levy, M Larson, M Gerstein, E Mick, J Rozowsky, R Kitchen, V Murthy, E Mikalev, JE Freedman (2016). PLoS One 11: e0153691.

Diverse human extracellular RNAs are widely detected in human plasma.
JE Freedman, M Gerstein, E Mick, J Rozowsky, D Levy, R Kitchen, S Das, R Shah, K Danielson, L Beaulieu, FC Navarro, Y Wang, TR Galeev, A Holman, RY Kwong, V Murthy, SE Tanriverdi, M Koupenova-Zamor, E Mikhalev, K Tanriverdi (2016). Nat Commun 7: 11106.

Extending gene ontology in the context of extracellular RNA and vesicle communication.
KH Cheung, S Keerthikumar, P Roncaglia, SL Subramanian, ME Roth, M Samuel, S Anand, L Gangoda, S Gould, R Alexander, D Galas, MB Gerstein, AF Hill, RR Kitchen, J Lotvall, T Patel, DC Procaccini, P Quesenberry, J Rozowsky, RL Raffai, A Shypitsyna, AI Su, C Thery, K Vickers, MH Wauben, S Mathivanan, A Milosavljevic, LC Laurent (2016). J Biomed Semantics 7: 19.

A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.
J Chen, J Rozowsky, TR Galeev, A Harmanci, R Kitchen, J Bedford, A Abyzov, Y Kong, L Regan, M Gerstein (2016). Nat Commun 7: 11101.

Identifying Allosteric Hotspots with Dynamics: Application to Inter- and Intra-species Conservation.
D Clarke, A Sethi, S Li, S Kumar, RWF Chang, J Chen, M Gerstein (2016). Structure 24: 826-837.

Cross-Disciplinary Network Comparison: Matchmaking Between Hairballs
KK Yan, D Wang, A Sethi, P Muir, R Kitchen, C Cheng, M Gerstein (2016). Cell Syst 2: 147-157.

Who Owns Your DNA?
D Greenbaum, M Gerstein (2016). Cell 165:257-258.

Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis.
F He, S Yoo, D Wang, S Kumari, M Gerstein, D Ware, S Maslov (2016). Plant J 86: 472-80.

The real cost of sequencing: scaling computation to keep pace with data generation
P Muir, S Li, S Lou, D Wang, DJ Spakowicz, L Salichos, J Zhang, GM Weinstock, F Isaacs, J Rozowsky, M Gerstein (2016). Genome Biol 17: 53.

Temporal Dynamics of Collaborative Networks in Large Scientific Consortia
D Wang, KK Yan, J Rozowsky, E Pan, M Gerstein (2016). Trends Genet 32: 251-253.

Quantification of private information leakage from phenotype-genotype data: linking attacks
A Harmanci, M Gerstein (2016). Nat Methods 13: 251-6.

Role of non-coding sequence variants in cancer.
E Khurana, Y Fu, D Chakravarty, F Demichelis, MA Rubin, M Gerstein (2016). Nat Rev Genet 17: 93-108.

Understanding genome structural variations.
A Abyzov, S Li, MB Gerstein (2016). Oncotarget 7: 7370-1.


Return to front page