NSF DBI1660648

Grant title: A Graph Based Approach for the Genome Wide Prediction of Conditionaly Essential Genes

How does one identify, and characterize at the genome scale, the set of genes that is essential for an organism to grow and thrive under particular conditions? Predicting such sets of genes is a fundamental goal in bioinformatics; this project aims to create methods and tools for making accurate lists of such functional genes. The approach combines phenotype prediction with knowledge about the functional biological networks in cells to infer new knowledge. The network analysis methods developed here can be easily transferred and applied to a large variety of datasets to answer a wide range of questions from inferring gene-phenotype associations to detecting communities on social networks, extensions highly relevant to the network science community. Moreover, the project's state-of-the-art analysis of temporal gene expression data using state-space models and dimensionality reduction techniques is universally applicable to any groups of genes - e.g. tissue specific vs universally expressed genes. In addition to advancing functional genomics knowledge in the study organism, yeast, the tools will have an impact on research in fields like personal genomics research, by providing a large-scale system-level identification and molecular characterization of phenotypes. Finally, this project provides new and innovative tools for education in bioinformatics.

In more technical terms, this project's major goal is to develop new mathematical models and methods that, given a set of genes or an entire genome, can infer their phenotypes and suggest whether or not these genes are necessary for the organism survival. Specifically, information will be integrated on two levels: phenotypic and molecular. At the phenotypic level the structure of biological networks will be used to assign phenotypic attributes to genes and identify sets of genes that share similar essential phenotypes. At the molecular level, the resulted phenotype predictions will be refined by identifying groups of essential genes governed by similar activity patterns. The integration of the information on these two levels will result in a comprehensive gene-phenotype characterization and a refined group of conditionally essential genes. The resulting predictions will be validated experimentally in two yeast systems. All the tools and datasets associated with this project will be made freely available through genopheno.gersteinlab.org.

Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution
J Warrell, L Salichos, M Gerstein (2020). bioRxiv.

Network propagation-based prioritization of long tail genes in 17 cancer types.
H Mohsen, V Gunasekharan, T Qing, M Seay, Y Surovtseva, S Negahban, Z Szallasi, L Pusztai, MB Gerstein (2021). Genome Biol 22: 287.

Predicting changes in protein thermodynamic stability upon point mutation with deep 3D convolutional neural networks.
B Li, YT Yang, JA Capra, MB Gerstein (2020). PLoS Comput Biol 16: e1008291.

Predicting the frequencies of drug side effects.
D Galeano, S Li, M Gerstein, A Paccanaro (2020). Nat Commun 11: 4575.

Cyclic and Multilevel Causation in Evolutionary Processes
J Warrell, M Gerstein (2020). Biology & Philosophy, 35(5), pp.1-36.

Approaches for integrating heterogeneous RNA-seq data reveal cross-talk between microbes and genes in asthmatic patients.
D Spakowicz, S Lou, B Barron, JL Gomez, T Li, Q Liu, N Grant, X Yan, R Hoyd, G Weinstock, GL Chupp, M Gerstein (2020). Genome Biol 21: 150.

Comparing Technological Development and Biological Evolution from a Network Perspective.
KK Yan, D Wang, K Xiong, M Gerstein (2020). Cell Syst 10: 219-222.

Return to front page