R01HG008126
Prioritizing rare variants associated with cancer using non-coding annotation
Recent progress made by the ENCODE consortium and the Epigenome Roadmap Project has provided detailed annotation of noncoding regions of the human genome. Whole-genome sequencing has identified large volumes of rare variants in such regions. Thus, this is an opportune time to study rare variants in noncoding regions. Despite these opportunities, little effort has been invested in leveraging these resources to tackle problems in cancer-risk variant prioritization. Here, we are developing strategies to use patterns of natural polymorphisms to prioritize the most impactful noncoding variants. To refine our approach, we are developing a tunable parameterization scheme in conjunction with iterative experimentation. With the support of this grant, we have made achievements in developing pipelines, computational tools, and experimental technologies.
The Gerstein lab has focused on pipeline and computational tool development. Specific outcomes include:
- Extending the FunSeq/FunSeq2 pipeline by adding new features that consider gene expression levels; And building GRAM, which can prioritize germline variants that can significantly affect enhancer regulatory activity;
- Building a simple Bayesian classifier using 89 attributes to develop a comprehensive catalog of active upstream open reading frames, which, in turn, can be used to help generate an impactful noncoding variant set;
- Leveraging the comprehensive variant datasets from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) project. In particular, we found that molecular impact correlates with subclonal architecture (i.e., early versus late mutations) and that the aggregated effect of putative passengers, including undetected weak drivers, provides significant additional power (~12% additive variance) for predicting cancerous phenotypes. We also developed the Evotum approach to identify driver genes and quantify tumor growth based on variant-allele frequencies for an individual sample
- Conducting a case study to identify and prioritize impactful noncoding variants (both germline and somatic) on renal cell carcinoma, which highlighted two impactful noncoding variants that are associated with prognosis and can aid in clinical decision-making.
- Developing the Intensification approach to aggregate variants to identify important positions in repeat domains that show strong conservation signals. Intensification allows us to compare conservation over different evolutionary timescales and visualize population-genetic measures on molecular structures.
The Yu Lab has developed advanced biotechnology tools to conduct mutagenesis and sequencing of specific regions. Specific outcomes include:
- Developing the element-clone-compatible STARR-seq protocol to incorporate a unique molecular barcode into the complementary DNA of each mRNA produced at the reverse transcription step, allowing direct quantification of enhancer activity for each enhancer by counting only reads with unique molecular barcodes.
- Developing a massively multiplexed cloning technique, using long-adaptor single-strand oligonucleotide probes, that is more powerful in terms of its higher cloning capacity and lower length bias and can capture wild-type guide DNA fragments of interest before introducing variants in the fragments.
- Developing a new en masse directed mutagenesis pipeline, MassClone-seq, which is compatible with the massively multiplexed format of cloning and which will allow us to introduce hundreds to thousands of pre-determined mutations in a one-pot fashion.
The Rubin Lab has made progress on experimental validation. Specific outcomes include:
- Scaling up the cloning pipeline to amplify 100 individual elements and then selecting and pooling four colonies for each element
- Successfully sequencing 86 elements, of which 31 contained a common SNV in all four of the clones sequenced
- Comparing the enhancer activity of the SNV-containing elements and their corresponding wild-type elements by a dual-luciferase assay
Return to front page