1U01HG003156-01 (PI Snyder)
9/30/03 - 7/31/06
Transcriptional and Regulatory Elements in the ENCODE Region
Role: co-PI

The goal is to obtain a coordinated and comprehensive picture of the transcribed regions and factors that regulate or affect transcriptional activity over the genomic regions selected as part of the ENCODE project.

[1] Construct and compare tiling arrays (NASA, Affymetrix, Pcr) for the ENCODE regions.
[2] Identify and analyze regions with transcriptional activity (polyadenylated mRNA).
[3] Identify and analyze transcription factor binding sites (using chromatin immunoprecipitation techniques) and other chromosomal proteins. Examine chromatin modifications.
[4] Analyze DNA methylation patterns.
[5] Publish the results in a publicly accessible database.

Gerstein lab contributions

Providing and continuously refining the framework as well as many of the specific tools for handling and analyzing the data. The Gerstein group will be involved in the processing (e.g., ExpressYourself), the storing, and the analysis of the microarray data. This includes --- but is not limited to --- the analysis of transcription factor motifs, the examination of transcribed regions and their correlation with gene annotations, and the integration of the ENCODE results with other data repositories (e.g., pseudogene annotations).


Articles funded by this grant:
MrTADFinder: A network modularity based approach to identify topologically associating domains in multiple resolutions.
KK Yan, S Lou, M Gerstein (2017). PLoS Comput Biol 13: e1005647.

HiC-spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps.
KK Yan, GG Yardimci, C Yan, WS Noble, M Gerstein (2017). Bioinformatics 33: 2199-2201.

Loregic: a method to characterize the cooperative logic of regulatory factors.
D Wang, KK Yan, C Sisu, C Cheng, J Rozowsky, W Meyerson, MB Gerstein (2015). PLoS Comput Biol 11: e1004132.

An approach for determining and measuring network hierarchy applied to comparing the phosphorylome and the regulome.
C Cheng, E Andrews, KK Yan, M Ung, D Wang, M Gerstein (2015). Genome Biol 16: 63.

MUSIC: identification of enriched regions in ChIP-Seq experiments using a mappability-corrected multiscale signal processing framework.
A Harmanci, J Rozowsky, M Gerstein (2014). Genome Biol 15: 474.

Comparative analysis of regulatory information and circuits across distant species.
AP Boyle, CL Araya, C Brdlik, P Cayting, C Cheng, Y Cheng, K Gardner, LW Hillier, J Janette, L Jiang, D Kasper, T Kawli, P Kheradpour, A Kundaje, JJ Li, L Ma, W Niu, EJ Rehm, J Rozowsky, M Slattery, R Spokony, R Terrell, D Vafeados, D Wang, P Weisdepp, YC Wu, D Xie, KK Yan, EA Feingold, PJ Good, MJ Pazin, H Huang, PJ Bickel, SE Brenner, V Reinke, RH Waterston, M Gerstein, KP White, M Kellis, M Snyder (2014). Nature 512: 453-6.

Comparative analysis of the transcriptome across distant species.
MB Gerstein, J Rozowsky, KK Yan, D Wang, C Cheng, JB Brown, CA Davis, L Hillier, C Sisu, JJ Li, B Pei, AO Harmanci, MO Duff, S Djebali, RP Alexander, BH Alver, R Auerbach, K Bell, PJ Bickel, ME Boeck, NP Boley, BW Booth, L Cherbas, P Cherbas, C Di, A Dobin, J Drenkow, B Ewing, G Fang, M Fastuca, EA Feingold, A Frankish, G Gao, PJ Good, R Guigo, A Hammonds, J Harrow, RA Hoskins, C Howald, L Hu, H Huang, TJ Hubbard, C Huynh, S Jha, D Kasper, M Kato, TC Kaufman, RR Kitchen, E Ladewig, J Lagarde, E Lai, J Leng, Z Lu, M MacCoss, G May, R McWhirter, G Merrihew, DM Miller, A Mortazavi, R Murad, B Oliver, S Olson, PJ Park, MJ Pazin, N Perrimon, D Pervouchine, V Reinke, A Reymond, G Robinson, A Samsonova, GI Saunders, F Schlesinger, A Sethi, FJ Slack, WC Spencer, MH Stoiber, P Strasbourger, A Tanzer, OA Thompson, KH Wan, G Wang, H Wang, KL Watkins, J Wen, K Wen, C Xue, L Yang, K Yip, C Zaleski, Y Zhang, H Zheng, SE Brenner, BR Graveley, SE Celniker, TR Gingeras, R Waterston (2014). Nature 512: 445-8.

Comparative analysis of pseudogenes across three phyla.
C Sisu, B Pei, J Leng, A Frankish, Y Zhang, S Balasubramanian, R Harte, D Wang, M Rutenberg-Schoenberg, W Clark, M Diekhans, J Rozowsky, T Hubbard, J Harrow, MB Gerstein (2014). Proc Natl Acad Sci U S A 111: 13361-6.

Defining functional DNA elements in the human genome.
M Kellis, B Wold, MP Snyder, BE Bernstein, A Kundaje, GK Marinov, LD Ward, E Birney, GE Crawford, J Dekker, I Dunham, LL Elnitski, PJ Farnham, EA Feingold, M Gerstein, MC Giddings, DM Gilbert, TR Gingeras, ED Green, R Guigo, T Hubbard, J Kent, JD Lieb, RM Myers, MJ Pazin, B Ren, JA Stamatoyannopoulos, Z Weng, KP White, RC Hardison (2014). Proc Natl Acad Sci U S A 111: 6131-8.

Genome-wide analysis of chromatin features identifies histone modification sensitive and insensitive yeast transcription factors.
C Cheng, C Shou, KY Yip, MB Gerstein (2011). Genome Biol 12: R111.

Modeling the relative relationship of transcription factor binding and histone modifications to gene expression levels in mouse embryonic stem cells.
C Cheng, M Gerstein (2011). Nucleic Acids Res 40: 553-68.

Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets.
DS Johnson, W Li, DB Gordon, A Bhattacharjee, B Curry, J Ghosh, L Brizuela, JS Carroll, M Brown, P Flicek, CM Koch, I Dunham, M Bieda, X Xu, PJ Farnham, P Kapranov, DA Nix, TR Gingeras, X Zhang, H Holster, N Jiang, RD Green, JS Song, SA McCuine, E Anton, L Nguyen, ND Trinklein, Z Ye, K Ching, D Hawkins, B Ren, PC Scacheri, J Rozowsky, A Karpikov, G Euskirchen, S Weissman, M Gerstein, M Snyder, A Yang, Z Moqtaderi, H Hirsch, HP Shulha, Y Fu, Z Weng, K Struhl, RM Myers, JD Lieb, XS Liu (2008). Genome Res 18: 393-403.

Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome.
JQ Wu, J Du, J Rozowsky, Z Zhang, AE Urban, G Euskirchen, S Weissman, M Gerstein, M Snyder (2008). Genome Biol 9: R3.

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.
The ENCODE Project Consortium (2007). Nature 447: 799-816.

Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies.
GM Euskirchen, JS Rozowsky, CL Wei, WH Lee, ZD Zhang, S Hartman, O Emanuelsson, V Stolc, S Weissman, MB Gerstein, Y Ruan, M Snyder (2007). Genome Res 17: 898-909.

Structured RNAs in the ENCODE selected regions of the human genome.
S Washietl, JS Pedersen, JO Korbel, C Stocsits, AR Gruber, J Hackermuller, J Hertel, M Lindemeyer, K Reiche, A Tanzer, C Ucla, C Wyss, SE Antonarakis, F Denoeud, J Lagarde, J Drenkow, P Kapranov, TR Gingeras, R Guigo, M Snyder, MB Gerstein, A Reymond, IL Hofacker, PF Stadler (2007). Genome Res 17: 852-64.

Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution.
D Zheng, A Frankish, R Baertsch, P Kapranov, A Reymond, SW Choo, Y Lu, F Denoeud, SE Antonarakis, M Snyder, Y Ruan, CL Wei, TR Gingeras, R Guigo, J Harrow, MB Gerstein (2007). Genome Res 17: 839-51.

Statistical analysis of the genomic distribution and correlation of regulatory elements in the ENCODE regions.
ZD Zhang, A Paccanaro, Y Fu, S Weissman, Z Weng, J Chang, M Snyder, MB Gerstein (2007). Genome Res 17: 787-97.

The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci.
JS Rozowsky, D Newburger, F Sayward, J Wu, G Jordan, JO Korbel, U Nagalakshmi, J Yang, D Zheng, R Guigo, TR Gingeras, S Weissman, P Miller, M Snyder, MB Gerstein (2007). Genome Res 17: 732-45.

Integrated analysis of experimental data sets reveals many novel promoters in 1% of the human genome.
ND Trinklein, U Karaoz, J Wu, A Halees, S Force Aldred, PJ Collins, D Zheng, ZD Zhang, MB Gerstein, M Snyder, RM Myers, Z Weng (2007). Genome Res 17: 720-31.

What is a gene, post-ENCODE? History and updated definition.
MB Gerstein, C Bruce, JS Rozowsky, D Zheng, J Du, JO Korbel, O Emanuelsson, ZD Zhang, S Weissman, M Snyder (2007). Genome Res 17: 669-81.

Tilescope: online analysis pipeline for high-density tiling microarray data.
ZD Zhang, J Rozowsky, HY Lam, J Du, M Snyder, M Gerstein (2007). Genome Biol 8: R81.

The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they?
D Zheng, MB Gerstein (2007). Trends Genet 23: 219-24.

Assessing the performance of different high-density tiling microarray strategies for mapping transcribed regions of the human genome.
O Emanuelsson, U Nagalakshmi, D Zheng, JS Rozowsky, AE Urban, J Du, Z Lian, V Stolc, S Weissman, M Snyder, MB Gerstein (2007). Genome Res 17: 886-97.

BoCaTFBS: a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments.
LY Wang, M Snyder, M Gerstein (2006). Genome Biol 7: R102.

A supervised hidden markov model framework for efficiently segmenting tiling array data in transcriptional and chIP-chip experiments: systematically incorporating validated biological knowledge.
J Du, JS Rozowsky, JO Korbel, ZD Zhang, TE Royce, MH Schultz, M Snyder, M Gerstein (2006). Bioinformatics 22: 3016-24.

A computational approach for identifying pseudogenes in the ENCODE regions.
D Zheng, MB Gerstein (2006). Genome Biol 7 Suppl 1: S131-10.

The ENCODE (ENCyclopedia Of DNA Elements) Project.
The ENCODE Project Consortium (2004). Science 306: 636-40.

