Accurate Promoter and Enhancer Identification in 127 ENCODE and Roadmap Epigenomics Cell Types and Tissues by GenoSTAN

PLoS One. 2017 Jan 5;12(1):e0169249. doi: 10.1371/journal.pone.0169249. eCollection 2017.

Abstract

Accurate maps of promoters and enhancers are required for understanding transcriptional regulation. Promoters and enhancers are usually mapped by integration of chromatin assays charting histone modifications, DNA accessibility, and transcription factor binding. However, current algorithms are limited by unrealistic data distribution assumptions. Here we propose GenoSTAN (Genomic STate ANnotation), a hidden Markov model overcoming these limitations. We map promoters and enhancers for 127 cell types and tissues from the ENCODE and Roadmap Epigenomics projects, today's largest compendium of chromatin assays. Extensive benchmarks demonstrate that GenoSTAN generally identifies promoters and enhancers with significantly higher accuracy than previous methods. Moreover, GenoSTAN-derived promoters and enhancers showed significantly higher enrichment of complex trait-associated genetic variants than current annotations. Altogether, GenoSTAN provides an easy-to-use tool to define promoters and enhancers in any system, and our annotation of human transcriptional cis-regulatory elements constitutes a rich resource for future research in biology and medicine.

MeSH terms

  • Algorithms
  • Chromatin / metabolism
  • Computational Biology / methods
  • Enhancer Elements, Genetic / genetics*
  • Epigenomics / methods*
  • Histones / metabolism
  • Humans
  • Promoter Regions, Genetic / genetics*
  • Regulatory Elements, Transcriptional / genetics

Substances

  • Chromatin
  • Histones

Grants and funding

BZ was supported by German Academic Exchange Service (DAAD short term research grant). AT was supported by the German Federal Ministry of Education and Research (BMBF e:Bio grant) and by the Deutsche Forschungsgemeinschaft (DFG SFB680 grant). JG was supported by the Bavarian Research Center for Molecular Biosystems and the Bundesministerium für Bildung und Forschung, Juniorverbund in der Systemmedizin “mitOmics” grant FKZ 01ZX1405A. PC was funded by Advanced Grant TRANSREGULON of the European Research Council, the Deutsche Forschungsgemeinschaft, the Volkswagen Foundation, CIMED, and SciLifeLab. This work was supported by the German Research Foundation (DFG) and the Technische Universität München within the Open Access Publishing Funding Programme.