Dissecting the regulatory activity and sequence content of loci with exceptional numbers of transcription factor associations

Genome Res. 2020 Jul;30(7):939-950. doi: 10.1101/gr.260463.119. Epub 2020 Jul 2.

Abstract

DNA-associated proteins (DAPs) classically regulate gene expression by binding to regulatory loci such as enhancers or promoters. As expanding catalogs of genome-wide DAP binding maps reveal thousands of loci that, unlike the majority of conventional enhancers and promoters, associate with dozens of different DAPs with apparently little regard for motif preference, an understanding of DAP association and coordination at such regulatory loci is essential to deciphering how these regions contribute to normal development and disease. In this study, we aggregated publicly available ChIP-seq data from 469 human DAPs assayed in three cell lines and integrated these data with an orthogonal data set of 352 nonredundant, in vitro-derived motifs mapped to the genome within DNase I hypersensitivity footprints to characterize regions with high numbers of DAP associations. We establish a generalizable definition for high occupancy target (HOT) loci and identify putative driver DAP motifs in HepG2 cells, including HNF4A, SP1, SP5, and ETV4, that are highly prevalent and show sequence conservation at HOT loci. The number of different DAPs associated with an element is positively associated with evidence of regulatory activity, and by systematically mutating 245 HOT loci with a massively parallel mutagenesis assay, we localized regulatory activity to a central core region that depends on the motif sequences of our previously nominated driver DAPs. In sum, this work leverages the increasingly large number of DAP motif and ChIP-seq data publicly available to explore how DAP associations contribute to genome-wide transcriptional regulation.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Base Composition
  • Cell Line
  • Chromatin / chemistry
  • Chromatin Immunoprecipitation Sequencing
  • DNA / chemistry
  • Enhancer Elements, Genetic*
  • Gene Expression Regulation*
  • Genetic Loci
  • Genome
  • Hep G2 Cells
  • Humans
  • Mutagenesis
  • Mutation
  • Nucleotide Motifs
  • Promoter Regions, Genetic*
  • Transcription Factors / metabolism*

Substances

  • Chromatin
  • Transcription Factors
  • DNA