Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data across 27 Tissue Types

Cell Rep. 2020 Aug 18;32(7):108029. doi: 10.1016/j.celrep.2020.108029.

Abstract

Characterizing the tissue-specific binding sites of transcription factors (TFs) is essential to reconstruct gene regulatory networks and predict functions for non-coding genetic variation. DNase-seq footprinting enables the prediction of genome-wide binding sites for hundreds of TFs simultaneously. Despite the public availability of high-quality DNase-seq data from hundreds of samples, a comprehensive, up-to-date resource for the locations of genomic footprints is lacking. Here, we develop a scalable footprinting workflow using two state-of-the-art algorithms: Wellington and HINT. We apply our workflow to detect footprints in 192 ENCODE DNase-seq experiments and predict the genomic occupancy of 1,515 human TFs in 27 human tissues. We validate that these footprints overlap true-positive TF binding sites from ChIP-seq. We demonstrate that the locations, depth, and tissue specificity of footprints predict effects of genetic variants on gene expression and capture a substantial proportion of genetic risk for complex traits.

Keywords: DNase-seq; ENCODE; footprinting; gene regulation; motifs; psychiatric genetics; transcription factors.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Binding Sites / genetics*
  • Deoxyribonucleases / metabolism*
  • Genomics / methods*
  • Humans
  • Transcription Factors / metabolism*

Substances

  • Transcription Factors
  • Deoxyribonucleases