Accurate and sensitive quantification of protein-DNA binding affinity

Proc Natl Acad Sci U S A. 2018 Apr 17;115(16):E3692-E3701. doi: 10.1073/pnas.1714376115. Epub 2018 Apr 2.

Abstract

Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes.

Keywords: SELEX; computational modeling; enhancer assays; low-affinity binding sites; transcription factors.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Binding Sites
  • DNA / metabolism*
  • DNA Footprinting / methods*
  • DNA-Binding Proteins / metabolism*
  • Datasets as Topic
  • Drosophila Proteins / metabolism
  • Electrophoretic Mobility Shift Assay
  • Enhancer Elements, Genetic
  • Gene Library
  • Homeodomain Proteins / metabolism
  • Humans
  • Models, Molecular
  • Protein Binding
  • Protein Conformation
  • Recombinant Proteins / metabolism
  • Transcription Factors / metabolism
  • Tumor Suppressor Protein p53 / metabolism

Substances

  • DNA-Binding Proteins
  • Drosophila Proteins
  • Homeodomain Proteins
  • Recombinant Proteins
  • TP53 protein, human
  • Transcription Factors
  • Tumor Suppressor Protein p53
  • exd protein, Drosophila
  • DNA