Exhaustive identification of genome-wide binding events of transcriptional regulators

Nucleic Acids Res. 2024 Apr 24;52(7):e40. doi: 10.1093/nar/gkae180.

Abstract

Genome-wide binding assays aspire to map the complete binding pattern of gene regulators. Common practice relies on replication-duplicates or triplicates-and high stringency statistics to favor false negatives over false positives. Here we show that duplicates and triplicates of CUT&RUN are not sufficient to discover the entire activity of transcriptional regulators. We introduce ICEBERG (Increased Capture of Enrichment By Exhaustive Replicate aGgregation), a pipeline that harnesses large numbers of CUT&RUN replicates to discover the full set of binding events and chart the line between false positives and false negatives. We employed ICEBERG to map the full set of H3K4me3-marked regions, the targets of the co-factor β-catenin, and those of the transcription factor TBX3, in human colorectal cancer cells. The ICEBERG datasets allow benchmarking of individual replicates, comparing the performance of peak calling and replication approaches, and expose the arbitrary nature of strategies to identify reproducible peaks. Instead of a static view of genomic targets, ICEBERG establishes a spectrum of detection probabilities across the genome for a given factor, underlying the intrinsic dynamicity of its mechanism of action, and permitting to distinguish frequent from rare regulation events. Finally, ICEBERG discovered instances, undetectable with other approaches, that underlie novel mechanisms of colorectal cancer progression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Cell Line, Tumor
  • Chromatin Immunoprecipitation Sequencing
  • Colorectal Neoplasms / genetics
  • Colorectal Neoplasms / metabolism
  • Genome, Human
  • Histones / genetics
  • Histones / metabolism
  • Humans
  • Protein Binding
  • Software*
  • T-Box Domain Proteins / genetics
  • T-Box Domain Proteins / metabolism
  • Transcription Factors / genetics
  • Transcription Factors / metabolism
  • Transcription, Genetic*
  • beta Catenin / genetics
  • beta Catenin / metabolism

Substances

  • beta Catenin
  • Histones
  • T-Box Domain Proteins
  • Transcription Factors