Background: Transcription factors (TFs) and histone modifications (HMs) play critical roles in gene expression by regulating mRNA transcription. Modelling frameworks have been developed to integrate high-throughput omics data, with the aim of elucidating the regulatory logic that results from the interactions of DNA, TFs and HMs. These models have yielded an unexpected and poorly understood result: that TFs and HMs are statistically redundant in explaining mRNA transcript abundance at a genome-wide level.
Results: We constructed predictive models of gene expression by integrating RNA-sequencing, TF and HM chromatin immunoprecipitation sequencing and DNase I hypersensitivity data for two mammalian cell types. All models identified genome-wide statistical redundancy both within and between TFs and HMs, as previously reported. To investigate potential explanations, groups of genes were constructed for ontology-classified biological processes. Predictive models were constructed for each process to explore the distribution of statistical redundancy. We found significant variation in the predictive capacity of TFs and HMs across these processes and demonstrated the predictive power of HMs to be inversely proportional to process enrichment for housekeeping genes.
Conclusions: It is well established that the roles played by TFs and HMs are not functionally redundant. Instead, we attribute the statistical redundancy reported in this and previous genome-wide modelling studies to the heterogeneous distribution of HMs across chromatin domains. Furthermore, we conclude that statistical redundancy between individual TFs can be readily explained by nucleosome-mediated cooperative binding. This could possibly help the cell confer regulatory robustness by rejecting signalling noise and allowing control via multiple pathways.
Keywords: Gene expression; Histone modifications; Predictive modelling; Transcription factors; Transcriptional regulation.