Background: Proximity ligation based techniques, like Hi-C, involve restriction digestion followed by ligation of formaldehyde cross-linked chromatin. Distinct chromatin states can impact the restriction digestion, and hence the visibility in the contact maps, of engaged loci. Yet, the extent and the potential impact of digestion bias remain obscure and under-appreciated in the literature.
Results: Through analysis of 45 Hi-C datasets, lamina-associated domains (LADs), inactive X-chromosome in mammals, and polytene bands in fly, we first established that the DNA in condensed chromatin had lesser accessibility to restriction endonucleases used in Hi-C as compared to that in decondensed chromatin. The observed bias was independent of known systematic biases, was not appropriately corrected by existing computational methods, and needed an additional optimization step. We then repurposed this bias to identify novel condensed domains outside LADs, which were bordered by insulators and were dynamically associated with the polycomb mediated epigenetic and transcriptional states during development.
Conclusions: Our observations suggest that the corrected one-dimensional read counts of existing Hi-C datasets can be reliably repurposed to study the gene-regulatory dynamics associated with chromatin condensation and decondensation, and that the existing Hi-C datasets should be interpreted with cautions.
Keywords: 3D genome; CTCF; Chromatin condensation; Hi-C; Lamina associated domains.