Background: Various chromatin modifications, identified in large-scale epigenomic analyses, are associated with distinct phenotypes of different cells and disease phases. To improve our understanding of these variations, many computational methods have been developed to discover novel sites and cell-specific chromatin modifications. Despite the availability of existing methods, there is still room for further improvement when they are applied to resolve the histone code hypothesis. Hence, we aim to investigate the development of a computational method to provide new insights into de novo combinatorial pattern discovery of chromatin modifications to characterize epigenetic variations in distinct phenotypes of different cells.
Results: We report a new computational approach, ChARM (Combinatorial Chromatin Modification Patterns using Association Rule Mining), that can be employed for the discovery of de novo combinatorial patterns of differential chromatin modifications. We used ChARM to analyse chromatin modification data from the livers of normal (non-cancerous) mice and hepatitis B virus X (HBx)-transgenic mice with hepatocellular carcinoma, and discovered 2,409 association rules representing combinatorial chromatin modification patterns. Among these, the combination of three histone modifications, a loss of H3K4Me3 and gains of H3K27Me3 and H3K36Me3, was the most striking pattern associated with the cancer. This pattern was enriched in functional elements of the mouse genome such as promoters, coding exons and 5'UTR with high CpG content, and CpG islands. It also showed strong correlations with polymerase activity at promoters and DNA methylation levels at gene bodies. We found that 30 % of the genes associated with the pattern were differentially expressed in the HBx compared to the normal, and 78.9 % of these genes were down-regulated. The significant canonical pathways (Wnt/ß-catenin, cAMP, Ras, and Notch signalling) that were enriched in the pattern could account for the pathogenesis of HBx.
Conclusions: ChARM, an unsupervised method for discovering combinatorial chromatin modification patterns, can identify histone modifications that occur globally. ChARM provides a scalable framework that can easily be applied to find various levels of combination patterns, which should reflect a range of globally common to locally rare chromatin modifications.
Keywords: Association rule mining; Chromatin signature; Combinatorial histone modifications; Differential modifications; Hepatitis B virus X (HBx)-transgenic mice; Hepatocellular carcinoma.