Background: The improvements of high throughput technologies have produced large amounts of multi-omics experiments datasets. Initial analysis of these data has revealed many concurrent gene alterations within single dataset or/and among multiple omics datasets. Although powerful bioinformatics pipelines have been developed to store, manipulate and analyze these data, few explicitly find and assess the recurrent co-occurring aberrations across multiple regulation levels.
Results: Here, we introduced a novel R-package (called OmicsARules) to identify the concerted changes among genes under association rules mining framework. OmicsARules embedded a new rule-interestingness measure, Lamda3, to evaluate the associated pattern and prioritize the most biologically meaningful gene associations. As demonstrated with DNA methlylation and RNA-seq datasets from breast invasive carcinoma (BRCA), esophageal carcinoma (ESCA) and lung adenocarcinoma (LUAD), Lamda3 achieved better biological significance over other rule-ranking measures. Furthermore, OmicsARules can illustrate the mechanistic connections between methlylation and transcription, based on combined omics dataset. OmicsARules is available as a free and open-source R package.
Conclusions: OmicsARules searches for concurrent patterns among frequently altered genes, thus provides a new dimension for exploring single or multiple omics data across sequencing platforms.
Keywords: Association rules; Data integration; Multi-omics experiments; OmicsARules; R package.