OmicsARules: a R package for integration of multi-omics datasets via association rules mining

BMC Bioinformatics. 2019 Nov 8;20(1):554. doi: 10.1186/s12859-019-3171-0.

Abstract

Background: The improvements of high throughput technologies have produced large amounts of multi-omics experiments datasets. Initial analysis of these data has revealed many concurrent gene alterations within single dataset or/and among multiple omics datasets. Although powerful bioinformatics pipelines have been developed to store, manipulate and analyze these data, few explicitly find and assess the recurrent co-occurring aberrations across multiple regulation levels.

Results: Here, we introduced a novel R-package (called OmicsARules) to identify the concerted changes among genes under association rules mining framework. OmicsARules embedded a new rule-interestingness measure, Lamda3, to evaluate the associated pattern and prioritize the most biologically meaningful gene associations. As demonstrated with DNA methlylation and RNA-seq datasets from breast invasive carcinoma (BRCA), esophageal carcinoma (ESCA) and lung adenocarcinoma (LUAD), Lamda3 achieved better biological significance over other rule-ranking measures. Furthermore, OmicsARules can illustrate the mechanistic connections between methlylation and transcription, based on combined omics dataset. OmicsARules is available as a free and open-source R package.

Conclusions: OmicsARules searches for concurrent patterns among frequently altered genes, thus provides a new dimension for exploring single or multiple omics data across sequencing platforms.

Keywords: Association rules; Data integration; Multi-omics experiments; OmicsARules; R package.

MeSH terms

  • Computational Biology / methods*
  • Data Mining*
  • Databases, Genetic*
  • Genomics*
  • Humans
  • Neoplasms / genetics
  • Software*