dCCA: detecting differential covariation patterns between two types of high-throughput omics data

Brief Bioinform. 2024 May 23;25(4):bbae288. doi: 10.1093/bib/bbae288.

Abstract

Motivation: The advent of multimodal omics data has provided an unprecedented opportunity to systematically investigate underlying biological mechanisms from distinct yet complementary angles. However, the joint analysis of multi-omics data remains challenging because it requires modeling interactions between multiple sets of high-throughput variables. Furthermore, these interaction patterns may vary across different clinical groups, reflecting disease-related biological processes.

Results: We propose a novel approach called Differential Canonical Correlation Analysis (dCCA) to capture differential covariation patterns between two multivariate vectors across clinical groups. Unlike classical Canonical Correlation Analysis, which maximizes the correlation between two multivariate vectors, dCCA aims to maximally recover differentially expressed multivariate-to-multivariate covariation patterns between groups. We have developed computational algorithms and a toolkit to sparsely select paired subsets of variables from two sets of multivariate variables while maximizing the differential covariation. Extensive simulation analyses demonstrate the superior performance of dCCA in selecting variables of interest and recovering differential correlations. We applied dCCA to the Pan-Kidney cohort from the Cancer Genome Atlas Program database and identified differentially expressed covariations between noncoding RNAs and gene expressions.

Availability and implementation: The R package that implements dCCA is available at https://github.com/hwiyoungstat/dCCA.

Keywords: RNA gene regulation; bipartite graph; canonical correlation analysis; differential correlation; multiomics; multivariate-to-multivariate.

MeSH terms

  • Algorithms*
  • Computational Biology / methods
  • Gene Expression Profiling / methods
  • Genomics / methods
  • Humans
  • Multivariate Analysis