Bayesian generalized biclustering analysis via adaptive structured shrinkage

Biostatistics. 2020 Jul 1;21(3):610-624. doi: 10.1093/biostatistics/kxy081.

Abstract

Biclustering techniques can identify local patterns of a data matrix by clustering feature space and sample space at the same time. Various biclustering methods have been proposed and successfully applied to analysis of gene expression data. While existing biclustering methods have many desirable features, most of them are developed for continuous data and few of them can efficiently handle -omics data of various types, for example, binomial data as in single nucleotide polymorphism data or negative binomial data as in RNA-seq data. In addition, none of existing methods can utilize biological information such as those from functional genomics or proteomics. Recent work has shown that incorporating biological information can improve variable selection and prediction performance in analyses such as linear regression and multivariate analysis. In this article, we propose a novel Bayesian biclustering method that can handle multiple data types including Gaussian, Binomial, and Negative Binomial. In addition, our method uses a Bayesian adaptive structured shrinkage prior that enables feature selection guided by existing biological information. Our simulation studies and application to multi-omics datasets demonstrate robust and superior performance of the proposed method, compared to other existing biclustering methods.

Keywords: -omics data; Adaptive shrinkage prior; Bayesian; Biclustering; Biological information; Integrative analysis.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Biostatistics / methods*
  • Cluster Analysis
  • Computational Biology / methods*
  • Computer Simulation
  • Datasets as Topic
  • Genomics
  • Humans
  • Models, Biological*
  • Models, Statistical*