Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures

Electron J Stat. 2016 Nov 16;10(2):3338-3354. doi: 10.1214/16-ejs1171.

Abstract

In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets.

Keywords: Bayes nonparametrics; contingency table; dependence measure; hypothesis testing; mixture model; mutual information.