Multivariate Time Series Change-Point Detection with a Novel Pearson-like Scaled Bregman Divergence

Tong Si; Yunge Wang; Lingling Zhang; Evan Richmond; Tae-Hyuk Ahn; Haijun Gong

doi:10.3390/stats7020028

Multivariate Time Series Change-Point Detection with a Novel Pearson-like Scaled Bregman Divergence

Stats (Basel). 2024 Jun;7(2):462-480. doi: 10.3390/stats7020028. Epub 2024 May 13.

Authors

Tong Si¹, Yunge Wang¹, Lingling Zhang², Evan Richmond¹, Tae-Hyuk Ahn³, Haijun Gong¹

Affiliations

¹ Department of Mathematics and Statistics, Saint Louis University, St. Louis, MO 63103, USA.
² Department of Mathematics and Statistics, University at Albany SUNY, Albany, NY 12222, USA.
³ Department of Computer Science, Saint Louis University, St. Louis, MO 63103, USA.

Abstract

Change-point detection is a challenging problem that has a number of applications across various real-world domains. The primary objective of CPD is to identify specific time points where the underlying system undergoes transitions between different states, each characterized by its distinct data distribution. Precise identification of change points in time series omics data can provide insights into the dynamic and temporal characteristics inherent to complex biological systems. Many change-point detection methods have traditionally focused on the direct estimation of data distributions. However, these approaches become unrealistic in high-dimensional data analysis. Density ratio methods have emerged as promising approaches for change-point detection since estimating density ratios is easier than directly estimating individual densities. Nevertheless, the divergence measures used in these methods may suffer from numerical instability during computation. Additionally, the most popular $α$ -relative Pearson divergence cannot measure the dissimilarity between two distributions of data but a mixture of distributions. To overcome the limitations of existing density ratio-based methods, we propose a novel approach called the Pearson-like scaled-Bregman divergence-based (PLsBD) density ratio estimation method for change-point detection. Our theoretical studies derive an analytical expression for the Pearson-like scaled Bregman divergence using a mixture measure. We integrate the PLsBD with a kernel regression model and apply a random sampling strategy to identify change points in both synthetic data and real-world high-dimensional genomics data of Drosophila. Our PLsBD method demonstrates superior performance compared to many other change-point detection methods.

Keywords: change-point detection; density ratio estimation; random sampling; scaled Bregman divergence; time-series data analysis.

Abstract

Grants and funding