Progressive Bounded Error Piecewise Linear Approximation with Resolution Reduction for Time Series Data Compression

Jeng-Wei Lin; Shih-Wei Liao; Yu-Hung Tsai; Ching-Che Huang

doi:10.3390/s25010145

Progressive Bounded Error Piecewise Linear Approximation with Resolution Reduction for Time Series Data Compression

Sensors (Basel). 2024 Dec 29;25(1):145. doi: 10.3390/s25010145.

Authors

Jeng-Wei Lin¹, Shih-Wei Liao², Yu-Hung Tsai¹, Ching-Che Huang¹

Affiliations

¹ Department of Information Management, Tunghai University, Taichung 407224, Taiwan.
² Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan.

PMID: 39796936
DOI: 10.3390/s25010145

Abstract

Today, huge amounts of time series data are sensed continuously by AIoT devices, transmitted to edge nodes, and to data centers. It costs a lot of energy to transmit these data, store them, and process them. Data compression technologies are commonly used to reduce the data size and thus save energy. When a certain level of data accuracy is sacrificed, lossy compression technologies can achieve better compression ratios. However, different applications may have different requirements for data accuracy. Instead of keeping multiple compressed versions of a time series w.r.t. different error bounds, HIRE hierarchically maintains a tree, where the root records a constant function to approximate the whole time series, and each other node records a constant function to approximate a part of the residual function of its parent for a particular time period. To retrieve data w.r.t. a specific error bound, it traverses the tree from the root down to certain levels according to the requested error bound and aggregates the constant functions on the visited nodes to generate a new bounded error compressed version dynamically. However, the number of nodes to be visited is unknown before the tree traversal completes, and thus the data size of the new version. In this paper, a time series is progressively decomposed into multiple piecewise linear functions. The first function is an approximation of the original time series w.r.t. the largest error bound. The second function is an approximation of the residual function between the original time series and the first function w.r.t. the second largest error bound, and so forth. The sum of the first, second, …, and m-th functions is an approximation of the original time series w.r.t. the m-th error bound. For each iteration, Swing-RR is used to generate a Bounded Error Piecewise Linear Approximation (BEPLA). Resolution Reduction (RR) plays an important role. Eight real-world datasets are used to evaluate the proposed method. For each dataset, approximations w.r.t. three typical error bounds, 5%, 1%, and 0.5%, are requested. Three BEPLAs are generated accordingly, which can be summed up to form three approximations w.r.t. the three error bounds. For all datasets, the total data size of the three BEPLAs is almost the same with the size used to store just one version w.r.t. the smallest error bound and significantly smaller than the size used to keep three independent versions. The experiment result shows that the proposed method, referred to as PBEPLA-RR, can achieve very good compression ratios and provide multiple approximations w.r.t. different error bounds.

Keywords: PBEPLA-RR; Swing-RR; bounded error piecewise linear approximation; hierarchical residual encoding; progressive data compression; sensor data; time series.