VAE-IF: Deep feature extraction with averaging for fully unsupervised artifact detection in routinely acquired ICU time-series

Hollan Haule; Ian Piper; Patricia Jones; Chen Qin; Tsz-Yan Milly Lo; Javier Escudero

doi:10.1016/j.compbiomed.2024.109610

VAE-IF: Deep feature extraction with averaging for fully unsupervised artifact detection in routinely acquired ICU time-series

Comput Biol Med. 2024 Dec 31:186:109610. doi: 10.1016/j.compbiomed.2024.109610. Online ahead of print.

Authors

Hollan Haule¹, Ian Piper², Patricia Jones³, Chen Qin⁴, Tsz-Yan Milly Lo², Javier Escudero⁵

Affiliations

¹ Institute for Imaging, Data and Communications (IDCOM), School of Engineering, University of Edinburgh, Edinburgh, EH9 3FB, UK. Electronic address: [email protected].
² Centre of Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, UK.
³ Department of Child Life and Health, University of Edinburgh, Edinburgh, UK.
⁴ Department of Electrical and Electronics Engineering & I-X, Imperial College London, London, UK.
⁵ Institute for Imaging, Data and Communications (IDCOM), School of Engineering, University of Edinburgh, Edinburgh, EH9 3FB, UK.

PMID: 39742825
DOI: 10.1016/j.compbiomed.2024.109610

Abstract

Artifacts are a common problem in physiological time series collected from intensive care units (ICU) and other settings. They affect the quality and reliability of clinical research and patient care. Manual annotation of artifacts is costly and time-consuming, rendering it impractical. Automated methods are desired. Here, we propose a novel fully unsupervised approach to detect artifacts in clinical-standard, minute-by-minute resolution ICU data without any prior labeling or signal-specific knowledge. Our approach combines a variational autoencoder (VAE) and an isolation forest (IF) into a hybrid model to learn features and identify anomalies in different types of vital signs, such as blood pressure, heart rate, and intracranial pressure. We evaluate our approach on a real-world ICU dataset and compare it with supervised benchmark models based on long short-term memory (LSTM) and XGBoost and statistical methods such as ARIMA. We show that our unsupervised approach achieves comparable sensitivity to fully supervised methods and generalizes well to an external dataset. We also visualize the latent space learned by the VAE and demonstrate its ability to disentangle clean and noisy samples. Our approach offers a promising solution for cleaning ICU data in clinical research and practice without the need for any labels whatsoever.

Keywords: Artifact detection; Autoencoders; Intensive Care Unit data; Isolation Forest; Time series; Unsupervised learning.