Patients with life-threatening arrhythmias are often treated with cardiac implantable electronic devices (CIEDs), such as pacemakers and implantable cardioverter defibrillators (ICDs). Recent advancements in CIEDs have enabled advanced functionality and connectivity that make such devices (particularly ICDs) vulnerable to cyber-attacks. One of the most dangerous attacks on CIED ecosystems is a data manipulation attack from a compromised programmer device that sends malicious clinical programmings to the CIED. Such attacks can affect the CIED functioning and impact patient's survival and quality of life. In this paper, we propose Cardio-ML - an automated system for the detection of malicious clinical programmings that is based on machine learning algorithms and a novel missing values resemblance framework. Our system is designed to detect new variants of existing attacks and, more importantly, new unknown (zero-day) attacks, aimed at ICDs. We collected 1651 legitimate clinical programmings from 514 patients, over a four-year period, from programmer devices at two medical centers. Our collection also includes 28 core malicious functionalities created by cardiac electrophysiology experts that were later used to create different variants of malicious programmings. Cardio-ML was evaluated extensively in three comprehensive experiments and showed high detection capabilities in most attack scenarios. We achieved perfect classification results for detecting newly created variants of existing core malicious functionalities, with an AUC of 100%; for completely new unknown (zero-day) malicious clinical programmings, an AUC of 80% was obtained, which is 14% better than the state-of-the-art method. We were able to further improve our detection results by identifying the best combination of legitimate and zero-day malicious programmings in the dataset, achieving an AUC of 87%. CIED clinical programmings have many parameters without values for a large number of samples (programmings). To cope with the extreme amount of missing values in our dataset, we developed a novel missing values-based resemblance framework and evaluated it using three dataset-creation approaches: a standard expert-driven approach, our novel data-driven approach, and a combined approach incorporating both approaches. The results showed that our novel framework handles missing values in the data better than the expert-driven approach which yields an empty dataset. In particular, the combined approach showed a 40% improvement in data utilization compared to the data-driven approach.
Keywords: CIED; Cyber-attack; Detection; ICD; Machine learning; Malware; Missing values.
Copyright © 2021 Elsevier B.V. All rights reserved.