Objective: Recent advantages in mHealth-enabled ECG recorders boosted the demand for algorithms, which are able to automatically detect cardiac anomalies with high accuracy.
Approach: We present a combined method of classical signal analysis and machine learning which has been developed during the Computing in Cardiology Challenge (CinC) 2017. Almost 400 hand-crafted features have been developed to reflect the complex physiology of cardiac arrhythmias and their appearance in single-channel ECG recordings. For the scope of this article, we performed several experiments on the publicly available challenge dataset to improve the classification accuracy. We compared the performance of two tree-based algorithms-gradient boosted trees and random forests-using different parameters for learning. We assessed the influence of five different sets of training annotations on the classifiers performance. Further, we present a new web-based ECG viewer to review and correct the training labels of a signal data set. Moreover, we analysed the feature importance and evaluated the model performance when using only a subset of the features. The primary data source used in the analysis was the dataset of the CinC 2017, consisting of 8528 signals from four classes. Our best results were achieved using a gradient boosted tree model which worked significantly better than random forests.
Main results: Official results of the challenge follow-up phase provided by the Challenge organizers on the full hidden test set are 90.8% (Normal), 84.1% (AF), 74.5% (Other), resulting in a mean F1-score of 83.2%, which was only 1.6% behind the challenge winner and 0.2% ahead of the next-best algorithm. Official results were rounded to two decimal places which lead to the equal-second best F1 F -score of 83% with five others.
Significance: The algorithm achieved the second-best score among 80 algorithms of the Challenge follow-up phase equal with five others.