Statistical and Machine Learning Models for Classification of Human Wear and Delivery Days in Accelerometry Data

Sensors (Basel). 2021 Apr 13;21(8):2726. doi: 10.3390/s21082726.

Abstract

Accelerometers are increasingly being used in biomedical research, but the analysis of accelerometry data is often complicated by both the massive size of the datasets and the collection of unwanted data from the process of delivery to study participants. Current methods for removing delivery data involve arduous manual review of dense datasets. We aimed to develop models for the classification of days in accelerometry data as activity from human wear or the delivery process. These models can be used to automate the cleaning of accelerometry datasets that are adulterated with activity from delivery. We developed statistical and machine learning models for the classification of accelerometry data in a supervised learning context using a large human activity and delivery labeled accelerometry dataset. Model performances were assessed and compared using Monte Carlo cross-validation. We found that a hybrid convolutional recurrent neural network performed best in the classification task with an F1 score of 0.960 but simpler models such as logistic regression and random forest also had excellent performance with F1 scores of 0.951 and 0.957, respectively. The best performing models and related data processing techniques are made publicly available in the R package, Physical Activity.

Keywords: accelerometry; machine learning; neural networks; physical activity; predictive modeling; statistical learning.

MeSH terms

  • Accelerometry*
  • Exercise
  • Humans
  • Logistic Models
  • Machine Learning*
  • Neural Networks, Computer