Objective: To identify lifting actions and count the number of lifts performed in videos based on robust class prediction and a streamlined process for reliable real-time monitoring of lifting tasks.
Background: Traditional methods for recognizing lifting actions often rely on deep learning classifiers applied to human motion data collected from wearable sensors. Despite their high performance, these methods can be difficult to implement on systems with limited hardware resources.
Method: The proposed method follows a five-stage process: (1) BlazePose, a real-time pose estimation model, detects key joints of the human body. (2) These joints are preprocessed by smoothing, centering, and scaling techniques. (3) Kinematic features are extracted from the preprocessed joints. (4) Video frames are classified as lifting or nonlifting using rank-altered kinematic feature pairs. (5) A lifting counting algorithm counts the number of lifts based on the class predictions.
Results: Nine rank-altered kinematic feature pairs are identified as key pairs. These pairs were used to construct an ensemble classifier, which achieved 0.89 or above in classification metrics, including accuracy, precision, recall, and F1 score. This classifier showed an accuracy of 0.90 in lifting counting and a latency of 0.06 ms, which is at least 12.5 times faster than baseline classifiers.
Conclusion: This study demonstrates that computer vision-based kinematic features could be adopted to effectively and efficiently recognize lifting actions.
Application: The proposed method could be deployed on various platforms, including mobile devices and embedded systems, to monitor lifting tasks in real-time for the proactive prevention of work-related low-back injuries.
Keywords: computer vision; lifting counting; musculoskeletal disorder; top scoring pair; workplace safety.