Exercise quantification from single camera view markerless 3D pose estimation

Heliyon. 2024 Mar 12;10(6):e27596. doi: 10.1016/j.heliyon.2024.e27596. eCollection 2024 Mar 30.

Abstract

Sports physiotherapists and coaches are tasked with evaluating the movement quality of athletes across the spectrum of ability and experience. However, the accuracy of visual observation is low and existing technology outside of expensive lab-based solutions has limited adoption, leading to an unmet need for an efficient and accurate means to measure static and dynamic joint angles during movement, converted to movement metrics useable by practitioners. This paper proposes a set of pose landmarks for computing frequently used joint angles as metrics of interest to sports physiotherapists and coaches in assessing common strength-building human exercise movements. It then proposes a set of rules for computing these metrics for a range of common exercises (single and double drop jumps and counter-movement jumps, deadlifts and various squats) from anatomical key-points detected using video, and evaluates the accuracy of these using a published 3D human pose model trained with ground truth data derived from VICON motion capture of common rehabilitation exercises. Results show a set of mathematically defined metrics which are derived from the chosen pose landmarks, and which are sufficient to compute the metrics for each of the exercises under consideration. Comparison to ground truth data showed that root mean square angle errors were within 10° for all exercises for the following metrics: shin angle, knee varus/valgus and left/right flexion, hip flexion and pelvic tilt, trunk angle, spinal flexion lower/upper/mid and rib flare. Larger errors (though still all within 15°) were observed for shoulder flexion and ASIS asymmetry in some exercises, notably front squats and drop-jumps. In conclusion, the contribution of this paper is that a set of sufficient key-points and associated metrics for exercise assessment from 3D human pose have been uniquely defined. Further, we found generally very good accuracy of the Strided Transformer 3D pose model in predicting these metrics for the chosen set of exercises from a single mobile device camera, when trained on a suitable set of functional exercises recorded using a VICON motion capture system. Future assessment of generalization is needed.

Keywords: Computer vision; Injury biomechanics; Markerless; Motion capture; Pose estimation; Sports biomechanics.