Background: Compensatory movements frequently emerge in the process of motor recovery after a stroke. Given their potential for unfavorable long-term effects, it is crucial to assess and document compensatory movements throughout rehabilitation. However, clinically applicable assessment tools are currently limited. Deep learning methods have shown promising potential for assessing movement quality and addressing this gap. A crucial prerequisite for developing an accurate measurement tool is ensuring reliability in assessing compensatory movements, which is essential for establishing a valid ground truth.
Objective: The study aimed to assess inter- and intra-rater reliability of occupational and physical therapists' visual assessment of compensatory movements based on video analysis.
Methods: Experienced therapists evaluated video-recorded performances of a standardized drinking task through an online labeling system. The standardized drinking task was performed by seven individuals with mild to moderate upper limb motor impairments after a stroke. The therapists rated compensatory movements in predetermined body segments and movement phases using a slider with a continuous scale ranging from 0 (no compensation) to 100 (maximum compensation). The collected data were analyzed using a generalized-linear mixed effects model with zero-inflated beta regression to estimate variance components. Intraclass correlation coefficients (ICC) were calculated to assess inter- and intra-rater reliability.
Results: Twenty-two therapists participated in this study. Inter-rater reliability was good for the phases of reaching, drinking, and returning (ICC ≥ .0.75), and moderate for both phases of transporting. Intra-rater reliability was excellent for the drinking phase (ICC > 0.9) and moderate to good for the phases of reaching, transporting, and returning of our cohort. ICCs for smoothness and interjoint coordination were poor for both inter- and intra-rater reliability. The data analysis unveiled a wide range of credible intervals for the ICCs across all domains examined in this study.
Conclusions: While this study shows promising inter- and intra-rater reliability for the drinking phases within our sample, the wide credible intervals raise the possibility that these results may have occurred by chance. Consequently, we cannot recommend the establishment of a ground truth for the automatic assessment of compensatory movements during a drinking task based on therapists' ratings alone.
Keywords: Compensatory movements; Ground truth; Inter-rater reliability; Intra-rater reliability; Machine learning; Stroke; Upper extremity.
© 2024. The Author(s).