Diagnostic performance of deep learning for leg length measurements on radiographs in leg length discrepancy: A systematic review

J Exp Orthop. 2024 Nov 10;11(4):e70080. doi: 10.1002/jeo2.70080. eCollection 2024 Oct.

Abstract

Purpose: To systematically review the literature regarding machine learning in leg length discrepancy (LLD) and to provide insight into the most relevant manuscripts on this topic in order to highlight the importance and future clinical implications of machine learning in the diagnosis and treatment of LLD.

Methods: A systematic electronic search was conducted using PubMed, OVID/Medline and Cochrane libraries in accordance with Preferred Reporting Items for Systematic Review and Meta-Analysis guidelines. Two observers independently screened the abstracts and titles of potential articles.

Results: A total of six studies were identified in the search. All measurements were calculated using standardized anterior-posterior long-leg radiographs. Five (83.3%) of the studies used measurements of the femoral length, tibial length and leg length to assess LLD, whereas one (16.6%) study used the iliac crest height difference to quantify LLD. The deep learning models showed excellent reliability in predicting all length measurements with intraclass correlation coefficients ranging from 0.98 to 1.0 and mean absolute error (MAE) values ranging from 0.11 to 0.45 cm. Three studies reported measurements of LLD, and the convolutional neural network model showed the lowest MAE of 0.13 cm in predicting LLD.

Conclusions: Machine learning models are effective and efficient in determining LLD. Implementation of these models may reduce cost, improve efficiency and lead to better overall patient outcomes.

Clinical relevance: This review highlights the potential of deep learning (DL) algorithms for accurate and reliable measurement of lower limb length and leg length discrepancy (LLD) on long-leg radiographs. The reported mean absolute error and intraclass correlation coefficient values indicate that the performance of the DL models was comparable to that of radiologists, suggesting that DL-based assessments could potentially be used to automate the measurement of lower limb length and LLD in clinical practice.

Level of evidence: Level IV.

Keywords: artificial intelligence; deep learning; diagnostic imaging; machine learning; orthopaedics.