Background: Discriminating active tuberculosis (ATB) from latent tuberculosis infection (LTBI) remains challenging. The present study aims to evaluate the performance of diagnostic models established using machine learning based on routine laboratory indicators in differentiating ATB from LTBI.
Methods: Participants were respectively enrolled at Tongji Hospital (discovery cohort) and Sino-French New City Hospital (validation cohort). Diagnostic models were established based on routine laboratory indicators using machine learning.
Results: A total of 2619 participants (1025 ATB and 1594 LTBI) were enrolled in discovery cohort and another 942 subjects (388 ATB and 554 LTBI) were recruited in validation cohort. ATB patients had significantly higher levels of tuberculosis-specific antigen/phytohemagglutinin ratio and coefficient variation of red blood cell volume distribution width, and lower levels of albumin and lymphocyte count than those of LTBI individuals. Six models were built and the optimal performance was obtained from GBM model. GBM model derived from training set (n = 1965) differentiated ATB from LTBI in the test set (n = 654) with a sensitivity of 84.38% (95% CI, 79.42%-88.31%) and a specificity of 92.71% (95% CI, 89.73%-94.88%). Further validation by an independent cohort confirmed its encouraging value with a sensitivity of 87.63% (95% CI, 83.98%-90.54%) and specificity of 91.34% (95% CI, 88.70%-93.40%), respectively.
Conclusions: We successfully developed a model with promising diagnostic value based on machine learning for the first time. Our study proposed that GBM model may be of great benefit served as a tool for the accurate identification of ATB.
Keywords: Active tuberculosis; Diagnostic models; Latent tuberculosis infection; Machine learning; Routine laboratory indicators.
Copyright © 2022. Published by Elsevier Ltd.