Prediction of persistent chronic cough in patients with chronic cough using machine learning

ERJ Open Res. 2023 Mar 27;9(2):00471-2022. doi: 10.1183/23120541.00471-2022. eCollection 2023 Mar.

Abstract

Introduction: The aim of this study was to develop and validate prediction models for risk of persistent chronic cough (PCC) in patients with chronic cough (CC). This was a retrospective cohort study.

Methods: Two retrospective cohorts of patients 18-85 years of age were identified for years 2011-2016: a specialist cohort which included CC patients diagnosed by specialists, and an event cohort which comprised CC patients identified by at least three cough events. A cough event could be a cough diagnosis, dispensing of cough medication or any indication of cough in clinical notes. Model training and validation were conducted using two machine-learning approaches and 400+ features. Sensitivity analyses were also conducted. PCC was defined as a CC diagnosis or any two (specialist cohort) or three (event cohort) cough events in year 2 and again in year 3 after the index date.

Results: 8581 and 52 010 patients met the eligibility criteria for the specialist and event cohorts (mean age 60.0 and 55.5 years), respectively. 38.2% and 12.4% of patients in the specialist and event cohorts, respectively, developed PCC. The utilisation-based models were mainly based on baseline healthcare utilisations associated with CC or respiratory diseases, while the diagnosis-based models incorporated traditional parameters including age, asthma, pulmonary fibrosis, obstructive pulmonary disease, gastro-oesophageal reflux, hypertension and bronchiectasis. All final models were parsimonious (five to seven predictors) and moderately accurate (area under the curve: 0.74-0.76 for utilisation-based models and 0.71 for diagnosis-based models).

Conclusions: The application of our risk prediction models may be used to identify high-risk PCC patients at any stage of the clinical testing/evaluation to facilitate decision making.