Objective: Identify clinical factors that modulate the risk of progression to COPD among asthma patients using data extracted from electronic medical records.
Design: Demographic information and comorbidities from adult asthma patients who were observed for at least 5 years with initial observation dates between 1988 and 1998, were extracted from electronic medical records of the Partners Healthcare System using tools of the National Center for Biomedical Computing "Informatics for Integrating Biology to the Bedside" (i2b2).
Measurements: A predictive model of COPD was constructed from a set of 9,349 patients (843 cases, 8,506 controls) using Bayesian networks. The model's predictive accuracy was tested using it to predict COPD in a future independent set of asthma patients (992 patients; 46 cases, 946 controls), who had initial observation dates between 1999 and 2002.
Results: A Bayesian network model composed of age, sex, race, smoking history, and 8 comorbidity variables is able to predict COPD in the independent set of patients with an accuracy of 83.3%, computed as the area under the Receiver Operating Characteristic curve (AUROC).
Conclusions: Our results demonstrate that data extracted from electronic medical records can be used to create predictive models. With improvements in data extraction and inclusion of more variables, such models may prove to be clinically useful.