Introduction: The incidence of incidentally detected lung nodules is rapidly rising, but little is known about their management or associated patient outcomes. One barrier to studying lung nodule care is the inability to efficiently and reliably identify the cohort of interest (i.e. cases). Investigators at Kaiser Permanente Southern California (KPSC) recently developed an automated method to identify individuals with an incidentally discovered lung nodule, but the feasibility of implementing this method across other health systems is unknown.
Methods: A random sample of Group Health (GH) members who had a computed tomography in 2012 underwent chart review to determine if a lung nodule was documented in the radiology report. A previously developed natural language processing (NLP) algorithm was implemented at our site using only knowledge of the key words, qualifiers, excluding terms, and the logic linking these parameters.
Results: Among 499 subjects, 156 (31%, 95% confidence interval [CI] 27-36%) had an incidentally detected lung nodule. NLP identified 189 (38%, 95% CI 33-42%) individuals with a nodule. The accuracy of NLP at GH was similar to its accuracy at KPSC: sensitivity 90% (95% CI 85-95%) and specificity 86% (95% CI 82-89%) versus sensitivity 96% (95% CI 88-100%) and specificity 86% (95% CI 75-94%).
Conclusion: Automated methods designed to identify individuals with an incidentally detected lung nodule can feasibly and independently be implemented across health systems. Use of these methods will likely facilitate the efficient conduct of multi-site studies evaluating practice patterns and associated outcomes.
Keywords: data collection; lung neoplasms; natural language processing; solitary pulmonary nodule.