Characterization of patients with advanced chronic pancreatitis using natural language processing of radiology reports

PLoS One. 2020 Aug 19;15(8):e0236817. doi: 10.1371/journal.pone.0236817. eCollection 2020.

Abstract

Study aim: To develop and apply a natural language processing algorithm for characterization of patients diagnosed with chronic pancreatitis in a diverse integrated U.S. healthcare system.

Methods: Retrospective cohort study including patients initially diagnosed with chronic pancreatitis (CP) within a regional integrated healthcare system between January 1, 2006 and December 31, 2015. Imaging reports from these patients were extracted from the electronic medical record system and split into training, validation and implementation datasets. A natural language processing (NLP) algorithm was first developed through the training dataset to identify specific features (atrophy, calcification, pseudocyst, cyst and main duct dilatation) from free-text radiology reports. The validation dataset was applied to validate the performance by comparing against the manual chart review. The developed algorithm was then applied to the implementation dataset. We classified patients with calcification(s) or ≥2 radiographic features as advanced CP. We compared etiology, comorbid conditions, treatment parameters as well as survival between advanced CP and others diagnosed during the study period.

Results: 6,346 patients were diagnosed with CP during the study period with 58,085 radiology studies performed. For individual features, NLP yielded sensitivity from 88.7% to 95.3%, specificity from 98.2% to 100.0%. A total of 3,672 patients met cohort inclusion criteria: 1,330 (36.2%) had evidence of advanced CP. Patients with advanced CP had increased frequency of smoking (57.8% vs. 43.0%), diabetes (47.6% vs. 35.9%) and underweight body mass index (6.6% vs. 3.6%), all p<0.001. Mortality from pancreatic cancer was higher in advanced CP (15.3/1,000 person-year vs. 2.8/1,000, p<0.001). Underweight BMI (HR 1.6, 95% CL 1.2, 2.1), smoking (HR 1.4, 95% CL 1.1, 1.7) and diabetes (HR 1.4, 95% CL 1.2, 1.6) were independent risk factors for mortality.

Conclusion: Patients with advanced CP experienced increased disease-related complications and pancreatic cancer-related mortality. Excess all-cause mortality was driven primarily by potentially modifiable risk factors including malnutrition, smoking and diabetes.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Aged
  • Body Mass Index
  • Diabetes Complications / pathology
  • Female
  • Humans
  • Image Processing, Computer-Assisted
  • Kaplan-Meier Estimate
  • Male
  • Middle Aged
  • Natural Language Processing*
  • Pancreas / diagnostic imaging
  • Pancreatic Neoplasms / mortality
  • Pancreatitis, Chronic / diagnosis*
  • Pancreatitis, Chronic / diagnostic imaging
  • Pancreatitis, Chronic / mortality
  • Proportional Hazards Models
  • Retrospective Studies
  • Risk Factors
  • Sensitivity and Specificity
  • Smoking
  • Tomography, X-Ray Computed