Automated discovery of drug treatment patterns for endocrine therapy of breast cancer within an electronic medical record

J Am Med Inform Assoc. 2012 Jun;19(e1):e83-9. doi: 10.1136/amiajnl-2011-000295. Epub 2011 Dec 1.

Abstract

Objective: To develop an algorithm for the discovery of drug treatment patterns for endocrine breast cancer therapy within an electronic medical record and to test the hypothesis that information extracted using it is comparable to the information found by traditional methods.

Materials: The electronic medical charts of 1507 patients diagnosed with histologically confirmed primary invasive breast cancer.

Methods: The automatic drug treatment classification tool consisted of components for: (1) extraction of drug treatment-relevant information from clinical narratives using natural language processing (clinical Text Analysis and Knowledge Extraction System); (2) extraction of drug treatment data from an electronic prescribing system; (3) merging information to create a patient treatment timeline; and (4) final classification logic.

Results: Agreement between results from the algorithm and from a nurse abstractor is measured for categories: (0) no tamoxifen or aromatase inhibitor (AI) treatment; (1) tamoxifen only; (2) AI only; (3) tamoxifen before AI; (4) AI before tamoxifen; (5) multiple AIs and tamoxifen cycles in no specific order; and (6) no specific treatment dates. Specificity (all categories): 96.14%-100%; sensitivity (categories (0)-(4)): 90.27%-99.83%; sensitivity (categories (5)-(6)): 0-23.53%; positive predictive values: 80%-97.38%; negative predictive values: 96.91%-99.93%.

Discussion: Our approach illustrates a secondary use of the electronic medical record. The main challenge is event temporality.

Conclusion: We present an algorithm for automated treatment classification within an electronic medical record to combine information extracted through natural language processing with that extracted from structured databases. The algorithm has high specificity for all categories, high sensitivity for five categories, and low sensitivity for two categories.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Antineoplastic Agents, Hormonal / therapeutic use*
  • Antineoplastic Combined Chemotherapy Protocols / therapeutic use
  • Aromatase Inhibitors / therapeutic use
  • Breast Neoplasms / drug therapy*
  • Electronic Health Records*
  • Female
  • Humans
  • Information Storage and Retrieval / methods*
  • Natural Language Processing*
  • Sensitivity and Specificity
  • Tamoxifen / therapeutic use

Substances

  • Antineoplastic Agents, Hormonal
  • Aromatase Inhibitors
  • Tamoxifen