Enhancing ICD-Code-Based Case Definition for Heart Failure Using Electronic Medical Record Data

Yuan Xu; Seungwon Lee; Elliot Martin; Adam G D'souza; Chelsea T A Doktorchik; Jason Jiang; Sangmin Lee; Cathy A Eastwood; Nowell Fine; Brenda Hemmelgarn; Kathryn Todd; Hude Quan

doi:10.1016/j.cardfail.2020.04.003

Enhancing ICD-Code-Based Case Definition for Heart Failure Using Electronic Medical Record Data

J Card Fail. 2020 Jul;26(7):610-617. doi: 10.1016/j.cardfail.2020.04.003. Epub 2020 Apr 15.

Authors

Affiliations

¹ Department of Oncology, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada; Department of Surgery, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada; Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada; Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada. Electronic address: [email protected].
² Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada; Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada; Alberta Health Services, Calgary, Alberta, Canada.
³ Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada; Alberta Health Services, Calgary, Alberta, Canada.
⁴ Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada; Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.
⁵ Division of Cardiology, Department of Cardiac Sciences, Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.
⁶ Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.
⁷ Alberta Health Services, Calgary, Alberta, Canada; Neurochemical Research Unit, Department of Psychiatry, University of Alberta, Edmonton, Alberta, Canada.

PMID: 32304875
DOI: 10.1016/j.cardfail.2020.04.003

Abstract

Background: Surveillance and outcome studies for heart failure (HF) require accurate identification of patients with HF. Algorithms based on International Classification of Diseases (ICD) codes to identify HF from administrative data are inadequate owing to their relatively low sensitivity. Detailed clinical information from electronic medical records (EMRs) is potentially useful for improving ICD algorithms. This study aimed to enhance the ICD algorithm for HF definition by incorporating comprehensive information from EMRs.

Methods: The study included 2106 inpatients in Calgary, Alberta, Canada. Medical chart review was used as the reference gold standard for evaluating developed algorithms. The commonly used ICD codes for defining HF were used (namely, the ICD algorithm). The performance of different algorithms using the free text discharge summaries from a population-based EMR were compared with the ICD algorithm. These algorithms included a keyword search algorithm looking for HF-specific terms, a machine learning-based HF concept (HFC) algorithm, an EMR structured data based algorithm, and combined algorithms (the ICD and HFC combined algorithm).

Results: Of 2106 patients, 296 (14.1%) were patients with HF as determined by chart review. The ICD algorithm had 92.4% positive predictive value (PPV) but low sensitivity (57.4%). The EMR keyword search algorithm achieved a higher sensitivity (65.5%) than the ICD algorithm, but with a lower PPV (77.6%). The HFC algorithm achieved a better sensitivity (80.0%) and maintained a reasonable PPV (88.9%) compared with the ICD algorithm and the keyword algorithm. An even higher sensitivity (83.3%) was reached by combining the HFC and ICD algorithms, with a lower PPV (83.3%). The structured EMR data algorithm reached a sensitivity of 78% and a PPV of 54.2%. The combined EMR structured data and ICD algorithm had a higher sensitivity (82.4%), but the PPV remained low at 54.8%. All algorithms had a specificity ranging from 87.5% to 99.2%.

Conclusions: Applying natural language processing and machine learning on the discharge summaries of inpatient EMR data can improve the capture of cases of HF compared with the widely used ICD algorithm. The utility of the HFC algorithm is straightforward, making it easily applied for HF case identification.

Keywords: Electronic medical record; case definition; heart failure; machine learning; natural language processing.

MeSH terms

Algorithms
Electronic Health Records
Heart Failure* / diagnosis
Heart Failure* / epidemiology
Heart Failure* / therapy
Humans
International Classification of Diseases*
Natural Language Processing