Derivation and validation of a machine learning record linkage algorithm between emergency medical services and the emergency department

J Am Med Inform Assoc. 2020 Jan 1;27(1):147-153. doi: 10.1093/jamia/ocz176.

Abstract

Objective: Linking emergency medical services (EMS) electronic patient care reports (ePCRs) to emergency department (ED) records can provide clinicians access to vital information that can alter management. It can also create rich databases for research and quality improvement. Unfortunately, previous attempts at ePCR and ED record linkage have had limited success. In this study, we use supervised machine learning to derive and validate an automated record linkage algorithm between EMS ePCRs and ED records.

Materials and methods: All consecutive ePCRs from a single EMS provider between June 2013 and June 2015 were included. A primary reviewer matched ePCRs to a list of ED patients to create a gold standard. Age, gender, last name, first name, social security number, and date of birth were extracted. Data were randomly split into 80% training and 20% test datasets. We derived missing indicators, identical indicators, edit distances, and percent differences. A multivariate logistic regression model was trained using 5-fold cross-validation, using label k-fold, L2 regularization, and class reweighting.

Results: A total of 14 032 ePCRs were included in the study. Interrater reliability between the primary and secondary reviewer had a kappa of 0.9. The algorithm had a sensitivity of 99.4%, a positive predictive value of 99.9%, and an area under the receiver-operating characteristic curve of 0.99 in both the training and test datasets. Date-of-birth match had the highest odds ratio of 16.9, followed by last name match (10.6). Social security number match had an odds ratio of 3.8.

Conclusions: We were able to successfully derive and validate a record linkage algorithm from a single EMS ePCR provider to our hospital EMR.

Keywords: clinical informatics; electronic patient care records; emergency medical services; machine learning; patient matching; prehospital care; record linkage.

Publication types

  • Validation Study

MeSH terms

  • Algorithms
  • Emergency Medical Services*
  • Emergency Service, Hospital*
  • Female
  • Humans
  • Logistic Models
  • Male
  • Medical Record Linkage / methods*
  • Retrospective Studies
  • Supervised Machine Learning*