CriteriaMapper: establishing the automatic identification of clinical trial cohorts from electronic health records by matching normalized eligibility criteria and patient clinical characteristics

Sci Rep. 2024 Oct 25;14(1):25387. doi: 10.1038/s41598-024-77447-x.

Abstract

The use of electronic health records (EHRs) holds the potential to enhance clinical trial activities. However, the identification of eligible patients within EHRs presents considerable challenges. We aimed to develop a CriteriaMapper system for phenotyping eligibility criteria, enabling the identification of patients from EHRs with clinical characteristics that match those criteria. We utilized clinical trial eligibility criteria and patient EHRs from the Mount Sinai Database. The CriteriaMapper system was developed to normalize the criteria using national standard terminologies and in-house databases, facilitating computability and queryability to bridge clinical trial criteria and EHRs. The system employed rule-based pattern recognition and manual annotation. Our system normalized 367 out of 640 unique eligibility criteria attributes, covering various medical conditions including non-small cell lung cancer, small cell lung cancer, prostate cancer, breast cancer, multiple myeloma, ulcerative colitis, Crohn's disease, non-alcoholic steatohepatitis, and sickle cell anemia. About 174 criteria were encoded with standard terminologies and 193 were normalized using the in-house reference tables. The agreement between automated and manual normalization was high (Cohen's Kappa = 0.82), and patient matching demonstrated a 0.94 F1 score. Our system has proven effective on EHRs from multiple institutions, showing broad applicability and promising improved clinical trial processes, leading to better patient selection, and enhanced clinical research outcomes.

Keywords: Clinical trials; Cohort identification; Electronic healthcare records; Eligibility criteria attribute normalization; Eligibility criteria phenotyping.

MeSH terms

  • Clinical Trials as Topic*
  • Databases, Factual
  • Electronic Health Records*
  • Eligibility Determination / methods
  • Female
  • Humans
  • Male
  • Patient Selection