Not all phenotypes are created equal: covariates of success in e-phenotype specification

J Am Med Inform Assoc. 2023 Jan 18;30(2):213-221. doi: 10.1093/jamia/ocac157.

Abstract

Background: Electronic (e)-phenotype specification by noninformaticist investigators remains a challenge. Although validation of each patient returned by e-phenotype could ensure accuracy of cohort representation, this approach is not practical. Understanding the factors leading to successful e-phenotype specification may reveal generalizable strategies leading to better results.

Materials and methods: Noninformaticist experts (n = 21) were recruited to produce expert-mediated e-phenotypes using i2b2 assisted by a honest data-broker and a project coordinator. Patient- and visit-sets were reidentified and a random sample of 20 charts matching each e-phenotype was returned to experts for chart-validation. Attributes of the queries and expert characteristics were captured and related to chart-validation rates using generalized linear regression models.

Results: E-phenotype validation rates varied according to experts' domains and query characteristics (mean = 61%, range 20-100%). Clinical domains that performed better included infectious, rheumatic, neonatal, and cancers, whereas other domains performed worse (psychiatric, GI, skin, and pulmonary). Match-rate was negatively impacted when specification of temporal constraints was required. In general, the increase in e-phenotype specificity contributed positively to match-rate.

Discussions and conclusions: Clinical experts and informaticists experience a variety of challenges when building e-phenotypes, including the inability to differentiate clinical events from patient characteristics or appropriately configure temporal constraints; a lack of access to available and quality data; and difficulty in specifying routes of medication administration. Biomedical query mediation by informaticists and honest data-brokers in designing e-phenotypes cannot be overstated. Although tools such as i2b2 may be widely available to noninformaticists, successful utilization depends not on users' confidence, but rather on creating highly specific e-phenotypes.

Keywords: electronic health record; electronic phenotyping; phenotyped data; translational research services; validation.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Electronic Health Records
  • Mental Processes*
  • Phenotype
  • Research Design*