Single-reviewer electronic phenotyping validation in operational settings: Comparison of strategies and recommendations

J Biomed Inform. 2017 Feb:66:1-10. doi: 10.1016/j.jbi.2016.12.004. Epub 2016 Dec 9.

Abstract

Objective: Develop evidence-based recommendations for single-reviewer validation of electronic phenotyping results in operational settings.

Material and methods: We conducted a randomized controlled study to evaluate whether electronic phenotyping results should be used to support manual chart review during single-reviewer electronic phenotyping validation (N=3104). We evaluated the accuracy, duration and cost of manual chart review with and without the availability of electronic phenotyping results, including relevant patient-specific details. The cost of identification of an erroneous electronic phenotyping result was calculated based on the personnel time required for the initial chart review and subsequent adjudication of discrepancies between manual chart review results and electronic phenotype determinations.

Results: Providing electronic phenotyping results (vs not providing those results) was associated with improved overall accuracy of manual chart review (98.90% vs 92.46%, p<0.001), decreased review duration per test case (62.43 vs 76.78s, p<0.001), and insignificantly reduced estimated marginal costs of identification of an erroneous electronic phenotyping result ($48.54 vs $63.56, p=0.16). The agreement between chart review and electronic phenotyping results was higher when the phenotyping results were provided (Cohen's kappa 0.98 vs 0.88, p<0.001). As a result, while accuracy improved when initial electronic phenotyping results were correct (99.74% vs 92.67%, N=3049, p<0.001), there was a trend towards decreased accuracy when initial electronic phenotyping results were erroneous (56.67% vs 80.00%, N=55, p=0.07). Electronic phenotyping results provided the greatest benefit for the accurate identification of rare exclusion criteria.

Discussion: Single-reviewer chart review of electronic phenotyping can be conducted more accurately, quickly, and at lower cost when supported by electronic phenotyping results. However, human reviewers tend to agree with electronic phenotyping results even when those results are wrong. Thus, the value of providing electronic phenotyping results depends on the accuracy of the underlying electronic phenotyping algorithm.

Conclusion: We recommend using a mix of phenotyping validation strategies, with the balance of strategies based on the anticipated electronic phenotyping error rate, the tolerance for missed electronic phenotyping errors, as well as the expertise, cost, and availability of personnel involved in chart review and discrepancy adjudication.

Keywords: Computable phenotype; Electronic clinical quality measurement; Electronic phenotyping; Human chart review; Manual chart review; Quality measure; Validation.

MeSH terms

  • Algorithms*
  • Electronic Health Records*
  • Humans
  • Phenotype*