The Impact of "Possible Patients" on Phenotyping Algorithms: Electronic Phenotype Algorithms Can Only Be Reproduced by Sharing Detailed Annotation Criteria

Rina Kagawa; Yoshimasa Kawazoe; Emiko Shinohara; Takeshi Imai; Kazuhiko Ohe

The Impact of "Possible Patients" on Phenotyping Algorithms: Electronic Phenotype Algorithms Can Only Be Reproduced by Sharing Detailed Annotation Criteria

Stud Health Technol Inform. 2017:245:432-436.

Authors

Rina Kagawa¹, Yoshimasa Kawazoe², Emiko Shinohara², Takeshi Imai³, Kazuhiko Ohe¹

Affiliations

¹ Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Japan.
² Department of Healthcare Information Management, The University of Tokyo Hospital, Japan.
³ Center for Disease Biology and Integrative Medicine, The University of Tokyo, Japan.

PMID: 29295131

Abstract

Phenotyping is an automated technique for identifying patients diagnosed with a particular disease based on electronic health records (EHRs). To evaluate phenotyping algorithms, which should be reproducible, the annotation of EHRs as a gold standard is critical. However, we have found that the different types of EHRs cannot be definitively annotated into CASEs or CONTROLs. The influence of such "possible patients" on phenotyping algorithms is unknown. To assess these issues, for four chronic diseases, we annotated EHRs by using information not directly referring to the diseases and developed two types of phenotyping algorithms for each disease. We confirmed that each disease included different types of possible patients. The performance of phenotyping algorithms differed depending on whether possible patients were considered as CASEs, and this was independent of the type of algorithms. Our results indicate that researchers must share annotation criteria for classifying the possible patients to reproduce phenotyping algorithms.

Keywords: Clinical Phenotyping; Data Annotation; Electronic Health Records.

MeSH terms

Algorithms*
Electronic Health Records*
Humans
Phenotype*