Biases in Electronic Health Records Data for Generating Real-World Evidence: An Overview

J Healthc Inform Res. 2023 Nov 14;8(1):121-139. doi: 10.1007/s41666-023-00153-2. eCollection 2024 Mar.

Abstract

Electronic Health Records (EHR) are increasingly being perceived as a unique source of data for clinical research as they provide unprecedentedly large volumes of real-time data from real-world settings. In this review of the secondary uses of EHR, we identify the anticipated breadth of opportunities, pointing out the data deficiencies and potential biases that are likely to limit the search for true causal relationships. This paper provides a comprehensive overview of the types of biases that arise along the pathways that generate real-world evidence and the sources of these biases. We distinguish between two levels in the production of EHR data where biases are likely to arise: (i) at the healthcare system level, where the principal source of bias resides in access to, and provision of, medical care, and in the acquisition and documentation of medical and administrative data; and (ii) at the research level, where biases arise from the processes of extracting, analyzing, and interpreting these data. Due to the plethora of biases, mainly in the form of selection and information bias, we conclude with advising extreme caution about making causal inferences based on secondary uses of EHRs.

Keywords: Bias; Electronic Health Records; Real World Data; Real World Evidence; Study Validity.