Deep learning approaches are gradually being applied to electronic health record (EHR) data, but they fail to incorporate medical diagnosis codes and real-valued laboratory tests into a single input sequence for temporal modeling. Therefore, the modeling misses the existing medical interrelations among codes and lab test results that should be exploited to promote early disease detection. To find connections between past diagnoses, represented by medical codes, and real-valued laboratory tests, in order to exploit the full potential of the EHR in medical diagnosis, we present a novel method to embed the two sources of data into a recurrent neural network. Experimenting with a database of Crohn's disease (CD), a type of inflammatory bowel disease, patients and their controls (~1:2.2), we show that the introduction of lab test results improves the network's predictive performance more than the introduction of past diagnoses but also, surprisingly, more than when both are combined. In addition, using bootstrapping, we generalize the analysis of the imbalanced database to a medical condition that simulates real-life prevalence of a high-risk CD group of first-degree relatives with results that make our embedding method ready to screen this group in the population.
Keywords: Crohn's disease; Electronic health record (EHR); Embedding; Gated recurrent unit (GRU); Lab test result; Medical concept.
Copyright © 2023 Elsevier B.V. All rights reserved.