Objective: Prediction of physiological mechanics are important in medical practice because interventions are guided by predicted impacts of interventions. But prediction is difficult in medicine because medicine is complex and difficult to understand from data alone, and the data are sparse relative to the complexity of the generating processes. Computational methods can increase prediction accuracy, but prediction with clinical data is difficult because the data are sparse, noisy and nonstationary. This paper focuses on predicting physiological processes given sparse, non-stationary, electronic health record data in the intensive care unit using data assimilation (DA), a broad collection of methods that pair mechanistic models with inference methods.
Methods: A methodological pipeline embedding a glucose-insulin model into a new DA framework, the constrained ensemble Kalman filter (CEnKF) to forecast blood glucose was developed. The data include tube-fed patients whose nutrition, blood glucose, administered insulins and medications were extracted by hand due to their complexity and to ensure accuracy. The model was estimated using an individual's data as if they arrived in real-time, and the estimated model was run forward producing a forecast. Both constrained and unconstrained ensemble Kalman filters were estimated to compare the impact of constraints. Constraint boundaries, model parameter sets estimated, and data used to estimate the models were varied to investigate their influence on forecasting accuracy. Forecasting accuracy was evaluated according to mean squared error between the model-forecasted glucose and the measurements and by comparing distributions of measured glucose and forecast ensemble means.
Results: The novel CEnKF produced substantial gains in robustness and accuracy while minimizing the data requirements compared to the unconstrained ensemble Kalman filters. Administered insulin and tube-nutrition were important for accurate forecasting, but including glucose in IV medication delivery did not increase forecast accuracy. Model flexibility, controlled by constraint boundaries and estimated parameters, did influence forecasting accuracy.
Conclusion: Accurate and robust physiological forecasting with sparse clinical data is possible with DA. Introducing constrained inference, particularly on unmeasured states and parameters, reduced forecast error and data requirements. The results are not particularly sensitive to model flexibility such as constraint boundaries, but over or under constraining increased forecasting errors.
Keywords: Constrained ensemble Kalman filter; Data assimilation; Electronic health record data; Glucose–insulin; Mathematical physiological model.
Copyright © 2023 The Authors. Published by Elsevier Inc. All rights reserved.