Background: In recent years, different tools have been developed to facilitate analysis of social determinants of health (SDH) and apply this to health policy. The possibility of generating predictive models of health outcomes which combine a wide range of socioeconomic indicators with health problems is an approach that is receiving increasing attention. Our objectives are twofold: (1) to predict population health outcomes measured as hospital morbidity, taking primary care (PC) morbidity adjusted for SDH as predictors; and (2) to analyze the geographic variability of the impact of SDH-adjusted PC morbidity on hospital morbidity, by combining data sourced from electronic health records and selected operations of the National Statistics Institute (Instituto Nacional de Estadística/INE).
Methods: The following will be conducted: a qualitative study to select socio-health indicators using RAND methodology in accordance with SDH frameworks, based on indicators published by the INE in selected operations; and a quantitative study combining two large databases drawn from different Spain's Autonomous Regions (ARs) to enable hospital morbidity to be ascertained, i.e., PC electronic health records and the minimum basic data set (MBDS) for hospital discharges. These will be linked to socioeconomic indicators, previously selected by geographic unit. The outcome variable will be hospital morbidity, and the independent variables will be age, sex, PC morbidity, geographic unit, and socioeconomic indicators.
Analysis: To achieve the first objective, predictive models will be used, with a test-and-training technique, fitting multiple logistic regression models. In the analysis of geographic variability, penalized mixed models will be used, with geographic units considered as random effects and independent predictors as fixed effects.
Discussion: This study seeks to show the relationship between SDH and population health, and the geographic differences determined by such determinants. The main limitations are posed by the collection of data for healthcare as opposed to research purposes, and the time lag between collection and publication of data, sampling errors and missing data in registries and surveys. The main strength lies in the project's multidisciplinary nature (family medicine, pediatrics, public health, nursing, psychology, engineering, geography).
Keywords: big data; electronic health records—HER; morbidity; social determinants of health (MeSH); socioeconomic factors (MeSH).
Copyright © 2022 Couso-Viana, Bentué-Martínez, Delgado-Martín, Cabeza-Irigoyen, León-Latre, Concheiro-Guisán, Rodríguez-Álvarez, Román-Rodríguez, Roca-Pardiñas, Zúñiga-Antón, García-Flaquer, Pericàs-Pulido, Sánchez-Recio, González-Álvarez, Rodríguez-Pastoriza, Gómez-Gómez, Motrico, Jiménez-Murillo, Rabanaque and Clavería.