Introduction: Obesity, defined as a body mass index ≥30 kg/m2, is a major public health concern in the United States. Preventative approaches are essential, but they are limited by an inability to accurately predict individuals at highest risk of weight gain. Our objective was to develop accurate weight gain prediction models using the National Institutes of Health All of Us dataset. We hypothesized that machine learning models using both electronic health record and behavioral survey data would outperform models using electronic health record data alone.
Methods: The All of Us dataset was used to identify adults between 18 and 70 ys old with weight measurements 2 y apart between 2008 and 2022. Patients with a history of cancer, bariatric surgery, or pregnancy were excluded. Demographics, vital signs, laboratory results, comorbidities, and survey data (Alcohol Use Disorder Identification Test, Patient-Reported Outcomes Measurement Information System physical and mental health scores) were included as model parameters. Elastic net and XGBoost machine learning models were developed with and without survey data to predict ≥10% total body weight gain within 2 y. The data were split into a training sample (60%) and a testing sample (40%), and parameters were tuned using 10-fold cross-validation. Performance was compared using area under the receiver operating characteristic curves (AUCs).
Results: Our cohort consisted of 34,715 patients (mean [SD] age 50.9 [13.4] y; 45.7% White; 55.3% female). Over a 2-y span, 10.4% of the cohort gained ≥10% total body weight. AUCs were 0.677 [95% DeLong confidence interval 0.665-0.688] for elastic net and 0.706 [0.695-0.717] for XGBoost. Incorporation of survey data did not improve predictability, with AUCs of 0.681 [0.669-0.692] and 0.705 [0.694-0.716], respectively.
Conclusions: Our machine learning weight gain prediction models had modest performance that was not improved by survey data. The addition of other All of Us variables, including genomic data, may be informative in future studies.
Keywords: Adult; All of us; Machine learning; Models; Obesity; Prediction; Weight gain.
Published by Elsevier Inc.