Background: Community obesity outcomes can reflect the food environment to which the community belongs. Recent studies have suggested that the local food environment can be measured by the degree of food accessibility, and survey data are normally used to calculate food accessibility. However, compared with survey data, social media data are organic, continuously updated, and cheaper to collect.
Objective: The objective of our study was to use publicly available social media data to learn the relationship between food environment and obesity rates at the state level.
Methods: To characterize the caloric information of the local food environment, we used food categories from Yelp and collected caloric information from MyFitnessPal for each category based on their popular dishes. We then calculated the average calories for each category and created a weighted score for each state. We also calculated 2 other dimensions from the concept of access, acceptability and affordability, to build obesity prediction models.
Results: The local food environment characterized using only publicly available social media data had a statistically significant correlation with the state obesity rate. We achieved a Pearson correlation of 0.796 between the predicted obesity rate and the reported obesity rate from the Behavioral Risk Factor Surveillance System across US states and the District of Columbia. The model with 3 generated feature sets achieved the best performance.
Conclusions: Our study proposed a method for characterizing state-level food environments only using continuously updated social media data. State-level food environments were accurately described using social media data, and the model also showed a disparity in the available food between states with different obesity rates. The proposed method should elastically apply to local food environments of different sizes and predict obesity rates effectively.
Keywords: category; correlation; dishes; environment; food; lifestyle; machine learning; mobile phone; modeling; obesity; outcome; popular; predict; rates; social media.
©Chuqin Li, Alexis Jordan, Jun Song, Yaorong Ge, Albert Park. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 13.12.2022.