Describing Intersectional Health Outcomes: An Evaluation of Data Analysis Methods

Mayuri Mahendran; Daniel Lizotte; Greta R Bauer

doi:10.1097/EDE.0000000000001466

Describing Intersectional Health Outcomes: An Evaluation of Data Analysis Methods

Epidemiology. 2022 May 1;33(3):395-405. doi: 10.1097/EDE.0000000000001466.

Authors

Mayuri Mahendran¹, Daniel Lizotte^{1

2}, Greta R Bauer¹

Affiliations

¹ From the Department of Epidemiology and Biostatistics, Schulich School of Medicine & Dentistry, Western University, London, ON, Canada.
² Department of Computer Science, Faculty of Science, Western University, London, ON, Canada.

Abstract

Background: Intersectionality theoretical frameworks have been increasingly incorporated into quantitative research. A range of methods have been applied to describing outcomes and disparities across large numbers of intersections of social identities or positions, with limited evaluation.

Methods: Using data simulated to reflect plausible epidemiologic data scenarios, we evaluated methods for intercategorical intersectional analysis of continuous outcomes, including cross-classification, regression with interactions, multilevel analysis of individual heterogeneity (MAIHDA), and decision-tree methods (classification and regression trees [CART], conditional inference trees [CTree], random forest). The primary outcome was estimation accuracy of intersection-specific means. We applied each method to an illustrative example using National Health and Nutrition Examination Study (NHANES) systolic blood pressure data.

Results: When studying high-dimensional intersections at smaller sample sizes, MAIHDA, CTree, and random forest produced more accurate estimates. In large samples, all methods performed similarly except CART, which produced less accurate estimates. For variable selection, CART performed poorly across sample sizes, although random forest performed best. The NHANES example demonstrated that different methods resulted in meaningful differences in systolic blood pressure estimates, highlighting the importance of selecting appropriate methods.

Conclusions: This study evaluates some of a growing toolbox of methods for describing intersectional health outcomes and disparities. We identified more accurate methods for estimating outcomes for high-dimensional intersections across different sample sizes. As estimation is rarely the only objective for epidemiologists, we highlight different outputs each method creates, and suggest the sequential pairing of methods as a strategy for overcoming certain technical challenges.

MeSH terms

Data Analysis*
Humans
Multilevel Analysis
Nutrition Surveys
Research Design*