Background: Intersectionality theoretical frameworks have been increasingly incorporated into quantitative research. A range of methods have been applied to describing outcomes and disparities across large numbers of intersections of social identities or positions, with limited evaluation.
Methods: Using data simulated to reflect plausible epidemiologic data scenarios, we evaluated methods for intercategorical intersectional analysis of continuous outcomes, including cross-classification, regression with interactions, multilevel analysis of individual heterogeneity (MAIHDA), and decision-tree methods (classification and regression trees [CART], conditional inference trees [CTree], random forest). The primary outcome was estimation accuracy of intersection-specific means. We applied each method to an illustrative example using National Health and Nutrition Examination Study (NHANES) systolic blood pressure data.
Results: When studying high-dimensional intersections at smaller sample sizes, MAIHDA, CTree, and random forest produced more accurate estimates. In large samples, all methods performed similarly except CART, which produced less accurate estimates. For variable selection, CART performed poorly across sample sizes, although random forest performed best. The NHANES example demonstrated that different methods resulted in meaningful differences in systolic blood pressure estimates, highlighting the importance of selecting appropriate methods.
Conclusions: This study evaluates some of a growing toolbox of methods for describing intersectional health outcomes and disparities. We identified more accurate methods for estimating outcomes for high-dimensional intersections across different sample sizes. As estimation is rarely the only objective for epidemiologists, we highlight different outputs each method creates, and suggest the sequential pairing of methods as a strategy for overcoming certain technical challenges.
Copyright © 2022 The Author(s). Published by Wolters Kluwer Health, Inc.