Objectives: The Agency for Healthcare Research and Quality (AHRQ) funded the RTI International—University of North Carolina at Chapel Hill Evidence-based Practice Center to determine best practices for addressing clinical heterogeneity in systematic reviews (SRs) and comparative effectiveness reviews (CERs). These best practices address critiques from patients, clinicians, policymakers, and others who assert that SRs typically focus on broad populations and, as a result, often lack information relevant to individual patients or patient subgroups.
Data sources and methods: We used numerous data sources. We abstracted information from guidance documents prepared by U.S. and international organizations engaged in preparing reviews. We searched MEDLINE® to identify studies on how to handle clinical heterogeneity and subgroup analyses. We reviewed more than 120 SRs conducted by AHRQ’s Evidence-based Practice Centers (EPCs), the Cochrane Collaboration, the Drug Effectiveness Review Project, the United Kingdom’s National Institute for Health and Clinical Excellence and others that we identified from the Centre for Reviews and Dissemination Database of Abstracts of Reviews of Effects and Health Technology Assessment. We reviewed peer and public review comments from AHRQ’s Scientific Review Center for three CERs, and we conducted key informant interviews with authors of six SRs prepared by AHRQ’s EPCs or international organizations.
Results: Clinical heterogeneity has been defined as the variation in study population characteristics, coexisting conditions, cointerventions, and outcomes evaluated across studies included in an SR or CER that may influence or modify the magnitude of the intervention measure of effect (e.g., odds ratio, risk ratio, risk difference). Statistical heterogeneity is defined as variability in the observed treatment effects beyond what would be expected by random error. The review organizations we studied varied in their inclusion of factors, in terms of the key questions and analysis that may modify the treatment-outcome association. They tended to give more consideration to demographic factors than to disease factors (e.g., disease severity, risk factors, coexisting disease, or cointerventions). Individual systematic reviewers whom we interviewed preferred a priori identification of effect modifiers to post hoc determination because of the latter’s data-dredging nature and the possibility of type 1 error when many subgroups are evaluated. Many publications that we identified through our literature searches did indicate that analysis of individual patient-level data in meta-analyses does allow better assessment of clinical heterogeneity, but the time, cost, and difficulty in obtaining these data are often prohibitive.
Conclusions: Identifying factors that may influence the treatment-outcome association is important to clinicians and patients because it helps them understand which patients will benefit most, who is least likely to benefit, and who is at greatest risk of experiencing adverse outcomes. Clear evidence-based guidance on addressing clinical heterogeneity in SRs and CERs is not available currently but would be valuable to AHRQ’s EPCs and to others conducting SRs internationally.