Biases in ecological studies: utility of including within-area distribution of confounders

Stat Med. 2000 Jan 15;19(1):45-59. doi: 10.1002/(sici)1097-0258(20000115)19:1<45::aid-sim276>3.0.co;2-5.

Abstract

This paper is centred on studies commonly known as ecological studies, where one seeks to estimate risks from data aggregated on a geographical basis. The aggregated nature of the data creates difficulties for interpreting, at an individual level, any ecological association found, difficulties which are generically referred to as 'the ecological bias or fallacy'. Here, we address an important component of this bias related to the problem of misspecification in ecological studies. We consider how aggregated level dose-response relationships are derived from integrating individual level ones over the group, and how their correct specification requires, in general, knowledge of the within-group joint distribution of the relevant risk factors, which is rarely available. We discuss in detail the common situation where data on the proportion of persons exposed in each area to several dichotomous risk factors are available. We show that ecological regression estimates of the relative risk for each factor will be improved by including in the regression, besides the linear terms, cross-product terms of the marginal prevalences. Results from a simulation study are discussed and an example concerning the geographical analysis of lung cancer mortality in France is presented.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Bias*
  • France / epidemiology
  • Humans
  • Lung Neoplasms / epidemiology
  • Lung Neoplasms / mortality
  • Male
  • Metallurgy
  • Middle Aged
  • Normal Distribution
  • Poisson Distribution
  • Regression Analysis*
  • Risk
  • Risk Factors
  • Urban Population