Addressing bias in national population density models: Focusing on rural Senegal

PLoS One. 2024 Nov 12;19(11):e0310809. doi: 10.1371/journal.pone.0310809. eCollection 2024.

Abstract

Knowing where people are is crucial for policymakers, particularly for the efficient allocation of resources in their country and the development of effective, people-centred policies. However, rural population distribution maps suffer from biases related to the type of dataset used to predict population density, such as the use of nighttime lights datasets in areas without electricity. This renders widely used datasets irrelevant in rural areas and biases nationwide models towards urban areas. To compensate for such biases, we aim at understanding the importance and relationship between water-related covariates and population densities in a random forest model across the urban-rural gradient. By extending a recursive feature elimination framework, we show that commonly used covariates are only selected when modelling the whole country. However, once the highest density areas are removed, water-related characteristics (especially distance to boreholes) become important covariates of population density outside of densely populated areas. This has important implications for modelling population in rural areas, including for a better estimation of the size of remote communities. When seeking to produce country-level population maps, we encourage further studies to explicitly account for rural areas by considering the urban-rural gradient and encourage the use of water-related datasets.

MeSH terms

  • Bias
  • Humans
  • Models, Theoretical
  • Population Density*
  • Rural Population*
  • Senegal
  • Urban Population

Grants and funding

The authors received no specific funding for this work.