Taking a 'Big Data' approach to data quality in a citizen science project

Ambio. 2015 Nov;44 Suppl 4(Suppl 4):601-11. doi: 10.1007/s13280-015-0710-4.

Abstract

Data from well-designed experiments provide the strongest evidence of causation in biodiversity studies. However, for many species the collection of these data is not scalable to the spatial and temporal extents required to understand patterns at the population level. Only data collected from citizen science projects can gather sufficient quantities of data, but data collected from volunteers are inherently noisy and heterogeneous. Here we describe a 'Big Data' approach to improve the data quality in eBird, a global citizen science project that gathers bird observations. First, eBird's data submission design ensures that all data meet high standards of completeness and accuracy. Second, we take a 'sensor calibration' approach to measure individual variation in eBird participant's ability to detect and identify birds. Third, we use species distribution models to fill in data gaps. Finally, we provide examples of novel analyses exploring population-level patterns in bird distributions.

Keywords: Biodiversity monitoring; Citizen science; Data quality; Species distribution models; eBird.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Biodiversity*
  • Birds*
  • Conservation of Natural Resources / methods*
  • Data Accuracy*
  • Internet
  • Models, Biological