Evaluation of imputation methods for microbial surface water quality studies

Environ Sci Process Impacts. 2014 May;16(5):1145-53. doi: 10.1039/c3em00721a.

Abstract

Longitudinal studies of microbial water quality are subject to missing observations. This study evaluates multiple imputation (MI) against data deletion, mean or median imputation for replacing missing microbial water quality data. The specific context is data collected in Chicago Area Waterway System (2007-2009), where 45% of Escherichia coli and 53% of enterococci densities were missing owing to sample analysis deficiencies. Imputation methods were compared performing a simulation study using complete observations with introduced missing values and subsequently compared with the original data with missing observations. Coefficients for E. coli densities in linear regression models predicting somatic coliphages density show that MI introduces the least bias among other methods while controlling Type I error. Further exploration of utilizing different MI implementations is recommended to address the influence of missing percentage on MI performance and to explore sensitivity to the degree of violation of the missing completely at random assumption.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Environmental Monitoring / methods*
  • Escherichia coli / growth & development
  • Fresh Water / microbiology*
  • Statistics as Topic
  • Water Microbiology*
  • Water Quality
  • Water Supply / statistics & numerical data