Classification and regression tree (CART) analysis for indicator bacterial concentration prediction for a Californian coastal area

Water Sci Technol. 2010;61(2):545-53. doi: 10.2166/wst.2010.842.

Abstract

The study used existing indicator bacterial data and a number of physicochemical parameters that can be measured instantaneously to determine if a decision tree approach, especially classification and regression tree, could be used to predict bacterial concentrations in timely manner for beach closure management. Each indicator bacteria showed different tree structures and each had its own significant variables; Dissolved oxygen played an important role for both total coliform and fecal coliform and turbidity was the most important factor to predict concentrations of enterococci for decision tree approaches. Root mean squared error stayed between 5 and 6.5% of the average values of observations; RMSEs from each simulation, 0.25 for total coliform, 0.31 for fecal coliform, and 0.29 for enterococci. Estimations from tree structures would be regarded as a good representation of the actual data. In addition to results of the objective function, RMSE, 77.5% of actual value fell into the 95% of confidence interval of estimations for total coliform concentrations, 60% for fecal coliform concentrations, and 62.5% for enterococci concentrations. The approach showed reliable estimations for the majority of the data processed, although the method did not portray low concentrations of bacteria as well.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / isolation & purification*
  • Bathing Beaches / standards
  • California
  • Decision Trees*
  • Oceans and Seas
  • Seawater / microbiology*
  • Water Microbiology*