Polychlorinated biphenyls (PCBs) are highly toxic environmental pollutants that can accumulate in soils. We consider the problem of explaining and mapping the spatial distribution of PCBs using a spatial data set of 105 PCB-187 measurements from a region in the north of France. A large proportion of our data (35%) fell below a quantification limit (QL), meaning that their concentrations could not be determined to a sufficient degree of precision. Where a measurement fell below this QL, the inequality information was all that we were presented with. In this work, we demonstrate a full geostatistical analysis-bringing together the various components, including model selection, cross-validation, and mapping-using censored data to represent the uncertainty that results from below-QL observations. We implement a Monte Carlo maximum likelihood approach to estimate the geostatistical model parameters. To select the best set of explanatory variables for explaining and mapping the spatial distribution of PCB-187 concentrations, we apply the Akaike Information Criterion (AIC). The AIC provides a trade-off between the goodness-of-fit of a model and its complexity (i.e., the number of covariates). We then use the best set of explanatory variables to help interpolate the measurements via a Bayesian approach, and produce maps of the predictions. We calculate predictions of the probability of exceeding a concentration threshold, above which the land could be considered as contaminated. The work demonstrates some differences between approaches based on censored data and on imputed data (in which the below-QL data are replaced by a value of half of the QL). Cross-validation results demonstrate better predictions based on the censored data approach, and we should therefore have confidence in the information provided by predictions from this method.
Copyright © by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America, Inc.