In EQA programs, Z-scores are used to evaluate laboratory performance. They should indicate poorly performing laboratories, regardless of the presence of outliers. For this, two different types of approaches exist. The first type are "outlier-based" approaches, which first exclude outlying values, calculate the average and standard deviation on the remaining data and obtain Z-scores for all values (e.g., Grubbs and Dixon). The second type includes the "robust" approaches (e.g., Tukey and Qn or the algorithm recommended by ISO). The different approaches were assessed by randomly generated samples from the Normal and Student t distributions. Part of the sample data were contaminated with outliers. The number of false and true outliers was recorded and subsequently, Positive and Negative Predictive Values were derived. Also, the sampling mean and variability were calculated for location and scale estimators. The various approaches performed similarly for sample sizes above 10 and when outliers were at good distance from the centre. For smaller sample sizes and closer outliers, however, the approaches performed quite differently. Tukey's method was characterised by a high true and a high false outlier rate, while the ISO and Qn approaches demonstrated weak performance. Grubbs test yielded overall the best results.
Copyright © 2011 Elsevier B.V. All rights reserved.