Consensus analysis has been proposed as a statistical method by which the efficacy of clinical and laboratory tests of inflammatory activity can be assessed. This technique is claimed to overcome the need for an external "gold standard" as a reference method; instead, the consensus of all tests is used as the gold standard. We have evaluated the reliability of consensus analysis using data collected from patients with Crohn's disease. Our results demonstrate that the technique depends strongly on the correlation structure underlying the set of measures of disease used for analysis. This observation was supported by a series of conventional cluster analyses of the same set of variables. Furthermore, slight modifications of the algorithm had profound effects on the final result. We conclude that for the evaluation of tests of inflammatory activity, an external reference method, albeit an imperfect one, remains indispensable.