Methods for assessing reliability and validity for a measurement tool: a case study and critique using the WHO haemoglobin colour scale

Sarah A White; Nynke R van den Broek

doi:10.1002/sim.1804

Methods for assessing reliability and validity for a measurement tool: a case study and critique using the WHO haemoglobin colour scale

Stat Med. 2004 May 30;23(10):1603-19. doi: 10.1002/sim.1804.

Authors

Sarah A White¹, Nynke R van den Broek

Affiliation

¹ Malawi-Liverpool-Wellcome Trust Clinical Research Programme, College of Medicine, University of Malawi, P.O. Box 30096, Blantyre 3, Malawi. [email protected]

PMID: 15122740
DOI: 10.1002/sim.1804

Abstract

Before introducing a new measurement tool it is necessary to evaluate its performance. Several statistical methods have been developed, or used, to evaluate the reliability and validity of a new assessment method in such circumstances. In this paper we review some commonly used methods. Data from a study that was conducted to evaluate the usefulness of a specific measurement tool (the WHO Colour Scale) is then used to illustrate the application of these methods. The WHO Colour Scale was developed under the auspices of the WHO to provide a simple portable and reliable method of detecting anaemia. This Colour Scale is a discrete interval scale, whereas the actual haemoglobin values it is used to estimate are on a continuous interval scale and can be measured accurately using electrical laboratory equipment. The methods we consider are: linear regression, correlation coefficients, paired t-tests plotting differences against mean values and deriving limits of agreement; kappa and weighted kappa statistics, sensitivity and specificity, an intraclass correlation coefficient and the repeatability coefficient. We note that although the definition and properties of each of these methods is well established inappropriate methods continue to be used in medical literature for assessing reliability and validity, as evidenced in the context of the evaluation of the WHO Colour Scale.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Anemia / diagnosis
Color
Community Health Workers / education
Evaluation Studies as Topic
Hemoglobins / analysis*
Hemoglobins / chemistry
Humans
Reproducibility of Results*
Sensitivity and Specificity
Statistics as Topic / methods*
World Health Organization

Substances

Hemoglobins