Guidelines for measurement validation in clinical trial design

J Biopharm Stat. 1999 Aug;9(3):417-38. doi: 10.1081/BIP-100101185.

Abstract

In the process of designing a clinical trial, the accuracy and precision of an endpoint is of critical importance in being able to determine valid results. In the creation and subsequent testing of the validity of an endpoint, it is desirable to show that on repeated measurements the endpoint can be measured precisely, and that it is reproducible with not only itself, but with any "gold standard" that can assess accuracy. Short of having this gold standard, we rely on showing that the endpoint is reliable. Agreement is the extent to which the measurement of the variable of interest yields the same results (consistency) on repeated trials. Thus the more consistent the results given by repeated measurements, the more reliable the measuring procedure. In this article, we examine the sensitivity to location and scale shift of different methods for assessing agreement. These tools include Pearson's correlation coefficient, the intraclass correlation coefficient, the concordance correlation coefficient, the Bradley-Blackwood procedure, and the within patient coefficient of variation. Our simulation studies showed that there are situations wherein the various methods for assessing agreement give conflicting results. On the basis of the results of the simulation studies, there are three important components to consider when deciding whether two sets of measurements are in agreement or not. The first component is the degree of linear relationship between the two sets; the second is the amount of bias as represented by the difference in the means; and the third is the difference between the two variances. The purpose of this article is to help interpret numerical results from the measures of agreement and to establish a criterion or range of values for each agreement measure that we consider to indicate agreement. We clarify what is meant by agreement by placing the measures in context based on scale and location shifts. Furthermore, we present guidelines for assessing agreement or endpoint validation when there are plans to design a clinical trial, and we give an actual example from a recent study.

MeSH terms

  • Biometry / methods
  • Clinical Trials as Topic / methods*
  • Clinical Trials as Topic / standards
  • Computer Simulation
  • Humans
  • Mathematical Computing
  • Models, Statistical*
  • Practice Guidelines as Topic
  • Reproducibility of Results
  • Research Design