Aim: To address the problem of estimating disease frequency identified by a diagnostic test, which may not represent the actual number of persons with disease in a community, but rather the number of persons who tested positive. Those two values may be very different, their relationship depending on the properties of the diagnostic test applied and true prevalence of the disease in a population.
Methods: We defined a new test parameter, the ratio of Test to Actual Positives (TAP), which summarizes the properties of the diagnostic test applied and true prevalence of the disease in a population, and propose that is the most useful summary measure of the potential for bias in disease frequency estimates.
Results: A consideration of the relationship between the sensitivity (Se) and specificity (Sp) of the diagnostic test and the true prevalence of disease in a population can inform study design by highlighting the potential for disease misclassification bias. The effects of a decrease in Sp on the TAP ratio at very low disease prevalence are dramatic, as at 80% Sp (and any Se value including 100%), the measured disease frequency will represent a 25-fold overestimate. At a disease prevalence of 0.10, the Sp needs to be 90% or greater to achieve a TAP ratio of 1.0. However, unlike at lower levels of disease prevalence, the test Se is also an important determinant of the TAP ratio. A TAP ratio of 1.0 can be achieved by a Sp of 95% and intermediate Se (40%-60%); or a Sp of 99% and very high Se (over 90%). This illustrates how a test with poor performance characteristics in a clinical setting can perform well in a disease burden study in a population. In circumstances in which the TAP ratio suggests a potential for a large bias, we suggest correction procedures that limit disease misclassification bias and which are often counter-intuitive. We also illustrate how these methods can improve the power of intervention studies, which define outcomes by use of a diagnostic test.
Conclusions: Optimal screening test characteristics for use in a population-based survey are likely to be different to those when the test is used in a clinical setting. Calibrating the test a priori to bring the TAP ratio closer to unity deals with the possible large bias in disease burden estimates based on application of diagnostic (screening) test.