Purpose: We evaluated the extent to which use of a hypothesized imperfect gold standard, the Composite International Diagnostic Interview (CIDI), biases the estimates of diagnostic accuracy of the Patient Health Questionnaire-9 (PHQ-9). We also evaluate how statistical correction can be used to address this bias.
Methods: The study was conducted among 926 adults where structured interviews were conducted to collect information about participants' current major depressive disorder using PHQ-9 and CIDI instruments. First, we evaluated the relative psychometric properties of PHQ-9 using CIDI as a gold standard. Next, we used a Bayesian latent class model to correct for the bias.
Results: In comparison with CIDI, the relative sensitivity and specificity of the PHQ-9 for detecting major depressive disorder at a cut point of 10 or more were 53.1% (95% confidence interval: 45.4%-60.8%) and 77.5% (95% confidence interval, 74.5%-80.5%), respectively. Using a Bayesian latent class model to correct for the bias arising from the use of an imperfect gold standard increased the sensitivity and specificity of PHQ-9 to 79.8% (95% Bayesian credible interval, 64.9%-90.8%) and 79.1% (95% Bayesian credible interval, 74.7%-83.7%), respectively.
Conclusions: Our results provided evidence that assessing diagnostic validity of mental health screening instrument, where application of a gold standard might not be available, can be accomplished by using appropriate statistical methods.
Keywords: Bayesian analysis; Depression; Imperfect gold standard; Screening.
Copyright © 2014 Elsevier Inc. All rights reserved.