Internal structure of the Patient Health Questionnaire-9: A systematic review and meta-analysis

Asian Nurs Res (Korean Soc Nurs Sci). 2024 Dec 24:S1976-1317(24)00161-0. doi: 10.1016/j.anr.2024.12.005. Online ahead of print.

Abstract

Purpose: This review aimed to evaluate the internal structure (structural validity, internal consistency, and measurement invariance) of the Patient Health Questionnire-9 (PHQ-9), which is one of the most widely used self-administered instruments for assessing and screening depression.

Methods: The updated COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) methodology for a systematic review of self-reported instruments was used. PubMed, Embase, CINAHL, PsycINFO, and Cochrane Library databases were searched from their inception up to February 28, 2023.

Results: This study reviewed 98 psychometric studies reported on in 90 reports conducted in 40 countries. Various versions of the PHQ-9 were identified: one-factor structures (8 types), two-factor structures (10 types), bifactor structures (4 types), three-factor structure (1 type), and second-order three-factor structure (1 type). There was sufficient high-quality evidence for structural validity of the one-factor structure with nine items scored using a four-point Likert scale based on confirmatory factor analysis, for internal consistency with a quantitatively pooled Cronbach's alpha of .85, and for measurement invariance across sex, age, education level, marital status, and income groups. There was sufficient high-quality evidence for structural validity, internal consistency (Cronbach's alpha = .76 - .92, = .83 - .92), and measurement invariance across sex for the PHQ-8 (which excluded item 9: "suicidality or self-harm thoughts").

Conclusions: The one-factor PHQ-9 and PHQ-8 (excluding item 9) scored using a four-point Likert scale have the best internal structure based on the current evidence. The one-factor PHQ-9 and PHQ-8 justify the use of aggregated total scores in both practice and research. The total score of the PHQ-9 using a four-point Likert scale can be used to compare depression levels across sex, age, education level, marital status, and income groups due to the availability of sufficient evidence for measurement invariance across these demographic groups.

Keywords: depression; questionnaire; reproducibility of results; systematic review.

Publication types

  • Review