Himanshu Agrawal, MBBS; Timothy McAuliffe, PhD; Alexandra Johnson, MD; Quamaine Bond, MD; Leah Flanagan, MD; Rita Sieracki, MLS
WMJ. 2026;125(2):268-272. Published June 2, 2026.
Download full-text pdf. Download Appendix.
ABSTRACT
Introduction: Developed by Spitzer et al in 1999, the Patient Health Questionnaire (PHQ-9) is a 9-item, self-administered instrument used to screen for depression. This screening tool has been validated by more than 25 studies. However, although the PHQ-9 asks respondents to rate the frequency of symptoms, most validation studies focus on symptom severity. The objective of this study was to assess the construct validity of PHQ-9 for detecting the severity of depression, as defined by the Diagnostic and Statistical Manual of Mental Disorders.
Methods: In part 1 of this study, 1408 outpatients across 14 family practice clinics were asked whether they scored the PHQ-9 based on symptom frequency, symptom severity, or a combination of both. In part 2, 87 mental health clinicians were asked how they typically interpret PHQ-9 scores.
Results: Of the 87 clinician responses, 79.3%, reported interpreting PHQ-9 scores based on severity of depressive symptoms, 4.6% based on frequency, and 16.1% based on a combination of severity and frequency. In striking contrast, among the 1408 patient responses, only 10.7% reported completing it based solely on severity, and 28.6% reported completing it based on frequency alone. An additional 60.7% reported completing the PHQ-9 based on a combination of frequency and severity.
Conclusions: The findings of this study indicate that the language used in the PHQ-9 may be interpreted differently by patients and clinicians. These differences may lead to conflation of symptom frequency and severity, resulting in misinterpretation of the PHQ-9 scores. The authors recommend revisions to the language of this screening tool to determine whether such changes improve alignment between patient responses and clinician interpretation.