Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: A systematic review and individual participant data meta-analysis

Yin Wu; Brooke Levis; Kira E. Riehm; Nazanin Saadat; Alexander W. Levis; Marleine Azar; Danielle B. Rice; Jill Boruff; Pim Cuijpers; Simon Gilbody; John P. A. Ioannidis; Lorie A. Kloda; Dean Mcmillan; Scott B. Patten; Ian Shrier; Roy C. Ziegelstein; Dickens H. Akena; Bruce Arroll; Liat Ayalon; Hamid R. Baradaran; Murray Baron; Charles H. Bombardier; Peter Butterworth; Gregory Carter; Marcos H. Chagas; Juliana C. N. Chan; Rushina Cholera; Yeates Conwell; Janneke M. de Man-van Ginkel; Jesse R. Fann; Felix H. Fischer; Daniel Fung; Bizu Gelaye; Felicity Goodyear-Smith; Catherine G. Greeno; Brian J. Hall; Patricia A. Harrison; Martin Härter; Ulrich Hegerl; Leanne Hides; Stevan E. Hobfoll; Marie Hudson; Thomas Hyphantis; M. D. Inagaki; Nathalie Jetté; Mohammad E. Khamseh; Kim M. Kiely; Yunxin Kwan; Femke Lamers; Shen-Ing Liu; Manote Lotrakul; Sonia R. Loureiro; Bernd Löwe; Anthony Mcguire; Sherina Mohd-Sidik; Tiago N. Munhoz; Kumiko Muramatsu; Flávia L. Osório; Vikram Patel; Brian W. Pence; Philippe Persoons; Angelo Picardi; Katrin Reuter; Alasdair G. Rooney; Iná S. Santos; Juwita Shaaban; Abbey Sidebottom; Adam Simning; M. D. Stafford; Sharon Sung; Pei Lin Lynnette Tan; Alyna Turner; Henk C. van Weert; Jennifer White; Mary A. Whooley; Kirsty Winkley; Mitsuhiko Yamada; Andrea Benedetti; Brett D. Thombs

doi:https://doi.org/10.1017/S0033291719001314

Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: A systematic review and individual participant data meta-analysis

Yin Wu, Brooke Levis, Kira E. Riehm, Nazanin Saadat, Alexander W. Levis, Marleine Azar, Danielle B. Rice, Jill Boruff, Pim Cuijpers, Simon Gilbody, John P. A. Ioannidis, Lorie A. Kloda, Dean Mcmillan, Scott B. Patten, Ian Shrier, Roy C. Ziegelstein, Dickens H. Akena, Bruce Arroll, Liat Ayalon, Hamid R. BaradaranMurray Baron, Charles H. Bombardier, Peter Butterworth, Gregory Carter, Marcos H. Chagas, Juliana C. N. Chan, Rushina Cholera, Yeates Conwell, Janneke M. de Man-van Ginkel, Jesse R. Fann, Felix H. Fischer, Daniel Fung, Bizu Gelaye, Felicity Goodyear-Smith, Catherine G. Greeno, Brian J. Hall, Patricia A. Harrison, Martin Härter, Ulrich Hegerl, Leanne Hides, Stevan E. Hobfoll, Marie Hudson, Thomas Hyphantis, M. D. Inagaki, Nathalie Jetté, Mohammad E. Khamseh, Kim M. Kiely, Yunxin Kwan, Femke Lamers, Shen-Ing Liu, Manote Lotrakul, Sonia R. Loureiro, Bernd Löwe, Anthony Mcguire, Sherina Mohd-Sidik, Tiago N. Munhoz, Kumiko Muramatsu, Flávia L. Osório, Vikram Patel, Brian W. Pence, Philippe Persoons, Angelo Picardi, Katrin Reuter, Alasdair G. Rooney, Iná S. Santos, Juwita Shaaban, Abbey Sidebottom, Adam Simning, M. D. Stafford, Sharon Sung, Pei Lin Lynnette Tan, Alyna Turner, Henk C. van Weert, Jennifer White, Mary A. Whooley, Kirsty Winkley, Mitsuhiko Yamada, Andrea Benedetti, Brett D. Thombs

Research output: Contribution to journal › Review article › Academic › peer-review

156 Citations (Scopus)

Abstract

Item 9 of the Patient Health Questionnaire-9 (PHQ-9) queries about thoughts of death and self-harm, but not suicidality. Although it is sometimes used to assess suicide risk, most positive responses are not associated with suicidality. The PHQ-8, which omits Item 9, is thus increasingly used in research. We assessed equivalency of total score correlations and the diagnostic accuracy to detect major depression of the PHQ-8 and PHQ-9.Methods We conducted an individual patient data meta-analysis. We fit bivariate random-effects models to assess diagnostic accuracy.Results 16 742 participants (2097 major depression cases) from 54 studies were included. The correlation between PHQ-8 and PHQ-9 scores was 0.996 (95% confidence interval 0.996 to 0.996). The standard cutoff score of 10 for the PHQ-9 maximized sensitivity + specificity for the PHQ-8 among studies that used a semi-structured diagnostic interview reference standard (N = 27). At cutoff 10, the PHQ-8 was less sensitive by 0.02 (-0.06 to 0.00) and more specific by 0.01 (0.00 to 0.01) among those studies (N = 27), with similar results for studies that used other types of interviews (N = 27). For all 54 primary studies combined, across all cutoffs, the PHQ-8 was less sensitive than the PHQ-9 by 0.00 to 0.05 (0.03 at cutoff 10), and specificity was within 0.01 for all cutoffs (0.00 to 0.01).Conclusions PHQ-8 and PHQ-9 total scores were similar. Sensitivity may be minimally reduced with the PHQ-8, but specificity is similar.

Original language	English
Pages (from-to)	1368-1380
Number of pages	13
Journal	Psychological Medicine
Volume	50
Issue number	8
Early online date	12 Jul 2019
DOIs	https://doi.org/10.1017/S0033291719001314
Publication status	Published - 1 Jun 2020

Keywords

Depression
PHQ-8
PHQ-9
diagnostic accuracy
individual participant data meta-analysis
meta-analysis
screening
systematic review

Access to Document

https://doi.org/10.1017/S0033291719001314

Cite this

Wu, Y., Levis, B., Riehm, K. E., Saadat, N., Levis, A. W., Azar, M., Rice, D. B., Boruff, J., Cuijpers, P., Gilbody, S., Ioannidis, J. P. A., Kloda, L. A., Mcmillan, D., Patten, S. B., Shrier, I., Ziegelstein, R. C., Akena, D. H., Arroll, B., Ayalon, L., ... Thombs, B. D. (2020). Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: A systematic review and individual participant data meta-analysis. Psychological Medicine, 50(8), 1368-1380. https://doi.org/10.1017/S0033291719001314

@article{deb496f3a37d42fcae129d8e8b7479f9,

title = "Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: A systematic review and individual participant data meta-analysis",

abstract = "Item 9 of the Patient Health Questionnaire-9 (PHQ-9) queries about thoughts of death and self-harm, but not suicidality. Although it is sometimes used to assess suicide risk, most positive responses are not associated with suicidality. The PHQ-8, which omits Item 9, is thus increasingly used in research. We assessed equivalency of total score correlations and the diagnostic accuracy to detect major depression of the PHQ-8 and PHQ-9.Methods We conducted an individual patient data meta-analysis. We fit bivariate random-effects models to assess diagnostic accuracy.Results 16 742 participants (2097 major depression cases) from 54 studies were included. The correlation between PHQ-8 and PHQ-9 scores was 0.996 (95% confidence interval 0.996 to 0.996). The standard cutoff score of 10 for the PHQ-9 maximized sensitivity + specificity for the PHQ-8 among studies that used a semi-structured diagnostic interview reference standard (N = 27). At cutoff 10, the PHQ-8 was less sensitive by 0.02 (-0.06 to 0.00) and more specific by 0.01 (0.00 to 0.01) among those studies (N = 27), with similar results for studies that used other types of interviews (N = 27). For all 54 primary studies combined, across all cutoffs, the PHQ-8 was less sensitive than the PHQ-9 by 0.00 to 0.05 (0.03 at cutoff 10), and specificity was within 0.01 for all cutoffs (0.00 to 0.01).Conclusions PHQ-8 and PHQ-9 total scores were similar. Sensitivity may be minimally reduced with the PHQ-8, but specificity is similar.",

keywords = "Depression, PHQ-8, PHQ-9, diagnostic accuracy, individual participant data meta-analysis, meta-analysis, screening, systematic review",

author = "Yin Wu and Brooke Levis and Riehm, {Kira E.} and Nazanin Saadat and Levis, {Alexander W.} and Marleine Azar and Rice, {Danielle B.} and Jill Boruff and Pim Cuijpers and Simon Gilbody and Ioannidis, {John P. A.} and Kloda, {Lorie A.} and Dean Mcmillan and Patten, {Scott B.} and Ian Shrier and Ziegelstein, {Roy C.} and Akena, {Dickens H.} and Bruce Arroll and Liat Ayalon and Baradaran, {Hamid R.} and Murray Baron and Bombardier, {Charles H.} and Peter Butterworth and Gregory Carter and Chagas, {Marcos H.} and Chan, {Juliana C. N.} and Rushina Cholera and Yeates Conwell and {de Man-van Ginkel}, {Janneke M.} and Fann, {Jesse R.} and Fischer, {Felix H.} and Daniel Fung and Bizu Gelaye and Felicity Goodyear-Smith and Greeno, {Catherine G.} and Hall, {Brian J.} and Harrison, {Patricia A.} and Martin H{\"a}rter and Ulrich Hegerl and Leanne Hides and Hobfoll, {Stevan E.} and Marie Hudson and Thomas Hyphantis and Inagaki, {M. D.} and Nathalie Jett{\'e} and Khamseh, {Mohammad E.} and Kiely, {Kim M.} and Yunxin Kwan and Femke Lamers and Shen-Ing Liu and Manote Lotrakul and Loureiro, {Sonia R.} and Bernd L{\"o}we and Anthony Mcguire and Sherina Mohd-Sidik and Munhoz, {Tiago N.} and Kumiko Muramatsu and Os{\'o}rio, {Fl{\'a}via L.} and Vikram Patel and Pence, {Brian W.} and Philippe Persoons and Angelo Picardi and Katrin Reuter and Rooney, {Alasdair G.} and Santos, {In{\'a} S.} and Juwita Shaaban and Abbey Sidebottom and Adam Simning and Stafford, {M. D.} and Sharon Sung and Tan, {Pei Lin Lynnette} and Alyna Turner and {van Weert}, {Henk C.} and Jennifer White and Whooley, {Mary A.} and Kirsty Winkley and Mitsuhiko Yamada and Andrea Benedetti and Thombs, {Brett D.}",

year = "2020",

month = jun,

day = "1",

doi = "https://doi.org/10.1017/S0033291719001314",

language = "English",

volume = "50",

pages = "1368--1380",

journal = "Psychological Medicine",

issn = "0033-2917",

publisher = "Cambridge University Press",

number = "8",

}

Wu, Y, Levis, B, Riehm, KE, Saadat, N, Levis, AW, Azar, M, Rice, DB, Boruff, J, Cuijpers, P, Gilbody, S, Ioannidis, JPA, Kloda, LA, Mcmillan, D, Patten, SB, Shrier, I, Ziegelstein, RC, Akena, DH, Arroll, B, Ayalon, L, Baradaran, HR, Baron, M, Bombardier, CH, Butterworth, P, Carter, G, Chagas, MH, Chan, JCN, Cholera, R, Conwell, Y, de Man-van Ginkel, JM, Fann, JR, Fischer, FH, Fung, D, Gelaye, B, Goodyear-Smith, F, Greeno, CG, Hall, BJ, Harrison, PA, Härter, M, Hegerl, U, Hides, L, Hobfoll, SE, Hudson, M, Hyphantis, T, Inagaki, MD, Jetté, N, Khamseh, ME, Kiely, KM, Kwan, Y, Lamers, F, Liu, S-I, Lotrakul, M, Loureiro, SR, Löwe, B, Mcguire, A, Mohd-Sidik, S, Munhoz, TN, Muramatsu, K, Osório, FL, Patel, V, Pence, BW, Persoons, P, Picardi, A, Reuter, K, Rooney, AG, Santos, IS, Shaaban, J, Sidebottom, A, Simning, A, Stafford, MD, Sung, S, Tan, PLL, Turner, A, van Weert, HC, White, J, Whooley, MA, Winkley, K, Yamada, M, Benedetti, A & Thombs, BD 2020, 'Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: A systematic review and individual participant data meta-analysis', Psychological Medicine, vol. 50, no. 8, pp. 1368-1380. https://doi.org/10.1017/S0033291719001314

TY - JOUR

T1 - Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: A systematic review and individual participant data meta-analysis

AU - Wu, Yin

AU - Levis, Brooke

AU - Riehm, Kira E.

AU - Saadat, Nazanin

AU - Levis, Alexander W.

AU - Azar, Marleine

AU - Rice, Danielle B.

AU - Boruff, Jill

AU - Cuijpers, Pim

AU - Gilbody, Simon

AU - Ioannidis, John P. A.

AU - Kloda, Lorie A.

AU - Mcmillan, Dean

AU - Patten, Scott B.

AU - Shrier, Ian

AU - Ziegelstein, Roy C.

AU - Akena, Dickens H.

AU - Arroll, Bruce

AU - Ayalon, Liat

AU - Baradaran, Hamid R.

AU - Baron, Murray

AU - Bombardier, Charles H.

AU - Butterworth, Peter

AU - Carter, Gregory

AU - Chagas, Marcos H.

AU - Chan, Juliana C. N.

AU - Cholera, Rushina

AU - Conwell, Yeates

AU - de Man-van Ginkel, Janneke M.

AU - Fann, Jesse R.

AU - Fischer, Felix H.

AU - Fung, Daniel

AU - Gelaye, Bizu

AU - Goodyear-Smith, Felicity

AU - Greeno, Catherine G.

AU - Hall, Brian J.

AU - Harrison, Patricia A.

AU - Härter, Martin

AU - Hegerl, Ulrich

AU - Hides, Leanne

AU - Hobfoll, Stevan E.

AU - Hudson, Marie

AU - Hyphantis, Thomas

AU - Inagaki, M. D.

AU - Jetté, Nathalie

AU - Khamseh, Mohammad E.

AU - Kiely, Kim M.

AU - Kwan, Yunxin

AU - Lamers, Femke

AU - Liu, Shen-Ing

AU - Lotrakul, Manote

AU - Loureiro, Sonia R.

AU - Löwe, Bernd

AU - Mcguire, Anthony

AU - Mohd-Sidik, Sherina

AU - Munhoz, Tiago N.

AU - Muramatsu, Kumiko

AU - Osório, Flávia L.

AU - Patel, Vikram

AU - Pence, Brian W.

AU - Persoons, Philippe

AU - Picardi, Angelo

AU - Reuter, Katrin

AU - Rooney, Alasdair G.

AU - Santos, Iná S.

AU - Shaaban, Juwita

AU - Sidebottom, Abbey

AU - Simning, Adam

AU - Stafford, M. D.

AU - Sung, Sharon

AU - Tan, Pei Lin Lynnette

AU - Turner, Alyna

AU - van Weert, Henk C.

AU - White, Jennifer

AU - Whooley, Mary A.

AU - Winkley, Kirsty

AU - Yamada, Mitsuhiko

AU - Benedetti, Andrea

AU - Thombs, Brett D.

PY - 2020/6/1

Y1 - 2020/6/1

N2 - Item 9 of the Patient Health Questionnaire-9 (PHQ-9) queries about thoughts of death and self-harm, but not suicidality. Although it is sometimes used to assess suicide risk, most positive responses are not associated with suicidality. The PHQ-8, which omits Item 9, is thus increasingly used in research. We assessed equivalency of total score correlations and the diagnostic accuracy to detect major depression of the PHQ-8 and PHQ-9.Methods We conducted an individual patient data meta-analysis. We fit bivariate random-effects models to assess diagnostic accuracy.Results 16 742 participants (2097 major depression cases) from 54 studies were included. The correlation between PHQ-8 and PHQ-9 scores was 0.996 (95% confidence interval 0.996 to 0.996). The standard cutoff score of 10 for the PHQ-9 maximized sensitivity + specificity for the PHQ-8 among studies that used a semi-structured diagnostic interview reference standard (N = 27). At cutoff 10, the PHQ-8 was less sensitive by 0.02 (-0.06 to 0.00) and more specific by 0.01 (0.00 to 0.01) among those studies (N = 27), with similar results for studies that used other types of interviews (N = 27). For all 54 primary studies combined, across all cutoffs, the PHQ-8 was less sensitive than the PHQ-9 by 0.00 to 0.05 (0.03 at cutoff 10), and specificity was within 0.01 for all cutoffs (0.00 to 0.01).Conclusions PHQ-8 and PHQ-9 total scores were similar. Sensitivity may be minimally reduced with the PHQ-8, but specificity is similar.

AB - Item 9 of the Patient Health Questionnaire-9 (PHQ-9) queries about thoughts of death and self-harm, but not suicidality. Although it is sometimes used to assess suicide risk, most positive responses are not associated with suicidality. The PHQ-8, which omits Item 9, is thus increasingly used in research. We assessed equivalency of total score correlations and the diagnostic accuracy to detect major depression of the PHQ-8 and PHQ-9.Methods We conducted an individual patient data meta-analysis. We fit bivariate random-effects models to assess diagnostic accuracy.Results 16 742 participants (2097 major depression cases) from 54 studies were included. The correlation between PHQ-8 and PHQ-9 scores was 0.996 (95% confidence interval 0.996 to 0.996). The standard cutoff score of 10 for the PHQ-9 maximized sensitivity + specificity for the PHQ-8 among studies that used a semi-structured diagnostic interview reference standard (N = 27). At cutoff 10, the PHQ-8 was less sensitive by 0.02 (-0.06 to 0.00) and more specific by 0.01 (0.00 to 0.01) among those studies (N = 27), with similar results for studies that used other types of interviews (N = 27). For all 54 primary studies combined, across all cutoffs, the PHQ-8 was less sensitive than the PHQ-9 by 0.00 to 0.05 (0.03 at cutoff 10), and specificity was within 0.01 for all cutoffs (0.00 to 0.01).Conclusions PHQ-8 and PHQ-9 total scores were similar. Sensitivity may be minimally reduced with the PHQ-8, but specificity is similar.

KW - Depression

KW - PHQ-8

KW - PHQ-9

KW - diagnostic accuracy

KW - individual participant data meta-analysis

KW - meta-analysis

KW - screening

KW - systematic review

UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85069052660&origin=inward

UR - https://www.ncbi.nlm.nih.gov/pubmed/31298180

UR - http://www.scopus.com/inward/record.url?scp=85069052660&partnerID=8YFLogxK

U2 - https://doi.org/10.1017/S0033291719001314

DO - https://doi.org/10.1017/S0033291719001314

M3 - Review article

C2 - 31298180

SN - 0033-2917

VL - 50

SP - 1368

EP - 1380

JO - Psychological Medicine

JF - Psychological Medicine

IS - 8

ER -

Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: A systematic review and individual participant data meta-analysis

Abstract

Keywords

Access to Document

Other files and links

Cite this