PHQ-9

The nine-item Patient Health Questionnaire (PHQ-9) is a depressive symptom scale and diagnostic tool introduced in 2001 to screen adult patients in primary care settings. The instrument assesses for the presence and severity of depressive symptoms and a possible depressive disorder. The PHQ-9 is a component of the larger self-administered Patient Health Questionnaire (PHQ), but can be used as a stand-alone instrument. The PHQ is part of Pfizer's larger suite of trademarked products, called the Primary Care Evaluation of Mental Disorders (PRIME-MD). The PHQ-9 takes less than three minutes to complete. It is scored by simply adding up the individual items' scores. Each of the nine items reflects a DSM-5 symptom of depression. Primary care providers can use the PHQ-9 to screen for possible depression in patients.

History
The PHQ-9 is the nine-item depression scale found in the 59-item PHQ. The PHQ is a self-administered version of the PRIME-MD, a screening tool that assesses 12 mental and emotional health disorders. It has modules on mood (PHQ-9), anxiety, alcohol, eating, and somatoform disorders. Robert L. Spitzer, Janet B.W. Williams, and Kurt Kroenke developed the PHQ in the mid-1990s and the PHQ-9 in 1999 with a grant from Pfizer.

Survey items
A patient may take the PHQ-9 in written form or be presented the survey items in interview form. The PHQ-9 questions reflect the diagnostic criteria for major depressive disorder (MDD) found in the DSM-5. The items ask about the patient's experience in the last two weeks. Questions are about the level of interest/pleasure in doing things (anhedonia), feeling down or depressed, sleep-related problems (sleeping too much/difficulty falling or staying asleep), low energy or fatigue, eating problems (poor appetite or eating too much), self-worth (feeling like a failure), ability to concentrate, psychomotor problems (speaking/moving slowly or fidgety/restless), and thoughts of suicide. Responses range from “0” (Not at all) to “3” (nearly every day). A tenth question asks about the extent to which the previously mentioned symptoms make functioning in daily life difficult. The response to the tenth question is not factored into the final score; however, clinicians may use the response to help gauge the patient's level of impairment. A massive study of almost 60,000 participants (involving 29 samples from seven countries and speaking five languages) that employed exploratory structural equation modeling bifactor analysis showed the PHQ-9 is essentially unidimensional; cognitive-affective and somatic specific factors were relatively weak.

Interpretation of results
The total sum of the responses roughly indexes levels of depression. Scores range from 0 to 27. In general, a total of 10 or above is suggestive of the presence of depression. Listed below are PHQ-9 totals, the levels of depression that they relate to, and suggested treatment for each level of depression: A provisional diagnosis of MDD can be made by using the pattern of responses to PHQ-9 items. According to the DSM-5, MDD is likely if five or more of the nine criterion symptoms are present for “most of the day, nearly every day" over the past 2 weeks; however, one of the symptoms must be either depressed mood or anhedonia (questions 1 and 2 on the PHQ-9). Any degree of suicidal thoughts counts toward a provisional diagnosis. The symptoms must also cause significant distress and loss of function. The PHQ-9 is limited to making a provisional diagnosis. It cannot be used to make an actual diagnosis. Only a trained clinician can do that. For example, a trained clinician can determine if the symptoms can be better explained by substance use or another medical or psychiatric condition. Clinicians, however, may use the PHQ-9 to evaluate the efficacy of treatments for depression. A change of PHQ-9 score to less than 10 is considered a “partial response” to treatment and a change of PHQ-9 score to less than 5 is considered to be “remission.”

Validity and reliability
Kroenke, Spitzer, and Williams conducted validity and reliability research on the PHQ-9 in 2001. With regard to reliability, they found that Cronbach's alpha for the PHQ-9 was 0.89 in a sample comprising 3,000 primary care patients and 0.86 among 3,000 OB-GYN patients. However, some research suggests that the scale is not purely unidimensional, with the scale reflecting two latent factors, somatic and cognitive/affective factors. By contrast, the results of the massive study by Bianchi et al. (2022) indicate that the PHQ-9's total score is essentially unidimensional.

The test-retest reliability was found to be excellent. The correlation between PHQ-9 scores obtained from in-person and phone interviews with the same patients was 0.84. The PHQ-9 showed acceptable psychometric properties in a rural Indian population. In general, psychometric research supports the use of total scores, i.e., summing the item scores, in research and practice.

In an assessment of construct validity, Kroenke et al. found that the correlation between the PHQ-9 and the SF-20 mental health scale was 0.73. To assess criterion validity, a mental health professional validated depression diagnoses from PHQ-9 scores from 580 participants, resulting in 88% sensitivity and 88% specificity.

Readability
Preliminary work using gold standard readability measures suggests that a significant minority of patients might find interpretation of the PHQ-9 difficult without support.

Applications
The National Institute for Health and Clinical Excellence endorsed the PHQ-9 for measuring depression severity and responsiveness to treatment in adults in a primary care setting. The Behavioral Risk Factor Surveillance Survey (BRFSS), the National Health and Nutrition Examination Survey, the Medical Expenditure Panel Survey, the National Epidemiologic Survey on Alcohol and Related Conditions, the Medicare Health Support program, and the Millennium Cohort Study use the full PHQ-9 or a shortened form of it. The Veterans Administration, Department of Defense, and Kaiser Permanente adopted the PHQ-9 as a standard measure for depression screening. The PHQ-9 is also the most commonly used depression measure in the United Kingdom's National Health Service, which requires providers to use a depression screening instrument when treating depression.

Studies found the PHQ-9 is also useful for screening for depression in psychiatric clinics. Researchers have used the PHQ-9 to study the mental health of patients with diabetes, HIV-AIDS, chronic pain, arthritis, fibromyalgia, epilepsy, and substance abuse. It also is used in studies involving patients with physical disabilities as well as older adults, students, and adolescents. The PHQ-9 has been extensively used in research investigating the relationship between burnout and depression. The instrument is available in over 30 languages and may be valid for use in different ethnic groups. Pfizer owns the copyright of the PHQ-9 and allows it to be accessed for free.

Related instruments
The PHQ-2 is a shortened version of the PHQ-9. It contains the first 2 questions of the PHQ-9 and takes less than a minute to administer. A score of 3 or greater on the PHQ-2 will generally lead to the subsequent administration of the PHQ-9. The Veterans Administration uses this method to screen for depression in patients.

The PHQ-8 consists of all of the PHQ-9 instruments except for the last question (suicidal thoughts). The 8-item version of the instrument is commonly used in research on general population samples, which mostly comprises individuals who are not depressed. Researchers generally use the PHQ-8 because timing and resource restraints may leave researchers unable to intervene with study participants who indicate that they have experienced suicidal thoughts. The absence of the ninth question has little effect on scoring between the PHQ-8 and PHQ-9. A study found that scores between the two tests are highly correlated (r = 0.998).

The PHQ-15 is a 15-item scale derived from the larger PHQ. The PHQ-15 inquires in 15 symptoms relating to somatoform disorders. The questions on the PHQ-15 account for 90% of all symptoms that providers observe in primary care settings. Patients must rate the extent to which symptoms bothered them over the last month. Responses range from "not at all" (a score of 0) to "bothered a lot" (a score of 2). Higher scores on the PHQ-15 are strongly associated with functional impairment, disability, and healthcare utilization.

The GAD-7 is a seven-item anxiety screening instrument developed in 2006 with a similar format to that of the PHQ-9. Total scores range from 0 to 21 with scores of 5, 10, and 15 indicating mild, moderate, and severe anxiety. Unlike the PHQ-9, clinicians use the GAD-7 to assess the severity of anxiety only. Unlike the PHQ-9, the GAD-7 does not generate provisional diagnoses. A clinical interview must be given to arrive at a clinical diagnosis. The GAD-2 is a 2-question shortened version of the GAD-7; it uses the first two items on the GAD-7. A total score that is greater than 3 indicates that a clinician should administer the full GAD-7 and conduct a clinical interview to assess the presence and type of anxiety disorder.