Screening accuracy for late-life depression in primary care: a systematic review.


Objective To determine the accuracy of depression screening instruments for older adults in primary care.

Study Design Systematic review

Data Sources MEDLINE, PsycINFO (search dates 1966 to January 2002), and the Cochrane database on depression, anxiety and neurosis. We also searched the second Guide to Clinical Preventive Services, the 1993 Agency for Health Care Policy and Research Clinical Practice Guideline on Depression, and recent systematic reviews. Hand-checking of bibliographies and extensive peer review were also used to identify potential articles.

Outcomes Measured A predefined search strategy targeted only studies of adults aged 65 years or older in primary care or community settings, including long-term care. Articles were included in this review if they reported original data and tested depression screening instruments against a criterion standard, yielding sensitivity and specificity.

Results Eighteen articles met criteria and are included in this review, representing 9 different screening instruments. The most commonly evaluated were the Geriatric Depression Scale (30and 15-item versions), the Center for Epidemiologic Studies Depression Scale, and the SelfCARE(D). Differences in the performance of these 3 instruments were minimal; sensitivities ranged from 74% to 100% and specificities ranged from 53% to 98%.

Conclusions Accurate and feasible screening instruments are available for detecting late-life depression in primary care. More research is needed to determine the accuracy of depression screening instruments for demented individuals, and for those with subthreshold depressive disorders.


When depression is detected and treated in older patients, not only do symptoms subside, but behavior, cognitive functioning, and overall quality of life improve. (1) We conducted a systematic review to determine the accuracy of instruments for detecting unrecognized late-life depression in the primary care setting. Several instruments are comparable in sensitivity and specificity, though the 15-item Geriatric Depression Scale is particularly useful in the primary care setting.


As a part of a broader review for the US Preventive Services Task Force and the Research Triangle Institute-University of North Carolina at Chapel Hill Evidence-Based Practice Center, we prepared a strategy to identify articles relevant to the accuracy of depression screening instruments for older adults in the primary care setting. We searched for articles in MEDLINE, PsycINF0 (search dates 1966 to January 2002), and the Cochrane database on depression, anxiety, and neurosis. We also searched the second Guide to Clinical Preventive Services, (2) the 1993 Agency for Health Care Policy and Research (AHCPR) Clinical Practice Guideline on Depression, and recent systematic reviews? We also hand-checked bibliographies and used extensive peer review to identify potential articles.

We used the search terms depression, depressive disorder, mass screening, sensitivity and specificity, reproducibility of results, primary health care, ambulatory care, family practice, and the names of common screening and diagnostic instruments used to detect depression. Our search was limited to English-language texts and to ages greater than 65 years.

Inclusion and exclusion criteria

For inclusion, articles must have reported on depression screening in a primary care population of adults aged greater than 65 years. They must have used a criterion standard as comparison and provided information on diagnostic accuracy (usually sensitivity and specificity). Studies performed in the community and in long-term care settings, but not in psychiatric facilities or clinics, were included.

We excluded studies that extracted briefer instruments from the parent version retrospectively; for example, if an investigator evaluated a 5-item version of the Geriatric Depression Scale (GDS), he or she must have defined the specific questions prior to administering the instrument, rather than extracting the 5 items based on post-hoc analyses.

The criterion standards must have been commonly accepted, structured or semistructured diagnostic interviews or independent evaluations performed by psychiatrists based on Diagnostic and Statistic Manual of Mental Disorders, revised 3rd or 4th editions (DSM-IIIR, DSM-IV), International Classification of Diseases, 10th ed (ICD-10), or Research Diagnostic Criteria. Our selection criteria are consistent with recognized standards for reviewing diagnostic tests, specifically in eliminating spectrum bias and requiring a criterion standard. (4)

Review standards

Both authors independently reviewed the abstracts and full articles generated from the searches. Discrepancies about eligibility were resolved by consensus after review of the entire article. For each included study, we extracted information about the screening instrument, the criterion standard, sensitivity and specificity, average age of participants, their dementia status, and the study setting. To further estimate accuracy, we calculated 95% confidence intervals around each measure of sensitivity and specificity. Multiple screening instruments precluded a meaningful meta-analysis of these results.


Our initial search strategy yielded 1325 potential articles, 1269 of which could be eliminated by title review. Of the 56 articles remaining, 38 were eliminated after identifying exclusion criteria in the abstract or the manuscript: 17 because there was no criterion standard, 7 because the setting was not appropriate, 8 because the population was not geriatric, and 6 with varying methodologic exclusions. Eighteen articles met our inclusion criteria and specifically examined the performance of depression screening instruments for older adults in primary care (Table 1).

The included studies were carried out among a wide spectrum of patients mostly in general practice settings, with the exception of 1 in a nursing home and 1 receiving home care. Two studies specifically included patients with dementia. Nine different instruments were used; most had 20 or fewer questions and were relatively easy to administer.

Overall test performance in detecting major depression was similarly favorable among the instruments, with sensitivities ranging from 67% to 100% and specificities ranging from 53% to 98%. All but 2 studies (5,6) reported sensitivity and specificity based on optimal cutpoints determined by post-hoc receiving-operating characteristic (ROC) curve analyses, possibly exaggerating test performance in comparison with the studies testing predetermined cutpoints.

Five studies (6,11,19,22,23) explicitly stated that interviewers performing the criterion standard exam were blinded to the results of the screening test; the remainder did not report on blinding, although in most cases blinding was implied by the use of a second "independent" rater.

Geriatric Depression Scale. The GDS, the Center for Epidemiologic Studies Depression scale (CES-D), and the SelfCARE(D) were the most-evaluated screening instruments. The GDS has both a 30- and 15-item version and was designed in a yes/no format for self- or caregiver administration, making it easy to use. It minimizes questions about somatic and vegetative symptoms, which can overlap with symptoms of concurrent medical illness.

The GDS has been validated repeatedly in psychiatric settings. (23-27) Nine studies (5-10, 12) evaluated its use in primary care elderly, most using the 15-item version and a cutpoint of 3 to 5. Sensitivity and specificity ranged from 79%-100% and 67%-80%, respectively.

Center for Epidemiological Studies Depression Scale. The CES-D can be self-administered. It lists 20 statements addressing depressive symptoms over the last week, asking the participant to rank the frequency of these feelings from "rarely" to "most of the time." Its psychometric properties have been consistently strong in younger adults in the community.

In the 5 studies (13-16) that evaluated this instrument, cutpoints varied from 9 to 21. The resultant sensitivities were 75%-93%, with specificities ranging from 73%-87%. One study (16) also specifically evaluated the performance of the CES-D in mildly demented subjects with an average Mini-Mental State Examination (MMSE) of 19, and showed similar test characteristics to the patients without dementia. This instrument was perceived as generally easy to administer, except in a nursing-home population where the questions had to be repeated multiple times.

Papassotiropoulos et al (17) used the CES-D and the General Health Questionnaire (GHQ) to identify subthreshold depression in a community sample in Greece. They defined subthreshold depression as fewer than 5 depressive symptoms in a 2-week period; brief, monthly depressive symptoms not occurring for a 2-week duration; and, any significant single depressive symptom not specified by duration or frequency. Accuracy was poor for delineating these syndromes, with sensitivities below 50% and specificities of 75% and 72%, respectively.

Lyness and colleagues (15) used the CES-D, as well as the GDS-15, to identify minor depression in their cohort. They defined minor depression as having sad mood or loss of interest and at least 2, but fewer than 5, additional depressive symptoms within a 2-week period. The CES-D revealed a sensitivity of 40% and specificity of 82% for detecting minor depression, while the GDS-15 had a sensitivity and specificity of 70% and 80%, respectively.

SelfCARE(D). The SelfCARE(D) is a self-administered instrument that requests responses to 12 items on a Likert scale, reflecting depressive symptoms over the last month. It was derived from a larger, previously validated instrument used in England. (18)

In 1 of 3 included studies, Bird and colleagues (18) reported the original results in a 1987 outpatient sample, showing a sensitivity of 77% and specificity of 98%, with a cutpoint of 5. Since then it has been validated again in general practice and in home care. (19,20) Both studies revealed sensitivities in the 90% range, but the specificity in home care was 53% vs 86% in general practice.

Caribbean Culture--Specific Screen. In an effort to address the potential cultural limitations of common instruments, Rait and colleagues (11) tested the Caribbean Culture--Specific Screen (CCSS) in the growing contingent of Caribbeans of African descent in the United Kingdom. They found that it performed well, but not better than the Brief Assessment Schedule Depression Cards or the GDS-15. Each had a sensitivity of 92%, with specificities ranging from 71%-84%.

Similarly, Abas et al (12) tested the CCSS and the GDS-15 in an African-Caribbean population, reporting sensitivities of 82% for both instruments, and specificities of 68% for the CCSS and 82% for the GDS-15.

Cornell Scale for Depression in Dementia.

Dementia poses barriers to effective screening for depression given the obvious limitations in self report due to cognitive impairment. The Cornell Scale for Depression in Dementia (CSDD) was specifically designed for this population and calls for the clinician to use both patient and caregiver information to complete the screen.

The CSDD is categorized by questions on mood, behavior, physical signs, diurnal patterns, and ideational disturbances. Each item is on a 3-point scale for a possible total score of 38, with higher scores indicating more depression. Most data generated about the CSDD have come from hospitalized patients, in whom it has demonstrated acceptable validity and reliability in demented and nondemented patients. (19-31)

We identified 1 study evaluating the CSDD that met our criteria. Vida et al (22) screened outpatients from a family medicine clinic and found a sensitivity of 90% and specificity of 75% for detecting major depression.

Other instruments. Several very brief instruments have been validated in psychiatric or hospital settings where the prevalence of depressive symptoms is often high, (32,33) but few have been tested in older primary care patients. Howe et al (34) attempted to validate a 1-question screen (MHI-1) derived from the mental health component of the SF-36, asking elderly participants, "in the past month, how much of the time have you felt downhearted or sad?" (1=none, 6=all the time). They showed that as a "stand alone" screen, the MHI-1 did not perform well in the primary care setting, with a sensitivity of 67% and a specificity of 60%.


Our systematic review shows that several instruments demonstrate good accuracy for detecting late-life major depression in primary care. The GDS, CES-D and SelfCARE(D) have comparable sensitivities and specificities. The CES-D and CCSD have similarly favorable accuracy in demented patients with an average MMSE score of 19.

A 1-question screen shows poor results, as do studies using the GHQ, CES-D, and GDS-15 to detect nonmajor depression. Finally, 2 studies demonstrate that a culturally specific screen in African-Caribbeans performs well, but no better than, the GDS.

The GDS has longstanding success in identifying major depression in psychiatric and hospital settings and now demonstrates accuracy in primary care, where the 15-item version in its yes/no self-administered format represents a realistic tool for use in the community or the clinic.

With a record of successful use in general adult research, the CES-D also has the benefit of a known track record and relative ease of administration. Evidence from this review suggests that it can be extended to the older primary care population. The SelfCARE(D) is comparably accurate in general practice, but has lower specificity in home care.

Our review highlights the need to further investigate the accuracy of screening tools for depression in patients with dementia, specifically where cognitive impairment may be severe. Using the CSDD, an instrument specifically designed for patients with dementia, Vida et al (22) found good accuracy for detecting depression; however, they studied patients with relatively mild dementia. The prevalence of depression in dementia is 15% to 40%. (35) Given the increasing incidence of dementia in our aging population, the availability of accurate screening tools that specifically account for the coexistence of these 2 common disorders is important.

This review also reveals a lack of screening accuracy for nonmajor depressive disorders using 3 common instruments. Lyness and colleagues (36) showed that there is considerable functional disability in subsyndromal depression, which is more prevalent than major depression. Others show similar findings, supporting the significant morbidity caused by depressive symptoms not severe enough to cross threshold for a major disorder. (37,38) As the characterization of nonmajor depressive disorders evolves, screening instruments should be developed and validated specifically for these syndromes. (39)

Late-life depressive disorders have a convincing burden of suffering, often go undetected, and have known effective treatments. (40) Our systematic review reveals that accurate screening instruments are available to detect major depression in older primary care patients. Based on format and length (Table 2), several could easily be self-administered or administered by nonclinicians in the waiting room. We recommend the 15-item GDS (Figure) because of its yes/no format and ease of scoring. Future work should include tests of depression screening accuracy for demented populations, and for nonmajor depressive disorders. Investigators should also evaluate the accuracy of very short instruments, such as the 5-item version of the GDS (10) in the primary care setting. Acceptable administration times and ease of use is likely to determine the realistic application of proven instruments.

Articles relevant to late-life depression screening

      Author                 Test/              Criterion       Avg.
                            cutpoint            standard         age

D'Ath et al (5)             GDS-15/5           GMS/AGECAT         74
Gerety et al (6)             GDS/11              SCID *           79
Neal and
Baldwin (7)                  GDS/11            GMS/AGECAT         77
Van Marjwick
et al (8)                    GDS/7                 DIS            74
Arthur et al (9)            GDS-15/3             ICD-10           80
Hoyl et al (10)             GDS-15/5              SCID            75
Rait et al (11)             GDS-15/4          GMS/AGECAT *       >60
Abas et al (12)             GDS-15/5           GMS/AGECAT        >60
Beekman et al (13)          CES-D/20               DIS          55-82
Lewisohn et al (14)         CES-D/12          RDC, DSM-IIIR       64
Lyness et al (15)           CES-D/21              SCID            71
                       --Major depression
                       --Minor depression
                       --Major depression
                       --Minor depression
Papassotiro-                CES-D/8               CIDI           >60
poulos et al (16)          (demented
                            CES-D/9               CIDI           >60
Papassotiro-                GHQ-12/0         CIDI, DSM-IIIR;     >60
poulos et al (17)         Subthreshold        not reported
Bird et al (18)          SelfCARE(D)/5         Independent        73
                                              assessment *
Upadhyaya                SelfCARE(D)/5         GMS/AGECAT*        71
and Stanley (19)
Banerjee et al (20)      SelfCARE(D)/8         GMS/AGECAT        >65
Howe et al (21)             MHI-1/2            GMS/AGECAT         81
Vida et al (22)             Cornell               RDC *           72

      Author                 Test/
                            cutpoint           Dementia

D'Ath et al (5)             GDS-15/5          Not tested
Gerety et al (6)             GDS/11            Avg MMSE
                                             23 (SD 4.7)
Neal and
Baldwin (7)                  GDS/11           Not tested
Van Marjwick
et al (8)                    GDS/7            Mild/none
Arthur et al (9)            GDS-15/3             None
Hoyl et al (10)             GDS-15/5           Avg MMSE
                                             27 (SD 2.6)
Rait et al (11)             GDS-15/4          Not tested
Abas et al (12)             GDS-15/5          Avg. MMSE
                                             24 (SD 4.6)
Beekman et al (13)          CES-D/20             None
Lewisohn et al (14)         CES-D/12         Not reported
Lyness et al (15)           CES-D/21          Not tested
                       --Major depression
                       --Minor depression
                       --Major depression
                       --Minor depression
Papassotiro-                CES-D/8            Avg MMSE
poulos et al (16)          (demented         27 (SD 6.0)
                            CES-D/9          Avg MMSE 19
                           (demented         (SD 5.5) in
                           excluded)           demented
Papassotiro-                GHQ-12/0          Avg. MMSE
poulos et al (17)         Subthreshold       28 (SD 2.0)
Bird et al (18)          SelfCARE(D)/5        Not tested
Upadhyaya                SelfCARE(D)/5        Not tested
and Stanley (19)
Banerjee et al (20)      SelfCARE(D)/8        Not tested
Howe et al (21)             MHI-1/2            Excluded
Vida et al (22)             Cornell            Avg MMSE
                            Screen/7         19 (SD 7.8)

      Author                 Test/              Sn (%)
                            cutpoint           (95% CI)

D'Ath et al (5)             GDS-15/5          91 (86-96)
Gerety et al (6)             GDS/11           89 (72-96)
                            CES-D/16          74 (55-86)
Neal and
Baldwin (7)                  GDS/11           83 (72-94)
Van Marjwick
et al (8)                    GDS/7            79 (76-82)
Arthur et al (9)            GDS-15/3         100 (98-102)
Hoyl et al (10)             GDS-15/5          94 (89-99)
Rait et al (11)             GDS-15/4          92 (64-100)
                           BASEDEC/6          92 (64-100)
                             CCSS/6           92 (64-100)
Abas et al (12)             GDS-15/5          82 (62-92)
                             CCSS/5           82 (62-92)
Beekman et al (13)          CES-D/20          93 (91-95)
Lewisohn et al (14)         CES-D/12          76 (73-79)
Lyness et al (15)           CES-D/21              --
                       --Major depression     92 (87-97)
                       --Minor depression     40 (32-48)
                       --Major depression    100 (98-102)
                       --Minor depression     70 (62-78)
Papassotiro-                CES-D/8           75 (70-80)
poulos et al (16)          (demented
                            CES-D/9           75 (70-80)
                           (demented          75 (70-80)
Papassotiro-                GHQ-12/0          46 (40-52)
poulos et al (17)         Subthreshold
                            CES-D/9           39 (33-45)
Bird et al (18)          SelfCARE(D)/5        77 (67-87)
Upadhyaya                SelfCARE(D)/5        95 (90-100)
and Stanley (19)                              74 (55-86)
Banerjee et al (20)      SelfCARE(D)/8        90 (86-94
Howe et al (21)             MHI-1/2           67 (58-76)
Vida et al (22)             Cornell           90 (80-100)

      Author                 Test/              SP (%)
                            cutpoint           (95% CI)

D'Ath et al (5)             GDS-15/5         72 (66-78)
Gerety et al (6)             GDS/11          68 (58-77)
                            CES-D/16         70 (60-79)
Neal and
Baldwin (7)                  GDS/11          80 (68-92)
Van Marjwick
et al (8)                    GDS/7           67 (63-71)
Arthur et al (9)            GDS-15/3         72 (67-77)
Hoyl et al (10)             GDS-15/5         82 (73-91)
Rait et al (11)             GDS-15/4         71 (63-79)
                           BASEDEC/6         84 (78-91)
                             CCSS/6          79 (71-86)
Abas et al (12)             GDS-15/5         82 (62-92)
                             CCSS/5          68 (54-79)
Beekman et al (13)          CES-D/20         73 (69-77)
Lewisohn et al (14)         CES-D/12         77 (74-80)
Lyness et al (15)           CES-D/21             --
                       --Major depression    87 (81-93)
                       --Minor depression    82 (75-89)
                       --Major depression    84 (78-90)
                       --Minor depression    80 (73-87)
Papassotiro-                CES-D/8          74 (67-81)
poulos et al (16)          (demented
                            CES-D/9          72 (67-77)
                           (demented         72 (67-77)
Papassotiro-                GHQ-12/0         72 (67-77)
poulos et al (17)         Subthreshold
                            CES-D/9          75 (70-80)
Bird et al (18)          SelfCARE(D)/5       98 (95-101)
Upadhyaya                SelfCARE(D)/5       86 (78-94)
and Stanley (19)                             70 (60-79)
Banerjee et al (20)      SelfCARE(D)/8       53 (46-60)
Howe et al (21)             MHI-1/2          60 (50-70)
Vida et al (22)             Cornell          75 (60-90)

* These studies were blinded; all others were not reported.

GDS, Geriatric Depression Scale, 30-item; GDS-15, Geriatric Depression
Scale, 15-item; GHQ, General Health Questionnaire; DIS, Diagnostic
Interview Schedule; BASEDEC, Brief Assessment Schedule Depression
Cards; CES-D, Center for Epidemiologic Study-Depression; MHI-1, single
question from the Mental Health Inventory ["in the past month, how much
have you felt downhearted or sad (1: none-6: all the time)"]; GMS,
Geriatric Mental State/AGECAT computer program; CIDI, Composite
International Diagnostic Interview; SCID, Structured Clinical Interview
for DSM IIIR; CCSS, Caribbean Culture Specific Screen; RDC, Research
Diagnostic Criteria; DSM IIIR, Diagnostic and Statistical Manual of
Mental Disorders, 3rd ed rev; MMSE, Mini Mental State Examination;
ICD-10, International Classification of Diseases, 10th ed

Table 2

Selected screening instruments and their characteristics

                                                     Time to
Instrument    Format                       Item    administer

GDS-15        Yes/no questions about        15     2-3 minutes
              current symptoms
CES-D         Rates frequency of            20     2-3 minutes
              selected symptoms
              over last week
SelfCareD     Multiple choice responses     12     2-3 minutes
              regarding symptoms
              over last month

Instrument    Sn (%)    SP (%)

GDS-15        82-100    72-82
CES-D         74-93     70-87
SelfCareD     77-95     53-98

GDS-15: Geriatric Depression Scale, 15-item; CES-D: Center for
Epidemiologic Study-Depression; Sn, sensitivity; Sp, specificity.
Sensitivity and specificity values represent the range reported from
the eligible studies in our review.


Geriatric Depression Scale, 15-item

Choose the best answer for how you have felt over the past week:

 1. Are you basically satisfied with your life?            Yes     No#
 2. Have you dropped many of your activities and
    interests?                                             Yes#    No
 3. Do you feel that your life is empty?                   Yes#    No
 4. Do you often get bored?                                Yes#    No#
 5. Are you in good spirits most of the time?              Yes     No#
 6. Are you afraid that something bad is going to
    happen to you?                                         Yes#    No
 7. Do you feel happy most of the time?                    Yes     No#
 8. Do you often feel helpless?                            Yes#    No
 9. Do you prefer to stay at home, rather than going
    out and doing new things?                              Yes#    No
10. Do you feel you have more problems with memory than
    most?                                                  Yes#    No
11. Do you think is it wonderful to be alive now?          Yes     No#
12. Do you feel pretty worthless the way you are now?      Yes#    No
13. Do you feel full of energy?                            Yes     No#
14. Do you feel that your situation is hopeless?           Yes     No#
15. Do you think that most people are better off than
    you are?                                               Yes     No#

Answers in bold indicate depression. Although differing sensitivities
and specificities have been obtained across studies, for clinical
purposes a score >5 bold answers is suggestive of depression and should
warrant a follow-up interview.

This instrument, and other versions of the GDS in multiple
translations, are in the public domain and can be found at:

Note: Depression indicated with #.


Lea C. Watson, MD, MPH Program of Geriatric Psychiatry, Department of Psychiatry and Behavioral Sciences, Duke University Medical Center, Durham, NC

Michael R Pignone, MD, MPH Division of General Medicine, Department of Medicine, University of North Carolina at Chapel Hill; Research Triangle Institute-University of North Carolina Evidence-based Practice Center

Corresponding author: Lea C. Watson MD, MPH, Geriatric Psychiatry, Box 3903, Duke University Medical Center, Durham, NC 27710.
