ODSIS Inglés

Psychological Assessment © 2014 American Psychological Association
2014, Vol. 26, No. 3, 815-830 1040-3590/14/$ 12.00 DOI: 10.1037/a0036216
Development and Validation of the Overall Depression Severity and

Impairment Scale
Kate H. Bentley Matthew W. Gallagher

Center for Anxiety and Related Disorders at Boston University National Center for PTSD, VA Boston Healthcare System,
Boston, Massachusetts, and Boston University School of
Medicine
Jenna R. Carl and David H. Barlow

Center for Anxiety and Related Disorders at Boston University
The need to capture severity and impairment of depressive symptomatology is widespread. Existing
depression scales are lengthy and largely focus on individual symptoms rather than resulting impairment.
The Overall Depression Severity and Impairment Scale (ODSIS) is a 5-item, continuous measure
designed for use across heterogeneous mood disorders and with subthreshold depressive symptoms. This
study examined the psychometric properties of the ODSIS in outpatients in a clinic for emotional
disorders (N = 100), undergraduate students {N = 566), and community-based adults (N = 189). Internal
consistency, latent structure, item response theory, classification accuracy, convergent and discriminant
validity, and differential item functioning analyses were conducted. ODSIS scores exhibited excellent
internal consistency, and confirmatory factor analyses supported a unidimensional structure. Item
response theory results demonstrated that the ODSIS provides more information about individuals with
high levels of depression than those with low levels of depression. Responses on the ODSIS discrimi
nated well between individuals with and without a mood disorder and depression-related severity across
clinical and subclinical levels. A cut score of 8 correctly classified 82% of outpatients as with or without
a mood disorder; it evidenced a favorable balance of sensitivity and specificity and of positive and
negative predictive values. The ODSIS demonstrated good convergent and discriminant validity, and
results indicate that items function similarly across clinical and nonclinical samples. Overall, findings
suggest that the ODSIS is a valid tool for measuring depression-related severity and impairment. The
brevity and ease of use of the ODSIS support its utility for screening and monitoring treatment response
across a variety of settings.
Keywords: depression, screening, psychometrics, item response theory
Mood disorders are common, with lifetime prevalence rates In fact, one recent study indicated that nearly 80% of a large,
estimated at 21% of the United States population (Kessler, Ber- psychiatric outpatient sample presented with a clinically signifi
glund, Dernier, Jin, & Walters, 2005). Among the mood and cant depressive disorder (Zimmerman, Young, Chelminski, Dal-
anxiety disorders, the most prevalent lifetime syndrome is a major rymple, & Galione, 2012). In light of evidence that depression is
depressive episode (Kessler, Petukhova, Sampson, Zaslavsky, & pervasive, associated with significant distress and impairment, and
Wittchen, 2012). Mood disorders and subsyndromal depressive frequently co-occurs with other disorders, there is a need to capture
symptoms are associated with significant functional impairment the severity and impairment of depressive symptomatology that char
(e.g., Kessler et al., 2003; Zimmerman et al., 2011). Depression is acterizes single, co-occurring, and subsyndromal mood disorders.
also associated with high comorbidity rates (Kessler et al., 2005). The case was recently made for the routine assessment of depres
sion in mental health settings to facilitate measurement-based care.
According to Zimmerman et al. (2012), levels of depression and
anxiety should be considered “psychiatric vital signs” and be deter
This article was published Online First April 7, 2014. mined with regularity in all clinical practice. A variety of well-
Kate H. Bentley, Center for Anxiety and Related Disorders at Boston validated tools exist to assess depression during the clinical encounter,
University; Matthew W. Gallagher, National Center for PTSD, VA Boston including the Beck Depression Inventory—II (BDI-D; Beck, Steer, &
Healthcare System, Boston, Massachusetts, and Boston University School
Brown, 1996), Patient Health Questionnaire-9 (PHQ-9; Kroenke,
of Medicine; Jenna R. Carl and David H. Barlow, Center for Anxiety and
Related Disorders at Boston University.
Spitzer, & Williams, 2001), and Center for Epidemiological Studies
None of the authors report any financial interests or potential conflicts of Depression Scale (CES-D; Radloff, 1977). However, these scales
interest. focus on assessing individual depressive symptoms, rather than re
Correspondence concerning this article should be addressed to Kate H. sulting functional and behavioral impairment, and thus may be less
Bentley, Center for Anxiety and Related Disorders at Boston University, useful for measuring the overall impact of treatment. For example,
648 Beacon Street, 6th Floor, Boston, MA 02215. E-mail: khb@bu.edu there is research to suggest that the BDI (Beck & Steer, 1987) and
815
816 BENTLEY, GALLAGHER, CARL, AND BARLOW
BDI-H, perhaps the most commonly used self-report measures in 2009; Norman et al., 2006, 2011). Further, a recent study using a
depression research, are not sensitive to change (e.g., Brouwer, Mei- female sample who completed PTSD treatment showed that the
jer, & Zevalkink, 2013; Edwards et al., 1984; Sayer et al., 1993; OASIS also possesses strong sensitivity to change (Norman et al.,
Schniebel et al., 2012) and therefore may be unable to capture subtle 2013). This measure is applicable in a variety of clinical settings (e.g.,
differences in depression-related distress and impairment over time. primary care) and research studies (e.g., treatment outcome,
In a similar vein, one recent study (Zimmerman et al., 2011) showed population-based surveys).
that the Remission from Depression Questionnaire (RDQ; Zimmer In order to assess depression in a similar fashion, we directly
man et al., 2013), a 42-item measure designed to capture a broad array modified the OASIS to capture the severity and functional impairment
of domains relevant to remission from depression (e.g., life satisfac due to depressive symptoms. We maintained the same five-item
tion, functioning), was rated by patients as a more accurate, preferable structure as in the OASIS, with one item regarding frequency of
indicator of treatment outcome than an exclusively symptom-based depression, one assessing intensity of depression, one measuring
measure, the Quick Inventory of Depressive Symptoms (QIDS; Rush interference with work or school, and one capturing interference with
et al., 2003). These findings suggest that purely symptom measures social life and relationships. The most notable deviation is that we
often used to define “remission” or “responder” status may not ade replaced one item of the OASIS assessing impairment due to avoid
quately reflect important areas to determining the effects of treatment; ance with an item assessing impairment due to depression-related loss
namely, the impact of remaining (subsyndromal) symptoms on func of interest and difficulty engaging in activities (American Psychiatric
tioning. Association, 2013). Consistent with the OASIS, instructions include
The recently developed Clinically Useful Depression Outcome an “over the past week” time frame, and respondents are asked to
Scale (CUDOS; Zimmerman, Chelminski, McGlinchey, & Poster- endorse one of five different response options (coded from 0 to 4) for
nak, 2008) is promising in that it evaluates both individual depres each item; higher scores are indicative of greater depression-related
sive symptoms and associated impairment. However, the 18-item severity and impairment. The resulting Overall Depression Severity
CUDOS may be burdensome for some providers to administer, and Impairment Scale (ODSIS) is included in the Appendix.
particularly when depression is not readily apparent or is not the Our goal in the current study was to evaluate the psychometric
main reason for patients seeking treatment. The CUDOS also properties of the ODSIS in three samples: outpatients, undergrad
measures impairment with a single item asking respondents uate students, and community-based adults. Given the goal of
how much depressive symptoms have interfered “in your life,” developing a brief measure applicable across diverse clinical and
which may limit the degree to which interference across several research settings, use of distinct samples is essential to demon
life domains (e.g., work, school, social life, relationships) is con strate how the scale performs in different populations. First, the
sidered. As true for items on several other widely used depression internal consistency and latent structure of the ODSIS were eval
scales, the symptom-based items of the CUDOS are also directly uated in each sample; along similar lines, item response theory
tied to the diagnostic criteria for major depressive disorder (MDD) (IRT) methods of analysis were used to investigate whether the
and thus may not adequately assess key criteria of other mood ODSIS is differentially informative across ranges of latent levels
disorders in the Diagnostic and Statistical Manual o f Mental of depression in the three samples. Second, the degree to which the
Disorders (5th ed.; DSM-5\ American Psychiatric Association, ODSIS differentiates both between outpatients with and without a
2013; e.g., persistent depressive disorder, premenstrual dysphoric current mood disorder diagnosis and among the range of clinical
disorder, depressive disorder not elsewhere classified) and subsyn and subclinical disorder severity levels was examined; further, the
dromal symptoms. In light of evidence that subsyndromal depres degree to which the ODSIS discriminates between those who
sive symptoms are associated with increased risk of relapse, higher report and do not report a current (or past) mood disorder diagnosis
morbidity, and greater functional impairment (e.g., Judd, Akiskal, and those who report receiving and not receiving treatment for
& Paulus, 1997; Paykel et al., 1995; Wells et al., 1989), the need depression in the two nonclinical samples was investigated. Third,
to assess subthreshold symptomatology is readily apparent. Thus, using the outpatient sample, we determined cut scores for identi
shorter instruments that specifically target both severity and im fying individuals with probable depressive disorders, and receiver
pairment of heterogeneous depressive symptomatology may prove operating characteristics (ROC) curves were used to examine
particularly useful for purposes of screening and ongoing assess sensitivity and specificity for each cut score; this ROC analysis in
ment in many clinical settings. the context of a treatment-seeking sample in which all patients
An efficient, yet still psychometrically sound, measure to capture have an emotional disorder (i.e., mood or anxiety disorder) con
depression-related severity and impairment may hold significant ben stitutes a rigid, conservative test of the extent to which high scores
efits across research, screening, and other treatment settings. With on the ODSIS are associated solely with increased levels of de
regard to anxiety (the other “psychiatric vital sign”), such a tool pression and not with other disorders (e.g., anxiety) characterized
already exists and may inform the development of a similar instru by high levels of anxious arousal and avoidance. Estimates of
ment for depression. The Overall Anxiety Severity and Impairment positive predictive power (PPP) and negative predictive power
Scale (OASIS; Norman, Cissell, Means-Christensen, & Stein, 2006) (NPP) were generated for each cut potential score, and agreement
is a five-item, continuous measure that captures frequency and inten between ODSIS and ADIS-L classifications was examined to
sity of anxiety symptoms, behavioral avoidance, and functional im enhance interpretation of the diagnostic utility of each potential
pairment associated with anxiety symptoms. Strengths of the OASIS ODSIS cut score. Fourth, convergent and discriminant validity of
include its brevity and applicability to any anxiety disorder and the ODSIS were examined, and finally, IRT methods were em
subsyndromal anxiety symptoms. Scores on the OASIS have also ployed again in order to examine whether differential item func
demonstrated strong psychometric properties in cross-sectional eval tioning (DIF) in ODSIS responses exists between outpatient, un
uations across clinical and nonclinical samples (Campbell-Sills et al., dergraduate student, and community-based samples.
OVERALL DEPRESSION SEVERITY AND IMPAIRMENT SCALE 817
Method assign a clinical severity rating (CSR) ranging from 0 to 8 that

indicates the degree of distress and interference associated with the
Participants disorder (0 = none to 8 = very severely disturbing/disabling). For
patients who receive more than one current diagnosis, the principal
Outpatients. The clinical sample (N = 100) consisted of diagnosis refers to the one with the highest CSR. For disorders that
patients presenting for assessment and treatment at the Center for meet or exceed the threshold for a formal DSM -IV diagnosis,
Anxiety and Related Disorders (CARD), a large outpatient clinic. CSRs of 4 (definitely disturbing/disabling) or higher are referred to
All individuals met criteria for a principal anxiety or depressive as clinical diagnoses. Current clinical diagnoses not determined to
disorder based on the administration of the Anxiety Disorders be the principal diagnosis are assigned as additional diagnoses. In
Interview Schedule for DSM-IV—Lifetime version (ADIS-IV-L; a prior reliability study of a subset of patients also seen at CARD
Di Nardo, Brown, & Barlow, 1994; for a detailed description, refer (n = 362) who underwent two independent administrations of this
to Measures). Participants were recruited following their initial ADIS-IV-L, acceptable to excellent interrater agreement in diag
intake evaluation with the ADIS-IV-L at CARD to participate in a nosing current mood and anxiety disorders (range of ks = .56 to
questionnaire study. Interested patients were mailed packets that .86) was observed, with the exception of dysthymia (k = .22 for
could be completed at home and sent in; those who mailed back principal diagnosis, k = .31 for principal or additional diagnosis;
their questionnaires were provided monetary compensation for Brown, Di Nardo, Lehman, & Campbell, 2001). With regard to the
their time. The mean age was 31.59 years (SD = 11.72, range = most prevalent mood disorder diagnosed in the present study
18 to 67), and women constituted the larger portion of the sample (MDD; n = 24), interrater reliability was fair to good (k = .67 for
(68.0%). Most patients were non-Hispanic or Latino (97.0%) and principal, k = .59 for principal or additional; Brown, Di Nardo, et
Caucasian (83.0%), with smaller numbers identifying as Asian al., 2001).
(9.0%) and African American (8.0%). Collapsing across principal Beck Anxiety Inventory (BAI; Beck & Steer, 1993). The
and additional diagnoses, the diagnostic breakdown in the sample BAI is a 21-item, self-report inventory of anxiety symptoms ex
was as follows: major depressive disorder (n = 24), dysthymic perienced during the past week. Questions are rated on a 4-point
disorder (n = 3), social phobia (n = 54), generalized anxiety Likert scale ranging from 0 to 3, and higher scores indicate greater
disorder (n = 39), panic disorder with or without agoraphobia (n = symptom severity. Scores on the BAI have demonstrated high
27), specific phobia (n = 23), obsessive-compulsive disorder (n = internal consistency, test-retest reliability, and construct validity
17), posttraumatic stress disorder (n = 5), and anxiety disorder not (Beck & Steer, 1993; Fydrich, Dowdall, & Chambless, 1992). In
otherwise specified (n = 7). the present study, internal consistency coefficients for BAI scores
Student. Participants in the student sample were undergradu were high, ranging from .94 (outpatient and student samples) to .96
ates (N = 566) at a large private university in Boston who were (community sample).
enrolled in an introductory psychology course. They were re BDI-II (Beck et al., 1996). The BDI-II is a 21-item inventory
cruited to complete an online questionnaire battery regarding emo of depressive symptoms experienced during the past week. Questions
tional experiences via a voluntary Internet-based sign-up system, are rated on a 4-point Likert scale ranging from 0 to 3, and higher
and they received research credit for their participation. Among the scores indicate greater symptom severity. Reliability and validity data
559 students who provided their age, the mean age was 18.93 years for the BDI-B has been reported in nonclinical and clinical samples
(SD = 1.07), with ages ranging from 18 to 24. Student participants (Beck et al., 1996; Storch, Roberti, & Roth, 2004). In the present
predominantly identified as female (70.0%), non-Hispanic or La study, internal consistency coefficients for the BDI-II were high in all
tino (79.9%), and Caucasian (58.1%). They also identified as three samples, ranging from .91 (student) to .96 (community).
Asian (30.0%), Black or African American (3.2%), and other Behavioral Inhibition and Behavioral Activation Scales
(8.3%). (BIS/BAS; Carver & White, 1994). The BIS/BAS is a 20-item
Community-based. Participants in the community sample instrument that measures levels of behavioral inhibition and be
were community-dwelling adults (N = 189) in Boston. They were havioral activation. The BIS subscale items tap into individuals’
recruited to partake in a web-based questionnaire study regarding emotional responses to impending negative events. The BAS sub
emotional experiences through an online advertisement, completed scale items assess individuals’ behavioral and emotional responses
the questionnaire battery online, and were paid for their participa to potentially positive events. Items are rated on a 4-point Likert
tion. Of the 188 participants who provided their age, the mean age scale ranging from 1 to 4. BIS/BAS scores have evidenced good
was 30.85 years (SD = 10.77), with ages ranging from 18 to 70. reliability, predictive validity, and convergent and discriminant
Community participants predominantly identified as female validity (Campbell-Sills, Liverant, & Brown, 2004; Carver &
(58.2%), non-Hispanic or Latino (89.4%), and Caucasian (75.7%). White, 1994). In the present study, for the outpatient sample,
Participants also identified as Asian (13.2%), Black or African internal consistency coefficients for the BIS and BAS were ac
American (6.9%), and other (3.7%). ceptable (.74 and .81, respectively). For the student and commu
nity samples, internal consistency was questionable for the BIS,
Measures ranging from .62 to .67, and acceptable for the BAS, ranging from
.72 to .80.
ADIS-IV-L (Di Nardo et al., 1994). The ADIS-IV-L is a CES-D (Radloff, 1977). The CES-D is a 20-item measure
semistructured interview designed to assess the Diagnostic and that assesses the frequency and severity of depressive symptoms.
Statistical Manual o f Mental Disorders (4th ed.; DSM-IV; Amer Items are rated on a 0 to 3 scale, and four items that assess the
ican Psychiatric Association, 1994) anxiety, mood, somatoform, absence of positive affect are reverse scored. Responses to the
and substance use disorders. For each diagnosis, interviewers CES-D have shown adequate internal consistency and test-retest
reliability, and correlate with clinical judgments and self-report Crawford, Lawton, & Reid, 2008). In the present study, internal
measures of depression (Iwata & Buka, 2002; Radloff, 1977; consistency coefficients for the PHQ-9 were high, ranging from
Roberts, 1980). In the present study, internal consistency coeffi .85 (student) to .95 (community).
cients for the CES-D were acceptable, ranging from .73 (student) Short-Form Health Survey, 12 items (SF-12; Ware, Kosin-
to .90 (community). ski, & Keller, 1996). The SF-12 is a widely used generic mea
Depression Anxiety Stress Scales (DASS; Lovibond & Lovi- sure of health status and is derived from the Short Form 36-Item
bond, 1995). The DASS is a 21-item instrument designed to Health Survey (SF-36). Using scoring algorithms, two summary
assess anxiety, depression, and stress. Items are scored on a 4-point scores (ranging from 0 to 100) are derived, mental health (SF-
scale ranging from 0 to 4. The Depression scale (DASS-D) as 12-M) and physical health (SF-12-P; Ware et al., 1996). Responses
sesses feelings of hopelessness, pessimism and lack of interest, are coded such that higher scores indicate higher levels of mental
involvement or enjoyment. The Anxiety scale (DASS-A) captures or physical health. Responses to the SF-12 have been shown to
autonomic arousal, skeletal muscle effects, situational anxiety and possess good reliability and validity in the general population
worry. The Stress scale (DASS-S) assesses nervous arousal, ten (Gandek et al., 1998; Ware et al., 1996). In the present study,
sion, ability to relax, and irritability. Scores on the DASS have internal consistency was acceptable to good, ranging from .78
shown high internal consistency and reliability (Antony, Bieling, (student) to .80 (community).
Cox, Enns, & Swinson, 1998; Henry & Crawford, 2005). In the
present study, internal consistency coefficients were acceptable to
Procedures
good for the DASS-A (ranging from .77 to .90), and good to
excellent for both the DASS-S (ranging from .84 to .91) and the After undergoing an intake evaluation with the ADIS-IV-L,
DASS-D (ranging from .87 to .92). outpatients completed the ODSIS, DASS, BDI-II, BAI, NFFI, and
NEO Five Factor Inventory (NFFI; Costa & McCrae, 1992). BIS/BAS; this sample was not administered the CES-D, OASIS,
The NFFI is a 60-item measure of the five-factor model of per SF-12, or PHQ-9. Student and community-based participants com
sonality. Items comprise self-descriptive statements (e.g., “I really pleted all the self-report measures described above (see Measures).
enjoy talking to people”) and are rated on a 5-point Likert scale A subset of the student and community samples (nstudent = 430;
ranging from 1 to 5 (1 = strongly agree, 5 = strongly disagree). 'Community = 156) were also asked the following three questions:
Five domain scores (openness, conscientiousness, agreeableness, “Have you ever received a diagnosis of a mood disorder (such as
extraversion, neuroticism) are calculated by summing their respec major depressive disorder, dysthymia, bipolar, cyclothymia)?”;
tive 12 item responses. In the present study, only the neuroticism “Do you currently have a mood disorder diagnosis?”; and “Have
(NFFI-N), extraversion (NFFI-E), and openness (NFFI-O) sub you ever received psychotherapy or medication for depression?”
scales were used. Responses to NFFI items have been found to
possess excellent internal consistency and reliability (Costa &
Data Analysis
McCrae, 1992) and also temporal stability (Robins, Fraley, Rob
erts, & Trzesniewski, 2001). In the present study, internal consis First, internal consistency of the ODSIS was calculated in each
tency for the NFFI-N in all three samples was good, ranging from sample. Second, each sample was used for factor analysis. Due to
.86 to .87. For the NFFI-E, internal consistency was also good for the small number of items that compose the ODSIS, we began by
the outpatient sample (.78), and questionable to acceptable for the using confirmatory factor analysis (CFA) to evaluate the appro
student and community samples (ranging from .69 to .72). For the priateness of a one-factor model in each sample. The latent con
NFFI-O, internal consistency was questionable in the nonclinical struct of depression was identified by fixing the variance to 1.0.
samples, ranging from .67 to .68. The sample variance-covariance matrices were analyzed with a
OASIS (Norman et al., 2006). The OASIS is a five-item latent variable software program (Mplus 5.1, Muthen & Muthen,
continuous measure of anxiety-related severity and impairment. 2007). Robust maximum likelihood estimation was used. Model fit
Items are coded from 0 to 4 and are summed to obtain one total was evaluated using commonly accepted indices of fit: chi-square
score. Results from three previous psychometric evaluations of the test (x2), root mean square error of approximation (RMSEA),
OASIS have indicated high internal consistency, excellent test- standardized root mean square residual (SRMR), non-normed fit
retest reliability, and convergent and discriminant validity in clin index (NNFI), and comparative fit index (CFI). Acceptance or
ical and nonclinical samples (Campbell-Sills et al., 2009; Norman rejection of models was based on conventional criteria for good
et al., 2006, 2011). In the present study, internal consistency of the model fit (RMSEA < .06, SRMR < .08, NNFI > .95, CFI > .95;
OASIS was good to excellent, ranging from .87 (student) to .91 Hu & Bentler, 1999), strength of parameter estimates (i.e., primary
(community). factor loadings > .35), and conceptual interpretability of the
ODSIS. As previously noted, the ODSIS (see the Appendix) solution.
is a five-item instrument designed to measure severity and impair Also with regard to the latent structure of the ODSIS, IRT
ment of depressive symptoms. modeling was used in order to examine how well the ODSIS
PHQ-9 (Kroenke et al., 2001). The PHQ-9 is a nine-item measures the latent construct of depression in each of the three
instrument used to screen for depression in primary care settings. samples. IRT modeling allows for more realistic assumptions
Items are rated on a 0 to 3 Likert scale in accordance with about measurement error (Embretson & Reise, 2000), thereby
increased frequency of difficulty experienced in each covered area. permitting researchers to examine whether different scores contain
PHQ-9 scores have demonstrated adequate validity as a diagnostic more measurement error. IRT analyses were conducted with the
tool for depression (Kroenke et al., 2001; Wittkampf, Naeiji, Mplus program (Muthen & Muthen, 2007). A two-parameter
Schene, Huyser, & van Weert, 2009) and reliability (Cameron, graded-response model (Samejima, 1969, 1977) and robust
maximum-likelihood estimation procedures were used. First, item given sample, the base rates of outpatients classified by the ODSIS
discrimination (a parameter) values were examined in order to as depressed at each cut score were also determined to enhance
determine the slope by which responses to individual ODSIS items interpretation of these values. Finally, kappa coefficients indicat
changed as a function of differences in the latent construct of ing agreement between ODSIS and ADIS-IV-L classifications
depression. These values typically range from 0 to 3, and items after correcting for chance (Cohen, 1960) were evaluated in order
with discrimination parameters above 1.0 are considered indicative to provide further indication of the diagnostic utility of the ODSIS
of high discrimination between levels of the latent construct. at each cut score after correcting for base rates. These various
Second, item difficulty thresholds (b parameters) were examined indices were used to select an appropriate cut score on the ODSIS
in order to determine how challenging each ODSIS item was. for identifying patients with probable mood disorders.
Given that the ODSIS has five response options, there are four Next, each sample was used for analyses of convergent and
response thresholds for each item; these thresholds indicate the discriminant validity. The magnitudes of correlations was inter
level of the latent level of depression at which an individual has a preted in accordance with commonly used benchmarks suggested
50% chance of scoring at or above a particular response option. by Cohen (1988); as such, effect sizes between .10 and .30 were
Finally, total information curves, which represent the amount of
considered small; those between .30 and .50 were considered
information a scale provides across various levels of the latent
medium, and those .50 or above were considered large. Analyses
construct, were analyzed. In the present study, total information
included correlating the ODSIS with well-established depression
curves indicated the levels of the latent construct of depression that
measures (e.g., BDI-II, CES-D, DASS-D, PHQ-9). Large positive
the ODSIS measured most reliably, and those levels measured with
correlations of the ODSIS with these depression scales were in
the most measurement error.
terpreted as evidence for convergent validity. In light of previous
Next, effect size estimates (Hedge’s g) were generated to ascer
findings that the higher-order dimension of neuroticism/behavioral
tain the degree to which the ODSIS discriminates between indi
inhibition (N/BI) contributes substantially to depression (e.g.,
viduals with and without a current clinical mood disorder diagno
Brown, 2007; Brown & Barlow, 2009), we also expected that the
sis in the outpatient sample. Correlations between the ODSIS and
ODSIS would evidence positive correlations of small to medium
ADIS-IV-L CSRs of all clinical and subclinical mood disorder
magnitudes with measures of N/BI (NFFI-N, BIS); it was antici
diagnoses were also computed to determine how well this measure
pated that correlations between the ODSIS and indicators of de
distinguishes across the range of clinician-rated depression sever
pression would be stronger than those with indicators of N/BI.
ity levels. Among the subsets of nonclinical participants who were
Similarly, we also anticipated that the ODSIS would evidence
asked the three additional questions about mood disorder diagno
ses and treatment, Hedge’s g effect size estimates were calculated positive, small to medium correlations with measures of anxiety
to examine the degree to which the ODSIS distinguishes between (e.g., BAI, DASS-A, OASIS) and stress (DASS-S). Negative
individuals who reported and did not report a current (or past) correlations between the ODSIS and indicators of behavioral ac
mood disorder diagnosis and those who reported receiving and not tivation (BAS), mental health (SF-12-M), physical health (SF-12-
receiving treatment for depression. P), and extraversion (NFFI-E) were also anticipated. Moreover, it
In order to assess classification accuracy, we used a receiver was expected that the magnitude of these associations would be
operating characteristics (ROC) curve to determine sensitivity and weaker than those with measures of depression. As such, discrim
specificity for cut scores on the ODSIS in the outpatient sample. inant validity analyses entailed comparing the magnitude of asso
Given that all patients in the sample presented with a clinical mood ciations between the ODSIS and indicators of depression with
or anxiety disorder, this analysis served as a rigorous evaluation of those correlations between the ODSIS and closely related, yet
the ability of the ODSIS to correctly distinguish adult outpatients distinct constructs (e.g., anxiety, neuroticism), as well as clearly
with and without clinical depression. Presence versus absence of distinct constructs (NFFI-O). Correlations between the ODSIS and
any (principal or additional) unipolar mood disorder diagnosis on measures of anxiety, stress, neuroticism, extraversion, physical
the ADIS-IV-L was the categorical outcome. A ROC curve is a health, and mental health of weaker magnitudes than those with
plot of the true positive rate (sensitivity) against the false positive depression measures were considered supportive of discriminant
rate (1 - specificity) for different possible cut points. Therefore, validity.
ROC curves provide visual representations of the trade-off be Finally, in order to determine whether the ODSIS measures
tween sensitivity and specificity of cut scores for diagnostic mea depression equivalently in distinct populations, DIF between the
sures. Chi-square tests, sensitivity values (i.e., percentage of adults three samples (i.e., student vs. outpatient, community vs. outpa
meeting diagnostic criteria for a mood disorder correctly identified tient, student vs. community) was investigated. IRTLRDIF 2.0
by the ODSIS as depressed), and specificity values (i.e., percent (Thissen, 2001), which provides omnibus tests of DIF, in addition
age of adults not meeting diagnostic criteria for a mood disorder to tests of group invariance for the discrimination (a) parameters
correctly identified by the ODSIS as nondepressed) were used to and difficulty (b) parameters, was utilized. First, the omnibus test
determine the percentage of patients correctly classified with the statistic was examined, and if it was statistically significant, the
ODSIS. Estimates of PPP (i.e., percentage of adults classified by discrimination and difficulty test statistics were examined. DIF
the ODSIS as depressed who actually met criteria for a unipolar analyses indicate whether the items on a particular scale are easier
mood disorder diagnosis) and NPP (i.e., percentage of adults for a certain group (i.e., vary on the b parameter), or whether the
classified by the ODSIS as nondepressed who did not meet criteria items function differently with regard to the latent trait (i.e., vary
for a mood disorder) at each potential cut score were calculated. on the a parameter). Uniform DIF refers to when items are easier
Given that estimates of PPP and NPP are affected by the preva for a particular group, and nonuniform DIF refers to when items
lence of individuals classified as depressed by the ODSIS in a vary in difficulty and discrimination. Relative to uniform DIF,
nonuniform DIF is considered more problematic, as it makes there were no significant effects of age, race, or ethnicity on
scores more difficult to compare across groups (Smith, 2002). ODSIS score. However, community participants’ ODSIS scores
differed significantly as a function of gender, f(186) = 3.197, p <
Results .01, with male participants (M = 6.41, SD = 4.67) endorsing
higher scores than female participants (M = 4.20, SD = 4.68).
Cronbach’s alpha for the five ODSIS items was .94 in the outpa
Preliminary Analyses
tient sample, .91 in the student sample, and .92 in the community
The mean ODSIS score in the outpatient sample (N = 100) was sample, all indicative of excellent internal consistency.
5.50 (SD = 5.04, range = 0 to 19). In the student sample (N =
566), the mean ODSIS score was 2.57 (SD = 3.36, range = 0 to
Latent Structure
17), and in the community sample (N = 189), the mean ODSIS
score was 5.16 (SD = 4.81, range = 0 to 18). These results In the outpatient sample, a single-factor model specifying paths
indicate that both clinic- and community-based participants were from a single factor of depression to each of the five ODSIS items
on average more depressed than student participants. The effect resulted in good model fit: x2(5) = 7.956, p = .159; NNFI = .973;
sizes for mean differences in ODSIS scores between outpatient and CFI = .987; RMSEA = .077, 90% Cl [.000, .172]; SRMR = .028.
student samples, and community and student samples were of In addition, each item displayed salient loadings on the latent
medium to large magnitudes (Hedge’s g = .80 and .69, respec factor (.81 to .96). In the student sample, the same single-factor
tively), whereas the effect size for the mean difference in ODSIS model resulted in adequate model fit: x2(5) = 27.77, p < .001;
scores between outpatient and community participants was negli NNFI = .943; CFI = .971; RMSEA = .090, 90% Cl [.059, .124];
gible (Hedge’s g = .07). These findings indicate that the ODSIS SRMR = .023. Each item displayed salient loadings on the latent
captured similar levels of depression severity and impairment in factor (.76 to .88). In the community sample, the single-factor
the outpatient and community samples. Descriptive analyses for model resulted in good model fit: x2(5) = 7.668, p = .176;
responses to individual ODSIS items in the three samples are NNFI = .985; CFI = .992; RMSEA = .053, 90% Cl [.000, .123];
presented in Table 1. Response frequencies indicated that for SRMR = .019. In addition, each item displayed salient loadings on
outpatient participants, the most common responses to the ODSIS the latent factor (.82 to .89). These results demonstrate that a single
items were the two lowest response options (none/no depression latent factor model provided good fit for the data from clinic- and
and infrequent/mild depression). For student and community- community-based samples, and adequate fit in the student sample.
based participants, the most common response to the ODSIS items With regard to the IRT analyses employed to further examine
was the lowest response option (none/no depression). Considered the latent structure of the ODSIS, item discrimination values and
together, between 20% and 69% of all participants used the bottom item difficulty values for the three samples are presented in Table
two response options when completing the ODSIS. 2. The item discrimination (a) parameters for the five ODSIS items
In the outpatient sample, there were no significant differences in were all higher than 3.0, which indicates that the ODSIS items
ODSIS score by gender, age, race, or ethnicity. In the student provide a high level of discrimination, although each ODSIS item
sample, there were no significant differences in ODSIS score by may only offer information about a narrow range of the latent level
gender, race, or ethnicity. Students’ ODSIS scores differed as a of depression. The a parameter for the fifth ODSIS item in the
function of age, in that older age was correlated with higher scores outpatient sample was notably large (11.21). Examination of dif
on the ODSIS (r = .09, p < .05); however, as previously noted, the ficulty (b) parameters indicates that in the clinic- and community-
age range (6 years) was relatively small. In the community sample, based samples, only the lowest threshold (bf) was associated with
Table 1
Descriptive Statistics and Response Percentages o f the ODSIS
0/None l/Mild 2/Moderate 3/Severe 4/Extreme

In the past week . . . Sample M SD (%) (%) (%) (%) (%)
1. How often have you felt depressed? Outpatient 1.23 1.13 30.0 38.0 14.0 15.0 3.0
Student 0.64 0.82 52.5 36.7 5.8 4.4 0.5
Community 1.06 1.10 41.3 25.4 20.1 12.2 1.1
2. When you have felt depressed, how intense or Outpatient 1.15 1.10 34.0 32.0 24.0 5.0 5.0
severe was your depression? Student 0.54 0.80 61.3 25.6 11.1 1.2 0.7
Community 0.99 1.06 44.4 23.3 22.2 9.0 1.1
3. How often did you have difficulty engaging in or Outpatient 1.14 1.21 42.0 23.0 18.0 13.0 4.0
being interested in activities you normally enjoy Student 0.44 0.74 68.9 20.5 8.3 2.3 0.0
because of depression? Community 1.12 1.17 42.3 22.2 18.0 15.9 1.6
4. How much did your depression interfere with your Outpatient 0.98 1.15 44.0 32.0 10.0 10.0 4.0
ability to do the things you needed to do at work, Student 0.51 0.80 64.8 24.2 6.5 4.4 0.0
at school, or at home? Community 0.92 1.04 45.0 30.7 13.8 9.0 1.6
5. How much has depression interfered with your Outpatient 1.07 1.08 39.0 28.0 19.0 12.0 1.0
social life and relationships? Student 0.44 0.73 68.7 20.3 9.4 1.6 0.0
Community 1.07 1.11 41.8 23.8 22.2 10.1 2.1
Note. ODSIS = Overall Depression Severity and Impairment Scale.

Table 2
1RT Discrimination Values (a) and Difficulty Thresholds (b)
Item Sample a b, b2 b3 b4
1. How often have you felt depressed? Outpatient 3.40 -0.57 0.64 1.08 1.97
Student 5.84 0.09 1.21 1.68 2.49
Community 5.27 -0.16 0.64 1.17 1.97
2. When you have felt depressed, how intense or severe was your Outpatient 3.74 -0.39 0.58 1.37 1.79
depression? Student 4.50 0.34 1.24 2.12 2.45
Community 4.32 -0.05 0.66 1.31 2.07
3. How often did you have difficulty engaging in or being Outpatient 4.46 -0.09 0.50 1.00 1.74
interested in activities you normally enjoy because of Student 3.24 0.59 1.41 2.21 _
depression? Community 3.56 -0.11 0.60 1.09 2.07
4. How much did your depression interfere with your ability to do Outpatient 5.34 -0.03 0.78 1.12 1.68
the things you needed to do at work, at school, or at home? Student 4.02 0.45 1.36 1.82 _
Community 3.81 -0.04 0.88 1.33 2.03
5. How much has depression interfered with your social life and Outpatient 11.21 -0.14 0.57 1.09 2.05
relationships? Student 3.04 0.59 1.42 2.41 _
Community 3.65 -0.15 0.62 1.28 1.94
Note. Difficulty parameters are not presented for the fifth response option of ODSIS items 3, 4, and 5 in the student sample because no students endorsed
these responses. IRT = item response theory; ODSIS = Overall Depression Severity and Impairment Scale.
below-average levels of depression. For example, in the outpatient pression. Considered together, total information curves confirm
sample, the value of b, for the fourth ODSIS item was -0 .0 3 , that regardless of the sample, the ODSIS provided more informa
indicating that an individual right at the average level of depres tion about individuals with above-average levels of depression
sion for the sample would still have a 50% chance to respond with than individuals with below-average levels of depression.
the lowest response option to this item. In the student sample, the
difficulty (b) parameters indicate that no responses were associated
Contribution of Mood Disorder Diagnoses to
with a below-average latent level of depression. For example, the
Predicting ODSIS Scores
value of foj for the third ODSIS item in the student sample was
0.59, which indicates that an individual that is 0.59 standard To investigate the effects of unipolar mood disorder diagnoses
deviations above the mean latent level of depression would still (i.e., major depressive disorder and dysthymic disorder in the
have a 50% chance to endorse “no depression in the past week” to outpatient sample), we computed Hedge’s g effect sizes in order to
this item. Considered together, difficulty parameters in all three compare the magnitudes of the differences in total ODSIS score
samples suggest that the ODSIS items are difficult, and may between individuals with and without a principal mood disorder
provide more information about individuals with above-average diagnosis, and individuals with and without any mood disorder
than below-average latent levels of depression. Overall, analysis of diagnosis (principal or additional; see Table 3). Patients with a
item discrimination values and difficulty parameters suggest that principal mood disorder diagnosis had higher overall ODSIS
items comprising the ODSIS may be more informative about high scores (M = 11.00, SD = 3.16) than those without a principal
levels of depression than low levels of depression in both outpa mood disorder diagnosis (M = 5.09, SD = 4.92; g = 1.21).
tient and nonclinical populations. Further, patients with any clinical mood disorder diagnosis also
The total information curves were also examined in the three had higher overall ODSIS scores (M = 10.70, SD = 4.30) than
samples (see Figure 1). In these figures, the y-axis represents the those without (M = 3.58, SD = 3.79; g = 1.80). The magnitudes
amount of information the ODSIS provides, and the x-axis repre of these effect sizes indicate that the presence of a mood disorder
sents the latent level of depression as standard deviations from the diagnosis, whether principal or additional, was strongly associated
mean. For the outpatient sample, the curve was multimodal, which with higher ODSIS scores in the outpatient sample.
suggests that the individual ODSIS items provide good informa Further, the degree to which the ODSIS discriminates between
tion about narrow ranges of the latent level of depression. The severity levels of clinician-rated mood disorder diagnoses was
curve was also centered between 0 and 1 standard deviation above determined by examining correlations between total ODSIS score
the mean, and generally had more coverage of above-average and CSRs for any clinical or subclinical mood disorder diagnosis
levels of depression. Thus, the total information curve provided in the outpatient sample. As previously noted, disorders detected
further indication that the ODSIS offers more information about by the ADIS-IV-L are assigned a CSR reflecting clinical severity,
patients with above-average than below-average levels of depres distress and impairment associated with the disorder; a CSR < 3
sion. With regard to the student and community samples, the indicates subclinical status, whereas a CSR > 4 indicates clinical
overall shape of these figures was similar. The curves were both status. The ODSIS demonstrated a strong capacity to distinguish
multi-modal, which suggests that the individual ODSIS items between CSRs across the range of clinical and subclinical mood
provided good information about narrow ranges of the latent level disorders (r = .61, p < .001). In order to compare the degree to
of depression in both samples. In addition, both curves were which the ODSIS and other measures of depression differentiate
centered between 0 and 1 standard deviation above the mean, and between depression severity levels, we also computed associations
appeared to have more coverage of above-average levels of de between both the BDI-II and DASS-D and mood disorder CSRs.
Figure 1. Total information curves for (a) outpatient, (b) student, and (c) community-based samples. The y-axis
represents the amount of information the ODSIS provides, and the x-axis represents the latent level of depression
as standard deviations from the mean. ODSIS = Overall Depression Severity and Impairment Scale.
The correlation between the BDI-II and mood disorder CSR was Sensitivity, Specificity, PPP, NPP, Agreement, and
.57 (p < .001), and the correlation between DASS-D and depres Correct Classification
sive disorder CSR was .43 (p < .001).
Among the two nonclinical samples, to examine the effects of In the outpatient sample, cut scores on the ODSIS were exam
mood disorder and treatment status, Hedge’s g effect sizes were ined in order to determine whether a score was well able to
also computed in order to compare the magnitudes of difference in discriminate those who met criteria for any (principal or addi
total ODSIS score between individuals who reported ever receiv tional) depressive disorder from those who did not. First, a ROC
ing a mood disorder diagnosis (i.e., major depressive disorder, curve with ODSIS total score as the continuous variable and
dysthymia, bipolar disorder, cyclothymia), a current mood disorder diagnostic status as the categorical outcome variable, and corre
diagnosis, and current treatment for depression (see Table 3). This sponding sensitivity and specificity values, were generated. The
information was available for a subset of each sample (n = 430 ROC curve and corresponding sensitivity and specificity values
students and n = 156 community participants). Student and suggested that a cut score o f 7 or 8 would maximize sensitivity
community-based participants who reported a current or any (i.e., relative to specificity (see Figure 2). Second, the percentage of
current or past) past mood disorder diagnosis had higher overall patients correctly classified using viable cut scores was computed
ODSIS scores than those who did not. Participants who reported by summing the true positives and true negatives, and dividing by
currently receiving treatment for depression evidenced higher the total sample size (N = 100); third, estimates of PPP and NPP
ODSIS scores than those who did not. Corresponding effect sizes, were calculated. In Table 4, the chi-square, sensitivity and speci
ranging from .88 to 1.76 (see Table 3), indicate that reported mood ficity values, PPP and NPP estimates, base rates, kappa coeffi
disorder and treatment status had strong effects on ODSIS scores cients (indicating agreement between ODSIS and ADIS-IV-L clas
among student and community-based samples. sifications after correcting for chance), and the percentages of
Table 3
ODSIS Scores as a Function o f Mood Disorder Status
Yes No
Hedge’s
Sample % M SD % M SD g
Outpatient sample
Current principal mood disorder 7.0 11.00 3.16 93.0 5.09 4.92 1.21
Any current mood disorder 27.0 10.70 4.30 73.0 3.58 3.79 1.80
Student sample
Ever received mood disorder diagnosis 10.0 5.79 4.51 90.0 1.99 2.86 1.23
Current mood disorder diagnosis 4.9 7.52 4.37 95.1 2.11 2.99 1.76
Received treatment for mood disorder 8.4 4.94 4.80 91.6 2.14 3.00 0.88
Community sample
Ever received mood disorder diagnosis 30.8 8.38 3.51 69.2 4.18 4.54 0.98
Current mood disorder diagnosis 21.8 9.06 2.86 78.2 4.47 4.58 1.07
Received treatment for mood disorder 28.2 8.50 3.27 71.8 4.28 4.59 0.98
Note. In the clinical sample, mood disorder status was determined by data gleaned from the clinician-rated ADIS-IV-L. In the two nonclinical samples,
mood disorder status was determined by single-item self-report questions. These data refer to the subset of students (n = 430) and community participants
(■n = 156) who were administered the three additional self-report questions. ODSIS = Overall Depression Severity and Impairment Scale; ADIS-IV-L =
Anxiety Disorders Interview Schedule for DSM-TV—Lifetime version.
patients correctly classified with cut scores of 0 through 20 are The BDI-II cut score evidencing the most favorable balance of
presented. A cut score of > 8 was determined to be optimal as it sensitivity (69%) and specificity (89%) was 22; at this score, 83%
successfully classified 82% of the sample, evidenced favorable of the sample was correctly classified, PPP was 69%, NPP was
balances of both sensitivity (74%) and specificity (85%) and PPP 89%, and agreement with the ADIS-IV-L after correcting for
(65%) and NPP (90%), and showed the highest agreement between chance was moderate ( k = .55). At the optimal cut score for the
ADIS-IV-L and ODSIS classification ( k = .56). DASS-D (> 11), sensitivity was 77%, specificity was 69%, PPP
For purposes of comparison, cut scores for the other two de was 48%, NPP was 89%, and agreement with the ADIS-IV-L after
pression scales employed in the outpatient sample were examined. correcting for chance was fair ( k = .37).
Convergent and Discriminant Validity

ROC Curve
With regard to convergent and discriminant validity, associa
tions between the ODSIS and other measures are displayed in
Table 5. Among outpatients, total scores on the ODSIS demon
strated convergent validity with existing measures of depression
(large correlations of .71 and .74, respectively). Additionally, the
ODSIS evidenced positive correlations of medium to large mag
nitudes with indicators of N/BI, ranging from .36 to .55. The
ODSIS also correlated positively with measures of anxiety and
stress; these small to medium correlations (ranging from .25 to .35)
were weaker than correlations between the ODSIS and indicators
of N/BI. Of note, the magnitudes of all associations between the
ODSIS and measures of related yet distinct constructs (e.g., anx
iety, neuroticism) in the outpatient sample were consistently
smaller than the magnitudes of correlations between the ODSIS
and indicators of depression, thus meeting our criteria for discrim
inant validity. The ODSIS also demonstrated small to medium
negative correlations with measures of extraversion and behavioral
activation (—.43 and —.29, respectively); again, these associations
were weaker than those with indicators of depression, thereby
suggestive of discriminant validity.
In the two nonclinical samples, correlations between the ODSIS
and well-validated depression scales were of large magnitudes,
Figure 2. Receiver operating characteristic (ROC) curve for ODSIS ranging from .71 (CES-D) to .77 (DASS-D) in the student sample,
scores to predict presence of a mood disorder. ROC curve refers to any and .79 (CES-D) to .88 (BDI-II) in the community sample. The
unipolar mood disorder (principal or additional). A cut score of 8 correctly ODSIS also showed positive correlations of large magnitudes with
identified the mood disorder status of 82% of the clinical sample (i.e., an the measure of neuroticism (NFFI-N), ranging from .62 to .71, and
ODSIS score of 8 or above indicates probable mood disorder). ODSIS = small to medium correlations with the measure of behavioral
Overall Depression Severity and Impairment Scale. inhibition (BIS), which ranged from .22 to .36. Correlations be-
Table 4
Diagnostic Utility o f the ODSIS
% correctly
ODSIS score X2(l, 100) Sensitivity Specificity PPP NPP Base rate K classified
0 _ 1.00 .00 .27 _ 1.00 .00 27

1 12.54** 1.00 .39 .38 1.00 .72 .25 55
2 15.50** 1.00 .44 .40 1.00 .68 .30 59
3 17.86** 1.00 .48 .42 1.00 .65 .33 62
4 18.28** .96 .53 .43 .98 .60 .36 65
5 20.10** .93 .60 .46 .96 .54 .40 69
6 28.41** .93 .70 .53 .96 .47 .51 76
7 28.94** .85 .77 .58 .93 .40 .54 79
8 29.38** .74 .85 .65 .90 .31 .56 82
9 22.63** .59 .89 .67 .86 .24 .50 81
10 23.10** .52 .93 .74 .84 .19 .50 82
11 20.06** .48 .93 .72 .83 .18 .46 81
12 19.46** .44 .95 .75 .82 .16 .45 81
13 22.10** .37 .99 .90 .80 .11 .44 82
14 25.32** .30 .99 1.00 .79 .09 .36 80
15 16.56** .26 1.00 1.00 .79 .07 .34 80
16 7.74** .15 1.00 1.00 .76 .04 .20 79
17 4.98* .11 1.00 1.00 .75 .03 .15 76
18 0.27 .04 1.00 1.00 .74 .01 .05 74
19 0.27 .04 1.00 1.00 .74 .01 .05 74
20 — .00 1.00 — .73 .00 .05 73
Note. A dash means not applicable. PPP = positive predictive power; NPP = negative predictive power; Base rate = percentage scoring at or above cut
score; k = agreement between ODSIS and ADIS-IV-L after correcting for chance; ODSIS = Overall Depression Severity and Impairment Scale;
ADIS-IV-L = Anxiety Disorders Interview Schedule for DSM-IV—Lifetime version.
> < .05. * > < .001.
tween the ODSIS and indicators of anxiety (BAI, DASS-A, indicative of discriminant validity. Finally, in both samples, the
OASIS) and stress (DASS-S) were of medium to large magnitudes ODSIS generally evidenced negative correlations of large magni
in both samples. Specifically in the student sample, these correla tudes with measures of mental and physical health (SF-12-M and
tions were consistently weaker than those with depression mea SF-12-P), with the exception of a small correlation between the
sures, ranging from .48 (BAI) to .59 (OASIS), thereby meeting our ODSIS and the SF-12-P in the student sample (.11).
criteria for discriminant validity. In the community sample, these
correlations were largely of similar magnitudes to those with
Differential Item Functioning
depression measures, ranging from .75 (DASS-A) to .81 (OASIS);
however, the association between the ODSIS and the BAI (.58) Results of the DIF analysis examining equivalence of measure
was notably smaller. The ODSIS also evidenced small to medium ment between the three samples (outpatient vs. student, outpatient
negative correlations with the NFFI-E in both nonclinical samples. vs. community, student vs. community) can be viewed in Table 6.
Associations between the ODSIS and both the NFFI-0 and BAS For all DIF analyses, the Bonferroni procedure was employed to
were negligible to small in the student and community samples, adjust the significance level as a function of the number of tests
Table 5
Correlations o f ODSIS With Convergent and Discriminant Validity Measures
Sample BDI-II CES-D DASS-D PHQ-9 OASIS BAI DASS-A DASS-S NFFI-N NFFI-E NFFI-0 BIS BAS SF-12-M SF-12-P
ODSIS
Outpatient .74** .71** .33** .25* .35** .55** -.43** — .36** -.29**
Student .76** .71** .77** .76** .59** .48** .52** .58** .62** -.44** -.0 3 .22** -.12** -.73** -.11**
Community .88** .79** .84** .87** .81** .58** .75** .76** .71** -.28** -.18* .36** .10 -.63** -.50**
Note. Patients in the clinical sample were not administered the CES-D, PHQ-9, OASIS, NFFI-O, or SF-12 (refer to Procedures). In the student and
community samples, the BAI was added partway through data collection; thus, ODSIS correlations with the BAI are based on subsets of the samples
(student n = 21A, community n = 119). Tables depicting correlations between all measures can be provided by the author at the reader’s request. ODSIS =
Overall Depression Severity and Impairment Scale; BDI-II = Beck Depression Inventory—Second Edition; CES-D = Center for Epidemiological Studies
Depression Scale; DASS-D = Depression Anxiety Stress Scales-Depression; PHQ-9 = Patient Health Questionnaire-9; OASIS = Overall Anxiety
Severity and Impairment Scale; BAI = Beck Anxiety Inventory; DASS-A = Depression Anxiety Stress Scales-Anxiety; DASS-S = Depression Anxiety
Stress Scales-Stress; NFFI-N = NEO Five Factory Inventory-Neuroticism; NFFI-E = NEO Five Factor Inventory-Extraversion; BIS = Behavioral
Inhibition Scale; BAS = Behavioral Activation Scale; SF-12-M = Short-Form Health Survey-Mental Health Subscale; SF-12-P = Short-Form Health
Survey-Physical Health Subscale.
> < . 0 5 . * > < .0 0 1 .
Table 6
Differential Item Functioning Tests
All equal Equal discrimination (a) Equal thresholds (b)

X2(5, n = 666) P X2(l, n = 666) P X2(4, n = 666) P
Student and outpatient samples
ODSIS Item 1 0.7 .983 _ _ _
ODSIS Item 2 7.5 .186 _ _
ODSIS Item 3 0.7 .983 _ _
ODSIS Item 4 6.0 .306 _ _ _
ODSIS Item 5 0.7 .983 — — — —
Community and outpatient samples X2(5, n = 289) P — — — _
ODSIS Item 1 0.0 1.00 _
ODSIS Item 2 4.0 .549 _ _ _
ODSIS Item 3 0.0 1.0 _ _
ODSIS Item 4 6.2 .287 _ _
ODSIS Item 5 0.0 1.00 — — — —
Student and community samples X2(5, n = 755) P X2(l, n = 755) P X2(4, n = 755) p
ODSIS Item 1 1.9 .862 _ __
ODSIS Item 2 27.0 .000 9.2 .002 17.8 .001
ODSIS Item 3 1.9 .862 _ _ _
ODSIS Item 4 18.8 .002 _ _
ODSIS Item 5 1.9 .862 — — — —
Note. A Bonferroni-adjusted significance level of 0.00111 was used for differential item functioning (DIF) analyses to account for the increased possibility
of Type I error. ODSIS = Overall Depression Severity and Impairment Scale. A dash means that a and b parameters were not calculated because the
omnibus test for DIF did not exceed the critical value for the adjusted significance level.
performed simultaneously (n = 45) to account for the increased (Norman et al., 2006), an assessment tool that captures anxiety-
possibility of Type I error. As such, an adjusted significance level related severity and impairment across the anxiety disorders and
of 0.0011 was used. In the comparison between the outpatient and with subthreshold anxiety symptoms. This was the first study to
student samples, and the outpatient and community samples, there examine the internal consistency and latent structure of the ODSIS,
were no statistically significant omnibus tests for DIF. These as well as whether the measure distinguishes among the full range
findings suggest that the ODSIS functions equivalently in outpa of depression severity levels and/or is differentially informative
tient compared to nonclinical samples. across the ranges of latent levels of depression. Additionally,
In the comparison between the two nonclinical samples, the appropriate cut scores were determined for using the ODSIS as a
omnibus test for DIF exceeded the critical value for the adjusted screening tool for identifying individuals with mood disorder
significance level in the second item of the ODSIS, which suggests diagnoses. We also examined convergent and discriminant validity
this item may function differently in student and community-based
of the ODSIS, and whether the measure functions differently in
samples. However, an examination of the secondary tests of equal
outpatient and nonclinical samples. Collectively, findings suggest
discrimination and equal thresholds suggests that it is unlikely that
that the ODSIS is a valid instrument for measurement of depres
uniform or nonuniform DIF was present in this item. First, the
sion severity and impairment and is well suited for use with both
equal discrimination (a) threshold did not exceed the critical value
treatment-seeking and nonclinical populations.
for the adjusted critical level, which indicates that nonuniform DIF
With regard to the internal consistency and dimensionality of
was not present. Second, the threshold (b) parameter for the second
the ODSIS, the five ODSIS items showed excellent internal con
ODSIS item also did not reach the critical value for the adjusted
sistency, and the measure was found to possess a unidimensional
significance level, which suggests that uniform DIF was not pres
factor structure. Specifically, a single-factor model provided good
ent. These findings support the notion that differences between the
fit to the ODSIS data in the clinic- and community-based samples,
student and community-based samples in responses to the second
and acceptable fit to the data in the student sample. Considered
ODSIS item do not reflect different relationships to the latent
depression construct. Considered together, the results from DIF together, fit statistics across the three samples suggested that a
analyses indicate that the ODSIS functions similarly across out single-factor solution fit the data well, which supported the unidi
patient, undergraduate student, and community-based populations. mensional factor structure of the ODSIS. As with the OASIS (e.g.,
Campbell-Sills et al., 2009), summing the five ODSIS items to
generate a total score is recommended for both outpatient and
Discussion
nonclinical samples.
The present study aimed to evaluate the psychometric properties Along similar lines, we employed IRT methods of analysis to
of the ODSIS, a brief self-report measure of depression severity determine how well the ODSIS captured the full range of severity
and impairment. The ODSIS was adapted directly from the OASIS and impairment due to depressive symptoms. Valuable informa-
don regarding the coverage of latent levels of depression was mood disorder after controlling for chance was observed. These
obtained for individual ODSIS items from item difficulty param findings support the use of a cut score of 8 when employing the
eters and for the overall ODSIS from total information curves. ODSIS as a screening instrument for clinical depression in outpa
Results from all three samples demonstrated that the ODSIS items tient settings. It is noteworthy that an optimal score of 5: 8 on the
are difficult, in that individuals with average levels of depression- OASIS was determined in two previous investigations to be the
related severity and impairment are likely to endorse low re most efficient score for successful classification of probable anx
sponses (e.g., “Infrequent depression,” “Depression was absent or iety disorders (Campbell-Sills et al., 2009; Norman et al., 2011).
barely noticeable”). Thus, the ODSIS primarily provides informa These equivalent cut scores suggest that the ODSIS and OASIS
tion about individuals with high depression-related severity and may be user-friendly tools aptly suited for concurrent screening of
impairment, and is less able to differentiate between individuals depression and anxiety in outpatient settings. Moreover, given that
with below-average levels of depression. This means that the most the present study only employed the clinician-rated ADIS-IV-L in
measurement error is likely to exist when clinicians or researchers the outpatient sample, future research should use a reliable diag
use the ODSIS to distinguish among individuals with very low nostic indicator among student and community samples in order to
depressive symptomatology. These findings indicate that the conduct similar sensitivity and specificity analyses among non
ODSIS may be particularly well-suited for situations when the aim clinical populations in which screening for depression is critical. In
is to detect imminent clinical depression (e.g., screenings, treat light of high comorbidity rates among anxiety and depressive
ment outcome research, monitoring progress in clinical settings). disorders (e.g., Kessler et al., 2012), the potential ability to simul
Of note, the discrimination value for the fifth ODSIS item in the taneously detect both types of emotional disorders with two anal
outpatient sample was relatively large, and although a variety of ogous scales that can be completed together in approximately five
potential explanations (e.g., missing data, skewness, kurtosis) were minutes is valuable. Moving forward, treatment outcome studies
explored, we were unable to determine a likely cause. Although for depression that utilize the ODSIS should also explore appro
unlikely, the possibility that this item functions differently than priate cut scores for identifying patients who meet criteria for each
other ODSIS items in outpatient samples warrants further study in of the “Five Rs” (i.e., Response, Remission, Relapse, Recovery,
future investigations. Recurrence; Thase, 1992).
Furthermore, ODSIS scores largely discriminated well between Regarding the diagnostic utility of the ODSIS relative to other
individuals with and without mood disorder diagnoses. For both well-established depression scales, it should be noted that a pre
student and community participants, results showed strong effects vious study of the BDI in a large sample of anxiety patients
of both mood disorder and treatment for depression status on determined that when employing the measure’s optimal cut score,
ODSIS score. The largest effects were observed for a current mood nearly 30% of the sample was misclassified (Sloan et al., 2002). As
disorder diagnosis, which suggests that the ODSIS effectively previously discussed, the optimal cut score for the five-item
captured current severity and impairment of depressive symptoms, ODSIS incorrectly classified 18% of the present outpatient sample,
in line with the “past week” time frame of the measure. In the whereas the optimal cut score for the 21-item BDI-II misclassified
outpatient sample, there was also a strong effect of mood disorder 17% and the optimal cut score for the seven-item DASS-D mis
status on ODSIS score, in that individuals with a current principal classified 29% of the sample. Unlike those for the ODSIS, no
or co-occurring mood disorder at a clinical level reported signifi BDI-II or DASS-D cut score in the present sample met the
cantly higher ODSIS scores than those without. Overall, these criterion for favorable sensitivity and specificity put forth by
findings support the utility of the ODSIS as a screening measure Matthey and Petrovski (2002). Further, agreement between the
for probable depression in clinical and nonclinical settings. ODSIS and ADIS-IV-L classifications after correcting for chance
Additionally, the ODSIS showed a strong ability to distinguish at the optimal cut score was higher than that of the BDI-II or
among the range of subclinical and clinical mood disorder severity DASS-D. Taken together, these findings suggest that despite its
levels. Of note, the five-item ODSIS was better able to discrimi brevity, the ODSIS is a similarly (if not more) accurate screening
nate between mood disorder CSRs than other longer, widely used instrument for depressive disorders in outpatient psychiatric set
indicators of depression. As previously described, the ODSIS does tings as compared to longer, and therefore more burdensome,
not focus on individual symptoms as do most other depression measures. Future research should compare the diagnostic utility of
scales but instead assesses depression-related severity and impair the ODSIS to that of other common self-report measures of de
ment (in a variety of domains) of heterogeneous depressive disor pression across a variety of populations.
ders and subsyndromal symptoms. Thus, findings that the ODSIS The ODSIS also evidenced large positive correlations with a
better captured the full range of depression severity than other variety of well-validated indicators of depression in all three
measures suggest that it may fulfill the need for a brief, yet samples, indicative of convergent validity. By including measures
accurate self-report scale assessing severity and interference across of N/BI, it was also possible to examine the degree to which this
clinical mood disorders and subsyndromal depression. higher order dimension that has been shown to contribute substan
It was found that a score of 8 or higher on the ODSIS correctly tially to depression (e.g., Brown & Barlow, 2009) was associated
classified 74% of patients with a mood disorder, 85% of patients with ODSIS scores. Findings from the outpatient sample showed
without a mood disorder, and, thus, 82% of the entire outpatient that the ODSIS had medium to large associations with indicators
sample. Of note, this ODSIS score achieved the sensitivity/speci- of N/BI, and that these associations were stronger than those
ficity criterion needed for a worthwhile cut score suggested by between the ODSIS and measures of anxiety but weaker than those
Matthey and Petrovski (2002; i.e., sensitivity of .70 and specificity with indicators of depression. These results are consistent with
of .80). Also at a cut score of 8, the highest correspondence prior research demonstrating that although N/BI is relevant to the
between ODSIS classification and actual diagnosis of unipolar range of emotional disorders, this higher order dimension evi-
dences the strongest relationships with depression and generalized participants, our outpatient sample was not exclusively depressed;
anxiety disorder (e.g., Brown & Barlow, 2009; Brown, Chorpita, in fact, only 27% met criteria for a current mood disorder of
& Barlow, 1998). These findings are also in line with our expec clinical severity. Furthermore, outpatient and community-based
tation that the ODSIS would show the strongest associations with participants also scored similarly on the well-validated BDI-II
depression but would also be related to indicators of N/BI. Also in (M = 14.69, SD = 11.51 and M = 15.86, SD = 13.47, respec
the outpatient sample, the ODSIS evidenced small to medium tively; g = -.0 9 ) and DASS-D (M = 12.37, SD = 11.28 and M =
associations in the expected directions with indicators of anxiety, 11.55, SD = 10.82, respectively; g = .07). Thus, it is likely that
stress, extraversion, and behavioral activation; of note, these cor these findings are due to the nature of the present outpatient and
relations were weaker than those between the ODSIS and mea community-based samples rather than the ODSIS not differentiat
sures of depression, thereby indicative of discriminant validity. ing well between clinical and community individuals.
Collectively, these results suggest that responses to the ODSIS Although results from the present research are promising, sev
were well able to discriminate between depression and closely eral limitations should be considered. First, the outpatient sample
related, yet distinct constructs in an outpatient sample. This serves was limited in the range of presenting symptoms, as all patients
as a notable strength of the ODSIS, and supports its potential had a principal mood or anxiety disorder diagnosis. Additionally,
utility as a screening and treatment outcome instrument for de prior to the intake evaluation, individuals reporting clear psychotic
pression specifically, rather other types of psychological distress. symptoms, several recent hospitalizations due to suicidality, or
Among the nonclinical samples, the ODSIS largely evidenced substance dependence were referred elsewhere and thus not in
small to medium correlations with indicators of N/BI, anxiety, and cluded in the present sample. These factors limit the ability to
stress. In the student sample, associations between the ODSIS and generalize our findings to patient populations with more diverse
measures of N/BI were ranged from small to large magnitudes, yet mental health conditions. Future studies that utilize clinical sam
weaker than correlations with indicators of depression; these find ples in which individuals do not necessarily meet criteria for a
ings are also theoretically consistent with prior work suggesting principal anxiety or mood disorder diagnosis are therefore war
strong relationships between N/BI and depression (e.g., Brown & ranted. As previously noted, the outpatient sample also evidenced
Barlow, 2009). Correlations between the ODSIS and indicators of levels of depressive symptom severity similar to those of the
anxiety and stress were of medium to large magnitudes, and community-based sample, which presents the need for evaluations
weaker than those with measures of depression, with one notewor of the ODSIS in patients presenting with more severe depression
thy exception of the association between the ODSIS and OASIS in symptomatology. Further, as with the OASIS (Norman et al.,
the community-based sample. Although this particular correlation 2011), research is needed with samples that include a larger
may suggest problems with the discriminant validity of the ODSIS proportion of depressed nonanxious participants, as approximately
among community-based individuals, it is also important to note 75% of patients in the current study met criteria for a principal
that the association between the DASS-D and DASS-A was sim anxiety disorder without co-occurring depression. Findings from
ilar to that between the ODSIS and OASIS (.83 and .81, respec this line of research could potentially lend incremental support to
tively) in the community sample. These findings suggest that high the ability of the ODSIS to precisely discriminate depressive from
symptom overlap existed among these individuals (e.g., Brown & anxiety symptoms across the full continuum of depression.
Barlow, 2009; Brown, Campbell, Lehman, Grisham, & Mancill, With regard to other limitations, the ODSIS was not adminis
2001), rather than an issue with the discriminant validityof the tered repeatedly over time in the present study; as a result, eval
ODSIS. However, the possibility remains that the ODSIS may not uations of test-retest reliability or sensitivity to change were not
discriminate well between depression and anxiety in community possible. This serves as the next logical step in the psychometric
compared to student or outpatient samples; thus, when the ODSIS evaluation of the ODSIS. As previously noted, existing symptom-
is used with community-based individuals, it may be important to based depression measures evidence questionable sensitivity to
employ clinician-rated diagnostic interviews as a supplementary change, and scales that assess a broad variety of depression-related
form of assessment in order to facilitate differential diagnosis of domains, rather than an exclusive focus on symptoms, may be
mood and other disorders with overlapping features. preferable for determining treatment outcome. Recent findings
Last, differential item functioning analyses were used to exam show that the OASIS possesses strong sensitivity to change during
ine whether responses to ODSIS items varied between samples anxiety treatment (Norman et al., 2013); given the highly similar
with regard to mean levels of depression-related severity and structure, item content, and psychometric performance of these
impairment, or relations to the underlying latent trait. This set of two measures, we anticipate that the ODSIS will also be well able
analyses demonstrated equivalence of measurement between the to capture change in depression severity and impairment during
samples, thereby indicating that the ODSIS functions similarly in treatment. In addition, convergent and divergent validity analyses
outpatient, student, and community samples. These findings sug were largely conducted with data gleaned from self-report mea
gest that ODSIS responses obtained from outpatients, undergrad sures, which poses the possibility that observed relationships with
uates, and community-based adults are likely to translate well the ODSIS were a function of method effects. In the two nonclini
across populations. In sum, results support the simultaneous use of cal samples, data indicating mood disorder and treatment status
the ODSIS in research and treatment efforts conducted with a were generated entirely from self-report questions administered to
variety of clinical and nonclinical individuals. It is important to a subset of each sample; thus, analyses of convergent and discrim
note that, in the present study, similar total ODSIS scores were inant validity, as well as the degree to which the ODSIS distin
observed among outpatient and community-based individuals. Al guishes between nonclinical participants with and without depres
though one might expect outpatients to score higher on a measure sion, should be replicated in studies using multiple methods of
of depression-related severity and impairment than community assessment. Finally, these findings indicated that, in the commu-
nity sample, males had higher ODSIS scores than did females. In and mood disorders in a large clinical sample. Journal o f Abnormal
light of evidence that the prevalence of depression is generally Psychology, 110, 585-599. doi: 10.1037/0021-843X.110.4.585
higher in females among community samples (e.g., Blazer, Kes Brown, T. A., Chorpita, B. F., & Barlow, D. H. (1998). Structural relation
sler, McGonagle, & Swartz, 1994; Seedat et al., 2009), these ships among dimensions of the DSM-IV anxiety and mood disorders and
findings are noteworthy. Future research might examine whether dimensions of negative affect, positive affect, and autonomic arousal. Jour
nal o f Abnormal Psychology, 107, 179-192. doi:10.1037/0021-843X
community-based adult men consistently endorse higher responses
.107.2.179
on the ODSIS than their female counterparts.
Brown, T. A., Di Nardo, P. A., Lehman, C. L., & Campbell, L. A. (2001).
Reliability of DSM -IV anxiety and mood disorders: Implications for the
Conclusions classification of emotional disorders. Journal o f Abnormal Psychology,
This investigation demonstrates that the ODSIS is a reliable and 110, 49-58. doi: 10.1037/0021-843X. 110.1.49
valid method for assessing depression-related severity and impair Cameron, I. M., Crawford, J. R., Lawton, K., & Reid, I. C. (2008).
Psychometric comparison of PHQ-9 and HADS for measuring depres
ment in outpatient and nonclinical samples. Strengths of the
sion severity in primary care. British Journal o f General Practice, 58,
ODSIS include its brevity, broad applicability across the mood
32-36. doi: 10.3399/bjgp08X263794
disorders and with subsyndromal depression, accuracy in detecting
Campbell-Sills, L., Liverant, G. I., & Brown, T. A. (2004). Psychometric
clinical mood disorders in outpatient settings, and targeted focus evaluation of the Behavioral Inhibition/Behavioral Activation Scales in
on severity and functional impairment due to all levels of depres a large sample of outpatients with anxiety and mood disorders. Psycho
sion severity, rather than individual depressive symptoms. The logical Assessment, 16, 244-254. doi: 10.1037/1040-3590.16.3.244
ODSIS may prove an efficient, user-friendly screening tool to Campbell-Sills, L., Norman, S. B., Craske, M. G., Sullivan, G., Lang,
identify depressed individuals in research or clinical settings. The A. L., Chavira, D. A., . . . Stein, M. B. (2009). Validation of a brief
ODSIS is also a promising instrument for evaluating the effects of measure of anxiety-related severity and impairment: The Overall Anx
interventions aimed at depression in the context of related disor iety Severity and Impairment Scale (OASIS). Journal o f Affective Dis
ders (e.g., anxiety) that often co-occur with depression. The utility orders, 112, 92-101. doi:10.1016/j.jad.2008.03.014
of the ODSIS would benefit from additional validation in more Carver, C. S., & White, T. L. (1994). Behavioral inhibition, behavioral
diverse settings in which there is a need for brief, accurate assess activation, and affective responses to impending reward and punishment:
ment of heterogeneous depressive symptomatology. The BIS/BAS scales. Journal o f Personality and Social Psychology, 67,
319 -333. doi: 10.1037/0022-3514.67.2.319
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educa
References
tional and Psychological Measurement, 20, 37—46. doi: 10.1177/
American Psychiatric Association. (1994). Diagnostic and statistical man 001316446002000104
ual o f mental disorders (4th ed.). Washington, DC: Author. Cohen, J. (1988). Statistical power analysis fo r the behavioral sciences.
American Psychiatric Association. (2013). Diagnostic and statistical man New York, NY: Routledge Academic.
ual o f mental disorders (5th ed.). Arlington, VA: American Psychiatric Costa, P. T., Jr., & McCrae, R. R. (1992). Revised NEO Personality
Publishing. Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI) pro
Antony, M. M., Bieling, P. J., Cox, B. J., Enns, M. W., & Swinson, R. P. fessional manual. Odessa, FL: Psychological Assessment Resources.
(1998). Psychometric properties of the 42-item and 21-item versions of Di Nardo, P. A., Brown, T. A., & Barlow, D. H. (1994). Anxiety disorders
the Depression Anxiety Stress Scales in clinical groups and a community interview schedule fo r DSM-IV: Lifetime version (ADIS-1V-L). New
sample. Psychological Assessment, 10, 176-181. doi: 10.1037/1040- York, NY: Oxford University Press.
3590.10.2.176 Edwards, B. C., Lambert, M. J., Moran, P. W., McCully, T., Smith, K. C.,
Beck, A. T., & Steer, R. A. (1987). Manual fo r the revised Beck Depres & Ellingson, A. G. (1984). A meta-analytic comparison of the Beck
sion Inventory. San Antonio, TX: Psychological Corporation. Depression Inventory and the Hamilton Rating Scale for Depression as
Beck, A. T., & Steer, R. A. (1993). Beck Anxiety Inventory manual. San measures of treatment outcome. British Journal o f Clinical Psychology,
Antonio, TX: Psychological Corporation. 23, 93-99. doi: 10.111 l/j.2044-8260.1984.tb00632.x
Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Manual fo r the Beck
Embretson, S. E., & Reise, S. (2000). Item response theory fo r psycholo
Depression Inventory-ll. San Antonio, TX: Psychological Corporation.
gists. Mahwah, NJ: Erlbaum.
Blazer, D. G., Kessler, R. C., McGonagle, K. A., & Swartz, M. S. (1994).
Fydrich, T., Dowdall, D., & Chambless, D. L. (1992). Reliability and
The prevalence and distribution of major depression in a national com
validity of the Beck Anxiety Inventory. Journal o f Anxiety Disorders, 6,
munity sample: The National Comorbidity Survey. American Journal o f
55-61. doi: 10.1016/0887-6185(92)90026-4
Psychiatry, 151, 979-986.
Gandek, B., Ware, J. E., Aaronson, N. K., Apolone, G., Bjomer, J. B.,
Brouwer, D., Meijer, R. R., & Zevalkink, J. (2013). Measuring individual
Brazier, J. E .,. . . Sullivan, M. (1998). Cross-validation of item selection
significant change on the Beck Depression Inventory-II through IRT-
and scoring for the SF-12 health survey in nine countries: Results from
based statistics. Psychotherapy Research, 23, 489-501. doi: 10.1080/
the IQOLA project. Journal o f Clinical Epidemiology, 51, 1171-1178.
10503307.2013.794400
Brown, T. A. (2007). Temporal course and structural relationships among doi: 10.1016/S0895-4356(98)00109-7
dimensions of temperament and DSM -IV anxiety and mood disorder Henry, J. D., & Crawford, J. R. (2005). The short-form version of the
constructs. Journal o f Abnormal Psychology, 116, 313-328. doi: Depression Anxiety Stress Scales (DASS-21): Construct validity and
10.1037/0021-843X.116.2.313 normative data in a large non-clinical sample. British Journal o f Clinical
Brown, T. A., & Barlow, D. H. (2009). A proposal for a dimensional Psychology, 44, 227-239. doi:10.1348/014466505X29657
classification system based on the shared features of the DSM -IV anx Hu, L., & Bender, P. M. (1999). Cutoff criteria for fit indexes in covariance
iety and mood disorders: Implications for assessment and treatment. structure analysis: Conventional criteria versus new alternatives. Struc
Psychological Assessment, 21, 256-271. doi:10.1037/a0016608 tural Equation Modeling, 6, 1-55. doi: 10.1080/10705519909540118
Brown, T. A., Campbell, L. A., Lehman, C. L., Grisham, J. R., & Mancill, Iwata, N., & Buka, S. (2002). Race/ethnicity and depressive symptoms: A
R. B. (2001). Current and lifetime comorbidity of the DSM -IV anxiety cross-cultural/ethnic comparison among university students in East Asia,
North and South America. Social Science & Medicine, 55, 2243-2252. chronic major depression. Biological Psychiatry, 54, 573-583. doi:
doi: 10.1016/S0277-9536(02)00003-5 10.1016/S0006-3223(02)01866-8
Judd, L. L., Akiskal, H. S., & Paulus, M. P. (1997). The role and clinical Samejima, F. (1969). Estimation o f latent ability using a response pattern
significance of subsyndromal depressive symptoms (SSD) in unipolar o f graded scores (Psychometric Monograph No. 17). Richmond, VA:
major depressive disorder. Journal o f Affective Disorders, 45, 5-17. Psychometric Society.
doi: 10.1016/S0165-0327(97)00055-4 Samejima, F. (1997). Graded response model. In W. van der Linden &
Kessler, R. C., Berglund, P., Dernier, O., Jin, R., Koretz, D., Merikangas, R. K. Hambleton (Eds.), Handbook of modern item response theory (pp.
K. R., . . . Wang, P. S. (2003). The epidemiology of major depressive 85-100). New York, NY: Springer.
disorder: Results from the National Comorbidity Survey Replication Sayer, N. A., Sackeim, H. A., Moeller, J. R., Prudic, J., Devanand, D. P.,
(NCS-R). Journal of the American Medical Association, 289, 3095- Coleman, E. A., & Kiersky, J. E. (1993). The relations between
3105. doi:10.1001/jama.289.23.3095 observer-rating and self-report of depressive symptomatology. Psycho
Kessler, R. C., Berglund, P., Demler, O., Jin, R., & Walters, E. E. (2005). logical Assessment, 5, 350-360. doi:10.1037/1040-3590.5.3.350
Lifetime prevalence and age-of-onset distributions of DSM-IV disorders Schneibel, R., Brakemeier, E., Wilbertz, G., Dykierek, P., Zobel, I., &
in the National Comorbidity Survey Replication. Archives o f General Schramm, E. (2012). Sensitivity to detect change and the correlation of
Psychiatry, 62, 593-602. doi:10.1001/archpsyc.62.6.593 clinical factors with the Hamilton Depression Rating Scale and the Beck
Kessler, R. C., Petukhova, M., Sampson, N. A., Zaslavsky, A. M., & Depression Inventory in depressed inpatients. Psychiatry Research, 198,
Wittchen, H. U. (2012). Twelve-month and lifetime prevalence and 62-67. doi: 10.1016/j.psychres.2011.11.014
lifetime morbid risk of anxiety and mood disorders in the United States. Seedat, S., Scott, K. M., Angermeyer, M. C., Berglund, P., Bromet, E. J.,
International Journal of Methods in Psychiatric Research, 21, 169—184.
Brugha, T. S., . . . Kessler, R. C. (2009). Cross-national associations
doi: 10.1002/mpr. 1359
between gender and mental disorders in the World Health Organization
Kroenke, K„ Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-9:
World Mental Health Surveys. Archives o f General Psychiatry, 66,
Validity of a brief depression severity measure. Journal o f General
785-795. doi: 10.1001/archgenpsychiatry.2009.36
Internal Medicine, 16, 606-613. doi:10.1046/j. 1525-1497.2001
Sloan, D. M., Marx, B. P., Bradley, M. M., Strauss, C. C., Lang, P. J., &
.016009606.x
Cuthbert, B. C. (2002). Examining the high-end specificity of the Beck
Lovibond, S. H., & Lovibond, P. F. (1995). Manual for the Depression
Depression Inventory using an anxiety sample. Cognitive Therapy and
Anxiety Stress Scales (2nd ed.). Sydney: Psychology Foundation of
Research, 26, 719-722. doi:10.1023/A:1021233215457
Australia.
Smith, L. L. (2002). On the usefulness of item bias analysis to personality
Matthey, S., & Petrovski, P. (2002). The Children’s Depression Inventory:
psychology. Personality and Social Psychology Bulletin, 28, 754-763.
Error in cutoff scores for screening purposes. Psychological Assessment,
doi: 10.1177/0146167202289005
14, 146-149. doi: 10.1037/1040-3590.14.2.146
Storch, E. A., Roberti, J. W., & Roth, D. A. (2004). Factor structure,
Muthen, L. K., & Muthen, B. O. (2007). Mplus user’s guide (6th ed.). Los
Angeles, CA: Muthen & Muthen. concurrent validity, and internal consistency of the Beck Depression
Norman, S. B., Allard, C. B., Trim, R. S., Thorp, S. R., Behrooznia, M., Inventory-Second Edition in a sample of college students. Depression
and Anxiety, 19, 187-189. doi:10.1002/da.20002
Masino, T. T., & Stein, M. B. (2013). Psychometrics of the overall
anxiety severity and impairment scale (OASIS) in a sample of women Thase, M. E. (1992). Long-term treatments of recurrent depressive disor
with and without trauma histories. Archives o f Women’s Mental Health, ders. Journal o f Clinical Psychiatry, 53, 32-44.
16, 123-129. doi: 10.1007/s00737-012-0325-8 Thissen, D. (2001). IRTLRDIF v. 2.0b: Software for the computation o f the
Norman, S. B., Campbell-Sills, L., Hitchcock, C. A., Sullivan, S., Rochlin, statistics involved in item response theory likelihood-ratio tests for
A., Wilkins, K. C., & Stein, M. B. (2011). Psychometrics of a brief differential item functioning. [Computer software]. Chapel Hill, NC:
measure of anxiety to detect severity and impairment: The Overall Thurstone Psychometric Library.
Anxiety Severity and Impairment Scale (OASIS). Journal of Psychiatric Ware, J., Jr., Kosinski, M., & Keller, S. D. (1996). A 12-item short-form
Research, 45, 262-268. doi:10.1016/j.jpsychires.2010.06.011 health survey: Construction of scales and preliminary tests of reliability
Norman, S. B., Cissell, S. H., Means-Christensen, A. J., & Stein, M. B. and validity. Medical Care, 34, 220-233. doi: 10.1097/00005650-
(2006). Development and validation of an overall anxiety severity and 199603000-00003
impairment scale (OASIS). Depression and Anxiety, 23, 245-249. doi: Wells, K. B., Stewart, A., Hayes, R. D., Burnam, M. A., Rogers, W.,
10.1002/da.20182 Daniels, M., . . . Ware, J. (1989). The functioning and well-being of
Paykel, E. S., Ramana, R., Cooper, Z., Hayhurst, H., Kerr, J., & Barocka, depressed patients: Results from the Medical Outcomes Study. Journal
A. (1995). Residual symptoms after partial remission: An important o f the American Medical Association, 262, 914-919. doi:
outcome in depression. Psychological Medicine, 25, 1171-1180. doi: 10.1001/jama. 1989.03430070062031
10.1017/S0033291700033146 Wittkampf, K. A., Naeiji, L., Schene, A. H., Huyser, J., & van Weert, H. C.
Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for (2007) . Diagnostic accuracy of the mood module of the Patient Health
research in the general population. Applied Psychological Measurement, Questionnaire: A systematic review. General Hospital Psychiatry, 29,
1, 385-401. doi: 10.1177/014662167700100306 388-395. doi:10.1016/j.genhosppsych.2007.06.004
Roberts, R. E. (1980). Reliability of the CES—D scale in different ethnic Zimmerman, M., Chelminski, I., McGlinchey, J. B., & Postemak, M. A.
contexts. Psychiatry Research, 2, 125-134. doi: 10.1016/0165- (2008) . A clinically useful depression outcome scale. Comprehensive
1781(80)90069-4 Psychiatry, 49, 131-140. doi:10.1016/j.comppsych.2007.10.006
Robins, R. W., Fraley, R. C., Roberts, B. W., & Trzesniewski, K. H. Zimmerman, M., Galione, J. N., Attiullah, N., Friedman, M., Toba, C.,
(2001). A longitudinal study of personality change in young adulthood. Boersecu, D; A., & Ragheb, M. (2011). Depressed patients’ perspective
Journal of Personality, 69, 617-640. doi:10.1111/1467-6494.694157 of two measures of outcome: The Quick Inventory of Depressive Symp
Rush, A. J., Trivedi, M. H., Ibrahim, H. M., Carmody, T. J., Amow, B., tomatology (QIDS) and the Remission from Depression Questionnaire
Klein, D. N., . . . Keller, M. B. (2003). The 16-Item Quick Inventory of (RDQ). Annals o f Clinical Psychiatry, 23, 208-212.
Depressive Symptomatology (QIDS), clinician rating (QIDS-C), and Zimmerman, M., Martinez, J. H., Attiullah, N., Friedman, M., Toba, C.,
self-report (QIDS-SR): A psychometric evaluation in patients with Boerescu, D. A., & Ragheb, M. (2013). A new type of scale for
determining remission from depression: The Remission from Depression J. N. (2012). Overcoming the problem of diagnostic heterogeneity in
Questionnaire. Journal o f Psychiatric Research, 47, 78-82. doi: applying measurement-based care in clinical practice: The concept of
10.1016/j .jpsychires.2012.09.006 psychiatric vital signs. Comprehensive Psychiatry, 53, 117-124. doi:
Zimmerman, M., Young, D., Chelminski, I., Dalrymple, K., & Galione, 10.1016/j.comppsych.2011.03.004
Appendix
Overall Depression Severity and Impairment Scale
The following items ask about depression. For each item, circle 4. In the past week, how much did your depression interfere with
the number for the answer that best describes your experience over your ability to do the things you needed to do at work, at school,
the past week. or at home?
0 = None: No interference at work/home/school from depres
1. In the past week, how often have you felt depressed?
sion
0 = No depression in the past week.
1 = Mild: My depression has caused some interference at
1 = Infrequent depression. Felt depressed a few times.
work/home/school. Things are more difficult, but everything that
2 = Occasional depression. Felt depressed as much of the time
needs to be done is still getting done.
as not.
2 = Moderate: My depression definitely interferes with tasks.
3 = Frequent depression. Felt depressed most of the time.
Most things are still getting done, but few things are being done as
4 = Constant depression. Felt depressed all of the time.
well as in the past.
2. In the past week, when you have felt depressed, how intense or 3 = Severe: My depression has really changed my ability to get
severe was your depression? things done. Some tasks are still being done, but many things are
0 = Little or None: Depression was absent or barely noticeable. not. My performance has definitely suffered.
1 = Mild: Depression was at a low level. 4 = Extreme: My depression has become incapacitating. I am
2 = Moderate: Depression was intense at times. unable to complete tasks and have had to leave school, have quit
3 = Severe: Depression was intense much of the time. or been fired from my job, or have been unable to complete tasks
4 = Extreme: Depression was overwhelming. at home and have faced consequences like bill collectors, eviction,
etc.
3. In the past week, how often did you have difficulty engaging in
or being interested in activities you normally enjoy because of 5. In the past week, how much has depression interfered with your
depression? social life and relationships?
0 = None: I had no difficulty engaging in or being interested in 0 = None: My depression doesn’t affect my relationships.
activities that I normally enjoy because of depression. 1 = Mild: My depression slightly interferes with my relation
1 = Infrequent: A few times I had difficulty engaging in or ships. Some of my friendships and other relationships have suf
being interested in activities that I normally enjoy because of fered, but, overall, my social life is still fulfilling.
depression. My lifestyle was not affected. 2 = Moderate: I have experienced some interference with my
social life, but I still have a few close relationships. I don’t spend
2 = Occasional: I had some difficulty engaging in or being
as much time with others as in the past, but I still socialize
interested in activities that I normally enjoy because of depression.
sometimes.
My lifestyle has only changed in minor ways.
3 = Severe: My friendships and other relationships have suf
3 = Frequent: I have considerable difficulty engaging in or
fered a lot because of depression. I do not enjoy social activities.
being interested in activities that I normally enjoy because of
I socialize very little.
depression. I have made significant changes in my life style
4 = Extreme: My depression has completely disrupted my
because of being unable to become interested in activities I used to
social activities. All of my relationships have suffered or ended.
enjoy.
My family life is extremely strained.
4 = All the Time: I have been unable to participate in or be
interested in activities that I normally enjoy because of depression. Received June 24, 2013
My lifestyle has been extensively affected and I no longer do Revision received January 8, 2014
things that I used to enjoy. Accepted January 22, 2014 ■
Copyright of Psychological Assessment is the property of American Psychological
Association and its content may not be copied or emailed to multiple sites or posted to a
listserv without the copyright holder's express written permission. However, users may print,
download, or email articles for individual use.

ODSIS Inglés

Uploaded by

Copyright:

Available Formats

ODSIS Inglés

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ODSIS Inglés

Uploaded by

Copyright:

Available Formats

Psychological Assessment © 2014 American Psychological Association

2014, Vol. 26, No. 3, 815-830 1040-3590/14/$ 12.00 DOI: 10.1037/a0036216

Development and Validation of the Overall Depression Severity and

Kate H. Bentley Matthew W. Gallagher

Jenna R. Carl and David H. Barlow

Keywords: depression, screening, psychometrics, item response theory

Method assign a clinical severity rating (CSR) ranging from 0 to 8 that

0/None l/Mild 2/Moderate 3/Severe 4/Extreme

Note. ODSIS = Overall Depression Severity and Impairment Scale.

Convergent and Discriminant Validity

0 _ 1.00 .00 .27 _ 1.00 .00 27

All equal Equal discrimination (a) Equal thresholds (b)

You might also like