Nothing Special   »   [go: up one dir, main page]

Ryland 2021

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

European Psychiatry Outcome measures in forensic mental health

www.cambridge.org/epa
services: A systematic review of instruments and
qualitative evidence synthesis
Howard Ryland1* , Jonathan Cook2 , Denis Yukhnenko1 ,
Review/Meta-analyses
3 1
Raymond Fitzpatrick and Seena Fazel
Cite this article: Ryland H, Cook J,
Yukhnenko D, Fitzpatrick R, Fazel S (2021). 1
Department of Psychiatry, University of Oxford, Oxford, United Kingdom; 2Nuffield Department of Orthopaedics,
Outcome measures in forensic mental health
Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, United Kingdom and 3Nuffield Department of
services: A systematic review of instruments
Population Health, University of Oxford, Oxford, United Kingdom
and qualitative evidence synthesis. European
Psychiatry, 64(1), e37, 1–11
https://doi.org/10.1192/j.eurpsy.2021.32 Abstract
Received: 01 March 2021 Background. Outcome measurement in forensic mental health services can support service
Revised: 11 May 2021 improvement, research, and patient progress evaluation. This systematic review aims to identify
Accepted: 12 May 2021 instruments available for use as outcome measures in this field and assess the evidence for the
most common instruments, specific to the forensic context, which cover multiple outcome
Keywords:
Forensic mental health services; outcome domains.
measurement; psychometrics; quality of life; Methods. Studies were identified by searching seven online databases. Additional searches were
risk assessment then performed for 10 selected instruments to identify additional information on their psycho-
metric properties. Instrument manuals and gray literature was reviewed for information about
Author for correspondence:
*Howard Ryland,
instrument development and content validity. The quality of evidence for psychometric prop-
E-mail: howard.ryland@psych.ox.ac.uk erties was summarized for each instrument based on the COnsensus-based Standards for health
Measurement INstruments (COSMIN) approach.
Results. A total of 435 different instruments or variants were identified. Psychometric infor-
mation on the 10 selected instruments was extracted from 103 studies. All 10 instruments had a
clinician reported component with only two having patient reported scales. Half of the
instruments were primarily focused on risk. No instrument demonstrated adequate psycho-
metric properties in all eight COSMIN categories assessed. Only one instrument, the Camber-
well Assessment of Need: Forensic Version, had adequate evidence for its development and
content validity. The most evidence was for construct validity, while none was identified for
construct stability between groups.
Conclusions. Despite the large number of instruments potentially available, evidence for their
use as outcome measures in forensic mental health services is limited. Future research and
instrument development should involve patients and carers to ensure adequate content validity.

Introduction
Forensic mental health services provide care for people with mental illness who pose a risk to
others and have typically perpetrated acts of violence or other antisocial behaviors [1]. The
structure and legal framework governing such services varies considerably between and even
within countries [2,3]. Demand for such services is rising in many high income countries, with
increasing inpatient capacity [4]. Long length of stay, high staffing ratios, and the need for
complex security arrangements mean that such services are expensive [5,6]. Forensic mental
health services can consume a disproportionate portion of overall health budgets given the small
numbers of patients [7]. Patients frequently spend many years in secure settings and continue to
be subject to restrictions on discharge [8]. The consequences of recidivism are often severe for
victims and their families [9]. Despite the financial and human costs, outcomes of care remain
poorly understood and measurement of progress often relies on the individual approach of
© The Author(s), 2021. Published by Cambridge clinicians [10].
University Press on behalf of the European
Psychiatric Association. This is an Open Access
Measuring the outcomes of forensic mental health services is complicated. Unlike most other
article, distributed under the terms of the healthcare services which focus exclusively on improving outcomes for patients, forensic mental
Creative Commons Attribution licence (http:// health services also have the dual purpose of public protection. In many jurisdictions this is
creativecommons.org/licenses/by/4.0/), which considered their primary, if not sole, purpose. In other forensic mental health systems however,
permits unrestricted re-use, distribution, and
there is an increasing recognition that patient-centered outcomes must also be prioritized
reproduction in any medium, provided the
original work is properly cited. [11,12]. Previous research has frequently focused on objective outcomes, such as rehospitaliza-
tion, reoffending and death, usually obtained from administrative datasets [13]. While such
outcomes are clearly important, they are relatively uncommon and may only occur after
considerable time has elapsed, limiting their usefulness to regularly monitor progress. Over
the past three decades, there has been increasing interest in standardized questionnaires to
quantify progress in a more nuanced way [14–16]. These questionnaires have predominately

https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press


2 Howard Ryland et al.

sought to reflect the assessment of the treating clinical teams, measurement. Observed changes could be the result of a number
although more recent developments have also considered the views of factors, including response to treatment and variations in symp-
of patients themselves [17,18]. In practice, what constitutes pro- toms. We decided to focus our quality assessment on instruments
gress varies considerably between services. Progress may be for- that are multidimensional, as these are more likely to be relevant to
mally defined and based on objective criteria, for example as a move routine clinical practice in forensic services, where multiple out-
to a lower level of security or discharge to the community comes are assessed for each patient. Although it is possible to
[19]. Alternatively, progress may be shown by more internal, less combine many different instruments that are narrowly focused
externally measured changes, such as a psychological shift toward on measuring single domains, this can be cumbersome and time-
responsibility for previous injurious actions [20]. Progress should consuming in clinical practice, and multidimensional instruments
therefore address therapeutic as well as risk reduction interven- can reduce clinician burden. We gave equal weight to patient
tions. The questionnaires used to assess progress in clinical practice centered and service outcomes and the four outcome dimensions
have not always been explicitly developed for this purpose we consider are risk, clinical symptoms, recovery (including func-
[21]. Thus, dynamic risk assessment, needs assessments, and deci- tioning), and quality of life. We also prioritize instruments that are
sion aids for determining the level of security have all been used as specific to the forensic context, over more generic instruments. We
measures of outcome in forensic mental health services. identify the 10 instruments most frequently occurring within the
Policy programs are increasingly concerned with measuring literature that are also multidimensional and forensic specific. We
outcomes across health services [22,23]. Driving principles high- then assess their quality, including development and content valid-
light the need for measures to reflect the concerns of stakeholders, ity, drawing on the latest consensus-based approaches for evaluat-
with adequate psychometric properties for their use as outcome ing instruments for the purpose of measuring outcomes from
measures [24]. The COnsensus-based Standards for health Mea- COSMIN [31]. To the best of our knowledge, this is the first review
surement INstruments (COSMIN) group has developed a taxon- of this type to apply the COSMIN criteria to outcome measures in
omy to define the various qualities of an instrument that can make it forensic mental health services. The purpose of the quality assess-
a good outcome measure [25]. This includes aspects of validity, ment was to determine how well the selected instruments function
reliability, and responsiveness. Validity concerns whether an as outcome measures, using the COSMIN criteria as a benchmark,
instrument actually measure the concept of interest, reliability and not to determine the appropriateness of other potential uses for
whether it does so consistently, and responsiveness whether it is the included instruments, such as risk prediction or needs
able to detect change over time. Instruments need to demonstrate assessment.
good psychometric properties relevant to how they are used in
practice. Measurement can be used at an individual level in deter- Methods
mining a patient’s pathway. This can support patients to under-
stand their own progress and to evaluate aspects of their treatment. We report this review following the Preferred Reporting Items for
It can also be used at a systemic level for quality assurance, alloca- Systematic Reviews and Meta-Analyses (PRISMA) reporting items,
tion of resources, service evaluation, and research [26]. Interna- adapting where appropriate for this type of study [32]. We followed
tional initiatives have agreed common sets of outcome measures for an adapted version of the COSMIN protocol for systematic reviews,
similar clinical services and to be used in clinical trials to facilitate including their risk of bias tool for assessing study quality [31].The
synthesis of individual study findings [27]. Understanding of psy- COSMIN approach is an internationally agreed standard for eval-
chometrics has evolved, placing greater emphasis on good content uating outcome measures. It can be used to assess all types of
validity. Content validity asks the question of whether an instru- outcome measures, including both clinician and patient reported
ment measures the concept that it is intended to measure. This instruments [33]. The study protocol was registered on PROS-
concept should reflect those outcomes that are most important for PERO, an international prospective register of systematic reviews.
stakeholders, including patients [28].
Previous reviews of outcome measures in forensic settings have Step 1: Database search
identified a large number of questionnaire-based instruments in
clinical practice and research settings [16,29]. These previous We searched seven databases (MEDLINE, PsycINFO, CINAHL
reviews noted a focus on risk and clinical symptoms, neglecting [Cumulative Index to Nursing and Allied Health Literature],
quality of life, and functional outcomes. They also highlight the lack EMBASE, National Criminal Justice Reference Service [NCJRS],
of patient involvement in the development and rating of these the Cochrane Database, and Web of Science) from database incep-
instruments. tion until spring 2018 using a combination of terms including
The present study seeks to update the evidence base, as previous “tool”; “instrument”; “scale”; “outcome”; “recovery”; “risk”;
reviews were completed almost a decade ago or only consider a “rehabilitation”; “quality of life”; “symptom”; “forensic”; “secure”;
small subset of measures [30]. It aims to identify existing instru- “unit”; “ward”; and “hospital”. See Supplementary Material 1 for an
ments from published literature which have been, or could be used, example of the full search strategy.
as outcome measures. To ensure that the full range of instruments
used in practice is included, we used a wide definition of what
Step 2: Screening and eligibility criteria
constitutes an outcome measure. This includes all instruments with
a dynamic component that could be used to measure change over We reviewed the titles and abstracts of identified records. Included
time, regardless of whether these were originally designed to be, or papers needed to describe the use of relevant instruments in a
are termed as, an “outcome measure.” In this context dynamic forensic mental health setting. The full text had to be available in
components measured indicators that vary with time, where this English. All types of empirical or review paper were included.
variation may have a significant effect on the measurement result, in Papers describing use in prison or general psychiatric services only
contrast to static items that measure historical factors, such as were excluded. Papers describing assessments of personality, which
previous behaviors, which will not change on repeated are generally not dynamic, and competency to stand trial and

https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press


European Psychiatry 3

malingering, which are outcomes related to the legal process, rather categories (no concerns/quality of evidence unclear/quality of evi-
than treatment response, were also excluded. dence inadequate).

Step 3: Full text review and identification of instruments Step 6: Overall strength of evidence
Papers meeting the screening criteria were reviewed in full text to We assigned an overall rating to the strength of evidence available
identify relevant instruments described within. The format, type of for each of the seven psychometric properties for each instrument
study and geographical location were recorded. The frequency each based on all included studies in one of three categories. For prop-
instrument or subvariant appeared was noted. To determine the erties with adequate evidence of good performance we assigned the
10 most frequently appearing instruments, counts for all subvar- highest category. Properties with either inadequate evidence of
iants of each instrument were summed. We then considered each good measurement properties or evidence of inadequate measure-
instrument, starting with the most frequently identified, to deter- ment properties were assigned to the middle category. Those prop-
mine which met the criteria of being both multidimensional and erties with no evidence were assigned to the lowest category.
designed for use in a forensic mental health context, until we had We used the same categorization system for content validity,
identified the 10 most frequently occurring within the literature. including the instrument development process. However, due to the
Multidimensional instruments included items on more than one of lack of published studies, we used a qualitative synthesis of infor-
the four domains identified in previous reviews in this field (clinical mation available from a range of sources, including instrument
symptoms, risk, recovery, and quality of life) [15,16]. Forensic manuals and other gray literature, based on the COSMIN method-
specific instruments were those concerned with mental health out- ology for assessing content validity, which focuses on establishing
comes for offenders or outcomes for individuals assessed or treated relevance and comprehensiveness in the target population [36].
in forensic mental health services. We then considered each of the
10 selected instruments to determine the most relevant version or
variant to undergo quality assessment in the next stage of the Results
review. This was either the most recent version or, for instruments Description of full text articles retrieved
that combined multiple components, those components designed
to measure patient progress over time. The initial screening process identified 4,494 unique references, of
which 502 met the inclusion criteria for full text review. Four
hundred and fifty-six (91%) were articles in scientific journals.
Step 4: Further searching for literature on selected instruments
Almost half (49%; n = 247) were studies of the psychometric
We conducted additional searches for each of the 10 selected properties of instruments, while only 3% (n = 17) concerned
instruments. We searched the PubMed database using common interventional trials. Almost half (45%; n = 227) originated in the
variants of each instrument’s name combined with the COSMIN UK and Ireland (see Supplementary for a full description of the
filter of psychometric terms [34]. We reviewed the manuals for each studies reviewed in full text).
instrument and other gray literature for further information on
instrument development. We reviewed the reference lists of all Description of the instruments identified
included papers and contacted experts in the field as necessary.
All sources of information were included until the end of 2019. Four hundred and thirty-five different instruments or their variants
of were identified. It was necessary to review 14 instruments until we
identified the 10th instrument most frequently occurring within the
Step 5: Data extraction
literature that also met the multidimensional and forensic-specific
We developed a data extraction tool, based on the COSMIN sys- criteria (see Supplementary Material 3). The most frequently occur-
tematic review protocol and risk of bias tool. A number of adapta- ring instrument within the literature was the Historical, Clinical, Risk
tions to the standard approach were necessary, as the identified 20 (HCR-20) [37], which appeared 196 times, followed by the Short
instruments were predominantly clinician reported. Content valid- Term Assessment of Risk and Treatability (START) [38] with
ity focused on the qualitative comprehensiveness and relevance of 53 mentions.
items in relation to the concept of interest, while all other psycho- There was considerable variation in the format and stated
metric properties were assessed using quantitative studies of purpose of the selected instruments. This included assessments of
numerical scores generated by the instruments. Quantitative data progress, risk factors, protective factors, patient need, and clinical
were extracted on seven psychometric properties (structural valid- decision aids.
ity, internal consistency, measurement invariance, reliability, mea-
surement error, hypothesis testing, and responsiveness). According
Overview of the 10 instruments selected for the quality
to COSMIN, the dimensionality of a scale should be determined by
assessment
factor analysis before internal consistency is considered [35]. In this
context, dimensionality considers whether there is statistical evi- Half of the 10 instruments assessed were developed primarily as risk
dence that respondents answer an instrument’s items in a similar assessments (HCR-20, START, Sexual Violence Risk 20 [SVR-20],
way, indicating that they relate to the same underlying construct. Violence Risk Scale [VRS], Level of Service: Case Management
Measurement error refers to the systematic or random error of a Inventory [LS/CMI]) [37–42]. Two instruments explicitly included
patient’s score that is not attributable to true changes that have items on patients’ strengths or protective factors (START and
occurred. It requires a qualitative estimation of the minimal impor- SAPROF) [38,43]. Only one instrument, the Health of the Nation
tant change, which is the smallest change in a score that would be Outcome Scale Secure (HoNOS Secure), was explicitly developed as
clinically meaningful. We assigned a quality rating to the evidence a progress measure [44]. All instruments included a clinician
for each property for each instrument in each study in one of three reported scale. Only one, the Camberwell Assessment of Need

https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press


4 Howard Ryland et al.

Forensic Version (CANFOR), was originally developed to include a mental health services. A broad definition of what constitutes an
patient reported scale [18]. A patient reported scale has subse- outcome measure ensured a wide range of instruments were con-
quently been developed for the Dangerousness, Understanding, sidered. The review focused on instruments which are clinically
Recovery, and Urgency Manual (DUNDRUM) [17]. The number relevant, to increase applicability of findings to real world settings.
of items ranged from 12 (DUNDRUM 3 and 4) to 150 Behavioral It assesses the quality of evidence for the 10 most frequently
Status Index (BEST) [45,46]. See Table 1 for a full description of occurring instruments within the literature, which are also multi-
each of the instruments. dimensional and forensic specific. This review was based on a
recognized quality assessment process and, to our knowledge, this
Quality of evidence for the selected instruments is the first time that such a systematic approach has been applied in
this field [35]. This quality assessment specifically considered the
Eighty-six (17%) of the references identified by the review strategy use of these instruments as broad outcome measures, covering a
contained relevant data on the psychometric properties of the wide range of clinically relevant domains. It made no evaluation
10 selected instruments. An extra 29 references were identified about the use of instruments for other purposes, such as risk
through the additional search techniques described in Step 4 of prediction or needs assessment.
the methods (see Figure 1). See Supplementary Material 4 for details
of the identified studies containing psychometric information
Key findings
about the selected instruments.
All 10 selected instruments had some evidence of empirical Overall, the evidence for the appropriateness of the selected instru-
processes to support their development, however, this often empha- ments as broad outcome measures is limited (see Table 2). At least
sized quantitative reviews of the literature on risk factors for violence, half focused primarily on risk assessment and management, which
rather than considering the views of relevant stakeholders [36]. When is in line with previous similar reviews and unsurprising given the
there was evidence of consultation with stakeholders, this was usually nature of forensic mental health services [15–18]. The Overt
unstructured, with limited details on the methods used or individuals Aggression Scale, developed to measure aggressive behavior in
involved. Only one instrument, CANFOR, demonstrated adequate inpatients with intellectual disabilities, appeared frequently but
evidence of stakeholder involvement, including patients, in its devel- was excluded from more detailed assessment due to not being
opment [18]. CANFOR also had evidence to support its relevance multidimensional [52]. Although clinical symptoms of mental
and comprehensiveness for the target population. illness featured in many of the selected instruments, this was not
The degree of evidence for the remaining psychometric prop- the primary focus of any. The Positive and Negative Symptoms
erties was mixed, with evidence on testing hypotheses for con- Scale [53] and Brief Psychiatric Ratings Scale [54] both appeared
struct validity identified for every instrument, but none for frequently, but were excluded from more detailed assessment as
measurement invariance [35]. Evidence for structural validity they only focused on symptoms and were not designed for use in a
was available for three of the instruments (BEST, VRS, and forensic context (see Supplementary Material 3). Recovery and
DUNDRUM), none of which demonstrated adequate perfor- quality of life were less prominent in the selected instruments,
mance [41,45,47]. This was either due to insufficient numbers, although there were both some generic and forensic specific mea-
the use of exploratory, rather than confirmatory factor analysis, sures of these domains in the other instruments identified (such as
or results that were not supportive of the hypothesized structure the Global Assessment of Functioning [55], which appeared fre-
of the instrument [31]. There was evidence for internal consis- quently but was excluded as not forensic specific or multidimen-
tency identified for 8 instruments out of 10. Despite the lack of sional, or the forensic specific Lancashire Quality of Life Profile
evidence for structural validity, four instruments were deemed to [56], which did not appear frequently enough to warrant more
have evidence of adequate internal consistency. detailed assessment). In accordance with previous reviews, few
Nine instruments had some evidence for their reliability, instruments were reported by patients, with only 2 of the 10 selected
which focused primarily on interrater, rather than test–retest instruments having a patient reported scale [15,16]. The systematic
reliability. Measurement error had limited evidence with studies gathering of the views of a wider group of stakeholders, especially
identified for three instruments [48–50]. The quality of evidence patients, was rarely performed to inform content validity.
for measurement error in the review was consistently low, relying The differing attention to various aspects of validity and reli-
on quantitative methods alone, with no attempt to relate the ability in the quality assessment reflects the original purposes of the
statistical error to the minimal important clinical change instruments. For example, as an assessment of patient need, the
[31]. Testing hypotheses for construct validity was the category CANFOR has a much greater focus on content validity, while the
with the greatest quantity of evidence. Three primary types of HCR-20, as a risk assessment, focuses more on prediction of
hypotheses were identified: prediction of future events, such as negative outcomes [18,57]. Studies of the DUNDRUM quartet
violence, self-harm, and victimization; difference between sub- often focus on differences between levels of security, as an aid to
groups, based on characteristics such as sex, ward type or behav- support decisions on pathway placement rather than outcome
ior; and correlation with other measures. Evidence for measurement, while the HoNOS-Secure has the highest number
responsiveness was identified for seven instruments, with only of studies of responsiveness, commensurate with its role as a
two demonstrating adequate properties in this respect [48,51]. See progress measure [45,58].
Table 2 for an overview of the evidence for the selected instru-
ments and Supplementary Material 5 for a detailed summary.
Implications for research
The COSMIN guidelines emphasize the need for outcome mea-
Discussion
sures to demonstrate adequate stakeholder involvement in their
This systematic review aimed to provide an overview of instru- development [28]. Even for clinician reported scales, this should
ments currently available for use as outcome measures in forensic include input from patients and carers. This holds for forensic

https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press


https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press

Table 1 An overview of the 10 outcome measurement instruments included in the quality assessment

Ranges of
Measurement scores for
Instrument Target individual Original
(Key reference) Construct population Mode of administration Recall period Subscale and number of items Response options items language

Historical Clinical Risk Static and dynamic risk Correctional, Clinician reported Lifetime for 20 items in 3 subscales: Presence - yes, partially or 0-2 English
20 (HCR-20) factors for violence civil historical scale, possibly present, no, omit
Version 3 [37] psychiatric timeframe for
and forensic clinical and risk Historical (10 items), Clinical Relevance – high, moderate,
psychiatric scales (5 items), Risk (5 items) low, omit
settings determined for Structured professional High/mod/
each patient by judgement for future low
raters violence, serious physical
harm and imminent
violence can be high,
moderate or low
Short-Term Assessment Strengths and Forensic mental Clinician reported 2-3 months (or since 40 items in 2 parallel subscales, None, low, high 0-2 English
of Risk and vulnerabilities health the last START plus 2 case specific items
Treatability patients assessment) Strengths and vulnerabilities Strengths can be marked as Yes/no
(START) [38] (20 items each, plus 2 case ‘key items’ and
specific items) vulnerabilities as ‘critical
items’
Specific risk estimates SREs can be high, moderate or High/mod/
(7 SREs) low low
Camberwell Assessment Assessment of needs Forensic mental Clinician and patient 1 month 25 items in 1 scale No problem/moderate Variable: 0- English
of Need – Forensic health reported scales problem/serious problem/ 2 or 0-3
Version (CANFOR) [18] patients not known OR None/low
help/moderate help/high Aggregate
help/not known scores of
met
needs,
unmet
needs
and total
needs
Dangerousness, Readiness to move to a Forensic mental Clinician and patient Variable – 5 years 12 items in 2 subscales: Ordinal: A statement 0-4 English
Understanding, lower level of health reported scales for score 0, DUNDRUM3 - Programme corresponds to each of five
Recovery and security patients unclear for the Completion (7 items); possible scores
Urgency Manual other scores DUNDRUM4 - Recovery (5
(DUNDRUM) [45] items)
Health of the Nation Repeatable progress Forensic mental Clinician reported (any The HoNOS-Secure 19 items in 2 subscales: Ordinal: Examples of each 0-4 English
Outcome Scales – measure for forensic health mental health Clinical/social Clinical/social functioning rating point provided in the
Secure Version services patients professional) functioning scale (12 items); Security (7 glossary
(HoNOS Secure) [44] – previous 2 items)
weeks Security
scale – the ‘near
future’
https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press

Table 1 Continued

Ranges of
Measurement scores for
Instrument Target individual Original
(Key reference) Construct population Mode of administration Recall period Subscale and number of items Response options items language

Level of Service: Case Risk factors for Offenders in a Professional reported Variable, depending 43 items in Section 1 - General Ordinal or binary 3-0 or Yes/ English
Management recidivism, variety of on the item – risk/need in 8 subscales – No
Inventory intervention needs settings, where specified, Criminal History (8),
(LS/CMI) [42] and case including usually the last Education/Employment (9),
management prison, year Family/Marital (4), Leisure/
psychiatric Recreation (2), Companions
hospitals and (4), Alcohol/Drug Problem
probation (9), Procriminal Attitude/
Orientation (4), Antisocial
Pattern (4)
4 additional scales that do not Final risk/need assessment Very high,
add to the score, but are high,
considered in administrative medium,
override and/or case low, very
management: low
Specific risk/need (21), prison
experience/ institutional
factors (11), other client
issues (21), special
responsivity considerations
(11)
Violence Risk Scale Risk factors for Forensic Clinician reported – file Lifetime 26 items in 2 subscales: Ordinal: Responses depend on 0-3 English
(VRS) [41] violence, readiness inpatients review and semi- functioning, with Static (6) Dynamic (20) the item
for change, targets and prisoners structured interview emphasis on
for intervention, recent
effect of treatment functioning
Structured Assessment Protective factors for Forensic Clinician reported Information used 17 items in 3 subscales: Each item is rated on a 3-point 0, 1 or 2 Dutch
of Protective Factors violence psychiatric from the last 6 scale
for risk of violence inpatient and months;
(SAPROF) [43] outpatients; predictions apply Internal (5) Final protection judgement: High/mod/
prisoners and to subsequent 6 Motivation (7) 1) Protection low
probation months External (5) 2) Risk

Sexual Violence Risk 20 Risk factors for sexual Sex offenders Clinician reported Recent changes 20 items in 3 subscales: Presence - yes, partially or 0-2 English
(SVR-20) [40] violence (including within the last possibly present, no, omit
those who are year (can be
forensic adjusted to each Psychosocial adjustment (11); Recent change +, 0, -
psychiatric case) sexual offences (7); future
plans (2) Summary risk rating High/mod/
patients) low
Behavioural Status Assessment of Forensic and Nurse reported Last 3 months 150 items in 6 subscales: Ordinal: Responses depend 1-5 English
Index (BEST) [46] behaviours general on the item
psychiatric
inpatients Social Risk (20);
Insight Subscale (20);
Communication and Social
Skills (30);
Work and Recreational
Activities (20);
Self-Care and Family Care (30);
Empathy (30).
European Psychiatry 7

Identification
Records identified through
database searching
(n = 7,731)

Duplicates removed
(n = 3,327)
Screening

Records excluded that did


Records screened after
not contain information
duplicates removed
on eligible instruments in
(n = 4,494)
a forensic context
(n = 3,992)

Full-text articles excluded


Full-text articles assessed that did not contain
Eligibility

to identify eligible information on the


instruments psychometric properties
(n = 502) of selected instruments
(n = 416)

Articles containing Articles identified from


information on the other sources containing
information on
psychometric properties
psychometric properties
of the selected
Included

of the selected
instruments
instruments
(n = 86) (n = 29)

Articles included in the


qualitative synthesis
(n = 115)

Figure 1. PRISMA flow diagram showing the flow of studies through the review.

services, which must balance the needs of patients with those of intended uses. Adequate evidence for other uses, such as risk
public protection. Evidence for instrument development was only predication, may be well established, but does not necessarily
adequate for the CANFOR [18]. Although other selected instru- support their use as outcome measures. Further research should
ments had some stakeholder involvement, this was limited, seek to ensure that the identified gaps in the evidence base are
unstructured and the reporting often inadequate. Subsequent addressed, if these instruments are used as outcome measures.
empirical validation of instrument content was similarly lacking, Certain properties, such as measurement error and measurement
again except for CANFOR. Testing of comprehensiveness and invariance are almost entirely overlooked, so should be considered
relevance should be completed in the population for which instru- in future studies. Evidence for other fundamental characteristics,
ments are intended [28]. This can take place after the instrument is such as structural validity, is also often absent or inadequate.
available in its final form and does not have to occur contempora- The ability to detect change over time was explicitly considered
neously with development [31]. Further research is therefore nec- in the review under the category of responsiveness. While seven
essary to establish the content validity of these instruments as instruments had some evidence for responsiveness, this property
outcome measures in a forensic psychiatric population. was only deemed adequate for the VRS and SAPROF. Demonstrat-
Overall, the evidence for the other psychometric properties of ing reliable change in this population can be challenging, due to the
the instruments as outcome measures is limited, with numerous long timescales involved [59]. Admissions to inpatient forensic
gaps in the published research. This lack of a comprehensive psychiatric care often last years. The timeframe of most psycho-
evidence base is perhaps surprising given the age and popularity metric studies however, including many in this review, is limited to
of many of these instruments, but may reflect the diversity of their a few months [8]. Despite these difficulties, it is essential for

https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press


8 Howard Ryland et al.

Table 2. Summary synthesis of evidence for the 10 outcome measurement instruments included in the quality assessment.

Content Structural Internal Measurement Measurement Hypothesis testing for


validity validity consistency invariance Reliability error construct validity Responsiveness

HCR-20 0 4 0 10 0 17 3
START 0 5 0 10 0 28 2
CANFOR 0 0 0 4 0 10 0
DUNDRUM 1 4 0 1 1 7 1
HONOS-S 0 2 0 1 1 14 11
LS/CMI 0 1 0 0 0 3 0
VRS 1 1 0 7 4 11 8
SAPROF 0 2 0 8 0 12 2
SVR-20 0 0 0 2 0 5 0
BEST 2 3 0 3 0 4 2
Note: This table provides an overall summary of the evidence for the psychometric properties of each of the included measurement instruments. The eight psychometric properties assessed are
listed at the top of the table and the 10 instruments on the left hand side. The numbers in the cells signify the number of studies identified which contain information about the relevant
psychometric property for each instrument. Numbers are not included for content validity, as this was not possible to accurately quantify, due to the diverse range of sources of information for
this property. The shading categorizes the level of evidence within each cell according to the schedule outlined below:

• Adequate evidence of good measurement properties


• Inadequate evidence of good measurement properties or evidence of inadequate measurement properties
• No evidence

Definition of terms used in Table 2.

Term Definition

Content validity The degree to which the content of an outcome measure is an adequate reflection of the construct to be measured
Structural validity The degree to which the scores of an outcome measure are an adequate reflection of the dimensionality of the construct to be
measured
Internal consistency The degree of the interrelatedness among the items
Measurement invariance The degree to which respondents from different groups with the same latent trait level respond similarly to a particular item
Reliability The extent to which scores for patients who have not changed are the same for repeated measurement under several
conditions: for example, over time (test–retest) or by different persons on the same occasion (inter‐rater)
Measurement error The systematic and random error of a patient’s score that is not attributed to true changes in the construct to be measured
Hypothesis testing for The degree to which the scores of an outcome measure are consistent with hypotheses based on the assumption that the
construct validity outcome measure validly measures the construct to be measured
Responsiveness The ability of an outcome measure to detect change over time in the construct to be measured

outcome measures to demonstrate responsiveness to change over a psychometric properties for the new instrument, such as content
time period that is relevant for the population of interest [60]. validity and responsiveness.
Authorship bias has been identified as a potential problem in the
literature on risk assessments in the forensic context [61]. While Implications for policy and practice
authorship bias was not formally assessed in this review, much of
the evidence identified was produced by the teams that originally This review identified many instruments that have been, or could
developed the instruments. Sufficient validation studies should be, used as outcome measures in forensic mental health services.
therefore be conducted independently of the original authors. These vary considerably in format, content, length, stated purpose,
New instruments are needed for forensic mental health ser- and evidence base. Of the 10 instruments reviewed in detail, only
vices to enable clinicians and patients to report and measure HoNOS-Secure is designed with the sole primary purpose of mea-
individual and service outcomes. These should be developed suring progress, although other instruments such as the VRS and
according to the latest best practice guidelines, including the LS/CMI are also intended to assess change over time [44]. The ways
participation of relevant stakeholders, such as clinicians and that clinicians and researchers use instruments can differ consid-
patients [62,63]. Developing new instruments will require work- erably. Risk assessments, such as the HCR-20, can be used by
ing with these stakeholders to identify and prioritize the most clinicians to develop risk formulations, while researchers may use
important outcomes. This should be followed by further work to it to predict negative outcomes. Instruments can be used in practice
develop an instrument that fits the needs of individuals and or in research in several different ways, for example using the same
services. Finally, empirical studies should confirm adequate instruments to predict the risk of future events and to establish if an

https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press


European Psychiatry 9

intervention has already reduced that risk [64]. This type of Thirdly, although we grouped variants of instruments together,
repurposing may be possible, but is limited by how to interpret there may be important differences between variants. Finally, all
scores. It will also need considerable additional work to establish types of paper were included in the count and the proportion of
relevant psychometric properties, in particular adequate content studies that contained psychometric information on a particular
validity and responsiveness [28,60]. While some commonly used instrument varied, so the overall count does not necessarily
instruments, such as HCR-20 and START, have been used as reflect the quantity of psychometric evidence available.
outcome measures, the underlying evidence for their use in this Language was a limitation in two ways. Firstly, the search was
way is weak. Use of such risk assessments as outcome measures in limited to those references where the full text was available in
isolation may lead to an unbalanced view of progress, as they do not English. Secondly, studies involving translations of instruments
include important outcomes such as quality of life and social were included, although evidence from a translated version may
functioning. Services should therefore start by deciding which out- not always apply directly to the English version, due to subtle
comes are important, before selecting high quality outcome mea- cultural and linguistic differences [69].
sures that cover all such outcomes in a way that is practical to Assessing the quality of instrument development and content
implement. validity studies according to the full COSMIN criteria proved
Most instruments identified in this review are reported by challenging [28,35]. The review team simplified the COSMIN
clinicians only. For instruments that do include a patient reported approach, to make it more pragmatic and streamlined. This
scale, these scales may have been designed after the development included reducing the quality assessment to three levels, rather
of the clinician reported ones, with limited patient input than four. The summary assessment of the quality of evidence for
[17,65]. This risks inadequate attention to the patient perspective instrument development and content validity was also simplified,
in the overall design and implementation of such measures as the limited evidence in this area rendered the full process
[66]. In instruments selected in this review that include a patient recommended by COSMIN unworkable. Despite these limitations,
reported scale, the patient reported scales mirror their clinician we think that the COSMIN framework is the most robust and
reported components. They contain identical items, reframed relevant mechanism currently available for assessing instruments
from the patient’s perspective, to allow direct comparison for use as outcome measures.
between the two scales. A disadvantage of this approach is that
certain outcome areas, such as those related to subjective quality
of life, may only meaningfully be rated by patients [67]. A patient Conclusions
reported scale that exactly mirrors the clinician reported scale Although there are a large number of instruments available that
therefore risks neglecting such areas. Services wishing to imple- can be used as outcome measures in forensic mental health
ment patient reported measures should consult their own users services, the evidence base for their use in this way is limited.
and other key stakeholders, such as family members, when Despite recommendations from previous reviews, instruments
selecting scales, to ensure that they are fit for the purpose of that appear most frequently in the literature remain focused on
measuring those outcomes deemed of greatest relevance [11]. risk and fail to adequately involve all stakeholders, especially
Comprehensiveness is an essential quality for outcome mea- patients [15,16]. Repurposing instruments developed for other
sures [28]. While risk and clinical symptoms are the dominant uses as outcome measures should be avoided where possible. This
domains within the most frequently occurring instruments within is particularly the case for risk assessment tools which cannot
the literature, quality of life and functional outcomes are either currently be recommended as outcome measures based on the
absent or remain of secondary importance. By relying on existing standard guidelines we have outlined. When this is unavoidable,
instruments services may overlook outcomes of importance, such additional research is necessary to ensure that they demonstrate
as quality of life, and over-emphasize the importance of other adequate psychometric properties to be used as outcome measures
domains, such as risk to others [68]. [35]. New outcome measures should be designed with input from
all relevant stakeholder groups, especially patients and carers, who
Limitations have hitherto been largely ignored [67]. This should follow cur-
rent best practice guidelines for outcome measure development,
Given the very large number of instruments identified, it was only
with a focus on ensuring adequate content validity [28].
possible to assess the quality of evidence for a small proportion of
them. Some of the included instruments were not intended to be Acknowledgments. We would like to thank Nia Roberts for her help in
used as outcome measures, and their utility is not limited to this. developing the search strategy and retrieving full text papers.
A frequency based approach was chosen to select instruments for
the quality assessment. This was deemed the most systematic Financial Support. Howard Ryland, Doctoral Research Fellow, DRF-2017-
method of identifying instruments that were likely to have a 10-019, is funded by the National Institute for Health Research (NIHR) for this
sufficient evidence base to judge their qualities against the COS- research project. The views expressed in this publication are those of the authors
MIN criteria. There may be instruments that did not meet our and not necessarily those of the NIHR, NHS, or the UK Department of Health
selection criteria that have the potential to perform well against and Social Care.
the COSMIN criteria, when sufficient evidence is available. The
Conflicts of Interest. The authors report no conflict of interest.
use of frequency of appearance in the literature to select instru-
ments for quality assessment has a number of drawbacks. Firstly, Data Availability Statement. The data that support the findings of this
older tools are likely to appear in more published studies, simply study are available from the authors on reasonable request.
by virtue of being in existence for longer. Secondly, some studies
were published as multiple papers, meaning that a limited evi- Supplementary Materials. To view supplementary material for this article,
dence base generates a disproportionate number of references. please visit http://dx.doi.org/10.1192/j.eurpsy.2021.32.

https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press


10 Howard Ryland et al.

References [20] Kennedy H, O’Reilly K, Davoren M, O’Flynn P, O’Sullivan O. How to


measure progress in forensic care. In: Völlm B, Braun P, editors. Long-
[1] Crocker A, Livingston J, Leclair M. Forensic mental health systems inter- term forensic psychiatric care. Cham: Springer; 2019, p. 103–21. doi:
nationally. Handbook of forensic mental health services. International 10.1007/978-3-030-12594-3_8
perspectives on forensic mental health. New York, NY: Taylor & Francis; [21] Ryland H, Carlile J, Kingdon D. A guide to outcome measurement in
2017, p. 3–76. psychiatry. BJPsych Adv. 2020;1–9. doi:10.1192/bja.2020.58.
[2] Sampson S, Edworthy R, Völlm B, Bulten E. Long-term forensic mental [22] Dawson J, Doll H, Fitzpatrick R, Jenkinson C, Carr A. The routine use of
health services: An exploratory comparison of 18 European countries. Int J patient reported outcome measures in healthcare settings. BMJ. 2010;340:
Forens Ment Health. 2016;15:333–51. doi:10.1080/14999013.2016. c186. doi:10.1136/bmj.c186.
1221484. [23] Calvert M, Kyte D, Price G, Valderas J, Hjollund N. Maximising the impact
[3] Tomlin J, Lega I, Braun P, Kennedy H, Herrando V, Barroso R, et al. of patient reported outcome assessment for patients and society. BMJ.
Forensic mental health in Europe: some key figures. Social Psychiatry 2019;364:k5267. doi:10.1136/bmj.k5267.
Psychiatr Epidemiol. 2021;56:109–17. doi:10.1007/s00127-020-01909-6. [24] NHS England and NHS Improvement. Delivering the five year forward
[4] Jansman-Hart EM, Seto MC, Crocker AG, Nicholls TL, Côté G. Interna- view for mental health: developing quality and outcomes measure.
tional trends in demand for forensic mental health Services. Int J Forens London, UK: NHS England and Improvement; 2016.
Ment Health. 2011;10:326–36. doi:10.1080/14999013.2011.625591. [25] Mokkink L, Terwee C, Patrick D, Alonso J, Stratford P, Knol D, et al. The
[5] Rutherford M, Duggan S. Forensic mental health services: facts and figures COSMIN study reached international consensus on taxonomy, terminol-
on current provision. Br J Forens Pract. 2008;10:4–10. doi: ogy, and definitions of measurement properties for health-related patient-
10.1108/14636646200800020. reported outcomes. J Clin Epidemiol. 2010;63:737–45.
[6] Pinals DA. Forensic services, public mental health policy, and financing: [26] Black N, Burke L, Forrest C, Ravens Sieberer U, Ahmed S, Valderas J, et al.
charting the course ahead. J Am Acad Psychiatry Law Online. 2014;42: Patient-reported outcomes: pathways to better health, better services, and
7–19. better societies. Quality Life Res. 2016;25:1103–12. doi:10.1007/s11136-
[7] Wilson S, James D, Forrester A. The medium-secure project and criminal 015-1168-3.
justice mental health. Lancet. 2011;378:110–1. doi:10.1016/s0140-6736 [27] Gargon E, Gorst SL, Williamson PR. Choosing important health outcomes
(10)62268-4. for comparative effectiveness research: 5th annual update to a systematic
[8] Völlm B. How long is (too) long? BJPsych Bull. 2019;43:151–3. doi: review of core outcome sets for research. PLOS ONE. 2019;14:e0225980.
10.1192/bjb.2019.24. doi:10.1371/journal.pone.0225980.
[9] Lund C, Hofvander B, Forsman A, Anckarsäter H, Nilsson T. Violent [28] Terwee CB, Prinsen CAC, Chiarotto A, Westerman MJ, Patrick DL,
criminal recidivism in mentally disordered offenders: a follow-up study of Alonso J, et al. COSMIN methodology for evaluating the content validity
13–20 years through different sanctions. Int J Law Psychiatry. 2013;36: of patient-reported outcome measures: a Delphi study. Quality Life Res.
250–7. doi:https://doi.org/10.1016/j.ijlp.2013.04.015. 2018;27:1159–70. doi:10.1023/A:1023499322593.
[10] Allnutt S, Ogloff J, Adams J, O’Driscoll C, Daffern M, Carroll A, et al. [29] Shinkfield G, Ogloff J. Use and interpretation of routine outcome measures
Managing aggression and violence: the clinician’s role in contemporary in forensic mental health. Int J Ment Health Nurs. 2015;24:11–8.
mental health care. Austr NZ J Psychiatry. 2013;47:728–36. doi: [30] Keulen-de Vos M, Schepers K. Needs assessment in forensic patients: a
10.1177/0004867413484368. review of instrument suites. Int J Forens Ment Health. 2016;15(3):
[11] Wallang P, Kamath S, Parshall A, Saridar T, Shah M. Implementation of 283–300. doi:10.1080/14999013.2016.1152614.
outcomes-driven and value-based mental health care in the UK. Br J [31] Prinsen CAC, Mokkink LB, Bouter LM, Alonso J, Patrick DL, de Vet
Hospital Med. 2018;79:322–7. HCW, et al. COSMIN guideline for systematic reviews of patient-reported
[12] Livingston J. What does success look like in the forensic mental health outcome measures. Quality Life Res. 2018;27(5):1147–57. doi:10.1007/
system? Perspectives of service users and service providers. Int J Offender s11136-018-1798-3.
Ther Comp Criminol. 2016;62:208–28. [32] Moher D, Liberati A, Tetzlaff J, Altman DG. The PG. Preferred reporting
[13] Fazel S, Fimińska Z, Cocks C, Coid J. Patient outcomes following discharge items for systematic reviews and meta-analyses: the PRISMA statement.
from secure psychiatric hospitals: systematic review and meta-analysis. Br PLOS Med. 2009;6(7):e1000097. doi:10.1371/journal.pmed.1000097.
J Psychiatry. 2016;208:17–25. doi:10.1192/bjp.bp.114.149997. [33] Consensus-based standards for the selection of health measurement
[14] Cohen A, Eastman N. Needs assessment for mentally disordered instruments. Guideline for systematic reviews of outcome measurement
offenders: measurement of ‘ability to benefit’ and outcome. Br J instruments. Amsterdam, The Netherlands: VU University Medical Cen-
Psychiatry. 2000;177:493–8. doi:10.1192/bjp.177.6.493. tre; 2021.
[15] Shinkfield G, Ogloff J. A review and analysis of routine outcome measures [34] COSMIN. Search filters. Amsterdam, The Netherlands: VU University
for forensic mental health services. Int J Forens Ment Health. 2014;13: Medical Centre; 2019.
252–71. doi:10.1080/14999013.2014.939788. [35] Mokkink LB, Prinsen CAC, Patrick DL, Alonso J, Bouter LM, de Vet
[16] Fitzpatrick R, Chambers J, Burns T, Doll H, Fazel S, Jenkinson C, et al. A HCW, et al. COSMIN methodology for systematic reviews of patient‐
systematic review of outcome measures used in forensic mental health reported outcome measures. Amsterdam, The Netherlands: VU Univer-
research with consensus panel opinion. Health Technol Assess. 2010;14: sity Medical Centre; 2018.
1–94. [36] Terwee CB, Prinsen CAC, Chiarotto A, Westerman MJ, Patrick DL,
[17] Davoren M, Hennessy S, Conway C, Marrinan S, Gill P, Kennedy HG. Alonso J, et al. COSMIN methodology for evaluating the content validity
Recovery and concordance in a secure forensic psychiatry hospital—the of patient-reported outcome measures: a Delphi study. Quality Life Res.
self rated DUNDRUM-3 programme completion and DUNDRUM-4 2018;27:1159–70. doi:10.1007/s11136-018-1829-0.
recovery scales. BMC Psychiatry. 2015;15:61. doi:10.1186/s12888-015- [37] Douglas KS, Hart SD, Webster CD, Belfrage H, Guy LS, Wilson CM.
0433-x. Historical-clinical-risk management-20, Version 3 (HCR-20 V3): devel-
[18] Thomas SD, Slade M, McCrone P, Harty MA, Parrott J, Thornicroft G, opment and overview. Int J Forens Ment Health. 2014;13:93–108. doi:
et al. The reliability and validity of the forensic Camberwell Assessment of 10.1080/14999013.2014.906519.
Need (CANFOR): a needs assessment for forensic mental health service [38] Webster C, Nicholls T, Martin M, Desmarais S, Brink J. Short-Term
users. Int J Methods Psychiatr Res. 2008;17:111–20. Assessment of Risk and Treatability (START): the case for a new struc-
[19] Kennedy H, O’Neill C, Flynn G, Gill P, Davoren M. Dangerousness tured professional judgment scheme. Behav Sci Law. 2006;24:747–66. doi:
Understanding, Recovery and Urgency Manual (The DUNDRUM quar- 10.1002/bsl.737.
tet): four structured professional judgement instruments for admission [39] Nicholls T, Brink J, Desmarais S, Webster C, Martin M. The Short-Term
triage, urgency, treatment completion and recovery assessments Version Assessment of Risk and Treatability (START): a prospective validation
1.0.26. Dublin: Trinity College Dublin; 2013. study in a forensic psychiatric sample. Assessment. 2006;13:313–27.

https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press


European Psychiatry 11

[40] Boer D. Manual for the sexual violence risk-20: professional guidelines for [54] Faustman WO, Overall JE. Brief Psychiatric Rating Scale. the use of
assessing risk of sexual violence. British Columbia, Canada: British psychological testing for treatment planning and outcomes assessment.
Columbia Institute Against Family Violence; 1997. 2nd ed. Mahwah, NJ: Lawrence Erlbaum Associates Publishers; 1999,
[41] Wong S, Gordon A. The validity and reliability of the Violence Risk Scale: a p. 791–830.
treatment-friendly violence risk assessment tool. Psychol Public Policy [55] Aas IHM. Guidelines for rating Global Assessment of Functioning (GAF).
Law. 2006;12:279–309. Ann Gen Psychiatry. 2011;10:2. doi:10.1186/1744-859X-10-2.
[42] Andrews D, Bonta J, Wormith S. The Level of Service/Case Management [56] Eklund M. Lancashire quality of life profile. In: Michalos AC, editor.
Inventory (LS/CMI) technical brochure. Toronto, Canada: Multi-Health Encyclopedia of quality of life and well-being research. Dordrecht, The
Systems; 2004. Netherlands: Springer; 2014, p. 3493–5.
[43] de Vogel V, de Ruiter C, Bouman Y, de Vries Robbé M. SAPROF: [57] Douglas KS. Version 3 of the historical-clinical-risk management-20
guidelines for the assessment of protective factors for violence risk [English (HCR-20 V3): relevance to violence risk assessment and management in
version of the Dutch original]. Utrecht, The Netherlands: Forum Educa- forensic conditional release contexts. Behav Sci Law. 2014;32:557–76.
tief; 2009. [58] Dickens G, Sugarman P, Picchioni M, Long C. HoNOS-Secure: tracking
[44] Dickens G, Sugarman P, Walker L. HoNOS-secure: a reliable outcome risk and recovery for men in secure care. Br J Forens Practice. 2010;12:
measure for users of secure and forensic mental health services. J Forens 36–46.
Psychiatry Psychol. 2007;18:507–14. [59] Tomlin J, Lega I, Braun P, Kennedy HG, Herrando VT, Barroso R, et al.
[45] O’Dwyer S, Davoren M, Abidin Z, Doyle E, McDonnell K, Kennedy HG. Forensic mental health in Europe: some key figures. Social
The DUNDRUM Quartet: validation of structured professional judge- Psychiatry Psychiatr Epidemiol. 2021;56:109–17. doi:10.1007/s00127-
ment instruments DUNDRUM-3 assessment of programme completion 020-01909-6.
and DUNDRUM-4 assessment of recovery in forensic mental health [60] Terwee CB, Dekker FW, Wiersinga WM, Prummel MF, Bossuyt PMM. On
services. BMC Res Notes. 2011;4:229. assessing responsiveness of health-related quality of life instruments:
[46] Woods P, Reed V, Robinson D. The Behavioural Status Index: therapeutic guidelines for instrument evaluation. Quality Life Res. 2003;12:349–62.
assessment of risk, insight, communication and social skills. J Psychiat doi:10.1023/A:1023499322593.
Ment Health Nurs. 1999;6:79–90. [61] Singh JP, Grann M, Fazel S. Authorship bias in violence risk assessment? A
[47] Woods P, Reed V, Collins M. Relationships among risk, and communi- systematic review and meta-analysis. PLOS ONE. 2013;8:e72484. doi:
cation and social skills in a high security forensic setting. Issues Ment 10.1371/journal.pone.0072484.
Health Nurs. 2004;25:769–82. [62] U.S. Department of Health and Human Services Food and Drug Admin-
[48] Horgan H, Charteris C, Ambrose D. The violence reduction programme: istration. Guidance for industry: patient-reported outcome measures: use
an exploration of posttreatment risk reduction in a specialist medium- in medical product development to support labeling llaims. Maryland:
secure unit. Crim Behav Ment Health. 2019;29:286–95. doi:10.1002/ Food and Drug Administration; 2009.
cbm.2123. [63] De Vet HC, Terwee CB, Mokkink LB, Knol DL. Measurement in medicine:
[49] Richter MS, O’Reilly K, O’Sullivan D, O’Flynn P, Corvin A, Donohoe G, a practical guide. Cambridge: Cambridge University Press; 2011.
et al. Prospective observational cohort study of ’treatment as usual’ over [64] Hogan NR, Olver ME. Assessing risk for aggression in forensic psychiatric
four years for patients with schizophrenia in a national forensic hospital. inpatients: an examination of five measures. Law Hum Behav. 2016;40:
BMC Psychiatry. 2018;18:289. doi:10.1186/s12888-018-1862-0. 233–43.
[50] Longdon L, Edworthy R, Resnick J, Byrne A, Clarke M, Cheung N, et al. [65] van den Brink RH, Troquete NA, Beintema H, Mulder T, van Os TW,
Patient characteristics and outcome measurement in a low secure forensic Schoevers RA, et al. Risk assessment by client and case manager for shared
hospital. Crim Behav Ment Health. 2018;28:255–69. doi:10.1002/cbm.2062. decision making in outpatient forensic psychiatry. BMC Psychiatry. 2015;
[51] de Vries Robbe M, de Vogel V, Douglas K, Nijman H. Changes in dynamic 15:120.
risk and protective factors for violence during inpatient forensic psychi- [66] Rothrock NE, Kaiser KA, Cella D. Developing a valid patient-reported
atric treatment: predicting reductions in postdischarge community recid- outcome measure. Clin Pharmacol Therapeut. 2011;90:737–42.
ivism. Law Hum Behav. 2015;39:53–61. doi:10.1037/lhb0000089. [67] Boardman J. Routine outcome measurement: recovery, quality of life and
[52] Yudofsky SC, Silver JM, Jackson W, Endicott J, Williams D. The Overt co-production. Br J Psychiatry. 2018;212:4–5.
Aggression Scale for the objective rating of verbal and physical aggression. [68] Connell J, O’Cathain A, Brazier J. Measuring quality of life in
Am J Psychiatry. 1986;143:35–9. mental health: are we asking the right questions? Social Sci Med. 2014;
[53] Kay SR, Opler LA, Lindenmayer J-P. The Positive and Negative Syndrome 120:12–20.
Scale (PANSS): rationale and standardisation. Br J Psychiatry. 1989;155: [69] Sartorius N, Kuyken W. Translation of health status instruments. Berlin,
59–65. Germany: Springer; 1994.

https://doi.org/10.1192/j.eurpsy.2021.32 Published online by Cambridge University Press

You might also like