Nothing Special   »   [go: up one dir, main page]

Treatment Integrity of School-Based Interventions Wiith Children in The Journal of Applied Behavior Analysis 1991-2005

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

JOURNAL OF APPLIED BEHAVIOR ANALYSIS 2007, 40, 659–672 NUMBER 4 (WINTER 2007)

TREATMENT INTEGRITY OF SCHOOL-BASED INTERVENTIONS


WITH CHILDREN IN THE JOURNAL OF APPLIED
BEHAVIOR ANALYSIS 1991–2005
LAURA LEE MCINTYRE
SYRACUSE UNIVERSITY

FRANK M. GRESHAM
LOUISIANA STATE UNIVERSITY

AND

FLORENCE D. DIGENNARO AND DEREK D. REED


SYRACUSE UNIVERSITY

We reviewed all school-based experimental studies with individuals 0 to 18 years published in


the Journal of Applied Behavior Analysis (JABA) between 1991 and 2005. A total of 142 articles
(152 studies) that met review criteria were included. Nearly all (95%) of these experiments
provided an operational definition of the independent variable, but only 30% of the studies
provided treatment integrity data. Nearly half of studies (45%) were judged to be at high risk for
treatment inaccuracies. Treatment integrity data were more likely to be included in studies that
used teachers, multiple treatment agents, or both. Although there was a substantial increase in
reporting operational definitions of independent variables, results suggest that there was only
a modest improvement in reported integrity over the past 30 years of JABA studies.
Recommendations for research and practice are discussed.
DESCRIPTORS: treatment integrity, child studies, school interventions, applied behavior
analysis
________________________________________

The field of applied behavior analysis has specification of an independent variable as well
always rested on the fundamental principle that accurate independent variable application, de-
the empirical demonstration of measurable finitive conclusions regarding the relation
changes in behavior must be related to between an independent variable and a depen-
systematic and controlled manipulations in the dent variable are compromised. The best way to
environment. That is, the observed changes in ensure accurate application of the independent
the dependent variable (behavior) must be variable is to measure the extent to which
attributed to changes in the independent vari- treatment is implemented as intended.
able (some environmental event). Without this Documentation of independent variable im-
empirical demonstration, a true science of plementation has been discussed in the litera-
human behavior is an impossibility (Skinner, ture under the rubric of treatment fidelity
1953). Without objective and documented (Moncher & Prinz, 1991) or treatment integrity
(Gresham, Gansle, & Noell, 1993; Gresham,
We thank Heidi Olson-Tinker, Lisa Dolstra, Veronica Gansle, Noell, & Cohen, 1993; Peterson,
McLaughlin, and Mai Van for assistance with the initial Homer, & Wonderlich, 1982; Yeaton &
preparation of this article. We also are grateful to Michael
J. Vance for assistance with data collection. Sechrest, 1981). Treatment integrity refers to
Address correspondence to Laura Lee McIntyre, De- the degree to which treatments are implemented
partment of Psychology, 430 Huntington Hall, Syracuse as planned, designed, or intended and is
University, Syracuse, New York 13244 (e-mail: llmcinty@
syr.edu). concerned with the accuracy and consistency
doi: 10.1901/jaba.2007.659–672 with which interventions are implemented

659
660 LAURA LEE MCINTYRE et al.

(Peterson et al.). Therefore, treatment integrity (Smith, Glass, & Miller, 1980). Having this
is necessary but insufficient for demonstrating knowledge would focus a behavior analyst’s
a functional relation between intervention efforts on problem solving with teachers and
procedures and behavior change (Gresham, students (i.e., change the intervention or di-
1989). rectly work to improve teachers’ implementa-
A number of studies have been published in tion of the current plan). Finally, recent
recent years that have examined variables legislation, such as the No Child Left Behind
associated with adequate treatment integrity Act (U.S. Department of Education, 2002) and
(DiGennaro, Martens, & Kleinmann, 2007; Individuals with Disabilities Improvement Act
DiGennaro, Martens, & McIntyre, 2005; (2004), necessitates that school-based practi-
Mortenson & Witt, 1998; Noell, Witt, Gil- tioners and teachers be accountable for their
bertson, Ranier, & Freeland, 1997; Noell et al., practices. As a result, there has been a recent
2000; Sterling-Turner, Watson, Wildmon, push for evidence-based practices in academic
Watkins, & Little, 2001; Witt, Noell, LaFleur, settings as well as demonstrations of accurate
& Mortenson, 1997). Most of these studies plan implementation over time.
have focused on schools as the primary setting How common is the measurement of
for investigation. Investigating the degree to treatment integrity in the behavior analysis
which interventions are carried out with in- literature? Several reviews of the literature
tegrity in schools is valuable for several reasons. suggest that the measurement of treatment
First, research suggests that teachers fail to integrity is uncommon (Gresham, Gansle, &
implement interventions with accuracy despite Noell, 1993; Peterson et al., 1982; Wheeler,
receiving high levels of initial training (e.g., Baggett, Fox, & Blevins, 2006). Peterson et al.
DiGennaro et al., 2005; Noell et al., 2000). reviewed 539 studies published in the Journal of
This is a waste of time and resources for both Applied Behavior Analysis ( JABA) between 1968
teachers and consultants if, after training, the and 1980; they found that only 20% of the 539
interventions are not implemented as intended. studies reported data on treatment integrity,
Second, findings also suggest that student and over 16% of these studies did not provide
problem behaviors are negatively correlated an operational definition of the independent
with treatment accuracy, such that low levels variable. There were no trends suggesting an
of problem behavior are associated with high improvement in treatment integrity over time.
levels of treatment integrity (DiGennaro et al., Gresham, Gansle, and Noell provided an
2005, 2007; Wilder, Atwell, & Wine, 2006). update of Peterson et al.’s review by examining
Thus, a teacher’s failure to implement recom- 158 studies published in JABA between 1980
mended interventions may result in poor out- and 1990 that were child studies (,19 years of
comes for students, in that behaviors will not age). Of these 158 studies, only 32% provided
improve in the desired direction. Third, the an operational definition of the independent
extent to which teachers implement plans with variable and only 16% (25 studies) systemati-
accuracy influences a behavior analyst’s ability cally measured and reported levels of treatment
to effectively conduct formative evaluations. integrity.
Specifically, a behavior analyst will be unable to Wheeler et al. (2006) focused on intervention
determine if a student’s resistance to treatment studies of children with autism published
is a result of an ineffective intervention or a lack between 1993 and 2003. Of the 60 studies
of intervention implementation (Moncher & included in the review, more than half (60%)
Prinz, 1991) because the treatment’s effect size were published in JABA, with the remaining
is positively correlated with internal validity studies (n 5 26) drawn from eight other
SCHOOL-BASED INTERVENTIONS 661

journals (e.g., Research in Developmental Dis- rigor of our experimental procedures (Baer,
abilities, Journal of Autism and Developmental Wolf, & Risley, 1968; Johnston & Penny-
Disorders). The results of Wheeler et al.’s review packer, 1993; Kazdin, 1973). The basic concern
were consistent with previous studies. Of these is that when data are not collected regarding the
60 studies, only 18% (n 5 11) reported data on status of the independent variable, researchers
treatment integrity. On the other hand, nearly and practitioners alike cannot objectively con-
all (92%) included operational definitions of clude that the independent variable was im-
independent variables. Closer analysis of the plemented as planned or intended (Kennedy,
results of Wheeler et al.’s review provides some 2005; Moncher & Prinz, 1991). This problem
insight into the treatment-integrity reporting may be especially problematic in practice
trends for child-based autism treatment studies. settings (Wilder et al., 2006), such as interven-
Although most of the included studies were tions that are implemented in schools.
published in JABA, only 14% (n 5 5) included The current article updates and extends the
treatment integrity data. This figure is lower findings of the Peterson et al. (1982) and
than what others have reported (e.g., Gresham, Gresham, Gansle, and Noell (1993) reviews by
Gansle, & Noell, 1993) for JABA studies. In another 15 years. All school-based interventions
contrast, studies published in Research in with children (,19 years old) published in
Developmental Disabilities, Focus on Autism JABA between 1991 and 2005 were reviewed
and Other Developmental Disabilities, Journal for possible inclusion. The clinical relevance of
of Autism and Developmental Disorders, and the investigating treatment integrity combined with
Journal of Positive Behavioral Interventions in- the importance of demonstrating that the
cluded treatment integrity data in 25% to 33% independent variable was accurately applied in
of studies. Studies that met inclusionary criteria school-based intervention research serves as the
published in Education and Treatment of basis of this study.
Children and the Journal of Early Intervention
reported treatment integrity 50% and 100% of METHOD
the time, respectively. In contrast, the three Criteria for Review
studies published in Education and Training in A total of 995 articles (excluding book reviews
Mental Retardation and Developmental Disabil- and remembrances) were reviewed to determine
ities and the Journal of Developmental and possible inclusion. Five features of each study
Physical Disabilities did not report treatment were considered. First, the study had to be
integrity data. Although these findings are experimental, in that the effects of intervention
limited due to the scope of Wheeler et al.’s on behavior were examined (i.e., the study had to
review criteria, they are helpful in placing manipulate some aspect of the environment to
treatment integrity reporting in JABA in create changes in a dependent variable). Because
context. we were evaluating school-based intervention
Based on the foregoing reviews, it is clear that studies, articles that were assessment only (e.g.,
the majority of treatment outcome studies functional analysis, preference assessment) were
published in JABA and other behavioral excluded. If a study contained an initial
journals either did not measure or did not functional analysis followed by an intervention,
report levels of treatment integrity. As can be the intervention experiment was included.
derived from the above discussion on the Second, participants had to be younger than
importance of treatment integrity, the failure 19 years old, an inclusionary criterion previously
to gather data on the integrity of independent employed by Gresham, Gansle, and Noell
variables may compromise the precision and (1993). Third, studies without a clear baseline
662 LAURA LEE MCINTYRE et al.

or control condition were excluded from fur- given the following criterion: ‘‘If you could
ther review. Studies that were not true experi- replicate this treatment with the information
mental designs (e.g., AB designs) were exclud- provided, the intervention is considered opera-
ed. Fourth, all studies had to be conducted in tionally defined.’’ This criterion was proposed
school settings; however, school was liberally by Baer et al. (1968) and later used by
defined to include a continuum of school Gresham, Gansle, and Noell (1993) in their
placements, including residential programs. review. Those studies that referred to more
Inpatient hospital units (e.g., Neurobehavioral extensive sources (e.g., book chapters, manuals,
Unit at the Kennedy Krieger Institute) and or technical reports) were coded as ‘‘footnote’’
outpatient clinics were excluded. Fifth, brief (i.e., contained directions to contact the author
reports of three or fewer pages in length were or see published details elsewhere).
excluded, as outlined by Peterson et al. (1982). Monitoring treatment integrity. Studies were
Because articles of three or fewer pages typically coded according to their inclusion of treatment
do not provide sufficient methodological detail integrity data. Studies that systematically mon-
(e.g., lengthy descriptions of independent itored and reported treatment integrity on at
variables or integrity monitoring), we chose least one independent variable were coded
to exclude these studies so we would not ‘‘yes.’’ Specifically, this included studies that
artificially underestimate independent variable (a) specified a method of measurement (observ-
operational definition and integrity reporting. er present, videotaping of sessions, component
Thus, a total of 142 articles met these checklist) and (b) reported data as percentage of
inclusionary criteria over the 15-year period. implementation (i.e., percentage of implemen-
Because some of the articles contained multiple ted steps in the intervention). Studies that
experiments, a total of 152 studies met monitored treatment integrity but failed to
inclusionary criteria for this review. (A full list
report data were coded as ‘‘monitored.’’ For
of articles meeting inclusionary criteria is
example, ‘‘treatment integrity was assessed to
available from the first author.)
ensure the fidelity of this intervention’’ was
Coding coded as ‘‘monitored.’’ Likewise, studies that
This review focused on the operational mentioned statements such as ‘‘deviations from
definition of the independent variables and intervention protocol were not observed’’ were
the extent to which these variables were also coded as ‘‘monitored’’ (no method of
described, monitored, and measured. Following measurement was described). The key differ-
the procedural guidelines set forth by Peterson ence between ‘‘yes’’ and ‘‘monitored’’ categories
et al. (1982), the risk for treatment inaccuracies was the provision of percentage data regarding
was also investigated. In addition, we were implementation and a specified data-collection
interested in assessing whether treatment in- method. Studies that made no mention of
tegrity reporting trends varied by publication treatment integrity were coded ‘‘no.’’ We chose
year and by whom the intervention was to replicate Gresham, Gansle, and Noell’s
implemented (treatment agent; e.g., teacher, (1993) treatment integrity coding because,
researcher, etc.). Coding schemes for each of unlike Peterson et al. (1982), this method
these variables are described below. allowed differentiating the categories of ‘‘yes’’
Operational definition of the independent and ‘‘monitored.’’
variable. Each study was coded ‘‘yes,’’ ‘‘no,’’ Risk for treatment inaccuracies. Treatments
or ‘‘footnote’’ in answer to the question: Is the were coded as either no, low, or high risk for
independent variable (treatment) operationally treatment inaccuracies based on the guidelines
defined? To answer this question, each rater was set forth by Peterson et al. (1982). Treatments
SCHOOL-BASED INTERVENTIONS 663

were coded as ‘‘no risk’’ if the implementation tors. Researchers and research assistants were
of the treatment was reported as monitored or individuals who collected data for the purpose
measured (i.e., monitoring of treatment in- of the published study and were not involved in
tegrity was coded as either ‘‘yes’’ or ‘‘moni- other service delivery roles (e.g., classroom
tored’’). Treatments were coded as ‘‘low risk’’ if teacher). Peer tutors were other children,
the treatment was not reported to be monitored typically in the target child’s classroom, who
or measured but was judged to be at low risk for were not the focus of the intervention. ‘‘Self’’
inaccuracies. Low-risk treatments included was recorded if the intervention was self-
treatments that were (a) mechanically defined administered or self-mediated (e.g., self-moni-
(e.g., computer mediated), (b) permanent toring interventions). ‘‘Multiple’’ was coded if
products (e.g., posting of classroom rules), (c) more than one category of treatment agent was
continuously applied (e.g., noncontingent ac- used. If the treatment agent described in the
cess to preferred items or activities), or (d) study did not fit in any of the aforementioned
single components (e.g., escape contingent on categories, ‘‘other’’ was coded. There were
work completion). Treatments were coded as a small handful of studies that did not specify
‘‘high risk’’ if the treatment was not reported to the treatment agent. In these cases ‘‘not
be monitored or measured but was necessary. specified’’ was coded.
According to Peterson et al. (1982) treatments
in the high-risk category were those in which Rater Training and Interobserver Agreement
‘‘the administration of the independent variable A PhD-level behavior analyst (faculty mem-
was not exempted by any of the cases cited in ber) and four doctoral students with advanced
category B [low risk], and the potential for error training in behavior analysis served as raters,
was judged to be high’’ (p. 485). Operationally with each rater coding 20% of the studies. Prior
defined, these included person-implemented to coding, all raters received four 2-hr training
interventions that included multiple behavioral sessions to discuss assigned practice articles (i.e.,
components (e.g., contingent reinforcement JABA articles published prior to 1991) and to
with response cost). revise ambiguous codes. During these training
Publication year. The publication year of the sessions, all raters reached 100% agreement (via
article was recorded (i.e., 1991 to 2005). consensus) on whether an assigned article met
Treatment agent. The individuals who im- inclusionary criteria. Five articles were assigned
plemented the intervention were classified into per training session, yielding a total of 20
one of the following mutually exclusive cate- training articles used prior to conducting
gories: (a) teacher, (b) professional (nontea- independent coding. In addition, a random
cher), (c) paraprofessional, (d) parent or sibling, sample of 20% of studies meeting inclusionary
(e) researcher or research assistant, (f ) peer criteria was selected for interobserver agreement
tutors, (g) self, (h) multiple, (i) other, or (j) not coding. Studies were coded on five categories:
specified. Examples of the teacher category (a) operational definition of the independent
included early childhood educators, general variable (three categories), (b) integrity assess-
education classroom teachers, or discrete-trial ment (three categories), (c) risk for treatment
instructors. The professional category included inaccuracies (three categories), (d) publication
other nonteacher professionals (e.g., school year (15 categories), and (e) treatment agent (10
psychologists, speech–language pathologists). categories). Percentage agreement was calculat-
Paraprofessionals included support staff such ed by dividing the number of agreements by the
as classroom aides, teaching assistants (non- number of agreements plus disagreements and
teachers), or playground or lunchroom moni- multiplying by 100%. Percentage agreement
664 LAURA LEE MCINTYRE et al.

averaged 93% across the five codes (98% on treatment integrity but were judged to be at
operational definition of the independent vari- low risk for treatment inaccuracies.
able; 87% integrity assessment; 88% risk for Reporting treatment integrity data did not
treatment inaccuracies; 100% publication year; appear to differ consistently by publication year;
92% treatment agent). however, there was ample variability across the
15-year period. Figure 1 depicts the percentage
of studies that included treatment integrity data
RESULTS by publication year. On average, treatment
integrity data were included in one third of the
The majority of studies (n 5 144; 95%) included studies (M 5 34%; SD 5 19.23). The
provided operational definitions of treatments, publication years 1996, 1998, 1999, and 2005
with an additional five studies (3%) reporting included relatively more studies that reported
references or contact information to allow treatment integrity data (range, 50% to 67%)
readers to gather more information about the than the remaining 12 years. Figure 2 shows
interventions (e.g., treatment manuals, pre- treatment integrity data from 1968 to 2005
viously published studies, etc.). The remaining based on Peterson et al.’s (1982) review; Gre-
three studies (2%) did not provide operational sham, Gansle, and Noell’s (1993) review; and
definitions adequate for replication purposes or the present review. These data are based on 834
cite other sources for more information. studies published in JABA from 1968 to 2005.
Approximately one third (n 5 46; 30%) of Of these 834 studies, 179 (21%) reported
the studies provided treatment integrity data in treatment integrity data (range, 0% to 67%).
the form of percentage of implementation. We were interested in exploring whether
Studies that reported these data showed a studies that used particular treatment agents
high percentage of integrity (M 5 93%; (e.g., teachers, researchers) reported treatment
SD 5 9.93). The majority of studies that integrity data more frequently. As shown in
reported integrity data (n 5 36; 78%) reported Table 1, there were a variety of reported
procedural fidelity of 90% or greater. Thirteen treatment agents in the included studies. The
studies (8%) mentioned that treatment integrity most common were researchers (n 5 52),
was monitored but did not provide data for teachers (n 5 38), multiple (n 5 19), and
percentage of steps accurately implemented. professionals (n 5 15). Although only seven
Over 60% of the studies (n 5 93) did not studies used peer tutors as treatment agents,
report treatment integrity data nor did they 57% (n 5 4) reported treatment integrity data.
report monitoring the implementation of their Of the 19 studies that used multiple treatment
interventions. agents, nearly a third (n 5 6; 32%) included
Approximately 39% of studies (n 5 59) were treatment integrity data. Likewise, for the 38
considered to be at no risk for treatment studies that used teachers as treatment agents,
inaccuracies, in that the authors reported 37% (n 5 14) reported treatment integrity
treatment integrity data or that treatment data. Studies that used professionals, parents or
integrity was monitored. Just under half of the siblings, researchers, or self-administered treat-
included studies (n 5 69; 45%) were consid- ments had lower reporting of treatment in-
ered to be at high risk for treatment inaccuracies tegrity data (range, 0% to 25%).
in that information on the implementation of
treatments or the assessment of independent
DISCUSSION
variables was not included but should have been
(Peterson et al., 1982). The remaining 16% of The present review of school-based interven-
studies (n 5 24) did not include information tions with children published in JABA demon-
SCHOOL-BASED INTERVENTIONS 665

Figure 1. Percentage of JABA school-based studies reporting treatment integrity data by year (1991 to 2005).

Figure 2. Percentage of JABA studies reviewed by Peterson et al. (1982); Gresham, Gansle, and Noell (1993); and the
current review reporting treatment integrity data by year (1968 to 2005).
666 LAURA LEE MCINTYRE et al.
Table 1
Treatment Integrity Monitoring by Treatment Agent

Treatment agent Yes + data n (%) Monitored n (%) No n (%) Total


Teacher 14 (37) 4 (10) 20 (53) 38
Professional (nonteacher) 3 (20) 1 (7) 11 (73) 15
Paraprofessional 2 (33) 0 (0) 4 (67) 6
Parent or sibling 0 (0) 1 (50) 1 (50) 2
Researcher 13 (25) 4 (8) 35 (67) 52
Peer tutors 4 (57) 1 (14) 2 (29) 7
Multiple 6 (32) 2 (10) 11 (58) 19
Does not specify 3 (33) 0 (0) 6 (67) 9
Self 0 (0) 0 (0) 2 (100) 2
Other 1 (50) 0 (0) 1 (50) 2
Total 46 (30) 13 (8) 93 (61) 152

strates that reporting rates of treatment integrity Reasons for low rates of treatment integrity
data have been remarkably stable (and low) over reporting are not entirely clear; however, low
the past 15 years. Approximately one third reporting may be a function of the editorial
(30%) of studies that met our inclusionary process (i.e., space limitations in journals
criteria reported treatment integrity data. This warrant cutting out treatment integrity data)
figure is slightly higher than the Peterson et al. or may be due to logistics (e.g., lack of skills in
(1982) and Gresham, Gansle, and Noell (1993) treatment integrity assessment, lack of re-
reviews of this literature that showed 20% and sources). There may also be a publication bias
16% integrity, respectively. Although somewhat favoring the reporting of treatment integrity
different inclusionary criteria were used in the data when integrity is high. In addition, it is
two earlier reviews, treatment integrity report- plausible that researchers do not view treatment
ing has been remarkably stable over the past integrity data collection as important, especially
37 years (1968 to 2005) (Figure 2). Of interest if interventions produce the desired effects. We
is the large increase in treatment integrity argue that without collecting integrity data, it
reporting that was seen from 1993 to 1994. becomes difficult to make conclusions regarding
Although attributions about the cause of this intervention results.
increase cannot be made, this spike occurred the Having access to treatment integrity data can
year following Gresham, Gansle, and Noell’s help behavior analysts to make decisions about
review. Gresham, Gansle, and Noell reported treatments in school-based settings. If, for
a similar increase from 1982 to 1983 (the year example, an intervention is being implemented
following Peterson et al.’s review). It is plausible accurately yet does not produce the desired
that papers of this nature may increase JABA effects, the behavior analyst will likely modify
authors’ and editors’ awareness of the need to the treatment. If the intervention is being
include treatment integrity data. Alternatively, implemented inaccurately and does not produce
there may be other variables that contributed to the desired effects, the behavior analyst will
the spike in treatment integrity reporting, such likely institute additional training or pro-
as the sharp increase seen from 1997 to 1998. grammed consequences to increase implemen-
To the best of our knowledge, however, tation accuracy. On the other hand, if the
editorial guidelines for preparing manuscripts intervention is not being implemented with
to be submitted to JABA did not change during integrity yet still produces the desired effects,
this time. the behavior analyst will likely change the
Reporting of treatment integrity data has treatment protocol to reflect the modified
been relatively stable and low over the years. intervention. Finally, if the intervention is being
SCHOOL-BASED INTERVENTIONS 667

implemented with integrity and the desired paraprofessionals, peers, or multiple treatment
treatment outcomes are produced, a causal agents, authors are more likely to report
relation between independent variable manip- treatment integrity data. It may be the case
ulations and changes in the dependent variable that these treatment agents were judged to be at
can be inferred. Thus, we argue that including high risk for procedural inaccuracies and the
regular treatment-integrity assessments is neces- authors therefore went to great lengths to ensure
sary but insufficient for making treatment- that these agents implemented the treatments as
related decisions (Gresham, 1989). planned. Although definitive conclusions can-
In contrast to the rates of treatment integrity not be made based on these descriptive data, it
reporting, reporting of operationally defined appears that the treatment agent used in school-
independent variables has dramatically in- based studies may influence the likelihood of
creased, with nearly all (95%) studies including reporting treatment integrity data. What is
detailed descriptions of the interventions. This unknown, however, is how many other authors
figure is consistent with a recent review of collected treatment integrity data but failed to
interventions for children with autism (Wheeler report it in their published articles. Failure to
et al., 2006) but is a much improved rate over include a brief statement on the extent to which
the 34% reported by Gresham, Gansle, and treatments were implemented as planned may
Noell (1993). Including operational definitions be especially problematic for interventions
of independent variables contributes to the judged to be at high risk for treatment
replicability of our science of behavioral inter- inaccuracies (Peterson et al., 1982). If treatment
ventions (Bellg et al., 2004). integrity data are not regularly included,
Although treatment integrity measures are inferences based on the study results may be
important for virtually all experimental studies, significantly limited (Kennedy, 2005). Thus, we
including assessment studies and interventions recommend that if treatment integrity data are
conducted in other settings, we chose to sample collected or if intervention implementation is
interventions with children in school settings. monitored, this information should be included
This population and setting were selected in published studies.
because it is the focus of our own research; Although we have seen marked improvement
however, this may be of interest to other in descriptions of independent variables, pub-
researchers in its own right. Furthermore, lications in JABA continue to focus on clear
interventions carried out in school settings, in specifications of the dependent variables and do
which treatment agents are less likely to be not include measurements of the independent
researchers with significant training in experi- variables. Indeed, a ‘‘curious double standard’’
mental methods, may be at greatest risk for so aptly recognized by Peterson et al. (1982)
inaccurate implementation of interventions. still remains. This observation continues to be
When treatment integrity is not systematically recognized by various task forces and organiza-
assessed and reported, there is little basis for tions within the fields of education, psychology,
judgments about how closely an implemented and mental health. For example, the Task Force
intervention approximates an intended inter- on Evidence-Based Practice in Special Educa-
vention. Because the current review focused on tion of the Council for Exceptional Children
school settings, the extent to which these stated that the integrity of intervention im-
findings generalize to published studies con- plementation is critical in single-case designs
ducted with other populations is unknown. because the independent variable is implemen-
Our findings suggest that when school-based ted continuously over time (Horner et al.,
interventions are carried out by teachers, 2005).
668 LAURA LEE MCINTYRE et al.

Similarly, other task forces within the tionally related to desired changes in the
American Psychological Association on evi- dependent variable, there has been relatively
dence-based treatments such as Divisions 16 little research that demonstrates this relation
(school psychology), 53 (clinical child/adoles- (Wilder et al., 2006). Furthermore, it may be
cent), and 54 (pediatric) have called for the the case that high levels of treatment integrity
assessment and monitoring of treatment in- are necessary for some interventions but may
tegrity. Furthermore, researchers who submit not be necessary for others. Only a handful of
single-case experimental design grant applica- behavior-analytic studies have addressed this
tions to the U.S. Department of Education’s issue, unfortunately coming to somewhat
Institute of Education Sciences (IES) now must different conclusions. For example, Wilder et
describe ‘‘how treatment fidelity will be mea- al. systematically manipulated different levels of
sured, frequency of assessments, and what treatment integrity of a three-step prompting
degree of variation in treatment fidelity will be procedure on children’s compliance. Wilder et
accepted over the course of the study’’ (IES, al. concluded that the level of treatment
2006, p. 50). These recommendations have also accuracy had a large impact on children’s
been made by the National Institutes of Health compliance. Northup, Fisher, Kahng, Harrel,
(NIH). Specifically the NIH Behavior Change and Kurtz (1997), on the other hand, found
Consortium recommends that treatments be very little difference between time-out treat-
monitored and reported and that treatment ments implemented at 100% accuracy and
agents be trained and supervised in the delivery those implemented at 50% accuracy. Vollmer,
of treatments (Bellg et al., 2004). Monitoring
Roane, Ringdahl, and Marcus (1999) evaluated
and reporting treatment fidelity are especially
the effects of differential reinforcement of
important in clinical treatments that are
alternative behavior and found that degree of
considered to be at high risk for treatment
treatment accuracy did affect treatment out-
inaccuracies or complex in other ways (e.g.,
comes. Because of the small number of studies
multisite). Furthermore, the special NIH report
that have addressed the varying effects of
on treatment fidelity in research specifies that
‘‘it is particularly important that funding treatment integrity on behavior change, we
agencies, reviewers, and journal editors who recommend that additional studies include
publish behavioral change research consider treatment integrity variation as an independent
treatment fidelity issues’’ (Bellg et al., p. 451). variable and consider that various treatments
With the increased attention paid to issues of may actually require different levels of treat-
accurate treatment implementation and report- ment integrity to produce desired changes in the
ing of treatment integrity, both within the field dependent variable. Regular documentation of
of behavior analysis and in other fields (e.g., treatment integrity may help to improve our
psychology, behavioral medicine), it may be knowledge base in this regard.
particularly important for JABA authors and An additional area of research for behavior-
readers to consider some additional ways to analytic studies may be to separate the compo-
strengthen the influence of behavior analysis in nents of treatment packages to identify the
the larger scientific community. Outlined are variables that are functionally responsible for
several recommendations for treatment integrity producing behavior change. It is plausible that
research and recommendations for practice. some components of a treatment package may
be excluded, whereas others may be necessary to
Recommendations for Research and Practice produce treatment effects. Thus, a treatment
Although accurate implementation of the that is implemented with 80% accuracy but is
independent variable is assumed to be func- missing a key ingredient may produce poorer
SCHOOL-BASED INTERVENTIONS 669

outcomes than a treatment that is implemented behavior analysis, it may be appropriate to


at 70% accuracy but includes the components update our thinking with respect to what
that are functionally responsible for changes in constitutes risk for treatment inaccuracies. Pe-
the dependent variable. terson et al.’s criteria were based on Kelly’s
Behavioral interventions, especially those (1977) definition of risk that he developed based
implemented in applied settings (e.g., schools), on reviewing reliability reporting trends in JABA.
may be at high risk for treatment inaccuracies This conceptualization of risk for independent
due to the setting, treatment agent, complexity variable inaccuracies does not include treatment
of the protocol, and demands placed on agent (e.g., certified behavior analyst vs. novice
teachers’ time and resources. Interventions that therapist), years of experience, setting, or other
include programmed consequences for teachers variables that may be germane to our consider-
(or other treatment agents) contingent on ation of risk. In addition, Peterson et al.
accuracy of treatment implementation may considered monitoring integrity and reporting
produce higher levels of treatment integrity. treatment integrity data to be equivalent with
For example, Noell et al. (1997) found that respect to risk for treatment inaccuracies. We
a performance feedback package increased posit that monitoring interventions may be less
teachers’ accurate implementation of treat- informative for both research and practice than
ments. Furthermore, DiGennaro et al. (2007) the provision of integrity data.
found that programmed consequences includ- In terms of practical recommendations, we
ing performance feedback and negative re- suggest that treatment integrity plans be specified
inforcement (escape from a meeting with the at the outset of studies (Hellg et al., 2004). That
behavior analyst) produced higher levels of is, researchers should specify when treatment
treatment integrity than a single programmed integrity will be assessed and how the assessment
consequence or no programmed consequence. will occur. Clearly specifying intervention steps
Additional research using programmed conse- in a treatment protocol may help the implemen-
quences for treatment agents in applied settings tation and assessment of the intervention. Given
may help to elucidate conditions in which that a number of school-based intervention
treatments are more or less likely to be studies published in JABA are considered to be
implemented with accuracy in applied settings. high risk for treatment inaccuracies, it is probable
Data to support Peterson et al.’s (1982) no that treatments implemented in practice (and not
risk, low risk, and high risk for treatment published) may be at greater risk for treatment
inaccuracies may help the field to flesh out the inaccuracies.
construct of risk for treatment inaccuracies. Other practical recommendations include
Although it is assumed that some treatments providing initial training for treatment agents
may be at higher risk for inaccuracies, treatment at the study onset and training to a criterion
integrity data have not been reported for studies rather than training for a prespecified period of
with more or less complex interventions. It is time (Bellg et al., 2004). Training should be
recommended that treatment integrity be viewed as an ongoing activity due to factors such
collected on a number of interventions to as therapist drift or failure to implement the
determine whether complexity of treatments treatment as outlined (e.g., DiGennaro et al.,
or other features of the treatment (e.g., accept- 2005; Noell et al., 2000). Spot checks of
ability; Sterling-Turner & Watson, 2002) are treatment integrity could be performed with
related to treatment integrity. Furthermore, the assistance of well-developed procedural
although Peterson et al. ’s criteria have served checklists and protocols. We have found that
as an important heuristic for the field of providing intervention protocols (see the exam-
670 LAURA LEE MCINTYRE et al.

ple in Appendix A) and using simple procedural Bellg, A. J., Borrelli, B., Resnick, B., Hecht, J., Minicucci,
checklists (see the example in Appendix B) can be D. S., Ory, M., et al. (2004). Enhancing treatment
fidelity in health behavior change studies: Best
a helpful way to train teachers to implement practices and recommendations from the NIH
interventions and collect integrity data that Behavior Change Consortium. Health Psychology,
reflects the percentage of treatment steps im- 23, 443–451.
DiGennaro, F. D., Martens, B. K., & Kleinmann, A. E.
plemented accurately. Depending on the in- (2007). A comparison of performance feedback
tervention, protocols could provide a step-by- procedures on teachers’ treatment implementation
step guide to treatment implementation or a list integrity and students’ inappropriate behavior in
special education classrooms. Journal of Applied
of components that must occur (or may not Behavior Analysis, 40, 447–461.
occur) during treatment. For example, it may be DiGennaro, F. D., Martens, B. K., & McIntyre, L. L.
important to specify when reinforcement should (2005). Increasing treatment integrity through nega-
occur (e.g., contingent on task completion) as tive reinforcement: Effects on teacher and student
behavior. School Psychology Review, 34, 220–231.
well as when reinforcement should not occur Gresham, F. M. (1989). Assessment of treatment integrity
(e.g., in the presence of target problem behavior). in school consultation and prereferral intervention.
Lastly, we recommend that a small sample of School Psychology Review, 18, 37–50.
Gresham, F. M., Gansle, K., & Noell, G. H. (1993).
treatment integrity assessments be collected on all Treatment integrity in applied behavior analysis with
interventions considered to be at high risk for children. Journal of Applied Behavior Analysis, 26,
treatment inaccuracies. Although the demands 257–263.
Gresham, F. M., Gansle, K. A., Noell, G. H., & Cohen, S.
placed on the time of behavior analysts, teachers, (1993). Treatment integrity of school-based behav-
and support staff are great, we have never skimped ioral intervention studies: 1980–1990. School Psychol-
on conducting assessments of the reliability of ogy Review, 22, 254–272.
Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom,
dependent variables (e.g., interobserver agreement S., & Wolery, M. (2005). The use of single-subject
checks). If, for example, interobserver agreement research to identify evidence-based practice in special
data are collected on 35% of all observations, education. Exceptional Children, 71, 165–179.
Individuals with Disabilities Education Improvement Act.
researchers and practitioners alike could decide (2004). Public Law 108–446. Retrieved December
that the number of agreement data-collection 30, 2006, from http://www.ed.gov/policy/speced/
observations could be reduced (e.g., to 20%) and guid/idea/idea2004.html
Institute for Education Sciences. (2006). Special education
15% of observations could be used for treatment research grants 2007 request for applications. Re-
integrity assessments. Because research conducted trieved August 29, 2006, from http://ies.ed.gov/ncser/
in applied settings may be at particularly high risk pdf/2007324.pdf
Johnston, J., & Pennypacker, H. (1993). Strategies and
for treatment inaccuracies, including treatment tactics of behavioral research (2nd ed.). Hillsdale, NJ:
integrity spot checks may be especially important. Erlbaum.
We believe that it is important to have some Kazdin, A. E. (1973). Methodological and assessment
methods in place to ensure that treatments are considerations in evaluating reinforcement programs
in applied settings. Journal of Applied Behavior
implemented as planned. Furthermore, regularly Analysis, 6, 517–531.
including such data in studies published in JABA Kelly, M. B. (1977). A review of the observational data-
may help the field of applied behavior analysis to collection and reliability procedures reported in the
Journal of Applied Behavior Analysis. Journal of Applied
have a better understanding of the concepts and Behavior Analysis, 10, 97–101.
strategies applied researchers can use to strengthen Kennedy, C. H. (2005). Single-case designs for educational
our science. research. Boston: Allyn & Bacon.
Moncher, F. J., & Prinz, F. J. (1991). Treatment fidelity
in outcome studies. Clinical Psychology Review, 11,
REFERENCES 247–266.
Mortenson, B. P., & Witt, J. C. (1998). The use of weekly
Baer, D., Wolf, M., & Risley, T. (1968). Some current performance feedback to increase teacher implemen-
dimensions of applied behavior analysis. Journal of tation of a prereferral academic intervention. School
Applied Behavior Analysis, 1, 91–97. Psychology Review, 27, 613–627.
SCHOOL-BASED INTERVENTIONS 671

Noell, G. H., Witt, J. C., Gilbertson, D. N., Ranier, U.S. Department of Education. (2002). No Child Left
D. D., & Freeland, J. T. (1997). Increasing Behind Act of 2001. Public Law 1-7-110. Retrieved
teacher intervention implementation in general December 30, 2006, from http://www.ed.gov/legisla-
education settings through consultation and perfor- tion/ESEA02/
mance feedback. School Psychology Quarterly, 12, Vollmer, T. R., Roane, H. S., Ringdahl, J. E., & Marcus,
77–88. B. A. (1999). Evaluating treatment challenges with
Noell, G. H., Witt, J. C., LaFleur, L. H., Mortenson, B. differential reinforcement of alternative behavior.
P., Ranier, D. D., & LeVelle, J. (2000). Increasing Journal of Applied Behavior Analysis, 32, 9–23.
intervention implementation in general education Wheeler, J. J., Baggett, B. A., Fox, J., & Blevins, L.
following consultation: A comparison of two follow- (2006). Treatment integrity: A review of intervention
up strategies. Journal of Applied Behavior Analysis, 33, studies conducted with children with autism. Focus on
271–284. Autism and Other Developmental Disabilities, 21,
Northup, J., Fisher, W., Kahng, S., Harrel, B., & Kurtz, 45–54.
P. (1997). An assessment of the necessary strength of Wilder, D. A., Atwell, J., & Wine, B. (2006). The effects
behavioral treatments for severe behavior problems. of varying levels of treatment integrity on child
Journal of Developmental and Physical Disabilities, 9, compliance during treatment with a three-step
1–16. prompting procedure. Journal of Applied Behavior
Peterson, L., Homer, A., & Wonderlich, S. (1982). The Analysis, 39, 369–373.
integrity of independent variables in behavior Witt, J. C., Noell, G. H., La Fleur, L. H., & Mortenson,
analysis. Journal of Applied Behavior Analysis, 15, B. P. (1997). Teacher use of interventions in general
477–492. education: Measurement and analysis of the in-
Skinner, B. F. (1953). Science and human behavior. New dependent variable. Journal of Applied Behavior
York: The Free Press. Analysis, 30, 693–696.
Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The Yeaton, W. H., & Sechrest, L. (1981). Critical dimensions
benefits of psychotherapy. Baltimore: Johns Hopkins in the choice and maintenance of successful treat-
University Press. ments: Strength, integrity, and effectiveness. Journal
Sterling-Turner, H. E., & Watson, T. S. (2002). An of Consulting and Clinical Psychology, 49, 156–167.
analog investigation of the relationship between
treatment acceptability and treatment integrity.
Journal of Behavioral Education, 11, 39–50.
Sterling-Turner, H. E., Watson, T. S., Wildmon, M.,
Watkins, C., & Little, E. (2001). Investigating
the relationship between training type and treat- Received September 6, 2006
ment integrity. School Psychology Quarterly, 16, Final acceptance April 10, 2007
56–67. Action Editor, Louis Hagopian
672 LAURA LEE MCINTYRE et al.

APPENDIX A

School-Based Intervention Protocol for Student Jamie


1. Jamie will use the reinforcement system at all times throughout the school day.
2. Jamie’s behavior plan is specific and targets the following:
a. Follows directions: complies with teacher’s instructions within 10 s without redirection.
b. Completes work: eyes and head oriented to academic task.
c. Body still: appropriate motor movement in the context of classroom instruction
3. Jamie will select a reinforcer from a prepared list of items or activities. The teacher will write
Jamie’s selection on the bottom of the reward slip.
4. Jamie will receive three checks contingent on successfully following directions, completing
work, and keeping his hands and feet to himself (one check for each behavior) within a 20-
min period.
5. Immediately after receiving the final check, Jamie is allowed to earn the selected reinforcer.
6. The teacher should then cycle back through the previous steps repeatedly through the day.

APPENDIX B

Treatment Integrity Protocol Checklist for Student Jamie


Date of observation: ___/___/___ Time of observation: _______ to _______
Teachers present: __________________________ Observer: __________________________
Directions: Please indicate that a treatment step was completed by marking a ! in the
corresponding box.
% Reward slip present targeting the following behaviors:
N Following directions
N Completing work
N Keeping body still
% The selected reward is written at the bottom of the slip.
% Teacher (or aide) provides a ! contingent on appropriate target behavior.
% Jamie earns a reward of his choosing approximately every 20 min.
% Verbal praise is paired with receipt of reward.
% Jamie is asked to select another reward at the start of the next 20-min interval.

# of steps completed: ____________ % steps completed: ____________

You might also like