Predict Student Failure
Predict Student Failure
Predict Student Failure
To cite this article: Avril Dewar, David Hope, Alan Jaap & Helen Cameron (2021): Predicting
failure before it happens: A 5-year, 1042 participant prospective study, Medical Teacher, DOI:
10.1080/0142159X.2021.1908526
Article views: 56
ABSTRACT KEYWORDS
Purpose of the article: Students who fail assessments are at risk of negative consequences, Assessment; psychometrics;
including emotional distress and cessation of studies. Identifying students at risk of failure before student support
they experience difficulties may considerably improve their outcomes.
Methods: Using a prospective design, we collected simple measures of engagement (formative
assessment scores, compliance with routine administrative tasks, and attendance) over the first 6
weeks of Year 1. These measures were combined to form an engagement score which was used
to predict a summative examination sat 14 weeks after the start of medical school. The project
was repeated for five cohorts, giving a total sample size of 1042.
Results: Simple linear regression showed engagement predicted performance (R2adj ¼ 0.03,
F(1,1040) ¼ 90.09, p < 0.001) with a small effect size. More than half of failing students had an
engagement score in the lowest two deciles.
Conclusions: At-risk medical students can be identified with some accuracy immediately after
starting medical school using routinely collected, easily analysed data, allowing for tailored inter-
ventions to support students. The toolkit provided here can reproduce the predictive model in any
equivalent educational context. Medical educationalists must evaluate how the advantages of early
detection are balanced against the potential invasiveness of using student data.
Introduction
Students who fail summative assessment are at risk of a Practice points
variety of adverse outcomes. These include factors such as Candidates at-risk of failing their first summative
emotional distress, cessation of study, and negative finan- exam can be identified after only 6 weeks
cial consequences (Chen 2012). Since student performance on programme.
can, at least to some extent, be predicted in advance Routinely collected data (formative examination
(Abrams and Jernigan 1984; Pell et al. 2009), educators scores, administrative tasks, attendance) provides
must decide how and when they should intervene to sup- sufficient information for a predictive model.
port student learning. The toolkit allows for straightforward reproduc-
‘Remediation’ is a general term for support targeted at tions of the study in a range of medical education
individuals after they have failed assessment (Bahr 2010). contexts.
Remediation can involve developing general study skills or
content knowledge, with the ultimate goal of bringing the
criticised for a reliance on small sample sizes, an absence
student up to a passing standard. Support for the utility of
of effective controls, and a risk of publication bias (Bahr
remediation can be found in many fields. Requiring med-
2008; Bettinger and Long 2009). Some researchers have
ical students resitting an academic year to undertake a
argued that remediation support tends to produce short-
mandatory cognitive skills programme increased the subse-
quent pass rate from 58 to 91% (Winston et al. 2010). A term gains which vanish over time (Pell et al. 2009). Many
comparison of medical students of equivalent ability interventions are poorly described and cannot identify the
showed that those who undertook remediation outper- mechanisms causing performance change—though more
formed those who did not on subsequent assessment recent studies appear to be of higher quality (Cleland
(Cleland et al. 2010). Research in higher education using et al. 2013).
large (n ¼ 85,894) multi-site interventions has shown The evidence therefore suggests remediation interven-
remediation is effective and can greatly reduce attainment tions can be effective—if properly designed and with
gaps (Bahr 2008). It is important to find methods to iden- appropriate follow up monitoring. In particular, developing
tify, support, and provide remediation for at-risk students an awareness of the contextual factors around learning,
across higher education (Foster and Siddle 2020). constructing institutional strategies to avoid the need for
Despite such work, concern over the validity of remedi- remediation, and exploring the specifics of the learning
ation research is ongoing. Studies in the field are often challenges faced by students have been consistently
CONTACT Avril Dewar avril.dewar@ed.ac.uk Edinburgh Medical School, The Chancellor’s Building, Edinburgh BioQuarter, 49 Little France Crescent,
Edinburgh EH16 4SB, Scotland, UK
ß 2021 Informa UK Limited, trading as Taylor & Francis Group
2 A. DEWAR ET AL.
highlighted as features of successful interventions on summative assessment. Finally, we present a simple and
(Kebaetse et al. 2018; Chou et al. 2019; Lacasse et al. 2019). accessible tool to reproduce this work in other contexts.
Despite these positives, remediation efforts have an
inherent problem: a student must have failed or underper-
formed on an aspect of their course to experience a
Methods
remediation intervention. Failure can delay (or prevent) Participants
their progression, potentially for long periods of time (Pell
et al. 2009; Winston et al. 2010). There is a risk the student All participants were from the first year of an MBChB pro-
will never undertake remediation: a significant proportion gramme at the same UK medical school. All medical stu-
of medical students discontinue in the first year, potentially dents studying in year 1 from 2014–2015 to 2019–2020
before effective remediation can be delivered were included in the analysis, except for 2018–2019, during
(Arulampalam et al. 2004). which the project was suspended due to staffing issues. In
One relatively recent innovation has been to estimate total, 1042 students participated. See Table 1 for a break-
performance in advance to effectively pre-deliver some down of participant information per year.
form of remediation content. Such early interventions for
‘at-risk’ students have focused firstly on identifying qualify- Procedure
ing students. Students are sometimes categorized as at-risk
based on their background characteristics (including age, We identified a series of measures which reflected student
sex, or ethnicity) and given additional support during the engagement. These were drawn from examples discussed
study. This has shown success in improving outcomes for in the literature (Wright and Tanner 2002; DeVoe et al.
some groups (Burch et al. 2007, 2013; Curtis et al. 2017) 2007) or informed by local experience. In the pilot year, we
but not others (DeVoe et al. 2007). Within medical schools, consulted with administrative and virtual learning environ-
issues involving attendance, attitude, completion of routine ment (VLE) staff to obtain a list of potential indicators
academic tasks, self-reported career choice and self-motiv- which were routinely collected and easy to access. We col-
ation have predicted later achievement on assessment lected historical data for the previous three years and cor-
(Wright and Tanner 2002; O’Neill et al. 2016; Urrutia-Aguilar related all the available data with the students’ first
et al. 2016; Li et al. 2019). Recently, attempts to predict at- summative exam score. We chose predictors with a correl-
risk students via learning analytics have suggested it is pos- ation of above 0.3 (moderate). These were formative exam
sible to use such analytics to identify over half the students result, attendance at problem-based learning (PBL) and
qualifying as being at-risk (Saqr et al. 2017). promptness of returning essential paperwork. In subse-
Collectively, such work represents an obvious opportun- quent years, the list did not change and so represented a
ity. If well-designed remediation strategies improve student prospective analysis of at-risk status.
outcomes, applying such interventions before failure may An engagement score was created by weighting the
allow for even better outcomes—including a reduction in three predictors linearly based on their correlation with
lost student time, increased staff resources for other tasks, summative exam performance, rounded for ease of use.
and potentially reducing distress caused by experiencing Formative exam result, converted to a percentage, as the
failure. However, challenges remain. strongest correlation was weighted at 50%. Attendance
Firstly, as the mechanisms behind post-remediation per- was expressed as a percentage and then weighted at 25%.
formance improvement remain unclear and the context of We calculated how far in advance of the deadline the
every programme different (Cleland et al. 2013) it is diffi- essential paperwork was completed, meaning those who
cult to be sure any specific approach will be effective in submitted on the first possible day had the highest score.
identifying or remediating problems. As a result, individual This was also weighted at 25%. The weighted scores were
educators may struggle to identify the most appropriate then added together to create the engagement score,
mechanisms for their context, and be unable to engage where the lower the score, the lower the engagement.
with the often time-consuming processes required to make In all years, the outcome measure was student scores
best use of the available data; the more complex the tool, sat on a multiple-choice question (MCQ) written assess-
the more challenging this becomes (Wilson et al. 2017; ment, standard set to have a pass score of 60. This summa-
Zhang et al. 2018). Secondly, effect sizes in the field remain tive assessment acted as a progression barrier and failing it
highly variable—which means it is difficult to know how required a resit.
effective early interventions can be at detecting students in Ethical approval for the study was granted by the MVM
need (Bahr 2010; Saqr et al. 2017). Finally, students express Education Research Ethics Board. The study was considered
concerns over the invasiveness of some of the data gath- routine evaluation of teaching, and students consented to
ered (Roberts et al. 2016)—which may rule out the use of the use of their data for these purposes.
some data sources.
In this study, we describe a prospective multi-year ana-
Statistical analyses
lysis of 1042 undergraduate medical students, identifying
those in need of an early intervention within 6 weeks of Shapiro–Wilks test was used to test regression residuals for
starting medical school. We report on how the findings can normality. Data was non-normal (W ¼ 0.87, p < 0.001). A
be applied to create a framework for identifying factors bootstrap regression was therefore performed. As results
relevant to predicting at-risk students, and estimate the were similar and both significant (p < 0.001), simple linear
effect size of the model and its effectiveness at identifying regression was accepted as the inferential test for simplicity
those who go on to be placed in the bottom two deciles and reproducibility.
MEDICAL TEACHER 3
Simple linear regression examines the relationship Deciles were chosen as a means to overcome variation in
between the predictor and outcome variables (Lewis-Beck year average engagement scores. The percentage of all
2004). Values of the outcome variable can be calculated by candidates suggests that the actual number of fails for this
the predictor variable. This is expressed as a standardised exam is low. The percentage of failing candidates shows
beta (b); as the predictor value changes by one standard that more than 50% of all failing candidates score in dec-
deviation, the outcome variable changes by one standard iles 1 and 2 of the engagement score.
deviation b.
Inferential statistics
Results
Results of the simple linear regression indicated that a
Descriptive statistics small amount of variance was explained by the mode (R2adj
A breakdown of the number of participants per session, ¼ 0.03, F(1,1040) ¼ 90.09, p < 0.001, f2 ¼ 0.09). It was
along with the means and SD of the engagement and found that engagement score significantly predicted candi-
summative scores can be seen in Table 1. The mean date performance on their first summative examination
engagement score is variable between years as the return (b ¼ 0.28, p < 0.001). This suggests that as the engagement
of essential paperwork score was derived using the dead- score increases by 14.13, the performance score will
line given to the students, which changed year to year. increase by 3.77 on average. This relationship can be
Table 2 shows the numbers of candidates who have viewed in Figure 1.
failed the summative exam over all sessions. Engagement A post hoc statistical power analysis was performed for
score was split into equal deciles by session, where decile a simple linear regression with 1042 participants. The effect
1 is the lowest scoring and 10 is the highest scoring. size in this study was f2 ¼ 0.09, which is considered small
4 A. DEWAR ET AL.
using Cohen’s (1992) criteria. With an alpha ¼ 0.05, the There are several topics for future research that would
power ¼ 1, which is sufficient to detect a small effect size. be especially useful. Given the variation in effect sizes it
would be beneficial to see replications in other contexts to
better estimate the range of possible effects (and the
Discussion
causes behind the variability). Longitudinal work looking at
Performance on assessment can be predicted from student performance later in the programme and likelihood of
information generated within the first few weeks of starting graduation would be especially valuable to see whether
medical school (R2adj ¼ 0.03, F(1,1040) ¼ 90.09, p < 0.001). early predictors add explanatory value over simply using
In the model, over half of all students who failed summa- assessment scores. Importantly, researchers should explore
tive assessment scored in the bottom two deciles of the the mechanisms behind early risk factors to better under-
engagement score. Collectively, these results support the stand what aspects of learning can be improved to help
use of very early interventions to help students before they students identified at-risk.
encounter difficulty. The effect size was small (f2 ¼ 0.09), This study significantly adds to our understanding of the
which is important. Such exercises can complement teacher utility of the early identification of at-risk students. It also
evaluation. provides a useful tool for educators to apply in their own
The findings largely agree with those based on past context. This tool may be particularly useful when school-
research. Research using admissions data to identify at-risk level education has been disrupted and so entry grades are
students has been able to identify around 40% of at-risk less certain, as in the COVID-19 pandemic (Fuller et al.
students (Wright and Tanner 2002). The use of tutor ratings 2020). As our understanding of this topic evolves, we will
has had some success in identifying at-risk students, cor- be better placed to support students as soon as they enter
rectly classifying 24% of those who went on to fail (O’Neill their programme.
et al. 2016). Using full-course data, around half of at-risk
students were identified in a more recent study (Saqr et al.
2017). Notably, the small effect size found in this study (f2 Disclosure statement
¼ 0.09) adds to a body of work where the effect sizes have The authors have no declarations of interest to report.
been highly variable, with some research suggesting very
high levels of accuracy (Saqr et al. 2017) compared to
others. This may partly be explained by the fact that the Glossary
present study focused on very early measures of engage-
Learning analytics: Is the measurement, collection, analysis
ment, but the range of expected effect sizes remains and reporting of data about learners and their contexts, for
unclear. Our findings support the need to investigate purposes of understanding and optimising learning and the
remediation, a priority throughout higher education (Foster environments in which it occurs.
and Siddle 2020). SIEMENS, G. & GASEVIC, D. 2012. Guest Editorial - Learning and
This study is particularly significant as it is measures risk Knowledge Analytics. Journal of Educational Technology &
at a very early point and does so using simple and rou- Society. 15: 1-2.
tinely collected measures which require no special expert-
ise or access to potentially sensitive data such as
admissions profiles. The simplicity and accessibility of the Funding
measures used allows for easy replication in other contexts This work was supported by Principal’s Teaching Award Scheme at the
using the tools provided. We have provided a full set of University of Edinburgh.
guidance notes on replicating the work. This study also fur-
ther demonstrates that we need to consider how to use
Notes on contributors
predictive measures in education. The concerns over the
use of data in higher education are growing (Roberts et al. Avril Dewar, MSc, is a fellow in medical education at Edinburgh
2016), but as we have the capacity to make meaningful Medical School.
predictions over future performance, it is necessary for edu- David Hope, PhD, is a senior lecturer at Edinburgh Medical School.
cators and students to reach a consensus on what should
be done with such capabilities. The present study used Alan Jaap, MD, is Deputy Director of Teaching at Edinburgh
Medical School.
prospective data, which reduces the likelihood of overfit-
ting and increases confidence that the detected effect is Helen Cameron, MBChB, is Acting Head of School and Dean of
genuine and reproducible. Medical Education at Aston Medical School.
Inevitably, there were some limitations with the design.
This was a single-site study which used a relatively short- ORCID
term outcome measure. It does not consider the long-term
predictive validity of the engagement score, such as likeli- Avril Dewar http://orcid.org/0000-0003-1992-6148
David Hope http://orcid.org/0000-0001-6623-2857
hood of graduation. Importantly, it cannot explain the Alan Jaap http://orcid.org/0000-0001-8289-704X
mechanisms behind the observed associations, and the Helen Cameron http://orcid.org/0000-0002-2798-2177
simplicity of the measures may have created trade-offs
with predictive validity. Additionally, the number of failing
students is small and therefore replication of the study, References
particularly in additional institutions, is necessary to con- Abrams HG, Jernigan LP. 1984. Academic support services and the suc-
firm the conclusion. cess of high-risk college students. Am Educ Res J. 21(2):261–274.
MEDICAL TEACHER 5
Arulampalam W, Naylor RA, Smith JP. 2004. A hazard model of the Fuller R, Joynes V, Cooper J, Boursicot K, Roberts T. 2020. Could
probability of medical school drop-out in the UK. J Royal Statistical COVID-19 be our ’There is no alternative’ (TINA) opportunity to
Soc A. 167(1):157–178. enhance assessment? Med Teach. 42(7):781–786.
Bahr PR. 2008. Does mathematics remediation work?: a comparative Kebaetse MB, Kebaetse M, Mokone GG, Nkomazana O, Mogodi M,
analysis of academic attainment among community college stu- Wright J, Falama R, Park E. 2018. Learning support interventions for
dents. Res High Educ. 49(5):420–450. Year 1 medical students: a review of the literature. Med Educ. 52(3):
Bahr PR. 2010. Revisiting the efficacy of postsecondary remediation: 263–273.
the moderating effects of depth/breadth of deficiency. Rev Higher Lacasse M, Aud etat MC, Boileau
E, Caire Fon N, Dufour MH, Laferriere
Educ. 33(2):177–205. MC, Lafleur A, La Rue E, Lee S, Nendaz M, et al. 2019. Interventions
Bettinger EP, Long BT. 2009. Addressing the needs of underprepared for undergraduate and postgraduate medical learners with aca-
students in higher education: does college remediation work? J demic difficulties: a BEME systematic review: BEME Guide No. 56.
Human Resources. 44(3):736–771. Med Teach. 41(9):981–1001.
Burch VC, Sikakana CN, Gunston GD, Shamley DR, Murdoch-Eaton D. Lewis-Beck M. 2004. Regression. In: Lewis-Beck MS, Bryman A, Liao TF,
2013. Generic learning skills in academically-at-risk medical stu- editors. The SAGE encyclopedia of social science research methods.
dents: a development programme bridges the gap. Med Teach. Thousand Oaks (CA): SAGE Publications, Inc.; p. 936–938.
35(8):671–677. Li J, Thompson R, Shulruf B. 2019. Struggling with strugglers: using
Burch VC, Sikakana CN, Yeld N, Seggie JL, Schmidt HG. 2007. data from selection tools for early identification of medical students
Performance of academically at-risk medical students in a problem-
at risk of failure. BMC Med Educ. 19(1):415.
based learning programme: a preliminary report. Adv Health Sci
O’Neill LD, Morcke AM, Eika B. 2016. The validity of student tutors’
Educ Theory Pract. 12(3):345–358.
judgments in early detection of struggling in medical school. A pro-
Chen R. 2012. Institutional characteristics and college student dropout
spective cohort study. Adv Health Sci Educ Theory Pract. 21(5):
risks: a multilevel event history analysis. Res High Educ. 53(5):
1061–1079.
487–505.
Pell G, Boursicot K, Roberts T. 2009. The trouble with resits … . The
Chou CL, Kalet A, Costa MJ, Cleland J, Winston K. 2019. Guidelines: the
Trouble with Resits … . Assess Eval High Educ. 34(2):243–251.
dos, don’ts and don’t knows of remediation in medical education.
Roberts LD, Howell JA, Seaman K, Gibson DC. 2016. Student attitudes
Perspect Med Educ. 8(6):322–338.
toward learning analytics in higher education: “The Fitbit Version of
Cleland J, Leggett H, Sandars J, Costa MJ, Patel R, Moffat M. 2013. The
the Learning World”. Front Psychol. 7:1959–1959.
remediation challenge: theoretical and methodological insights
Saqr M, Fors U, Tedre M. 2017. How learning analytics can early pre-
from a systematic review. Med Educ. 47(3):242–251.
Cleland J, Mackenzie RK, Ross S, Sinclair HK, Lee AJ. 2010. A remedial dict under-achieving students in a blended medical education
intervention linked to a formative assessment is effective in terms course. Med Teach. 39(7):757–767.
of improving student performance in subsequent degree examina- Urrutia-Aguilar ME, Fuentes-Garcıa R, Martınez V, Beck E, Leo n S,
tions. Med Teach. 32(4):e185–e190. Guevara-Guzman R. 2016. Logistic regression model for the aca-
Cohen J. 1992. A power primer. Psychol Bull. 112(1):155–159. demic performance of first-year medical students in the biomedical
Curtis E, Wikaire E, Jiang Y, McMillan L, Loto R, Poole P, Barrow M, area. CE. 07(15):2202–2211.
Bagg W, Reid P. 2017. Examining the predictors of academic out- Wilson A, Watson C, Thompson TL, Drew V, Doyle S. 2017. Learning
comes for indigenous Maori, Pacific and rural students admitted analytics: challenges and limitations. Teach High Educ. 22(8):
into medicine via two equity pathways: a retrospective observa- 991–1007.
tional study at the University of Auckland, Aotearoa New Zealand. Winston KA, Van der Vleuten CP, Scherpbier AJ. 2010. An investigation
BMJ Open. 7(8):e017276. into the design and effectiveness of a mandatory cognitive skills
DeVoe P, Niles C, Andrews N, Benjamin A, Blacklock L, Brainard A, programme for at-risk medical students. Med Teach. 32(3):236–243.
Colombo E, Dudley B, Koinis C, Osgood M. 2007. Lessons learned Wright N, Tanner MS. 2002. Medical students’ compliance with simple
from a study-group pilot program for medical students perceived administrative tasks and success in final examinations: retrospective
to be ’at risk’. Medical Teacher. 29(2–3):e37–e40. cohort study. BMJ. 324(7353):1554–1555.
Foster E, Siddle R. 2020. The effectiveness of learning analytics for Zhang J, Zhang X, Jiang S, Ordo n
~ez de Pablos P, Sun Y. 2018.
identifying at-risk students in higher education. Assess Eval Higher Mapping the study of learning analytics in higher education. Behav
Educ. 45(6):842–854. Inf Technol. 37(10–11):1142–1155.