Abstract
This paper focuses on enhancing the accuracy of student performance prediction using LSTM and Random Forest algorithms. These algorithms are trained on Jordan datasets and also aimed to shed light on their internal workings. The methods of LIME and SHAP for achieving explainability have been employed to gain insights into the inner mechanisms of these prediction models. The study found that different explanation techniques yielded diverse outcomes in identifying the crucial factors influencing student success, even when using the same group of students and machine learning models. The results revealed that the LSTM model was predominantly influenced by the Parent Answering Survey and behavior features, as revealed by the LIME and SHAP methodologies. Conversely, the SPAP and LIME techniques unveiled that the Students Absence Days and Behavior features had a more pronounced influence on explaining the outcomes of the RF model.
Similar content being viewed by others
Data Availability
The dataset generated and analyzed during the current study are available from the corresponding author on reasonable request.
References
Adnan M, Habib A, Ashraf J, Mussadiq S, Raza AA, Abid M, Bashir M, Khan SU. Predicting at-risk students at different percentages of course length for early intervention using machine learning models. Ieee Access. 2021;5(9):7519–39.
Alwarthan S, Aslam N, Khan IU. An explainable model for identifying at-risk student at higher education. IEEE Access. 2022;30(10):107649–68.
Alyahyan E, Düşteaör D. Decision Trees for Very Early Prediction of Student's Achievement. In2020 2nd International Conference on Computer and Information Sciences (ICCIS) 2020 Oct 13 pp. 1–7. IEEE
Baashar Y, Alkawsi G, Mustafa A, Alkahtani AA, Alsariera YA, Ali AQ, Hashim W, Tiong SK. Toward predicting student’s academic performance using artificial neural networks (ANNs). Appl Sci. 2022;12(3):1289.
Baranyi M, Nagy M, Molontay R. Interpretable deep learning for university dropout prediction. InProceedings of the 21st annual conference on information technology education 2020 Oct 7 (pp. 13–19)
Burkart N, Huber MF. A survey on the explainability of supervised machine learning. J Artif Intell Res. 2021;19(70):245–317.
Coussement K, Phan M, De Caigny A, Benoit DF, Raes A. Predicting student dropout in subscription-based online learning environments: the beneficial impact of the logit leaf model. Decis Support Syst. 2020;1(135): 113325.
Chen Fu, Cui Y. Utilizing student time series behaviour in learning management systems for early prediction of course performance. J Learn Anal. 2020;7(2):1–17.
Eason G, Noble B, Sneddon IN. On certain integrals of Lipschitz-Hankel type involving products of Bessel functions. Phil Trans Roy Soc London. 1955;A247:529–51.
Ferreira A, Madeira SC, Gromicho M, de Carvalho M, Vinga S, Carvalho AM. Predictive medicine using interpretable recurrent neural networks. In: Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021. Lecture Notes in Computer Science book series (LNIP, Vol. 12661), Proceedings, Part I 2021 Feb 21. Cham: Springer International Publishing; 2021. p. 187–202
Hasib KM, Rahman F, Hasnat R, Alam MG. A machine learning and explainable ai approach for predicting secondary school student performance. In2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC) 2022 Jan 26 (pp. 0399–0405). IEEE
Injadat M, Moubayed A, Nassif AB, Shami A. Multi-split optimized bagging ensemble model selection for multi-class educational data mining. Appl Intell. 2020;50:4506–28.
Khosravi H, Shum SB, Chen G, Conati C, Tsai YS, Kay J, Knight S, Martinez-Maldonado R, Sadiq S, Gaševi’c D. Explainable artificial intelligence in education. Comput Educ Artif Intell. 2022;3: 100074.
Klaise J, Van Looveren A, Vacanti G, Coca A. Alibi explain: Algorithms for explaining machine learning models. J Mach Learn Res. 2021;22(1):8194–200.
Marras M, Julien Tuang Tu Vignoud, and Tanja Kaser. Can feature predictive power generalize? benchmarking early predictors of student success across piped and online courses. In 14th International Conference on Educational Data Mining, pages. 2021;150:160
S. Mazzanti, "Shap values explained exactly how you wished someone explained to you," https://towardsdatascience.com/shap-explained-the-wayi-wish-someone-explained-it-to-me-ab81cc69ef30. Accessed 4 Jan 2020
Molnar C. Interpretable Machine Learning. 2nd edition, 2022
Pei B, Xing W. An interpretable pipeline for identifying at-risk students. J Educ Comput Res. 2022;60(2):380–405.
Piscitello J, Kim YK, Orooji M, Robison S. Sociodemographic risk, school engagement, and community characteristics: a mediated approach to understanding high school dropout. Child Youth Serv Rev. 2022;133: 106347.
Ramaswami G, Susnjak T, Mathrani A, Lim J, Garcia P. Using educational data mining techniques to increase the prediction accuracy of student academic performance. Inf Learn Sci. 2019;120(7/8):451–67.
Scheers H, De Laet T. Interactive and explainable advising dashboard opens the black box of student success prediction. InTechnology-Enhanced Learning for a Free, Safe, and Sustainable World: 16th European Conference on Technology Enhanced Learning, EC-TEL 2021, Bolzano, Italy, September 20–24, 2021, Proceedings 16 2021 (pp. 52–66). Springer International Publishing
Veerasamy AK, D'Souza D, Apiola MV, Laakso MJ, Salakoski T. Using early assessment performance as early warning signs to identify at-risk students in programming courses. In2020 IEEE Frontiers in Education Conference (FIE) 2020 Oct 21 pp. 1–9. IEEE
Vultureanu-Albi si A, and Costin B adic a. Improving students' performance by interpretable explanations using ensemble tree-based approaches. In IEEE 15th International Symposium on Applied Computational Intelligence and Informatics, pages 215{220. IEEE, 2021
Yağcı M. Educational data mining: prediction of students’ academic performance using machine learning algorithms. Smart Learn Environ. 2022;9(1):11.
Funding
No funding was received for this research.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
No conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Advances in Computational Approaches for Image Processing, Wireless Networks, Cloud Applications and Network Security” guest edited by P. Raviraj, Maode Ma and Roopashree H R.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kartik, N., Mahalakshmi, R. & Venkatesh, K.A. XAI-Based Student Performance Prediction: Peeling Back the Layers of LSTM and Random Forest’s Black Boxes. SN COMPUT. SCI. 4, 699 (2023). https://doi.org/10.1007/s42979-023-02070-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-023-02070-y