Abstract
Predicting student performance for educational organizations such as universities, community colleges, schools, and training centers will enhance the overall results of these organizations. Big data can be extracted from the internal systems of these organizations, such as exam records, statistics about virtual courses, and e-learning systems. Finding meaningful knowledge from extracted data is a challenging task. In this paper, we proposed a modified version of Harris Hawks Optimization (HHO) algorithm by controlling the population diversity to overcome the early convergence problem and prevent trapping in a local optimum. The proposed approach is employed as a feature selection algorithm to discover the most valuable features for student performance prediction problem. A dynamic controller that controls the population diversity by observing the performance of HHO using the k-nearest neighbors (kNN) algorithm as a clustering approach. Once all solutions belong to one cluster, an injection process is employed to redistribute the solutions over the search space. A set of machine learning classifiers such as kNN, Layered recurrent neural network (LRNN), Naïve Bayes, and Artificial Neural Network are used to evaluate the overall prediction system. A real dataset obtained from UCI machine learning repository is adopted in this paper. The obtained results show the importance of predicting students’ performance at an earlier stage to avoid students’ failure and improve the overall performance of the educational organization. Moreover, the reported results show that the combination between the enhanced HHO and LRNN can outperform other classifiers with accuracy equal to \(92\%\), since LRNN is a deep learning algorithm that is able to learn from previous and current input values.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Kemper L, Vorhoff G, Wigger BU (2020) Predicting student dropout: a machine learning approach. Eur J Higher Educ 10:28–47. https://doi.org/10.1080/21568235.2020.1718520
Baker R, Yacef K, The state of educational data mining in, (2009) A review and future visions. JEDM 1(2009):3–17
Turabieh H (2019) Hybrid machine learning classifiers to predict student performance. In: 2019 2nd international conference on new trends in computing sciences (ICTCS), pp 1–6. https://doi.org/10.1109/ICTCS.2019.8923093
Fernandes E, Holanda M, Victorino M, Borges V, Carvalho R, Erven GV (2019) Educational data mining: Predictive analysis of academic performance of public school students in the capital of Brazil. J Bus Res 94(2019):335–343. https://doi.org/10.1016/j.jbusres.2018.02.012
Bakhshinategh B, Zaiane OR, Elatia S, Ipperciel D (2018) Educational data mining applications and tasks: a survey of the last 10 years. Educ Inf Technol 23:537–553. https://doi.org/10.1007/s10639-017-9616-z
Aldowah H, Al-Samarraie H, Fauzy WM (2019) Educational data mining and learning analytics for 21st century higher education: a review and synthesis. Telematics Inf 37:13–49. https://doi.org/10.1016/j.tele.2019.01.007
Hussain S, Atallah R, Kamsin A, Hazarika J (2019) Classification, clustering and association rule mining in educational datasets using data mining tools: A case study. In: Silhavy R (ed) Cybernetics and algorithms in intelligent systems. Springer International Publishing, Cham, pp 196–211
Khare K, Lam H, Khare A (2018) Educational Data Mining (EDM): Researching impact on online business education. Springer International Publishing, Cham, pp 37–53. https://doi.org/10.1007/978-3-319-62776-2_3
Olivé DM, Huynh DQ, Reynolds M, Dougiamas M, Wiese D (2018) A supervised learning framework for learning management systems. In: Proceedings of the first international conference on data science, E-learning and information systems, DATA ’18, ACM, New York, NY, USA, pp 18:1–18:8. https://doi.org/10.1145/3279996.3280014
Talbi E-G (2009) Metaheuristics: from design to implementation, vol 74. Wiley, Hoboken
Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13:533–549
Lourenço HR, Martin OC, Stützle T (2003) Iterated Local Search, Springer, US, Boston. MA 320–353. https://doi.org/10.1007/0-306-48056-5_11
Van Laarhoven PJ, Aarts EH (1987) Simulated annealing. In: Simulated annealing: theory and applications. Springer, pp 7–15
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE International Conference on Neural Networks. Proceedings, vol 4, pp 1942–1948. https://doi.org/10.1109/ICNN.1995.488968
Maniezzo A (1992) Distributed optimization by ant colonies. In: Toward a practice of autonomous systems: proceedings of the First European conference on artificial life. Mit Press, p 134
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. MIT Press, Cambridge, MA, USA
Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008
Mirjalili S (2015) Moth-flame optimization algorithm: a novel nature-inspired heuristic paradigm. Knowl Based Syst 89(2015):228–249. https://doi.org/10.1016/j.knosys.2015.07.006
Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27:1053–1073
Yang XS (2010) A new metaheuristic bat-inspired algorithm. Springer, Berlin, pp 65–74. https://doi.org/10.1007/978-3-642-12538-6_6
Heidari AA, Mirjalili S, Faris H, Aljarah I, Mafarja M, Chen H (2019) Harris hawks optimization: algorithm and applications. Future Gener Comput Syst. 97:849–872. https://doi.org/10.1016/j.future.2019.02.028
Ho Y-C, Pepyne DL (2002) Simple explanation of the no-free-lunch theorem and its implications. J Optim Theory Appl 115:549–570
Lin L, Gen M (2009) Auto-tuning strategy for evolutionary algorithms: balancing between exploration and exploitation. Soft Comput 13:157–168. https://doi.org/10.1007/s00500-008-0303-2
Elaziz MA, Heidari AA, Fujita H, Moayedi H (2020) A competitive chain-based harris hawks optimizer for global optimization and multi-level image thresholding problems. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2020.106347
Yang F, Li FW (2018) Study on student performance estimation, student progress analysis, and student potential prediction based on data mining. Comput Educ 123:97–108. https://doi.org/10.1016/j.compedu.2018.04.006
Rana S, Garg R (2018) Student’s performance evaluation of an institute using various classification algorithms. In: Mishra DK, Nayak MK, Joshi A (eds) Information and communication technology for sustainable development. Springer, Singapore, pp 229–238
Kesumawati A, Utari DT (2018) Predicting patterns of student graduation rates using naïve bayes classifier and support vector machine. AIP Conf Proc 2021:060005
Bharara S, Sabitha S, Bansal A (2018) Application of learning analytics using clustering data mining for students’ disposition analysis. Educ Inf Technol 23:957–984. https://doi.org/10.1007/s10639-017-9645-7
Alfiani HAP, Wulandari FA (2015) Mapping student’s performance based on data mining approach (a case study). Agric Sci Proc 3:173 – 177. https://doi.org/10.1016/j.aaspro.2015.01.034, international Conference on Agro-industry (IcoA): Sustainable and Competitive Agro-industry for Human Welfare Yogyakarta-INDONESIA 2014
de Morais AM, Araújo JMFR, Costa EB (2014) Monitoring student performance using data clustering and predictive modelling. In: 2014 IEEE frontiers in education conference (FIE) proceedings, pp 1–8. https://doi.org/10.1109/FIE.2014.7044401
Trivedi S, Pardos ZA, Heffernan NT (2011) Clustering students to generate an ensemble to improve standard test score predictions. In: Biswas G, Bull S, Kay J, Mitrovic A (eds) Artificial Intelligence in Education. Springer, Berlin, pp 377–384
Romero C, Ventura S (2007) Educational data mining: a survey from 1995 to 2005. Exp Syst Appl 33:135–146. https://doi.org/10.1016/j.eswa.2006.04.005
Njeru AM, Omar MS, Yi S, Paracha S, Wannous M (2017) Using iot technology to improve online education through data mining. Int Conf Appl Syst Innov (ICASI) 2017:515–518. https://doi.org/10.1109/ICASI.2017.7988469
Marquez J, Villanueva J, Solarte Z, Garcia A (2016) Iot in education: integration of objects with virtual academic communities. In: Rocha Á, Correia AM, Adeli H, Reis LP, Mendonça Teixeira M (Eds), New advances in information systems and technologies. Springer International Publishing, Cham, pp 201–212
Farhan M, Jabbar S, Aslam M, Hammoudeh M, Ahmad M, Khalid S, Khan M, Han K (2018) Iot-based students interaction framework using attention-scoring assessment in elearning. Future Gener Comput Syst 79:909–919. https://doi.org/10.1016/j.future.2017.09.037
Memeti S, Pllana S, Ferati M, Kurti A, Jusufi I (2019) Iotutor: How cognitive computing can be applied to internet of things education. In: Strous L, Cerf VG (eds) Internet of Things. Springer International Publishing, Cham, Information Processing in an Increasingly Connected World, pp 218–233
Minaei-Bidgoli B, Kashy DA, Kortemeyer G, Punch WF (2003) Predicting student performance: an application of data mining methods with an educational web-based system In: 33rd annual frontiers in education. FIE 2003, vol 1, pp T2A–13. https://doi.org/10.1109/FIE.2003.1263284
García E, Romero C, Ventura S, de Castro C (2011) A collaborative educational association rule mining tool. Internet and Higher Educ 14:77–88. http://www.sciencedirect.com/science/article/pii/S1096751610000618. https://doi.org/10.1016/j.iheduc.2010.07.006, web mining and higher education: Introduction to the special issue
Ougiaroglou S, Paschalis G (2012) Association rules mining from the educational data of esog web-based application. In: Iliadis L, Maglogiannis I, Papadopoulos H, Karatzas K, Sioutas S (eds) Artificial intelligence applications and innovations. Springer, Berlin, pp 105–114
Damaševičius R (2010) Analysis of academic results for informatics course improvement Using Association Rule Mining. Springer, Boston, pp 357–363
Pinto H, Han J, Pei J, Wang K, Chen Q, Dayal U (2001) Multi-dimensional sequential pattern mining. In: Proceedings of the tenth international conference on information and knowledge management, CIKM ’01, ACM, New York, NY, USA, pp 81–88. https://doi.org/10.1145/502585.502600
Simpson K, Beukelman D, Sharpe T (2000) An elementary student with severe expressive communication impairment in a general education classroom: Sequential analysis of interactions. Augmentative Alternative Commun 16:107–121. https://doi.org/10.1080/07434610012331278944
Nakamura S, Nozaki K, Morimoto Y, Miyadera Y (2014) Sequential pattern mining method for analysis of programming learning history based on the learning process. In: International conference on education technologies and computers (ICETC) 2014, pp 55–60. https://doi.org/10.1109/ICETC.2014.6998902
Črepinšek M, Liu S-H, Mernik M (2013) Exploration and exploitation in evolutionary algorithms: a survey. ACM Comput Surv. 45:35:1–35:33. https://doi.org/10.1145/2480741.2480752
Sun J, Zhang H, Zhang Q, Chen H (2018) Balancing exploration and exploitation in multiobjective evolutionary optimization. In: Proceedings of the genetic and evolutionary computation conference companion, GECCO ’18, ACM, New York, NY, USA, pp 199–200. https://doi.org/10.1145/3205651.3205708
Mittal N, Singh U, Sohi BS (2016) Modified grey wolf optimizer for global engineering optimization. Appl Comp Intell Soft Comput. https://doi.org/10.1155/2016/7950348
Albina K, Lee SG (2019) Hybrid stochastic exploration using grey wolf optimizer and coordinated multi-robot exploration algorithms. IEEE Access 7:14246–14255. https://doi.org/10.1109/ACCESS.2019.2894524
Lynn N, Suganthan PN (2015) Heterogeneous comprehensive learning particle swarm optimization with enhanced exploration and exploitation. Swarm Evol Comput 24(2015):11–24. https://doi.org/10.1016/j.swevo.2015.05.002
Chen F, Sun X, Wei D, Tang Y (2011) Tradeoff strategy between exploration and exploitation for pso. In: 2011 seventh international conference on natural computation, vol 3, pp 1216–1222. https://doi.org/10.1109/ICNC.2011.6022365
Shojaedini E, Majd M, Safabakhsh R (2019) Novel adaptive genetic algorithm sample consensus. Appl Soft Comput 77:635–642. https://doi.org/10.1016/j.asoc.2019.01.052
Kelly J, Hemberg E, O’Reilly U-M (2019) Improving genetic programming with novel exploration - exploitation control. In: Sekanina L, Hu T, Lourenço N, Richter H, García-Sánchez P (eds) Genetic Programming. Springer International Publishing, Cham, pp 64–80
Rezapoor Mirsaleh M, Meybodi MR (2018) Balancing exploration and exploitation in memetic algorithms: a learning automata approach. Comput Intell 34:282–309. https://doi.org/10.1111/coin.12148
Jedrzejowicz P (2019) Current trends in the population-based optimization. In: Nguyen NT, Chbeir R, Exposito E, Aniorté P, Trawiński B (eds) Computational collective intelligence. Springer International Publishing, Cham, pp 523–534
Turabieh H, Mafarja M, Li X (2019) Iterated feature selection algorithms with layered recurrent neural network for software fault prediction. Exp Syst Appl 122:27–42. https://doi.org/10.1016/j.eswa.2018.12.033
Cortez P, Silva A (2008) Using data mining to predict secondary school student performance. In: Brito A, Teixeira J (Eds) Proceedings of 5th future business technology conference (FUBUTEC 2008), Porto, Portugal, 2008, pp 5–12
Dua D, Graff C (2019) UCI machine learning repository. http://archive.ics.uci.edu/ml
Acknowledgements
The authors would like to acknowledge Taif University Researchers Supporting Project Number (TURSP-2020/125), Taif University, Taif, Saudi Arabia.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Turabieh, H., Azwari, S.A., Rokaya, M. et al. Enhanced Harris Hawks optimization as a feature selection for the prediction of student performance. Computing 103, 1417–1438 (2021). https://doi.org/10.1007/s00607-020-00894-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-020-00894-7