Using Educational Data Mining Techniques to Identify Profiles in Self-Regulated Learning: An Empirical Evaluation

Authors

DOI:

https://doi.org/10.19173/irrodl.v22i4.5401

Keywords:

educational data mining, EDM, self-regulated learning, SRL profile, algorithm, agglomerative hierarchical clustering, clustering algorithm

Abstract

With the increased emphasis on the benefits of self-regulated learning (SRL), it is important to make use of the huge amounts of educational data generated from online learning environments to identify the appropriate educational data mining (EDM) techniques that can help explore and understand online learners’ behavioral patterns. Understanding learner behaviors helps us gain more insights into the right types of interventions that can be offered to online learners who currently receive limited support from instructors as compared to their counterparts in traditional face-to-face classrooms. In view of this, our study first identified an optimal EDM algorithm by empirically evaluating the potential of three clustering algorithms (expectation-maximization, agglomerative hierarchical, and k-means) to identify SRL profiles using trace data collected from the Open University of the UK. Results revealed that agglomerative hierarchical was the optimal algorithm, with four clusters. From the four clusters, four SRL profiles were identified: poor self-regulators, intermediate self-regulators, good self-regulators, and exemplary self-regulators. Second, through correlation analysis, our study established that there is a significant relationship between the SRL profiles and students’ final results. Based on our findings, we recommend agglomerative hierarchical as the optimal algorithm to identify SRL profiles in online learning environments. Furthermore, these profiles could provide insights on how to design a learning management system which could promote SRL, based on learner behaviors.

 

References

Ainscough, L., Leung, R., Colthorpe, K., & Langfield, T. (2019). Characterizing university students’ self-regulated learning behavior using dispositional learning analytics. In J. Domenech, P. Merello, E. de la Poza, D. Blazquez, & R. Peña-Ortiz (Eds.), Fifth international conference on higher education advances (HEAd’19) (pp. 233–241). Editorial Universitat Politècnica de València. https://doi.org/10.4995/head19.2019.9153

Aljohani, N. R., Fayoumi, A., & Hassan, S.-U. (2019). Predicting at-risk students using clickstream data in the virtual learning environment. Sustainability, 11(24), 7238. https://doi.org/10.3390/su11247238

Alshabandar, R., Hussain, A., Keight, R., Laws, A., & Baker, T. (2018). The application of Gaussian mixture models for the identification of at-risk learners in massive open online courses. In 2018 IEEE congress on evolutionary computation (CEC 2018) (pp. 1–8). IEEE. https://doi.org/10.1109/CEC.2018.8477770

Araka, E., Maina, E., Gitonga, R., & Oboko, R. (2020). Research trends in measurement and intervention tools for self-regulated learning for e-learning environments—Systematic review (2008–2018). Research and Practice in Technology Enhanced Learning, 15(1), Article 6. https://doi.org/10.1186/s41039-020-00129-5

Araka, E., Maina, E., Gitonga, R., Oboko, R., & Kihoro, J. (2021). University students’ perception on the usefulness of learning management system features in promoting self-regulated learning in online learning. International Journal of Education and Development Using Information and Communication Technology (IJEDICT), 17(1), 45–64. http://ijedict.dec.uwi.edu/viewarticle.php?id=2850

Araka, E., Oboko, R., Maina, E., & Gitonga, R. (2020). A conceptual educational data mining model for supporting self-regulated learning in online learning environments. In J. Keengwe & Y. Tran (Eds.), Handbook of research on equity in computer science in P-16 education (pp. 278–292). IGI Global. https://doi.org/10.4018/978-1-7998-4739-7.ch016

Azevedo, R. (2009). Theoretical, conceptual, methodological, and instructional issues in research on metacognition and self-regulated learning: A discussion. Metacognition and Learning, 4(1), 87–95. https://doi.org/10.1007/s11409-009-9035-7

Barnard-Brak, L., Paton, V. O., & Lan, W. Y. (2010). Profiles in self-regulated learning in the online learning environment. International Review of Research in Open and Distributed Learning, 11(1), 55–78. https://doi.org/10.19173/irrodl.v11i1.769

Bosch, N., Crues, W., Henricks, G., Perry, M., Angrave, L., Shaik, N., Bhat, S., & Anderson, C. (2018). Modeling key differences in underrepresented students’ interactions with an online STEM course. In A. L. Story (Chair), Proceedings of the Technology, Mind, and Society conference 2018: TechMindSociety ’18, 1–6. https://doi.org/10.1145/3183654.3183681

Bouchet, F., Harley, J. M., Trevors, G. J., & Azevedo, R. (2013). Clustering and profiling students according to their interactions with an intelligent tutoring system fostering self-regulated learning. Journal of Educational Data Mining, 5(1), 104–146. https://doi.org/10.5281/zenodo.3554613

Broadbent, J., & Poon, W. L. (2015). Self-regulated learning strategies and academic achievement in online higher education learning environments: A systematic review. The Internet and Higher Education, 27, 1–13. https://doi.org/10.1016/j.iheduc.2015.04.007

Broadbent, J., & Fuller-Tyszkiewicz, M. (2018). Profiles in self-regulated learning and their correlates for online and blended learning students. Educational Technology Research and Development, 66, 1435–1455. https://doi.org/10.1007/s11423-018-9595-9

Brock, G., Pihur, V., Datta, S., & Datta, S. (2008). ClValid: An R package for cluster validation. Journal of Statistical Software, 25(4), 1–22. https://doi.org/10.18637/jss.v025.i04

Çebi, A., & Güyer, T. (2020). Students’ interaction patterns in different online learning activities and their relationship with motivation, self-regulated learning strategy and learning performance. Education and Information Technologies, 25, 3975–3993. https://doi.org/10.1007/s10639-020-10151-1

Cerezo, R., Bogarín, A., Esteban, M., & Romero, C. (2020). Process mining for self-regulated learning assessment in e-learning. Journal of Computing in Higher Education, 32(1), 74–88. https://doi.org/10.1007/s12528-019-09225-y

Cerezo, R., Sánchez-Santillán, M., Paule-Ruiz, M. P., & Núñez, J. C. (2016). Students’ LMS interaction patterns and their relationship with achievement: A case study in higher education. Computers & Education, 96, 42–54. https://doi.org/10.1016/j.compedu.2016.02.006

Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). NbClust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6), 1–36. https://doi.org/10.18637/jss.v061.i06

Cicchinelli, A., Veas, E., Pardo, A., Pammer-Schindler, V., Fessl, A., Barreiros, C., & Lindstädt, S. (2018). Finding traces of self-regulated learning in activity streams. In A. Pardo, K. Bartimote-Aufflick, & G. Lynch (Chairs), LAK ’18: Proceedings of the eighth international conference on learning analytics and knowledge (pp. 191–200). ACM. https://doi.org/10.1145/3170358.3170381

Coman, C., Țîru, L. G., Meseșan-Schmitz, L., Stanciu, C., & Bularca, M. C. (2020). Online teaching and learning in higher education during the coronavirus pandemic: Students’ perspective. Sustainability (Switzerland), 12(24), 1–22. https://doi.org/10.3390/su122410367

Crossley, S., Mcnamara, D., Paquette, L., Baker, R., & Dascalu, M. (2016). Combining click-stream data with NLP tools to better understand MOOC completion. In D. Gašević & G. Lynch (Chairs), LAK ’16: Proceedings of the sixth international conference on learning analytics and knowledge (pp. 6–14). ACM. https://doi.org/10.1145/2883851.2883931

Dabbagh, N., & Kitsantas, A. (2005). Using web-based pedagogical tools as scaffolds for self-regulated learning. Instructional Science, 33(5–6), 513–540. https://doi.org/10.1007/s11251-005-1278-3

Di Mitri, D., Börner, D., Scheffel, M., Ternier, S., Drachsler, H., & Specht, M. (2017). Learning pulse: A machine learning approach for predicting performance in self-regulated learning using multimodal data. In A. Wise, P. Winne, & G. Lynch (Chairs), LAK ’17: Proceedings of the seventh international learning analytics and knowledge conference (pp. 188–197). ACM. https://doi.org/10.1145/3027385.3027447

Di Mitri, D., Scheffel, M., Drachsler, H., Borner, D., Ternier, S., & Specht, M. (2016). Learning pulse: Using wearable biosensors and learning analytics to investigate and predict learning success in self-regulated learning. In R. Martinez-Maldonado, D. Hernandez-Leo (Eds), Proceedings of the First International Workshop on Learning Analytics Across Physical and Digital Spaces co-located with 6th International Conference on Learning Analytics & Knowledge (LAK 2016) (pp. 34–39). http://ceur-ws.org/Vol-1601/CrossLAK16Paper7.pdf

Elsayed, A., Caeiro-Rodríguez, M., Mikic-Fonte, F., & Llamas-Nistal, M. (2019). Research in learning analytics and educational data mining to measure self-regulated learning : A systematic review. In Proceedings of world conference on mobile and contextual learning 2019 (pp. 46–53). https://www.learntechlib.org/p/210600/

Gašević, D., Jovanović, J., Pardo, A., & Dawson, S. (2017). Detecting learning strategies with analytics: Links with self-reported measures and academic performance. Journal of Learning Analytics, 4(2), 113–128. https://doi.org/10.18608/jla.2017.42.10

Goda, Y., Kato, H., Yamada, M., Saito, Y., Matsuda, T., & Miyagawa, H. (2020). From adaptive learning support to fading out support for effective self-regulated online learning. In D. Glick, A. Cohen, & C. Chang (Eds.), Early warning systems and targeted interventions for student success in online courses (pp. 218–238). IGI Global. https://doi.org/10.4018/978-1-7998-5074-8

Jansen, R., Leeuwen, A., Janssen, J., Conijn, R., & Kester, L. (2020). Supporting learners’ self-regulated learning in massive open online courses. Computers & Education, 146, Article 103771. https://doi.org/10.1016/j.compedu.2019.103771

Jha, N. I., Ghergulescu, I., & Moldovan, A. N. (2019). OULAD MOOC dropout and result prediction using ensemble, deep learning and regression techniques. In H. Lane, S. Zvacek, & J. Uhomoibhi (Eds.), Proceedings of the 11th international conference on computer supported education, Volume 2 (pp. 154–164). https://doi.org/10.5220/0007767901540164

Jo, I., Park, Y., Yoon, M., & Sung, H. (2016). Evaluation of online log variables that estimate learners’ time management in a Korean online learning context. International Review of Research in Open and Distributed Learning, 17(1), 195–213. https://doi.org/10.19173/irrodl.v17i1.2176

Khan, K. S., Kunz, R., Kleijnen, J., & Antes, G. (2003). Five steps to conducting a systematic review. Journal of the Royal Society of Medicine, 96(3), 118–121. https://journals.sagepub.com/doi/pdf/10.1177/014107680309600304

Khanna, L., Narayan Singh, S., & Alam, M. (2016). Educational data mining and its role in determining factors affecting students academic pPerformance: A systematic review. In 2016 1st India international conference on information processing (IICIP) (pp. 1–7). IEEE. https://www.doi.org/10.1109/IICIP.2016.7975354

Kim, D., Yoon, M., Jo, I.-H., & Branch, R. M. (2018). Learning analytics to support self-regulated learning in asynchronous online courses: A case study at a women’s university in South Korea. Computers & Education, 127, 233–251. https://doi.org/10.1016/j.compedu.2018.08.023

Kinnebrew, J., Loretz, K., & Biswas, G. (2013). A contextualized, differential sequence mining method to derive students’ learning behavior patterns. Journal of Educational Data Mining, 5(1),190–219. https://doi.org/10.5281/zenodo.3554617

Kizilcec, R., Erez-Sanagustín, M. P., & Maldonado, J. (2017). Self-regulated learning strategies predict learner behavior and goal attainment in massive open online courses. Computers & Education, 104, 18–33. https://doi.org/10.1016/j.compedu.2016.10.001

Kizilcec, R., Piech, C., & Schneider, E. (2013). Deconstructing disengagement: Analyzing learner subpopulations in massive open online courses. In D. Suthers, K. Verbert, E. Duval, & X. Ochoa (Eds.), LAK ’13: Proceedings of the third international conference on learning analytics and knowledge (pp. 170–179). ACM. https://doi.org/10.1145/2460296.2460330

Klug, J., Ogrin, S., & Keller, S. (2011). A plea for self-regulated learning as a process: Modelling, measuring and intervening. Psychological Test and Assessment Modeling, 53(1), 51–72.

Kuzilek, J., Hlosta, M., Herrmannova, D., Zdrahal, Z., Vaclavek, J., & Wolff, A. (2015). OU analyse: Analysing at-risk students at The Open University. Learning Analytics Review: LAK15-1, 1–16. http://libeprints.open.ac.uk/42529/

Kuzilek, J., Hlosta, M., & Zdrahal, Z. (2017). Data descriptor: Open University learning analytics dataset. Scientific Data, 4, 1–8. https://doi.org/10.1038/sdata.2017.171

Li, H., Flanagan, B., Konomi, S., & Ogata, H. (2018). Measuring behaviors and identifying indicators of self-regulation in computer-assisted language learning courses. Research and Practice in Technology Enhanced Learning, 13, Article 19. https://doi.org/10.1186/s41039-018-0087-7

Lodge, J. M., & Corrin, L. (2017). What data and analytics can and do say about effective learning. Npj Science of Learning, 2(1), Article 5. https://doi.org/10.1038/s41539-017-0006-5

Madni, H. A., Anwar, Z., & Shah, M. A. (2017). Data mining techniques and applications - A decade review. In J. Zhang (Ed.), 2017 23rd international conference on automation and computing (ICAC) (pp. 1-7). IEEE. https://doi.org/10.23919/IConAC.2017.8082090

Maldonado-Mahauad, J., Pérez-Sanagustín, M., Kizilcec, R., Morales, N., & Munoz-Gama, J. (2018). Mining theory-based patterns from big data: Identifying self-regulated learning strategies in massive open online courses. Computers in Human Behavior, 80, 179–196. https://doi.org/10.1016/j.chb.2017.11.011

Manzanares, M., Sánchez, M., García, O., & Díez-Pastor, J. (2017). How do b-learning and learning patterns influence learning outcomes? Frontiers in Psychology, 8(745), 1–13. https://doi.org/10.3389/fpsyg.2017.00745

Matcha, W., Gašević, D., Uzir, N. A., Jovanović, J., & Pardo, A. (2019). Analytics of learning strategies: Associations with academic performance and feedback. In S. Hsiao, J. Cunningham, K. McCarthy, G. Lynch, C. Brooks, R. Ferguson, & U. Hoppe (Chairs), LAK ’19: Proceedings of the ninth international conference on learning analytics and knowledge(pp. 461–470). ACM. https://doi.org/10.1145/3303772.3303787

Montgomery, A. P., Mousavi, A., Carbonaro, M., Hayward, D. V, Dunn, W., & Montgomery, A. (2019). Using learning analytics to explore self-regulated learning in flipped blended learning music teacher education. British Journal of Educational Technology, 50(1), 114–127. https://doi.org/10.1111/bjet.12590

Nuankaew, W., Nuankaew, P., Teeraputon, D., Phanniphong, K., & Bussaman, S. (2019). Perception and attitude toward self-regulated learning of Thailand’s students in educational data mining perspective. International Journal of Emerging Technologies in Learning, 14(9), 34–49. https://doi.org/10.3991/IJET.V14I09.10048

Pardo, A., Han, F., & Ellis, R. A. (2016). Exploring the relation between self-regulation, online activities, and academic performance: A case study. In D. Gašević, G. Lynch, S. Dawson, H. Drachsler, & C. Penstein Rosé (Chairs), LAK ’16: Proceedings of the sixth international conference on learning analytics and knowledge (pp. 422–429). ACM. https://doi.org/10.1145/2883851.2883883

Pardo, A., Han, F., & Ellis, R. A. (2017). Combining university student self-regulated learning indicators and engagement with online learning events to predict academic performance. IEEE Transactions on Learning Technologies, 10(1), 82–92. https://doi.org/10.1109/TLT.2016.2639508

Park, J., Yu, R., Rodriguez, F., Baker, R., Smyth, P., & Warschauer, M. (2018). Understanding student procrastination via mixture models. In K. E. Boyer & M. Yudelson (Eds.), Proceedings of the 11th international conference on educational data mining (pp. 187–197). International Educational Data Mining Society.

Peach, R. L., Yaliraki, S. N., Lefevre, D., & Barahona, M. (2019). Data-driven unsupervised clustering of online learner behaviour. Npj Science of Learning, 4(1), Article 14. https://doi.org/10.1038/s41539-019-0054-0

Pintrich, P. R. (2004). A conceptual framework for assessing motivation and self-regulated learning in college students. Educational Psychology Review, 16(4), 385–407. https://doi.org/10.1007/s10648-004-0006-x

Rodriguez, A., Vázquez Barreiros, B., Lama, M., Gewerc, A., & Mucientes, M. (2014). Using a learning analytics tool for evaluation in self-regulated learning. Proceedings of the 2014 Frontiers in Education conference (pp. 2484–2491). IEEE. https://doi.org/10.1109/FIE.2014.7044400

Rodriguez, F., Rivas, M. J., Yu, R., Warschauer, M., Park, J., & Sato, B. K. (2019). Utilizing learning analytics to map students’ self-reported study strategies to click behaviors in STEM courses. In S. Hsiao, J. Cunningham, K. McCarthy, G. Lynch, C. Brooks, R. Ferguson, & U. Hoppe (Chairs), LAK ’19: Proceedings of the ninth international conference on learning analytics and knowledge (pp. 456–460). ACM. https://doi.org/10.1145/3303772.3303841

Rodriguez, M. Z., Comin, C. H., Casanova, D., Bruno, O. M., Amancio, D. R., Costa, L. da F., & Rodrigues, F. A. (2019). Clustering algorithms: A comparative approach. PLoS ONE, 14(1), Article e0210236. https://doi.org/10.1371/journal.pone.0210236

Saadati, Z., Zeki, C. P., & Barenji, R. V. (2021, April 29). On the development of blockchain-based learning management system as a metacognitive tool to support self-regulation learning in online higher education. Interactive Learning Environments. https://doi.org/10.1080/10494820.2021.1920429

Schraw, G. (2010). Measuring self-regulation in computer-based learning environments. Educational Psychologist, 45(4), 258–266. https://doi.org/10.1080/00461520.2010.515936

Siemens, G., & Baker, R. (2012). Learning analytics and educational data mining: Towards communication and collaboration. In S. B. Shum, D. Gašević, & R. Ferguson (Chairs), LAK ’12: Proceedings of the second international conference on learning analytics and knowledge (pp. 252–254). ACM. https://doi.org/10.1145/2330601.2330661

Silvola, A., Näykki, P., Kaveri, A., & Muukkonen, H. (2021). Expectations for supporting student engagement with learning analytics: An academic path perspective. Computers & Education, 168, Article 104192. https://doi.org/10.1016/j.compedu.2021.104192

Sun, Z., Lu, L., & Xie, K. (2016). The effects of self-regulated learning on students’ performance trajectory in the flipped math classroom. In C. K. Looi, J. L. Polman, U. Cress, & P. Reimann (Eds.), Transforming learning, empowering learners: The international conference of the learning sciences (ICLS) 2016, Volume 1 (pp. 66–73). International Society of the Learning Sciences. https://www.isls.org/icls/2016/docs/ICLS2016_Volume_1_30June2016.pdf

Syuhada, R., Mawengkang, H., & Lydia, M. S. (2020). Analysis of performances k-nearest neighbor for regulate learning. IOP conference series: Materials science and engineering, 725(3). https://doi.org/10.1088/1757-899X/725/1/012132

Trevors, G., Feyzi-Behnagh, R., Azevedo, R., & Bouchet, F. (2016). Self-regulated learning processes vary as a function of epistemic beliefs and contexts: Mixed method evidence from eye tracking and concurrent and retrospective reports. Learning and Instruction, 42, 31–46. https://doi.org/10.1016/j.learninstruc.2015.11.003

Valdiviezo, P., Reátegui, R., & Sarango, M. (2013). Student behavior patterns in a virtual learning environment. In M. M. Larrondo Petrie, H. Alvarez, I. E. Esparragoza, & C. Rodriguez Arroyave (Eds.), Innovation in engineering, technology and education for competitiveness and prosperity: Proceedings of the 11th Latin American and Caribbean conference for engineering and technology (pp. 1–8). LACCEI. http://www.laccei.org/LACCEI2013-Cancun/RefereedPapers/RP091.pdf

Valle, A., Núñez, C., Cabanach, R., González-Pienda, J., Rodríguez, S., Rosário, P., Cerezo, R., & Muñoz-Cadavid, M. (2008). Self-regulated profiles and academic achievement. Psicothema, 20(4), 724–731. http://www.psicothema.com/pdf/3547.pdf

Van-Craenendonck, T., & Blockeel, H. (2015, June 19). Using internal validity measures to compare clustering algorithms [Poster presentation]. Benelearn, Delft, Netherlands. https://lirias.kuleuven.be/1656512?limo=0

Winne, P., & Baker, R. (2013). The potentials of educational data mining for researching metacognition, motivation and self-regulated learning. Journal of Educational Data Mining, 5(1), 1–8. https://doi.org/10.5281/zenodo.3554619

Wong, J., Baars, M., de Koning, B. B., & Paas, F. (2021). Examining the use of prompts to facilitate self-regulated learning in massive open online courses. Computers in Human Behavior, 115, Article 106596. https://doi.org/10.1016/j.chb.2020.106596

Wong, J., Khalil, M., Baars, M., de Koning, B. B., & Paas, F. (2019). Exploring sequences of learner activities in relation to self-regulated learning in a massive open online course. Computers and Education, 140, Article 103595. https://doi.org/10.1016/j.compedu.2019.103595

Yot-Domínguez, C., & Marcelo, C. (2017). University students’ self-regulated learning using digital technologies. International Journal of Educational Technology in Higher Education, 14, Article 38. https://doi.org/10.1186/s41239-017-0076-8

Yu, R., Jiang, D., & Warschauer, M. (2018). Representing and predicting student navigational pathways in online college courses. In R. Luckin, S. Klemmer, & K. Koedinger (Chairs), L@S ’18: Proceedings of the fifth annual ACM conference on learning at scale (pp. 1–4). ACM. https://doi.org/10.1145/3231644.3231702

Zheng, J., Xing, W., Zhu, G., Chen, G., Zhao, H., & Xie, C. (2020). Profiling self-regulation behaviors in STEM learning of engineering design. Computers & Education, 143, Article 103669. https://doi.org/10.1016/j.compedu.2019.103669

Zimmerman, B. J. (1990). Self-regulated learning and academic achievement: An overview. Educational Psychologist, 25(1), 3–17. https://doi.org/10.1207/s15326985ep2501_2

Published

2022-02-01

How to Cite

Araka, E., Oboko, R., Maina, E. ., & Gitonga, R. . (2022). Using Educational Data Mining Techniques to Identify Profiles in Self-Regulated Learning: An Empirical Evaluation. The International Review of Research in Open and Distributed Learning, 23(1), 131–162. https://doi.org/10.19173/irrodl.v22i4.5401

Publication Facts

Metric
This article
Other articles
Peer reviewers 
2
2.4

Reviewer profiles  N/A

Author statements

Author statements
This article
Other articles
Data availability 
N/A
16%
External funding 
No
32%
Competing interests 
N/A
11%
Metric
This journal
Other journals
Articles accepted 
86%
33%
Days to publication 
420
145

Indexed in

Editor & editorial board
profiles
Academic society 
N/A
Publisher 
Athabasca University Press