Using Educational Data Mining Techniques to Identify Profiles in Self-Regulated Learning: An Empirical Evaluation
DOI:
https://doi.org/10.19173/irrodl.v22i4.5401Keywords:
educational data mining, EDM, self-regulated learning, SRL profile, algorithm, agglomerative hierarchical clustering, clustering algorithmAbstract
With the increased emphasis on the benefits of self-regulated learning (SRL), it is important to make use of the huge amounts of educational data generated from online learning environments to identify the appropriate educational data mining (EDM) techniques that can help explore and understand online learners’ behavioral patterns. Understanding learner behaviors helps us gain more insights into the right types of interventions that can be offered to online learners who currently receive limited support from instructors as compared to their counterparts in traditional face-to-face classrooms. In view of this, our study first identified an optimal EDM algorithm by empirically evaluating the potential of three clustering algorithms (expectation-maximization, agglomerative hierarchical, and k-means) to identify SRL profiles using trace data collected from the Open University of the UK. Results revealed that agglomerative hierarchical was the optimal algorithm, with four clusters. From the four clusters, four SRL profiles were identified: poor self-regulators, intermediate self-regulators, good self-regulators, and exemplary self-regulators. Second, through correlation analysis, our study established that there is a significant relationship between the SRL profiles and students’ final results. Based on our findings, we recommend agglomerative hierarchical as the optimal algorithm to identify SRL profiles in online learning environments. Furthermore, these profiles could provide insights on how to design a learning management system which could promote SRL, based on learner behaviors.
References
Ainscough, L., Leung, R., Colthorpe, K., & Langfield, T. (2019). Characterizing university students’ self-regulated learning behavior using dispositional learning analytics. In J. Domenech, P. Merello, E. de la Poza, D. Blazquez, & R. Peña-Ortiz (Eds.), Fifth international conference on higher education advances (HEAd’19) (pp. 233–241). Editorial Universitat Politècnica de València. https://doi.org/10.4995/head19.2019.9153
Aljohani, N. R., Fayoumi, A., & Hassan, S.-U. (2019). Predicting at-risk students using clickstream data in the virtual learning environment. Sustainability, 11(24), 7238. https://doi.org/10.3390/su11247238
Alshabandar, R., Hussain, A., Keight, R., Laws, A., & Baker, T. (2018). The application of Gaussian mixture models for the identification of at-risk learners in massive open online courses. In 2018 IEEE congress on evolutionary computation (CEC 2018) (pp. 1–8). IEEE. https://doi.org/10.1109/CEC.2018.8477770
Araka, E., Maina, E., Gitonga, R., & Oboko, R. (2020). Research trends in measurement and intervention tools for self-regulated learning for e-learning environments—Systematic review (2008–2018). Research and Practice in Technology Enhanced Learning, 15(1), Article 6. https://doi.org/10.1186/s41039-020-00129-5
Araka, E., Maina, E., Gitonga, R., Oboko, R., & Kihoro, J. (2021). University students’ perception on the usefulness of learning management system features in promoting self-regulated learning in online learning. International Journal of Education and Development Using Information and Communication Technology (IJEDICT), 17(1), 45–64. http://ijedict.dec.uwi.edu/viewarticle.php?id=2850
Araka, E., Oboko, R., Maina, E., & Gitonga, R. (2020). A conceptual educational data mining model for supporting self-regulated learning in online learning environments. In J. Keengwe & Y. Tran (Eds.), Handbook of research on equity in computer science in P-16 education (pp. 278–292). IGI Global. https://doi.org/10.4018/978-1-7998-4739-7.ch016
Azevedo, R. (2009). Theoretical, conceptual, methodological, and instructional issues in research on metacognition and self-regulated learning: A discussion. Metacognition and Learning, 4(1), 87–95. https://doi.org/10.1007/s11409-009-9035-7
Barnard-Brak, L., Paton, V. O., & Lan, W. Y. (2010). Profiles in self-regulated learning in the online learning environment. International Review of Research in Open and Distributed Learning, 11(1), 55–78. https://doi.org/10.19173/irrodl.v11i1.769
Bosch, N., Crues, W., Henricks, G., Perry, M., Angrave, L., Shaik, N., Bhat, S., & Anderson, C. (2018). Modeling key differences in underrepresented students’ interactions with an online STEM course. In A. L. Story (Chair), Proceedings of the Technology, Mind, and Society conference 2018: TechMindSociety ’18, 1–6. https://doi.org/10.1145/3183654.3183681
Bouchet, F., Harley, J. M., Trevors, G. J., & Azevedo, R. (2013). Clustering and profiling students according to their interactions with an intelligent tutoring system fostering self-regulated learning. Journal of Educational Data Mining, 5(1), 104–146. https://doi.org/10.5281/zenodo.3554613
Broadbent, J., & Poon, W. L. (2015). Self-regulated learning strategies and academic achievement in online higher education learning environments: A systematic review. The Internet and Higher Education, 27, 1–13. https://doi.org/10.1016/j.iheduc.2015.04.007
Broadbent, J., & Fuller-Tyszkiewicz, M. (2018). Profiles in self-regulated learning and their correlates for online and blended learning students. Educational Technology Research and Development, 66, 1435–1455. https://doi.org/10.1007/s11423-018-9595-9
Brock, G., Pihur, V., Datta, S., & Datta, S. (2008). ClValid: An R package for cluster validation. Journal of Statistical Software, 25(4), 1–22. https://doi.org/10.18637/jss.v025.i04
Çebi, A., & Güyer, T. (2020). Students’ interaction patterns in different online learning activities and their relationship with motivation, self-regulated learning strategy and learning performance. Education and Information Technologies, 25, 3975–3993. https://doi.org/10.1007/s10639-020-10151-1
Cerezo, R., Bogarín, A., Esteban, M., & Romero, C. (2020). Process mining for self-regulated learning assessment in e-learning. Journal of Computing in Higher Education, 32(1), 74–88. https://doi.org/10.1007/s12528-019-09225-y
Cerezo, R., Sánchez-Santillán, M., Paule-Ruiz, M. P., & Núñez, J. C. (2016). Students’ LMS interaction patterns and their relationship with achievement: A case study in higher education. Computers & Education, 96, 42–54. https://doi.org/10.1016/j.compedu.2016.02.006
Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). NbClust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6), 1–36. https://doi.org/10.18637/jss.v061.i06
Cicchinelli, A., Veas, E., Pardo, A., Pammer-Schindler, V., Fessl, A., Barreiros, C., & Lindstädt, S. (2018). Finding traces of self-regulated learning in activity streams. In A. Pardo, K. Bartimote-Aufflick, & G. Lynch (Chairs), LAK ’18: Proceedings of the eighth international conference on learning analytics and knowledge (pp. 191–200). ACM. https://doi.org/10.1145/3170358.3170381
Coman, C., Țîru, L. G., Meseșan-Schmitz, L., Stanciu, C., & Bularca, M. C. (2020). Online teaching and learning in higher education during the coronavirus pandemic: Students’ perspective. Sustainability (Switzerland), 12(24), 1–22. https://doi.org/10.3390/su122410367
Crossley, S., Mcnamara, D., Paquette, L., Baker, R., & Dascalu, M. (2016). Combining click-stream data with NLP tools to better understand MOOC completion. In D. Gašević & G. Lynch (Chairs), LAK ’16: Proceedings of the sixth international conference on learning analytics and knowledge (pp. 6–14). ACM. https://doi.org/10.1145/2883851.2883931
Dabbagh, N., & Kitsantas, A. (2005). Using web-based pedagogical tools as scaffolds for self-regulated learning. Instructional Science, 33(5–6), 513–540. https://doi.org/10.1007/s11251-005-1278-3
Di Mitri, D., Börner, D., Scheffel, M., Ternier, S., Drachsler, H., & Specht, M. (2017). Learning pulse: A machine learning approach for predicting performance in self-regulated learning using multimodal data. In A. Wise, P. Winne, & G. Lynch (Chairs), LAK ’17: Proceedings of the seventh international learning analytics and knowledge conference (pp. 188–197). ACM. https://doi.org/10.1145/3027385.3027447
Di Mitri, D., Scheffel, M., Drachsler, H., Borner, D., Ternier, S., & Specht, M. (2016). Learning pulse: Using wearable biosensors and learning analytics to investigate and predict learning success in self-regulated learning. In R. Martinez-Maldonado, D. Hernandez-Leo (Eds), Proceedings of the First International Workshop on Learning Analytics Across Physical and Digital Spaces co-located with 6th International Conference on Learning Analytics & Knowledge (LAK 2016) (pp. 34–39). http://ceur-ws.org/Vol-1601/CrossLAK16Paper7.pdf
Elsayed, A., Caeiro-Rodríguez, M., Mikic-Fonte, F., & Llamas-Nistal, M. (2019). Research in learning analytics and educational data mining to measure self-regulated learning : A systematic review. In Proceedings of world conference on mobile and contextual learning 2019 (pp. 46–53). https://www.learntechlib.org/p/210600/
Gašević, D., Jovanović, J., Pardo, A., & Dawson, S. (2017). Detecting learning strategies with analytics: Links with self-reported measures and academic performance. Journal of Learning Analytics, 4(2), 113–128. https://doi.org/10.18608/jla.2017.42.10
Goda, Y., Kato, H., Yamada, M., Saito, Y., Matsuda, T., & Miyagawa, H. (2020). From adaptive learning support to fading out support for effective self-regulated online learning. In D. Glick, A. Cohen, & C. Chang (Eds.), Early warning systems and targeted interventions for student success in online courses (pp. 218–238). IGI Global. https://doi.org/10.4018/978-1-7998-5074-8
Jansen, R., Leeuwen, A., Janssen, J., Conijn, R., & Kester, L. (2020). Supporting learners’ self-regulated learning in massive open online courses. Computers & Education, 146, Article 103771. https://doi.org/10.1016/j.compedu.2019.103771
Jha, N. I., Ghergulescu, I., & Moldovan, A. N. (2019). OULAD MOOC dropout and result prediction using ensemble, deep learning and regression techniques. In H. Lane, S. Zvacek, & J. Uhomoibhi (Eds.), Proceedings of the 11th international conference on computer supported education, Volume 2 (pp. 154–164). https://doi.org/10.5220/0007767901540164
Jo, I., Park, Y., Yoon, M., & Sung, H. (2016). Evaluation of online log variables that estimate learners’ time management in a Korean online learning context. International Review of Research in Open and Distributed Learning, 17(1), 195–213. https://doi.org/10.19173/irrodl.v17i1.2176
Khan, K. S., Kunz, R., Kleijnen, J., & Antes, G. (2003). Five steps to conducting a systematic review. Journal of the Royal Society of Medicine, 96(3), 118–121. https://journals.sagepub.com/doi/pdf/10.1177/014107680309600304
Khanna, L., Narayan Singh, S., & Alam, M. (2016). Educational data mining and its role in determining factors affecting students academic pPerformance: A systematic review. In 2016 1st India international conference on information processing (IICIP) (pp. 1–7). IEEE. https://www.doi.org/10.1109/IICIP.2016.7975354
Kim, D., Yoon, M., Jo, I.-H., & Branch, R. M. (2018). Learning analytics to support self-regulated learning in asynchronous online courses: A case study at a women’s university in South Korea. Computers & Education, 127, 233–251. https://doi.org/10.1016/j.compedu.2018.08.023
Kinnebrew, J., Loretz, K., & Biswas, G. (2013). A contextualized, differential sequence mining method to derive students’ learning behavior patterns. Journal of Educational Data Mining, 5(1),190–219. https://doi.org/10.5281/zenodo.3554617
Kizilcec, R., Erez-Sanagustín, M. P., & Maldonado, J. (2017). Self-regulated learning strategies predict learner behavior and goal attainment in massive open online courses. Computers & Education, 104, 18–33. https://doi.org/10.1016/j.compedu.2016.10.001
Kizilcec, R., Piech, C., & Schneider, E. (2013). Deconstructing disengagement: Analyzing learner subpopulations in massive open online courses. In D. Suthers, K. Verbert, E. Duval, & X. Ochoa (Eds.), LAK ’13: Proceedings of the third international conference on learning analytics and knowledge (pp. 170–179). ACM. https://doi.org/10.1145/2460296.2460330
Klug, J., Ogrin, S., & Keller, S. (2011). A plea for self-regulated learning as a process: Modelling, measuring and intervening. Psychological Test and Assessment Modeling, 53(1), 51–72.
Kuzilek, J., Hlosta, M., Herrmannova, D., Zdrahal, Z., Vaclavek, J., & Wolff, A. (2015). OU analyse: Analysing at-risk students at The Open University. Learning Analytics Review: LAK15-1, 1–16. http://libeprints.open.ac.uk/42529/
Kuzilek, J., Hlosta, M., & Zdrahal, Z. (2017). Data descriptor: Open University learning analytics dataset. Scientific Data, 4, 1–8. https://doi.org/10.1038/sdata.2017.171
Li, H., Flanagan, B., Konomi, S., & Ogata, H. (2018). Measuring behaviors and identifying indicators of self-regulation in computer-assisted language learning courses. Research and Practice in Technology Enhanced Learning, 13, Article 19. https://doi.org/10.1186/s41039-018-0087-7
Lodge, J. M., & Corrin, L. (2017). What data and analytics can and do say about effective learning. Npj Science of Learning, 2(1), Article 5. https://doi.org/10.1038/s41539-017-0006-5
Madni, H. A., Anwar, Z., & Shah, M. A. (2017). Data mining techniques and applications - A decade review. In J. Zhang (Ed.), 2017 23rd international conference on automation and computing (ICAC) (pp. 1-7). IEEE. https://doi.org/10.23919/IConAC.2017.8082090
Maldonado-Mahauad, J., Pérez-Sanagustín, M., Kizilcec, R., Morales, N., & Munoz-Gama, J. (2018). Mining theory-based patterns from big data: Identifying self-regulated learning strategies in massive open online courses. Computers in Human Behavior, 80, 179–196. https://doi.org/10.1016/j.chb.2017.11.011
Manzanares, M., Sánchez, M., García, O., & Díez-Pastor, J. (2017). How do b-learning and learning patterns influence learning outcomes? Frontiers in Psychology, 8(745), 1–13. https://doi.org/10.3389/fpsyg.2017.00745
Matcha, W., Gašević, D., Uzir, N. A., Jovanović, J., & Pardo, A. (2019). Analytics of learning strategies: Associations with academic performance and feedback. In S. Hsiao, J. Cunningham, K. McCarthy, G. Lynch, C. Brooks, R. Ferguson, & U. Hoppe (Chairs), LAK ’19: Proceedings of the ninth international conference on learning analytics and knowledge(pp. 461–470). ACM. https://doi.org/10.1145/3303772.3303787
Montgomery, A. P., Mousavi, A., Carbonaro, M., Hayward, D. V, Dunn, W., & Montgomery, A. (2019). Using learning analytics to explore self-regulated learning in flipped blended learning music teacher education. British Journal of Educational Technology, 50(1), 114–127. https://doi.org/10.1111/bjet.12590
Nuankaew, W., Nuankaew, P., Teeraputon, D., Phanniphong, K., & Bussaman, S. (2019). Perception and attitude toward self-regulated learning of Thailand’s students in educational data mining perspective. International Journal of Emerging Technologies in Learning, 14(9), 34–49. https://doi.org/10.3991/IJET.V14I09.10048
Pardo, A., Han, F., & Ellis, R. A. (2016). Exploring the relation between self-regulation, online activities, and academic performance: A case study. In D. Gašević, G. Lynch, S. Dawson, H. Drachsler, & C. Penstein Rosé (Chairs), LAK ’16: Proceedings of the sixth international conference on learning analytics and knowledge (pp. 422–429). ACM. https://doi.org/10.1145/2883851.2883883
Pardo, A., Han, F., & Ellis, R. A. (2017). Combining university student self-regulated learning indicators and engagement with online learning events to predict academic performance. IEEE Transactions on Learning Technologies, 10(1), 82–92. https://doi.org/10.1109/TLT.2016.2639508
Park, J., Yu, R., Rodriguez, F., Baker, R., Smyth, P., & Warschauer, M. (2018). Understanding student procrastination via mixture models. In K. E. Boyer & M. Yudelson (Eds.), Proceedings of the 11th international conference on educational data mining (pp. 187–197). International Educational Data Mining Society.
Peach, R. L., Yaliraki, S. N., Lefevre, D., & Barahona, M. (2019). Data-driven unsupervised clustering of online learner behaviour. Npj Science of Learning, 4(1), Article 14. https://doi.org/10.1038/s41539-019-0054-0
Pintrich, P. R. (2004). A conceptual framework for assessing motivation and self-regulated learning in college students. Educational Psychology Review, 16(4), 385–407. https://doi.org/10.1007/s10648-004-0006-x
Rodriguez, A., Vázquez Barreiros, B., Lama, M., Gewerc, A., & Mucientes, M. (2014). Using a learning analytics tool for evaluation in self-regulated learning. Proceedings of the 2014 Frontiers in Education conference (pp. 2484–2491). IEEE. https://doi.org/10.1109/FIE.2014.7044400
Rodriguez, F., Rivas, M. J., Yu, R., Warschauer, M., Park, J., & Sato, B. K. (2019). Utilizing learning analytics to map students’ self-reported study strategies to click behaviors in STEM courses. In S. Hsiao, J. Cunningham, K. McCarthy, G. Lynch, C. Brooks, R. Ferguson, & U. Hoppe (Chairs), LAK ’19: Proceedings of the ninth international conference on learning analytics and knowledge (pp. 456–460). ACM. https://doi.org/10.1145/3303772.3303841
Rodriguez, M. Z., Comin, C. H., Casanova, D., Bruno, O. M., Amancio, D. R., Costa, L. da F., & Rodrigues, F. A. (2019). Clustering algorithms: A comparative approach. PLoS ONE, 14(1), Article e0210236. https://doi.org/10.1371/journal.pone.0210236
Saadati, Z., Zeki, C. P., & Barenji, R. V. (2021, April 29). On the development of blockchain-based learning management system as a metacognitive tool to support self-regulation learning in online higher education. Interactive Learning Environments. https://doi.org/10.1080/10494820.2021.1920429
Schraw, G. (2010). Measuring self-regulation in computer-based learning environments. Educational Psychologist, 45(4), 258–266. https://doi.org/10.1080/00461520.2010.515936
Siemens, G., & Baker, R. (2012). Learning analytics and educational data mining: Towards communication and collaboration. In S. B. Shum, D. Gašević, & R. Ferguson (Chairs), LAK ’12: Proceedings of the second international conference on learning analytics and knowledge (pp. 252–254). ACM. https://doi.org/10.1145/2330601.2330661
Silvola, A., Näykki, P., Kaveri, A., & Muukkonen, H. (2021). Expectations for supporting student engagement with learning analytics: An academic path perspective. Computers & Education, 168, Article 104192. https://doi.org/10.1016/j.compedu.2021.104192
Sun, Z., Lu, L., & Xie, K. (2016). The effects of self-regulated learning on students’ performance trajectory in the flipped math classroom. In C. K. Looi, J. L. Polman, U. Cress, & P. Reimann (Eds.), Transforming learning, empowering learners: The international conference of the learning sciences (ICLS) 2016, Volume 1 (pp. 66–73). International Society of the Learning Sciences. https://www.isls.org/icls/2016/docs/ICLS2016_Volume_1_30June2016.pdf
Syuhada, R., Mawengkang, H., & Lydia, M. S. (2020). Analysis of performances k-nearest neighbor for regulate learning. IOP conference series: Materials science and engineering, 725(3). https://doi.org/10.1088/1757-899X/725/1/012132
Trevors, G., Feyzi-Behnagh, R., Azevedo, R., & Bouchet, F. (2016). Self-regulated learning processes vary as a function of epistemic beliefs and contexts: Mixed method evidence from eye tracking and concurrent and retrospective reports. Learning and Instruction, 42, 31–46. https://doi.org/10.1016/j.learninstruc.2015.11.003
Valdiviezo, P., Reátegui, R., & Sarango, M. (2013). Student behavior patterns in a virtual learning environment. In M. M. Larrondo Petrie, H. Alvarez, I. E. Esparragoza, & C. Rodriguez Arroyave (Eds.), Innovation in engineering, technology and education for competitiveness and prosperity: Proceedings of the 11th Latin American and Caribbean conference for engineering and technology (pp. 1–8). LACCEI. http://www.laccei.org/LACCEI2013-Cancun/RefereedPapers/RP091.pdf
Valle, A., Núñez, C., Cabanach, R., González-Pienda, J., Rodríguez, S., Rosário, P., Cerezo, R., & Muñoz-Cadavid, M. (2008). Self-regulated profiles and academic achievement. Psicothema, 20(4), 724–731. http://www.psicothema.com/pdf/3547.pdf
Van-Craenendonck, T., & Blockeel, H. (2015, June 19). Using internal validity measures to compare clustering algorithms [Poster presentation]. Benelearn, Delft, Netherlands. https://lirias.kuleuven.be/1656512?limo=0
Winne, P., & Baker, R. (2013). The potentials of educational data mining for researching metacognition, motivation and self-regulated learning. Journal of Educational Data Mining, 5(1), 1–8. https://doi.org/10.5281/zenodo.3554619
Wong, J., Baars, M., de Koning, B. B., & Paas, F. (2021). Examining the use of prompts to facilitate self-regulated learning in massive open online courses. Computers in Human Behavior, 115, Article 106596. https://doi.org/10.1016/j.chb.2020.106596
Wong, J., Khalil, M., Baars, M., de Koning, B. B., & Paas, F. (2019). Exploring sequences of learner activities in relation to self-regulated learning in a massive open online course. Computers and Education, 140, Article 103595. https://doi.org/10.1016/j.compedu.2019.103595
Yot-Domínguez, C., & Marcelo, C. (2017). University students’ self-regulated learning using digital technologies. International Journal of Educational Technology in Higher Education, 14, Article 38. https://doi.org/10.1186/s41239-017-0076-8
Yu, R., Jiang, D., & Warschauer, M. (2018). Representing and predicting student navigational pathways in online college courses. In R. Luckin, S. Klemmer, & K. Koedinger (Chairs), L@S ’18: Proceedings of the fifth annual ACM conference on learning at scale (pp. 1–4). ACM. https://doi.org/10.1145/3231644.3231702
Zheng, J., Xing, W., Zhu, G., Chen, G., Zhao, H., & Xie, C. (2020). Profiling self-regulation behaviors in STEM learning of engineering design. Computers & Education, 143, Article 103669. https://doi.org/10.1016/j.compedu.2019.103669
Zimmerman, B. J. (1990). Self-regulated learning and academic achievement: An overview. Educational Psychologist, 25(1), 3–17. https://doi.org/10.1207/s15326985ep2501_2
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International Licence. The copyright of all content published in IRRODL is retained by the authors.
This copyright agreement and use license ensures, among other things, that an article will be as widely distributed as possible and that the article can be included in any scientific and/or scholarly archive.
You are free to
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms below:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.