Abstract
While Massive Open Online Course (MOOCs) platforms provide knowledge in a new and unique way, the very high number of dropouts is a significant drawback. Several features are considered to contribute towards learner attrition or lack of interest, which may lead to disengagement or total dropout. The jury is still out on which factors are the most appropriate predictors. However, the literature agrees that early prediction is vital to allow for a timely intervention. Whilst feature-rich predictors may have the best chance for high accuracy, they may be unwieldy. This study aims to predict learner dropout early-on, from the first week, by comparing several machine-learning approaches, including Random Forest, Adaptive Boost, XGBoost and GradientBoost Classifiers. The results show promising accuracies (82%–94%) using as little as 2 features. We show that the accuracies obtained outperform state of the art approaches, even when the latter deploy several features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ipaye, B., Ipaye, C.B.: Opportunities and challenges for open educational resources and massive open online courses: the case of Nigeria. Commonwealth of Learning. Educo-Health Project. Ilorin (2013)
Kloft, M., Stiehler, F., Zheng, Z., Pinkwart, N.: Predicting MOOC dropout over weeks using machine learning methods. In: Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, pp. 60–65 (2014)
Yang, D., Sinha, T., Adamson, D., Rose, C.P.: Turn on, tune in, drop out: anticipating student dropouts in massive open online courses. In: Proceedings of NIPS Work Data Driven Education, pp. 1–8 (2013)
Jordan, K.: MOOC completion rate: the data (2013)
Ye, C., Biswas, G.: Early prediction of student dropout and performance in MOOCs using higher granularity temporal information. J. Learn. Anal. 1, 169–172 (2014)
Coates, A., et al.: Text detection and character recognition in scene images with unsupervised feature learning. In: Proceedings of International Conference Document Anal. and Recognition ICDAR, pp. 440–445 (2011)
Wen, M., Yang, D., Ros, C.P., Rosé, C.P., Rose, C.P.: Linguistic reflections of student engagement in massive open online courses. In: Proceedings of 8th International Conference of Weblogs Social Media, ICWSM 2014, pp. 525–534 (2014)
Wen, M., Yang, D., Rosé, C.P.: Sentiment Analysis in MOOC Discussion Forums: What does it tell us? In: Proceedings of the 7th International Conference on Educational Data Mining (EDM), pp. 1–8 (2014)
Gardner, J., Brooks, C.: Student success prediction in MOOCs. User Model. User-Adapt. Inter. 28, 127–203 (2018)
Hong, B., Wei, Z., Yang, Y.: Discovering learning behavior patterns to predict dropout in MOOC. In: 12th International Conference on Computer Science and Education, ICCSE 2017, pp. 700–704. IEEE. (2017)
Xing, W., Chen, X., Stein, J., Marcinkowski, M.: Temporal predication of dropouts in MOOCs: reaching the low hanging fruit through stacking generalization. Comput. Hum. Behav. 58, 119–129 (2016)
Halawa, S., Greene, D., Mitchell, J.: Dropout prediction in MOOCs using learner activity features. In: Proceedings of the Second European MOOC Stakeholder Summit, pp. 58–65 (2014)
Sharkey, M., Sanders, R.: A process for predicting MOOC attrition. In: Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, pp. 50–54 (2014)
Nagrecha, S., Dillon, J.Z., Chawla, N.V.: MOOC dropout prediction: lessons learned from making pipelines interpretable. In: International World Wide Web Conferences Steering Committee Proceedings of the 26th International Conference on World Wide Web Companion, pp. 351–359 (2017)
Bote-Lorenzo, M.L., Gómez-Sánchez, E.: Predicting the decrease of engagement indicators in a MOOC. In: Proceedings of the Seventh International Learning Analytics and Knowledge Conference on LAK 2017. pp. 143–147. ACM Press, New York (2017)
Liang, J., Yang, J., Wu, Y., Li, C., Zheng, L.: Big data application in education: Dropout prediction in Edx MOOCs. In: Proceedings of 2016 IEEE 2nd International Conference on Multimedia Big Data, BigMM 2016, pp. 440–443, IEEE (2016)
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794, ACM. (2016)
Dietterich, Thomas G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Ruipérez-Valiente, J.A., Cobos, R., Muñoz-Merino, P.J., Andujar, Á., Delgado Kloos, C.: Early prediction and variable importance of certificate accomplishment in a MOOC. In: Delgado Kloos, C., Jermann, P., Pérez-Sanagustín, M., Seaton, D.T., White, S. (eds.) EMOOCs 2017. LNCS, vol. 10254, pp. 263–272. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59044-8_31
Cristea, A.I., Alamri, A., Kayama, M., Stewart, C., Alshehri, M., Shi, L.: Earliest predictor of dropout in MOOCs: a longitudinal study of futurelearn courses. In: 27th International Conference on Information Systems Development (ISD) (2018)
Alshehri, M., et al.: On the need for fine-grained analysis of gender versus commenting behaviour in MOOCs. In: Proceedings of the 2018 The 3rd International Conference on Information and Education Innovations, pp. 73–77. ACM (2018)
Cristea, A.I., Alshehri, M., Alamri, A., Kayama, M., Stewart, C., Shi, L.: How is learning fluctuating? futurelearn MOOCs fine-grained temporal analysis and feedback to teachers and designers. In: 27th International Conference on Information Systems Development (ISD2018). Association for Information Systems, Lund (2018)
Dorfman, R.: A formula for the Gini coefficient. Rev. Econ. Stat. 61, 146–149 (1979)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Hastie, T., Rosset, S., Zhu, J., Zou, H.: Multi-class adaboost. Statistics and its. Interface 2, 349–360 (2009)
Schapire, R.E., Freund, Y.: Boosting: Foundations and algorithms. MIT press, Cambridge (2012)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
An, S., Liu, W., Venkatesh, S.: Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression. Pattern Recognit. 40, 2154–2162 (2007)
Hinkley, D.V., Cox, D.: Theoretical Statistics. Chapman and Hall/CRC, London (1979)
Acknowledgment
We would like to thank FAPEAM (Foundation for the State of Amazonas Research), through Edital 009/2017, for partially funding this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Alamri, A. et al. (2019). Predicting MOOCs Dropout Using Only Two Easily Obtainable Features from the First Week’s Activities. In: Coy, A., Hayashi, Y., Chang, M. (eds) Intelligent Tutoring Systems. ITS 2019. Lecture Notes in Computer Science(), vol 11528. Springer, Cham. https://doi.org/10.1007/978-3-030-22244-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-22244-4_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22243-7
Online ISBN: 978-3-030-22244-4
eBook Packages: Computer ScienceComputer Science (R0)