Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3430895.3460137acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesl-at-sConference Proceedingsconference-collections
research-article

Classification of Discussions in MOOC Forums: An Incremental Modeling Approach

Published: 08 June 2021 Publication History

Abstract

Supervised classification models are commonly used for classifying discussions in a MOOC forum. In most cases these models require a tedious process for manual labeling the forum messages as training data. So, new methods are needed to reduce the human effort necessary for the preparation of such training datasets. In this study we follow an incremental approach in order to examine how soon after the beginning of a new course, we have collected enough data for training a supervised classification model. We show that by employing features that derive from a seeded topic modeling method, we achieve classifiers with reliable performance early enough in the course life, thus reducing significantly the human effort. The content of the MOOC platform is used to bias the topic extraction towards discussions related to (a) course content, (b) logistics, or (c) social interactions. Then, we develop a supervised model at the start of each week based on the topic features of all previous weeks and evaluate its performance in classifying the discussions for the rest of the course. Our approach was implemented in three different MOOCs of different subjects and different sizes. The findings reveal that supervised models are able to perform reliably quite early in a MOOC's life and retain a steady overall accuracy across the remaining weeks, without requiring to be trained with the entire forum dataset.

Supplementary Material

MP4 File (L-at-S21-lsfp034.mp4)
In this video we present the study "Classification of Discussions in MOOC Forums: an Incremental Modeling Approach". In this study we address the need for new methods that are needed to reduce the human effort necessary for the preparation of training datasets in supervised classification tasks for MOOC forum discussions. We follow an incremental approach in order to examine how soon after the beginning of a course, we have collected enough data for training a supervised classifier. We show that by employing features that derive from a seeded topic modeling method biased by the content of the MOOC platform, we achieve a reliable performance early enough in the course life, thus reducing significantly the human effort. Our approach was implemented in three MOOCs of different subjects. The findings reveal that supervised models are able to perform reliably quite early in a MOOC?s life and retain a steady overall accuracy across the remaining weeks, without requiring to be trained with the entire forum dataset.

References

[1]
Melody M. Terras and Judith Ramsay. 2015. Massive open online courses (MOOCs): Insights and challenges from a psychological perspective. British Journal of Educational Technology 46, 3 (2015), 472--487.
[2]
René F. Kizilcec, Chris Piech, and Emily Schneider. 2013. Deconstructing disengagement: analyzing learner subpopulations in massive open online courses. In Proceedings of the Third International Conference on Learning Analytics and Knowledge (LAK '13), Association for Computing Machinery, New York, NY, USA, 170--179.
[3]
Anastasios Ntourmas, Nikolaos Avouris, Sophia Daskalaki, and Yannis Dimitriadis. 2019. Evaluation of a Massive Online Course Forum: Design Issues and Their Impact on Learners' Support. In Human-Computer Interaction -- INTERACT 2019 (Lecture Notes in Computer Science), Springer International Publishing, Cham, 197--206.
[4]
Panagiotis Adamopoulos. 2013. What Makes a Great MOOC? An Interdisciplinary Analysis of Student Retention in Online Courses. In Proceedings of the 34th International Conference on Information Systems: ICIS 2013 (2013).
[5]
David A. Wiley and Erin K. Edwards. 2002. Online Self-Organizing Social Systems: The Decentralized Future of Online Learning. Quarterly Review of Distance Education 3, 1 (2002), 33--46.
[6]
Siwei Fu, Jian Zhao, Weiwei Cui and Huamin Qu. 2017. Visual Analysis of MOOC Forums with iForum. IEEE Transactions on Visualization and Computer Graphics 23, 1 (January 2017), 201--210.
[7]
Alyssa Friend Wise, Yi Cui, and Jovita Vytasek. 2016. Bringing order to chaos in MOOC discussion forums with content-related thread identification. In Proceedings of the Sixth International Conference on Learning Analytics & Knowledge (LAK '16), Association for Computing Machinery, New York, NY, USA, 188--197.
[8]
Christopher G. Brinton, Mung Chiang, Shaili Jain, Henry Lam, Zhenming Liu and Felix Ming Fai Wong. 2014. Learning about Social Learning in MOOCs: From Statistical Analysis to Generative Model. IEEE Transactions on Learning Technologies 7, 4 (October 2014), 346--359.
[9]
Michael Rowe. 2018. "Operating at the Limit of what was Possible": A case study of facilitator experiences in an Open Online Course. Curriculum and Teaching 33, 2 (December 2018), 91--105.
[10]
Afsaneh Sharif and Barry Magrill. 2015. Discussion Forums in MOOCs. International Journal of Learning, Teaching and Educational Research 12, 1 (July 2015).
[11]
Omaima Almatrafi, Aditya Johri, and Huzefa Rangwala. 2018. Needle in a haystack: Identifying learner posts that require urgent response in MOOC discussion forums. Computers & Education 118, (March 2018), 1--9.
[12]
Xiaocong Wei, Hongfei Lin, Liang Yang, and Yuhai Yu. 2017. A Convolution-LSTM-Based Deep Neural Network for Cross-Domain MOOC Forum Post Classification. Information 8, 3 (September 2017), 92.
[13]
Jing Chen, Jun Feng, Xia Sun, and Yang Liu. 2020. Co-Training Semi-Supervised Deep Learning for Sentiment Classification of MOOC Forum Posts. Symmetry 12, 1 (January 2020), 8.
[14]
Mi Fei and Dit-Yan Yeung. 2015. Temporal Models for Predicting Student Dropout in Massive Open Online Courses. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW), 256--263.
[15]
Marius Kloft, Felix Stiehler, Zhilin Zheng, and Niels Pinkwart. 2014. Predicting MOOC Dropout over Weeks Using Machine Learning Methods. In Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, Association for Computational Linguistics, Doha, Qatar, 60--65.
[16]
Thushari Atapattu and Katrina Falkner. 2016. A Framework for Topic Generation and Labeling from MOOC Discussions. In Proceedings of the Third ACM Conference on Learning @ Scale (L@S'16), Association for Computing Machinery, New York, NY, USA, 201--204.
[17]
Alexander William Wong, Ken Wong, and Abram Hindle. 2019. Tracing Forum Posts to MOOC Content using Topic Analysis. arXiv:1904.07307 (April 2019).
[18]
Jagadeesh Jagarlamudi, Hal Daumé, and Raghavendra Udupa. 2012. Incorporating lexical priors into topic models. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL '12), Association for Computational Linguistics, USA, 204--213.
[19]
Anastasios Ntourmas, Sophia Daskalaki, Yannis Dimitriadis, and Nikolaos Avouris. 2021. Classifying MOOC forum posts using corpora semantic similarities: a study on transferability across different courses. Neural Computing and Applications, 1--15.
[20]
Arti Ramesh, Shachi H. Kumar, James Foulds, and Lise Getoor. 2015. Weakly Supervised Models of Aspect-Sentiment for Online Course Discussion Forums. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Association for Computational Linguistics, Beijing, China, 74--83.
[21]
Anastasios Ntourmas, Nikolaos Avouris, Sophia Daskalaki, and Yannis Dimitriadis. 2019. Teaching Assistants in MOOCs Forums: Omnipresent Interlocutors or Knowledge Facilitators. In European conference on technology enhanced learning, Springer International Publishing, Cham, 236--250.
[22]
Mousumi Banerjee, Michelle Capozzoli, Laura McSweeney and Debajyoti Sinha. 1999. "Beyond kappa: A review of interrater agreement measures," Canadian Journal of Statistics 27, 1 (1999), 3--23.
[23]
Nicolas Hernandez and Amir Hazem. 2018. PyRATA, Python Rule-based feAture sTructure Analysis. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), European Language Resources Association (ELRA), 2093--2098.
[24]
Ryan J. Gallagher, Kyle Reing, David Kale, and Greg Ver Steeg. 2017. Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge. Transactions of the Association for Computational Linguistics 5, (December 2017), 529--542.
[25]
Wanli Xing, Xin Chen, Jared Stein, and Michael Marcinkowski. 2016. Temporal predication of dropouts in MOOCs: Reaching the low hanging fruit through stacking generalization. Computers in Human Behavior 58, (May 2016), 119--129.
[26]
Cheng Ye and Gautam Biswas. 2014. Early Prediction of Student Dropout and Performance in MOOCs using Higher Granularity Temporal Information. Learning Analytics 1, 3 (December 2014), 169--172.
[27]
Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information Processing & Management 24, 5 (January 1988), 513--523.
[28]
J. Richard Landis and Gary G. Koch. 1977. The Measurement of Observer Agreement for Categorical Data. Biometrics 33, 1 (1977), 159--174.
[29]
Rasoul S. Safavian and David Landgrebe. 1991. A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man and Cybernetics 21, 3 (May 1991), 660--674.
[30]
Colleen M. Farrelly. 2017. Deep vs. Diverse Architectures for Classification Problems. arXiv:1708.06347
[31]
Saumya Debray, Sampath Kannan, and Mukul Paithane. 1992. Weighted Decision Trees. In Proceedings of the Joint International Conference and Symposium on Logic Programming, MIT Press, 654--668.
[32]
Jaime Arguello and Kyle Shaffer. 2015. Predicting Speech Acts in MOOC Forum Posts. In Proceedings of the Ninth International AAAI Conference on Web and Social Media (ICWSM) 9, 1 (April 2015).

Cited By

View all
  • (2023)Designing effective discussion forum in MOOCs: insights from learner perspectivesFrontiers in Education10.3389/feduc.2023.12234098Online publication date: 1-Dec-2023
  • (2023)Lessons from debiasing data for fair and accurate predictive modeling in educationExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120323228:COnline publication date: 15-Oct-2023
  • (2022)Automatic content analysis of asynchronous discussion forum transcripts: A systematic literature reviewEducation and Information Technologies10.1007/s10639-022-11065-w27:8(11355-11410)Online publication date: 1-Sep-2022

Index Terms

  1. Classification of Discussions in MOOC Forums: An Incremental Modeling Approach

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        L@S '21: Proceedings of the Eighth ACM Conference on Learning @ Scale
        June 2021
        380 pages
        ISBN:9781450382151
        DOI:10.1145/3430895
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 08 June 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. MOOC
        2. corex
        3. discussion forum
        4. supervised classification
        5. topic modeling

        Qualifiers

        • Research-article

        Funding Sources

        • State Scholarships Foundation (IKY)

        Conference

        L@S '21
        L@S '21: Eighth (2021) ACM Conference on Learning @ Scale
        June 22 - 25, 2021
        Virtual Event, Germany

        Acceptance Rates

        Overall Acceptance Rate 117 of 440 submissions, 27%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)15
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 30 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Designing effective discussion forum in MOOCs: insights from learner perspectivesFrontiers in Education10.3389/feduc.2023.12234098Online publication date: 1-Dec-2023
        • (2023)Lessons from debiasing data for fair and accurate predictive modeling in educationExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120323228:COnline publication date: 15-Oct-2023
        • (2022)Automatic content analysis of asynchronous discussion forum transcripts: A systematic literature reviewEducation and Information Technologies10.1007/s10639-022-11065-w27:8(11355-11410)Online publication date: 1-Sep-2022
        • (2022)MOOC-LSTM: The LSTM Architecture for Sentiment Analysis on MOOCs Forum PostsComputational Intelligence and Data Analytics10.1007/978-981-19-3391-2_21(283-293)Online publication date: 2-Sep-2022

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media