Keywords

1 Introduction

The main objective of e-learning platforms is to change the traditional framework of education and make necessary improvements to teaching methods for better learning. However, e-learning remains a complex learning environment in which the learner feels autonomous, isolated, and responsible for his/her educational experience. Therefore, the learner must exhibit enough engagement to counterbalance any other factors resistant to his/her learning. There are three types of engagement: emotional, behavioral, and cognitive. In our work, we toggle the cognitive dimension of engagement since it reveals learners’ reflexing and critical thinking. However, the latent nature of engagement and the lack of direct interaction between students and tutors make the prediction of engagement level difficult and challenging (Aleven 2010). Therefore, we focus on the online discussion forum as a tool of asynchronous communication which fosters social interaction.

An online discussion forum is a tool that allows free communication between different participants at any time by keeping track of the various exchanges. Given the degree of learner autonomy during online learning, Larkin-Hein (Larkin-Hein 2001) finds that discussions forum represent a promising way to both achieve emotional attachment and acquire an effective role in the program. Althaus (Althaus 1997) adds that learners learn better through their participation in online discussions because they are placed in a socio-intellectual environment that encourages active participation, reflection, and equality among different learners.

In our work, we attempt to explore learners’ transcripts in order to extract features revealing their cognitive behavior and more especially their level of cognitive engagement. To do that, we propose to automatically classify learners according to cognitive engagement levels based on their social interaction by combining both Text Mining and Machine Learning techniques. We can distinguish four levels of cognitive engagement: Passive, active, constructive, and interactive.

2 Detecting Learners’ Level of Cognitive Engagement

We can’t talk about effective and efficient learning without addressing learners’ cognitive behavior, particularly their cognitive engagement. This latter reflects the quality and degree of mental effort that a learner can spend during the learning process. Therefore, our main objective is to determine the level of learners’ cognitive engagement from their social interactions within discussion forums.

According to ICAP Framework (Chi and Wylie 2014), we can distinguish four levels related to cognitive engagement, namely: Passive: for a learner who simply receives the information without analyzing it, interpreting it, or even reacting to it. Active: for those who can understand the text, summarize it and focus on what they are learning. Constructive: the learner becomes productive and can generate and produce new ideas and construct knowledge. Finally, the Interactive, for whom can debate with peers and defend his/her ideas. To automatically classify learners to the four levels above we have two essential phases:

2.1 Learners’ Vector Construction

This step is based on feature extraction to model learners and construct vectors. In our system, a learner can be detected by his/her messages categorized according to the cognitive presence phases (Hayati et al. 2019) as well as traces of his/her social interaction within the platform specifically the discussion forum. Therefore, we have two types of attributes:

The Cognitive Presence categorized messages:

For each learner we calculate

  • TE: number of messages belonging to the Triggering Event phase.

  • EX: number of messages belonging to the Exploration phase.

  • Int: number of messages belonging to the Integration phase.

  • Res: number of messages belonging to the Resolution phase.

The social interaction features:

  • Add_post: number of added posts.

  • Discussion_view: number of learner’s consultation of the discussions.

  • Thread_count: number of discussions initiated by the learner.

  • Nbr_peer_interaction: number of peers interacting with the learner.

  • Nbr_vote: number of votes collected by the learner.

  • Time_spent: time spent by the learner in the online forum.

Thus for a learner i we have the vector

$$ \overrightarrow {{A_{i} }} \left( {{\text{TE}}, {\text{Ex}}, {\text{Int}}, {\text{Res}}, {\text{Add}}_{\text{post}} , {\text{Discussion}}_{\text{view}} , {\text{Thread}}_{\text{count}} ,{\text{Nbr}}_{{{\text{peer}}_{\text{interaction}} }} , {\text{Nbr}}_{\text{vote}} , {\text{Time}}_{\text{spent}} } \right) $$

2.2 SVM-Based Classifier

This phase relies on the use of SVM as a Machine Learning algorithm for classifying learners as per the four levels of cognitive engagement.

SVM was originally designed for binary classification. Yet, several studies have studied the case of multi-class classification, either by combining binary classifications or by considering all classes at once (Mayoraz and Alpaydm 1999) (Hsu and Lin, n.d.). Indeed, there are two essential approaches, namely “one-vs-one” (OVO) and “one-vs-all” (OVA). OVO consists of definitions for each pair of classes a specific classifier, so, if we have k classes OVO method constructs k(k − 1)/2 classifiers. OVA hinges on constructing for each class a classifier that separates its points from all the others. In fact, if we have k classes OVA approach constructs k classifier.

3 Test and Results

3.1 Data Set Description

To test our system we used data from discussion forum samples of different courses in software engineering. Therefore, to classify learners’, we construct a database with all the calculated features whereupon two experts coded according to the four levels of cognitive engagement. The inter-rater agreement was good: percent agreement = 87%. Our data is balanced.

3.2 Training and Testing Phases

After constructing learners-vectors and codding them to the four levels we start the training phase of our SVM classifier after what we will test it and compare in Table 1 accuracy results (classification accuracy, cohen’s K, recall precision and f1 score) for the two approaches OVO and OVA.

Table 1. Accuracy results for OVA and OVO approaches

From the obtained results we can see that the best choice in our context is the OVA approach. To better observe the obtained results, we have detailed the normalized confusion matrix (Fig. 1).

Fig. 1.
figure 1

Normalized Confusion Matrix; 1: passive, 2: active, 3: constructive, 4: interactive

4 General Conclusion

This work sought to explore learners’ cognitive engagement within online discussion forums. This later represents a socio-constructivist environment that encourages higher-order thinking behaviors. In fact, this type of asynchronous communication foster conducts like being socially interactive and adding new knowledge constructively. Regarding the literature review, there are four levels of cognitive engagement: Passive, active, constructive, and interactive.

Our research presents a new automated system to predict learners’ cognitive engagement while examining their social interactions in online discussion forums and using Text Mining & Machine Learning techniques. These two approaches propose interesting methods for prepossessing text, analyzing data, and discovering knowledge.

Based on a corpus of posts extracted from learners’ participation within courses in software engineering offered through an online learning platform, we explore whether the learner is a passive, active, constructive, or interactive participant. The achieved results have demonstrated interesting precision as classification accuracy = 0.9 and Cohen Kappa = 0.89, which shows that the proposed system is very effective with an almost perfect agreement.

Nevertheless, like any other research, there are limitations to this work too. Our approach, focus only on learners’ posts in online discussion forums to predict their level of cognitive engagement. Yet, it can be learners who are highly engaged with the course materials even if they never display a good level of cognitive engagement in the discussion forum.

As perspective, we can use our system as an input for the recommended systems. In fact, the reported results can be used to recommend new resources for learners according to their level of engagement.