Abstract
Question Difficulty Estimation from Text (QDET) received an increased research interest in recent years, but most of previous work focused on single silos, without performing quantitative comparisons between different models or across datastes from different educational domains. To fill this gap, we quantitatively analyze several approaches proposed in previous research, and compare their performance on two publicly available datasets. Specifically, we consider reading comprehension Multiple Choice Questions (MCQs) and maths questions. We find that Transformer-based models are the best performing in both educational domains; models based on linguistic features perform well on reading comprehension questions, while frequency based features and word embeddings perform better in domain knowledge assessment.
This paper reports on research supported by Cambridge University Press & Assessment. We thank Dr Andrew Caines for the feedback on the manuscript.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
AlKhuzaey, S., Grasso, F., Payne, T.R., Tamma, V.: A systematic review of data-driven approaches to item difficulty prediction. In: Roll, I., McNamara, D., Sosnovsky, S., Luckin, R., Dimitrova, V. (eds.) AIED 2021. LNCS (LNAI), vol. 12748, pp. 29–41. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78292-4_3
Beinborn, L., Zesch, T., Gurevych, I.: Candidate evaluation strategies for improved difficulty prediction of language tests. In: Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 1–11 (2015)
Benedetto, L., Aradelli, G., Cremonesi, P., Cappelli, A., Giussani, A., Turrin, R.: On the application of transformers for estimating the difficulty of multiple-choice questions from text. In: Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications, pp. 147–157 (2021)
Benedetto, L., Cappelli, A., Turrin, R., Cremonesi, P.: Introducing a framework to assess newly created questions with natural language processing. In: Bittencourt, I.I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds.) AIED 2020. LNCS (LNAI), vol. 12163, pp. 43–54. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52237-7_4
Benedetto, L., Cappelli, A., Turrin, R., Cremonesi, P.: R2DE: a NLP approach to estimating IRT parameters of newly generated questions. In: Proceedings of the Tenth International Conference on Learning Analytics & Knowledge, pp. 412–421 (2020)
Benedetto, L., et al.: A survey on recent approaches to question difficulty estimation from text. ACM Comput. Surv. (CSUR) 55, 1–37 (2022)
Culligan, B.: A comparison of three test formats to assess word difficulty. Lang. Test. 32(4), 503–520 (2015)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2019)
Ehara, Y.: Building an English vocabulary knowledge dataset of Japanese English-as-a-second-language learners using crowdsourcing. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (2018)
Feng, M., Heffernan, N., Koedinger, K.: Addressing the assessment challenge with an online system that tutors as it assesses. User Model. User-Adapt. Interact. 19(3), 243–266 (2009)
Hou, J., Maximilian, K., Quecedo, J.M.H., Stoyanova, N., Yangarber, R.: Modeling language learning using specialized Elo rating. In: Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications (2019)
Huang, Y.T., Chen, M.C., Sun, Y.S.: Development and evaluation of a personalized computer-aided question generation for English learners to improve proficiency and correct mistakes. arXiv preprint arXiv:1808.09732 (2018)
Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: RACE: large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 785–794 (2017)
Liang, Y., Li, J., Yin, J.: A new multi-choice reading comprehension dataset for curriculum learning. In: Asian Conference on Machine Learning. PMLR (2019)
Loginova, E., Benedetto, L., Benoit, D., Cremonesi, P.: Towards the application of calibrated Transformers to the unsupervised estimation of question difficulty from text. In: RANLP 2021, pp. 846–855. INCOMA (2021)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
Trace, J., Brown, J.D., Janssen, G., Kozhevnikova, L.: Determining cloze item difficulty from item and passage characteristics across different learner backgrounds. Lang. Test. 34(2), 151–174 (2017)
Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)
Yaneva, V., Baldwin, P., Mee, J., et al.: Predicting the difficulty of multiple choice questions in a high-stakes medical exam. In: Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications (2019)
Zhou, Y., Tao, C.: Multi-task BERT for problem difficulty prediction. In: 2020 International Conference on Communications, Information System and Computer Engineering (CISCE), pp. 213–216. IEEE (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Benedetto, L. (2023). A Quantitative Study of NLP Approaches to Question Difficulty Estimation. In: Wang, N., Rebolledo-Mendez, G., Dimitrova, V., Matsuda, N., Santos, O.C. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol 1831. Springer, Cham. https://doi.org/10.1007/978-3-031-36336-8_67
Download citation
DOI: https://doi.org/10.1007/978-3-031-36336-8_67
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36335-1
Online ISBN: 978-3-031-36336-8
eBook Packages: Computer ScienceComputer Science (R0)