short-paper

Multiple-Choice Question Generation Using Large Language Models: Methodology and Educator Insights

Authors:

Giorgio Biancini,

Alessio Ferrato,

Carla LimongelliAuthors Info & Claims

UMAP Adjunct '24: Adjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization

Pages 584 - 590

https://doi.org/10.1145/3631700.3665233

Published: 28 June 2024 Publication History

Abstract

Integrating Artificial Intelligence (AI) in educational settings has brought new learning approaches, transforming the practices of both students and educators. Among the various technologies driving this transformation, Large Language Models (LLMs) have emerged as powerful tools for creating educational materials and question answering, but there are still space for new applications. Educators commonly use Multiple-Choice Questions (MCQs) to assess student knowledge, but manually generating these questions is resource-intensive and requires significant time and cognitive effort. In our opinion, LLMs offer a promising solution to these challenges. This paper presents a novel comparative analysis of three widely known LLMs - Llama 2, Mistral, and GPT-3.5 - to explore their potential for creating informative and challenging MCQs. In our approach, we do not rely on the knowledge of the LLM, but we inject the knowledge into the prompt to contrast the hallucinations, giving the educators control over the test’s source text, too. Our experiment involving 21 educators shows that GPT-3.5 generates the most effective MCQs across several known metrics. Additionally, it shows that there is still some reluctance to adopt AI in the educational field. This study sheds light on the potential of LLMs to generate MCQs and improve the educational experience, providing valuable insights for the future.

References

[1]

Lorin W Anderson and David R Krathwohl. 2001. A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives: complete edition.

[2]

Jun Araki, Dheeraj Rajagopal, Sreecharan Sankaranarayanan, Susan Holm, Yukari Yamakawa, and Teruko Mitamura. 2016. Generating Questions and Multiple-Choice Answers using Semantic Analysis of Texts. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Yuji Matsumoto and Rashmi Prasad (Eds.). The COLING 2016 Organizing Committee, Osaka, Japan, 1125–1136.

[3]

Ciro Bologna, Anna Chiara De Rosa, Alfonso De Vivo, Matteo Gaeta, Giuseppe Sansonetti, and Valeria Viserta. 2013. Personality-Based Recommendation in E-Commerce. In CEUR Workshop Proceedings, Vol. 997. CEUR-WS.org, Aachen, Germany, 6 pages.

[4]

Sahan Bulathwela, Hamze Muse, and Emine Yilmaz. 2023. Scalable Educational Question Generation with Pre-trained Language Models. In Artificial Intelligence in Education, Ning Wang, Genaro Rebolledo-Mendez, Noboru Matsuda, Olga C. Santos, and Vania Dimitrova (Eds.). Springer Nature Switzerland, Cham, 327–339.

[5]

Dhawaleswar Rao Ch and Sujan Kumar Saha. 2020. Automatic Multiple Choice Question Generation From Text: A Survey. IEEE Transactions on Learning Technologies 13, 1 (2020), 14–25.

[6]

Cristina Conati, Oswald Barral, Vanessa Putnam, and Lea Rieger. 2021. Toward personalized XAI: A case study in intelligent tutoring systems. Artificial intelligence 298 (2021), 103503.

[7]

David Coniam. 1997. A Preliminary Inquiry Into Using Corpus Word Frequency Data in the Automatic Generation of English Language Cloze Tests. CALICO Journal 14, 2/4 (1997), 15–33. http://www.jstor.org/stable/45119223

[8]

Alessio De Angelis, Fabio Gasparetti, Alessandro Micarelli, and Giuseppe Sansonetti. 2017. A Social Cultural Recommender Based on Linked Open Data. In Adjunct Publication of the 25th Conference on User Modeling, Adaptation and Personalization (Bratislava, Slovakia) (UMAP ’17). ACM, New York, NY, USA, 329–332.

Digital Library

[9]

Sabina Elkins, Ekaterina Kochmar, Jackie C. K. Cheung, and Iulian Serban. 2023. How Useful are Educational Questions Generated by Large Language Models?arxiv:2304.06638 [cs.CL]

[10]

Alessio Ferrato, Carla Limongelli, Mauro Mezzini, and Giuseppe Sansonetti. 2022. The META4RS Proposal: Museum Emotion and Tracking Analysis For Recommender Systems. In Adjunct Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization (Barcelona, Spain) (UMAP ’22 Adjunct). Association for Computing Machinery, New York, NY, USA, 406–409. https://doi.org/10.1145/3511047.3537664

Digital Library

[11]

Alessio Ferrato, Carla Limongelli, Mauro Mezzini, and Giuseppe Sansonetti. 2022. Using Deep Learning for Collecting Data about Museum Visitor Behavior. Applied Sciences 12, 2 (2022), 21 pages.

[12]

Sithara HPW Gamage, Jennifer R Ayres, Monica B Behrend, and Elizabeth J Smith. 2019. Optimising Moodle quizzes for online assessments. International journal of STEM education 6 (2019), 1–14.

[13]

Simone Grassini. 2023. Shaping the future of education: exploring the potential and consequences of AI and ChatGPT in educational settings. Education Sciences 13, 7 (2023), 692.

[14]

Archana Praveen Kumar, Ashalatha Nayak, Manjula Shenoy K., Shashank Goyal, and Chaitanya. 2023. A novel approach to generate distractors for Multiple Choice Questions. Expert Systems with Applications 225 (2023), 120022. https://doi.org/10.1016/j.eswa.2023.120022

Digital Library

[15]

S Adi Lakshmi, Rajesh Saturi, Anupriya Bharti, Meghana Avvari, and Battu Bhavana. 2023. Multiple choice question generation using BERT XL net.

[16]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.

[17]

Farah Maheen, Muhammad Asif, Haseeb Ahmad, Shahbaz Ahmad, Fahad Alturise, Othman Asiry, and Yazeed Yasin Ghadi. 2022. Automatic computer science domain multiple-choice questions generation based on informative sentences. PeerJ Computer Science 8 (2022), e1010.

[18]

Mukta Majumder and Sujan Kumar Saha. 2015. A System for Generating Multiple Choice Questions: With a Novel Approach for Sentence Selection. In Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications, Hsin-Hsi Chen, Yuen-Hsien Tseng, Yuji Matsumoto, and Lung Hsiang Wong (Eds.). Association for Computational Linguistics, Beijing, China, 64–72. https://doi.org/10.18653/v1/W15-4410

[19]

Ariana Martino, Michael Iannelli, and Coleen Truong. 2023. Knowledge Injection to Counter Large Language Model (LLM) Hallucination. In The Semantic Web: ESWC 2023 Satellite Events, Catia Pesquita, Hala Skaf-Molli, Vasilis Efthymiou, Sabrina Kirrane, Axel Ngonga, Diego Collarana, Renato Cerqueira, Mehwish Alam, Cassia Trojahn, and Sven Hertling (Eds.). Springer Nature Switzerland, Cham, 182–185.

[20]

Niklas Meißner, Sandro Speth, Julian Kieslinger, and Steffen Becker. 2024. EvalQuiz – LLM-based Automated Generation of Self-Assessment Quizzes in Software Engineering Education. In Software Engineering im Unterricht der Hochschulen 2024. Gesellschaft für Informatik e.V., Bonn, 53–64. https://doi.org/10.18420/seuh2024_04

[21]

Ruslan Miktov, Le An Ha, and Nikiforos Karamanis. 2006. A computer-aided environment for generating multiple-choice test items. Natural Language Engineering 12, 2 (2006), 177–194. https://doi.org/10.1017/S1351324906004177

Digital Library

[22]

Joshua Robinson, Christopher Michael Rytting, and David Wingate. 2022. Leveraging large language models for multiple choice question answering.

[23]

Giuseppe Sansonetti, Fabio Gasparetti, and Alessandro Micarelli. 2019. Cross-Domain Recommendation for Enhancing Cultural Heritage Experience. In Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization (Larnaca, Cyprus). ACM, New York, NY, USA, 413–415.

[24]

Hongda Sun, Yuxuan Liu, Chengwei Wu, Haiyu Yan, Cheng Tai, Xin Gao, Shuo Shang, and Rui Yan. 2024. Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question Answering. In Proceedings of the ACM on Web Conference 2024 (Singapore) (WWW ’24). Association for Computing Machinery, New York, NY, USA, 4372–4382. https://doi.org/10.1145/3589334.3645670

Digital Library

[25]

Jeromie Whalen, Chrystalla Mouza, 2023. ChatGPT: Challenges, Opportunities, and Implications for Teacher Education. Contemporary Issues in Technology and Teacher Education 23, 1 (2023), 1–23.

[26]

JD Zamfirescu-Pereira, Richmond Y Wong, Bjoern Hartmann, and Qian Yang. 2023. Why Johnny can’t prompt: how non-AI experts try (and fail) to design LLM prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–21.

[27]

Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, 2023. Siren’s song in the ai ocean: A survey on hallucination in large language models.

Cited By

Index Terms

Multiple-Choice Question Generation Using Large Language Models: Methodology and Educator Insights
1. Applied computing
  1. Education
    1. E-learning
2. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

A Comparative Study of AI-Generated (GPT-4) and Human-crafted MCQs in Programming Education
ACE '24: Proceedings of the 26th Australasian Computing Education Conference

There is a constant need for educators to develop and maintain effective up-to-date assessments. While there is a growing body of research in computing education on utilizing large language models (LLMs) in generation and engagement with coding ...
Leaf: Multiple-Choice Question Generation
Advances in Information Retrieval
Abstract
Testing with quiz questions has proven to be an effective way to assess and improve the educational process. However, manually creating quizzes is tedious and time-consuming. To address this challenge, we present Leaf, a system for generating ...
AI-Driven Content Creation: Revolutionizing Educational Materials
L@S '24: Proceedings of the Eleventh ACM Conference on Learning @ Scale

The integration of Artificial Intelligence (AI) into the field of education is an unprecedented trend with the potential to revolutionize teaching approaches and significantly improve the overall learning experience. This workshop offers an opportunity ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

UMAP Adjunct '24: Adjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization

June 2024

662 pages

ISBN:9798400704666

DOI:10.1145/3631700

General Chairs:
Ludovico Boratto
University of Cagliari, Italy
,
Cristina Gena
University of Turin, Italy
,
Mirko Marras
University of Cagliari, Italy
,
Program Chairs:
Panagiotis Germanakos
SAP SE, Germany
,
Elvira Popescus
University of Craiova, Romania

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Conference

UMAP '24

Sponsor:

UMAP '24: 32nd ACM Conference on User Modeling, Adaptation and Personalization

July 1 - 4, 2024

Cagliari, Italy

Acceptance Rates

Overall Acceptance Rate 162 of 633 submissions, 26%

Upcoming Conference

UMAP '25

Sponsor:
sigchi
sigchi

33rd ACM Conference on User Modeling, Adaptation and Personalization

June 16 - 19, 2025

New York City , NY , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
295
Total Downloads

Downloads (Last 12 months)295
Downloads (Last 6 weeks)68

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents