Empowering Educators: Automated Short Answer Grading with Inconsistency Check and Feedback Integration using Machine Learning

P. Sree Lakshmi¹,
J. B. Simha² &
Rajeev Ranjan¹

330 Accesses
2 Citations
Explore all metrics

Abstract

Automatic Short Answer Grading (ASAG) is a thriving domain of natural language understanding, focusing on learning analytics research. ASAG solutions are designed to alleviate the workload of teachers and instructors. While research in ASAG continues to advance through the application of deep learning, it faces certain limitations such as the need for extensive datasets and high computational costs. Our focus is on creating a machine-learning solution for ASAG that optimizes performance with small datasets and minimal computational demands. In this study, an ASAG framework namely Intelligent Descriptive answer E-Assessment System (IDEAS) is proposed. It uses a model answer-based approach that utilizes eight similarity metrics to compare the model's answer with student answers. These similarities are derived using the combination of both statistical and deep learning approaches. Unlike any prior work, this differs significantly because (i) the ASAG problem is conceptualized as multiclass classification rather than regression or binary classification, eliminating the necessity for extra discriminators. (ii) it aids evaluators in identifying inconsistencies in evaluation and provides comprehensive feedback. IDEAS is validated question-wise on various ASAG benchmark datasets namely ASAP-SAS, SciEntsBank, STITA Texas (Mohler). These datasets are constrained in ways such as lacking grading criteria for mark allocation. To address this limitation, a novel dataset, IDEAS_ASAG_DATA, is collected and utilized to validate the framework. Results demonstrate an accuracy of 94% when evaluating the framework on a specific dataset question. The results show that IDEAS attains comparable, and in certain instances, even superior performance when compared to human evaluators. We argue that the proposed framework establishes a robust baseline for future advancements in the ASAG field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GradeAid: a framework for automatic short answers grading in educational contexts—design, implementation and evaluation

Article Open access 19 May 2023

Machine Learning Approach for Automatic Short Answer Grading: A Systematic Review

Automated Short Answer Grading Using Deep Learning: A Survey

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

Data is provided only upon request.

References

Mohler M, Mihalcea R. Text-to-text semantic similarity for automatic short answer grading. In: Proceedings of the 12th conference of the European chapter of the ACL (EACL 2009). Athens, Greece: Association for Computational Linguistics; 2009. p. 567–75.
Burrows S, Gurevych I, Stein B. The eras and trends of automatic short answer grading. Int J Artif Intell Educ. 2015. https://doi.org/10.1007/s40593-014-0026-8.
Article Google Scholar
Sree Lakshm P, Kavitha. Intelligent scoring systems for descriptive answers—a review. Test Eng Manag. 2020;83:3595–600.
Google Scholar
Lun J, Zhu J, Tang Y, Yang M. Multiple data augmentation strategies for improving performance on automatic short answer scoring, vol. 20; 2020.
Rajagede RA, Hastuti RP. Stacking neural network models for automatic short answer scoring. IOP Conf Ser Mater Sci Eng. 2021;1077:012013. https://doi.org/10.1088/1757-899x/1077/1/012013.
Article Google Scholar
Zhang Y, Lin C, Chi M. Going deeper: automatic short-answer grading by combining student and question models. User Model User Adapt Interact. 2020;30:51–80. https://doi.org/10.1007/s11257-019-09251-6.
Article Google Scholar
Siddiqi R, Harrison CJ, Siddiqi R. Improving teaching and learning through automated short-answer marking. IEEE Trans Learn Technol. 2010;3:237–49. https://doi.org/10.1109/TLT.2010.4.
Article Google Scholar
Saha SK, Gupta R. Adopting computer-assisted assessment in evaluation of handwritten answer books: an experimental study. Edu Inform Technol. 2020;25:4845–60. https://doi.org/10.1007/s10639-020-10192-6.
Article Google Scholar
Saha SK, Dhawaleswar Rao CH. Development of a practical system for computerized evaluation of descriptive answers of middle school level students. Interact Learn Environ. 2022;30:215–28. https://doi.org/10.1080/10494820.2019.1651743.
Article Google Scholar
Bahel V, Thomas A. Text similarity analysis for evaluation of descriptive answers; 2021. arXiv:2105.02935.
Jamil F, Hameed IA. Toward intelligent open-ended questions evaluation based on predictive optimization. Expert Syst Appl. 2023;231:120640. https://doi.org/10.1016/J.ESWA.2023.120640.
Article Google Scholar
Shukla A, Chaudhary BD. A strategy for detection of inconsistency in evaluation of essay type answers. Educ Inform Technol. 2014;19:899–912. https://doi.org/10.1007/s10639-013-9264-x.
Article Google Scholar
Rico-Juan JR, Gallego A-J, Calvo-Zaragoza J. Automatic detection of inconsistencies between numerical scores and textual feedback in peer-assessment processes with machine learning. Comput Educ. 2019;140:103609. https://doi.org/10.1016/j.compedu.2019.103609.
Article Google Scholar
Bernius JP, Krusche S, Bruegge B. Machine learning based feedback on textual student answers in large courses. Comput Educ Artif Intell. 2022. https://doi.org/10.1016/j.caeai.2022.100081.
Article Google Scholar
Vwen YL, Luco AAC, Tan SC. A human-centric automated essay scoring and feedback system for the development of ethical reasoning. Technol Soc. 2023;26:147–59. https://doi.org/10.2307/48707973.
Article Google Scholar
Hao Q, Smith DH IV, Ding L, Ko A, Ottaway C, Wilson J, Arakawa KH, Turcan A, Poehlman T, Greer T. Towards understanding the effective design of automated formative feedback for programming assignments. Comput Sci Educ. 2022;32:105–27. https://doi.org/10.1080/08993408.2020.1860408.
Article Google Scholar
Wang Z, Lan AS, Waters AE, Grimaldi P, Baraniuk RG. A meta-learning augmented bidirectional transformer model for automatic short answer grading. In: Proceedings of the 12th international conference on educational data mining (EDM 2019); 2019.
Zhu H, Togo R, Ogawa T, Haseyama M. Prompt-based personalized federated learning for medical visual question answering; 2024. arXiv:2402.09677.
del Gobbo E, Guarino A, Cafarelli B, Grilli L. GradeAid: a framework for automatic short answers grading in educational contexts—design, implementation and evaluation. Knowl Inform Syst. 2023;65:4295–334. https://doi.org/10.1007/s10115-023-01892-9.
Article Google Scholar
Kumar Y, Aggarwal S, Mahata D, Shah RR, Kumaraguru P, Zimmermann R. Get IT scored using AutoSAS—an automated system for scoring short answers. Proc AAAI Conf Artif Intell. 2019;33:9662–9. https://doi.org/10.1609/aaai.v33i01.33019662.
Article Google Scholar
Wang T, Inoue N, Ouchi H, Mizumoto T, Inui K. Inject rubrics into short answer grading system; 2019. p. 175–82. https://doi.org/10.18653/v1/P17.
Riordan B, Horbach A, Cahill A, Zesch T, Lee CM. Investigating neural architectures for short answer scoring. In: EMNLP 2017-12th workshop on innovative use of NLP for building educational applications, BEA 2017—proceedings of the workshop. Association for Computational Linguistics (ACL); 2017. p. 159–68. https://doi.org/10.18653/v1/w17-5017.
Gaddipati SK, Nair D, Plöger PG. Comparative evaluation of pretrained transfer learning models on automatic short answer grading; 2020.
Sultan MA, Salazar C, Sumner T. Fast and easy short answer grading with high accuracy. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. San Diego, California. Association for Computational Linguistics; 2016. p. 1070–5.
Callear D, Jerrams-Smith J, Soh V. CAA of short non-MCQ answers. In: Proceedings of the 5th CAA conference, Loughborough: Loughborough University; 2001.
Leacock C, Chodorow M. C-rater: automated scoring of short-answer questions. Comput Hum. 2003;37:37.
Article Google Scholar
Siddiqi Ra, Harrison C. A systematic approach to the automated marking of short-answer questions. In: IEEE INMIC 2008: 12th IEEE international multitopic conference—conference proceedings; 2008. p. 329–32. https://doi.org/10.1109/INMIC.2008.4777758.
Mitchell T, Russell T. Towards robust computerised marking of free-text responses understanding evolution and inheritance in the national curriculum KS2-3 view project GEMSTONE technology: optimisation of global supply chain view project; 2002.
Alfonseca E, Pérez D. Automatic assessment of open ended questions with a Bleu-inspired algorithm and shallow NLP. In: Vicedo JL, Martínez-Barco P, Muńoz R, Saiz Noeda M, editors. Advances in natural language processing. EsTAL 2004. Lecture notes in computer science(), vol. 3230. Berlin: Springer; 2004.
Google Scholar
Condor A. Exploring automatic short answer grading as a tool to assist in human rating. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 12164 LNAI:74–79. London: Springer; 2020. https://doi.org/10.1007/978-3-030-52240-7_14.
Hou WJ, Tsao JH. Automatic assessment of students’ free-text answers with different levels. Int J Artif Intell Tools. 2011;20:327–47. https://doi.org/10.1142/S0218213011000188.
Article Google Scholar
del Gobbo E, Guarino A, Cafarelli B, Grilli L. GradeAid: a framework for automatic short answers grading in educational contexts—design, implementation and evaluation. In: Knowledge and information systems. Springer Science and Business Media Deutschland GmbH; 2023. https://doi.org/10.1007/s10115-023-01892-9.
Gomaa WH, Fahmy AA. Ans2vec: a scoring system for short answers. Adv Intell Syst Comput. 2020;921:586–95. https://doi.org/10.1007/978-3-030-14118-9_59.
Article Google Scholar
Prabhudesai A, Duong TNB. Automatic short answer grading using Siamese bidirectional LSTM based regression. In: 2019 IEEE international conference on engineering, technology and education (TALE). IEEE; 2019. p. 1–6. https://doi.org/10.1109/TALE48000.2019.9226026.
Chimingyang H. An automatic system for essay questions scoring based on LSTM and word embedding. In: Proceedings—2020 5th international conference on information science, computer technology and transportation, ISCTT. Institute of Electrical and Electronics Engineers Inc; 2020. p. 355–64. https://doi.org/10.1109/ISCTT51595.2020.00068.
Tulu CN, Ozkaya O, Orhan U. Automatic short answer grading with SemSpace sense vectors and MaLSTM. IEEE Access. 2021;9:19270–80. https://doi.org/10.1109/ACCESS.2021.3054346.
Article Google Scholar
Zichao Y, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. San Diego, California. Association for Computational Linguistics; 2016. p. 1480–9.
Cai C. Automatic essay scoring with recurrent neural network. In: Proceedings of the 3rd international conference on high performance compilation, computing and communications. New York, NY, USA: ACM; 2019. p. 1–7. https://doi.org/10.1145/3318265.3318296.
Sung C, Dhamecha T, Saha S, Ma T, Reddy V, Arora R. Pre-training BERT on domain resources for short answer grading. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Stroudsburg, PA, USA: Association for Computational Linguistics; 2019. p. 607074. https://doi.org/10.18653/v1/D19-1628.
Ghavidel HA, Zouaq A, Desmarais MC. Using BERT and XLNET for the automatic short answer grading task. In: CSEDU 2020—proceedings of the 12th international conference on computer supported education, vol. 1. SciTePress; 2020. p. 58–67. https://doi.org/10.5220/0009422400580067.
Wiratmo A, Fatichah C. Assessment of Indonesian short essay using transfer learning Siamese dependency tree-LSTM. In: ICICoS 2020—proceeding: 4th international conference on informatics and computational sciences. Institute of Electrical and Electronics Engineers Inc; 2020. https://doi.org/10.1109/ICICoS51170.2020.9299044.
Chen Z, Zhou Y. Research on automatic essay scoring of composition based on CNN and OR. In: 2019 2nd international conference on artificial intelligence and big data (ICAIBD). IEEE; 2019. p. 13–8. https://doi.org/10.1109/ICAIBD.2019.8837007.
Lakshmi S. Document representation methods for text categorization: a review. International Journal of Scientific Research in Computer Science Applications and Management Studies IJSRCSAMS, vol. 7; 2018.
Stacey B, Meurers D. Diagnosing meaning errors in short answers to reading comprehension questions. In: Proceedings of the 3rd ACL Workshop on Innovative Use of NLP for Building Educational Applications; 2008. p. 107–14.
Hou W-J, Tsao J-H, Li S-Y, Chen L. LNAI 6096—automatic assessment of students’ free-text answers with support vector machines. IEA/AIE 2010, Part I, LNAI 6096, © Springer, Berlin; 2010.
Elnaka A, Nael O, Afifi H, Sharaf N. AraScore: investigating response-based Arabic short answer scoring. Proc CIRP. 2021;189:282–91. https://doi.org/10.1016/j.procs.2021.05.091.
Article Google Scholar
Saha S, Dhamecha TI, Marvaniya S, Sindhgatta R, Sengupta B. Sentence level or token level features for automatic short answer grading? Use both. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 10947 LNAI. London: Springer; 2018. p. 503–17. https://doi.org/10.1007/978-3-319-93843-1_37.

Download references

Acknowledgements

The authors acknowledge the support from REVA University for the facilities provided to carry out the research.

Funding

No funding available.

Author information

Authors and Affiliations

School of Computer Science and Applications, REVA University, Bangalore, Karnataka, India
P. Sree Lakshmi & Rajeev Ranjan
RACE, REVA University, Bangalore, Karnataka, India
J. B. Simha

Authors

P. Sree Lakshmi
View author publications
You can also search for this author in PubMed Google Scholar
J. B. Simha
View author publications
You can also search for this author in PubMed Google Scholar
Rajeev Ranjan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P. Sree Lakshmi: Conceptualization, Methodology, Visualization, Writing-original draft. Simha J. B.: Supervision, Validation, Rajeev Ranjan: Writing-review and editing.

Corresponding authors

Correspondence to P. Sree Lakshmi or Rajeev Ranjan.

Ethics declarations

Conflict of interest

All authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sree Lakshmi, P., Simha, J.B. & Ranjan, R. Empowering Educators: Automated Short Answer Grading with Inconsistency Check and Feedback Integration using Machine Learning. SN COMPUT. SCI. 5, 653 (2024). https://doi.org/10.1007/s42979-024-02954-7

Download citation

Received: 05 April 2024
Accepted: 04 May 2024
Published: 13 June 2024
DOI: https://doi.org/10.1007/s42979-024-02954-7

Empowering Educators: Automated Short Answer Grading with Inconsistency Check and Feedback Integration using Machine Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GradeAid: a framework for automatic short answers grading in educational contexts—design, implementation and evaluation

Machine Learning Approach for Automatic Short Answer Grading: A Systematic Review

Automated Short Answer Grading Using Deep Learning: A Survey

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Empowering Educators: Automated Short Answer Grading with Inconsistency Check and Feedback Integration using Machine Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GradeAid: a framework for automatic short answers grading in educational contexts—design, implementation and evaluation

Machine Learning Approach for Automatic Short Answer Grading: A Systematic Review

Automated Short Answer Grading Using Deep Learning: A Survey

Explore related subjects

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now