Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments

Ehsan Latif^9,10,
Luyang Fang^9,11,
Ping Ma¹¹ &
…
Xiaoming Zhai^9,10

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2151))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

824 Accesses
2 Citations

Abstract

This study proposes a method for knowledge distillation (KD) of fine-tuned Large Language Models (LLMs) into smaller, more efficient, and accurate neural networks. We specifically target the challenge of deploying these models on resource-constrained devices. Our methodology involves training the smaller student model (Neural Network) using the prediction probabilities (as soft labels) of the LLM, which serves as a teacher model. This is achieved through a specialized loss function tailored to learn from the LLM’s output probabilities, ensuring that the student model closely mimics the teacher’s performance. To validate the performance of the KD approach, we utilized a large dataset, 7T, containing 6,684 student-written responses to science questions and three mathematical reasoning datasets with student-written responses graded by human experts. We compared accuracy with state-of-the-art (SOTA) distilled models, TinyBERT, and artificial neural network (ANN) models. Results have shown that the KD approach has 3% and 2% higher scoring accuracy than ANN and TinyBERT, respectively, and comparable accuracy to the teacher model. Furthermore, the student model size is 0.03M, 4,000 times smaller in parameters and x10 faster in inferencing than the teacher model and TinyBERT, respectively. The significance of this research lies in its potential to make advanced AI technologies accessible in typical educational settings, particularly for automatic scoring.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

BeGrading: large language models for enhanced feedback in programming education

Article Open access 16 October 2024

Empowering Educators: Automated Short Answer Grading with Inconsistency Check and Feedback Integration using Machine Learning

Article 13 June 2024

Revolutionizing High School Physics Education: A Novel Dataset

References

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
ETS-MTS, T.: Learning progression-based and ngss-aligned formative assessment for using mathematical thinking in science, November 2023. http://ets-cls.org/mts/index.php/assessment/
Ghiassi, M., Olschimke, M., Moon, B., Arnaudo, P.: Automated text classification using a dynamic artificial neural network model. Expert Syst. Appl. 39(12), 10967–10976 (2012)
Article Google Scholar
González-Calatayud, V., Prendes-Espinosa, P., Roig-Vila, R.: Artificial intelligence for student assessment: a systematic review. Appl. Sci. 11(12), 5467 (2021)
Article Google Scholar
Hinton, G., Vinyals, O., Dean, J., et al.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Holmes, W., Tuomi, I.: State of the art and practice in AI in education. Eur. J. Educ. 57(4), 542–570 (2022)
Article Google Scholar
Jiao, X., et al.: Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2019)
Latif, E., Zhai, X.: Fine-tuning chatgpt for automatic scoring. Comput. Educ. Artif. Intell. 100210 (2024)
Google Scholar
Liu, Z., He, X., Liu, L., Liu, T., Zhai, X.: Context matters: a strategy to pre-train language model for science education. arXiv preprint arXiv:2301.12031 (2023)
Selwyn, N.: Should Robots Replace Teachers?: AI and The Future of Education. John Wiley & Sons, Hoboken (2019)
Google Scholar
Zhai, X.: Chatgpt user experience: Implications for education (2022). SSRN 4312418
Google Scholar
Zhai, X., He, P., Krajcik, J.: Applying machine learning to automatically assess scientific models. J. Res. Sci. Teach. 59(10), 1765–1794 (2022)
Article Google Scholar
Zhai, X., Yin, Y., Pellegrino, J.W., Haudek, K.C., Shi, L.: Applying machine learning in science assessment: a systematic review. Stud. Sci. Educ. 56(1), 111–151 (2020)
Article Google Scholar
Guo, S., Zheng, Y., Zhai, X.: Artificial intelligence in education research during 2013–2023: a review based on bibliometric analysis. Educ. Inf. Technol. 1–23 (2024)
Google Scholar

Download references

Acknowledgment

This work was funded by the National Science Foundation(NSF) (Award Nos. 2101104, 2138854) and partially supported by the NSF under grants DMS-1903226, DMS-1925066, DMS-2124493, DMS-2311297, DMS-2319279, DMS-2318809, and by the U.S. National Institutes of Health under grant R01GM152814.

Author information

Authors and Affiliations

AI4STEM Education Center, Athens, GA, USA
Ehsan Latif, Luyang Fang & Xiaoming Zhai
Department of Mathematics, Science, and Social Studies Education, University of Georgia, Athens, GA, USA
Ehsan Latif & Xiaoming Zhai
Department of Statistics, University of Georgia, Athens, GA, USA
Luyang Fang & Ping Ma

Authors

Ehsan Latif
View author publications
You can also search for this author in PubMed Google Scholar
Luyang Fang
View author publications
You can also search for this author in PubMed Google Scholar
Ping Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Zhai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ping Ma or Xiaoming Zhai .

Editor information

Editors and Affiliations

University of Memphis, Memphis, TN, USA
Andrew M. Olney
University of Duisburg-Essen, Duisburg, Germany
Irene-Angelica Chounta
Jinan University, Guangzhou, China
Zitao Liu
UNED, Madrid, Spain
Olga C. Santos
Universidade Federal de Alagoas, Maceio, Brazil
Ig Ibert Bittencourt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Latif, E., Fang, L., Ma, P., Zhai, X. (2024). Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments. In: Olney, A.M., Chounta, IA., Liu, Z., Santos, O.C., Bittencourt, I.I. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2024. Communications in Computer and Information Science, vol 2151. Springer, Cham. https://doi.org/10.1007/978-3-031-64312-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-64312-5_20
Published: 02 July 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-64311-8
Online ISBN: 978-3-031-64312-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

BeGrading: large language models for enhanced feedback in programming education

Empowering Educators: Automated Short Answer Grading with Inconsistency Check and Feedback Integration using Machine Learning

Revolutionizing High School Physics Education: A Novel Dataset

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

BeGrading: large language models for enhanced feedback in programming education

Empowering Educators: Automated Short Answer Grading with Inconsistency Check and Feedback Integration using Machine Learning

Revolutionizing High School Physics Education: A Novel Dataset

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation