Abstract
This study proposes a method for knowledge distillation (KD) of fine-tuned Large Language Models (LLMs) into smaller, more efficient, and accurate neural networks. We specifically target the challenge of deploying these models on resource-constrained devices. Our methodology involves training the smaller student model (Neural Network) using the prediction probabilities (as soft labels) of the LLM, which serves as a teacher model. This is achieved through a specialized loss function tailored to learn from the LLM’s output probabilities, ensuring that the student model closely mimics the teacher’s performance. To validate the performance of the KD approach, we utilized a large dataset, 7T, containing 6,684 student-written responses to science questions and three mathematical reasoning datasets with student-written responses graded by human experts. We compared accuracy with state-of-the-art (SOTA) distilled models, TinyBERT, and artificial neural network (ANN) models. Results have shown that the KD approach has 3% and 2% higher scoring accuracy than ANN and TinyBERT, respectively, and comparable accuracy to the teacher model. Furthermore, the student model size is 0.03M, 4,000 times smaller in parameters and x10 faster in inferencing than the teacher model and TinyBERT, respectively. The significance of this research lies in its potential to make advanced AI technologies accessible in typical educational settings, particularly for automatic scoring.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
ETS-MTS, T.: Learning progression-based and ngss-aligned formative assessment for using mathematical thinking in science, November 2023. http://ets-cls.org/mts/index.php/assessment/
Ghiassi, M., Olschimke, M., Moon, B., Arnaudo, P.: Automated text classification using a dynamic artificial neural network model. Expert Syst. Appl. 39(12), 10967–10976 (2012)
González-Calatayud, V., Prendes-Espinosa, P., Roig-Vila, R.: Artificial intelligence for student assessment: a systematic review. Appl. Sci. 11(12), 5467 (2021)
Hinton, G., Vinyals, O., Dean, J., et al.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Holmes, W., Tuomi, I.: State of the art and practice in AI in education. Eur. J. Educ. 57(4), 542–570 (2022)
Jiao, X., et al.: Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2019)
Latif, E., Zhai, X.: Fine-tuning chatgpt for automatic scoring. Comput. Educ. Artif. Intell. 100210 (2024)
Liu, Z., He, X., Liu, L., Liu, T., Zhai, X.: Context matters: a strategy to pre-train language model for science education. arXiv preprint arXiv:2301.12031 (2023)
Selwyn, N.: Should Robots Replace Teachers?: AI and The Future of Education. John Wiley & Sons, Hoboken (2019)
Zhai, X.: Chatgpt user experience: Implications for education (2022). SSRN 4312418
Zhai, X., He, P., Krajcik, J.: Applying machine learning to automatically assess scientific models. J. Res. Sci. Teach. 59(10), 1765–1794 (2022)
Zhai, X., Yin, Y., Pellegrino, J.W., Haudek, K.C., Shi, L.: Applying machine learning in science assessment: a systematic review. Stud. Sci. Educ. 56(1), 111–151 (2020)
Guo, S., Zheng, Y., Zhai, X.: Artificial intelligence in education research during 2013–2023: a review based on bibliometric analysis. Educ. Inf. Technol. 1–23 (2024)
Acknowledgment
This work was funded by the National Science Foundation(NSF) (Award Nos. 2101104, 2138854) and partially supported by the NSF under grants DMS-1903226, DMS-1925066, DMS-2124493, DMS-2311297, DMS-2319279, DMS-2318809, and by the U.S. National Institutes of Health under grant R01GM152814.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Latif, E., Fang, L., Ma, P., Zhai, X. (2024). Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments. In: Olney, A.M., Chounta, IA., Liu, Z., Santos, O.C., Bittencourt, I.I. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2024. Communications in Computer and Information Science, vol 2151. Springer, Cham. https://doi.org/10.1007/978-3-031-64312-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-64312-5_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-64311-8
Online ISBN: 978-3-031-64312-5
eBook Packages: Computer ScienceComputer Science (R0)