Nothing Special   »   [go: up one dir, main page]

Skip to main content

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2151))

Included in the following conference series:

Abstract

This study proposes a method for knowledge distillation (KD) of fine-tuned Large Language Models (LLMs) into smaller, more efficient, and accurate neural networks. We specifically target the challenge of deploying these models on resource-constrained devices. Our methodology involves training the smaller student model (Neural Network) using the prediction probabilities (as soft labels) of the LLM, which serves as a teacher model. This is achieved through a specialized loss function tailored to learn from the LLM’s output probabilities, ensuring that the student model closely mimics the teacher’s performance. To validate the performance of the KD approach, we utilized a large dataset, 7T, containing 6,684 student-written responses to science questions and three mathematical reasoning datasets with student-written responses graded by human experts. We compared accuracy with state-of-the-art (SOTA) distilled models, TinyBERT, and artificial neural network (ANN) models. Results have shown that the KD approach has 3% and 2% higher scoring accuracy than ANN and TinyBERT, respectively, and comparable accuracy to the teacher model. Furthermore, the student model size is 0.03M, 4,000 times smaller in parameters and x10 faster in inferencing than the teacher model and TinyBERT, respectively. The significance of this research lies in its potential to make advanced AI technologies accessible in typical educational settings, particularly for automatic scoring.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  2. ETS-MTS, T.: Learning progression-based and ngss-aligned formative assessment for using mathematical thinking in science, November 2023. http://ets-cls.org/mts/index.php/assessment/

  3. Ghiassi, M., Olschimke, M., Moon, B., Arnaudo, P.: Automated text classification using a dynamic artificial neural network model. Expert Syst. Appl. 39(12), 10967–10976 (2012)

    Article  Google Scholar 

  4. González-Calatayud, V., Prendes-Espinosa, P., Roig-Vila, R.: Artificial intelligence for student assessment: a systematic review. Appl. Sci. 11(12), 5467 (2021)

    Article  Google Scholar 

  5. Hinton, G., Vinyals, O., Dean, J., et al.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  6. Holmes, W., Tuomi, I.: State of the art and practice in AI in education. Eur. J. Educ. 57(4), 542–570 (2022)

    Article  Google Scholar 

  7. Jiao, X., et al.: Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2019)

  8. Latif, E., Zhai, X.: Fine-tuning chatgpt for automatic scoring. Comput. Educ. Artif. Intell. 100210 (2024)

    Google Scholar 

  9. Liu, Z., He, X., Liu, L., Liu, T., Zhai, X.: Context matters: a strategy to pre-train language model for science education. arXiv preprint arXiv:2301.12031 (2023)

  10. Selwyn, N.: Should Robots Replace Teachers?: AI and The Future of Education. John Wiley & Sons, Hoboken (2019)

    Google Scholar 

  11. Zhai, X.: Chatgpt user experience: Implications for education (2022). SSRN 4312418

    Google Scholar 

  12. Zhai, X., He, P., Krajcik, J.: Applying machine learning to automatically assess scientific models. J. Res. Sci. Teach. 59(10), 1765–1794 (2022)

    Article  Google Scholar 

  13. Zhai, X., Yin, Y., Pellegrino, J.W., Haudek, K.C., Shi, L.: Applying machine learning in science assessment: a systematic review. Stud. Sci. Educ. 56(1), 111–151 (2020)

    Article  Google Scholar 

  14. Guo, S., Zheng, Y., Zhai, X.: Artificial intelligence in education research during 2013–2023: a review based on bibliometric analysis. Educ. Inf. Technol. 1–23 (2024)

    Google Scholar 

Download references

Acknowledgment

This work was funded by the National Science Foundation(NSF) (Award Nos. 2101104, 2138854) and partially supported by the NSF under grants DMS-1903226, DMS-1925066, DMS-2124493, DMS-2311297, DMS-2319279, DMS-2318809, and by the U.S. National Institutes of Health under grant R01GM152814.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ping Ma or Xiaoming Zhai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Latif, E., Fang, L., Ma, P., Zhai, X. (2024). Knowledge Distillation of LLMs for Automatic Scoring of Science Assessments. In: Olney, A.M., Chounta, IA., Liu, Z., Santos, O.C., Bittencourt, I.I. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2024. Communications in Computer and Information Science, vol 2151. Springer, Cham. https://doi.org/10.1007/978-3-031-64312-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-64312-5_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-64311-8

  • Online ISBN: 978-3-031-64312-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics