Nothing Special   »   [go: up one dir, main page]

Skip to main content

Lightweight Inference by Neural Network Pruning: Accuracy, Time and Comparison

  • Conference paper
  • First Online:
Artificial Intelligence Applications and Innovations (AIAI 2024)

Abstract

This paper addresses the application of neural networks in resource constrained edge-devices. The goal is to achieve a speedup both in inference and training time, with minimal accuracy loss. More specifically, it brings to light the need for compressing current models, which are mostly developed with access to more resources that the device that the model will potential run on. With the recent advances of Internet of Things(IoT) the number of devices has and is expected to rise. Not only are these devices computationally limited, but their capabilities are nor homogeneous nor predictable at the time of the development of a model, as new devices can be added anytime. This creates the need to quickly and efficiently produce models that fit each devices specifications. Transfer learning is a very efficient method, in terms of training time, but confines the user to the dimensionality of the pretrained model. Pruning is used as a way to overcome this obstacle and carry over knowledge to a variety of model, that differ in size. The aim of this paper is to serve as an introduction to pruning as a concept, as a template for further research, quantify the efficiency of a variety of methods and expose some of it’s limitations. Pruning was performed on a telecommunications anomaly dataset and the results were compared to a baseline, in regards to speed and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. AlphaGo: Mastering the ancient game of go with machine learning. https://blog.research.google/2016/01/alphago-mastering-ancient-game-of-go.html. Accessed 18 Jan 2024

  2. Aazam, M., Khan, I., Alsaffar, A.A., Huh, E.N.: Cloud of things: integrating internet of things and cloud computing and the issues involved. In: Proceedings of 2014 11th International Bhurban Conference on Applied Sciences & Technology (IBCAST) Islamabad, Pakistan, 14th–18th January 2014, pp. 414–419. IEEE (2014)

    Google Scholar 

  3. Azizan, N., Lale, S., Hassibi, B.: Stochastic mirror descent on overparameterized nonlinear models. IEEE Trans. Neural Netw. Learn. Syst. 33(12), 7717–7727 (2021)

    Article  MathSciNet  Google Scholar 

  4. Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? Proc. Mach. Learn. Syst. 2, 129–146 (2020)

    Google Scholar 

  5. Bock, S., Weiß, M.: A proof of local convergence for the adam optimizer. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2019)

    Google Scholar 

  6. Chang, Z., Liu, S., Xiong, X., Cai, Z., Tu, G.: A survey of recent advances in edge-computing-powered artificial intelligence of things. IEEE Internet Things J. 8(18), 13849–13875 (2021)

    Article  Google Scholar 

  7. Farooq, H., Rehman, H.U., Javed, A., Shoukat, M., Dudley, S.: A review on smart IoT based farming. Ann. Emerg. Technol. Comput. (AETiC) (2020). Print ISSN pp. 2516–0281

    Google Scholar 

  8. Frantar, E., Alistarh, D.: Optimal brain compression: a framework for accurate post-training quantization and pruning. Adv. Neural. Inf. Process. Syst. 35, 4475–4488 (2022)

    Google Scholar 

  9. Giannopoulos, A., et al.: Supporting intelligence in disaggregated open radio access networks: architectural principles, AI/ML workflow, and use cases. IEEE Access 10, 39580–39595 (2022)

    Article  Google Scholar 

  10. Guo, D., Rush, A.M., Kim, Y.: Parameter-efficient transfer learning with diff pruning. arXiv preprint arXiv:2012.07463 (2020)

  11. Hassibi, B., Stork, D.G., Wolff, G.J.: Optimal brain surgeon and general network pruning. In: IEEE International Conference on Neural Networks, pp. 293–299. IEEE (1993)

    Google Scholar 

  12. He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4340–4349 (2019)

    Google Scholar 

  13. Hettiarachchi, D.L.N., Davuluru, V.S.P., Balster, E.J.: Integer vs. floating-point processing on modern FPGA technology. In: 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0606–0612. IEEE (2020)

    Google Scholar 

  14. Kang, M., Kang, S.: Data-free knowledge distillation in neural networks for regression. Expert Syst. Appl. 175, 114813 (2021)

    Article  Google Scholar 

  15. Rose, K., Eldridge, S., Chapin, L.: The internet of things: an overview. Internet Society (ISOC) 80, 1–50 (2015)

    Google Scholar 

  16. Ruby, U., Yendapalli, V.: Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng. 9(10) (2020)

    Google Scholar 

  17. Singh, S.P., Alistarh, D.: Woodfisher: efficient second-order approximation for neural network compression. Adv. Neural. Inf. Process. Syst. 33, 18098–18109 (2020)

    Google Scholar 

  18. Takamoto, M., Morishita, Y., Imaoka, H.: An efficient method of training small models for regression problems with knowledge distillation. In: 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 67–72. IEEE (2020)

    Google Scholar 

  19. Teerapittayanon, S., McDanel, B., Kung, H.T.: BranchyNet: fast inference via early exiting from deep neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2464–2469. IEEE (2016)

    Google Scholar 

  20. Torralbo-Muñoz, J.L., Sendra, S., Parra, L., Lloret, J.: SmartFridge: the intelligent system that controls your fridge. In: 2018 Fifth International Conference on Internet of Things: Systems, Management and Security, pp. 200–207 (2018). https://doi.org/10.1109/IoTSMS.2018.8554615

  21. Tsantekidis, A., Passalis, N., Tefas, A.: Diversity-driven knowledge distillation for financial trading using deep reinforcement learning. Neural Netw. 140, 193–202 (2021)

    Article  Google Scholar 

  22. Xin, J., Tang, R., Yu, Y., Lin, J.: BERxiT: early exiting for BERT with better fine-tuning and extension to regression. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 91–104 (2021)

    Google Scholar 

  23. Xu, Q., Chen, Z., Ragab, M., Wang, C., Wu, M., Li, X.: Contrastive adversarial knowledge distillation for deep model compression in time-series regression tasks. Neurocomputing 485, 242–251 (2022)

    Article  Google Scholar 

  24. Yuehong, Y., Zeng, Y., Chen, X., Fan, Y.: The internet of things in healthcare: an overview. J. Ind. Inf. Integr. 1, 3–13 (2016)

    Google Scholar 

Download references

Acknowledgments

This work was partially support by the “Trustworthy And Resilient Decentralised Intelligence For Edge Systems (TaRDIS)” Project, funded by EU HORIZON EUROPE program, under grant agreement No 101093006

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilias Paralikas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Paralikas, I., Spantideas, S., Giannopoulos, A., Trakadas, P. (2024). Lightweight Inference by Neural Network Pruning: Accuracy, Time and Comparison. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Avlonitis, M., Papaleonidas, A. (eds) Artificial Intelligence Applications and Innovations. AIAI 2024. IFIP Advances in Information and Communication Technology, vol 713. Springer, Cham. https://doi.org/10.1007/978-3-031-63219-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-63219-8_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-63218-1

  • Online ISBN: 978-3-031-63219-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics