Lightweight Inference by Neural Network Pruning: Accuracy, Time and Comparison

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 713))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

267 Accesses

Abstract

This paper addresses the application of neural networks in resource constrained edge-devices. The goal is to achieve a speedup both in inference and training time, with minimal accuracy loss. More specifically, it brings to light the need for compressing current models, which are mostly developed with access to more resources that the device that the model will potential run on. With the recent advances of Internet of Things(IoT) the number of devices has and is expected to rise. Not only are these devices computationally limited, but their capabilities are nor homogeneous nor predictable at the time of the development of a model, as new devices can be added anytime. This creates the need to quickly and efficiently produce models that fit each devices specifications. Transfer learning is a very efficient method, in terms of training time, but confines the user to the dimensionality of the pretrained model. Pruning is used as a way to overcome this obstacle and carry over knowledge to a variety of model, that differ in size. The aim of this paper is to serve as an introduction to pruning as a concept, as a template for further research, quantify the efficiency of a variety of methods and expose some of it’s limitations. Pruning was performed on a telecommunications anomaly dataset and the results were compared to a baseline, in regards to speed and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

AlphaGo: Mastering the ancient game of go with machine learning. https://blog.research.google/2016/01/alphago-mastering-ancient-game-of-go.html. Accessed 18 Jan 2024
Aazam, M., Khan, I., Alsaffar, A.A., Huh, E.N.: Cloud of things: integrating internet of things and cloud computing and the issues involved. In: Proceedings of 2014 11th International Bhurban Conference on Applied Sciences & Technology (IBCAST) Islamabad, Pakistan, 14th–18th January 2014, pp. 414–419. IEEE (2014)
Google Scholar
Azizan, N., Lale, S., Hassibi, B.: Stochastic mirror descent on overparameterized nonlinear models. IEEE Trans. Neural Netw. Learn. Syst. 33(12), 7717–7727 (2021)
Article MathSciNet Google Scholar
Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? Proc. Mach. Learn. Syst. 2, 129–146 (2020)
Google Scholar
Bock, S., Weiß, M.: A proof of local convergence for the adam optimizer. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2019)
Google Scholar
Chang, Z., Liu, S., Xiong, X., Cai, Z., Tu, G.: A survey of recent advances in edge-computing-powered artificial intelligence of things. IEEE Internet Things J. 8(18), 13849–13875 (2021)
Article Google Scholar
Farooq, H., Rehman, H.U., Javed, A., Shoukat, M., Dudley, S.: A review on smart IoT based farming. Ann. Emerg. Technol. Comput. (AETiC) (2020). Print ISSN pp. 2516–0281
Google Scholar
Frantar, E., Alistarh, D.: Optimal brain compression: a framework for accurate post-training quantization and pruning. Adv. Neural. Inf. Process. Syst. 35, 4475–4488 (2022)
Google Scholar
Giannopoulos, A., et al.: Supporting intelligence in disaggregated open radio access networks: architectural principles, AI/ML workflow, and use cases. IEEE Access 10, 39580–39595 (2022)
Article Google Scholar
Guo, D., Rush, A.M., Kim, Y.: Parameter-efficient transfer learning with diff pruning. arXiv preprint arXiv:2012.07463 (2020)
Hassibi, B., Stork, D.G., Wolff, G.J.: Optimal brain surgeon and general network pruning. In: IEEE International Conference on Neural Networks, pp. 293–299. IEEE (1993)
Google Scholar
He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4340–4349 (2019)
Google Scholar
Hettiarachchi, D.L.N., Davuluru, V.S.P., Balster, E.J.: Integer vs. floating-point processing on modern FPGA technology. In: 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0606–0612. IEEE (2020)
Google Scholar
Kang, M., Kang, S.: Data-free knowledge distillation in neural networks for regression. Expert Syst. Appl. 175, 114813 (2021)
Article Google Scholar
Rose, K., Eldridge, S., Chapin, L.: The internet of things: an overview. Internet Society (ISOC) 80, 1–50 (2015)
Google Scholar
Ruby, U., Yendapalli, V.: Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng. 9(10) (2020)
Google Scholar
Singh, S.P., Alistarh, D.: Woodfisher: efficient second-order approximation for neural network compression. Adv. Neural. Inf. Process. Syst. 33, 18098–18109 (2020)
Google Scholar
Takamoto, M., Morishita, Y., Imaoka, H.: An efficient method of training small models for regression problems with knowledge distillation. In: 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 67–72. IEEE (2020)
Google Scholar
Teerapittayanon, S., McDanel, B., Kung, H.T.: BranchyNet: fast inference via early exiting from deep neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2464–2469. IEEE (2016)
Google Scholar
Torralbo-Muñoz, J.L., Sendra, S., Parra, L., Lloret, J.: SmartFridge: the intelligent system that controls your fridge. In: 2018 Fifth International Conference on Internet of Things: Systems, Management and Security, pp. 200–207 (2018). https://doi.org/10.1109/IoTSMS.2018.8554615
Tsantekidis, A., Passalis, N., Tefas, A.: Diversity-driven knowledge distillation for financial trading using deep reinforcement learning. Neural Netw. 140, 193–202 (2021)
Article Google Scholar
Xin, J., Tang, R., Yu, Y., Lin, J.: BERxiT: early exiting for BERT with better fine-tuning and extension to regression. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 91–104 (2021)
Google Scholar
Xu, Q., Chen, Z., Ragab, M., Wang, C., Wu, M., Li, X.: Contrastive adversarial knowledge distillation for deep model compression in time-series regression tasks. Neurocomputing 485, 242–251 (2022)
Article Google Scholar
Yuehong, Y., Zeng, Y., Chen, X., Fan, Y.: The internet of things in healthcare: an overview. J. Ind. Inf. Integr. 1, 3–13 (2016)
Google Scholar

Download references

Acknowledgments

This work was partially support by the “Trustworthy And Resilient Decentralised Intelligence For Edge Systems (TaRDIS)” Project, funded by EU HORIZON EUROPE program, under grant agreement No 101093006

Author information

Authors and Affiliations

R&D Department, Four Dot Infinity, Leof. Kifisias 208, P.C. 15231, Athens, Greece
Ilias Paralikas
Department of Ports Management and Shipping, National and Kapodistrian University of Athens (NKUA), Athens, Greece
Sotiris Spantideas, Anastasios Giannopoulos & Panagiotis Trakadas

Authors

Ilias Paralikas
View author publications
You can also search for this author in PubMed Google Scholar
Sotiris Spantideas
View author publications
You can also search for this author in PubMed Google Scholar
Anastasios Giannopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Panagiotis Trakadas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ilias Paralikas .

Editor information

Editors and Affiliations

University of Piraeus, Piraeus, Greece
Ilias Maglogiannis
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Abertay, Dundee, UK
John Macintyre
Ionian University, Corfu, Greece
Markos Avlonitis
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Paralikas, I., Spantideas, S., Giannopoulos, A., Trakadas, P. (2024). Lightweight Inference by Neural Network Pruning: Accuracy, Time and Comparison. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Avlonitis, M., Papaleonidas, A. (eds) Artificial Intelligence Applications and Innovations. AIAI 2024. IFIP Advances in Information and Communication Technology, vol 713. Springer, Cham. https://doi.org/10.1007/978-3-031-63219-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-63219-8_19
Published: 22 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63218-1
Online ISBN: 978-3-031-63219-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)