Abstract
This paper addresses the application of neural networks in resource constrained edge-devices. The goal is to achieve a speedup both in inference and training time, with minimal accuracy loss. More specifically, it brings to light the need for compressing current models, which are mostly developed with access to more resources that the device that the model will potential run on. With the recent advances of Internet of Things(IoT) the number of devices has and is expected to rise. Not only are these devices computationally limited, but their capabilities are nor homogeneous nor predictable at the time of the development of a model, as new devices can be added anytime. This creates the need to quickly and efficiently produce models that fit each devices specifications. Transfer learning is a very efficient method, in terms of training time, but confines the user to the dimensionality of the pretrained model. Pruning is used as a way to overcome this obstacle and carry over knowledge to a variety of model, that differ in size. The aim of this paper is to serve as an introduction to pruning as a concept, as a template for further research, quantify the efficiency of a variety of methods and expose some of it’s limitations. Pruning was performed on a telecommunications anomaly dataset and the results were compared to a baseline, in regards to speed and accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
AlphaGo: Mastering the ancient game of go with machine learning. https://blog.research.google/2016/01/alphago-mastering-ancient-game-of-go.html. Accessed 18 Jan 2024
Aazam, M., Khan, I., Alsaffar, A.A., Huh, E.N.: Cloud of things: integrating internet of things and cloud computing and the issues involved. In: Proceedings of 2014 11th International Bhurban Conference on Applied Sciences & Technology (IBCAST) Islamabad, Pakistan, 14th–18th January 2014, pp. 414–419. IEEE (2014)
Azizan, N., Lale, S., Hassibi, B.: Stochastic mirror descent on overparameterized nonlinear models. IEEE Trans. Neural Netw. Learn. Syst. 33(12), 7717–7727 (2021)
Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? Proc. Mach. Learn. Syst. 2, 129–146 (2020)
Bock, S., Weiß, M.: A proof of local convergence for the adam optimizer. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2019)
Chang, Z., Liu, S., Xiong, X., Cai, Z., Tu, G.: A survey of recent advances in edge-computing-powered artificial intelligence of things. IEEE Internet Things J. 8(18), 13849–13875 (2021)
Farooq, H., Rehman, H.U., Javed, A., Shoukat, M., Dudley, S.: A review on smart IoT based farming. Ann. Emerg. Technol. Comput. (AETiC) (2020). Print ISSN pp. 2516–0281
Frantar, E., Alistarh, D.: Optimal brain compression: a framework for accurate post-training quantization and pruning. Adv. Neural. Inf. Process. Syst. 35, 4475–4488 (2022)
Giannopoulos, A., et al.: Supporting intelligence in disaggregated open radio access networks: architectural principles, AI/ML workflow, and use cases. IEEE Access 10, 39580–39595 (2022)
Guo, D., Rush, A.M., Kim, Y.: Parameter-efficient transfer learning with diff pruning. arXiv preprint arXiv:2012.07463 (2020)
Hassibi, B., Stork, D.G., Wolff, G.J.: Optimal brain surgeon and general network pruning. In: IEEE International Conference on Neural Networks, pp. 293–299. IEEE (1993)
He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4340–4349 (2019)
Hettiarachchi, D.L.N., Davuluru, V.S.P., Balster, E.J.: Integer vs. floating-point processing on modern FPGA technology. In: 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0606–0612. IEEE (2020)
Kang, M., Kang, S.: Data-free knowledge distillation in neural networks for regression. Expert Syst. Appl. 175, 114813 (2021)
Rose, K., Eldridge, S., Chapin, L.: The internet of things: an overview. Internet Society (ISOC) 80, 1–50 (2015)
Ruby, U., Yendapalli, V.: Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng. 9(10) (2020)
Singh, S.P., Alistarh, D.: Woodfisher: efficient second-order approximation for neural network compression. Adv. Neural. Inf. Process. Syst. 33, 18098–18109 (2020)
Takamoto, M., Morishita, Y., Imaoka, H.: An efficient method of training small models for regression problems with knowledge distillation. In: 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 67–72. IEEE (2020)
Teerapittayanon, S., McDanel, B., Kung, H.T.: BranchyNet: fast inference via early exiting from deep neural networks. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2464–2469. IEEE (2016)
Torralbo-Muñoz, J.L., Sendra, S., Parra, L., Lloret, J.: SmartFridge: the intelligent system that controls your fridge. In: 2018 Fifth International Conference on Internet of Things: Systems, Management and Security, pp. 200–207 (2018). https://doi.org/10.1109/IoTSMS.2018.8554615
Tsantekidis, A., Passalis, N., Tefas, A.: Diversity-driven knowledge distillation for financial trading using deep reinforcement learning. Neural Netw. 140, 193–202 (2021)
Xin, J., Tang, R., Yu, Y., Lin, J.: BERxiT: early exiting for BERT with better fine-tuning and extension to regression. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 91–104 (2021)
Xu, Q., Chen, Z., Ragab, M., Wang, C., Wu, M., Li, X.: Contrastive adversarial knowledge distillation for deep model compression in time-series regression tasks. Neurocomputing 485, 242–251 (2022)
Yuehong, Y., Zeng, Y., Chen, X., Fan, Y.: The internet of things in healthcare: an overview. J. Ind. Inf. Integr. 1, 3–13 (2016)
Acknowledgments
This work was partially support by the “Trustworthy And Resilient Decentralised Intelligence For Edge Systems (TaRDIS)” Project, funded by EU HORIZON EUROPE program, under grant agreement No 101093006
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 IFIP International Federation for Information Processing
About this paper
Cite this paper
Paralikas, I., Spantideas, S., Giannopoulos, A., Trakadas, P. (2024). Lightweight Inference by Neural Network Pruning: Accuracy, Time and Comparison. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Avlonitis, M., Papaleonidas, A. (eds) Artificial Intelligence Applications and Innovations. AIAI 2024. IFIP Advances in Information and Communication Technology, vol 713. Springer, Cham. https://doi.org/10.1007/978-3-031-63219-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-63219-8_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63218-1
Online ISBN: 978-3-031-63219-8
eBook Packages: Computer ScienceComputer Science (R0)