Abstract
Convolutional Neural Networks (CNNs) have become a de-facto standard for image and video recognition. However, current software and hardware implementations targeting convolutional operations still lack embracing energy budget constraints due to the CNN intensive data processing behavior. This paper proposes a software-based memoization technique to skip entire convolution calculations. We demonstrate that, by grouping output values within proximity-based clusters, it is possible to reduce by hundreds of times the amount of memory necessary to store all the tables. Also, we present a table mapping scheme to index the input set of each convolutional layer to its output value. Our experimental results show that for a YOLOv3-tiny CNN, it is possible to achieve a speedup up to 3.5\(\times \) while reducing the energy consumption to 22% of the baseline with an accuracy loss of 7.4%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alwani, M., Chen, H., Ferdman, M., Milder, P.: Fused-layer CNN accelerators. In: The 49th Annual IEEE/ACM International Symposium on Microarchitecture, p. 22. IEEE Press (2016)
Choquette, J., Giroux, O., Foley, D.: Volta: performance and programmability. IEEE Micro 38(2), 42–52 (2018)
Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 933–941. JMLR.org (2017)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint. arXiv:1510.00149 (2015)
Hegde, K., Yu, J., Agrawal, R., Yan, M., Pellauer, M., Fletcher, C.W.: UCNN: exploiting computational reuse in deep neural networks via weight repetition. In: Proceedings of the 45th Annual International Symposium on Computer Architecture, pp. 674–687. IEEE Press (2018)
Hoshen, Y., Weiss, R.J., Wilson, K.W.: Speech acoustic modeling from raw multichannel waveforms. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4624–4628. IEEE (2015)
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint. arXiv:1704.04861 (2017)
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18(1), 6869–6898 (2017)
Jiao, X., Akhlaghi, V., Jiang, Y., Gupta, R.K.: Energy-efficient neural networks using approximate computation reuse. In: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1223–1228. IEEE (2018)
Jouppi, N.P., et al.: In-datacenter performance analysis of a tensor processing unit. In: 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), pp. 1–12. IEEE (2017)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, X., Pool, J., Han, S., Dally, W.J.: Efficient sparse-winograd convolutional neural networks. In: International Conference on Learning Representations (ICLR) (2018)
Muralimanohar, N., Balasubramonian, R., Jouppi, N.P.: Cacti 6.0: a tool to model large caches. HP laboratories, pp. 22–31 (2009)
Razlighi, M.S., Imani, M., Koushanfar, F., Rosing, T.: LookNN: Neural network with no multiplication. In: Proceedings of the Conference on Design, Automation & Test in Europe, pp. 1779–1784. European Design and Automation Association (2017)
Redmon, J.: Darknet: open source neural networks in C (2013–2016). http://pjreddie.com/darknet/
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint. arXiv:1804.02767 (2018)
Riera, M., Arnau, J.M., González, A.: Computation reuse in DNNs by exploiting input similarity. In: Proceedings of the 45th Annual International Symposium on Computer Architecture, pp. 57–68. IEEE Press (2018)
Shafiee, A., et al.: ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Comput. Archit. News 44(3), 14–26 (2016)
Sodani, A.: Knights landing (KNL): 2nd generation Intel® xeon phi processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1–24. IEEE (2015)
Suresh, A., Rohou, E., Seznec, A.: Compile-time function memoization. In: Proceedings of the 26th International Conference on Compiler Construction, pp. 45–54. ACM (2017)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4820–4828 (2016)
Acknowledgements
This work was supported by CAPES, CNPQ and FAPERGS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
de Moura, R.F., Santos, P.C., de Lima, J.P.C., Alves, M.A.Z., Beck, A.C.S., Carro, L. (2019). Skipping CNN Convolutions Through Efficient Memoization. In: Pnevmatikatos, D., Pelcat, M., Jung, M. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2019. Lecture Notes in Computer Science(), vol 11733. Springer, Cham. https://doi.org/10.1007/978-3-030-27562-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-27562-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27561-7
Online ISBN: 978-3-030-27562-4
eBook Packages: Computer ScienceComputer Science (R0)