Article

Certified Quantization Strategy Synthesis for Neural Networks

Authors:

Jin Song DongAuthors Info & Claims

Formal Methods: 26th International Symposium, FM 2024, Milan, Italy, September 9–13, 2024, Proceedings, Part I

Pages 343 - 362

https://doi.org/10.1007/978-3-031-71162-6_18

Published: 11 September 2024 Publication History

Abstract

Quantization plays an important role in deploying neural networks on embedded, real-time systems with limited computing and storage resources (e.g., edge devices). It significantly reduces the model storage cost and improves inference efficiency by using fewer bits to represent the parameters. However, it was recently shown that critical properties may be broken after quantization, such as robustness and backdoor-freeness. In this work, we introduce the first method for synthesizing quantization strategies that verifiably maintain desired properties after quantization, leveraging a key insight that quantization leads to a data distribution shift in each layer. We propose to compute the preimage for each layer based on which the preceding layer is quantized, ensuring that the quantized reachable region of the preceding layer remains within the preimage. To tackle the challenge of computing the exact preimage, we propose an MILP-based method to compute its under-approximation. We implement our method into a tool Quadapter and demonstrate its effectiveness and efficiency by providing certified quantization that successfully preserves model robustness and backdoor-freeness.

References

[1]

Agha G and Palmskog K A survey of statistical model checking ACM Trans. Model. Comput. Simul. 2018 28 1 1-39

Digital Library

[2]

Albarghouthi A and McMillan KL Sharygina N and Veith H Beautiful interpolants Computer Aided Verification 2013 Heidelberg Springer 313-329

[3]

Amir, G., Wu, H., Barrett, C.W., Katz, G.: An SMT-based approach for verifying binarized neural networks. In: Proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), vol. 12652, pp. 203–222 (2021).

Digital Library

[4]

Dathathri, S., Gao, S., Murray, R.M.: Inverse abstraction of neural networks using symbolic interpolation. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI), pp. 3437–3444 (2019).

Digital Library

[5]

Dong S, Wang P, and Abbas K A survey on deep learning and its applications Comput. Sci. Rev. 2021 40

Digital Library

[6]

Eleftheriadis, C., Kekatos, N., Katsaros, P., Tripakis, S.: On neural network equivalence checking using SMT solvers. In: Proceedings of the 20th International Conference on Formal Modeling and Analysis of Timed Systems, vol. 13465, pp. 237–257 (2022).

Digital Library

[7]

Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., Vechev, M.T.: AI

^{2}

: safety and robustness certification of neural networks with abstract interpretation. In: Proceedings of the 2018 IEEE Symposium on Security and Privacy, pp. 3–18 (2018)

[8]

Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. In: Low-Power Computer Vision, pp. 291–326. Chapman and Hall/CRC (2022)

[9]

Giacobbe M, Henzinger TA, and Lechner M How many bits does it take to quantize your neural network? Tools and Algorithms for the Construction and Analysis of Systems 2020 Cham Springer 79-97

Digital Library

[10]

Guo, X., Wan, W., Zhang, Z., Zhang, M., Song, F., Wen, X.: Eager falsification for accelerating robustness verification of deep neural networks. In: Proceedings of the 32nd IEEE International Symposium on Software Reliability Engineering, pp. 345–356 (2021)

[11]

Gurobi. A most powerful mathematical optimization solver (2018). https://www.gurobi.com/

[12]

Henzinger, T.A., Lechner, M., Zikelic, D.: Scalable verification of quantized neural networks. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI), pp. 3787–3795 (2021).

[13]

Hong, S., Panaitescu-Liess, M., Kaya, Y., Dumitras, T.: Qu-anti-zation: exploiting quantization artifacts for achieving adversarial outcomes. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS), pp. 9303–9316 (2021)

[14]

Huang, P., et al.: Towards efficient verification of quantized neural networks. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence, pp. 21152–21160 (2024).

Digital Library

[15]

Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2704–2713 (2018)

[16]

Jr., J.B.P.M., de Lima Filho, E.B., Bessa, I., Manino, E., Song, X., Cordeiro, L.C.: Counterexample guided neural network quantization refinement. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 43(4), 1121–1134 (2024).

Digital Library

[17]

Katz, G., Barrett, C.W., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Proceedings of the 29th International Conference on Computer Aided Verification, pp. 97–117 (2017)

[18]

Kotha, S., Brix, C., Kolter, J.Z., Dvijotham, K., Zhang, H.: Provably bounding neural network preimages. Adv. Neural Inf. Process. Syst. 36 (2024)

[19]

Lechner, M., Žikelić, Đ., Chatterjee, K., Henzinger, T.A., Rus, D.: Quantization-aware interval bound propagation for training certifiably robust quantized neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 14964–14973 (2023).

Digital Library

[20]

LeCun, Y., Cortes, C.: Mnist handwritten digit database (2010)

[21]

Li, Z., Ni, B., Zhang, W., Yang, X., Gao, W.: Performance guaranteed network acceleration via high-order residual quantization. In: IEEE International Conference on Computer Vision (ICCV), pp. 2603–2611 (2017).

[22]

Lin, D.D., Talathi, S.S., Annapureddy, V.S.: Fixed point quantization of deep convolutional networks. In: Proceedings of the 33nd International Conference on Machine Learning (ICML). pp. 2849–2858 (2016)

[23]

Lin, H., Lou, J., Xiong, L., Shahabi, C.: Integer-arithmetic-only certified robustness for quantized neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), pp. 7808–7817. IEEE (2021).

[24]

Lin, J., Gan, C., Han, S.: Defensive quantization: when efficiency meets robustness. In: International Conference on Learning Representations (2018)

[25]

Liu, J., Xing, Y., Shi, X., Song, F., Xu, Z., Ming, Z.: Abstraction and refinement: towards scalable and exact verification of neural networks. arXiv preprint arXiv:2207.00759 (2022)

[26]

Ma, H., et al.: Quantization backdoors to deep learning commercial frameworks. IEEE Trans. Depend Secure Comput. (2023).

Digital Library

[27]

Marco, V.S., Taylor, B., Wang, Z., Elkhatib, Y.: Optimizing deep learning inference on embedded systems through adaptive model selection. ACM Trans. Embed. Comput. Syst. 19(1), 2:1–2:28 (2020).

Digital Library

[28]

Matoba, K., Fleuret, F.: Exact preimages of neural network aircraft collision avoidance systems. In: Proceedings of the Workshop on Machine Learning for Engineering Modeling, Simulation, and Design, pp. 1–9 (2020)

[29]

Mirman, M., Gehr, T., Vechev, M.T.: Differentiable abstract interpretation for provably robust neural networks. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80, pp. 3575–3583 (2018)

[30]

Mohammadinejad S, Paulsen B, Deshmukh JV, and Wang C Dima C and Shirmohammadi M DiffRNN: differential verification of recurrent neural networks Formal Modeling and Analysis of Timed Systems 2021 Cham Springer 117-134

Digital Library

[31]

Musa AA, Hussaini A, Liao W, Liang F, and Yu W Deep neural networks for spatial-temporal cyber-physical systems: a survey Future Internet 2023 15 6 199

[32]

Nagel, M., Amjad, R.A., Van Baalen, M., Louizos, C., Blankevoort, T.: Up or down? Adaptive rounding for post-training quantization. In: Proceedings of the 37th International Conference on Machine Learning (ICML), vol. 119, pp. 7197–7206 (2020)

[33]

Nagel, M., Fournarakis, M., Amjad, R.A., Bondarenko, Y., van Baalen, M., Blankevoort, T.: A white paper on neural network quantization. arXiv preprint arXiv:2106.08295 (2021)

[34]

Pan, X., Zhang, M., Yan, Y., Yang, M.: Understanding the threats of trojaned quantized neural network in model supply chains. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 634–645 (2021).

Digital Library

[35]

Paulsen, B., Wang, J., Wang, C.: Reludiff: differential verification of deep neural networks. In: 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pp. 714–726. IEEE (2020)

[36]

Paulsen, B., Wang, J., Wang, J., Wang, C.: NeuroDiff: scalable differential verification of neural networks using fine-grained approximation. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pp. 784–796 (2020)

[37]

Pham, L.H., Sun, J.: Verifying neural networks against backdoor attacks. In: Proceedings of the 34th International Conference on Computer Aided Verification (CAV), pp. 171–192 (2022).

Digital Library

[38]

Prabhakar, P., Afzal, Z.R.: Abstraction based output range analysis for neural networks. In: Proceedings of the Annual Conference on Neural Information Processing Systems, pp. 15762–15772 (2019)

[39]

Rokh, B., Azarpeyvand, A., Khanteymoori, A.: A comprehensive survey on model quantization for deep neural networks in image classification. ACM Trans. Intell. Syst. Technol. 14(6), 97:1–97:50 (2023).

Digital Library

[40]

Singh, G., Gehr, T., Püschel, M., Vechev, M.T.: An abstract domain for certifying neural networks. Proc. ACM Program. Lang. (POPL) 3, 41:1–41:30 (2019).

Digital Library

[41]

Song, C., Fallon, E., Li, H.: Improving adversarial robustness in weight-quantized neural networks. arXiv preprint arXiv:2012.14965 (2020)

[42]

Song, X., Sun, Y., Mustafa, M.A., Cordeiro, L.C.: QNNRepair: qneural network repair. In: Proceedings of the 21st International Conference on Software Engineering and Formal Methods, vol. 14323, pp. 320–339 (2023)

[43]

Tang, Z., Dong, Y., Su, H.: Error-silenced quantization: bridging robustness and compactness. In: Proceedings of the Workshop on Artificial Intelligence Safety (AISafety@IJCAI) (2020)

[44]

Wang, P., Hu, Q., Zhang, Y., Zhang, C., Liu, Y., Cheng, J.: Two-step quantization for low-bit neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4376–4384 (2018).

[45]

Wang, S., et al.: Beta-crown: efficient bound propagation with per-neuron split constraints for neural network robustness verification. In: Proceedings of the Annual Conference on Neural Information Processing Systems, pp. 29909–29921 (2021)

[46]

Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)

[47]

Yang, P., et al.: Improving neural network verification through spurious region guided refinement. In: TACAS 2021. LNCS, vol. 12651, pp. 389–408. Springer, Cham (2021).

Digital Library

[48]

Zhang D, Yang J, Ye D, and Hua G Ferrari V, Hebert M, Sminchisescu C, and Weiss Y LQ-Nets: learned quantization for highly accurate and compact deep neural networks Computer Vision – ECCV 2018 2018 Cham Springer 373-390

Digital Library

[49]

Zhang, X., Wang, B., Kwiatkowska, M.: On preimage approximation for neural networks. arXiv preprint arXiv:2305.03686 (2023)

[50]

Zhang, Y., Chen, G., Song, F., Sun, J., Dong, J.S.: Certified quantization strategy synthesis for neural networks. https://github.com/zhangyedi/Quadapter (2024)

[51]

Zhang, Y., Song, F., Sun, J.: Qebverif: quantization error bound verification of neural networks. In: Proceedings of the 35th International Conference on Computer Aided Verification, vol. 13965, pp. 413–437 (2023).

Digital Library

[52]

Zhang Y, Zhao Z, Chen G, Song F, and Chen T Silva A and Leino KRM BDD4BNN: a BDD-based quantitative analysis framework for binarized neural networks Computer Aided Verification 2021 Cham Springer 175-200

Digital Library

[53]

Zhang, Y., Zhao, Z., Chen, G., Song, F., Chen, T.: Precise quantitative analysis of binarized neural networks: a BDD-based approach. ACM Trans. Softw. Eng. Methodol. 32(3), 62:1–62:51 (2023).

Digital Library

[54]

Zhang, Y., Zhao, Z., Chen, G., Song, F., Zhang, M., Chen, T., Sun, J.: Qvip: an ilp-based formal verification approach for quantized neural networks. In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 82:1–82:13 (2022).

Digital Library

[55]

Zhu, Y., et al.: Towards robustness evaluation of backdoor defense on quantized deep learning model. SSRN: https://ssrn.com/abstract=4578346

Index Terms

Certified Quantization Strategy Synthesis for Neural Networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Approximating vector quantisation by transformation and scalar quantisation

Vector quantisation provides better rate‐distortion performance over scalar quantisation even for a random vector with independent dimensions. However, the design and implementation complexity of vector quantisers is much higher than that of scalar ...
Stochastic vector quantization using neural networks
Sample-adaptive product quantization: asymptotic analysis and examples

Vector quantization (VQ) is an efficient data compression technique for low bit rate applications. However the major disadvantage of VQ is that its encoding complexity increases dramatically with bit rate and vector dimension. Even though one can use a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Formal Methods: 26th International Symposium, FM 2024, Milan, Italy, September 9–13, 2024, Proceedings, Part I

Sep 2024

691 pages

ISBN:978-3-031-71161-9

DOI:10.1007/978-3-031-71162-6

Editors:
André Platzer
Karlsruhe Institute of Technology, Karlsruhe, Germany
,
Kristin Yvonne Rozier
Iowa State University, Ames, IA, USA
,
Matteo Pradella
https://ror.org/01nffqt88Politecnico di Milano, Milan, Italy
,
Matteo Rossi
https://ror.org/01nffqt88Politecnico di Milano, Milan, Italy

© The Author(s) 2025.

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 11 September 2024

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten