Article

Provable Repair of Vision Transformers

Authors:

Stephanie Nawas,

Aditya V. ThakurAuthors Info & Claims

AI Verification: First International Symposium, SAIV 2024, Montreal, QC, Canada, July 22–23, 2024, Proceedings

Pages 156 - 178

https://doi.org/10.1007/978-3-031-65112-0_8

Published: 22 July 2024 Publication History

Abstract

Vision Transformers have emerged as state-of-the-art image recognition tools, but may still exhibit incorrect behavior. Incorrect image recognition can have disastrous consequences in safety-critical real-world applications such as self-driving automobiles. In this paper, we present Provable Repair of Vision Transformers (PRoViT), a provable repair approach that guarantees the correct classification of images in a repair set for a given Vision Transformer without modifying its architecture. PRoViT avoids negatively affecting correctly classified images (drawdown) by minimizing the changes made to the Vision Transformer’s parameters and original output. We observe that for Vision Transformers, unlike for other architectures such as ResNet or VGG, editing just the parameters in the last layer achieves correctness guarantees and very low drawdown. We introduce a novel method for editing these last-layer parameters that enables PRoViT to efficiently repair state-of-the-art Vision Transformers for thousands of images, far exceeding the capabilities of prior provable repair approaches.

References

[1]

Balunovic, M., Vechev, M.: Adversarial training and provable defenses: bridging the gap. In: International Conference on Learning Representations (2019)

[2]

Bojarski, M., et al.: End to end learning for self-driving cars. CoRR abs/1604.07316 (2016). http://arxiv.org/abs/1604.07316

[3]

Bonaert, G., Dimitrov, D.I., Baader, M., Vechev, M.: Fast and precise certification of transformers. In: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. PLDI 2021, pp. 466–481. Association for Computing Machinery, New York, NY, USA (2021).

Digital Library

[4]

Deng, J., Berg, A., Satheesh, S., Su, H., Khosla, A., Li, F.F.: Imagenet large scale visual recognition challenge 2012 (ilsvrc2012) (2012). https://www.image-net.org/challenges/LSVRC/2012/

[5]

Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=YicbFdNTTy

[6]

Fu, F., Li, W.: Sound and complete neural network repair with minimality and locality guarantees. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=xS8AMYiEav3

[7]

Goldberger, B., Katz, G., Adi, Y., Keshet, J.: Minimal modifications of deep neural networks using verification. In: LPAR, vol. 2020, p. 23 (2020)

[8]

Gonzales, R.: Feds say self-driving Uber SUV did not recognize jaywalking pedestrian in fatal crash. NPR (2019). https://www.npr.org/2019/11/07/777438412/feds-say-self-driving-uber-suv-did-not-recognize-jaywalking-pedestrian-in-fatal-. Accessed 16 Nov 2023

[9]

Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2023). https://www.gurobi.com

[10]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

[11]

Hendrycks, D., Dietterich, T.G.: Benchmarking neural network robustness to common corruptions and surface variations. arXiv preprint arXiv:1807.01697 (2018)

[12]

Huang, Z., Shen, Y., Zhang, X., Zhou, J., Rong, W., Xiong, Z.: Transformer-patcher: one mistake worth one neuron. In: The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, 1–5 May 2023. OpenReview.net (2023). https://openreview.net/pdf?id=4oYUGeGBPm

[13]

Kemker, R., McClure, M., Abitino, A., Hayes, T.L., Kanan, C.: Measuring catastrophic forgetting in neural networks. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, 2–7 February 2018, pp. 3390–3398. AAAI Press (2018).

[14]

Kermany, D.S., et al.: Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172(5) (2018)., http://www.sciencedirect.com/science/article/pii/S0092867418301545

[15]

Lee, D.: US opens investigation into Tesla after fatal crash. BBC (2016). https://www.bbc.co.uk/news/technology-36680043. Accessed 16 Nov 2023

[16]

Meng, K., Bau, D., Andonian, A., Belinkov, Y.: Locating and editing factual associations in GPT. In: NeurIPS (2022). http://papers.nips.cc/paper_files/paper/2022/hash/6f1d43d5a82a37e89b0665b33bf3a182-Abstract-Conference.html

[17]

Meng, K., Sharma, A.S., Andonian, A.J., Belinkov, Y., Bau, D.: Mass-editing memory in a transformer. In: The Eleventh International Conference on Learning Representations. ICLR 2023, Kigali, Rwanda, 1–5 May 2023. OpenReview.net (2023). https://openreview.net/pdf?id=MkbcAHIYgyS

[18]

Mitchell, E., Lin, C., Bosselut, A., Manning, C.D., Finn, C.: Memory-based model editing at scale. In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., Sabato, S. (eds.) Proceedings of the 39th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 162, pp. 15817–15831. PMLR, 17–23 July 2022. https://proceedings.mlr.press/v162/mitchell22a.html

[19]

Müller, M.N., Eckert, F., Fischer, M., Vechev, M.T.: Certified training: small boxes are all you need. In: The Eleventh International Conference on Learning Representations. ICLR 2023, Kigali, Rwanda, 1–5 May 2023. OpenReview.net (2023). https://openreview.net/pdf?id=7oFuxtJtUMH

[20]

Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library, pp. 8024–8035 (2019). https://proceedings.neurips.cc/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html

[21]

Shi, Z., Zhang, H., Chang, K., Huang, M., Hsieh, C.: Robustness verification for transformers. In: 8th International Conference on Learning Representations. ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=BJxwPJHFwS

[22]

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations. ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1409.1556

[23]

Singh, G., Gehr, T., Püschel, M., Vechev, M.: An abstract domain for certifying neural networks. Proc. ACM Program. Lang. 3(POPL), 1–30 (2019)

[24]

Sotoudeh, M., Thakur, A.V.: Correcting deep neural networks with small, generalizing patches. In: NeurIPS 2019 Workshop on Safety and Robustness in Decision Making (2019)

[25]

Sotoudeh, M., Thakur, A.V.: Provable repair of deep neural networks. In: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI). ACM (2021).

Digital Library

[26]

Tao, Z., Nawas, S., Mitchell, J., Thakur, A.V.: Architecture-preserving provable repair of deep neural networks. Proc. ACM Program. Lang. 7(PLDI) (2023).

Digital Library

[27]

Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jegou, H.: Training data-efficient image transformers & distillation through attention. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 10347–10357. PMLR, 18–24 July 2021. https://proceedings.mlr.press/v139/touvron21a.html

[28]

Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

Recommendations

Adaptive Token Sampling for Efficient Vision Transformers
Computer Vision – ECCV 2022
Abstract
While state-of-the-art vision transformer models achieve promising results in image classification, they are computationally expensive and require many GFLOPs. Although the GFLOPs of a vision transformer can be decreased by reducing the number of ...
Vision Transformers for Single Image Dehazing
Image dehazing is a representative low-level vision task that estimates latent haze-free images from hazy images. In recent years, convolutional neural network-based methods have dominated image dehazing. However, vision Transformers, which has recently ...
A survey of the vision transformers and their CNN-transformer based variants
Abstract
Vision transformers have become popular as a possible substitute to convolutional neural networks (CNNs) for a variety of computer vision applications. These transformers, with their ability to focus on global relationships in images, offer large ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

AI Verification: First International Symposium, SAIV 2024, Montreal, QC, Canada, July 22–23, 2024, Proceedings

Jul 2024

196 pages

ISBN:978-3-031-65111-3

DOI:10.1007/978-3-031-65112-0

Editors:
Guy Avni
https://ror.org/02f009v59University of Haifa, Haifa, Israel
,
Mirco Giacobbe
https://ror.org/03angcq70University of Birmingham, Birmingham, UK
,
Taylor T. Johnson
https://ror.org/02vm5rt34Vanderbilt University, Nashville, TN, USA
,
Guy Katz
Hebrew University of Jerusalem, Jerusalem, Israel
,
Anna Lukina
https://ror.org/02e2c7k09Delft University of Technology, Delft, The Netherlands
,
Nina Narodytska
VMware by Broadcom, Palo Alto, CA, USA
,
Christian Schilling
https://ror.org/04m5j1k67Aalborg University, Aalborg, Denmark

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 22 July 2024

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten