Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Free access
Just Accepted

Exploiting Pre-Trained Models and Low-Frequency Preference for Cost-Effective Transfer-based Attack

Online AM: 25 July 2024 Publication History

Abstract

The transferability of adversarial examples enables practical transfer-based attacks. However, existing theoretical analysis cannot effectively reveal what factors contribute to cross-model transferability. Furthermore, the assumption that the target model dataset is available together with expensive prices of training proxy models also leads to insufficient practicality. We first propose a novel frequency perspective to study the transferability and then identify two factors that impair the transferability: an unchangeable intrinsic difference term along with a controllable perturbation-related term. To enhance the transferability, an optimization task with the constraint that decreases the impact of the perturbation-related term is formulated and an approximate solution for the task is designed to address the intractability of Fourier expansion. To address the second issue, we suggest employing pre-trained models as proxy models, which are freely available. Leveraging these advancements, we introduce cost-effective transfer-based attack (CTA), which addresses the optimization task in pre-trained models. CTA can be unleashed against broad applications, at any time, with minimal effort and nearly zero cost to attackers. This remarkable feature indeed makes CTA an effective, versatile, and fundamental tool for attacking and understanding a wide range of target models, regardless of their architecture or training dataset used. Extensive experiments show impressive attack performance of CTA across various models trained in seven black-box domains, highlighting the broad applicability and effectiveness of CTA.

References

[1]
Adam Coates, A. Ng, and Honglak Lee. 2011. An Analysis of Single-Layer Networks in Unsupervised Feature Learning. In International Conference on Artificial Intelligence and Statistics.
[2]
Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. 2019. Certified adversarial robustness via randomized smoothing. In international conference on machine learning. PMLR, 1310–1320.
[3]
Ambra Demontis, Marco Melis, Maura Pintor, Matthew Jagielski, Battista Biggio, Alina Oprea, Cristina Nita-Rotaru, and Fabio Roli. 2019. Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 321–338. https://www.usenix.org/conference/usenixsecurity19/presentation/demontis
[4]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, K. Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition (2009), 248–255.
[5]
Shi Dong, Ping Wang, and Khushnood Abbas. 2021. A survey on deep learning and its applications. Computer Science Review 40 (2021), 100379. https://doi.org/10.1016/j.cosrev.2021.100379
[6]
Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting Adversarial Attacks with Momentum. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), 9185–9193.
[7]
Yinpeng Dong, Tianyu Pang, Hang Su, and Jun Zhu. 2019. Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019), 4307–4316.
[8]
Yifan Du, Zikang Liu, Junyi Li, and Wayne Xin Zhao. 2022. A Survey of Vision-Language Pre-Trained Models. In International Joint Conference on Artificial Intelligence.
[9]
John C. Duchi, Peter L. Bartlett, and Martin J. Wainwright. 2011. Randomized Smoothing for Stochastic Optimization. SIAM J. Optim. 22 (2011), 674–701. https://api.semanticscholar.org/CorpusID:1182594
[10]
Mingyuan Fan, Cen Chen, Chengyu Wang, and Jun Huang. 2023. On the trustworthiness landscape of state-of-the-art generative models: A comprehensive survey. arXiv preprint arXiv:2307.16680 (2023).
[11]
Mingyuan Fan, Cen Chen, Chengyu Wang, Wenmeng Zhou, and Jun Huang. 2023. On the Robustness of Split Learning Against Adversarial Attacks. In ECAI 2023. IOS Press, 668–675.
[12]
Mingyuan Fan, Wenzhong Guo, Zuobin Ying, and Ximeng Liu. 2023. Enhance transferability of adversarial examples with model architecture. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5.
[13]
Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A Wichmann, and Wieland Brendel. 2019. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In International Conference on Learning Representations. https://openreview.net/forum?id=Bygh9j09KX
[14]
Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, and Aleksander Madry. 2019. Adversarial Examples Are Not Bugs, They Are Features. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2019/file/e2c420d928d4bf8ce0ff2ec19b371514-Paper.pdf
[15]
Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey E. Hinton. 2019. Similarity of Neural Network Representations Revisited. International Conference on Machine Learning.
[16]
Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images.
[17]
A. Kurakin, I. Goodfellow, and S. Bengio. 2016. Adversarial examples in the physical world. (2016).
[18]
Jiadong Lin, Chuanbiao Song, Kun He, Liwei Wang, and John E. Hopcroft. 2020. Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks. arXiv: Learning (2020).
[19]
Yuyang Long, Qi li Zhang, Boheng Zeng, Lianli Gao, Xianglong Liu, Jian Zhang, and Jingkuan Song. 2022. Frequency Domain Model Augmentation for Adversarial Attack. (2022).
[20]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations.
[21]
S. Maji, J. Kannala, E. Rahtu, M. Blaschko, and A. Vedaldi. 2013. Fine-Grained Visual Classification of Aircraft. Technical Report. arXiv:1306.5151 [cs-cv]
[22]
Yuhao Mao, Chong Fu, Sai gang Wang, Shouling Ji, Xuhong Zhang, Zhenguang Liu, Junfeng Zhou, Alex X. Liu, Raheem A. Beyah, and Ting Wang. 2022. Transfer Attacks Revisited: A Large-Scale Empirical Study in Real Computer Vision Settings. 2022 IEEE Symposium on Security and Privacy (SP) (2022), 1423–1439.
[23]
Muzammal Naseer, Salman Hameed Khan, M. H. Khan, Fahad Shahbaz Khan, and Fatih Murat Porikli. 2019. Cross-Domain Transferability of Adversarial Perturbations. In Neural Information Processing Systems.
[24]
Yuval Netzer, Tao Wang, Adam Coates, A. Bissacco, Bo Wu, and A. Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning.
[25]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In International Conference on Machine Learning.
[26]
Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. 2019. On the Spectral Bias of Neural Networks. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 5301–5310. https://proceedings.mlr.press/v97/rahaman19a.html
[27]
Hadi Salman, Andrew Ilyas, Logan Engstrom, Ashish Kapoor, and Aleksander Madry. 2020. Do Adversarially Robust ImageNet Models Transfer Better? (NIPS’20). Curran Associates Inc., Red Hook, NY, USA, Article 298, 13 pages.
[28]
Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, and Patrick McDaniel. 2018. Ensemble Adversarial Training: Attacks and Defenses. In International Conference on Learning Representations.
[29]
C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. FGVC-Aircraft Benchmark. Technical Report CNS-TR-2011-001. California Institute of Technology.
[30]
Xiaosen Wang and Kun He. 2021. Enhancing the Transferability of Adversarial Attacks through Variance Tuning. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021), 1924–1933.
[31]
Xiaosen Wang, Xu He, Jingdong Wang, and Kun He. 2021. Admix: Enhancing the Transferability of Adversarial Attacks. 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2021), 16138–16147.
[32]
Xin Wang, Jie Ren, Shuyu Lin, Xiangming Zhu, Yisen Wang, and Quanshi Zhang. 2021. A Unified Approach to Interpreting and Boosting Adversarial Transferability. ICLR abs/2010.04055 (2021).
[33]
Yilin Wang and Farzan Farnia. 2022. On the Role of Generalization in Transferability of Adversarial Examples. ArXiv abs/2206.09238 (2022).
[34]
Yajie Wang, Yu-an Tan, Haoran Lyu, Shangbo Wu, Yuhang Zhao, and Yuanzhang Li. 2022. Toward feature space adversarial attack in the frequency domain. International Journal of Intelligent Systems 37, 12 (2022), 11019–11036.
[35]
Futa Waseda, Sosuke Nishikawa, Trung-Nghia Le, Huy Hoang Nguyen, and Isao Echizen. 2021. Closer Look at the Transferability of Adversarial Examples: How They Fool Different Models Differently. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2021), 1360–1368.
[36]
Dongxian Wu, Yisen Wang, Shutao Xia, James Bailey, and Xingjun Ma. 2020. Skip Connections Matter: On the Transferability of Adversarial Examples Generated with ResNets. ArXiv abs/2002.05990 (2020).
[37]
Cihang Xie, Zhishuai Zhang, Jianyu Wang, Yuyin Zhou, Zhou Ren, and Alan Loddon Yuille. 2019. Improving Transferability of Adversarial Examples With Input Diversity. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019), 2725–2734.
[38]
Han Xu, Yao Ma, Hao-Chen Liu, Debayan Deb, Hui Liu, Ji-Liang Tang, and Anil K Jain. 2020. Adversarial attacks and defenses in images, graphs and text: A review. International Journal of Automation and Computing 17, 2 (2020), 151–178.
[39]
Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yan Xiao, and Zheng Ma. 2019. Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks. ArXiv abs/1901.06523 (2019).
[40]
Zhi-Qin John Xu, Yaoyu Zhang, and Yan Xiao. 2018. Training behavior of deep neural network in frequency domain. In International Conference on Neural Information Processing.
[41]
Zhi-Qin John Xu and Hanxu Zhou. 2020. Deep frequency principle towards understanding why deeper learning is faster. In AAAI Conference on Artificial Intelligence.
[42]
Jianping Zhang, Weibin Wu, Jen tse Huang, Yizhan Huang, Wenxuan Wang, Yuxin Su, and Michael R. Lyu. 2022. Improving Adversarial Transferability via Neuron Attribution-based Attacks. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), 14973–14982.
[43]
Qilong Zhang, Xiaodan Li, Yuefeng Chen, Jingkuan Song, Lianli Gao, Yuan He, and Hui Xue. 2022. Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains. ICLR (2022).
[44]
Zhengyu Zhao, Zhuoran Liu, and Martha Larson. 2021. On Success and Simplicity: A Second Look at Transferable Targeted Attacks. In NeurIPS.

Index Terms

  1. Exploiting Pre-Trained Models and Low-Frequency Preference for Cost-Effective Transfer-based Attack

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Knowledge Discovery from Data
        ACM Transactions on Knowledge Discovery from Data Just Accepted
        EISSN:1556-472X
        Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Online AM: 25 July 2024
        Accepted: 14 July 2024
        Revised: 09 April 2024
        Received: 15 September 2023

        Check for updates

        Author Tags

        1. Deep Neural Networks
        2. Adversarial Examples
        3. Black-box Adversarial Attacks
        4. Transferability

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 128
          Total Downloads
        • Downloads (Last 12 months)128
        • Downloads (Last 6 weeks)29
        Reflects downloads up to 16 Nov 2024

        Other Metrics

        Citations

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Full Access

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media