WBP: Training-Time Backdoor Attacks Through Hardware-Based Weight Bit Poisoning

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15123))

Included in the following conference series:

European Conference on Computer Vision

173 Accesses

Abstract

Training from pre-trained models (PTM) is a popular approach for fast machine learning (ML) service deployment. Recent studies on hardware security have revealed that ML systems could be compromised through flipping bits in model parameters (e.g., weights) with memory faults. In this paper, we introduce WBP (i.e., weight bit poisoning), a novel task-agnostic backdoor attack that manifests during the victim’s training time (i.e., fine-tuning from a public and clean PTM) by inducing hardware-based weight bit flips. WBP utilizes a novel distance-aware algorithm that identifies bit flips to maximize the distance between the distribution of poisoned output representations (ORs) and clean ORs based on the public PTM. This unique set of bit flips can be applied to backdoor any victim model during the fine-tuning of the same public PTM, regardless of the downstream tasks. We evaluate WBP on state-of-the-art CNNs and Vision Transformer models with representative downstream tasks. The results show that WBP can compromise a wide range of PTMs and downstream tasks with an average 99.3% attack success rate by flipping as few as 11 model weight bits. WBP can be effective in various training configurations with respect to learning rate, optimizer, and fine-tuning duration. We investigate limitations of existing backdoor protection techniques against WBP and discuss potential future mitigation. (Our code can be accessed at: https://github.com/casrl/WBP).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Pre-trained Trojan Attacks for Visual Recognition

Article 22 January 2025

What Makes Vision Transformers Robust Towards Bit-Flip Attack?

RobSparse: Automatic Search for GPU-Friendly Robust and Sparse Vision Transformers

Notes

1.
Note this is a theoretical setup to study the applicability of prior methods. In reality, typically only one bit can flip for a period of time.

References

Al Rafi, M., Feng, Y., Yao, F., Tang, M., Jeon, H.: Decepticon: attacking secrets of transformers. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 128–139 (2023)
Google Scholar
Bai, J., Gao, K., Gong, D., Xia, S.T., Li, Z., Liu, W.: Hardly perceptible trojan attack against neural networks with bit flips. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13665, pp. 104–121. Springer, Cham (2022)
Chapter Google Scholar
Bai, J., Wu, B., Zhang, Y., Li, Y., Li, Z., Xia, S.: Targeted attack against deep neural networks via flipping limited weight bits. In: International Conference on Learning Representations (ICLR) (2021)
Google Scholar
Bickel, P., Doksum, K.: Mathematical Statistics: Basic Ideas and Selected Topics. Prentice Hall (2001)
Google Scholar
Brown, T.B., Mané, D., Roy, A., Abadi, M., Gilmer, J.: Adversarial patch. arXiv preprint arXiv:1712.09665 (2017)
Cai, K., Chowdhuryy, M.H.I., Zhang, Z., Yao, F.: Seeds of seed: NMT-stroke: diverting neural machine translation through hardware-based faults. In: International Symposium on Secure and Private Execution Environment Design (SEED), pp. 76–82 (2021)
Google Scholar
Cai, K., Chowdhuryy, M.H.I., Zhenkai, Z., Yao, F.: DeepVenom: persistent DNN backdoors exploiting transient weight perturbations in memories. In: 2024 IEEE Symposium on Security and Privacy (SP), p. 244 (2024)
Google Scholar
Cai, K., Zhang, Z., Yao, F.: On the feasibility of training-time trojan attacks through hardware-based faults in memory. In: Hardware Oriented Security and Trust (HOST), pp. 133–136 (2022)
Google Scholar
Chan, A., Ong, Y.S.: Poison as a cure: detecting & neutralizing variable-sized backdoor attacks in deep neural networks. arXiv preprint arXiv:1911.08040 (2019)
Chen, B., et al.: Detecting backdoor attacks on deep neural networks by activation clustering. In: Workshop on AAAI Conference on Artificial Intelligence (AAAI), vol. 2301 (2019)
Google Scholar
Chen, H., Fu, C., Zhao, J., Koushanfar, F.: ProFlip: targeted trojan attack with progressive bit flips. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7718–7727 (2021)
Google Scholar
Chen, K., et al.: BadPre: task-agnostic backdoor attacks to pre-trained NLP foundation models. In: International Conference on Learning Representations (ICLR) (2022)
Google Scholar
Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017)
Cheng, G., Han, J., Lu, X.: Remote sensing image scene classification: benchmark and state of the art. Proc. IEEE 1865–1883 (2017)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
Google Scholar
Fang, H., Dayapule, S.S., Yao, F., Doroslovački, M., Venkataramani, G.: A noise-resilient detection method against advanced cache timing channel attack. In: Asilomar Conference on Signals, Systems, and Computers. pp. 237–241 (2018)
Google Scholar
French, R.M.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 128–135 (1999)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: Bengio, Y., LeCun, Y. (eds.) International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. J. Mach. Learn. Res. 723–773 (2012)
Google Scholar
Gruss, D., et al.: Another flip in the wall of Rowhammer defenses. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 245–261. IEEE (2018)
Google Scholar
Gu, T., Dolan-Gavitt, B., Garg, S.: BadNets: identifying vulnerabilities in the machine learning model supply chain. CoRR abs/1708.06733 (2017)
Google Scholar
Hayase, J., Kong, W.: SPECTRE: defending against backdoor attacks using robust covariance estimation. In: International Conference on Machine Learning (ICLR) (2020)
Google Scholar
Helber, P., Bischke, B., Dengel, A., Borth, D.: EuroSAT: a novel dataset and deep learning benchmark for land use and land cover classification. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2217–2226 (2019)
Google Scholar
Hestness, J., et al.: Deep learning scaling is predictable. Empirically. arXiv, p. 2 (2017)
Google Scholar
Houben, S., Stallkamp, J., Salmen, J., Schlipsing, M., Igel, C.: Detection of traffic signs in real-world images: the German Traffic Sign Detection Benchmark. In: International Joint Conference on Neural Networks, No. 1288 (2013)
Google Scholar
Jattke, P., van der Veen, V., Frigo, P., Gunter, S., Razavi, K.: BLACKSMITH: scalable Rowhammering in the frequency domain. In: IEEE Symposium on Security and Privacy (SP), vol. 1 (2022)
Google Scholar
Jeon, M., Venkataraman, S., Phanishayee, A., Qian, J., Xiao, W., Yang, F.: In: USENIX Annual Technical Conference, pp. 947–960 (2019)
Google Scholar
Jia, J., Liu, Y., Gong, N.Z.: BadEncoder: backdoor attacks to pre-trained encoders in self-supervised learning. In: IEEE Symposium on Security and Privacy (SP) (2022)
Google Scholar
Kim, Y., et al.: Flipping bits in memory without accessing them: an experimental study of dram disturbance errors. ACM SIGARCH Comput. Architect. News 361–372 (2014)
Google Scholar
Koh, J.Y.: Model zoo: discover open source deep learning code and pretrained models. https://modelzoo.co/
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Li, L., Song, D., Li, X., Zeng, J., Ma, R., Qiu, X.: Backdoor attacks on pre-trained models by layerwise weight poisoning. In: Empirical Methods in Natural Language Processing (EMNLP) (2021)
Google Scholar
Liu, F., Yarom, Y., Ge, Q., Heiser, G., Lee, R.B.: Last-level cache side-channel attacks are practical. In: IEEE Symposium on Security and Privacy (SP), pp. 605–622 (2015)
Google Scholar
Liu, Q., Yin, J., Wen, W., Yang, C., Sha, S.: $\{$NeuroPots$\}$: realtime proactive defense against $\{$Bit-Flip$\}$ attacks in neural networks. In: USENIX Security Symposium, pp. 6347–6364 (2023)
Google Scholar
Liu, Y., et al.: Trojaning attack on neural networks. In: Network and Distributed System Security Symposium (NDSS) (2018)
Google Scholar
Liu, Y., Xie, Y., Srivastava, A.: Neural trojans. In: International Conference on Computer Design (ICCD), pp. 45–48 (2017)
Google Scholar
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning and Motivation, pp. 109–165 (1989)
Google Scholar
McKeen, F., et al.: Intel® software guard extensions (intel® SGX) support for dynamic memory management inside an enclave. In: Hardware and Architectural Support for Security and Privacy (HASP), pp. 1–9 (2016)
Google Scholar
monkeydoodle@gmail.com: The Oxford-IIIT pet dataset dataset (2022). https://universe.roboflow.com/monkeydoodle-gmail-com/the-oxford-iiit-pet-dataset
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)
Google Scholar
Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: Indian Conference on Computer Vision, Graphics & Image Processing, pp. 722–729 (2008)
Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 1345–1359 (2009)
Google Scholar
Rakin, A.S., Chowdhuryy, M.H.I., Yao, F., Fan, D.: DeepSteal: advanced model extractions leveraging efficient weight stealing in memories. In: IEEE Security and Privacy (SP), pp. 1157–1174 (2022)
Google Scholar
Rakin, A.S., He, Z., Fan, D.: Bit-flip attack: crushing neural network with progressive bit search. In: International Conference on Computer Vision (ICCV), pp. 1211–1220 (2019)
Google Scholar
Rakin, A.S., He, Z., Fan, D.: TBT: targeted neural network attack with bit trojan. In: Computer Vision and Pattern Recognition (CVPR), pp. 13195–13204 (2020)
Google Scholar
Rakin, A.S., He, Z., Li, J., Yao, F., Chakrabarti, C., Fan, D.: T-BFA: targeted bit-flip adversarial weight attack. IEEE Trans. Pattern Anal. Mach. Intell. 7928–7939 (2021)
Google Scholar
Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)
Seaborn, M., Dullien, T.: Exploiting the dram Rowhammer bug to gain kernel privileges. Black Hat 71 (2015)
Google Scholar
Shen, G., et al.: Backdoor scanning for deep neural networks through k-arm optimization. In: International Conference on Machine Learning (ICML) (2021)
Google Scholar
Shen, L., et al.: Backdoor pre-trained models can transfer to all. In: ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 3141–3158 (2021)
Google Scholar
Singhal, A., et al.: Modern information retrieval: a brief overview. IEEE Data Eng. Bull. 35–43 (2001)
Google Scholar
Tol, M.C., Islam, S., Adiletta, A.J., Sunar, B., Zhang, Z.: Don’t knock! Rowhammer at the backdoor of DNN models. In: Dependable Systems and Networks (DSN), pp. 109–122 (2023)
Google Scholar
Tolpegin, V., Truex, S., Gursoy, M.E., Liu, L.: Data poisoning attacks against federated learning systems. In: European Symposium on Research in Computer Security (ESORICS), pp. 480–501 (2020)
Google Scholar
Tran, B., Li, J., Madry, A.: Spectral signatures in backdoor attacks. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 31 (2018)
Google Scholar
Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: IEEE Symposium on Security and Privacy (SP), pp. 707–723 (2019)
Google Scholar
Wang, J., et al.: Aegis: mitigating targeted bit-flip attacks against deep neural networks. In: USENIX Security Symposium, pp. 2329–2346 (2023)
Google Scholar
Wang, S., Nepal, S., Rudolph, C., Grobler, M., Chen, S., Chen, T.: Backdoor attacks against transfer learning with pre-trained deep learning models. IEEE Trans. Serv. Comput. (TSC) 1526–1539 (2020)
Google Scholar
Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. CoRR (2019)
Google Scholar
Yao, F., Fang, H., Doroslovački, M., Venkataramani, G.: COTSknight: practical defense against cache timing channel attacks using cache monitoring and partitioning technologies. In: Hardware Oriented Security and Trust (HOST), pp. 121–130 (2019)
Google Scholar
Yao, F., Rakin, A.S., Fan, D.: DeepHammer: depleting the intelligence of deep neural networks through targeted chain of bit flips. In: USENIX Security Symposium, pp. 1463–1480 (2020)
Google Scholar
Yao, F., Venkataramani, G., Doroslovački, M.: Covert timing channels exploiting non-uniform memory access based architectures. In: Proceedings of the on Great Lakes Symposium on VLSI 2017, pp. 155–160 (2017)
Google Scholar
Yao, Y., Li, H., Zheng, H., Zhao, B.Y.: Latent backdoor attacks on deep neural networks. In: ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 2041–2055 (2019)
Google Scholar
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems (NeurIPS), vol. 27 (2014)
Google Scholar
Zhang, Z., et al.: Red alarm for pre-trained models: universal vulnerability to neuron-level backdoor attacks. Mach. Intell. Res. 180–193 (2023)
Google Scholar

Download references

Acknowledgements

This work is supported in part by U.S. National Science Foundation under SaTC-2019536 and CNS-2147217.

Author information

Authors and Affiliations

University of Central Florida, Orlando, FL, USA
Kunbei Cai, Qian Lou & Fan Yao
Clemson University, Clemson, SC, USA
Zhenkai Zhang

Authors

Kunbei Cai
View author publications
You can also search for this author in PubMed Google Scholar
Zhenkai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qian Lou
View author publications
You can also search for this author in PubMed Google Scholar
Fan Yao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kunbei Cai .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cai, K., Zhang, Z., Lou, Q., Yao, F. (2025). WBP: Training-Time Backdoor Attacks Through Hardware-Based Weight Bit Poisoning. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15123. Springer, Cham. https://doi.org/10.1007/978-3-031-73650-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-73650-6_11
Published: 21 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73649-0
Online ISBN: 978-3-031-73650-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

WBP: Training-Time Backdoor Attacks Through Hardware-Based Weight Bit Poisoning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Pre-trained Trojan Attacks for Visual Recognition

What Makes Vision Transformers Robust Towards Bit-Flip Attack?

RobSparse: Automatic Search for GPU-Friendly Robust and Sparse Vision Transformers

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

WBP: Training-Time Backdoor Attacks Through Hardware-Based Weight Bit Poisoning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Pre-trained Trojan Attacks for Visual Recognition

What Makes Vision Transformers Robust Towards Bit-Flip Attack?

RobSparse: Automatic Search for GPU-Friendly Robust and Sparse Vision Transformers

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation