Nothing Special   »   [go: up one dir, main page]

Skip to main content

WBP: Training-Time Backdoor Attacks Through Hardware-Based Weight Bit Poisoning

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Abstract

Training from pre-trained models (PTM) is a popular approach for fast machine learning (ML) service deployment. Recent studies on hardware security have revealed that ML systems could be compromised through flipping bits in model parameters (e.g., weights) with memory faults. In this paper, we introduce WBP (i.e., weight bit poisoning), a novel task-agnostic backdoor attack that manifests during the victim’s training time (i.e., fine-tuning from a public and clean PTM) by inducing hardware-based weight bit flips. WBP utilizes a novel distance-aware algorithm that identifies bit flips to maximize the distance between the distribution of poisoned output representations (ORs) and clean ORs based on the public PTM. This unique set of bit flips can be applied to backdoor any victim model during the fine-tuning of the same public PTM, regardless of the downstream tasks. We evaluate WBP on state-of-the-art CNNs and Vision Transformer models with representative downstream tasks. The results show that WBP can compromise a wide range of PTMs and downstream tasks with an average 99.3% attack success rate by flipping as few as 11 model weight bits. WBP can be effective in various training configurations with respect to learning rate, optimizer, and fine-tuning duration. We investigate limitations of existing backdoor protection techniques against WBP and discuss potential future mitigation. (Our code can be accessed at: https://github.com/casrl/WBP).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Note this is a theoretical setup to study the applicability of prior methods. In reality, typically only one bit can flip for a period of time.

References

  1. Al Rafi, M., Feng, Y., Yao, F., Tang, M., Jeon, H.: Decepticon: attacking secrets of transformers. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 128–139 (2023)

    Google Scholar 

  2. Bai, J., Gao, K., Gong, D., Xia, S.T., Li, Z., Liu, W.: Hardly perceptible trojan attack against neural networks with bit flips. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13665, pp. 104–121. Springer, Cham (2022)

    Chapter  Google Scholar 

  3. Bai, J., Wu, B., Zhang, Y., Li, Y., Li, Z., Xia, S.: Targeted attack against deep neural networks via flipping limited weight bits. In: International Conference on Learning Representations (ICLR) (2021)

    Google Scholar 

  4. Bickel, P., Doksum, K.: Mathematical Statistics: Basic Ideas and Selected Topics. Prentice Hall (2001)

    Google Scholar 

  5. Brown, T.B., Mané, D., Roy, A., Abadi, M., Gilmer, J.: Adversarial patch. arXiv preprint arXiv:1712.09665 (2017)

  6. Cai, K., Chowdhuryy, M.H.I., Zhang, Z., Yao, F.: Seeds of seed: NMT-stroke: diverting neural machine translation through hardware-based faults. In: International Symposium on Secure and Private Execution Environment Design (SEED), pp. 76–82 (2021)

    Google Scholar 

  7. Cai, K., Chowdhuryy, M.H.I., Zhenkai, Z., Yao, F.: DeepVenom: persistent DNN backdoors exploiting transient weight perturbations in memories. In: 2024 IEEE Symposium on Security and Privacy (SP), p. 244 (2024)

    Google Scholar 

  8. Cai, K., Zhang, Z., Yao, F.: On the feasibility of training-time trojan attacks through hardware-based faults in memory. In: Hardware Oriented Security and Trust (HOST), pp. 133–136 (2022)

    Google Scholar 

  9. Chan, A., Ong, Y.S.: Poison as a cure: detecting & neutralizing variable-sized backdoor attacks in deep neural networks. arXiv preprint arXiv:1911.08040 (2019)

  10. Chen, B., et al.: Detecting backdoor attacks on deep neural networks by activation clustering. In: Workshop on AAAI Conference on Artificial Intelligence (AAAI), vol. 2301 (2019)

    Google Scholar 

  11. Chen, H., Fu, C., Zhao, J., Koushanfar, F.: ProFlip: targeted trojan attack with progressive bit flips. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7718–7727 (2021)

    Google Scholar 

  12. Chen, K., et al.: BadPre: task-agnostic backdoor attacks to pre-trained NLP foundation models. In: International Conference on Learning Representations (ICLR) (2022)

    Google Scholar 

  13. Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017)

  14. Cheng, G., Han, J., Lu, X.: Remote sensing image scene classification: benchmark and state of the art. Proc. IEEE 1865–1883 (2017)

    Google Scholar 

  15. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)

    Google Scholar 

  16. Fang, H., Dayapule, S.S., Yao, F., Doroslovački, M., Venkataramani, G.: A noise-resilient detection method against advanced cache timing channel attack. In: Asilomar Conference on Signals, Systems, and Computers. pp. 237–241 (2018)

    Google Scholar 

  17. French, R.M.: Catastrophic forgetting in connectionist networks. Trends Cogn. Sci. 128–135 (1999)

    Google Scholar 

  18. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: Bengio, Y., LeCun, Y. (eds.) International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  19. Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. J. Mach. Learn. Res. 723–773 (2012)

    Google Scholar 

  20. Gruss, D., et al.: Another flip in the wall of Rowhammer defenses. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 245–261. IEEE (2018)

    Google Scholar 

  21. Gu, T., Dolan-Gavitt, B., Garg, S.: BadNets: identifying vulnerabilities in the machine learning model supply chain. CoRR abs/1708.06733 (2017)

    Google Scholar 

  22. Hayase, J., Kong, W.: SPECTRE: defending against backdoor attacks using robust covariance estimation. In: International Conference on Machine Learning (ICLR) (2020)

    Google Scholar 

  23. Helber, P., Bischke, B., Dengel, A., Borth, D.: EuroSAT: a novel dataset and deep learning benchmark for land use and land cover classification. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2217–2226 (2019)

    Google Scholar 

  24. Hestness, J., et al.: Deep learning scaling is predictable. Empirically. arXiv, p. 2 (2017)

    Google Scholar 

  25. Houben, S., Stallkamp, J., Salmen, J., Schlipsing, M., Igel, C.: Detection of traffic signs in real-world images: the German Traffic Sign Detection Benchmark. In: International Joint Conference on Neural Networks, No. 1288 (2013)

    Google Scholar 

  26. Jattke, P., van der Veen, V., Frigo, P., Gunter, S., Razavi, K.: BLACKSMITH: scalable Rowhammering in the frequency domain. In: IEEE Symposium on Security and Privacy (SP), vol. 1 (2022)

    Google Scholar 

  27. Jeon, M., Venkataraman, S., Phanishayee, A., Qian, J., Xiao, W., Yang, F.: In: USENIX Annual Technical Conference, pp. 947–960 (2019)

    Google Scholar 

  28. Jia, J., Liu, Y., Gong, N.Z.: BadEncoder: backdoor attacks to pre-trained encoders in self-supervised learning. In: IEEE Symposium on Security and Privacy (SP) (2022)

    Google Scholar 

  29. Kim, Y., et al.: Flipping bits in memory without accessing them: an experimental study of dram disturbance errors. ACM SIGARCH Comput. Architect. News 361–372 (2014)

    Google Scholar 

  30. Koh, J.Y.: Model zoo: discover open source deep learning code and pretrained models. https://modelzoo.co/

  31. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  32. Li, L., Song, D., Li, X., Zeng, J., Ma, R., Qiu, X.: Backdoor attacks on pre-trained models by layerwise weight poisoning. In: Empirical Methods in Natural Language Processing (EMNLP) (2021)

    Google Scholar 

  33. Liu, F., Yarom, Y., Ge, Q., Heiser, G., Lee, R.B.: Last-level cache side-channel attacks are practical. In: IEEE Symposium on Security and Privacy (SP), pp. 605–622 (2015)

    Google Scholar 

  34. Liu, Q., Yin, J., Wen, W., Yang, C., Sha, S.: \(\{\)NeuroPots\(\}\): realtime proactive defense against \(\{\)Bit-Flip\(\}\) attacks in neural networks. In: USENIX Security Symposium, pp. 6347–6364 (2023)

    Google Scholar 

  35. Liu, Y., et al.: Trojaning attack on neural networks. In: Network and Distributed System Security Symposium (NDSS) (2018)

    Google Scholar 

  36. Liu, Y., Xie, Y., Srivastava, A.: Neural trojans. In: International Conference on Computer Design (ICCD), pp. 45–48 (2017)

    Google Scholar 

  37. McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning and Motivation, pp. 109–165 (1989)

    Google Scholar 

  38. McKeen, F., et al.: Intel® software guard extensions (intel® SGX) support for dynamic memory management inside an enclave. In: Hardware and Architectural Support for Security and Privacy (HASP), pp. 1–9 (2016)

    Google Scholar 

  39. monkeydoodle@gmail.com: The Oxford-IIIT pet dataset dataset (2022). https://universe.roboflow.com/monkeydoodle-gmail-com/the-oxford-iiit-pet-dataset

  40. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)

    Google Scholar 

  41. Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: Indian Conference on Computer Vision, Graphics & Image Processing, pp. 722–729 (2008)

    Google Scholar 

  42. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 1345–1359 (2009)

    Google Scholar 

  43. Rakin, A.S., Chowdhuryy, M.H.I., Yao, F., Fan, D.: DeepSteal: advanced model extractions leveraging efficient weight stealing in memories. In: IEEE Security and Privacy (SP), pp. 1157–1174 (2022)

    Google Scholar 

  44. Rakin, A.S., He, Z., Fan, D.: Bit-flip attack: crushing neural network with progressive bit search. In: International Conference on Computer Vision (ICCV), pp. 1211–1220 (2019)

    Google Scholar 

  45. Rakin, A.S., He, Z., Fan, D.: TBT: targeted neural network attack with bit trojan. In: Computer Vision and Pattern Recognition (CVPR), pp. 13195–13204 (2020)

    Google Scholar 

  46. Rakin, A.S., He, Z., Li, J., Yao, F., Chakrabarti, C., Fan, D.: T-BFA: targeted bit-flip adversarial weight attack. IEEE Trans. Pattern Anal. Mach. Intell. 7928–7939 (2021)

    Google Scholar 

  47. Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)

  48. Seaborn, M., Dullien, T.: Exploiting the dram Rowhammer bug to gain kernel privileges. Black Hat 71 (2015)

    Google Scholar 

  49. Shen, G., et al.: Backdoor scanning for deep neural networks through k-arm optimization. In: International Conference on Machine Learning (ICML) (2021)

    Google Scholar 

  50. Shen, L., et al.: Backdoor pre-trained models can transfer to all. In: ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 3141–3158 (2021)

    Google Scholar 

  51. Singhal, A., et al.: Modern information retrieval: a brief overview. IEEE Data Eng. Bull. 35–43 (2001)

    Google Scholar 

  52. Tol, M.C., Islam, S., Adiletta, A.J., Sunar, B., Zhang, Z.: Don’t knock! Rowhammer at the backdoor of DNN models. In: Dependable Systems and Networks (DSN), pp. 109–122 (2023)

    Google Scholar 

  53. Tolpegin, V., Truex, S., Gursoy, M.E., Liu, L.: Data poisoning attacks against federated learning systems. In: European Symposium on Research in Computer Security (ESORICS), pp. 480–501 (2020)

    Google Scholar 

  54. Tran, B., Li, J., Madry, A.: Spectral signatures in backdoor attacks. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 31 (2018)

    Google Scholar 

  55. Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: IEEE Symposium on Security and Privacy (SP), pp. 707–723 (2019)

    Google Scholar 

  56. Wang, J., et al.: Aegis: mitigating targeted bit-flip attacks against deep neural networks. In: USENIX Security Symposium, pp. 2329–2346 (2023)

    Google Scholar 

  57. Wang, S., Nepal, S., Rudolph, C., Grobler, M., Chen, S., Chen, T.: Backdoor attacks against transfer learning with pre-trained deep learning models. IEEE Trans. Serv. Comput. (TSC) 1526–1539 (2020)

    Google Scholar 

  58. Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. CoRR (2019)

    Google Scholar 

  59. Yao, F., Fang, H., Doroslovački, M., Venkataramani, G.: COTSknight: practical defense against cache timing channel attacks using cache monitoring and partitioning technologies. In: Hardware Oriented Security and Trust (HOST), pp. 121–130 (2019)

    Google Scholar 

  60. Yao, F., Rakin, A.S., Fan, D.: DeepHammer: depleting the intelligence of deep neural networks through targeted chain of bit flips. In: USENIX Security Symposium, pp. 1463–1480 (2020)

    Google Scholar 

  61. Yao, F., Venkataramani, G., Doroslovački, M.: Covert timing channels exploiting non-uniform memory access based architectures. In: Proceedings of the on Great Lakes Symposium on VLSI 2017, pp. 155–160 (2017)

    Google Scholar 

  62. Yao, Y., Li, H., Zheng, H., Zhao, B.Y.: Latent backdoor attacks on deep neural networks. In: ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 2041–2055 (2019)

    Google Scholar 

  63. Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems (NeurIPS), vol. 27 (2014)

    Google Scholar 

  64. Zhang, Z., et al.: Red alarm for pre-trained models: universal vulnerability to neuron-level backdoor attacks. Mach. Intell. Res. 180–193 (2023)

    Google Scholar 

Download references

Acknowledgements

This work is supported in part by U.S. National Science Foundation under SaTC-2019536 and CNS-2147217.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kunbei Cai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cai, K., Zhang, Z., Lou, Q., Yao, F. (2025). WBP: Training-Time Backdoor Attacks Through Hardware-Based Weight Bit Poisoning. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15123. Springer, Cham. https://doi.org/10.1007/978-3-031-73650-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73650-6_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73649-0

  • Online ISBN: 978-3-031-73650-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics