Abstract
Deep neural networks are known to be vulnerable to malicious perturbations. Current methods for improving adversarial robustness make use of either implicit or explicit regularization, with the latter is usually based on adversarial training. Randomized smoothing, the averaging of the classifier outputs over a random distribution centered in the sample, has been shown to guarantee a classifier’s performance subject to bounded perturbations of the input. In this work, we study the application of randomized smoothing to improve performance on unperturbed data and increase robustness to adversarial attacks. We propose to combine smoothing along with adversarial training and randomization approaches, and find that doing so significantly improves the resilience compared to the baseline. We examine our method’s performance on common white-box (FGSM, PGD) and black-box (transferable attack and NAttack) attacks on CIFAR-10 and CIFAR-100, and determine that for a low number of iterations, smoothing provides a significant performance boost that persists even for perturbations with a high attack norm, 𝜖. For example, under a PGD-10 attack on CIFAR-10 using Wide-ResNet28-4, we achieve 60.3% accuracy for infinity norm \(\epsilon _{\infty }=\nicefrac {8}{255}\) and 13.1% accuracy for \(\epsilon _{\infty }=\nicefrac {35}{255}\) – outperforming previous art by 3% and 6%, respectively. We achieve nearly twice the accuracy on \(\epsilon _{\infty }=\nicefrac {35}{255}\) and even more so for perturbations with higher infinity norm. A https://github.com/yanemcovsky/SIAM of the proposed method is provided.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Athalye A, Carlini N, Wagner D (2018a) Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, PMLR, Stockholmsmässan. http://proceedings.mlr.press/v80/athalye18a.html, vol 80. Proceedings of Machine Learning Research, Stockholm Sweden, pp 274–283
Athalye A, Engstrom L, Ilyas A, Kwok K (2018b) Synthesizing robust adversarial examples. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v80/athalye18b.html, vol 80. Proceedings of Machine Learning Research, Stockholmsmässan, pp 284–293
Balaji Y, Goldstein T, Hoffman J (2019) Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets. arXiv:1910.08051
Balunovic M, Vechev M (2020) Adversarial training and provable defenses: Bridging the gap. In: International Conference on Learning Representations, https://openreview.net/forum?id=SJxSDxrKDr
Bietti A, Mialon G, Chen D, Mairal J (2019) A kernel perspective for regularizing deep neural networks. In: International Conference on Machine Learning. PMLR, pp 664–674
Blum A, Dick T, Manoj N, Zhang H (2020) Random smoothing might be unable to certify \(\ell _{\infty }\) robustness for high-dimensional images. arXiv:2002.03517
Brown TB, Mané D, Roy A, Abadi M, Gilmer J (2017) Adversarial patch. arXiv:1712.09665
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp 39–57. https://doi.org/10.1109/SP.2017.49
Carlini N, Wagner D (2018) Audio adversarial examples: Targeted attacks on speech-to-text. In: 2018 IEEE Security and Privacy Workshops (SPW). IEEE, pp 1–7
Chaturvedi A, KP A, Garain U (2019) Exploring the robustness of nmt systems to nonsensical inputs. arXiv:1908.01165
Chen PY, Zhang H, Sharma Y, Yi J, Hsieh CJ (2017) ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec ’17, ACM, New York, pp 15–26. https://doi.org/10.1145/3128572.3140448
Cohen J, Rosenfeld E, Kolter Z (2019) Certified adversarial robustness via randomized smoothing. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v97/cohen19c.html, vol 97. Proceedings of Machine Learning Research, Long Beach, pp 1310–1320
Gao J, Lanchantin J, Soffa M L, Qi Y (2018) Black-box generation of adversarial text sequences to evade deep learning classifiers. In: 2018 IEEE Security and Privacy Workshops (SPW) IEEE, pp 50–56
Gilmer J, Metz L, Faghri F, Schoenholz SS, Raghu M, Wattenberg M, Goodfellow I (2018) Adversarial spheres. arXiv:180102774
Gilmer J, Ford N, Carlini N, Cubuk E (2019) Adversarial examples are a natural consequence of test error in noise. In: International Conference on Machine Learning, PMLR, pp 2280–2289
Gleave A, Dennis M, Wild C, Kant N, Levine S, Russell S (2020) Adversarial policies: Attacking deep reinforcement learning. In: International Conference on Learning Representations, https://openreview.net/forum?id=HJgEMpVFwB
Gong C, Ren T, Ye M, Liu Q (2020) Maxup: A simple way to improve generalization of neural network training. arXiv:2002.090242002.09024
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv:1412.6572
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He Z, Rakin AS, Fan D (2019) Parametric noise injection: Trainable randomness to improve deep neural network robustness against adversarial attack. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), http://openaccess.thecvf.com/content_CVPR_2019/html/He_Parametric_Noise_Injection_Trainable_Randomness_to_Improve_Deep_Neural_Network_CVPR_2019_paper.html
Jiang H, Chen Z, Shi Y, Dai B, Zhao T (2018) Learning to defense by learning to attack. arXiv:1811.01213
Jin D, Jin Z, Zhou J T, Szolovits P (2020) Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8018–8025
Khoury M, Hadfield-Menell D (2019) Adversarial training with voronoi constraints. arXiv:1905.01019
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114
Kingma DP, Salimans T, Welling M (2015) Variational dropout and the local reparameterization trick. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems, vol 28. Curran Associates, Inc., pp 2575–2583. http://papers.nips.cc/paper/5666-variational-dropout-and-the-local-reparameterization-trick.pdf
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
Kumar A, Levine A, Goldstein T, Feizi S (2020) Curse of dimensionality on randomized smoothing for certifiable robustness. arXiv:2002.03239
Li Y, Li L, Wang L, Zhang T, Gong B (2019) NATTACK: Learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v97/li19g.html, vol 97. Proceedings of Machine Learning Research, Long Beach, pp 3866–3876
Li Z, Feng C, Zheng J, Wu M, Yu H (2020) Towards adversarial robustness via feature matching. IEEE Access https://ieeexplore.ieee.org/abstract/document/9089860/
Liu A, Liu X, Yu H, Zhang C, Liu Q, Tao D (2021) Training robust deep neural networks via adversarial noise propagation. IEEE Trans Image Process 30:5769–5781
Liu X, Cheng M, Zhang H, Hsieh CJ (2018) Towards robust neural networks via random self-ensemble. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 369–385
Liu Y, Chen X, Liu C, Song D (2016) Delving into transferable adversarial examples and black-box attacks. arXiv:1611.02770
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations, https://openreview.net/forum?id=rJzIBfZAb
Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2017) Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, ASIA CCS ’17, ACM, New York, pp 506–519. https://doi.org/10.1145/3052973.3053009
Pinot R, Meunier L, Araujo A, Kashima H, Yger F, Gouy-Pailler C, Atif J (2019) Theoretical evidence for adversarial robustness through randomization. In: Wallach H, Larochelle H, Beygelzimer A, d’ Alché-Buc F, Fox E, Garnett R (eds). https://papers.nips.cc/paper/2019/hash/36ab62655fa81ce8735ce7cfdaf7c9e8-Abstract.html, vol 32. Advances in Neural Information Processing Systems. Curran Associates, Inc.
Rony J, Hafemann L G, Oliveira L S, Ayed I B, Sabourin R, Granger E (2019) Decoupling direction and norm for efficient gradient-based L2 adversarial attacks and defenses. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Salman H, Yang G, Li J, Zhang P, Zhang H, Razenshteyn I, Bubeck S (2019) Provably robust deep learning via adversarially trained smoothed classifiers. arXiv:1906.04584
Sarkar A, Gupta NK, Iyengar R (2019) Enforcing linearity in dnn succours robustness and adversarial image generation. arXiv:1910.08108
Shafahi A, Najibi M, Ghiasi MA, Xu Z, Dickerson J, Studer C, Davis LS, Taylor G, Goldstein T (2019) Adversarial training for free!. In: Wallach H, Larochelle H, Beygelzimer A, d’ Alché-Buc F, Fox E, Garnett R (eds) Advances in Neural Information Processing Systems. http://papers.nips.cc/paper/8597-adversarial-training-for-free, vol 32. Curran Associates, Inc., pp 3358–3369
Sun H, Wang R, Chen K, Utiyama M, Sumita E, Zhao T (2020) Robust unsupervised neural machine translation with adversarial training. arXiv:2002.12549
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. arXiv:1312.6199
Wang B, Yuan B, Shi Z, Osher S J (2020a) Enresnet: Resnets ensemble via the feynman–kac formalism for adversarial defense and beyond. SIAM J Math Data Sci 2(3):559–582
Wang D, Li C, Wen S, Nepal S, Xiang Y (2019a) Daedalus: Breaking non-maximum suppression in object detection via adversarial examples. arXiv:1902.02067
Wang Y, Ma X, Bailey J, Yi J, Zhou B, Gu Q (2019b) On the convergence and robustness of adversarial training. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v97/wang19i.html, vol 97. Proceedings of Machine Learning Research, Long Beach, pp 6586–6595
Wang Y, Zou D, Yi J, Bailey J, Ma X, Gu Q (2020b) Improving adversarial robustness requires revisiting misclassified examples. In: International Conference on Learning Representations, https://openreview.net/forum?id=rklOg6EFwS
Wong E, Kolter Z (2018) Provable defenses against adversarial examples via the convex outer adversarial polytope. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v80/wong18a.html, vol 80. Proceedings of Machine Learning Research, Stockholmsmässan, pp 5286–5295
Xiang C, Qi C R, Li B (2019) Generating 3d adversarial point clouds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Xie C, Tan M, Gong B, Wang J, Yuille A, Le QV (2019a) Adversarial examples improve image recognition. arXiv:1911.09665
Xie C, Wu Y, Maaten Lvd, Yuille AL, He K (2019b) Feature denoising for improving adversarial robustness. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 501–509
Xiong Y, Hsieh CJ (2020) Improved adversarial training via learned optimizer. arXiv:2004.12227
Xu H, Caramanis C, Mannor S (2009) Robustness and regularization of support vector machines. J Mach Learn Res 10(51):1485–1510. http://jmlr.org/papers/v10/xu09b.html
Xu K, Zhang G, Liu S, Fan Q, Sun M, Chen H, Chen PY, Wang Y, Lin X (2019) Evading real-time person detectors by adversarial t-shirt. arXiv:1910.11099
Yang G, Duan T, Hu E, Salman H, Razenshteyn I, Li J (2020) Randomized smoothing of all shapes and sizes. arXiv:2002.08118
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146
Zhang H, Yu Y, Jiao J, Xing E, Ghaoui LE, Jordan M (2019) Theoretically principled trade-off between robustness and accuracy. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of Machine Learning Research. http://proceedings.mlr.press/v97/zhang19p.html, vol 97, Long Beach, pp 7472–7482
Zhang H, Chen H, Xiao C, Gowal S, Stanforth R, Li B, Boning D, Hsieh CJ (2020) Towards stable and efficient training of verifiably robust neural networks. In: International Conference on Learning Representations, https://openreview.net/forum?id=Skxuk1rFwB
Zhang Y, Liang P (2019) Defending against whitebox adversarial attacks via randomized discretization. In: Chaudhuri K, Sugiyama M (eds) Proceedings of Machine Learning Research. http://proceedings.mlr.press/v89/zhang19b.html, vol 89. PMLR, Proceedings of Machine Learning Research, pp 684–693
Zheltonozhskii E, Baskin C, Nemcovsky Y, Chmiel B, Mendelson A, Bronstein AM (2020) Colored noise injection for training adversarially robust neural networks. arXiv:2003.02188
Zheng S, Song Y, Leung T, Goodfellow I (2016) Improving the robustness of deep neural networks via stability training. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4480–4488. https://doi.org/10.1109/CVPR.2016.485
Zheng T, Wang D, Li B, Xu J (2020) Towards assessment of randomized mechanisms for certifying adversarial robustness. arXiv:2005.07347
Acknowledgements
The research was funded by the Hyundai Motor Company through the HYUNDAI-TECHNION-KAIST Consortium, National Cyber Security Authority, and the Hiroshi Fujiwara Technion Cyber Security Research Center.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nemcovsky, Y., Zheltonozhskii, E., Baskin, C. et al. Adversarial robustness via noise injection in smoothed models. Appl Intell 53, 9483–9498 (2023). https://doi.org/10.1007/s10489-022-03423-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03423-5