Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3309182.3309190acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
research-article
Public Access

On Improving the Effectiveness of Adversarial Training

Published: 13 March 2019 Publication History

Abstract

Machine learning models, including neural networks, are vulnerable to adversarial examples, which are adversarial inputs generated from legitimate examples by applying small perturbations to fool machine learning models to misclassify. Algorithms that are used to generate adversarial examples are called adversarial example generation methods. As the state-of-the-art defense approach, adversarial training improves the robustness of machine learning models by augmenting the training data with adversarial examples. However, adversarial training is far from being perfect yet, and a deeper understanding of it is always needed for further improving its effectiveness. In this paper, we propose to investigate two research questions. The first question is: whether Method-Based Ensemble Adversarial Training (MBEAT) could be beneficial, i.e., whether leveraging the adversarial examples generated by multiple methods could help increase the effectiveness of adversarial training. The second question is: whether Round Gap Of Adversarial Training (RGOAT) could exist, i.e., whether a neural network model adversarially trained in one round would not be robust against the adversarial examples further generated from this model itself. We design an adversarial training experimental framework to answer these two research questions. We find that MBEAT is indeed beneficial, indicating that it has some important value in practice. We also find that RGOAT indeed exists, indicating that adversarial training should be an iterative process.

References

[1]
Ludwig Schmidt Dimitris Tsipras Adrian Vladu Aleksander Madry, Aleksandar Makelov. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations (ICLR) .
[2]
Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Srndic, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. 2013. Evasion Attacks against Machine Learning at Test Time. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 387--402.
[3]
Nicholas Carlini and David A. Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In IEEE Symposium on Security and Privacy (SP). 39--57.
[4]
Pin-Yu Chen, Yash Sharma, Huan Zhang, Jinfeng Yi, and Cho-Jui Hsieh. 2018. EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples. In AAAI Conference on Artificial Intelligence (AAAI) .
[5]
Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting Adversarial Attacks with Momentum. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .
[6]
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples . In International Conference on Learning Representations (ICLR) .
[7]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 1097--1105. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
[8]
Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. 2017a. Adversarial Examples in the Physical World. In International Conference on Learning Representations (ICLR) Workshop Track .
[9]
Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. 2017b. Adversarial Machine Learning at Scale. In International Conference on Learning Representations (ICLR) .
[10]
Yann Lecun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. In Proceedings of the IEEE . 2278--2324.
[11]
Yann LeCun and Corinna Cortes. 2010. MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/. (2010). http://yann.lecun.com/exdb/mnist/
[12]
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. DeepFool: a simple and accurate method to fool deep neural networks. (2016).
[13]
Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian J. Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, Rujun Long, and Patrick McDaniel. 2018. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library . CoRR, Vol. abs/1610.00768 (2018). arxiv: 1610.00768 http://arxiv.org/abs/1610.00768
[14]
Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. 2017. Practical Black-Box Attacks Against Machine Learning. In Asia Conference on Computer and Communications Security. 506--519.
[15]
Nicolas Papernot, Patrick D. McDaniel, and Ian J. Goodfellow. 2016a. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples . CoRR, Vol. abs/1605.07277 (2016). arxiv: 1605.07277 http://arxiv.org/abs/1605.07277
[16]
Nicolas Papernot, Patrick D. McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016b. The Limitations of Deep Learning in Adversarial Settings. In IEEE European Symposium on Security and Privacy. 372--387.
[17]
Sara Sabour, Yanshuai Cao, Fartash Faghri, and David J. Fleet. 2016. Adversarial Manipulation of Deep Representations. In International Conference on Learning Representations (ICLR) .
[18]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper with Convolutions. In Computer Vision and Pattern Recognition (CVPR) . http://arxiv.org/abs/1409.4842
[19]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. 2013. Intriguing Properties of Neural Networks . CoRR, Vol. abs/1312.6199 (2013). arxiv: 1312.6199 http://arxiv.org/abs/1312.6199
[20]
Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian J. Goodfellow, Dan Boneh, and Patrick McDaniel. 2018. Ensemble Adversarial Training: Attacks and Defenses. In International Conference on Learning Representations (ICLR) .
[21]
Jonathan Uesato, Brendan O'Donoghue, A"a ron van den Oord, and Pushmeet Kohli. 2018. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks. In International Conference on Machine Learning (ICML) .
[22]
Weilin Xu, Yanjun Qi, and David Evans. 2016. Automatically Evading Classifiers: A Case Study on PDF Malware Classifiers. In Network and Distributed System Security Symposium (NDSS) .

Cited By

View all
  • (2023)Adversarial Training Method for Machine Learning Model in a Resource-Constrained EnvironmentProceedings of the 19th ACM International Symposium on QoS and Security for Wireless and Mobile Networks10.1145/3616391.3622768(87-95)Online publication date: 30-Oct-2023
  • (2022)Adversarial Training Methods for Deep Learning: A Systematic ReviewAlgorithms10.3390/a1508028315:8(283)Online publication date: 12-Aug-2022
  • (2022)On the Resiliency of an Analog Memristive Architecture against Adversarial Attacks2022 23rd International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED54688.2022.9806277(1-7)Online publication date: 6-Apr-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
IWSPA '19: Proceedings of the ACM International Workshop on Security and Privacy Analytics
March 2019
67 pages
ISBN:9781450361781
DOI:10.1145/3309182
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. adversarial examples
  2. adversarial machine learning
  3. adversarial training
  4. deep learning
  5. neural networks

Qualifiers

  • Research-article

Funding Sources

Conference

CODASPY '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 18 of 58 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)111
  • Downloads (Last 6 weeks)23
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Adversarial Training Method for Machine Learning Model in a Resource-Constrained EnvironmentProceedings of the 19th ACM International Symposium on QoS and Security for Wireless and Mobile Networks10.1145/3616391.3622768(87-95)Online publication date: 30-Oct-2023
  • (2022)Adversarial Training Methods for Deep Learning: A Systematic ReviewAlgorithms10.3390/a1508028315:8(283)Online publication date: 12-Aug-2022
  • (2022)On the Resiliency of an Analog Memristive Architecture against Adversarial Attacks2022 23rd International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED54688.2022.9806277(1-7)Online publication date: 6-Apr-2022
  • (2022)Compressed Learning in MCA Architectures to Tolerate Malicious Noise2022 IEEE 28th International Symposium on On-Line Testing and Robust System Design (IOLTS)10.1109/IOLTS56730.2022.9897622(1-8)Online publication date: 12-Sep-2022
  • (2021)Key-Based Input Transformation Defense Against Adversarial Examples2021 IEEE International Performance, Computing, and Communications Conference (IPCCC)10.1109/IPCCC51483.2021.9679424(1-10)Online publication date: 29-Oct-2021
  • (2020)Distinguishability of adversarial examplesProceedings of the 15th International Conference on Availability, Reliability and Security10.1145/3407023.3407040(1-10)Online publication date: 25-Aug-2020

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media