invited-talk

Using Honeypots to Catch Adversarial Attacks on Neural Networks

Author:

Shawn ShanAuthors Info & Claims

MTD '21: Proceedings of the 8th ACM Workshop on Moving Target Defense

Page 25

https://doi.org/10.1145/3474370.3485655

Published: 15 November 2021 Publication History

Get Access

Abstract

Deep neural networks (DNN) are known to be vulnerable to adversarial attacks. Numerous efforts either try to patch weaknesses in trained models, or try to make it difficult or costly to compute adversarial examples that exploit them. In our work, we explore a new "honeypot'' approach to protect DNN models. We intentionally inject trapdoors, honeypot weaknesses in the classification manifold that attract attackers searching for adversarial examples. Attackers' optimization algorithms gravitate towards trapdoors, leading them to produce attacks similar to trapdoors in the feature space. Our defense then identifies attacks by comparing neuron activation signatures of inputs to those of trapdoors.

In this paper, we introduce trapdoors and describe an implementation of a trapdoor-enabled defense. First, we analytically prove that trapdoors shape the computation of adversarial attacks so that attack inputs will have feature representations very similar to those of trapdoors. Second, we experimentally show that trapdoor-protected models can detect, with high accuracy, adversarial examples generated by state-of-the-art attacks (PGD, optimization-based CW, Elastic Net, BPDA), with negligible impact on normal classification. These results generalize across classification domains, including image, facial, and traffic-sign recognition. We also present significant results measuring trapdoors' robustness against customized adaptive attacks (countermeasures).

Supplementary Material

MP4 File (MTD21-fp12345.mp4)

In our work, we explore a new honeypot approach to protect DNN models. We intentionally inject honeypot weaknesses in the classification manifold that attract attackers searching for adversarial examples. Attackers optimization algorithms gravitate towards trapdoors, leading them to produce attacks similar to trapdoors in the feature space. We introduce trapdoors & describe an implementation of a trapdoor-enabled defense. We analytically prove that trapdoors shape the computation of adversarial attacks so that attack inputs will have feature representations very similar to those of trapdoors. We experimentally show that trapdoor-protected models can detect, with high accuracy, adversarial examples generated by state-of-the-art attacks. These results generalize across classification domains, including image, facial, & traffic-sign recognition. We also present significant results measuring trapdoors? robustness against customized adaptive attacks (countermeasures).

Download
131.82 MB

Cited By

View all

Liu MSangiovanni-Vincentelli AYue X(2023)Beating Backdoor Attack at Its Own Game2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.00426(4597-4606)Online publication date: 1-Oct-2023
https://doi.org/10.1109/ICCV51070.2023.00426
Sun CBu YChen BZhang DChen ZLu XZhang SSun J(2022)Application of Artificial Intelligence Technology in Honeypot Technology2021 International Conference on Advanced Computing and Endogenous Security10.1109/IEEECONF52377.2022.10013349(01-09)Online publication date: 21-Apr-2022
https://doi.org/10.1109/IEEECONF52377.2022.10013349

Index Terms

Using Honeypots to Catch Adversarial Attacks on Neural Networks
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Security and privacy

Recommendations

Gotta Catch'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks
CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security

Deep neural networks (DNN) are known to be vulnerable to adversarial attacks. Numerous efforts either try to patch weaknesses in trained models, or try to make it difficult or costly to compute adversarial examples that exploit them. In our work, we ...
Adversarial Attacks and Defenses: Frontiers, Advances and Practice
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Deep neural networks (DNN) have achieved unprecedented success in numerous machine learning tasks in various domains. However, the existence of adversarial examples leaves us a big hesitation when applying DNN models on safety-critical tasks such as ...
AdvRefactor: A Resampling-Based Defense Against Adversarial Attacks
Advances in Multimedia Information Processing – PCM 2018
Abstract
Deep neural networks have achieved great success in many domains. However, they are vulnerable to adversarial attacks, which generate adversarial examples by adding tiny perturbations to legitimate images. Previous studies providing defense mostly ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

MTD '21: Proceedings of the 8th ACM Workshop on Moving Target Defense

November 2021

48 pages

ISBN:9781450386586

DOI:10.1145/3474370

Program Chairs:
Trent Jaeger
Penn State University, USA
,
Zhiyun Qian
University of California, Riverside, USA

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2021

Check for updates

Author Tags

Qualifiers

Invited-talk

Conference

CCS '21

Sponsor:

SIGSAC

CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security

November 15, 2021

Virtual Event, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 40 of 92 submissions, 43%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
173
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Liu MSangiovanni-Vincentelli AYue X(2023)Beating Backdoor Attack at Its Own Game2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.00426(4597-4606)Online publication date: 1-Oct-2023
https://doi.org/10.1109/ICCV51070.2023.00426
Sun CBu YChen BZhang DChen ZLu XZhang SSun J(2022)Application of Artificial Intelligence Technology in Honeypot Technology2021 International Conference on Advanced Computing and Endogenous Security10.1109/IEEECONF52377.2022.10013349(01-09)Online publication date: 21-Apr-2022
https://doi.org/10.1109/IEEECONF52377.2022.10013349

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Gotta Catch'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks

Adversarial Attacks and Defenses: Frontiers, Advances and Practice

AdvRefactor: A Resampling-Based Defense Against Adversarial Attacks