Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.14772 (cs)

[Submitted on 24 Nov 2023 (v1), last revised 11 Jun 2024 (this version, v2)]

Title:Trainwreck: A damaging adversarial attack on image classifiers

Abstract:Adversarial attacks are an important security concern for computer vision (CV). As CV models are becoming increasingly valuable assets in applied practice, disrupting them is emerging as a form of economic sabotage. This paper opens up the exploration of damaging adversarial attacks (DAAs) that seek to damage target CV models. DAAs are formalized by defining the threat model, the cost function DAAs maximize, and setting three requirements for success: potency, stealth, and customizability. As a pioneer DAA, this paper proposes Trainwreck, a train-time attack that conflates the data of similar classes in the training data using stealthy ($\epsilon \leq 8/255$) class-pair universal perturbations obtained from a surrogate model. Trainwreck is a black-box, transferable attack: it requires no knowledge of the target architecture, and a single poisoned dataset degrades the performance of any model trained on it. The experimental evaluation on CIFAR-10 and CIFAR-100 and various model architectures (EfficientNetV2, ResNeXt-101, and a finetuned ViT-L-16) demonstrates Trainwreck's efficiency. Trainwreck achieves similar or better potency compared to the data poisoning state of the art and is fully customizable by the poison rate parameter. Finally, data redundancy with hashing is identified as a reliable defense against Trainwreck or similar DAAs. The code is available at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as:	arXiv:2311.14772 [cs.CV]
	(or arXiv:2311.14772v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.14772

Submission history

From: Jan Zahálka [view email]
[v1] Fri, 24 Nov 2023 13:37:19 UTC (185 KB)
[v2] Tue, 11 Jun 2024 09:53:51 UTC (402 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Trainwreck: A damaging adversarial attack on image classifiers

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Trainwreck: A damaging adversarial attack on image classifiers

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators