Robustness to Adversarial Examples through an Ensemble of Specialists

Mahdieh Abbasi, Christian Gagne

29 Nov 2024 (modified: 09 Mar 2017)ICLR 2017Readers: Everyone

Abstract: We are proposing to use an ensemble of diverse specialists, where speciality is defined according to the confusion matrix. Indeed, we observed that for adversarial instances originating from a given class, labeling tend to be done into a small subset of (incorrect) classes. Therefore, we argue that an ensemble of specialists should be better able to identify and reject fooling instances, with a high entropy (i.e., disagreement) over the decisions in the presence of adversaries. Experimental results obtained confirm that interpretation, opening a way to make the system more robust to adversarial examples through a rejection mechanism, rather than trying to classify them properly at any cost.

Conflicts: ulaval.ca

7 Replies