Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Facial Soft-biometrics Obfuscation through Adversarial Attacks

Published: 12 September 2024 Publication History

Abstract

Sharing facial pictures through online services, especially on social networks, has become a common habit for thousands of users. This practice hides a possible threat to privacy: the owners of such services, as well as malicious users, could automatically extract information from faces using modern and effective neural networks. In this article, we propose the harmless use of adversarial attacks, i.e., variations of images that are almost imperceptible to the human eye and that are typically generated with the malicious purpose to mislead Convolutional Neural Networks (CNNs). Such attacks have been instead adopted to (1) obfuscate soft biometrics (gender, age, ethnicity) but (2) without degrading the quality of the face images posted online. We achieve the above-mentioned two conflicting goals by modifying the implementations of four of the most popular adversarial attacks, namely FGSM, PGD, DeepFool, and C&W, in order to constrain the average amount of noise they generate on the image and the maximum perturbation they add on the single pixel. We demonstrate, in an experimental framework including three popular CNNs, namely VGG16, SENet, and MobileNetV3, that the considered obfuscation method, which requires at most 4 seconds for each image, is effective not only when we have a complete knowledge of the neural network that extracts the soft biometrics (white box attacks) but also when the adversarial attacks are generated in a more realistic black box scenario. Finally, we prove that an opponent can implement defense techniques to partially reduce the effect of the obfuscation, but substantially paying in terms of accuracy over clean images; this result, confirmed by the experiments carried out with three popular defense methods, namely adversarial training, denoising autoencoder, and Kullback-Leibler autoencoder, shows that it is not convenient for the opponent to defend himself and that the proposed approach is robust to defenses.

1 Introduction

Every day, billions of photos are captured by our smartphones, containing various subjects such as locations, family members, and other individuals who are inadvertently included. A significant portion of these pictures are promptly shared on social media platforms [6, 42]. In fact, in 2022 there were about 4.62 billion active social media users [36], and approximately 95 million photos and videos are estimated to be shared on Instagram each day. This habit has become so widespread that all smartphones come pre-installed with various social media applications and users spend \(43\%\) of their overall phone usage time on such applications [36].
Online services behind these applications, as explicitly written in their privacy policy statements, are allowed by the users to process the uploaded contents and automatically extract metadata with the purpose of improving the service itself. In particular, soft biometrics like gender, age, and ethnicity, together with people identities and places, are among the most common information extracted from pictures [4, 11, 33, 48, 52, 56]. Although most of the social media platform users accept privacy statements and may deliberately share sensitive information with the platform [48], for instance, while filling out registration forms, about one-third of the users are concerned about protecting their privacy and want to avoid potential misuse of personal information [36]. As discussed in [5, 16, 22, 23, 41], user concerns are justified by the fact that the platforms themselves may pose potential threats by extracting unauthorized metadata from individuals, other than the user, who are in the picture or by engaging in targeted advertising, and trying to manipulate the user’s behavior. Additionally, malicious users may also be interested in inferring sensitive information, such as soft biometrics, from shared contents, for personalized advertising or social engineering attacks [21], as an example.
Therefore, it is important for a user to have a convenient and effective way of concealing soft biometrics information in pictures while maintaining a reasonable level of quality. In this way, other humans can still perceive the picture as authentic and be able to clearly identify faces, but automatic systems can be fooled. Hereinafter, we will use the term opponent to encompass both automated processes and people who, deceptively and without the user’s knowledge, attempt to extract soft-biometrics data from the content shared by the latter. Furthermore, we will define obfuscation as the deliberate process of altering elements within a picture to hide sensitive information from opponents, and we will refer to the modified images as obfuscated images.
Of course, since faces are the most relevant and richly detailed element of a picture [4], and they are used by both humans and machines [10, 17, 18, 52, 53, 60] to extract information about people, it is reasonable to assume that it would be acceptable for a user to focus only on them instead of altering the picture as a whole. This is also confirmed by the interest of researchers who in the last years have been mostly focused on methods working on faces. Among these, common approaches are based on face de-identification [57] that aims to hide identification information by modifying or replacing the face of a person [27] on a picture. Scrambled images techniques have been proposed in [35]; similarly to de-identification, they are used to hide people’s identity by covering their faces, but with patches. Very recently, thanks to the success of deep learning methods in computer vision, new approaches for face de-identification are based on Generative Adversarial Networks (GANs) [43, 47, 66, 71]. These latter are very effective and, while the person in the resulting picture is not identifiable, the facial features are still recognizable, to the point that it is possible to extract coherent biometric features and that a human does not perceive the face as a fake. Nevertheless, the drawback of all the de-identification approaches, making them unfit for our purpose, is that the face is considerably different from the original one, so that the users themselves would not be able to recognize their faces while looking at the obfuscated image.
In [40, 44, 45, 63, 68] a new trend has emerged: the idea is to avoid altering the faces in a perceptible way and to allow managing the tradeoff between the effectiveness of the obfuscation and the quality of the output image; the approach is based on adversarial machine learning methods, also known as adversarial attacks. The basic idea is to exploit the intrinsic weaknesses of convolutional neural networks (CNNs), which are generally used to realize modern image processing systems, by generating properly corrupted images, called adversarial examples, that induce the CNN to output wrong predictions. Indeed, as discussed in [8, 24, 67], the impressive accuracy achieved by CNNs on several computer vision tasks is not accompanied by an equally remarkable robustness w.r.t. a family of image corruptions, named adversarial patterns, that are generated to purposefully mislead neural networks. The reason these methods fit for the purpose at hand lies in the fact that the adversarial examples are generated to bound the noise to a level that it is not perceived by a human, but that can still induce an error in the neural network [7, 24]. It is worth noting that, after the appearance of adversarial attacks, some defenses also have been proposed to make the CNN more robust against the adversarial examples [12, 38, 46, 58, 62]; thus, we can assume that such defenses may be adopted by the opponent to prevent the effects of the obfuscation.
This article presents a comprehensive analysis aimed at assessing the effectiveness of well-known state-of-the-art adversarial machine learning techniques as privacy tools for users. The purpose is to investigate whether these techniques can be exploited by the users to obfuscate faces in pictures shared on social media platforms, with the intention of hiding the gender, ethnicity, and age of individuals in the images.
The proposed analysis is based on the following assumptions:
(1)
The users want to obfuscate soft biometrics extracted from faces by an opponent through CNNs, while keeping their faces clearly recognizable by humans.
(2)
The amount of noise added to the image must not affect the quality of the image perceived by a human.
(3)
The users are not aware of the specific CNN model used by an opponent to extract biometric features, but they can use well-known pre-trained neural networks commonly used for face analysis.
(4)
The opponent may use defense strategies based on adversarial training or denoising stages for preprocessing.
(5)
The generation of obfuscated images must be performed in a time that is reasonable for the user (e.g., less than 5 seconds).
Despite recent works having faced similar problems, focusing the analysis on text contents [1], face recognition [15, 53, 63], or social graphs [41], to the best of our knowledge, this is the most extensive analysis on the application of adversarial machine learning methods to prevent the extraction of soft biometrics from people’s faces in shared contents. We conducted the analysis by comparing four state-of-the-art adversarial attacks on large standard face datasets. According to the purpose of the proposed analysis, standard defenses also have been considered, such as adversarial training and denoising autoencoders. These approaches require having samples of obfuscated images, possibly generated using the same adversarial attacks and CNNs, but it is reasonable to expect that, as the networks and the attacks are available for a user, they can be also exploited by an opponent to implement defense strategies. Consequently, it becomes crucial to assess the effectiveness of obfuscation techniques despite the presence of such defenses.
The remainder of the article is structured as follows: in Section 2 we describe the methodology adopted to conduct the analysis by detailing CNNs, datasets, attacks, and defenses; in Section 3 we give details about our experimental framework, explaining the assumptions and the design choices; in Section 4 we describe the experiments and discuss the results; and finally, in Section 5 we provide the final outcomes and conclusions.

2 Methodology

In this section, we present a comprehensive overview of the tasks, the methods, and the datasets considered in the proposed analysis. The section is organized as follows: in Section 2.1 we provide a formal definition of each task, along with details of the reference datasets and CNNs selected from the state of the art to extract the soft-biometric features from facial images; in Section 2.2 we formulate the problem of generating adversarial samples and elaborate on the specific attacks employed in our experiments; and finally, in Section 2.3 we delve into the defense approaches.

2.1 Extracting Soft Biometrics from Faces

Over the years, several formulations and methods have been proposed to extract soft biometrics from facial images [10, 17, 25, 26]. Specifically, with respect to the facial features considered in this article, i.e., gender, age, and ethnicity, the problem of extracting each of them is commonly formalized as follows: (1) the recognition of the gender is a binary classification between male and female [3, 20]; (2) the estimation of the age [10] is a classification among age groups, whose typical subsets are 0 to 2, 4 to 6, 8 to 13, 15 to 20, 25 to 32, 38 to 43, 48 to 53, and over 60; (3) the recognition of the ethnicity [25] is a classification among Caucasian Latin, African American, East Asian, and Asian Indian categories. Hereinafter, we will refer to these definitions while talking about the tasks at hand.
In the recent literature it is possible to identify different state-of-the-art datasets for face analysis, such as Adience [39], FERET [49], Gender-FERET [2, 3], and VGGFace2 [9]. In case of VGGFace2, there are also extensions of the original label for ethnicity recognition and age estimation, i.e., VGGFace2 MIVIA Ethnicity Recognition (VMER) [25] and VGGFace2 MIVIA Age (VMAGE) [26]. Among the publicly available datasets, we have selected those most representative for the purposes of our analysis:
(1)
VGGFace2 [9], a large-scale dataset released and maintained by the Visual Geometry Group of the University of Oxford. It consists of over 3 million images from more than 9,000 individuals collected in the wild, making it one of the largest and most representative face recognition datasets publicly available. VGGFace2 has been designed to include a broad spectrum of variability within the population. It includes individuals from diverse ethnicities, age groups, and genders, as well as people captured in different poses. On average, the dataset contains 362 samples for each person, providing a comprehensive representation of individuals and their facial characteristics.
(2)
VMER [25] and VMAGE [26], extensions of the original VGGFace2 dataset released by the MIVIA Lab of the University of Salerno. They preserve the same characteristics and distributions of individuals in VGGFace2. For the sake of clearness, VMER adds, to each individual, four ethnicity categories to the labels in VGGFace2 (African American, East Asian, Caucasian Latin, and Asian Indian), while VMAGE adds the estimated age value for each image in the dataset.
(3)
Adience [39], a dataset specifically designed for age and gender classification collected in real-world conditions, which encompasses a wide range of variations in appearance, pose, lighting, and image quality. The dataset comprises a total of 26,580 face images captured from different yaw angles, with 13,649 images captured from almost frontal angles. The faces in the dataset are categorized into eight non-balanced age groups: 0 to 2, 4 to 6, 8 to 13, 15 to 20, 25 to 32, 38 to 43, 48 to 53, and 60 years or older.
The three state-of-the-art CNNs that obtained, in recent years, remarkable performance in face analysis tasks [10, 25, 26] are VGG-16 [64], SENet [31], and MobileNetV3 [29]; we have chosen them for their effectiveness and because all of them are publicly available pre-trained on VGGFace2 or ImageNet [19, 73], so it is not necessary to train them from scratch.
For the sake of clearness we provide few details about the CNNs. VGG-16 [64], proposed by the Visual Geometry Group of University of Oxford, is widely used for face analysis tasks, even if it has a very simple architecture with a fixed input size of 224 x 224 pixels that is passed through a sequence of convolutional layers followed by three fully connected layers and a softmax. SENet [31] is a variant of the well-known ResNet [28], with the addition of the squeeze and excitation blocks; this is an attention mechanism, in which for each convolution the feature maps are passed through a squeeze operation that aggregates the features across their spatial dimension, followed by an excitation operation that takes the embedding and provides a set of weights to be applied to the initial feature maps to generate the output of the block. MobileNet was first proposed by Google in [30] as an efficient CNN for mobile devices, thanks to the introduction of the concept depth-wise separable convolution that has been demonstrated to be very effective in reducing the model size and complexity. In MobileNetV2 [61], the architecture has been extended with the addition of the inverted residual blocks, a more efficient variant of the residual blocks used in ResNet [28]. MobileNetV3 is an evolution of MobileNetV2 where some inefficient blocks have been improved through the use of squeeze and excitation layers and swish non-linearity [55].

2.2 Generating Adversarial Examples

Generating an adversarial example is basically an optimization problem, where the objective function aims at minimizing the distance between the image and its altered version, such that the two are perceived as identical, but the label assigned by the classifier to the altered image is different from that of the original one. In the case of untargeted attacks, it is sufficient that the classifier assigns any incorrect label to the altered image, while in targeted attacks, a specific incorrect label is desired. In our analysis, we consider only untargeted attacks since we just need that the neural network used by the opponent provides wrong labels.
More formally, given an image x and its version \(x^{\prime }=x+\mu\) obfuscated through the addition of the noise pattern image \(\mu\) , the aim is to find the value of \(\mu\) that minimizes \(D(x,x^{\prime })\) (where D is a given distance function) with the constraints that \(F(x^{\prime }) \ne F(x)\) , where F is the classifier prediction function, and \(x^{\prime }\) is in \([0,1]^n\) to guarantee that it is still a valid image (see Equation (1)):
\begin{equation} \begin{aligned}\min \quad D(x,x^{\prime }) \\ \text {s.t.} \quad F(x^{\prime }) \ne F(x) \\ x^{\prime } \in [0,1]^n. \end{aligned} \end{equation}
(1)
Unfortunately, since the first constraint involves the classification function F, the problem is hard to face because of the non-linearities in neural networks. The differences among the attacks in the state of the art are mainly on how they define the distance among the images and how they deal with the non-linearity.
In [24] the authors proposed the Fast Gradient Sign Method (FGSM), a method designed to be fast at the expense of optimality. An adversarial example is computed as shown in the following equation:
\begin{equation} x^{\prime } = x + \epsilon \cdot sign(\nabla _{x} Loss_{F,t}(x)), \end{equation}
(2)
where t is the label of x and \(\epsilon\) regulates the intensity of the additive noise. The aim of FGSM is to use the gradient of the loss function \(\nabla Loss_{F,t}\) to get the direction in which the pixel intensities should change so as to maximize the loss. In this case, the distance function is the \(L_{ \infty }\) norm since the method aims at constraining the maximum value of the noise for each single pixel.
Successively, Kurakin et al. [37] proposed an iterative version of the FGSM, namely IFGSM, where the difference from the original version is that the optimization step is performed multiple times with a small step size as follows:
\begin{equation} \begin{aligned}x^{\prime }_0 = x \\ x^{\prime }_{N+1} = Clip_{x,\epsilon }\lbrace x^{\prime }_N + \alpha \cdot sign(\nabla _{x} Loss_{F,t}(x))\rbrace . \end{aligned} \end{equation}
(3)
The function \(Clip_{x,\epsilon }\lbrace x^{\prime }\rbrace\) , defined in Equation (4), performs a per-pixel clipping, limiting the value of each pixel of the adversarial example \(x^{\prime }\) so that the latter is in the \(\epsilon\) neighborhood of the original image x:
\begin{equation} \begin{aligned}Clip_{x,\epsilon }\lbrace x^{\prime }\rbrace (k,m,i) = \min \lbrace 255, x(k,m,i) + \epsilon , \\ \max \lbrace 0, x(k,m,i) - \epsilon , x(k,m,i)^{\prime } \rbrace . \end{aligned} \end{equation}
(4)
The PGD method used in our experiments is similar to IFGSM with the difference that \(x^{\prime }\) is initialized not to x but to a random point in the sphere defined by the \(L_{\infty }\) norm.
DeepFool [51] is a method designed for untargeted attack using the \(L_2\) norm. Differently from the other methods, it assumes that the neural network is completely linear so that the decision regions of each class are separated by hyperplanes. The method first solves a simplified optimization problem, but it does not obtain an adversarial example immediately since it is moving around the real solution. To get the final outcome, the method has to repeat the process and stops the search when a real adversarial example is found. We suggest that interested readers refer to the original paper for the mathematical formulation.
More recently, Carlini and Wagner [14] have proposed C&W, an \(L_2\) norm-based approach together with a slightly different formulation to deal with the problem of the non-linearity of the F function in Equation (1). The authors define an objective function f such that \(F(x^{\prime }) \ne F(x)\) if and only if \(f(x^{\prime }) \lt 0\) ; they also propose a list of possible functions to be used to this purpose. Therefore, the alternate formulation proposed is the following:
\begin{equation} \begin{aligned}\min \quad ||x - x^{\prime }||^2 + c \cdot f(x^{\prime })\\ \text {s.t.} \quad x^{\prime } \in [0,1]^n, \end{aligned} \end{equation}
(5)
where \(c \gt 0\) is a suitable chosen constant. For the sake of clarity, in Figure 1 we show the effect of the aforementioned adversarial attacks.
Fig. 1.
Fig. 1. Examples of obfuscation patterns obtained exploiting the adversarial attacks considered in the analysis; the noise patterns have been generated using the parameters and the constraints discussed in Section 3.2. The image (a) is the original face, then from left to right, the noise added by (b) FGSM, (c) PGD, (d) DeepFool, and (e) C&W. The noise intensity has been increased by a factor of 80 to make it clearly visible.

2.3 Defenses against Adversarial Attacks

The defense of a neural network against adversarial attacks is an open challenging problem and is the main purpose of the research on adversarial machine learning methods. As for many problems of cybersecurity, there does not exist a definitive defense capable of preventing every possible threat; therefore, several approaches have been proposed in the last years to prevent the most common adversarial attacks [12]. In our analysis we have not focused on all the possible defense methods, since it would have been out of the scope of this article, but we have selected the most general and widely used approaches to make neural networks more robust against adversarial noise patterns.
In particular, adversarial training [24, 32, 54] aims at making a neural network more robust by adding adversarial examples, generated by the attacks whose effects have to be mitigated, to the training set. The procedure is formulated as a min-max game [24] (see Equation (6)):
\begin{equation} \min _{\theta } \max _{D(x,x^{\prime })\lt \eta }{J(\theta ,x^{\prime },y).} \end{equation}
(6)
On one hand, the inner maximization has the purpose to find the most effective adversarial examples according to the attack loss J, the network weights \(\Theta\) , and the distance metric D; on the other hand, the minimization represents the standard training procedure to fine-tune the model to lower the value of the neural network loss J. Of course, this training process do not ensure the immunity of the neural network against unseen adversarial examples; the process can be repeated multiple times, considering either the same or different attacks. The process may significantly affect the accuracy on the original dataset; therefore, while using the adversarial training, it is important to supervise the tradeoff between robustness and accuracy [12, 13].
Adversarial examples are noisy inputs and it is natural to deal with them as a such; to this purpose autoencoders [69] have been demonstrated to be particularly useful as denoising methods against adversarial noise in different recent papers [50, 59, 65, 70, 72]. They are self-supervised neural networks designed to learn an effective representation of the input data over an embedding space named the latent space. The representation is obtained by coupling two neural networks: the encoder that projects the input data into the latent space and the decoder that maps the latent space back to the original input space. This network is trained by minimizing a reconstruction error that measures the distance between input and reconstructed samples. If the latent space and the input space both have at least the same dimension, then the autoencoder would learn an approximation of the identity function; if the latent space is smaller, then the autoencoder learns a compressed representation of the input samples. Therefore, it is possible to train autoencoders using both the original samples and their noisy versions to learn a latent representation that reduces the reconstruction error only with respect to the clean samples, thus making the network able to reconstruct the obfuscated samples as close as possible to the corresponding clean ones. To this aim, the mean squared error (MSE) between the reference clean sample y and the reconstructed one \(\hat{y}\) is commonly used as loss function.
\begin{equation} D_{KL}(P||Q) = \sum _{x \in X}P(x) \log {\frac{P(x)}{Q(x)}} \end{equation}
(7)
In [70] the authors have proposed a different approach to realize a denoising autoencoder; instead of minimizing the reconstruction error, the autoencoder tries to make the probability distribution of the output of the CNN as similar as possible when the CNN is fed with the original images and with the reconstructed ones. To this aim, for the loss function the approach uses the Kullback–Leibler (KL) divergence \(D_{KL}\) (see Equation (7)), a measure of dissimilarity between two probability distributions, \(P(x)\) and \(Q(x)\) . The big advantage of this method is that it does not need adversarial samples during the training, so it is independent from the adversarial method used, but it requires the target model in the training loop as shown in Figures 2 and 3. In Figure 4, we show the difference between a face reconstructed by a traditional denoising autoencoder and a KL autoencoder.
Fig. 2.
Fig. 2. Basic architecture of a system using a denoising autoencoder. In (a) we show the classification pipeline: the image is processed by the autoencoder that aims at reconstructing the clean version of the input on the base of its latent representation. In (b) we show the training pipeline where the autoencoder is trained to learn a representation able to reconstruct the clean image from a noisy one.
Fig. 3.
Fig. 3. Architecture of a KL autoencoder used for denoising. Differently from the classic autoencoder, the training process takes into account the output probabilities of the softmax function and tries to minimize the distance between the two output distributions.
Fig. 4.
Fig. 4. Examples of output images produced by a traditional denoising autoencoder (a) and a KL autoencoder (b). It is possible to note that, even in case (a), the reconstructed image could be significantly different from the original.

3 Experimental Framework

In this section we discuss in more detail the experimental setup and how we have prepared the networks, the obfuscation methods, and the defenses to conduct the analysis. All the experiments reported in the following sections have been performed on a workstation equipped with an Intel i7-3770S, 32 Gb of RAM, and an NVIDIA TITAN Xp with 12 Gb of RAM; the software platform is Ubuntu 18.04.6 LTS with Tensorflow 1.15.2, Keras 2.3.1, and CUDA 10.1.

3.1 CNNs for Facial Soft Biometrics Recognition

All the CNNs for the recognition of facial soft biometrics have been trained on VGGFace2 using the original labels and those provided by the extensions VMER and VMAGE for ethnicity and age, respectively. The overall training set is thus composed of 8.631 identities and more than 3.1 million images. The base accuracy values have been computed on the test set of VGGFace2, containing 500 identities and around 170,000 images, for gender and ethnicity recognition tasks, and on the whole Adience dataset, including 26,580 face images, for the age group classification task.
Some of the CNNs used in the analysis have been found already pre-trained on the mentioned training set. In particular, the authors of [26] have publicly shared the pre-trained weights of the three CNN models for age group classification. Likewise, in [10, 25], the authors have published pre-trained weights for VGG-16 and SENet, respectively, specifically for gender recognition tasks. These pre-trained networks are very suitable for the purpose of our analysis; indeed, they have demonstrated state-of-the-art accuracy and have been validated to exhibit robustness against common corruptions typically encountered in real-world scenarios. Therefore, to conduct our experiments, we had to train the neural networks for ethnicity recognition and MobileNetV3 for gender recognition by following procedures similar to those adopted for the other pre-trained models. We exploited a Single Shot Detector (SSD) based on the CNN ResNet-10 to obtain the crop of the single face that is present in each of the VGGFace2 images. As the crop can have a rectangular shape but the CNNs expect a \(224\times 224\) pixels input, we applied padding to ensure that the face is consistently centered within the box. Additionally, we aimed for the face to occupy an average of \(80\%\) of the input image, as suggested in [34]. It is worth pointing out that considering images with a single person represents the worst case for the user, since this eliminates for the opponent the chance to miss the face and it also allows us to neglect the error of the detector in our analysis.
Following the training procedure adopted for the other CNNs, data augmentation techniques have been used to enhance the robustness of the CNNs against common corruptions. The variations applied have been the following:
(1)
Random rotation: The angle has been sampled from the range \([-10^{\circ },10^{\circ }]\) .
(2)
Random change of bounding box or shift: This variation aims at simulating errors related to the face detector; the effect is similar to a random crop, causing the face to not be perfectly centered in the box.
(3)
Random brightness: To simulate overexposure and underexposure, we have randomly changed the brightness of the original image in the range \([-30\%, 30\%]\) of the pixel intensity.
(4)
Random horizontal flip.
For both the random variations (1) and (2) we have considered a zero mean normal distribution with standard deviations equal to \(10^{\circ }\) and \(2,5\%\) of the bounding box width, respectively. Furthermore, during the augmentation, we have used a pseudo-random procedure to apply two or more variations together to make the process more effective.
After the data augmentation, the resulting images have been normalized by subtracting the average value of each color channel computed over all the images in the dataset. This normalization has the effect of zero-centering every channel and has been demonstrated to improve the convergence of the loss function [64].
Finally, the networks have been trained through Stochastic Gradient Descent (SGD) with a batch size equal to 128 for MobileNetV3 and 32 for VGG-16 and SENet. The training process started with a learning rate set to 0.005 with a decay factor of 0.2 every 20 epochs, to gradually adjust the learning rate. A weight decay of 0.05 has been used to prevent overfitting. Since all the tasks of interest are formulated as multi-class classification problems, we have used a categorical cross-entropy loss function.

3.2 Setup of the Attacks to Obfuscate the Soft Biometrics

Assumption 2 in Section 1 forced a tradeoff between the effectiveness of the attacks and the amount of noise added on the image. Therefore, we do not expect to achieve the best result in terms of attack success rate. For the same reason, to avoid a degradation of the quality of the obfuscated image, we have empirically estimated the maximum amount of noise that an adversarial attack can add during the obfuscation and limited the effect of the latter by constraining both the norms \(L_{\infty }\) and \(L_2\) with the values of 15 and 900, respectively. It is worth noting that the two constraints affect the noise on the output image in different ways: the norm \(L_{\infty }\) limits the maximum perturbation on each single pixel of the altered image, while the norm \(L_2\) limits the maximum noise over the whole obfuscated image. These norms are used by the attacks, during the optimization process, to measure the distance between the original and the obfuscated samples. Therefore, on the one hand, they affect the effectiveness of the attack, and on the other hand, they impact the quantity of noise added to the image.
To estimate the limits, samples have been prepared with different intensities of noise for each type of adversarial attack. These samples have been then evaluated by five people to determine the maximum level of noise that could be added before it became noticeable to most of them.
For each attack, the adversarial examples have been generated by obfuscating, respecting the constraints, the same faces used to assess the accuracy of the CNNs, using the following parameters:
FGSM with an \(\epsilon\) value equal to 0.01.
PGD has been iterated for a maximum of 40 steps with an \(\alpha\) and step size equal to 0.01 and 0.005, respectively, in the case of ethnicity and gender recognition, and to 0.007 and 0.007 for the age estimation task.
DeepFool with an overshoot of the boundary of 0.02 and 50 as iteration limit.
\(C\&W\) with learning rate of 0.02 and maximum 50 iterations.
In addition, we have taken into account the effect of the Additive White Gaussian Noise (AWGN) with unitary variance, applied to each color channel. Although it is not an adversarial attack, the accuracy of the CNN in the presence of random perturbations can be considered as a reference about its initial robustness; indeed, we expect that the CNNs used by an opponent are quite insensitive to slight AWGN, which is ascribable to low-quality sensors or other external factors, because it is usually applied to the training data as a data augmentation technique.
For the sake of clearness, in our experiments we have analyzed two distinct scenarios:
(1)
White box scenario: The user has generated the obfuscated images using the same network adopted by the opponent. While this scenario may not be realistic in practice, it serves as a baseline to evaluate the effectiveness of the obfuscation.
(2)
Black box scenario: The user has created the obfuscated images using a different CNN than the one of the opponent. This scenario is particularly significant as it represents the most realistic situation that a user can encounter in the real world. In this scenario, when obfuscating the images, the user is unaware of the specific network used to extract the soft biometrics. To address this scenario, we conducted a transferability analysis to determine whether an adversarial example crafted to fool a particular CNN could also be successful in deceiving a different network.

3.3 Setup of the Defenses

According to Assumption 4 in Section 1, an opponent can use countermeasures to reduce the effect of the obfuscation. In our experimental setup we have considered the defenses described in Section 2.3, i.e., the adversarial training and two different denoising networks. Similarly to the setup of the attacks, we have analyzed two different scenarios: white box and black box.
To evaluate the effectiveness of obfuscation against these defense methods, we have prepared the worst-case defense scenario that a user may face under the hypothesis that (1) the opponent is using the CNN that achieved the best average accuracy over all the three tasks, and we use MobileNetV3 since it obtains the best performance, as clarified in Section 4.1, and (2) the opponent can select and use the most transferable adversarial attacks to generate the examples, i.e., PGD and FGSM, according to the results discussed in Section 4.2.
In the case of adversarial training, we did not need to train MobileNetV3 from scratch. Instead, we employed a fine-tuning process with the objective of enhancing its robustness in the presence of obfuscated images. For this purpose, we created three training sets by randomly extracting 750,000 samples from the original training set and generating adversarial examples using FGSM and PGD attacks against MobileNetV3. This process resulted in a set of 2.2 million images for each task, including a balance of clean, FGSM, and PGD samples, amounting to a total of 6.75 million images.
The training process was conducted following a similar procedure as described in Section 3.1. SGD was used as the optimizer, with an initial learning rate set to 0.001. A learning rate decay factor of 0.5 was applied every five epochs to adjust the learning rate during the training process.
We realized the denoising autoencoder from scratch. The architecture of the autoencoder includes three convolutional layers each for both the encoding and decoding stages, with a fully connected layer of 100 neurons generating the latent vector. In total, the denoising autoencoder comprises 41.95 million parameters. The input size for these autoencoders matches that of the CNNs, namely 224 x 224 pixels. As for the KL autoencoder, it shares the same architecture as the denoising autoencoders; the difference lies in the loss function, which is based on the KL divergence.
Using a process similar to the adversarial training, we have prepared a set of 1.5 million images composed of both clean and obfuscated samples to train the autoencoders.
For the denoising autoencoder, the training procedure employed the Adam optimizer with an initial learning rate of 0.001. If the validation loss did not improve for three consecutive epochs, the learning rate was reduced by a factor of 0.2. The batch size used during training has been set to 64 samples. The loss function used has been the mean squared error, calculated between the reconstructed images and the target images. Regarding the KL autoencoder, the training procedure also has utilized the Adam optimizer with an initial learning rate of 0.001. However, the learning rate has been decreased by a decay factor of 0.1 every 20 epochs.

4 Results

In this section we present the results of the experiments, organized to discuss the following aspects: (1) effectiveness of white box obfuscation, (2) effectiveness of black box obfuscation through the transferability analysis, (3) effectiveness in case of countermeasures, and (4) quality and time required to generate the obfuscated images.
For the sake of clearness, in all the tables, the base accuracy of the CNNs over each task is reported in terms of classification accuracy(CA), as defined in Equation (8). In the case of ethnicity recognition and age group classification, where the network provides probabilities for multiple classes, we have considered the class with the highest probability value as the predicted one.
\begin{equation} CA = \frac{\text{Number of correct predictions}}{\text{Total number of predictions}} \end{equation}
(8)
On the other hand, the effectiveness of the obfuscation techniques has been assessed in terms of the drop of accuracy, which is calculated as the difference between the CA when predicting adversarial examples and the CA obtained on the original (clean) images.
Finally, we have computed the average time to generate an obfuscated image and the average noise in terms of \(L_{\infty }\) and \(L_2\) norms, in order to evaluate the suitability of the attacks according to Assumptions 2 and 5 in Section 1.

4.1 Effectiveness of White Box Obfuscation

In Table 2 we have reported the classification accuracy of each CNN on a specific task and the drop caused by the random noise and the adversarial attacks. First, it is important to note that in all the tables the adversarial attacks have been arranged in a bottom-to-top order, according to the expected effectiveness of the attack with respect to the quantity of noise added to the obfuscated image. Hence, FGSM and PGD typically generate images with a higher noise intensity, with respect to DeepFool and C&W, to achieve effective samples.
Table 1.
CNNWeights numberInput size
VGG16138M224x224
SENet25.5M224x224
MobileNetV35.4M224x224
Table 1. Overview of the Convolutional Neural Networks Considered in the Analysis
Table 2.
Task VGG16SENetMobileNetV3Average
Gender RecognitionClean CA \(97.50\%\) \(97.30\%\) \(97.54\%\) \(97.45\%\)
AWGN \(-1.29\%\) \(-1.49\%\) \(-7.52\%\) \(-3.50\%\)
FGSM \(-30.53\%\) \(-49.45\%\) \(-60.40\%\) \(-46.80\%\)
PGD \(-61.04\%\) \(-62.11\%\) \(-44.73\%\) \(-55.96\%\)
DeepFool \(-24.84\%\) \(-36.23\%\) \(-9.76\%\) \(-23.61\%\)
C& W \(-59.86\%\) \(-79.03\%\) \(-45.46\%\) \(-61.45\%\)
Ethnicity RecognitionClean CA \(92.68\%\) \(93.65\%\) \(94.12\%\) \(93.48\%\)
AWGN \(-2.17\%\) \(-1.96\%\) \(-8.09\%\) \(-4.07\%\)
FGSM \(-60.20\%\) \(-48.28\%\) \(-68.19\%\) \(-58.89\%\)
PGD \(-38.03\%\) \(-91.17\%\) \(-68.19\%\) \(-65.79\%\)
DeepFool \(-34.09\%\) \(-39.78\%\) \(-22.91\%\) \(-33.26\%\)
C& W \(-81.92\%\) \(-86.79\%\) \(-67.14\%\) \(-78.61\%\)
Age Group ClassificationClean CA \(58.80\%\) \(65.59\%\) \(55.83\%\) \(60.07\%\)
AWGN \(-6.59\%\) \(-7.01\%\) \(-10.15\%\) \(-7.91\%\)
FGSM \(-53.82\%\) \(-49.31\%\) \(-42.74\%\) \(-48.79\%\)
PGD \(-53.15\%\) \(-59.22\%\) \(-51.29\%\) \(-54.55\%\)
DeepFool \(-47.55\%\) \(-22.63\%\) \(-15.11\%\) \(-28.43\%\)
C& W \(-30.69\%\) \(-28.94\%\) \(-10.70\%\) \(-23.47\%\)
Table 2. Classification Accuracy Achieved by VGG16, SENet, and MobileNetV3 in Gender Recognition, Ethnicity Recognition, and Age Group Classification
For each task, the first row reports the classification accuracy on the clean images (Clean CA), while the successive rows are the corresponding drops of accuracy caused by each obfuscation approach (AWGN, FGSM, PGD, DeepFool, C& W). In the last column we report the average of the values in each row.
Regarding the tasks, the age group classification is the most challenging one, not only due to the larger number of classes but also because of the inherent complexity of the task itself [10], which can be difficult even for humans.
However, this also implies that obfuscating the age is relatively simpler compared to gender recognition, which involves a binary classification with considerably less variability, as shown in Figure 5. This is evident from Table 2, in which the classification accuracy of all the CNNs on the age group estimation is about \(30\%\) lower than for the other tasks (on average, \(60.07\%\) of the age group classification vs. \(97.45\%\) and \(93.48\%\) for gender and ethnicity recognition, respectively), and in most of the cases, the drop in accuracy makes the output of the network completely unreliable. Think, just as an example, about the PGD obfuscation ( \(-54.55\%\) ), meaning an accuracy for the age group classification of only \(5.52\%\) . Similar results can be achieved on the other tasks using specific attacks: for instance, C&W and PGD make untrustworthy the gender and ethnicity provided by all of the CNNs (see Figure 5(a)).
Fig. 5.
Fig. 5. Effectiveness of the adversarial attacks in terms of classification accuracy in white box and black box scenarios on the three tasks. The obfuscation is effective in both the scenarios, but the knowledge of the target CNN allows the white box attacks to have a higher impact.
Focusing on the impact of the AWGN, the results reveal that all the CNNs demonstrate sufficient robustness against such a perturbation, despite not being explicitly trained to handle it. Among the considered CNNs, MobileNetV3 shows the higher sensitivity to AWGN, resulting in an average drop of \(8.5\%\) in accuracy across all tasks. Although this drop in accuracy for MobileNetV3 may seem notable, it is important to consider that the values reported in Table 2 have been obtained by introducing a substantial amount of noise. The average \(L_2\) norm value of the noise used was 14.517, which is two orders of magnitude higher than what has been required for the adversarial attacks. This suggests that the CNNs maintain a good level of accuracy even in presence of such high level of random noise.
Contrary to what we pointed out for AWGN, the results in Table 2 demonstrate that all the considered attacks are effective in fooling the prediction of neural networks that exhibit state-of-the-art performance. Indeed, even the least effective approach, DeepFool, managed to achieve an average drop in accuracy of at least \(23\%\) across all tasks. Remarkably, despite the results obtained by advanced methods such as C&W for gender and ethnicity recognition, we can note the effectiveness of simpler approaches such as FGSM and PGD. In particular, PGD caused an average drop in accuracy of \(58.77\%\) over all the tasks.

4.2 Effectiveness of Black Box Obfuscation: Transferability Analysis

Although the results achieved in the white box scenario are remarkable, they can be considered as the best case, since the user cannot be fully aware of the CNN employed by an opponent. The transferability analysis provides a measure of the generalization capability of an attack with respect to the target network. To perform this analysis, we generated obfuscated images targeted to fool a specific CNN and evaluate the effect of the same attack on the other CNNs. In Table 3 we have shown the results of the transferability experiments.
Table 3.
Task VGG16SENetMobileNetV3Average
SENetMobileNetV3VGG16MobileNetV3SENetVGG16
Gender RecognitionClean CA \(97.30\%\) \(97.54\%\) \(97.50\%\) \(97.54\%\) \(97.30\%\) \(97.50\%\) \(97.45\%\)
FGSM \(-6.11\%\) \(-7.17\%\) \(-6.81\%\) \(-10.24\%\) \(-3.14\%\) \(-2.59\%\) \(-6.01\%\)
PGD \(-5.44\%\) \(-6.72\%\) \(-5.80\%\) \(-8.84\%\) \(-1.93\%\) \(-1.68\%\) \(-5.07\%\)
DeepFool \(-1.23\%\) \(-0.54\%\) \(-1.25\%\) \(-2.18\%\) \(-0.44\%\) \(-0.38\%\) \(-1.00\%\)
C& W \(-2.99\%\) \(-3.50\%\) \(-3.53\%\) \(-5.84\%\) \(-0.85\%\) \(-0.79\%\) \(-2.92\%\)
Ethnicity RecognitionClean CA \(93.65\%\) \(94.12\%\) \(92.68\%\) \(94.12\%\) \(93.65\%\) \(92.68\%\) \(93.48\%\)
FGSM \(-12.20\%\) \(-10.65\%\) \(-6.45\%\) \(-20.05\%\) \(-10.07\%\) \(-2.49\%\) \(-10.31\%\)
PGD \(-15.30\%\) \(-13.98\%\) \(-7.34\%\) \(-24.22\%\) \(-10.07\%\) \(-2.49\%\) \(-12.23\%\)
DeepFool \(-3.34\%\) \(-3.06\%\) \(-0.96\%\) \(-3.71\%\) \(-1.24\%\) \(-0.45\%\) \(-2.13\%\)
C& W \(-16.33\%\) \(-13.54\%\) \(-2.32\%\) \(-9.39\%\) \(-3.87\%\) \(-0.99\%\) \(-7.74\%\)
Age Group ClassificationClean CA \(65.59\%\) \(55.83\%\) \(58.8\%\) \(55.83\%\) \(65.59\%\) \(58.8\%\) \(60.07\%\)
FGSM \(-33.13\%\) \(-30.89\%\) \(-20.36\%\) \(-19.52\%\) \(-15.96\%\) \(-16.16\%\) \(-22.67\%\)
PGD \(-47.38\%\) \(-45.07\%\) \(-20.52\%\) \(-21.19\%\) \(-23.53\%\) \(-27.66\%\) \(-30.89\%\)
DeepFool \(-47.55\%\) \(-11.67\%\) \(-4.80\%\) \(-4.54\%\) \(-1.99\%\) \(-1.39\%\) \(-11.99\%\)
C& W \(-19.55\%\) \(-19.13\%\) \(-9.32\%\) \(-8.80\%\) \(-3.80\%\) \(-4.87\%\) \(-10.91\%\)
Table 3. Results of the Transferability Analysis
For each attack, the original target network is on the top header (italics), while the secondary header contains the CNNs against which the transferability has been evaluated. For each task, the first row reports the classification accuracy on the clean images (Clean CA), while the successive rows are the corresponding drops of accuracy caused by each obfuscation approach (AWGN, FGSM, PGD, DeepFool, C& W). In the last column we report the average of the values in each row.
The more independent the attack is with respect to the target network, the more general it is expected to be. Indeed, as gradient-based approaches, such as FGSM and PGD, do not strongly depend on the attacked CNNs, they are able to achieve a higher transferability if compared to DeepFool and C&W, which are designed to generate noise patterns more effectively and less perceivably, but more specialized on the target network [12, 14]. Our analysis confirms this observation, as PGD attained the best performance across the three tasks, resulting in an average drop of approximately \(16.09\%\) , followed by FGSM with an average drop of \(13.03\%\) .
A notable outcome is the fact that the attacks have proved to be transferable over all the CNNs, despite their distinct architectures, as shown in Figure 5(b). This means that the effectiveness of the obfuscated images is not limited to specific network architectures and instead demonstrated a general capability to mislead multiple CNNs.
Finally, it is worth noting that even without knowing the network used by an opponent to extract the age, using a gradient-based method, it is possible to make the output unreliable; in fact, by using PGD, a user can cause an average loss of accuracy of \(30.89\%\) . On the other hand, as for the white box scenario, gender is the hardest soft-biometric feature to obfuscate.

4.3 Effectiveness of Adversarial Defenses

As introduced in Section 3.3, we have taken into account two different scenarios: white box and black box. The results of these experiments are reported in Table 4 and Table 5, respectively.
Table 4.
Task No DefenseAdv TrainingDenoising AEKL AE
Gender RecognitionClean CA \(97.54\%\) \(-1.54\%\) \(-3.82\%\) \(-2.71\%\)
FGSM \(-60.40\%\) \(+0.04\%\) \(-4.00\%\) \(-3.84\%\)
PGD \(-44.73\%\) \(+0.21\%\) \(-3.97\%\) \(-3.48\%\)
DeepFool \(-9.76\%\) \(-0.95\%\) \(-3.88\%\) \(-2.96\%\)
C& W \(-45.46\%\) \(-0.56\%\) \(-3.95\%\) \(-3.19\%\)
Ethnicity RecognitionClean CA \(94.12\%\) \(-9.34\%\) \(-7.63\%\) \(-7.09\%\)
FGSM \(-68.19\%\) \(+0.78\%\) \(-8.43\%\) \(-10.04\%\)
PGD \(-68.19\%\) \(+0.86\%\) \(-8.30\%\) \(-9.95\%\)
DeepFool \(-22.91\%\) \(-6.08\%\) \(-8.14\%\) \(-7.91\%\)
C& W \(-67.14\%\) \(-4.58\%\) \(-8.46\%\) \(-8.32\%\)
Age Group ClassificationClean CA \(55.83\%\) \(-6.14\%\) \(-12.20\%\) \(-10.92\%\)
FGSM \(-42.74\%\) \(-11.10\%\) \(-12.28\%\) \(-12.43\%\)
PGD \(-51.29\%\) \(-15.28\%\) \(-12.34\%\) \(-14.71\%\)
DeepFool \(-15.11\%\) \(-6.24\%\) \(-12.25\%\) \(-11.21\%\)
C& W \(-10.70\%\) \(-6.50\%\) \(-11.89\%\) \(-11.55\%\)
Table 4. Classification Accuracy Achieved by the Original MobileNetV3 on Obfuscated Images before (No Defense) and after Denoising (Denoising AE and KL AE) and by the Adversarially Trained (Adv Training) MobileNetV3 on Obfuscated Images in the White Box Scenario (by Using MobileNetV3 as a Target)
With all the defenses, there is a performance drop on both clean and obfuscated images; the drop on gender recognition CA is more limited than on the other tasks.
Table 5.
Task No DefenseAdv TrainingDenoising AEKL AE
Gender RecognitionClean CA \(97.54\%\) \(-1.54\%\) \(-3.82\%\) \(-2.71\%\)
FGSM \(-8.71\%\) \(-4.30\%\) \(-4.13\%\) \(-5.50\%\)
PGD \(-7.78\%\) \(-3.98\%\) \(-4.12\%\) \(-4.82\%\)
Ethnicity RecognitionClean CA \(94.12\%\) \(-9.34\%\) \(-7.63\%\) \(-7.09\%\)
FGSM \(-15.35\%\) \(-25.17\%\) \(-8.79\%\) \(-12.83\%\)
PGD \(-20.1\%\) \(-24.29\%\) \(-8.81\%\) \(-12.06\%\)
Age Group ClassificationClean CA \(55.83\%\) \(-6.14\%\) \(-12.20\%\) \(-10.92\%\)
FGSM \(-25.21\%\) \(-12.61\%\) \(-12.42\%\) \(-13.94\%\)
PGD \(-32.13\%\) \(-11.03\%\) \(-12.40\%\) \(-14.35\%\)
Table 5. Classification Accuracy Achieved by the Original MobileNetV3 on Obfuscated Images before (No Defense) and after Denoising (Denoising AE and KL AE) and by the Adversarially Trained (Adv Training) MobileNetV3 on Obfuscated Images in the Black Box Scenario (by Using VGG16 and SENet as a Target)
The results are comparable with the ones obtained for the white box scenario except for adversarial training, which suffers more the unawareness of the attack adopted by the user.
The first relevant result is that the enhancement of robustness against obfuscation often comes at the cost of a loss in accuracy on the clean samples. This drawback is observed across all the considered defenses, as shown in Figures 6 and 7. By observing results of the adversarial training in Table 4, a drop in accuracy ranging from \(1.54\%\) to \(9.34\%\) compared to the original network is evident. Similarly, when considering the system with a denoising stage, both the autoencoders lead to a decrease in accuracy, with the denoising autoencoder causing a drop ranging from \(3.82\%\) to \(12.20\%\) , and the KL autoencoder resulting in a decrease of \(2.71\%\) to \(10.92\%\) . Furthermore, the higher the complexity of the task is, the lower the effectiveness of such countermeasures is; this is evident by comparing the drop in the gender recognition against the one in the age group classification.
Fig. 6.
Fig. 6. Effectiveness of the defense in terms of classification accuracy in white box scenario on the three tasks. The figure shows the accuracy of MobileNetV3 in the following cases: without defense (blue), after the adversarial training (orange), ad using the denosing autoencoder (gray) and the KL autoencoder (yellow).
Fig. 7.
Fig. 7. Effectiveness of the defense in terms of classification accuracy in black box scenario on the three tasks. The figure shows the accuracy of MobileNetV3 in the following cases: without defense (blue), after the adversarial training (orange), and using the denosing autoencoder (gray) and the KL autoencoder (yellow).
In the white box scenario, all the approaches are able to partially prevent the loss of accuracy caused by the obfuscation (see Figure 6). Among them, the adversarial training is the most effective countermeasure, particularly for gender and ethnicity recognition tasks; in the cases of FGSM and PGD, it also leads to a slight improvement in the classification accuracy. This is an expected outcome since the network has been retrained to properly recognize obfuscated images, albeit at the cost of some loss of accuracy on clean images. On the other hand, for the same reason, this defense is less effective in the black box scenario, where the obfuscated images have been generated using CNNs different from MobileNetV3, namely SENet and VGG16. Differently from the adversarial training, in the black box scenario (see Figure 7), the denoising autoencoder and the KL autoencoder maintain their performance, demonstrating to be capable to generalize with respect to a specific neural network used to generate the adversarial samples.
Finally, when comparing the results of the black box scenario in Table 3 and Table 5, with and without the defenses, respectively, it becomes undeniable that the benefits provided by the defenses do not entirely compensate for the loss of accuracy on clean images. As a result, the user can definitely exploit the adversarial attacks even when the opponent employs countermeasures.

4.4 Obfuscated Image Quality and Obfuscation Time

In the end, it is worth discussing some additional results regarding the average perturbation and the time required to generate an obfuscated image, which demonstrate the effectiveness and the suitability of the proposed solution for the problem at hand.
In more detail, Table 6 shows the average perturbation required by each attack to achieve the results discussed in the previous sections. It is important to note that all of them are quite distant from the noise constraints introduced in Section 3.2; in the worst case the \(L_2\) and \(L_{\infty }\) norms are lower than the thresholds, \(10.33\%\) and \(22.73\%\) , respectively. Furthermore, the quantity of noise added to the image by FGSM and PGD is consistent with that added by DeepFool and C&W. The results demonstrate that the proposed solution meets the maximum perturbation requirements and achieves the desired obfuscation results while maintaining higher-than-expected image quality.
Table 6.
 GenderEthnicityAgeAverage
  \(L_2\) \(L_{\infty }\) \(L_2\) \(L_{\infty }\) \(L_2\) \(L_{\infty }\) \(L_2\) \(L_{\infty }\)
FGSM795.9310.95864.0011.00762.3410.84807.4210.93
PGD692.7410.50852.0011.00523.6711.04689.4710.85
DeepFool644.8712.92600.6712.00606.349.86617.3011.59
C& W641.3411.72728.3411.67785.0010.40718.2311.26
Table 6. Average Level of Noise Added by Each Type of Adversarial Attack
The perturbation is significantly lower than the thresholds imposed as constraints, namely \(L_2=900\) and \(L_{\infty }=15\) ; this means that the image quality has been preserved more than expected.
Regarding the average time to generate an obfuscated image, reported in Table 7, the worst case is C&W, which requires 3.93 seconds. Even if this time is acceptable for a user, a noteworthy result is that FGSM and PGD, the most general attacks, require less than a second to generate very effective obfuscated examples; also, DeepFool requires only 0.21 seconds for the generation of obfuscated images. We can conclude that the proposed approach allows a user to effectively obfuscate the soft biometrics in a very short time.
Table 7.
 VGG16SENetMobileNetV3Average
FGSM0.100.110.070.09
PGD0.210.260.080.18
DeepFool0.200.200.220.21
C& W4.634.083.093.93
Table 7. Average Time Required to Generate an Obfuscated Image in Seconds
The time required to the user is acceptable, being less than 1 second in most of the cases and around 4 seconds in the worst case.

5 Conclusions

Can users protect their privacy while using social media or similar services for sharing images? This question is not easy to answer, because in most of the real-world cases the users are not aware of the threats to their privacy, nor of the possible countermeasures. In this article we have analyzed the possibility of using adversarial methodologies, designed to evade neural networks by forcing them to provide wrong predictions, to allow users to obfuscate facial soft-biometric features like age, gender, and ethnicity on their pictures. The analysis has considered different challenges that users have to deal with, among them, the fact that the users are not aware of how the service has been realized and if it includes defenses against adversarial attacks. Starting from the results of the proposed analysis, we can conclude that (1) for a user it can be easy to find pre-trained ready-to-use CNNs that achieve state-of-the-art accuracy and use them as a target to obfuscate soft biometrics; (2) the adversarial machine learning methods, properly modified and configured to limit the amount of noise added to obfuscate the image, are very effective also in black box scenarios, as shown by the transferability analysis, preserving a satisfying image quality; (3) the time required to generate obfuscated images is negligible, and also for the most time-demanding approaches it is acceptable for a user; and (4) the obfuscation techniques are robust to the most common countermeasures exploitable by an opponent—indeed, the latter is not able to fully prevent the effect of the obfuscation and pays a non-negligible performance drop on clean data. Therefore, we can conclude that the proposed solution allows the users to effectively obfuscate their facial soft biometrics on images shared on social media.

References

[1]
Izzat Alsmadi, Kashif Ahmad, Mahmoud Nazzal, Firoj Alam, Ala Al-Fuqaha, Abdallah Khreishah, and Abdulelah Algosaibi. 2021. Adversarial attacks and defenses for social network text processing applications: Techniques, challenges and future research directions.
[2]
George Azzopardi, Antonio Greco, Alessia Saggese, and Mario Vento. 2017. Fast gender recognition in videos using a novel descriptor based on the gradient magnitudes of facial landmarks. In 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 1–6.
[3]
George Azzopardi, Antonio Greco, Alessia Saggese, and Mario Vento. 2018. Fusion of domain-specific and trainable features for gender recognition from face images. IEEE Access 6 (2018), 24171–24183.
[4]
Fabiola Becerra-Riera, Annette Morales-González, and Heydi Méndez-Vázquez. 2019. A survey on facial soft biometrics for video surveillance and forensic applications. Artificial Intelligence Review 52, 2 (2019), 1155–1187.
[5]
Ghazaleh Beigi, Kai Shu, Yanchao Zhang, and Huan Liu. 2018. Securing social media user data. In 29th Conference on Hypertext and Social Media. ACM.
[6]
Beth T. Bell. 2019. “You take fifty photos, delete forty nine and use one”: A qualitative study of adolescent image-sharing practices on social media. International Journal of Child-Computer Interaction 20 (June2019), 64–71.
[7]
Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. 2013. Evasion attacks against machine learning at test time. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 387–402.
[8]
Battista Biggio and Fabio Roli. 2018. Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognition 84 (2018), 317–331.
[9]
Qiong Cao, Li Shen, Weidi Xie, Omkar M. Parkhi, and Andrew Zisserman. 2018. VGGFace2: A dataset for recognising faces across pose and age. In IEEE International Conference on Automatic Face and Gesture Recognition. IEEE.
[10]
Vincenzo Carletti, Antonio Greco, Gennaro Percannella, and Mario Vento. 2020. Age from faces in the deep learning revolution. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 9 (2020), 2113–2132.
[11]
Vincenzo Carletti, Antonio Greco, Alessia Saggese, and Mario Vento. 2020. An effective real time gender recognition system for smart cameras. Journal of Ambient Intelligence and Humanized Computing 11 (2020), 2407–2419.
[12]
Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, and Alexey Kurakin. 2019. On evaluating adversarial robustness. arXiv preprint arXiv:1902.06705 (2019).
[13]
Nicholas Carlini and David Wagner. 2016. Defensive distillation is not robust to adversarial examples. arXiv preprint arXiv:1607.04311 (2016).
[14]
Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy (SP). 39–57.
[15]
Valeriia Cherepanova, Micah Goldblum, Harrison Foley, Shiyuan Duan, John Dickerson, Gavin Taylor, and Tom Goldstein. 2021. LowKey: Leveraging adversarial attacks to protect social media users from facial recognition. (2021).
[16]
Ming Cheung and James She. 2016. Evaluating the privacy risk of user-shared images. ACM Transactions on Multimedia Computing, Communications, and Applications 12, 4s, Article 58 (Sept.2016), 21 pages.
[17]
Antitza Dantcheva, Petros Elia, and Arun Ross. 2016. What else does your biometric data reveal? A survey on soft biometrics. IEEE Transactions on Information Forensics and Security 11, 3 (2016), 441–467.
[18]
Antitza Dantcheva, Carmelo Velardo, Angela D’Angelo, and Jean-Luc Dugelay. 2011. Bag of soft biometrics for person identification. Multimedia Tools and Applications 51, 2 (2011), 739–777.
[19]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition. 248–255.
[20]
Amit Dhomne, Ranjit Kumar, and Vijay Bhan. 2018. Gender recognition through face using deep learning. Procedia Computer Science 132 (2018), 2–10. International Conference on Computational Intelligence and Data Science.
[21]
Walter Fuertes, Diana Arévalo, Joyce Denisse Castro, Mario Ron, Carlos Andrés Estrada, Roberto Andrade, Felix Fernández Peña, and Eduardo Benavides. 2021. Impact of social engineering attacks: A literature review. In Smart Innovation, Systems and Technologies. Springer, Singapore, 25–35.
[22]
Kambiz Ghazinour and John Ponchak. 2017. Hidden privacy risks in sharing pictures on social media. Procedia Computer Science 113 (2017), 267–272.
[23]
Neil Zhenqiang Gong and Bin Liu. 2018. Attribute inference attacks in online social networks. ACM Transactions on Privacy and Security 21, 1 (2018), 1–30.
[24]
Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In International Conference on Learning Representations. http://arxiv.org/abs/1412.6572
[25]
Antonio Greco, Gennaro Percannella, Mario Vento, and Vincenzo Vigilante. 2020. Benchmarking deep network architectures for ethnicity recognition using a new large face dataset. Machine Vision and Applications 31, 7 (2020), 1–13.
[26]
Antonio Greco, Alessia Saggese, Mario Vento, and Vincenzo Vigilante. 2021. Effective training of convolutional neural networks for age estimation based on knowledge distillation. Neural Computing and Applications 34, 24 (2021), 21449–21464.
[27]
Ralph Gross, Latanya Sweeney, Jeffrey Cohn, Fernando de la Torre, and Simon Baker. 2009. Face De-identification. Springer, 129–146.
[28]
K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’16). 770–778.
[29]
Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, and Hartwig Adam. 2019. Searching for MobileNetV3. In IEEE/CVF International Conference on Computer Vision (ICCV).
[30]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.
[31]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7132–7141.
[32]
Ruitong Huang, Bing Xu, Dale Schuurmans, and Csaba Szepesvari. 2016. Learning with a Strong Adversary. arxiv:1511.03034 [cs.LG]
[33]
Anil K. Jain, Sarat C. Dass, and Karthik Nandakumar. 2004. Can soft biometric traits assist user recognition? In Biometric Technology for Human Identification, Anil K. Jain and Nalini K. Ratha (Eds.), Vol. 5404. International Society for Optics and Photonics, SPIE, 561–572.
[34]
Ahsan Jalal and Usman Tariq. 2017. The LFW-gender dataset. In Computer Vision—ACCV 2016 Workshops, Chu-Song Chen, Jiwen Lu, and Kai-Kuang Ma (Eds.). Springer International Publishing, Cham, 531–540.
[35]
Richard Jiang, Somaya Al-Maadeed, Ahmed Bouridane, Danny Crookes, and M. Emre Celebi. 2016. Face recognition in the scrambled domain via salience-aware ensembles of many kernels. IEEE Transactions on Information Forensics and Security 11, 8 (2016), 1807–1817.
[36]
Simon Kemp. 2022. Digital 2022: Global Overview Report. https://datareportal.com/reports/digital-2022-global-overview-report
[37]
Alexey Kurakin, Ian J Goodfellow, and Samy Bengio. 2018. Adversarial examples in the physical world. In Artificial Intelligence Safety and Security. Chapman and Hall/CRC, 99–112.
[38]
Mathias Lecuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, and Suman Jana. 2019. Certified robustness to adversarial examples with differential privacy. In 2019 IEEE Symposium on Security and Privacy (SP ’19). 656–672.
[39]
Gil Levi and Tal Hassncer. 2015. Age and gender classification using convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 34–42.
[40]
Tao Li and Lei Lin. 2019. AnonymousNet: Natural face de-identification with measurable privacy. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 56–65.
[41]
Xiaoting Li, Lingwei Chen, and Dinghao Wu. 2023. Adversary for social good: Leveraging attribute-obfuscating attack to protect user privacy on social networks. In LNCS Social Informatics and Telecommunications Engineering. Springer Nature, 710–728.
[42]
Yiyi Li and Ying Xie. 2019. Is a picture worth a thousand words? An empirical study of image content and social media engagement. Journal of Marketing Research 57, 1 (2019), 1–19.
[43]
Jiacheng Lin, Yang Li, and Guanci Yang. 2021. FPGAN: Face de-identification method with generative adversarial networks for social robots. Neural Networks 133 (2021), 132–147.
[44]
Bo Liu, Ming Ding, Sina Shaham, Wenny Rahayu, Farhad Farokhi, and Zihuai Lin. 2021. When machine learning meets privacy: A survey and outlook. Comput. Surveys 54, 2 (2021), 1–36.
[45]
Yujia Liu, Weiming Zhang, and Nenghai Yu. 2017. Protecting privacy in shared photos via adversarial examples based stealth. Security and Communication Networks 2017 (2017), 1–15.
[46]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks.
[47]
Francesco Marra, Diego Gragnaniello, Davide Cozzolino, and Luisa Verdoliva. 2018. Detection of GAN-generated fake images over social networks. In IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). 384–389.
[48]
Andrew McStay. 2020. Emotional AI, soft biometrics and the surveillance of emotional life: An unusual consensus on privacy. Big Data and Society 7, 1 (2020).
[49]
Mostafa Mehdipour Ghazi and Hazim Kemal Ekenel. 2016. A comprehensive analysis of deep learning based representation for face recognition. In IEEE Conference on Computer Vision and Pattern Recognition Workshops. 34–41.
[50]
Dongyu Meng and Hao Chen. 2017. MagNet: A two-pronged defense against adversarial examples. In ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, 135–147.
[51]
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: A simple and accurate method to fool deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition. 2574–2582.
[52]
Mark S. Nixon, Paulo L. Correia, Kamal Nasrollahi, Thomas B. Moeslund, Abdenour Hadid, and Massimo Tistarelli. 2015. On soft biometrics. Pattern Recognition Letters 68 (2015), 218–230.
[53]
Asem Othman and Arun Ross. 2015. Privacy of facial soft biometrics: Suppressing gender but retaining identity. In European Conference on Computer Vision Workshops. Springer, 682–696.
[54]
Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and Michael P. Wellman. 2018. SoK: Security and privacy in machine learning. In IEEE European Symposium on Security and Privacy. 399–414.
[55]
Prajit Ramachandran, Barret Zoph, and Quoc V. Le. 2017. Searching for Activation Functions.
[56]
D. A. Reid, S. Samangooei, C. Chen, M. S. Nixon, and A. Ross. 2013. Soft biometrics for surveillance: An overview. In Handbook of Statistics. Vol. 31. Elsevier, 327–352.
[57]
Slobodan Ribaric, Aladdin Ariyaeeinia, and Nikola Pavesic. 2016. De-identification for privacy protection in multimedia content: A survey. Signal Processing: Image Communication 47 (2016), 131–151.
[58]
Ishai Rosenberg, Asaf Shabtai, Yuval Elovici, and Lior Rokach. 2021. Adversarial machine learning attacks and defense methods in the cyber security domain. 36 pages.
[59]
Rajeev Sahay, Rehana Mahfuz, and Aly El Gamal. 2019. Combatting adversarial attacks through denoising and dimensionality reduction: A cascaded autoencoder approach. In 53rd Conference on Information Sciences and Systems (CISS). IEEE, 1–6.
[60]
Sina Samangooei, Baofeng Guo, and Mark S. Nixon. 2008. The use of semantic human description as a soft biometric. In IEEE 2nd International Conference on Biometrics: Theory, Applications and Systems. 1–7.
[61]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4510–4520.
[62]
Muhammad Shafique, Mahum Naseer, Theocharis Theocharides, Christos Kyrkou, Onur Mutlu, Lois Orosa, and Jungwook Choi. 2020. Robust machine learning systems: Challenges, current trends, perspectives, and the road ahead. IEEE Design Test 37, 2 (2020), 30–57.
[63]
Shawn Shan, Emily Wenger, Jiayun Zhang, Huiying Li, Haitao Zheng, and Ben Y Zhao. 2020. Fawkes: Protecting personal privacy against unauthorized deep learning models. In 29th USENIX Security Symposium. 1589–1604.
[64]
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition.
[65]
Yang Song, Taesup Kim, Sebastian Nowozin, Stefano Ermon, and Nate Kushman. 2018. Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. In 6th International Conference on Learning Representations (ICLR), Conference Track Proceedings.
[66]
Qianru Sun, Ayush Tewari, Weipeng Xu, Mario Fritz, Christian Theobalt, and Bernt Schiele. 2018. A hybrid model for identity obfuscation by face replacement. In European Conference on Computer Vision (ECCV). 553–569.
[67]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In International Conference on Learning Representations. http://arxiv.org/abs/1312.6199
[68]
Simen Thys, Wiebe Van Ranst, and Toon Goedemé. 2019. Fooling automated surveillance cameras: Adversarial patches to attack person detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[69]
Michael Tschannen, Olivier Frederic Bachem, and Mario Lučić. 2018. Recent advances in autoencoder-based representation learning. (2018).
[70]
Giovanni Vacanti and Arnaud Van Looveren. 2020. Adversarial detection and correction by matching prediction distributions.
[71]
Yifan Wu, Fan Yang, Yong Xu, and Haibin Ling. 2019. Privacy-protective-GAN for privacy preserving face de-identification. Journal of Computer Science and Technology 34, 1 (2019), 47–60.
[72]
Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan L. Yuille, and Kaiming He. 2019. Feature denoising for improving adversarial robustness. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 501–509.
[73]
Kaiyu Yang, Klint Qinami, Li Fei-Fei, Jia Deng, and Olga Russakovsky. 2020. Towards fairer datasets: Filtering and balancing the distribution of the people subtree in the ImageNet hierarchy. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. ACM, 547–558.

Cited By

View all
  • (2024)Introduction to Special Issue on “Recent trends in Multimedia Forensics”ACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3678473Online publication date: 2-Aug-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 11
November 2024
333 pages
EISSN:1551-6865
DOI:10.1145/3613730
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 September 2024
Online AM: 06 April 2024
Accepted: 25 March 2024
Revised: 14 February 2024
Received: 28 July 2023
Published in TOMM Volume 20, Issue 11

Check for updates

Author Tags

  1. Adversarial machine learning
  2. face analysis
  3. privacy

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)248
  • Downloads (Last 6 weeks)113
Reflects downloads up to 30 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Introduction to Special Issue on “Recent trends in Multimedia Forensics”ACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3678473Online publication date: 2-Aug-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media