Oriole: Thwarting Privacy against Trustworthy Deep Learning Models
Liuqiao Chen, Hu Wang, Benjamin Zi Hao Zhao, Minhui Xue, Haifeng Qian
aa r X i v : . [ c s . CR ] F e b Oriole: Thwarting Privacy against TrustworthyDeep Learning Models
Liuqiao Chen , Hu Wang , Benjamin Zi Hao Zhao , Minhui Xue , and HaifengQian (cid:0) ) East China Normal University, Shanghai, China [email protected] The University of Adelaide, Adelaide, Australia The University of New South Wales and Data61-CSIRO, Australia
Abstract.
Deep Neural Networks have achieved unprecedented successin the field of face recognition such that any individual can crawl the dataof others from the Internet without their explicit permission for the pur-pose of training high-precision face recognition models, creating a seriousviolation of privacy. Recently, a well-known system named Fawkes [34](published in USENIX Security 2020) claimed this privacy threat canbe neutralized by uploading cloaked user images instead of their originalimages. In this paper, we present
Oriole , a system that combines theadvantages of data poisoning attacks and evasion attacks, to thwart theprotection offered by Fawkes, by training the attacker face recognitionmodel with multi-cloaked images generated by
Oriole . Consequently,the face recognition accuracy of the attack model is maintained and theweaknesses of Fawkes are revealed. Experimental results show that ourproposed
Oriole system is able to effectively interfere with the perfor-mance of the Fawkes system to achieve promising attacking results. Ourablation study highlights multiple principal factors that affect the perfor-mance of the
Oriole system, including the DSSIM perturbation budget,the ratio of leaked clean user images, and the numbers of multi-cloaksfor each uncloaked image. We also identify and discuss at length the vul-nerabilities of Fawkes. We hope that the new methodology presented inthis paper will inform the security community of a need to design morerobust privacy-preserving deep learning models.
Keywords:
Data poisoning · Deep learning privacy · Facial Recognition · Multi-cloaks
Facial Recognition is one of the most important biometrics of mankind andis frequently used in daily human communication [1]. Facial recognition, as anemerging technology composed of detection, capturing and matching, has beensuccessfully adapted to various fields: photography [30], video surveillance [3],and mobile payments [38]. With the tremendous success gained by deep learning
L. Chen, H. Wang, B. Zhao, M. Xue and H. Qian techniques, current deep neural facial recognition models map an individual’sbiometric information into a feature space and stores them as faceprints. Conse-quently, features of a live captured image are extracted for comparison with thestored faceprints. Currently, many prominent vendors offer high-quality facialrecognition tools or services, including NEC [28], Aware [2], Google [15], andFace++ [11] (a Chinese tech giant Megvii). According to an industry researchreport “Market Analysis Repo” [31], the global facial recognition market wasvalued around $ Oriole , a system designed to render the Fawkessystem ineffective. In Fawkes, the target class is selected from the public dataset.In contrast,
Oriole implements a white-box attack to artificially choose multipletargets and acquire the corresponding multiple cloaked images of leaked userphotos. With the help of the proposed multi-cloaks, the protection of Fawkesbecomes fragile. To do so, the attacker utilizes the multi-cloaks to train theface recognition model. During the test phase, after the original user images arecollected, the attacker inputs the Fawkes cloaked image into the model for facerecognition. As a result, in the feature space, the features of cloaked photos willinevitably fall into the range of marked multi-cloaks. Therefore, the user imagescan still be recognized even if they are cloaked by Fawkes. We also highlightthe intrinsic weakness of Fawkes: The imperceptibility of images before andafter cloaking is limited when encountering high-resolution images, as cloakedimages may include spots, acne, and even disfigurement. This will result in thereluctance of users to upload their disfigured photos.In summary, our main contributions in this paper are as follows: – The Proposal of Oriole.
We design, implement, and evaluate
Oriole , aneural-based system that makes attack models indifferent to the protectionof Fawkes. Specifically, in the training phase, we produce the most relevantmulti-cloaks according to the leaked user photos and mix them into the riole : Thwarting Privacy against Trustworthy Deep Learning Models 3
Add
Data Poisoning Attack Decision-time AttackTraining Data DNN Target Model
Raw Data
Perturbations
Adversarial Examples
Raw Data
Perturbations
Adversarial Examples
Fig. 1.
The differences between data poisoning attacks and decision-time attacks. Datapoisoning attacks modify the training data before the model training process. In con-trast, Decision-time attacks are performed after model training to induce the modelmake erroneous predictions. training data to obtain a face recognition model. During the testing phase,when encountering uncloaked images, we first cloak them with Fawkes andthen feed them into the attack model. By doing so, the user images can stillbe recognized even if they are protected by Fawkes. – Empirical Results.
We provide experimental results to show the effec-tiveness of
Oriole in the interference of Fawkes. We also identify multipleprinciple factors that affect the performance of the
Oriole system, includ-ing the DSSIM perturbation budget, the ratio of leaked clean user images,and the number of multi-cloaks for each uncloaked image. Furthermore, weidentify and discuss at length the intrinsic vulnerability of Fawkes to dealwith high-resolution images.
In this section, we briefly introduce defense strategies against data poisoningattacks and decision-time attacks. Figure 1 highlights the differences betweendata poisoning attacks and decision-time attacks. We then introduce the white-box attacks. The Fawkes system is detailed at the end of this section.
In the scenario of data poisoning attacks, the model’s decision boundary will beshifted due to the injection of adversarial data points into training set. The intu-ition behind it is that the adversary deliberately manipulates the training datasince the added poisoned data has vastly different distribution with the originaltraining data. Prior research primarily involves two common defense strategies.
L. Chen, H. Wang, B. Zhao, M. Xue and H. Qian
First, anomaly detection models [40] function efficiently if the injected datahas obvious differences compared to the original training data. Unfortunately,anomaly detection models become ineffective if the adversarial examples are in-conspicuous. Similar ideas have been utilized in digital watermarking or datahiding [45]. Second, it is common to analyze the impact of newly added trainingsamples according to the accuracy of models. For example, Reject On NegativeImpact (RONI) was proposed against spam filter poisoning attacks, while Target-aware RONI (tRONI) builds on the observation of RONI failing to mitigate tar-geted attacks [35]. Other notable methods include TRIM [22], STRIP [13], andmore simply, human analysis on training data likely to be attacked [26].
In decision-time attacks, assuming that the model has already been learned, theattacker leads the model to produce erroneous predictions by making reactivechanges to the input. Decision-time attacks can be divided into several categories.Within these attacks, the most common one is the evasion attack.We shall present the most conventional evasion attack, which can be furtherbroken down into five categories: Gradient-based attacks [6, 8, 25], Confidencescore attacks [21, 9], Hard label attacks [4], Surrogate model attacks [47] andBrute-force attacks [10, 17, 12]). Undoubtedly, adversarial training is presentlyone of the most effective defenses. Adversarial samples, correctly labeled, areadded to the training set to enhance model robustness. Input modification [24],extra classes [19] and detection [27, 16] are common defense techniques againstevasion attacks. Alternative defenses against decision-time attacks involve iter-ative retraining [23, 37], and decision randomization [33].
The adversary has full access to the target DNN model’s parameters andarchitecture in white-box attacks. For any specified input, the attacker can cal-culate the intermediate computations of each step as well as the correspondingoutput. Therefore, the attacker can leverage the outputs and the intermediateresult of the hidden layers of the target model to implement a successful attack.Goodfellow et al. [14] introduce a fast gradient sign method (FGSM) to attackneural network models with perturbed adversarial examples according to thegradients of the loss with respect to the input image. The adversarial attackproposed by Carlini and Wagner is by far one of the most efficient white-boxattacks [7].
Fawkes [34], provides privacy protections against unauthorized training ofmodels by modifying user images collected without consent by the attacker.Fawkes achieves this by providing as simple means for users to add imperceptible riole : Thwarting Privacy against Trustworthy Deep Learning Models 5 perturbations onto the original photos before uploading them to social media orpublic web. When processed by Fawkes, the features representing the cloaked anduncloaked images are hugely different in the feature space but are perceptuallysimilar. The Fawkes system cloaks images by choosing (in advance) a specifictarget class that has a vast difference to the original image. Then it cloaks theclean images to obtain the cloaked images with great alterations to images’feature representations, but indistinguishable for naked eyes. When trained withthese cloaked images, the attacker’s model would produce incorrect outputs whenencountering clean images. However, Fawkes may be at risk of white-box attacks.If the adversary can obtain full knowledge of the target model’s parameters andarchitecture, for any specified input, the attacker can calculate any intermediatecomputation and the corresponding output. Thus, the attackers can leverage theresults of each step to implement a successful attack.
For a clean image x of a user Alice, Oriole produces multi-cloaks by addingpixel-level perturbation to x when choosing multiple targets dissimilar to Alice inthe feature space. That is, we first need to determine the target classes and theirnumbers for each user; then, we shall generate multi-cloaks with these selectedclasses. The process is detailed in Section 4.1.Figure 2 illustrates the overview of the proposed Oriole system, togetherwith both its connection and the differences with Fawkes. In the proposed
Ori-ole , the implementation is divided into two stages: training and testing. In thetraining phase, the attacker inserts the multi-cloaks generated by the
Oriole system into their training set. After model training, upon encountering cleanuser images, we use Fawkes to generate cloaked images; the cloaked images arethen fed into the trained face recognition model to complete the recognition pro-cess.
Oriole has significant differences with Fawkes. On one hand, we adopt adata poisoning attack scheme against the face recognition model by modifyingimages with generated multi-cloaks. On the other hand, an evasion attack (toevade the protection) is applied during testing by converting clean images totheir cloaked version before feeding them into the unauthorized face recogni-tion model. Although the trained face recognition model cannot identify users inclean images, it can correctly recognize the cloaked images generated by Fawkesand then map them back to their “true” labels.
We now elaborate the design details of
Oriole . We refer to the illustrationof the
Oriole process in Figure 3. Recall that the application of
Oriole isdivided into a training phase and a testing phase. The training phase can befurther broken down into two steps. In the first step, the attacker A launchesa data poisoning attack to mix the multi-cloaks into the training data (recallthat the training data is collected without consent and has been protected by L. Chen, H. Wang, B. Zhao, M. Xue and H. Qian
Fawkes
Training
Model M Model M Original Cloaked
Training Data (cloaked)
Wrong
LabelTesting Data (uncloaked)
Web crawl
User
Tracker / Model
Trainer
Oriole (multi-cloaks)
Fawkes (cloaks)
Add Add
Fig. 2.
The proposed
Oriole system is able to successfully recognize faces, even withthe protection of Fawkes.
Oriole achieves this by combining the concepts of datapoisoning attacks and evasion attacks.
Fawkes). Then, the unauthorized facial recognition model M is trained on themixed training data of the second step. At test time, as evasion attacks, theattacker A first converts the clean testing images to the cloaked version byapplying Fawkes and the cloaked version is presented to the trained model M for identification. From Figure 3, images making up the attacker database D A can be downloaded from the Internet as training data, while the user database D U provides the user U with leaked and testing data. After obtaining the inputimages from the database, we adopt MTCNN [46] for accurate face detectionand localization as the preprocessing module [46, 42]. It outputs standardizedimages that only contain human faces with a fixed size. At the training phase,the attacker A mixes the processed images of A ′ and multi-cloaks S O of the user U into training set to train the face recognition model M . At the testing phase,the attacker A first converts the preprocessed clean images U ′ B into the cloakedimages S F , followed by the same procedure as described in Fawkes; then, theattacker A pipes S F into the trained model M to fetch the results. We assume that a user U has converted his/her clean images U B into theircloaked form for privacy protection. However, the attacker A has collected someleaked clean images of the user U in advance, denoted as U A . As shown inFigure 3, this leaked user dataset U consists of data needed U A and U B . In theproposed Oriole system, U A is utilized for obtaining multi-cloaks S O , whichcontains a target set T M with m categories out of N categories. Here, we denote G ( X, m ) as the new set composed of the target classes corresponding to thefirst m largest element values in set X , where X contains the minimum distancebetween the feature vector of users and the centroids of N categories (see Eq. 2).The L distances are measured between the image feature in the projected space Φ ( · ) to the centroids of N categories, and then the top m targets are selected. http://mirror.cs.uchicago.edu/fawkes/files/target data/ riole : Thwarting Privacy against Trustworthy Deep Learning Models 7 Attacker DatabaseUser Database
Pre-processing
Pre-processing
Oriole Fawkes
Training
Training Data Testing Data
Testing
Mix
Cloaked imagesMulti-cloaks
Fig. 3.
The overall process of the proposed
Oriole . The process includes both thetraining and testing stages. Images U taken from the leaked user database D U aredivided into two parts ( U ′ A and U ′ B ) after preprocessing. In the training phase, theattacker A mixes the generated multi-cloaks S O into training data. After training, theface recognition model M is obtained. During the testing phase, the attacker A firstconverts the clean images U ′ B into cloaked images S F and then pipes them into thetrained model M to obtain a correct prediction. X = N [ k =1 { d | d = min x ∈ U B ( Dist ( Φ ( x ) , C k )) } , (1) T M = G ( X, m ) = { T , T , · · · , T m } = m [ i =1 T i , (2)where C k represents the centroid of a certain target and Φ is the feature projec-tor [34]. Besides, the distance calculation function adopts L distance. Next, thecalculation of a cloak δ ( x, x T i ) is defined as: δ ( x, X T i ) = min δ Dist ( Φ ( x T i ) , Φ ( x ⊕ δ ( x, x T i ))) , (3)where δ subjects to | δ ( x, x T i ) | < ρ , and | δ ( x, x T i ) | is calculated by DSSIM (Struc-ture Dis-Similarity Index) [39, 41] and ρ is the perturbation budget. Then wecan obtain the multi-cloaks S O as follows: S O = m [ i =1 { s | s = x ⊕ δ ( x, x T i ) } , (4)where multi-value m is a tunable hyper-parameter. m decides the number ofmulti-cloaks produced for each clean image.Instead of training the model M with clean data, the attacker A mixes themulti-cloaks S O calculated from Equation 4 with the preprocessed images U ′ A L. Chen, H. Wang, B. Zhao, M. Xue and H. Qian to form the training set. The deep convolutional face recognition model M istrained [32]. The last stage of
Oriole is model testing. Unlike Fawkes, we do not directlyapply clean images to the attack model. Instead,
Oriole first makes subtlechanges to the clean images before faces identification inference. Specifically,we implement the subtle changes through cloaking images from processed userimages U ′ B . Conceptually, the feature vectors of cloaked images S F will fall intothe marked feature space of multi-cloaks S O . Then, the trained model M is ableto correctly identify users through cloaked images S F .Figure 4 illustrates the intuition behind the Oriole system. For the purposesof demonstration, we assume the number of multi-value m equals to four. Toput differently, we shall assume that Fawkes will select one of four targets forcloaking, from which the proposed Oriole system will attempt to obtain multi-cloaks associated with all four targets with a small number of the user U ’s leakedphotos. In this scenario, we successfully link the four feature spaces of our fourtarget classes ( T , T , T and T ) with the user U . Thus, when it comes to anew and clean image of U , we first cloak it with Fawkes. The cloaked versionuser images will inevitably fall into one of the marked feature spaces of themulti-cloaks ( T has been chosen for illustration in Figure 4(b). See the hollowgreen and red triangles for the clean and cloaked image features, respectively).As the cloaked image features lie in T , and the multi-cloak trained model nowassociates T (and T , T , T ) as U , the attacker can correctly identify a user’sidentity even with the protection of Fawkes.We finally discuss the performance of Oriole when target classes are in-cluded and not included in the training data, respectively. We further observethat, no matter whether the number of target classes m is included in the train-ing set or not, the Oriole system still functions effectively to thwart protectionsoffered by Fawkes. In Figure 4, assuming that the feature vectors of the cloakedtesting image are located in the high dimensional feature space of T . We firstconsider when target users of T are not included in the attack model trainingprocess. We are able to map the user U to the feature space of T through theleaked images of the user U that were used to generate multi-cloaks. Further-more, Oriole still works when images of the target class T are included in thetraining set. Even if the cloaked images of U are detected as T , but the settingof Fawkes ensures that the cloaks of T occupy another area within the featurespace that will not overlap with T . Thus, this special case will not interfere theeffectiveness of Oriole . We implemented our
Oriole system on three popular image datasets againstthe Fawkes system. In our implementation, considering the size of the three riole : Thwarting Privacy against Trustworthy Deep Learning Models 9 x1x1x2x2 UB DCT1 T2T3 T4 x1x1x2x2 UB DCT1 T2T3 T4Without Oriole With OrioleDecision Boundary(a) (b)
Target classLeaked image of UMulti-cloaked image of U Cloaked image of UTest image of UTarget classLeaked image of UMulti-cloaked image of U Cloaked image of UTest image of U
Fig. 4.
The intuition behind why
Oriole can help the attacker A successfully identifythe user U even with the protection of Fawkes. We denote the process on a simplified2D feature space with seven user classes B, C, D, T , T , T , T and U . Figures (a) and(b) represent the decision boundaries of the model trained on U ’s clean photos andmulti-cloaks respectively (with four targets). The white triangles represent the multi-cloaked images of U and the red triangles are the cloaked images of U . Oriole worksas long as cloaked testing images fall into the same feature space of the multi-cloakedleaked images of U . datasets, we took the smallest PubFig83 [29] as the user dataset, while the largerVGGFace2 [5] and CASIA-WebFace [44] were prepared for the attacker to traintwo face recognition models. In addition, we artificially created a high-definitionface dataset to benchmark the data constraints surrounding the imperceptibilityof the Fawkes system. PubFig83 [29].
PubFig83 is a well-known dataset for face recognition research.It contains 13,838 cropped facial images belonging to 83 celebrities, each of whichhas at least 100 pictures. In our experiment, we treat PubFig83 as a databasefor user sample selection, due to its relative small number of tags and consistentpicture resolution.
CASIA-WebFace [44].
CASIA-WebFace dataset is the largest known pub-lic dataset for face recognition, consisting a total of 903,304 images in 38,423categories.
VGGFace2 [5].
VGGFace2 is a large-scale dataset containing 3.31 million im-ages from 9131 subjects, with an average of 362.6 images for each subject. Allimages on VGGFace2 were collected from the Google Image Search and dis-tributed as evenly as possible on gender, occupation, race, etc.
Models: M V and M CW . We chose VGGFace2 and CASIA to train face recog-nition models separately for real-world attacker simulation. In the preprocessingstage, MTCNN [46] is adopted for face alignment and Inception-ResNet-V1 [36]selected as our model architecture, and we then completed the model trainingprocess on a Tesla P100 GPU, with Tensorflow r1.7. An Adam optimizer witha learning rate of -1 is used to train models over 500 epochs. Here, we denotethe models trained on the VGGFace2 and CASIA-WebFace datasets as M V and M CW , the LFW accuracy of these models achieved 99 .
05% and 99 . Similar to the Fawkes system, the proposed
Oriole system is designed for auser-attacker scenario, whereby the attacker trains a powerful model through ahuge number of images collected on the Internet. The key difference is that
Ori-ole assumes the attacker A is able to obtain a small percentage of leaked cleanimages of user U . Through the evaluation of the Oriole system, we discoverthe relevant variables affecting the attack capability of the
Oriole system. Inthis case, we define a formula for facial recognition accuracy evaluation in Equa-tion 5, where R represents the ratio of the user’s multi-cloaks in the trainingdata. The ranges of R and ρ are both set to [0 , m (numberof multi-cloaks) is subject to the inequality: 0 < m ≪ N , where N = 18 ,
947 isthe total number of target classes in the public dataset.
Accuracy = k R · mρ (5)Throughout our experimental evaluation, the ratio between the training dataand testing data is fixed at 1:1 (see Section 5.2 for the motivation behind thisratio). Comparison between Fawkes and
Oriole . We start by reproducing theFawkes system against unauthorized face recognition models. Next, we employedthe proposed
Oriole scheme to invalidate the Fawkes system. We shall empha-size that the leaked data obtained associated with the user will not be directlyused for training the attack model. Instead, we insert multi-cloaks actively pro-duced by
Oriole into the training process, which presents a significant differencein the way adversary training schemes deal with leaked data.In particular, we randomly select a user U with 100 images from PubFig83and divided their images equally into two non-intersecting parts: U A and U B ,each of which contains 50 images, respectively. We shall evaluate both Fawkesand Oriole in two settings for comparison. In the first setting, we mix the multi-cloaks of the processed U ′ A into the training data to train the face recognitionmodel M and test the accuracy of this model M with the processed U ′ B inthe testing phase (see Figure 3). In the second setting, we replace the cleanimages of U A with the corresponding cloaked images (by applying Fawkes) to riole : Thwarting Privacy against Trustworthy Deep Learning Models 11 Model M V DSSIM Perturbation Budget -3 F ac i a l R ec ogn iti on S u cce ss R a ti o Only Fawkes Fawkes with Oriole
Model M CW DSSIM Pertubation Budget -3 F ac i a l R ec ogn iti on S u cce ss R a ti o Fig. 5.
Evaluation of the impact on
Oriole against Fawkes through two models M V and M CW . The two figures depict the performance of the face recognition model M with Fawkes and equipped with Oriole . There are clear observations from the twofigures: the larger the DSSIM perturbation budget ρ , the higher the resulting facerecognition accuracy obtained from model M . Additionally, it demonstrates that ourproposed Oriole system can successfully bypass protections offered by Fawkes. obtain a secondary measure of accuracy. Figure 5 shows the variation in facialrecognition accuracy with certain DSSIM perturbation budget, and displays theperformance of
Oriole against Fawkes protection. We implement this processon two different models: M V and M CW . The former training data consists ofthe leaked images U A and all images in VGGFace2, while the latter containsthe leaked images U A and all images in CASIA-WebFace. All experiments wererepeated three times and the results presented are averages.It can been seen from Figure 5 that there is a clear trend that the facialrecognition ratio of the two models rises significantly as the DSSIM perturbationbudget ρ increases from 0.1 to 1. Specifically, Oriole improves the accuracyof the face recognition model M V from 12.0% to 87.5%, while the accuracyof the model M CW increases from 0.111 to 0.763 when parameter ρ is set to0.008. We notice that the accuracy of the two models M V and M CW has beenimproved nearly 7 fold, when compared to the scenario where Fawkes is usedto protect privacy. From these results, we empirically find that Oriole canneutralize the protections offered by Fawkes, invalidating its protection of imagesin unauthorized deep learning models. Figure 6 shows an uncloaked image andits related multi-cloaks ( ρ = 0 . , m = 20). The feature representation of theclean image framed by a red outline is dissimilar from that of the remaining20 images. Figure 7 shows the two-dimensional Principal Component Analysis(PCA) of the face recognition system validating our theoretical analysis (for ρ = 0 . , m = 4). The feature representation of the clean images are mappedto the feature space of the four target classes images through multi-cloaks. Wethen mark the corresponding feature spaces as part of identity U and identifythe test images of U by cloaking them. Fig. 6.
An example of a clean image of the user U and 20 multi-cloaks produced by Oriole . The uncloaked image has been framed by a red outline.
Table 1.
The four models used in our verification and their classification accuracyon PubFig83. The “Basic” column represents the conventional face recognition. The“Fawkes” column represents that only Fawkes is used to fool the face recognition modelfor privacy protection. The
Oriole column represents the performance of
Oriole . Dataset Model Architecture Test AccuracyBasic Fawkes Oriole
CASIA-WebFace Inception-ResNet-V1 0.973 0.111 0.763CASIA-WebFace DenseNet-121 0.982 0.214 0.753VGGFace2 Inception-ResNet-V1 0.976 0.120 0.875VGGFace2 DenseNet-121 0.964 0.117 0.714
We show the general effectiveness of the proposed
Oriole system in Table 1.We build four models with two different architectures, named Inception-ResNet-V1 [36] and DenseNet-121 [20], on the two aforementioned datasets. The model,equipped with
Oriole , significantly outperforms the model without it acrossdifferent setups. The experimental results demonstrate that the
Oriole sys-tem can retain the test accuracy at a higher level of more than 70% accuracyacross all listed settings, even with the protection of Fawkes. For instance, on theCASIA-WebFace dataset with DenseNet-121 as the backbone architecture,
Ori-ole increases the attack success rate from 12.0% to 87.5%, significantly boostingthe attack effectiveness.
Main factors contributing to the performance of
Oriole . There are threemain factors influencing the performance of
Oriole : 1) the DSSIM perturbationbudget ρ , 2) the ratio of leaked clean images R , and 3) the number of multi-cloaks for each uncloaked image m . Different DSSIM perturbation budgets ρ have already been discussed in the previous paragraph. We now explore theimpact of R and m values on model’s performance. Up until this point we haveperformed experiments with default values of R , m and ρ as 1, 20 and 0.008 riole : Thwarting Privacy against Trustworthy Deep Learning Models 13 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 Dimension 1 -0.5-0.4-0.3-0.2-0.100.10.20.30.4 D i m e n s i on Cloaked image of U Multi-cloaks of U Target classOther targetTesting image of U Leaked image of U Fig. 7.
Oriole system. Triangles areuser’s leaked images (solid) and testing data (hollow), dots are multi-cloaks of leakedimages, dots represent multi-cloaks (magenta) and images from target classes (black),red crosses are cloaked images of testing data, blue square are images from anotherclass. respectively to enable a fair comparison. From Figure 8 we can observe the mainfactors affecting the
Oriole system’s performance. We observe that the facialrecognition success ratio increases monotonically as the number of multi-cloaks m increases, and this rise occurs until m reaches 20, whereby the success ratioplateaus. We can conclude that the facial recognition success ratio grows withthe ratio of leaked clean images R . The ratio increases at least three times when R increases from 0.1 to 1. Model validation.
In order to ensure the validity of
Oriole , as a comparativeexperiment, we respectively evaluate the model M V and M CW on PubFig83.We divide PubFig83 into 10 training-testing set pairs with different proportionsand build classifiers with the help of two pre-trained models. We obtained 20experimental results depending on which model M V or M CW was used withratios selected between 0.1 to 1 shown in Table 2. The experimental resultsshow that the accuracy of model M V and M CW based on FaceNet increasesmonotonically as the ratio of the training set to the testing set increases. Wecan see that both models exceed a 96% recognition accuracy on PubFig83 whenthe selected the ratio between training and testing sets are 0.5. Consequently,models M V and M CW are capable of verifying the performance of Oriole .
10 20 30 40 50Value of Parameter m F ac i a l R ec ogn iti on S u cce ss R a ti o M V M CW R : Ratio of Multi-cloaked Images00.20.40.60.81 F ac i a l R ec ogn iti on S u cce ss R a ti o M V M CW Fig. 8.
The facial recognition accuracy changes with different ratios of leaked cleanimages R and numbers of multi-cloaks for each uncloaked image m . Table 2.
The test accuracy of models M V (trained on VGGFace2) and M CW (trainedon CASIA-WebFace) across different rates of PubFig83. The rate in the first columnrepresents the ratio of the size of training and test sets. The test accuracy is the overallcorrect classification score for clean images. Rate Test Accuracy of M V Test Accuracy of M CW Shan et al. [34] claim that the cloaked images with small perturbations addedare indistinguishable to the naked human eye. However, we show that the im-perceptibility of Fawkes is limited due to its inherent imperfection, which isvulnerable to white-box attacks. For practical applications, users tend to uploadclear and high-resolution pictures for the purpose of better sharing their lifeexperiences. Through our empirical study, we find that Fawkes is able to makeimperceptible changes for low-resolution images, such as the PubFig83 dataset.However, when it comes to high-resolution images, the perturbation betweencloaked photos and their originals is plainly apparent. riole : Thwarting Privacy against Trustworthy Deep Learning Models 15
To demonstrate the limitations in Fawkes for high-resolution images, we man-ually collect 54 high-quality pictures covering different genders, ages and regions,whose resolution is more than 300 times (width × height is larger than 3,000,000pixels at least) of PubFig83 images. We further conduct an experiment to setthe value of perturbation budget ρ to 0.007 and run the optimization process for1,000 iterations with a learning rate of 0.5, in the same experimental setting asdescribed in Fawkes [34].A sample of the resulting images from this experiment is displayed in Fig-ure 9, these figures show images of the same users before (a) and after beingcloaked by Fawkes (b). From these figures, we can easily observe significant dif-ferences with and without cloaking. Notably, there are many wrinkles, shadowsand irregular purple spots on the boy’s face in the cloaked image. This protectionmay result in the reluctance of users to post the cloaked images online. Sybil accounts are fake or bogus identities created by a malicious user toinflate the resources and influence in a target community [43]. A Sybil account,existing in the same online community, is a separate account to the original oneof the user U , but the account, bolstering cloaking effectiveness, can be craftedto boost privacy protection in Fawkes when clean and uncloaked images areleaked for training [34]. Fawkes modifies the Sybil images to protect the user’soriginal images from being recognized. These Sybil images induce the model tobe misclassified because they occupy the same area within the feature space of U ’s uncloaked images. However, the feature space of cloaked images is vastlydifferent from the originals. Sybil accounts are ineffective since the clean imagesare first cloaked before testing. Furthermore, these cloaked photos occupy adifferent area within feature space from the Sybil images as well as the cleanimages. To put it differently, no defense can be obviously offered irrespective ofhow many Sybil accounts the user can own, as cloaked images and uncloakedimages occupy different feature spaces. We are also able to increase the numberof multi-cloaks m in step with Fawkes to ensure the robustness of Oriole dueto the white-box nature of the attack.
In this work, we present
Oriole , a novel system to combine the advantages ofdata poisoning attacks and evasion attacks to invalidate the privacy protection ofFawkes. To achieve our goals, we first train the face recognition model with multi-cloaked images and test the trained model with cloaked images. Our empiricalresults demonstrate the effectiveness of the proposed
Oriole system. We havealso identified multiple principle factors affecting the performance of the
Oriole system. Moreover, we lay out the limitation of Fawkes and discuss it at length.We hope that the attack methodology developed in this paper will inform thesecurity and privacy community of a pressing need to design better privacy-preserving deep neural models.
Fig. 9.
Comparison between the cloaked and the uncloaked versions of high-resolutionimages. Note that there are wrinkles, shadows and irregular purple spots on faces ofthe cloaked images. eferences [1] Akbari, R., Mozaffari, S.: Performance enhancement of pca-based facerecognition system via gender classification method. In: 2010 6th IranianConference on Machine Vision and Image Processing. pp. 1–6. IEEE (2010)[2] Aware Nexa—Face TM , https://aware.com/biometrics/nexa-facial-recognition/. [3] Bashbaghi, S., Granger, E., Sabourin, R., Parchami, M.: Deep learning ar-chitectures for face recognition in video surveillance. In: Deep Learning inObject Detection and Recognition, pp. 133–154. Springer (2019)[4] Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks:Reliable attacks against black-box machine learning models. arXiv preprintarXiv:1712.04248 (2017)[5] Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: Adataset for recognising faces across pose and age. In: 2018 13th IEEE in-ternational conference on automatic face & gesture recognition (FG 2018).pp. 67–74. IEEE (2018)[6] Carlini, N., Wagner, D.: Adversarial examples are not easily detected: By-passing ten detection methods. In: Proceedings of the 10th ACM Workshopon Artificial Intelligence and Security. pp. 3–14 (2017)[7] Carlini, N., Wagner, D.: Towards evaluating the robustness of neural net-works. In: 2017 ieee symposium on security and privacy (sp). pp. 39–57.IEEE (2017)[8] Chen, P., Sharma, Y., Zhang, H., Yi, J., Hsieh, C.: EAD: elastic-netattacks to deep neural networks via adversarial examples. In: McIlraith,S.A., Weinberger, K.Q. (eds.) Proceedings of the Thirty-Second AAAIConference on Artificial Intelligence, (AAAI-18), the 30th innovativeApplications of Artificial Intelligence (IAAI-18), and the 8th AAAI Sym-posium on Educational Advances in Artificial Intelligence (EAAI-18), NewOrleans, Louisiana, USA, February 2-7, 2018. pp. 10–17. AAAI Press (2018), [9] Chen, P.Y., Zhang, H., Sharma, Y., Yi, J., Hsieh, C.J.: Zoo: Zeroth or-der optimization based black-box attacks to deep neural networks withouttraining substitute models. In: Proceedings of the 10th ACM workshop onartificial intelligence and security. pp. 15–26 (2017)[10] Engstrom, L., Tran, B., Tsipras, D., Schmidt, L., Madry, A.: Exploring thelandscape of spatial robustness. In: International Conference on MachineLearning. pp. 1802–1811. PMLR (2019)[11] Face++ Face Searching API, https://faceplusplus.com/face-searching/. [12] Ford, N., Gilmer, J., Carlini, N., Cubuk, E.D.: Adversarial examples are anatural consequence of test error in noise. CoRR abs/1901.10513 (2019), http://arxiv.org/abs/1901.10513 [13] Gao, Y., Xu, C., Wang, D., Chen, S., Ranasinghe, D.C., Nepal, S.: Strip: Adefence against trojan attacks on deep neural networks. In: Proceedings of the 35th Annual Computer Security Applications Conference. pp. 113–125(2019)[14] Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adver-sarial examples. arXiv preprint arXiv:1412.6572 (2014)[15] Google Cloud Vision AI, https://cloud.google.com/vision/. (4), 1–32 (2018)[24] Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., Zhu, J.: Defense againstadversarial attacks using high-level representation guided denoiser. In: Pro-ceedings of the IEEE Conference on Computer Vision and Pattern Recog-nition. pp. 1778–1787 (2018)[25] Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towardsdeep learning models resistant to adversarial attacks. arXiv preprintarXiv:1706.06083 (2017)[26] Mei, S., Zhu, X.: Using machine teaching to identify optimal training-setattacks on machine learners. In: Bonet, B., Koenig, S. (eds.) Proceed-ings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Jan-uary 25-30, 2015, Austin, Texas, USA. pp. 2871–2877. AAAI Press (2015), [27] Meng, D., Chen, H.: Magnet: a two-pronged defense against adversarial ex-amples. In: Proceedings of the 2017 ACM SIGSAC conference on computerand communications security. pp. 135–147 (2017)[28] Nec Face Recognition API, https://nec.com/en/global/solutions/biometrics/face/. riole : Thwarting Privacy against Trustworthy Deep Learning Models 19 [29] Pinto, N., Stone, Z., Zickler, T., Cox, D.: Scaling up biologically-inspiredcomputer vision: A case study in unconstrained face recognition on face-book. In: CVPR 2011 WORKSHOPS. pp. 35–42. IEEE (2011)[30] Rasti, P., Uiboupin, T., Escalera, S., Anbarjafari, G.: Convolutional neuralnetwork super resolution for face recognition in surveillance monitoring. In:International conference on articulated motion and deformable objects. pp.175–184. Springer (2016)[31] Research, G.V.: Facial Recognition Market Size, Share &Trends Analysis Report By Technology (2D, 3D), By Appli-cation (Emotion Recognition, Attendance Tracking & Mon-itoring), By End-use, And Segment Forecasts, 2020 - 2027, [32] Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding forface recognition and clustering. In: Proceedings of the IEEE conference oncomputer vision and pattern recognition. pp. 815–823 (2015)[33] Shah, R., Gaston, J., Harvey, M., McNamara, M., Ramos, O., You, Y.,Alhajjar, E.: Evaluating evasion attack methods on binary network traf-fic classifiers. In: Proceedings of the Conference on Information SystemsApplied Research ISSN. vol. 2167, p. 1508 (2019)[34] Shan, S., Wenger, E., Zhang, J., Li, H., Zheng, H., Zhao, B.Y.: Fawkes:Protecting privacy against unauthorized deep learning models. In: 29th { USENIX } Security Symposium ( { USENIX } Security 20). pp. 1589–1604(2020)[35] Suciu, O., Marginean, R., Kaya, Y., Daume III, H., Dumitras, T.: Whendoes machine learning { FAIL } ? generalized transferability for evasion andpoisoning attacks. In: 27th { USENIX } Security Symposium ( { USENIX } Security 18). pp. 1299–1316 (2018)[36] Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedingsof the AAAI Conference on Artificial Intelligence. vol. 31 (2017)[37] Tong, L., Li, B., Hajaj, C., Xiao, C., Zhang, N., Vorobeychik, Y.: Improv-ing robustness of { ML } classifiers against realizable evasion attacks usingconserved features. In: 28th { USENIX } Security Symposium ( { USENIX } Security 19). pp. 285–302 (2019)[38] Vazquez-Fernandez, E., Gonzalez-Jimenez, D.: Face recognition for authen-tication on mobile devices. Image and Vision Computing , 31–33 (2016)[39] Wang, B., Yao, Y., Viswanath, B., Zheng, H., Zhao, B.Y.: With great train-ing comes great vulnerability: Practical attacks against transfer learning. In:27th { USENIX } Security Symposium ( { USENIX } Security 18). pp. 1281–1297 (2018)[40] Wang, H., Pang, G., Shen, C., Ma, C.: Unsupervised representation learningby predicting random distances. arXiv preprint arXiv:1912.12186 (2019)[41] Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarityfor image quality assessment. In: The Thrity-Seventh Asilomar Conferenceon Signals, Systems Computers, 2003. vol. 2, pp. 1398–1402 Vol.2 (2003).https://doi.org/10.1109/ACSSC.2003.1292216 [42] Xiang, J., Zhu, G.: Joint face detection and facial expression recognitionwith mtcnn. In: 2017 4th International Conference on Information Scienceand Control Engineering (ICISCE). pp. 424–427. IEEE (2017)[43] Yang, Z., Wilson, C., Wang, X., Gao, T., Zhao, B.Y., Dai, Y.: Uncoveringsocial network sybils in the wild. ACM Transactions on Knowledge Discov-ery from Data (TKDD) (1), 1–29 (2014)[44] Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch.arXiv preprint arXiv:1411.7923 (2014)[45] Zhang, H., Wang, H., Li, Y., Cao, Y., Shen, C.: Robust watermarking usinginverse gradient attention. arXiv preprint arXiv:2011.10850 (2020)[46] Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignmentusing multitask cascaded convolutional networks. IEEE Signal ProcessingLetters23