[PDF] DeepTag: Robust Image Tagging for DeepFake Provenance

Abstract

In recent years, DeepFake is becoming a common threat to our society, due to the remarkable progress of generative adversarial networks (GAN) in image synthesis. Unfortunately, existing studies that propose various approaches, in fighting against DeepFake, to determine if the facial image is real or fake, is still at an early stage. Obviously, the current DeepFake detection method struggles to catchthe rapid progress of GANs, especially in the adversarial scenarios where attackers can evade the detection intentionally, such as adding perturbations to fool DNN-based detectors. While passive detection simply tells whether the image is fake or real, DeepFake provenance, on the other hand, provides clues for tracking the sources in DeepFake forensics. Thus, the tracked fake images could be blocked immediately by administrators and avoid further spread in social networks. In this paper, we investigated the potentials of image tagging in serving the DeepFake provenance. Specifically, we devise a deep learning-based approach, named DeepTag, with a simple yet effective encoder and decoder design to embed message to the facial image, which is to recover the embedded message after various drastic GAN-based DeepFake transformation with high confidence. The embedded message could be employed to represent the identity of facial images, which further contributed to DeepFake detection and provenance. Experimental results demonstrate that our proposed approach could recover the embedded message with an average accuracy of nearly 90%. Our research finding confirms effective privacy-preserving techniques for protecting personal photos from being DeepFaked. Thus, effective proactive defense mechanisms should be developed for fighting against DeepFakes, instead of simply devising DeepFake detection methods that can be mostly ineffective in practice.

Full PDF

DDeepTag : Robust Image Tagging for DeepFake Provenance

Run Wang ∗ , Felix Juefei-Xu , Qing Guo , Yihao Huang , Lei Ma , Yang Liu , Lina Wang Nanyang Technological University, Singapore Alibaba Group, USA East China Normal University, China Kyushu University, Japan Wuhan University, China

Abstract

In recent years, DeepFake is becoming a common threat toour society, due to the remarkable progress of generative ad-versarial networks (GAN) in image synthesis. Unfortunately,existing studies that propose various approaches, in ﬁghtingagainst DeepFake, to determine if the facial image is real orfake, is still at an early stage. Obviously, the current Deep-Fake detection method struggles to catch the rapid progressof GANs, especially in the adversarial scenarios where at-tackers can evade the detection intentionally, such as addingperturbations to fool DNN-based detectors. While passive de-tection simply tells whether the image is fake or real, Deep-Fake provenance, on the other hand, provides clues for track-ing the sources in DeepFake forensics. Thus, the tracked fakeimages could be blocked immediately by administrators andavoid further spread in social networks.In this paper, we investigated the potentials of image tag-ging in serving the DeepFake provenance. Speciﬁcally, wedevise a deep learning-based approach, named

DeepTag , witha simple yet effective encoder and decoder design to em-bed message to the facial image, which is to recover theembedded message after various drastic

GAN-based Deep-Fake transformation with high conﬁdence. The embeddedmessage could be employed to represent the identity of fa-cial images, which further contributed to DeepFake detec-tion and provenance. Experimental results demonstrate thatour proposed approach could recover the embedded messagewith an average accuracy of nearly 90%. Our research ﬁndingconﬁrms effective privacy-preserving techniques for protect-ing personal photos from being DeepFaked. Thus, effectiveproactive defense mechanisms should be developed for ﬁght-ing against DeepFakes, instead of simply devising DeepFakedetection methods that can be mostly ineffective in practice.

Capturing the exciting moments with camera and sharingthem with friends over social networks ( e.g ., Facebook,Twitter, Instagram) becomes a common activity in our dailylife. However, with the recent development of GAN and itsvariants, our shared photos may suffer from being manipu-lated by various GANs to create DeepFakes (Mirsky and Lee2020). Abusing the DeepFakes can bring potential threatsand concerns to everyone, for example, releasing a realis-tic fake statement, creating fake pornography, etc . Addition- ∗ Corresponding author, E-mail: [email protected] Figure 1: Comparison between a vulnerable social media platform(top panel) and a DeepTag protected social media platform (bottompanel) in handling malicious bad actors for spreading the misinfor-mation by using DeepFake technology. ally, many freely available tools ( e.g ., FaceApp, ZAO) allowusers to easily create DeepFakes on their own without anyadditional expertise. Thus, effective measures should be de-veloped to ﬁght against DeepFakes to protect our personalsecurity and privacy.In ﬁghting against DeepFakes, researchers are activelyproposing various techniques to determine if a suspiciousstill image or video is real or fake passively. These studiesmostly focus on the artifacts introduced in synthesizing theimages with GANs. Identifying the synthesized images withobservable artifacts (Li, Chang, and Lyu 2018; Yang, Li, andLyu 2019) and detecting the synthesized images using deepneural networks (DNN) to spot the invisible artifacts (Danget al. 2020; Wang et al. 2020d) are the two mainstream ap-proaches in detecting DeepFakes. Unfortunately, our investi-gation into the artifact-based methods has revealed that theystill suffer a lot from the following two issues. • Generalization . Almost all the existing studies are fo-cused on evaluating the effectiveness of their method on alimited number of known GANs. Since advanced GANswill be developed at an enormous speed and the artifacts a r X i v : . [ c s . CR ] S e p hich could be employed in previous GANs for distin-guishing real and fake will likely be removed (Karras et al.2020; Choi et al. 2020). • Robustness . Simple image transformation ( e.g ., resizing,compression, Gaussian noises) and adversarial attack withcarefully crafted perturbations are two obstacles in de-veloping robust detectors (Qian et al. 2020; Carlini andFarid 2020). Especially, the adversarial attacks by addingimperceptible noises can fool DNN-based detectors withhigh conﬁdence in many cases (Carlini and Farid 2020;Huang et al. 2020b).Undoubtedly, advanced GANs will be developed to pro-duce high-quality synthesized images with fewer artifacts.These advanced GANs will be applied for creating Deep-Fakes maliciously and pose real challenges for detectionsince existing detectors are not generalized to unknownGANs. Furthermore, recent studies have demonstrated thatDNN-based detectors are susceptible to adversarial noise at-tacks by adding imperceptible perturbations into the facialimages. To address these two issues in passively defendingDeepFakes, we propose a novel approach, named

DeepTag ,by protecting the safety and privacy of faces with image tag-ging to embed messages into the victim images and recoverthem to determine whether they are DeepFaked and manip-ulated by GANs proactively. Speciﬁcally, our proposed ap-proach can be employed in DeepFake forensics for both de-tection and provenance purposes.

Threat Model:

In this paper, our threat model is describedin Fig. 1. A user could upload his/ her personal photos to so-cial networks like Facebook and share it with friends or any-one. Unfortunately, attackers can easily pick victim’s photosand manipulate them with various GANs to create Deep-Fakes they wanted, like releasing a fake statement in a video.The created DeepFakes will cause panic and raise securityand privacy concerns for victims when it spreads on socialnetworks. Our proposed DeepTag embeds message into theimages before uploading to the social networks, after whichit tries to recover the embedded message from a suspiciousphoto in social network for DeepFake detection and Deep-Fake provenance by determining the sources based on therecovered message. The key idea here is that our image tag-ging method should be robust enough to survive the drasticimage transformation and reconstruction by the DeepFakeprocess. Finally, the conﬁrmed DeepFakes could be blockedand avoid further spreading.Here are more details regarding Fig. 1. In the top panel,after a user (Fig. 1-a1) uploads his/her personal photos to thepublic domain social media platform, the personal picturecan be picked up by a malicious actor (Fig. 1-b1). The badactor can apply off-the-shelf DeepFake technology to pro-duce a DeepFaked version of the user’s face image (Fig. 1-c1). In this case, the male face is transformed to exhibit fe-male’s attribute, which is one example of how DeepFakecan alter any face image without noticeable artifacts. Then,the bad actor can upload the DeepFaked face image to thesame social media platform again (Fig. 1-d1), impersonat-ing the user, or aiming at other malicious activities such asspreading misinformation. As can be seen, the unprotected social media platform is quite vulnerable in this scenario interms of identifying the DeepFake images and preventingthe spread of misinformation since no mechanism is estab-lished to distinguish between a legitimate face image and aDeepFake one.On the contrary, in the bottom panel where the social me-dia platform is protected by the proposed DeepTag mecha-nism, the spread of misinformation can be effectively pro-hibited. When a user uploads his/her personal photo (Fig. 1-a2) to the social media platform, the DeepTag is invokedto check whether this picture has been tagged by a Deep-Tag message before (usually a UID that matches the user’sidentity). If this face image is new, DeepTag can embed amessage in the image, which is sufﬁciently robust to survivedrastic image transformation such as DeepFake reconstruc-tion. When a malicious bad actor (Fig. 1-b2) picks out thevictim’s photo and applies the DeepFake technique (Fig. 1-c2), the DeepTag message will survive. Then, when the badactor tries to upload the DeepFaked face image to the socialmedia platform again (Fig. 1-d2), the embedded DeepTagmessage will trigger an alarm since the UID of the originalpicture does not match the one of the bad actors, indicatinga perpetrating event has happened. In this way, proper mea-sures can be taken to stop the spread of misinformation suchas blocking the uploading of the DeepFake face image, and/or raising a red ﬂag for this bad actor. In the bottom panel,the DeepTag protected images are represented by a green tagas well as a blue picture frame. In both panels, the pink ar-rows depict the route that a bad actor can take from picking avictim to the spread of misinformation. The blue arrow routeindicates where the DeepTag message remains active duringthe whole process.Our DeepTag is motivated by the existing studies onthe privacy-preserving of multimedia, for example, digi-tal watermarking for digital multimedia copyright protec-tion (Katzenbeisser and Petitcolas 2000). Digital watermark-ing allows users to embed visible and invisible watermark-ing into the target multimedia ( e.g ., text, image, audio). Ourproposed image tagging is similar to digital watermarking.However, the difference between our image tagging and dig-ital watermarking lies in that image tagging should surviveafter various drastic image transformation with GAN, whiledigital watermarking need to robust against common imagetransformation. In tackling the GAN-based manipulation,the following challenges need to be addressed in our imagetagging. 1)

Diverse GANs . Existing GANs for face synthe-sis could be classiﬁed as entire synthesis and partial synthe-sis, but the manipulation intensity of them is obviously dif-ferent. DeepTag needs to tackle diverse GANs with variousmanipulation intensities. 2)

Unclear manipulated region .In creating DeepFakes using GAN, the manipulated regionsare unknown, thus the embedded message should avoid theeffects of position where manipulation is performed.To address the aforementioned challenges in embedding amessage into the images, in this paper, our proposed Deep-Tag is based on a simple yet effective encoder and decoderarchitecture that could recover messages effectively even af-ter drastic GAN-based transformation. The encoder and de-coder are both DNNs and jointly trained. In DeepTag, a2eepFake simulator connects the encoder and decoder tosimulate various manipulation with GANs on the encodedimages to enforce that the decoder could recover the em-bedded messages effectively after GAN-based transforma-tion. To comprehensively evaluate the effectiveness of ourDeepTag, our experiments are conducted on three state-of-the-art (SOTA) GANs including STGAN (Liu et al. 2019),StarGAN (Choi et al. 2018), and StyleGAN (Karras, Laine,and Aila 2019). These three GANs involve all the two typ-ical GAN-based transformations, entire synthesis and par-tial synthesis. Experimental results have demonstrated thatDeepTag achieves an average accuracy of nearly 90% in re-covering the embedded messages.Our main contribution are summarized as follows: • New idea in defending DeepFake with image tagging.

To the best of our knowledge, this is the ﬁrst work propos-ing image tagging to achieve DeepFake forensics forboth DeepFake detection and DeepFake provenance. Ourproactive defense techniques could overcome the gener-alization and robustness issues in the traditional artifact-based DeepFake detection. • Performing a comprehensive evaluation of the effec-tiveness on typical GANs.

Experiments are conductedon three SOTA GANs spanning entire synthesis and par-tial synthesis. Experimental results demonstrated the ef-fectiveness in embedding messages and recovering themafter drastic GAN-based transformation. • New insight for defending DeepFakes.

Detecting syn-thesized images based on artifacts passively for Deep-Fakes is not enough for defending DeepFakes since theyare not generic to unknown GANs and robust against ad-versary attack. Our approach presents a new insight byemploying image tagging to protect the safety of imagesproactively.

GANs (Goodfellow et al. 2014) have achieved remarkableprogress in image synthesis (Zhu et al. 2017) and voice syn-thesis (Oord et al. 2016), which are widely employed in cre-ating realistic DeepFakes. In this paper, we mainly focus onimage synthesis which plays a key role in creating modernDeepFakes. Entire synthesis and partial synthesis are twotypical manipulations in facial image synthesis with GANs(Tolosana et al. 2020). In the entire synthesis, the whole syn-thesized images are totally generated by GANs and it can beused for synthesizing a new face that does not exist in theworld. PGGAN (Karras et al. 2017) and StyleGAN (Kar-ras, Laine, and Aila 2019) can generate high-resolution fa-cial images to improve the quality of a given face. Specif-ically, StyleGAN has the capability to synthesize a non-existent face by utilizing the idea of style transfer. In thepartial synthesis, the face attributes like hair, expression, aremanipulated by GANs automatically. StarGAN (Choi et al.2018), STGAN (Liu et al. 2019), and AttGAN (He et al.2019) can edit the attributes in a ﬁne-grained manner, for ex-ample, changing the hair color, wearing eyeglasses, turning the smiling expression into scared, etc . Thus, determiningwhether a facial image is manipulated by GANs provides astraightforward idea for detecting DeepFake.Due to the imperfection design of existing GANs, themanipulated images with GAN inevitably introduces vari-ous artifacts. Existing studies on identifying DeepFakes aremostly leveraging the artifacts as clues. The artifacts can beclassiﬁed as observable-artifacts noticed by human eyes andinvisible-artifacts learned by DNN-based classiﬁers (Wanget al. 2020c; Zhang, Karaman, and Chang 2019).Lyu et al . proposed to spot DeepFake video by observ-ing the lack of eye blinking in the synthesized face (Li,Chang, and Lyu 2018). The inconsistent head poses in thesynthesized face is another observable-artifacts in Deep-Fake videos (Yang, Li, and Lyu 2019). Some researchersalso investigated the invisible-artifacts which could be usedfor spotting DeepFakes. Wang et al . observed that CNN-generated images contain common artifacts that could beidentiﬁed by careful pre- and post-processing and data aug-mentation (Wang et al. 2020d). Frank et al . addressed theGAN-generated image identiﬁcation with a basic obser-vation that the artifacts revealed in the frequency domain(Frank et al. 2020). AutoGAN (Zhang, Karaman, and Chang2019) observed the upsampling design in GAN will intro-duce artifacts in the synthesized images, thus they developeda GAN simulator to produce fake images and train a classi-ﬁer to detect GAN-generated images. These proposed meth-ods all claimed the effectiveness on seen GANs, but theircapabilities on unknown GANs are still unclear.

In the past decades, digital watermarking plays a key role indigital multimedia copyright protection. Digital watermark-ing indicates that the embedded watermark could be visibleand invisible by human eyes and the embedded watermarkin the carrier should be recovered even after various im-age transformation. Thus, robustness is the main concerns indesigning an effective embedding algorithm (Katzenbeisserand Petitcolas 2000; Podilchuk and Delp 2001; Siddaraju,Jayadevappa, and Ezhilarasan 2015).The spatial and frequency domain are two lines in em-bedding watermark into the carrier. Spatial domain is moreeasily to perform than the frequency domain, but it can beeasily corrupted or attacked by attackers with pixel pertur-bations (Singh et al. 2012). The spatial domain techniquesembed watermark by modifying the pixels value, such as theleast signiﬁcant bit (LSB) (Bamatraf, Ibrahim, and Salleh2010). In embedding on the frequency domain, the carrierwill be ﬁrst converted into a speciﬁc transformation, thenthe watermark will be embedded in the transformation coef-ﬁcients. The common frequency domains adopted in embed-ding watermarks include discrete cosine transform (DCT),discrete wavelet transform (DWT), discrete Fourier trans-form (DFT), and singular value decomposition (SVD) (Jian-sheng, Sukang, and Xiaomei 2009; Khan et al. 2013; Yavuzand Telatar 2007).With the rapid development of deep learning, end-to-endwatermark embedding techniques are proposed in recentyears. HiDDeN (Zhu et al. 2018) proposed the ﬁrst end-3o-end framework by jointly training encoder and decodernetwork which could robust to noises like Gaussian blur-ring, pixel-wise dropout, etc . StegaStamp (Tancik, Milden-hall, and Ng 2020) presented a steganographic algorithmfor embedding arbitrary hyperlink into the photos, whichcomprises a deep neural network for encoding and decod-ing. In addressing the unknown image distortion, Luo et al .proposed a framework for distortion agnostic watermarking(Luo et al. 2020) which are generic to unseen distortions.

We present the very ﬁrst facial image tagging approach forDeepFake provenance. We give our motivation and summa-rize the qualiﬁcations of a desirable image tagging solutionagainst DeepFake in Section 3.1. Then, we establish the im-age tagging pipeline in Section 3.2.

Existing techniques against DeepFake aim at observing theartifacts in the synthesized images with various methods.However, these studies suffer two issues, 1) they are not gen-eral to unknown GANs (Karras et al. 2020), 2) they are eas-ily susceptible to adversarial attacks by adding perturbationsintentionally or simple image transformation ( e.g ., compres-sion, Gaussian noises) (Qian et al. 2020; Carlini and Farid2020). Thus, the existing artifact-based techniques are notprepared in tackling the future emerging DeepFake threats.Another straightforward idea for protecting facial imagesagainst DeepFake is that we can borrow the idea in privacy-preserving to defend DeepFakes proactively. Thus, we ex-plore whether a robust image tagging can be served as asafeguard for protecting the safety of facial images in so-cial networks against DeepFake. The image tagging allowsus to easily conduct DeepFake detection and provenancewith the embedded message. Our image tagging is similarto digital watermarking which is widely applied in protect-ing the copyright of digital multimedia (Ambadekar, Jain,and Khanapuri 2019), but ours has the following challengeswhich are vastly different to digital watermarking: • Image tagging for DeepFake should be robust againstGAN-based transformation, rather than simple imagetransformation like digital watermarking. • The manipulated region in DeepFake is always unknown,however, the corrupted region in copyright protection canbe ﬁgured out sometimes.Inspired by the advances of deep learning in achievingend-to-end watermarking, we employ a DNN based encoderand decoder and jointly trained to enforce that the embeddedmessage could survive various drastic GAN-based transfor-mation. In the subsections, we introduce the pipeline of ourproposed image tagging for DeepFake.

Overview

Fig. 2 gives an overview of our proposed Deep-Tag overall architecture. Our method includes three keycomponents, a DNN-based encoder F enc , a GAN simulator Figure 2: Training pipeline for DeepTag. G sim , and a DNN-based decoder F dec . The encoder and de-coder are inspired by a previous work StegaStamp (Tancik,Mildenhall, and Ng 2020). Speciﬁcally, the functionalitiesof each component as follows. • The encoder F enc embeds a message (usually a UID) intoa facial image and ensures the embedded message invis-ible to human eyes. In other words, the encoded imageneeds to be perceptually similar to the input image. • The GAN simulator G sim is adopted for performing var-ious GAN-based transformation, including entire synthe-sizing the encoded facial images, editing the attributes ofencoded facial images. • The decoder F dec recovers the embedded message fromthe encoded facial images after drastic GAN-based trans-formation. The recovered UID is further used for the iden-tity veriﬁcation purpose. Image tagging encoder-decoder training

The DNN-based encoder and decoder are jointly trained to embed mes-sages into the given input facial images. The encoder allowsan arbitrary message to imperceptibly embed into the givenarbitrary facial images. The decoder is trained to retrieve theembedded message even after drastic GAN-based manipula-tion. Here, the embedded message indicates n bits UID, butit can be easily extended to arbitrary binary bits.Speciﬁcally, the encoder F enc receives a facial image i and a message w as input, then the encoder output a taggedfacial image (cid:101) i with a mapping F enc ( i , w ) (cid:55)→ (cid:101) i . The inputfacial image i need to perceptually similar to the encodedfacial image (cid:101) i , where i ≈ (cid:101) i . The encoded facial imagesmay manipulated by GAN, where G sim ( (cid:101) i ) (cid:55)→ i . The de-coder try to recover the embedded message F dec ( i ) (cid:55)→ (cid:101) w or F dec ( (cid:101) i ) (cid:55)→ (cid:101) w , where (cid:101) w ≈ w .To improve the capabilities of our decoder in recoveringthe embedded message from the encoded images, we needto explore where and when to embed message. In this paper,we embed the message in less manipulated regions and thelate embedding level in the encoder. In manipulating faces4 ype Manipulation GAN Entire synthesis full StyleGANidentity swap ZAOPartial synthesis facial attributes ( e.g ., eyeglass, gender) StarGANfacial expression ( e.g ., smile, scared) STGAN

Table 1: GANs adopted in creating DeepFake. The column

Type indicates the fake type including entire synthesis and partial syn-thesis. The column

Manipulation means the facial region that willbe manipulated. The column

GAN represents typical GANs. ZAOis an app for face manipulation, but the technical details are stillunknown to us. with GAN, the faces involve entire synthesis and partial syn-thesis, thus we employ masks to enforce more messages em-bedded in the region less manipulated by GANs. The level ofembedding indicates when we can embed the message in theencoder. The late embedding in the encoder is less corruptedthan early embedding since more layers are processed.

GAN-based manipulation

DeepFake involves facial im-ages manipulation with various GANs. Speciﬁcally, existingGANs can be classiﬁed into entire synthesis and partial syn-thesis. The details refer to Tab. 1. Our encoded facial im-ages will be corrupted by these GAN-based manipulations.Thus, a GAN simulator performs the two typical manipula-tions by connecting our encoder and decoder to enforce thatthe decoder could learn how to recover message after drasticGAN-based manipulations.

Losses

To train the encoder and decoder jointly, we use aseries of losses in training. Particularly, we adopt the lossesdeﬁned in StegaStamp (Tancik, Mildenhall, and Ng 2020).Here, we use L residual regularization L R , the LPIPS per-ceptual loss (Zhang et al. 2018) L P , and a critic loss L C calculated between encoded image and input image. We usecross entropy loss L M for the message. The training loss iscalculated as follows. L = λ R L R + λ P L P + λ C L C + λ M L M (1)where λ R,P,C should be set to zero initially while the de-coder trains to high accuracy, after which λ R,P,C are in-creased linearly.

In this section, we conduct experiments to evaluate the effec-tiveness of our proposed DeepTag in recovering the embed-ded message after drastic GAN-based manipulation. Speciﬁ-cally, we evaluate the effectiveness of DeepTag against threetypical GANs including entire and partial synthesis and itsrobustness against image perturbations. Furthermore, we ex-plore the performance in tackling different length of embed-ded messages and the impact on the level of embedding.

GANs

In our experiments, we employ three GANs, i.e .,StarGAN (Choi et al. 2020), STGAN (Liu et al. 2019),and StyleGAN (Karras, Laine, and Aila 2019), since theyachieved the state-of-the-art performance in faces manipu-lation. StyleGAN can reconstruct a given face and generate a new face. Both StarGAN and STGAN involve partial syn-thesis including facial attributes editing ( e.g ., wearing eye-glasses, changing hair color) and expression manipulation( e.g ., smile, scared).

Dataset

We employ CelebA-HQ (Karras et al. 2017) thatis a public face dataset consisting 30,000 facial images andcontains several different size facial images, such as × , × , and , × , , etc . In our experiments,we explore the effectiveness of DeepTag in tackling facialimages with different input size. Metrics

To evaluate the performance of DeepTag quanti-tatively, we employ accuracy to measure the recovered mes-sage after GAN-based manipulations. The accuracy indi-cates the full message retrieval rate (FMRR). Furthermore,PSNR and SSIM are adopted for calculating the similaritybetween the input and encoded facial images with DeepTag.

Encoder

Our encoder is trained to embed messages intocarrier images while preserving the perceptual similar to theinput carrier. Here, we use a U-Net (Ronneberger, Fischer,and Brox 2015) style architecture for receiving the input car-rier images and output an encoded three-channel image. Inour experiments, we explore different size of input carrierimages ( e.g ., × , × ) and different length ofembedded message ( e.g ., bits, bits, bits). Further-more, the embedded message could be embedded in differ-ent levels in our encoder for achieving better performance inrecovering the message in the decoder. Decoder

Our decoder is trained to retrieve the embeddedmessage from the encoded images that are the output of ourencoder. The decoder consists of seven convolutional layerswith kernel size × and strides ≥ , one dense layer, andﬁnally output the decoded message with the sigmoid activa-tion function. The size of the decoded message is the sameas the embedded message. GAN simulator

We employ three GANs that haveachieved state-of-the-art performance in their ﬁeld, i.e ., Star-GAN, STGAN, and StyleGAN. The three GANs involve allthe two typical GAN manipulation ( e.g ., entire synthesis,partial synthesis) in creating DeepFakes.

Encoder and decoder training

The encoder and decoderare jointly trained with randomly generated messages. Theinput images are collected from the public dataset CelebA-HQ. In training, we use different sizes input facial imagesto train the model to explore the performance of DeepTag intackling input faces of different sizes. In this section, we mainly explore the effectiveness of ourproposed DeepTag in recovering the embedded messageswith different GANs manipulation. Three GANs are adoptedin our experiments for evaluation, namely StyleGAN for en-tire synthesis, STGAN and StarGAN for partial synthesis.Here, the length of the message is set to bits, and the lateembedding is implemented.5 mage Size Facial Attributesbald mustache eyeglasses plain skin × × × Table 2: Performance (FMRR) of DeepTag on STGAN. The facialattributes mean the encoded images will be manipulated on suchattributes. Manipulating the color of skin is the most drastic one.

Image Size Facial Attributesblond hair gender angry happy × × × Table 3: Performance (FMRR) of DeepTag on StarGAN. The facialattributes mean the encoded images will be manipulated on suchattributes. Changing the hair color to blond is the most moderateone than the three other facial attributes.

Tab. 2 summarizes the performance of DeepTag in tack-ling the attributes manipulation with STGAN. In the ex-periments, the manipulated attributes include removing hairinto bald, adding mustache, wearing eyeglasses, and chang-ing into pale skin. Experimental results have shown that ourDeepTag can perform well in the three former attributes ma-nipulation, but susceptible to the skin color changing. Themain reason is that the manipulation region is larger and theintensity is drastic than others. We also observe that the sizeof the input image has a positive impact on performance.Large size image can provide more space for embeddingmessage and can survive in GAN-based manipulation moreeasily.Tab. 3 presents the performance of DeepTag in dealingwith StarGAN. The manipulated attributes include turninghair color to blond, changing gender, and facial expressions(angry & happy). Among these four facial attributes manip-ulation, the hair color changing is the most moderate one in-volving less region manipulation. Experimental results showthat our DeepTag achieve an average accuracy of 90.8% onthe hair changing manipulation. However, DeepTag gives anaccuracy of 78.1% on the gender manipulation that involvesthe whole skin modiﬁcation. In tackling the facial expressionmanipulation, DeepTag reaches an accuracy nearly 85% onthe two common facial expression manipulation. Comparedwith the performance on STGAN and StarGAN, we can no-tice that the performance of DeepTag on STGAN is betterthan StarGAN due to the less artifact existed in STGAN.StyleGAN indicates the entire synthesis which receivesan input image and reconstructs it with less observable ar-tifacts. In our experiments, we evaluate the performance on × size image with a pretrained model providedby StyleGAN . Experimental results show that DeepTagachieves an accuracy of more than 95.1% on the large res-olution facial images. It is interesting to explore the perfor-mance of low resolution with StyleGAN, but training Style- https://github.com/Puzer/stylegan-encoder GAN for entire synthesis is extremely time-consuming andcomputing resource-intensive.According to the experimental results of DeepTag on thethree GANs, we can easily ﬁnd that the input image sizehas a positive impact on the recovering message, while themanipulated region has a negative impact on the embeddedmessage retrieval. Furthermore, advanced GANs with lessartifact in the synthesized images could also reduce the neg-ative impact in message retrieval. Similarly, the advancedGANs could also be employed for creating realistic Deep-Fakes in the future.

Capacity is an important factor for measuring the capabilityof our DeepTag in embedding message. A large capacity in-dicates that the carrier can contain more information whichcould represent a large number of UID in our work. Thus,we explore the impact of message size on the performanceof DeepTag in recovering messages.Fig. 3 shows the relation between the accuracy of Deep-Tag in recovering messages and the length of message onthree GANs. For STGAN and StarGAN, the input imagesize is 256 ×

256 which is the most common size in sharingimages on the social networks. We select the bald attributefor STGAN and blond hair attribute for StarGAN. These twoattributes involve less manipulated regions, which could bebetter for us to illustrate the problem.Experimental results show that the length of message hasa negative impact on the performance of DeepTag in recov-ering embedded message. DeepTag can achieve an accuracyof more than 95% on the three GNAs when the size of em-bedded message is bits. However, the accuracy reduces toless than 70% when the size of embedded message is bits.Actually, the bits of message can represent more than billion different UIDs and the bits can represent morethan billion UIDs. We believe that message with the bits or bits is enough for a social media platform to assigneach user a speciﬁed UID. Accuracy

L e n g t h o f m e s s a g e ( b i t s )

S T G A N S t a r G A N S t y l e G A N

Figure 3: Performance of DeepTag on different size of messages.

In DeepTag, the message could be embedded in the differ-ent levels of encoder, thus we explore the raw embedding,early embedding and late embedding. The raw embeddingindicates the message embedded along with the carrier as6 mbedding Facial Attributesbald mustache eyeglasses plain skin

Raw 0.880 0.898 0.903 0.861Early 0.891 0.911 0.901 0.869Late 0.920 0.931 0.913 0.893

Table 4: Performance (FMRR) of DeepTag on three different levelembedding. STGAN is adopted for evaluation and the input imagesize is × . Metrics GANSTGAN StarGAN StyleGAN

PSNR ↑ ↑ Table 5: Image quality of the encoded images and input measuredby PSNR and SSIM. For PSNR and SSIM, the higher the better. input to the encoder. The early and late embedding meansthat the message embedded in the front and behind layer inthe encoder, more details refer to Fig. 2.Tab. 4 presents the performance of DeepTag on three dif-ferent level embedding. The input image size is × and the adopted GAN is STGAN for its performance in par-tial synthesis. Experimental results show that the late em-bedding outperforms both raw and early embedding. Fur-thermore, the early embedding is better than raw embeddingin most of the time, except wearing eyeglasses manipulatedby STGAN. Experimental results in Tab. 4 indicates that theembedded message would be easily corrupted when morelayers are processed in the encoder. In DeepTag, the encoder outputs an encoded image with em-bedded message. Ideally, the encoded images should be per-ceptually similar to the input image. Here, we use two dif-ferent metrics, PSNR and SSIM for measuring the distancebetween encoded image and input. Experimental result inTab. 5 illustrates that our encoded image could maintain highvisual quality. Furthermore, the StyleGAN achieved the bestperformance among the three GANs, due to that the entiresynthesis exhibits less artifact.

In creating DeepFake videos, the manipulated images willbe further processed by numerous image perturbations likecompression, resizing, etc . In this section, we evaluated therobustness of DeepTag in tackling these image perturbationswhich are common appeared in producing videos.Fig. 4 presents the robustness evaluation results of Deep-Tag on three GANs. In experiments, we employ two widelyappeared perturbations in creating DeepFake videos, namelycompression and resizing. For STGAN, we select the baldfacial attributes for manipulation. For StarGAN, we selectthe blond hair attributes for manipulation. The StyleGAN in-volves the entire synthesis. The input image size for STGANand StarGAN is × , while the input image size forStyleGAN is × .Experimental results show that the accuracy in recoveringmessage decreased when the compression quality increased, Accuracy

C o m p r e s s io n Q u a lit y

S T G A N S ta r G A N S ty le G A N (a) Compression

Accuracy

S c a le F a c t o r

S T G A N S ta r G A N S ty le G A N (b) ResizingFigure 4: Robustness evaluation with compression and resizingdegradations. but our DeepTag could achieve more than 80% in even whenthe compression quality is 80%. The higher compressionquality indicates more messages will be discarded. In tack-ling the resizing perturbation, the performance of DeepTagis similar to compression where the scale factor has posi-tive impacts on the accuracy. The small scale factor means alarge resizing and it further results in a bad performance forDeepTag in recovering. In the real scenario, we can see thatthe compression quality for compression and the scale factorfor resizing will not be too bad that touches the boundary. Tosome extent, our DeepTag can be well applied for real appli-cation in considering the robustness against perturbations.Our pioneering work leverages image tagging for defend-ing DeepFakes proactively. In the performance evaluation,we consider the most strict case where all the bits are fullyrecovered. DeepTag will have an even broader application,more robustness, and stronger resilience when partial errorscould be tolerated in the retrieval or applying the redundancycode design techniques for embedding message.

In this paper, we proposed DeepTag that embeds messagesinto the images for DeepFake provenance. To the best ofour knowledge, this is the ﬁrst work that presents a new in-sight for ﬁghting against DeepFake from the perspective ofprivacy-preserving, which aims to defend DeepFake proac-tively. Experiments on three typical GANs including the en-tire synthesis and partial synthesis demonstrate the effective-ness of our method in embedding watermarking into facialimages and recovering them from facial images after drasticGAN-based transformation.With the rapid development of AI-techniques, nobody canimagine future advances in producing DeepFakes. We canconﬁrm that the DeepFake will become more and more re-alistic and everyone could fall victim. In this AI era, we areliving in a world where we cannot believe our eyes anymore.However, detecting DeepFakes by observing the artifacts inthe synthesized images is obviously insufﬁcient for protect-ing us against this AI risk. Our work poses a new insight forﬁghting against DeepFakes proactively, instead of observingthe artifacts by leveraging domain knowledge in synthesizedimages which could be easily invalid in unseen GANs. In fu-ture work, the community needs to develop more powerfuldefense strategies by considering how to protect images toavoid DeepFake threats.Another orthogonal research direction is to investigatethe interplay between DeepTag-based provenance technique7nd the state-of-the-art DeepFake detection methods (Qiet al. 2020; Wang et al. 2020b; Huang et al. 2020d) as wellas methods that can help DeepFakes more detection-evasive(Huang et al. 2020c,a). The effectiveness of the DeepTagunder the presence of various adversarial perturbation, es-pecially those that are not purely based on additive noise,such as (Gao et al. 2020; Cheng et al. 2020b; Tian et al.2020; Zhai et al. 2020; Guo et al. 2020b; Wang et al. 2020a;Guo et al. 2020a; Cheng et al. 2020a) is also worth carefullystudying.

References

Ambadekar, S. P.; Jain, J.; and Khanapuri, J. 2019. Digital imagewatermarking through encryption and DWT for copyright protec-tion. In

Recent Trends in Signal and Image Processing , 187–195.Springer.Bamatraf, A.; Ibrahim, R.; and Salleh, M. N. B. M. 2010. Digitalwatermarking algorithm using LSB. In , 155–159. IEEE.Carlini, N.; and Farid, H. 2020. Evading Deepfake-Image Detec-tors with White-and Black-Box Attacks. In

Proceedings of theIEEE/CVF Conference on Computer Vision and Pattern Recogni-tion Workshops , 658–659.Cheng, Y.; Guo, Q.; Juefei-Xu, F.; Xie, X.; Lin, S.-W.; Lin,W.; Feng, W.; and Liu, Y. 2020a. Pasadena: PerceptuallyAware and Stealthy Adversarial Denoise Attack. arXiv preprintarXiv:2007.07097 .Cheng, Y.; Juefei-Xu, F.; Guo, Q.; Fu, H.; Xie, X.; Lin, S.-W.; Lin,W.; and Liu, Y. 2020b. Adversarial Exposure Attack on DiabeticRetinopathy Imagery. arXiv preprint arXiv .Choi, Y.; Choi, M.; Kim, M.; Ha, J.-W.; Kim, S.; and Choo, J. 2018.Stargan: Uniﬁed generative adversarial networks for multi-domainimage-to-image translation. In

Proceedings of the IEEE conferenceon computer vision and pattern recognition , 8789–8797.Choi, Y.; Uh, Y.; Yoo, J.; and Ha, J.-W. 2020. Stargan v2: Di-verse image synthesis for multiple domains. In

Proceedings of theIEEE/CVF Conference on Computer Vision and Pattern Recogni-tion , 8188–8197.Dang, H.; Liu, F.; Stehouwer, J.; Liu, X.; and Jain, A. K. 2020. Onthe detection of digital face manipulation. In

Proceedings of theIEEE/CVF Conference on Computer Vision and Pattern Recogni-tion , 5781–5790.Frank, J.; Eisenhofer, T.; Sch¨onherr, L.; Fischer, A.; Kolossa, D.;and Holz, T. 2020. Leveraging Frequency Analysis for Deep FakeImage Recognition. arXiv preprint arXiv:2003.08685 .Gao, R.; ; Guo, Q.; Juefei-Xu, F.; Yu, H.; Ren, X.; Feng, W.; andWang, S. 2020. Making Images Undiscoverable from Co-SaliencyDetection. arXiv preprint arXiv .Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, Y. 2014. Genera-tive adversarial nets. In

Advances in neural information processingsystems , 2672–2680.Guo, Q.; Juefei-Xu, F.; Xie, X.; Ma, L.; Wang, J.; Feng, W.; andLiu, Y. 2020a. ABBA: Saliency-Regularized Motion-Based Ad-versarial Blur Attack. arXiv preprint arXiv:2002.03500 .Guo, Q.; Xie, X.; Juefei-Xu, F.; Ma, L.; Li, Z.; Xue, W.; Feng,W.; and Liu, Y. 2020b. SPARK: Spatial-aware online incremen-tal attack against visual tracking. In

Proceedings of the EuropeanConference on Computer Vision (ECCV) . He, Z.; Zuo, W.; Kan, M.; Shan, S.; and Chen, X. 2019. AttGAN:Facial attribute editing by only changing what you want.

IEEETransactions on Image Processing arXiv preprint arXiv .Huang, Y.; Juefei-Xu, F.; Wang, R.; Guo, Q.; Ma, L.; Xie, X.; Li, J.;Miao, W.; Liu, Y.; and Pu, G. 2020b. FakePolisher: Making Deep-Fakes More Detection-Evasive by Shallow Reconstruction. arXivpreprint arXiv:2006.07533 .Huang, Y.; Juefei-Xu, F.; Wang, R.; Guo, Q.; Ma, L.; Xie, X.; Li,J.; Miao, W.; Liu, Y.; and Pu, G. 2020c. FakePolisher: MakingDeepFakes More Detection-Evasive by Shallow Reconstruction. In

Proceedings of the ACM International Conference on Multimedia(ACM MM) .Huang, Y.; Juefei-Xu, F.; Wang, R.; Guo, Q.; Xie, X.; Ma, L.; Li,J.; Miao, W.; Liu, Y.; and Pu, G. 2020d. FakeLocator: RobustLocalization of GAN-Based Face Manipulations. arXiv preprintarXiv:2001.09598 .Jiansheng, M.; Sukang, L.; and Xiaomei, T. 2009. A digital water-marking algorithm based on DCT and DWT. In

Proceedings. The2009 International Symposium on Web Information Systems andApplications (WISA 2009) , 104. Citeseer.Karras, T.; Aila, T.; Laine, S.; and Lehtinen, J. 2017. Progressivegrowing of gans for improved quality, stability, and variation. arXivpreprint arXiv:1710.10196 .Karras, T.; Laine, S.; and Aila, T. 2019. A style-based generatorarchitecture for generative adversarial networks. In

Proceedings ofthe IEEE conference on computer vision and pattern recognition ,4401–4410.Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; andAila, T. 2020. Analyzing and improving the image quality of style-gan. In

Proceedings of the IEEE/CVF Conference on ComputerVision and Pattern Recognition , 8110–8119.Katzenbeisser, S.; and Petitcolas, F. 2000. Digital watermarking.

Artech House, London arXiv preprint arXiv:1307.6328 .Li, Y.; Chang, M.-C.; and Lyu, S. 2018. In ictu oculi: Exposing aicreated fake videos by detecting eye blinking. In ,1–7. IEEE.Liu, M.; Ding, Y.; Xia, M.; Liu, X.; Ding, E.; Zuo, W.; and Wen,S. 2019. Stgan: A uniﬁed selective transfer network for arbitraryimage attribute editing. In

Proceedings of the IEEE conference oncomputer vision and pattern recognition , 3673–3682.Luo, X.; Zhan, R.; Chang, H.; Yang, F.; and Milanfar, P. 2020.Distortion Agnostic Deep Watermarking. In

Proceedings of theIEEE/CVF Conference on Computer Vision and Pattern Recogni-tion , 13548–13557.Mirsky, Y.; and Lee, W. 2020. The Creation and Detection of Deep-fakes: A Survey. arXiv preprint arXiv:2004.11138 .Oord, A. v. d.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.;Graves, A.; Kalchbrenner, N.; Senior, A.; and Kavukcuoglu, K.2016. Wavenet: A generative model for raw audio. arXiv preprintarXiv:1609.03499 . odilchuk, C. I.; and Delp, E. J. 2001. Digital watermarking: algo-rithms and applications. IEEE signal processing Magazine

Proceedings of the ACMInternational Conference on Multimedia (ACM MM) .Qian, Y.; Yin, G.; Sheng, L.; Chen, Z.; and Shao, J. 2020. Thinkingin Frequency: Face Forgery Detection by Mining Frequency-awareClues. arXiv preprint arXiv:2007.09355 .Ronneberger, O.; Fischer, P.; and Brox, T. 2015. U-net: Convo-lutional networks for biomedical image segmentation. In

Inter-national Conference on Medical image computing and computer-assisted intervention , 234–241. Springer.Siddaraju, P. M.; Jayadevappa, D.; and Ezhilarasan, K. 2015. Dig-ital image watermarking techniques: a review.

Int. J. Comput. Sci.Secur , 497–501. IEEE.Tancik, M.; Mildenhall, B.; and Ng, R. 2020. Stegastamp: Invis-ible hyperlinks in physical photographs. In

Proceedings of theIEEE/CVF Conference on Computer Vision and Pattern Recogni-tion , 2117–2126.Tian, B.; Guo, Q.; Juefei-Xu, F.; Chan, W.; Cheng, Y.; Li, X.; Xie,X.; and Qin, S. 2020. Bias Field Poses a Threat to DNN-basedX-Ray Recognition. arXiv preprint arXiv .Tolosana, R.; Vera-Rodriguez, R.; Fierrez, J.; Morales, A.; andOrtega-Garcia, J. 2020. Deepfakes and beyond: A survey of facemanipulation and fake detection. arXiv preprint arXiv:2001.00179 .Wang, R.; Juefei-Xu, F.; Guo, Q.; Huang, Y.; Xie, X.; Ma, L.; andLiu, Y. 2020a. Amora: Black-box Adversarial Morphing Attack. In

Proceedings of the ACM International Conference on Multimedia(ACM MM) .Wang, R.; Juefei-Xu, F.; Huang, Y.; Guo, Q.; Xie, X.; Ma, L.; andLiu, Y. 2020b. DeepSonar: Towards Effective and Robust Detec-tion of AI-Synthesized Fake Voices. In

Proceedings of the ACMInternational Conference on Multimedia (ACM MM) .Wang, R.; Juefei-Xu, F.; Ma, L.; Xie, X.; Huang, Y.; Wang, J.; andLiu, Y. 2020c. FakeSpotter: A Simple yet Robust Baseline for Spot-ting AI-Synthesized Fake Faces. In

International Joint Conferenceon Artiﬁcial Intelligence (IJCAI) .Wang, S.-Y.; Wang, O.; Zhang, R.; Owens, A.; and Efros, A. A.2020d. CNN-generated images are surprisingly easy to spot... fornow. In

Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition , volume 7.Yang, X.; Li, Y.; and Lyu, S. 2019. Exposing deep fakes usinginconsistent head poses. In

ICASSP 2019-2019 IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP) ,8261–8265. IEEE.Yavuz, E.; and Telatar, Z. 2007. Improved SVD-DWT based digitalimage watermarking against watermark ambiguity. In

Proceedingsof the 2007 ACM symposium on Applied computing , 1051–1055.Zhai, L.; Juefei-Xu, F.; Guo, Q.; Xie, X.; Ma, L.; Feng, W.; Qin,S.; and Liu, Y. 2020. It’s Raining Cats or Dogs? Adversarial RainAttack on DNN Perception. arXiv preprint arXiv . Zhang, R.; Isola, P.; Efros, A. A.; Shechtman, E.; and Wang, O.2018. The unreasonable effectiveness of deep features as a percep-tual metric. In

Proceedings of the IEEE conference on computervision and pattern recognition , 586–595.Zhang, X.; Karaman, S.; and Chang, S.-F. 2019. Detecting andsimulating artifacts in gan fake images. In ,1–6. IEEE.Zhu, J.; Kaplan, R.; Johnson, J.; and Fei-Fei, L. 2018. Hidden:Hiding data with deep networks. In

Proceedings of the Europeanconference on computer vision (ECCV) , 657–672.Zhu, J.-Y.; Park, T.; Isola, P.; and Efros, A. A. 2017. Unpairedimage-to-image translation using cycle-consistent adversarial net-works. In

Proceedings of the IEEE international conference oncomputer vision , 2223–2232., 2223–2232.