FFake-image detection with Robust Hashing * st Miki Tanaka
Tokyo Metropoliltan University
Tokyo, [email protected] nd Hitoshi Kiya
Tokyo Metropoliltan University
Tokyo, [email protected]
Abstract —In this paper, we investigate whether robust hashinghas a possibility to robustly detect fake-images even whenmultiple manipulation techniques such as JPEG compressionare applied to images for the first time. In an experiment, theproposed fake detection with robust hashing is demonstrated tooutperform state-of-the-art one under the use of various datasetsincluding fake images generated with GANs.
Index Terms —fake images, GAN
I. I
NTRODUCTION
Recent rapid advances in image manipulation tools and deepimage synthesis techniques, such as Generative AdversarialNetworks (GANs) have easily generated fake images. Inaddition, with the spread of SNS (social networking services),the existence of fake images has become a major threat tothe credibility of the international community. Accordingly,detecting manipulated images has become an urgent issue [1].Most forgery detection methods assume that images aregenerated by using a specific manipulation technique, andthe methods aim to detect unique features caused by themanipulation technique such as checkerboard artifacts [2]–[5].Actually tampered images are usually uploaded to SNS andimage sharing services. SNS providers are known to processthe uploaded images by resizing or compressing them intoJPEG format [6]–[9]. Such manipulation may damage or losethe unique features of fake images. However, the influence ofmanipulations on images has not been discussed sufficientlywhen a number of manipulation techniques such as JPEGcompression are applied at the same time. In this paper,we investigate the possibility that there is a method withrobust hashing that has been proposed for image retrieval,and the proposed method with robust hashing is demonstratedto have a high fake-detection accuracy, even when multiplemanipulation techniques are carried out.II. R
ELATED WORK
A. Fake-image generation
Fake images are manually generated by using image editingtools such as Photoshop. Splicing, copy-move, and deletionare also carried out under the use of such a tool. Similarly,resizing, rotating, blurring, and changing the color of an imagecan be manually carried out.In addition, recent rapid advances in deep image synthesistechniques such as GANs have automatically generated fake images. CycleGAN [10] and StarGAN [11] are typical imagesynthesis techniques with GANs. CycleGAN is a GAN thatperforms one-to-one transformations, e.g. changing apples tooranges, while StarGAN is a GAN that performs many-to-many transformations, such as changing a person’s facialexpression or hair color (see Figs.1 and 3). Furthermore, fakevideos created using deep learning are called Deepfake, andvarious tampering methods have emerged, such as those usingautoencoders, Face2Face [12], FaceSwap [13], and so on.
Fig. 1. Example Fake-images with CycleGAN
Real-world fake images may include the influence of a num-ber of manipulation techniques such as image compression,resizing, copy-move at the same time, even if fake-images aregenerated by using GANs. Therefore, we have to consider suchconditions for detecting real-world fake images.
B. Fake detection methods
Image tampering has a longer history than that of deeplearning. Fragile watermarking [14], detection of double JPEGcompression with a statistical method [15] [16], and useof PRNU (photo-response non-uniformity) patterns of eachcamera [17] [18] have been proposed to detect such tampers.However, most of them do not suppose to detect fake imagesgenerated with GANs. Moreover, they cannot detect the dif-ference between fake images and just manipulated ones suchas resized images, which are not fake images in general.With the development of deep learning, fake detectionmethods with deep leaning have been studied so far. Themethods with deep learning do not employ a reference imageor the features of a reference image to detect tamper ones.The methods also assume that images are generated by using a r X i v : . [ c s . MM ] F e b specific manipulation technique to detect unique featurescaused by the manipulation technique.There are several detection methods with deep learning fordetecting fake images generated with an image editing toolas Photoshop. Some of them focus on detecting the boundarybetween tampered regions and an original image [19] [20][21]. Besides, a detection method [22] enables us to train amodel without tamper images.Most detection methods with deep learning have beenproposed to detect fake images generated by using GANs. Animage classifier trained only with ProGAN was shown to beeffective in detecting images generated by other GAN models[23]. Various studies have focused on detecting checkerboardartifacts caused in both of two processes: forward propagationof upsampling layers and backpropagation of convolutionallayers [24]. In this work, the spectrum of images is used asan input image in order to capture the checkerboard artifacts.To detect fake videos called DeepFake, a number of de-tection methods have been investigated so far. Some methodsattempt to detect failures in the generation of fake videos, interms of poorly generated eyes and teeth [25], the frequencyof blinking as a feature [26], and the correctness of faciallandmarks [27] or head posture [28]. However, all of thesemethods have been pointed out to have problems in therobustness against the difference between training datasetsand test data [1]. In addition, the conventional methods havenot considered the robustness against the combination ofvarious manipulations such as the combination of resizing andDeepFake.III. P ROPOSED METHOD WITH R OBUST H ASHING
A. Overview
Figure2 shows an overview of the proposed method. In theframework, robust hash value is computed from easy referenceimage by using a robust hash method, and stored in a database.Similar to reference images, a robust hash value is computedfrom a query one by using the same hash method. The hashvalue of the query is compared with those stored the database.Finally, the query image is judged whether it is real or fakein accordance with the distance between two hash values.
Fig. 2. Overview of proposed method
B. Fake detection with Robust Hashing
Various robust hashing methods have been proposed toretrieval similar images to a query one [29], [30]. In this paper, we apply the robust hashing method proposed by Li et al [29]for applying it to fake-image detection. This robust hashingenables us to robustly retrieve images, and has the followingproperties. • Resizing images to 128 ×
128 pixels prior to featureextraction. • Performing 5 × • Using rich features extracted from spatial and chromaticcharacteristics. • Outputting a bit string with a length of 120 bits as a hashvalue.In the method, the similarity is evaluated in accordance withthe hamming distance between the hash string of a queryimage and that of each image in a database.Let vectors u = { u , u , . . . , u n } and q = { q , q , . . . , q n } , u i , q i ∈ { , } be the hash strings of reference image U andquery image Q , respectively. The hamming distance d H ( u , q ) between U and Q is given by: d H ( u , q ) (cid:44) n (cid:88) i =1 δ ( u i , q i ) (1)where δ ( u i , q i ) = (cid:26) , u i = q i , u i (cid:54) = q i . (2)To apply this similarity to fake-image detection, we introducea threshold d as follows. Q ∈ U (cid:48) , min u (cid:54) = q,u ∈ U ( d H ( u , q )) < dQ / ∈ U (cid:48) , min u (cid:54) = q,u ∈ U ( d H ( u , q )) ≥ d (3)where U is a set of reference images and U (cid:48) is the an of imagesgenerated with image manipulations from U , which does notinclude fake images. According to eq. (3), Q is judged whetherit is a fake image or not.IV. E XPERIMENT RESULTS
The proposed fake-image detection with robust hashing wasexperimentally evaluated in terms of accuracy and robustnessagainst image manipulations.
A. Experiment setup
In the experiment, four fake-image datasets: Image Ma-nipulation Dataset [31], UADFV [26], CycleGAN [10], andStarGAN [11] were used. The details of datasets are shownin Table I (see Figs. 1 and 3). The datasets consist of pairs ofa fake-image and the original one. JPEG compression with aquantization parameter of Q J = 80 was applied to all queryimages. d = 3 was selected as threshold d in accordance withthe EER (Equal error rate) performance.As one of the state-of-the-art fake detection methods,Wang’s method [23] was compared with the proposed one.Wang’s method was proposed for detecting images generatedby using CNNs including various GAN models, where aclassifier is trained by using ProGAN. ABLE ID
ATASETS dataset Fake-image generation real fakeNo. of imagesImageManipulationDataset [31] copy-move 48 48UADFV [26] face swap 49 49CycleGAN [10] GAN 1320 1320StarGAN [11] GAN 1999 1999Fig. 3. Example of datasets
The performance of fake-image detection was evaluated byusing AP (Average Precision) and Accuracy (fake), given by,
Accuracy ( f ake ) = N tn N Qf (4)where N Qf is the number of fake query images, and N tn isthe number of fake query ones that are correctly judged asfake images. B. Results without additional manipulation
Table II shows experimental results under the use of thetwo detection methods. From the table, it is shown thatthe proposed method had a higher performance than Wang’smethod in terms of both AP and Acc (fake). In addition, theperformance of Wang’s method heavily decreased when usingthe image manipulation and UADFV datasets. The reason isthat Wang’s method focuses on detecting fake images gener-ated by using CNNs. The image manipulation dataset does not consist of images generated with GANs. In addition, althoughUADFV consists of images generated by using DeepFake, theyhave the influence of video compression.
TABLE II
COMPARISON WITH W ANG ’ S METHOD
Wang’s method [23] proposedDataset AP Acc (fake) AP Acc (fake)Image Manipulation Dataset 0.5185 0.0000 0.9760 0.8750UADFV 0.5707 0.0000 0.8801 0.7083CycleGAN 0.9768 0.5939 1.0000 1.0000StarGAN 0.9594 0.5918 1.0000 1.0000
C. Results with additional manipulation
JPEG compression with Q J = 70 , resizing with a scalefactor of 0.5, copy-move or splicing was applied to queryimages. Therefore, when query images were fake ones, thefake query ones included the effects of two manipulations atthe same time.Table III shows experimental results under the additionalmanipulation, where 50 fake images generated by using Cy-cleGAN, in which horses were converted to zebras, wereused (see Fig.1). The proposed method was confirmed to stillmaintain a high accuracy even under the additional manipula-tion. In contrast, Wang’s method suffered from the influenceof the addition manipulation. In particular, for splicing andresizing, Wang’s method was affected by these operations.That is why the method assume that fake images are generatedby using CNNs, to detect unique features caused by usingCNNs. However, splicing and resizing don’t depend on CNNs,although CycleGAN includes CNNs. TABLE IIIC
OMPARISON WITH W ANG ’ S METHOD UNDER ADDITIONALMANIPULATION ( DATASET : C
YCLE
GAN)
Wang’s method [23] proposedadditional manipulation AP Acc (fake) AP Acc (fake)None 0.9833 0.6200 0.9941 1.0000JPEG( Q J = 70 ) 0.9670 0.6000 0.9922 0.9800resize (0.5) 0.8264 0.2400 0.9793 1.0000copy-move 0.9781 0.6000 1.0000 1.0000splicing 0.9666 0.4800 0.9992 1.0000 V. C
ONCLUSION
In this paper, we proposed a novel fake-image detectionmethod with robust hashing for the first time. Although variousrobust hashing methods have been proposed to retrieve similarimages to a query one so far, a robust hashing methodproposed by Li et al was applied to various datasets includingfake images generated with GANs. In the experiment, theproposed method was demonstrated not only to outperforma state-of-the-art but also to be robust against the combinationof image manipulations.
EFERENCES[1] L. Verdoliva, “Media forensics and deepfakes: An overview,”
IEEEJournal of Selected Topics in Signal Processing , vol. 14, no. 5, pp.910–932, 2020.[2] Y. Sugawara, S. Shiota, and H. Kiya, “Super-resolution using convolu-tional neural networks without any checkerboard artifacts,” in
Proc. ofIEEE International Conference on Image Processing , 2018, pp. 66–70.[3] Y. Sugawara, S. Shiota, and H. Kiya, “Checkerboard artifacts freeconvolutional neural networks,”
APSIPA Transactions on Signal andInformation Processing , vol. 8, p. e9, 2019.[4] Y. Kinoshita and H. Kiya, “Fixed smooth convolutional layer foravoiding checkerboard artifacts in cnns,” in
Proc. in IEEE InternationalConference on Acoustics, Speech and Signal Processing , 2020, pp.3712–3716.[5] T. Osakabe, M. Tanaka, Y. Kinoshita, and H. Kiya, “Cyclegan withoutcheckerboard artifacts for counter-forensics of fake-image detection,” arXive preprint arXive:2012.00287 , 2020. [Online]. Available: https://arxiv.org/abs/2012.00287[6] T. Chuman, K. Iida, W. Sirichotedumrong, and H. Kiya, “Image manip-ulation specifications on social networking services for encryption-then-compression systems,”
IEICE Transactions on Information and Systems ,vol. E102.D, no. 1, pp. 11–18, 2019.[7] T. Chuman, K. Kurihara, and H. Kiya, “Security evaluation for blockscrambling-based etc systems against extended jigsaw puzzle solverattacks,” in
Proc. of IEEE International Conference on Multimedia andExpo (ICME) , 2017, pp. 229–234.[8] W. Sirichotedumrong and H. Kiya, “Grayscale-based block scram-bling image encryption using ycbcr color space for encryption-then-compression systems,”
APSIPA Transactions on Signal and InformationProcessing , vol. 8, p. e7, 2019.[9] T. Chuman, W. Sirichotedumrong, and H. Kiya, “Encryption-then-compression systems using grayscale-based image encryption for jpegimages,”
IEEE Transactions on Information Forensics and Security ,vol. 14, no. 6, pp. 1515–1525, 2019.[10] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-imagetranslation using cycle-consistent adversarial networks,” in
Proc. of IEEEInternational Conference on Computer Vision , Oct 2017.[11] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, “Stargan:Unified generative adversarial networks for multi-domain image-to-image translation,” in
Proc. of IEEE Conference on Computer Visionand Pattern Recognition , June 2018.[12] J. Thies, M. Zollhofer, M. Stamminger, C. Theobalt, and M. Niessner,“Face2face: Real-time face capture and reenactment of rgb videos,” in
Proc. of IEEE Conference on Computer Vision and Pattern Recognition ,June 2016.[13] Y. Nirkin, I. Masi, A. Tran Tuan, T. Hassner, and G. Medioni, “On facesegmentation, face swapping, and face perception,” in
Proc. of IEEEInternational Conference on Automatic Face Gesture Recognition , 2018,pp. 98–105.[14] A. T. S. Ho, X. Zhu, J. Shen, and P. Marziliano, “Fragile watermarkingbased on encoding of the zeroes of the z -transform,” IEEE Transactionson Information Forensics and Security , vol. 3, no. 3, pp. 567–569, 2008.[15] G. Zhenzhen, N. Shaozhang, and H. Hongli, “Tamper detection methodfor clipped double jpeg compression image,” in
Proc. of InternationalConference on Intelligent Information Hiding and Multimedia SignalProcessing , 2015, pp. 185–188.[16] T. Bianchi and A. Piva, “Detection of nonaligned double jpeg com-pression based on integer periodicity maps,”
IEEE Transactions onInformation Forensics and Security , vol. 7, no. 2, pp. 842–848, 2012.[17] M. Chen, J. Fridrich, M. Goljan, and J. Lukas, “Determining image ori-gin and integrity using sensor noise,”
IEEE Transactions on InformationForensics and Security , vol. 3, no. 1, pp. 74–90, 2008.[18] G. Chierchia, G. Poggi, C. Sansone, and L. Verdoliva, “A bayesian-mrfapproach for prnu-based image forgery detection,”
IEEE Transactionson Information Forensics and Security , vol. 9, no. 4, pp. 554–567, 2014.[19] Y. Rao and J. Ni, “A deep learning approach to detection of splicingand copy-move forgeries in images,” in
Pros. of IEEE InternationalWorkshop on Information Forensics and Security , 2016, pp. 1–6.[20] J. H. Bappy, A. K. Roy-Chowdhury, J. Bunk, L. Nataraj, and B. S.Manjunath, “Exploiting spatial structure for localizing manipulatedimage regions,” in
Proc. of IEEE International Conference on ComputerVision , Oct 2017. [21] P. Zhou, X. Han, V. I. Morariu, and L. S. Davis, “Pros. of learning richfeatures for image manipulation detection,” in
Proc. of IEEE Conferenceon Computer Vision and Pattern Recognition , June 2018.[22] M. Huh, A. Liu, A. Owens, and A. A. Efros, “Pros. of fighting fakenews: Image splice detection via learned self-consistency,” in
Proc. ofEuropean Conference on Computer Vision , September 2018.[23] S.-Y. Wang, O. Wang, R. Zhang, A. Owens, and A. A. Efros, “Cnn-generated images are surprisingly easy to spot... for now,” in
Proc. ofIEEE/CVF Conference on Computer Vision and Pattern Recognition ,June 2020.[24] X. Zhang, S. Karaman, and S. Chang, “Detecting and simulating artifactsin gan fake images,” in
Proc. of IEEE International Workshop onInformation Forensics and Security , 2019, pp. 1–6.[25] F. Matern, C. Riess, and M. Stamminger, “Exploiting visual artifactsto expose deepfakes and face manipulations,” in
Proc. of IEEE WinterApplications of Computer Vision Workshops , 2019, pp. 83–92.[26] Y. Li, M. Chang, and S. Lyu, “In ictu oculi: Exposing ai createdfake videos by detecting eye blinking,” in
Proc. of IEEE InternationalWorkshop on Information Forensics and Security , 2018, pp. 1–7.[27] X. Yang, Y. Li, H. Qi, and S. Lyu, “Exposing gan-synthesized facesusing landmark locations,” in
Proc. of ACM Workshop on InformationHiding and Multimedia Security , 2019, p. 113–118.[28] X. Yang, Y. Li, and S. Lyu, “Exposing deep fakes using inconsistenthead poses,” in
Proc. of IEEE International Conference on Acoustics,Speech and Signal Processing , 2019, pp. 8261–8265.[29] Y. N. Li, P. Wang, and Y. T. Su, “Robust image hashing based onselective quaternion invariance,”
IEEE Signal Processing Letters , vol. 22,no. 12, pp. 2396–2400, 2015.[30] K. Iida and H. Kiya, “Robust image identification with dc coefficientsfor double-compressed jpeg images,”