Adversarial Examples Detection beyond Image Space
Kejiang Chen, Yuefeng Chen, Hang Zhou, Chuan Qin, Xiaofeng Mao, Weiming Zhang, Nenghai Yu
AADVERSARIAL EXAMPLES DETECTION BEYOND IMAGE SPACE
Kejiang Chen (cid:63)
Yuefeng Chen † Hang Zhou (cid:63)
Chuan Qin (cid:63)
Xiaofeng Mao † Weiming Zhang (cid:63)
Nenghai Yu (cid:63) (cid:63)
University of Science and Technology of China † Alibaba Group
ABSTRACT
Deep neural networks have been proved that they are vulnerableto adversarial examples, which are generated by adding human-imperceptible perturbations to images. To defend these adversarialexamples, various detection based methods have been proposed. How-ever, most of them perform poorly on detecting adversarial exampleswith extremely slight perturbations. By exploring these adversarialexamples, we find that there exists compliance between perturbationsand prediction confidence, which guides us to detect few-perturbationattacks from the aspect of prediction confidence. To detect bothfew-perturbation attacks and large-perturbation attacks, we propose amethod beyond image space by a two-stream architecture, in whichthe image stream focuses on the pixel artifacts and the gradient streamcopes with the confidence artifacts. The experimental results showthat the proposed method outperforms the existing methods underoblivious attacks and is verified effective to defend omniscient attacksas well.
1. INTRODUCTION
Deep neural networks have been very successful in recognizing visualobjects, and state-of-the-art neural networks even perform better thanhumans on large-scale image classification tasks[1]. However, theirrobustness has raised concerns, and recently researches show that theyare fragile to adversarial-based perturbations[2, 3]. These adversarialexamples are threatening if neural networks are utilized in crucial realapplications, such as autonomous driving and identity recognition.To solve this, plenty of works have been proposed to defend ad-versarial examples in DNNs, and can be roughly categorized into 1) defense that focuses on making the underlying model robust to adver-sarial examples, and 2) detection that attempts to distinguish adversar-ial example from innocent inputs [4]. Most defense methods[5, 6, 7]modify the target models, and the expensive retraining process makesthem impractical for massive data classification. Generally, the ac-curacy will decrease, which is not acceptable for big tasks, such asmalicious image detection. The detection can be deployed in bypasswithout affecting the original task. Additionally, it can also be usedin conjunction with robust defense.There are many detection methods proposed recently from differ-ent aspects, including prediction logits [8, 9], pixel artifacts [10, 11]and the layer consistency [12, 13]. SRM [14] owns the-state-of-artperformance, which detects the artifacts from the aspect of steganal-ysis. However, SRM depends heavily on the artifacts. Recently,novel attacks, such as Decoupled Direction and Norm (DDN) [15]and Elastic-net Attacks to DNNs (EAD) [16], deceive the classifica-tion model with few or exiguous perturbations, and the experiments
This work was supported in part by the Natural Science Foundation ofChina under Grant U1636201, and by Anhui Initiative in Quantum InformationTechnologies under Grant AHY150400. Contact mail: [email protected]
Clean PGD DDNAdversarial PerturbationPrediction LogitConfidence Gradient
Fig. 1 . The comparison between the clean image and adversarialimages in terms of perturbation (zoomed 30 times), prediction logitsand confidence gradient. The confidence of slight perturbation adver-sarial example (DDN) is low, and that of adversarial examples withmany and large perturbations (PGD) is high, indicating that thereexists compliance between perturbation and prediction confidence,which guides us to detect adversarial examples from pixel artifactsand gradient artifacts.show that these adversarial examples cannot be detected by SRMeffectively.In this paper, we first analyze the prediction logits of clean exam-ples and adversarial examples, and find that there exists compliancebetween perturbation and prediction confidence. Generally, the pre-diction confidence can be defined as the advantage of the rank-1predicted logit to the rank-2 predicted logit. As shown in Figure 1,for few-perturbation attack (DDN), the prediction confidence is low.For large perturbation attack (PGD), prediction confidence is high. Inconclusion, the stronger the perturbation, the higher of the predictionconfidence. The phenomenon indicates that the prediction confidencecan be used for detecting few-perturbation attacks.Inspired by LID and MAD, we further propose confidence gra-dient to gather more discriminative information from the classifi-cation model. The confidence loss is defined as the cross-entropybetween predicted logit and its one-hot version, representing predic-tion confidence. Afterwards, the confidence gradient is computed byback-propagation, which includes the information of both predictionconfidence and classification model.For detecting few-perturbation attacks as well as large-perturbationattacks, we propose a novel adversarial example detection throughexploiting both pixel artifacts and confidence artifacts (abbreviatedas PACA). The method is under a two-stream framework, where theimage stream is used to capture pixel artifacts and the gradient streamis used to catch gradient artifact. a r X i v : . [ c s . C V ] F e b ig. 2 . The distribution of the prediction confidence of clean imagesand adversarial images.We apply our method to detect various attacking methods in-cluding (cid:96) , (cid:96) and (cid:96) ∞ constraint on the widely used ImageNet andCaltech-256 datasets under different threat models including oblivi-ous adversaries and omniscient adversaries, where the former adver-saries only deceive the classification model and the latter adversariesknow both the classification model as well as the detection model andtry to deceive both.The results demonstrate that compared to the baseline, the pro-posed method improves the detection accuracy against adversarialattacks under oblivious adversaries in most cases. Besides, we demon-strate that the omniscient adversaries have to craft adversarial ex-amples with larger noises to successfully mislead the classificationequipped with our detection.
2. RELATED WORK
There are many detection methods proposed recently from differentaspects. [10, 11] detects adversarial examples by exploiting theimage artifacts. Feature Squeezing (FS) [8] processes the input imageand discriminates according to the change of the prediction. LocalIntrinsic Dimensionality (LID) [12] and Mahalanobis AdversarialDetection (MAD) [13] detects adversarial examples based on theconsistency within the model. It has been pointed out [11] thatSpatial Rich Model (SRM) [14] owns the-state-of-art performance,which detects the artifacts from the aspect of steganalysis. SRMhas superior performance than FS, LID, and MAD when detectingwell-known attacks. However, we implement the detection task onthe newly proposed methods with few perturbations, such as DDNand EAD, and find that they cannot be detected by SRM effectively,motivating us to design a novel method to detect these imperceptiblyadversarial examples.
3. METHODOLOGY
In this section, we first analyze the properties of these examples,which guide us to the new method.
Figure 2 shows the distribution of prediction confidence of 500 cleanimages and different types of adversarial images. It is oblivious thatthe prediction confidence is discriminative between clean imagesand adversarial images crafted by few-perturbations attacks (DDN,EAD). This phenomenon implies us to detect few-perturbations at-tacks (DDN, EAD) from the perspective of confidence artifacts. Forlarge-perturbation attacks, we can detect them from pixel artifacts.
Image Stream ConvNet
Layer1 ( ) Layer2 ( ) Layer3 ( ) Layer4 ( ) Layer5 ( ) Layer6 ( ) Layer7 ( ) Layer8 ( ) Layer9 ( ) Layer10 ( ) Layer11 ( ) GCP FC
Gradient Stream ConvNet
Layer1 ( ) Layer2 ( ) Layer3 ( ) Layer4 ( ) Layer5 ( ) Layer6 ( ) Layer7 ( ) Layer8 ( ) Layer9 ( ) Layer10 ( ) Layer11 ( ) GCP FC
ScoreFuserGradientGenerator
Type1 Type2
Type3
Conv BN Type1 ReLu Conv BN Type1 ReLu Conv BN AvgConv (1×1) BN
Fig. 3 . The two-stream convolutional neural network for adversarialexample detection. There are three different types of layers, shownin different shapes, and their architectures are defined at the bottom.The kernel size of the convolution in Layer 1 is × to obtain largereception field and others are × without specific instruction. Thenumber in parentheses under the text “Layer” denotes the numberof kernels. BN, GCP and FC represent batch normalization, globalcovariance pooling and fully connected layer, respectively. To detect both large-perturbation attacks as well as few-perturbationattack, we propose a novel method to exploit both pixel artifacts andconfidence artifacts by a two-stream architecture, named PACA.The PACA consists of a gradient generator, two identical sub-networks with different inputs, and a score fuser. Given an image,the gradient generator will generate its gradient, which reflects theinformation of both prediction confidence and classification model.Then the image and the gradient are fed to two sub-networks to getthe immediate scores. Finally, the score fuser mixes the immediatescores, and output the final result. The detail of every part will beexplained in the following subsections.
Drawing lessons from LID and MDA that the information of themodel does help detection, we are about to gather more informationfrom the classification model with the prediction confidence. At first,we design a loss function, named confidence loss: L = − (cid:88) ni t i log ( y i ) (1)where y i = e zi (cid:80) nj e zj , n is the class number, and t is the one-hotvector of the predicted logits z . It should be noticed that one-hotvector t is the predicted label of the input image, not the true label.Small confidence loss corresponds to the high confidence of theclassification of the input image. For fully employing the informationof the classification model, we compute the gradient of image byback-propagating confidence loss to the image. In implementation,the absolute value of gradient is fed into the sub-network. The image x and the gradient | g | are then fed to the sub-network. Asfor the image stream, the task of the backbone is to classify an inputimage as a clean or an adversarial image: y = (cid:40) x , clean x + δ , adversarial (2)where x is the clean image, and δ denotes the adversarial perturbation.It should be noticed that the perturbation δ is quite small comparedto x . Thus common neural architectures may not perform well oniscrimination for they possibly diminishing the perturbation signal,i.e. average pooling will suppress noise-like perturbation signals byaveraging adjacent pixels. Actually, this task is similar to steganalysis,which means to classify the cover image and stego image (addingslight perturbation on cover image for hiding secret message). More-over, the traditional high-dimension human-design features (SRM)has been verified effective to detect adversarial examples. Recently,neural network based steganalysis methods [17, 18] perform betterthan SRM. Drawing insights from these neural network based ste-ganalysis, we design the backbone network shown in Figure 3, whichhas the following characteristics:• Average pooling layer is abandoned in the front layers. Be-cause average pooling layer is a low-pass filter, it reinforcescontent and suppresses noise-like perturbation signals by aver-aging adjacent pixels.• The perturbation signal will decay as the layers increases with-out shortcut connections, resulting in unsatisfying detectionperformance. As a result, shortcut connections are adopted topreserve the weak perturbation signal.• Global covariance pooling (GCP)[19] is introduced for gath-ering more information. Compared to the first-order statistic(i.e. global average pooling), more useful information can beobtained from the higher-order statistics.As for the gradient stream, we adopt the same bone neural net-work as that of the image stream, resulting from its strong discrimina-tion ability. The image and gradient sub-networks are denoted by F I , F G , respec-tively. With the input image as well as the gradient, we can obtainimmediate scores z I = F I ( x ) , z G = F G ( g ) . Then the immediatescores are mixed to get the final output: z (cid:48) = z I + z G (3)
4. EXPERIMENTS
We now present the experimental results to demonstrate the effective-ness of our method on improving detection performance.
We use two widely studied datasets ImageNet [20] and Caltech-256 [21]. The ImageNet dataset contains 1.2 million training imagesand the other 50,000 images for testing. Caltech-256 is composedof 256 object categories containing a total of 30,607 images, and wedivided it into the training set and testing set by a ratio of 8:2. Here,all images are resized to × × color images to match theclassification model. Different target models are adopted to show the generality of thePACA. For Caltech-256, VGG16 [22] is adopted as the classifica-tion model. We train the model on the training set using Adamoptimizer with learning rate 0.001. For ImageNet, pretrained modelResNet34 [23] provided in torchvision is directly adopted. The clas-sification accuracy on the testing set are 78% for ImageNet, 79% forCaltech-256. (a) PACA-ImageNet (b) SRM-ImageNet(c) PACA-Caltech-256 (d) SRM-Caltech-256
Fig. 4 . The generalizability detection accuracy (%) of PACA andSRM on two datasets. The detection accuracy of PACA is higherthan SRM in most cases, meaning that the transferability of PACA isbetter than SRM.
For each target model, we generate adversarial examples from thetesting set and use only those that can attack successfully beforedeploying any countermeasure to the target model in all of our ex-periments. We conduct untargeted attacks to each target model withfive representative attack algorithms, SF, EAD, C&W, DDN, andPGD attacks, as introduced in Section 2. These attacks cover (cid:96) , (cid:96) and (cid:96) ∞ constraint attacks. We use the default setting for SF, EAD,and DDN. C&W under (cid:96) constraint is adopted. We use confidence κ = 1 and the number of iterations are 1 and 500, respectively. ForPGD, (cid:15) = 0 . , α = 0 . and the number of iterations is 10. Ourimplementations are based on the foolbox and Advertorch. • Oblivious adversaries have a full access and knowledge toclassifier F but are not aware of detector D in place.• Omniscient adversaries know the model details of both classi-fier F and detector D . The Adamax optimizer was used with a mini-batch of 32 shuffledclean and adversarial images. The batch normalization parameterswere learned via an exponential moving average with decay rate 0.05.For the fully-connected classifier layer, we initialized the weightswith xavier uniform distribution and no bias. The learning rate is0.001 and dropped by a factor of 0.1 at 30, 70 and 150 epochs,with atotal budget of 200 epochs. All the experiments are implemented byPyTorch, and the codes will be released later. For quick convergence,we suggest initializing the model with the parameters of the pretrainedmodel which detects large-perturbation adversarial examples.
Table 1 and Table 2 give the detection performance under the oblivi-ous attack. The detection methods FS, LID, MDA, SRM are adopted able 1 . Detection accuracy (%) under the Oblivious Adversary onImageNet.
FS LID MDA SRM PACASF 61.30 59.67 66.38 53.39
EAD 54.75 60.88 68.70 74.82
C&W 55.47 64.25 68.87 87.24
DDN 69.63 61.41 68.52 65.62
PGD 95.55 99.21 99.55
Table 2 . Detection accuracy (%) under the Oblivious Adversary onCaltech-256.
FS LID MDA SRM PACASF 59.97 73.94 80.61 82.82
EAD 52.07 71.42 74.08 83.48
C&W 58.25 70.29 73.24 97.59
DDN 61.17 69.49 74.19 78.10
PGD 100 97.13 99.97 for comparison. PGD is easiest to be detected, and the detection accu-racy of all detection methods is nearly 100%. EAD is one of the mostdifficult methods to be detected, for its perturbation is slight as well asits confidence is not quite low. For the methods except PGD, the pro-posed method PACA outperforms other methods on two datasets witha clear margin. The advantage of detection accuracy even approaches20% with respect to all methods when detecting DDN. These resultsverify the effectiveness of PACA under the oblivious adversary.We have also tested the generalization detection performance,since in most cases the detector has no knowledge of which algorithmthe adversary adopted. The generalization detection experimentsshow the generalizability of detection methods among different at-tacks. Figure 4 shows generalizability heatmaps by PACA comparingwith SRM on two datasets. The detectors are trained with one of theattacks listed in the columns and tested against one another listed inthe rows. For SRM, the transfer performance is unsatisfying, sincemany results are around 50% on both datasets. Compared with SRM,PACA owns far better generalizability. The PACA detector trainedon SF or EAD can detect other attacks effectively on two datasets.Analyzing these two attacks, we find that they are both under (cid:96) con-straint. Similarly, the detector trained on DDN and C&W has abilityto detect other methods, for they are under (cid:96) constraint. When the adversaries have full knowledge of the classifier as wellas the detector, they can generate adversarial examples deceivingboth. Actually, this attack is also named second-round attack, whichhas been used to evaluate the performance of detectors in [24, 25].Following [25]’s setting, here we evaluate the performance of theproposed scheme. Adopt C&W as the original attack, and thenmodify the C&W attack by introducing to its adversarial objective anadditional loss term for penalizing being detected: min {(cid:107) δ (cid:107) + J F ( x + δ ) + J D ( x + δ ) } (4)where J F , J D are the loss of classifier F and proposed detector D ,respectively. The modified version is named C&W-PACA. Attacksuccessful rate and average (cid:96) distance between adversarial and cleanimages are utilized to measure the defense ability. Large (cid:96) distancemeans that it is more hard to generate adversarial examples. Table3 shows that the successful rate of C&W-PACA is far smaller thanthat of C&W, and the (cid:96) distance of modified attack is higher in bothdatasets, indicating that PACA does enhance the defense ability. Table 3 . The attack successful rate and the average (cid:96) distance ofC&W and C&W-PACA on ImageNet and Caltech-256. Attacks Successful rate (cid:96) distanceImageNet C&W 77.80% 0.1430C&W-PACA 7.00% 0.1594Caltech-256 C&W 77.50% 0.1463C&W-PACA 8.90% 0.1600 Table 4 . Detection accuracy (%) of variant detectors of PACA.
Operations DDN C&WPACA
Remove image stream 91.07 68.47Remove gradient stream 76.19 92.16GCP → GAP 89.36 94.23Remove short-cut connection 89.04 66.35Single logits + FC 75.75 60.61
To inspect the effect of each component of PACA, we conduct thecontrol experiments on ImageNet by removing or replacing the com-ponent. DDN and C&W are chosen as the attack methods, whichrepresent different perturbation attacks. The results are shown inTable 4. PACA performs best among different settings. Gradientstream performs well on detecting DDN, while the image streamdoes better in detecting C&W. That is to say, two steams are comple-mentary for they can cope with different attacks. Besides, replacingthe GCP or removing shortcut connection will destroy the detectionperformance, meaning that these architectures play positive roles inPACA. Moreover, we also investigate the performance of directlyusing prediction logit for classification rather than using the gradi-ent. Three fully-connected layers and ReLU activation are adoptedfor classification, and the number of neurons are 512, 32, respec-tively. The results in the last row in Table 4 show that only usinglogits assembled by fully-connected layers for classification is un-desirable, the detection accuracy is lower than that of merely usinggradient stream (Remove image stream), indicating that the gradientstream does favor to detection, which exploits the information of theclassification model.
5. CONCLUSION
The newly proposed methods like DDN, EAD which only requireslight perturbation are hard to detect by existing detection methods.Through exploring the perturbation and the predicted logits of theseadversarial examples, we find there exists compliance between per-turbation and prediction confidence. Slight perturbation leads to lowprediction confidence. For fully exploiting the confidence informa-tion as well as the classification model, the gradient is introducedby back-propagating the confidence loss to the images, where theconfidence loss is defined as the cross-entropy between predictionlogits and its one-hot vector. To make use of the information of imageartifacts as well, we propose a two-stream convolutional neural net-work for detecting different types of attacks, including image streamand gradient stream. Extensive experiments have been performedto evaluate the performance of the proposed PACA, and the resultsshow that PACA owns stronger detection ability as well as bettergeneralizability in most cases. . REFERENCES [1] Mingxing Tan and Quoc V. Le, “Efficientnet: Rethinking modelscaling for convolutional neural networks,” in
Proceedings ofthe 36th International Conference on Machine Learning, ICML2019, 9-15 June 2019, Long Beach, California, USA , KamalikaChaudhuri and Ruslan Salakhutdinov, Eds. 2019, vol. 97 of
Proceedings of Machine Learning Research , pp. 6105–6114,PMLR.[2] Anand Bhattad, Min Jin Chong, Kaizhao Liang, Bo Li, andDavid A. Forsyth, “Unrestricted adversarial examples viasemantic manipulation,” in . 2020, OpenReview.net.[3] Kenneth T. Co, Luis Mu˜noz-Gonz´alez, Sixte de Maupeou, andEmil C. Lupu, “Procedural noise adversarial examples for black-box attacks on deep convolutional networks,” in
Proceedings ofthe 2019 ACM SIGSAC Conference on Computer and Commu-nications Security, CCS 2019, London, UK, November 11-15,2019 , Lorenzo Cavallaro, Johannes Kinder, XiaoFeng Wang,and Jonathan Katz, Eds. 2019, pp. 275–289, ACM.[4] Bo Huang, Yi Wang, and Wei Wang, “Model-agnostic adversar-ial detection by random perturbations,” in
Proceedings of the28th International Joint Conference on Artificial Intelligence .AAAI Press, 2019, pp. 4689–4696.[5] Nicolas Papernot, Patrick D. McDaniel, Xi Wu, Somesh Jha,and Ananthram Swami, “Distillation as a defense to adversarialperturbations against deep neural networks,” in
IEEE Sympo-sium on Security and Privacy, SP 2016, San Jose, CA, USA,May 22-26, 2016 . 2016, pp. 582–597, IEEE Computer Society.[6] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt,Dimitris Tsipras, and Adrian Vladu, “Towards deep learningmodels resistant to adversarial attacks,” in . 2018, OpenReview.net.[7] Chaithanya Kumar Mummadi, Thomas Brox, and Jan HendrikMetzen, “Defending against universal perturbations with sharedadversarial training,” in . 2019, pp. 4927–4936, IEEE.[8] Weilin Xu, David Evans, and Yanjun Qi, “Feature squeezing:Detecting adversarial examples in deep neural networks,” arXivpreprint arXiv:1704.01155 , 2017.[9] Bin Liang, Hongcheng Li, Miaoqiang Su, Xirong Li, Wen-chang Shi, and XiaoFeng Wang, “Detecting adversarial imageexamples in deep neural networks with adaptive noise reduc-tion,”
IEEE Transactions on Dependable and Secure Comput-ing , 2018.[10] Reuben Feinman, Ryan R. Curtin, Saurabh Shintre, and An-drew B. Gardner, “Detecting adversarial samples from artifacts,”
CoRR , vol. abs/1703.00410, 2017.[11] Jiayang Liu, Weiming Zhang, Yiwei Zhang, Dongdong Hou,Yujia Liu, Hongyue Zha, and Nenghai Yu, “Detection baseddefense against adversarial examples from the steganalysis pointof view,” in
Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition , 2019, pp. 4825–4834. [12] Xingjun Ma, Bo Li, Yisen Wang, Sarah M Erfani, Sudanthi N RWijewickrema, Michael E Houle, Grant Schoenebeck, DawnSong, and James Bailey, “Characterizing adversarial subspacesusing local intrinsic dimensionality,” arXiv:1801.02613 , 2018.[13] Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin, “A sim-ple unified framework for detecting out-of-distribution samplesand adversarial attacks,”
NeurIPS , 2018.[14] Jessica Fridrich and Jan Kodovsky, “Rich models for steganal-ysis of digital images,”
IEEE Transactions on InformationForensics and Security , vol. 7, no. 3, pp. 868–882, 2012.[15] J´erˆome Rony, Luiz G Hafemann, Luiz S Oliveira, Ismail BenAyed, Robert Sabourin, and Eric Granger, “Decoupling direc-tion and norm for efficient gradient-based l2 adversarial attacksand defenses,” in
Proceedings of the IEEE Conference on Com-puter Vision and Pattern Recognition , 2019, pp. 4322–4330.[16] Pin-Yu Chen, Yash Sharma, Huan Zhang, Jinfeng Yi, and Cho-Jui Hsieh, “Ead: elastic-net attacks to deep neural networks viaadversarial examples,” in
Thirty-second AAAI conference onartificial intelligence , 2018.[17] Songtao Wu, Sheng-hua Zhong, and Yan Liu, “Residual convo-lution network based steganalysis with adaptive content suppres-sion,” in . IEEE, 2017, pp. 241–246.[18] Xiaoqing Deng, Bolin Chen, Weiqi Luo, and Da Luo, “Fast andeffective global covariance pooling network for image steganal-ysis,” in
Proceedings of the ACM Workshop on InformationHiding and Multimedia Security , 2019, pp. 230–234.[19] Peihua Li, Jiangtao Xie, Qilong Wang, and Wangmeng Zuo, “Issecond-order information helpful for large-scale visual recogni-tion?,” in
IEEE International Conference on Computer Vision,ICCV 2017, Venice, Italy, October 22-29, 2017 . 2017, pp. 2089–2097, IEEE Computer Society.[20] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei,“ImageNet: A Large-Scale Hierarchical Image Database,” in
CVPR09 , 2009.[21] G. Griffin, A. Holub, and P. Perona, “Caltech-256 object cate-gory dataset,” Tech. Rep. 7694, California Institute of Technol-ogy, 2007.[22] Karen Simonyan and Andrew Zisserman, “Very deep convolu-tional networks for large-scale image recognition,” in
Interna-tional Conference on Learning Representations , 2015.[23] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun,“Deep residual learning for image recognition,” in
Proceed-ings of the IEEE conference on computer vision and patternrecognition , 2016, pp. 770–778.[24] Nicholas Carlini and David A. Wagner, “Adversarial exam-ples are not easily detected: Bypassing ten detection methods,”in
Proceedings of the 10th ACM Workshop on Artificial In-telligence and Security, AISec@CCS 2017, Dallas, TX, USA,November 3, 2017 , Bhavani M. Thuraisingham, Battista Biggio,David Mandell Freeman, Brad Miller, and Arunesh Sinha, Eds.2017, pp. 3–14, ACM.[25] Tianyu Pang, Chao Du, Yinpeng Dong, and Jun Zhu, “To-wards robust detection of adversarial examples,” in