Adversarial Vulnerability of Active Transfer Learning
AAdversarial Vulnerability of Active TransferLearning
Nicolas M. Müller
Cognitive Security TechnologiesFraunhofer AISEC
Garching near [email protected]
Konstantin Böttinger
Cognitive Security TechnologiesFraunhofer AISEC
Garching near [email protected]
Abstract —Two widely used techniques for training supervisedmachine learning models on small datasets are Active Learningand Transfer Learning. The former helps to optimally use alimited budget to label new data. The latter uses large pre-trainedmodels as feature extractors and enables the design of complex,non-linear models even on tiny datasets. Combining these twoapproaches is an effective, state-of-the-art method when dealingwith small datasets.In this paper, we share an intriguing observation: Namely, thatthe combination of these techniques is particularly susceptible toa new kind of data poisoning attack: By adding small adversarialnoise on the input, it is possible to create a collision in theoutput space of the transfer learner. As a result, Active Learningalgorithms no longer select the optimal instances, but almostexclusively the ones injected by the attacker. This allows anattacker to manipulate the active learner to select and includearbitrary images into the data set, even against an overwhelmingmajority of unpoisoned samples. We show that a model trainedon such a poisoned dataset has a significantly deterioratedperformance, dropping from 86% to 34% test accuracy. Weevaluate this attack on both audio and image datasets and supportour findings empirically. To the best of our knowledge, thisweakness has not been described before in literature.
I. I
NTRODUCTION
Training supervised machine learning algorithms such asneural networks requires large amounts of labeled trainingdata. In order to solve problems for which there is no or littletraining data, previous work has developed techniques such asTransfer Learning and Active Learning.Transfer Learning (TL) applies knowledge gained fromone problem to a second problem. For example, consider theproblem of training a Neural Network on a very small imagedataset (say N = 500 ). Training directly on the dataset willyield poor generalization due to the limited number of trainingexamples. A better approach is to use a second, already-existingnetwork trained on a different task to extract high-level semanticfeatures from the training data, and train one’s network on thesefeatures instead of the raw images themselves. For images, thisis commonly employed practice and easily accessible via the tensorflow library. For audio data, similar feature extractorsare also easily accessible [13].Active Learning (AL) on the other hand is a process wherea learning algorithm can query a user. AL is helpful whencreating or expanding labeled datasets. Instead of randomlyselecting instances to present to a human annotator, the model can query for specific instances. The model ranks the unlabeledinstances by certainty of prediction. Those instances for whichit is most uncertain are the ones where a label is queried fromthe human annotator. Model uncertainty can be understood asthe distance to the decision surface (SVM), or entropy of theclass predictions ( uncertainty sampling for Neural Networks).In summary, Active Learning can help in finding the optimalset of instances to label in order to optimally use a givenbudget [15].Since these approaches are complementary, they can becombined straightforwardly: One designs a transfer activelearning system by combining an untrained network with apre-trained feature extractor and then allows this combinedmodel to query a human expert. Previous work has examinedthis in detail and finds that it can accelerate the learning processsignificantly [8, 19, 18].In this work, we are the first to observe that this combinationof AL with TL is highly susceptible to data poisoning. Ourcontribution is to present this novel weakness, which • allows an attacker to reliably control the samples queriedby the active learner (94.8 % success rate even when thepoisoned samples are outnumbered 1:50 by clean trainingdata), • considerably deteriorates the test performance of thelearner (by more than 50 percent in absolute test accuracy), • is hard to detect for a human annotator.We evaluate the attack on both audio and image datasetsand report the results in Section IV. To the best of ourknowledge, this attack has not been described in literaturebefore whatsoever. II. R ELATED W ORK
In this section, we discuss related work on transfer activelearning. We first present work on combining transfer andactive learning and then discuss related data poisoning attacksfor each approach. We observe that there is no prior workon data poisoning for the combined method of transfer activelearning.
1) Active Learning with Transfer Learning.:
Kale et Al. [8]present a transfer active learning framework which "leveragespre-existing labeled data from related tasks to improve theperformance of an active learner". They take large, well-known a r X i v : . [ c s . L G ] J a n atasets such as Twenty Newsgroup , and evaluate the numberof queries required in order to reach an error of . or less.They can reduce the number of queries by as much as 50% bycombining transfer learning with active learning. The authorsof [19] perform similar experiments, and find that "the numberof labeled examples required for learning with transfer is oftensignificantly smaller than that required for learning each targetindependently". They also evaluate combining active learningand transfer learning, and find that the "combined active transferlearning algorithm [...] achieve[s] better prediction performancethan alternative methods".Chan et al. [5] examine the problem of training a wordsense disambiguation (WSD) system on one domain and thenapplying it to another domain, thus ’transferring’ knowledgefrom one domain to another domain. They show that activelearning approaches can be used to help in this transfer learningtask. The authors of [17] examine how to borrow informationfrom one domain in order to label data in another domain. Theirgoal is to label samples in the current domain (in-domain) usinga model trained on samples from the other domain (out-of-domain). The model then predicts the labels of the in-domaindata. Where the prediction confidence of the model is low, ahuman annotator is asked to provide the label, otherwise, themodel’s prediction is used to label the data. The authors reportthat their approach significantly improves test accuracy.
2) Related Data Poisoning Attacks.:
Data poisoning is anattack on machine learning systems where malicious data isintroduced into the training set. When the model trains on theresulting poisoned training dataset, this induces undesirablebehavior at test time. Biggio et Al. [4] were one of the firstto examine the effects of data poisoning on Support VectorClassifiers. The branch of data poisoning most related toour work is clean poison attacks. These introduce minimallyperturbed instances with the ’correct’ label into the data set.For example, the authors of [16] present an attack on transferlearners which uses clean poison samples to introduce a back-door into the model. The resulting model misclassifies thetargeted instance, while model accuracy for other samplesremains unchanged. To give an example, these clean poison samples may be manipulated pictures of dogs that, when trainedon by a transfer learner, will at test time cause a specific instance of a cat to be classified as a dog. Other samples thanthe targeted ’cat’ instance will not be affected. Such clean label attacks have also been explored in a black-box scenario [20].Data Poisoning not only affects classification models butalso regression learners. Jagielski et Al. [7] present attacks anddefenses on regression models that cause a denial of service:By injecting a small number of malicious data into the trainingset, the authors induce a significant change in the predictionerror.
3) Poisoning Active Learning.:
Poisoning active learnersrequires the attacker to craft samples which, in addition toadversely affecting the model, have to be selected by theactive learner for labeling and insertion into the dataset. Thisattack aims at both increasing overall classification error duringtest time as well as increasing the cost for labeling. In this sense, poisoning active learning is harder than poisoningconventional learners, since two objectives have to be satisfiedsimultaneously. Miller et Al. [12] present such an attack: Theypoison linear classifiers and manage to satisfy both previouslymentioned objectives (albeit with some constraints): Poisonedinstances are selected by the active learner with high probability,and the model trained on the poisoned instances inducessignificant deterioration in prediction accuracy.
4) Adversarial Collisions:
There exists some related workon adversarial collisions, albeit with a different focus from ours:Li et Al. [11] observe that neural networks can be insensitiveto adversarial noise of large magnitude. This results in ’twovery different examples sharing the same feature activation’and results in a feature collision. However, this constitutesan attack at test time (evasion attack), whereas we present anattack at training time (poison attack).III. A
TTACKING A CTIVE T RANSFER L EARNING
In this section, we introduce the proposed attack. We firstpresent the threat model and then detail the attack itself.In Section IV, we evaluate the effectiveness of our attackempirically.
A. Threat model
In this work, we assume that an attacker has the followingcapabilities: • The attacker may introduce a small number of adversar-ially perturbed instances into the active learning pool.These instances are unlabeled. They are screened bythe active learning algorithm and may, along with thebenign instances, be presented to the human annotator forlabeling. • The attacker cannot compromise the output of the humanannotator, i.e. they cannot falsify the labels assigned toeither the benign or poison instances. • The attacker knows the feature extractor used. This could,for example, be a popular open-source model such as resnet50 [6] or
YAMNet [14]. These models are readilyavailable and widely used [1]. • The attacker has no knowledge about, or access to themodel trained on top of the feature extractor. • The attacker does not know the active learning algorithm.
B. Feature Collision Attack
Since transfer active learning is designed to use found data,i.e. data from untrusted sources, it is highly susceptible to datapoisoning attacks. In this section, we present such an attackand show how it completely breaks the learned model.Let
X, Y be a set of unpoisoned data, where the data X and targets Y consist of N instances. An instance pertainsto one of M different classes (e.g. ’dog’ or ’cat’). Let f bethe pretrained feature extractor, which maps a sample x i ∈ X to a d ζ dimensional feature vector (i.e. f ( x i ) = ζ i ) . Let g be the dense model head, which maps a feature vector ζ i tosome prediction y pred ∈ [0 , M , where y pred is a one-hotvector, i.e. (cid:80) M − m =0 y mpred = 1 . Thus, an image is classified byhe subsequent application of the feature extractor f and thedense model head g : y = arg max i g ( f ( x i )) (1)For a set of instances { x i } , a set of adversarial examples{ x i + δ i } can be found by minimizing, for each x i separately: δ i = arg min δ (cid:13)(cid:13) f ( x i + δ ) − µ (cid:13)(cid:13) + β (cid:107) δ (cid:107) (2)where β ∈ R + and µ is a fixed vector of size d ζ . SolvingEquation 2 will find adversarial examples that 1) are selected bythe active learner 2) break the model, and 3) are imperceptibleto the human annotator:1) Examples are queried.
A set of thusly found adversarialexamples { x i + δ i } will all be mapped to the same output µ , i.e. f ( x i + δ i ) = µ for all i . This will ’confuse’ theactive learner, since all adversarial examples share thesame feature vector, but have different class labels. Thus,the active learner will incur high uncertainty for instancesmapped to this collision vector µ , and thus will queryalmost exclusively these (we verify this experimentallyin Section IV-B).2) Examples are harmful . These examples will break anymodel head trained on the extracted features ζ i , since alladversarial examples share identical features, but differentlabels.3) Examples are undetected by a human annotator.
Once queried, the adversarial examples x i + δ i will bereviewed by a human annotator and labeled accordingly.The second part of Equation 2 ensures that the adversarialnoise is small enough in magnitude to remain undetectedby human experts. The adversarial example will thusbe assigned the label of x i , but not raise suspicion ofthe human annotator. The scalar β is a hyperparametercontrolling the strength of this regularisation.
1) Choice of Collision Vector:
The
Collision Vector µ ischosen as the zero vector µ = 0 d ζ because of two reasons:First, we find that it helps numerical convergence of Equation 2when training with Gradient Descent. Second, a zero-vector offeatures is highly uncommon with unpoisoned data. Thus, itinduces high uncertainty with the active learner, which in turnhelps to promote the adversarial poison samples for labelingand inclusion in the training dataset from the beginning. It ispossible to chose a different µ , for example the one-vector µ = 1 d ζ , or the mean of the feature values µ = ζ i . However,we find that the zero-vector works best, most likely due to thereasons detailed above.
2) Improving attack efficiency:
We propose two improve-ments over the baseline attack. First, when choosing the baseinstances from the test set to poison and to include in the trainset, it is advisable to select those where (cid:107) f ( x ) − µ (cid:107) (3)is smallest. Intuitively, this pre-screens the samples for thosewhere the optimization step (Equation 2) requires the least work. Second, maintaining class balance within the poison samplesimproves effectiveness. This helps in maximally confusing theactive learner since a greater diversity of different labels for thesame feature vector µ increases the learner’s uncertainty withrespect to future samples that map to µ . In all the followinganalysis, we evaluate the attack with these improvements inplace. IV. I MPLEMENTATION AND R ESULTS
In this section, we first describe our transfer active learningsetup and the data sets used for evaluation. We then implementour attack and demonstrate its effectiveness (c.f. Section IV-B).
A. Active Transfer Learner Setup
This section describes the data we use and our choice oftransfer learner.For our experiments, we use image and audio data. This isbecause Active Learning requires a human oracle to annotateinstances, and humans are very good at annotating both imageand audio data, but rather inefficient in processing purelynumerical data. This motivates the choice of the active learner,namely a neural network, which in recent times has been shownto provide state-of-the-art performance on image and audiodata.Thus, we create the transfer learner as follows: We use alarge, pre-trained model to perform feature extraction on theaudio and image data. For image data, we use a pre-trained resnet50 model [6], which comes with the current release ofthe python tensorflow library. For audio data, we build a featureextractor from the
YAMNet model, a deep convolutional audioclassification network [14]. Both of these feature extractors mapthe raw input data to a vector of features. For example, resnet50 maps images with ∗ ∗ input dimensions to avector of higher-level features. A dense neural network( dense head ) is then used to classify these feature vectors. OurActive Learner uses Entropy Sampling [15] to compute theuncertaintyuncertainty( x i ) = H ( g ( f ( x i ))) (4) = − (cid:88) m ∈ M g ( f ( x i )) m log g ( f ( x i )) m (5)for all unlabeled x i . The scalar value g ( f ( x i )) m indicatessoftmax-probability of the m -th class for of the network’soutput. The active learner computes the uncertainty for allunlabeled instances x i and selects the one with the highestuncertainty to be labeled.
1) Prevention of Overfitting.:
As detailed in Section I, weuse active transfer learning in order to learn from very smalldatasets. Accordingly, we use at most N = 500 instances perdataset in our experiments. In this scenario, overfitting caneasily occur. Thus, we take the following countermeasures:First, we keep the number of trainable parameters low and usea dense head with at most two layers and a small number ofhidden neurons. Second, we use high Dropout (up to ) andemploy early stopping during training. Third, we refrain fromtraining the weights of the Transfer Learner (this is commonlyeferred to as fine-tuning ). This is motivated by the observationthat the resnet50 architecture has more than 25 Million trainableweights, which makes it prone to overfitting, especially on verysmall datasets.
2) Datasets:
We use three datasets to demonstrate our attack. • Google AudioSet [3], a large classification dataset com-prising several hundred different sounds. We use only asubset of ten basic commands ( right , yes , left , etc.). • Cifar10 [9], a popular image recognition dataset with32x32 pixel images of cars, ships, airplanes, etc. • STL10 [2], a semi-labeled dataset, comprising severalthousand labeled 96x96 pixel images in addition to tensof thousand unlabeled images, divided into ten classes. Inthis supervised scenario, we use only the labeled subset.Each dataset is split into a train set and two test sets, test1 and test2 using an 80, 10 and 10 percent split. Following previouswork [16], we train on the train set, use the test1 set to craftthe adversarial examples, and evaluate the test accuracy onthe test2 set. Thus, the adversarial examples base instancesoriginate from the same distribution as the train images, butthe two sets remain separate.
B. Feature Collision Results
We now verify empirically that the created instances lookinconspicuous and are hard to distinguish from real, benigninstances. Consider Figure 1. It shows four randomly selectedpoison instances of the
ST L dataset. Observe that thepoisoned images are hardly distinguishable from the originals.We also provide the complete set of audio samples for theGoogle Audio Set , both poisoned and original .wav files.Fig. 1: Images with index 155 (bird), 308 (horse), 614 (dog) and3964 (deer) from the STL dataset. The left image of each pairshows the base instance, i.e. the unpoisoned image. The rightimage shows the poisoned image which causes the collisionin the transfer learner’s output space.We now proceed to visually illustrate the results of thefeature collision. Consider Figure 2: For the poisoned andbenign training data in the AudioSet10 dataset, it shows thecorresponding feature vectors after PCA-projection in twodimensions. Thus, it shows what the dense model head ’sees’when looking at the poisoned data set. Note that while theunpoisoned data has a large variety along the first two principal https://drive.google.com/file/d/1JtXUu6degxnQ84Kggav8rgm9ktMhyAq0/view Fig. 2: Visualisation of the transfer learner’s feature space.The top image shows the first two principal components ofthe
Google AudioSet 10 dataset’s features when using theYAMNet feature extractor. The bottom image shows the samevisualization of the
Cifar10 dataset, using the imagenet50 feature extractor. The blue dots represent the unpoisonedtraining data. The red dots represent poison data found viaEquation 2, chosen equally per class (50 instances for each ofthe ten classes). Observe the large diversity of the unpoisonedtraining data compared to the adversarial poison data, which isprojected onto a single point, thus creating a ’feature collision’which massively deteriorates the active learner’s performance.components (as is expected, since the individual instancespertain to different classes), the poison data’s features collidein a single point (red dot) - even though they also pertain todifferent classes. Thus, we observe a ’feature collision’ of thepoison samples to a single point in feature space, which, incombination with the different labels, will cause maximum’confusion’ in the active learner. . Impact on the Model
In this section, we evaluate the impact on the classificationaccuracy of transfer active learners when exposed to the poisonsamples. For each of the three datasets, we create 500 poisonsamples and include them into the training set. We then createa transfer active learner (a neural network with one / twolayers), train it on 20 unpoisoned samples, and simulate humanannotation by letting the active learner query for 500 newinstances from the training set (which contains a majority ofbenign data plus the injected poison samples). We find that theactive learner • almost exclusively selects poisoned samples, and • test performance is degraded severely (by up to 50%absolute).Table 1 details our results.For example, consider the first row, which evaluates a one-layer neural network (NN1) on the STL-dataset. In an unpoi-soned scenario, after having queried 500 images, the activelearner has a test-accuracy of 86 percent. When introducing 500poison instances, we observe the following: First, even thoughthe poison instances are outnumbered 1:8, the active learnerchooses 500 out of 500 possible poison instances for manualannotation - a success rate of percent. In comparison,if the adversarial instances would be queried with the sameprobability as unpoisoned samples, only . percent of themwould be chosen (random success rate). Secondly, observethat test accuracy is degraded significantly from 86 percent to34 percent. This is because the model can not learn on datathat, due to the feature collision attack, looks identical to oneanother for the dense head. D. Hyper Parameters and Runtime
Table 1 details the run time to find a single adversarialexample on an Intel Core i7-6600 CPU (no GPU), whichranges from one to two minutes. We used the followinghyperparameters to find the adversarial samples: For the audiosamples, we used β = 0 . and iterations with earlystopping and adaptive learning rate. For the image samples,we used β = 1 e − and iterations. The difference in β between image and audio data is due to the difference in inputfeature scale, which ranges from ± for images to ± foraudio data. E. Adversarial Retraining Defense
We find that there exists a trivial, yet highly effectivedefense against our proposed attack: Unfreezing, i.e. trainingof the feature extractor f . When f is trained in conjunctionwith g , training on the poison samples actually serves as’adversarial retraining’, which boosts model robustness [10]. Inour experiments, we find that restraining completely negates theeffects of the poisoning attack. However, it is not a satisfactorydefense due to the following concerns. ± [0 , ..., − , where one bit isreserved for the sign. • Lack of labeled samples . In order to unfreeze the weightsof the feature extractor f , a large number of labeledsamples is required. These are not available at the start ofthe active learning cycle. Thus, retraining can only occurduring later stages. Until then, however, the adversaryhas free reign to introduce their poison samples into thedataset. • High computational overhead . Retraining the featureextractor f incurs high computational overhead in com-parison to training only the dense head g . • Overfitting . Training f , especially on a small dataset,may result in overfitting, as described in Section IV-A1.In summary, while unfreezing and adversarial retraining doesmitigate the attack, these strategies may be hard to apply dueto several practical concerns. Thus, a better defense strategy isrequired, which we leave for future work.V. C ONCLUSION AND F UTURE W ORK
In this work, we point out an intriguing weakness of thecombination of active and transfer learning: By crafting featurecollisions, we manage to introduce attacker-chosen adversarialsamples into the dataset with a success rate of up to 100percent. Additionally, we decreased the model’s test accuracyby a significant margin (from 86 to 36 percent). This attackcan effectively break a model, wastes resources of the humanoracle, and is very hard to detect for a human when reviewingthe poisoned instances. To the best of our knowledge, thisparticular weakness of transfer active learning has not beenobserved before. R
EFERENCES[1] Resnet and resnetv2. https://keras.io/api/applications/resnet/. (Accessedon 11/26/2020).[2] Stl-10 dataset. http://ai.stanford.edu/~acoates/stl10/, 2011. (Accessed on11/26/2020).[3] Google audio set. https://research.google.com/audioset/, 2017. (Accessedon 11/26/2020).[4] B
IGGIO , B., N
ELSON , B.,
AND L ASKOV , P. Poisoning attacks againstsupport vector machines. arXiv preprint arXiv:1206.6389 (2012).[5] C
HAN , Y. S.,
AND N G , H. T. Domain adaptation with active learning forword sense disambiguation. In Proceedings of the 45th Annual Meetingof the Association of Computational Linguistics (Prague, Czech Republic,June 2007), Association for Computational Linguistics, pp. 49–56.[6] H E , K., Z HANG , X., R EN , S., AND S UN , J. Deep residual learning forimage recognition. In Proceedings of the IEEE conference on computervision and pattern recognition (2016), pp. 770–778.[7] J
AGIELSKI , M., O
PREA , A., B
IGGIO , B., L IU , C., N ITA -R OTARU , C.,
AND L I , B. Manipulating machine learning: Poisoning attacks andcountermeasures for regression learning. In (2018), IEEE, pp. 19–35.[8] K ALE , D.,
AND L IU , Y. Accelerating active learning with transferlearning. In (Dec 2013), pp. 1085–1090.[9] K RIZHEVSKY , A., H
INTON , G.,
ET AL . Learning multiple layers offeatures from tiny images. Citeseer 2009.[10] L I , B., V OROBEYCHIK , Y.,
AND C HEN , X. A general retrainingframework for scalable adversarial classification. arXiv preprintarXiv:1604.02606 (2016).[11] L I , K., Z HANG , T.,
AND M ALIK , J. Approximate feature collisionsin neural nets. In
Advances in Neural Information Processing Systems (2019), pp. 15842–15850.[12] M
ILLER , B., K
ANTCHELIAN , A., A
FROZ , S., B
ACHWANI , R., D
AUBER ,E., H
UANG , L., T
SCHANTZ , M. C., J
OSEPH , A. D.,
AND T YGAR , J. D.Adversarial active learning. In
Proceedings of the 2014 Workshop onArtificial Intelligent and Security Workshop (2014), pp. 3–14.
ABLE I: Results of the feature collision poisoning on three data sets: Cifar10, STL and Google Audio Set. The Head of theActive Learner has one (NN1) or two (NN2) dense layers. The success rate provides the ratio with with poison samples areselected by the active learner.
Dataset Model Accuracy(Clean) Accuracy(Poisoned) Loss(adv.) Loss(initital) N SuccessRate(Poison) SuccessRate(Random) Time(s)STL NN1 0.862 0.347 107.729 66450.624 500 1.0 0.126 113.08Cifar10 NN1 0.432 0.267 39.746 5016.588 500 1.0 0.013 93.425Audio Set 10 NN1 0.252 0.143 0.488 24.59 500 1.0 0.016 73.083STL NN2 0.844 0.7 107.729 66450.624 500 0.86 0.126 113.08Cifar10 NN2 0.428 0.341 39.746 5016.588 500 0.832 0.013 93.425Audio Set 10 NN2 0.263 0.141 0.488 24.59 500 0.998 0.016 73.083 [13] P AN , S. J., AND Y ANG , Q. A survey on transfer learning.
IEEETransactions on knowledge and data engineering 22 , 10 (2009), 1345–1359.[14] P
LAKAL , M.,
AND E LLIS , D. Yamnet. github.com/tensorflow/models/tree/master/research/audioset/yamnet.[15] S
ETTLES , B. Active learning literature survey. Tech. rep., University ofWisconsin-Madison Department of Computer Sciences, 2009.[16] S
HAFAHI , A., H
UANG , W. R., N
AJIBI , M., S
UCIU , O., S
TUDER , C.,D
UMITRAS , T.,
AND G OLDSTEIN , T. Poison frogs! targeted clean-labelpoisoning attacks on neural networks. In
Advances in Neural InformationProcessing Systems (2018), pp. 6103–6113.[17] S HI , X., F AN , W., AND R EN , J. Actively transfer domain knowledge. In Proceedings of the 2008th European Conference on Machine Learningand Knowledge Discovery in Databases - Volume Part II (Berlin,Heidelberg, 2008), ECMLPKDD’08, Springer-Verlag, p. 342–357.[18] W
ANG , X., H
UANG , T.-K.,
AND S CHNEIDER , J. Active transfer learningunder model shift. In
Proceedings of the 31st International Conferenceon Machine Learning (Bejing, China, 22–24 Jun 2014), E. P. Xing andT. Jebara, Eds., vol. 32 of
Proceedings of Machine Learning Research ,PMLR, pp. 1305–1313.[19] Y
ANG , L., H
ANNEKE , S.,
AND C ARBONELL , J. A theory of transferlearning with applications to active learning.
Machine Learning 90 (022013).[20] Z HU , C., H UANG , W. R., S
HAFAHI , A., L I , H., T AYLOR , G., S
TUDER ,C.,
AND G OLDSTEIN , T. Transferable clean-label poisoning attacks ondeep neural nets. arXiv preprint arXiv:1905.05897arXiv preprint arXiv:1905.05897