[PDF] Rethinking Non-idealities in Memristive Crossbars for Adversarial Robustness in Neural Networks

Abstract

Deep Neural Networks (DNNs) have been shown to be prone to adversarial attacks. Memristive crossbars, being able to perform Matrix-Vector-Multiplications (MVMs) efficiently, are used to realize DNNs on hardware. However, crossbar non-idealities have always been devalued since they cause errors in performing MVMs, leading to computational accuracy losses in DNNs. Several software-based defenses have been proposed to make DNNs adversarially robust. However, no previous work has demonstrated the advantage conferred by the crossbar non-idealities in unleashing adversarial robustness. We show that the intrinsic hardware non-idealities yield adversarial robustness to the mapped DNNs without any additional optimization. We evaluate the adversarial resilience of state-of-the-art DNNs (VGG8 & VGG16 networks) using benchmark datasets (CIFAR-10, CIFAR-100 & Tiny Imagenet) across various crossbar sizes. We find that crossbar non-idealities unleash significantly greater adversarial robustness (>10-20%) in crossbar-mapped DNNs than baseline software DNNs. We further assess the performance of our approach with other state-of-the-art efficiency-driven adversarial defenses and find that our approach performs significantly well in terms of reducing adversarial loss.

Full PDF

11 Rethinking Non-idealities in MemristiveCrossbars for Adversarial Robustness in NeuralNetworks

Abhiroop Bhattacharjee, and Priyadarshini PandaDepartment of Electrical Engineering, Yale University, USA

Abstract — Deep Neural Networks (DNNs) have been shown to be prone to adversarial attacks. With a growing need to enableintelligence in embedded devices in this

Internet of Things (IoT) era, secure hardware implementation of DNNs has becomeimperative. Memristive crossbars, being able to perform

Matrix-Vector-Multiplications (MVMs) efﬁciently, are used to realize DNNs onhardware. However, crossbar non-idealities have always been devalued since they cause errors in performing MVMs, leading todegradation in the accuracy of the DNNs. Several software-based adversarial defenses have been proposed in the past to make DNNsadversarially robust. However, no previous work has demonstrated the advantage conferred by the non-idealities present in analogcrossbars in terms of adversarial robustness. In this work, we show that the intrinsic hardware variations manifested through crossbarnon-idealities yield adversarial robustness to the mapped DNNs without any additional optimization. We evaluate resilience ofstate-of-the-art DNNs (VGG8 & VGG16 networks) using benchmark datasets (CIFAR-10 & CIFAR-100) across various crossbar sizestowards both hardware and software adversarial attacks. We ﬁnd that crossbar non-idealities unleash greater adversarial robustness( > − ) in DNNs than baseline software DNNs. We further assess the performance of our approach with other state-of-the-artefﬁciency-driven adversarial defenses and ﬁnd that our approach performs signiﬁcantly well in terms of reducing adversarial losses. Index Terms —Deep Neural Networks, Memristive crossbars, Non-idealities, Adversarial robustness (cid:70)

NTRODUCTION

In the recent years, resistive crossbar systems have re-ceived signiﬁcant focus for their ability to realize

Deep Neu-ral Networks (DNNs) by effciently computing analog dot-products [1], [2], [3]. These systems have been realized usinga wide range of emerging technologies such as,

ResistiveRAM (ReRAM),

Phase Change Memory (PCM) and Spintronicdevices [4], [5], [6]. These devices exhibit high on-chipstorage density, non-volatility, low leakage and low-voltageoperation and thus, enable compact and energy-efﬁcientimplementation of DNNs [7], [8].Despite so many advantages, the analog nature of com-putation of dot-products in crossbars poses certain chal-lenges owing to device-level and circuit-level non-idealitiessuch as, interconnect parasitics, process variations in thesynaptic devices, driver and sensing resistances, etc. [8],[9]. Such non-idealities lead to errors in the analog dot-product computations in the crossbars, thereby adverselyaffecting DNN implementation in the form of accuracydegradation [10]. Numerous frameworks have been de-veloped in the past to model the impact of non-idealitiespresent in crossbar systems and accordingly, retraining theweights (stored in synaptic devices) of the DNNs to mitigateaccuracy degradation [9], [10], [11], [12].Crossbar-based non-idealities, thus, have so far been de-valued because they lead to accuracy degradation in DNNs.However, an interesting aspect of these non-idealities inproviding resilience to DNNs against adversarial attackshas been unexplored. DNNs have been shown to be ad-versarially vulnerable [13]. A DNN can easily be fooled byapplying structured, yet, small perturbations on the input,leading to high conﬁdence misclassiﬁcation of the input. This vulnerability severely limits the deployment and po-tential safe-use of DNNs for real world applications such asself-driving cars, malware detection, healthcare monitoringsystems etc. [14], [15]. Thus, it is imperative to ensure thatthe DNN models used for such applications are robustagainst adversarial attacks. Recent works such as [16], [17]show quantization methods, that primarily reduce computeresource requirements of DNNs, act as a straightforwardway of improving the robustness of DNNs against adver-sarial attacks. A recent work has led to the developmentof a framework called

QUANOS that provides a structuredmethod for hybrid quantization of different layers of aDNN to produce energy-efﬁcient, accurate and adversari-ally robust models [15]. In [15], [16], the authors show thatefﬁciency-driven hardware optimization techniques can beleveraged to improve software vulnerability, such as, adver-sarial attacks, while yielding energy-efﬁciency. In this work,we present a comprehensive analysis on how device-leveland circuit-level non-idealities intrinsic to analog crossbarscan be leveraged for adversarial robustness in neural net-works. To the best of our knowledge, we are the ﬁrst to showthat the intrinsic hardware variations manifested throughnon-idealities in crossbars intrinsically improve adversarialsecurity without any additional optimization. Our mainﬁnding is that-

A DNN model mapped on hardware, whilesuffering accuracy degradation, is also more adversarially resilientthan the baseline software DNN . Contributions:

In summary, the key contributions of thiswork are as follows: • We employ a systematic framework in

PyTorch [18] a r X i v : . [ c s . ET ] A ug to map DNNs onto resistive crossbar arrays andinvestigate the cumulative impact of various circuitand device-level non-idealities to confer adversarialrobustness. • We analyse the robustness of state-of-the-art DNNs, viz.

VGG8 and VGG16 [19] using benchmarkdatasets- CIFAR-10 and CIFAR-100 [20], respectively,across various crossbar dimensions. • We show that crossbar-based non-idealities impartrobustness in neural networks against both hardwareand software-based adversarial attacks. • We ﬁnd that non-idealities lead to higher adversarialrobustness ( > − for both FGSM and PGD-based adversarial attacks on hardware) in DNNsmapped onto resistive crossbars than DNNs evalu-ated on software. • We investigate the role of various crossbar param-eters (such as R MIN ) in unleashing adversarial ro-bustness to DNNs mapped onto crossbars. We alsostudy the impact of input

Pixel Discretization pro-posed in [16] together with crossbar non-idealitieson adversarial robustness. • A comparison of our proposed method with otherstate-of-the-art quantization techniques is also pre-sented to emphasise the importance of hardwarenon-idealities in imparting resilience to DNNsagainst adversarial inputs.

ELATED WORKS

Prior research works have focused on modeling crossbarnon-idealities to mitigate the problem of accuracy degrada-tion incurred when DNNs are mapped onto them. Severalframeworks have been proposed such as,

CxDNN [10],that employs matrix-inversion techniques combined with

Kirchoff’s circuit laws to model the effect of interconnect par-asitics and peripheral non-idealities in the resistive crossbararrays. The authors in [21] have presented an approximationtechnique based on sample input/output behavior. How-ever, these analytical models take into account only lineardata-dependent non-idealities while modeling the crossbarinstances. Recent frameworks such as

GenieX [9] use aneural network-based approach to accurately encapsulatethe effects of both data dependent and non-data depen-dent non-idealities and assess their impact on accuracydegradation.

PUMA is the ﬁrst

Instruction Set Architecture (ISA)-programmable inference accelerator based on hybridCMOS-memristor crossbar technology, which is designedto maintain crossbar area and energy efﬁciency as wellas storage density.

PUMA has been shown to outperformother state-of-the-art CPUs, GPUs, and ASICs for ML ac-celeration [7], [22], [23], [24]. Nevertheless, none of theaforementioned techniques or architectures have helpedunderstand the advantages that the intrinsic non-idealitiesof the crossbar structures may confer in terms of adversarialrobustness to DNNs.

In the recent years, several heuristic adversarial defensestrategies have been developed, including adversarial train-ing [13], [25], [26], [27], [28], randomization-based tech-niques [29], [30], [31] and denoising methods [32], [33],[34], [35]. However, these defenses might be broken by anew attack in the future since they lack a theoretical error-rate guarantee [36]. Hence, researchers have strived to de-velop certiﬁed defensive methods [37], [38], [39], [40], whichalways maintain a certain accuracy under a well-deﬁnedclass of attacks [36]. Even though the certiﬁed defensemethods indicate a way to reach theoretically guaranteedsecurity, their accuracy and efﬁciency are far from meetingthe practical requirements [36]. Apart from these, severalquantization-based methods on software have been pro-posed of late, including works like [15], [16], [17] to improveresilience of neural networks against adversarial perturba-tions. The work in [16] deals with discretization of the inputspace (or allowed pixel levels from 256 values or 8-bit to4-bit, 2-bit). It shows that input discretization improves theadversarial robustness of DNNs for a substantial range ofperturbations, besides improvement in its computationalefﬁciency with minimal loss in test accuracy. Likewise,

QUANOS [15] is a framework that performs layer-speciﬁchybrid quantization of DNNs based on a metric termed as

Adversarial Noise Sensitivity (ANS) to make DNNs robustagainst adversarial perturbations. In contrast to prior works,we present a ﬁrst of its kind work that comprehensivelystudies the inherent advantage of hardware non-idealitiestowards imparting adversarial robustness to DNNs withoutrelying upon other software-based optimization methodolo-gies. Note, we also show that combining previously pro-posed optimization strategies, such as pixel discretization,with analog crossbars further improves robustness.

ACKGROUND

DNNs are vulnerable to adversarial attacks in which themodel gets fooled by applying precisely calculated smallperturbations on the input, leading to high conﬁdencemisclassiﬁcation [15]. The authors in [25] have proposed amethod called

Fast Gradient Sign Method (FGSM) to generatethe adversarial input by linearization of the loss function( L ) of the trained models with respect to the input ( X ) asshown in equation (1). X adv = X + (cid:15) × sign ( ∇ x L ( θ, X, y true )) (1)Here, y true is the true class label for the input X; θ denotesthe model parameters (weights, biases etc.) and (cid:15) quantiﬁesthe degree of distortion.The quantity ∆ = (cid:15) × sign ( ∇ x L ( θ, X, y true )) is the netperturbation added to the input ( X ), which is controlled by (cid:15) . It is noteworthy that gradient propagation is, thus, a cru-cial step in unleashing an adversarial attack. Furthermore,the contribution of gradient to ∆ would vary for differentlayers of the network depending upon the activations [15].In addition to FGSM-based attacks, multi-step variants ofFGSM, such as Projected Gradient Descent (PGD) [13] havealso been proposed that cast stronger attacks.

To build resilience against against small adversarial per-turbations, defense mechanisms such as gradient maskingor obfuscation [41] have been proposed. Such methods con-struct a model devoid of useful gradients, thereby makingit difﬁcult to create an adversarial attack.

Types of Attacks:

Broadly, attacks to evaluate adversar-ial robustness are classiﬁed as:

Black-Box (BB) and

White-Box (WB). WB attacks are launched when the attacker hascomplete knowledge of the target model parameters andtraining information. BB attacks, on the other hand, arelaunched when the attacker has no knowledge about thetarget model parameters. Resilience against WB adversariesalso guarantees resilience against the BB ones for similarperturbation ( (cid:15) ) range [15]. Thus, all our subsequent exper-iments are based on WB adversaries for the assessment ofadversarial robustness.In this work,

Clean Accuracy ( CA ) refers to the accuracyof a DNN when presented with the test dataset in absenceof an adversarial attack. We deﬁne Adversarial Accuracy ( AA ) as the accuracy of a DNN on the adversarial datasetcreated using the test data for a given task. Adversarial Loss ( AL ) is deﬁned as the difference between CA and AA , i.e ., AL = CA − AA . Higher the value of AA , smaller will bethe value of AL , which implies increased robustness againstadversarial attacks. Resistive crossbar arrays can be harnessed to implement

Matrix-Vector-Multiplications (MVMs) in an analog man-ner. Crossbars (Fig. 1(a)) consist of 2D arrays of synapticdevices (programmable resistors realized using emergingnanotechnologies),

Digital-to-Analog (DAC), and

Analog-to-Digital (DAC) converters and a write circuit. The synapticdevices at the intersection of each row and column areconﬁgured to a particular value of conductance (that rangesfrom G MIN to G MAX ), by enabling the corresponding writecircuits along the

Write Wordline (WWL) and the

Bitline (BL). Thereafter, the MVMs are performed by convertingthe digital inputs to analog voltages on the

Read Wordlines (RWLs) using DACs, and sensing the output current ﬂowingthrough the bitlines (BLs) using the ADCs [8].Equation (2) shows the ideal MVM operation for anMxN crossbar, for which V in is a 1xM vector comprisingthe input analog voltages, G ideal is the MxN conductancematrix (formed by mapping the weights of a DNN onto thecrossbar instances), and Iout ideal is a 1xN vector comprisingoutput currents.

Iout ideal = V in ∗ G ideal (2) Non-idealities:

The analog nature of the computationleads to various non-idealities resulting in errors in theMVMs. These include device-level and circuit-level non-idealities in the resistive crossbars. Fig. 1(b) shows theequivalent circuit for the crossbar array and its peripheralsaccounting for the non-idealities listed in TABLE 1. Thecircuit-level non-idealities have been modelled as parasiticresistances. The cumulative effect of all the non-idealitiesresults in the deviation of the output current from its idealvalue, resulting in an

Iout non − ideal vector. The relative TABLE 1Various circuit-level and device-level non-idealities in a resistivecrossbar array

Type of non-idealities Parameters

Circuit non-idealities Rdriver, Rwire row, Rwire col, RsenseDevice non-idealities Gaussian variation proﬁle deviation of

Iout non − ideal from its ideal value is denotedby non-ideality factor (NF) [9] such that: N F = (

Iout ideal − Iout non − ideal ) /Iout ideal (3)Thus, increased non-idealities in crossbars can induce agreater value of NF. This can lead to a signiﬁcant impacton the computational accuracy of crossbars and therefore,degradation in the accuracy of the DNNs implemented onhardware [8], [9], [10]. Crossbar Mapping:

In this work, we use a similarprocedure as that of [8], [10] for mapping DNNs ontocrossbars of various dimensions as shown in Fig. 3(b) .First, the weights of each layer of the DNN are partitionedbased on the size of the crossbar array used and mappedonto the crossbar instances. Thereafter, the correspondingconductance for each value of DNN weight in a crossbarinstance is computed by taking into account the synapticdevice parameters, viz. G MIN , G MAX and bit-precision.This gives us the ideal conductance matrix ( G ideal ). Finally,we consider the circuit-level and device-level non-idealitiespresent in a crossbar instance speciﬁed in TABLE 1, andconvert G ideal into G non − ideal using circuit laws ( Kirchoff’slaws and

Ohm’s law ) and linear algebraic operations [8]. Thiscompletes the mapping of the weights of the DNN onto thecrossbar instances.

Non-idealities inherent in crossbars have so far been pro-jected in a negative light since, they lead to degradationin clean accuracy when DNNs are mapped onto them.However, in this work, we show how the non-idealities (oran increased value of NF for a crossbar) lead to robustness ofDNNs against adversarial attacks. Thus, we observe loweradversarial loss ( AL ) with respect to the correspondingsoftware implementation of the DNNs. We argue that non-idealities intrinsically lead to defense via gradient obfuscationagainst adversarial perturbations since gradient propagation, asdiscussed in Section 3.1, is crucial to initiate an adversarial attack .Fig. 2 pictorially demonstrates the intuition behind cre-ation of an adversary in DNNs and how hardware non-idealities can cause gradient obfuscation. DNNs, being dis-criminative models, partition a very high-dimensional inputspace into different classes by learning appropriate decisionboundaries. The class-speciﬁc decision boundaries simplydivide the space into hyper-volumes. These hypervolumesconsist of the training data examples as well as large areasof unpopulated space that is arbitrary and untrained. Thedecision boundary during model training extrapolates tovast regions of unpopulated high-dimensional subspacebecause of linearity/generalization in the model behavior.

ADC ADC ADC ADCDACDACDAC

Write circuitWrite circuit Write circuit Write circuitWrite circuitWrite circuitWrite circuit

WWLRWL BL

Synaptic Device ...... ...... I I I N1 I I I I N V V V N G G G N1 G G G G G G G N2 G N3 G NN (a) WWLRWL BL ...... ......

Rdriver Rwire_row Rwire_colRsense V V V N Peripheral and parasitic resistances

WWL - Write WordlineRWL - Read WordlineBL - Bit-line

Processvariation I I I I N G G G N1 G G G G G G G N2 G N3 G NN I j = f(V i , G ij (V i ), R driver , R sense, R wire_row , R wire_col ) (b)Fig. 1. (a) An ideal crossbar array; (b) A typical non-ideal crossbar array structure with resistive circuit-level non-idealities Decision Boundary Class A Class B

Class CAdversarial perturbation for

DNN mapped on crossbarsAdversarial perturbation for software-based DNN ABC ABCDNN on software DNN mapped oncrossbars (a) (b)

Fig. 2. Pictorial depiction of creation of adversaries for software andhardware-based DNNs - (a) The data points (shown as ‘dots) encom-pass the data manifold in the high-dimensional subspace. The classiﬁeris trained to separate the data into different classes or hypervolumesbased on which the decision boundary is formed. Adversaries are cre-ated by perturbing the data points into these empty regions or hypervol-umes and are thus misclassiﬁed; (b) The decision boundaries get shiftedowing to the crossbar-based non-idealities in hardware, resulting in theplacement of certain data points into a different hypervolume leadingto accuracy degradation. However, due to gradient obfuscation owingto crossbar non-idealities, many data points remain restricted in theiroriginal hypervolumes on perturbations. This results in better adversarialrobustness in hardware-based DNNs.

But this exposes the model to adversarial attacks [16]. Ad-versarial perturbations, essentially, can shift a data pointfrom its typical hypervolume region to another, leadingto high-conﬁdence missclassiﬁcation. This has been shownin Fig. 2(a) with black arrows for a DNN evaluated onsoftware.However, when a DNN is mapped onto crossbar arrays,the decision boundaries are shifted owing to the crossbar-based non-idealities, resulting in the placement of certaindata points into a different hypervolume (Fig. 2(b)). Thisleads to missclassiﬁcations and hence, degradation in the clean accuracy of the DNN. Also, on unleashing adversarialattacks on a DNN mapped on crossbars, the displacement ofa data point in the high-dimensional subspace gets alteredin a different direction. This has been marked in Fig. 2(b)using violet arrows which demarcate another direction w.r.t. the one demarcated using black arrows (for DNNs evalu-ated on software). Thus, instead of moving into a differ-ent hypervolume, the many perturbed data points remainrestricted in their original hypervolumes, thereby resultingin lower adversarial losses (ALs) for the DNN and greateradversarial robustness.

Quantifying the intuition in Fig. 2:

To support ourgradient obfuscation argument, let us consider a DNNmapped onto crossbars as f . The net perturbation addedto the input ( X ) in case of an adversarial attack is givenby ∆ = (cid:15) × sign ( ∇ x L ( θ, X, y true )) (refer to Section 3.1).Without loss of generality, we assume the loss function ( L )from the hardware mapped DNN to be a function of theoutput current emerging out of a crossbar array ( I out ), i.e. : L = f ( I out ) Since DNNs are sufﬁciently linear owing to the ReLU acti-vation functions being used, we can assume that: L ≈ I out In the ideal scenario of crossbars with no non-idealities, L ≈ Iout ideal which implies, ∆ ideal = (cid:15) × sign ( ∇ x ( Iout ideal )) (4)However, in the case of non-idealities being present incrossbar structures, Iout non − ideal = Iout ideal − γ where, γ denotes the deviation of the output current ofthe crossbar from its ideal value due to the inherent non-idealities. Hence, in the non-ideal scenario, we have: ∆ non − ideal = (cid:15) × sign ( ∇ x ( Iout ideal − γ )) (5) From equation (5), we ﬁnd that there is a deviationin the adversarial perturbation from its ideal value owingto crossbar non-idealities. Hence, this explains the reasonbehind an altered displacement of data points in the high-dimensional subspace w.r.t. the direction of displacement incase of a DNN evaluated on software (Fig. 2(b)).In this work, we employ a framework in

PyTorch similarto

RxNN [8], to map DNNs onto a resistive crossbar arrayand investigate the cumulative impact of the circuit anddevice-level non-idealities (mentioned in TABLE 1) on therobustness of neural networks against adversarial inputs.

ETHODOLOGY

The methodology described in Fig. 3(a) is adopted to assessthe robustness of DNNs against adversarial inputs whenimplemented on hardware. The entire process is dividedinto two parts:

We employ benchmark datasets- CIFAR-10 and CIFAR-100to evaluate VGG8 and VGG16 networks, respectively. Thesenetworks are ﬁrst trained on PyTorch with the appropriatetraining datasets. Subsequently, we obtain two kinds oftrained models as follows:1)

Model-1:

A standard model trained without addingany random noise to its activations.2)

Model-2:

A model trained with random noiseadded to all neuronal activation values. Such noiseenabled training has been used in past works [42]to mitigate the accuracy degradation observed frommapping DNNs onto crossbars. Essentially, addingrandom noise to the neuronal activations is a crudeand approximate way of modeling non-idealitiesduring the training process.

Attack-SW:

We launch FGSM and PGD attacks on thesoftware models by adding adversarial perturbations to theclean test inputs. We record the adversarial accuracies (AAs)and adversarial losses (ALs) in case of each attack.

Using a PyTorch-based framework, we layer-wise map thesoftware DNN weights separately for both Model-1 andModel-2 onto resistive crossbar instances of sizes- 16x16,32x32 and 64x64 respectively. We follow the procedure ofmapping the weights of the DNN onto crossbars as de-scribed in Section 3.2.We calculate the CAs for both

Model-1 and

Model-2 forthe crossbar mapped DNN, which are expected to be lesserthan the values obtained for software DNN. Thereafter, welaunch FGSM and PGD attacks on the mapped crossbar-based models in two modes termed as:1)

Attack-1:

The adversarial perturbations for each at-tack, FGSM & PGD, are created using the software-based DNN model’s loss function and then added tothe clean input that yields the adverdarial input. Thegenerated adversaries are then fed to the crossbarmapped DNN to monitor AL . TABLE 2Parameters and their values associated with a resistive crossbar array

Parameter Value

Rdriver 1 k Ω Rwire row 5 Ω Rwire col 10 Ω Rsense 1 k Ω R MIN k Ω R MAX k Ω Attack-2:

The adversarial inputs are generated foreach attack, FGSM & PGD, using the loss fromthe crossbar-based hardware models. As a result,we can expect the adversaries in this case will notbe as strong as

Attack-1 adversaries owing to thepresence of non-idealities that can interfere in theattack generation process.We ﬁnally record the adversarial accuracies (AAs) andadversarial losses (ALs) for all modes:

Attack-SW, Attack-1,Attack-2 . ESULTS AND D ISCUSSION

The parameters pertaining to the non-ideal resistive cross-bars for mapping the DNNs are listed in TABLE 2 andemployed for all the simulations unless stated otherwise.Also, device-level process variation shown in experimentsbelow has been modelled as a Gaussian variation in theresistances of the synaptic devices with σ/µ = 10% .Fig. 4 presents a comparison of clean accuracies of thetrained VGG8 and VGG16 networks (note,

Model-1 type)when evaluated on software and after mapping on non-ideal crossbars of various dimensions (excluding the device-level variations). It can be observed that the clean accuraciesdrop post-mapping on the crossbars which is a direct im-plication of the inherent non-idealities in a crossbar causingerrors in MVMs as discussed in Section 3.2. We also seethat accuracies drop more for larger sized crossbars. Inthe subsequent subsections, we discuss the implications onadversarial robustness by inducing

Attack-1 and

Attack-2 onthe crossbar-mapped models.

In Fig. 5, it can be observed that AL in case of an FGSMattack on the DNNs mapped onto crossbars of variousdimensions (16x16, 32x32, 64x64) are lesser than that of aDNN evaluated on software. For different attack strengthsquantiﬁed by (cid:15) values, the value of AL in case of Attack-SW is signiﬁcantly greater than

Attack-1 or Attack-2 ( > − ).In other words, the hardware-based non-idealities that comeinto play when DNNs are mapped onto crossbars providerobustness against adversarial inputs.PGD attack, being a multi-step variant of FGSM attack,is much stronger and yields signiﬁcantly higher adversariallosses in DNNs than FGSM attacks. Similar to the case ofFGSM attacks, non-idealities in crossbars provide robust-ness to the mapped DNNs against adversarial inputs asshown in Fig. 6. Training a ﬁxed precisionVGG8 / VGG16 network onsoftware to generateModel-1 and Model-2 anddetermining CA FGSM and PGD attacks onthe software models(Attack-SW) and determineAA and AL for differentvalues of ϵ Mapping the weights ofthe trained DNNs on thecrossbars (16x16, 32x32,64x64) using a PyTorch-based framework Determining thevalue of AA andAL in each caseFGSM/PGD attack onmapped DNN models(Attack-1 & Attack-2) fordifferent values of ϵ Datasets :CIFAR-10 /CIFAR-100 Circuit-levelnon-idealities Evaluation to obtainCA after crossbarmappingPart-1 Part-2 Device-levelvariation proﬁle (a)

Mapto C1 ...

G1 G2 G3

Mapto C2 Mapto C3Mapto C4

C1 C2 C3 non-ideal

Mapto C1 Mapto C2Mapto C4 Mapto C3

Mapping to nine3x3 crossbarinstances G1 G2 G3G ideal

C1 C2 C3G non-ideal

Circuit-level and Device-level non-idealitiesadded

G1 G2 G3

Merge all non-idealconductance matricesG3G2G1 ........ (b)Fig. 3. (a) Flow diagram explaining the methodology followed (CIFAR-10 dataset is used with a VGG8 network (highlighted in red) while, CIFAR-100dataset is used with a VGG16 network (highlighted in blue)); (b) Pictorial depiction of the steps associated with mapping of an 8x8 weight matrixinto crossbar instances of size 3x3 TABLE 3Table showing AL (%) for different values of (cid:15) in case of Attack-2 (PGD) on crossbar sizes of 16x16, 32x32 and 64x64 on Model-1 and Model-2 ofVGG8 network with CIFAR-10 dataset

Attack-2 (PGD) on Model-1 Attack-2 (PGD) on Model-2 (cid:15)

Cross16

Cross32

Cross64 C A ( % ) Software Cross16 Cross32 Cross64 -3.44% -4.08% -4.58% -13.93% -15.03%

VGG8 network VGG16 network

Fig. 4. Bar diagram showing CA of a VGG8 and VGG16 networks onsoftware and crossbars of sizes 16x16, 32x32 and 64x64

For both FGSM and PGD attacks, we ﬁnd that ALs incase of

Attack-2 are lesser than

Attack-1 , indicating that themapped DNNs are more resilient to adversarial perturba-tions created using the crossbar-based hardware models than the software-based perturbations. Interestingly, we alsoﬁnd that larger crossbar sizes provide greater robustnessagainst adversarial attacks (characterized by lower valuesof AL for the same value of (cid:15) ) than the smaller ones. Thisis because larger crossbars involve greater number of par-asitic components (non-idealities), thereby imparting morerobustness. This has been shown in TABLE 3 where, thecrossbar size of 64x64 provides the best robustness amongthe other crossbar sizes.Fig. 7 shows the variation in the CAs of

Model-1 and

Model-2 for both baseline software DNN (VGG8 network)and when mapped onto crossbars. We ﬁnd that on traininga DNN with random noise added to its activations (

Model-2 ), CAs are signiﬁcantly lower than those for a normal DNN(

Model-1 ). As already discussed in the case of

Model-1 , weobserve similar results for both FGSM and PGD attackson

Model-2 shown in TABLE 4 and TABLE 5, all of whichafﬁrm that hardware-based non-idealities lead to reduc-tion in adversarial losses and improvement in adversarialrobustness( > − ). Furthermore, in case of Model-2 , we also ﬁnd that larger crossbar sizes provide greaterrobustness against adversarial attacks than the smaller onesas indicated by TABLE 3, where AL for a particular valueof (cid:15) is the highest in case of a 16x16 crossbar, followed bya 32x32 crossbar and the least for a 64x64 crossbar. Fromthe values of AL presented in TABLE 3, the reader might bemisleaded into thinking that

Model-2 yields greater adver-sarial robustness than

Model-1 when mapped on crossbars. (a) (b) (c) A L ( % ) ϵ Cross16 - FGSM attackAttack-SW Attack-1 Attack-2 A L ( % ) ϵ Cross32 - FGSM attackAttack-SW Attack-1 Attack-2 A L ( % ) ϵ Cross64 - FGSM attackAttack-SW Attack-1 Attack-2

Fig. 5. A plot between AL and (cid:15) for Attack-SW, Attack-1 and Attack-2 (FGSM) on Model-1 (VGG8 network with CIFAR-10 dataset) for crossbar sizes- (a) 16x16; (b) 32x32; (c) 64x64 (a) (b) (c)0 A L ( % ) ϵ Cross16 - PGD attack

Attack-SW Attack-1 Attack-2 A L ( % ) ϵ Cross32 - PGD attack

Attack-SW Attack-1 Attack-2 A L ( % ) ϵ Cross64 - PGD attack

Attack-SW Attack-1 Attack-2

Fig. 6. A plot between AL and (cid:15) for Attack-SW, Attack-1 and Attack-2 (PGD) on Model-1 (VGG8 network with CIFAR-10 dataset) for crossbar sizes- (a) 16x16; (b) 32x32; (c) 64x64 TABLE 4Table showing AL (%) for different values of (cid:15) in case of Attack-SW, Attack-1 and Attack-2 (FGSM) on Model-2 (VGG8 network with CIFAR-10dataset) for crossbar sizes of 16x16 and 32x32 (cid:15)

Attack-SW

Attack-1

Attack-2

However, the fact is that the general trend of ﬁnding lowervalues of AL for Model-2 with respect to Model-1 for a given (cid:15) is because of signiﬁcantly smaller CAs of

Model-2 whencompared with

Model-1 (Fig. 7), and not higher values ofAA.

Effect of R MIN on adversarial robustness:

The effectiveresistance of a crossbar structure is the parallel combinationof resistances along its rows and columns. A smaller valueof R MIN reduces the effective resistance of the crossbar andincreases the value of NF for the crossbar [9]. As we havealready argued that an increased value of NF improves theadversarial robustness of crossbars, so on decreasing R MIN to 10 k Ω (maintaining a constant R MAX /R MIN ratio of 10)we ﬁnd that ALs (for a PGD attack) in case of smaller R MIN are lower than the corresponding ALs for a larger R MIN as shown in Fig. 8.However, we also observe that in case of smaller R MIN ,the DNN achieves greater robustness against

Attack-1 than

Attack-2 , contrary to what has been observed in Fig. 5 andFig. 6. This is because of larger adversarial perturbationscreated on hardware during Attack-2 with lower R MIN .Lowering R MIN causes larger values of output currents inthe crossbar arrays (due to smaller effective resistance of thecrossbars). To verify this, we employ a metric called

Distor-tion Coefﬁcient (d) that quantiﬁes the degree of distortion oftest images of the dataset over a batch during an adversarialattack. Mathematically, it is given as: d = (cid:80) i | N C − N A | N (6) TABLE 5Table showing AL (%) for different values of (cid:15) in case of Attack-SW, Attack-1 and Attack-2 (PGD) on Model-2 (VGG8 network with CIFAR-10dataset) for crossbar sizes of 16x16 and 32x32 (cid:15)

Attack-SW

Attack-1

Attack-2 CA on software

CA on

Cross16

CA on

Cross32

CA on

Cross64 C A ( % ) VGG8 network with CIFAR-10 dataset

Model-1 Model-2

Fig. 7. Comparison of CAs of VGG8 network (sofware-based as well ascrossbar-mapped) for

Model-1 and

Model-2 using CIFAR-10 dataset Attack-1 Attack-2 Attack-1 Attack-2

Rmin = 10 k Ω Rmin = 20 k Ω A L ( % )

32 x 32 Crossbar (Rmax / Rmin = 10) ϵ = 2/255 ϵ = 8/255 ϵ = 32/255

Fig. 8. Bar-diagram showing ALs in case of

Attack-1 and

Attack-2 (PGD) for a VGG8 network mapped on 32x32 crossbars using CIFAR-10 dataset for two different values of R MIN at constant R MAX /R MIN ratio where,

N C = normalized pixel value of clean image,

N A = normalized pixel value of adversarially perturbed image, i denotes the indices of a pixel in an image and N = totalnumber of pixels across an image.From TABLE 6, we observe that the distortion coefﬁcient d over a batch of images for Attack-2 is greater than

Attack-1 . This veriﬁes that

Attack-2 is stronger than

Attack-1 andhence, greater adversarial robustness is observed in case of

Attack-1 w.r.t

Attack-2 with lower R MIN . Effect of R MAX at constant R MIN on adversarialrobustness:

Fig. 9 shows results for PGD attack on a VGG8network mapped onto crossbars with constant R MIN of TABLE 6Table showing values of distortion coefﬁcient over a batch (calculatedusing equation 6) and AL for PGD attack ( (cid:15) = 8 / ) on VGG8 networkmapped onto 32x32 crossbar with CIFAR-10 dataset. The value of R MIN = 10 k Ω and R MAX /R MIN = 10 for the crossbar

Type of Attack Distortion coefﬁcient (d) AL (%)Attack-1

Attack-2 k Ω and R MAX /R MIN ratio increased by increasing thevalue of R MAX . We ﬁnd that even increasing R MAX bya factor of 200 results in no added advantage in terms ofadversarial robustness for

Attack-1 or Attack-2 . Hence, weﬁnd that there is a greater impact of R MIN on adversarialrobustness than R MAX . Effect of process variation on adversarial robustness:

Fig. 10 shows results for PGD attack on a VGG8 networkmapped onto crossbars by varying the σ/µ ratio, pertainingto synaptic device variation, from − . Similar to thecase of increasing the value of R MAX , we ﬁnd no addedadvantage in terms of adversarial robustness for

Attack-1 or Attack-2 by increasing the Gaussian variation in the devicesof the crossbars.

Studying the combined effect of input pixel discretiza-tion and crossbar non-idealities:

In [16], the authors showthat input pixel discretization from 256 or 8-bit level to 4-bit, 2-bit improves adversarial resilience of software DNNs.Here, we unleash FGSM attack on the VGG8 networkmapped onto 32x32 crossbars with input image pixels ofthe CIFAR-10 test dataset discretized to 4-bits (or 16 levels)and 2-bits (4 levels). The results are shown in Fig. 11.Interestingly, we ﬁnd that with pixel discretization, ALson crossbar mapped DNN for both

Attack-1 and

Attack-2 attain a ﬁxed value and do not vary on increasing (cid:15) from0.1 to 0.3. This implies that input pixel discretization doesnot necessarily help in resiliency when attacking hardwaremapped DNNs. For lower values of (cid:15) , greater adversar-ial robustness is observed without pixel discretization. Athigher value of (cid:15) ( (cid:15) = 0 . ), the combined effect of 4-bitpixel discretization and crossbar non-idealities outperformsthe rest in terms of adversarial robustness. Furthermore, 2-bit pixel discretization not only reduces the clean accuracyto . but also imparts marginally lesser adversarialrobustness than 4-bit pixel discretization - < . for Attack-1 and < . for Attack-2 . The results shown in Fig. 12 are similar to those in thecase of the VGG8 network evaluated with CIFAR-10 dataset. % Rmax/Rmin ϵ = 2/255 (PGD attack) CA AL (Attack-1) AL (Attack-2) 6570

10 20 200 % Rmax/Rmin ϵ = 8/255 (PGD attack) CA AL (Attack-1) AL (Attack-2) 65758595 10 20 200 % Rmax/Rmin ϵ = 32/255 (PGD attack) CA AL (Attack-1) AL (Attack-2) (a) (b) (c)

Fig. 9. Bar-diagram showing CAs and ALs (for PGD-based

Attack-1 and

Attack-2 ) for a VGG8 network mapped on 32x32 crossbars using CIFAR-10dataset for different values of R MAX with - (a) (cid:15) = 2 / ; (b) (cid:15) = 8 / ; (c) (cid:15) = 32 / (a) (b) (c) % σ/μ (%) ϵ = 2/255 (PGD attack) CA AL (Attack-1) AL (Attack-2) % σ/μ (%) ϵ = 8/255 (PGD attack) CA AL (Attack-1) AL (Attack-2) % σ/μ (%) ϵ = 32/255 (PGD attack) CA AL (Attack-1) AL (Attack-2)

Fig. 10. Bar-diagram showing CAs and ALs (for PGD-based

Attack-1 and

Attack-2 ) for a VGG8 network mapped on 32x32 crossbars using CIFAR-10 dataset for different values of σ/µ (synaptic device variation) with - (a) (cid:15) = 2 / ; (b) (cid:15) = 8 / ; (c) (cid:15) = 32 / Crossbar-based non-idealities impart adversarial robustnessto the mapped VGG16 network ( > − ) againstboth FGSM and PGD-based attacks. However, with CIFAR-100 dataset, we clearly observe that DNN shows greateradversarial robustness against PGD attack in case of Attack-2w.r.t Attack-1 than what is observed with CIFAR-10 dataset.Quantitatively, there is ∼ greater robustness in case of Attack-2 w.r.t Attack-1 with CIFAR-100 dataset against ∼ with CIFAR-10 dataset, albeit the drop in clean accuracy forthe DNN mapped onto crossbars is higher in case of CIFAR-100 dataset (Fig. 4). Comparison with Related works:

We compare the per-formance of non-ideality-driven adversarial robustness incrossbars against state-of-the-art software-based adversar-ial techniques described in [15], [16]. Note, [15], [16] useefﬁciency driven transformations (that implicitly translateto hardware beneﬁts) such as, quantization to improveresilience. In contrast, our work utilizes explicit hardwarevariations to improve robustness. We aim to compare therobustness obtained from implicit and explicit hardwaretechniques. We observe that for single-step FGSM attack ona VGG16 network mapped on 32x32 crossbars (note, Model-1 type DNN), adversarial robustness due to crossbar non-idealities,

Attack − results, outperforms all other tech- niques (Fig. 13(a)). For multi-step PGD attack, Attack − ranks second ((Fig. 13(b)). With respect to 4-bit (4b) pixeldiscretization of input data [16], non-idealities in crossbarsimpart ∼ greater adversarial robustness in case ofFGSM attack and ∼ greater adversarial robustness incase of PGD attack. On the other hand, in case of FGSMattack, crossbar-based non-idealities impart ∼ greateradversarial robustness than QUANOS [15], while for PGDattack, QUANOS outperforms by ∼ − . ONCLUSION

In this work, we perform a comprehensive analysis to showhow crossbar-based non-idealities can be harnessed for ad-versarial robustness. This work brings in a new standpointthat does not devalue the importance of non-idealities orparasitics present in crossbar systems. We develop a frame-work based on

PyTorch that maps state-of-the-art DNNs(VGG8 and VGG16 networks) onto resistive crossbar arraysand evaluates them with benchmark datasets (CIFAR-10and CIFAR-100). We show that circuit-level non-idealities( e.g. , interconnect parasitics) and synaptic device-level non-idealities intrinsically provide robustness to the mappedDNNs against adversarial attacks, such as FGSM and PGDattacks. This is reﬂected by lower accuracy degradations (a) (b) (c) Normal 4b 2b % Bit-quantization ϵ = 0.1 (FGSM attack) CA AL (Attack-1) AL (Attack-2)

Normal 4b 2b % Bit-quantization ϵ = 0.2 (FGSM attack) CA AL (Attack-1) AL (Attack-2) Normal 4b 2b % Bit-quantization ϵ = 0.3 (FGSM attack) CA AL (Attack-1) AL (Attack-2)

Fig. 11. Bar-diagram showing CAs and ALs (for FGSM-based

Attack-1 and

Attack-2 ) for a VGG8 network mapped on 32x32 crossbars usingCIFAR-10 dataset for different bit-discretizations of input pixels (4-bit and 2-bit) with - (a) (cid:15) = 0 . ; (b) (cid:15) = 0 . ; (c) (cid:15) = 0 . (a) (b) (c) (d) A L ( % ) ϵ Cross16 - FGSM attack

Attack-SW Attack-1 Attack-2 A L ( % ) ϵ Cross32 - FGSM attack

Attack-SW Attack-1 Attack-2 A L ( % ) ϵ Cross16 - PGD attack

Attack-SW Attack-1 Attack-2 A L ( % ) ϵ Cross32 - PGD attack

Attack-SW Attack-1 Attack-2

Fig. 12. (a)-(b) A plot between AL and (cid:15) for Attack-SW, Attack-1 and Attack-2 (FGSM) on Model-1 (VGG16 with CIFAR-100 dataset) for crossbarsizes 16x16 and 32x32 respectively; (c)-(d) A plot between AL and (cid:15) for Attack-SW, Attack-1 and Attack-2 (PGD) on Model-1 (VGG16 with CIFAR-100 dataset) for crossbar sizes 16x16 and 32x32 respectively during adversarial attacks in case of DNNs mapped oncrossbars than that of software-based DNNs ( > − ).We also ﬁnd that larger crossbar sizes extend greater re-silience to the DNNs even against stronger PGD attacks.We investigate the inﬂuence of various crossbar param-eters on the adversarial robustness of the mapped DNNs.While large values of R MAX do not produce any appre-ciable effect on adversarial robustness, a smaller value of R MIN makes the network more adversarially robust. Fur-thermore, increasing the σ/µ ratio of the synaptic devicespertaining to process variation does not yield any signiﬁcantbeneﬁt in terms of adversarial robustness. We further com-pare the performance of our non-ideality driven approach toadversarial robustness in a 32x32 crossbar with other state-of-the-art software-based adversarial defense techniques onCIFAR-100 dataset. We ﬁnd that our approach performssigniﬁcantly well in terms of reducing adversarial lossesduring FGSM or PGD attacks.In our present work, in order to substantiate our claim,we have taken into account a crossbar system that does notinclude selector devices (such as MOSFETs) being connectedin series with the resistive synaptic devices. In other words,we have not considered the impact of non-idealities pertain-ing to 1T-1R crossbar system that are non-linear in nature.Thus, in our future work we shall extend our analysisto 1T-1R memresistive crossbar arrays by employing anarchitecture similar to

GENIE-x [9] that accounts for bothdata-dependent and data-independent non-idealities while modeling the crossbar instances. Finally, our comprehensiveanalysis and encouraging results establish the idea of re-thinking analog crossbar computing for adversarial securityin addition to energy efﬁciency. A CKNOWLEDGEMENT

This work was supported in part by the National ScienceFoundation (Grant R EFERENCES [1] Catherine D. Schuman et al.

A Survey of Neuromorphic Computingand Neural Networks in Hardware . 2017. arXiv:1705.06963 [cs.NE] .[2] H. -. P. Wong et al. MetalOxide RRAM. In:

Proceedings of theIEEE (2012), pp. 221222.[4] Wei-Hao Chen et al. Circuit design for beyond von Neumannapplications using emerging memory: From nonvolatile logicsto neuromorphic computing. In: (2017), pp. 2328.[5] A. Sengupta, Y. Shim, and K. Roy. Proposal for an All-Spin Artiﬁ-cial Neural Network: Emulating Neural and Synaptic Functionali-ties Through Domain Wall Motion in Ferromagnets. In:

IEEE Trans-actions on Biomedical Circuits and Systems

IEEE Transactions on Nanotechnology A L ( % ) ϵ Comparison with other methods for FGSM attack

Attack-SW Attack-1 on Cross32Input pixel discretization (4b) QUANOS (a) A L ( % ) ϵ Comparison with other methods for

PGD attack

Attack-SW Attack-1 on Cross32

Input pixel discretization (4b) QUANOS (b)Fig. 13. (a) Comparison of our proposed method with other state-of-the-art adversarial defenses during FGSM attack using VGG16 network andCIFAR-100 dataset; (b) Comparison of our proposed method with otherstate-of-the-art adversarial defenses during PGD attack using VGG16network and CIFAR-100 dataset [7] Aayush Ankit et al.

PUMA: A Programmable Ultra-efﬁcientMemristor-based Accelerator for Machine Learning Inference .2019.arXiv:1901.10351 [cs.ET] .[8] Shubham Jain et al.

RxNN: A Framework for Evaluating Deep NeuralNetworks on Resistive Crossbars . 2018. arXiv:1809.00072 [cs.ET] .[9] Indranil Chakraborty et al.

GENIEx: A Generalized Approach to Emu-lating Non-Ideality in Memristive Xbars using Neural Networks . 2020.arXiv:2003.06902 [cs.ET] .[10] Shubham Jain and Anand Raghunathan. CxDNN: Hardware-Software Compensation Methods for Deep Neural Networkson Resistive Crossbar Systems. In:

ACM Trans. Embed. Comput.Syst.

X-CHANGR:Changing Memristive Crossbar Mapping for Mitigating Line-ResistanceInduced Accuracy Degradation in DeepNeural Networks . 2019.arXiv:1907.00285 [cs.ET] .[12] I. Chakraborty, D. Roy, and K. Roy. Technology Aware Train-ing in Memristive Neuromorphic Systems for Nonideal SynapticCrossbars.In:

IEEE Transactions on Emerging Topics in ComputationalIntelligence

Towards Deep Learning Models Resistant toAdversarial Attacks . 2017. arXiv:1706.06083 [stat.ML] .[14] Nicholas Carlini et al.

On Evaluating Adversarial Robustness . 2019.arXiv:1902.06705 [cs.LG] .[15] Priyadarshini Panda.

QUANOS- Adversarial Noise Sensitivity DrivenHybrid Quantization of Neural Networks . 2020. arXiv:2004.11233 [cs.LG] . [16] Priyadarshini Panda, Indranil Chakraborty, and KaushikRoy.Discretization Based Solutions for Secure Machine Learn-ing Against Adversarial Attacks. In:

IEEE Access

Defensive Quantization:WhenEfﬁciency Meets Robustness . 2019. arXiv:1904.08444 [cs.LG] .[18] Adam Paszke et al. Automatic differentiation in PyTorch. In:

NIPS-W . 2017.[19] Karen Simonyan and Andrew Zisserman.

Very Deep ConvolutionalNetworks for Large-Scale Image Recognition . 2014.arXiv:1409.1556 [cs.CV] .[20] Alex Krizhevsky.

Learning multiple layers of features from tiny images .Tech. rep. 2009.[21] Beiye Liu et al. Reduction and IR-drop compensations techniquesfor reliable neuromorphic computing systems. In:

IEEE/ACM In-ternational Conference on Computer-Aided Design (ICCAD) (2014), pp.6370.[22] Aayush Ankit et al. Resparc: A reconﬁgurable and energy-efﬁcientarchitecture with memristive crossbars for deep spiking neuralnetworks. In:

Proceedings of the 54th Annual DesignAutomation Con-ference 2017 . 2017, pp. 16.[23] Ali Shaﬁee et al. ISAAC: A convolutional neural network acceler-ator with in-situ analog arithmetic in crossbars. In:

ACMSIGARCHComputer Architecture News

ACM SIGARCH Computer Architecture News

Explaining and Harnessing Adversarial Examples .2014.arXiv:1412.6572 [stat.ML] .[26] Alexey Kurakin, Ian Goodfellow, and Samy Bengio.

AdversarialMachine Learning at Scale . 2016. arXiv:1611.01236 [cs.CV] .[27] Harini Kannan, Alexey Kurakin, and Ian Goodfellow.

AdversarialLogit Pairing . 2018. arXiv:1803.06373 [cs.LG] .[28] Hyeungill Lee, Sungyeob Han, and Jungwoo Lee.

Generative Ad-versarial Trainer: Defense to Adversarial Perturbations with GAN .2017.arXiv:1705.03387 [cs.LG] .[29] Cihang Xie et al.

Mitigating Adversarial Effects Through Randomiza-tion . 2017. arXiv:1711.01991 [cs.CV] .[30] Xuanqing Liu et al.

Towards Robust Neural Networks via Random Self-ensemble . 2017. arXiv:1712.00673 [cs.LG] .[31] Guneet S. Dhillon et al.

Stochastic Activation Pruning for RobustAdversarial Defense . 2018. arXiv:1803.01442 [cs.LG] .[32] Weilin Xu, David Evans, and Yanjun Qi. Feature Squeez-ing:Detecting Adversarial Examples in Deep Neural Networks.In:

Proceedings 2018 Network and Distributed System Security Sym-posium (2018).[33] Pouya Samangouei, Maya Kabkab, and Rama Chellappa.

Defense-GAN: Protecting Classiﬁers Against Adversarial Attacks Using Gener-ative Models . 2018. arXiv:1805.06605 [cs.CV] .[34] Dongyu Meng and Hao Chen.

MagNet: a Two-Pronged Defenseagainst Adversarial Examples . 2017. arXiv:1705.09064 [cs.CR] .[35] Fangzhou Liao et al.

Defense against Adversarial Attacks UsingHigh-Level Representation Guided Denoiser . 2017. arXiv:1712.02976 [cs.CV] .[36] Kui Ren et al. Adversarial Attacks and Defenses in Deep Learn-ing.In:

Engineering

CertiﬁedDefenses against Adversarial Examples . 2018.arXiv:1801.09344 [cs.LG] .[38] Aman Sinha et al.

Certifying Some Distributional Robustness withPrincipled Adversarial Training . 2017. arXiv:1710.10571 [stat.ML] .[39] Yiwen Guo et al.

Sparse DNNs with Improved Adversarial Robustness .2018. arXiv:1810.09619 [cs.LG] .[40] Xuanqing Liu et al.

Adv-BNN: Improved Adversarial Defense throughRobust Bayesian Neural Network . 2018. arXiv:1810.01279 [cs.LG] .[41] Nicolas Papernot et al.

Practical Black-Box Attacks against MachineLearning . 2016. arXiv:1602.02697 [cs.CR][cs.CR]