Methodology for Realizing VMM with Binary RRAM Arrays: Experimental Demonstration of Binarized-ADALINE Using OxRAM Crossbar
Sandeep Kaur Kingra, Vivek Parmar, Shubham Negi, Sufyan Khan, Boris Hudec, Tuo-Hung Hou, Manan Suri
AAccepted as conference paper at the ISCAS 2020
Methodology for Realizing VMM with BinaryRRAM Arrays: Experimental Demonstration ofBinarized-ADALINE Using OxRAM Crossbar
Sandeep Kaur Kingra ,Vivek Parmar ,Shubham Negi ,Sufyan Khan ,Boris Hudec ,Tuo-Hung Hou and Manan Suri Indian Institute of Technology Delhi, Hauz Khas, New Delhi - 110016, India, Email ID : [email protected] National Chiao Tung University,Hsinchu 300, Taiwan
Abstract —In this paper, we present an efficient hardwaremapping methodology for realizing vector matrix multiplication(VMM) on resistive memory (RRAM) arrays. Using the proposedVMM computation technique, we experimentally demonstrate abinarized-ADALINE (Adaptive Linear) classifier on an OxRAMcrossbar. An 8 × /7 nmAl-doped-TiO /TiN device stack is used. Weight training forthe binarized-ADALINE classifier is performed ex-situ on UCIcancer dataset. Post weight generation the OxRAM array iscarefully programmed to binary weight-states using the proposedweight mapping technique on a custom-built testbench. OurVMM powered binarized-ADALINE network achieves a classifi-cation accuracy of 78% in simulation and 67% in experiments.Experimental accuracy was found to drop mainly due to crossbarinherent sneak-path issues and RRAM device programmingvariability. I. I
NTRODUCTION
In-Memory Computing (IMC) and analog hardware Vec-tor Matrix Multiplication (VMM) approaches offer efficientalternatives to conventional computing for resource hungrymodern-AI workloads [1]. Advance neural networks such asConvolutional Neural Networks (CNNs) and Recurrent NeuralNetworks (RNNs) involves extensive use of VMM operations.In particular, emerging resistive memory (RRAM) nanodevicebased arrays offer an efficient and compact option for perform-ing VMM operations in hardware. Several research groups[2]–[6] have demonstrated this analog computing method fora variety of applications such as linear equation solver [6],image processing [7], data compression [8], feature extraction[9], neural network inference [10], in-situ training [10], [11],etc. Multiple emerging memory nanodevice technologies havebeen utilized for this application : OxRAM [1], MRAM[11], PCM [12], Ferroelectric FET [13], ECRAM [14], Flash[15], etc. Owing to their higher device density, crossbarstructures are preferred for VMM applications [16]. Typicalimplementations of crossbar based VMM operations utilizeanalog conductance states of RRAM nanodevices [1], [2],[4], [17]. However, difficulty to obtain multiple reliably pro-grammable resistance states, non-linearity of analog RRAMdevice conductance and variability related issues pose a majorchallenge to this approach. Furthermore complex program-and-verify schemes may be required in certain cases [18].In case of selector-free crossbars these issues are furtherenhanced due to the presence of sneak-paths [19]. Binary
Fig. 1. (a) Basic VMM operation implemented using two-terminal resistivenanodevice crossbar. (b) Structure of generic ADALINE Network. neural networks (BNNs) have been shown to have betterperformance (i.e. lower memory, time and energy require-ments) [20], [21], compared to their full-precision (analog)counterparts, at the cost of a marginal accuracy trade-off.Hence using crossbar based VMM for hardware realization ofBNN would face lesser challenges compared to fully analogalternatives. Furthermore, use of binary memory states leadsto simplification of programming with relatively less impacton device endurance. Recently BNNs have been demonstratedon RRAM matrix using 2T-2R synaptic cells [22]. However,true density benefit of emerging RRAM technology can beexploited only by using selector-free RRAM crossbars. In thecurrent study, we propose a hardware mapping methodologyfor realizing VMM using binary RRAM crossbars. Furtherwe experimentally demonstrate the proposed technique witha case-study of binarized-ADALINE using OxRAM crossbar.To the best of our knowledge, this is the first experimentalvalidation of BNNs using selector-free RRAM crossbars.Compared to literature we present the following novelconcepts in this paper:1) Weight mapping methodology, algorithm and operationscheduling for implementing BNN on any RRAM array(ex- OxRAM, CBRAM, PCM, etc.).2) Experimental demonstration of a binarized-ADALINEnetwork on an 8 × This document is the paper as accepted for presentation at the IEEE International Symposium on Circuits and Systems (ISCAS) 2020.©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, includingreprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, orreuse of any copyrighted component of this work in other works. a r X i v : . [ c s . ET ] J un oncepts relevant for this work. Section III describes theproposed methodology for mapping BNNs and training abinarized-ADALINE network. In Section IV, we describe thecustom-test platform in detail along with experimental resultson fabricated OxRAM crossbar.II. B ASICS AND B ACKGROUND
A. Vector Matrix Multiplication (VMM) in Hardware
Fig. 1(a) describes the standard implementation of VMMoperation in a generic two-terminal resistive nanodevice cross-bar based architecture. Input is applied in the form of voltages( V in,i ) across all rows in parallel. Based on the conductancestate ( G i,j ) of the devices current integrates on each column.The resulting integrated current is converted to voltage usinga Trans-Impedance Amplifier (TIA) with feedback resistor( R f ). Output voltage ( V out,j ) or the output of VMM operation(across a given column j of the matrix) is defined by Eq. 1: V out,j = R f × V in,i × G i,j (1) B. ADALINE ‘Adaptive Linear Element’ also known as ADALINE, wasone of the first networks proposed based on ”memistors” (notmemristors) [24]. It is used for a variety of classificationapplications such as power-quality event detection [25], stereo-vision matching [26], etc. that require fast computation. It isan adaptive threshold logic element representative of a neuronused in ANNs (Artificial Neural Networks). As shown in Fig.1(b) it consists of an adaptive linear combiner, cascaded witha hard-limiting quantizer, which is used to produce a binaryoutput, Y k = sign ( s k ). The bias weight w k which is connectedto a constant input x = +1 effectively controls the thresholdlevel of the quantizer. In single-element neural networks, anadaptive algorithm such as Least Mean Square (LMS) orPerceptron rule is used for training weights of ADALINE.III. P ROPOSED M ETHODOLOGY FOR
VMM-
BASED
BNN
A. Training for Binarized-ADALINE
We use a modified version of the training algorithm pro-posed in [27]. The specific modifications include: (i) Trainingwith only positive value inputs. This helps to simplify inputencoding scheme compared to inputs with signed magnitudesas normally used for training BNNs [20]. (ii) Adding supportfor training a binarized-ADALINE network by using binary‘tanh’ as the hard-limiting quantizer and binary weights. (iii)Support for training non-vision datasets by use of min-maxnormalization on input features. The training algorithm issummarized in Algorithm 1. Post-training, binary weights(+1,-1) are generated.
B. Weight Mapping Strategy
Proposed weight mapping strategy for using two-terminalRRAM device crossbar is shown in Fig. 2. In this discussion,we primarily focus on the binarized-ADALINE as an examplenetwork since it represents a basic unit for realization of amulti-layer BNN. Every weight vector is mapped using two
Fig. 2. Proposed scheme for computation and weight mapping using two-terminal resistive crossbar. Every logical weight value (i.e. +1 or -1) is mappedon the crossbar using 2 paired devices from consecutive rows (i.e. rows W + and W − ) of the same column. The paired devices are always programmedto complementary states (LRS-HRS or vice-versa). In particular, to realizelogical weight +1, device from the first row is programmed to LRS and thepaired device in the consecutive row is programmed to HRS. For realizinglogical weight -1, programming is inverted (i.e. first row device is in HRSwhile consecutive row device is in LRS). Eight logical weight values (-1,1,1,-1,1,1,-1,-1) are programmed using 16 (4x4) devices. The first four weightscorresponding to class k=0 (-1,1,1,-1) are partitioned in rows 1 and 2, whilethe next four weights corresponding to class k=1 (1,1,-1,-1) are partitionedin rows 3 and 4. Note, input voltages are applied on columns and currentintegration occurs across rows. This is due to the fact that DAC units used inour experimental setup generate only positive voltages. Negative voltages areeffectively realized by grounding device top-electrode and applying +ve DACsignal at device bottom electrode. Programming and read paths are isolatedusing CMOS switches for each channel. separate components i.e. W + , W − in consecutive rows of thecrossbar. If length of input vector is greater than number ofcolumns, we partition weights and allocate them on separateset of rows. Since two devices are utilized for representingeach logical weight (shown in Fig. 2), the utilization ofcrossbar is reduced by 50%. Summation performed at eachrow of the crossbar can be represented by Eq. 2 (where k denotes output class, i denotes polarity i.e. +ve, -ve and p denotes partition of feature vector [0,n]). Post summation stage(i.e. TIA output), final sum voltages from rows W + and W − are subtracted at the 2nd stage (Eq. 4) leading to a class-wise score V o,k , which is used for generating a class decisionat the final stage. In case of partitions, all output voltagesof a respective polarity (i), for class k ( V k,io,p ) are first storedand summed (Eq. 3) before proceeding to the subtraction andclass-decision stages. In proposed binarized-ADALINE, weuse analog inputs of 8-bit precision. 8-bit Inputs are realizedby utilizing pulse-width modulation (PWM) based encoding Algorithm 1
Algorithm for training binarized-ADALINE
Input:
Training data X, Expected target Y, Epochs N, Batch-size B
Output:
Target Y t . Initialisation : X n = X - min(X) / (max(X) - min(X)) X i = round(X n × Y o = One-Hot Encode(Y) Initialise Weight vector W Binarize W
LOOP Process for i = 0 to N do Y t [B] = W · X i error = Y t [B] - Y o [B] Calculate δ W using squared-hinge loss with ADAM optimizer
Binarize W end for ig. 3. Flowchart summarizing sequence of operations to perform binarized-ADALINE computation. which requires integration across 255 cycles to generate thefinal class score. Fig. 3 summarizes sequence of operations. V k,io,p = W i,kp · X (2) V k,io = n (cid:88) p =1 V k,io,p (3) V o,k = V + o,k − V − o,k (4)IV. E XPERIMENTAL R ESULTS
A. Testbench & Dataset
Fig. 4 shows the custom-testbench built for this application.The main components are:1)
M icro − controller : Drives the overall VMM operationscheduling. It is also used for communication with hostPC, managing control signals to all other ICs, OxRAMcrossbar programming/sensing/ weight-mapping parti-tioning, computation of outputs post-TIA stage, finalclass score computation and inference decision.2) Sensing Circuitry : Consists of TIA for convertingcurrent to voltage signal and ADC unit (from micro-controller).3)
DAC units : For generating required OxRAM program-ming signals. We used V set = 3.3 V, V reset = -5.5 Vand V read = -0.8 V for programming OxRAM devicesin our crossbar. (High values of V set ,V reset are used inour prototype devices as they have large active area (100 × µm ). Negative V reset ,V read were realized byapplying +ve voltage signals at the bottom electrode ofOxRAM device and grounding the top electrode, thereby Fig. 4. Custom-testbench of proposed OxRAM based VMM. Since our DACunits can only generate +ve voltage signals, -ve voltage across the RRAMdevice is realized by applying a +ve voltage to the bottom terminal whilegrounding the top terminal of the device. generating effective -ve V
T B (where V
T B = V
T op -V Bottom ).4)
CM OS Switches : For path selection (during SET,RESET and READ operation) and compliance currentcontrol.For validation of the proposed methodology, we trained thebinarized-ADALINE network using a binary classificationdataset (Breast Cancer dataset) from the UCI machine learningrepository [23]. The dataset consists of 357 benign and 212malignant cells.
B. Fabricated OxRAM Crossbar Chip × /7 nm Al-doped-TiO /TiN (shown in Fig. 5(a))was fabricated for this study. First, 500 nm thick TiN filmwas deposited on thermal-SiO (500 nm)/Si wafer by reac-tive DC sputtering. The wordlines were then patterned byoptical photolithography (first mask) and dry etching usinginductively-coupled plasma (ICP). The bottom, 7 nm thickATO dielectric, was then deposited by interchanging varyingamount of TiO and Al O PE-ALD cycles. Upper, 3 nmthick dielectric HfO film, was deposited using TDMAHf(Tetrakis(dimethylamido)hafnium) and O plasma. All depo-sitions were carried out at 250 ◦ C using remote plasma hot-wall reactor PE-ALD system. The TE pattern (similar to theBE pattern but rotated 90 ◦ ) was defined using second maskand 100 nm thick Ni top electrode film was deposited by DCsputtering and patterned using lift-off technique. This way, an8 × µ m wide perpendicularTiN and Ni wordlines and bitlines sandwiching the dielectricbilayer, forming 64 OxRAM devices with 100x100 µ m activearea at each crosspoint. The third mask and ICP dry etchingstep was performed to open the contact windows (etch thedielectrics) to the wordline contact pads. Wire bonding andpackaging were the final steps for the OxRAM crossbarencapsulation. DC IV characteristics of 64 OxRAM devicesin the 8 × C. BNN Results on OxRAM Crossbar
To perform BNN computation with OxRAM crossbar, row-level READ operations were used. Since we used V read = -0.8 V, due to negative read voltage current integrationhappened over the row elements. For binarized-ADALINEnetwork (see Fig. 6(a)), we partitioned and mapped the weightvector over multiple rows (shown in Fig. 6(c)) and performedVMM operations. The output class was decided by Eq. 4 ig. 5. (a) (i) Packaged 8 × HfO /ATO/TiN OxRAM device fabricated for this study. (b) Overlaid DC IV curves of 64 OxRAM devices in 8 × V set = 3.3 V, V reset = -5.5 V) (c) Resistance distribution of LRS and HRS states used for the study (V read = - 0.8 V).Fig. 6. (a) Implemented 30 × × × × Ω . as described in Section III-B. The 8-bit wide positive inputvalues were encoded as PWM duty cycles applied to crossbarcolumns (Fig. 6(b)) leading to current integration over time.In PWM encoding, the input value (ranging from 0 to 255)was translated to pulses with fixed voltage (0.8 V). Specificpulse-widths were n ×
17 ms (where n= number of clockcycles [0-255]). Prior to executing VMM operations the entirecrossbar was initialized to HRS (Fig. 6(d,e)) for reliable weightprogramming. Crossbar resistance distribution post-mappingof trained weights is shown in Fig. 6 (f, g). Training wasperformed on 80% of the UCI cancer dataset with randomshuffling. Classification accuracy results are shown in Table I.We can observe a reduction in classification accuracy betweensimulated and experimental network. The drop in accuracycan be attributed to sneak-path issue inherent to crossbars.Due to sneak-path leakage, the effective integrated currentvalue across a specific row is diminished. Another factorcontributing to accuracy loss is the variability in OxRAMdevice programming (i.e. LRS and HRS distributions) asevident from Fig. 5(c), 6(g). An effective strategy to mitigatethese effects would be to implement two separate ADALINEnetworks in place of a single neuron. Deep binarized multi-layer networks can also be explored, using larger RRAMarrays to further improve the learning performance as shownin simulation studies [28]–[30]. It should be noted that by
TABLE IC
LASSIFICATION ACCURACY FOR
VMM-
BASED BINARIZED -ADALINE.Sr.No. Platform TrainingAccuracy% TestingAccuracy%1 Simulation (Software) 74.06 78.072 Experimental (8 × using binary OxRAM states (HRS/LRS), we have limitedthe effect of variability that would otherwise reflect in ananalog resistance VMM implementation. Furthermore, usingOxRAM crossbar solely for inference relaxes the endurancerequirements for the devices.V. C ONCLUSION
In this paper, we presented a novel methodology for map-ping BNN operations on two-terminal binary NVM devicebased crossbars of arbitrary sizes. We experimentally demon-strate the realization of a binarized-ADALINE classifier, forUCI Cancer dataset, on fabricated 8 × CKNOWLEDGEMENT
This work was supported in part by SERB-CRG/2018/001901 and CYRAN AI Solutions.
EFERENCES[1] A. Shafiee, A. Nag, N. Muralimanohar, R. Balasubramonian, J. P.Strachan, M. Hu, R. S. Williams, and V. Srikumar, “ISAAC: Aconvolutional neural network accelerator with in-situ analog arithmeticin crossbars,” in . IEEE, jun 2016. [Online]. Available:https://doi.org/10.1109%2Fisca.2016.12[2] M. Hu, R. S. Williams, J. P. Strachan, Z. Li, E. M. Grafals, N. Davila,C. Graves, S. Lam, N. Ge, and J. J. Yang, “Dot-product engine forneuromorphic computing,” in
Proceedings of the 53rd Annual DesignAutomation Conference on - DAC16 . ACM Press, 2016. [Online].Available: https://doi.org/10.1145%2F2897937.2898010[3] S. N. Truong, K. V. Pham, W. Yang, and K.-S. Min, “Sequentialmemristor crossbar for neuromorphic pattern recognition,”
IEEETransactions on Nanotechnology , vol. 15, no. 6, pp. 922–930, nov 2016.[Online]. Available: https://doi.org/10.1109%2Ftnano.2016.2611008[4] M. A. Lastras-Montano, B. Chakrabarti, D. B. Strukov, and K.-T.Cheng, “3d-DPE: A 3d high-bandwidth dot-product engine for high-performance neuromorphic computing,” in
Design, Automation & Testin Europe Conference & Exhibition (DATE), 2017 . IEEE, mar 2017.[Online]. Available: https://doi.org/10.23919%2Fdate.2017.7927183[5] Y. Jeong and W. Lu, “Neuromorphic computing using memristorcrossbar networks: A focus on bio-inspired approaches,”
IEEENanotechnology Magazine , vol. 12, no. 3, pp. 6–18, sep 2018. [Online].Available: https://doi.org/10.1109%2Fmnano.2018.2844901[6] S.-Y. Sun, H. Xu, J. Li, Q. Li, and H. Liu, “Cascaded architecture formemristor crossbar array based larger-scale neuromorphic computing,”
IEEE Access , vol. 7, pp. 61 679–61 688, 2019. [Online]. Available:https://doi.org/10.1109%2Faccess.2019.2915787[7] X. Hu, S. Duan, L. Wang, and X. Liao, “Memristive crossbar arraywith applications in image processing,”
Science China InformationSciences , vol. 55, no. 2, pp. 461–472, dec 2011. [Online]. Available:https://doi.org/10.1007%2Fs11432-011-4410-9[8] Y. Wang, X. Li, H. Yu, L. Ni, W. Yang, C. Weng, and J. Zhao,“Optimizing boolean embedding matrix for compressive sensing inRRAM crossbar,” in . IEEE, jul 2015. [Online].Available: https://doi.org/10.1109%2Fislped.2015.7273483[9] S. Choi, J. H. Shin, J. Lee, P. Sheridan, and W. D. Lu, “Experimentaldemonstration of feature extraction and dimensionality reduction usingmemristor networks,”
Nano Letters , vol. 17, no. 5, pp. 3113–3118,may 2017. [Online]. Available: https://doi.org/10.1021%2Facs.nanolett.7b00552[10] F. Alibart, E. Zamanidoost, and D. B. Strukov, “Pattern classificationby memristive crossbar circuits using ex situ and in situ training,”
Nature Communications , vol. 4, no. 1, jun 2013. [Online]. Available:https://doi.org/10.1038%2Fncomms3072[11] A. Mondal and A. Srivastava, “In-situ stochastic training ofMTJ crossbar based neural networks,” in
Proceedings of theInternational Symposium on Low Power Electronics and Design- ISLPED '18 . ACM Press, 2018. [Online]. Available: https://doi.org/10.1145%2F3218603.3218616[12] L. Wang, W. Gao, L. Yu, J.-Z. Wu, and B.-S. Xiong, “Multiple-matrixvector multiplication with crossbar phase-change memory,”
AppliedPhysics Express , vol. 12, no. 10, p. 105002, sep 2019. [Online].Available: https://doi.org/10.7567%2F1882-0786%2Fab4002[13] M. Jerry, P.-Y. Chen, J. Zhang, P. Sharma, K. Ni, S. Yu, andS. Datta, “Ferroelectric FET analog synapse for acceleration ofdeep neural network training,” in . IEEE, dec 2017. [Online]. Available:https://doi.org/10.1109%2Fiedm.2017.8268338[14] J. Tang, D. Bishop, S. Kim, M. Copel, T. Gokmen, T. Todorov,S. Shin, K.-T. Lee, P. Solomon, K. Chan, W. Haensch, andJ. Rozen, “ECRAM as scalable synaptic cell for high-speed, low-power neuromorphic computing,” in . IEEE, dec 2018. [Online]. Available:https://doi.org/10.1109%2Fiedm.2018.8614551[15] M. R. Mahmoodi and D. Strukov, “An ultra-low energy internallyanalog, externally digital vector-matrix multiplier based on NORflash memory technology,” in . IEEE, jun 2018. [Online]. Available:https://doi.org/10.1109%2Fdac.2018.8465804 [16] S. H. Tan, P. Lin, H. Yeon, S. Choi, Y. Park, and J. Kim, “Perspective:Uniform switching of artificial synapses for large-scale neuromorphicarrays,”
APL Materials , vol. 6, no. 12, p. 120901, dec 2018. [Online].Available: https://doi.org/10.1063%2F1.5049137[17] H. Tsai, S. Ambrogio, P. Narayanan, R. M. Shelby, and G. W.Burr, “Recent progress in analog memory-based accelerators for deeplearning,”
Journal of Physics D: Applied Physics , vol. 51, no. 28,p. 283001, jun 2018. [Online]. Available: https://doi.org/10.1088%2F1361-6463%2Faac8a5[18] K. Moon, S. Lim, J. Park, C. Sung, S. Oh, J. Woo, J. Lee, andH. Hwang, “RRAM-based synapse devices for neuromorphic systems,”
Faraday Discussions , vol. 213, pp. 421–451, 2019. [Online]. Available:https://doi.org/10.1039%2Fc8fd00127h[19] Y. Cassuto, S. Kvatinsky, and E. Yaakobi, “Information-theoreticsneak-path mitigation in memristor crossbar arrays,”
IEEE Transactionson Information Theory , vol. 62, no. 9, pp. 4801–4813, sep 2016.[Online]. Available: https://doi.org/10.1109%2Ftit.2016.2594798[20] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Ben-gio, “Binarized neural networks: Training deep neural networks withweights and activations constrained to+ 1 or-1,” arXiv preprintarXiv:1602.02830 , 2016.[21] S. Liang, S. Yin, L. Liu, W. Luk, and S. Wei, “FP-BNN: Binarized neuralnetwork on FPGA,”
Neurocomputing , vol. 275, pp. 1072–1086, jan 2018.[Online]. Available: https://doi.org/10.1016%2Fj.neucom.2017.09.046[22] P. Huang, Z. Zhou, Y. Zhang, Y. Xiang, R. Han, L. Liu, X. Liu, andJ. Kang, “Hardware implementation of RRAM based binarized neuralnetworks,”
APL Materials , vol. 7, no. 8, p. 081105, aug 2019. [Online].Available: https://doi.org/10.1063%2F1.5116863[23] D. Dua and C. Graff, “UCI machine learning repository,” 2017.[Online]. Available: http://archive.ics.uci.edu/ml[24] B. Widrow and M. Lehr, “30 years of adaptive neural networks:perceptron, madaline, and backpropagation,”
Proceedings of the IEEE ,vol. 78, no. 9, pp. 1415–1442, 1990.[25] T. Abdel-Galil, E. El-Saadany, and M. Salama, “Power quality eventdetection using adaline,”
Electric Power Systems Research , vol. 64, no. 2,pp. 137–144, 2003.[26] G. Pajares and J. M. de la Cruz, “Local stereovision matchingthrough the ADALINE neural network,”
Pattern Recognition Letters ,vol. 22, no. 14, pp. 1457–1473, dec 2001. [Online]. Available:https://doi.org/10.1016%2Fs0167-8655%2801%2900097-6[27] B. Moons, K. Goetschalckx, N. V. Berckelaer, and M. Verhelst,“Minimum energy quantized neural networks,” in . IEEE, oct 2017.[Online]. Available: https://doi.org/10.1109%2Facssc.2017.8335699[28] T. Hirtzlin, B. Penkovsky, J.-O. Klein, N. Locatelli, A. F. Vincent,M. Bocquet, J.-M. Portal, and D. Querlioz, “Implementing binarizedneural networks with magnetoresistive ram without error correction,” arXiv preprint arXiv:1908.04085 , 2019.[29] M. E. Fouda, S. Lee, J. Lee, A. Eltawil, and F. Kurdahi, “Mask techniquefor fast and efficient training of binary resistive crossbar arrays,”
IEEETransactions on Nanotechnology , vol. 18, pp. 704–716, 2019. [Online].Available: https://doi.org/10.1109%2Ftnano.2019.2927493[30] Y. Kim, W. H. Jeong, S. B. Tran, H. C. Woo, J. Kim, C. S. Hwang,K.-S. Min, and B. J. Choi, “Memristor crossbar array for binarizedneural networks,”