[PDF] SOT-MRAM based Sigmoidal Neuron for Neuromorphic Architectures

Abstract

In this paper, the intrinsic physical characteristics of spin-orbit torque (SOT) magnetoresistive random-access memory (MRAM) devices are leveraged to realize sigmoidal neurons in neuromorphic architectures. Performance comparisons with the previous power- and area-efficient sigmoidal neuron circuits exhibit 74x and 12x reduction in power-area-product values for the proposed SOT-MRAM based neuron. To verify the functionally of the proposed neuron within larger scale designs, we have implemented a circuit realization of a 784x16x10 SOT-MRAM based multiplayer perceptron (MLP) for MNIST pattern recognition application using SPICE circuit simulation tool. The results obtained exhibit that the proposed SOT-MRAM based MLP can achieve accuracies comparable to an ideal binarized MLP architecture implemented on GPU, while realizing orders of magnitude increase in processing speed.

Full PDF

SSOT-MRAM based Sigmoidal Neuron forNeuromorphic Architectures

Brendan Reidy and Ramtin Zand

Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208. ([email protected])

Abstract —In this paper, the intrinsic physical characteristics ofspin orbit torque (SOT) magnetoresistive random-access memory(MRAM) devices are leveraged to realize sigmoidal neuronsin neuromorphic architectures. Performance comparisons withthe previous power- and area-efﬁcient sigmoidal neuron circuitsexhibit × and × reduction in power-area-product valuesfor the proposed SOT-MRAM based neuron. To verify thefunctionally of the proposed neuron within larger scale designs,we have implemented a circuit realization of a × × SOT-MRAM based multiplayer perceptron (MLP) for MNIST patternrecognition application using SPICE circuit simulation tool. Theresults obtained exhibit that the proposed SOT-MRAM basedMLP can achieve accuracies comparable to an ideal binarizedMLP architecture implemented on GPU, while realizing ordersof magnitude increase in processing speed.

I. I

NTRODUCTION

The neuromorphic computing is the concept of embody-ing the physical processes that underlie the computationsof biological neural networks (NNs) within the physics ofthe very large-scale integration (VLSI) circuits, as opposedto the conventional power-hungry approaches which emulatethe mathematical behavior NNs on conventional computingsystems such as GPUs. Recently, various beyond CMOStechnologies have been investigated to be leveraged withinneuromorphic circuits and architectures, among which mem-ristive devices are one of the most promising solutions [1].Memristors have been widely used within both synapse andneuron circuits, and provide signiﬁcant advantages such assmall on-chip area, non-volatility, and low-power dissipation[1]. However, they suffer from severe reliability issues suchas high device-to-device (D2D) and cycle-to-cycle (C2C)variations [2] and low endurance [3]. On the other hand,spintronic devices have shown some reliability advantagesover other memristive devices. For instance, spin orbit torque(SOT) magnetoresistive random-access memory (MRAM) [4]have exhibited inﬁnite write endurance, which is a desirablefeature for in-circuit training that can be used to alleviatevariation challenges [2] in neuromemristive architectures [1].While SOT-MRAM devices have been previously used withinin-memory computing platforms as a hardware accelerator forartiﬁcial neural networks [5], herein we will go beyond theprevious work and utilize the intrinsic characteristics of SOT-MRAM cells within the neuromorphic architecture as a naturalbuilding block for both synapses and neurons.

Fig. 1. (a) SOT-MRAM cell. Positive current along + x induces a spininjection current + z direction. The injected spin current produces the requiredspin torque for aligning the magnetic direction of the free layer in + y directions, and vice versa. (b) SOT-MRAM Top view. II. SOT-MRAM

BASED N EURONS AND S YNAPSES

Fig. 1 shows a simpliﬁed structure of a SOT-MRAM cell,which includes a magnetic tunnel junction (MTJ) with twoferromagnetic (FM) layers, which are separated by a thin oxidelayer. MTJ has two different resistance levels, which are deter-mined according to the angle ( θ ) between the magnetizationorientation of the FM layers. The resistance of the MTJ inparallel (P) and antiparallel (AP) magnetization conﬁgurationscan be obtained using the following equations [6]: R ( θ ) = 2 R MTJ (1 +

T MR )2 +

T MR (1 + cos θ )= (cid:40) R P = R MTJ , θ = 0 R AP = R MTJ (1 +

T MR ) , θ = π (1) T MR ( T, V b ) = T MR / V b V ) (2) where R MT J = RAArea , in which the resistance-area product(RA) value of the MTJ depends on the material compositionof its layers. TMR is the tunneling magnetoresistance, whichrelies on temperature (T) and bias voltage ( V b ). V is a ﬁttingparameter, and T M R is a material-dependent constantIn the MTJ structure, the magnetization direction of elec-trons in one of the FM layers is ﬁxed (pinned layer), whilethe electrons’ directions in the other FM layer (free layer)can be switched. In [4], Liu et al. have shown that passing acharge current ( I c ) through a heavy metal (HM) generates aspin-polarized current ( I s ) using the spin Hall Effect (SHE),which can switch the magnetization direction of the free layer,as shown in Fig. 1. The ratio of the generated spin current tothe applied charge current is normally greater than one leadingto an energy-efﬁcient switching operation [7]. Herein, we willuse SOT-MRAM devices as a building block for both synapseand neuron circuits. a r X i v : . [ c s . ET ] J un ABLE IP

ARAMETERS OF THE

SHE-MRAM

DEVICE [6].

Parameter Description Value

MT J

Area l MTJ × w MTJ × π nm × nm × π HM V l HM × w HM × t HM nm × nm × nmRA resistance-area product 10 Ω .µm V Fitting parameter 0.65

T MR tunneling magnetoresistance 100Fig. 2. (a) The SOT-MRAM based neuron, (b) The VTC curves showingvarious operating regions of PMOS (MP) and NMOS (MN) transistors. A. SOT-MRAM Based Neuron

Fig. 2 (a) shows the structure of the proposed neuron,which includes two SOT-MRAM devices and a CMOS-basedinverter (2T-2R). The magnetization conﬁguration in SOT-MRAM1 is required to be in P state, while SOT-MRAM2is in AP state. The SOT-MRAMs in the neuron’s circuitoperate as a voltage divider, which reduces the slope of thelinear operating region in the CMOS inverter’s voltage transfercharacteristic (VTC) curve. The reduction in the slope of linearregion in the CMOS inverter creates a smooth high-to-lowoutput voltage transition, which enables the realization of theactivation function behavior desirable for sigmoid neurons.In order to verify the functionality of our proposed neuron,ﬁrst we created a Verilog-A model of the SOT-MRAM deviceusing equations (1) and (2), and the parameters listed inTable I [6]. Next, we utilized the developed model alongwith 14nm HP-FinFET PTM library to realize the circuitimplementation of the neuron. Fig. 2 (b) shows the SPICEcircuit simulation results of the proposed SOT-MRAM basedneuron using V DD = 0 . V and V SS = 0 V voltages, whichvalidates the desired sigmoidal behavior for neurons. B. SOT-MRAM Based Synapse

SOT-MRAM cell are capable of realizing two resistive level,i.e. R P and R AP . The combination of two SOT-MRAMcells and a differential ampliﬁer can produce the positiveand negative weights required for implementation of a binarysynapse. Fig. 3 shows a neuron with Y i = X i × W i as theinput of the neuron, where X i is the input signal and W i isa binarized weight. The corresponding circuit implementationis also shown in the ﬁgure, which includes two SOT-MRAMcells and a differential ampliﬁer as synapse. The output of the Fig. 3. The SOT-MRAM based binary synapse. differential ampliﬁer ( Y i ) is proportional to ( I + − I − ), where I + = X i G + i and I − = X i G − i . Thus, Y i ∝ X i ( G + i − G − i )in which G + i and G + i are the conductance of SOT-MRAM1and SOT-MRAM2, respectively, that can be tuned as shownin Fig. 3 to realize negative and positive weights in a binarysynapse. For instance, for W i = − , SOT-MRAM1 and SOT-MRAM2 should be in AP state and P states, respectively.According to Eq. (1) R AP > R P , which means G AP < G P since G = 1 /R , therefore G + i < G + i and Y i < .III. P ROPOSED

SOT-MRAM

BASED

MLP A

RCHITECTURE

Figures 4 and 5 exhibit the training and inference paths of a × SOT-MRAM based single layer perceptron proposedhere, which are shown separately for simplicity. The synapticconnections are designed in form of a crossbar architecture, inwhich the number of columns and rows are deﬁned based onthe number of nodes in input and output layers, respectively.During the training phase, the resistance of the SOT-MRAMbased synapses will be tuned using the bit-line (BL) andsource-line (SL) interconnections which are shared betweendifferent rows, as shown in Fig. 4. The write word line (WWL)control signals will only activate one row in each clock cycle,thus the entire array can be updated using j clock cycles,where j is equal to the number of neurons in the output layer.Moreover, to tune the states of the SOT-MRAMs in neuronsaccording to the requirements mentioned in Section II.A, theBL and SL control signals for the neuron are set to VDD andVSS, respectively, as shown in Fig. 4.In the inference phase, the BL and SL control signals arein high-impedance (Hi-Z) state, and read word line (RWL)and WWL control signals are connected to VDD and GND,respectively. This will stop the write operation in synapses, andgenerate I + and I − currents shown in Fig. 5, amplitude ofwhich depend on the input (IN) signals and the resistances ofSOT-MRAM synapses. Each row includes a shared differentialampliﬁer, which generates an output voltage proportional to (cid:80) i ( I + i,n − I − i,n ) for the n th row, where i is the total number ofnodes in the input layer. Finally, the output of the differentialampliﬁers are connected to the SOT-MRAM based sigmoidalneurons. The entire inference operation occurs in parallel andin a single clock cycle. The required signaling to controlthe training and inference operations is listed in Table II. ig. 4. The training path for a × SOT-MRAM based perceptron.Fig. 5. The inference path for the a × SOT-MRAM based perceptron.TABLE IIT

HE REQUIRED SIGNALING TO CONTROL THE PROPOSED

SOT-MRAM

BASED PERCEPTRON ARRAY . Operation WWL RWL BL SL IN

Training W i = +1 VDD GND VDD GND Hi-Z W i = − VDD GND GND VDD Hi-ZInference GND VDD Hi-Z Hi-Z VIN

One of the main advantages of the proposed architectureis that it can be readily concatenated to form a multi-layerperceptron (MLP) and deep neural network (DNN), which canstill operate in a single clock cycle as it will be shown in theSimulation Results section.IV. H

ARDWARE - AWARE L EARNING M ECHANISM FOR P ROPOSED

SOT-MRAM

BASED

MLP A

RCHITECTURE

To train the proposed SOT-MRAM based neuromorphicMLP architecture, a hardware-aware learning mechanismshould be developed which realizes the characteristics andlimitations of our SOT-MRAM based neurons and synapses.Herein, we utilize a two stage teacher-student approach, inwhich both teacher and student networks have identical topolo-gies. Table III provides the notations and descriptions for bothteacher and student networks, in which x is the input and o i is the output of the i th neuron. TABLE IIIT

HE NOTATIONS AND DESCRIPTIONS OF THE PROPOSED LEARNINGMECHANISM FOR THE

SOT-MRAM

BASED

MLP.Teacher Network Student NetworkWeights W i ∈ R W i ∈ {− , +1 } Biases B i ∈ R W i ∈ {− , +1 } Transfer Function y i = w i x + b i y i = w i x + b i Activation Function o i = sigmoid ( − y i ) o i = sigmoid ( − y i ) To incorporate the features of the SOT-MRAM basedsynapses and neurons within our training mechanism, we havemade two modiﬁcations to the approaches previously used fortraining binarized neural networks [8], [9]. First, we have usedbinarized biases in the student networks instead of real-valuedbiases. Second, since our SOT-MRAM neuron realizes real-valued sigmoidal activation function ( sigmoid ( − x ) ) withoutany computation overheads, we could avoid binarizing theactivation functions and reduce the possible information lossin the teacher or student networks [8]. Herein, after eachweight update in the teacher network we clip the real-valuedweights within the [ − , interval, and then use the belowdeterministic binarization approach to binarize the weights: W ij = (cid:40) +1 , ¯ w ij ≥ ∆ B − , ¯ w ij < ∆ B (3) where ∆ B = 0 is threshold parameters for binarized weights.Finally, once all the binarized weights are trained we willuse a mapping mechanism to convert them to resistive statesin SOT-MRAM based synapses as explained in Section II.B.Stochastic binarization [9] scheme can also be used to quan-tize the weights and biases. However, stochastic approachexhibits its advantages in very large scale convolutional neuralnetworks (CNNs) which are not the focus of this paper. Infact, we have initially leveraged stochastic mechanisms in oursimulations and while the training times were approximately10-fold longer, the obtained accuracy values were comparableto those realized by deterministic approaches.V. S IMULATION R ESULTS

To evaluate the performance of our proposed SOT-MRAMbased neuromorphic MLP architecture, we have utilized ahierarchical simulation approach including circuit-level andapplication-level simulations as described in the following.

A. Circuit-Level Simulation of SOT-MRAM based Neuron

Herein, we have used SPICE circuit simulator with 14nmHP-FinFET PTM transistor library, Verilog-A model of theSOT-MRAM using, and V DD = 0 . as the nominal voltage toobtain the power consumption of our proposed SOT-MRAMbased sigmoid neuron. The obtained simulation results showthe average power consumption of µW for the proposedsigmoid neuron. Moreover, the area of our neuron is ap-proximately equal to λ × λ , that is obtained by thelayout design, in which λ is a technology-dependent parameter.Herein, we have used the 14nm FinFET technology, whichleads to the approximate area consumption of . µm per ABLE IVP

ERFORMANCE C OMPARISON FOR V ARIOUS N EURON I MPLEMENTATIONS .[11] [12] Proposed HereinPower Consumption 7.4 × × × Area Consumption 10 × × × Power-Area Product 74 × × × neuron. SOT-MRAM devices can be fabricated on top of thetransistors, thus incurring no area overheadTable IV provides a comparison between our SOT-MRAMbased sigmoidal neuron and some of the most power- and area-efﬁcient mixed-signal sigmoid neuron designs. To provide afair comparison in terms of area and power dissipation, wehave utilized General Scaling method [10] to normalize thepower dissipation and area of the designs listed in Table IV.voltage and area scale at different rate of U and S , respectively.Thus, the power dissipation is scaled with respect to /U and area per device is scaled according to /S [10]. Theresults obtained exhibit that the proposed SOT-MRAM basedneuron can achieve signiﬁcant area reduction, while realizingcomparable power consumption compared to the existingpower- and area-efﬁcient neuron implementations. This resultsin a × and × reduction in power-area product comparedto the designs introduced in [11] and [12], respectively. B. Application-level Simulation

To verify the functionality of our SOT-MRAM based neuronand synapse for larger-scale applications, we have developeda Python-based simulation framework based on [13]. Thedeveloped simulator realizes the SPICE circuit implemen-tation of our SOT-MRAM based MLP, and measures itscorresponding accuracy and power consumption for a speciﬁcpattern recognition application. Fig. 6 depicts the accuracyof a × × SOT-MRAM based neuromorphic MLPsimulated in SPICE compared to ﬂoating-point and binarizedMLP architectures implemented by GPU for MNIST hand-written digit recognition application. The results obtainedshow that within 10 training epochs a comparable test accuracyof 86.54% and 85.56% can be achieved for binarized MLP andSOT-MRAM based MLP architectures, respectively. However,the SOT-MRAM based MLP complete the recognition task ina single clock cycle, while a highly-parallel implementationof binarized MLP on GPU requires ∼ clock cycles withsimilar frequency to complete the same task.VI. C ONCLUSION

Herein, we proposed a power- and area-efﬁcient SOT-MRAM based sigmoidal neuron, which have been leveragedalong-with SOT-MRAM based synapses to construct a neuro-morphic MLP architecture. The developed neuron played anenabling role in the single-cycle operation of the SOT-MRAMbased MLP. We implemented the SPICE circuit realization ofa × × SOT-MRAM based MLP and compared itsperformance with a binarized MLP implemented on GPU forMNIST pattern recognition application. The results obtainedexhibited approximately ﬁve orders of magnitude increase in

Fig. 6. Accuracy for MNIST application using a × × MLP. the processing speed of our SOT-MRAM based MLP, whilerealizing comparable accuracy to that of the GPU-implementedbinarized MLP. Herein, we have used a small network as aproof-of-concept, while the achieved improvements are ex-pected to be even more signiﬁcant for larger scale circuitswhich will be studied in the future work of authors.R

EFERENCES[1] O. Krestinskaya, A. P. James, and L. O. Chua, “Neuromemristive circuitsfor edge computing: A review,”

IEEE Transactions on Neural Networksand Learning Systems , vol. 31, no. 1, pp. 4–23, 2020.[2] M. Prezioso, F. Merrikh-Bayat, B. Hoskins, G. C. Adam, K. K. Likharev,and D. B. Strukov, “Training and operation of an integrated neuromor-phic network based on metal-oxide memristors,”

Nature , vol. 521, no.7550, p. 61, 2015.[3] D. B. Strukov, “Endurance-write-speed tradeoffs in nonvolatile memo-ries,”

Applied Physics A , vol. 122, no. 4, p. 302, Mar 2016.[4] L. Liu, C. Pai, Y. Li, H. W. Tseng, D. C. Ralph, and R. A. Buhrman,“Spin-torque switching with the giant spin hall effect of tantalum,”

Science , vol. 336, no. 6081, pp. 555–558, 2012.[5] S. Angizi, Z. He, and D. Fan, “Parapim: A parallel processing-in-memory accelerator for binary-weight deep neural networks,” in

Proceedings of the 24th Asia and South Paciﬁc Design AutomationConference , ser. ASPDAC ’19, 2019, pp. 127–132.[6] Y. Zhang, W. Zhao, Y. Lakys, J. O. Klein, J. V. Kim, D. Ravelosona, andC. Chappert, “Compact modeling of perpendicular-anisotropy cofeb/mgomagnetic tunnel junctions,”

IEEE Transactions on Electron Devices ,vol. 59, no. 3, pp. 819–826, March 2012.[7] R. Zand, A. Roohi, and R. F. DeMara, “Energy-efﬁcient and process-variation-resilient write circuit schemes for spin hall effect mramdevice,”

IEEE Transactions on Very Large Scale Integration (VLSI)Systems , vol. 25, no. 9, pp. 2394–2401, Sep. 2017.[8] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “Xnor-net:Imagenet classiﬁcation using binary convolutional neural networks,”in

Computer Vision – ECCV 2016 , B. Leibe, J. Matas, N. Sebe, andM. Welling, Eds., 2016, pp. 525–542.[9] M. Courbariaux, Y. Bengio, and J.-P. David, “Binaryconnect: Trainingdeep neural networks with binary weights during propagations,” in

Advances in Neural Information Processing Systems 28 , 2015, pp. 3123–3131.[10] A. Stillmaker and B. Baas, “Scaling equations for the accurate predictionof CMOS device performance from 180nm to 7nm,”

Integration , vol. 58,pp. 74–81, 6 2017.[11] G. Khodabandehloo, M. Mirhassani, and M. Ahmadi, “Analog imple-mentation of a novel resistive-type sigmoidal neuron,”

IEEE Transac-tions on Very Large Scale Integration (VLSI) Systems , vol. 20, no. 4,pp. 750–754, 2012.[12] J. Shamsi, A. Amirsoleimani, S. Mirzakuchaki, A. Ahmade,S. Alirezaee, and M. Ahmadi, “Hyperbolic tangent passive resistive-type neuron,” in , 2015, pp. 581–584.[13] R. Zand, K. Y. Camsari, S. Datta, and R. F. Demara, “Composableprobabilistic inference networks using mram-based stochastic neurons,”