[PDF] AQuRate: MRAM-based Stochastic Oscillator for Adaptive Quantization Rate Sampling of Sparse Signals

Abstract

Recently, the promising aspects of compressive sensing have inspired new circuit-level approaches for their efficient realization within the literature. However, most of these recent advances involving novel sampling techniques have been proposed without considering hardware and signal constraints. Additionally, traditional hardware designs for generating non-uniform sampling clock incur large area overhead and power dissipation. Herein, we propose a novel non-uniform clock generator called Adaptive Quantization Rate (AQR) generator using Magnetic Random Access Memory (MRAM)-based stochastic oscillator devices. Our proposed AQR generator provides ~25-fold reduction in area, on average, while offering ~6-fold reduced power dissipation, on average, compared to the state-of-the-art non-uniform clock generators.

Full PDF

AAQuRate: MRAM-based Stochastic Oscillator for AdaptiveQuantization Rate Sampling of Sparse Signals

Soheil Salehi, Ramtin Zand, Alireza Zaeemzadeh, Nazanin Rahnavard, Ronald F. DeMara

Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL, 32816 USA

ABSTRACT

Recently, the promising aspects of compressive sensing have in-spired new circuit-level approaches for their efficient realizationwithin the literature. However, most of these recent advances in-volving novel sampling techniques have been proposed withoutconsidering hardware and signal constraints. Additionally, tradi-tional hardware designs for generating non-uniform sampling clockincur large area overhead and power dissipation. Herein, we pro-pose a novel non-uniform clock generator called Adaptive Quantiza-tion Rate (AQR) generator using Magnetic Random Access Memory(MRAM)-based stochastic oscillator devices. Our proposed AQRgenerator provides ∼ ∼ CCS CONCEPTS • Hardware → Data conversion ; Spintronics and magnetictechnologies ; Emerging architectures ; KEYWORDS

Analog to Digital Converter, Adaptive Sampling Rate, Non-uniformClock Generator, MRAM-based Stochastic Oscillator, CompressiveSensing.

Recently, non-uniform sampling approaches such as CompressiveSensing (CS) have been proposed to reduce the energy consumptionof sampling operation by reducing number of samples in each frame,reduce required storage to save the sampled data, and reduce thedata transmission due to lower number of samples taken [15, 19, 20].Additionally, event-driven sampling, such as level-crossing sam-pling, has been widely adopted as a promising CS technique tomaximize the performance of sampling operation while reducingenergy consumption [18]. Furthermore, CS techniques are utilizedto sample spectrally sparse wide-band signals close to their infor-mation rate rather than their Nyquist rate, which can be a challengeusing conventional uniform sampling techniques due to the highcost of the hardware that is capable of performing the samplingoperation at a high Nyquist rate.Despite all the benefits that CS techniques offer, they are typi-cally realized oblivious to the hardware limitations such as energy,bandwidth, and battery capacity. Additionally, signal-dependentconstraints such as sparsity and noise level are ignored while study-ing the quantization rate and resolution trade-off. The aforemen-tioned hardware-dependent and signal-dependent constraints alterduring the sampling operation. Thus, an adaptive quantization rateand resolution optimization circuitry is required to maximize sam-pling performance while minimizing the number of samples to reduce energy consumption, data transmission, and storage. Adap-tive quantization rate and resolution sampling might be readilyachieved from the algorithm perspective, however it requires ahardware platform that is capable of real-time adaptation accordingto certain signal behavior such as sparsity rate. Recently, an adap-tive optimization of the quantization rate and resolution duringsignal acquisition has been investigated in [14].Previous works on adaptive quantization rate and resolutionADCs have been implemented using Complementary Metal OxideSemiconductor (CMOS) technology and considering a low-passsignal model [3, 18]. Herein, we propose an spin-based Adaptivequantization rate (AQR) generator circuit that considers the signaldependent constraint as well as hardware limitations. The proposedAQR generator circuit utilized Magnetic Random Access Memory(MRAM)-based stochastic oscillator devices, which offer miniatur-ization and significant energy savings [6].

Recently researchers have achieved significant performance im-provements using sparse signal recovery techniques. Spectrallysparse signals are utilized in many applications such as frequencyhopping communications, musical audio signals, cognitive radionetworks, and radar/sonar imaging systems [14]. Additionally, amajor challenge in spectrum sensing is that in most cases, the sparsecomponents of the signal are spread over a wide-band spectrumand need to be acquired without prior knowledge of their frequen-cies. Moreover, spectrum-aware communication networks requireRadio Frequency and mixed-signal hardware architectures that canachieve very wide-band but energy-efficient spectrum sensing [14].The cornerstone to achieving CS approaches and non-uniformsampling techniques is the utilization of an asynchronous pseudo-random clock generator, usually referred to as non-uniform clockgenerator, which is consisted of a Linear Feedback Shift Register(LFSR) that selects a clock signal at random from a series of ringoscillators with different frequency and phases [3, 4, 10, 11, 13]. Inmost cases, these circuits require a large number of CMOS tran-sistors and incur significant area overhead and power dissipation.Recently, a novel approach for generating the non-uniform clockusing VCMA-MTJ devices is proposed in [11] and the authors haveshown that their proposed design can achieve significant area andpower dissipation reduction compared to the previous CMOS-basedpseudo-random clock generators. However, the authors in [11] con-sidered the frequency of the signal in order to generate the samplingclock, which limits the bandwidth and in case of spectrally sparsesignals, where no prior knowledge of frequency is available, theirproposed approach will face challenges. Herein, we consider thesparsity rate of the signal to generate the sampling clock. This willminimize the number of samples and results in more energy savings. a r X i v : . [ c s . ET ] M a r igure 1: The building block of the proposed spin-basedAQR generator [6]. Furthermore, our proposed design has reduced complexity com-pared to other designs proposed in the literature due to significantreduction in the CMOS circuit elements.

In this section, we show that a recently proposed building blockwith embedded MRAM technology can enable the hardware real-ization of an AQR generator. The structure of the MRAM-based sto-chastic device is shown in Fig. 1, which includes a magnetic tunneljunction (MTJ) that is a 2-terminal device with two different resis-tive levels depending on the orientation of its ferromagnetic (FM)layers, called fixed layer and free layer . The fixed layer is designedto have a fixed magnetic orientation, while the magnetization ori-entation of the free layer can be switched. In MRAM-based memorydevices, a thermally-stable nanomagnet with a large energy barrierwith respect to the thermal energy (kT) is utilized for free layer sothat the fixed layer can function as a non-volatile memory. In recentyears the use of superparamagnetic MTJs that are not thermallystable have been experimentally and theoretically investigated insearch of functional spintronic devices [5, 7–9, 12, 17, 21, 22].In this paper, we use an MRAM device with a low energy-barriernanomagnet ( E B ≪ kT ), which is thermally unstable [6]. Theresistance of an MTJ with such a low energy barrier nanomagnetrandomly fluctuates between high ( R AP ) and low resistance states( R P ). This creates a fluctuating output voltage at the drain of theNMOS transistor, which can be amplified by an inverter circuit toproduce a stochastic output that can be modulated by the inputvoltage. In particular, the output voltage at the drain of the NMOStransistor can be shorted to the ground by reducing its drain-sourceresistance ( r ds ) through increasing the input voltage ( V I N ), or itcan be near V DD by increasing the r ds through decreasing V I N .The device operation can be comprehended by considering the MTJconductance [6]: G MT J = G (cid:20) + m z T MR ( + T MR ) (cid:21) (1) where m z is the free layer magnetization that is stochastically fluc-tuating due to the thermal noise, G is the average MTJ conductance, ( G P + G AP )/

2, and

T MR is the tunneling magnetoresistance ratio.The drain voltage can be expressed as: V DRAI N / V DD = ( + T MR ) + T MR m z ( + T MR )( + α ) + T MR m z (2)where α is the ratio of the transistor conductance ( G T ) to the av-erage MTJ conductance ( G ). The maximum fluctuations at thedrain occurs when α ≈

1, thus the MTJ resistance is approximatelyequal to the NMOS resistance when V I N = . V DD . Since the drainvoltage fluctuations are in the order of hundreds of mV for typicalTMR values, an additional inverter is used to amplify the noise toproduce output voltages ranging from 0 to V DD . To realize an effective hybrid emerging device and CMOS circuit,one useful approach can be to consider stochastic and determinis-tic attributes separately. For instance, Fig. 2 depicts the proposedAQR generator circuit wherein a 2-terminal MTJ realizes stochasticbehavior to provide the non-uniform clock generation capability.The quantized Sparsity Rate Estimator (SRE) module shown inFig. 2 estimates the sparsity rate of the digital output bit-streamby estimating the sparse spectral components of the digital outputusing an iterative algorithm. Recently, rapid and optimized sparsecomponent estimation method is proposed in [14]. In the approachproposed in [14] in order to minimize the computational complexityof the sparse component estimation, an sliding window approachis utilized and the algorithm operates only one iteration on eachframe of the input by utilizing the previous estimate as an initialvalue. This will result in gradual convergence of the sparse com-ponents to the actual values across iterations. These algorithmscan be employed to find the sparsity rate of the signal. In mostcases, sparsity rate of analog signals, which can be described asthe number of non-zero elements in divided by the total number ofelements the sparse representation of the signal, is between 5% to15% in many applications including those targeted herein.When the SRE module estimates the sparsity rate of the signalbased on the digital output of the previous frame, it will then gen-erate a voltage level according to that sparsity rate of the inputanalog signal. This voltage, referred to as V SR , will be applied tothe gate of the NMOS transistor shown in Fig. 2 and results inan stochastic bit-stream generated by the MRAM-based stochasticoscillator device. The stochastic bit-stream output generated by theMRAM-based stochastic oscillator device will be forwarded to theD-Flip-Flop (D-FF) as shown in Fig. 2 and the result of the 2-inputNAND gate between the output of the D-FF and the actual clockof the circuit will generate the required quantization rate to beused for the following frame of the signal acquisition, referred to asAsynchronous Clock (A-Clk) in Fig. 2. Additionally, the SRE modulecan also used by the recovery algorithms to efficiently recover thesampled signal [14]. Additionally, the A-Clk will be forwarded tothe sparse recovery algorithm to provide necessary informationabout the samples taken from the signal to assist with the signalreconstruction.To obtain the relation between the output probability of thestochastic MRAM-based AQR generator and its input voltage, we igure 2: Integration of AQR generator circuit within theCompressive Sensing ADC (CS-ADC) system design.Figure 3: The sampled output of the stochastic MRAM-basedbuilding block for AQR generator for various input voltages. have applied an input pulse that its amplitude starts from GN D andis increased by 200mV every 100ns until it reaches V DD . The outputof the building block is sampled with a 1GHz clock frequency usinga D-FF circuit, as shown in Fig. 3. In order to evaluate and validate the behavior and functionalityof the proposed AQR generator circuit, SPICE and MATLAB sim-ulations were performed. We have utilized the 14nm High Perfor-mance FinFET Predictive Technology Model (PTM) [2] as well asthe MRAM-based stochastic oscillator device model and parametersrepresented in [6] to implement and evaluate the proposed AQRgenerator circuit.

Table 1: Comparison with recently proposed non-uniformclock generator designsDesign Technology (V nominal ) Power norm

Area norm [11] 65nm (1 . ∼ × ∼ × [13] 65nm (1 . ∼ × ∼ × [4] 90nm (1 . ∼ × ∼ × [3] 28nm (1 . ∼ × N/AThis Work 14nm (0 . × × According to our results, AQR provides significant power dissi-pation and area reductions compared to the state-of-the-art nonuni-form clock generators listed in Table 1 [3, 4, 11, 13]. Accordingto our simulation results, power dissipation of the proposed AQRgenerator circuit is 22 . µ W on average. With respect to area utiliza-tion, our proposed AQR design requires only 23 FinFET transistors,which attains a significant reduction in the transistor count andcomplexity of the non-uniform clock generator circuit present instate-of-the-art designs [3, 4, 11, 13]. Thus, AQR avoids high tran-sistor counts while making it unnecessary to use of large LFSRcircuits that contain numerous D-FFs as well as several logic gatesand multiplexers. For a more equitable comparison in terms of areaand power dissipation, we have derived (3) and (4) consideringGeneral Scaling method [16] to normalize the power dissipationand area of the designs listed in Table 1. Based on the GeneralScaling method, voltage and area scale at different rate of U and S ,respectively. Thus, the power dissipation is scaled with respect to1 / U and area per device is scaled according to 1 / S [16]. Power norm = Power x Power

AQR × ( U ) = Power x Power

AQR × ( . VV nominal ) (3) Area norm = Area x Area

AQR × ( S ) = Area x Area

AQR × ( nmTechnoloдy ) (4)where, V nominal is the nominal voltage of the technology model, Technoloдy refers to the technology node in nanometers, and sub-script x refers to the design that we want to scale its power dissi-pation and area according to the technology models. According to(3) and (4), AQR provides power dissipation reduction up to one-order-of-magnitude compared to the state-of-the-art nonuniformclock generators as listed in Table 1. Additionally, AQR offers up toone-orders-of-magnitude area reduction compared to the designsprovided in Table 1 using the scaling comparison trends acceptedin the literature.As described in Section 3.2, sparsity rate of analog signals is usu-ally within the range of 5% − igure 4: Sampling an analog signal with sparsity rate of using AQR generator. Blue represents the signal and Redrepresents the samples taken using the AQR generator. generator. According to the results, the mean normalized errorsof the reconstruction of the signals with 5%, 10%, and 15% spar-sity rates using OMP are 0 . . . . . . We have devised a novel non-uniform clock generator called Adap-tive quantization rate (AQR) generator using MRAM-based sto-chastic oscillator devices. Our proposed AQR generator considerssignal constraints, such as sparsity rate, as well as hardware con-straints, such as area and power dissipation, in order to generatethe non-uniform clock for the asynchronous CS-ADC. Comparedto similar non-uniform clock generators presented in the literature,AQR generator provides significant area reduction of ∼ ∼ ACKNOWLEDGEMENT

This work was supported in part by the Center for ProbabilisticSpin Logic for Low-Energy Boolean and Non-Boolean Computing(CAPSL), one of the Nanoelectronic Computing Research (nCORE)Centers as task 2759.006, a Semiconductor Research Corporation(SRC) program sponsored by the NSF through CCF 1739635, andby NSF through ECCS 1810256.

REFERENCES [1] 2009. CoSaMP: Iterative signal recovery from incomplete and inaccurate samples.

Applied and Computational Harmonic Analysis

26, 3 (2009), 301–321.[2] Arizona State University (ASU). [n. d.]. 14nm HP-FinFET Predictive TechnologyModel (PTM), accessed on 26 November 2018, available at: http://ptm.asu.edu/.http://ptm.asu.edu/. ([n. d.]). http://ptm.asu.edu/[3] David Bellasi, Luca Bettini, Thomas Burger, Qiuting Huang, Christian Benkeser,and Christoph Studer. 2014. A 1.9 GS/s 4-bit sub-Nyquist flash ADC for 3.8 GHzcompressive spectrum sensing in 28 nm CMOS. In . IEEE, 101–104. https://doi.org/10.1109/MWSCAS.2014.6908362[4] Rashed Zafar Bhatti, Keith M. Chugg, and Jeff Draper. 2007. Standard cell basedpseudo-random clock generator for statistical random sampling of digital signals.In . IEEE, 1110–1113. https://doi.org/10.1109/MWSCAS.2007.4488752[5] Kerem Yunus Camsari, Rafatul Faria, Brian M Sutton, and Supriyo Datta. 2017.Stochastic p-bits for invertible logic.

Physical Review X

7, 3 (2017), 031014.[6] Kerem Yunus Camsari, Sayeef Salahuddin, and Supriyo Datta. 2017. Implementingp-bits with embedded mtj.

IEEE Electron Device Letters

38, 12 (2017), 1767–1770.[7] Won Ho Choi, Yang Lv, Jongyeon Kim, Abhishek Deshpande, Gyuseong Kang,Jian-Ping Wang, and Chris H Kim. 2014. A magnetic tunnel junction basedtrue random number generator with conditional perturb and real-time outputprobability tracking. In

Electron Devices Meeting (IEDM), 2014 IEEE International .IEEE, 12–5.[8] Punyashloka Debashis, Rafatul Faria, Kerem Y Camsari, Joerg Appenzeller,Supriyo Datta, and Zhihong Chen. 2016. Experimental demonstration of nano-magnet networks as hardware for ising computing. In

Electron Devices Meeting(IEDM), 2016 IEEE International . IEEE, 34–3.[9] Akio Fukushima, Takayuki Seki, Kay Yakushiji, Hitoshi Kubota, Hiroshi Imamura,Shinji Yuasa, and Koji Ando. 2014. Spin dice: A scalable truly random numbergenerator based on spintronics.

Applied Physics Express

7, 8 (2014), 083001.[10] Selcuk Kose, Emre Salman, Zeljko Ignjatovic, and Eby G. Friedman. 2008. Pseudo-random clocking to enhance signal integrity. In . IEEE, 47–50. https://doi.org/10.1109/SOCC.2008.4641477[11] H Lee, C Grezes, A Lee, F Ebrahimi, P Khalili Amiri, and K L Wang. 2017. ASpintronic Voltage-Controlled Stochastic Oscillator for Event-Driven RandomSampling.

IEEE Electron Device Letters

38, 2 (2017), 281–284. https://doi.org/10.1109/LED.2016.2642818[12] Nicolas Locatelli, Alice Mizrahi, A Accioly, Rie Matsumoto, Akio Fukushima, Hi-toshi Kubota, Shinji Yuasa, Vincent Cros, Luis Gustavo Pereira, Damien Querlioz,et al. 2014. Noise-enhanced synchronization of stochastic magnetic oscillators.

Physical Review Applied

2, 3 (2014), 034009.[13] Muhammad Osama, Lamya Gaber, and Aziza Hussein. 2016. Design of highperformance Pseudorandom Clock Generator for compressive sampling appli-cations. In . IEEE, 257–265.https://doi.org/10.1109/NRSC.2016.7450836[14] Soheil Salehi, Mahdi Boloursaz Mashhadi, Alireza Zaeemzadeh, Nazanin Rah-navard, and Ronald F. De Mara. 2018. Energy-Aware Adaptive Rate and Reso-lution Sampling of Spectrally Sparse Signals Leveraging VCMA-MTJ Devices.

IEEE Journal on Emerging and Selected Topics in Circuits and Systems (2018), 1–1.https://doi.org/10.1109/JETCAS.2018.2857998[15] Shriram Sarvotham, Dror Baron, Richard G. Baraniuk, Shriram Sarvotham, DrorBaron, and Richard G. Baraniuk. 2006. Measurements vs. Bits: CompressedSensing meets Information Theory.

Allerton Conference on Communication,Control and Computing (9 2006). https://scholarship.rice.edu/handle/1911/20323[16] Aaron Stillmaker and Bevan Baas. 2017. Scaling equations for the accurateprediction of CMOS device performance from 180Âănm to 7Âănm.

Integration

58 (6 2017), 74–81. https://doi.org/10.1016/J.VLSI.2017.02.002[17] Brian Sutton, Kerem Yunus Camsari, Behtash Behin-Aein, and Supriyo Datta.2017. Intrinsic optimization using stochastic nanomagnets.

Scientific reports

IEEE Journal of Solid-State Circuits

52, 9 (2017), 2335–2349. https://doi.org/10.1109/JSSC.2017.2718671[19] Alireza Zaeemzadeh, Jamie Haddock, Nazanin Rahnavard, and Deanna Needell.2018. A Bayesian Approach for Asynchronous Parallel Sparse Recovery. In . IEEE, Pacific Grove, CA,1980–1984. https://doi.org/10.1109/ACSSC.2018.8645176[20] Alireza Zaeemzadeh, Mohsen Joneidi, and Nazanin Rahnavard. 2017. Adaptivenon-uniform compressive sampling for time-varying signals. IEEE, 1–6. https://doi.org/10.1109/CISS.2017.7926148[21] Ramtin Zand, Kerem Y Camsari, Supriyo Datta, and Ronald F DeMara. 2018.Composable Probabilistic Inference Networks Using MRAM-based StochasticNeurons. arXiv preprint arXiv:1811.11390 (2018).[22] Ramtin Zand, Kerem Yunus Camsari, Steven D. Pyle, Ibrahim Ahmed, Chris H.Kim, and Ronald F. DeMara. 2018. Low-Energy Deep Belief Networks UsingIntrinsic Sigmoidal Spintronic-based Probabilistic Neurons. In