[PDF] Adaptive Multi-bit SRAM Topology Based Analog PUF

Abstract

Physically Unclonable Functions (PUFs) are lightweight cryptographic primitives for generating unique signatures from minuscule manufacturing variations. In this work, we present lightweight, area efficient and low power adaptive multi-bit SRAM topology based Current Mirror Array (CMA) analog PUF design for securing the sensor nodes, authentication and key generation. The proposed Strong PUF increases the complexity of the machine learning attacks thus making it difficult for the adversary. The design is based on scl180 library.

Full PDF

AAdaptive Multi-bit SRAM Topology Based AnalogPUF

Sudarshan Sharma ∗ , Dhruv Thapar † , Nikhil Bhelave † and Mrigank Sharad ∗ Department of Electronics and Electrical Communication Engineering ∗ , Department of Electrical Engineering † Indian Institute of Technology Kharagpur, IndiaEmail:[email protected], [email protected], [email protected], [email protected]

Abstract —Physically Unclonable Functions (PUFs) arelightweight cryptographic primitives for generating uniquesignatures from minuscule manufacturing variations. In thiswork, we present lightweight, area efﬁcient and low poweradaptive multi-bit SRAM topology based Current Mirror Array(CMA) analog PUF design for securing the sensor nodes,authentication and key generation. The proposed Strong PUFincreases the complexity of the machine learning attacks thusmaking it difﬁcult for the adversary. The design is based onscl180 library.

Keywords —Strong analog PUF, multi bit response, low power.

I. I

NTRODUCTION

Due to the whopping increase in the use of IoT devices,security has become one of the prime concerns. PhysicallyUnclonable Functions (PUF) can be used as a low poweralternative to secure sensor nodes. PUFs are mathematicalmodels or physical structures that map the challenge wordsto corresponding responses which are governed by the uncer-tainties in the variation at the device level. Silicon PUFs haveemerged as potential hardware cryptographic tools due to theirability to generate hardware-unique responses to a given digitaltest word or challenge, by exploiting manufacturing processvariations in circuit components in the IC. These randomintrinsic variations are nearly impossible to replicate and there-fore, PUFs provide extremely reliable hardware authenticationand key generation. The popular applications of PUFs includehardware security and authentication such as secure RFIDs,IP protection in FPGAs and cryptographic key generation.

A. Related Work

The silicon realization of the PUFs (SPUF) is based onthe random variations in dies across a wafer, and fromwafer to wafer due to the process, temperature and pressurevariations during the various manufacturing steps. The ﬁrstimplementation of PUFs on silicon is introduced in [1], wherethe delay variations of CMOS logic components are used toproduce unique responses. In the delay-based PUF, the analogdelay difference between two structurally identical parallelpaths are compared which arises due to the manufacturingvariations. The Ring Oscillator (RO) PUFs are based ondigital loops, they are easy to implement and possess higherreliability. However, RO PUFs are slower and consume largerpower when compared to arbiter PUF (delay based PUFs) anddepends heavily on the number of RO present. The low power current based PUF structure [2] uses anautomatic cut-off mechanism to stop the current ﬂow afterresponse evaluation to minimize the power consumption. Asub-threshold design based PUF [3] exploits the sensitivity dueto process variation in deep sub-micron technologies, the de-sign consist of n stage CMOS multiplexers and delay circuitsfollowed by an arbiter. To mitigate the power supply noise,switching noises and environmental variation a differentialampliﬁer topology based PUF was introduced in [4].Integrated circuit Identiﬁcation (ICID) [5] is based onaddressable MOSFET array which drives a load to generaterandom repeatable voltages based on the threshold voltagemismatches. SRAM cell and memory block based PUF ex-ploiting the intrinsic process variations in read/write reliabilityof cells in static memories is implemented in [6]. [7] showsa PUF cell based on 2-T ampliﬁer working in sub thresholdregion. In [8], PUF design based on power grid resistancevariation has been investigated. The response depends on thevoltage drop at distinct locations of the IC occurred due tothe introduction of a variety of stimuli. The concept based onthe ampliﬁcation of random transistor mismatch through twocomplementary current mirrors and a modiﬁed design withan addition of sense ampliﬁer is discussed in [9]. T Saha etal. [10] proposed aging resistant, lightweight and low-poweranalog PUF which exploits the susceptibility of ThresholdVoltage (Vth) of MOSFETs to process variations.Almost every discussed PUF yield a single bit response fora unique challenge word which makes it easier for machinelearning algorithms to model the PUFs with fewer challengeresponse pairs. We present an adaptive multi-bit SRAM topol-ogy based low power and highly robust analog PUF. The restof the paper is organized in the following manner. SectionII presents a discussion on Machine learning attacks whileSection III describes the architecture of the proposed PUF.Section IV discusses the reliability, and ﬁnally, Section Vconcludes the paper.II. D

ISCUSSION ON M ACHINE L EARNING B ASED A TTACKSON

PUFMost of the existing PUF circuits produce a single bitresponse for a challenge which can be modeled easily usingmachine learning algorithms. The proposed PUF adheres withthe features of a Strong PUF following the property such asMany Challenges, Unpredictability and Unprotected Challenge a r X i v : . [ c s . A R ] D ec ig. 1: Proposed PUF ArchitectureResponse Interface [11]. Ruhrmair et al. proposed variousmodelling attacks on PUFs in [12].The primary attack modelconsists of two steps, the ﬁrst one requires ﬁnding a functionwith parameters which correctly describes the PUFs challengeresponse behavior (or input-output behavior), followed byselection of a machine learning (ML) algorithm to train theparameters of the chosen function to improve its predictionquality using a large set of Challenge Response Pairs (CRPs)as training set. The best existing ML algorithms for attacks onPUFs involves Logistic Regression and Evolution Strategies.Other includes Support Vector Machines (SVM) and NeuralNetworks. The adaptive multi-bit response of the proposedPUF fails the SVM and logistic regression and makes itextremely difﬁcult for heuristic-based methods like EvolutionStrategies thus increasing the complexity of the attack.III. PUF A RCHITECTURE

Fig. 1 depicts the architecture of the proposed PUF. Asillustrated in the ﬁgure, the two 4x16 decoders are used toconvert the challenge word into bit-line and word-line whichselects a particular bit cell from the entire array. Once thebit cell is selected, the power gating block routes the CurrentMirror biasing to that speciﬁc bit cell. Its corresponding outputthrough the MUX is processed by the conﬁgurable ADC blockwhich produces the ﬁnal variable multi-bit response of thechallenge word.

A. PUF Bit Cell

The basic unit of the proposed SRAM topology basedanalog PUF called bit cell comprises of two PMOS in cascodewith two NMOS. All MOSFETs in the bit cell are minimumsized, hence increasing the probability of mismatch duringmanufacturing and also reducing the area required for thecomplete 16 x 16 bit-cell array. The output of the bit cellis chosen as the drain of PM2 and NM1 attributing tothe symmetry. The cascode scheme used provides maximumampliﬁcation, and hence the mismatch on the extremities ineven tenth of millivolts can be ampliﬁed.PM2 and NM1 are biased through two transmission gateswhich are controlled via bit-line and word-line using the powergating blocks as seen in Fig 2. The power gating block consistsof two transmission gates(TG) controlled by the word lineand the bit line, the TGs are tied with the current mirrorbiasing and the bit-cell as seen from Fig. 1. One power gatingblock is required per 16-bit cells (one column). This helps inswitching off all unselected bit cells resulting in zero idealstate power consumption. Different switching schemes weretested and studied for various process corners. The channellength modulation results in a signiﬁcant error while copyingcurrents, especially for minimum size transistors. Thus acurrent mirroring scheme becomes very important. Proposedconﬁguration is shown in Fig. 4.Due to variations in semiconductor manufacturing pro-cesses, corners are introduced in CMOS circuits. Five processcorners dealing with process variations are typical-typicalig. 2: Bit Cell(TT), slow-slow (SS), fast-fast (FF), slow-fast (SF), fast-slow(FS). TT, FF and SS are not much of concern as both theMOSFETs get equally affected in one direction but othercorners like SF, FS called as skewed corners may affect the re-sponse. Considering this, various current mirror conﬁgurationswere tested and wide swing cascode current mirror workedwell with the bit cell for all the corners with the perfectsymmetrical output. The biasing current into the bit-cell usingthis particular current scheme varies around the center value(at no mismatch) of 4.3 µ A.Fig. 3 shows the response of various testbed used as currentmirrors where X axis represents the referred variation modeledas the additional change in the threshold voltage of PM1.We observe that the ideal response is obtained through theproposed current mirror conﬁguration. Moreover, the formerhas a higher gain and is perfectly symmetric compared to thereduced headroom cascode current mirror and simple cascodecurrent mirror conﬁguration.Fig. 3: Output Voltage Response for different Current MirrorConﬁgurations

B. Switching Conﬁgurations

The bit-cell zero static power in the idle state is achievedby turning the single bit cell ON only when it is selected andcutting OFF the supply rail for rest of the bit cell present inthe entire array. Different switching conﬁgurations were tested, Fig. 4: Wide Swing Cascode Current Mirror with PTATCurrent Referenceand their effects on the performance were studied in detail. Themost viable option being the one in Fig. 5. The conﬁgurationFig. 5: Naive Switching Conﬁgurationdoes not require any additional circuitry to serve the purpose ofswitching. However, the response was hampered consideringall process corners as a substantial amount of shift wasobserved in the output due to small voltage drops across theswitching transistors. To mitigate these effects control switcheswere removed from the bit cell and external control circuitryis employed (power gating blocks) which even drasticallyreduced the number of transistors per bit cell and works wellwith different process corners. The switching of PM1 andNM2 of the bit-cell will slow down the charging/dischargingprocess from the pre-charged value to the desired value dueto increased parasitic capacitance per bit line. Therefore, thetwo central MOSFETs, PM2 and NM1 are held with VDDand GND respectively which turns OFF the bit cell throughthe power gating block. Fig. 6 shows the comparison of theoutput voltage for different process corners in case of naiveswitching conﬁguration and used switching conﬁguration.

C. ADC scheme

Due to the cascode conﬁguration, the gain from the MOS-FETs at the extremities to the output is very high thus it ishighly probable that the response would hit either of the power a)(b)

Fig. 6: Output Voltage Variation for different Process Cornersin (a) Naive Switching Conﬁguration (b) Used SwitchingConﬁgurationrails, i.e., VDD or GND. The impact of the random processvariations on circuit behavior can be studied through MonteCarlo(MC) simulation. The MC Simulation presented in Fig.7 shows the output voltage distribution based on both thestatistical variations, i.e., process and mismatch. As expected,the probability distribution is more skewed towards the powerrails thus a varying multi-bit ADC scheme can exploit theutmost variations from the circuit. The ﬁnal response isobtained through an adaptive multi-bit ADC scheme shownin Fig. 8. First, we deﬁne windows or regions as in the Fig.8 using the Lloyd-Max Algorithm based on the probabilitydistribution from the MC Simulation. The regions obtainedare as follows:Region 1: [0,0.1451) 8 bit ADCRegion 2: [0.1451,0.6596) 7 bit ADCRegion 3: [0.6596,1.3308) 6 bit ADCRegion 4: [1.3308,1.6978) 7 bit ADCRegion 5: [1.6978,1.8) 8 bit ADCThe ADC selection logic unit governs bit precision of theconversion based on the regions mentioned above, and thenthe response is generated using the conﬁgurable single slopeADC scheme. V1, V2, V3, and V4 in Fig. 8 are the voltagevalues deﬁning the windows.

1) Conﬁgurable Single Slope ADC Scheme:

The singleslope ADC technique has a series of advantages over the Flashand SAR ADC. The most important one being lower areaand low power consumption. The single slope ADC techniquedesigned in this case utilizes an external ramp circuit, a Fig. 7: Monte Carlo Simulation considering Process Variationand Mismatch for Bitcell Output VoltageFig. 8: Conﬁgurable ADC schemeconstant current source, three separate voltage comparators,a multiplexer, a free running timer, and a latching mechanism.The adaptive bit precision technique discussed is addressedusing the variable frequency clock derived from the originalclock along with variable counter values which are ﬁnallylatched upon comparison during ADC operation. Fig. 9 de-scribes the topology used to eliminate the Input CommonMode Range. The scheme is capable of correctly establishingADC operation for unknown analog voltage varying from 0to VDD. The two comparators one with NMOS as the inputpair say Comparator A and the other one with PMOS as inputpair say Comparator B includes offset cancellation scheme toprevent the results being skewed towards one direction. Thedecision of which comparators output to be used is performedby the third comparator i.e Comparator C based on the clause:If the unknown voltage is higher than VDD/2, the output ofComparator A is selected else the output of Comparator B isused. IV. R

ESULTS

A. Reliability with Temperature Variations

The reliability of bit-cell output voltage is tested withtemperature variations. Fig 10 depicts the effect on the bit-cell output voltage as the temperature varies between ◦ C and ◦ C . For simulation purposes, the referred variation isig. 9: Single Slope ADCmodeled as the change in the threshold voltage of the PM1 inthe bit-cell.Fig. 10: Bit-Cell output voltage variation across differenttemperatures at different variations in threshold voltages B. Comparison with other existing PUF schemes

PUFs operating in the sub-threshold region [3] are shownto consume lesser power compared to those in super-thresholdregion but they exhibit higher delays thus reducing the speedof operation. The power gating block in the proposed PUFprimarily contributes to the low power consumption withoutany hindrance in the operating speed. Table 1 provides thecomparison with existing PUFs based on speed and power for1 bit generation.TABLE I: Power and Speed Comparison with PUF’s for 1 bitgeneration

PUF model Power @ Speed of Operation Energy / cycleSuper-threshold 136.4 µ W @ 1GHz 0.136 pJSub-threshold 0.047 µ W @ 1MHz 0.047 pJICID 250 µ W @ 0.5MHz 500 pJTV-PUF 0.181 µ W @ 1 GHz 0.0018 pJProposed PUF 306.54 µ W @ 6.4 GHz 0.0478 pJ V. CONCLUSION AND FUTURE SCOPE

This paper discusses the architecture of an adaptive multi-bit Strong Analog PUF. Proposed Analog PUF is better thandelay based PUFs considering unbiased responses. The outputbeing independent of the actual layout design, the PUF is easyto fabricate due to its less circuit complexity. Furthermore,the PUF is shown to consume low power at a faster speed ofoperation.The future work on this proposed PUF may include anevaluation based on uniqueness, uniformity and reliability afterchip manufacturing. NIST tests can also be performed on thechip to compare it with more state of art PUF conﬁgurations.R

EFERENCES[1] B. Gassend, D. Clarke, M. van Dijk, and S. Devadas, “Silicon physicalrandom functions,” in

Proceedings of the 9th ACM Conference onComputer and Communications Security , CCS ’02, (New York, NY,USA), pp. 148–160, ACM, 2002.[2] M. Majzoobi, G. Ghiaasi, F. Koushanfar, and S. R. Nassif, “Ultra-lowpower current-based puf,” in , pp. 2071–2074, May 2011.[3] L. Lin, D. Holcomb, D. K. Krishnappa, P. Shabadi, and W. Burleson,“Low-power sub-threshold design of secure physical unclonable func-tions,” in , pp. 43–48, Aug 2010.[4] B.-D. Choi, T.-W. Kim, M.-K. Lee, K.-S. Chung, and D. K. Kim, “Inte-grated circuit design for physical unclonable function using differentialampliﬁers,”

Analog Integr. Circuits Signal Process. , vol. 66, pp. 467–474, Mar. 2011.[5] K. Lofstrom, W. R. Daasch, and D. Taylor, “Ic identiﬁcation circuitusing device mismatch,” in , pp. 372–373, Feb 2000.[6] A. R. Krishna, S. Narasimhan, X. Wang, and S. Bhunia, “Mecca: Arobust low-overhead puf using embedded memory array,” in

Crypto-graphic Hardware and Embedded Systems – CHES 2011 (B. Preneeland T. Takagi, eds.), (Berlin, Heidelberg), pp. 407–420, Springer BerlinHeidelberg, 2011.[7] K. Yang, Q. Dong, D. Blaauw, and D. Sylvester, “8.3 a 553f22-transistorampliﬁer-based physically unclonable function (puf) with 1.67% nativeinstability,” in , pp. 146–147, Feb 2017.[8] R. Helinski, D. Acharyya, and J. Plusquellic, “A physical unclonablefunction deﬁned using power distribution system equivalent resistancevariations,” in ,pp. 676–681, July 2009.[9] A. B. Alvarez, W. Zhao, and M. Alioto, “Static physically unclonablefunctions for secure chip identiﬁcation with 1.95.80.61 v and 15 fj/bitin 65 nm,”

IEEE Journal of Solid-State Circuits , vol. 51, pp. 763–775,March 2016.[10] V. Sehwag and T. Saha, “Tv-puf: A fast lightweight analog physicalunclonable function,” in , pp. 182–186, Dec 2016.[11] U. Rhrmair and D. E. Holcomb, “Pufs at a glance,” in , pp. 1–6,March 2014.[12] U. Rhrmair and J. Slter, “Puf modeling attacks: An introduction andoverview,” in2014 Design, Automation Test in Europe ConferenceExhibition (DATE)