Janus: An Uncertain Cache Architecture to Cope with Side Channel Attacks
Hossein Hosseinzadeh, Mihailo Isakov, Mostafa Darabi, Ahmad Patooghy, Michel A. Kinsy
JJanus: An Uncertain Cache Architecture to Copewith Side Channel Attacks
Hossein Hosseinzadeh, Mihailo Isakov, Mostafa Darabi, Ahmad Patooghy and Michel A. Kinsy
Adaptive and Secure Computing Systems (ASCS) LaboratoryDepartment of Electrical and Computer Engineering, Boston University
Abstract —Side channel attacks are a major class of attacksto crypto-systems. Attackers collect and analyze timing behavior,I/O data, or power consumption in these systems to underminetheir effectiveness in protecting sensitive information. In thiswork, we propose a new cache architecture, called Janus, toenable crypto-systems to introduce randomization and uncer-tainty in their runtime timing behavior and power utilizationprofile. In the proposed cache architecture, each data block isequipped with an on-off flag to enable/disable the data block. TheJanus architecture has two special instructions in its instructionset to support the on-off flag. Beside the analytical evaluationof the proposed cache architecture, we deploy it in an ARM-7processor core to study its feasibility and practicality. Resultsshow a significant variation in the timing behavior across all thebenchmarks. The new secure processor architecture has minimalhardware overhead and significant improvement in protectingagainst power analysis and timing behavior attacks.
I. I
NTRODUCTION
Computing and embedded systems have penetrated almostevery aspect of our daily lives, from mobile phones andartificial pacemakers to thermostats and self-driving vehicles.In fact, nowadays, most of the integrated circuits (ICs) inuse are found in embedded systems and processing sensitiveinformation. The need to improve the security of these systemshas never been greater because of the ongoing push to connectthem to the Internet. To meet some of the security challenges,different crypto-systems have been proposed. However, oneof the common attacks on crypto-systems, and computingsystems in general, is side channel attacks (SCAs) in whichexternal indicators such as power consumption and electro-magnetic emissions can be used to derive secret and sensitiveinformation. Power analysis attacks, fault injection attacks, andtiming attacks are among the most successful side channelattacks. With power analysis attacks, the power expenditureof a crypto-system is investigated by attackers in order toreveal sensitive information such as cryptographic keys. Themost popular power analysis attacks are known as simplepower analysis (SPA) and differential power analysis (DPA)attacks [1]. In SPA attacks, the power consumption graphsrelated to the electrical activities of the IC modules areinterpreted visually. With DPA techniques, attackers collectand analyze data from various cryptographic functions, anduse them to calculate the intermediate values of cryptographiccomputations. Since power consumption monitoring is notinvasive, the crypto-system may not detect power analysisattacks. To cope with power analysis attacks, the system’spower consumption can be obfuscated. Randomization ofthe IC runtime power variations is one such technique. Byrandomizing the consuming power of a crypto-system, at- tackers find it more difficult to extract secret information.Memory operations and the memory hierarchy can be utilizedto randomize the power expenditure. Fault injection attacks areanother widely used class of side channel attacks [2]. Faultinjection attacks have two main phases. In the first phase,the attacker maliciously injects some faults in order to affectthe input parameters, processing unit [3], storage unit [3], orinstructions [4] of the crypto-system. In the second phase,an analysis is done to gathered information e.g., I/O data,timing behaviors to reveal secret keys inside the crypto-system.Fault injection attacks are often based on some well-definedanalysis vectors [5] performed on the gathered informationduring attacks. Randomizing the timing and I/O data of acrypto-system significantly improves security of the system,especially, against fault injection attacks [5]. In this work,we propose and evaluate a new cache design to cope withpower and fault injection attacks. In the proposed
Janus cachearchitecture, each cacheline has an additional “on-off flag”(OOF) bit to enable and disable access to the data block.By introducing instructions to turn on and off cachelines, theruntime power utilization and the timing behavior of the cachestructure are efficiently obfuscated.II. R
ELATED W ORK
In [6], a hardware-software randomized instruction injectionscheme (RIJID) was proposed. In RIJID, the power utilizationis scrambled so that the segments of the encryption code can-not be identified. The scheme has shown some efficacy againstboth SPA and DPA attacks that use system power profile toextract encryption code. Ambrose et al. in [7] proposed the useof parallel capability in multi-modulo residue number systems(RNS) architectures to scramble sensitive data. By usingRNS architectures, the operations can be divided into parallelsections, and thus, the power consumption and complexityare reduced. Yang et al. [8] introduced a scheme known asrandom dynamic voltage and frequency scaling (RDVFS) todecrease the correlation between the system power consump-tion and input data by changing the frequency and voltagerandomly. However, RDVFS method cannot defeat SPA/DPApower attacks [8]. In [9], authors developed a policy usingdynamic voltage and frequency scaling (DVFS) to overcomethe limitations of RDVFS by breaking correlations betweenvoltage and frequency of (V, f) pairs. In [10], the advancedencryption standard (AES) algorithm is implemented usingtechniques resistant to first order differential electromagneticand power analyses. With this approach, the Galois Field of theAES is randomized and no additional operation is added to thealgorithm. Consequently, the working frequency remains the a r X i v : . [ c s . CR ] O c t ame and the used algorithm is compatible to the publishedstandard. Fault attack tolerant methods generally fall underone of two categories: fault avoidance and fault protection. Inboth cases, extra hardware is often required to (a) check andprevent fault injection or (b) rollback the crypto-system torecover from the fault. Most of the proposed approaches dealtwith power attacks while ignoring fault attacks or vice versa.In this work, we try to jointly address both fault injection andpower attacks in crypto-systems.III. J ANUS C ACHE A RCHITECTURE
In both general-purpose and embedded computing, the over-all system’s performance and power usage is highly dependenton the cache’s performance. When the processor needs somedata, it goes to the cache. If the data is in the cache, there isa hit. Otherwise, the processor has to wait for main memoryto supply the data. Since access time for the main memory isorders of magnitude greater than the cache access time, cachehits and cache misses have very different access times andpower profiles. From the power consumption view, a cachehit consumes very little energy since no external lines of dataare moved through the memory subsystem hierarchy and nomain bus address or data activities are involved. Therefore,the hit rate of the cache system plays a pivotal role in thepower consumption and timing behavior of a crypto-system.The key insight is that by changing the miss and hit rates, onecan alter the power consumption and timing behavior of thesystem leading to a more robust crypto-system.The proposed cache architecture operates under fully asso-ciative policy for substituting the data words. More specifi-cally, new data words can be stored in any free locations ofthe cache, and if the cache is full, data eviction and new datawords placement use the Least Recently Used (LRU) policy.In the
Janus cache design, for each block of data there isone flag bit called “on-off flag” (
OOF ). The
OOF is used toenable or disable access to a particular cacheline even whenthe valid bit of the line is one and there is a match on thetags. By introducing a small set of instructions for turning onand off the
OOF bits, we are effectively able to (a) obfuscatethe power utilization of system in a controlled manner and(b) minimize the hardware modifications needed to supportthe new security feature. All the fields in the conventionalcache structure and their functionalities remain the same. Forsimplicity, we did not show the cache coherence bits field. The
OOF bit check happens after the valid bit check, therefore, inthe
Janus architecture, there is one single gate delay in thecache structure. To control the state of each
OOF of the cachestructure, we introduce two ON - OFF instructions: “cache-block-on- i ” and “cache-block-off- i ”, for controlling the i -thcacheline. By exploiting these two instructions, the amount ofeffective hit and miss rates of the cache is controlled beyondthe normal miss and hit rates of the executing program. Thisapproach gives users a program level access for controlling thedesired amount of obfuscation. The random injection of thisinstruction pair into the based code creates a runtime powerprofile and timing behavior for the crypto-system that are moreresilient to power analysis and fault injection attacks. A. Runtime ON-OFF Algorithm
Each memory request now has three possible outcomes: (1)program miss, (2) program hit and
OOF off , and finally (3)program hit and
OOF on . When the addressed block has notbeen previously brought in the cache or has been evicted, acache miss occurs. However, if the block is found in the cache,there is a cache hit. Since cachelines can be disabled throughthe security policy, certain cache misses are intentional (IM) -outcome (2). In the case of an IM, the crypto-system followsthe same data fetching process (either from lower caches ormain memory) as in the case of a genuine cache miss. Tomake sure that an IM and an actual miss have the same powerand latency profiles, the fetched block is placed on top of theold. Let us assume that the cache has n data blocks, from 0 to n − , and the considered code to be run consists of m timeslots, from 0 to m − . Turning off each data block increasesthe power utilization of the crypto-system. This execution timeoverhead is based on the amount of IMs encountered duringprogram execution. This increase in power can be modeledas a random variable, more specifically as a Poisson randomvariable, since it depends on the number of ON - OFF instructionpairs executed at runtime. Let P i denote the increase in powerin the crypto-system when the i -th cache data block is turnedoff. Thus, for P i , we have: P i = A i × C , (1) A i is the number of active requests on the i -th data blockduring execution and C is a constant value (the powerconsumed by the crypto-system to bring in data from the RAM- Random Access Memory - instead of the cache). With theJanus caching scheme, the execution of ON - OFF instructionpairs and their effects on the cache miss rate add uncertainty tothe power consumption of the crypto-system, and obfuscate theactual program execution power usage profile. As a result, thecrypto-system is protected against the power analysis attacks.The runtime power utilization uncertainty or the added noise isa random process . In practical systems, the power consumptionis capped (i.e., the second moment of the noise is limited),therefore, the highest uncertainty (i.e., entropy) in the powerconsumption can be realized with a Gaussian noise model [11].For this reason, the Janus caching scheme creates a Gaussiannoise in the power consumption through the random variable P i and uses it to insert the appropriate number of ON - OFF instruction pairs in the code. Let n ( t ) denote the amount ofGaussian noise at the time slot t which can be modeled by aGaussian random variable . Because of practical limitations, apure Gaussian random variable cannot be generated, thus, apseudo Gaussian random variable at the time slot t is used.For producing the n ( t ) , at first, we choose two numbers U and U in the range of [0 , arbitrarily. n ( t ) can be producedusing the following equations [11]: V = 2 U − and V = 2 U − ,S = V + V , ( Such that S ≤ ,n ( t ) = (cid:114) − S ) S V (2)f S > , we select another U and U until S ≤ holds.This is the amount of Gaussian noise at the time slot t thatshould be added to the power consumption of the system.Algorithm 1 presents the procedure for achieving the value of n ( t ) in Equation 2 to be added to the random variables P i inEquation 1. The algorithm derives in n ( t ) by exploring the ON and OFF states of the cache data blocks in the time slot t . It is Algorithm 1
Janus O N -O FF Policy in the Time Slot t Compute all the P i values via Equation 1. Compute the power addition or minus for all the states ofdata blocks. Compute n ( t ) value via Equation 2. Among the different states of data blocks for being turnedon and off, choose the state whose result is the closest tothe amount of n ( t ) computed in the previous step.worth noting that, although we illustrate the Janus architecturewith a single level cache for presentation simplicity, it worksin multi-level cache systems as well.IV. E VALUATIONS
A. Analytical Evaluation
The analytical assessment of the Janus cache architecturefocuses on (a) the number of available data blocks to turn ON and OFF in the cache at any given time and (b) the errorprobability of guessing the consumed power. Let us assumethat there are N data blocks available to turn ON and OFF , andturning off each of them results in some P power increase.During each time slot, there are N + 1 possible data blockstates and NPN +1 power difference between them. Based on [12]noise quantization results, the variance of distance between thePseudo-Gaussian noise produced by turning ON and OFF thedata blocks and the Gaussian noise, i.e., σ distance , can becalculated as σ distance = N P N + 1) . (3)By increasing the number of available data blocks in the cachestructure to be turned ON and OFF , one can decrease thedistance between the two noises. Another important metricfor evaluating the Janus architecture’s performance is its errorprobability in estimating the crypto-system’s power usage tothe ON and OFF decisions. Since a Gaussian noise model isused, one can model the error probability of the power estimateas the error rate in an additive white Gaussian noise (AWGN)channel using binary phase-shift keying (BPSK) modulation.Using the same approach as in [12] for AWGN channel, theerror probability of the estimated power, i.e.,
P r error , can bewritten as
P r error = 12 erf c ( √ P ) (4)Where erf c is the error function and equals to sqrtπ (cid:82) ∞ x e − x dx . B. Simulation Results
Data blocks are turned on and off in the Janus cachearchitecture using Algorithm 1. We compare the measuredconsumed power of the system to the theoretical Gaussian noise model to show that the proposed architecture effectivelyrandomizes the power consumption of the crypto-system.Figure 1a shows the mean distance between the Gaussiannoise and the produced noise under Janus’ ON and OFF ofdata block policy. By increasing the number of the datablocks in the cache structure, the average distance betweenthe produced noise and the Gaussian noise is reduced. Figure1b shows the error probability of the estimated power usageas a function of the average power change introduced bydata blocks being turned ON or OFF . By increasing thenumber of data blocks and the average change power ofeach data block, the error probability of power consumptionestimate decreases. To fully evaluate the concrete andpractical implications of the Janus cache architecture on thetiming behavior of a crypto-system, we deploy the Janusarchitecture in a gate-level synthesized version of the ARM-7processor that is simulated using the XILINX ISIM simulator.The timing behavior of the synthesized ARM-7 on threebenchmarks (1) Fibonacci sequence generator, (2) quick sort,and (3) bubble sort is extracted and analyzed. The timingbehavior results are reported in Figure 2a. Depending onthe temporal locality of the program, turning off a singledata block in the cache can have a significant effect on theruntime behavior of the system. A powerful resulting insightfrom this analysis is the fact that even if an attacker identifiesthe data block with the most effect on the system runtimebehavior under one program code, this information may notbe useful or effective in attacking the same crypto-systemrunning another program. Therefore moving target securityfeatures are also present in the Janus design as a byproduct.TABLE I: The mean and variance of the runtimes presentedin Figure 2a.
Mean execution Execution timetime (ns) varianceBubble sort 325.8 547.951Quick sort 237.26 313.49Fibonacci 58.26 55.67
TABLE II: The mean and variance of the runtimes presentedin Figure 2b.
Mean execution Execution time Normal executiontime (ns) variance time (ns)Bubble sort 311.86 122.11 290.220Quick sort 238.12 88.65 223.660Fibonacci 58.26 55.67 51.740
To investigate the effect of turning on and off the datablocks on the program execution profile, we create 5 differentpatterns, each pattern has 5 time intervals (the first four are60 nanoseconds long and a last interval runs to the end ofthe program). For each pattern, at each interval, differentcache block sets are turned on and off. Figure 2b shows theresults of these experiments. The mean and variance of theruntimes for the benchmarks in Figure 2a are summarized inTable I. Table II presents the mean and variance of runtimes forbenchmarks under the different ON - OFF patterns. The results
10 12 14 16 180.40.50.60.70.80.91 Number of memory blocks M eand i s t an c ebe t w een G au ss i anno i s eandgene r a t edno i s e ( d B ) (a) Mean distance between the Gaussian noise and the generatednoise versus the number of data blocks in the cache structure. E rr o r p r obab ili t y o f gue ss i ng t he c on s u m ed po w e r
12 data blocks10 data blocks8 data blocks (b) Error probability of the power estimate versus the averagepower change due to data blocks being turned ON or OFF . Fig. 1: Results of the mean distance and error probability. . . . . . . . . . . . . . . .
14 226 . .
02 319 . . .
62 307 .
22 369 . . . Fibonacci Quick Sort Bubble Sort
Turned-off Block E x ec u ti on T i m e ( n s ) (a) Under random ON - OFF scheme. . . . . . . . . . . . Fibonacci Quick Sort Bubble Sort E x ec u ti on T i m e ( n s ) Pattern Number (b) Under the five predetermined ON - OFF patterns.
Fig. 2: Runtimes for the benchmarks under different ON - OFF schemes.show that even under this simple time slicing approach, theJanus architecture scrambles the mean and variance of theprogram runtime enough to provide strong protection againstfault injection attacks.V. C
ONCLUSION
In this work, we propose a new caching architecture, calledJanus, to enable the randomization of the power consumptionin crypto-systems. By obfuscating the runtime power profilethe Janus architecture is able to effectively protect thesesystems against power analysis and timing behavior attacks.The Janus cache architecture is deployed in a synthesizedARM-7 processor core running three different benchmarksto evaluate (a) the feasibility of the architecture, and (b) itsefficacy against the mentioned attacks.R
EFERENCES[1] P. C. Kocher, J. Jaffe, and B. Jun, “Differential power analysis,” in
Proceedings of the 19th Annual International Cryptology Conferenceon Advances in Cryptology , ser. CRYPTO ’99. London, UK, UK:Springer-Verlag, 1999, pp. 388–397.[2] P. Kocher, R. Lee, G. McGraw, and A. Raghunathan, “Security as anew dimension in embedded system design,” in
Proceedings of the 41stAnnual Design Automation Conference , ser. DAC ’04. New York, NY,USA: ACM, 2004, pp. 753–760, moderator-Ravi, Srivaths.[3] D. Page and F. Vercauteren, “A fault attack on pairing-based cryptogra-phy,”
IEEE Transactions on Computers , vol. 55, pp. 1075–1080, 2006. [4] S.-M. Yen, S. Kim, S. Lim, and S. Moon, “A countermeasure againstone physical cryptanalysis may benefit another attack,” in
Proceedingsof the 4th International Conference Seoul on Information Security andCryptology , ser. ICISC ’01. London, UK, UK: Springer-Verlag, 2002,pp. 414–427.[5] M. Tunstall, D. Mukhopadhyay, and S. Ali, “Differential fault analysisof the advanced encryption standard using a single fault,” in
IFIPInternational Workshop on Information Security Theory and Practices .Springer, 2011, pp. 224–233.[6] J. A. Ambrose, R. G. Ragel et al. , “Rijid: random code injection tomask power analysis based side channel attacks,” in
Design AutomationConference, 2007. DAC’07. 44th ACM/IEEE . IEEE, 2007, pp. 489–492.[7] J. A. Ambrose, H. Pettenghi, D. Jayasinghe, and L. Sousa, “Randomisedmulti-modulo residue number system architecture for double-and-add toprevent power analysis side channel attacks,”
IET Circuits, Devices &Systems , vol. 7, no. 5, pp. 283–293, 2013.[8] S. Yang, W. Wolf, N. Vijaykrishnan, D. N. Serpanos, and Y. Xie,“Power attack resistant cryptosystem design: A dynamic voltage andfrequency switching approach,” in
Proceedings of the Conference onDesign, Automation and Test in Europe - Volume 3 , ser. DATE ’05.Washington, DC, USA: IEEE Computer Society, 2005, pp. 64–69.[9] N. D. P. Avirneni and A. K. Somani, “Countering power analysis attacksusingreliable and aggressive designs,”
IEEE Transactions on Computers ,vol. 63, no. 6, pp. 1408–1420, 2014.[10] M. Masoumi and M. H. Rezayati, “Novel approach to protect advancedencryption standard algorithm implementation against differential elec-tromagnetic and power analysis,”
IEEE Transactions on InformationForensics and Security , vol. 10, no. 2, pp. 256–265, 2015.[11] J. Proakis and M. Salehi, “Digital communications, (mcgrawhill, newyork, 2008),”
Google Scholar .[12] H. C. v. Tilborg,