[PDF] Extract Secrets from Wireless Channel: A New Shape-based Approach

Abstract

Existing secret key extraction techniques use quantization to map wireless channel amplitudes to secret bits. This pa- per shows that such techniques are highly prone to environ- ment and local noise effects: They have very high mismatch rates between the two nodes that measure the channel be- tween them. This paper advocates using the shape of the channel instead of the size (or amplitude) of the channel. It shows that this new paradigm shift is significantly ro- bust against environmental and local noises. We refer to this shape-based technique as Puzzle. Implementation in a software-defined radio (SDR) platform demonstrates that Puzzle has a 63% reduction in bit mismatch rate than the state-of-art frequency domain approach (CSI-2bit). Exper- iments also show that unlike the state-of-the-art received signal strength (RSS)-based methods like ASBG, Puzzle is robust against an attack in which an eavesdropper can pre- dict the secret bits using planned movements.

Full PDF

aa r X i v : . [ c s . CR ] J u l Extract Secrets from Wireless Channel: A NewShape-based Approach

ABSTRACT

Puzzle . Implementation ina software-deﬁned radio (SDR) platform demonstrates thatPuzzle has a 63% reduction in bit mismatch rate than thestate-of-art frequency domain approach (CSI-2bit). Exper-iments also show that unlike the state-of-the-art receivedsignal strength (RSS)-based methods like ASBG, Puzzle isrobust against an attack in which an eavesdropper can pre-dict the secret bits using planned movements.

1. INTRODUCTION

For wireless communications, there has been a great inter-est in generating shared secrets from the physical layer as acomplementary approach to the traditional methods of cryp-tography. The interest stems from the open nature of thewireless medium and the infrastructure constraints associ-ated with key management in mobile scenarios. There aretwo main approaches for secret-sharing in wireless. One isbased on information-theoretic principles of exploiting thesecrecy capacity between Alice and Bob compared to Aliceand Eve [1]. The main drawback of this approach is thatsecrecy is dependent on rather strong assumptions abouteavesdropper capability. Equally importantly, even a mod-est increase in the spatial density of eavesdroppers harmsthe secrecy rate of the approach dramatically [10].The other approach is based on channel reciprocity. Chan-nel reciprocity refers to the physical principle whereby near-simultaneous observations of the channel by two commu-nicating parties are identical due to the channel paths be-tween them being symmetrical. Figure 1 in Section 3 showsthis reciprocity in our testbed. The time for which thewireless channel remains correlated is called the coherencetime . By extracting channel state information from the ob-served signals, Alice and Bob can share bits by transmit-ting signals to each other within the coherence time. Fur-thermore, extensive theoretical analysis and experimenta-tion have shown that observations of the wireless channelover distances larger than half-the-wavelength of the carrierfrequency are uncorrelated [11]. In a 2.4GHz ISM band, forinstance, at any location farther than 6cm away from Bob,Eve will observe Alice’s signal through an uncorrelated chan-nel. Channel reciprocity and spatial decorrelation together make the wireless channel an excellent random source forgenerating shared secret keys.There is signiﬁcant prior work that exploits channel reci-procity for secret extraction. One set of techniques usethe received signal strength (RSS) as the secret source [2,4, 8, 9, 15]. These techniques measure the received signalstrength over diﬀerent coherent times to generate a sequenceof received signal strengths. They choose a threshold andtransform the signal strength sequence into 1s (if above thatthreshold) and 0s (if below the threshold). The largest draw-back with RSS-based techniques is that large variations canbe easily introduced by an attacker by blocking transmissionevery now and then. These make the secret predictable sincethe attacker knows the exact moments at which the signal-to-noise ratio (SNR) will drop or increase. Section 4 presentsthis attack and shows this vulnerability. Even if there areno malicious attackers, some unintentional regular activitieswould also make the variation public. For example, SNR ina corridor of a classroom building would be much lower afterclass than during class.Another set of techniques use the ﬁne-grain temporal [13, 7,6, 14] or frequency [5] components contained in received sig-nals as the secret source. The temporal techniques use ultra-wideband transmissions ( ≈ GHz bandwidth) to capture thisﬁne-grained temporal information. Therefore, these tech-niques are not applicable for narrowband systes such as Wi-Fi (with only 20MHz bandwidth). Furthermore, anotherchallenge in temporal techniques is that temporal informa-tion is sensitive to sampling oﬀset which leads to a high rateof secret disagreement. In contrast, the frequency techniqueof Liu et al [5] is applicable to narrowband systems and isnot sensitive to sampling oﬀset. The authors quantize thefrequency response in each subcarrier in OFDM and mapthem to secret bits. In Section 4, we dispute the authors’claim of high secrecy rate and show that the secrets gener-ated from their method is very limited.Overall, this paper takes the stand that the amplitude (the size ) of a signal –in time or frequency– is prone to per-turbations from the environment as well as hardware im-perfections. This leads to quantization errors at nodes andhigh mismatch in secrecy bits generated by wireless nodes.Instead, this paper proposes to use the shape of a signalto deduce secrecy bits. Speciﬁcally, we make the followingcontributions. • We propose and implement a shape-based secret extract-ing algorithm called Puzzle that we show to be robustto noise and device imperfections. In particular, no on-line or oﬄine device calibration is required in using ouralgorithm. • We prove that the power spectrum density (PSD) of ran-dom data can be used to extract the channel state infor-ation. This implies that no modiﬁcation is needed forthe higher layers of the wireless communication, such astransmitting special training data. Two communicatingparties can successfully extract secrets from the receivedpackets as long as they exchange data packets within theircoherence time. • Our experiments show that Puzzle produces a 5-bit se-cret per packet and has a 63% improvement in bit mis-match rate than the frequency domain approach men-tioned above.

2. SYSTEM MODEL

Consider two wireless nodes, Alice and Bob, that wish tocreate a shared secret S within a coherence time, duringwhich the channel is stable. An adversary, Eve eavesdropsthe communication between Alice and Bob. Our goal isto develop a secret extraction algorithm that introduces aslittle communication and computation overhead as possibleand ensures that Eve obtains little information about S . Assume Alice and Bob operate in a Time-Division Duplexing(TDD) system. If they talk to each other in coherence time,the observed signals of Alice and Bob are represented by y A ( t ) = ( h ∗ x A )( t ) + n A ( t ) (1) y B ( t ) = ( h ∗ x B )( t ) + n B ( t ) (2)where h ( t ) is the channel impulse response, which is iden-tical in both directions by virtue of channel reciprocity, x A and x B are the signals transmitted by Alice and Bob respec-tively, n A ( t ) and n B are additive white Gaussian noise withthe same variance N , and “ ∗ ” indicates convolution. In thefrequency domain, the equations above are rewritten as Y A ( f ) = H ( f ) · X A ( f ) + N A ( f ) , − W f c < f < W f c (3) Y B ( f ) = H ( f ) · X B ( f ) + N B ( f ) , − W f c < f < W f c (4)where W is the transmission bandwidth, f c is the centerfrequency, and H ( f ) is the channel frequency response. In this section, we propose two ways to extract the channelfrequency response H ( f ). • Direct calculation : By using pre-deﬁned training sig-nals or decoding the received signals, Alice and Bob knowthe frequency components X A ( f ) and X B ( f ) of the trans-mitted signals. Therefore, they can calculate H ( f ) easily,assuming that noise can be ignored. • PSD based method : Let { x , x , ..., x N − } be a com-plex sample sequence. Since the sequence is stationaryand random, the auto-correlation of the sequence is R ( t , t ) = PN × δ ( t − t ) (5)where P is the power contained by the signal sequence.Then, the PSD of the sequence is F [ R ( τ )] = Z + ∞−∞ PN × δ ( τ ) e − jωτ dτ = PN (6) From Equation 6, we know that X A ( f ) = P A W , X B ( f ) = P B W (7)Combining Equations 3 through 7 we get Y A ( f ) ≈ H ( f ) · P A W + N, Y B ( f ) ≈ H ( f ) · P B W + N (8)According to the above equations, we conclude that thePSD of y A ( t ) is the same as that of y B ( t ) as long as P A = P B . It is worth noting that even if P A = P B ,the shape of Alice’s and of Bob’s PSD are still similar.This property is remarkable because it can be extendedto the case in which Alice and Bob experience diﬀerentlevels of transmission power, noise or cross-band interfer-ence. Even in such cases, the shapes still don’t changesigniﬁcantly. Eve is motivated to derive the shared secret generated byAlice and Bob. There are two main ways of achieving this.

Eve can attempt to derive Ch AB from Ch AE or Ch BE ,where Ch AB , Ch AE , and Ch BE denote the channel fromAlice to Bob, Alice to Eve, and Bob to Eve, respectively.This may be possible if Eve has full knowledge of the envi-ronment. In general, however, full knowledge of the environ-ment is a rather unrealistic assumption, so we do not regardit as the main threat to our system. Instead, we focus on thethreat of spatial correlation of the secrets produced by ouralgorithm. We assume that Eve cannot stalk Alice or Bob tobeing within half of a wave length of either of them. This as-sumption is reasonable since close eavesdroppers suﬀer froma high exposure risk. Recall that theory [11] supports thatchannels decorrelate beyond half a wavelength. Eve can move in between Alice and Bob to block and unblocktheir transmissions. Planned movements can thus introducepredictable increase or decrease of RSS at Alice and Bob.Note that while this attack is harmful to RSS-based meth-ods, without the full knowledge about the environment, Evecannot, however, predict the impact of the planned move onthe frequency response of the channel.

3. SECRET GENERATION

After getting the frequency response curve of the receivedsamples, we smoothen the curve, encode the smoothed curveby segmenting it into several pieces, and then map each ofthe pieces into one of the patterns from a predetermined set.Note from Fig. 1 that although channel reciprocity is clearlyapparent for the naked eye, the frequency response curvesare more or less shifted or zoomed versions at correspond-ing frequencies. Moreover, distinct local ﬂuctuations exist.These discrepancies are unavoidable because they sponta-neously result from the hardware imperfections and envi-ronment interferences. This shows that direct quantizationand mapping of the frequency response can lead to high mis-match rates. We, therefore, develop a shape-based approachto solve the encoding problem.

10M −5M 0 5M 10M−120−100−80−60−40−20 Frequency (Hz) A m p li t ude ( d B m ) frequency spectrumlowess curve (a) Lowess curve derived by Al-ice −10M 5M 0 5M 10M−120−100−80−60−40−20 Frequency (Hz) A m p li t ude ( d B m ) frequency spectrumlowess curve (b) Lowess curve derived by Bob Figure 1: Lowess curves derived by Alice and Bob. Lowesscurves are much more similar to each other than the originalPSD curves as local variations are removed.

Algorithm 1:

CurveCoding

Input :complex samples a [0 , · · · , n ];number of segments m ; Output :code [ C , C , · · · , C m ] Initialization divide a [0 , · · · , n ] into m segments b , b , · · · , b m ; peak = { Max1( a [0 , · · · , n ] ) - Min1( a [0 , · · · , n ] ) } PatternGeneration( ⌊ n/m ⌋ , m , peak ) :generate 3 patterns of size ⌊ n/m ⌋ : p , p , p ; for i ← to m do temp = ∞ ; for j = 1 → do dis = Fr´echet( b i , p j ); if temp > dis then temp = dis ; C i = j ; endendend As mentioned above, even though local details of a powerspectral density pair are signiﬁcantly diﬀerent, channel reci-procity manifests itself by the similarity of the overall shapesbetween the pair. By plotting smoothed points, confor-mal information about the overall shape is extracted de-spite the local variations. In our algorithm, we adopt Lo-cally Weighted Scatter Plot (Lowess) smoothing [3], a curveﬁtting method that calculates the smoothed value by apply-ing locally weighted regression over a span. Fig. 1 depictstwo PSD curves obtained by two communicating wirelessnodes and their corresponding curves after applying Lowesssmoothing with a span of 0.4. From Fig. 1, we can see thatthe Lowess curves coincide with each other almost exactlyand the overall shapes are preserved, even though the origi-nal ones diﬀer from each other in most of the locations.

By using curve smoothing, we obtain two highly similarcurves. To solve the encoding problem, let us ﬁrst brieﬂyconsider several alternative methods: 1) encode in accor-dance with an approximation function that describes thecurve; 2) encode in accordance with the statistical proper-ties of the curve; 3) encode by describing the shape of the

Algorithm 2:

PatternGeneration

Input : k , m , peak ; Output :3 patterns p [1 , · · · , k ], p [1 , · · · , k ], p [1 , · · · , k ] for i ← to k do p [ i ] = peak × ik ; p [ i ] = − peak × ik ; p [ i ] = peakm/ ; end −10M −5M 0 5M 10M−140−120−100−80−60−40−200 Frequency (Hz) A m p li t ud e ( d B m ) frequency spectrumLowess curveBlock segment Figure 2: An example of curve encoding.response. We adopt the third one for the following reason.As mentioned in Section 3.1, channel reciprocity is readilyseen by the similarity of the overall shapes between curves.Hence, encoding by describing the shape should preservemost of the information shared by the two ends. By wayof contrast, extracting secrets from the statistical proper-ties deﬁnitely suﬀers from losing much of the mutual infor-mation. And the approximation function does not tolerateeven small deviations, but measurement error and interfer-ence make such deviations quite common. Fig. 2 gives anexample of curve coding. The curve obtained in a certainband is treated as a block, which can be divided into vary-ing number of segments of equal length, and then the seg-ments are mapped to one of three curve patterns which areof the same length, as shown in Fig. 2. These three patternsare indexed as 0, 1, and 2. The three “predetermined” pat-terns describe the ascending, descending and steady trend ofthe curves respectively. By “predetermined”, we mean thatthe indices and the shapes of the patterns are well knownto all wireless nodes. The gradient of the ascending anddescending lines, however, is decided by each node accord-ing to the maximum and minimum values of the smoothedcurve, and the length of the segment. We have designed thatpattern generation thus to tolerate measurement errors anddiﬀerent device settings. For example, two communicatingnodes may wish to use diﬀerent tx/rx gains that would am-plify the signals diﬀerently. Since each pattern is relatedto the locally received signals, it describes the shape cor-rectly without the need to negotiate with the other node.We set the gradient of the ascending pattern to be relativeto max − min , and likewise for the descend-ing pattern is relative to − max − min . Thesegment is then mapped to the most similar of the threepatterns by measuring the discrete Fr´echet distance [12] δ dF between the segment and the patterns, which measures thesimilarity of two polygonal curves while taking the locationnd ordering of the points along the curves into considera-tion. The smaller the distance, the more is the similaritythe two curves share. The complete algorithm is presentedin Algorithm 1 and Algorithm 2.

4. EXPERIMENTAL VALIDATION

In this section we study four important metrics to measurethe performance of Puzzle. • Entropy : Entropy measures the unpredictability of arandom variable X. It is deﬁned as H ( X ) = − n X i =1 p ( x i ) log p ( x i )where x , · · · , x n are possible values of X . • Bit Mismatch Rate : Bit mismatch rate is deﬁned asthe ratio of the number of bits between Alice and Bobthat do not match and the number of bits extracted fromthe shape of the spectrum. • Correlation : Correlation ρ x,y is deﬁned as ρ x,y = n P i =1 ( x i − ¯ x )( y i − ¯ y ) s n P i =1 ( x i − ¯ x ) n P i =1 ( y i − ¯ y ) We use correlation to measure the dependence of codesgenerated by Puzzle relative to diﬀerent distance betweenBob and Eve. • Leakage : Letting p mis be the mismatch rate betweenAlice and Eve, we deﬁne the leakage between them as leakage = ( − p mis . if p mis < .

50 otherwise

The measurement environment is a lab where there are 6cubicles. Data were collected during daytime (from 7:00 amto 6:00 pm). Human activities introduced a certain level ofinterference in the channel, but generally speaking, the en-vironment is quite stable. We conducted the experiment insuch a stable environment because we wanted to see clearlythe performance comparisons without risking mismatchescaused by the changes of the channel itself. In theory, fur-ther implementation in mobile environment would give bothhigher mismatch rate and higher secret bit extraction rate.The communication system consists of three software-deﬁnedtransceivers. Each of their RF chains contains an XCVR2450(RF front end), a NI-5781 (data converter module) andan NI PXIe-7965R (a Xilinx Virtex-5 FPGA). Two of thethree transceivers transmit at 2.45 GHz with 20MHz band-width. We call these two transceivers Alice and Bob. Thethird transceiver, Eve, overhears the communication. Dur-ing reception, each transceiver records the I and Q samplesat a sampling rate of 100 MHz and down converts to thebaseband. The received samples are then sent to the NIPXIe-8133, an RTOS-based controller, through two direct-memory-access (DMA) channels, which have a data stream-ing rate that is as high as 800 MB/s. Except for the ex-periment done in Section 4.2.1, all the results of Puzzle areobtained based on the PSD of 10240 received samples withQPSK modulation.

We ﬁrst compare Puzzle with the frequency domain secretkey generation method with 2-bit quantization [5], whichin the rest of this paper we refer to as the CSI-2bit. Wechoose CSI-2bit as the basis for bit mismatch rate and en-tropy comparison because, to the best of our knowledge, itachieves the highest bit generation rate along with a lowmismatch rate. Coarse-grained method like RSS-based onesachieve only 1 ∼ Extracted Bits Per Packet B i t m i s m a t c h r a t e ( % ) PuzzleCSI−2bit (a) Mismatch rate of diﬀerentbit generation rates.

Extracted Bits Per Packet E n t r op y ( b i t s ) PuzzleCSI−2bit (b) Puzzle produces a compara-ble amount of entropy as CSI-2bit.

Figure 3: Bit Mismatch Rate and Entropy x Rx ° D = 2m r = 5cm~45cm

Carrier frequency = (a) Deployment of correlationexperiment.

10 20 30 400.20.40.60.81 Distance(cm) C o rr e l a t i on (b) The correlation of two codesgenerated by Puzzle relative todistance. Figure 4: Deployment and result of correlation experimentTo evaluate the resistance to an eavesdropping attacker, weestablish the correlation of bits generated by two receiversat diﬀerent distances. We performed an experiment wherewe ﬁxed the distance between one transmitter and one re-ceiver, and then placed another receiver at a certain dis-tance away from the ﬁrst receiver along 6 orientations asshown in Fig. 4a. Each frequency response curve is seg-mented into 4 pieces. We measured the correlation betweenthe codes produced by the two receivers at distances rang-ing from 5cm away to 45cm away. To be more speciﬁc, wemeasured the correlation of 6 pairs of locations by ﬁxingthe ﬁrst receiver and moving the second one 60 ◦ apart ateach distance. Figure 4b shows the result. We see that thecorrelation decreases rapidly as the distance between tworeceivers increases. In practice, it is reasonable to assumethat eavesdroppers are beyond one meter away, otherwisethey suﬀer from high risk of exposures. Therefore Puzzle isrobust against eavesdropping. EveBob

Alice Object moving along the track every 30 seconds (a) An object moves between Alice and Bob with a certain tem-poral pattern and Eve overhears the transmission from Alice toBob. Lea k age PuzzleASBG (b) Leakage relative to distancebetween Bob and Eve. Puzzlehas a stable low leakage rate ir-respective of the distance. N on − l ea k ed s e c r e t b i t s PuzzleASBG (c) Non-leaked secrete bits pro-duced by Puzzle and ASBG perpacket, relative to the distancebetween Bob and Eve.

Figure 5: Performance: Leakage

Towards validating the resistance to the planned movementattacker (cf. Section 2), we compared the leakage perfor- mance of a the state-of-the-art RSS-based method ABSGand Puzzle by moving an object across the transmissionpath between Alice and Bob, while placing an eavesdroppernear Bob, as shown in Figure 5a. Since ABSG like manyother RSS-based methods asks the two communicating endsto drop some RSS values based on certain thresholds andto exchange the indices of those values, Eve knows exactlywhich RSS probe is used by Bob but dropped by herself. Inthis case, we assume that Eve makes a random guess as tothe quantization result with a success rate of 50%. We cal-culate the mismatch rate of Eve’s and Bob’s bits to be thecombination of the actual mismatch rate between them andthe failure rate of the random guess. And again, we segmentthe frequency response curves into four pieces.Fig 5b shows the leakage of our algorithm against that ofABSG over a distance from 10 cm to 50 cm. It is clear thatPuzzle is much more insensitive to the threat of plannedmovement. Furthermore, due to the fact that Puzzle has amuch higher secret generation rate (4 ∗ log (3) ≈ . References [1]

Argyraki, K., et al.

Creating secrets out of erasures. In

MobiCom ’13 .[2]

Azimi-Sadjadi, et al.

Robust key generation from signalenvelopes in wireless networks. In

CCS ’07 .[3]

Cleveland, W. S.

Robust locally weighted regression andsmoothing scatterplots.

Journal of the American StatisticalAssociation 74 (1979), 829–836.[4]

Jana, S., et al.

On the eﬀectiveness of secret key extrac-tion from wireless signal strength in real environments. In

MobiCom ’09 .[5]

Liu, H., et al.

Fast and practical secret key extraction byexploiting channel response. In

INFOCOM ’13 .[6]

Madiseh, M. G., et al.

Veriﬁcation of secret key generationfrom uwb channel observations. In

ICC ’09 .[7]

Madiseh, M. G., et al.

Secret key generation and agree-ment in uwb communication channels. pp. 1842–1846.[8]

Mathur, S., et al.

Radio-telepathy: extracting a secret keyfrom an unauthenticated wireless channel. In

MobiCom ’08 .[9]

Patwari, N., et al.

High-rate uncorrelated bit extractionfor shared secret key generation from channel measurements.

IEEE Transactions on Mobile Computing 9 , 1 (Jan. 2010),17–30.[10]

Pinto, P. C., et al.

Wireless physical-layer security: Thecase of colluding eavesdroppers. In

ISIT ’09 .[11]

Rappaport, T.

Wireless Communications: Principles andPractice , 2nd ed. Prentice Hall PTR, Upper Saddle River,NJ, USA, 2001.[12]

Wien, T. U., et al.

Computing discrete Fr´echet distance.Tech. rep., 1994.[13]

Wilson, R., et al.

Channel identiﬁcation: Secret sharingusing reciprocity in UWB channels.

IEEE Transactions onInformation Forensics and Security (2007), 364–375.[14]

Ye, C., et al.

On the secrecy capabilities of ITU channels.In

VTC Fall (2007), IEEE, pp. 2030–2034.[15]

Zeng, K., et al.

Exploiting multiple-antenna diversity forshared secret key generation in wireless networks. In