[PDF] Nested Tailbiting Convolutional Codes for Secrecy, Privacy, and Storage

Abstract

A key agreement problem is considered that has a biometric or physical identifier, a terminal for key enrollment, and a terminal for reconstruction. A nested convolutional code design is proposed that performs vector quantization during enrollment and error control during reconstruction. Physical identifiers with small bit error probability illustrate the gains of the design. One variant of the nested convolutional codes improves on the best known key vs. storage rate ratio but it has high complexity. A second variant with lower complexity performs similar to nested polar codes. The results suggest that the choice of code for key agreement with identifiers depends primarily on the complexity constraint.

Full PDF

NNested Tailbiting Convolutional Codes forSecrecy, Privacy, and Storage

Thomas Jerkovits [email protected] Aerospace CenterWeÃ§ling, Germany

Onur Günlü [email protected] BerlinBerlin, Germany

Vladimir SidorenkoGerhard Kramer [email protected]@tum.deTU MunichMunich, Germany

ABSTRACT

A key agreement problem is considered that has a biometric orphysical identifier, a terminal for key enrollment, and a terminalfor reconstruction. A nested convolutional code design is proposedthat performs vector quantization during enrollment and errorcontrol during reconstruction. Physical identifiers with small biterror probability illustrate the gains of the design. One variant ofthe nested convolutional codes improves on the best known keyvs. storage rate ratio but it has high complexity. A second variantwith lower complexity performs similar to nested polar codes. Theresults suggest that the choice of code for key agreement withidentifiers depends primarily on the complexity constraint.

CCS CONCEPTS • Security and privacy → Information-theoretic techniques . KEYWORDS nested codes, information privacy, tailbiting, convolutional codes,physical unclonable functions

ACM Reference Format:

Thomas Jerkovits, Onur Günlü, Vladimir Sidorenko, and Gerhard Kramer.2020. Nested Tailbiting Convolutional Codes for Secrecy, Privacy, and Stor-age. In

ACM, New York, NY,USA, 11 pages. https://doi.org/10.1145/3369412.3395063

Irises and fingerprints are biometric identifiers used to authenticateand identify individuals, and to generate secret keys [4]. In a digitaldevice, there are digital circuits that have outputs unique to thedevice. One can generate secret keys from such physical unclonablefunctions (PUFs) by using their outputs as a source of randomness.Fine variations of ring oscillator (RO) outputs, the start-up behaviorof static random access memories (SRAM), and quantum-physicalreadouts through coherent scattering [37] can serve as PUFs thathave reliable outputs and high entropy [11, 18]. One can consider

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected].

IH&MMSec ’20, June 22–24, 2020, Denver, CO, USA © 2020 Association for Computing Machinery.ACM ISBN 978-1-4503-7050-9/20/06...$15.00https://doi.org/10.1145/3369412.3395063 them as physical “one-way functions” that are easy to compute anddifficult to invert [33].There are several security, privacy, storage, and complexity con-straints that a PUF-based key agreement method should fulfill. First,the method should not leak information about the secret key (neg-ligible secrecy leakage ). Second, the method should leak as littleinformation about the identifier (minimum privacy leakage ). Theprivacy leakage constraint can be considered as an upper boundon the secrecy leakage via the public information of the first en-rollment of a PUF about the secret key generated by the secondenrollment of the same PUF [12]. Third, one should limit the stor-age rate because storage can be expensive and limited, e.g., forinternet-of-things (IoT) device applications. Similarly, the hardwarecost, e.g., hardware area, of the encoder and decoder used for keyagreement with PUFs should be small for such applications.There are two common models for key agreement: the generated-secret (GS) and the chosen-secret (CS) models . An encoder extracts asecret key from an identifier measurement for the GS model, whilefor the CS model a secret key that is independent of the identifiermeasurements is given to the encoder by a trusted entity. In the clas-sic key-agreement model introduced in [1] and [31], two terminalsobserve correlated random variables and have access to a public,authenticated, and one-way communication link; an eavesdropperobserves only the public messages called helper data . The regionsof achievable secret-key vs. privacy-leakage (key-leakage) ratesfor the GS and CS models are given in [19, 26]. The storage ratesfor general (non-negligible) secrecy-leakage levels are analyzed in[23], while the rate regions with multiple encoder and decoder mea-surements of a hidden source are treated in [16]. There are otherkey-agreement models with an eavesdropper that has access to asequence correlated with the identifier outputs, e.g., in [6, 8, 12, 22].This model is not realistic for PUFs, unlike physical-layer securityprimitives and some biometric identifiers that are continuouslyavailable for physical attacks. PUFs are used for on-demand keyreconstruction, i.e., the attack should be performed during execu-tion, and an invasive attack applied to obtain a correlated sequencepermanently changes the identifier output [11, 13]. Therefore, weassume that the eavesdropper cannot obtain a sequence correlatedwith the PUF outputs.Two classic code constructions for key agreement are code-offsetfuzzy extractors (COFE) [10] and the fuzzy commitment scheme(FCS) [21], which are based on a one-time padding step in combi-nation with an error correcting code. Both constructions requirea storage rate of 1 bit/symbol due to the one-time padding step. A a r X i v : . [ c s . I T ] A p r lepian-Wolf (SW) [38] coding method, which corresponds to syn-drome coding for binary sequences, is proposed in [5] to reduce thestorage rate so that it is equal to the privacy-leakage rate. It is shownin [14] that these methods do not achieve the key-leakage-storageboundaries of the GS and CS models.Wyner-Ziv (WZ) [42] coding constructions that bin the observedsequences are shown in [14] to be optimal deterministic code con-structions for key agreement with PUFs. Nested random linearcodes are shown to asymptotically achieve boundary points of thekey-leakage-storage region. A second WZ-coding construction usesa nested version of polar codes (PCs) [3], which are designed in [14]for practical SRAM PUF parameters to illustrate that rate tuplesthat cannot be achieved by using previous code constructions canbe achieved by nested PCs.A closely related problem to the key agreement problem isWyner’s wiretap channel (WTC) [41]. The main aim in the WTCproblem is to hide a transmitted message from the eavesdropperthat observes a channel output correlated with the observation ofa legitimate receiver. There are various code constructions for theWTC that achieve the secrecy capacity, e.g., in [2, 25, 28, 30], andsome of these constructions use nested PCs, e.g., [2, 28]. Similarly,nested PCs are shown in [7] to achieve the strong coordinationcapacity boundaries, defined and characterized in [9].We design codes for key agreement with PUFs by constructingnested convolutional codes. Due to the broad use of nested codes in,e.g., WTC and strong coordination problems, the proposed nestedconvolutional code constructions can be useful also for these prob-lems. A summary of the main contributions is as follows. • We propose a method to obtain nested tailbiting convolu-tional codes (TBCCs) that are used as a WZ-coding construc-tion, which is a binning method used in various achievabilityschemes and can be useful for various practical code con-structions. • We develop a design procedure for the proposed nested con-volutional code construction adapted to the problem of keyagreement with biometric or physical identifiers. This is anextension of the asymptotically optimal nested code con-structions with random linear codes and PCs proposed in[14]. We consider binary symmetric sources and binary sym-metric channels (BSCs). Physical identifiers such as RO PUFswith transform coding [15] and SRAM PUFs [29] are modeledby these sources and channels. • We design and simulate nested TBCCs for practical sourceand channel parameters obtained from the best PUF design inthe literature. The target block-error probability is P B = − and the target secret-key size is 128 bits. We illustrate thatone variant of nested codes achieves the largest key vs. stor-age rate ratio but it has high decoding complexity. Anothervariant of nested codes with lower decoding complexityachieves a rate ratio that is slightly greater than the rateratio achieved by a nested PC. We also illustrate the gaps tothe finite-length bounds.This paper is organized as follows. In Section 3, we describethe GS and CS models, and give their rate regions that are alsoevaluated for binary symmetric sequences. We summarise in Sec-tion 4 our new nested code construction that uses convolutional codes. In Section 5, we propose a design procedure for the newnested TBCCs adapted to the key agreement with PUFs problem.Section 6 compares the estimated decoding complexity of TBCCsand PCs. Section 7 illustrates the significant gains from nestedconvolutional codes designed for practical PUF parameters as com-pared to previously-proposed nested PCs and other channel codesin terms of the key vs. storage rate ratio. Let F denote the finite field of order 2 and let F a × b denote the setof all a × b matrices over F . Rows and columns of a × b matricesare indexed by 1 , . . . , a and 1 , . . . , b , and h i , j is the element in the i -th row and j -th column of a matrix H . F a denotes the set of allrow vectors of length a over F . With a × b we denote the all-zeromatrix of size a × b . A linear block code over F of length N anddimension K is a K -dimensional subspace of F N and denoted by ( N , K ) . A variable with superscript denotes a string of variables,e.g., X n = X . . . X i . . . X n , and a subscript denotes the position of avariable in a string. A random variable X has probability distribution P X . Calligraphic letters such as X denote sets, and set sizes arewritten as |X| . Enc (·) is an encoder mapping and

Dec (·) is a decodermapping. H b ( x ) = − x log x − ( − x ) log ( − x ) is the binary entropyfunction, where we take logarithms to the base 2. The ∗ -operator isdefined as p ∗ x = p ( − x ) + ( − p ) x . A BSC with crossover probability p is denoted by BSC( p ). X n ∼ Bern n ( α ) is an independent andidentically distributed (i.i.d.) binary sequence of random variableswith Pr [ X i = ] = α for i = , , . . . , n . H T represents the transposeof the matrix H . Drawing an element e from a set E uniformly atrandom is denoted by e $ ←− E . (1) Denote the parameters of a block code generated by a binary con-volutional encoder as ( N , K ) , where N is the blocklength and K isthe code dimension (in bits). At each time step, the convolutionalencoder receives k input bits and generates n output bits. The num-ber of clock cycles needed to encode K bits is ℓ = Kk . We considerconvolutional encoders with a single shift register only. The shiftregister consists of m delay cells, where m is also called the memoryof the encoder. The bit value stored in the i -th delay cell at timestep t is denoted by s ( i ) t ∈ F for i = , . . . , m . For a given binaryinput vector u t = (cid:16) u ( ) t , u ( ) t , . . . , u ( k ) t (cid:17) of length k at time step t , theencoder outputs a binary vector c t = (cid:16) c ( ) t , c ( ) t , . . . , c ( n ) t (cid:17) of length n . The encoder can be described by the state-space representationof the encoder circuit such that the output c t is c t = s t · C T + u t · D T (2)where s t = (cid:16) s ( ) t , s ( ) t , . . . , s ( m ) t (cid:17) is the vector describing the contentof the shift register, C ∈ F n × m is the observation matrix, and D ∈ F n × k is the transition matrix. The content of the shift registerfor the next clock cycle at time step t + s t + = s t · A T + u t · B T (3) s ( ) t s ( ) t . . . s ( m ) t + + + (cid:101) B C (cid:101) D + c ( n ) t c ( ) t c ( ) t ...... . . .... . . .. . . . . . u ( ) t u ( ) t u ( ) t u ( k ) t Figure 1: Encoder circuit of convolutional codes described in Section 2.2. where A ∈ F m × m is the system matrix and B ∈ F m × k is the controlmatrix. For the case of a single shift register we have that the systemmatrix is given by A = (cid:20) ×( m − ) I ( m − )×( m − ) ( m − )× (cid:21) (4)where I ( m − )×( m − ) ∈ F ( m − )×( m − ) is the identity matrix. For sim-plicity, first entry of the input tuple u ( ) t is always an input to theshift register and thus we can write B = ( e T | (cid:101) B ) and D = ( n × | (cid:101) D ) ,where e is the unit row vector having a 1 in the first position and0 everywhere else, (cid:101) B ∈ F m ×( k − ) , and (cid:101) D ∈ F n ×( k − ) . The corre-sponding encoder circuit is shown in Figure 1. Elements of a vectorentering a square box, which represents one of the aforementionedmatrices, depicts a vector-matrix multiplication, and the box withthe addition symbol depicts an elementwise vector-vector addition.Therefore, the encoder of the convolutional code can be describedby the three matrices (cid:101) B , C , and (cid:101) D . We denote such an encoder by [ (cid:101) B , C , (cid:101) D ] .Using the tailbiting method from [20, Chapter 4.8], we avoid hav-ing a rate loss, unlike the zero-tail termination method. We have N = ℓ n and the resulting code rate is R = kn . A tailbiting convolu-tional code (TBCC) can be represented by a tailbiting trellis using ℓ sections and 2 m states per section. The codewords correspondto all possible paths in the trellis, where starting and ending statescoincide. TBCCs can be decoded by using the wrap around Viterbialgorithm (WAVA) [36]. This decoder is suboptimal but performsclose to the performance of the maximum likelihood decoder.Let A d be the number of codewords of Hamming weight d for d = , , . . . , N , which characterizes the distance spectrum of a TBCC. The weight enumerator polynomial A ( X ) is then defined as A ( X ) def = N (cid:213) d = A d X d . (5)To compute the weight enumerator and to determine the distancespectrum we use the approach described in [40]. Consider the statetransition matrix T ( X ) of size 2 m × m , where every entry t i , j ( X ) iseither X d , where d is the Hamming weight of the output producedby the encoder when going from the state labeled with i to the statelabeled with j , or 0 if there is no possible transition between theaforementioned states. Therefore, we have A ( X ) = Tr (cid:16) T ℓ ( X ) (cid:17) (6)where T ℓ ( X ) denotes multiplication of the matrix T ( X ) with itself ℓ times and Tr (·) denotes the trace. Consider the GS model in Figure 2 ( a ) , where a biometric or physicalsource output is used to generate a secret key. The source X , noisymeasurement Y , secret key S , and storage W alphabets are finitesets. During enrollment, the encoder observes the i.i.d. identifier out-put X N , generated according to some P X , and computes a secret key S ∈ S and public helper data W ∈ W as ( S , W ) = Enc ( X N ) . Duringreconstruction, the decoder observes a noisy source measurement Y N of the source output X N through a memoryless measurementchannel P Y | X in addition to the helper data W . The decoder esti-mates the secret key as (cid:98) S = Dec ( Y N , W ) . Furthermore, Figure 2 ( b ) shows the CS model, where a secret key S ′ ∈ S is embedded intothe helper data as W ′ = Enc ( X N , S ′ ) . The decoder for the CS modelestimates the secret key as (cid:98) S ′ = Dec ( Y N , W ′ ) . X (·)( S , W ) ( a ) = Enc (cid:16) X N (cid:17) W ′ ( b ) = Enc (cid:16) X N , S ′ (cid:17) P Y | X (·) (cid:98) S ( a ) = Dec (cid:16) Y N , W (cid:17)(cid:98) S ′ ( b ) = Dec (cid:16) Y N , W ′ (cid:17) ( a ) W ( b ) W ′ X N Y N Enrollment Reconstruction

S S ′ (cid:98) S (cid:98) S ′ ( a ) ( b )( a ) ( b ) Figure 2: The ( a ) GS and ( b ) CS models.

Definition 3.1.

A key-leakage-storage tuple ( R s , R ℓ , R w ) is achiev-able for the GS and CS models if, given any ϵ >

0, there is some N ≥

1, an encoder, and a decoder such that R s = log |S | N and P B def = Pr [ (cid:98) S (cid:44) S ] ≤ ϵ ( reliability ) (7)1 N I ( S ; W ) ≤ ϵ ( secrecy ) (8)1 N H ( S ) ≥ R s − ϵ ( key uniformity ) (9)1 N log (cid:12)(cid:12) W (cid:12)(cid:12) ≤ R w + ϵ ( storage ) (10)1 N I ( X N ; W ) ≤ R ℓ + ϵ ( privacy ) (11)where for the CS model, S and W in the constraints should bereplaced by, respectively, S ′ and W ′ .The key-leakage-storage regions R gs and R cs for the GS and CSmodels, respectively, are the closures of the sets of achievable tuplesfor the corresponding models. ♢ Theorem 3.2 ([19]).

The key-leakage-storage region R gs for theGS model is the union of the bounds ≤ R s ≤ I ( U ; Y ) (12) R ℓ ≥ I ( U ; X ) − I ( U ; Y ) (13) R w ≥ I ( U ; X ) − I ( U ; Y ) (14) over all P U | X such that U − X − Y form a Markov chain. Similarly,the key-leakage-storage region R cs for the CS model is the union ofthe bounds in (12), (13), and R w ≥ I ( U ; X ) . (15) These regions are convex sets. The alphabet U of the auxiliary randomvariable U can be limited to have size |U| ≤ |X| + . Deterministicencoders and decoders suffice to achieve these regions. Suppose the transform-coding algorithms proposed in [15] areapplied to RO PUFs or any PUF circuits with continuous-valuedoutputs to obtain X N that is almost i.i.d. according to a uniformBernoulli random variable, i.e., X N ∼ Bern N ( ) , and the channel P Y | X is a BSC ( p A ) for p A ∈ [ , . ] . The key-leakage-storage region R gs,bin of the GS model for this case is the union of the bounds0 ≤ R s ≤ − H b ( q ∗ p A ) R ℓ ≥ H b ( q ∗ p A ) − H b ( q ) R w ≥ H b ( q ∗ p A ) − H b ( q ) (16)over all q ∈ [ , . ] [19], which follows by using an auxiliary ran-dom variable U such that P X | U ∼ BSC ( q ) due to Mrs. Gerber’slemma [42]. The rate tuples on the boundary of the region R gs,bin are uniquely defined by the ratio R s R w . We therefore use this ratio asthe metric to compare our nested TBCCs with previously-proposednested PCs and channel codes. A larger key vs. storage rate ratiosuggests that the code construction is closer to an achievable pointthat is on the boundary of the region R gs,bin , which is an optimaltuple. We next focus on the GS model for code constructions. Allresults can be extended to the CS model by using an additionalone-time padding step [12]. In this section, we sketch the main steps to obtain a nested con-struction for convolutional codes. Furthermore, we give two explicitalgorithms to find good code constructions. The first algorithm ad-dresses the search of a good error correcting code ( N , K s ) , denotedby C s , and the second algorithm finds a ( N , K q ) code C q used as avector quantizer such that C s is a subcode of C q , i.e., C s ⊆ C q . Using the encoder circuit depicted in Figure 1, we construct twocodes C q and C s such that C s ⊆ C q . Let C q be the ( N , K q ) TBCCwith memory m and K q = ℓ k q generated by using the encoderdefined by the matrices [ (cid:101) B , (cid:101) C , (cid:101) D ] . Recall that B = ( e T | (cid:101) B ) with (cid:101) B ∈ F m ×( k q − ) and D = ( T | (cid:101) D ) with (cid:101) D ∈ F n ×( k q − ) . By removing the i -th column of (cid:101) B and (cid:101) D simultaneously, one obtains a new encoderthat generates a code of rate k q − n , which is a subcode of theoriginal code. This is true, since the new code corresponds to allcodewords by encoding the original code but restricting to all inputswhere u ( i ) t =

0. By “freezing” further input bits we can thereforeobtain a subcode of rates R s = n , n , . . . , k q − n . (17)To obtain codes with rates of better granularity between 1 n and k q − n that are not in (17), we can freeze input bits in a time-variantmanner. That is, by using the encoder ℓ times, we can freeze adifferent amount of input bits in different clock cycles. This allowsto obtain codes of rates R s = ℓ N , ℓ + N , . . . , K q N . (18)Denote the parameters of the subcode, obtained by freezing inputbits accordingly, as ( N , K s ) . Note that by freezing input bits in atime-variant manner, K s is not necessarily a multiple of ℓ . Further-more, the procedure can be applied also to add columns to (cid:101) B and lgorithm 1: Search for ( N , K s ) TBCC C s , R s = n Input : n , m , K s , P B , W max (maximum number of iterations) Output : C ∈ F n × m Initialize: p c ← C ← for w ← to W max do C ′ $ ←− F n × m Compute A d for the ( N , K s ) TBCC generated by [ , C ′ , ] for d = , . . . , N using (5) and (6) Find p ′ c such that: P UB B ( A d , p ′ c ) = P B if p ′ c ≥ p c then p c ← p ′ c C ← C ′ return C (cid:101) D to generate a supercode. The design procedure of the nestedconvolutional code construction is split into two steps:(1) Search for a good error correcting code C s of rate R s = n = K s N at given target block error probability P B by finding anappropriate matrix C .(2) Expand the low rate code by finding appropriate matrices (cid:101) B and (cid:101) D to obtain a good code of rate R q = k q n = K q N thatachieves a low average distortion q .Note that for the first step we restrict to codes of rate R s = n and hence the matrices (cid:101) B and (cid:101) D are vanishing. The first step canalso be performed for codes of any rate R s > n , but then also theappropriate matrices (cid:101) B and (cid:101) D have to be found accordingly. For fixed parameters n , m , and K s , we try to find a matrix C suchthat the resulting ( N , K s ) TBCC C s at a given target block errorprobability P B can be operated on a noisy BSC with large crossoverprobability p c . To evaluate P B we use the union bound, see, e.g.,[34], and the distance spectrum of the code. This gives an upperbound on P B under maximum likelihood decoding. The bound isgiven by P B ≤ P UB B ( A d , p c ) def = N (cid:213) d = d min A d d (cid:213) i = ⌈ d / ⌉ (cid:18) di (cid:19) p i c ( − p c ) d − i (19)where d min is the minimum distance of the code.The design of the code C s is performed by a purely randomsearch of the matrix C as described in Algorithm 1. This algorithmsearches the best TBCC of rate R s = n by randomly generatingdifferent matrices C . The matrix C of the code that yields the largest Algorithm 2:

Search for ( N , K q ) TBCC C q , R q = k q n Input : m , k q , k s , W max , C , (cid:101) B s ∈ F m ×( k s − ) , (cid:101) D s ∈ F n ×( k s − ) Output : (cid:101) B q ∈ F m ×( k q − ) , (cid:101) D q ∈ F n ×( k q − ) Initialize: (cid:101) B q ← ( (cid:101) B s | ) (cid:101) D q ← ( (cid:101) D s | ) d ← A ← for w ← to W max do B ′ $ ←− F m ×( k q − k s ) D ′ $ ←− F n ×( k q − k s ) (cid:101) B ′ q ← ( (cid:101) B s | B ′ ) (cid:101) D ′ q ← ( (cid:101) D s | D ′ ) Compute d free and A free for [ (cid:101) B q , C , (cid:101) D q ] if d free > d or ( d free = d and A free < A ) then d ← d free A ← A free (cid:101) B q ← (cid:101) B ′ q (cid:101) D q ← (cid:101) D ′ q return (cid:101) B q , (cid:101) D q p c at a given target block error probability P B is returned as theoutput of Algorithm 1. In this section, an algorithm to obtain a high rate code from anexisting low rate convolutional encoder is explained. The algorithmis presented in Algorithm 2. The inputs are the system matrix, theobservation matrix, and the transition matrix of the low rate codewith rate R s = k s n . (20)By randomly adding k q − k s columns to both, the system and thetransition matrix of a code of high rate R q = k q n (21)is constructed. The algorithm performs a random search and returnsthe best configuration. As selection metrics, the free distance andits multiplicity are chosen. The free distance d free of a convolutionalcode is defined as the minimum Hamming weight between anytwo differing paths in the state transition diagram [20, Chapter 3].Due to linearity of convolutional codes, d free is also the minimumHamming weight over the nonzero paths. We denote by A free themultiplicity of paths that have Hamming weight d free . To find agood high rate code, we use d free and A free to select the best encoder.The BEAST algorithm described in [20, Chapter 10] is a fast methodto compute d free and A free . The selection criterion is as follows:eep the code with largest d free and in case of a tie decide for thecode with smaller A free . Algorithms 1 and 2 are combined to find good nested code construc-tions for the coding problem described in Section 3. Two TBCCs C s and C q of the same length N are needed such that C s ⊆ C q . Let K q and K s denote the dimensions of C q and C s , respectively, andlet R q = K q N and R s = K s N denote their code rates. The objectiveis to maximize the key vs. storage rate ratio. Since R s = K s N and R w = K q − K s N , we have R s R w = K s K q − K s = (cid:18) R q R s − (cid:19) − . (22)Therefore, we maximize R s and minimize R q simultaneously.To reconstruct the key S of size K s (in bits) the code C s has tocorrect errors on the artifical BSC channel with crossover probabil-ity p c = q ∗ p A at a given target P B . The code C q serves as a vectorquantizer with average distortion q such that [14] q ≤ p c − p A − p A . (23)The design procedure is then as follows:(1) Choose m and n to design a TBCC of rate R s = n by usingAlgorithm 1.(2) Obtain the corresponding value of p c where the code achievesthe target block error probability P B by Monte Carlo simula-tions.(3) Construct code C q from C s by using Algorithm 2 such that(23) is satisfied.The last step in this procedure is executed by applying Algorithm 2incrementally as follows:(1) Initialization: Start constructing a code C ( ) q of rate R ( ) q = n from code C s (Algorithm 1).(2) Set i ← C ( i ) q of rate R ( i ) q = i + n from code C ( i − ) q (Algorithm 2).(4) If the average distortion achieved by the code C ( i ) q satisfiesthe constraint given in (23), stop; else increment i ← i + C q is the code in the last iteration. To obtain coderates in between those steps we randomly freeze inputs of theencoder in a time-variant manner as described in Section 4. Sincein each iteration the code is optimized for the minimum distanceof the code, we can only freeze inputs on the last added input. Thisway we guarantee to preserve the minimum distance of the codefor the next iteration due to C ( i − ) q ⊆ C ( i ) q . We compare the decoding complexities of TBCCs and PCs. Sincethe real complexity of decoding depends on the hardware imple-mentation, we only estimate the complexity for both code classesby using standard decoding algorithms.The WAVA algorithm performs standard Viterbi decoding onthe tailbiting trellis of the TBCC in a circular fashion. That meansthe decoder runs over the trellis several times and at each iterationthe probabilities of the starting states of the trellis are updatedaccording to the probabilities of the ending states of the previousiteration. Therefore, the WAVA algorithms scales with the complex-ity of a standard Viterbi decoder times the number of iterations.For simplicity, we consider the worst case complexity and hence let V denote the number of maximum iterations of the WAVA decoder.According to [27], let κ be the complexity of a standard Viterbidecoder with indices • F for Forney trellis, • P for precomputation, • M for merged or minimal trellis.We have for the total of number Nn of trellis sections κ F ∝ N · k + m (24) κ P ∝ Nn (cid:16) k + m + n (cid:17) (25) κ M ∝ N · min { k , n − k } + m . (26)By scaling these complexities with the maximum number of WAVAiterations V we obtain the desired complexities of decoding a TBCC.For decoding on the Forney trellis, we can reuse the branch met-rics computed in the first WAVA iteration in the following V − κ WAVA F ∝ ( n + V − ) Nn k + m (27) κ WAVA P ∝ Nn (cid:16) V · k + m + n (cid:17) (28) κ WAVA M ∝ V N · min { k , n − k } + m . (29)Overall we have that the complexity κ WAVA of decoding a TBCC is κ WAVA ∝ min (cid:110) κ WAVA F , κ WAVA P , κ WAVA M (cid:111) . (30)For error correction and vector quantization, we obtain differentcomplexities since we have different values for k . For the errorcorrecting code we have k = k s = k = k q , where k q is the largest value needed to achievea rate of R q such that k q = (cid:6) nR q (cid:7) . The complexity of the vectorquantizer can be reduced by considering decoding over the trelliswith the time-variant frozen input bit values, since all branchesthat do not correspond to the frozen input bit value can be removed.For simplicity, we will only consider the complexity over the time-invariant trellis.For the PCs under successive cancellation list (SCL) decoding [39]with a list size L , we have a complexity proportional to LN log N .This complexity is independent of the code rate and thus applies to C s and C q . All decoding complexities are summarized in Table 1.Note that for the Viterbi decoder parallelization up to a factor of2 m can be easily achieved since all state nodes in a trellis section able 1: Complexities of the error correcting code C s and vec-tor quantizer code C q for PCs and TBCCs. Code class Complexity of C s Complexity of C q TBCC κ WAVA F ∝ ( n + V − ) Nn m ∝ ( n + V − ) Nn k q + m TBCC κ WAVA P ∝ Nn ( V · m + n ) ∝ Nn (cid:16) V · k q + m + n (cid:17) TBCC κ WAVA M ∝ V N · + m ∝ V N · min { k q , n − k q } + m PC ∝ LN log N ∝ LN log N can be processed independently. For the SCL decoding of PCs,parallelization cannot be achieved without changing the decoder’serror correction performance since each decoded bit sequentiallydepends on the previously decoded ones. In this section, the performance of TBCCs designed by the proposedprocedure for the PUF setting is presented. We consider PUF deviceswith p A = . P B = − anda key size of K s =

128 bits. These values correspond to the bestRO PUF designs in the literature [17]. We construct TBCCs withrates R s = and R s = , and with memories m = m = R s = , since for a key size of K s = R s = we would have N =

384 which is not a powerof two. All simulations for the PCs are performed by using SCLdecoding with a list size of L =

8. We also compute the resultsfor the rate PC presented in [14] but now for p A = . R s R w . All simulationsfor the TBCCs are performed by using the WAVA algorithm witha maximum of V = The construction of the nested code design starts with the errorcorrecting code C s . We design two TBCCs with R s =

13 and R s = W max = . Results of the Monte Carlosimulations as well as the bound (19) are shown in Figures 3 and 4for the two TBCCs.To bound the code performance on a BSC for a given block lengthand code rate we use two finite length bounds, namely the metaconverse (MC) and the random coding union (RCU) bound from [35].The MC gives a lower bound and the RCU an upper bound on theblock error probability. For R s = , we observe that the TBCCwith m =

11 outperforms the PC, whereas the TBCC with m = .

02 0 .

04 0 .

06 0 .

08 0 . .

12 0 . − − − − Crossover Probability p c B l o c k E rr o r P r o b a b i l i t y P B MCRCUTBCC, m =

11 (UB)TBCC, m =

11 (simul.)TBCC, m = m = Figure 3: Error correcting performance of different codeswith K s = bits and R s = over a BSC with crossoverprobability p c . The MC and RCU bounds for the same codeparameters are given as references. .

04 0 .

06 0 .

08 0 . .

12 0 .

14 0 .

16 0 .

18 0 . − − − − Crossover Probability p c B l o c k E rr o r P r o b a b i l i t y P B MCRCUPC, L = m =

11 (UB)TBCC, m =

11 (simul.)TBCC, m = m = Figure 4: Error correcting performance of different codeswith K s = bits and R s = over a BSC with crossoverprobability p c . The MC and RCU bounds for the same codeparameters are given as references.able 2: Parameters of the designed codes for K s = bits, p A = . and P B ≤ − and complexities for C s and C q ,respectively. For the TBCCs also the type of complexity ( κ WAVA F , κ WAVA M or κ WAVA P ) which is minimal is given. (cid:6) log |W| (cid:7) is theamount of helper data in bits. Code m R s p c ¯ q R q R w (cid:6) log |W| (cid:7) R s R w Complexity C s Complexity C q TBCC 11 . . . . . κ WAVA P ∝ . κ WAVA M ∝ . TBCC 8 . . . . . κ WAVA P ∝ . κ WAVA M ∝ . TBCC 11 . . . . . κ WAVA P ∝ . κ WAVA M ∝ . TBCC 8 . . . . . κ WAVA P ∝ . κ WAVA M ∝ . PC - . . . . . ∝ . ∝ . PC - . . . . . ∝ . ∝ . . .

75 0 . .

85 0 . . . . . .

06 Vector Quantizer Code Rate R q (bits/symbol) A v e r a g e D i s t o r t i o n ¯ q − H b ( q ) + log ( N )/ N TBCC, m = m =

11 (simul.)

Figure 5: Code rate of the vector quantizer code C q vs. aver-age distortion ¯ q for N = bits and P B ≤ − . Using the approach described in Section 4 and setting W max = ,we construct high rate codes to be used as a vector quantizer. UsingMonte Carlo simulations, we plot the rate of these codes R q vs.the measured average distortion ¯ q in Figures 5 and 6 for N = R s = , and for N = R s = ,respectively. 0 .

66 0 .

68 0 . .

72 0 . . . . . . .

09 Vector Quantizer Code Rate R q (bits/symbol) A v e r a g e D i s t o r t i o n ¯ q − H b ( q ) + log ( N )/ N TBCC, m =

11 (simul.)TBCC, m = L = Figure 6: Code rate of the vector quantizer code C q vs. aver-age distortion ¯ q for N = bits and P B ≤ − . We plot the approximate bound on the rate achieved for a givendistortion from [24]. The approximated rate for block length N is R ( approx ) q def = − H b ( q ) + log ( N ) N + O (cid:18) N (cid:19) (31)where O (·) denotes the big O notation. This approximation doesnot consider the effect of the constraint that the error correctingcode designed in the previous subsection has to be a subcode of 0 . . . . . . . . . . . . . . . . . . R w (bits/symbol) S e c r e t - k e y R a t e R s ( b i t s / s y m b o l ) ( R w , R s ) projection of R gs,bin Finite length non-achievability Finite length approximation R s + R w = R s = FCS & COFE, R s = TBCC, m = R s = TBCC, m = R s = TBCC, m = R s = TBCC, m = R s = PC, L = R s = PC, L = R s = SWC, R s = [5] SWC, R s = [5] Figure 7: Storage-key rates for the GS model with p A = . . The ( . , . ) bits/symbol point is the best possible pointachieved by SW-coding (SWC) constructions such as polar codes (PCs) in [5], which lies on the dashed line representing R w + R s = H ( X ) . The PCs are designed by applying the design procedure proposed in [14] for WZ-coding with the SCL decoder withlist size of L . The block-error probability satisfies P B ≤ − and K s = bits for all codes. The finite length non-achievabilitybound and its approximation for K s = bits is depicted as well. the vector quantizer of rate R q . Therefore, this bound only givesan approximate achievable bound on the rate of the high-rate codethat is used as a vector quantizer without having any constraint.The bound is plotted by neglecting the O (cid:18) N (cid:19) term.Using (23), we obtain the target distortion for the code to bedesigned, which allows to find a lower bound on the required rate R q of the vector quantizer. The results are shown in Table 2. Ob-serve that vector quantization performance of all codes is similar.Therefore, the code that has the best error correction performanceyields the smallest rate for vector quantization, which correspondsto the smallest amount of helper data. Combining the results of the error correction and the vector quan-tizer performance, we can evaluate the key vs. storage rate ratioby using (22). The intermediate and final results are listed in Ta-ble 2, and the achieved ( R w , R s ) tuples for all mentioned codes aredepicted in Figure 7. For the nested WZ-coding construction, where we have a vectorquantizer and an error correcting code, we plot a finite length non-achievability (converse) bound. For a fixed key size of K s = p A = . q , any vector quantizer code of blocklength N mustsatisfy [24, (2.186)] ⌊ N q ⌋ (cid:213) j = (cid:18) Nj (cid:19) ≥ N ( − R q ) . (32)Similar to the achievability bound discussed in Section 7.2, (31) isused to approximate also the non-achievability bound in (32). Thecombination of the MC bound and the converse bound for the vectorquantizer performance establishes a non-achievability bound onthe best rate tuples that can be achieved for given parameters by ourWZ-coding construction. In Figure 7, we plot this non-achievabilityound using (32) and its approximation using (31). Note that thezigzag behaviour of the bound in (32) is due to the floor function.We observe a gap between these bounds and achieved rate tuplesby the designed codes.The FCS and COFE have the key vs. storage rate ratio of R s R w = R s (33)as the storage rate is 1 bit/symbol for these constructions. TheSW coding constructions such as the syndrome coding methodproposed in [5] achieve the ratio R s R w = R s − R s (34)which improves on the FCS and COFE. WZ coding constructionswith nested PC we constructed for p A = . R s = and m =

11 such that R s R w = . R s and the memory size of TBCCs allowsa larger key vs. storage rate ratio. We proposed a nested convolutional code construction, which mightbe useful for various achievability schemes. For the key agreementproblem with PUFs, we proposed a design procedure for the nestedcode construction using TBCCs to obtain good reliability, secrecy,privacy, storage, and cost performance jointly. We implementednested convolutional codes for practical source and channel param-eters to illustrate the gains in terms of the key vs. storage rate ratioas compared to previous code designs. We observe that one variantof nested convolutional codes achieves a higher rate ratio than allother code designs in the literature but it may have a high hardwarecost. Another variant of nested convolutional codes with low com-plexity is illustrated to perform similarly to the best previous codesin the literature. We also computed known finite-length bounds forour code construction to show the gaps between the performanceof the designed codes and these bounds.

ACKNOWLEDGMENTS

This work was performed while O. Günlü was with the Chair ofCommunications Engineering, Technical University of Munich. O.Günlü was supported by the German Federal Ministry of Educa-tion and Research (BMBF) within the national initiative for “PostShannon Communication (NewCom)” under the Grant 16KIS1004,and by the German Research Foundation (DFG) under grant KR3517/9-1. V. Sidorenko is on leave from the Institute for InformationTransmission Problems, Russian Academy of Sciences. His workwas supported by the European Research Council (ERC) underthe European UnionâĂŹs Horizon 2020 research and innovationprogramme (grant agreement No 801434) and by the Chair of Com-munications Engineering at the Technical University of Munich.The work of G. Kramer was supported by an Alexander von Hum-boldt Professorship endowed by the BMBF.

REFERENCES [1] Rudolf Ahlswede and Imre Csiszár. 1993. Common randomness in informationtheory and cryptography - Part I: Secret sharing.

IEEE Trans. Inf. Theory

39, 4(July 1993), 1121–1132. https://doi.org/10.1109/18.243431[2] Mattias Andersson, Vishwambhar Rathi, Ragnar Thobaben, Jörg Kliewer, andMikael Skoglund. 2010. Nested polar codes for wiretap and relay channels.

IEEECommun. Lett.

14, 8 (Aug. 2010), 752–754. https://doi.org/10.1109/LCOMM.2010.08.100875[3] Erdal Arikan. 2009. Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels.

IEEE Trans.Inf. Theory

55, 7 (July 2009), 3051–3073. https://doi.org/10.1109/TIT.2009.2021379[4] Patrizio Campisi. 2013.

Security and privacy in biometrics . London, U.K.: Springer-Verlag.[5] Bin Chen, Tanya Ignatenko, Frans M.J. Willems, Roel Maes, Erik van der Sluis,and Georgios Selimis. 2017. A robust SRAM-PUF key generation scheme basedon polar codes. In

IEEE Global Commun. Conf.

Singapore, 1–6. https://doi.org/10.1109/GLOCOM.2017.8254007[6] Remi A. Chou and Matthieu R. Bloch. 2014. Separation of reliability and secrecyin rate-limited secret-key generation.

IEEE Trans. Inf. Theory

60, 8 (Aug. 2014),4941–4957. https://doi.org/10.1109/TIT.2014.2323246[7] Remi A. Chou, Matthieu R. Bloch, and Jörg Kliewer. 2015. Polar coding forempirical and strong coordination via distribution approximation. In

IEEE Int.Symp. Inf. Theory . Hong Kong, China, 1512–1516. https://doi.org/10.1109/ISIT.2015.7282708[8] Imre Csiszár and Prakash Narayan. 2000. Common randomness and secret keygeneration with a helper.

IEEE Trans. Inf. Theory

46, 2 (Mar. 2000), 344–366.https://doi.org/10.1109/18.825796[9] Paul W. Cuff, Haim H. Permuter, and Thomas M. Cover. 2010. CoordinationCapacity.

IEEE Trans. Inf. Theory

56, 9 (Sep. 2010), 4181–4206. https://doi.org/10.1109/TIT.2010.2054651[10] Yevgeniy Dodis, Rafail Ostrovsky, Leonid Reyzin, and Adam Smith. 2008. Fuzzyextractors: How to generate strong keys from biometrics and other noisy data.

SIAM J. Comput.

38, 1 (Jan. 2008), 97–139. https://doi.org/10.1007/978-3-540-24676-3_31[11] Blaise Gassend. 2003.

Physical random functions . Master’s thesis. M.I.T., Cam-bridge, MA.[12] Onur Günlü. 2018.

Key Agreement with Physical Unclonable Functions and Bio-metric Identifiers . Ph.D. Dissertation. TU Munich, Germany. published by Dr.Hut Verlag.[13] Onur Günlü and Onurcan İşcan. 2014. DCT based ring oscillator physical unclon-able functions. In

IEEE Int. Conf. Acoustics Speech Sign. Process.

Florence, Italy,8198–8201. https://doi.org/10.1109/ICASSP.2014.6855199[14] Onur Günlü, Onurcan İşcan, Vladimir Sidorenko, and Gerhard Kramer. 2019.Code Constructions for Physical Unclonable Functions and Biometric SecrecySystems.

IEEE Trans. Inf. Forensics Security

14, 11 (Nov. 2019), 2848–2858. https://doi.org/10.1109/TIFS.2019.2911155[15] Onur Günlü, Tasnad Kernetzky, Onurcan İşcan, Vladimir Sidorenko, GerhardKramer, and Rafael F. Schaefer. 2018. Secure and reliable key agreement withphysical unclonable functions.

Entropy

20, 5 (May 2018). https://doi.org/10.3390/e20050340[16] Onur Günlü and Gerhard Kramer. 2018. Privacy, secrecy, and storage withmultiple noisy measurements of identifiers.

IEEE Trans. Inf. Forensics Security

IEEE Int. Conf. Acoustics, Speech,Signal Process.

Barcelona, Spain. to appear.[18] Tanya Ignatenko, Geert jan Schrijen, Boris Skoric, Pim Tuyls, and Frans Willems.2006. Estimating the Secrecy-Rate of Physical Unclonable Functions with theContext-Tree Weighting Method. In

IEEE Int. Symp. Inf. Theory . Seattle, WA,499–503. https://doi.org/10.1109/ISIT.2006.261765[19] Tanya Ignatenko and Frans M. J. Willems. 2009. Biometric systems: Privacyand secrecy aspects.

IEEE Trans. Inf. Forensics Security

4, 4 (Dec. 2009), 956–973.https://doi.org/10.1109/TIFS.2009.2033228[20] Rolf Johannesson and Kamil Zigangirov. 2015.

Fundamentals of ConvolutionalCoding (2 ed.). 1–667 pages. https://doi.org/10.1002/9781119098799[21] Ari Juels and Martin Wattenberg. 1999. A fuzzy commitment scheme. In

ACMConf. Comp. Commun. Security . New York, NY, 28–36. https://doi.org/10.1145/319709.319714[22] Ashish Khisti, Suhas N. Diggavi, and Gregory W. Wornell. 2012. Secret-keygeneration using correlated sources and channels.

IEEE Trans. Inf. Theory

58, 2(Feb. 2012), 652–670. https://doi.org/10.1109/TIT.2011.2173629[23] Manabu Koide and Hirosuke Yamamoto. 2010. Coding theorems for biometricsystems. In

IEEE Int. Symp. Inf. Theory . Austin, TX, 2647–2651. https://doi.org/10.1109/ISIT.2010.5513689[24] Victoria Kostina. 2013.

Lossy Data Compression: Nonasymptotic FundamentalLimits . Ph.D. Dissertation. Princeton University, NJ, USA.25] Onur Ozan Koyluoglu and Hesham El Gamal. 2012. Polar coding for securetransmission and key agreement.

IEEE Trans. Inf. Forensics Security

7, 5 (Oct.2012), 1472–1483. https://doi.org/10.1109/TIFS.2012.2207382[26] Lifeng Lai, SiuWai Ho, and H. Vincent Poor. 2011. Privacy-security trade-offsin biometric security systems - Part I: Single use case.

IEEE Trans. Inf. ForensicsSecurity

6, 1 (Mar. 2011), 122–139. https://doi.org/10.1109/TIFS.2010.2098872[27] Wenhui Li, Vladimir Sidorenko, Thomas Jerkovits, and Gerhard Kramer. 2019. OnMaximum-Likelihood Decoding of Time-Varying Trellis Codes. In

InternationalSymposium Problems of Redundancy in Information and Control Systems . Moscow,Russia, 104–109.[28] Ruoheng Liu, Yingbin Liang, H. Vincent Poor, and Predrag Spasojevic. 2007.Secure Nested Codes for Type II Wiretap Channels. In

IEEE Inf. Theory Workshop .Tahoe City, CA, 337–342. https://doi.org/10.1109/ITW.2007.4313097[29] Roel Maes, Pim Tuyls, and Ingrid Verbauwhede. 2009. A Soft Decision HelperData Algorithm for SRAM PUFs. In

IEEE Int. Symp. Inf. Theory . Seoul, Korea,2101–2105. https://doi.org/10.1109/ISIT.2009.5205263[30] Hessam Mahdavifar and Alexander Vardy. 2011. Achieving the secrecy capacityof wiretap channels using polar codes.

IEEE Trans. Inf. Theory

57, 10 (Oct. 2011),6428–6443. https://doi.org/10.1109/TIT.2011.2162275[31] Ueli Maurer. 1993. Secret key agreement by public discussion from commoninformation.

IEEE Trans. Inf. Theory

39, 3 (May 1993), 2733–742. https://doi.org/10.1109/18.256484[32] Lars Palzer and Roy Timo. 2016. A converse for lossy source coding in the finiteblocklength regime. In

Int. Zurich Seminar Commun.

Zurich, Switzerland, 15–19.https://doi.org/10.3929/ethz-a-010645199[33] Ravikanth Pappu. 2001.

Physical one-way functions . Ph.D. Dissertation. M.I.T.,Cambridge, MA. [34] Gregory Poltyrev. 1994. Bounds on the decoding error probability of binarylinear codes via their spectra.

IEEE Trans. Inf. Theory

40, 4 (July 1994), 1284–1292.https://doi.org/10.1109/18.335935[35] Yury Polyanskiy, H. Vincent Poor, and Sergio Verdu. 2010. Channel CodingRate in the Finite Blocklength Regime.

IEEE Trans. Inf. Theory

56, 5 (May 2010),2307–2359. https://doi.org/10.1109/TIT.2010.2043769[36] Rose Y. Shao, Shu Lin, and Marc P. C. Fossorier. 2003. Two decoding algorithmsfor tailbiting codes.

IEEE Trans. Commun.

51, 10 (Oct. 2003), 1658–1665. https://doi.org/10.1109/TCOMM.2003.818084[37] Boris Škorić. 2012. Quantum readout of physical unclonable functions.

Int. J. Quantum Inf.

10, 1 (Feb. 2012), 1250001. https://doi.org/10.1142/S0219749912500013[38] David Slepian and Jack Wolf. 1973. Noiseless coding of correlated informationsources.

IEEE Trans. Inf. Theory

19, 4 (July 1973), 471–480. https://doi.org/10.1109/TIT.1973.1055037[39] Ido Tal and Alexander Vardy. 2015. List Decoding of Polar Codes.

IEEE Trans. Inf.Theory

61, 5 (May 2015), 2213–2226. https://doi.org/10.1109/TIT.2015.2410251[40] Jack K. Wolf and Andrew J. Viterbi. 1996. On the weight distribution of linearblock codes formed from convolutional codes.

IEEE Trans. Commun.

44, 9 (Sep.1996), 1049–1051. https://doi.org/10.1109/26.536907[41] Aaron D. Wyner. 1975. The wire-tap channel.

Bell Labs Tech. J.

54, 8 (Oct. 1975),1355–1387. https://doi.org/10.1002/j.1538-7305.1975.tb02040.x[42] Aaron D. Wyner and Jacob Ziv. 1973. A theorem on the entropy of certain binarysequences and applications: Part I.