Distributed Storage for Data Security
aa r X i v : . [ c s . I T ] M a y Distributed Storage for Data Security
Annina Bracher
ETH Zurich
Eran Hof
Samsung Israel Research and Development Center
Amos Lapidoth
ETH Zurich
Abstract —We study the secrecy of a distributed storage systemfor passwords. The encoder, Alice, observes a length- n passwordand describes it using two hints, which she then stores in differentlocations. The legitimate receiver, Bob, observes both hints. Theeavesdropper, Eve, sees only one of the hints; Alice cannot controlwhich. We characterize the largest normalized (by n ) exponentthat we can guarantee for the number of guesses it takes Eveto guess the password subject to the constraint that either thenumber of guesses it takes Bob to guess the password or the sizeof the list that Bob must form to guarantee that it contain thepassword approach as n tends to infinity. I. I
NTRODUCTION
Suppose that some sensitive information X (e.g. password)is drawn from a finite set X according to some PMF P X .A (stochastic) encoder, Alice, maps (possibly using random-ization) X to two hints M and M , which she then storesin different locations. The hints are intended for a legitimatereceiver, Bob, who knows where they are stored and has accessto both. An eavesdropper, Eve, sees one of the hints but notboth; we do not know which. Given some notion of ambiguity,we would ideally like Bob’s ambiguity about X to be smalland Eve’s large.Which hint is revealed to Eve is a subtle question. We adopta conservative approach and assume that, after observing X ,an adversarial ”genie” reveals to Eve the hint that minimizesher ambiguity. Not allowing the genie to observe X wouldlead to a weaker form of secrecy (an example is given in [1]).There are several ways to define ambiguity. For example, wecould require that Bob be able to reconstruct X whenever X is ”typical” and that the conditional entropy of X given Eve’sobservation be large. For some scenarios, such an approachmight be inadequate. Firstly, this approach may not properlyaddress Bob’s needs when X is not typical. For example, ifBob must guess X , this approach does not guarantee that theexpected number of guesses be small: It only guarantees thatthe probability of success after one guess be large. It doesnot indicate the number of guesses that Bob might need when X is atypical. Secondly, conditional entropy need not be anadequate measure of Eve’s ambiguity: if X is some passwordthat Eve wishes to uncover, then we may care more about thenumber of guesses that Eve needs than about the conditionalentropy [2].In this paper, we assume that Eve wants to guess X with theleast number of guesses of the form ”Is X = x?”. We quantifyEve’s ambiguity about X by the expected number of guessesthat she needs to uncover X . In this sense, Eve faces aninstance of Arikan’s guessing problem [3]. For each possibleobservation z in some finite set Z , Eve chooses a guessing function G ( · | z ) from X onto the set { , . . . , |X |} , whichdetermines the guessing order: if Eve observes z , then thequestion ”Is X = x ?” will be her G ( x | z ) -th question. Eve’sexpected number of guesses is E [ G ( X | Z )] . This expectationis minimized if for each z ∈ Z the guessing function G ( · | z ) orders the possible realizations of X in decreasing order oftheir posterior probabilities given Z = z .As to Bob, we will consider two different criteria: In the”guessing version” of the problem the criterion is the expectednumber of guesses it takes Bob to guess X , and in the ”listversion” the criterion is the first moment of the size of the listthat Bob must form to guarantee that it contain X . We shallsee that the two criteria lead to similar results.The former criterion is natural when Bob can check whethera guess is correct: If X is some password, then Bob can stopguessing as soon as he has gained access to the account thatis secured by X .The latter criterion is appropriate if Bob does not knowwhether a guess is correct. For example, if X is a task thatBob must perform, then the only way for Bob to make surethat he performs X is to perform all the tasks in a listcomprising the tasks that have positive posterior probabilitiesgiven his observation. In this scenario, a good measure ofBob’s ambiguity about X is the expected number of tasksthat he must perform, and this will be small whenever Aliceis a good task-encoder for Bob [4]. To describe the list oftasks that Bob must perform more explicitly, let us denote by P [ M = m , M = m | X = x ] , m ∈ M , m ∈ M the probability that Alice produces the pair of hints ( M , M ) = ( m , m ) upon observing that X = x . Itis 0-1 valued if Alice does not use randomization. Uponobserving that ( M , M ) = ( m , m ) , Bob produces the list L m ,m of all the tasks x ∈ X whose posterior probabil-ity P [ X = x | M = m , M = m ] is positive. Our notion ofBob’s ambiguity about X is E (cid:2) |L M ,M | (cid:3) .The guessing and the list-size criterion for Bob lead tosimilar results in the following sense: Clearly, every guess-ing function G ( ·| M , M ) for X that maps the elementsof X that have zero posterior probability to larger valuesthan those that have a positive posterior probability satis-fies E (cid:2) G ( X | M , M ) (cid:3) ≤ E (cid:2) |L M ,M | (cid:3) . Conversely, one canprove that every pair of ambiguities for Bob and Eve that isachievable in the ”guessing version” is, up to polylogarithmicfactors of |X | , also achievable in the ”list version” providedthat we increase M or M by a logarithmic factor of |X | (see Section VI ahead). These polylogarithmic factors washut in the asymptotic regime where the sensitive informationis an n -tuple and n tends to infinity.With no extra effort we can generalize the model and replaceexpectations with ρ -th moments. This we do to better bringout the role of R´enyi entropy. For an arbitrary ρ > , we thusstudy the ρ -th (instead of the first) moment of the list-sizeand of the number of guesses. Moreover, we shall allow someside-information Y that is available to all parties. We shallthus assume that ( X, Y ) take value in the finite set X × Y according to P X,Y .II. P
ROBLEM S TATEMENT
We consider two problems, which we call the ”guessingversion” and the ”list version”. They differ in the definitionof Bob’s ambiguity. In both versions a pair ( X, Y ) is drawnfrom the finite set X × Y according to the PMF P X,Y , and ρ > is fixed. Upon observing ( X, Y ) = ( x, y ) , Alice drawsthe hints M and M from the finite set M × M accordingto some PMF P [ M = m , M = m | X = x, Y = y ] . (1)In the ”guessing version” Bob’s ambiguity about X is A ( g ) B ( P X,Y ) = min G E (cid:2) G ( X | Y, M , M ) ρ (cid:3) . (2)In the ”list version” Bob’s ambiguity about X is A ( l ) B ( P X,Y ) = E (cid:2) |L YM ,M | ρ (cid:3) , (3)where for all y ∈ Y and ( m , m ) ∈ M × M L ym ,m = { x : P [ X = x | Y = y, M = m , M = m ] > } is the list of all the tasks whose posterior probability P [ X = x | Y = y, M = m , M = m ]= P X,Y ( x, y ) P [ M = m , M = m | X = x, Y = y ] P ˜ x P X,Y (˜ x, y ) P [ M = m , M = m | X = ˜ x, Y = y ] (4)is positive. In both versions Eve’s ambiguity about X is A E ( P X,Y ) = min G ,G E (cid:2) G ( X | Y, M ) ρ ∧ G ( X | Y, M ) ρ (cid:3) , (5)where α ∧ β denotes the minimum of α and β .Optimizing over Alice’s mapping, i.e., the choice of theconditional PMF in (1), we wish to characterize the largestambiguity that we can guarantee that Eve will have subject toa given upper bound on the ambiguity that Bob may have.Of special interest to us is the asymptotic regime where ( X, Y ) is an n -tuple (not necessarily drawn IID), and where M = (cid:8) , . . . , nR (cid:9) , M = (cid:8) , . . . , nR (cid:9) , where ( R , R ) is a nonnegative pair corresponding to therate. For both versions of the problem, we shall characterizethe largest exponential growth that we can guarantee for Eve’sambiguity subject to the constraint that Bob’s ambiguity tendto one. This asymptote turns out not to depend on the versionof the problem, and in the asymptotic analysis A B can standfor either A ( g ) B or A ( l ) B . To phrase this mathematically, let us introduce the stochasticprocess { ( X i , Y i ) } i ∈ N with finite alphabet X × Y . We denoteby P X n ,Y n the PMF of ( X n , Y n ) . For a nonnegative rate-pair ( R , R ) , we call E E an achievable ambiguity-exponent if there is a sequence of stochastic encoders such that Bob’sambiguity (which is always at least ) satisfies lim n →∞ A B ( P X n ,Y n ) = 1 , (6)and such that Eve’s ambiguity satisfies lim inf n →∞ log( A E ( P X n ,Y n )) n ≥ E E . (7)We shall characterize the supremum E E of all achievableambiguity-exponents, which we call privacy-exponent . If (6)cannot be satisfied, then the set of achievable ambiguity-exponents is empty, and we say that the privacy-exponent isnegative infinity. III. M AIN R ESULTS
To describe our results, we shall need a conditional versionof R´enyi entropy (originally proposed by Ariomoto [5] andalso studied in [4]) H α ( X | Y ) = α − α log X y ∈Y (cid:16) X x ∈X P X,Y ( x, y ) α (cid:17) /α , (8)where α ∈ [0 , ∞ ] is the order and where the cases where α is , , or ∞ are treated by a limiting argument. In addition,we shall need the notion of conditional R´enyi entropy-rate:Let { ( X i , Y i ) } i ∈ N be a discrete-time stochastic process withfinite alphabet X × Y . Whenever the limit as n tends toinfinity of H α ( X n | Y n ) /n exists, we denote it by H α ( X | Y ) and call it conditional R´enyi entropy-rate. In this paper, α = 1 / (1 + ρ ) takes value in the set (0 , . To simplifynotation, we henceforth write ˜ ρ for / (1 + ρ ) and α ∨ β forthe maximum of α and β . A. Finite Blocklength Results
In the next two theorems c s is related to how much canbe gleaned about X from ( M , M ) but not from one hintalone; c is related to how much can be gleaned from M ;and c is related to how much can be gleaned from M . Moreprecisely, we shall see in Section V ahead that Alice first maps ( X, Y ) to the triple ( V s , V , V ) , which takes value in a set V s × V × V of size c s c c . Independently of ( X, Y ) she thendraws a (one-time-pad like) random variable U uniformly over V s and maps ( U, V s ) to a variable ˜ V s choosing the (XOR like)mapping so that V s can be recovered from ( ˜ V s , U ) while ˜ V s alone is independent of ( X, Y ) . The hints are M = ( ˜ V s , V ) and M = ( U, V ) . Alice does not use randomization if c s = 1 . Theorem 1 (Finite Blocklength Guessing Version):
For ev-ery triple ( c s , c , c ) ∈ N satisfying c s ≤ |M |∧|M | , c ≤ ⌊|M | /c s ⌋ , c ≤ ⌊|M | /c s ⌋ , (9)there is a choice of the conditional PMF in (1) for which Bob’sambiguity about X is upper-bounded by A ( g ) B ( P X,Y ) < ρ ( H ˜ ρ ( X | Y ) − log( c s c c )+1) , (10)nd Eve’s ambiguity about X is lower-bounded by A E ( P X,Y ) ≥ (1 + ln |X | ) − ρ ρ ( H ˜ ρ ( X | Y ) − log( c + c )) . (11)Conversely, for every conditional PMF, Bob’s ambiguity islower-bounded by A ( g ) B ( P X,Y ) ≥ (1+ln |X | ) − ρ ρ ( H ˜ ρ ( X | Y ) − log |M ||M | ) ∨ , (12)and Eve’s ambiguity is upper-bounded by A E ( P X,Y ) ≤ ( |M | ρ ∧|M | ρ ) A ( g ) B ( P X,Y ) ∧ ρH ˜ ρ ( X | Y ) . (13) Theorem 2 (Finite Blocklength List Version): If |M ||M | > log |X | + 2 , then for every triple ( c s , c , c ) ∈ N satisfying c s ≤ |M |∧|M | , c ≤ ⌊|M | /c s ⌋ , c ≤ ⌊|M | /c s ⌋ , (14a) c s c c > log |X | + 2 , (14b)there is a choice of the conditional PMF in (1) for which Bob’sambiguity about X is upper-bounded by A ( l ) B ( P X,Y ) < ρ ( H ˜ ρ ( X | Y ) − log( c s c c − log |X |− , (15)and Eve’s ambiguity about X is lower-bounded by A E ( P X,Y ) ≥ (1 + ln |X | ) − ρ ρ ( H ˜ ρ ( X | Y ) − log( c + c )) . (16)Conversely, for every conditional PMF, Bob’s ambiguity islower-bounded by A ( l ) B ( P X,Y ) ≥ ρ ( H ˜ ρ ( X | Y ) − log |M ||M | ) ∨ , (17)and Eve’s ambiguity is upper-bounded by A E ( P X,Y ) ≤ ( |M | ρ ∧|M | ρ ) A ( l ) B ( P X,Y ) ∧ ρH ˜ ρ ( X | Y ) . (18)We sketch a proof of Theorems 1 and 2 in Section V ahead.Here, we discuss an important implication of the theorems: Note 3:
We can choose the conditional PMF in (1) so that,neglecting polylogarithmic factors of |X | , Bob’s ambiguitysatisfies any prespecified upper bound U B no smaller than ρ ( H ˜ ρ ( X | Y ) − log |M ||M | ) ∨ , while Eve’s ambiguity is guar-anteed to be ( |M | ρ ∧|M | ρ ) U B ∧ ρH ˜ ρ ( X | Y ) .To show that Note 3 holds, we next argue that the bounds inTheorems 1 and 2 are tight in the sense that with a judiciouschoice of ( c s , c , c ) the achievability results (namely (10)–(11) in the ”guessing version” and (15)–(16) in the ”listversion”) match the corresponding converse results (namely(12)–(13) in the ”guessing version” and (17)–(18) in the ”listversion”) up to polylogarithmic factors of |X | . By possiblyrelabeling the hints, we can assume w.l.g. that |M | ≤ |M | .If |M | exceeds H ˜ ρ ( X | Y ) we can choose ( c s , c , c ) =( |M | , , . Neglecting polylogarithmic factors of |X | , thischoice guarantees that Bob’s ambiguity be close to one, whileEve’s ambiguity is ρH ˜ ρ ( X | Y ) . Suppose now that |M | doesnot exceed H ˜ ρ ( X | Y ) . In this case we can choose ( c s , c , c ) so that c ≥ c and, neglecting logarithmic factors of |X | ,so that c s c = |M | while c s c c assumes any given integervalue between |M | and ( |M | |M | ) ∧ H ˜ ρ ( X | Y ) . This indeedproves the claim: neglecting polylogarithmic factors of |X | , wecan guarantee that Bob’s ambiguity satisfy any given upperbound no smaller than the RHS of (12) or (17), while Eve’sambiguity satisfies (13) or (18) with equality. B. Asymptotic Results
Consider now the asymptotic regime where ( X, Y ) is an n -tuple. In this case the results are the same for both versionsof the problem, and we thus refer to both A ( g ) B and A ( l ) B by A B . With a judicious choice of ( c s , c , c ) one can show thatTheorems 1 and 2 imply the following asymptotic result: Corollary 4:
Let { ( X i , Y i ) } i ∈ N be a discrete-time stochas-tic process with finite alphabet X × Y , and suppose its con-ditional R´enyi entropy-rate H ˜ ρ ( X | Y ) is well-defined. Givenany positive rate-pair ( R , R ) , the privacy-exponent is E E = ( ρ ( R ∧ R ∧ H ˜ ρ ( X | Y )) , R + R > H ˜ ρ ( X | Y ) −∞ , R + R < H ˜ ρ ( X | Y ) . (19)In the full version of this paper [1], we generalize Corol-lary 8 to a scenario where Bob’s ambiguity may grow expo-nentially with a given normalized (by n ) exponent E B .IV. L ISTS AND G UESSES
The results for the ”guessing version” and the ”list version”are remarkably similar. To understand why, we relate task-encoders to guessing functions. We show that a good guessingfunction induces a good task-encoder and vice versa:
Theorem 5:
Let ( X, Y ) be drawn from the finite set X × Y according to the PMF P X,Y . Using the side-information Y ,a stochastic task-encoder describes the task X by a chancevariable M , which it draws from a finite set M according tosome conditional PMF P [ M = m | X = x, Y = y ] , m ∈ M , x ∈ X , y ∈ Y . (20)For any PMF (20) define for all m ∈ M and y ∈ Y the lists L ym = { x ∈ X : P [ X = x | Y = y, M = m ] > } . (21)1) For every conditional PMF (20) the lists {L ym } inducea guessing function G ( ·| Y ) for X such that E [ G ( X | Y ) ρ ] ≤ |M| ρ E (cid:2) |L YM | ρ (cid:3) . (22)2) Every guessing function G ( ·| Y ) for X and every posi-tive integer v ≤ |X | satisfying |M| ≥ v ( ⌊ log ⌈|X | /v ⌉⌋ + 1) (23)induce a - valued conditional PMF (20)—i.e., a de-terministic task-encoder—whose lists {L ym } satisfy E (cid:2) |L YM | ρ (cid:3) ≤ E [ ⌈ G ( X | Y ) /v ⌉ ρ ] . (24)To prove Theorem 5, we need the following fact: Fact 1:
Fix a positive integer u , and let h ( · ) map every k ∈ { , . . . , u } to ⌊ log k ⌋ . Then, (cid:12)(cid:12)(cid:8) ˜ k ∈ { , . . . , u } : h (˜ k ) = h ( k ) (cid:9)(cid:12)(cid:12) ≤ k, k ∈ { , . . . , u } . (25) Proof: If k, ˜ k ∈ { , . . . , u } are such that h (˜ k ) = h ( k ) ,then ⌊ log k ⌋ ≤ k < ⌊ log k ⌋ +1 . Hence, (25) holds. Proof of Theorem 5:
As to the first part, suppose we aregiven a conditional PMF (20) with corresponding lists {L ym } as in (21). For each y ∈ Y , order the lists {L ym } m ∈M inincreasing order of their cardinalities, and order the elementsn each list in some arbitrary way. Now consider the guessingorder where we first guess the elements of the first (andsmallest) list in their respective order followed by thoseelements in the second list that have not yet been guessed(i.e., that are not contained in the first list) and we continueuntil concluding by guessing those elements of the last (andlongest) list that have not been previously guessed. Let G ( ·| Y ) be the corresponding guessing function, and observe that E [ G ( X | Y ) ρ ] = X x,y P X,Y ( x, y ) |{ ˜ x : G (˜ x | y ) ≤ G ( x | y ) }| ρ ( a ) ≤ X x,y P X,Y ( x, y ) |M| ρ min m : x ∈L ym |L ym | ρ ≤ |M| ρ E (cid:2) |L YM | ρ (cid:3) , where ( a ) holds because for all x, ˜ x ∈ X and y ∈ Y anecessary condition for G (˜ x | y ) ≤ G ( x | y ) is that ˜ x ∈ L y ˜ m forsome ˜ m ∈ M satisfying |L y ˜ m | ≤ min m : x ∈L ym |L ym | , and thenumber of lists whose size does not exceed min m : x ∈L ym |L ym | is at most |M| .As to the second part, suppose we are given a guessingfunction G ( ·| Y ) for X and a positive integer v ≤ |X | that satisfies (23). Let Z = { , . . . , v − } and S = { , . . . , ⌊ log ⌈|X | /v ⌉⌋} . From (23) it follows that |M| ≥|Z| |S| . It thus suffices to prove the existence of a task-encoder that uses only |Z| |S| possible descriptions, and wethus assume w.l.g. that M = Z × S . That is, using the side-information y the task-encoder (deterministically) describes x by m = ( z, s ) . The encoding involves two steps: Step 1:
In Step 1 the encoder first computes Z ∈ Z as theremainder of the Euclidean division of G ( X | Y ) − by |Z| .It then constructs from G ( ·| Y ) a guessing function G ( ·| Y, Z ) for X as follows. Given Y = y and Z = z , the task X must bein the set X y,z , { x ∈ X : ( G ( x | y ) − ≡ z mod |Z|} . Theencoder constructs the guessing function G ( ·| y, z ) so that—inthe corresponding guessing order—we first guess the elementsof X y,z in increasing order of G ( x | y ) . For l ∈ { , . . . , |X y,z |} our l -th guess x l is thus the element of X y,z for which G ( x l | y ) = z + 1 + ( l − |Z| . Once we have guessed allthe elements of X y,z we guess the remaining elements of X in some arbitrary order. This order is immaterial because X isguaranteed to be in the set X y,z . Since z + 1 ∈ { , . . . , |Z|} we find that G ( x | y, z ) = ⌊ G ( x | y ) / |Z|⌋ whenever x ∈ X y,z .But X is guaranteed to be in the set X y,z . Hence, the guessingfunction G ( ·| Y, Z ) for X satisfies G ( X | Y, Z ) = ⌈ G ( X | Y ) / |Z|⌉ . (26) Step 2:
In Step 2 the encoder first computes S = ⌊ log G ( X | Y, Z ) ⌋ ∈ Z , and then describes the task X by M , ( Z, S ) . Given Y = y , Z = z , and S = s , the task X must be in the set X y,zs , { x ∈ X : ⌊ log G ( x | y, z ) ⌋ = s } .Fact 1 and the fact that the guessing function G ( ·| y, z ) isa bijection imply that |X y,zs | ≤ G ( x | y, z ) for all x ∈ X y,zs .Since X is guaranteed to be in the set X y,zs we have |X Y,ZS | ≤ G ( X | Y, Z ) . From M = ( Z, S ) and (21) we obtain that the list L YM is contained in the set X Y,ZS and thus satisfies |L YM | ≤ G ( X | Y, Z ) . (27)Recalling that |Z| = v we conclude from (26) and (27) E (cid:2) |L YM | ρ (cid:3) ≤ E [ G ( X | Y, Z )] = E [ ⌈ G ( X | Y ) /v ⌉ ρ ] . (28)Since Z and S are deterministic given ( X, Y ) the conditionalPMF (20) associated with M = ( Z, S ) is - valued.The choice of v as ⌊|M| / ( ⌊ log |X |⌋ + 1) ⌋ and [4, Equa-tion (26)], i.e., ⌈ ξ ⌉ ρ < ρ ξ ρ , ξ ≥ , imply our next result: Corollary 6:
Every guessing function G ( ·| Y ) for X in-duces a deterministic task-encoder corresponding to a - valued conditional PMF (20) that satisfies E (cid:2) |L YM | ρ (cid:3) ≤ ρ E [ G ( X | Y ) ρ ] (cid:18) |M| log |X | + 1 − (cid:19) − ρ . (29)Combined with Arikan’s bounds [3, Theorem 1 and Propo-sition 4] on E [ G ( X | Y ) ρ ] , Equations (22) and (29) provide anupper and a lower bound on the smallest E (cid:2) |L YM | ρ (cid:3) that isachievable for a given |M| . These bounds are weaker than[4, Theorem 1.1 and Theorem 6.1] in the finite blocklengthregime but tight enough to prove the asymptotic results [4,Theorem 1.2 and Theorem 6.2].Another interesting corollary to Theorem 5 results from thechoice of v as (see Section VI for an implication): Corollary 7:
For |M| = ⌊ log |X |⌋ + 1 every guessing func-tion G ( ·| Y ) induces a deterministic task-encoder for which E (cid:2) |L YM | ρ (cid:3) ≤ E (cid:2) G ( X | Y ) ρ (cid:3) . (30)V. O N THE P ROOF OF T HEOREMS AND Z helps guessing. We show that if Z takes value in a finite set Z , then it can reduce the ρ -th momentof the number of guesses by at most a factor of |Z| − ρ . Lemma 1:
Let ( X, Y, Z ) be drawn from the finite set X ×Y × Z according to the PMF P X,Y,Z . Then, E [ G ∗ ( X | Y, Z ) ρ ] ≥ E (cid:2) ⌈ G ∗ ( X | Y ) / |Z|⌉ ρ (cid:3) , (31)where G ∗ ( ·| Y, Z ) minimizes E (cid:2) G ( X | Y, Z ) ρ (cid:3) and G ∗ ( ·| Y ) minimizes E (cid:2) G ( X | Y ) ρ (cid:3) . Equality holds whenever Z = f ( X, Y ) for some f : X ×Y → Z for which f ( x, y ) = f (˜ x, y ) implies either ⌈ G ∗ ( x | y ) / |Z|⌉ 6 = ⌈ G ∗ (˜ x | y ) / |Z|⌉ or x = ˜ x .Such a function always exists because for all l ∈ N at most |Z| different x ∈ X satisfy ⌈ G ∗ ( x | y ) / |Z|⌉ = l . Proof: If g ( x,y ) ∈ arg min z ∈Z G ∗ ( x | y,z ) for ( x, y ) ∈X × Y , then E [ G ∗ ( X | Y, Z ) ρ ] ≥ min G E [ G ( X | Y, g ( X, Y )) ρ ] .It thus suffices to prove (31) for the case where Z is determin-istic given ( X, Y ) , and we thus assume w.l.g. that Z = g ( X, Y ) for some function g : X × Y → Z . Consider E [ G ( X | Y, Z ) ρ ] = X x,y P X,Y ( x, y ) G ( x | y, g ( x, y )) ρ , (32)where G ( ·| Y, g ( X, Y )) is a guessing function. Note that G ( x | y, g ( x, y )) = G (˜ x | y, g (˜ x, y )) implies g ( x, y ) = g (˜ x, y ) forall y ∈ Y and distinct x, ˜ x ∈ X . For every l ∈ N there are thust most |Z| different x ∈ X for which G ( x | y, g ( x, y )) = l . Foreach y ∈ Y order the possible realizations of X in decreasingorder of P X,Y ( x, y ) , i.e., in decreasing order of their posteriorprobabilities given Y = y , and let x yj denote the j -th element.Clearly, (32) is minimum over g ( · , · ) and G ( ·| Y, g ( X, Y )) iffor l ∈ N and y ∈ Y we have G ( x | y, g ( x, y )) = l whenever x = x yj for some j satisfying ( l − |Z| + 1 ≤ j ≤ l |Z| ,i.e., ⌈ j/ |Z|⌉ = l . Since G ∗ ( ·| Y ) minimizes E (cid:2) G ( X | Y ) ρ (cid:3) , itorders the elements of X in increasing order of their posteriorprobabilities given Y . We can thus choose x yj to be the unique x ∈ X for which G ∗ ( x | y ) = j . Hence, (32) is minimized if f ( · , · ) satisfies the specifications in the lemma, g ( · , · ) = f ( · , · ) ,and G ( x | y,f ( x,y )) = ⌈ G ∗ ( x | y ) / |Z|⌉ . The minimum equalsthe RHS of (31).Lemma 1 and [4, Equation (26)] imply the following result: Corollary 8:
Draw ( X, Y ) from the finite set X × Y ac-cording to the PMF P X,Y , and let Z be a finite set. There isa function f : X × Y → Z such that Z = f ( X, Y ) satisfies min G E [ G ( X | Y, Z ) ρ ] < ρ |Z| − ρ min G E [ G ( X | Y ) ρ ] . (33)Conversely, every chance variable Z with alphabet Z satisfies min G E [ G ( X | Y, Z ) ρ ] ≥ |Z| − ρ min G E (cid:2) G ( X | Y ) ρ (cid:3) ∨ . (34)We now sketch the proofs of Theorems 1 and 2 startingwith the direct part. Fix ( c s , c , c ) ∈ N satisfying (9)in the ”guessing version” and (14) in the ”list version”.For each ν ∈ { s, , } let V ν be a chance variable takingvalue in the set V ν = { , . . . , c ν − } . Corollary 8 and [3,Proposition 4] imply that there is a - valued conditionalPMF P [( V s , V , V ) = ( v s , v , v ) | X = x, Y = y ] for which min G E (cid:2) G ( X | Y,V s ,V ,V ) ρ (cid:3) < ρ ( H ˜ ρ ( X | Y ) − log( c s c c )+1) , (35)and likewise [4, Theorem 6.1] implies that there is a - valuedconditional PMF for which E (cid:2) |L YV s ,V ,V | ρ (cid:3) < ρ ( H ˜ ρ ( X | Y ) − log( c s c c − log |X |− . (36)Both (9) and (14) imply |M | ≥ c s c and |M | ≥ c s c . Itthus suffices to prove (10)–(11) and (15)–(16) for a conditionalPMF (1) that assigns positive mass only to c s c elements of M and c s c elements of M , and we thus assume w.l.g.that M = V s × V and M = V s × V . Hence, we canchoose M = ( V s ⊕ c s U, V ) and M = ( U, V ) , where U isindependent of ( X, Y ) and uniformly distributed over V s , andwhere ⊕ c s denotes modulo- c s addition. For this choice (10)follows from (35) and (15) from (36). The proof of (11) and(16) is more involved. It builds on the following two ideas:1) Since U is computable from both ( X, M ) and ( X, M ) we can w.l.g. assume that Eve must guess ( X, U ) insteadof X . 2) Given two guessing functions G ( · , · | Y, M ) and G ( · , · | Y, M ) for ( X, U ) , one can show that G ( · , · | Y, M ) ∧ G ( · , · | Y, M ) behaves like a guessing function G ( · , ·| Y, Z ) for ( X, U ) , where the additional side-information Z assumesat most c s ( c + c ) different values. Once 1) and 2) havebeen established, the proof is concluded by Corollary 8, [3,Theorem 1], and H ˜ ρ ( X, U | Y ) = H ρ ( X | Y )+log c s . The converse is straightforward: In the ”guessing version”the bound (12) on Bob’s ambiguity follows from Corollary 8and [3, Theorem 1]. In the ”list version” (17) follows from [4,Theorem 6.1] and the observation that A ( g ) B is minimized ifthe PMF in (1) is - valued. Clearly, Eve’s ambiguity satisfies A E ( P X,Y ) ≤ min k ∈{ , } (cid:16) min G k E [ G k ( X | Y, M k ) ρ ] (cid:17) , andCorollary 8 implies for each k ∈ { , } and l ∈ { , }\{ k } min G E [ G ( X | Y, M , M ) ρ ] ≥ |M l | − ρ min G k E [ G k ( X | Y, M k ) ρ ] . Since min G E [ G ( X | Y, M , M ) ρ ] ≤ E (cid:2) |L YM ,M | ρ (cid:3) we thusfind that in both versions Eve’s ambiguity exceeds Bob’s byat most a factor of |M | ρ ∧ |M | ρ . Due to [3, Proposi-tion 4] and because Eve can guess X using only Y we have min G E [ G ( X | Y ) ρ ] ≤ ρH ˜ ρ ( X | Y ) . Hence, (13) and (18) hold.VI. D ISCUSSION
We next explain why the results for the ”guessing ver-sion” and the ”list version” differ only by polylogarithmicfactors of |X | . In the ”guessing version” Bob uses a guessingfunction G ( ·| Y, M , M ) to guess X based on the side-information Y and the hints M and M , and his ambi-guity is E [ G ( X | Y, M , M ) ρ ] . On account of Corollary 7we can construct an additional hint M , which assumes log |X | + 1 different values and satisfies E (cid:2) |L YM ,M ,M | ρ (cid:3) ≤ E (cid:2) G ∗ ( X | Y, M , M ) ρ (cid:3) , where L YM ,M ,M is the smallest listthat is guaranteed to contain X given ( Y, M , M , M ) . Sup-pose Alice maps X to the hints M ′ = ( M , M ) and M ′ = M . Then, Bob’s ambiguity in the ”list version” is E (cid:2) |L YM ′ ,M ′ | ρ (cid:3) = E (cid:2) |L YM ,M ,M | ρ (cid:3) and hence no larger than E [ G ( X | Y, M , M ) ρ ] . Since M assumes only log |X | + 1 different values one can use Lemma 1 to show that Eve’sambiguity decreases by at most a polylogarithmic factor of |X | (compared to the case where the hints are M and M ).VII. E XTENSIONS
In the full version of this paper [1], we discuss several mod-ifications of the model: We extend the analysis to scenarioswhere we know in advance which hint Eve will find (e.g., sincethe other hint is stored in a safe), or where secrecy is achievedby means of a secret key, which is available to Alice and Bobbut not to Eve. Moreover, we generalize the asymptotic resultsto the case where Bob and Eve must reconstruct X n within agiven distortion D (cf. [4], [6]).R EFERENCES[1] A. Bracher, E. Hof, and A. Lapidoth, ”Secrecy constrained encoding forguessing decoders and list-decoders,” in preparation.[2] N. Merhav and E. Arikan, ”The Shannon cipher system with a guessingwiretapper,”
IEEE Trans. Inf. Theory , IT-45, No. 6, pp. 1860–1866, Sep.1999.[3] E. Arikan, ”An inequality on guessing and its applications to sequentialdecoding,”
IEEE Trans. Inf. Theory , IT-42, No. 1, pp. 99–105, Jan. 1996.[4] C. Bunte and A. Lapidoth, ”Encoding tasks and R´enyi entropy,” arXiv:1401.6338 [cs.IT] , 2014.[5] S. Arimoto, ”Information measures and capacity of order α for discretememoryless channels,” Topics in Inf. Theory , Vol. 17, No. 6, pp. 41-52,1977.[6] E. Arikan and N. Merhav, ”Guessing subject to distortion,”