Multi-User Privacy: The Gray-Wyner System and Generalized Common Information
aa r X i v : . [ c s . I T ] J un Multi-User Privacy: The Gray-Wyner System andGeneralized Common Information
Ravi Tandon, Lalitha Sankar, H. Vincent Poor
Dept. of Electrical Engineering,Princeton University, Princeton, NJ 08544.Email: { rtandon,lalitha,poor } @princeton.edu Abstract — The problem of preserving privacy when a multi-variate source is required to be revealed partially to multipleusers is modeled as a Gray-Wyner source coding problem with K correlated sources at the encoder and K decoders in whichthe k th decoder, k = 1 , , ..., K, losslessly reconstructs the k th source via a common link of rate R and a private link of rate R k . The privacy requirement of keeping each decoder obliviousof all sources other than the one intended for it is introducedvia an equivocation constraint E k at decoder k such that thetotal equivocation summed over all decoders E ≥ ∆ . The set ofachievable ( { R k } Kk =1 , R , ∆) rates-equivocation ( K + 2) -tuplesis completely characterized. Using this characterization, twodifferent definitions of common information are presented andare shown to be equivalent. I. I
NTRODUCTION
Information sources often need to be made accessible tomultiple legitimate users simultaneously. However, not alldata from the source should be accessible to all users. Forexample, a computer retailer may need to share the annualrevenue of all computers sold with all the vendors but sharevendor-specific sale information only with a particular vendor.Similarly, a business consulting firm may share general dataabout a specific market with all clients associated with thatmarket but share client-specific strategies with only that client.In both cases, one can view sharing the public (shared by all)information via a common link and the private informationvia a dedicated link. Maximizing the rate over the commonlink allows the information source (retailer/consulting firm) toshare the most allowed publicly with all clients; however, theprivacy guarantee requires that no client has access to privatedata of the other clients. This paper develops an abstract modeland a methodology to study this problem.We model the problem of revealing partial source informa-tion to multiple users while keeping the data specific to eachuser private from other users as a Gray-Wyner source codingproblem with K correlated sources at the encoder and K decoders in which the k th decoder, k = 1 , , ..., K, losslesslyreconstructs the k th source via a common link of rate R anda private link of rate R k . We model the privacy requirement The research was supported by the Air Force Office of Scientific ResearchMURI Grant FA- - - - , by the National Science Foundation GrantsCNS- - and CCF- - , and by a Fellowship from the Councilon Science and Technology at Princeton University. of keeping each decoder oblivious of all sources other thanthe one intended for it via an equivocation constraint E k atdecoder k such that the total equivocation summed over alldecoders E ≥ ∆ .Since privacy is an important aspect of this problem, it isnatural to understand the maximal total equivocation that isachievable if the rate on the common link is set to the maxi-mum achievable. On the other hand, imposing the constraintof maximal total equivocation may lead to perhaps a differentlimit on the maximal rate on the common link. In this paper,we show that both requirements, which are formally differentdefinitions, yield the same formulation for the maximal rate onthe common link. In keeping with the literature, this commonrate is defined as the common information .The common information of two correlated random vari-ables has been defined independently by Wyner [1] and G´acs-K¨orner [2]. Wyner’s definition of common information asapplied to the two-user Gray-Wyner system (without privacyconstraints) is the minimum rate on the common link suchthat the total information shared across all three links (onecommon and two private) does not exceed the source entropy.On the other hand, the G´acs-K¨orner common informationis the maximal entropy of a random variable that two non-interacting terminals can agree upon when one terminal hasaccess to X n and the other to Y n where X and Y arecorrelated random variables. For two correlated variables X and Y , the Wyner common information C W , the G´acs-K¨ornercommon information C GK , and the mutual information ofthe two variables are related as C GK ≤ I ( X ; Y ) ≤ C W .Recently, the authors in [3] have generalized Wyner’s def-inition of common information to K variables, henceforthreferred to as B ( X , X , . . . , X K ) for K correlated variables.While the definition naturally generalizes the two variablecommon information, the resulting common information doesnot satisfy a non-increasing property with K as expected.In this paper, we present two different definitions of com-mon information: the first is the maximal rate on the commonlink for which the total equivocation is maximized, and thesecond is the maximal rate on the common link such that eachuser losslessly reconstructs its intended source at its entropy. X K Encoder X k X K X Decoder 1Decoder k Decoder KR R k R K R ˆ X ˆ X k Fig. 1. The generalized Gray-Wyner source network.
We show that both definitions lead to the same formulationfor common information C ( X , X , . . . , X K ) . We presentmany properties of C ( X , X , . . . , X K ) and specifically showthat C ( X , X , . . . , X K ) ≤ B ( X , X , . . . , X K ) . To the bestof our knowledge this is the first generalization of commoninformation that preserves the non-increasing property and onewhose form can be viewed as a natural generalization of theG´acs-K¨orner common information to K variables.The paper is organized as follows. In Section II, wepresent the system model. In Section III, we present therate-equivocation region, develop a formulation for commoninformation in two different ways, and present key properties.In Section IV, we compare our formulation with the K -variable generalization of Wyner’s common information in [3]and illustrate with examples. We conclude in Section V.II. S YSTEM M ODEL
We consider the following source network. A centralizedencoder observes K discrete, memoryless correlated sources, { X nk } Kk =1 and is interested in communicating source X k todecoder k in a lossless manner. The resources available at theencoder comprise two types of noiseless rate-limited links.There are K links of finite rate from the encoder to each ofthe K decoders and there is a common link of finite rate toall decoders. Figure 1 shows the source broadcasting networkin consideration.An ( n, { M k } Kk =1 , M ) code for this model is defined by ( K + 1) encoding functions described as f : X n × . . . X nK → { , . . . , M } , (1) f k : X n × . . . X nK → { , . . . , M k } , k = 1 , . . . , K, (2)and K decoding functions, g k : { , . . . , M } × { , . . . , M k } → X nk , k = 1 , . . . , K. We define the probability of error at decoder k as P e,k = Pr ( X nk = g k ( f ( X n ) , f k ( X n ))) , where X n , { X nk } Kk =1 . We define the equivocation at decoder k as E k = 1 n H ( X n \ X nk | f ( X n ) , f k ( X n )) , and the total equivocation as E = P Kk =1 E k . Remark 1:
Informally, E k captures the average uncertainty,and hence privacy achievable, about the remaining ( K − unintended sources at decoder k .An ( { R k } Kk =1 , R , ∆) rate-equivocation ( K + 2) -tupleis achievable for the source network if there exists an ( n, { M k } Kk =1 , M ) code such that, M ≤ nR , (3) M k ≤ nR k , k = 1 , . . . , K (4) P e,k ≤ ǫ k , k = 1 , . . . , K (5) E ≥ ∆ − ǫ. (6)We denote by R the region of all achievable ( { R k } Kk =1 , R , ∆) rate-equivocation ( K + 2) -tuples.III. M AIN C ONTRIBUTIONS
A. Rate-Equivocation Region
We state our first result in the following theorem. The proofis presented in the appendix.
Theorem 1:
The region R of achievable rates-equivocation ( K + 2) -tuples for the source network shown in Figure 1 isthe union of all ( k + 2) -tuples ( { R k } Kk =1 , R , ∆) that satisfy R ≥ I ( X , X , . . . , X K ; W ) , (7) R k ≥ H ( X k | W ) , k = 1 , , . . . , K, (8) ∆ ≤ K P k =1 H (cid:0) X | W, X k (cid:1) (9)where the union is over all auxiliary random variables W arbitrarily correlated with ( X , X , . . . , X K ) , and where X ≡ ( X , X , . . . , X K ) . Remark 2:
The rate region R G − W of the Gray-Wyner net-work without additional equivocation constraints is the regionof ( K + 1) rate tuples that satisfy (7) and (8). B. Common Information of K Correlated Variables
We now present two definitions for the common informationof K correlated random variables. Definition 1:
The common information of K correlatedrandom variables, C , is the maximal value of R , such that ( { R k } Kk =1 , R , ∆ max ) ∈ R , where ∆ max , K X k =1 H ( X | X k ) . Definition 2:
The common information of K correlatedrandom variables, C , is the maximal value of R , such that ( { H ( X k ) − R } Kk =1 , R ) ∈ R G − W .We next state our second result. Theorem 2: C and C are related as follows: C = C = max W − X k − ¯ X \ X k ,k =1 , ,...,K I ( X X . . . X K ; W ) . (10) Proof:
From Definition 1, the achievable equivocation E must satisfy E ≥ ∆ max = K X k =1 H ( X | X k ) n the other hand, any achievable ( { R k } Kk =1 , R , E ) ∈ R alsosatisfies E ≤ K X k =1 H ( X | W, X k ) . We therefore, have the following constraint: K X k =1 H ( X | W, X k ) ≥ K X k =1 H ( X | X k ) which is equivalent to the following K constraints: I ( X \ X k ; W | X k ) = 0 , k = 1 , . . . , K. (11)Therefore, from Definition 1, C is equal to the maximal R subject to (11), which implies that C = max W − X k − X \ X k ,k =1 ,...,K I ( X , . . . , X K ; W ) . From Definition 2, C is defined as the maximal R such that R k + R = H ( X k ) , for k = 1 , . . . , K , and ( { R k } Kk =1 , R ) ∈ R G − W . We therefore have the followingconstraints for k = 1 , . . . , K : H ( X k ) = R k + R (12) ≥ H ( X k | W ) + I ( X , . . . , X K ; W ) . (13)These constraints are equivalent to I ( X \ X k ; W | X k ) = 0 , k = 1 , . . . , K. Therefore, C can be written as follows: C = max W − X k − X \ X k ,k =1 ,...,K I ( X , . . . , X K ; W ) . C. Common Information: Properties
We will now develop some properties of common infor-mation of K correlated random variables defined in Theorem2. Proposition 1:
The common information of K random vari-ables, C ( X , X , . . . , X K ) , is monotonically decreasing in K . Proof:
Consider an arbitrary W satisfying the Markovchain relationship W − X k − X \ X k , k = 1 , . . . , K. (14)First consider the following sequence of inequalities: I ( X , . . . , X K − , X K ; W )= I ( X , . . . , X K − ; W ) + I ( X K ; W | X , . . . , X K − ) (15) ≤ I ( X , . . . , X K − ; W ) + I ( X , . . . , X K ; W | X ) (16) = I ( X , . . . , X K − ; W ) (17)where (17) follows from the Markov chain relationship W − X − ( X , . . . , X K ) . Now consider the following sequence of inequalities: C ( X , . . . , X K )= max W − X k − X \ X k , k =1 ,...,K I ( X , . . . , X K ; W ) (18) ≤ max W − X k − X \ X k , k =1 ,...,K I ( X , . . . , X K − ; W ) (19) ≤ max W − X k − X \ ( X k ,X K ) , k =1 ,..., ( K − I ( X , . . . , X K − ; W ) (20) = C ( X , . . . , X K − ) (21)where (19) follows from (17) and (20) follows from the factthat the Markov chain relationship W − X k − X \ X k impliesthe Markov chain relationship W − X k − X \ ( X k , X K ) . Sincethe random variable X K could be chosen arbitrarily from theset ( X , . . . , X K ) , (21) shows that the common informationis monotonically decreasing in K . Proposition 2: C ( X , X , . . . , X K ) is upper boundedas C ( X , X , . . . , X K ) ≤ min i = j,i,j =1 , ,...,K I ( X i ; X j ) . (22) Proof:
We consider an arbitrary W satisfying (14), andupper bound the following mutual information: I ( X , . . . , X K ; W ) = I ( X i ; W ) + I ( X \ X i ; W | X i ) (23) = I ( X i ; W ) (24) ≤ I ( X i ; X j , W ) (25) = I ( X i ; X j ) + I ( X i ; W | X j ) (26) = I ( X i ; X j ) (27)where (24) follows from the Markov chain condition W − X i − X \ X i , and (27) follows from the Markov chain condition W − X j − X i . The choice of ( i, j ) was arbitrary, and therefore,the common information is upper bounded by the minimumof pairwise mutual information among all pairs, i.e., C ( X , . . . , X K ) ≤ min i = j I ( X i ; X j ) . IV. C
OMPARISON AND E XAMPLES
In [1] Wyner defines the common information of twocorrelated random variables ( X , X ) as B ( X , X ) = inf X → W → X I ( X , X ; W ) . One interpretation of this common information can be ob-tained from the Gray-Wyner source network. The commoninformation B ( X , X ) of two random variables is given asthe smallest value of R such that ( R , R , R ) ∈ R G − W and R + R + R ≤ H ( X , X ) . Recently, this notion ofcommon information was generalized to K correlated randomvariables in [3]. The common information, B ( X , . . . , X K ) ,of K correlated random variables, as defined in [3], is given bysmallest value of R such that ( { R k } Kk =1 , R ) ∈ R G − W and R + P Ki =1 R k ≤ H ( X , . . . , X K ) . The common information B ( X , . . . , X K ) is given as B ( X , . . . , X K ) = inf I ( X , . . . , X K ; W ) here the infimum is over all distributions p ( w, x , . . . , x K ) that satisfy X w ∈W p ( w, x , . . . , x K ) = p ( x , . . . , x K ) (28) p ( x , . . . , x K | w ) = K Y k =1 p ( x k | w ) . (29)It was shown in [3] that B ( X , . . . , X K ) is monotonicallyincreasing in K . We believe that any intuitively satisfactorymeasure of common information should satisfy the propertythat the common information should decrease as the numberof random variables increases. In Proposition 1, we showedthat our measure of common information indeed satisfies thisproperty.We next prove a property of B ( X , . . . , X K ) that helps us incomparing it with our common information C ( X , . . . , X K ) . Proposition 3: B ( X , X , . . . , X K ) is lower bounded asfollows: max i = j I ( X i ; X j ) ≤ B ( X , X , . . . , X K ) . (30) Proof:
To prove Proposition 3, consider an arbitrary W satisfying the constraints (28)-(29) and the following sequenceof inequalities: I ( X , . . . , X K ; W ) ≥ I ( X i ; W ) (31) ≥ I ( X i ; X j ) (32)where (32) follows from the Markov chain relationship X i − W − X j , and from the data processing inequality. In arrivingat (32), the choice of ( i, j ) was arbitrary, and therefore we canmaximize over all pairs ( i, j ) such that i = j to get the bestpossible lower bound in this manner.Using Propositions 2 and 3, we have the following: C ( X , . . . , X K ) ≤ min i = j I ( X i ; X j ) ≤ max i = j I ( X i ; X j ) ≤ B ( X , . . . , X K ) . (33)We will now give two examples to illustrate the usefulness ofour definition C ( X , . . . , X K ) over B ( X , . . . , X K ) . Example 1:
Consider K = 3 random variables ( X , X , X ) such that X ∼ Ber (1 / , X = X ⊕ N ,where N ∼ Ber ( δ ) and X is independent of ( X , X ) . Since X is independent of ( X , X ) , these sources have nothing incommon and we should expect the ‘common information’ tobe zero. Note that for these sources, min i = j I ( X i ; X j ) = 0 ,whereas max i = j I ( X i ; X j ) = 1 − h ( δ ) . Therefore, from (33),we have ≤ C ( X , X , X ) ≤ ≤ − h ( δ ) ≤ B ( X , X , X ) , which implies that C ( X , X , X ) = 0 , whereas B ( X , X , X ) > for any δ ∈ (0 , / . Example 2:
Consider K = 3 random variables ( X , X , X ) such that X = ( X , X p ) , X = ( X , X p ) and X = ( X , X p ) , where ( X , X p , X p , X p ) are all mutually independent. Since X appears tobe the only common part in all three sources, we should expect the ‘common information’ to be equalto the entropy of X . Note that for these sources, min i = j I ( X i ; X j ) = max i = j I ( X i ; X j ) = H ( X ) . Therefore,from (33), we have ≤ C ( X , X , X ) ≤ H ( X ) ≤ B ( X , X , X ) , It is straightforward to show that for these sources, C ( X , X , X ) = B ( X , X , X ) = H ( X ) . Inspired by the above example, we show the followinginteresting property that in some sense relates C ( X , . . . , X K ) to B ( X , . . . , X K ) . Proposition 4:
For a set of sources X , X , . . . , X K thatsatisfy min i = j I ( X i ; X j ) = max i = j I ( X i ; X j ) , (34)we have C ( X , X , . . . , X K ) = min i = j I ( X i ; X j ) (35)if B ( X , X , . . . , X K ) = max i = j I ( X i ; X j ) . (36) Proof:
The constraint (34) implies that the mutualinformation I ( X i ; X j ) is the same for all i, j ∈ { , . . . , K } , i = j . Let us start with a W ∗ that satisfies the infimizationconstraints for B ( X , . . . , X K ) and yields B ( X , . . . , X K ) = max i = j I ( X i ; X j ) (37) = I ( X i ; X j ) , (38)for some i = j . For this W ∗ , we have I ( X i ; X j ) = max i = j I ( X i ; X j ) (39) = I ( X , . . . , X K ; W ∗ ) (40) = I ( X i ; W ∗ ) + I ( X \ X i ; W ∗ | X i ) (41) ≥ I ( X i ; X j ) + I ( X \ X i ; W ∗ | X i ) (42)where (42) follows from the fact that W ∗ satisfies the Markovrelationship X i − W ∗ − X j , for all i = j . In the derivationof (42), i can be chosen arbitrarily due to (34). Therefore,(42) implies that this W ∗ also satisfies I ( X \ X i ; W ∗ | X i ) = 0 for all i = 1 , . . . , K . This in turn implies that W ∗ servesas a valid choice in the maximization for evaluation of C ( X , . . . , X K ) . Therefore, we obtain the following lowerbound for C ( X , . . . , X K ) : C ( X , . . . , X K ) = max W − X k − X \ X k ,k =1 ,...,K I ( X , . . . , X K ; W ) (43) ≥ I ( X , . . . , X K ; W ∗ ) (44) = max i = j I ( X i ; X j ) (45) = min i = j I ( X i ; X j ) . (46)Hence, from Proposition 1, it now follows thatif B ( X , . . . , X K ) = max i = j I ( X i ; X j ) , then ( X , . . . , X K ) = min i = j I ( X i ; X j ) . We remark herethat a similar property has been shown for K = 2 byAhlswede and K¨orner in [4].V. C ONCLUDING R EMARKS
We have abstracted the problem of privacy in a settingwhere a source interacts with multiple users via the Gray-Wyner source coding problem with additional equivocationconstraints at each user and a total equivocation constraint.In addition to developing the rate-equivocation region, wehave introduced two definitions of common information of K correlated variables and shown them both to have a form thatcan be viewed as a K -user generalization of the G´acs-K¨ornercommon information (see also [4]).VI. A PPENDIX : P
ROOF OF T HEOREM p ( w | x , . . . , x K ) . Generate nI ( X ,...,X K ; W ) sequencesaccording to the distribution Q nt =1 p ( w t ) , and index thesesequences as w n ( i ) , for i = 1 , . . . , nI ( X ,...,X K ; W ) .Independently and uniformly bin the X nk -sequencesin nH ( X k | W ) bins, and index these bins as b k, , . . . , b k, nH ( Xk | W ) , for k = 1 , . . . , K .Encoding scheme: Upon observing the ( x n , . . . , x nK ) se-quences, the encoder searches for a w n sequence that is jointlytypical with these sequences. Using standard arguments (asin [6]), it can be shown that the encoder can succeed infinding one such w n sequence. The encoder sends the indexof the w n sequence on the public link, for which we require R ≥ I ( X , . . . , X K ; W ) . It sends the bin index of the sourcesequence x nk on the private link to decoder k , for which werequire R k ≥ H ( X k | W ) .Decoding: At decoder k , the decoder looks for a unique x n in bin b k (received from the private link), that is jointly typicalwith the w n sequence received from the public link. It can beshown that decoder k can reconstruct X nk with a vanishinglysmall probability of error. We omit the probability of errorcalculation as it follows from the same arguments as in [5].Equivocation: We show that this coding scheme yieldsthe total equivocation stated in Theorem 1. Let J denotethe encoder output for the public link and let J k denotethe encoder output for the private link to decoder k , for k = 1 , . . . , K . For E k , we have the following sequence of inequalities: E k = 1 n H ( X n , . . . , X nk − , X nk +1 , . . . , X nK | J , J k ) (47) = 1 n H ( X n \ X nk | J , J k ) (48) ≥ n H ( X n | J , J k ) − n H ( X nk | J , J k ) (49) ≥ n H ( X n | J , J k ) − ǫ k,n (50) = 1 n H ( X n , J , J k ) − n H ( J , J k ) − ǫ k,n (51) ≥ n H ( X n ) − n H ( J , J k ) − ǫ k,n (52) ≥ n H ( X n ) − n H ( J ) − n H ( J k ) − ǫ k,n (53) ≥ H ( X , . . . , X K ) − I ( X , . . . , X K ; W ) − H ( X k | W ) − ǫ k,n (54) = H ( X , . . . , X K | W, X k ) − ǫ k,n (55) = H ( X | W, X k ) − ǫ k,n , (56)where (50) follows from Fano’s inequality, and (54)follows from the facts that H ( J ) ≤ log( |J | ) = nI ( X , . . . , X K ; W ) , and H ( J k ) ≤ log( |J k | ) = nH ( X K | W ) , for k = 1 , . . . , K . Therefore, we havethat E = K X k =1 E k ≥ K X k =1 H ( X | W, X k ) − ǫ. Hence, this coding scheme yields an equivocation of ∆ = P Kk =1 H ( X | W, X k ) . R EFERENCES[1] A. D. Wyner, “The common information of two dependent randomvariables,”
IEEE Trans. Inform. Theory , vol. 21, no. 2, pp. 163–179,March 1975.[2] P. G´acs and J. K¨orner, “Common information is far less than mutualinformation,”
Problems of Control and Information Theory , vol. 2, pp.149–162, 1973.[3] W. Liu, G. Xu, and B. Chen, “The common information of N dependentrandom variables,” in
Proc. 48th Annual Allerton Conference on Com-munications, Control and Computing , Monticello, IL, September 2010.[4] R. Ahlswede and J. K¨orner, “On common information and relatedcharacteristics of correlated information sources,” in
Lecture Notes inComputer Science . Berlin, Germany: Springer-Verlag, 2006, vol. 4123,pp. 664–677.[5] R. M. Gray and A. D. Wyner, “Source coding for a simple network,”
Bell System Technical Journal , vol. 53, no. 9, pp. 1681–1721, November1974.[6] T. M. Cover and J. A. Thomas,