Private Computation with Individual and Joint Privacy
aa r X i v : . [ c s . I T ] J a n Private Computation with Individual and Joint Privacy
Anoosheh Heidarzadeh and Alex Sprintson
Abstract
This paper considers the problem of single-server Private Computation (PC) in the presence of Side Information(SI). In this problem, there is a server that stores K i.i.d. messages, and a user who has a subset of M uncodedmessages or a coded linear combination of them as side information, where the identities of these messages areunknown to the server. The user wants to privately compute (via downloading information from the server) a linearcombination of a subset of D other messages, where the identities of these messages must be kept private individuallyor jointly. For each setting, we define the capacity as the supremum of all achievable download rates.We characterize the capacity of both PC with coded and uncoded SI when individual privacy is required, for all K, M, D . Our results indicate that both settings have the same capacity. In addition, we establish a non-trivial lowerbound on the capacity of PC with coded SI when joint privacy is required, for a range of parameters
K, M, D . Thislower bound is the same as the lower bound we previously established on the capacity of PC with uncoded SI whenjoint privacy is required. I. INTRODUCTION
In this work, we consider the problem of Private Computation (PC) in the presence of side information. In thisproblem, there is a single (or multiple) remote server(s) storing (identical copies of) a database of i.i.d. messages;and there is a user who initially has a side information about some subset of messages in the database, wherethe identities of the messages in the support of the side information are initially unknown to the server. The useris interested in privately computing (via downloading information from the server(s)) a linear combination of adifferent subset of database messages, while minimizing the total amount of information being downloaded fromthe server(s).We consider two different types of side information: (i) uncoded side information - where the user knows asubset of database messages, and (ii) coded side information - where the user holds a linear combination of asubset of database messages. These settings are referred to as
PC with Side Information (PC-SI) and
PC withCoded Side Information (PC-CSI) , respectively. We also consider two different privacy conditions: (i) individualprivacy - where the identity of each message in the support set of the demanded linear combination needs to bekept private individually, and (ii) joint privacy - where the identities of all messages in the support set of thedemanded linear combination must be kept private jointly. When the condition (i) or (ii) is required, we refer tothe PC problem as
Individually-Private Computation (IPC) or Jointly-Private Computation (JPC) , respectively.The goal is to design a protocol for generating the query of the user and the corresponding answer of the server(s)such that the entropy of the answer is minimized, while the query satisfies the privacy condition.
The authors are with the Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843 USA(E-mail: { anoosheh, spalex } @tamu.edu). oth IPC and JPC settings are related to the problem of Private Computation, introduced in [1], where thegoal is to compute a linear combination of the messages in the database, while hiding both the identities and thecoefficients of these messages. Several variants of this problem were also studied in [2]–[5]. These works considerneither individual nor joint privacy, nor any type of side information.The JPC-SI setting, initially introduced in [6], is closely related to the problem of Private Information Retrievalwith Side Information (PIR-SI), which was initially introduced in [7], [8] and later extended in several works,e.g., [9]–[12]. In the PIR-SI problem, a user wishes to retrieve a subset of database messages with the help ofan uncoded side information, while achieving joint privacy. Several variants of PIR with different types of sideinformation or different types of privacy conditions were also studied in [13]–[20]. The IPC-SI setting is an extensionof the PIR-SI problem when individual privacy is required. This problem, known as IPIR-SI, was introduced in [21].The JPC-CSI ad IPC-CSI settings are two generalizations of PIR with Coded Side Information (PIR-CSI), previouslystudied in [22] and [23]. A. Main Contributions
In this work, we focus on the single-server case. For each type of side information (coded or uncoded) and eachprivacy condition (individual or joint), the capacity of the underlying setting is defined as the supremum of allachievable download rates, where the download rate is the ratio of the entropy of a message to the entropy of theserver’s answer.We characterize the capacity of both the IPC-SI and IPC-CSI settings, for all parameters. These results subsumeseveral existing results in the PIR literature. The converse proof is information-theoretic, and the achievabilityscheme is a generalization of our recently proposed scheme in [24] for the PIR-CSI setting. Our results show thatthe capacity of both settings are the same. This implies that, when individual privacy is required, having only one linear combination of a subset of messages as side information is as efficient as having them all separately. Inaddition, we establish a non-trivial lower bound on the capacity of the JPC-CSI setting for a range of parameters.Interestingly, this lower bound is the same as the lower bound we previously established in [6] on the capacity ofthe JPC-SI setting. The proof of achievability is based on a modification of the scheme we proposed in [6] for theJPC-SI setting.Our results for both IPC and JPC settings, when compared to the existing results in the PIR literature, indicatethat one can privately compute a linear combination of multiple messages much more efficiently than privatelyretrieving multiple messages, and linearly combining them locally. In addition, comparing our results with thosein [1], one can see that hiding only the identities of the messages (either individually or jointly) and not theircoefficients —which may still provide a satisfactory level of privacy in many applications, can be done much lesscostly, even when there is only one server and/or the user has no side information.II. P
ROBLEM F ORMULATION
Throughout, random variables and their realizations are denoted by bold-face letters and regular letters, respec-tively.et F q be a finite field for a prime q , and let F q ℓ be an extension field of F q for a positive integer ℓ .Let K , M , and D be non-negative integers such that K ≥ M + D . Let K , { , . . . , K } , and let K M (or K D )be the set of all M -subsets (or D -subsets) of K . Let C be the set of all nonzero elements in F q , and let C M (or C D ) be the set of all ordered multisets of C of size M (or D ).Consider a single server that stores a dataset of K messages, X K , { X , . . . , X K } , where each message X i is independently and uniformly distributed over F q ℓ . That is, H ( X i ) = L for i ∈ K , and H ( X K ) = KL , where X K , { X , . . . , X K } , and L , ℓ log q . Consider a user that knows a linear combination Y [ S,U ] , P i ∈ S u i X i of M messages X S , { X i } i ∈ S for some S ∈ K M and some U , { u i } i ∈ S ∈ C M , and wishes to retrieve a linearcombination Z [ W,V ] , P i ∈ W v i X i from the server for some W ∈ K D , W ∩ S = ∅ , and some V , { v i } i ∈ W ∈ C D .We refer to Y [ S,U ] as the side information , X S as the side information support set , S as the side informationsupport index set , M as the side information support size , Z [ W,V ] as the demand , X W as the demand support set , W as the demand support index set , and D as the demand support size .We assume that S , U , and V are distributed uniformly over K M , C M , and C D , respectively, and W , given S = S ,is uniformly distributed over all W ∈ K D , W ∩ S = ∅ . Also, we assume that the server initially knows M, D , andthe joint distribution of S , U , W , and V , whereas the realizations S , U , W and V are not initially known to theserver.For any given S, U, W, V , the user sends to the server a query Q [ W,V,S,U ] , which is a (potentially stochastic)function of W , V , S , and U , in order to retrieve Z [ W,V ] . For simplifying the notation, we denote Q [ W , V , S , U ] by Q . The query must satisfy one of the following two privacy conditions:(i) Individual Privacy: every message in X K must be equally likely to be in the user’s demand support set, i.e.,for all i ∈ K , it must hold that Pr( i ∈ W | Q = Q [ W,V,S,U ] ) = Pr( i ∈ W ) . (ii) Joint Privacy: every D -subset of messages in X K must be equally likely to be the user’s demand support set,i.e., for all W ∗ ∈ K D , it must hold that Pr( W = W ∗ | Q = Q [ W,V,S,U ] ) = Pr( W = W ∗ ) . The joint privacy, which is a stronger notion of privacy, implies the individual privacy, but not vice versa. Themain difference between these two conditions is that for joint privacy the query must protect the correlation betweenthe indices in the demand support index set, whereas for individual privacy some information about this correlationmay be leaked, and hence a weaker notion of privacy.Neither individual nor joint privacy requires the privacy of the coefficients in the demand to be protected. This isin contrast to the privacy condition being considered in [1], and as a result of this relaxation one can expect moreefficient private computation schemes in our settings. In particular, for single-server private computation withoutany side information, the user must download the entire dataset in order to protect the privacy of both the identitiesof the messages in the demand support set and their coefficients in the demand [1]. However, for neither of thewo privacy conditions being considered here the entire dataset needs to be downloaded, even when the user hasno side information.Upon receiving Q [ W,V,S,U ] , the server sends to the user an answer A [ W,V,S,U ] , which is a (deterministic) functionof the query Q [ W,V,S,U ] and the messages in X K . We denote A [ W , V , S , U ] by A for the ease of notation. Note that H ( A | Q , X K ) = 0 , since ( W , V , S , U ) and A are conditionally independent given ( Q , X K ) .The collection of A [ W,V,S,U ] , Q [ W,V,S,U ] , Y [ S,U ] , W , V , S , and U must enable the user to retrieve the demand Z [ W,V ] . That is, it must hold that H ( Z [ W , V ] | A , Q , Y [ S , U ] , W , V , S , U ) = 0 . We refer to this condition as the recoverability condition .For each type of privacy, the problem is to design a protocol for generating a query Q [ W,V,S,U ] (and thecorresponding answer A [ W,V,S,U ] , given Q [ W,V,S,U ] and X K ) for any given W, V, S, U , such that both the privacy andrecoverability conditions are satisfied. We refer to this problem as single-server Individually-Private Computationwith Coded Side Information (IPC-CSI) or Jointly-Private Computation with Coded Side Information (JPC-CSI) ,when individual or joint privacy is required, respectively.We similarly define the
IPC-SI and
JPC-SI problems for the settings in which the user’s side information is thesupport set X S itself, instead of a linear combination of the messages in X S .We refer to a protocol that generates query/answer for the IPC-CSI or JPC-CSI setting as an IPC-CSI or a
JPC-CSI protocol , respectively. The rate of an IPC-CSI or a JPC-CSI protocol is defined as the ratio of the entropyof a message, i.e., L , to the entropy of the answer A . The capacity of the IPC-CSI or JPC-CSI setting is definedas the supremum of rates over all IPC-CSI or JPC-CSI protocols, respectively. An IPC-SI or a JPC-SI protocol, itsrate, and the capacity of the IPC-SI or JPC-SI setting are defined similarly.Our goal in this work is to establish lower and/or upper bounds on the capacity of IPC-CSI, JPC-CSI, IPC-SI,and JPC-SI settings, in terms of the parameters K , M , and D .III. M AIN R ESULTS
Our main results for the IPC and JPC settings with both coded and uncoded side information are summarizedin Sections III-A and III-B, respectively.The following two lemmas provide a necessary condition for individual and joint privacy, for both types of sideinformation. The proofs are straightforward by the way of contradiction, and hence omitted for brevity.
Lemma 1 (A Necessary Condition for Individual Privacy) . For any i ∈ K , there exist W ∗ ∈ K D , V ∗ ∈ C D , and S ∗ ∈ K M where i ∈ W ∗ and S ∗ ∩ W ∗ = ∅ , such that H ( Z [ W ∗ ,V ∗ ] | A , Q , X S ∗ ) = 0 . Lemma 2 (A Necessary Condition for Joint Privacy) . For any W ∗ ∈ K D , there exist V ∗ ∈ C D and S ∗ ∈ K M where S ∗ ∩ W ∗ = ∅ , such that H ( Z [ W ∗ ,V ∗ ] | A , Q , X S ∗ ) = 0 . hinking of scalar-linear IPC or JPC protocols —where the answer consists only of scalar-linear combinationsof the messages in X K , the necessary conditions in Lemmas 1 and 2 imply the need for linear codes that satisfycertain combinatorial requirements. (Recently, in [12], we made a similar connection between single-server PIR withside information and locally recoverable codes.) Consider a (linear) code of length K that satisfies the followingrequirement: for any i ∈ { , . . . , K } , there is a codeword of (Hamming) weight D or M + D (or at least D and atmost M + D ) whose support includes the index i . The parity-check equations of the dual of any such code can beused for constructing a scalar-linear IPC-CSI (or IPC-SI) protocol. Minimizing the entropy of the answer in orderto maximize the rate of the protocol translates into minimizing the dimension of the code. In this work, we designoptimal codes with minimum dimension for all K, M, D for the IPC-CSI setting. These codes naturally serve alsoas optimal codes for the IPC-SI setting.The problem of designing a scalar-linear JPC-CSI (or JPC-SI) protocol reduces to the problem of designing a codeof length K with minimum dimension satisfying the following requirement: for any D -subset W ⊆ { , . . . , K } ,there is a codeword of weight D or M + D (or at least D and at most M + D ) whose support includes the D -subset W . The design of optimal codes satisfying this requirement remains an open problem. In [6], we initiated the studyof the JPC-SI setting, and established a non-trivial upper bound on the dimension of optimal codes for this setting.In this work, we make the first attempt towards characterizing the dimension of optimal codes for the JPC-CSIsetting; and provide a non-trivial upper bound for a range of parameters K, M, D . A. IPC-SI and IPC-CSI
The capacity of IPC-SI and IPC-CSI for arbitrary
K, M, D are characterized in Theorems 1 and 2, respectively.
Theorem 1.
For the IPC-SI setting with K messages, side information of size M , and demand support size D , thecapacity is given by ⌈ KM + D ⌉ − . Theorem 2.
For the IPC-CSI setting with K messages, side information support size M , and demand support size D , the capacity is given by ⌈ KM + D ⌉ − . For the converse proof, we use information-theoretic arguments relying primarily on the result of Lemma 1, toupper bound the rate of any IPC-SI protocol (see Section IV-A). This upper bound obviously holds for any IPC-CSIprotocol. For the proof of achievability, we construct a new scalar-linear IPC-CSI protocol, termed the
GeneralizedModified Partition-and-Code (GMPC) protocol , which achieves the rate upper bound (see Section IV-B). Thisprotocol naturally serves also as an IPC-SI protocol. The GMPC protocol is based on the idea of non-uniformrandomized partitioning, and generalizes our recently proposed protocol in [24] for the PIR-CSI setting. Examplesof the GMPC protocol are given in Section V.
Remark 1.
The matching capacity of the IPC-SI and IPC-CSI settings shows that achieving individual privacycomes at no loss in capacity if the user has only one random linear combination of M random messages, insteadof M random messages separately as their side information. emark 2. As shown in [21], for the IPIR-SI setting, the normalized download cost of K − M ⌊ KM + D ⌋ or D ⌈ KM + D ⌉ (depending on K, M, D ) is achievable, where the normalized download cost is defined as the download costnormalized by the entropy of a message. Comparing this with the result of Theorem 1, one can see that, whenindividual privacy is required, one can privately compute a linear combination of multiple messages much moreefficiently than retrieving them privately and linearly combining them locally.
Remark 3.
For the case of M = 0 , the capacity of both IPC-SI and IPC-CSI settings is equal to ⌈ KD ⌉ − . Dependingon the value of D , the capacity can be substantially larger than K , which was shown to be the capacity of single-server private computation where the privacy of both the demand support index set and the coefficients in the demandmust be preserved [1]. For the case of D = 1 , both the IPC-SI and IPC-CSI problems reduce to the problems ofPIR-SI [7] and PIR-CSI where the demanded message does not lie in the support of the side information [22],respectively. The capacity of these settings were shown to be equal to ⌈ KM +1 ⌉ − , matching the results of Theorems 1and 2. B. JPC-SI and JPC-CSI
Theorem 3 lower bounds the capacity of JPC-SI for all
K, M, D , and Theorem 4 establishes a lower bound onthe capacity of JPC-CSI for some values of
K, M, D . Theorem 3 ( [6]) . For the JPC-SI setting with K messages, side information of size M , and demand support size D , the capacity is lower bounded by ( ⌈ K − M − D ⌊ M/D ⌋ +1 ⌉ + 1) − . Theorem 4.
For the JPC-CSI setting with K messages, side information support size M , and demand support size D , the capacity is lower bounded by ( K − M − D ⌊ M/D ⌋ +1 + 1) − if ⌊ MD ⌋ + 1 divides K − M − D . The capacity lower bound in Theorem 3 is achievable by a scalar-linear JPC-SI protocol, called
Partition-and-Code with Interference Alignment (PC-IA) , which we recently proposed in [6]. The PC-IA protocol is applicable forall
K, M, D , and relies on the idea of a probabilistic partitioning that allows the parts to overlap and have multipleblocks of interference that are aligned (for details, see [6]).Theorem 4, which appears without proof, follows directly from a simple observation that the PC-IA protocol(with a slight modification in the choice of coefficients in the linear combinations that constitute the server’s answerto the user’s query) serves also as a scalar-linear JPC-CSI protocol for some values of
K, M, D , particularly whenthe divisibility condition in the theorem’s statement holds; however, the PC-IA protocol is not a JPC-CSI protocolin general. Examples of the PC-IA protocol for both cases are given in Section V.We have been able to design different scalar-linear JPC-CSI protocols for some other values of
K, M, D ; butthe constructions are not universal and are limited to specific values of
K, M, D , and hence not presented in thiswork. The extension of these constructions to arbitrary
K, M, D is a challenging open problem, and the focus ofan ongoing work.
Remark 4.
As was shown in [6], when joint privacy is required, with the help of an uncoded side informationhe download cost for the private computation of one linear combination of multiple messages can be much lowerthan that of privately retrieving multiple messages and computing the linear combination of them. For instance, for K even, when the user has M = 2 messages as side information, for privately computing a linear combination of D = 2 messages the normalized download cost is equal to K − (see Theorem 3); whereas, privately retrieving D = 2 messages incurs a normalized download cost of min { K − , K − ⌊ K ⌋} , which is significantly higher than K − (see [9, Theorem 2]). Surprisingly, the result of Theorem 4 shows that for some values of K, M, D (e.g., K even and M = D = 2 ), only one linear combination of M messages suffices to achieve the same normalizeddownload cost (e.g., K − ). This is interesting because regardless of the values of M and D , when joint privacy isrequired, with the help of only one linear combination of M messages the normalized download cost for retrieving D messages is equal to K − , which is much higher than, for instance, K − . Remark 5.
The capacity lower bounds in Theorems 3 and 4 are tight for the cases of D = 1 and M = 0 (see [7],[22]). We have been able to prove the tightness of these bounds for small values of K, M, D , particularly for M = D = 2 and several values of K . Nevertheless, it remains open whether these lower bounds are tight for all K, M, D in general.
Remark 6.
The matching capacity lower bounds in Theorems 3 and 4 raises an intriguing question whether, similarto the IPC-SI and IPC-CSI settings, the capacity of the JPC-SI and JPC-CSI settings are the same. We conjecturethat the answer is affirmative for both linear and non-linear protocols.IV. P
ROOFS OF T HEOREMS AND
A. Converse
Lemma 3.
The rate of any IPC-SI protocol for K messages, side information of size M , and demand support size D , is upper bounded by ⌈ KM + D ⌉ − .Proof: To prove the lemma, we need to show that H ( A ) ≥ ⌈ KM + D ⌉ L . Take arbitrary W ∈ K D , V ∈ C D , S ∈ K M such that S ∩ W = ∅ . By a simple application of the chain rule of entropy, one can show that H ( A ) ≥ H ( Z ) + H ( A | Q , X S , Z ) , (1)where Z , Z [ W,V ] . Note that H ( Z ) = L . We consider two cases: (i) W ∪ S = K , and (ii) W ∪ S = K . In thecase (i), we have M = K − D , and ⌈ KM + D ⌉ L = L ; and hence, (1) implies that H ( A ) ≥ H ( Z ) = L , as was to beshown.In the case (ii), we further lower bound H ( A | Q , X S , Z ) as follows. Choose an arbitrary message, say X i ,for some i W ∪ S . By the result of Lemma 1, there exist W ∈ K D , i ∈ W , V ∈ C D , and S ∈ K M , ∩ W = ∅ , such that H ( Z | A , Q , X S ) = 0 , or in turn, H ( Z | A , Q , X S , Z , Z ) = 0 , where Z , Z [ W ,V ] .Thus, H ( A | Q , X S , Z ) ≥ H ( A | Q , X S , Z , X S )+ H ( Z | A , Q , X S , Z , X S )= H ( Z | Q , X S , Z , X S )+ H ( A | Q , X S , Z , X S , Z )= H ( Z )+ H ( A | Q , X S , Z , X S , Z ) (2)where Z and ( Q , X S , Z , X S ) are independent because i ∈ W and i W ∪ S ∪ S . Let n , ⌈ KM + D ⌉ . UsingLemma 1 recursively, it can be shown that for all ≤ k < n there exist i , . . . , i k ∈ K , W , . . . , W k ∈ K D , V , . . . , V k ∈ C D , and S , . . . , S k ∈ K M satisfying i l ∈ W l , S l ∩ W l = ∅ , i l
6∈ ∪ l − j =1 ( W j ∪ S j ) ∪ ( W ∪ S ) for all ≤ l ≤ k , such that H ( Z k | A , Q , X S , Z , X S , Z , . . . , X S k − , Z k − , X S k ) = 0 , where Z l , Z [ W l ,V l ] for all ≤ l ≤ k . Obviously, (cid:12)(cid:12) ∪ k − j =1 ( W j ∪ S j ) ∪ ( W ∪ S ) (cid:12)(cid:12) ≤ ( M + D ) k for all ≤ k < n .Applying the same technique as in (2), it can then be shown that for all ≤ k < n , we have H ( A | Q , X S , Z , X S , Z , . . . , X S k − , Z k − ) ≥ H ( Z k ) + H ( A | Q , X S , Z , X S , Z , . . . , X S k , Z k ) . Putting together these lower bounds for all k , we have H ( A | Q , X S , Z ) ≥ n − X k =1 H ( Z k ) = ( n − L, (3)since Z , . . . , Z n − are independent by the choice of i , . . . , i n − in the construction. Combining (1) and (3), wehave H ( A ) ≥ nL = ⌈ KM + D ⌉ L , as was to be shown. B. Achievability
For the ease of notation, we define n , ⌈ KM + D ⌉ , m , n ( M + D ) − K , and r , M + D − m . Generalized Modified Partition-and-Code (GMPC) Protocol:
This protocol consists of three steps as follows:
Step 1:
Let I l , { ( l − M + D ) + 1 , . . . , l ( M + D ) } for ≤ l < n , and let I n , { , . . . , m, ( n − M + D ) + 1 , . . . , K } . Note that I ∩ I n = { , . . . , m } .First, the user constructs a random permutation π on K as follows. With probability α , m +2 rK , the user chooses l ∗ ∈ { , n } uniformly at random; otherwise, with probability − α , the user randomly chooses l ∗ ∈ { , . . . , n − } .If l ∗ ∈ { , n } , with probability β (or − β ) where the choice of β will be specified shortly, the user assigns µ , min { D, m } (or D − ρ , D − min { D, r } ) randomly chosen indices from W and m − µ (or m − D + ρ )randomly chosen indices from S to { π ( j ) : 1 ≤ j ≤ m } at random, and randomly assigns the rest of the indices in ∪ S to { π ( j ) : j ∈ I l ∗ \ { , . . . , m }} . Otherwise, if l ∗ ∈ { , . . . , n − } , the user randomly assigns the M + D indices in W ∪ S to { π ( j ) : j ∈ I l ∗ } . Then, the user assigns the (not-yet-assigned) indices in K \ ( W ∪ S ) to { π ( j ) : j I l ∗ } .The value of β , which is carefully chosen in order to satisfy the individual privacy condition, depends on thevalues of D, m, r : β , mm +2 r , D ≤ m, D ≤ r, Dm +2 r , D > m, D ≤ r, − Dm +2 r , D ≤ m, D > r, rM (cid:16) − Dm +2 r (cid:17) , D > m, D > r. Next, the user constructs n ordered sets Q ′ , . . . , Q ′ n , each of size M + D , defined as Q ′ k , { π ( j ) : j ∈ I l } ; andconstructs an ordered multiset Q ′′ of size M + D , defined as Q ′′ , { c j : j ∈ I l ∗ } where c j = v π ( j ) or c j = u π ( j ) when π ( j ) ∈ W or π ( j ) ∈ S , respectively. Recall that v π ( j ) or u π ( j ) is the coefficient of the message X π ( j ) in theuser’s demand or side information, respectively.The user then constructs Q l = ( Q ′ l , Q ′′ ) for ≤ l ≤ n , and sends the query Q [ W,V,S,U ] = { Q , . . . , Q n } to theserver. Step 2:
By using Q l = ( Q ′ l , Q ′′ ) ’s, the server computes A l ’s, defined as A l , P M + Dj =1 c i j X i j where Q ′ l = { i , . . . , i M + D } and Q ′′ = { c i , . . . , c i M + D } , and sends the answer A [ W,V,S,U ] = { A , . . . , A n } back to the user. Step 3:
Upon receiving the server’s answer, the user retrieves the demand Z [ W,V ] by subtracting off the contributionof the side information Y [ S,U ] from A l ∗ = Z [ W,V ] + Y [ S,U ] . Lemma 4.
The GMPC protocol is a scalar-linear IPC-CSI protocol, and achieves the rate ⌈ KM + D ⌉ − .Proof: The rate and the scalar-linearity of the GMPC protocol are obvious from the construction. Clearly, therecoverability condition is also satisfied.To prove that the GMPC protocol satisfies the individual privacy condition, we need to show that for any givenquery Q generated by the protocol, for all i ∈ K , it holds that Pr( i ∈ W | Q = Q ) = Pr( i ∈ W ) = DK , noting that W is distributed uniformly over K D .Fix an arbitrary i ∈ K . We consider the following three different cases separately: (i) π − ( i ) ∈ { , . . . , m } ; (ii) π − ( i ) ∈ I l \ { , . . . , m } for some l ∈ { , n } ; and (iii) π − ( i ) ∈ I l for some l
6∈ { , n } , where π − ( i ) = j if andonly if π ( j ) = i .irst, consider the case (i). In this case, we have Pr( i ∈ W | Q = Q )= X l ∈{ ,n } Pr( i ∈ W , l ∗ = l | Q = Q )= X l ∈{ ,n } Pr( l ∗ = l | Q = Q ) × Pr( i ∈ W | Q = Q, l ∗ = l )= 2 × α β × (cid:0) m − µ − (cid:1)(cid:0) mµ (cid:1) + (1 − β ) × (cid:0) m − D − ρ − (cid:1)(cid:0) mD − ρ (cid:1) !! = αβ (cid:0) Dm (cid:1) , D ≤ m, D ≤ r,αβ, D > m, D ≤ r,α (cid:0) β (cid:0) Dm (cid:1) + (1 − β ) (cid:0) D − rm (cid:1)(cid:1) , D ≤ m, D > r,α (cid:0) β + (1 − β ) (cid:0) D − rm (cid:1) ) (cid:1) , D > m, D > r, = DK , for our choice of β for each range of values of D, m, r .Next, consider the case (ii). In this case, we have
Pr( i ∈ W | Q = Q )= Pr( i ∈ W , l ∗ = l | Q = Q )= Pr( l ∗ = l | Q = Q ) × Pr( i ∈ W | Q = Q, l ∗ = l )= 12 × α β × (cid:0) r − D − µ − (cid:1)(cid:0) rD − µ (cid:1) + (1 − β ) × (cid:0) r − ρ − (cid:1)(cid:0) rρ (cid:1) ! = α (1 − β ) (cid:0) Dr (cid:1) , D ≤ m, D ≤ r, α (cid:0) β (cid:0) D − mr (cid:1) + (1 − β ) (cid:0) Dr (cid:1)(cid:1) , D > m, D ≤ r, α (1 − β ) , D ≤ m, D > r, α (cid:0) β (cid:0) D − mr (cid:1) + (1 − β ) (cid:1) , D > m, D > r, = DK , for the choices of β specified earlier.astly, consider the case (iii). In this case, we have Pr( i ∈ W | Q = Q )= Pr( i ∈ W , l ∗ = l | Q = Q )= Pr( l ∗ = l | Q = Q ) Pr( i ∈ W | Q = Q, l ∗ = l )= 1 n − × (1 − α ) (cid:18) DM + D (cid:19) = (cid:18) M + DK − m − r (cid:19) (cid:18) K − m − rK (cid:19) (cid:18) DM + D (cid:19) = DK .
This completes the proof. V. E
XAMPLES
A. GMPC Protocol
This section illustrates two examples of the GMPC protocol for M = D = 2 and K ∈ { , } . Example 1.
Consider a scenario where the server has K = 12 messages X , . . . , X ∈ F , and the user demandsthe linear combination Z = X +3 X with support size D = 2 and has a coded side information Y = 5 X + X withsupport size M = 2 . For this example, W = { , } , V = { v , v } = { , } , S = { , } , and U = { u , u } = { , } .For this example, the protocol’s parameters are as follows: n = 3 , m = 0 , r = 4 , α = , β = , µ = 0 and ρ = 2 .Let I = { , , , } , I = { , , , } , and I = { , , , } . First, the user constructs a permutation π of { , . . . , } as follows. With probability α = , the user randomly chooses l ∗ ∈ { , } , or with probability − α = , the user chooses l ∗ = 2 . Note that for this example, l ∗ is equally likely to be any of the indices in { , , } . Suppose that the user chooses l ∗ = 1 . Since, for this example, µ = 0 , D − ρ = 0 , m − µ = 0 , and m − D + ρ = 0 , the user randomly assigns all indices in W ∪ S = { , , , } to { π ( j ) : j ∈ I } ; say, π (1) = 2 , π (2) = 4 , π (3) = 1 , and π (4) = 3 . Then, the user randomly assigns the (not-yet-assigned) indices in { , . . . , } to { π ( j ) : j I } ; say, π (5) = 10 , π (6) = 8 , π (7) = 6 , π (8) = 5 , π (9) = 11 , π (10) = 9 , π (11) = 12 , and π (12) = 7 . Thus, the permutation π maps { , . . . , } to { , , , , , , , , , , , } .Next, the user constructs the ordered sets Q ′ = { π ( j ) : j ∈ I } = { , , , } , Q ′ = { π ( j ) : j ∈ I } = { , , , } , and Q ′ = { π ( j ) : j ∈ I } = { , , , } ; and constructs the ordered multiset Q ′′ = { c j : j ∈ I } = { c , c , c , c } = { v , u , v , u } = { , , , } . The user then constructs Q = ( Q ′ , Q ′′ ) = ( { , , , } , { , , , } ) , Q = ( Q ′ , Q ′′ ) = ( { , , , } , { , , , } ) , and Q = ( Q ′ , Q ′′ ) = ( { , , , } , { , , , } ) ; and sends thequery Q = { Q , Q , Q } to the server.The server computes A = 3 X + X + X +5 X , A = 3 X + X + X +5 X , and A = 3 X + X + X +5 X ;and sends the answer A = { A , A , A } back to the user. The user then subtracts Y = 5 X + X from A l ∗ = A = 3 X + X + X + 5 X , and recovers Z = X + 3 X .o prove that the individual privacy condition is satisfied in this example, we need to show that the probabilityof every message X i to be one of the two messages in X W is = . From the perspective of the server, l ∗ is , , or , each with probability . Given l ∗ = l , every one of the pairs of messages in the support of A l are thetwo messages in X W with probability . Since every message in the support of A l belongs to pairs of messages,the probability of any message in the support of A l to be one of the two messages in X W is × = . Thus, theprobability of any message X i to belong to X W is × = . Example 2.
Consider the scenario in Example 1 (i.e., W = { , } , V = { , } , S = { , } , and U = { , } ),except when the server has K = 11 messages X , . . . , X ∈ F .The protocol’s parameters for this example are as follows: n = 3 , m = 1 , r = 3 , α = , β = , µ = 1 and ρ = 2 .Let I = { , , , } , I = { , , , } , and I = { , , , } . The user first constructs a permutation π of { , . . . , } as follows. With probability α = , the user randomly chooses l ∗ ∈ { , } , or with probability − α = , the user chooses l ∗ = 2 . For this example, l ∗ is equal to , , or , with probability , , or , respectively. Suppose that the user chooses l ∗ = 1 . With probability β = (or − β = ), the user assigns µ = 1 randomly chosen index from W = { , } (or m − D + ρ = 1 randomly chosen index from S = { , } ),say the index , to π (1) , i.e., π (1) = 2 ; and randomly assigns the rest of the indices in W ∪ S = { , , , } ,i.e., { , , } , to { π ( j ) : j ∈ I \ { }} ; say π (2) = 4 , π (3) = 1 , and π (4) = 3 . Then, the user randomly assignsthe (not-yet-assigned) indices in { , . . . , } to { π ( j ) : j I } ; say, π (5) = 10 , π (6) = 8 , π (7) = 6 , π (8) = 5 , π (9) = 11 , π (10) = 9 , and π (11) = 7 . Thus, the permutation π maps { , . . . , } to { , , , , , , , , , , } .Next, the user constructs the ordered sets Q ′ = { π ( j ) : j ∈ I } = { , , , } , Q ′ = { π ( j ) : j ∈ I } = { , , , } , and Q ′ = { π ( j ) : j ∈ I } = { , , , } ; and constructs the ordered multiset Q ′′ = { c j : j ∈ I } = { c , c , c , c } = { v , u , v , u } = { , , , } . The user then constructs Q = ( Q ′ , Q ′′ ) = ( { , , , } , { , , , } ) , Q = ( Q ′ , Q ′′ ) = ( { , , , } , { , , , } ) , and Q = ( Q ′ , Q ′′ ) = ( { , , , } , { , , , } ) ; and sends the query Q = { Q , Q , Q } to the server.The server computes A = 3 X + X + X +5 X , A = 3 X + X + X +5 X , and A = 3 X + X + X +5 X ;and sends the answer A = { A , A , A } back to the user. The user then subtracts Y = 5 X + X from A l ∗ = A = 3 X + X + X + 5 X , and recovers Z = X + 3 X .Now, we show that the individual privacy condition is satisfied for this example. We need to verify that everymessage X i belongs to X W with probability . From the server’s perspective, l ∗ is , , or with probability , , or , respectively. First, consider the message X . Given l ∗ = 1 (or l ∗ = 3 ), the message X , which belongsto the support of both A and A , is one of the two messages in X W with probability ; whereas for l ∗ = 2 , themessage X cannot belong to X W . Thus, the probability of the message X to belong to X W is × × = .Now, consider the message X . Given l ∗ = 1 , the message X belongs to X W with probability × + × = .This is because for X being one of the two messages in X W given l ∗ = 1 , either (i) X belongs to X W , which hasprobability , and X is the other message in X W , which has probability (given X belonging to X W ), or (ii) does not belong to X W , which has probability , and one of the pairs X , X or X , X are the two messagesin X W , which has probability (given X not belonging to X W ). Given l ∗ = 2 or l ∗ = 3 , the message X cannotbe one of the two messages in X W . Thus, the probability of the message X to belong to X W is × = .Similarly, one can show that any message X i belongs to X W with probability . B. PC-IA Protocol
In this section, we give two examples for the PC-IA protocol for M = D = 2 and K ∈ { , } . Example 3shows an instance where the PC-IA is a JPC-CSI protocol, whereas Example 4 shows an instance for which thePC-IA fails as a JPC-CSI protocol. Example 3.
Consider the scenario in Example 1, except when joint privacy is required, instead of individual privacy.The protocol’s parameters for this example are as follows (for details, see [6]): s = 2 , n = 5 , m = 6 , r = 0 , and t = 1 , and { x , x , x , x , x , y , y } = { , , . . . , } .First, the user creates m = 6 ordered sets B , . . . , B , where B j = {− , −} for all j , i.e., B j has two slots to befilled with elements from { , . . . , } . The user then randomly places the D = 2 indices in W = { , } into twoslots; say, B = { , −} , B = {− , } , and B , B , B , B remain empty. Since B and B contain some indicesfrom W , the user fills B and B , each with a randomly chosen index from S = { , } ; say, B = { , } , and B = { , } . Next, the user randomly places the remaining indices , . . . , into the remaining slots, and fills B , B , B , B ; say B = { , } , B = { , } , B = { , } , and B = { , } .The user then constructs n = 5 ordered sets Q , . . . , Q , where Q i = { B , B i } . That is, Q = { , , , } , Q = { , , , } , Q = { , , , } , Q = { , , , } , and Q = { , , , } . Next, the user creates n = 5 ordered multisets Q ′ , . . . , Q ′ , defined as Q ′ i = { C i, , C i, i } , where C i, = { α , ω i, , α , ω i, } and C i, i = { α i, , α i, } , where ω i,j = 1 / ( x i − y j ) ; and the values of α j,k ’s are specified shortly. For this example,the user constructs Q ′ = { , , , } , Q ′ = { , , , } , Q ′ = { , , , } , Q ′ = { , , , } , and Q ′ = { , , , } .The procedure for choosing α j,k ’s is described below. First, the user finds: (i) the set J of indices j such that B j contains some indices from W ; (ii) the minimal set I (with highest lexicographical order) of indices of Q i ’s such that ∪ i ∈ I Q i contains all indices in W ; and (iii) the set H of | I |− largest indices in { , . . . , t }\ J ; for this example, J = { , } , I = { , } , and H = { } . Then, the user forms the matrix T = ( ω i,j ) i ∈ I,j ∈ H = [ ω , , ω , ] T = [1 , T ,and chooses c = 1 and c = − ω , /ω , = 5 such that [ c , c ] · T = 0 . The user then selects α j,k ’s as follows:for j ∈ J = { , } and k ∈ { , . . . , s } = { , } such that the k th element of B j , say, l , belongs to W (or S ), theuser selects α j,k = v l / P i ∈ I c i ω i,j (or α j,k = u l / P i ∈ I c i ω i,j ) if ≤ j ≤ t = 1 , and selects α i,k = v l /c i (or α i,k = u l /c i ), where v l (or u l ) is the coefficient of the message X l in the user’s demand Z (or side information Y ). For this example, α , = v /c = 3 and α , = u /c = 1 (the first element in B and the second element in B are the demand support indices and , respectively); and α , = u /c = 1 and α , = v /c = 3 (the secondelement in B and the first element in B are the side information support indices and , respectively). The userselects the rest of α j,k ’s from F q \ { } = { , . . . , } at random; say, α , = 2 , α , = 5 , α , = 3 , α , = 3 , α , = 4 , α , = 1 , α , = 3 , and α , = 2 . These choices of α j,k ’s yield the ordered multisets Q ′ , . . . , Q ′ definedarlier.The user then sends to the server ( Q , Q ′ ) = ( { , , , } , { , , , } ) , ( Q , Q ′ ) = ( { , , , } , { , , , } ) , ( Q , Q ′ ) = ( { , , , } , { , , , } ) , ( Q , Q ′ ) = ( { , , , } , { , , , } ) , ( Q , Q ′ ) = ( { , , , } , { , , , } ) , and the server sends the user back A = 2 X + 5 X + 3 X + X ,A = X + 6 X + X + 3 X ,A = 3 X + X + 3 X + 3 X ,A = 4 X + 3 X + 4 X + X ,A = 6 X + X + 3 X + 2 X . Then, the user computes c A + c A = A + 5 A = X + 3 X + 5 X + X ; and subtracting off the contributionof Y = 5 X + X , recovers Z = X + 3 X .To show that the joint privacy condition is satisfied in this example, we need to prove that any pair of messagesis equally likely to be in X W . As an example, consider the pair of messages X and X . According to the supportsof A and A , X and X belong to X W if and only if X and X are the two messages in X S . This is because X belongs only to the support of A , and X belongs only to the support of A ; and by the protocol, one ofthe messages in X S (in this case, X ) must be paired with X , and the other message in X S (in this case, X )must be paired with X . Note that X and X are aligned in A and A ; and they can be canceled by linearlycombining A and A . Moreover, there exists a unique such linear combination of A and A , i.e., c A + c A ,where the coefficient of A , i.e., c , is equal to . Note that by the protocol, the coefficient of the least-indexed A i in the linear combination of A i ’s being used by the user in the recovery process (in this case, the coefficient c of A in the linear combination c A + c A ) is always chosen to be equal to .Next, consider a different pair of messages, say, X and X . According to the support of A , the messages X and X belong to X W if and only if the messages X and X belong to X S . In this case, A is the unique linearcombination of A i ’s (with the least-indexed A i having coefficient ) whose support contains X , X , X , X . Asanother example, consider the pair of messages X and X . By the protocol, X and X belong to X W if andonly if X and X belong to X S . This is because A is the least-indexed A i whose support contains X and X .Similarly as above, one can verify that from the perspective of the server, there is a unique way to recover a linearcombination of any two messages. This observation, together with the fact that by the protocol the two messages in X W were placed randomly in the support of A i ’s, show that any pair of messages is equally likely to be in X W . xample 4. Consider the scenario in Example 2, except when, instead of individual privacy, joint privacy is required.Following the procedure in the PC-IA protocol, without specifying the choice of nonzero coefficients (denotedby ‘ ∗ ’), the server’s answer to the user’s query will have one of the following structures (up to a permutation of and ; a permutation of and ; and a permutation of , . . . , ): Case (i): A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X . Case (ii): A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X . Case (iii): A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X ,A = ∗ X + ∗ X + ∗ X . In either of these cases, one can easily verify that for any D = 2 messages, say, X i and X i , there exists a linearcombination of A i ’s whose support includes X i and X i , and has size at least D = 2 and at most M + D = 4 .(Recall that this property is required for any scalar-linear JPC-SI protocol.) For instance, consider the two messages X and X . In the case (i), by linearly combining A and A in such a way that X is canceled, one can recovera linear combination of X , X , X , and X . In both cases (ii) and (iii), the support of A contains X and X ,and has size ( < ). On the other hand, for a scalar-linear JPC-CSI protocol, a stronger requirement needs to besatisfied: for any two messages X i and X i , there must exist a linear combination of A i ’s whose support includes X i and X i , and has size exactly equal to M + D = 4 . For the two messages X and X , however, neither case(ii) nor (iii) satisfies the underlying requirement. EFERENCES[1] H. Sun and S. A. Jafar, “The capacity of private computation,” in , May2018, pp. 1–6.[2] S. A. Obead and J. Kliewer, “Achievable rate of private function retrieval from MDS coded databases,” , pp. 2117–2121, 2018.[3] S. A. Obead, H.-Y. Lin, E. Rosnes, and J. Kliewer, “Capacity of private linear computation for coded databases,” , pp. 813–820, 2018.[4] M. Mirmohseni and M. A. Maddah-Ali, “Private function retrieval,” in , April 2018, pp. 1–6.[5] Z. Chen, Z. Wang, and S. Jafar, “The asymptotic capacity of private search,” Jan 2018. [Online]. Available: arXiv:1801.05768[6] A. Heidarzadeh and A. Sprintson, “Private computation with side information: The single-server case,” in , July 2019, pp. 1657–1661.[7] S. Kadhe, B. Garcia, A. Heidarzadeh, S. E. Rouayheb, and A. Sprintson, “Private information retrieval with side information: The singleserver case,” in , Oct 2017, pp. 1099–1106.[8] ——, “Private information retrieval with side information,”
CoRR , vol. abs/1709.00112, 2017. [Online]. Available: http://arxiv.org/abs/1709.00112[9] A. Heidarzadeh, S. Kadhe, B. Garcia, S. E. Rouayheb, and A. Sprintson, “On the capacity of single-server multi-message private informationretrieval with side information,” in , Oct 2018.[10] S. Li and M. Gastpar, “Single-server multi-message private information retrieval with side information,” in , Oct 2018.[11] F. Kazemi, E. Karimi, A. Heidarzadeh, and A. Sprintson, “Single-server single-message online private information retrieval with sideinformation,” in , July 2019, pp. 350–354.[12] S. Kadhe, A. Heidarzadeh, A. Sprintson, and O. O. Koyluoglu, “On an equivalence between single-server pir with side information andlocally recoverable codes,” July 2019. [Online]. Available: arXiv:1907.00598[13] Z. Chen, Z. Wang, and S. Jafar, “The capacity of private information retrieval with private side information,”
CoRR , vol. abs/1709.03022,2017. [Online]. Available: http://arxiv.org/abs/1709.03022[14] S. P. Shariatpanahi, M. J. Siavoshani, and M. A. Maddah-Ali, “Multi-message private information retrieval with private side information,”May 2018. [Online]. Available: arXiv:1805.11892[15] R. Tandon, “The capacity of cache aided private information retrieval,” in ,Oct 2017, pp. 1078–1082.[16] Y. Wei, K. Banawan, and S. Ulukus, “Cache-aided private information retrieval with partially known uncoded prefetching: Fundamentallimits,”
IEEE Journal on Selected Areas in Communications , vol. 36, no. 6, pp. 1126–1139, June 2018.[17] ——, “Fundamental limits of cache-aided private information retrieval with unknown and uncoded prefetching,”
IEEE Transactions onInformation Theory , pp. 1–1, 2018.[18] Y.-P. Wei and S. Ulukus, “The capacity of private information retrieval with private side information under storage constraints,” June2018. [Online]. Available: arXiv:1806.01253[19] A. Heidarzadeh, F. Kazemi, and A. Sprintson, “Capacity of single-server single-message private information retrieval with private codedside information,” in , July 2019, pp. 1662–1666.[20] F. Kazemi, E. Karimi, A. Heidarzadeh, and A. Sprintson, “Private information retrieval with private coded side information: Themulti-server case,” June 2019. [Online]. Available: arXiv:1906.11278[21] A. Heidarzadeh, S. Kadhe, S. E. Rouayheb, and A. Sprintson, “Single-server multi-message individually-private information retrieval withside information,” in , July 2019, pp. 1042–1046.[22] A. Heidarzadeh, F. Kazemi, and A. Sprintson, “Capacity of single-server single-message private information retrieval with coded sideinformation,” in2018 IEEE Information Theory Workshop (ITW)