aa r X i v : . [ c s . I T ] O c t Noiseless Privacy
Farhad Farokhi
CSIRO’s Data61The University of [email protected]@unimelb.edu.au
Abstract —In this paper, we define noiseless privacy, as a non-stochastic rival to differential privacy, requiring that the outputsof a mechanism (i.e., function composition of a privacy-preservingmapping and a query) can attain only a few values while varyingthe data of an individual (the logarithm of the number of thedistinct values is bounded by the privacy budget). Therefore, theoutput of the mechanism is not fully informative of the data ofthe individuals in the dataset. We prove several guarantees fornoiselessly-private mechanisms. The information content of theoutput about the data of an individual, even if an adversaryknows all the other entries of the private dataset, is boundedby the privacy budget. The zero-error capacity of memory-lesschannels using noiselessly private mechanisms for transmissionis upper bounded by the privacy budget. The performance ofa non-stochastic hypothesis-testing adversary is bounded againby the privacy budget. Finally, assuming that an adversary hasaccess to a stochastic prior on the dataset, we prove that theestimation error of the adversary for individual entries of thedataset is lower bounded by a decreasing function of the privacybudget. In this case, we also show that the maximal informationleakage is bounded by the privacy budget. In addition to privacyguarantees, we prove that noiselessly-private mechanisms admitcomposition theorem and post-processing does not weaken theirprivacy guarantees. We prove that quantization operators canensure noiseless privacy if the number of quantization levels isappropriately selected based on the sensitivity of the query andthe privacy budget. Finally, we illustrate the privacy merits ofnoiseless privacy using multiple datasets in energy and transport.
Index Terms —data privacy; noiseless privacy; non-stochasticinformation theory; hypothesis testing.
I. I
NTRODUCTION
Big data revolution, equipped with novel tools for datacollection, analysis, and reporting, has significant promises foranswering societal challenges. These promises however comeat the cost of erosion of privacy. Therefore, there is a need forrigorous protection of the privacy of individuals.Natural candidates for privacy protection, such as differen-tial privacy [1], [2] and information-theoretic privacy [3], [4],require randomized policies for privacy protection. The defi-nition of differential privacy assumes the use of randomizedfunctions as well as the probability of outputs, and conven-tional information-theoretic tools, such as mutual informationand entropy, rely on random variables.Heuristic-based privacy-preserving methods, such as k -anonymity [5], [6] and ℓ -diversity [7], are however determinis-tic in nature. They employ deterministic mechanisms, such assuppression and generalization, and do not assume stochastic properties about the datasets. Popularity of these methods isevident from the availability of toolboxes for implementation .Although providing powerful guarantees, randomized orstochastic privacy-preserving policies sometimes cause prob-lems, such as un-truthfulness [8], that are undesirable inpractice [9]. This is touted as a reason behind slow adoptionof differential privacy within financial and health sectors [8].For instance, randomized policies, stemming from differentialprivacy in financial auditing, complicate fraud detection [10],[11]. Randomized policies can also generate unreasonable andunrealistic outputs that might mislead investors or marketoperators, e.g., by reporting noisy outputs that point to lackof liquidity in a financial sector while that was not thecase. For instance, the slow-decaying nature of the Laplacenoise means impossible reports (e.g., negative median income)can occur with a non-negative probability [12]. Randomizedprivacy-preserving policies have also encountered difficultiesin medical, health, or social sciences [13], [14]. Furthermore,the Laplace mechanism, a common approach to ensuringdifferential privacy, is shown to cause undesirable properties,e.g., the optimal estimation in the presence of Laplace noise iscomputationally expensive [15]. These motivate the develop-ment of non-stochastic privacy metrics and privacy-preservingpolicies in a rigorous manner.Although it has been proved that noiseless policies cannotprovide the strong guarantees of randomized policies, e.g.,it has been proved that differential privacy cannot be deliv-ered without noise [2], the popularity of noiseless privacy-preserving policies justifies investigating metrics for analysisand comparison. This must be done irrespective of theirinherent philosophical weaknesses in comparison to stochasticpolicies because they belong to a different category.In this paper, we define the new notion of noiseless privacy.Noiseless privacy implies that the outputs of a mechanismcan only attain a few distinct values while varying the data ofan individual. Therefore, the output of the mechanism is notvery informative about the data of the individuals in a dataset.We prove the following guarantees for the noiselessly-privatemechanisms: • The information content of the output about the dataof each individual, even if an adversary knows all theother entries of the private dataset, is bounded from aboveby the privacy budget (a constant similar to the privacy https://arx.deidentifier.org/overview/related-software/ udget in differential privacy capturing the amount ofthe leaked information). As non-stochastic notions ofinformation, we use non-stochastic information leakagein [16] and the maximin information [17]. These areestablished measures of information in the non-stochasticinformation theory literature [16]–[19]. • Zero-error capacity of memory-less channels usingnoiselessly-private mechanisms for data transmission isupper bounded by the privacy budget. Zero-error capacityis the non-stochastic equivalent of normal capacity, alsocoined by Shannon while investigating non-stochasticcommunication channels and worst-case behaviours [20]. • The performance of an adversary performing non-stochastic hypothesis tests [21] on the data of an indi-vidual, while knowing all the other entries of the privatedataset, is bounded again by the privacy budget. • Assuming that an adversary has access to a stochasticprior about the dataset, we prove that the error of anadversary for estimating the data each individual is lowerbounded by a decreasing function of the privacy budget.Therefore, by reducing the privacy budget, the estimationerror of the adversary worsens. In this case, we alsoshow that the maximal information leakage (in the senseof [65]) is upper bounded by the privacy budget. Hence,by reducing the privacy budget, we can also reduce themaximal information leakage.In addition to these privacy guarantees, we prove the followingimportant properties: • Noiselessly-private mechanisms admit composition theo-rem, i.e., the privacy budgets of the mechanisms add upwhen reporting on multiple queries on the same privatedataset. • Post-processing of noiselessly-private mechanisms doesnot weaken their privacy guarantees, i.e., the privacybudget can only be increased by post-processing.We also prove that quantization operators can ensure noiselessprivacy. We provide a recipe for determining the number ofquantization levels based on the sensitivity of the query andthe privacy budget. Finally, we illustrate the privacy meritsof noiseless privacy using multiple datasets in energy andtransport.
A. Related StudiesAnonymization:
Anonymization is widely used withinpublic and private sectors for releasing sensitive datasets for public competitions and analysis. Although popularlyadopted, anonymization is often insufficient for privacy preser-vation [22]–[24] and hence, systematic methods with provableguarantees are required. Multi-Party Computation and Encryption:
We may usesecure multi-party computation, for instance based on ho-momorphic encryption, to compute aggregate statistics ormachine learning models [25]–[31]. Secure multi-party com-putation and homomorphic encryption introduce massive com- putation and communication overheads. They also do not fullyeliminate the risk of privacy breaches, e.g., risks associatedwith dis-aggregation attacks still remain if these algorithmsare not pared with other privacy-preserving techniques. Differential Privacy:
Differential privacy offers provableprivacy guarantees [2], [32]–[37]. This method uses random-ization to provide plausible deniability for the data of anindividual by ensuring that the statistics of privacy-preservingoutputs do not change significantly by varying the data ofindividual. Additive Laplace and Gaussian noise with scalesproportional to the sensitivity of the submitted query withrespect to the individual entries of the dataset are proved toguarantee differential privacy [2]. By definition, differentialprivacy requires randomization.
Information-Theoretic Privacy:
Information-theoreticprivacy, a rival to differential privacy, dates back tostudying secrecy [38] and its generalizations [3], [4], [39],[40]. Information-theoretic guarantees have been also used tomeasure the quantity of leaked private information when usingdifferential privacy [41], [42]. In information-theoretic privacy,entropy, mutual information, Kulback-Leiber divergence, andFisher information have been repeatedly used as measuresof privacy [43]–[49]. Information theory, starting withShannon [50], assumes that data source and communicationchannels are random, and is powerful in modelling andanalysing communication systems. However, traditionalnotions in information theory, such as mutual information,are not useful for analysing non-stochastic/noiseless settingsand deterministic privacy-preserving policies.
Deterministic Privacy-Preserving Policies:
Noiselessprivacy-preserving policies are often heuristic-based makingthem vulnerable to attacks, e.g., k -anonymity is vulnerableto homogeneity attack [7]. This is because we do not pos-sess sensible measures/definitions for privacy that extend tonoiseless privacy-preserving policies on deterministic datasets.Therefore, we cannot prove, in any sense, privacy guaran-tees of noiseless privacy-preserving (even if weak or limitedin scope or practice). The popularity of noiseless privacy-preserving policies justifies investigating metrics for analysisand comparison. In this paper, we propose a rival to differentialprivacy that is noiseless. We use non-stochastic informationtheory, non-stochastic hypothesis testing, stochastic estima-tion theory to investigate the merits of this definition. Non-stochastic information theory dates back to early studies ofinformation transmission [17], [51]–[54]. It has been recentlyused in engineering [55]–[57]. Most recently, non-stochasticinformation theory was used in [16] for investigating deter-ministic privacy-preserving policies. Interestingly, in [16], itwas easily proved that k -anonymity is not privacy-preservingusing non-stochastic information theory, a fact that was onlyobserved using adversarial attacks in [7]. B. Paper Organization
The rest of the paper is organized as follows. Backgroundmaterial on non-stochastic information theory and hypothesistesting are presented in Section II. Noiseless privacy is definedn Section III. In this section, guarantees and properties ofnoiseless privacy are also presented. A method for ensuringnoiseless privacy is presented in Section IV. Experimentalresults are presented in Section V. Finally, the paper isconcluded in Section VI.II. U
NCERTAIN V ARIABLES , H
YPOTHESIS T ESTING , AND N ON -S TOCHASTIC I NFORMATION T HEORY
We start by reviewing necessary concepts from non-stochastic information theory, particularly, uncertain variables,non-stochastic information leakage, and hypothesis testing.
A. Uncertain Variables
Let Ω be an uncertainty set/space whose elements, i.e., ω ∈ Ω , model/capture the source of uncertainty. An un-certain variable X is defined as a mapping on Ω . For anyuncertain variable X : Ω → X , X ( ω ) is the realizationof uncertain variable X (corresponding to the realization ofuncertainty ω ∈ Ω ). Marginal range of uncertain variable X is J X K := { X ( ω ) : ω ∈ Ω } ⊆ X . Joint range ofuncertain variables X : Ω → X and Y : Ω → Y isdefined as J X, Y K := { ( X ( ω ) , Y ( ω )) : ω ∈ Ω } ⊆ X × Y . Conditional range of uncertain variable X , conditioned onthe realizations of uncertain variable Y belonging to the set Y , i.e., Y ( ω ) ∈ Y ⊆ J Y K , is given by J X |Y K := { X ( ω ) : ∃ ω ∈ Ω such that Y ( ω ) ∈ Y} ⊆ J X K . If Y is a singleton,i.e., Y = { y } , we use J X | y K instead of J X |{ y } K = J X |Y K .The definition of uncertain variables and their properties aresimilar to those of random variables with the exception of notrequiring a measure on Ω . Finally, if the marginal range J X K is uncountably infinite for an uncertain variable X , we refer to X as a continuous uncertain variable, similar to a continuousrandom variable. If the marginal range J X K is countable for anuncertain variable X , we call X a discrete uncertain variable. B. Non-Stochastic Information Theory
Non-stochastic entropy of discrete uncertain variable X is H ( X ) := log ( | J X K | ) ∈ R ∪ {±∞} . (1)This is commonly referred to as the Hartley entropy [17],[51]. For continuous uncertain variable X , the non-stochastic(differential) entropy is given by h ( X ) := log e ( µ ( | J X K | )) ∈ R ∪ {±∞} , (2)where µ ( · ) is the Lebesgue measure. This is sometimesreferred to as R´enyi differential 0-entropy [17]. The authorsof [17], [58] define the non-stochastic conditional entropy ofuncertain variable X , conditioned on uncertain variable Y , as H ( X | Y ) := max y ∈ J Y K log ( | J X | y K | ) , (3)for discrete uncertain variables X and Y . Similarly, for con-tinuous uncertain variables X and Y , we get h ( X | Y ) := ess sup y ∈ J Y K log e ( µ ( J X | y K )) . (4) Now, we can define non-stochastic information between un-certain variables X and Y as the difference of the entropy of X with and without access to realizations of Y . Hence, fordiscrete uncertain variables, non-stochastic information can bedefined as I ( X ; Y ) := H ( X ) − H ( X | Y )= min y ∈ J Y K log (cid:18) | J X K || J X | y K | (cid:19) . (5)For continuous uncertain variables, non-stochastic informationcan be similarly defined as I ( X ; Y ) := h ( X ) − h ( X | Y ) .It is clear that the non-stochastic information is in fact notsymmetric, i.e., I ( X ; Y ) = I ( Y ; X ) in general.With slight adaptation, Kolmogorov had previously defined‘combinatorial’ conditional entropy using log( | J X | y K | ) and theinformation gain as | J X K | / | J X | y K | in [52]. The combinatorialconditional entropy and the information gain are only definedfor a fixed realization Y ( ω ) = y while (5) is based on theworst-case scenario.In [16], it was observed that, in the context of information-theoretic privacy, the non-stochastic information (5) is not agood measure of information leakage and therefore, the non-stochastic information leakage was proposed as L ( X ; Y ) := max y ∈ J Y K log (cid:18) | J X K || J X | y K | (cid:19) , (6)for discrete uncertain variables. Similarly, for continuous un-certain variables, the non-stochastic information leakage wasdefined as L ( X ; Y ) := ess sup y ∈ J Y K log e (cid:18) µ ( J X K ) µ ( J X | y K ) (cid:19) . (7)In general, the non-stochastic information I and non-stochastic information leakage L are not equal, i.e, I ( X ; Y ) = L ( Y ; X ) . In fact, from the definition, it iseasy to see that I ( X ; Y ) ≤ L ( X ; Y ) . Further, L ( X ; Y ) is not symmetric. We propose the symmetrized non-stochasticinformation leakage as L s0 ( X ; Y ) := min( L ( X ; Y ) , L ( Y ; X )) . (8)Note that, by construction, L s0 ( X ; Y ) = L s0 ( Y ; X ) . a) Maximin Information: In [17], the maximin informa-tion was introduced as a symmetric measure of informationand its relationship with zero-error capacity was explored. Topresent the definition of the maximin information, we need tointroduce the notion of overlap partitions: • x, x ′ ∈ J X K are J X | Y K -overlap connected, or in short x ! x ′ , if there exists a finite sequence of conditionalranges { J X | y i K } ni =1 such that x ∈ J X | y K , x ′ ∈ J X | y n K ,and J X | y i K ∩ J X | y i +1 K = ∅ for all i = 1 , . . . , n − ; • A ⊆ J X K is J X | Y K -overlap connected if all x, x ′ ∈ A are J X | Y K -overlap connected; • A , B ⊆ J X K are J X | Y K -overlap isolated if there does notexist x ∈ A , x ′ ∈ B such that x, x ′ are J X | Y K -overlapconnected; An J X | Y K -overlap partition is a partition of J X K suchthat each member set is J X | Y K -overlap connected andall two member sets are J X | Y K -overlap isolated.Symmetry, i.e., x ! x ′ implies that x ′ ! x , andtransitivity, i.e., x ! x ′ and x ′ ! x ′′ implies that x ! x ′′ ,guarantee that a unique J X | Y K -overlap partition always ex-ists [17]. The unique J X | Y K -overlap partition is shown by J X | Y K ⋆ in what follows. The maximin information is I ⋆ ( X ; Y ) := log ( | J X | Y K ⋆ | ) . (9)In [17], it was proved that | J X | Y K ⋆ | = | J Y | X K ⋆ | and thus I ⋆ ( X ; Y ) = I ⋆ ( Y ; X ) . We now prove an important resultregarding the relationship between non-stochastic informationleakage and maximin information. Proposition 1.
For discrete uncertain variable Y , I ⋆ ( X ; Y ) ≤ L s0 ( X ; Y ) .Proof. See Appendix A.An uncertain time series X is a sequence of uncertainvariables X [ k ] : Ω → X for all k ∈ N . Alternatively, wecan think of uncertain time series X as a mapping fromthe sample space Ω to the set of discrete-time functions X ∞ := {∀ x : N → X } .Now, we can define a memory-less uncertain communi-cation channel. A memory-less uncertain channel maps anyuncertain time series X to uncertain time series Y such that J Y [ k ] , . . . , Y [1] | X [ k ]( ω ) = x [ k ] , . . . , X [1]( ω ) = x [1] K = J Y [ k ] | X [ k ]( ω ) = x [ k ] K × · · · × J Y [1] | X [1]( ω ) = x [1] K , for all ( x [ k ] , . . . , x [1]) ∈ J X [ k ] , . . . , X [1] K and all k ∈ N . Acode of length k is a finite set F ⊆ X k with each codeword f ∈ F denoting a distinct message. Define X ( y [ k ] , . . . , y [1]):= J X [ k ] , . . . , X [1] | Y [ k ]( ω ) = y [ k ] , . . . , Y [1]( ω ) = y [1] K . The zero-error capacity is C := lim k →∞ sup F ⊆ X k : |F ∩ X ( y [ k ] , . . . , y [1]) | ≤ , ∀ ( y [ k ] , . . . , y [1]) ∈ Y k log ( |F| ) k . (10)In what follows, we only consider sequence of discreteuncertain variables Y [ k ] . Now, we are ready to relate thesymmetrized non-stochastic information leakage to zero-errorcapacity of memory-less uncertain channels. Proposition 2.
Any memory-less uncertain channel satisfies C ≤ sup J X [ k ] K ⊆ X L s0 ( X [ k ]; Y [ k ]) . Proof.
See Appendix B.
C. Non-Stochastic Hypothesis Testing
Consider uncertain variable X denoting the original uncer-tain variable. An adversary is interested in testing the validityof a hypothesis for the realizations of X . The adversary doesnot have access to realizations of this uncertain variable as p p J X | p K J X | p K Ω J X K J H K J Y | J X | p KK \ J Y | J X | p KKJ Y | J X | p KK \ J Y | J X | p KKJ Y | J X | p KK ∩ J Y | J X | p KK J Y K Fig. 1: Relationship between uncertain variables in non-stochastic hypothesis testing based on uncertain measure-ments. If the realization of uncertain measurement Y belongsto J Y | p K ∩ J Y | p K , there is not enough evidence to acceptor reject the null hypothesis p or the alternative hypothesis p . However, if the realization of uncertain measurement Y belongs to ( J Y | p K \ J Y | p K ) ( ( J Y | p K \ J Y | h K ) ), we canconfidently accept (reject) the null hypothesis p and reject(accept) the alternative hypothesis p .otherwise hypothesis testing is trivial. Instead, it has accessto an uncertain measurement of this variable denoted by Y .This is captured by that Y ( ω ) = g Y ( X ( ω )) for a mapping g Y : J X K → J Y K . Recalling that uncertain variables aremappings from the uncertainty set, it must be that Y = g Y ◦ X ,where ◦ denotes composition of mappings. Similarly, wemay define the hypothesis as an uncertain variable H withbinary range J H K = { p , p } , where p denotes the nullhypothesis and p denotes the alternative hypothesis . Weassume that there exists a mapping g H : J X K → J H K suchthat H = g H ◦ X ; the hypothesis is constructed based on theuncertain variable X as H ( ω ) = g H ( X ( ω )) . This setup andthe relationship between all uncertain variables is summarizedin Figure 1.A test is a function T : J Y K → J H K = { p , p } . If T ( Y ) = p , the test rejects the null hypothesis in favour of thealternative hypothesis; however, if T ( Y ) = p , the test acceptsthe null hypothesis (and rejects the alternative hypothesis). Theset of all tests is given by C ( J H K , J Y K ) which captures the setof all functions from J Y K to J H K . Following [21], we say thata test T ∈ C ( J H K , J Y K ) is correct at a particular realizationof uncertain variable Y , Y ( ω ) = y ∈ J Y K , if J H | J X | y KK = { T ( y ) } . The set of all outputs at which test T is correct isequal to ℵ ( T ) := { y ∈ J Y K : J H | J X | y KK = { T ( y ) }} . Basedon this definition of correctness, we can define a performancemeasure for tests [21]. If Y is a continuous uncertain variable,the performance is P ( T ) := log e ( µ ( ℵ ( T ))) . (11)Similarly, if Y is a discrete uncertain variable, the performances equal to P ( T ) := log ( |ℵ ( T ) | ) . (12)In the following result, ∆ denotes the symmetric differenceoperator on the sets, i.e., A ∆ B = ( A \ B ) ∪ ( B \ A ) . Proposition 3 ([21]) . The performance of any test T ∈C ( J H K , J Y K ) is bounded by P ( T ) ≤ log e ( µ ( J Y | p K ∆ J Y | p K )) if Y is a continuous uncertain variable, and by P ( T ) ≤ log ( | J Y | p K ∆ J Y | p K | ) if Y is a discrete uncertain variable. Note that, for any realization of uncertain variable Y in theset J Y | p K ∩ J Y | p K , there is not enough evidence to accept orreject either the null hypothesis or the alternative hypothesis.This is because these realizations can be caused by realizationsof X that are consistent with the null hypothesis p or realiza-tions of X that are consistent with the alternative hypothesis p . On the other hand, if the realization of the measurement Y is in the set ( J Y | p K \ J Y | p K ) ∪ ( J Y | p K \ J Y | h K ) = J Y | p K ∆ J Y | p K , we can confidently reject or accept the nullhypothesis or the alternative hypothesis. Proposition 3 can bethought of as a non-stochastic equivalent of the Chernoff-Stein Lemma; see, e.g., [59, Ch. 11] for randomized hypothesistesting. The size of the set J Y | p K ∆ J Y | p K essentially capturesthe difference between the ranges J Y | p K and J Y | p K re-sembling the Kullback–Leibler divergence in a non-stochasticframework. III. N OISELESS P RIVACY :D EFINITION , G
UARANTEES , AND P ROPERTIES
We model a private dataset by a realization of a vector-valued uncertain variable X : Ω → R n with n denoting thenumber of individuals whose data is in the dataset. The datasetis therefore in the form of X ( ω ) = X ( ω ) X ( ω ) ... X n ( ω ) , where X i ( ω ) ∈ R is the data of the i -th individual. Evidently,each X i : Ω → R , ≤ i ≤ n , is itself an uncertain variable.A data curator, in possession of the realization of uncertainvariable X , i.e., the private dataset X ( ω ) , must return aresponse to a query f : J X K → R q for some q ∈ N . Thecurator employs a mechanism M : R q → R q to generate aa privacy-preserving response. Throughout this paper, M ◦ f is referred to as the mechanism of the curator. This is thesame language used in the privacy literature albeit withoutthe presence of the randomness [1], [60]. Therefore, thecurator provides the response Y ( ω ) = M ◦ f ( X ( ω )) . Bydefinition, Y = M ◦ f ◦ X is an uncertain variable. In theremainder of this paper, we use the notation x − i to denote ( x , . . . , x i − , x i +1 , . . . , x n ) for vectors and X − i to denote ( X , . . . , X i − , X i +1 , . . . , X n ) for uncertain variables alike.In this notation, − i refers essentially refers to the set of allindividuals except the i -th one. Definition 1 (Noiseless Privacy) . A mechanism M ◦ f is ǫ -noiselessly private, for ǫ > , if | J Y | X − i ( ω ) = x − i K | ≤ ǫ , ∀ x − i ∈ J X − i K , ∀ i. (13)Note that this definition is akin to a noiseless differentialprivacy. This is because, instead of bounding the informationleakage as in information-theoretic privacy [16], the outputrealizations are restricted if one individual entry of the datasetchanges. Note that, when the data of i -th individual changes,the output can take all the values within the set J Y | X − i ( ω ) = x − i K . If this set is not informative, i.e., it does not containmany elements, reverse engineering the data of i -th individualchanges with knowledge of X ( ω ) even in the presence of side-channel information is a difficult task. In what follows, weuse non-stochastic information theory to establish the extendof the privacy guarantees from noiseless privacy. Similar todifferential privacy, we can also define a local version ofnoiseless privacy. Definition 2 (Local Noiseless Privacy) . Assume that f i :( x i ) ni =1 x i for each i . A mechanism M is ǫ -locallynoiselessly private if M ◦ f i is ǫ -noiselessly private forall i .A. Guarantee: Non-Stochastic Information Leakage We define a function ψ i,v − i : J X K → J X i K ×{ v − i } replacingthe value of X j ( ω ) or the realization of X j , for all ≤ j ≤ except for j = i , with given constants v j , i.e., ψ i,v − j ( X ( ω )) =( v , . . . , v i − , X i ( ω ) , v i +1 , . . . , v n ) . Let Ψ i = { ψ i,v − j | v − j ∈ J X − i K } be the set of all such functions for i ∈ { , . . . , n } .The uncertain variable ψ i,v − i ◦ X becomes unrelated (in thesense of [17]) to X − i for all ψ i,v − i ∈ Ψ i .This definition allows us to measure the amount of theinformation that the curator’s mechanism leaks about the dataof the i -the individual X i ( ω ) . For a given ψ i,v − i ∈ Ψ i , let usdefine Y = M ◦ f ◦ ψ i,v − i ◦ X . Now, the information between Y and X i captures how much more information can an adversaryextract from Y knowing the data of all the individuals exceptthe i -the individual. This is because, here, we let the adversaryto select any possible ψ i,v − i . Theorem 1 (Non-Stochastic Information vs NoiselessPrivacy) . Assume X is a discrete uncertain variable, Y = M ◦ f ◦ ψ i,v − i ◦ X , and M ◦ f is ǫ -noiselesslyprivate. For any ψ i,v − i ∈ Ψ i , ≤ I ⋆ ( X i ; Y ) ≤ L s0 ( X i ; Y ) ≤ L ( Y ; X i ) ≤ ǫ. (14) Proof.
See Appendix C.Theorem 1 shows that, by reducing ǫ , we can reduce theamount of the leaked information about each individual. Thismakes sense. Consider the case where ǫ = + ∞ . In this case,the curator can report the output of the query f ( ψ i,v − i ◦ X ( ω )) completely (i.e., M can be chosen to be equal identity) andhe adversary, knowing v − i , can compute the data of the dataof the i -th individual X i ( ω ) (at least if the adversary select thequery to be linear with non-zero weight for the i -th individual).On the other hand, if ǫ = 0 , the output becomes a constantthat is independent of X i ( ω ) and thus the adversary learnsnothing new about the data of the i -th individual X i ( ω ) . B. Guarantee: Zero-Error Capacity
Let us consider a memory-less noiselessly-private communi-cation channel. This can be seen as a non-stochastic equivalentof differentially-private communication channels in [61].Let M ◦ f be a ǫ -noiselessly private for some ǫ > . For anygiven sequence of mappings { ψ ti,v − i } t ∈ N with ψ ti,v − i ∈ Ψ i ,a memory-less ǫ -noiselessly-private channel maps any uncer-tain time series X = ( X [ k ] , . . . , X [1]) to uncertain timeseries Y = ( Y [ k ] , . . . , Y [1]) such that Y [ ℓ ]( ω ) = M ◦ f ◦ ψ ℓi,v − i ( X ( ω )) for all ≤ ℓ ≤ k and k ∈ N .This setup can be seen as a case in which the curatoris reporting on a stream of data from the individuals. Wecan assume that an extremely strong adversary can set therealizations of the data of all individuals except the i -thindividual. The capacity of the channel captures the amountof information that passes through a ǫ -noiselessly privatemechanism over time. Theorem 2 (Zero-Error Capacity vs Noiseless Privacy) . Assume, for all k , X [ k ] is a discrete uncertain variable, Y [ k ] = M ◦ f ◦ ψ ℓi,v − i ◦ X [ k ] , and M ◦ f is ǫ -noiselesslyprivate. For any ψ i,v − i ∈ Ψ i , the zero-error capacity ofmemory-less ǫ -noiselessly-private channel is bounded by C ≤ ǫ. (15) Proof.
The rest of the proof follows from Proposition 2 andTheorem 1.
C. Guarantee: Non-Stochastic Hypothesis Testing
In this part, our analysis is motivated by the definition ofsemantic security or indistinguishability under chosen plaintextattack [62]. Assume that an adversary selects i ∈ { , . . . , n } , x i , x ′ i ∈ J X i K , and provides this information to the curator. Thecurator uses uncertain variable X i : Ω → J X i K := { x i , x ′ i } to constructs uncertain variable X = ( X − i , X i ) . Fix ψ i,v − i ∈ Ψ i . The curator then generates a realization X ( ω ) , computes Y ( ω ) = M ◦ f ◦ ψ i,v − i ( X ( ω )) , and provides Y ( ω ) to theadversary. The adversary tests whether the realization of thedata of individual i is equal to x i or x ′ i knowing that it is boundto be one of those values and knowing that the value of thedata of all the other individuals is fixed to v − i . We define thehypothesis uncertain variable H using g H : X ( ω ) H ( ω ) as H ( ω ) = g H ( X ( ω )) = ( p , X i ( ω ) = x i ,p , X i ( ω ) = x ′ i . The following theorem bounds the performance of the adver-sary for performing its hypothesis test.
Theorem 3 (Hypothesis Testing vs Noiselss Privacy) . Assume Y = M ◦ f ◦ ψ i,v − i ◦ ¯ X and M ◦ f is ǫ -noiselesslyprivate. Then, for any test T and any ψ i,v − i ∈ Ψ i , theperformance of the adversary is bounded by P ( T ) ≤ ǫ. (16) Proof.
See Appendix D.Bounding the performance of a hypothesis-testing adversaryis in essence close to identifiability [63], [64] for which privacypreservation relates to the potential of an adversary identifyingthe private data of individuals based on the received outputs.
D. Guarantee: Performance of Adversaries with StochasticPriors
In this subsection, we briefly assume that the dataset is ran-domly distributed according to the probability density function p , i.e., for any Lebesgue-measurable set X ⊆ J X K , P { X ∈X } = R x ∈ J X K ξ ( x ) µ ( x ) . We also consider an adversary thatknows the realizations of all the entries of the dataset exceptthe entry of the i -th individual. It constructs an estimate ofthe missing entry X i using an estimator ˆ X i ( X − i , M ◦ f ( X )) using its prior information X − i and the response M ◦ f ( X ) . Theorem 4 (Stochastic Prior vs Noiselss Privacy) . As-sume that ρ = inf x ∈ J X K ξ ( x ) > , M ◦ f is ǫ -noiselesslyprivate, and f − ◦ M − ( y ) is a connected set for any y ∈ J Y K . For any p ∈ N , E { ( X i − ˆ X i ( X − i , M ◦ f ( X ))) p | X − i }≥ (cid:18) ρµ ( J X i K ) p +1 p +2 (cid:19) − ǫ ( p +1) . (17) Proof.
See Appendix E.The lower bound on the adversary’s estimation performancein Theorem 4 is an decreasing function of ǫ . Therefore, asexpected and in-line with the earlier results, by decreasing ǫ ,we can reduce the adversary’s ability to infringe on the privacyof any individual in the dataset even if the adversary knowsthe data of all the other individuals. E. Guarantee: Stochastic Maximal Leakage
In this section, we can recreate the stochastic frameworkfor information leakage in [65] by again endowing all theuncertain variables with a measure. This way, we can definethe maximal stochastic leakage from X to Y as L c ( X → Y ) = sup U − X − Y − ˆ U log P { U = ˆ U } max u ∈ J U K P U ( u ) ! where the supremum is taken over all random variables U, ˆ U taking values in the same finite arbitrary alphabets. Here, U − − Y − ˆ U states that these variables from a Markov chainin the introduced order. It was shown in [65] that L c ( X → Y ) = log X y ∈ J Y K max x ∈ J X K : P X ( x ) > P Y | X ( y | x ) = I ∞ ( X ; Y ) . Theorem 5 (Maximal Leakage vs Noiseless Privacy) . Assume Y = M ◦ f ◦ ψ i,v − i ◦ X and M ◦ f is ǫ -noiselesslyprivate. Then, L c ( X i → Y ) ≤ ǫ. Proof:
Note that sup P Y | X L c ( X → Y ) ≤ H ( Y ) becauseof [65, Lemma 1 & Example 6]. Furthermore, | J M ◦ f ◦ ψ i,v − i ◦ X K | = | J Y | X − i ( ω ) = v − i K | ≤ ǫ . Evidently, the amount of the leaked information is upperbounded by the privacy budget. Hence, by reducing theprivacy budget, we can minimize the amount of the leakedinformation.
F. Property: Composition of Noiselessly-Private Mechanisms
Composition of differentially-private mechanisms [2], [66],[67] is an important result showing that the privacy budgetsadd up when reporting on multiple queries on the same privatedataset. In what follows, we show that the same also appliesto noiseless privacy.
Theorem 6 (Composition of Noiselessly-Private Mech-anisms) . Let M and M be such that M ◦ f and M ◦ f are ǫ -noiseless private and ǫ -noiseless private,respectively. Then, ( M , M ) ◦ f is ( ǫ + ǫ ) -noiselessprivate.Proof. See Appendix F.
G. Property: Post-Processing of Noiselessly-Private Mecha-nisms
Finally, an important property of differentially-privatemechanisms and information-theoretic privacy is that the pri-vacy guarantees do not weaken by post-processing privacy-preserving outputs [2]. In what follows, this also holds fornoiselessly-private mechanisms as well.
Theorem 7 (Post-Processing of Noiselessly-PrivateMechanisms) . Let M be such that M ◦ f is ǫ -noiselessprivate. Then, g ◦ M ◦ f is also ǫ -noiseless private forany mapping g .Proof. See Appendix G.IV. N
OISELESS P RIVACY : S
ATISFACTION
We can ensure noiseless privacy using non-stochastic ap-proaches, such as binning or quantization. To do so, first, wedefine linear quantizers.
Adversary Curator select i and i i and i j ← { , } select i , . . . , i n y, t y = M ◦ f ( x i j ,t , x i ,t , . . . , x i n ,t ) estimate j ˆ j return j = ˆ j Fig. 2: The timing of a game used for evaluating the ability ofan adversary in guessing if the data of a particular individualbelongs to a publicly-released noiselessly-private aggregatestatistics.
Definition 3 (Linear Quantizer) . A q -level quantizer Q :[ x min , x max ] → { b , . . . , b q } is a piecewise constant functiondefined as Q ( x ) = b , x ∈ [ x , x ) ,b , x ∈ [ x , x ) , ... ... b q − , x ∈ [ x q − , x q ) ,b q , x ∈ [ x q , x q +1 ] , where ( b i ) qi =1 are distinct symbols and x ≤ x ≤ · · · ≤ x q are real numbers such that x = x min , x q +1 = x max , x i +1 − x i = ( x max − x min ) /q for all ≤ i ≤ q . We can show that linear quantizers can achieve noiselessprivacy for any query on private datasets. This is proved inthe next theorem.
Theorem 8.
Let J f ( X ) | X K ⊆ [ y min , y max ] . Define sen-sitivity of query f as S ( f ) := sup x − i ∈ J X − i K µ ( f ( J X i K × { x − i } ))= sup x − i ∈ J X − i K sup x i ,x ′ i ∈ J X i K | f ( x ′ i , x − i ) − f ( x i , x − i ) | . The mechanism
M ◦ f is ǫ -noiseless private if M is a q -level quantizer with q ≤ ǫ ( y max − y min ) S f . Proof.
See Appendix H.V. E
XPERIMENTS
A. Energy Data: Reporting Aggregate
For this part, we use a publicly available dataset from theAusgrid containing half-hour smart meter measurements for300 randomly-selected homes with rooftop solar systems overthe period from 1 July 2010 to 30 June 2013. In this paper,we use the data over July 2012 to June 2013. et x i,t denote the consumption of house i at day t . Weconsider reporting aggregate statistics y t = f (( x i,t ) ni =1 ) = 1 n ( x ,t + · · · + x n,t ) , ∀ t. Here, f denotes the query. We particularly use the mechanismin Theorem 8 to report noiselessly-private outputs. In thisexperiment, we test the ability of an adversary for inferring ifa particular household has contributed to the aggregate or notas in [68]. We use a game, as in [68], [69], to evaluate theability of the adversary. The setup of the game is summarizedin Figure 2. At first, the adversary can select two households i , i . The curator select one of those households uniformly atrandom i j . It also selects an additional n − households. Thenit reports the privacy-preserving aggregate outputs. Based onthe reported output, the adversary guesses the participatinghousehold i ˆ j . The adversary’s success or advantage is thendefined as Adv := 2 | P { j = ˆ j } − / | . Small
Adv means that the adversary is as successful asrandomly guessing and large
Adv implies that the adversaryis successful in recognizing the household participating in theaggregate.Similar to [68], we use three adversary policies. The firstone is based on correlation. In this case, the adversary selects j ∈ { , } based on the correlation between ( x i j ,t ) t and ( y t ) t . The second policy is based on mean square error. Inthis case, the adversary selects j ∈ { , } by minimizing thesquare error k ( x i j ,t ) t − ( y t ) t k . Finally, the last policy usesthe relative peaks of each load profile ( x i ,t ) t , ( x i ,t ) t , and ( y t ) t . In this case, the adversary selects j ∈ { , } based onthe most common peaks between ( x i ,t ) t and ( y t ) t , or ( x i ,t ) t and ( y t ) t .Figure 3 illustrates the advantage of the adversary Adv when using the correlation-based policy (left), the mean squareerror policy (center), and the peak-based policy (right). As ǫ grows larger, the adversary’s advantage tends toward thenon-private case in [68]. Clearly, even for moderate ǫ whenconsidering small groups, the adversary’s advantage is verysmall. This is not the case for non-private outputs as observedin [68]. For instance, for small groups and moderate privacybudgets, such as n = 4 and ǫ = 2 or n = 8 and ǫ = 3 ,the adversary’s advantage is negligible (almost zero). Thisshows that combining noiseless privacy with aggregation isan excellent tool for providing privacy to individuals. B. Energy Data: Reporting Single Consumption
Non-intrusive load monitoring provides tools for extract-ing appliance-specific energy consumption statistics from thesmart meter readings of a household and and is one of privacyconcerns behind releasing energy data [70], [71]. In thissection, we use Theorem 8 to report high-frequency energyconsumption of a household in a privacy-preserving mannerusing local noiseless privacy. We then proceed to see the effectof privacy budget on an adversary performing non-intrusiveload monitoring. We use the low frequency data from the first house in theREDD database database [72], which contains the consump-tion of various appliances in the house every 3-4 seconds. Thisdata in conjunction with the consumption of the entire houseis used for training and verification of a non-intrusive loadmonitoring algorithm. The consumption of the entire house ismeasured every second. The data is for the period of April23–May 21, 2011. The part of the data prior to April 30th isused for training and the rest for validation purposes. We selectthe top 5 appliances in energy consumption for disaggregationpurposes, namely, fridge, microwave, socket (in the kitchen),light, and dish washer. For non-intrusive load monitoring, wehave used the NILMTK toolbox in Python [73]. We haveused a frequently utilized combinatorial optimization methodfor non-intrusive load monitoring. We report the success ofthe non-intrusive load monitoring using the f -score.Figure 4 shows the f-score of the non-intrusive load mon-itoring algorithm based on combinatorial optimization versusthe privacy budget. As we can see the f -score gets rapidlybad as ǫ decrease. This means an adversary would not be ableto identify the appliances that are used within the householdreliably. This illustrates the power of local noiseless privacyin reporting energy consumption of households for analysiswhile protecting the privacy of the households. C. Transport Data: Reporting Individual Source-Destinations
Finally, we use New York City Taxi Cab trips for 2014.We use the first million trips and focus on trips that beginand end within the New York City in Figure 5. Here, weconsider reporting the start-end point of the taxi rides in alocally noiselessly private manner. In this subsection, we againuse Theorem 8 to report start-end points of taxi rides in theNew York City in a privacy-preserving manner using localnoiseless privacy. In this case, for each ǫ , we split the latitudeand the longitude into ǫ/ boxes. Therefore, the privacy of thetotal privacy budget for the reported outputs is ǫ , followingTheorem 6.Figure 6 illustrates the portion of unique start-end points oftaxi rides versus the privacy budget. As we can see, the portionof unique start-end points is negligible for small ǫ . This meansan adversary would not be able to attribute a specific taxi rideto an individual, thus protecting the privacy of contributingindividuals.VI. D ISCUSSIONS AND F UTURE W ORK
In this paper, we defined noiseless privacy, as a non-stochastic rival to differential privacy, requiring that theoutputs of the mechanism to attain very few values whilevarying the data of an individual remains. We proved thatnoiseless-private mechanisms admit composition theorem andpost-processing does not weaken their privacy guarantees.We proved that quantization operators can ensure noiselessprivacy. We finally illustrated the privacy merits of noiseless http://redd.csail.mit.edu/ https://github.com/nilmtk/nilmtk A d v ǫ A d v ǫ A d v ǫ Fig. 3: The advantage of the adversary
Adv when using the correlation-based policy (left), the mean square error policy(center), and the peak-based policy (right). f - s c o r e ǫ Fig. 4: f -score of the non-intrusive load monitoring algorithmbased on combinatorial optimization versus the privacy budget.Fig. 5: Map of New York City: latitude in [40.92 ◦ , 40.49 ◦ ]and longitude in [-74.27 ◦ ,-73.68 ◦ ] no uniquesource-destinations all uniquesource-destinations19 uniquesource-destinations r a ti oo f un i qu e s ou r ce - d e s ti n a ti on s ǫ Fig. 6: Portion of taxi rides with unique start-end points versusthe privacy budget.privacy and local noiseless privacy using multiple datasets inenergy and transport. R
EFERENCES[1] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noiseto sensitivity in private data analysis,” in
Theory of CryptographyConference , pp. 265–284, Springer, 2006.[2] C. Dwork and A. Roth, “The algorithmic foundations of differentialprivacy,”
Foundations and Trends in Theoretical Computer Science ,vol. 9, no. 3–4, pp. 211–407, 2014.[3] L. Sankar, S. R. Rajagopalan, and H. V. Poor, “Utility-privacy tradeoffsin databases: An information-theoretic approach,”
IEEE Transactions onInformation Forensics and Security , vol. 8, no. 6, pp. 838–852, 2013.[4] H. Yamamoto, “A source coding problem for sources with additionaloutputs to keep secret from the receiver or wiretappers,”
IEEE Transac-tions on Information Theory , vol. 29, no. 6, pp. 918–923, 1983.[5] P. Samarati, “Protecting respondents identities in microdata release,”
IEEE transactions on Knowledge and Data Engineering , vol. 13, no. 6,pp. 1010–1027, 2001.[6] L. Sweeney, “k-anonymity: A model for protecting privacy,”
Interna-tional Journal of Uncertainty, Fuzziness and Knowledge-Based Systems ,vol. 10, no. 05, pp. 557–570, 2002.7] A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam,“ ℓ -diversity: privacy beyond k -anonymity,” in , pp. 24–24, 2006.[8] R. Bild, K. A. Kuhn, and F. Prasser, “SafePub: A truthful dataanonymization algorithm with strong privacy guarantees,” Proceedingson Privacy Enhancing Technologies , vol. 2018, no. 1, pp. 67–87, 2018.[9] G. Poulis, A. Gkoulalas-Divanis, G. Loukides, S. Skiadopoulos, andC. Tryfonopoulos,
SECRETA: A Tool for Anonymizing Relational, Trans-action and RT-Datasets , pp. 83–109. Springer International Publishing,2015.[10] R. Bhaskar, A. Bhowmick, V. Goyal, S. Laxman, and A. Thakurta,“Noiseless database privacy,” in
International Conference on the Theoryand Application of Cryptology and Information Security , pp. 215–232,Springer, 2011.[11] S. U. Nabar, B. Marthi, K. Kenthapadi, N. Mishra, and R. Motwani,“Towards robustness in query auditing,” in
Proceedings of the 32ndinternational conference on Very large data bases , pp. 151–162, VLDBEndowment, 2006.[12] J. Bambauer, K. Muralidhar, and R. Sarathy, “Fool’s gold: an illustratedcritique of differential privacy,”
Vanderbilt Journal of Entertainment &Technology Law , vol. 16, p. 701, 2013.[13] F. K. Dankar and K. El Emam, “Practicing differential privacy in healthcare: A review,”
Transactions on Data Privacy , vol. 6, no. 1, pp. 35–67,2013.[14] J. Mervis, “Researchers object to census privacy measure,”
Science ,vol. 363, no. 6423, pp. 114–114, 2019.[15] F. Farokhi, J. Milosevic, and H. Sandberg, “Optimal state estimationwith measurements corrupted by Laplace noise,” in
Decision and Control(CDC), 2016 IEEE 55th Conference on , pp. 302–307, IEEE, 2016.[16] F. Farokhi, “Development and analysis of deterministic privacy-preserving policies using non-stochastic information theory,”
IEEETransactions on Information Forensics and Security , vol. 14, pp. 2567–2576, Oct 2019.[17] G. N. Nair, “A nonstochastic information theory for communicationand state estimation,”
IEEE Transactions on Automatic Control , vol. 58,no. 6, pp. 1497–1510, 2013.[18] P. Duan, F. Yang, S. L. Shah, and T. Chen, “Transfer zero-entropyand its application for capturing cause and effect relationship betweenvariables,”
IEEE Transactions on Control Systems Technology , vol. 23,no. 3, pp. 855–867, 2014.[19] T. J. Lim and M. Franceschcetti, “A deterministic view on the capacityof bandlimited functions,” in , pp. 691–698,IEEE, 2014.[20] C. E. Shannon, “The zero error capacity of a noisy channel,”
IRETransactions on Information Theory , vol. 2, no. 3, pp. 8–19, 1956.[21] F. Farokhi, “Non-stochastic hypothesis testing with applicationto privacy against hypothesis-testing adversary,” in
Decisionand Control (CDC), 2019 IEEE 58th Conference on , 2019.https://arxiv.org/abs/1904.07377.[22] A. Narayanan and V. Shmatikov, “Robust de-anonymization of largesparse datasets,” in
Security and Privacy, 2008. SP 2008. IEEE Sympo-sium on , pp. 111–125, IEEE, 2008.[23] J. Su, A. Shukla, S. Goel, and A. Narayanan, “De-anonymizing webbrowsing data with social networks,” in
Proceedings of the 26th Inter-national Conference on World Wide Web , pp. 1261–1269, 2017.[24] Y.-A. De Montjoye, C. A. Hidalgo, M. Verleysen, and V. D. Blondel,“Unique in the crowd: The privacy bounds of human mobility,”
Scientificreports , vol. 3, p. 1376, 2013.[25] R. A. Popa, A. J. Blumberg, H. Balakrishnan, and F. H. Li, “Privacy andaccountability for location-based aggregate statistics,” in
Proceedings ofthe 18th ACM conference on Computer and Communications Security ,pp. 653–666, ACM, 2011.[26] F. Li, B. Luo, and P. Liu, “Secure information aggregation for smartgrids using homomorphic encryption,” in , pp. 327–332, IEEE, 2010.[27] Y. Lindell and B. Pinkas, “Privacy preserving data mining,” in
Advancesin Cryptology — CRYPTO 2000 (M. Bellare, ed.), pp. 36–54, SpringerBerlin Heidelberg, 2000.[28] W. Du, Y. S. Han, and S. Chen, “Privacy-preserving multivariatestatistical analysis: Linear regression and classification,” in
Proceedingsof the 2004 SIAM International Conference on Data Mining , pp. 222–233, SIAM, 2004. [29] J. Vaidya and C. Clifton, “Privacy preserving association rule mining invertically partitioned data,” in
Proceedings of the eighth ACM SIGKDDinternational conference on Knowledge discovery and data mining ,pp. 639–644, ACM, 2002.[30] J. Vaidya, M. Kantarcıo˘glu, and C. Clifton, “Privacy-preserving naivebayes classification,”
The VLDB Journal , vol. 17, no. 4, pp. 879–898,2008.[31] G. Jagannathan and R. N. Wright, “Privacy-preserving distributed k-means clustering over arbitrarily partitioned data,” in
Proceedings ofthe eleventh ACM SIGKDD international conference on Knowledgediscovery in data mining , pp. 593–599, ACM, 2005.[32] C. Dwork, “Differential privacy: A survey of results,” in
Theory andApplications of Models of Computation: 5th International Conference,TAMC 2008, Xi’an, China, April 25-29, 2008. Proceedings (M. Agrawal,D. Du, Z. Duan, and A. Li, eds.), pp. 1–19, Berlin, Heidelberg: SpringerBerlin Heidelberg, 2008.[33] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy andstatistical minimax rates,” in
Foundations of Computer Science (FOCS),2013 IEEE 54th Annual Symposium on , pp. 429–438, IEEE, 2013.[34] P. Kairouz, S. Oh, and P. Viswanath, “Extremal mechanisms for localdifferential privacy,” in
Advances in Neural Information ProcessingSystems , pp. 2879–2887, 2014.[35] A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber,“Privacy: Theory meets practice on the map,” in
Proceedings of the 2008IEEE 24th International Conference on Data Engineering , pp. 277–286,IEEE Computer Society, 2008.[36] R. Hall, A. Rinaldo, and L. Wasserman, “Random differential privacy,”
Journal of Privacy and Confidentiality , vol. 4, no. 2, pp. 43–59, 2012.[37] A. Padakandla, P. Kumar, and W. Szpankowski, “Preserving privacy andfidelity via Ehrhart theory,” in , pp. 696–700, IEEE, 2018.[38] A. D. Wyner, “The wire-tap channel,”
Bell System Technical Journal ,vol. 54, no. 8, pp. 1355–1387, 1975.[39] T. Courtade, “Information masking and amplification: The source cod-ing setting,” in
Proceedings of the IEEE International Symposium onInformation Theory Proceedings (ISIT) , pp. 189–193, 2012.[40] H. Yamamoto, “A rate-distortion problem for a communication systemwith a secondary decoder to be hindered,”
IEEE Transactions onInformation Theory , vol. 34, no. 4, pp. 835–842, 1988.[41] M. S. Alvim, M. E. Andr´es, K. Chatzikokolakis, and C. Palamidessi, “Onthe relation between differential privacy and quantitative informationflow,” in
Automata, Languages and Programming (L. Aceto, M. Hen-zinger, and J. Sgall, eds.), vol. 6756 of
Lecture Notes in ComputerScience , pp. 60–76, Springer Berlin Heidelberg, 2011.[42] F. du Pin Calmon and N. Fawaz, “Privacy against statistical inference,” in
Proceedings of the 50th Annual Allerton Conference onCommunication,Control, and Computing (Allerton) , pp. 1401–1408, 2012.[43] F. Farokhi, H. Sandberg, I. Shames, and M. Cantoni, “Quadratic Gaus-sian privacy games,” in
Proceedings of the 54th IEEE Conference onDecision and Control , pp. 4505–4510, 2015.[44] M. J. Wainwright, M. I. Jordan, and J. C. Duchi, “Privacy awarelearning,” in
Proceedings of Advances in Neural Information ProcessingSystems (NIPS) , pp. 1430–1438, 2012.[45] Y. Liang, H. V. Poor, and S. Shamai, “Information theoretic security,”
Foundations and Trends in Communications and Information Theory ,vol. 5, no. 4–5, pp. 355–580, 2009.[46] L. Lai, S.-W. Ho, and H. V. Poor, “Privacy–security trade-offs inbiometric security systems–Part I: Single use case,”
IEEE Transactionson Information Forensics and Security , vol. 6, no. 1, pp. 122–139, 2011.[47] Z. Li and T. Oechtering, “Privacy on hypothesis testing in smart grids,”in
IEEE Information Theory Workshop (ITW) 2015, Jeju, Korea, Oct.11-15, 2015 , pp. 337–341, IEEE, 2015.[48] G. Bassi, M. Skoglund, and P. Piantanida, “Lossy communication subjectto statistical parameter privacy,” in , pp. 1031–1035, IEEE, 2018.[49] F. Farokhi and H. Sandberg, “Fisher information as a measure of privacy:Preserving privacy of households with smart meters using batteries,”
IEEE Transactions on Smart Grid , vol. 9, no. 5, pp. 4726–4734, 2018.[50] C. E. Shannon, “A mathematical theory of communication,”
Bell SystemTechnical Journal , vol. 27, no. 3, pp. 379–423, 1948.[51] R. V. L. Hartley, “Transmission of information,”
Bell System TechnicalJournal , vol. 7, no. 3, pp. 535–563, 1928.[52] A. N. Kolmogorov and V. M. Tikhomirov, “ ε -entropy and ε -capacity ofsets in function spaces,” Uspekhi Matematicheskikh Nauk , vol. 14, no. 2,p. 3–86, 1959. English translation American Mathematical SocietyTranslations, series 2, vol. 17, pp. 277364.[53] A. Renyi, “On measures of entropy and information,” in
Proc. of theFourth Berkeley Symp. on Math. Statist. and Prob. , vol. 1, pp. 547–561,1961.[54] D. Jagerman, “ ε -entropy and approximation of bandlimited functions,” SIAM Journal on Applied Mathematics , vol. 17, no. 2, pp. 362–377,1969.[55] G. N. Nair, “A nonstochastic information theory for feedback,” in
Decision and Control (CDC), 2012 IEEE 51st Annual Conference on ,pp. 1343–1348, IEEE, 2012.[56] P. Duan, F. Yang, S. L. Shah, and T. Chen, “Transfer zero-entropyand its application for capturing cause and effect relationship betweenvariables,”
IEEE Transactions on Control Systems Technology , vol. 23,no. 3, pp. 855–867, 2015.[57] M. Wiese, K. H. Johansson, T. J. Oechtering, P. Papadimitratos,H. Sandberg, and M. Skoglund, “Uncertain wiretap channels and secureestimation,” in
Information Theory (ISIT), 2016 IEEE InternationalSymposium on , pp. 2004–2008, IEEE, 2016.[58] H. Shingin and Y. Ohta, “Disturbance rejection with information con-straints: Performance limitations of a scalar system for bounded andgaussian disturbances,”
Automatica , vol. 48, no. 6, pp. 1111–1116, 2012.[59] T. M. Cover and J. A. Thomas,
Elements of Information Theory . Wiley,2012.[60] K. Chatzikokolakis, M. E. Andr´es, N. E. Bordenabe, and C. Palamidessi,“Broadening the scope of differential privacy using metrics,” in
Inter-national Symposium on Privacy Enhancing Technologies Symposium ,pp. 82–102, Springer, 2013.[61] G. Barthe and B. Kopf, “Information-theoretic bounds for differentiallyprivate mechanisms,” in , pp. 191–204, IEEE, 2011.[62] J. Katz and Y. Lindell,
Introduction to Modern Cryptography, SecondEdition . Chapman & Hall/CRC Cryptography and Network SecuritySeries, Taylor & Francis, 2 ed., 2014.[63] W. Wang, L. Ying, and J. Zhang, “On the relation between identifiability,differential privacy, and mutual-information privacy,”
IEEE Transactionson Information Theory , vol. 62, no. 9, pp. 5018–5029, 2016.[64] A. Bkakria, N. Cuppens-Boulahia, and F. Cuppens, “Linking differentialidentifiability with differential privacy,” in
International Conference onInformation and Communications Security , pp. 232–247, Springer, 2018.[65] I. Issa, A. B. Wagner, and S. Kamath, “An operational approach toinformation leakage,” arXiv preprint arXiv:1807.07878 , 2018.[66] C. Dwork, G. N. Rothblum, and S. Vadhan, “Boosting and differentialprivacy,” in , pp. 51–60, IEEE, 2010.[67] P. Kairouz, S. Oh, and P. Viswanath, “The composition theorem fordifferential privacy,”
IEEE Transactions on Information Theory , vol. 63,no. 6, pp. 4037–4049, 2017.[68] N. Buescher, S. Boukoros, S. Bauregger, and S. Katzenbeisser, “Twois not enough: Privacy assessment of aggregation schemes in smartmetering,”
Proceedings on Privacy Enhancing Technologies , vol. 2017,no. 4, pp. 198–214, 2017.[69] J.-M. Bohli, C. Sorge, and O. Ugus, “A privacy model for smartmetering,” in , pp. 1–5, IEEE, 2010.[70] A. Zoha, A. Gluhak, M. A. Imran, and S. Rajasegarar, “Non-intrusiveload monitoring approaches for disaggregated energy sensing: A survey,”
Sensors , vol. 12, no. 12, pp. 16838–16866, 2012.[71] O. Parson, S. Ghosh, M. Weal, and A. Rogers, “Non-intrusive loadmonitoring using prior models of general appliance types,” in
Twenty-Sixth AAAI Conference on Artificial Intelligence , 2012.[72] J. Z. Kolter and M. J. Johnson, “REDD: A public data set for energydisaggregation research,” in
Workshop on Data Mining Applications inSustainability (SIGKDD) , vol. 25, pp. 59–62, 2011.[73] N. Batra, J. Kelly, O. Parson, H. Dutta, W. Knottenbelt, A. Rogers,A. Singh, and M. Srivastava, “NILMTK: An open source toolkit for non-intrusive load monitoring,” in
Fifth International Conference on FutureEnergy Systems (ACM e-Energy) , 2014. A PPENDIX AP ROOF OF P ROPOSITION I ⋆ ( X ; Y ) ≤ L ( X ; Y ) . Let I ⋆ ( X ; Y ) = m . Then, J X | Y K ⋆ = { P , . . . , P m } . Each P i is non-empty. Therefore, there exists at least one x such that x ∈ P i . Notethat x must also belong to J X | y K for any y ∈ J Y | x K . We provethat J X | y K ⊆ P i . Assume that this not the case. Therefore,there exist an element of x ′ ∈ J X | y K , distinct from x , thatbelongs to another P j , j = i , because { P , . . . , P m } covers J X K . We know that P i and P j are J X | Y K -overlap isolatedby the definition of partition J X | Y K ⋆ . On the other hand, weevidently have x ! x ′ (by the definition of J X | Y K -overlapconnectedness). This is a contradiction and thus J X | y K mustbe a subset of P i . This results in | J X | y K | ≤ | P i | and hence min y ∈ J Y K | J X | y K | ≤ P i , ∀ i ∈ { , . . . , m } . (18)On the other hand, S m i =1 P i = J X K because { P , . . . , P m } isa partition for J X K . Because of the non-overlapping nature ofthe sets { P , . . . , P m } , we get m X i =1 | P i | = | J X K | . (19)Combining (18) and (19) results in m min y ∈ J Y K | J X | y K | ≤| J X K | . This implies that I ⋆ ( X ; Y ) ≤ L ( X ; Y ) . Similarly, wecan show that I ⋆ ( Y ; X ) ≤ L ( Y ; X ) . By symmetry of themaximin information [17], i.e., I ⋆ ( X ; Y ) = I ⋆ ( Y ; X ) , we getthat I ⋆ ( X ; Y ) ≤ L ( X ; Y ) and I ⋆ ( X ; Y ) ≤ L ( Y ; X ) . Thisconcludes the proof. A PPENDIX BP ROOF OF P ROPOSITION ℵ ( y [ k ] , . . . , y [1]) denote the statement Y [ k ]( ω ) = y [ k ] , . . . , Y [1]( ω ) = y [1] . Note that L ( X [ k ] ,...,X [1]; Y [ k ] ,...,Y [1]) = max ( y [ i ]) ki =1 ∈ J Y K k | J X [ k ] , . . . , X [1] K || J X [ k ] , . . . , X [1] |ℵ ( y [ k ] , . . . , y [1]) K | = | J X [ k ] , . . . , X [1] K | min ( y [ i ]) ki =1 ∈ J Y K k | J X k , . . . , X |ℵ ( y [ k ] , . . . , y [1]) K | = Q kℓ =1 | J X [ ℓ ] K | Q kℓ =1 min y [ ℓ ] ∈ J Y K | J X [ ℓ ] | Y [ ℓ ]( ω ) = y [ ℓ ] K | = k Y ℓ =1 | J X [ ℓ ] K | min y [ ℓ ] ∈ J Y K | J X [ ℓ ] | Y [ ℓ ]( ω ) = y [ ℓ ] K | = k Y ℓ =1 L ( X [ ℓ ]; Y [ ℓ ])= ( L ( X [ ℓ ]; Y [ ℓ ])) k . Therefore, L ( X [ k ] , . . . , X [1]; Y [ k ] , . . . , Y [1]) = kL ( X [ ℓ ]; Y [ ℓ ]) , and, as a result, L ( X [ k ] , . . . , X [1]; Y [ k ] , . . . , Y [1]) /k = L ( X [ ℓ ]; Y [ ℓ ]) . Similarly, we can show that L ( Y [ k ] , . . . , Y [1]; X [ k ] , . . . , X [1]) /k = L ( Y [ ℓ ]; X [ ℓ ]) . Combining these inequalities with Proposition 1 in this paperand Theorem 4.1 in [17] proves the result.
PPENDIX CP ROOF OF P ROPOSITION L s0 ( X i ; Y ) ≤ L ( Y ; X i ) = sup x i ∈ J X i K | J Y K | J Y | X i ( ω ) = x i K = | J Y K | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ x ′ i ∈ J X i K J Y | X i ( ω ) = x ′ i , X − i ( ω ) = v − i K (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = | J Y | X − i ( ω ) = v − i K |≤ ǫ , where the second equality follows from that the realization of Y can be uniquely determined based on the realization of X i ,i.e., J Y | X i ( ω ) = x i K is a singleton. Therefore, L s0 ( X i ; Y ) ≤ L ( Y ; X i ) ≤ ǫ . The rest follows from Proposition 1.A PPENDIX DP ROOF OF T HEOREM Y is a discrete uncertain variable, for any test T ,Theorem 3 states that the performance of the adversary isbounded from the above by P ( T ) ≤ log ( | J Y | X i ( ω ) = x i K ∆ J Y | X i ( ω ) = x ′ i K | ) . Now, note that J Y | X i ( ω ) = x i K ∆ J Y | X i ( ω ) = x ′ i K ⊆ J Y | X − i ( ω ) = v − i K . Therefore, | J Y | X i ( ω ) = x i K ∆ J Y | X i ( ω ) = x ′ i K | ≤ | J Y | X − i ( ω ) = v − i K | ≤ ǫ . This concludes the proof.A PPENDIX EP ROOF OF T HEOREM g ( · ) such that g ( X i ) = M ◦ f ( X ) . Theremust exists y ∈ J Y K such that µ ( g − ( y ))) ≥ µ ( J X i K )2 − ǫ . Asotherwise, µ ( g − ( y )) < µ ( J X i K )2 − ǫ for all y ∈ J Y K and thus µ ( J X i K ) = µ [ y ∈ J Y K g − ( y ) = X y ∈ J Y K µ (cid:0) g − ( y ) (cid:1) < X y ∈ J Y K µ ( J X i K )2 − ǫ = µ ( J X i K ) . This is a contradiction. Hence, we get E { ( X i − ˆ X i ( X − i , M ◦ f ( X )) p | X − i }≥ ρ Z g − ( y ) ( X i − ˆ X i ( X − i , M ◦ f ( X )) p d µ ( X i ) , where ρ = inf X ∈ J X K ξ ( X ) . Since g − ( y ) is a connected set,there must exists x i , x i such that closure of the g − ( y ) is equalto [ x i , x i ] . Hence, we get Z g − ( y ) ( X i − ˆ X i ( X − i , M ◦ f ( X )) p d µ ( X i )= Z x i x i ( X i − ˆ X i ( X − i , M ◦ f ( X )) p d µ ( X i ) ≥ Z x i x i ( z − ( x i + x i ) / p d z ≥ (cid:18) x i − x i (cid:19) p x i − x i µ ( g − ( y )) p +1 p +1 / ≥ µ ( J X i K ) p +1 − ǫ ( p +1) / p +1 . This concludes the proof.A
PPENDIX FP ROOF OF T HEOREM J ( M , M ) ◦ f ( X ) | X − i ( ω ) = x − i K = J ( M ◦ f ( X ) , M ◦ f ( X )) | X − i ( ω ) = x − i K ⊆ J M ◦ f ( X ) | X − i ( ω ) = x − i K × J M ◦ f ( X ) | X − i ( ω ) = x − i K , and as a result µ ( J ( M , M ) ◦ f ( X ) | X − i ( ω ) = x − i K ) ≤ µ ( J M ◦ f ( X ) | X − i ( ω ) = x − i K ) × µ ( J M ◦ f ( X ) | X − i ( ω ) = x − i K ) . Hence, log e ( µ ( J ( M , M ) ◦ f ( X ) | X − i ( ω ) = x − i K )) ≤ log e ( µ ( J M ◦ f ( X ) | X − i ( ω ) = x − i K ) × µ ( J M ◦ f ( X ) | X − i ( ω ) = x − i K ))= log e ( µ ( J M ◦ f ( X ) | X − i ( ω ) = x − i K ))+ log e ( µ ( J M ◦ f ( X ) | X − i ( ω ) = x − i K )) . The proof for discrete uncertain variables follow the sameapproach. A
PPENDIX GP ROOF OF T HEOREM | J g ◦ M ◦ f ( X ) | X − i ( ω ) = x − i K | ≤ | J M ◦ f ( X ) | X − i ( ω ) = x − i K | for any x − i .A PPENDIX HP ROOF OF T HEOREM x − i ∈ J X − i K , due to continuity of f , we knowthat f ( J X i K × { x − i } ) = f ◦ ψ i,x − i ( J X i K ) ⊆ [ y min , y max ] isa connected set (because J X i K is connected). Therefore, if M is a q -level quantizer, J Y | X − i ( ω ) = x − i K = M ◦ f ( J X i K ×{ x − i } ) can at most contain qµ ( f ( J X i K × { x − i } )) / ( y max − y min ) points. Therefore, | J Y | X − i ( ω ) = x − i K | ≤ q S f / ( y max − y min ))