[PDF] Generalized Gaussian Mechanism for Differential Privacy

Abstract

Assessment of disclosure risk is of paramount importance in the research and applications of data privacy techniques. The concept of differential privacy (DP) formalizes privacy in probabilistic terms and provides a robust concept for privacy protection without making assumptions about the background knowledge of adversaries. Practical applications of DP involve development of DP mechanisms to release results at a pre-specified privacy budget. In this paper, we generalize the widely used Laplace mechanism to the family of generalized Gaussian (GG) mechanism based on the l p global sensitivity of statistical queries. We explore the theoretical requirement for the GG mechanism to reach DP at prespecified privacy parameters, and investigate the connections and differences between the GG mechanism and the Exponential mechanism based on the GG distribution We also present a lower bound on the scale parameter of the Gaussian mechanism of (ϵ,δ) -probabilistic DP as a special case of the GG mechanism, and compare the statistical utility of the sanitized results in the tail probability and dispersion in the Gaussian and Laplace mechanisms. Lastly, we apply the GG mechanism in 3 experiments (the mildew, Czech, adult data), and compare the accuracy of sanitized results via the l 1 distance and Kullback-Leibler divergence and examine how sanitization affects the prediction power of a classifier constructed with the sanitized data in the adult experiment.

Full PDF

GGeneralized Gaussian Mechanism for Diﬀerential Privacy

Fang Liu ∗ Abstract

Assessment of disclosure risk is of paramount importance in the research and applica-tions of data privacy techniques. The concept of diﬀerential privacy (DP) formalizes privacyin probabilistic terms and provides a robust concept for privacy protection without makingassumptions about the background knowledge of adversaries. Practical applications of DPinvolve development of DP mechanisms to release results at a pre-speciﬁed privacy budget.In this paper, we generalize the widely used Laplace mechanism to the family of generalizedGaussian (GG) mechanism based on the l p global sensitivity of statistical queries. We ex-plore the theoretical requirement for the GG mechanism to reach DP at prespeciﬁed privacyparameters, and investigate the connections and diﬀerences between the GG mechanism andthe Exponential mechanism based on the GG distribution We also present a lower boundon the scale parameter of the Gaussian mechanism of ( (cid:15), δ )-probabilistic DP as a specialcase of the GG mechanism, and compare the statistical utility of the sanitized results inthe tail probability and dispersion in the Gaussian and Laplace mechanisms. Lastly, weapply the GG mechanism in 3 experiments (the mildew, Czech, adult data), and comparethe accuracy of sanitized results via the l distance and Kullback-Leibler divergence andexamine how sanitization aﬀects the prediction power of a classiﬁer constructed with thesanitized data in the adult experiment. Keywords : (probabilistic) diﬀerential privacy, l p global sensitivity, privacy budget, Laplacemechanism, Gaussian mechanism When releasing information publicly from a database or sharing data with collaborators, datacollectors are always concerned about exposing sensitive personal information of individuals whocontribute to the data. Even with key identiﬁers removed, data users may still identify a partici-pant in a data set such as via linkage with public information. Diﬀerential privacy (DP) providesa strong privacy guarantee to data release without making assumptions about the backgroundknowledge or behavior of data users [1, 2, 3]. For a given privacy budget, information released viaa diﬀerentially private mechanism guarantees no additional personal information of an individ-ual in the data can be inferred, regardless how much background information data users alreadypossess about the individual. DP has spurred a great amount work in the development of diﬀer-entially private mechanisms to release results and data, including the Laplace mechanism [1], theExponential mechanism [4, 5], the medium mechanism [6], the multiplicative weights mechanism[7], the geometric mechanism [8], the staircase mechanism [9], the Gaussian mechanism [10], andapplications of DP for private and secure inference in a Bayesian setting [11], among others. ∗ Fang Liu is Associate Professor in the Department of Applied and Computational Mathematics and Statistics,University of Notre Dame, Notre Dame, IN 46556 ( ‡ E-mail: [email protected]). The work is supported by theNSF Grant 1546373 and the University of Notre Dame Faculty Research Support Program Initiation Grant. a r X i v : . [ m a t h . S T ] D ec n this paper, we unify the Laplace mechanism and the Gaussian mechanism in the frameworkof a general family, referred to as the generalized Gaussian (GG) mechanism. The GG mechanismis based on the l p global sensitivity (GS) of queries, a generalization of the l GS. We demonstratethe nonexistence of a scale parameter that would lead to a GG mechanism of pure (cid:15) -DP in thecase of p (cid:54) = 1 if the results to be released are unbounded, but suggest the GG mechanism of( (cid:15), δ )-probabilistic DP (pDP) as an alternative in such cases. For bounded data we introduce thetruncated GG mechanism and the boundary inﬂated truncated GG mechanism that satisfy pure (cid:15) -DP. We investigate the connections between the GG mechanism and the Exponential mecha-nism when the utility function in the latter is based on the Minkowski distance, and establish therelationship between the sensitivity of the utility function in the Exponential mechanism and the l p GS of queries. We then take a closer look at the Gaussian mechanism (the GG mechanism oforder 2), and derive a lower bound on the scale parameter that delivers ( (cid:15), δ )-pDP. The bound istighter than the bound to satisfy ( (cid:15), δ )-approximate DP (aDP) in the Gaussian mechanism [10],implying less noise being injected in the sanitized results. We compare the utility of sanitizedresults, in terms of the tail probability and dispersion or mean squared errors (MSE), from inde-pendent applications of the Gaussian mechanism and the Laplace mechanism. Finally, we run 3experiments on the mildew, Czech, and adult data, respectively, and sanitize the count data viathe Laplace mechanism, the Gaussian mechanisms of ( (cid:15), δ )-pDP and ( (cid:15), δ )-aDP. We compare theaccuracy of sanitized results in terms of the l distance and Kullback-Leibler divergence from theoriginal results, and examine how sanitization aﬀects the prediction accuracy of support vectormachines constructed with the sanitized data in the adult experiment.The rest of the paper is organized as follows. Section 2 deﬁnes the l p GS and presents the GGmechanism of ( (cid:15), δ )-pDP, the truncated GG mechanism, and the boundary inﬂated truncated GGmechanism that satisfy pure (cid:15) -DP. It also connects and diﬀerentiates between the GG mechanismsand the Exponential mechanism when the utility function in the latter is based the Minkowskidistance. Section 3 take a close look at the Gaussian mechanism of ( (cid:15), δ )-pDP, and comparesit with the Gaussian mechanism of ( (cid:15), δ )-aDP. It also compares the tail probability and thedispersion of the noises injected via the Gaussian mechanism of ( (cid:15), δ )-pDP and the Laplacemechanism. Section 4 presents the ﬁndings from the 3 experiments. Concluding remarks aregiven in Section 5.

DP was proposed and formulated in Dwork [12] and Dwork et al. [1]. A perturbation algorithm R gives (cid:15) -diﬀerential privacy if for all data sets ( x , x (cid:48) ) that diﬀer by only one individual ( d ( x , x (cid:48) ) =1), and all possible query results Q ⊆ T to query s ( T denotes the output range of R ), (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) Pr( R ( s ( x )) ∈ Q )Pr( R ( s ( x (cid:48) )) ∈ Q ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15), (1)where (cid:15) > s refers to queries about data x and x (cid:48) , we alsouse it to denote the query results (unless stated otherwise, the domain of the query results is theset of all real numbers). d ( x , x (cid:48) ) = 1 is often deﬁned in two ways in the DP community: x and x (cid:48) are of the same size and diﬀer in exactly one record (row) in at least one attributes (columns);and x is exactly the same as x (cid:48) except that it has one less (more) record. Mathematically, Eqn(1) states that the probabilities of obtaining the same query result perturbed via R are roughly2he same regardless of whether the query is sent to x or x (cid:48) . In layman’s terms, DP implies thechance an individual will be identiﬁed based on the perturbed query result is very low since thequery result would be about the same with or without the individual in the data. The degree of“roughly the same” is determined by the privacy budget (cid:15) . The lower (cid:15) is, the more similar theprobabilities of obtaining the same query results from x and x (cid:48) are. DP provides a strong androbust privacy guarantee in the sense that it does not assume anything regarding the backgroundknowledge or the behavior on data users.In addition to the “pure” (cid:15) -DP in Eqn (1), there are softer versions of DP, including the ( (cid:15), δ )-approximate DP (aDP) [13], the ( (cid:15), δ )-probabilistic DP (pDP) [14], the ( (cid:15), δ )-random DP (rDP)[15], and the ( (cid:15), τ )-concentrated DP (cDP) [16]. In all the relaxed versions of DP, one additionalparameter is employed to characterize the amount of relaxation on top of the privacy budget (cid:15) .Both the ( (cid:15), δ )-aDP and the ( (cid:15), δ )-pDP reduce to (cid:15) -DP when δ = 0, but are diﬀerent with respectto the interpretation of δ . In ( (cid:15), δ )-aDP,Pr( R ( s ( x )) ∈ Q ) ≤ e (cid:15) Pr( R ( s ( x (cid:48) )) ∈ Q ) + δ ; (2)while a perturbation algorithm R satisﬁes ( (cid:15), δ )-pDP ifPr (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) Pr( R ( s ( x )) ∈ Q )Pr( R ( s ( x (cid:48) )) ∈ Q ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) > (cid:15) (cid:19) ≤ δ ; (3)that is, the probability of R generating an output belonging to the disclosure set is boundedbelow δ , where the disclosure set contains all the possible outputs that leak information for agiven privacy budget (cid:15) . The fact that probabilities are within [0 ,

1] puts constraints on the valuesof (cid:15),

Pr( R ( s ( x (cid:48) ) ∈ Q ), and δ in the framework of ( (cid:15), δ )-aDP. By contrast, ( (cid:15), δ )-pDP seems tobe less constrained and more intuitive with its probabilistic ﬂavor. When δ is small, ( (cid:15), δ )-aDPand ( (cid:15), δ )-aDP are roughly the same. The ( (cid:15), δ )-rDP is also a probabilistic relaxation of DP; butit diﬀers from ( (cid:15), δ )-pDP in that the probabilistic relaxation is with respect to data generation.In ( (cid:15), τ )-cDP, privacy cost is treated as a random variable with an expectation of (cid:15) and theprobability of the actual cost > (cid:15) ) > a is bounded by e − ( a/τ ) / . The ( (cid:15), τ )-cDP, similar to the( (cid:15), δ )-pDP, relaxes the satisfaction of DP with respect to R and is broader in scope. l p global sensitivity Deﬁnition 1.

For all ( x , x (cid:48) ) that is d ( x , x (cid:48) ) = 1, the l p -global sensitivity (GS) of query s is∆ p = max x , x (cid:48) d ( x , x (cid:48) )=1 (cid:107) s ( x ) − s ( x (cid:48) ) (cid:107) p = ( (cid:80) rk =1 | s k ( x ) − s k ( x (cid:48) ) | p ) /p for integer p > . (4)In layman’s term, ∆ p is the maximum diﬀerence measured by the Minkowski distance in queryresults s between two neighboring data set x , x (cid:48) with d ( x , x (cid:48) ) = 1. The sensitivity is “global”since it is deﬁned for all possible data sets and all possible ways that x and x (cid:48) diﬀer by one. Thehigher ∆ p is, the more disclosure risk there is on the individuals from releasing the original queryresults s . The l p GS is a key concept in the construction of the generalized Gaussian mechanismin Section 2.The l p GS is a generalization of the l GS [1, 12] and the l GS [10]. The “diﬀerence” between s ( x ) and s ( x (cid:48) ) measured by ∆ is the largest among all ∆ p for p ≥ (cid:107) s (cid:107) p + a ≤ (cid:107) s (cid:107) p forany real-valued vector s and a ≥

0. In addition, ∆ is also the most “sensitive” measure giventhat the rate of change with respective to any s k is the largest among all p ≥

1. When s is a3calar, ∆ p = ∆ for all p >

0. When s is multi-dimensional, an easy upper bound for l GS ∆ is (cid:80) rk =1 ∆ ,k , the sum of the l GS of each element k in s , by the triangle inequality. Lemma 2gives an upper bound on ∆ p for a general p that includes p = 1 as a special case (the proof isprovided in Appendix A). Lemma 2. (cid:0)(cid:80) rk =1 ∆ p ,k (cid:1) /p is an upper bound for ∆ p , where ∆ ,k is the l GS of s k .The upper bound given in Lemma 2 can be conservative in cases where the change from x to x (cid:48) does not necessarily alter every entry in the multidimensional s . For example, the l p GS ofreleasing a histogram with r bins is 1 (if d ( x , x (cid:48) ) = 1 is deﬁned as x (cid:48) is one record less/more than x ). In other words, the GS is not r /p even though there are r counts in the released histogram,but is the same as in releasing a single cell because removing one record only alters the count ina single bin.It is obvious that each element s k in s for k = 1 , . . . , r needs to be bounded to obtain a ﬁnite∆ p . The most extreme case is the change from x to x (cid:48) makes s k jump from one extreme to theother, implying the range of s k can be used as an upper bound for ∆ k, , which, combined withLemma 2, leads to the following claim. Claim 3.

Denote the bounds of statistic s k by [ c k , c k ], both of which are ﬁnite. The GS∆ k ≤ c k − c k and the GS for s = { s k } k =1 ,...,r is ∆ p ≤ ( (cid:80) rk =1 ( c k − c k ) p ) /p . The GG mechanism is deﬁned based on the GG distribution GG( µ, b, p ) with location parameter µ , scale parameter b >

0, shape parameter p >

0. The probability density function (pdf) is f ( x | µ, b, p ) = p b Γ( p − ) exp (cid:26)(cid:18) | x − µ | b (cid:19) p (cid:27) . The mean and variance of x are µ and b Γ(3 /b ) / Γ(1 /b ), respectively. (Γ( t ) = (cid:82) ∞ x t − e − x dx isthe Gamma function). When p = 1, the GG distribution is the Laplace distribution with mean µ and variance 2 b ; when p = 2, the GG distribution becomes the Gaussian distribution withmean 0 and variance b /

2. Figure 1 presents some examples of the GG distributions at diﬀerent

Figure 1: Density of GG distributions p . All the distributions in the left plot have the same scale b = √ p increases, and the Laplace distribution( p = 1) looks very diﬀerent from the rest. When the variance is the same (the right plot), theLaplace distribution is the most likely to generate values that are close to the mean, followed bythe Gaussian distribution ( p = 2). (cid:15) -DP We ﬁrst examine the GG mechanism of (cid:15) -DP with the domain for s ∗ k deﬁned on ( −∞ , ∞ ) for k = 1 , . . . , r . s needs to bounded to calculate the l p GS, but the bounding requirement does notnecessarily goes into formulating the GG distribution for the GG mechanism in the ﬁrst place. Ifbounding for s ∗ is necessary, it can be incorporated in a post-hoc manner after being generatedfrom the GG mechanism. A well-known example is the Laplace mechanism. It employs a Laplacedistribution deﬁned on ( −∞ , ∞ ), though its scale parameter b = ∆ /(cid:15) requires s to be boundedfor ∆ to be calculated.Eqn (5) presents the GG distribution from which sanitized s ∗ would be generated to satisfy (cid:15) -DP, assuming b exists. f ( s ∗ ) ∝ e ( (cid:107) s ∗ − s (cid:107) p /b ) p ∝ (cid:81) rk =1 exp {− ( | s ∗ k − s k | /b ) p } = (cid:81) rk =1 p b Γ( p − ) exp { ( | s ∗ k − s k | /b ) p } = (cid:81) rk =1 GG( s k , b, p ) (5) Claim 4.

There does not exist a lower bound on b for the GG distribution in Eqn (5) when p (cid:54) = 1 that generates s ∗ with (cid:15) -DP. When p = 1, the lower bound on b that leads to (cid:15) -DP is (cid:15) − ∆ .Appendix B lists the detailed steps that lead to Claim 4. In brief, to achieve (cid:15) -DP, we need b − p (cid:16)(cid:80) rk =1 (cid:80) p − j =1 ( pj ) | s ∗ k − s k | p − j ∆ j ,k + ∆ pp (cid:17) ≤ (cid:15) (Eqn B.4). However, this inequality depends onthe random GG noise e k = s ∗ k − s k for k = 1 , . . . , r , the support of which is ( −∞ , ∞ ) r . In otherwords, there does not exist a random noise-free solution on b , unless p = 1 in which case theinequality no longer involves the error terms and the GG mechanism reduces to the familiarLaplace mechanism of (cid:15) -DP. We propose two approaches to ﬁx the problem and achieve DPthrough the GG mechanism. The ﬁrst approach leverages the bounding requirement for s andbuilds in the requirement in the GG distribution in the ﬁrst place to generate s ∗ with (cid:15) -DP,assuming that s ∗ and s share the same bounded domain (Section 2.5). The second approach stilluses the GG distribution in Eqn (5) to sanitize s , only satisfying ( (cid:15), δ )-pDP instead of the pure (cid:15) -DP (Section 2.6). The sanitized s ∗ can be bounded in a post-hoc manner, as needed. (cid:15) -DP Deﬁnition 5.

Denote the bounds on query result s by [ c k , c k ] k =1 ,...,r . For integer p ≥

1, thetruncated GG mechanism of order p generates s ∗ ∈ [ c k , c k ] k =1 ,...,r with (cid:15) -DP by drawing from thetruncated GG distribution f ( s ∗ | c k ≤ s ∗ k ≤ c k , ∀ k = 1 , . . . , r ) = r (cid:89) k =1 p exp { ( | s ∗ k − s k | /b ) p } b Γ( p − ) A ( s k , b, p ) with scale parameter (6) b ≥ (cid:32) (cid:15) − (cid:32) r (cid:88) k =1 p − (cid:88) j =1 ( pj ) | c k − c k | p − j ∆ j ,k + ∆ pp (cid:33)(cid:33) /p , (7)5here A ( s k , b, p ) = Pr( c k ≤ s ∗ k ≤ c k ; s k , b, p ) = (Γ( p − )) − ( γ [ p − , ( c k − s k ) /b ] + γ [ p − , ( s k − c k ) /b ])( γ is the lower incomplete gamma function), ∆ ,k is the l GS of s k , and ∆ p is the l p GS of s .The proof of (cid:15) -DP of the truncated GG mechanism is given in Appendix C. The truncatedGG mechanism perturbs each element in s independently; thus Eqn (6) involves the productof r independent density functions. Though the closed interval [ c k , c k ] is used to denote thebounds on s k , Deﬁnition 5 remains the same regardless of whether the interval is closed, open,or half-closed since the GG distribution is deﬁned on a continuous domain. If s k is discrete innature such as counts, post-hoc rounding on perturbed s ∗ k can be applied. The lower bound on b in Eqn (7) depends on ∆ p . We may apply Lemma 2 and set ∆ pp at its upper bound (cid:80) rk =1 ∆ p ,k to obtain a less tight bound on b . b ≥ (cid:16) (cid:15) − (cid:16)(cid:80) rk =1 (cid:80) pj =1 ( pj ) | c k − c k | p − j ∆ j ,k (cid:17)(cid:17) /p . (8) Deﬁnition 6.

Denote the bounds on query result s k by [ c k , c k ] for k = 1 , . . . , r . For integer p ≥

1, the p th order boundary inﬂated truncated (BIT) GG mechanism sanitizes s with (cid:15) -DP bydrawing perturbed s ∗ from the following piecewise distribution f ( s ∗ | c k ≤ s ∗ k ≤ c k , ∀ k = 1 ,. . . ,r ) = (cid:81) rk =1 (cid:26) p I( s ∗ k = c k ) k q I( s ∗ k = c k ) k (cid:16) p exp { ( | s ∗ k − s k | /b ) p } b Γ( p − ) (cid:17) I( c k c k ; s k , p, b ) = − γ ( p − , (( c k − s k ) /b p ))(2Γ( p − )) − , γ is the lower incomplete gamma func-tion, and Γ is the gamma function; and I() is the indicator function that equals 1 if the argumentin the parentheses is true, 0 otherwise.In brief, the BIT GG distribution replaces out-of-bound values with the boundary values andkeeps the within-bound values as is, leading to a piecewise distribution. This is in contrastto the truncated GG distribution which throws away out-of-bound values. The challenge withperturbing s directly via Eqn (9) lies in solving for a lower bound b that satisﬁes (cid:15) -DP fromlog (cid:12)(cid:12)(cid:12)(cid:12) f ( s ∗ | c k ≤ s ∗ k ≤ c k , ∀ k = 1 , . . . , r ) f ( s (cid:48) ∗ | c k ≤ s ∗ k ≤ c k , ∀ k = 1 , . . . , r ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (10)where s ∗ = { s ∗ k } and s (cid:48)∗ = { s (cid:48)∗ k } are the sanitized results from data x and x (cid:48) that are d ( x , x (cid:48) ) = 1,respectively. The lower bound given in Eqns (7) and 8 can be used when the output subset Q is asubset of ( c , c ) × · · · × ( c r , c r ) (open intervals). However, when Q is { s k = c k ∀ k = 1 , . . . , r } and { s k = c k ∀ k = 1 , . . . , r } , respectively, there are no analytical solutions on b in either Eqns(11) or (12) log (cid:12)(cid:12)(cid:12)(cid:81) ri =1 1 / − γ ( p − , (( s k − c k ) /b ) p )(2Γ( p − )) − / − γ ( p − , (( s (cid:48) k − c k ) /b ) p )(2Γ( p − )) − (cid:12)(cid:12)(cid:12) ≤ (cid:15) (11)log (cid:12)(cid:12)(cid:12)(cid:81) ri =1 1 / − γ ( p − , (( s k − c k ) /b ) p )(2Γ( p − )) − / − γ ( p − , (( s (cid:48) k − c k ) /b ) p )(2Γ( p − )) − (cid:12)(cid:12)(cid:12) ≤ (cid:15). (12)The most challenging situation is when Q is a mixture set of ( c k , c k ), c k , and c k for diﬀerent k = 1 , . . . , r . In summary, the BIT GG mechanism is not very appealing from a practicalstandpoint. 6 .6 GG mechanism of ( (cid:15), δ ) -pDP The second approach to obtain a lower bound on the scale parameter b for the GG distributionin Eqn (5) when p ≥ b that satisﬁes ( (cid:15), δ )-pDP. Corollary 7.

If the scale parameter b in the GG distribution in Eqn (5) satisﬁesPr (cid:16)(cid:80) rk =1 (cid:80) p − j =1 ( pj ) | s ∗ k − s k | p − j ∆ j ,k > b p (cid:15) − ∆ pp (cid:17) < δ, (13)then the GG mechanism satisﬁes ( (cid:15), δ )-pDP when p ≥ ≤ (cid:15) (i.e.with 100%), we attach a probability of achieving the inequality, that is, Pr(Eqn (B.4) < (cid:15) ) > − δ ,leading to Eqn (13). The ( (cid:15), δ )-pDP does not apply to the Laplace mechanism ( p = 1) at least inthe framework laid out in Corollary 7. When p = 1, Eqn (B.1) becomes b − (cid:80) rk =1 (cid:12)(cid:12) | e k |−| e k + d k | (cid:12)(cid:12) ≤ b − (cid:80) rk =1 | d k | ≤ b − ∆ , which does not involve the random variable s ∗ ; in other words, as long as b − ∆ s , ≤ (cid:15) , the pure (cid:15) -DP is guaranteed.Corollary 7 does not list a closed-form solution on b as it is likely that only numerical solutionsexist in most cases. Given that s ∗ k is independent across k = 1 , . . . , r , a k = (cid:80) p − j =1 ( pj ) | s ∗ k − s k | p − j ∆ j ,k a function of s ∗ k , is also independent across k . Therefore, the problem becomes searching for alower bound on b where the probability of a sum of r independent variables ( a , . . . , a r ) exceeding b p − ∆ pp (cid:15) is smaller than δ . If there exists a closed-form distribution function for (cid:80) rk =1 a k , an exactsolution on b can be obtained. When p = 2, an analytical lower bound b can be obtained (seeSection 3); when p > pj ) | s ∗ k − s k | p − j ∆ j ,k ,but not for a k or (cid:80) rk =1 a k at the current stage. A relatively simple case is when the elements ofstatistics s are calculated on disjoint subsets of the original data, thus removing one individualfrom the data only aﬀects one element out of r , ∆ = ∆ p = ∆ ,k (cid:48) , leading to the Corollary 8. Corollary 8.

When all r elements in s are based disjoint subsets of the data, the lower boundon b satisﬁes Pr( (cid:80) pj =1 ( pj ) | s ∗ k (cid:48) − s k (cid:48) | p − j ∆ ,k (cid:48) > b p (cid:15) ) < δ , where k (cid:48) = argmax k ∆ ,k .When the query is a histogram, ∆ = ∆ p = ∆ ,k (cid:48) = 1, and the lower bound b for ( (cid:15), δ )-pDPcan be derived from Pr( (cid:80) pj =1 ( pj ) | e k (cid:48) | p − j > b p (cid:15) ) < δ . The proof of 8 is trivial. With disjoint queries,only one element in s is aﬀected by changing from x to x (cid:48) while the other r − s k ( x ) = s k ( x (cid:48) ), and Eqn (B.2) = b − p (cid:80) pj =1 ( pj ) | e k (cid:48) | p − j | d k (cid:48) | j ≤| b − p (cid:80) pj =1 ( pj ) | e k (cid:48) | p − j ∆ ,k (cid:48) .Numerical approaches can be applied to obtain a lower bound on b when the closed-formsolutions are diﬃcult to attain. Figure 2 depicts the lower bounds on b at diﬀerent p and ( (cid:15), δ )obtained via the Monte Carlo approach. We set ∆ ,k at 1 , . , .

05 for k = 1 , ,

3, respectivelyand applied Lemma 2 to obtain an upper bound on ∆ p for a given p value. As expected, the lowerbound on b increases with decreased (cid:15) (lower privacy budget) and decreased δ (reduced chanceof failing the pure (cid:15) -DP). The results also suggest b increases with p to maintain ( (cid:15), δ )-pDP inthe examined scenarios. s ∗ sampled from the GG mechanism of ( (cid:15), δ )-pPD in Eqn (5) once b is determined – analyticallyor numerically – ranges ( −∞ , ∞ ). To bound s ∗ , it is straightforward to apply a post processingprocedure such as the truncation and the boundary inﬂated truncation (BIT) procedure [17].The truncation procedure throws away the out-of-bounds values and only keeps those in boundswhile the BIT procedure sets the out-of-bounds values at the bounds. If the bounds are noninfor-mative in the sense that the bounds are global and do not contain any data-speciﬁc information,7 l l l l l l l p l o w e r bound f o r b

39 78.5 7.5 18.1 30.2 51 6.2 10.3 e = 0.1; d = 0.01 e = 0.5; d = 0.01 e = 0.1; d = 0.05 e = 0.5; d = 0.05 Figure 2: Numerical Lower bound on b from Corollary 7 then neither one of the two post-hoc bounding procedures will leak the original information orcompromise the established ( (cid:15), δ )-pDP. The exponential mechanism was introduced by McSherry and Talwar [4]. We paraphrase theoriginal deﬁnition as follows, covering both discrete and continuous outcomes. Let S denote theset containing all possible output s ∗ . The exponential mechanism releases s ∗ with probability f ( s ∗ ) = exp (cid:18) u ( s ∗ | x ) (cid:15) u (cid:19) ( A ( x )) − (14)to ensure (cid:15) -DP. A ( x ) is a normalizing constant so that f ( s ∗ ) sums or integrates to 1, and equalsto (cid:80) s ∗ ∈S exp (cid:16) u ( s ∗ | x ) (cid:15) u (cid:17) or (cid:82) s ∗ ∈S exp (cid:16) u ( s ∗ | x ) (cid:15) u (cid:17) d s ∗ , depending on whether S is a count-able/discrete sample space, or a continuous set, respectively. u is the utility function andassigns a “utility” score to each possible outcome s ∗ conditional on the original data x , and∆ u = max x , x (cid:48) ,d ( x , x (cid:48) )=1 , s ∗ ∈S | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | is the maximum change in the utility score acrossall possible output s ∗ and all possible data sets x and x (cid:48) that is d ( x , x (cid:48) ) = 1. From a practicalperspective, the scores should properly reﬂect the “usefulness” of s ∗ . For example, “usefulness”can be measured the similarity between perturbed s ∗ and original s if s is numerical. The closer s ∗ is to the original s , the larger u ( s ∗ | x ) is, and the higher the probability s ∗ will be released.The Exponential mechanism can be conservative (See Appendix D), in the sense that the actualprivacy cost is lower than the nominal privacy budget (cid:15) , or more than necessary amount of per-turbation is injected to preserve (cid:15) -DP. Despite the conservativeness, the Exponential mechanismis a widely used mechanism in DP with its generality and ﬂexibility as long as the utility function u is properly designed.When u is deﬁned as the negative p th power of the p th -order Minkowski distance between s ∗ and s , that is, u ( s ∗ | s ) = −(cid:107) s ∗ − s (cid:107) pp , the Exponential mechanism generates perturbed s ∗ from theGG distribution f ( s ∗ | s ) = ( A ( s )) − exp (cid:16) −(cid:107) s ∗ − s (cid:107) pp (cid:15) u (cid:17) = ( A ( s )) − (cid:81) rk =1 exp (cid:16) − | s ∗ k − s k | p u (cid:15) − (cid:17) = (cid:81) rk =1 GG( s k , b, p ) (15)with A ( s ) = ( p − b Γ( p − )) r and b p = 2∆ u (cid:15) − . The scale parameter b in Eqn (15) is a function ofthe GS of the utility function ∆ u and the privacy budget (cid:15) . For bounded data s ∗ k ∈ [ c k , c k ] for8 = 1 , . . . , r , the Exponential mechanism based on the GG distribution is f ( s ∗ | s ∗ ∈ [ c , c ]) = ( A ( s )) − (cid:81) rk =1 ( B ( s k )) − exp (cid:16) − | s ∗ k − s k | p u (cid:15) − (cid:17) , (16)where B ( s k ) = Pr( s ∗ k ∈ [ c k , c k ]) is calculated from the pdf GG( s k , b, p ). Compared to thetruncated GG mechanism in Deﬁnition 5, the only diﬀerence in the Exponential mechanism inEqn (16) is how the scale parameter b is deﬁned. In Deﬁnition 5, b depends on the GS of s (∆ p )while it is a function of the GS of the utility function u (∆ u ) in the Exponential mechanism.Speciﬁcally, b p ≥ (cid:15) − ∆ u in the Exponential mechanism, and the lower bound on b is given inEqn (7) in the GG mechanism. While both mechanisms will lead to the satisfaction of (cid:15) -DP, theone with a smaller b is preferable at the same (cid:15) . The magnitude of b in each case depends onthe bounds of s , and the order p , in addition to ∆ u or ∆ p . Though not a direct comparison on b , Lemma 9 explores the relationship between ∆ u and ∆ p , with the hope to shed light on thecomparison of b (the proof is in Appendix E). Lemma 9.

Let [ c k , c k ] denote the bounds on s k for k = 1 , . . . , r .a) When u = −(cid:107) s ∗ − s (cid:107) , ∆ u ≤ ∆ . Both the GG mechanism and the GG-distribution basedExponential mechanism reduce to the truncated Laplace mechanism with the same b .b) When u = −(cid:107) s ∗ − s (cid:107) , ∆ u ≤ (cid:80) rk =1 ∆ ,k | c k − c k | .c) When u = −(cid:107) s ∗ − s (cid:107) pp for p ≥

3, ∆ u ≤ (cid:80) rk =1 (cid:80) pj =1 ( pj )(max {| c k | , | c k |} ) p − j ∆ ( j )1 ,k , where ∆ ( j )1 ,k =max x , x (cid:48) ,d ( x , x (cid:48) )=1 | ( s k ( x )) j − ( s k ( x (cid:48) )) j | is l GS of ( s k ) j .As a ﬁnal note on the GG-distribution based Exponential mechanism, we did not use thenegative Minkowski distance directly as the utility function due to a couple of potential prac-tical diﬃculties with this approach. First, ∆ u can be diﬃculty to obtain. Second, f ( s ∗ ) ∝ exp {− ( (cid:80) rk =1 | s ∗ k − s k | p ) /p (cid:15) (2∆ u ) − } , does not appear to be associated with any known distri-butions (except when p = 1), and additional eﬀorts are required to study the properties of f ( s ∗ )and to develop an eﬃcient algorithm to draw samples from it. A special case of the GG mechanism is the Gaussian mechanism when p = 2 that draws s ∗ k independently from a Gaussian distribution with mean s k and variance σ = b / k = 1 , . . . , r .Applying Eqn (6) with b deﬁned in Eqns (7) and (8), we can obtain the truncated Gaussianmechanism of (cid:15) -DP for bounded s ∈ [ c , c ] × · · · × [ c r , c r ] f ( s ∗ | s ) = (cid:81) rk =1 (cid:110) (Φ( c k ; µ, σ ) − Φ( c k ; µ, σ )) − φ ( s ∗ k ; µ = s k , σ = b / (cid:111) , where (17) b ≥ (cid:15) − (2 (cid:80) rk =1 | c k − c k | ∆ ,k + ∆ ) ≥ (cid:15) − (cid:80) rk =1 (cid:0) | c k − c k | ∆ ,k + ∆ ,k (cid:1) , where φ and Φ are the pdf and the CDF of the Gaussian distribution, respectively.An analytical solution on the lower bound of b for the Gaussian mechanism of ( (cid:15), δ )-pDP isprovided in Lemma 10 (the proof is provided in Appendix F). Lemma 10.

The lower bound on the scale parameter b from the Gaussian mechanism of ( (cid:15), δ )-pDP is b ≥ − / (cid:15) − ∆ (cid:16)(cid:112) (Φ − ( δ/ + 2 (cid:15) − Φ − ( δ/ (cid:17) .9iven the relationship between b and the standard deviation of the Gaussian distribution σ = b/ √

2, the lower bound can also be expressed in σ , σ ≥ (2 (cid:15) ) − ∆ (cid:16)(cid:112) (Φ − ( δ/ + 2 (cid:15) − Φ − ( δ/ (cid:17) . (18)The pDP lower bound given in Eqn (18) is diﬀerent from the lower bound σ > (cid:15) − ∆ c, with (cid:15) ∈ (0 ,

1) and c > . /δ ) . (19)in Dwork and Roth [10] for ( (cid:15), δ )-aDP (Eqn (2)). The pDP bound in Eqn (18) is tighter thanthe aDP bound in Eqn (19) for the same set of ( (cid:15), δ ) (note the interpretation of δ in pDP andaDP is diﬀerent, but the DP guarantee is roughly the same when δ is small). In addition, thepDP bound does not constrain (cid:15) to be < (cid:15) ∈ (0 ,

1) and δ ∈ (0 , . < (cid:15), δ ). The smaller (cid:15) is, or the larger δ is, the smaller the ratio is and the larger the diﬀerence is between the two bounds. R a t i o . . . . . d R a t i o on l o w e r bound ( p D P vs a D P ) e Figure 3: Comparison of pDP lower bound (Eqn 18) vs. aDP bound (Eqn 19) on σ in the Gaussianmechanism for (cid:15) < (the aDP bound requires (cid:15) < ) Dwork and Roth [10] list several advantages of the Gaussian noises, such as the Gaussiannoise is a “familiar” type of noise as many noise sources in real life can be well approximated byGaussian distributions; the sum of Gaussian variable is still a Gaussian; and ﬁnally, in the caseof multiple queries or when δ is small, the pure-DP guarantee in the Laplace mechanism and thepDP guarantee in the Gaussian mechanism see minimal diﬀerence. A theoretical disadvantageto Gaussian noise is that it does not guarantee DP in some cases (e.g., Report Noisy Max)[10].We investigate the accuracy of s ∗ by examining the tail probability and the dispersion of thenoises injected via the (cid:15) -DP Laplace mechanism and the ( (cid:15), δ )-pDP Gaussian mechanism. Denotethe noise drawn from the Laplace distribution by e and that from the Gaussian distribution by e .The location parameters of both are µ = 0; the tail probability p = Pr( e > | t | ) = exp( −| t | (cid:15)/ ∆ )in the Laplace distribution and p = Pr( e > | t | ) = 2Φ( −| t | /σ ) in the Gaussian distribution,where σ is given in Eqn (18). Since the CDF Φ() does not have a close-formed expression,we examine several numerical examples to compare p and p (Figure 4). We set (cid:15) to be thesame (0.1, 1, 2, respectively) between the two mechanisms and examine δ = (1% , , , (cid:15), δ )-pDP Gaussian mechanism. If the ratio p : p is <

1, it implies that the Laplacemechanism is less likely to generate more extreme s ∗ compared to the Gaussian mechanism at the10ame privacy speciﬁcation of (cid:15) . We should focus on the meaningful cases where noise | t | at leasthas a non-ignorable chance to occur in either mechanism. We used cutoﬀ 10 − ; that is, either p > − or p > − (other cutoﬀs can be used, depending on how “unlikely” is deﬁned). It isinteresting to observe that after the initial take-oﬀ at 1 when | t | = 0, the ratio decreases until ithits the bottom and then bounds back with some cases eventually exceeding 1 at some value of | t | , depending on the privacy parameter speciﬁcation. The smaller (cid:15) or δ is, the longer it takesfor the bounce-back to occurs. The observation suggests that the Laplace mechanism is in somecases more likekly to generate sanitized results s ∗ that are far away from s . We also compare . . . . . . . |t| r a t i o ( p1 : p2 ) (0.1,0.01)(0.1,0.05)(0.1,0.1)(0.1,0.2)(1, 0.01)(1, 0.05)(1, 0.1)(1, 0.2)(2, 0.01)(2, 0.05)(2, 0.1)(2, 0.2)(epsilon,delta)p1<10 - ; p2<10 - Figure 4: Ratio on the tail probabilities p : p (the gray curves represent the unlikely cases where both p and p are < − ) the privacy parameter (cid:15) between the two mechanisms when both have the same tail probability.Figure 5 shows the calculated (cid:15) value associated with the Gaussian mechanism of ( (cid:15) , δ )-DP fora given δ that yields Pr( e < | t | ) = Pr( e < | t | ) with the Laplace mechanism of (cid:15) -DP. If theratio of (cid:15) : (cid:15) < | t | and a small and somewhat ignorable δ , it implies the same tailprobability can be achieved with less privacy cost with the Gaussian mechanism compared tothe Laplace mechanism. Figure 5 suggests that at the same | t | , the more relaxation of the pure (cid:15) -DP is allowed (i.e., the larger δ is), the smaller (cid:15) is (relative to baseline (cid:15) ), which expectedas the (cid:15) and δ together determine the noise released in the Gaussian mechanism.Lemma 11 presents the precision comparison of s ∗ between the Laplace mechanism of (cid:15) -DPand the Gaussian mechanism of ( (cid:15), δ )-pDP. With the same location parameter in the Laplace andGaussian distributions, a larger precision is equivalent to a smaller mean squared error (MSE). Lemma 11.

Between the Gaussian mechanism of ( (cid:15), δ )-pDP and the Laplace mechanism of (cid:15) -DPfor sanitizing a statistic s , when δ < √ ≈ . s ∗ released by the Gaussian mechanism of ( (cid:15), δ < . (cid:15) -DP. In other words, if there are multiple sets of s ∗ released via the Gaussianand the Laplace mechanisms respectively, then the former sets would have a wider spread thanthe latter. Since ( (cid:15), δ )-pDP provides less privacy protection than (cid:15) -pDP, together with the larger11

10 20 30 40 50 |t| ep s il on2 : ep s il on1 p1<10 - ; p2<10 - Figure 5: Relative privacy cost (cid:15) : (cid:15) (the gray curves represent the unlikely cases where both p and p are < − ) MSE, it can be argued that the Laplace mechanism is superior to the Gaussian mechanism (whichis also reﬂected in the 3 experiments in Section 4). It should be noted that δ < .

157 in Lemma11 is a suﬃcient but not necessary condition. In other words, the Gaussian mechanism may notbe less dispersed than the Laplace mechanism when δ ≥ . δ needs tobe small to provide suﬃcient privacy protection in the setting of ( (cid:15), δ )-pDP, it is very unlikely tohave δ > .

157 in practical applications. Also noted is that the setting explored in Lemma 11,where the focus is on examining the precision (dispersion) of a single perturbed statistic giventhe speciﬁcized privacy parameters and the original statistics when the sample size of a data setis public, is diﬀerent from the recent work on the bounds of sample complexity (required samplesize) to reach a certain level of a statistical accuracy in perturbed results with (cid:15) -DP or ( (cid:15), δ )-aDP[18] (more discussions are provided in Section 5 on this point).

We run three experiments on the mildew data set, the Czech data set, and the Census Incomedata set; a.k.a. the adult data. The mildew data contains information of parental alleles at 6 locion the chromosome for 70 strands of barley powder mildew[19]. Each loci has two levels, yieldinga very sparse 6-way cross-tabulation (22 cells out of the 64 are non-empty with low frequenciesin many other cells). The Czech data contains data collected on 6 potential risk factors forcoronary thrombosis for 1841 workers in a Czechoslovakian car factory [19]. Each risk factor has2 levels (Y or N). The cross-tabulation is also 6-way with 64 cells, the same as the mildew data,but table is not as sparse with the large n (only one empty cell). The adult data was extractedfrom the 1994 US Census database to yield a set of reasonably clean records that satisfy a setof conditions[20]. The data set is often used to test classiﬁers by predicting whether a personmakes over 50K a year. We used only the completers in the adult data (with no missing valueson the attributes) and then split them to 2/3 training (20009 subjects) and 1/3 testing (10005subjects).In each experiment, we run the Laplace mechanism of (cid:15) -DP, the Gaussian mechanism of ( (cid:15), δ )-pDP presented in Section 3, and the Gaussian mechanism of of ( (cid:15), δ )-aDP [10] to sanitize count12 lll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllllll Laplace−DPGaussian−pDPGaussian−aDPGGM3−pDP epsilon = 0.5sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll epsilon = 1sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll epsilon = 2sanitized counts o r i g i na l c oun t s de l t a = % llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s de l t a = % llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s de l t a = % llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s de l t a = % Figure 6: sanitized vs. original cell counts in the mildew data l l l l l l l l l l l l L1 e rr o r l l l l l l l l l l l ll l l l l l l l l l l l K L d i s t an c e 0123 l l l l l l l l l l l ll l l Laplace−DP Gaussian−pDP Gaussian−aDP GGM3−pDP

Figure 7: l distance and KL divergence between sanitized and original counts in the mildew data data. We examined (cid:15) = 0 . , , δ = 0 . , . , . , .

25. To examine the variation of noises,we run 500 repeats and computed the means and standard deviations of l distances betweenthe sanitized and the original counts and the Kullback-Leibler (KL) divergence between theempirical distributions of the synthetic data and the original data over the 500 repeats. Inaddition, we tested the GG mechanism of order 3 ( p = 3) in the mildew data, and compared theclassiﬁcation accuracy of the income outcome in the testing data set in the adult experiment based13 l ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lllll Laplace−DPGaussian− p DPGaussian− a DP epsilon = 0.1sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll epsilon = 0.5sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll epsilon = 1sanitized counts o r i g i na l c oun t s de l t a = % ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s de l t a = % ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s de l t a = % ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s de l t a = % Figure 8: sanitized vs. original cell counts in the Czech data l l l l l l l l l l l l L1 e rr o r l l l l l l l l l l l ll l l l l l l l l l l l . K L d i s t an c e . . . . l l l l l l l l l l l ll l Laplace−DP Gaussian−pDP Gaussian−aDP

Figure 9: l distance and KL divergence between sanitized and original counts in the Czech data on the support vector machines (SVMs) trained with the original training data and the sanitizedtraining data, respectively. The KL distance was calculated using the KL.Dirichlet command14n R package entropy that computes a Bayesian estimate of the KL divergence. The SVMs weretrained using the svm command in R package e1071 . In all experiments, ∆ p = 1 for all p since thereleased query is a histogram and the bin counts are based on disjoint subsets of data. The scaleparameters of the Laplace mechanism and the Gaussian mechanisms were obtained analytically(∆ (cid:15) − , Eqns (18) and (19), respectively), the grid search and the MC approach were applied toobtain the lower bound b for GGM-3 via Corollary 8. In the mildew and Czech experiments, wesanitized all bins in the histograms, including the empty bins, assuming all combinations of the6 attributes in each case are practically meaningful (in other words, the empty cells are samplezeros rather than population zeros). In the adult data, there are 14 attributes and ∼ . × bins in the 14-attribute histogram, a non-ignorable portion of which do not make any practicalsense (e.g., a 90-age works >

80 hours per week). For simplicity, we only sanitized the 17,985nonempty cells in the training data. After the sanitization, we set the out-of-bounds syntheticcounts < > n at n , respectively, and normalized the sanitized counts to sum upto the original sample size n in all 3 experiments, assuming n itself is public or does not carryprivacy information. Figure 10: sanitized vs. original cell counts in the adult data

The results are given in Figures 6 to 12. In Figures 6, 8 and 10, the closer the points are tothe identity line, the more similar are the original and sanitized counts. The Laplace sanitizeris the obvious winner in all 3 cases, producing the sanitized counts closest to the original withthe smallest l l error and the KL divergence, followed by the Gaussian mechanism of ( (cid:15), δ )-pDP,and GGM3 of ( (cid:15), δ )-pDP in the mildew data; the Gaussian mechanism of ( (cid:15), δ )-aDP is the worst.In the mildew experiment, the performance of the Gaussian mechanism of ( (cid:15), δ )-pDP is similarwhen (cid:15) = 2 or δ ≥ .

1. The decrease in the l error and the KL divergence seems to decreasemore or less in a linear manner as (cid:15) increases from 0.5 to 1 to 2, while the impact of δ seemed to15 l l l l l l l l l l l l l l l l l l l l l l l L1 e rr o r l l Laplace Gaussian−pDP Gaussian−aDP l l l l l l l l l l l l . . . . . l l l l l l l l l l l l K L d i s t an c e Figure 11: l distance and KL divergence between sanitized and original counts in the adult data l l l l l l l l l l l l . . . l l l l l l l l l l l l a cc u r a cy r a t e l l Laplace Gaussian−pDP Gaussian−aDP l l l l l l l l l l l l . . . l l l l l l l l l l l l P r( p r ed i c t ed > K | > K ) l l l l l l l l l l l l . . l l l l l l l l l l l l delta P r( p r ed i c t ed <= K | <= K ) original rates - - - - Figure 12: Prediction accuracy in testing data via SVMs trained on sanitized and original data in theadult data have less a profound impact on the l error and the KL divergence. In the Czech experiment, thesanitized counts approach the original counts more quickly than the mildew case with increased (cid:15) and δ , but there is signiﬁcantly more variability for small (cid:15) (0.1); and the l error and the KLdivergence no longer decreases in a linear fashion, but drastically from (cid:15) = 0 . (cid:15) = 1 to 2. The diﬀerences in the results between the mildew and the Czech experiments16an be explained by the larger n in the latter. In the adult experiment, Figure 12 suggests theprediction accuracy via the SVMs built on sanitized data is barely aﬀected compared to theoriginal accuracy regardless of the mechanism.There are some decreases in the accuracy ratesfrom the original, but they are largely ignorable (on the scale of 0.25% to 1%), even with thevariation take into account. In addition, the Gaussian mechanism of ( (cid:15), δ )-aDP, though beingthe worst in preserving the original counts measured the l distance and KL divergence, is noworse than the two Gaussian mechanisms in prediction. We introduced a new concept of the l p GS, and uniﬁed the Laplace mechanism and the Gaussianmechanism in the family of the GG mechanism. For bounded data, we discussed the truncatedand the BIT GG mechanisms to achieve (cid:15) -DP. We also proposed ( (cid:15), δ )-pDP as an alternativeparadigm to the pure (cid:15) -DP for the GG mechanism for order p ≥

2. We showed the connectionsand distinctions between the GG mechanism and the Exponential mechanism when the utilityfunction is deﬁned as the negative p th -power of the Minkowski distance between the original andsanitized results. We also presented the Gaussian mechanism as an example of the GG mechanismand derived a lower bound for the scale parameter of the associated Gaussian distribution toachieve ( (cid:15), δ )-pDP. The bound is tighter than the lower bound for the Gaussian mechanism of( (cid:15), δ )-aDP. We compared the tail probability and the dispersion of the the noise generated via theGaussian mechanism of ( (cid:15), δ )-pDP and the Laplace mechanism. We ﬁnally applied the Gaussianmechanisms of ( (cid:15), δ )-pDP and ( (cid:15), δ )-aDP and the Laplace mechanism of (cid:15) -DP in three real-lifedata sets.The GG mechanism is based on the l p “global” sensitivity of query results in the sense thatthe sensitivity is independent of any speciﬁc data. Though the employment of the GS is robustin terms of privacy protection, it could result in a large amount of noises being injected to queryresults. There is work that allows the sensitivity of a query to vary with data (“local” sensitivity)[21, 22] with the purpose to increase the accuracy of sanitized results. How to develop the GGmechanism in the context of local sensitivity is a topic for future investigation.The setting for the examination on the tail probability and dispersion in Section 3 is diﬀerentfrom, though related to, the work on upper and lower bounds on sample complexity – the requiredsample size n to reach a certain level of accuracy α and privacy guarantee ( (cid:15), δ ) for count queries[18, 23, 24]. α often refers to the accuracy of perturbed results in the DP literature, such as theworst case accuracy L ∞ or average accuracy L , and might also refer to the tail probability andthe MSE of released data, among others. A diﬀerential privacy mechanism is characterized by (cid:15) (and δ ) for privacy guarantee, α to measure information preservation and utility of sanitizedresults, and the sample size n of original data. The existing work on sample complexity focuseson bounding n given (cid:15) (and δ ) and α , while the results in Section 3 focus on the the accuracy andprecision of sanitized results given (cid:15) (and δ ) and n . If the bias from perturbed results (relativeto the original results) are the same between the two mechanisms, a larger precision is equivalentto a smaller MSE. 17 ppendixA Proof of Lemma 2 ∆ p = max x , x (cid:48) ,d ( x , x (cid:48) )=1 ( (cid:80) rk =1 | s k ( x ) − s k ( x (cid:48) ) | p ) /p = (cid:0) max x , x (cid:48) ,d ( x , x (cid:48) )=1 (cid:80) rk =1 | s k ( x ) − s k ( x (cid:48) ) | p (cid:1) /p . Since max x , x (cid:48) ,d ( x , x (cid:48) )=1 (cid:80) rk =1 | s k ( x ) − s k ( x (cid:48) ) | p ≤ (cid:80) rk =1 max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k ( x ) − s k ( x (cid:48) ) | p = (cid:80) rk =1 (cid:0) max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k ( x ) − s k ( x (cid:48) ) | (cid:1) p = (cid:80) rk =1 ∆ p ,k . Therefore, (cid:0)(cid:80) rk =1 ∆ p ,k (cid:1) /p is an upper bound for ∆ p . (cid:4) B Proof of Claim 4 (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18)

Pr( s ∗ ∈ Q | x )Pr( s ∗ ∈ Q | x (cid:48) ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:32) exp (cid:0) − b − p (cid:107) s ∗ − s ( x ) (cid:107) pp (cid:1) exp ( − b − p (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = b − p (cid:12)(cid:12) (cid:107) s ∗ − s ( x ) (cid:107) pp − (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp (cid:12)(cid:12) = b − p (cid:12)(cid:12) (cid:80) rk =1 ( | s ∗ k − s k ( x ) | p − | s ∗ k − s k ( x (cid:48) ) | p ) (cid:12)(cid:12) ≤ b − p (cid:80) rk =1 (cid:12)(cid:12) | s ∗ k − s k ( x ) | p − | s ∗ k − s k ( x (cid:48) ) | p (cid:12)(cid:12) = b − p (cid:80) rk =1 (cid:12)(cid:12) | e k | p − | e k + d k | p (cid:12)(cid:12) , (B.1)where e k = s ∗ k − s k ( x ) and d k = s k ( x ) − s k ( x (cid:48) )= b − p (cid:80) rk =1 (cid:12)(cid:12) | e pk | − | ( e k + d k ) p | (cid:12)(cid:12) for integers p ≥ ≤ b − p (cid:80) rk =1 (cid:12)(cid:12) e pk − ( e k + d k ) p (cid:12)(cid:12) = b − p (cid:80) rk =1 (cid:12)(cid:12)(cid:12)(cid:80) pj =1 ( pj ) e p − jk d jk (cid:12)(cid:12)(cid:12) by reverse triangle inequality ≤ b − p (cid:80) rk =1 (cid:80) pj =1 ( pj ) | e k | p − j | d k | j (B.2)= b − p (cid:16) p (cid:80) rk =1 | e k | p − | d k | + ( p − p (cid:80) rk =1 | e k | p − | d k | + · · · +( p − p (cid:80) rk =1 | e k | | d k | p − / p (cid:80) rk =1 | e k | · | d k | p − + (cid:80) rk =1 | d k | p ) ≤ b − p (cid:16) p (cid:80) rk =1 | e k | p − ∆ ,k + ( p − p (cid:80) rk =1 | e k | p − ∆ ,k + · · · + ( p − p (cid:80) rk =1 | e k | ∆ p − ,k + p (cid:80) rk =1 | e k | ∆ p − ,k +∆ pp (cid:17) , (B.3)where ∆ ,k is the l GS of s k and ∆ p is the l p GS of s . To achieve (cid:15) -DP, Eqn (B.3) needs to be ≤ (cid:15) ; that is, ∆ pp + (cid:80) rk =1 (cid:80) p − j =1 ( pj ) | e k | p − j ∆ j ,k ≤ b p (cid:15). (B.4)A less tight bound can be obtained by applying Lemma 2 (∆ pp ≤ (cid:80) rk =1 ∆ p ,k ), thus (cid:80) rk =1 (cid:80) pj =1 ( pj ) | e k | p − j ∆ j ,k ≤ b p (cid:15). (B.5)The inequalities in Eqns (B.4) or (B.5) susgest that the lower bound on b depends on the randomGG noise e k = s ∗ k − s k for k = 1 , . . . , r , the support of which is ( −∞ , ∞ ) r . In other words, theredoes not exist a random noise-free solution on b , unless p = 1 in which case the inequality nolonger involves the error terms and the GG mechanism reduces to the familiar Laplace mechanismof (cid:15) -DP, leading to Claim 4. When p = 1, Eqn (B.1) ≤ b − (cid:80) rk =1 | d k | ≤ b − (cid:80) rk =1 | ∆ ,k | = b − ∆ < (cid:15) ,and thus b > ∆ (cid:15) − . (cid:4) Proof of (cid:15) -DP of the truncated GG mechanism in Deﬁnition 5

To satisfy (cid:15) -DP, we need (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18)

Pr( s ∗ ∈ Q | x , s ∗ ∈ [ c , c ] ×· · ·× [ c r , c r ])Pr( s ∗ ∈ Q | x (cid:48) , s ∗ ∈ [ c , c ] ×· · ·× [ c r , c r ]) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:32) exp (cid:0) − b − p (cid:107) s ∗ − s ( x ) (cid:107) pp (cid:1)(cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s k , b, p ) × (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s (cid:48) k , b, p )exp ( − b − p (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:32) exp (cid:0) − b − p (cid:107) s ∗ − s ( x ) (cid:107) pp (cid:1) exp ( − b − p (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp ) (cid:33) + log (cid:18) (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s k , b, p ) (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s (cid:48) k , b, p ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:32) exp (cid:0) − b − p (cid:107) s ∗ − s ( x ) (cid:107) pp (cid:1) exp ( − b − p (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + (C.6) (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s k , b, p ) (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s (cid:48) k , b, p ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (C.7)If the term in Eqn (C.6) satisﬁes (cid:15)/ (cid:15)/ b p ( (cid:15)/ ≥ ∆ pp + (cid:80) rk =1 (cid:80) p − j =1 ( pj ) | s ∗ k − s k | p − j ∆ j ,k Since s ∗ isbounded within [ c k , c k ] for k = 1 , . . . , K , | s ∗ k − s k | ≤ | c k − c k | . Setting b p ( (cid:15)/ ≥ ∆ pp + (cid:80) rk =1 (cid:80) p − j =1 ( pj ) | c k − c k | p − j ∆ j ,k ensures the truncated GG mechanism is of (cid:15) -DP; or equivalently, b p ≥ (cid:15) − (cid:16)(cid:80) rk =1 (cid:80) p − j =1 ( pj ) | c k − c k | p − j ∆ js k +∆ p s ,p (cid:17) ensures that the truncated GG mechanism is of (cid:15) -DP. (cid:4) D Conservativeness of Exponential mechanismCorollary 12.

The actual privacy cost of the Exponential mechanism of (cid:15) -DP is always lessthan the nominal budget (cid:15) . When the normalization factor A ( x ) in Eqn (14) is independent of x , the actual privacy cost is (cid:15)/ A ( x ) independent of x implies increases and decreases in the utility scores upon the changefrom x to x (cid:48) “cancel out” when integrated or summed over all possible s ∗ in the form ofexp (cid:16) u ( s ∗ | x ) (cid:15) u (cid:17) . Proof.

Since u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) ≤ ∆ u , (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) Pr( s ∗ ( x ) ∈ Q )Pr( s ∗ ( x (cid:48) ) ∈ Q ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log  exp (cid:16) u ( s ∗ | x ) (cid:15) u (cid:17) exp (cid:16) u ( s ∗ | x (cid:48) ) (cid:15) u (cid:17) × A ( x (cid:48) ) A ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) e (cid:15)/ A ( x (cid:48) ) A ( x ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) (D.8)= (cid:12)(cid:12)(cid:12)(cid:12) (cid:15) (cid:18) A ( x (cid:48) ) A ( x ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) A ( x (cid:48) ) A ( x ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) (D.9)by the triangle inequality, and A ( x (cid:48) ) = (cid:90) s ∗ ∈S exp (cid:18) u ( s ∗ | x (cid:48) ) (cid:15) u (cid:19) d s ∗ ≤ (cid:90) s ∗ ∈S exp (cid:18) ( u ( s ∗ | x ) + ∆ u ) (cid:15) u (cid:19) d s ∗ (D.10)= exp (cid:16) (cid:15) (cid:17)(cid:90) s ∗ ∈S exp( u ( s ∗ | x )) d s ∗ = exp (cid:16) (cid:15) (cid:17) A ( x )19herefore, log (cid:16) A ( x (cid:48) ) A ( x ) (cid:17) ≤ (cid:15)/

2, and Eqn (D.9) becomes (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18)

Pr( s ∗ ( x ) ∈ Q )Pr( s ∗ ( x (cid:48) ) ∈ Q ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) A ( x ) A ( x (cid:48) ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (D.11)The same result can be obtained by replacing the integral with summation when S is a discreteset in the equation set (D.11). The above results seem to suggest (cid:15) can be achieved exactly since“equality” appears in all the inequalities above (Eqn (D.8) to (D.11)); however, equality cannotoccur simultaneously in Eqns (D.8) and (D.10) unless ∆ u was 0, which is meaningless in DP.In addition, ∆ u is deﬁned as the maximum change in u for all d ( x , x (cid:48) ) = 1. While it is likelythat the maximum change occurs at more than a single value of s ∗ , it is not possible that theutility scores at all values of s ∗ increase or decreases by the same amount ∆ u . In other words,the “equality” in Eqn (D.10) itself is unlikely to hold. All taken together, the actual privacy costin the Exponential mechanism is always less than (cid:15) and never attains the exact upper bound (cid:15) . In the extreme, the actual privacy cost can be down to (cid:15)/ A ( x ) ≡ A ( x (cid:48) ) ∀ x , x (cid:48) and d ( x , x (cid:48) ) = 1, as suggested by Eqn (D.9). (cid:4) E Proof of Lemma 9

Proof.

Part a) . Denote s ( x ) by s and s ( x (cid:48) ) by s (cid:48) . When p = 1, u ( s ∗ | x ) = −(cid:107) s ∗ − s (cid:107) , | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | = (cid:12)(cid:12)(cid:80) rk =1 ( | s ∗ k − s k | − | s ∗ k − s (cid:48) k | ) (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:12)(cid:12) | s ∗ k − s k | − | s ∗ k − s (cid:48) k | (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:12)(cid:12) s ∗ k − s k − ( s ∗ k − s k ) (cid:12)(cid:12) = (cid:80) rk =1 | s k − s (cid:48) k | = | s − s (cid:48) | . Therefore, ∆ u = max x , x (cid:48) ,d ( x , x (cid:48) )=1 , s ∗ ∈S | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | ≤ max x , x (cid:48) ,d ( x , x (cid:48) )=1 (cid:107) s − s (cid:48) (cid:107) = ∆ s , . (cid:4) Proof.

Part b) . When p = 2, u ( s ∗ | x ) = −(cid:107) s ∗ − s (cid:107) , | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | = (cid:12)(cid:12) (cid:80) rk =1 ( s k − s ∗ k ) − ( s (cid:48) k − s ∗ k ) (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:12)(cid:12) ( s k − s ∗ k ) − ( s (cid:48) k − s ∗ k ) (cid:12)(cid:12) = (cid:80) rk =1 | s k − s (cid:48) k | · | s k − s ∗ k + s (cid:48) k − s ∗ k | ≤ (cid:80) rk =1 ∆ ,k ( | s k − s ∗ k | + | s (cid:48) k − s ∗ k | ). Suppose s k is bounded within [ c k , c k ], so is s ∗ k , then∆ u = max x , x (cid:48) , s ∗ ∈S d ( x , x (cid:48) )=1 (cid:12)(cid:12) (cid:80) rk =1 ( s k ( x ) − s ∗ k ) − (cid:80) rk =1 ( s k ( x (cid:48) ) − s ∗ k ) (cid:12)(cid:12) ≤ (cid:80) rk =1 ∆ ,k ( c k − c k ) (E.12)When c k − c k ≡ b − a ∀ k , ∆ u ≤ b − a ) (cid:80) rk =1 ∆ ,k = 2( b − a )∆ . (cid:4) Proof.

Part c) . When u ( s ∗ | x ) = −(cid:107) s ∗ − s (cid:107) pp for integer p ≥ | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | = (cid:12)(cid:12) (cid:107) s ∗ − s (cid:107) pp −(cid:107) s ∗ − s (cid:48) (cid:107) pp (cid:12)(cid:12) = (cid:12)(cid:12) (cid:80) rk =1 | s k − s ∗ k | p − (cid:80) rk =1 | s (cid:48) k − s ∗ k | p (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:13)(cid:13) ( s k − s ∗ k ) p | − | ( s (cid:48) k − s ∗ k ) p | (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:12)(cid:12) ( s k − s ∗ k ) p − ( s (cid:48) k − s ∗ k ) p (cid:12)(cid:12) = (cid:80) rk =1 (cid:12)(cid:12) (cid:80) pi =1 ( pi )( − s ∗ k ) p − i [ s ik − ( s (cid:48) k ) i ] (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:80) pi =1 ( pi ) (cid:12)(cid:12) ( s ∗ k ) p − i [ s ik − ( s (cid:48) k ) i ] (cid:12)(cid:12) .Suppose s k is bounded within ( c k , c k ), so is s ∗ k .Deﬁne ∆ ( i )1 ,k = max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s ik − ( s (cid:48) k ) i | , then∆ u = max x , x (cid:48) ,d ( x , x (cid:48) )=1 , s ∗ ∈S (cid:12)(cid:12)(cid:80) rk =1 | s k − s ∗ k | p −| s (cid:48) k − s ∗ k | p (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:80) pi =1 ( pi )∆ ( i )1 ,k (max {| c k | , | c k |} ) p − i (E.13)When p = 1, Eqn (E.13) reduces to ∆ u ≤ (cid:80) rk =1 ∆ ,k in Part a). When p = 2, Eqn (E.13)becomes (cid:80) rk =1 (cid:16) ∆ (2)1 ,k + 2∆ ,k max {| c k | , | c k |} (cid:17) , not as tight an upper bound as Eqn (E.12). Tosee this, we can show 2∆ ,k ( c k − c k ) ≤ ∆ (2)1 ,k + 2∆ ,k max {| c k | , | c k |} or 2∆ ,k max {| c k | , | c k |} − ,k ( c k − c k ) + ∆ (2)1 ,k ≥ k . When c k c k ≥ c k − c k < max {| c k | , | c k |} ,2∆ ,k ( c k − c k ) ≤ ,k max {| c k | , | c k |} < ,k max {| c k | , | c k |} + ∆ (2)1 ,k . When c k c k ≤ {| c k | , c k |} = c k , 2∆ ,k max {| c k | , | c k |} − ,k ( c k − c k ) + ∆ (2)1 ,k = 2∆ ,k c k − ,k ( c k − c k ) + ∆ (2)1 ,k = 2∆ ,k c k + ∆ (2)1 ,k .Since ∆ (2)1 ,k =max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − ( s (cid:48) k ) | = max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − s (cid:48) k | · | s k + s (cid:48) k | ≥ max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − s (cid:48) k | · | c k | = 2∆ ,k | c k | , ∆ (2)1 ,k − ,k | c k | = ∆ (2)1 ,k + 2∆ ,k c k ≥

0. When c k c k ≤ {| c k | , c k |} = | c k | , 2∆ ,k max {| c k | , | c k |} + ∆ (2)1 ,k − ∆ ,k ( c k − c k ) = 2∆ ,k | c k | − ,k ( c k − c k ) + ∆ (2)1 ,k = ∆ (2)1 ,k − ,k c k . Since ∆ (2)1 ,k = max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − ( s (cid:48) k ) | ≥ max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − s (cid:48) k | · | c k | = 2∆ ,k c k , ∆ (2)1 ,k − ,k c k ≥

0. All taken together, 2 (cid:80) rk =1 ∆ ,k ( c k − c k ) ≤ (cid:80) rk =1 (cid:16) ∆ (2)1 ,k + 2∆ ,k max {| c k | , | c k |} (cid:17) . (cid:4) F Proof of Lemma 10

Proof.

When r = 1 ( s is a scalar), ∆ p ≡ ∆ for all p ≥

1. To satisfy ( (cid:15), δ )-pDP, we setPr (cid:16) | s ∗ − s | > (cid:15)b ∆ − − ∆2 (cid:17) = 2Φ (cid:16) ∆ / − (cid:15)b (2∆) − b/ √ (cid:17) ≤ δ (F.14) ⇒ ∆ b − − (cid:15)b ∆ − ≤√ − ( δ/ ⇒ b ≥ − / (cid:15) − ∆ (cid:112) (Φ − ( δ/ + 2 (cid:15) − Φ − ( δ/ . Together with the requirement b − (cid:15) − ∆ > b ≥ max (cid:26) (cid:15) − / ∆ , (cid:0) (cid:15) − / ∆ (cid:1) √ (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ √ (cid:15) (cid:27) .Since δ <

1, Φ − ( δ/ < (cid:112) (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ ≥ √ (cid:15) , and thus b ≥ (cid:0) (cid:15) − / ∆ (cid:1) √ (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ √ (cid:15) . When r >

1, we leverage the proof in Appendix A (page265) in [10] and obtain (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18)

Pr( s ∗ ∈ Q | x )Pr( s ∗ ∈ Q | x (cid:48) ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) exp( −(cid:107) e (cid:107) /b )exp ( −(cid:107) e + d (cid:107) /b ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12) b − (cid:0) (cid:107) e (cid:107) − (cid:107) e + d (cid:107) (cid:1)(cid:12)(cid:12) ≤ (cid:12)(cid:12) b − (cid:0) λ ∆ + ∆ (cid:1)(cid:12)(cid:12) ≤ b − (cid:0) | λ | + ∆ (cid:1) , where e = s ∗ − s ( x ) , d = s ( x ) − s ( x (cid:48) ) deﬁned in Eqn (B.2), and λ ∼ N (0 , b / (cid:15), δ )-pDP, we setPr( b − (cid:0) | λ | + ∆ (cid:1) < (cid:15) ) = Pr( (cid:0) | λ | < ( b (cid:15) ∆ − − ∆ ) / (cid:1) > − δ ⇒ Pr (cid:0) | λ | > ( b (cid:15) ∆ − − ∆ ) / (cid:1) = 2Φ (cid:16) ∆ − (cid:15)b ∆ − √ b (cid:17) > δ, which is the same as Eqn (F.14) for r = 1. Similar to the case of r = 1, we need b (cid:15) ∆ − − ∆ > b for r > b ≥ max (cid:26) (cid:15) − / ∆ , (cid:0) (cid:15) − / ∆ (cid:1) √ (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ √ (cid:15) (cid:27) . Since δ <

1, Φ − ( δ/ <

0, thus b ≥ (cid:0) (cid:15) − / ∆ (cid:1) √ (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ √ (cid:15) (cid:4) G Proof of Lemma 11

Proof. If σ is set at the lower bound in Eqn (18), the ratio of the variance between the Gaussiandistribution of the Gaussian mechanism of ( (cid:15), δ )-pDP and the Laplace distribution of the Laplacemechanism of (cid:15) -DP is (cid:16) (2 (cid:15) ) − ∆ s (cid:16)(cid:113) (Φ − ( δ )) + 2 (cid:15) − Φ − ( δ ) (cid:17) / ( √ (cid:15) − ∆ s ) (cid:17) (cid:16)(cid:113) (Φ − ( δ )) + 2 (cid:15) − Φ − ( δ ) (cid:17) / − (Φ − ( δ )) + (cid:15) − Φ − ( δ ) (cid:113) (Φ − ( δ )) + 2 (cid:15) (G.15)Since δ ∈ [0 , δ/ ∈ [0 , .

5] and Φ − ( δ/ ∈ ( −∞ , (cid:15) >

0, Eqn (G.15) > (Φ − ( δ/ /

2. Let (Φ − ( δ/ / >

1, then δ/ < Φ( −√ δ < −√ ≈ . (cid:4) References [1] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in privatedata analysis,” in

Theory of cryptography . Springer, 2006, pp. 265–284.[2] C. Dwork, “Diﬀerential privacy: A survey of results,”

Theory and Applications of Models ofComputation , vol. 4978, pp. 1–19, 2008.[3] ——, “Diﬀerential privacy,” in

Encyclopedia of Cryptography and Security . Springer, 2011,pp. 338–340.[4] F. McSherry and K. Talwar, “Mechanism design via diﬀerential privacy,” in , 2007, pp. 94–103.[5] F. McSherry, “Privacy integrated queries: an extensible platform for privacy-preservingdata analysis,” in

Proceedings of the 2009 ACM SIGMOD International Conference onManagement of data . ACM, 2009, pp. 19–30.[6] A. Roth and T. Roughgarden, “Interactive privacy via the median mechanism,” in

Proceed-ings of the 42nd ACM Symposium on Theory of Computing , June 5-8, 2010.[7] M. Hardt, K. Ligett, and F. McSherry, “A simple and practical algorithm for diﬀerentiallyprivate data release,” arXiv:1012.4763v2 , 2012.[8] A. Ghosh, T. Roughgarden, and M. Sundararajan, “Universally utility-maximizing privacymechanisms,”

SIAM Journal on Computing , vol. 41, no. 6, pp. 1673–1693, 2012.[9] Q. Geng and P. Viswanath, “The optimal noise-adding mechanism in diﬀerential privacy,”

IEEE Transactions on Information Theory , vol. 62, no. 2, pp. 925–951, 2016.[10] C. Dwork and A. Roth,

The Algorithmic Foundation of Diﬀerential Privacy . Now Publishes,Inc., 2014.[11] C. Dimitrakakis, B. Nelson, A. Mitrokotsa, and B. Rubinstein, “Robust and private bayesianinference,” in

Algorithmic Learning Theory ALT 2014 , P. Auer, A. Clark, T. Zeugmann, andS. Zilles, Eds. Spring, Cham, 2014.[12] C. Dwork, “Diﬀerential privacy,” in

Proceedings of the International Colloquium on Au-tomata, Languages and Programming (ICALP) . Springer-Verlag ARCoSS, 2006, pp. 1–12.[13] C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor, “Our data, ourselves:privacy via distributed noise generation,” in

Advances in Cryptology: Proceedings of EU-ROCRYPT . Springer Berlin Heidelberg, 2006, pp. 485–503.[14] A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber, “Privacy: Theorymeets practice on the map,”

IEEE ICDE 24th International Conference , pp. 277 – 286,2008. 2215] R. Hall, A. Rinaldoy, and L. Wasserman, “Random diﬀerential privacy,”

Journal of Privacyand Conﬁdentiality , vol. 4, no. 2, pp. 43–59, 2012.[16] C. Dwork and G. N. Rothblum, “Concentrated diﬀerential privacy,” arXiv:1603.01887v2 ,2016.[17] F. Liu, “Noninformative bounding in diﬀerential privacy and its impact on statistical proper-ties of sanitized results in truncated and boundary-inﬂated-truncated laplace mechanisms,” arXiv:1607.08554 , 2016.[18] T. Steinke and J. Ullman, “Between pure and approximate diﬀerential privacy,” arXiv:1501.06095v1 , 2015.[19] A.-S. Charest, “Empirical evaluation of statistical inference from diﬀerentially-privatecontingency tables,” in

Proceeding of International Conferency on Privacy in StatisticalDatabases , 2012, pp. 257–272.[20] M. Lichman, “UCI machine learning repository,” 2013. [Online]. Available: http://archive.ics.uci.edu/ml[21] K. Nissim, S. Raskhodnikova, and A. Smith, “Smooth sensitivity and sampling in privatedata analysis,”

Proceedings of the 39th ACM Symposium on Theory of Computing , p. 7584,2007.[22] C. Dwork and J. Lei, “Diﬀerential privacy and robust statistics,”

Proceedings of the 41rdACM symposium on Theory of computing , pp. 371–380, 2009.[23] M. Hardt and K. Talwar, “On the geometry of diﬀerential privacy,”

Proceedings of theForty-second ACM Symposium on Theory of Computing, STOC ’10 , pp. 705–714, 2010.[24] B. Mark, J. Ullman, and S. Vadhan, “Fingerprinting codes and the price of approximatediﬀerential privacy,” arXiv:1311.3158v2arXiv:1311.3158v2

Related Researches

On admissible estimation of a mean vector when the scale is unknown

by Yuzo Maruyama

Simplified quasi-likelihood analysis for a locally asymptotically quadratic random field

by Nakahiro Yoshida

Product-form estimators: exploiting independence to scale up Monte Carlo

by Juan Kuntz

Factorization and discrete-time representation of multivariate CARMA processes

by Vicky Fasen-Hartmann

A Small-Uniform Statistic for the Inference of Functional Linear Regressions

by Raymond C. W. Leung

Adversarial robust weighted Huber regression

by Takeyuki Sasai

Spectral density of random graphs: convergence properties and application in model fitting

by Suzana de Siqueira Santos

Large-scale simultaneous inference under dependence

by Jinjin Tian

Inconsistency thresholds for incomplete pairwise comparison matrices

by Kolos Csaba ?goston

Autocovariance Estimation in the Presence of Changepoints

by Colin Gallagher

An Asymptotic Theory of Joint Sequential Changepoint Detection and Identification for General Stochastic Models

by Alexander G. Tartakovsky

Optimal Sequential Detection of Signals with Unknown Appearance and Disappearance Points in Time

by Alexander G. Tartakovsky

Inferring the minimum spanning tree from a sample network

by Jonathan Larson

Regression-type analysis for block maxima on block maxima

by Miguel de Carvalho

Transfer Learning for Linear Regression: a Statistical Test of Gain

by David Obst

Linear Functions to the Extended Reals

by Bo Waggoner

Convolution of a symmetric log-concave distribution and a symmetric bimodal distribution can have any number of modes

by Charles Arnal

Nonasymptotic bounds for suboptimal importance sampling

by Carsten Hartmann

A Consistent Extension of Discrete Optimal Transport Maps for Machine Learning Applications

by Lucas de Lara

Sample variance of rounded variables

by J. An

Distribution-Free Conditional Median Inference

by Dhruv Medarametla

Unbiased simulation of rare events in continuous time

by James Hodgson

Tight Risk Bound for High Dimensional Time Series Completion

by Pierre Alquier

Signed variable optimal kernel for non-parametric density estimation

by M.R.Formica

Exponential confidence interval based on the recursive Wolverton-Wagner density estimation

by M.R.Formica

«
1

2

3

4

»

Submitted on 19 Feb 2016 (v1), last revised 23 Dec 2017 (this version, v6) Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar