GGeneralized Gaussian Mechanism for Differential Privacy
Fang Liu ∗ Abstract
Assessment of disclosure risk is of paramount importance in the research and applica-tions of data privacy techniques. The concept of differential privacy (DP) formalizes privacyin probabilistic terms and provides a robust concept for privacy protection without makingassumptions about the background knowledge of adversaries. Practical applications of DPinvolve development of DP mechanisms to release results at a pre-specified privacy budget.In this paper, we generalize the widely used Laplace mechanism to the family of generalizedGaussian (GG) mechanism based on the l p global sensitivity of statistical queries. We ex-plore the theoretical requirement for the GG mechanism to reach DP at prespecified privacyparameters, and investigate the connections and differences between the GG mechanism andthe Exponential mechanism based on the GG distribution We also present a lower boundon the scale parameter of the Gaussian mechanism of ( (cid:15), δ )-probabilistic DP as a specialcase of the GG mechanism, and compare the statistical utility of the sanitized results inthe tail probability and dispersion in the Gaussian and Laplace mechanisms. Lastly, weapply the GG mechanism in 3 experiments (the mildew, Czech, adult data), and comparethe accuracy of sanitized results via the l distance and Kullback-Leibler divergence andexamine how sanitization affects the prediction power of a classifier constructed with thesanitized data in the adult experiment. Keywords : (probabilistic) differential privacy, l p global sensitivity, privacy budget, Laplacemechanism, Gaussian mechanism When releasing information publicly from a database or sharing data with collaborators, datacollectors are always concerned about exposing sensitive personal information of individuals whocontribute to the data. Even with key identifiers removed, data users may still identify a partici-pant in a data set such as via linkage with public information. Differential privacy (DP) providesa strong privacy guarantee to data release without making assumptions about the backgroundknowledge or behavior of data users [1, 2, 3]. For a given privacy budget, information released viaa differentially private mechanism guarantees no additional personal information of an individ-ual in the data can be inferred, regardless how much background information data users alreadypossess about the individual. DP has spurred a great amount work in the development of differ-entially private mechanisms to release results and data, including the Laplace mechanism [1], theExponential mechanism [4, 5], the medium mechanism [6], the multiplicative weights mechanism[7], the geometric mechanism [8], the staircase mechanism [9], the Gaussian mechanism [10], andapplications of DP for private and secure inference in a Bayesian setting [11], among others. ∗ Fang Liu is Associate Professor in the Department of Applied and Computational Mathematics and Statistics,University of Notre Dame, Notre Dame, IN 46556 ( ‡ E-mail: [email protected]). The work is supported by theNSF Grant 1546373 and the University of Notre Dame Faculty Research Support Program Initiation Grant. a r X i v : . [ m a t h . S T ] D ec n this paper, we unify the Laplace mechanism and the Gaussian mechanism in the frameworkof a general family, referred to as the generalized Gaussian (GG) mechanism. The GG mechanismis based on the l p global sensitivity (GS) of queries, a generalization of the l GS. We demonstratethe nonexistence of a scale parameter that would lead to a GG mechanism of pure (cid:15) -DP in thecase of p (cid:54) = 1 if the results to be released are unbounded, but suggest the GG mechanism of( (cid:15), δ )-probabilistic DP (pDP) as an alternative in such cases. For bounded data we introduce thetruncated GG mechanism and the boundary inflated truncated GG mechanism that satisfy pure (cid:15) -DP. We investigate the connections between the GG mechanism and the Exponential mecha-nism when the utility function in the latter is based on the Minkowski distance, and establish therelationship between the sensitivity of the utility function in the Exponential mechanism and the l p GS of queries. We then take a closer look at the Gaussian mechanism (the GG mechanism oforder 2), and derive a lower bound on the scale parameter that delivers ( (cid:15), δ )-pDP. The bound istighter than the bound to satisfy ( (cid:15), δ )-approximate DP (aDP) in the Gaussian mechanism [10],implying less noise being injected in the sanitized results. We compare the utility of sanitizedresults, in terms of the tail probability and dispersion or mean squared errors (MSE), from inde-pendent applications of the Gaussian mechanism and the Laplace mechanism. Finally, we run 3experiments on the mildew, Czech, and adult data, respectively, and sanitize the count data viathe Laplace mechanism, the Gaussian mechanisms of ( (cid:15), δ )-pDP and ( (cid:15), δ )-aDP. We compare theaccuracy of sanitized results in terms of the l distance and Kullback-Leibler divergence from theoriginal results, and examine how sanitization affects the prediction accuracy of support vectormachines constructed with the sanitized data in the adult experiment.The rest of the paper is organized as follows. Section 2 defines the l p GS and presents the GGmechanism of ( (cid:15), δ )-pDP, the truncated GG mechanism, and the boundary inflated truncated GGmechanism that satisfy pure (cid:15) -DP. It also connects and differentiates between the GG mechanismsand the Exponential mechanism when the utility function in the latter is based the Minkowskidistance. Section 3 take a close look at the Gaussian mechanism of ( (cid:15), δ )-pDP, and comparesit with the Gaussian mechanism of ( (cid:15), δ )-aDP. It also compares the tail probability and thedispersion of the noises injected via the Gaussian mechanism of ( (cid:15), δ )-pDP and the Laplacemechanism. Section 4 presents the findings from the 3 experiments. Concluding remarks aregiven in Section 5.
DP was proposed and formulated in Dwork [12] and Dwork et al. [1]. A perturbation algorithm R gives (cid:15) -differential privacy if for all data sets ( x , x (cid:48) ) that differ by only one individual ( d ( x , x (cid:48) ) =1), and all possible query results Q ⊆ T to query s ( T denotes the output range of R ), (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) Pr( R ( s ( x )) ∈ Q )Pr( R ( s ( x (cid:48) )) ∈ Q ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15), (1)where (cid:15) > s refers to queries about data x and x (cid:48) , we alsouse it to denote the query results (unless stated otherwise, the domain of the query results is theset of all real numbers). d ( x , x (cid:48) ) = 1 is often defined in two ways in the DP community: x and x (cid:48) are of the same size and differ in exactly one record (row) in at least one attributes (columns);and x is exactly the same as x (cid:48) except that it has one less (more) record. Mathematically, Eqn(1) states that the probabilities of obtaining the same query result perturbed via R are roughly2he same regardless of whether the query is sent to x or x (cid:48) . In layman’s terms, DP implies thechance an individual will be identified based on the perturbed query result is very low since thequery result would be about the same with or without the individual in the data. The degree of“roughly the same” is determined by the privacy budget (cid:15) . The lower (cid:15) is, the more similar theprobabilities of obtaining the same query results from x and x (cid:48) are. DP provides a strong androbust privacy guarantee in the sense that it does not assume anything regarding the backgroundknowledge or the behavior on data users.In addition to the “pure” (cid:15) -DP in Eqn (1), there are softer versions of DP, including the ( (cid:15), δ )-approximate DP (aDP) [13], the ( (cid:15), δ )-probabilistic DP (pDP) [14], the ( (cid:15), δ )-random DP (rDP)[15], and the ( (cid:15), τ )-concentrated DP (cDP) [16]. In all the relaxed versions of DP, one additionalparameter is employed to characterize the amount of relaxation on top of the privacy budget (cid:15) .Both the ( (cid:15), δ )-aDP and the ( (cid:15), δ )-pDP reduce to (cid:15) -DP when δ = 0, but are different with respectto the interpretation of δ . In ( (cid:15), δ )-aDP,Pr( R ( s ( x )) ∈ Q ) ≤ e (cid:15) Pr( R ( s ( x (cid:48) )) ∈ Q ) + δ ; (2)while a perturbation algorithm R satisfies ( (cid:15), δ )-pDP ifPr (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) Pr( R ( s ( x )) ∈ Q )Pr( R ( s ( x (cid:48) )) ∈ Q ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) > (cid:15) (cid:19) ≤ δ ; (3)that is, the probability of R generating an output belonging to the disclosure set is boundedbelow δ , where the disclosure set contains all the possible outputs that leak information for agiven privacy budget (cid:15) . The fact that probabilities are within [0 ,
1] puts constraints on the valuesof (cid:15),
Pr( R ( s ( x (cid:48) ) ∈ Q ), and δ in the framework of ( (cid:15), δ )-aDP. By contrast, ( (cid:15), δ )-pDP seems tobe less constrained and more intuitive with its probabilistic flavor. When δ is small, ( (cid:15), δ )-aDPand ( (cid:15), δ )-aDP are roughly the same. The ( (cid:15), δ )-rDP is also a probabilistic relaxation of DP; butit differs from ( (cid:15), δ )-pDP in that the probabilistic relaxation is with respect to data generation.In ( (cid:15), τ )-cDP, privacy cost is treated as a random variable with an expectation of (cid:15) and theprobability of the actual cost > (cid:15) ) > a is bounded by e − ( a/τ ) / . The ( (cid:15), τ )-cDP, similar to the( (cid:15), δ )-pDP, relaxes the satisfaction of DP with respect to R and is broader in scope. l p global sensitivity Definition 1.
For all ( x , x (cid:48) ) that is d ( x , x (cid:48) ) = 1, the l p -global sensitivity (GS) of query s is∆ p = max x , x (cid:48) d ( x , x (cid:48) )=1 (cid:107) s ( x ) − s ( x (cid:48) ) (cid:107) p = ( (cid:80) rk =1 | s k ( x ) − s k ( x (cid:48) ) | p ) /p for integer p > . (4)In layman’s term, ∆ p is the maximum difference measured by the Minkowski distance in queryresults s between two neighboring data set x , x (cid:48) with d ( x , x (cid:48) ) = 1. The sensitivity is “global”since it is defined for all possible data sets and all possible ways that x and x (cid:48) differ by one. Thehigher ∆ p is, the more disclosure risk there is on the individuals from releasing the original queryresults s . The l p GS is a key concept in the construction of the generalized Gaussian mechanismin Section 2.The l p GS is a generalization of the l GS [1, 12] and the l GS [10]. The “difference” between s ( x ) and s ( x (cid:48) ) measured by ∆ is the largest among all ∆ p for p ≥ (cid:107) s (cid:107) p + a ≤ (cid:107) s (cid:107) p forany real-valued vector s and a ≥
0. In addition, ∆ is also the most “sensitive” measure giventhat the rate of change with respective to any s k is the largest among all p ≥
1. When s is a3calar, ∆ p = ∆ for all p >
0. When s is multi-dimensional, an easy upper bound for l GS ∆ is (cid:80) rk =1 ∆ ,k , the sum of the l GS of each element k in s , by the triangle inequality. Lemma 2gives an upper bound on ∆ p for a general p that includes p = 1 as a special case (the proof isprovided in Appendix A). Lemma 2. (cid:0)(cid:80) rk =1 ∆ p ,k (cid:1) /p is an upper bound for ∆ p , where ∆ ,k is the l GS of s k .The upper bound given in Lemma 2 can be conservative in cases where the change from x to x (cid:48) does not necessarily alter every entry in the multidimensional s . For example, the l p GS ofreleasing a histogram with r bins is 1 (if d ( x , x (cid:48) ) = 1 is defined as x (cid:48) is one record less/more than x ). In other words, the GS is not r /p even though there are r counts in the released histogram,but is the same as in releasing a single cell because removing one record only alters the count ina single bin.It is obvious that each element s k in s for k = 1 , . . . , r needs to be bounded to obtain a finite∆ p . The most extreme case is the change from x to x (cid:48) makes s k jump from one extreme to theother, implying the range of s k can be used as an upper bound for ∆ k, , which, combined withLemma 2, leads to the following claim. Claim 3.
Denote the bounds of statistic s k by [ c k , c k ], both of which are finite. The GS∆ k ≤ c k − c k and the GS for s = { s k } k =1 ,...,r is ∆ p ≤ ( (cid:80) rk =1 ( c k − c k ) p ) /p . The GG mechanism is defined based on the GG distribution GG( µ, b, p ) with location parameter µ , scale parameter b >
0, shape parameter p >
0. The probability density function (pdf) is f ( x | µ, b, p ) = p b Γ( p − ) exp (cid:26)(cid:18) | x − µ | b (cid:19) p (cid:27) . The mean and variance of x are µ and b Γ(3 /b ) / Γ(1 /b ), respectively. (Γ( t ) = (cid:82) ∞ x t − e − x dx isthe Gamma function). When p = 1, the GG distribution is the Laplace distribution with mean µ and variance 2 b ; when p = 2, the GG distribution becomes the Gaussian distribution withmean 0 and variance b /
2. Figure 1 presents some examples of the GG distributions at different
Figure 1: Density of GG distributions p . All the distributions in the left plot have the same scale b = √ p increases, and the Laplace distribution( p = 1) looks very different from the rest. When the variance is the same (the right plot), theLaplace distribution is the most likely to generate values that are close to the mean, followed bythe Gaussian distribution ( p = 2). (cid:15) -DP We first examine the GG mechanism of (cid:15) -DP with the domain for s ∗ k defined on ( −∞ , ∞ ) for k = 1 , . . . , r . s needs to bounded to calculate the l p GS, but the bounding requirement does notnecessarily goes into formulating the GG distribution for the GG mechanism in the first place. Ifbounding for s ∗ is necessary, it can be incorporated in a post-hoc manner after being generatedfrom the GG mechanism. A well-known example is the Laplace mechanism. It employs a Laplacedistribution defined on ( −∞ , ∞ ), though its scale parameter b = ∆ /(cid:15) requires s to be boundedfor ∆ to be calculated.Eqn (5) presents the GG distribution from which sanitized s ∗ would be generated to satisfy (cid:15) -DP, assuming b exists. f ( s ∗ ) ∝ e ( (cid:107) s ∗ − s (cid:107) p /b ) p ∝ (cid:81) rk =1 exp {− ( | s ∗ k − s k | /b ) p } = (cid:81) rk =1 p b Γ( p − ) exp { ( | s ∗ k − s k | /b ) p } = (cid:81) rk =1 GG( s k , b, p ) (5) Claim 4.
There does not exist a lower bound on b for the GG distribution in Eqn (5) when p (cid:54) = 1 that generates s ∗ with (cid:15) -DP. When p = 1, the lower bound on b that leads to (cid:15) -DP is (cid:15) − ∆ .Appendix B lists the detailed steps that lead to Claim 4. In brief, to achieve (cid:15) -DP, we need b − p (cid:16)(cid:80) rk =1 (cid:80) p − j =1 ( pj ) | s ∗ k − s k | p − j ∆ j ,k + ∆ pp (cid:17) ≤ (cid:15) (Eqn B.4). However, this inequality depends onthe random GG noise e k = s ∗ k − s k for k = 1 , . . . , r , the support of which is ( −∞ , ∞ ) r . In otherwords, there does not exist a random noise-free solution on b , unless p = 1 in which case theinequality no longer involves the error terms and the GG mechanism reduces to the familiarLaplace mechanism of (cid:15) -DP. We propose two approaches to fix the problem and achieve DPthrough the GG mechanism. The first approach leverages the bounding requirement for s andbuilds in the requirement in the GG distribution in the first place to generate s ∗ with (cid:15) -DP,assuming that s ∗ and s share the same bounded domain (Section 2.5). The second approach stilluses the GG distribution in Eqn (5) to sanitize s , only satisfying ( (cid:15), δ )-pDP instead of the pure (cid:15) -DP (Section 2.6). The sanitized s ∗ can be bounded in a post-hoc manner, as needed. (cid:15) -DP Definition 5.
Denote the bounds on query result s by [ c k , c k ] k =1 ,...,r . For integer p ≥
1, thetruncated GG mechanism of order p generates s ∗ ∈ [ c k , c k ] k =1 ,...,r with (cid:15) -DP by drawing from thetruncated GG distribution f ( s ∗ | c k ≤ s ∗ k ≤ c k , ∀ k = 1 , . . . , r ) = r (cid:89) k =1 p exp { ( | s ∗ k − s k | /b ) p } b Γ( p − ) A ( s k , b, p ) with scale parameter (6) b ≥ (cid:32) (cid:15) − (cid:32) r (cid:88) k =1 p − (cid:88) j =1 ( pj ) | c k − c k | p − j ∆ j ,k + ∆ pp (cid:33)(cid:33) /p , (7)5here A ( s k , b, p ) = Pr( c k ≤ s ∗ k ≤ c k ; s k , b, p ) = (Γ( p − )) − ( γ [ p − , ( c k − s k ) /b ] + γ [ p − , ( s k − c k ) /b ])( γ is the lower incomplete gamma function), ∆ ,k is the l GS of s k , and ∆ p is the l p GS of s .The proof of (cid:15) -DP of the truncated GG mechanism is given in Appendix C. The truncatedGG mechanism perturbs each element in s independently; thus Eqn (6) involves the productof r independent density functions. Though the closed interval [ c k , c k ] is used to denote thebounds on s k , Definition 5 remains the same regardless of whether the interval is closed, open,or half-closed since the GG distribution is defined on a continuous domain. If s k is discrete innature such as counts, post-hoc rounding on perturbed s ∗ k can be applied. The lower bound on b in Eqn (7) depends on ∆ p . We may apply Lemma 2 and set ∆ pp at its upper bound (cid:80) rk =1 ∆ p ,k to obtain a less tight bound on b . b ≥ (cid:16) (cid:15) − (cid:16)(cid:80) rk =1 (cid:80) pj =1 ( pj ) | c k − c k | p − j ∆ j ,k (cid:17)(cid:17) /p . (8) Definition 6.
Denote the bounds on query result s k by [ c k , c k ] for k = 1 , . . . , r . For integer p ≥
1, the p th order boundary inflated truncated (BIT) GG mechanism sanitizes s with (cid:15) -DP bydrawing perturbed s ∗ from the following piecewise distribution f ( s ∗ | c k ≤ s ∗ k ≤ c k , ∀ k = 1 ,. . . ,r ) = (cid:81) rk =1 (cid:26) p I( s ∗ k = c k ) k q I( s ∗ k = c k ) k (cid:16) p exp { ( | s ∗ k − s k | /b ) p } b Γ( p − ) (cid:17) I( c k c k ; s k , p, b ) = − γ ( p − , (( c k − s k ) /b p ))(2Γ( p − )) − , γ is the lower incomplete gamma func-tion, and Γ is the gamma function; and I() is the indicator function that equals 1 if the argumentin the parentheses is true, 0 otherwise.In brief, the BIT GG distribution replaces out-of-bound values with the boundary values andkeeps the within-bound values as is, leading to a piecewise distribution. This is in contrastto the truncated GG distribution which throws away out-of-bound values. The challenge withperturbing s directly via Eqn (9) lies in solving for a lower bound b that satisfies (cid:15) -DP fromlog (cid:12)(cid:12)(cid:12)(cid:12) f ( s ∗ | c k ≤ s ∗ k ≤ c k , ∀ k = 1 , . . . , r ) f ( s (cid:48) ∗ | c k ≤ s ∗ k ≤ c k , ∀ k = 1 , . . . , r ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (10)where s ∗ = { s ∗ k } and s (cid:48)∗ = { s (cid:48)∗ k } are the sanitized results from data x and x (cid:48) that are d ( x , x (cid:48) ) = 1,respectively. The lower bound given in Eqns (7) and 8 can be used when the output subset Q is asubset of ( c , c ) × · · · × ( c r , c r ) (open intervals). However, when Q is { s k = c k ∀ k = 1 , . . . , r } and { s k = c k ∀ k = 1 , . . . , r } , respectively, there are no analytical solutions on b in either Eqns(11) or (12) log (cid:12)(cid:12)(cid:12)(cid:81) ri =1 1 / − γ ( p − , (( s k − c k ) /b ) p )(2Γ( p − )) − / − γ ( p − , (( s (cid:48) k − c k ) /b ) p )(2Γ( p − )) − (cid:12)(cid:12)(cid:12) ≤ (cid:15) (11)log (cid:12)(cid:12)(cid:12)(cid:81) ri =1 1 / − γ ( p − , (( s k − c k ) /b ) p )(2Γ( p − )) − / − γ ( p − , (( s (cid:48) k − c k ) /b ) p )(2Γ( p − )) − (cid:12)(cid:12)(cid:12) ≤ (cid:15). (12)The most challenging situation is when Q is a mixture set of ( c k , c k ), c k , and c k for different k = 1 , . . . , r . In summary, the BIT GG mechanism is not very appealing from a practicalstandpoint. 6 .6 GG mechanism of ( (cid:15), δ ) -pDP The second approach to obtain a lower bound on the scale parameter b for the GG distributionin Eqn (5) when p ≥ b that satisfies ( (cid:15), δ )-pDP. Corollary 7.
If the scale parameter b in the GG distribution in Eqn (5) satisfiesPr (cid:16)(cid:80) rk =1 (cid:80) p − j =1 ( pj ) | s ∗ k − s k | p − j ∆ j ,k > b p (cid:15) − ∆ pp (cid:17) < δ, (13)then the GG mechanism satisfies ( (cid:15), δ )-pDP when p ≥ ≤ (cid:15) (i.e.with 100%), we attach a probability of achieving the inequality, that is, Pr(Eqn (B.4) < (cid:15) ) > − δ ,leading to Eqn (13). The ( (cid:15), δ )-pDP does not apply to the Laplace mechanism ( p = 1) at least inthe framework laid out in Corollary 7. When p = 1, Eqn (B.1) becomes b − (cid:80) rk =1 (cid:12)(cid:12) | e k |−| e k + d k | (cid:12)(cid:12) ≤ b − (cid:80) rk =1 | d k | ≤ b − ∆ , which does not involve the random variable s ∗ ; in other words, as long as b − ∆ s , ≤ (cid:15) , the pure (cid:15) -DP is guaranteed.Corollary 7 does not list a closed-form solution on b as it is likely that only numerical solutionsexist in most cases. Given that s ∗ k is independent across k = 1 , . . . , r , a k = (cid:80) p − j =1 ( pj ) | s ∗ k − s k | p − j ∆ j ,k a function of s ∗ k , is also independent across k . Therefore, the problem becomes searching for alower bound on b where the probability of a sum of r independent variables ( a , . . . , a r ) exceeding b p − ∆ pp (cid:15) is smaller than δ . If there exists a closed-form distribution function for (cid:80) rk =1 a k , an exactsolution on b can be obtained. When p = 2, an analytical lower bound b can be obtained (seeSection 3); when p > pj ) | s ∗ k − s k | p − j ∆ j ,k ,but not for a k or (cid:80) rk =1 a k at the current stage. A relatively simple case is when the elements ofstatistics s are calculated on disjoint subsets of the original data, thus removing one individualfrom the data only affects one element out of r , ∆ = ∆ p = ∆ ,k (cid:48) , leading to the Corollary 8. Corollary 8.
When all r elements in s are based disjoint subsets of the data, the lower boundon b satisfies Pr( (cid:80) pj =1 ( pj ) | s ∗ k (cid:48) − s k (cid:48) | p − j ∆ ,k (cid:48) > b p (cid:15) ) < δ , where k (cid:48) = argmax k ∆ ,k .When the query is a histogram, ∆ = ∆ p = ∆ ,k (cid:48) = 1, and the lower bound b for ( (cid:15), δ )-pDPcan be derived from Pr( (cid:80) pj =1 ( pj ) | e k (cid:48) | p − j > b p (cid:15) ) < δ . The proof of 8 is trivial. With disjoint queries,only one element in s is affected by changing from x to x (cid:48) while the other r − s k ( x ) = s k ( x (cid:48) ), and Eqn (B.2) = b − p (cid:80) pj =1 ( pj ) | e k (cid:48) | p − j | d k (cid:48) | j ≤| b − p (cid:80) pj =1 ( pj ) | e k (cid:48) | p − j ∆ ,k (cid:48) .Numerical approaches can be applied to obtain a lower bound on b when the closed-formsolutions are difficult to attain. Figure 2 depicts the lower bounds on b at different p and ( (cid:15), δ )obtained via the Monte Carlo approach. We set ∆ ,k at 1 , . , .
05 for k = 1 , ,
3, respectivelyand applied Lemma 2 to obtain an upper bound on ∆ p for a given p value. As expected, the lowerbound on b increases with decreased (cid:15) (lower privacy budget) and decreased δ (reduced chanceof failing the pure (cid:15) -DP). The results also suggest b increases with p to maintain ( (cid:15), δ )-pDP inthe examined scenarios. s ∗ sampled from the GG mechanism of ( (cid:15), δ )-pPD in Eqn (5) once b is determined – analyticallyor numerically – ranges ( −∞ , ∞ ). To bound s ∗ , it is straightforward to apply a post processingprocedure such as the truncation and the boundary inflated truncation (BIT) procedure [17].The truncation procedure throws away the out-of-bounds values and only keeps those in boundswhile the BIT procedure sets the out-of-bounds values at the bounds. If the bounds are noninfor-mative in the sense that the bounds are global and do not contain any data-specific information,7 l l l l l l l p l o w e r bound f o r b
39 78.5 7.5 18.1 30.2 51 6.2 10.3 e = 0.1; d = 0.01 e = 0.5; d = 0.01 e = 0.1; d = 0.05 e = 0.5; d = 0.05 Figure 2: Numerical Lower bound on b from Corollary 7 then neither one of the two post-hoc bounding procedures will leak the original information orcompromise the established ( (cid:15), δ )-pDP. The exponential mechanism was introduced by McSherry and Talwar [4]. We paraphrase theoriginal definition as follows, covering both discrete and continuous outcomes. Let S denote theset containing all possible output s ∗ . The exponential mechanism releases s ∗ with probability f ( s ∗ ) = exp (cid:18) u ( s ∗ | x ) (cid:15) u (cid:19) ( A ( x )) − (14)to ensure (cid:15) -DP. A ( x ) is a normalizing constant so that f ( s ∗ ) sums or integrates to 1, and equalsto (cid:80) s ∗ ∈S exp (cid:16) u ( s ∗ | x ) (cid:15) u (cid:17) or (cid:82) s ∗ ∈S exp (cid:16) u ( s ∗ | x ) (cid:15) u (cid:17) d s ∗ , depending on whether S is a count-able/discrete sample space, or a continuous set, respectively. u is the utility function andassigns a “utility” score to each possible outcome s ∗ conditional on the original data x , and∆ u = max x , x (cid:48) ,d ( x , x (cid:48) )=1 , s ∗ ∈S | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | is the maximum change in the utility score acrossall possible output s ∗ and all possible data sets x and x (cid:48) that is d ( x , x (cid:48) ) = 1. From a practicalperspective, the scores should properly reflect the “usefulness” of s ∗ . For example, “usefulness”can be measured the similarity between perturbed s ∗ and original s if s is numerical. The closer s ∗ is to the original s , the larger u ( s ∗ | x ) is, and the higher the probability s ∗ will be released.The Exponential mechanism can be conservative (See Appendix D), in the sense that the actualprivacy cost is lower than the nominal privacy budget (cid:15) , or more than necessary amount of per-turbation is injected to preserve (cid:15) -DP. Despite the conservativeness, the Exponential mechanismis a widely used mechanism in DP with its generality and flexibility as long as the utility function u is properly designed.When u is defined as the negative p th power of the p th -order Minkowski distance between s ∗ and s , that is, u ( s ∗ | s ) = −(cid:107) s ∗ − s (cid:107) pp , the Exponential mechanism generates perturbed s ∗ from theGG distribution f ( s ∗ | s ) = ( A ( s )) − exp (cid:16) −(cid:107) s ∗ − s (cid:107) pp (cid:15) u (cid:17) = ( A ( s )) − (cid:81) rk =1 exp (cid:16) − | s ∗ k − s k | p u (cid:15) − (cid:17) = (cid:81) rk =1 GG( s k , b, p ) (15)with A ( s ) = ( p − b Γ( p − )) r and b p = 2∆ u (cid:15) − . The scale parameter b in Eqn (15) is a function ofthe GS of the utility function ∆ u and the privacy budget (cid:15) . For bounded data s ∗ k ∈ [ c k , c k ] for8 = 1 , . . . , r , the Exponential mechanism based on the GG distribution is f ( s ∗ | s ∗ ∈ [ c , c ]) = ( A ( s )) − (cid:81) rk =1 ( B ( s k )) − exp (cid:16) − | s ∗ k − s k | p u (cid:15) − (cid:17) , (16)where B ( s k ) = Pr( s ∗ k ∈ [ c k , c k ]) is calculated from the pdf GG( s k , b, p ). Compared to thetruncated GG mechanism in Definition 5, the only difference in the Exponential mechanism inEqn (16) is how the scale parameter b is defined. In Definition 5, b depends on the GS of s (∆ p )while it is a function of the GS of the utility function u (∆ u ) in the Exponential mechanism.Specifically, b p ≥ (cid:15) − ∆ u in the Exponential mechanism, and the lower bound on b is given inEqn (7) in the GG mechanism. While both mechanisms will lead to the satisfaction of (cid:15) -DP, theone with a smaller b is preferable at the same (cid:15) . The magnitude of b in each case depends onthe bounds of s , and the order p , in addition to ∆ u or ∆ p . Though not a direct comparison on b , Lemma 9 explores the relationship between ∆ u and ∆ p , with the hope to shed light on thecomparison of b (the proof is in Appendix E). Lemma 9.
Let [ c k , c k ] denote the bounds on s k for k = 1 , . . . , r .a) When u = −(cid:107) s ∗ − s (cid:107) , ∆ u ≤ ∆ . Both the GG mechanism and the GG-distribution basedExponential mechanism reduce to the truncated Laplace mechanism with the same b .b) When u = −(cid:107) s ∗ − s (cid:107) , ∆ u ≤ (cid:80) rk =1 ∆ ,k | c k − c k | .c) When u = −(cid:107) s ∗ − s (cid:107) pp for p ≥
3, ∆ u ≤ (cid:80) rk =1 (cid:80) pj =1 ( pj )(max {| c k | , | c k |} ) p − j ∆ ( j )1 ,k , where ∆ ( j )1 ,k =max x , x (cid:48) ,d ( x , x (cid:48) )=1 | ( s k ( x )) j − ( s k ( x (cid:48) )) j | is l GS of ( s k ) j .As a final note on the GG-distribution based Exponential mechanism, we did not use thenegative Minkowski distance directly as the utility function due to a couple of potential prac-tical difficulties with this approach. First, ∆ u can be difficulty to obtain. Second, f ( s ∗ ) ∝ exp {− ( (cid:80) rk =1 | s ∗ k − s k | p ) /p (cid:15) (2∆ u ) − } , does not appear to be associated with any known distri-butions (except when p = 1), and additional efforts are required to study the properties of f ( s ∗ )and to develop an efficient algorithm to draw samples from it. A special case of the GG mechanism is the Gaussian mechanism when p = 2 that draws s ∗ k independently from a Gaussian distribution with mean s k and variance σ = b / k = 1 , . . . , r .Applying Eqn (6) with b defined in Eqns (7) and (8), we can obtain the truncated Gaussianmechanism of (cid:15) -DP for bounded s ∈ [ c , c ] × · · · × [ c r , c r ] f ( s ∗ | s ) = (cid:81) rk =1 (cid:110) (Φ( c k ; µ, σ ) − Φ( c k ; µ, σ )) − φ ( s ∗ k ; µ = s k , σ = b / (cid:111) , where (17) b ≥ (cid:15) − (2 (cid:80) rk =1 | c k − c k | ∆ ,k + ∆ ) ≥ (cid:15) − (cid:80) rk =1 (cid:0) | c k − c k | ∆ ,k + ∆ ,k (cid:1) , where φ and Φ are the pdf and the CDF of the Gaussian distribution, respectively.An analytical solution on the lower bound of b for the Gaussian mechanism of ( (cid:15), δ )-pDP isprovided in Lemma 10 (the proof is provided in Appendix F). Lemma 10.
The lower bound on the scale parameter b from the Gaussian mechanism of ( (cid:15), δ )-pDP is b ≥ − / (cid:15) − ∆ (cid:16)(cid:112) (Φ − ( δ/ + 2 (cid:15) − Φ − ( δ/ (cid:17) .9iven the relationship between b and the standard deviation of the Gaussian distribution σ = b/ √
2, the lower bound can also be expressed in σ , σ ≥ (2 (cid:15) ) − ∆ (cid:16)(cid:112) (Φ − ( δ/ + 2 (cid:15) − Φ − ( δ/ (cid:17) . (18)The pDP lower bound given in Eqn (18) is different from the lower bound σ > (cid:15) − ∆ c, with (cid:15) ∈ (0 ,
1) and c > . /δ ) . (19)in Dwork and Roth [10] for ( (cid:15), δ )-aDP (Eqn (2)). The pDP bound in Eqn (18) is tighter thanthe aDP bound in Eqn (19) for the same set of ( (cid:15), δ ) (note the interpretation of δ in pDP andaDP is different, but the DP guarantee is roughly the same when δ is small). In addition, thepDP bound does not constrain (cid:15) to be < (cid:15) ∈ (0 ,
1) and δ ∈ (0 , . < (cid:15), δ ). The smaller (cid:15) is, or the larger δ is, the smaller the ratio is and the larger the difference is between the two bounds. R a t i o . . . . . d R a t i o on l o w e r bound ( p D P vs a D P ) e Figure 3: Comparison of pDP lower bound (Eqn 18) vs. aDP bound (Eqn 19) on σ in the Gaussianmechanism for (cid:15) < (the aDP bound requires (cid:15) < ) Dwork and Roth [10] list several advantages of the Gaussian noises, such as the Gaussiannoise is a “familiar” type of noise as many noise sources in real life can be well approximated byGaussian distributions; the sum of Gaussian variable is still a Gaussian; and finally, in the caseof multiple queries or when δ is small, the pure-DP guarantee in the Laplace mechanism and thepDP guarantee in the Gaussian mechanism see minimal difference. A theoretical disadvantageto Gaussian noise is that it does not guarantee DP in some cases (e.g., Report Noisy Max)[10].We investigate the accuracy of s ∗ by examining the tail probability and the dispersion of thenoises injected via the (cid:15) -DP Laplace mechanism and the ( (cid:15), δ )-pDP Gaussian mechanism. Denotethe noise drawn from the Laplace distribution by e and that from the Gaussian distribution by e .The location parameters of both are µ = 0; the tail probability p = Pr( e > | t | ) = exp( −| t | (cid:15)/ ∆ )in the Laplace distribution and p = Pr( e > | t | ) = 2Φ( −| t | /σ ) in the Gaussian distribution,where σ is given in Eqn (18). Since the CDF Φ() does not have a close-formed expression,we examine several numerical examples to compare p and p (Figure 4). We set (cid:15) to be thesame (0.1, 1, 2, respectively) between the two mechanisms and examine δ = (1% , , , (cid:15), δ )-pDP Gaussian mechanism. If the ratio p : p is <
1, it implies that the Laplacemechanism is less likely to generate more extreme s ∗ compared to the Gaussian mechanism at the10ame privacy specification of (cid:15) . We should focus on the meaningful cases where noise | t | at leasthas a non-ignorable chance to occur in either mechanism. We used cutoff 10 − ; that is, either p > − or p > − (other cutoffs can be used, depending on how “unlikely” is defined). It isinteresting to observe that after the initial take-off at 1 when | t | = 0, the ratio decreases until ithits the bottom and then bounds back with some cases eventually exceeding 1 at some value of | t | , depending on the privacy parameter specification. The smaller (cid:15) or δ is, the longer it takesfor the bounce-back to occurs. The observation suggests that the Laplace mechanism is in somecases more likekly to generate sanitized results s ∗ that are far away from s . We also compare . . . . . . . |t| r a t i o ( p1 : p2 ) (0.1,0.01)(0.1,0.05)(0.1,0.1)(0.1,0.2)(1, 0.01)(1, 0.05)(1, 0.1)(1, 0.2)(2, 0.01)(2, 0.05)(2, 0.1)(2, 0.2)(epsilon,delta)p1<10 - ; p2<10 - Figure 4: Ratio on the tail probabilities p : p (the gray curves represent the unlikely cases where both p and p are < − ) the privacy parameter (cid:15) between the two mechanisms when both have the same tail probability.Figure 5 shows the calculated (cid:15) value associated with the Gaussian mechanism of ( (cid:15) , δ )-DP fora given δ that yields Pr( e < | t | ) = Pr( e < | t | ) with the Laplace mechanism of (cid:15) -DP. If theratio of (cid:15) : (cid:15) < | t | and a small and somewhat ignorable δ , it implies the same tailprobability can be achieved with less privacy cost with the Gaussian mechanism compared tothe Laplace mechanism. Figure 5 suggests that at the same | t | , the more relaxation of the pure (cid:15) -DP is allowed (i.e., the larger δ is), the smaller (cid:15) is (relative to baseline (cid:15) ), which expectedas the (cid:15) and δ together determine the noise released in the Gaussian mechanism.Lemma 11 presents the precision comparison of s ∗ between the Laplace mechanism of (cid:15) -DPand the Gaussian mechanism of ( (cid:15), δ )-pDP. With the same location parameter in the Laplace andGaussian distributions, a larger precision is equivalent to a smaller mean squared error (MSE). Lemma 11.
Between the Gaussian mechanism of ( (cid:15), δ )-pDP and the Laplace mechanism of (cid:15) -DPfor sanitizing a statistic s , when δ < √ ≈ . s ∗ released by the Gaussian mechanism of ( (cid:15), δ < . (cid:15) -DP. In other words, if there are multiple sets of s ∗ released via the Gaussianand the Laplace mechanisms respectively, then the former sets would have a wider spread thanthe latter. Since ( (cid:15), δ )-pDP provides less privacy protection than (cid:15) -pDP, together with the larger11
10 20 30 40 50 |t| ep s il on2 : ep s il on1 p1<10 - ; p2<10 - Figure 5: Relative privacy cost (cid:15) : (cid:15) (the gray curves represent the unlikely cases where both p and p are < − ) MSE, it can be argued that the Laplace mechanism is superior to the Gaussian mechanism (whichis also reflected in the 3 experiments in Section 4). It should be noted that δ < .
157 in Lemma11 is a sufficient but not necessary condition. In other words, the Gaussian mechanism may notbe less dispersed than the Laplace mechanism when δ ≥ . δ needs tobe small to provide sufficient privacy protection in the setting of ( (cid:15), δ )-pDP, it is very unlikely tohave δ > .
157 in practical applications. Also noted is that the setting explored in Lemma 11,where the focus is on examining the precision (dispersion) of a single perturbed statistic giventhe specificized privacy parameters and the original statistics when the sample size of a data setis public, is different from the recent work on the bounds of sample complexity (required samplesize) to reach a certain level of a statistical accuracy in perturbed results with (cid:15) -DP or ( (cid:15), δ )-aDP[18] (more discussions are provided in Section 5 on this point).
We run three experiments on the mildew data set, the Czech data set, and the Census Incomedata set; a.k.a. the adult data. The mildew data contains information of parental alleles at 6 locion the chromosome for 70 strands of barley powder mildew[19]. Each loci has two levels, yieldinga very sparse 6-way cross-tabulation (22 cells out of the 64 are non-empty with low frequenciesin many other cells). The Czech data contains data collected on 6 potential risk factors forcoronary thrombosis for 1841 workers in a Czechoslovakian car factory [19]. Each risk factor has2 levels (Y or N). The cross-tabulation is also 6-way with 64 cells, the same as the mildew data,but table is not as sparse with the large n (only one empty cell). The adult data was extractedfrom the 1994 US Census database to yield a set of reasonably clean records that satisfy a setof conditions[20]. The data set is often used to test classifiers by predicting whether a personmakes over 50K a year. We used only the completers in the adult data (with no missing valueson the attributes) and then split them to 2/3 training (20009 subjects) and 1/3 testing (10005subjects).In each experiment, we run the Laplace mechanism of (cid:15) -DP, the Gaussian mechanism of ( (cid:15), δ )-pDP presented in Section 3, and the Gaussian mechanism of of ( (cid:15), δ )-aDP [10] to sanitize count12 lll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllllll Laplace−DPGaussian−pDPGaussian−aDPGGM3−pDP epsilon = 0.5sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll epsilon = 1sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll epsilon = 2sanitized counts o r i g i na l c oun t s de l t a = % llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s de l t a = % llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s de l t a = % llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll llll ll lll llll lll ll ll ll lllll ll ll l ll ll llll l ll llll lllllllllllllllll sanitized counts o r i g i na l c oun t s de l t a = % Figure 6: sanitized vs. original cell counts in the mildew data l l l l l l l l l l l l L1 e rr o r l l l l l l l l l l l ll l l l l l l l l l l l K L d i s t an c e 0123 l l l l l l l l l l l ll l l Laplace−DP Gaussian−pDP Gaussian−aDP GGM3−pDP
Figure 7: l distance and KL divergence between sanitized and original counts in the mildew data data. We examined (cid:15) = 0 . , , δ = 0 . , . , . , .
25. To examine the variation of noises,we run 500 repeats and computed the means and standard deviations of l distances betweenthe sanitized and the original counts and the Kullback-Leibler (KL) divergence between theempirical distributions of the synthetic data and the original data over the 500 repeats. Inaddition, we tested the GG mechanism of order 3 ( p = 3) in the mildew data, and compared theclassification accuracy of the income outcome in the testing data set in the adult experiment based13 l ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lllll Laplace−DPGaussian− p DPGaussian− a DP epsilon = 0.1sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll epsilon = 0.5sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll epsilon = 1sanitized counts o r i g i na l c oun t s de l t a = % ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s de l t a = % ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s de l t a = % ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll llllllll lll lll sanitized counts o r i g i na l c oun t s ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll ll ll l ll l ll ll llll l l lll lll ll lll ll lll lll lllll llllllll lll lllll lll lll sanitized counts o r i g i na l c oun t s de l t a = % Figure 8: sanitized vs. original cell counts in the Czech data l l l l l l l l l l l l L1 e rr o r l l l l l l l l l l l ll l l l l l l l l l l l . K L d i s t an c e . . . . l l l l l l l l l l l ll l Laplace−DP Gaussian−pDP Gaussian−aDP
Figure 9: l distance and KL divergence between sanitized and original counts in the Czech data on the support vector machines (SVMs) trained with the original training data and the sanitizedtraining data, respectively. The KL distance was calculated using the KL.Dirichlet command14n R package entropy that computes a Bayesian estimate of the KL divergence. The SVMs weretrained using the svm command in R package e1071 . In all experiments, ∆ p = 1 for all p since thereleased query is a histogram and the bin counts are based on disjoint subsets of data. The scaleparameters of the Laplace mechanism and the Gaussian mechanisms were obtained analytically(∆ (cid:15) − , Eqns (18) and (19), respectively), the grid search and the MC approach were applied toobtain the lower bound b for GGM-3 via Corollary 8. In the mildew and Czech experiments, wesanitized all bins in the histograms, including the empty bins, assuming all combinations of the6 attributes in each case are practically meaningful (in other words, the empty cells are samplezeros rather than population zeros). In the adult data, there are 14 attributes and ∼ . × bins in the 14-attribute histogram, a non-ignorable portion of which do not make any practicalsense (e.g., a 90-age works >
80 hours per week). For simplicity, we only sanitized the 17,985nonempty cells in the training data. After the sanitization, we set the out-of-bounds syntheticcounts < > n at n , respectively, and normalized the sanitized counts to sum upto the original sample size n in all 3 experiments, assuming n itself is public or does not carryprivacy information. Figure 10: sanitized vs. original cell counts in the adult data
The results are given in Figures 6 to 12. In Figures 6, 8 and 10, the closer the points are tothe identity line, the more similar are the original and sanitized counts. The Laplace sanitizeris the obvious winner in all 3 cases, producing the sanitized counts closest to the original withthe smallest l l error and the KL divergence, followed by the Gaussian mechanism of ( (cid:15), δ )-pDP,and GGM3 of ( (cid:15), δ )-pDP in the mildew data; the Gaussian mechanism of ( (cid:15), δ )-aDP is the worst.In the mildew experiment, the performance of the Gaussian mechanism of ( (cid:15), δ )-pDP is similarwhen (cid:15) = 2 or δ ≥ .
1. The decrease in the l error and the KL divergence seems to decreasemore or less in a linear manner as (cid:15) increases from 0.5 to 1 to 2, while the impact of δ seemed to15 l l l l l l l l l l l l l l l l l l l l l l l L1 e rr o r l l Laplace Gaussian−pDP Gaussian−aDP l l l l l l l l l l l l . . . . . l l l l l l l l l l l l K L d i s t an c e Figure 11: l distance and KL divergence between sanitized and original counts in the adult data l l l l l l l l l l l l . . . l l l l l l l l l l l l a cc u r a cy r a t e l l Laplace Gaussian−pDP Gaussian−aDP l l l l l l l l l l l l . . . l l l l l l l l l l l l P r( p r ed i c t ed > K | > K ) l l l l l l l l l l l l . . l l l l l l l l l l l l delta P r( p r ed i c t ed <= K | <= K ) original rates - - - - Figure 12: Prediction accuracy in testing data via SVMs trained on sanitized and original data in theadult data have less a profound impact on the l error and the KL divergence. In the Czech experiment, thesanitized counts approach the original counts more quickly than the mildew case with increased (cid:15) and δ , but there is significantly more variability for small (cid:15) (0.1); and the l error and the KLdivergence no longer decreases in a linear fashion, but drastically from (cid:15) = 0 . (cid:15) = 1 to 2. The differences in the results between the mildew and the Czech experiments16an be explained by the larger n in the latter. In the adult experiment, Figure 12 suggests theprediction accuracy via the SVMs built on sanitized data is barely affected compared to theoriginal accuracy regardless of the mechanism.There are some decreases in the accuracy ratesfrom the original, but they are largely ignorable (on the scale of 0.25% to 1%), even with thevariation take into account. In addition, the Gaussian mechanism of ( (cid:15), δ )-aDP, though beingthe worst in preserving the original counts measured the l distance and KL divergence, is noworse than the two Gaussian mechanisms in prediction. We introduced a new concept of the l p GS, and unified the Laplace mechanism and the Gaussianmechanism in the family of the GG mechanism. For bounded data, we discussed the truncatedand the BIT GG mechanisms to achieve (cid:15) -DP. We also proposed ( (cid:15), δ )-pDP as an alternativeparadigm to the pure (cid:15) -DP for the GG mechanism for order p ≥
2. We showed the connectionsand distinctions between the GG mechanism and the Exponential mechanism when the utilityfunction is defined as the negative p th -power of the Minkowski distance between the original andsanitized results. We also presented the Gaussian mechanism as an example of the GG mechanismand derived a lower bound for the scale parameter of the associated Gaussian distribution toachieve ( (cid:15), δ )-pDP. The bound is tighter than the lower bound for the Gaussian mechanism of( (cid:15), δ )-aDP. We compared the tail probability and the dispersion of the the noise generated via theGaussian mechanism of ( (cid:15), δ )-pDP and the Laplace mechanism. We finally applied the Gaussianmechanisms of ( (cid:15), δ )-pDP and ( (cid:15), δ )-aDP and the Laplace mechanism of (cid:15) -DP in three real-lifedata sets.The GG mechanism is based on the l p “global” sensitivity of query results in the sense thatthe sensitivity is independent of any specific data. Though the employment of the GS is robustin terms of privacy protection, it could result in a large amount of noises being injected to queryresults. There is work that allows the sensitivity of a query to vary with data (“local” sensitivity)[21, 22] with the purpose to increase the accuracy of sanitized results. How to develop the GGmechanism in the context of local sensitivity is a topic for future investigation.The setting for the examination on the tail probability and dispersion in Section 3 is differentfrom, though related to, the work on upper and lower bounds on sample complexity – the requiredsample size n to reach a certain level of accuracy α and privacy guarantee ( (cid:15), δ ) for count queries[18, 23, 24]. α often refers to the accuracy of perturbed results in the DP literature, such as theworst case accuracy L ∞ or average accuracy L , and might also refer to the tail probability andthe MSE of released data, among others. A differential privacy mechanism is characterized by (cid:15) (and δ ) for privacy guarantee, α to measure information preservation and utility of sanitizedresults, and the sample size n of original data. The existing work on sample complexity focuseson bounding n given (cid:15) (and δ ) and α , while the results in Section 3 focus on the the accuracy andprecision of sanitized results given (cid:15) (and δ ) and n . If the bias from perturbed results (relativeto the original results) are the same between the two mechanisms, a larger precision is equivalentto a smaller MSE. 17 ppendixA Proof of Lemma 2 ∆ p = max x , x (cid:48) ,d ( x , x (cid:48) )=1 ( (cid:80) rk =1 | s k ( x ) − s k ( x (cid:48) ) | p ) /p = (cid:0) max x , x (cid:48) ,d ( x , x (cid:48) )=1 (cid:80) rk =1 | s k ( x ) − s k ( x (cid:48) ) | p (cid:1) /p . Since max x , x (cid:48) ,d ( x , x (cid:48) )=1 (cid:80) rk =1 | s k ( x ) − s k ( x (cid:48) ) | p ≤ (cid:80) rk =1 max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k ( x ) − s k ( x (cid:48) ) | p = (cid:80) rk =1 (cid:0) max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k ( x ) − s k ( x (cid:48) ) | (cid:1) p = (cid:80) rk =1 ∆ p ,k . Therefore, (cid:0)(cid:80) rk =1 ∆ p ,k (cid:1) /p is an upper bound for ∆ p . (cid:4) B Proof of Claim 4 (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18)
Pr( s ∗ ∈ Q | x )Pr( s ∗ ∈ Q | x (cid:48) ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:32) exp (cid:0) − b − p (cid:107) s ∗ − s ( x ) (cid:107) pp (cid:1) exp ( − b − p (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = b − p (cid:12)(cid:12) (cid:107) s ∗ − s ( x ) (cid:107) pp − (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp (cid:12)(cid:12) = b − p (cid:12)(cid:12) (cid:80) rk =1 ( | s ∗ k − s k ( x ) | p − | s ∗ k − s k ( x (cid:48) ) | p ) (cid:12)(cid:12) ≤ b − p (cid:80) rk =1 (cid:12)(cid:12) | s ∗ k − s k ( x ) | p − | s ∗ k − s k ( x (cid:48) ) | p (cid:12)(cid:12) = b − p (cid:80) rk =1 (cid:12)(cid:12) | e k | p − | e k + d k | p (cid:12)(cid:12) , (B.1)where e k = s ∗ k − s k ( x ) and d k = s k ( x ) − s k ( x (cid:48) )= b − p (cid:80) rk =1 (cid:12)(cid:12) | e pk | − | ( e k + d k ) p | (cid:12)(cid:12) for integers p ≥ ≤ b − p (cid:80) rk =1 (cid:12)(cid:12) e pk − ( e k + d k ) p (cid:12)(cid:12) = b − p (cid:80) rk =1 (cid:12)(cid:12)(cid:12)(cid:80) pj =1 ( pj ) e p − jk d jk (cid:12)(cid:12)(cid:12) by reverse triangle inequality ≤ b − p (cid:80) rk =1 (cid:80) pj =1 ( pj ) | e k | p − j | d k | j (B.2)= b − p (cid:16) p (cid:80) rk =1 | e k | p − | d k | + ( p − p (cid:80) rk =1 | e k | p − | d k | + · · · +( p − p (cid:80) rk =1 | e k | | d k | p − / p (cid:80) rk =1 | e k | · | d k | p − + (cid:80) rk =1 | d k | p ) ≤ b − p (cid:16) p (cid:80) rk =1 | e k | p − ∆ ,k + ( p − p (cid:80) rk =1 | e k | p − ∆ ,k + · · · + ( p − p (cid:80) rk =1 | e k | ∆ p − ,k + p (cid:80) rk =1 | e k | ∆ p − ,k +∆ pp (cid:17) , (B.3)where ∆ ,k is the l GS of s k and ∆ p is the l p GS of s . To achieve (cid:15) -DP, Eqn (B.3) needs to be ≤ (cid:15) ; that is, ∆ pp + (cid:80) rk =1 (cid:80) p − j =1 ( pj ) | e k | p − j ∆ j ,k ≤ b p (cid:15). (B.4)A less tight bound can be obtained by applying Lemma 2 (∆ pp ≤ (cid:80) rk =1 ∆ p ,k ), thus (cid:80) rk =1 (cid:80) pj =1 ( pj ) | e k | p − j ∆ j ,k ≤ b p (cid:15). (B.5)The inequalities in Eqns (B.4) or (B.5) susgest that the lower bound on b depends on the randomGG noise e k = s ∗ k − s k for k = 1 , . . . , r , the support of which is ( −∞ , ∞ ) r . In other words, theredoes not exist a random noise-free solution on b , unless p = 1 in which case the inequality nolonger involves the error terms and the GG mechanism reduces to the familiar Laplace mechanismof (cid:15) -DP, leading to Claim 4. When p = 1, Eqn (B.1) ≤ b − (cid:80) rk =1 | d k | ≤ b − (cid:80) rk =1 | ∆ ,k | = b − ∆ < (cid:15) ,and thus b > ∆ (cid:15) − . (cid:4) Proof of (cid:15) -DP of the truncated GG mechanism in Definition 5
To satisfy (cid:15) -DP, we need (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18)
Pr( s ∗ ∈ Q | x , s ∗ ∈ [ c , c ] ×· · ·× [ c r , c r ])Pr( s ∗ ∈ Q | x (cid:48) , s ∗ ∈ [ c , c ] ×· · ·× [ c r , c r ]) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:32) exp (cid:0) − b − p (cid:107) s ∗ − s ( x ) (cid:107) pp (cid:1)(cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s k , b, p ) × (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s (cid:48) k , b, p )exp ( − b − p (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:32) exp (cid:0) − b − p (cid:107) s ∗ − s ( x ) (cid:107) pp (cid:1) exp ( − b − p (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp ) (cid:33) + log (cid:18) (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s k , b, p ) (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s (cid:48) k , b, p ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:32) exp (cid:0) − b − p (cid:107) s ∗ − s ( x ) (cid:107) pp (cid:1) exp ( − b − p (cid:107) s ∗ − s ( x (cid:48) ) (cid:107) pp ) (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + (C.6) (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s k , b, p ) (cid:81) rk =1 Pr( c k ≤ s ∗ k ≤ c k ; s (cid:48) k , b, p ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (C.7)If the term in Eqn (C.6) satisfies (cid:15)/ (cid:15)/ b p ( (cid:15)/ ≥ ∆ pp + (cid:80) rk =1 (cid:80) p − j =1 ( pj ) | s ∗ k − s k | p − j ∆ j ,k Since s ∗ isbounded within [ c k , c k ] for k = 1 , . . . , K , | s ∗ k − s k | ≤ | c k − c k | . Setting b p ( (cid:15)/ ≥ ∆ pp + (cid:80) rk =1 (cid:80) p − j =1 ( pj ) | c k − c k | p − j ∆ j ,k ensures the truncated GG mechanism is of (cid:15) -DP; or equivalently, b p ≥ (cid:15) − (cid:16)(cid:80) rk =1 (cid:80) p − j =1 ( pj ) | c k − c k | p − j ∆ js k +∆ p s ,p (cid:17) ensures that the truncated GG mechanism is of (cid:15) -DP. (cid:4) D Conservativeness of Exponential mechanismCorollary 12.
The actual privacy cost of the Exponential mechanism of (cid:15) -DP is always lessthan the nominal budget (cid:15) . When the normalization factor A ( x ) in Eqn (14) is independent of x , the actual privacy cost is (cid:15)/ A ( x ) independent of x implies increases and decreases in the utility scores upon the changefrom x to x (cid:48) “cancel out” when integrated or summed over all possible s ∗ in the form ofexp (cid:16) u ( s ∗ | x ) (cid:15) u (cid:17) . Proof.
Since u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) ≤ ∆ u , (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) Pr( s ∗ ( x ) ∈ Q )Pr( s ∗ ( x (cid:48) ) ∈ Q ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log exp (cid:16) u ( s ∗ | x ) (cid:15) u (cid:17) exp (cid:16) u ( s ∗ | x (cid:48) ) (cid:15) u (cid:17) × A ( x (cid:48) ) A ( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) e (cid:15)/ A ( x (cid:48) ) A ( x ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) (D.8)= (cid:12)(cid:12)(cid:12)(cid:12) (cid:15) (cid:18) A ( x (cid:48) ) A ( x ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) A ( x (cid:48) ) A ( x ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) (D.9)by the triangle inequality, and A ( x (cid:48) ) = (cid:90) s ∗ ∈S exp (cid:18) u ( s ∗ | x (cid:48) ) (cid:15) u (cid:19) d s ∗ ≤ (cid:90) s ∗ ∈S exp (cid:18) ( u ( s ∗ | x ) + ∆ u ) (cid:15) u (cid:19) d s ∗ (D.10)= exp (cid:16) (cid:15) (cid:17)(cid:90) s ∗ ∈S exp( u ( s ∗ | x )) d s ∗ = exp (cid:16) (cid:15) (cid:17) A ( x )19herefore, log (cid:16) A ( x (cid:48) ) A ( x ) (cid:17) ≤ (cid:15)/
2, and Eqn (D.9) becomes (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18)
Pr( s ∗ ( x ) ∈ Q )Pr( s ∗ ( x (cid:48) ) ∈ Q ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) A ( x ) A ( x (cid:48) ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) (D.11)The same result can be obtained by replacing the integral with summation when S is a discreteset in the equation set (D.11). The above results seem to suggest (cid:15) can be achieved exactly since“equality” appears in all the inequalities above (Eqn (D.8) to (D.11)); however, equality cannotoccur simultaneously in Eqns (D.8) and (D.10) unless ∆ u was 0, which is meaningless in DP.In addition, ∆ u is defined as the maximum change in u for all d ( x , x (cid:48) ) = 1. While it is likelythat the maximum change occurs at more than a single value of s ∗ , it is not possible that theutility scores at all values of s ∗ increase or decreases by the same amount ∆ u . In other words,the “equality” in Eqn (D.10) itself is unlikely to hold. All taken together, the actual privacy costin the Exponential mechanism is always less than (cid:15) and never attains the exact upper bound (cid:15) . In the extreme, the actual privacy cost can be down to (cid:15)/ A ( x ) ≡ A ( x (cid:48) ) ∀ x , x (cid:48) and d ( x , x (cid:48) ) = 1, as suggested by Eqn (D.9). (cid:4) E Proof of Lemma 9
Proof.
Part a) . Denote s ( x ) by s and s ( x (cid:48) ) by s (cid:48) . When p = 1, u ( s ∗ | x ) = −(cid:107) s ∗ − s (cid:107) , | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | = (cid:12)(cid:12)(cid:80) rk =1 ( | s ∗ k − s k | − | s ∗ k − s (cid:48) k | ) (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:12)(cid:12) | s ∗ k − s k | − | s ∗ k − s (cid:48) k | (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:12)(cid:12) s ∗ k − s k − ( s ∗ k − s k ) (cid:12)(cid:12) = (cid:80) rk =1 | s k − s (cid:48) k | = | s − s (cid:48) | . Therefore, ∆ u = max x , x (cid:48) ,d ( x , x (cid:48) )=1 , s ∗ ∈S | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | ≤ max x , x (cid:48) ,d ( x , x (cid:48) )=1 (cid:107) s − s (cid:48) (cid:107) = ∆ s , . (cid:4) Proof.
Part b) . When p = 2, u ( s ∗ | x ) = −(cid:107) s ∗ − s (cid:107) , | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | = (cid:12)(cid:12) (cid:80) rk =1 ( s k − s ∗ k ) − ( s (cid:48) k − s ∗ k ) (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:12)(cid:12) ( s k − s ∗ k ) − ( s (cid:48) k − s ∗ k ) (cid:12)(cid:12) = (cid:80) rk =1 | s k − s (cid:48) k | · | s k − s ∗ k + s (cid:48) k − s ∗ k | ≤ (cid:80) rk =1 ∆ ,k ( | s k − s ∗ k | + | s (cid:48) k − s ∗ k | ). Suppose s k is bounded within [ c k , c k ], so is s ∗ k , then∆ u = max x , x (cid:48) , s ∗ ∈S d ( x , x (cid:48) )=1 (cid:12)(cid:12) (cid:80) rk =1 ( s k ( x ) − s ∗ k ) − (cid:80) rk =1 ( s k ( x (cid:48) ) − s ∗ k ) (cid:12)(cid:12) ≤ (cid:80) rk =1 ∆ ,k ( c k − c k ) (E.12)When c k − c k ≡ b − a ∀ k , ∆ u ≤ b − a ) (cid:80) rk =1 ∆ ,k = 2( b − a )∆ . (cid:4) Proof.
Part c) . When u ( s ∗ | x ) = −(cid:107) s ∗ − s (cid:107) pp for integer p ≥ | u ( s ∗ | x ) − u ( s ∗ | x (cid:48) ) | = (cid:12)(cid:12) (cid:107) s ∗ − s (cid:107) pp −(cid:107) s ∗ − s (cid:48) (cid:107) pp (cid:12)(cid:12) = (cid:12)(cid:12) (cid:80) rk =1 | s k − s ∗ k | p − (cid:80) rk =1 | s (cid:48) k − s ∗ k | p (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:13)(cid:13) ( s k − s ∗ k ) p | − | ( s (cid:48) k − s ∗ k ) p | (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:12)(cid:12) ( s k − s ∗ k ) p − ( s (cid:48) k − s ∗ k ) p (cid:12)(cid:12) = (cid:80) rk =1 (cid:12)(cid:12) (cid:80) pi =1 ( pi )( − s ∗ k ) p − i [ s ik − ( s (cid:48) k ) i ] (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:80) pi =1 ( pi ) (cid:12)(cid:12) ( s ∗ k ) p − i [ s ik − ( s (cid:48) k ) i ] (cid:12)(cid:12) .Suppose s k is bounded within ( c k , c k ), so is s ∗ k .Define ∆ ( i )1 ,k = max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s ik − ( s (cid:48) k ) i | , then∆ u = max x , x (cid:48) ,d ( x , x (cid:48) )=1 , s ∗ ∈S (cid:12)(cid:12)(cid:80) rk =1 | s k − s ∗ k | p −| s (cid:48) k − s ∗ k | p (cid:12)(cid:12) ≤ (cid:80) rk =1 (cid:80) pi =1 ( pi )∆ ( i )1 ,k (max {| c k | , | c k |} ) p − i (E.13)When p = 1, Eqn (E.13) reduces to ∆ u ≤ (cid:80) rk =1 ∆ ,k in Part a). When p = 2, Eqn (E.13)becomes (cid:80) rk =1 (cid:16) ∆ (2)1 ,k + 2∆ ,k max {| c k | , | c k |} (cid:17) , not as tight an upper bound as Eqn (E.12). Tosee this, we can show 2∆ ,k ( c k − c k ) ≤ ∆ (2)1 ,k + 2∆ ,k max {| c k | , | c k |} or 2∆ ,k max {| c k | , | c k |} − ,k ( c k − c k ) + ∆ (2)1 ,k ≥ k . When c k c k ≥ c k − c k < max {| c k | , | c k |} ,2∆ ,k ( c k − c k ) ≤ ,k max {| c k | , | c k |} < ,k max {| c k | , | c k |} + ∆ (2)1 ,k . When c k c k ≤ {| c k | , c k |} = c k , 2∆ ,k max {| c k | , | c k |} − ,k ( c k − c k ) + ∆ (2)1 ,k = 2∆ ,k c k − ,k ( c k − c k ) + ∆ (2)1 ,k = 2∆ ,k c k + ∆ (2)1 ,k .Since ∆ (2)1 ,k =max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − ( s (cid:48) k ) | = max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − s (cid:48) k | · | s k + s (cid:48) k | ≥ max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − s (cid:48) k | · | c k | = 2∆ ,k | c k | , ∆ (2)1 ,k − ,k | c k | = ∆ (2)1 ,k + 2∆ ,k c k ≥
0. When c k c k ≤ {| c k | , c k |} = | c k | , 2∆ ,k max {| c k | , | c k |} + ∆ (2)1 ,k − ∆ ,k ( c k − c k ) = 2∆ ,k | c k | − ,k ( c k − c k ) + ∆ (2)1 ,k = ∆ (2)1 ,k − ,k c k . Since ∆ (2)1 ,k = max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − ( s (cid:48) k ) | ≥ max x , x (cid:48) ,d ( x , x (cid:48) )=1 | s k − s (cid:48) k | · | c k | = 2∆ ,k c k , ∆ (2)1 ,k − ,k c k ≥
0. All taken together, 2 (cid:80) rk =1 ∆ ,k ( c k − c k ) ≤ (cid:80) rk =1 (cid:16) ∆ (2)1 ,k + 2∆ ,k max {| c k | , | c k |} (cid:17) . (cid:4) F Proof of Lemma 10
Proof.
When r = 1 ( s is a scalar), ∆ p ≡ ∆ for all p ≥
1. To satisfy ( (cid:15), δ )-pDP, we setPr (cid:16) | s ∗ − s | > (cid:15)b ∆ − − ∆2 (cid:17) = 2Φ (cid:16) ∆ / − (cid:15)b (2∆) − b/ √ (cid:17) ≤ δ (F.14) ⇒ ∆ b − − (cid:15)b ∆ − ≤√ − ( δ/ ⇒ b ≥ − / (cid:15) − ∆ (cid:112) (Φ − ( δ/ + 2 (cid:15) − Φ − ( δ/ . Together with the requirement b − (cid:15) − ∆ > b ≥ max (cid:26) (cid:15) − / ∆ , (cid:0) (cid:15) − / ∆ (cid:1) √ (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ √ (cid:15) (cid:27) .Since δ <
1, Φ − ( δ/ < (cid:112) (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ ≥ √ (cid:15) , and thus b ≥ (cid:0) (cid:15) − / ∆ (cid:1) √ (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ √ (cid:15) . When r >
1, we leverage the proof in Appendix A (page265) in [10] and obtain (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18)
Pr( s ∗ ∈ Q | x )Pr( s ∗ ∈ Q | x (cid:48) ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) log (cid:18) exp( −(cid:107) e (cid:107) /b )exp ( −(cid:107) e + d (cid:107) /b ) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12) b − (cid:0) (cid:107) e (cid:107) − (cid:107) e + d (cid:107) (cid:1)(cid:12)(cid:12) ≤ (cid:12)(cid:12) b − (cid:0) λ ∆ + ∆ (cid:1)(cid:12)(cid:12) ≤ b − (cid:0) | λ | + ∆ (cid:1) , where e = s ∗ − s ( x ) , d = s ( x ) − s ( x (cid:48) ) defined in Eqn (B.2), and λ ∼ N (0 , b / (cid:15), δ )-pDP, we setPr( b − (cid:0) | λ | + ∆ (cid:1) < (cid:15) ) = Pr( (cid:0) | λ | < ( b (cid:15) ∆ − − ∆ ) / (cid:1) > − δ ⇒ Pr (cid:0) | λ | > ( b (cid:15) ∆ − − ∆ ) / (cid:1) = 2Φ (cid:16) ∆ − (cid:15)b ∆ − √ b (cid:17) > δ, which is the same as Eqn (F.14) for r = 1. Similar to the case of r = 1, we need b (cid:15) ∆ − − ∆ > b for r > b ≥ max (cid:26) (cid:15) − / ∆ , (cid:0) (cid:15) − / ∆ (cid:1) √ (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ √ (cid:15) (cid:27) . Since δ <
1, Φ − ( δ/ <
0, thus b ≥ (cid:0) (cid:15) − / ∆ (cid:1) √ (Φ − ( δ/ +2 (cid:15) − Φ − ( δ/ √ (cid:15) (cid:4) G Proof of Lemma 11
Proof. If σ is set at the lower bound in Eqn (18), the ratio of the variance between the Gaussiandistribution of the Gaussian mechanism of ( (cid:15), δ )-pDP and the Laplace distribution of the Laplacemechanism of (cid:15) -DP is (cid:16) (2 (cid:15) ) − ∆ s (cid:16)(cid:113) (Φ − ( δ )) + 2 (cid:15) − Φ − ( δ ) (cid:17) / ( √ (cid:15) − ∆ s ) (cid:17) (cid:16)(cid:113) (Φ − ( δ )) + 2 (cid:15) − Φ − ( δ ) (cid:17) / − (Φ − ( δ )) + (cid:15) − Φ − ( δ ) (cid:113) (Φ − ( δ )) + 2 (cid:15) (G.15)Since δ ∈ [0 , δ/ ∈ [0 , .
5] and Φ − ( δ/ ∈ ( −∞ , (cid:15) >
0, Eqn (G.15) > (Φ − ( δ/ /
2. Let (Φ − ( δ/ / >
1, then δ/ < Φ( −√ δ < −√ ≈ . (cid:4) References [1] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in privatedata analysis,” in
Theory of cryptography . Springer, 2006, pp. 265–284.[2] C. Dwork, “Differential privacy: A survey of results,”
Theory and Applications of Models ofComputation , vol. 4978, pp. 1–19, 2008.[3] ——, “Differential privacy,” in
Encyclopedia of Cryptography and Security . Springer, 2011,pp. 338–340.[4] F. McSherry and K. Talwar, “Mechanism design via differential privacy,” in , 2007, pp. 94–103.[5] F. McSherry, “Privacy integrated queries: an extensible platform for privacy-preservingdata analysis,” in
Proceedings of the 2009 ACM SIGMOD International Conference onManagement of data . ACM, 2009, pp. 19–30.[6] A. Roth and T. Roughgarden, “Interactive privacy via the median mechanism,” in
Proceed-ings of the 42nd ACM Symposium on Theory of Computing , June 5-8, 2010.[7] M. Hardt, K. Ligett, and F. McSherry, “A simple and practical algorithm for differentiallyprivate data release,” arXiv:1012.4763v2 , 2012.[8] A. Ghosh, T. Roughgarden, and M. Sundararajan, “Universally utility-maximizing privacymechanisms,”
SIAM Journal on Computing , vol. 41, no. 6, pp. 1673–1693, 2012.[9] Q. Geng and P. Viswanath, “The optimal noise-adding mechanism in differential privacy,”
IEEE Transactions on Information Theory , vol. 62, no. 2, pp. 925–951, 2016.[10] C. Dwork and A. Roth,
The Algorithmic Foundation of Differential Privacy . Now Publishes,Inc., 2014.[11] C. Dimitrakakis, B. Nelson, A. Mitrokotsa, and B. Rubinstein, “Robust and private bayesianinference,” in
Algorithmic Learning Theory ALT 2014 , P. Auer, A. Clark, T. Zeugmann, andS. Zilles, Eds. Spring, Cham, 2014.[12] C. Dwork, “Differential privacy,” in
Proceedings of the International Colloquium on Au-tomata, Languages and Programming (ICALP) . Springer-Verlag ARCoSS, 2006, pp. 1–12.[13] C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor, “Our data, ourselves:privacy via distributed noise generation,” in
Advances in Cryptology: Proceedings of EU-ROCRYPT . Springer Berlin Heidelberg, 2006, pp. 485–503.[14] A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber, “Privacy: Theorymeets practice on the map,”
IEEE ICDE 24th International Conference , pp. 277 – 286,2008. 2215] R. Hall, A. Rinaldoy, and L. Wasserman, “Random differential privacy,”
Journal of Privacyand Confidentiality , vol. 4, no. 2, pp. 43–59, 2012.[16] C. Dwork and G. N. Rothblum, “Concentrated differential privacy,” arXiv:1603.01887v2 ,2016.[17] F. Liu, “Noninformative bounding in differential privacy and its impact on statistical proper-ties of sanitized results in truncated and boundary-inflated-truncated laplace mechanisms,” arXiv:1607.08554 , 2016.[18] T. Steinke and J. Ullman, “Between pure and approximate differential privacy,” arXiv:1501.06095v1 , 2015.[19] A.-S. Charest, “Empirical evaluation of statistical inference from differentially-privatecontingency tables,” in
Proceeding of International Conferency on Privacy in StatisticalDatabases , 2012, pp. 257–272.[20] M. Lichman, “UCI machine learning repository,” 2013. [Online]. Available: http://archive.ics.uci.edu/ml[21] K. Nissim, S. Raskhodnikova, and A. Smith, “Smooth sensitivity and sampling in privatedata analysis,”
Proceedings of the 39th ACM Symposium on Theory of Computing , p. 7584,2007.[22] C. Dwork and J. Lei, “Differential privacy and robust statistics,”
Proceedings of the 41rdACM symposium on Theory of computing , pp. 371–380, 2009.[23] M. Hardt and K. Talwar, “On the geometry of differential privacy,”
Proceedings of theForty-second ACM Symposium on Theory of Computing, STOC ’10 , pp. 705–714, 2010.[24] B. Mark, J. Ullman, and S. Vadhan, “Fingerprinting codes and the price of approximatedifferential privacy,” arXiv:1311.3158v2arXiv:1311.3158v2