[PDF] Pain-Free Random Differential Privacy with Sensitivity Sampling

Abstract

Popular approaches to differential privacy, such as the Laplace and exponential mechanisms, calibrate randomised smoothing through global sensitivity of the target non-private function. Bounding such sensitivity is often a prohibitively complex analytic calculation. As an alternative, we propose a straightforward sampler for estimating sensitivity of non-private mechanisms. Since our sensitivity estimates hold with high probability, any mechanism that would be (ϵ,δ) -differentially private under bounded global sensitivity automatically achieves (ϵ,δ,γ) -random differential privacy (Hall et al., 2012), without any target-specific calculations required. We demonstrate on worked example learners how our usable approach adopts a naturally-relaxed privacy guarantee, while achieving more accurate releases even for non-private functions that are black-box computer programs.

Full PDF

PPain-Free Random Differential Privacy with Sensitivity Sampling

Benjamin I. P. Rubinstein Francesco Ald`a Abstract

Popular approaches to differential privacy, suchas the Laplace and exponential mechanisms, cal-ibrate randomised smoothing through global sen-sitivity of the target non-private function. Bound-ing such sensitivity is often a prohibitively com-plex analytic calculation. As an alternative, wepropose a straightforward sampler for estimat-ing sensitivity of non-private mechanisms. Sinceour sensitivity estimates hold with high prob-ability, any mechanism that would be ( (cid:15), δ ) -differentially private under bounded global sen-sitivity automatically achieves ( (cid:15), δ, γ ) -randomdifferential privacy (Hall et al., 2012), withoutany target-speciﬁc calculations required. Wedemonstrate on worked example learners howour usable approach adopts a naturally-relaxedprivacy guarantee, while achieving more accu-rate releases even for non-private functions thatare black-box computer programs.

1. Introduction

Differential privacy (Dwork et al., 2006) has emerged asthe dominant framework for protected privacy of sensitivetraining data when releasing learned models to untrustedthird parties. This paradigm owes its popularity in partto the strong privacy model provided, and in part to theavailability of general building block mechanisms such asthe Laplace (Dwork et al., 2006) & exponential (McSherry& Talwar, 2007), and to composition lemmas for buildingup more complex mechanisms. These generic mechanismscome endowed with privacy and utility bounds that holdfor any appropriate application. Such tools almost alle-viate the burden of performing theoretical analysis in de-veloping privacy-preserving learners. However a persis- School of Computing and Information Systems, Univer-sity of Melbourne, Australia Horst G¨ortz Institute for IT Secu-rity and Faculty of Mathematics, Ruhr-Universit¨at Bochum, Ger-many. Correspondence to: BR < [email protected] > ,FA < [email protected] > . Proceedings of the th International Conference on MachineLearning , Sydney, Australia, PMLR 70, 2017. Copyright 2017by the author(s). tent requirement is the need to bound global sensitivity—aLipschitz constant of the target, non-private function. Forsimple scalar statistics of the private database, sensitivitycan be easily bounded (Dwork et al., 2006). However inmany applications—from collaborative ﬁltering (McSherry& Mironov, 2009) to Bayesian inference (Dimitrakakiset al., 2014; 2017; Wang et al., 2015)—the principal chal-lenge in privatisation is completing this calculation.In this work we develop a simple approach to approxi-mating global sensitivity with high probability, assumingonly oracle access to target function evaluations. Com-bined with generic mechanisms like Laplace, exponential,Gaussian or Bernstein, our sampler enables systematisingof privatisation: arbitrary computer programs can be madedifferentially private with no additional mathematical anal-ysis nor dynamic/static analysis, whatsoever. Our approachdoes not make any assumptions about the function underevaluation or underlying sampling distribution.

Contributions.

This paper contributes: i) S

ENSITIVI - TY S AMPLER for easily-implemented empirical estimationof global sensitivity of (potentially black-box) non-privatemechanisms; ii) Empirical process theory for guaranteeingrandom differential privacy for any mechanism that pre-serves (stronger) differential privacy under bounded globalsensitivity; iii) Experiments demonstrating our sampler onlearners for which analytical sensitivity bounds are highlyinvolved; and iv) Examples where sensitivity estimates beat(pessimistic) bounds, delivering pain-free random differen-tial privacy at higher levels of accuracy, when used in con-cert with generic privacy-preserving mechanisms.

Related Work.

This paper builds on the large body ofwork in differential privacy (Dwork et al., 2006; Dwork& Roth, 2014), which has gained broad interest in partdue the framework’s strong guarantees of data privacywhen releasing aggregate statistics or models, and due toavailability of many generic privatising mechanisms e.g. ,:Laplace (Dwork et al., 2006), exponential (McSherry &Talwar, 2007), Gaussian (Dwork & Roth, 2014), Bern-stein (Ald`a & Rubinstein, 2017) and many more. Whilethese mechanisms present a path to privatisation withoutneed for reproving differential privacy or utility, they dohave in common a need to analytically bound sensitivity—a Lipschitz-type condition on the target non-private func- a r X i v : . [ c s . L G ] J un ain-Free Random Differential Privacy with Sensitivity Sampling tion. Often derivations are intricate e.g. , for collabora-tive ﬁltering (McSherry & Mironov, 2009), SVMs (Ru-binstein et al., 2012; Chaudhuri et al., 2011), model se-lection (Thakurta & Smith, 2013), feature selection (Kiferet al., 2012), Bayesian inference (Dimitrakakis et al., 2014;2017; Wang et al., 2015), SGD in deep learning (Abadiet al., 2016), etc. Undoubtedly the non-trivial nature ofbounding sensitivity prohibits adoption by some domainexperts. We address this challenge through the S ENSITIVI - TY S AMPLER that estimates sensitivity empirically—evenfor privatising black-box computer programs—providinghigh probability privacy guarantees generically.Several systems have been developed to ease deploymentof differentially privacy, with Barthe et al. (2016) overview-ing contributions from Programming Languages. Dynamicapproaches track privacy budget expended at runtime, typ-ically through basic operations on data with known privacyloss, e.g. , the PINQ (McSherry, 2009; McSherry & Ma-hajan, 2010) and Airavat (Roy et al., 2010) systems. Thesecreate a C

ENSITIVITY S AMPLER mechanism complementssuch systems, e.g. , within broader frameworks for protect-ing against side-channel attacks (Haeberlen et al., 2011;Mohan et al., 2012).Minami et al. (2016) show that special-case Gibbs sam-pler is ( (cid:15), δ ) -DP without bounded sensitivity. Nissim et al.(2007) ask: Why calibrate for worst-case global sensitiv-ity when the actual database does not witness worst-caseneighbours? Their smoothed sensitivity approach priva-tises local sensitivity, which itself is sensitive to perturba-tion. While this can lead to better sensitivity estimates, oursampled sensitivity still does not require analytical bounds.A related approach is the sample-and-aggregate mecha-nism (Nissim et al., 2007) which avoids computation ofsensitivity of the underlying target function and instead re-quires sensitivity of an aggregator combining the outputs of the non-private target run repeatedly on subsamples of thedata. By contrast, our approach provides direct sensitivityestimates, permitting direct privatisation.Our application of empirical process theory to estimatehard-to-compute quantities resembles the work of Riondato& Upfal (2015). They use VC-theory and sampling to ap-proximate mining frequent itemsets. Here we approximateanalytical computations, and to our knowledge provide aﬁrst generic mechanism that preserves random differen-tial privacy (Hall et al., 2012)—a natural weakening of thestrong guarantee of differential privacy. Hall et al. (2012)leverage empirical process theory for a speciﬁc worked ex-ample, while our setting is general sensitivity estimation.

2. Background

We are interested in non-private mechanism f : D n →B that maps databases in product space over domain D to responses in a normed space B . The terminologyof “database” (DB) comes from statistical databases, andshould be understood as a dataset. Example 1.

For instance, in supervised learning of linearclassiﬁers, the domain could be Euclidean vectors compris-ing features & labels, and responses might be parameteri-sations of learned classiﬁers such as a normal vector.

We aim to estimate sensitivity which is commonly used tocalibrate noise in differentially-private mechanisms.

Deﬁnition 2.

The global sensitivity of non-private f : D n → B is given by ∆ = sup D,D (cid:48) (cid:107) f ( D ) − f ( D (cid:48) ) (cid:107) B ,where the supremum is taken over all pairs of neighbour-ing databases D, D (cid:48) in D n that differ in one point. Deﬁnition 3.

Randomized mechanism M : D n → R responding with values in arbitrary response set R pre-serves (cid:15) -differential privacy for (cid:15) > if for all neigh-bouring D, D (cid:48) ∈ D n and measurable R ⊂ R it holdsthat Pr ( M ( D ) ∈ R ) ≤ exp( (cid:15) )Pr ( M ( D (cid:48) ) ∈ R ) . If in-stead for < δ < it holds that Pr ( M ( D ) ∈ R ) ≤ exp( (cid:15) )Pr ( M ( D (cid:48) ) ∈ R ) + δ then the mechanism preservesthe weaker notion of ( (cid:15), δ ) -differential privacy. In Section 4, we recall a number of key mechanisms thatpreserve these notions of privacy by virtue of target non-private function sensitivity.The following deﬁnition due to Hall et al. (2012) relaxesthe requirement that uniform smoothness of response dis-tribution holds on all pairs of databases, to the requirementthat uniform smoothness holds for likely database pairs.

Deﬁnition 4.

Randomized mechanism M : D n → R responding with values in an arbitrary response set R preserves ( (cid:15), γ ) -random differential privacy, atprivacy level (cid:15) > and conﬁdence γ ∈ (0 , , if Pr ( ∀ R ⊂ R , Pr ( M ( D ) ∈ R ) ≤ e (cid:15) Pr ( M ( D (cid:48) ) ∈ R )) ≥ ain-Free Random Differential Privacy with Sensitivity Sampling − γ , with the inner probabilities over the mechanism’srandomization, and the outer probability over neighbour-ing D, D (cid:48) ∈ D n drawn from some P n +1 . The weaker ( (cid:15), δ ) -DP has analogous deﬁnition as ( (cid:15), δ, γ ) -RDP. Remark 5.

While strong (cid:15) -DP is ideal, utility may demandcompromise. Precedent exists for weaker privacy, with thedeﬁnition of ( (cid:15), δ ) -DP wherein on any databases (includinglikely ones) a private mechanism may leak sensitive infor-mation on low probability responses, forgiven by the addi-tive δ relaxation. ( (cid:15), γ ) -RDP offers an alternate relaxation,where on all but a small γ -proportion of unlikely databasepairs, strong (cid:15) -DP holds—RDP plays a useful role. Example 6.

Consider a database on unbounded posi-tive reals D ∈ R n + representing loan default times ofa bank’s customers, and target release statistic f ( D ) = n − (cid:80) ni =1 D i the sample mean. To (cid:15) -DP privatise scalar-valued f ( D ) it is natural to look to the Laplace mechanism.However the mechanism requires a bound on the statistic’sglobal sensitivity, impossible under unbounded D . Notefor ∆ > , when neighbouring D, D (cid:48) satisfy {| f ( D ) − f ( D (cid:48) ) | ≤ ∆ } then Laplace mechanism M ∆ ,(cid:15) ( f ( D )) en-joys (cid:15) -DP on that DB pair. Therefore the probability ofthe latter event is bounded below by the probability of theformer. Modelling the default times by iid exponential vari-ables of rate λ > , then | f ( D ) − f ( D (cid:48) ) | = | D n − D (cid:48) n | /n is distributed as Exp( nλ ) , and so Pr ( ∀ t ∈ R , Pr ( M ∆ ,(cid:15) ( D ) = t ) ≤ e (cid:15) Pr ( M ∆ ,(cid:15) ( D (cid:48) ) = t )) ≥ Pr ( | f ( D ) − f ( D (cid:48) ) | ≤ ∆) = 1 − e − λn ∆ ≥ − γ , provided that ∆ ≥ log(1 /γ ) / ( λn ) . While (cid:15) -DP fails dueto unboundedness, the data is likely bounded and so themechanism is likely strongly private: M ∆ ,(cid:15) is ( (cid:15), γ ) -RDP.

3. Problem Statement

We consider a statistician looking to apply a differentially-private mechanism to an f : D n → B whose sensitivitycannot easily be bounded analytically ( cf. Example 6 orthe case of a computer program).Instead we assume that the statistician has the ability tosample from some arbitrary product space P n +1 on D n +1 ,can evaluate f arbitrarily (and in particular on the resultof this sampling), and is interested in applying a privatis-ing mechanism with the guarantee of random differentialprivacy (Deﬁnition 4). Remark 7.

Natural choices for P present themselves forsampling or deﬁning random differential privacy. P couldbe taken as the underlying distribution from which a sensi-tive DB was drawn—in the case of sensitive training databut insensitive data source; an alternate test distributionof interest in the case of domain adaptation; or P couldbe uniform or an otherwise non-informative likelihood ( cf. Example 6). Proved in Appendix A, the following relatesRDP of similar distributions.

Proposition 8.

Let

P, Q be distributions on D withbounded KL divergence KL ( P (cid:107) Q ) ≤ τ . If mechanism M on databases in D n is RDP with conﬁdence γ > wrt P then it is also RDP with conﬁdence γ + (cid:112) ( n + 1) τ / wrt Q , with the same privacy parameters (cid:15) (or (cid:15), δ ).

4. Sensitivity-Induced Differential Privacy

When a privatising mechanism M is known to achievedifferential privacy for some mapping f : D n → B under bounded global sensitivity, then our approach’shigh-probability estimates of sensitivity will imply high-probability preservation of differential privacy. In order toreason about such arguments, we introduce the concept ofsensitivity-induced differential privacy. Deﬁnition 9.

For arbitrary mapping f : D n → B andrandomised mechanism M ∆ : B → R , we say that M ∆ is sensitivity-induced (cid:15) -differentially private if for a neigh-bouring pair of databases D, D (cid:48) ∈ D n , and ∆ ≥ (cid:107) f ( D ) − f ( D (cid:48) ) (cid:107) B ≤ ∆= ⇒ ∀ R ⊂ R , Pr ( M ∆ ( f ( D )) ∈ R ) ≤ exp( (cid:15) ) · Pr ( M ∆ ( f ( D (cid:48) )) ∈ R ) with the qualiﬁcation on R being all measurable subsetsof the response set R . In the same vein, the analogousdeﬁnition for ( (cid:15), δ ) -differential privacy can also be made.Many generic mechanisms in use today preserve differen-tial privacy by virtue of satisfying this condition. The fol-lowing are immediate consequences of existing proofs ofdifferential privacy. First, when a non-private target func-tion f aims to release Euclidean vectors responses. Corollary 10 (Laplace mechanism) . Consider database D ∈ D n , normed space B = ( R d , (cid:107) · (cid:107) ) for d ∈ N , non-private function f : D n → B . The Laplace mechanism (Dwork et al., 2006) M ∆ ( f ( D )) ∼ Lap ( f ( D ) , ∆ /(cid:15) ) , issensitivity-induced (cid:15) -differentially private. Example 11.

Example 6 used Corollary 10 for RDP of theLaplace mechanism on unbounded bank loan defaults.

Corollary 12 (Gaussian mechanism) . Consider database D ∈ D n , normed space B = ( R d , (cid:107) · (cid:107) ) for some d ∈ N , and non-private function f : D n → B . The Gaus-sian mechanism (Dwork & Roth, 2014) M ∆ ( f ( D )) ∼N ( f ( D ) , diag ( σ )) with σ > log(1 . /δ ) /(cid:15) , issensitivity-induced ( (cid:15), δ ) -differentially private.Second, f may aim to release elements of an arbitrary set R , where a score function s ( D, · ) benchmarks quality ofpotential releases (placing a partial ordering on R ). Lap ( a , b ) has unnormalised PDF exp( −(cid:107) x − a (cid:107) /b ) . ain-Free Random Differential Privacy with Sensitivity Sampling Algorithm 1 S ENSITIVITY S AMPLER

Input: database size n , target mapping f : D n → B ,sample size m , order statistic index k , distribution P for i = 1 to m do Sample D ∼ P n +1 Set G i = (cid:107) f ( D ...n ) − f ( D ...n − ,n +1 ) (cid:107) B end for Sort G , . . . , G m as G (1) ≤ . . . ≤ G ( m ) return ˆ∆ = G ( k ) Corollary 13 (Exponential mechanism) . Considerdatabase D ∈ D n , response space R , normedspace B = (cid:0) R R , (cid:107) · (cid:107) ∞ (cid:1) , non-private score function s : D n × R → R , and restriction f : D n → B given by f ( D ) = s ( D, · ) . The exponential mechanism (McSherry& Talwar, 2007) M ∆ ( f ( D )) ∼ exp ( (cid:15) ( f ( D )) ( r ) / ,which when normalised speciﬁes a PDF over responses r ∈ R , is sensitivity-induced (cid:15) -differentially private.Third, f could be function-valued as for learning settings,where given a training set we wish to release a model ( e.g. ,classiﬁer or predictive posterior) that can be subsequentlyevaluated on (non-sensitive) test points. Corollary 14 (Bernstein mechanism) . Consider database D ∈ D n , query space Y = [0 , (cid:96) with constant dimen-sion (cid:96) ∈ N , lattice cover of Y of size k ∈ N given by L = ( { , /k, . . . , } ) (cid:96) , normed space B = (cid:0) R Y , (cid:107) · (cid:107) ∞ (cid:1) ,non-private function F : D n × Y → R , and restriction f : D n → B given by f ( D ) = F ( D, · ) . The Bern-stein mechanism (Ald`a & Rubinstein, 2017) M ∆ ( f ( D )) ∼ (cid:8) Lap (cid:0) ( f ( D ))( p ) , ∆( k + 1) (cid:96) /(cid:15) (cid:1) | p ∈ L (cid:9) , is sensitivity-induced (cid:15) -differentially private.Our framework does not apply directly to the objective per-turbation mechanism of Chaudhuri et al. (2011), as thatmechanism does not rely directly on a notion of sensitiv-ity of objective function, classiﬁer, or otherwise. Howeverit can apply to the posterior sampler used for differentially-private Bayesian inference (Mir, 2012; Dimitrakakis et al.,2014; 2017; Zhang et al., 2016): there the target function f : D n → B returns the likelihood function p ( D |· ) , itselfmapping parameters Θ to R ; using the result of f ( D ) andpublic prior ξ ( θ ) , the mechanism samples from the poste-rior ξ ( B | D ) = (cid:82) B p ( D | θ ) dξ ( θ ) / (cid:82) Θ p ( D | θ ) dξ ( θ ) ; differ-ential privacy follows from a Lipschitz condition on f thatwould require our sensitivity sampler to sample from alldatabase pairs—a minor modiﬁcation left for future work.

5. The Sensitivity Sampler

Algorithm 1 presents the S

ENSITIVITY S AMPLER in de-tail. Consider privacy-insensitive independent sample D , . . . , D m ∼ P n +1 of databases on n + 1 records, where Algorithm 2 S AMPLE -T HEN -R ESPOND

Input: database D ; randomised mechanism M ∆ : B →R ; target mapping f : D n → B , sample size m , orderstatistic index k , distribution P Set ˆ∆ to S ENSITIVITY S AMPLER ( | D | , f, m, k, P ) respond M ˆ∆ ( D ) P is chosen to match the desired distribution in deﬁnitionof random differential privacy. A number of natural choicesare available for P ( cf. Remark 7). The main idea ofS

ENSITIVITY S AMPLER is that for each extended-databaseobservation of D ∼ P n +1 , we induce i.i.d. observations G , . . . , G m ∈ R of the random variable G = (cid:107) f ( D ...n ) − f ( D ...n − n +1 ) (cid:107) B . From these observations of the sensitivity of target map-ping f : D n → B , we estimate w.h.p. sensitivity thatcan achieve random differential privacy, for the full suite ofsensitivity-induced private mechanisms discussed above.If we knew the full CDF of G , we would simply invertthis CDF to determine the level of sensitivity for achievingany desired γ level of random differential privacy: higherconﬁdence would invoke higher sensitivity and thereforelower utility. However as we cannot in general possessthe true CDF, we resort to uniformly approximating itw.h.p. using the empirical CDF induced by the sample G , . . . , G m . The guarantee of uniform approximation de-rives from empirical process theory. Figure 1 providesfurther intuition behind S ENSITIVITY S AMPLER . Algo-rithm 2 presents S

AMPLE -T HEN -R ESPOND which com-poses S

ENSITIVITY S AMPLER with any sensitivity-induceddifferentially-private mechanism.Our main result Theorem 15 presents explicit expressionsfor parameters m, k that are sufﬁcient to guarantee thatS

AMPLE -T HEN -R ESPOND achieves ( (cid:15), δ, γ ) -random dif-ferential privacy. Under that result the parameter ρ , whichcontrols the uniform approximation of the empirical CDFfrom G , . . . , G m sample to the true CDF, is introduced Figure 1.

Inside S

ENSITIVITY S AMPLER : the true sensitivityCDF (blue); empirical sensitivity CDF (piecewise constant red);inversion of the empirical CDF (black dotted); where ρ, ρ (cid:48) are theDKW conﬁdence and errors deﬁned in Theorem 15. ain-Free Random Differential Privacy with Sensitivity Sampling as a free parameter. We demonstrate through a series ofoptimisations in Corollaries 20–21 how ρ can be tuned tooptimise either sampling effort m , utility via order statis-tic index k , or privacy conﬁdence γ . These alternative ex-plicit choices for ρ serve as optimal operating points for themechanism. S ENSITIVITY S AMPLER simpliﬁes the application of dif-ferential privacy by obviating the challenge of boundingsensitivity. As such, it is important to explore any practicalissues arising in its implementation. The algorithm itselfinvolves few main stages: sampling databases, measuringsensitivity, sorting, order statistic lookup (inversion), fol-lowed by the sensitivity-induced private mechanism.

Sampling.

As discussed in Remark 7, a number of nat-ural choices for sampling distribution P could be made.Where a simulation process exists, capable of generatingsynthetic data approximating D , then this could be run. Forexample in the Bayesian setting (Dimitrakakis et al., 2014),one could use a public conditional likelihood p ( ·| θ ) , para-metric family Θ , prior ξ ( θ ) and sample from the marginal (cid:82) Θ p ( x | θ ) dξ ( θ ) . Alternatively, it may sufﬁce to samplefrom the uniform distribution on D , or Gaussian restrictedto Euclidean D . In any of these cases, sampling is relativelystraightforward and the choice should consider meaningfulrandom differential privacy guarantees relative to P . Sensitivity Measurement.

A trivial stage, given neigh-bouring databases, measurement could involve expandinga mathematical expression representing a target function,or a computer program such as running a deep learning orcomputer vision open-source package. For some targets,it may be that running ﬁrst on one database, covers muchof the computation required for the neighbouring databasein which case amortisation may improve runtime. The costof sensitivity measurement will be primarily determined bysample size m . Note that sampling and measurement canbe trivially parallelised over map-reduce-like platforms. Sorting, Inversion.

Strictly speaking the entire sensitivitysample need not be sorted, as only one order statistic is re-quired. That said, sorting even millions of scalar measure-ments can be accomplished in under a second on a stockmachine. An alternative strategy to inversion as presented,is to take the maximum sensitivity measured so as to max-imise privacy without consideration to utility.

Mechanism.

It is noteworthy that in settings where mech-anism M ∆ is to be run multiple times, the estimation of ˆ∆ need not be redone. As such S ENSITIVITY S AMPLER couldbe performed entirely in an ofﬂine amortisation stage.

6. Analysis

For the i.i.d. sample of sensitivities G , . . . , G m drawnwithin Algorithm 1, denote the corresponding ﬁxed un-known CDF, and corresponding random empirical CDF, by Φ ( g ) = Pr ( G ≤ g ) , Φ m ( g ) = 1 m m (cid:88) i =1 [ G i ≤ g ] . In this section we use Φ m (∆) to bound the likelihood of a(non-private, possibly deterministic) mapping f : D n → R achieving sensitivity ∆ . This permits bounding RDP. Theorem 15.

Consider any non-private mapping f : D n → B , any sensitivity-induced ( (cid:15), δ ) -differentially pri-vate mechanism M ∆ mapping B to (randomised) responsesin R , any database D of n records, privacy parameters (cid:15) > , δ ∈ [0 , , γ ∈ (0 , , and sampling parameters size m ∈ N , order statistic index m ≥ k ∈ N , approximationconﬁdence < ρ < min { γ, / } , distribution P on D . If m ≥ γ − ρ ) log (cid:18) ρ (cid:19) , (1) k ≥ m (cid:16) − γ + ρ + (cid:112) log(1 /ρ ) / (2 m ) (cid:17) , (2) then Algorithm 2 run with D, M ∆ , f, m, k, P , preserves ( (cid:15), δ, γ ) -random differential privacy.Proof. Consider any ρ (cid:48) ∈ (0 , to be determined later,and consider sampling G , . . . , G m and sorting to G (1) ≤ . . . ≤ G ( m ) . Provided that − γ + ρ + ρ (cid:48) ≤ ⇔ ρ (cid:48) ≤ γ − ρ , (3)then the random sensitivity ˆ∆ = G ( k ) , where k = (cid:100) m (1 − γ + ρ + ρ (cid:48) ) (cid:101) , is the smallest ∆ ≥ such that Φ m (∆) ≥ − γ + ρ + ρ (cid:48) . That is, Φ m ( ˆ∆) ≥ − γ + ρ + ρ (cid:48) . (4)Note that if − γ + ρ + ρ (cid:48) < then ˆ∆ can be taken as any ∆ , namely zero. Deﬁne the events A ∆ = {∀ R ⊂ R , Pr ( M ∆ ( f ( D )) ∈ R ) ≤ exp( (cid:15) ) · Pr ( M ∆ ( f ( D (cid:48) )) ∈ R ) + δ } B ρ (cid:48) = (cid:26) sup ∆ (Φ m (∆) − Φ(∆)) ≤ ρ (cid:48) (cid:27) . The ﬁrst is the event that DP holds for a speciﬁc DB pair,when the mechanism is run with (possibly random) sensi-tivity parameter ∆ ; the second records the empirical CDFuniformly one-sided approximating the CDF to level ρ (cid:48) . Bythe sensitivity-induced (cid:15) -differential privacy of M ∆ , ∀ ∆ > , Pr D,D (cid:48) ∼ P n +1 ( A ∆ ) ≥ Φ(∆) . (5) ain-Free Random Differential Privacy with Sensitivity Sampling Table 1.

Optimal ρ operating points for budgeted resources— γ or m —minimising m , γ or k ; proved in Appendix B. Budgeted Optimise ρ γ m kγ ∈ (0 , m exp (cid:16) W − (cid:16) − γ √ e (cid:17) + (cid:17) • (cid:24) log ( ρ ) γ − ρ ) (cid:25) (cid:24) m (cid:18) − γ + ρ + (cid:113) log ( ρ ) m (cid:19)(cid:25) m ∈ N , γ k exp (cid:0) W − (cid:0) − m (cid:1)(cid:1) ≥ ρ + (cid:113) log ( ρ ) m • (cid:24) m (cid:18) − γ + ρ + (cid:113) log ( ρ ) m (cid:19)(cid:25) m ∈ N γ exp (cid:0) W − (cid:0) − m (cid:1)(cid:1) ρ + (cid:113) log ( ρ ) m • m Privacy confidence g (log scale) S a m p li ng e ff o r t m ( l og sc a l e ) Figure 2.

The minimum sample size m (sampler effort) requiredto achieve various target RDP conﬁdence levels γ . The random

D, D (cid:48) on the left-hand side induce the distri-bution on G on the right-hand side under which Φ(∆) =Pr G ( G ≤ ∆) . The probability on the left is the level ofrandom differential privacy of M ∆ when run on ﬁxed ∆ .By the Dvoretzky-Kiefer-Wolfowitz inequality (Massart,1990) we have that for all ρ (cid:48) ≥ (cid:112) (log 2) / (2 m ) , Pr G ,...,G m ( B ρ (cid:48) ) ≥ − e − mρ (cid:48) . (6)Putting inequalities (4), (5), and (6) together, provided that ρ (cid:48) ≥ (cid:112) (log 2) / (2 m ) , yields that Pr D,D (cid:48) ,G ,...,G m (cid:0) A ˆ∆ (cid:1) = E (cid:2) (cid:2) A ˆ∆ (cid:3)(cid:12)(cid:12) B ρ (cid:48) (cid:3) Pr ( B ρ (cid:48) ) + E (cid:2) (cid:2) A ˆ∆ (cid:3)(cid:12)(cid:12) B ρ (cid:48) (cid:3) Pr (cid:0) B ρ (cid:48) (cid:1) ≥ E (cid:104) Φ (cid:16) ˆ∆ (cid:17)(cid:12)(cid:12)(cid:12) B ρ (cid:48) (cid:105) Pr ( B ρ (cid:48) ) ≥ E (cid:104) Φ m (cid:16) ˆ∆ (cid:17) − ρ (cid:48) (cid:12)(cid:12)(cid:12) B ρ (cid:48) (cid:105) (cid:0) − exp (cid:0) − mρ (cid:48) (cid:1)(cid:1) ≥ (1 − γ + ρ + ρ (cid:48) − ρ (cid:48) ) (cid:0) − exp (cid:0) − mρ (cid:48) (cid:1)(cid:1) ≥ (1 − γ + ρ )(1 − ρ ) ≥ − γ + ρ − ρ =1 − γ . The last inequality follows from ρ < γ ; the penultimateinequality follows from setting ρ (cid:48) ≥ (cid:115) m log (cid:18) ρ (cid:19) , (7) . . . . . . Privacy confidence g S en s i t i v t y quan t il e l e v e l k m m = 100m = 1000m = 10000 Figure 3.

For sample sizes m ∈ { , , } , trade-offs be-tween privacy conﬁdence level γ and order-statistic index k (rel-ative to m ) which controls sensitivity estimates and so utility. and so the DKW condition (Massart, 1990), that ρ (cid:48) ≥ (cid:112) (log 2) / (2 m ) , is met provided that ρ ≤ / . Now (1)follows from substituting (7) into (3).Note that for sensitivity-induced (cid:15) -differentially privatemechanisms, the theorem applies with δ = 0 . Optimising Free Parameter ρ . Table 1 recommends alter-native choices of free parameter ρ , derived by optimisingthe sampler’s performance along one axis—privacy con-ﬁdence γ , sampler effort m , or order statistic index k —given a ﬁxed budget of another. The table summarises re-sults with proofs found in Appendix B. The speciﬁc ex-pressions derived involve branches of the Lambert- W func-tion, which is the inverse relation of the function f ( z ) = z exp( z ) , and is implemented as a special function in scien-tiﬁc libraries as standard. While Lambert- W is in generala multi-valued relation on the analytic complex domain, allinstances in our results are single-real-valued functions onthe reals. The next result presents the ﬁrst operating point’scorresponding rate on effort in terms of privacy, and fol-lows from recent bounds on the secondary branch W − dueto Chatzigeorgiou (2013). Corollary 16.

Minimising m for given γ ( cf. Table 1,row 1; Corollary 20, Appendix B), yields rate for m as That for all u > , − − √ u − u < W − ( − e − u − ) < − − √ u − u . ain-Free Random Differential Privacy with Sensitivity Sampling . . . Privacy confidence g C o m pu t ed s en s i t i v i t y D Figure 4.

Analytical vs estimated sensitivity for Example 6. o (cid:16) γ log γ (cid:17) with increasing privacy conﬁdence γ → ∞ . Remark 17.

Theorem 15 and Table 1 elucidate that effort,privacy and utility are in tension. Effort is naturally de-creased by reducing the conﬁdence level of RDP ( ρ chosento minimise m , or γ ). By minimising order statistic index k , we select smaller G k and therefore sensitivity estimate ˆ∆ . This in turn leads to lower generic mechanism noiseand higher utility. All this is achieved by sacriﬁcing effortor privacy conﬁdence. As usual, sacriﬁcing (cid:15) or δ privacylevels also leads to utility improvement. Figures 2 and 3visualise these operating points. Less conservative estimates on sensitivity can lead to supe-rior utility while also enjoying easier implementation. Thishypothesis is borne out in experiments in Section 7.

Proposition 18.

For any f : D n → B with global sen-sitivity ∆ = sup D ∼ D (cid:48) (cid:107) f ( D ) − f ( D (cid:48) ) (cid:107) B , S ENSITIVI - TY S AMPLER ’s random sensitivity ˆ∆ ≤ ∆ . As a result,Algorithm 2 run with any of the sensitivity-induced privatemechanisms of Corollaries 10–14 achieves utility dominat-ing that of the respective mechanisms run with ∆ .

7. Experiments

We now demonstrate the practical value of S

ENSITIVI - TY S AMPLER . First in Section 7.1 we illustrate how S EN - SITIVITY S AMPLER sensitivity quickly approaches analyt-ical high-probability sensitivity, and how it can be sig-niﬁcantly lower than worst-case global sensitivity in Sec-tion 7.2. Running privatising mechanisms with lower sen-sitivity parameters can mitigate utility loss, while maintain-ing (a weaker form of) differential privacy. We present ex-perimental evidence of this utility savings in Section 7.3.While application domains may ﬁnd the alternate balancetowards utility appealing by itself, it should be stressedthat a signiﬁcant advantage of S

ENSITIVITY S AMPLER isits ease of implementation. d S en s i t i v i t y D ( l og sc a l e ) - - l l l l ll g = g = g = Figure 5.

Global vs sampled sensitivity for linear SVM.

Consider running Example 6: private release of samplemean f ( D ) = n − (cid:80) ni =1 D i of a database D drawn i.i.d.from Exp(1) . Figure 4 presents, for varying probability γ : the analytical bound on sensitivity versus S ENSITIVI - TY S AMPLER estimates for different sampling budgets av-eraged over 50 repeats. For ﬁxed sampling budget, ˆ∆ isestimated at lower limits on γ , quickly converging to exact. Consider now the challenging goal of privately releasing anSVM classiﬁer ﬁt to sensitive training data. In applying theLaplace mechanism to releasing the primal normal vector,Rubinstein et al. (2012) bound the vector’s sensitivity usingalgorithmic stability of the SVM. In particular, a lengthyderivation establishes that (cid:107) w D − w D (cid:48) (cid:107) ≤ LCκ √ d/n for a statistically consistent formulation of the SVM withconvex L -Lipschitz loss, d -dimensional feature mappingwith sup x k ( x , x ) ≤ κ , and regularisation parameter C .While the original work (and others since) did not considerthe practical problem of releasing unregularised bias term b , we can effectively bound this sensitivity via a short argu-ment in Appendix D. Proposition 19.

For the SVM run with hinge loss, linearkernel, D = [0 , d , the release ( w , b ) has L global sensi-tivity bounded by C √ d + 4 Cd/n .We train private SVM using the Laplace mechanism (Ru-binstein et al., 2012), with global sensitivity bound ofProposition 19 or S

ENSITIVITY S AMPLER . We synthe-sise a dataset of n = 1000 points, selected with equalprobability of being drawn from the positive class N (0 . · , diag(0 . or negative class N (0 . · , diag(0 . .The feature space’s dimension varies from d = 8 through d = 64 . The SVMs are run with C = 3 , S ENSITIVI - TY S AMPLER with m = 1500 & varying γ . Figure 5 showsvery different sensitivities obtained. While estimated ˆ∆ hovers around 0.01 largely independent of γ , global sen- ain-Free Random Differential Privacy with Sensitivity Sampling . . . Privacy budget e (log scale) M i sc l a ss i f i c a t i on r a t e ( l og sc a l e ) - - l l l l l l l l ll Global g = g = = Figure 6.

Linear SVM predictive error under sensitivity estimatesvs with global sensitivity bound. . . . . Privacy budget e (log scale) T o t a l v a r i a t i on d i s t an c e - - l l l l ll Global g = g =

Figure 7.

KDE error (relative to non-private) under sensitivity es-timates vs global sensitivity bound. sitivity ∆ exceeds 20—two orders of magnitude greater.These patterns are repeated as dimension increases; sensi-tivity increasing is to be expected since as dimensions areadded, the few points in the training set become more likelyto be support vectors and thus affecting sensitivity. Suchconservative estimates could clearly lead to inferior utility. We return to the sameSVM setup as in the previous section, with d = 2 , nowplotting utility as misclassiﬁcation error (averaged over 500repeats) vs. privacy budget (cid:15) . Here we set γ = 0 . and include also the non-private SVM’s performance as abound on utility possible. See Figure 6. At very high pri-vacy levels both private SVMs suffer the same poor error.But quickly with lower privacy, the misclassiﬁcation errorof S ENSITIVITY S AMPLER drops until it reaches the non-private rate. Simultaneously the global sensitivity approachhas a signiﬁcantly higher value and suffers a much slowerdecline. These results suggest that S

ENSITIVITY S AMPLER can achieve much better utility in addition to sensitivity.

Kernel Density Estimation.

We ﬁnally consider a one di-mensional ( d = 1 ) KDE setting. In Figure 7 we show theerror (averaged over 1000 repeats) of the Bernstein mecha-nism (with lattice size k = 10 and Bernstein order h = 3 )on 5000 points drawn from a mixture of two normal dis-tributions N (0 . , . and N (0 . , . with weights . , . , respectively. For this experimental result, we set m = 50000 and two different values for γ , as displayedin Figure 7. Once again we observe that for high privacylevels the global sensitivity approach incurs a higher errorrelative to non-private, while S ENSITIVITY S AMPLER pro-vides stronger utility. At lower privacy, both approachesconverge to the approximation error of the Bernstein poly-nomial used.

8. Conclusion

In this paper we propose S

ENSITIVITY S AMPLER , an al-gorithm for empirical estimation of sensitivity for privati-sation of black-box functions. Our work addresses animportant usability gap in differential privacy, wherebyseveral generic privatisation mechanisms exist completewith privacy and utility guarantees, but require analyti-cal bounds on global sensitivity (a Lipschitz condition)on the non-private target. While this sensitivity is triv-ially derived for simple statistics, for state-of-the-art learn-ers sensitivity derivations are arduous e.g. , in collabora-tive ﬁltering (McSherry & Mironov, 2009), SVMs (Ru-binstein et al., 2012; Chaudhuri et al., 2011), model se-lection (Thakurta & Smith, 2013), feature selection (Kiferet al., 2012), Bayesian inference (Dimitrakakis et al., 2014;Wang et al., 2015), and deep learning (Abadi et al., 2016).While derivations may prevent domain experts from lever-aging differential privacy, our S

ENSITIVITY S AMPLER promises to make privatisation simple when using existingmechanisms including Laplace (Dwork et al., 2006), Gaus-sian (Dwork & Roth, 2014), exponential (McSherry & Tal-war, 2007) and Bernstein (Ald`a & Rubinstein, 2017). Allsuch mechanisms guarantee differential privacy on pairs ofdatabases for which a level ∆ of non-private function sen-sitivity holds, when the mechanism is run with that ∆ pa-rameter. For all such mechanisms we leverage results fromempirical process theory to establish guarantees of randomdifferential privacy (Hall et al., 2012) when using sampledsensitivities only.Experiments demonstrate that real-world learners can eas-ily be run privately without any new derivation whatso-ever. And by using a naturally-weaker form of privacy,while replacing worst-case global sensitivity bounds withestimated (actual) sensitivities, we can achieve far superiorutility than existing approaches. ain-Free Random Differential Privacy with Sensitivity Sampling Acknowledgements

F. Ald`a and B. Rubinstein acknowledge the support of theDFG Research Training Group GRK 1817/1 and the Aus-tralian Research Council (DE160100584) respectively.

References

Abadi, Mart´ın, Chu, Andy, Goodfellow, Ian, McMahan,H Brendan, Mironov, Ilya, Talwar, Kunal, and Zhang, Li.Deep learning with differential privacy. In

Proceedingsof the 2016 ACM SIGSAC Conference on Computer andCommunications Security , pp. 308–318. ACM, 2016.Ald`a, Francesco and Rubinstein, Benjamin I. P. The Bern-stein mechanism: Function release under differential pri-vacy. In

Proceedings of the 31st AAAI Conference on Ar-tiﬁcial Intelligence (AAAI’2017) , pp. 1705–1711, 2017.Barthe, Gilles, Gaboardi, Marco, Hsu, Justin, and Pierce,Benjamin. Programming language techniques for differ-ential privacy.

ACM SIGLOG News , 3(1):34–53, 2016.Chatzigeorgiou, Ioannis. Bounds on the Lambert functionand their application to the outage analysis of user coop-eration.

IEEE Communications Letters , 17(8), 2013.Chaudhuri, Kamalika, Monteleoni, Claire, and Sarwate,Anand D. Differentially private empirical risk minimiza-tion.

Journal of Machine Learning Research , 12(Mar):1069–1109, 2011.Dimitrakakis, Christos, Nelson, Blaine, Mitrokotsa, Aika-terini, and Rubinstein, Benjamin I. P. Robust and pri-vate Bayesian inference. In

International Conferenceon Algorithmic Learning Theory , pp. 291–305. Springer,2014.Dimitrakakis, Christos, Nelson, Blaine, Zhang, Zuhe,Mitrokotsa, Aikaterini, and Rubinstein, Benjamin I. P.Differential privacy for Bayesian inference through pos-terior sampling.

Journal of Machine Learning Research ,18(11):1–39, 2017.Dwork, Cynthia and Roth, Aaron. The algorithmic founda-tions of differential privacy.

Foundations and Trends inTheoretical Computer Science , 9(3–4):211–407, 2014.Dwork, Cynthia, McSherry, Frank, Nissim, Kobbi, andSmith, Adam. Calibrating noise to sensitivity in privatedata analysis. In

Theory of Cryptography Conference ,pp. 265–284. Springer, 2006.Gaboardi, Marco, Haeberlen, Andreas, Hsu, Justin,Narayan, Arjun, and Pierce, Benjamin C. Linear de-pendent types for differential privacy.

ACM SIGPLANNotices , 48(1):357–370, 2013. Haeberlen, Andreas, Pierce, Benjamin C, and Narayan, Ar-jun. Differential privacy under ﬁre. In

USENIX SecuritySymposium , 2011.Hall, Rob, Rinaldo, Alessandro, and Wasserman, Larry.Random differential privacy.

Journal of Privacy andConﬁdentiality , 4(2):43–59, 2012.Kifer, Daniel, Smith, Adam, and Thakurta, Abhradeep.Private convex empirical risk minimization and high-dimensional regression.

Journal of Machine LearningResearch , 1(41):3–1, 2012.Massart, Pascal. The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality.

The Annals of Probability ,18(3):1269–1283, 1990.McSherry, Frank and Mahajan, Ratul. Differentially-private network trace analysis.

ACM SIGCOMM Com-puter Communication Review , 40(4):123–134, 2010.McSherry, Frank and Mironov, Ilya. Differentially privaterecommender systems: building privacy into the net. In

Proceedings of the 15th ACM SIGKDD InternationalConference on Knowledge Discovery and Data Mining ,pp. 627–636. ACM, 2009.McSherry, Frank and Talwar, Kunal. Mechanism design viadifferential privacy. In , pp.94–103. IEEE, 2007.McSherry, Frank D. Privacy integrated queries: an exten-sible platform for privacy-preserving data analysis. In

Proceedings of the 2009 ACM SIGMOD InternationalConference on Management of Data , pp. 19–30. ACM,2009.Minami, Kentaro, Arai, HItomi, Sato, Issei, and Nakagawa,Hiroshi. Differential privacy without sensitivity. In

Ad-vances in Neural Information Processing Systems 29 , pp.956–964, 2016.Mir, Darakhshan. Differentially-private learning and in-formation theory. In

Proceedings of the 2012 JointEDBT/ICDT Workshops , pp. 206–210. ACM, 2012.Mohan, Prashanth, Thakurta, Abhradeep, Shi, Elaine,Song, Dawn, and Culler, David. GUPT: privacy pre-serving data analysis made easy. In

Proceedings of the2012 ACM SIGMOD International Conference on Man-agement of Data , pp. 349–360. ACM, 2012.Nissim, Kobbi, Raskhodnikova, Sofya, and Smith, Adam.Smooth sensitivity and sampling in private data analysis.In

Proceedings of the Thirty-Ninth Annual ACM Sympo-sium on Theory of Computing , pp. 75–84. ACM, 2007. ain-Free Random Differential Privacy with Sensitivity Sampling

Palamidessi, Catuscia and Stronati, Marco. Differentialprivacy for relational algebra: improving the sensitiv-ity bounds via constraint systems. In Wiklicky, Herbertand Massink, Mieke (eds.),

QAPL - Tenth Workshop onQuantitative Aspects of Programming Languages , vol-ume 85, pp. 92–105, 2012.Reed, Jason and Pierce, Benjamin C. Distance makes thetypes grow stronger: a calculus for differential privacy.

ACM Sigplan Notices , 45(9):157–168, 2010.Riondato, Matteo and Upfal, Eli. Mining frequent item-sets through progressive sampling with Rademacher av-erages. In

Proceedings of the 21th ACM SIGKDD Inter-national Conference on Knowledge Discovery and DataMining , pp. 1005–1014. ACM, 2015.Roy, Indrajit, Setty, Srinath TV, Kilzer, Ann, Shmatikov,Vitaly, and Witchel, Emmett. Airavat: Security and pri-vacy for MapReduce. In

NSDI , volume 10, pp. 297–312,2010.Rubinstein, Benjamin I. P., Bartlett, Peter L., Huang, Ling,and Taft, Nina. Learning in a large function space:Privacy-preserving mechanisms for SVM learning.

Jour-nal of Privacy and Conﬁdentiality , 4(1):65–100, 2012.Thakurta, Abhradeep Guha and Smith, Adam. Differen-tially private feature selection via stability arguments,and the robustness of the Lasso. In

Conference on Learn-ing Theory , pp. 819–850, 2013.Wang, Yu-Xiang, Fienberg, Stephen E, and Smola, Alexan-der J. Privacy for free: Posterior sampling and stochasticgradient Monte Carlo. In

ICML , pp. 2493–2502, 2015.Zhang, Zuhe, Rubinstein, Benjamin I. P., and Dimitrakakis,Christos. On the differential privacy of Bayesian infer-ence. In

Proceedings of the Thirtieth AAAI Conferenceon Artiﬁcial Intelligence , pp. 2365–2371. AAAI Press,2016.

A. Proof of Proposition 8

By Pinsker’s inequality the product measures have boundedtotal variation distance (cid:13)(cid:13) P n +1 − Q n +1 (cid:13)(cid:13) ≤ (cid:114) KL ( P n +1 (cid:107) Q n +1 ) ≤ (cid:114) n + 12 τ . Denote by A the event that (cid:15) -DP holds (similarly for ( (cid:15), δ ) -DP) on neighbouring databases on n records: A = {∀ R ⊂ R , Pr ( M ( D ) ∈ R ) ≤ e (cid:15) Pr ( M ( D (cid:48) ) ∈ R ) } . Then RDP wrt Q follows as Q n +1 ( A ) ≥ P n +1 ( A ) − (cid:112) ( n + 1) τ / ≥ − γ − (cid:112) ( n + 1) τ / . B. Optimising Sampler Performance with ρ This section presents precise statements and proofs for theexpressions found in Table 1.

B.1. Fixed γ Minimum m Corollary 20.

For ﬁxed given privacy conﬁdence budget γ ∈ (0 , , taking ρ = exp (cid:18) W − (cid:18) − γ √ e (cid:19) + 12 (cid:19) ,m = (cid:108)(cid:0) γ − ρ ) (cid:1) − log(1 /ρ ) (cid:109) ,k = (cid:108) m (cid:16) − γ + ρ + (cid:112) log(1 /ρ ) / (2 m ) (cid:17)(cid:109) , minimises sampling effort m , when running Algorithm 2to achieve ( (cid:15), δ, γ ) -RDP. Proof.

For any ﬁxed γ ∈ (0 , , our task is to minimise thebound m ( ρ ) = 12( γ − ρ ) log 1 ρ , on ρ ∈ (0 , min { γ, . } ) . The ﬁrst- and second-orderderivatives of this function are ∂m∂ρ = − log ρ ( γ − ρ ) − ρ ( γ − ρ ) ∂ m∂ρ = − ρ ( γ − ρ ) − ρ ( γ − ρ ) + 12 ρ ( γ − ρ ) = 12 ρ ( γ − ρ ) (cid:20) ( γ − ρ ) + ρ (cid:18) ρ − (cid:19)(cid:21) . For the second derivative to be positive, it is sufﬁcient for ρ − > which in turn is guaranteed when ρ < exp( − / ≈ . . Therefore m ( ρ ) is strictly convex onthe feasible region; and the ﬁrst-order necessary conditionfor optimality is also sufﬁcient. We seek ρ (cid:63) critical point − ( γ − ρ (cid:63) ) − (cid:20) log ρ (cid:63) γ − ρ (cid:63) + 12 ρ (cid:63) (cid:21) ⇔ − log ρ (cid:63) γ − ρ (cid:63) = 12 ρ (cid:63) ⇔ − γ = 2 ρ (cid:63) log ρ (cid:63) − ρ (cid:63) ⇔ − γ √ e = (cid:18) log ρ (cid:63) − (cid:19) ρ (cid:63) √ e = (cid:18) log ρ (cid:63) − (cid:19) exp (cid:18) log ρ (cid:63) − (cid:19) Applying the Lambert- W function to each side, yields log( ρ (cid:63) ) − ∈ W (cid:18) − γ √ e (cid:19) ⇔ ρ (cid:63) ∈ exp (cid:18) W (cid:18) − γ √ e (cid:19) + 12 (cid:19) . ain-Free Random Differential Privacy with Sensitivity Sampling −0.5 0.0 0.5 1.0 − − − − − z W ( z ) W W - Figure 8.

The branches of the Lambert-W function: primary W (blue) and secondary W − (red). The Lambert- W function is real-valued on [ − exp( − , ∞ ) , within which it is two-valued on ( − exp( − , and univalued otherwise. As depictedin Figure 8, it consists of a primary branch W whichmaps [ − exp( − , ∞ ) to [ − , ∞ ) , and a secondarybranch W − which maps [ − exp( − , to [ − , −∞ ) .Returning to our condition on ρ (cid:63) , consider that for γ ∈ (0 , we have that − γ √ e since > √ e . Onthis domain primary W ∈ ( − , while secondary W − ∈ ( −∞ , − and so the primary branch would yield ρ (cid:63) ∈ ( − exp( − . , exp(0 . which is disjoint fromfeasible region (0 , . . The secondary branch, however,has image in (0 , . . . . ) which is feasible. Therefore,we arrive at the ρ (cid:63) as claimed, completing the main part ofthe proof. B.2. Fixed m and γ Minimum k Corollary 21.

For given ﬁxed sampling resource budget m ∈ N and privacy conﬁdence γ ∈ (0 , , taking ρ = exp (cid:18) W − (cid:18) − m (cid:19)(cid:19) ,k = (cid:108) m (cid:16) − γ + ρ + (cid:112) log(1 /ρ ) / (2 m ) (cid:17)(cid:109) , provided that γ ≥ ρ + (cid:112) log(1 /ρ ) / (2 m ) , minimises order-statistic index k , when running Algo-rithm 2 to achieve ( (cid:15), δ, γ ) -RDP. Proof.

For ﬁxed m, γ , our task is to minimise the bound k ( ρ ) = m (cid:32) − γ + ρ + (cid:114) log(1 /ρ )2 m (cid:33) , or equivalently ˜ k ( ρ ) = ρ + (cid:114) log(1 /ρ )2 m , (8)on ρ ∈ (0 , min { γ, . } ) . The ﬁrst- and second-orderderivatives of this function are ∂ ˜ k∂ρ = 1 − √ mρ (cid:112) log(1 /ρ ) ∂ ˜ k∂ρ = 12 √ mρ (cid:112) log(1 /ρ ) (cid:20) −

12 log(1 /ρ ) (cid:21) . Since its leading term is positive on feasible ρ , it fol-lows that the second derivative is strictly positive iff ρ < exp( − / ≈ . which is guaranteed on the feasible re-gion. Therefore k ( ρ ) is strictly convex; and the ﬁrst-ordernecessary condition for optimality is also sufﬁcient. Nextwe seek ρ (cid:63) critical point − √ mρ (cid:63) (cid:112) log(1 /ρ (cid:63) ) ⇔ ρ (cid:63) log ρ (cid:63) = − m ⇔ ρ (cid:63) log ρ (cid:63) = − m ⇔ log ρ (cid:63) = W (cid:18) − m (cid:19) ⇔ ρ (cid:63) ∈ exp (cid:18) W (cid:18) − m (cid:19)(cid:19) , where the introduction of the Lambert- W function lever-ages the identity W ( z log z ) = log z . Since − exp( − < − (4 m ) − < it follows that W is real- and strictly nega-tive in value. Further, since ρ ≤ . < exp( − / ≈ . ,it follows that our solution lies again in the lower branch asclaimed.To guarantee that the relation (1) between m, γ, ρ is stillsatisﬁed, we can solve the bound on m in terms of γ : m ≥ γ − ρ ) log (cid:18) ρ (cid:19) ⇔ γ ≥ ρ + (cid:115) m log (cid:18) ρ (cid:19) . (9)Operating with this γ establishes all the conditions of The-orem 15. ain-Free Random Differential Privacy with Sensitivity Sampling d S en s i t i v i t y D ( l og sc a l e ) - - - l l l l ll g = g = g = Figure 9.

Global vs estimated sensitivity for the sample mean onbounded data.

B.3. Fixed m Minimum γ Corollary 22.

For given ﬁxed sampling resource budget m ∈ N , taking ρ = exp (cid:18) W − (cid:18) − m (cid:19)(cid:19) ,γ = ρ + (cid:112) log(1 /ρ ) / (2 m ) ,k = m , minimises privacy conﬁdence parameter γ , when runningAlgorithm 2 to achieve ( (cid:15), δ, γ ) -RDP. Proof.

Consider now choosing ρ to minimise γ , for givenﬁxed m sample size budget, while then taking order statis-tic index k according to the selected m, ρ, γ . This cor-responds to optimising the expression (9) with respectto ρ . Noting that this expression is identical to the ob-jective (8), again the global optimiser must be ρ (cid:63) =exp( W − ( − / (4 m )) / . With this choice of γ , the nec-essary k equates to m . C. Global vs. Sampled Sensitivity: SampleMean of Bounded Data

Consider the goal of releasing the sample mean f ( D ) = n − (cid:80) ni =1 D i of a database D as in Example 6, but overdomain D = [0 , d . Figure 9 presents: the (sharp)bound on global sensitivity for this target for use in e.g. ,the Laplace mechanism; and the sensitivity ˆ∆ estimatedby S ENSITIVITY S AMPLER . Here D comprises n = 500 points sampled from the uniform distribution over D , withS ENSITIVITY S AMPLER run with optimised m under vary-ing γ as displayed. The reduction in sensitivity due tosampling is striking (note the log scale). This experimentdemonstrates sensitivity for different privacy guarantees (DP vs. RDP). By contrast for the same level of privacy(RDP) in Section 7.1, S ENSITIVITY S AMPLER quickly ap-proaches the analytical approach.

D. Proof of Proposition 19

It follows immediately that L = 1 and κ = √ d . From thesolution b = y i − (cid:80) nj =1 α j y j k ( D i , D j ) for some i ∈ [ n ] ,combined with the box constraints ≤ α j ≤ C/n , thesensitivity of the bias can be bounded as C √ dd