Differentially Private Link Prediction With Protected Connections
PPrivacy Preserving Link Predictionwith Latent Geometric Network Models
Abir De and Soumen Chakrabarti Max Plank Institute for Software Systems, [email protected] Indian Institute of Technology Bombay, [email protected]
Abstract
Link prediction is an important task in social network analysis, with a wide variety of applications ranging fromgraph search to recommendation. The usual paradigm is to propose to each node a ranked list of nodes that arecurrently non-neighbors, as the most likely candidates for future linkage. Owing to increasing concerns about privacy,users (nodes) may prefer to keep some or all their connections private. Most link prediction heuristics, such as commonneighbor, Jaccard coefficient, and Adamic-Adar, can leak private link information in making predictions. We presentD P L P , a generic framework to protect differential privacy for these popular heuristics under the ranking objective.Under a recently-introduced latent node embedding model, we also analyze the trade-off between privacy and linkprediction utility. Extensive experiments with eight diverse real-life graphs and several link prediction heuristics showthat D P L P can trade off between privacy and predictive performance more effectively than several alternatives. Link prediction (LP) [20, 22, 25] is the task of predicting future relations (edges) that can emerge between nodes in asocial network, given historical views of it. Predicting new collaborators for researchers, new Twitter followers, newFacebook friends and new LinkedIn connections, are examples of LP tasks. Link predictions are usually presented toeach node u as a ranking of the most promising non-neighbors v of u at the time of prediction.With increasing concerns about privacy [35, 37, 42], social media platforms are enabling users to mark some of theirdata (usually demographic attributes) as private to other users and the Internet at large. However, homophily and othernetwork effects may leak attributes. Someone having skydiver or smoker friends may end up paying large insurancepremiums, even if they do not subject themselves to those risks. A potential remedy is for users to mark some part oftheir ego network as private as well. However, the social media hosting platform itself continues to enjoy full accessto the network and all attributes, and exploits them for LP. As we shall see, many popular LP algorithms leak neighborinformation, because they take advantage of rampant triad completion : when node u links to v , very often there existsa node w such that edges ( u, w ) and ( w, v ) already exist. Therefore, recommending v to u may breach the privacy of v and w .In this paper, we develop D P L P , a simple-to-implement wrapper around most popular LP algorithms and show thatit can protect privacy in the more general sense motivated above. For our analysis we use differential privacy (DP)[13], the standard yardstick for protecting private information. We perturb the scores from an LP module using a (non-Laplacian) distribution and sample top predictions using by sampling a distribution parameterized by the perturbedscores. Similar paradigms have been used [26], but with utility functions that would choose a single node (rather thana ranking), had privacy concerns been absent.There is usually a trade-off between prediction quality and privacy. If the utility without privacy concerns is adeterministic function of the current graph, the relative loss of utility for a given level of privacy can be analyzed [26]for per-node definitions of utility. We analyze prediction quality under two additional challenges. First, we considertop- K ranking quality. Second, we model utility that is not directly observable, but latent in the generative process thatcreates the graph.Sarkar et al. [33] showed that popular (non-DP) LP heuristics wield predictive power, when latent node embeddingsin a geometric space [17] are modeled as causative forces driving edge creation. More specifically, nodes u, v are1 a r X i v : . [ c s . S I] J u l epresented as points in a high dimensional space, and the edge ( u, v ) is likely to emerge according to a suitabledistribution that depends on the distance between the node embeddings. The latent node embeddings are not knownto LP algorithms, so even the best LP algorithm may have a positive risk to (ranking) utility. DP requirements willgenerally result in an increase in that risk, which we analyze. As a result, we get a complete characterization of thetrade-off between privacy requirement and risk to utility, with reference to a latent generative process. Related work:
In the absence of a graph generative model as a reference, absolute prediction quality cannot bequantified. Machanavajjhala et al. [26] define utility as the best single-node score s v ∗ attained by a non-DP LP heuristic,such as the best common-neighbor score. To protect privacy, the deterministic maximum utility choice is replaced bya sample drawn from all nodes V according to a distribution { p v } over non-neighbors. The expected sample utility is ( (cid:80) v s v p v ) /s v ∗ . Thus, they do not consider the top- K ranking scenario. They also assume a constant fraction of totalutility is concentrated in a small number of candidate nodes, which naturally yields steep trade-offs between privacyand utility. When the popular Laplace perturbation is used in this relative utility setting [14], |V| random numbers areneeded to draw a single recommendation. If the level of privacy is held constant, as |V| → ∞ , Laplacian perturbationwill continue to lose (relative) utility. In contrast, we introduce K random numbers to draw K samples, and our(absolute) loss of ranking accuracy goes to zero as |V| → ∞ .Other complementary aspects of privacy in social networks have also been addressed [10, 11, 32]. When the socialmedia platform itself cannot be trusted, Samanthula et al. [32] provide an encryption and security architecture, togetherwith anonymous message-passing protocols for friend recommendation. Chen et al. [10] characterize node attributeand link information in terms of rareness. Disclosure of attributes has benefits and risks, which are balanced using aknapsack formulation. Summary of contributions:
We initiate a study of differential privacy in LP algorithms that present a ranked listof top- K recommended nodes to a query node u . We present and establish the privacy guarantee of D P L P , a generictemplate to convert a non-DP LP algorithm into a DP version. In contrast with Laplacian or Gaussian mechanisms thatuse O ( |V| ) random numbers, D P L P uses O ( K ) random numbers. Consequently, it results in a smaller loss of rankingaccuracy, while maintaining the same level of privacy. In a marked departure from earlier work [26], we also analyzethe absolute loss of ranking quality attributable to privacy requirements, in a graph generative framework where nodeshave latent embeddings and links depend on the distance between these embeddings. Experiments over eight data setsand several popular LP algorithms show the efficacy of D P L P , compared to DP obtained through Laplacian, Gaussianand Exponential perturbations. Our code and data have been uploaded in [1]. We model a snapshot of the social network as an undirected graph G = ( V , E ) with vertices V and edges E . Neighborsof node u are denoted as the set N ( u ) and non-neighbors denoted as N ( u ) . After observing this snapshot, an LPalgorithm will present to each node u a ranking of nodes in N ( u ) , perhaps truncated to the first/top- K positions. Theexpectation is that user u has strong reason to link to the top nodes in the list. Throughout, we denote { , . . . , N } as [ N ] and the 0/1 indicator for event B as (cid:126) B (cid:127) . Popular link prediction (LP) algorithms:
Triad (triangle) completion has provided a mainstay for many effectiveLP heuristics. Among the earliest ones are common neighbors ( CN ) [20, 23], Adamic-Adar ( AA ) [4, 20] and Jaccardcoefficient ( JC ) [20, 23]. A generic LP algorithm will be denoted A , but omitted when not needed or clear from context.Each LP algorithm A implements a scoring function s A ( u, · ) : N ( u ) → R + , with score s A ( u, v ) for candidate non-neighbor v . Thus we have s CN ( u, v ) = |N ( u ) ∩ N ( v ) | and s JC ( u, v ) = |N ( u ) ∩ N ( v ) | / |N ( u ) ∪ N ( v ) | . AA refinedCN to a weighted count: common neighbors who have many other neighbors are dialed down in importance as thereciprocal of the logarithm of its degree. I.e., s AA ( u, v ) = (cid:80) w ∈N ( u ) ∩N ( v ) / log |N ( w ) | . CN, JC and AA are stillwidely used [32]. Latent space models for analyzing LP:
Sarkar et al. [33] proposed a latent space model to explain why somepopular LP methods succeed. In their fixed radius- r deterministic model, every node u is associated with a (latent)2oint x u sampled uniformly randomly from within a D -dimensional hypersphere having unit volume. Nodes u and v get connected if the distance d uv = || x u − x v || between them is less than r [9]. Therefore, Pr (cid:0) ( u, v ) ∈ E ) = Pr( d uv < r ) = Ω( r ) = Ω(1) r D . (1)Here, Ω( r ) is the volume of the r -radius hypersphere. The usual evaluation protocol of an LP algorithm is that a graph,thus created, gets some fraction of edges and non-edges ‘hidden’ (see Section 5 and Appendix E) from an LP algorithm A , which has to predict their existence or non-existence. A has no access to the latent node embeddings or r . In ourwork we are interested in the quality of ranking , for each node u , nodes v that are not currently neighbors of u , that aremost likely to be(come) neighbors. Ranking loss:
Given u , an LP method M ranks the top- K nodes v that u is most likely to link to. If the latent nodeembeddings were known to M , it would pick non-neighbors v ∈ N ( u ) by increasing d uv . However, M cannot observedistances in the latent space. Popular LP methods assign scores s M ( u, v ) by observing G . These scores provide M ’sranking on current non-neighbors v of node u . This ranking will generally differ from the ‘perfect’ ranking induced inthe latent space, leading to a ranking loss. Differential privacy:
We design and analyze privacy protection in LP algorithms in the framework of differentialprivacy (DP) [13]. In our context, a randomized computation M runs on a graph G , returning an ordered list of K recommended nodes for a query node u . Let us call this (random) output M ( u ; G ) . If the graph is changedto G (cid:48) , the random output becomes M ( u ; G (cid:48) ) . M provides (cid:15) -differential privacy if, for any possible output list L , (cid:12)(cid:12)(cid:12) log Pr( M ( u ; G )= L )Pr( M ( u ; G (cid:48) )= L ) (cid:12)(cid:12)(cid:12) ≤ (cid:15) |G ⊕ G (cid:48) | , where the multiplier of (cid:15) measures the extent of perturbation, usually a single edge.For a more formal background of differential privacy, see Dwork and Roth [13], applications to link prediction [26], orAppendix A. In this section, we present a generic differentially private (DP) framework for link prediction. Such a frameworkDP implements an algorithm A on a given LP protocol A with a scoring function s A ( u, · ) . To that aim, we firstdefine a sensitivity function of a score s A ( u, v ) implemented by an LP protocol A . Then we leverage this sensitivityfunction to develop a generalized paradigm for DP LP algorithms, called D P L P . Finally, we instantiate D P L P tospecific privacy-preserving methods for the popular LP methods: common neighbors (CN), Jaccard coefficient (JC) andAdamic-Adar (AA). Graph sensitivity:
Given a query node u , a D P L P algorithm A which operates over an original LP algorithm A ,should not be likely to produce a substantially different set nodes for a graph G as compared to G (cid:48) , where G and G (cid:48) differs by only one edge. To establish the DP property, we need to show that the likelihood of predicting a candidate isnot very sensitive to (small) perturbations to G . To that end, we first define the sensitivity of a positive scoring function s A ( u, · ) : N ( u ) → R + . Definition 1 (Sensitivity of s A ) . Let G = ( V , E ) and G (cid:48) = ( V , E (cid:48) ) be two graphs with the same node set V . E (cid:48) differsfrom E by at most one edge, i.e. , max( |E \ E (cid:48) | , |E (cid:48) \ E| ) = 1 . Then the sensitivity ∆ A of an LP algorithm A is definedas the maximum absolute difference between the scores s A ( u, v ) in G and G (cid:48) across all pairs of nodes ( u, v ) and all setof such possible graphs G (cid:48) and G , i.e. , ∆ A = max G , G (cid:48) max u,v ∈V ∆ A uv ( G , G (cid:48) ) , where ∆ A uv ( G , G (cid:48) ) = | s A ( u, v |G ) − s A ( u, v |G (cid:48) ) | . (2)Note that G and G (cid:48) differ only by an edge, but not a node . This is because adding or removing an isolated node doesnot affect reasonable LP algorithms. Examples:
For common neighbors, ∆ CN = 1 , because an additional edge ( v, y ) increases s CN ( u, v ) = |N ( u ) ∩N ( v ) | by one for y ∈ N ( u ) and v ∈ N ( u ) . For Jaccard coefficient, too, it is easy to check that ∆ JC ≤ , since a single-edgeperturbation can increase the number of common neighbors only by . For Adamic-Adar, an additional edge can affect s AA ( u, v ) either in the number of terms in N ( u ) ∩ N ( v ) , by adding an additional term / log |N ( w ) | , or in the degreeof some common neighbor w , in which case ∆ AA = |N ( w ) | − |N ( w ) | +1 . In all cases, ∆ AA ≤ / log 2 .3 P L P , a privacy-protecting LP framework: Since the traditional deterministic scoring protocols are sensitive toperturbation in G , they can lead to privacy leakage. In response, we present a generic DP framework for LP. Insteadof selecting nodes deterministically, we sample them based on a simple categorical distribution which depends on thecorresponding scores ( s A ) and suitable privacy parameter σ to control the noise injected into the sampler, and thus,leakage. Algorithm 1 summarizes the generic template. Algorithm 1 R K = D P L P ( u, G , A , K ) (cid:46) Recommends top K nodes that u may link to. Input:
Node u , non-neighbors v ∈ N ( u ) , LP algorithm A , estimate of ∆ A ,allowed privacy leakage level (cid:15) p Output: K predicted nodes to which u might link. σ ← (cid:15) p K log(∆ A + 1) R ← ∅ ; I u ← N ( u ) for k ∈ . . . K do (cid:46) Main loop for predicting st to K th neighbor ααα ← (cid:34) ( s A ( u, v ) + ∆ A + 1) σ (cid:80) w ∈I u ( s A ( u, w ) + ∆ A + 1) σ (cid:35) v ∈I u (cid:46) Set probability of sampling k th neighbor w ∼ Multinomial ( ααα ) (cid:46) Sample the neighbor R k ← R k − ∪ w ; I u ← I u \ w (cid:46) Update variables return R K Elaborating, to sample the k th node recommended to a query node u , we draw a potential neighbor v with aprobability proportional to ( s A ( u, v ) + ∆ A + 1) σ , among all non-neighbors who are not selected until step k − . Theparameter (cid:15) p controls the privacy leakage. If (cid:15) p → ∞ , then σ → ∞ , and the candidate node with maximum score getsselected, and the algorithm reduces to the original deterministic protocol A that may violate the privacy of the nodes.On the other hand, if (cid:15) p , σ → , then every node has an equal chance of getting selected, which preserves privacy buthas low predictive utility.Applied in the utility setting of Machanavajjhala et al. [26], the popular Laplacian or Gaussian perturbation protocol[14] would generate |V| random numbers, perturb the raw scores, and choose one node with maximum perturbedscore. In contrast, in each of K iterations, D P L P first generates ααα without using any randomness, and then samplesMultinomial ( ααα ) once in each of K recommendations. Thus, D P L P uses O ( K ) random numbers and provides privacyat a smaller loss of ranking accuracy. Note that, since any monotone function of a score is also a suitable score, theexponential protocol [13] can also be used in Algorithm 1 to guarantee privacy. However, depending on the form ofthe monotone function, ranking utility may vary. In Section 5, we show that exponential perturbation achieves worseranking quality in practice. Analysis of the privacy of D P L P : Among our key results is that the template proposed in Algorithm 1 offers a DPguarantee for any positive scoring function s A , provided ∆ A is bounded. The proof of the following formal claim isgiven in supplementary material (Section B.1). Theorem 2.
Given any LP algorithm A with bounded sensitivity on the scoring function s A , the corresponding DPalgorithm A given by D P L P (Algorithm 1) is (cid:15) p = 2 Kσ log(1 + ∆ A ) differentially private. As sensitivity ∆ A increases, (cid:15) p increases as well, thereby inflicting more violation of privacy. Such an observationintuitively supports the basic notion of privacy leakage — for a highly sensitive scoring protocol, a small perturbationin the graph produces a substantially different set of recommendations.While the above result holds for any generic LP algorithm, D P L P may give stronger guarantees for specific algo-rithms. In particular, the presence of such stronger bounds also depends on the number of nodes affected due to onesingle addition or deletion of edge in the graph. For example, given a node u , adding or deleting one edge can changethe common neighbor score s CN ( u, v ) for at most one possible node v , when it connects v to a neighbor of u . On theother hand, in case of Adamic-Adar, addition of one edge can change the scores for many nodes. In the following, weformally state the privacy guarantee for common neighbors and Jaccard coefficient. The proof is given in supplementarymaterial (Section B.2). Lemma 3.
Given the conditions of Theorem 2, if s A ( u, v ) is computed either using common neighbors or Jaccardcoefficient, then Algorithm 1 is (cid:15) p / differentially private. Quality of privacy-protecting link prediction under the latent space model
Privacy can always be trivially protected by ignoring private data, but that leads to poor prediction. E.g., Algorithm 1can provide extreme privacy by driving (cid:15) p → , but such a protocol will select each node uniformly, which has nopredictive value. Most proposals for privacy protection provide empirical analysis of variation of prediction qualityagainst some DP guarantee parameter [2, 21, 34]. However, a formal guarantee would reveal more insights and helpto draw the possible boundaries of the underlying proposal. In this context, the relative loss of prediction quality—loss suffered in terms of an observable quantity e.g. a scoring function— may be easier to analyze [26]. However,preservation of absolute prediction quality—utility in terms of the true latent generative process—is harder to prove,because, even in the non-DP case, absolute quality is rarely analyzed. A notable exception is the latent space modelof Sarkar et al. [33], reviewed in Section 2. We leverage their paradigm in two ways: we will first work with rankinglosses over K > neighbors, and then establish ranking quality in the face of privacy protection. Relating ranking loss to absolute prediction error:
In general, any LP algorithm M provides two maps: π M u : N ( u ) → (cid:2) |N ( u ) | (cid:3) and u M i = ( π M u ) − ( i ) . π M u gives the rank of v ∈ N ( u ) and u M i represents the node at rank i recommended to node u by algorithm M . Any score based LP algorithm M provides ranking π M u by sorting scores s M ( u, v ) in decreasing order. On the other hand, a sampling based LP protocol ( e.g. the proposed DP algorithm A )sequentially samples u M i . Furthermore, we denote π ∗ u as the perfect ranking over v ∈ N ( u ) induced by (increasing)latent distances d uv , and therefore we have u ∗ i = ( π ∗ u ) − ( i ) . Any ranking π M u may suffer some deviation from the‘ideal’ permutation π ∗ u of non-neighbors N ( u ) . This may happen because M can build only imperfect estimates oflatent distances after observing G , and because it has to protect privacy. Therefore, we define a loss function incurredby the ranking protocol M , which measures such a deviation from the hidden perfect order π ∗ u of non-neighbors up torank K . In the context of the latent model [33], we define d M u = [ d uu M , d uu M , . . . ] as the sequence of latent distancesfrom u to other nodes, as ordered by algorithm M . Because M is generally imperfect, d M u need not be monotoneincreasing. Following Negahban et al. [29], we defineR ANKING L OSS ( d u ; d M u ) = 12 K (cid:88) i
Define (cid:15) = (cid:113) /δ ) |V| + /δ )3( |V|− and recall that Ω( r ) is the volume of radius- r , D -dimensional hyper- We denote M as A if it is a non DP algorithm; and as A if it is a DP algorithm induced by A . phere for the latent space random graph model. With probability − K δ , we have E CN (cid:104) R ANKING L OSS ( d u ; d CN u ) (cid:105) ≤ K r (cid:20) K(cid:15) + γ u ( CN , (cid:15) p )Ω( r ) (cid:21) /KD E AA (cid:104) R ANKING L OSS ( d u ; d AA u ) (cid:105) ≤ K r (cid:20) log( |V| Ω( r ))(2 K(cid:15) + γ u ( AA , (cid:15) p ) / |V| )Ω( r ) (cid:21) /KD E JC (cid:104) R ANKING L OSS ( d u ; d JC u ) (cid:105) ≤ K r [4 K(cid:15) + 2 γ u ( JC , (cid:15) p ) / |V| ] /KD Note that the expectation is only taken only over the differentially private algorithm, but not randomness of the data,and this randomness of the data induces the high probability.
Proof idea:
To prove these inequalities, for each LP heuristic, we first bound the deviation of the common volumesshared by u and the recommended nodes u A i from those shared by u and the optimal nodes u ∗ i . Then we use such adeviation to bound R ANKING L OSS ( d u ; d A u ) . The proof is formally given in supplementary material (Appendix C.2).To investigate the behavior of ranking loss with variation of (cid:15) p and |V| , we first estimate γ u ( A , (cid:15) p ) (For proof seeAppendix C.3). Lemma 6.
We have γ u ( A , (cid:15) p ) ≤ (cid:88) i ∈ [ K ] s A ( u, u A i )( |V| − i + 1)( s A ( u, u A i + ) + ∆ A + 1) σ ( s A ( u, u A i ) + ∆ A + 1) σ + ( |V| − i )(∆ A + 1) σ , where σ = (cid:15) p log(1 + ∆ A ) ,and s A ( u, u A i + ) := max i Note that if (cid:15) p increases, the amount of privacy given by D P L P decreases, and thereforewe can measure /(cid:15) p as a measure of privacy. Then, from Lemma 6, we observe the following lemma. Lemma 7. If s max = max v ∈N ( u ) s A ( u, v ) , P RIV := 1 /(cid:15) p then we have: P RIV × log (cid:18) γ u ( A , (cid:15) p )2 Ks max (cid:19) ≤ K (cid:18) log( s max + ∆ A + 1)log(∆ A + 1) − (cid:19) (5)The proof mostly leverages Lemma 6. It is given in supplementary material (Appendix C.4). The above relationreveals that if sensitivity increases, the maximum attainable privacy for maintaining a given utility decreases. Moreover,if the privacy is kept constant, then a high sensitivity helps to increase the utility ( γ u ( A , (cid:15) p ) goes low). Since a highsensitivity allows the underlying algorithm A to exploit rich signals, it provides better prediction. In this section, we use eight real world datasets to show that D P L P can trade off privacy and the predictive accuracymore effectively than three standard baselines [13, 26, 28]. Datasets: We use eight datasets (USAir [7], C.Elegans [39], Yeast [38], Facebook [19], NS [30], PB [3], Power [39]and Ecoli [41]). These datasets are graphs with diverse sizes and structural properties. Appendix E.1 contains furtherdetails and statistics about them. Owing to space constraints, in this section, we present the results only for the first fivedatasets. Appendix E contains the results on the others. Evaluation protocol and metrics: For each of these datasets, we only consider predicting neighbors on the set ofquery nodes Q each containing at least one triangle, and we leave the others out. Such a pre-selection protocol isstandard to LP literature [5, 12], which allows for a fair evaluation specifically for triad based LP heuristics. Then,following [12], for each query node q , in the fully-disclosed graph, the set of V\{ q } is partitioned into neighbors N ( q ) https://github.com/muhanzhang/SEAL , used in [40]. ata Method CN AA JC D P L P Lapl. Gauss. Exp. D P L P Lapl. Gauss. Exp. D P L P Lapl. Gauss. Exp.USAir Node2Vec Struct2Vec PRUNE D P L P Lapl. Gauss. Exp. D P L P Lapl. Gauss. Exp. D P L P Lapl. Gauss. Exp.USAir Table 1: Comparison of performance in terms of expected Mean Average Precision (MAP) between various differentialprivate algorithms e.g. D P L P , Laplace, Gaussian and Exponential for 15% held-out set with (cid:15) p = 0 . and K = 10 .The expectation is computed using MC approximation with n = 10 runs of randomization. Error analysis is given inAppendix. The first (last) five rows indicate performance of triad-based LP heuristics (graph embedding techniques).D P L P outperforms Laplacian, Gaussian and Exponential mechanisms across almost all the datasets.and non-neighbors N ( q ) . Then we sample 85% of |N ( q ) | neighbors and 85% of N ( q ) non-neighbors and present theresulting graph G sampled to an LP protocol M ( M = A for a usual non-private LP and M = A for a differentially privateLP). Then, for each query node q , we ask M to provide a top- K ( K = 10 ) list of potential neighbors (good items in thecontext of information retrieval [27]) from the held-out graph—consisting of 15% secret neighbors and non-neighbors,then compute average precision value AP ( q ) and finally provide mean average precision (MAP) i.e. | Q | (cid:80) q ∈ Q AP ( q ) to measure the predictive power of M . In contrast to some previous works [5, 40], we avoid AUC as an accuracy metricdue to two pitfalls: (i) some differentially private mechanisms (like D P L P and Exponential) randomly sample nodes,where the sequence of sampled nodes is not likely to comply with relative order of scores; and (ii) AUC strongly relieson the number of neighbor vs. non-neighbor pairs in the top- K -ranked list, and is somewhat immune to class imbalance.As a result, AUC is usually large and undiscerning for any reasonable LP protocol over a sparse graph [12, 27]. Candidates for traditional LP protocols A : We consider two classes of base LP algorithms A : (i) algorithms basedon the triad-completion principle, i.e. the ones we analyzed (CN, JC, AA), and (ii) algorithms based on fitting node em-beddings, i.e. Node2Vec [16], Struct2Vec [31] and PRUNE [18], which are beyond the scope of the current theoreticalanalysis. In addition, we also report results on two another LP protocols: Cumulative Random Walk (CRW) [24] andLINE [36] in Appendix E.3. D P L P and baselines ( A ): We consider D P L P and three additional perturbation mechanisms that maintain differentialprivacy to compare with D P L P : (i) Laplacian, (ii) Gaussian, and (iii) Exponential perturbation. To evaluate a differen-tially private algorithm A on the above LP protocols, we first fix the level of privacy leakage (cid:15) p and apply A to variousbase LP algorithms and then we run n = 10 trials and finally report E A [ MAP ] . The exact choice of (cid:15) p differs acrossexperiments. Results: Table 1 compares the expected MAP estimates at a given privacy leakage (cid:15) p = 0 . , for various data sets andbase LP algorithms for top- K ( K = 10 ) predictions. At the very outset, D P L P almost always outperforms Laplacian,Gaussian and Exponential mechanisms. This is because, Laplacian and Gaussian mechanisms use O ( |V| ) randomnumbers thereby injecting more randomness than D P L P and Exponential protocols which generate O ( K ) randomnumbers. While the substantially better performance of D P L P than the baselines over CN, AA and JC corroborate ourtheoretical analysis, we note that such an improvement is not so significant in case of the other three base LP algorithmsinvolving deep graph embedding techniques. The formal theoretical analysis of this observation is beyond the scope ofthe paper. We believe that these embedding methods already use many sources of randomness—from initialization tonegative samples, which make them immune to an additional random perturbation. In fact, it is surprising to see thatthe overall MAP values of the node embedding methods were generally worse than simple LP protocols, however, thisis also noted by others [40]. 7 − − − . . . . (cid:15) p → E A [ M A P ] USAir CN AA JC − − − (cid:15) p → C. Elegans − − − (cid:15) p → Facebook − − − (cid:15) p → NS − − − (cid:15) p → Yeast Figure 1: Variation of E A ( MAP ) as privacy decreases for D P L P over A = CN , AA and JC across five datasets with K = 10 . In almost all cases, we observe that as we allow more privacy leakage, MAP increases. − − − . . . . (cid:15) p → E A [ M A P ] CN − − − (cid:15) p → AA − − − (cid:15) p → JC − − − (cid:15) p → Node2Vec − − − (cid:15) p → Struct2Vec − − − (cid:15) p → PRUNE D P L P Lapl.Gauss. Exp. Figure 2: Variation of E A ( MAP ) as privacy decreases for D P L P , Laplacian, Gaussian and Exponential mechanisms(legends in the last subplot) for Yeast dataset across different LP algorithms with K = 10 . In triad-based LP protocols i.e. CN, AA and JC, we observe that as we allow more privacy leakage, MAP increases for all methods. For LP protocolsbased on node embeddings, the performances remain stable. The overall MAP values of node embedding methods areworse than simple triad-based protocols, which is also noted in [40].Figure 1 fixes A = D P L P and K = 10 , varies the privacy level (cid:15) p , and compares expected MAP using base methodsCN, JC and AA on various data sets. AA is generally the best. There is an increasing trend of MAP with increasing (cid:15) p (reducing privacy), which is expected.Figure 2 shows a different slice of the data. We fix the dataset to Yeast and K = 10 . Each chart is for a base method,and shows MAP vs. privacy for various perturbation protocols A . For traditional deterministic methods CN, AA andJC, the upward trend continues. Curiously, for the node embedding approaches, such trends, if any, are very weak.As mentioned before, LP methods that depend on node embeddings invest large amounts of randomness even withoutprivacy requirements. Another important issue to consider for node embedding methods is that node embeddingsinclude private information. Decisions made by comparing these embeddings may leak information. In this paper, we have presented D P L P , a perturbation protocol to turn non-DP LP heuristics into DP versions, in atop- K node ranking setting. After establishing DP guarantees when D P L P is applied to three popular LP heuristics, weanalyzed the loss of predictive quality of D P L P in a latent distance graph generative framework. We also characterizedthe trade-off between privacy and absolute predictive quality in this framework. In contrast to popular Laplace orGaussian mechanisms which use O ( |V| ) random numbers, we inject O ( K ) random numbers, which allows for a moreaccurate link prediction with the same amount of privacy leakage. Extensive experiments showed that D P L P is superiorto popular Laplacian, Gaussian and exponential perturbation protocols. Our work opens up several interesting directionsfor future work. In particular, we can investigate D P L P or similarly effective protocols, when applied to recent LPalgorithms such as supervised random walks [5] and graph-based deep networks [40]. Collusion between queryingnodes, graph steganography [6], and repeated data exposure over time entail privacy breach risks that are not modeledin our framework. Bhagat et al. [8] use LP on a current graph snapshot to guide how future exposure should be curtailed.Further analysis along these lines may be of interest. 8 eferences [1] Dplp. URL https://github.com/bingola/DPLP .[2] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang. Deep learning with differentialprivacy. In ACM SIGSAC CCS , pages 308–318, 2016. URL https://arxiv.org/abs/1607.00133 .[3] R. Ackland et al. Mapping the us political blogosphere: Are conservative bloggers more prominent? In BlogTalk Downunder2005 Conference, Sydney . BlogTalk Downunder 2005 Conference, Sydney, 2005.[4] L. A. Adamic and E. Adar. Friends and neighbors on the Web. Social Networks , 25(3):211 – 230, 2003. ISSN 0378-8733.URL .[5] L. Backstrom and J. Leskovec. Supervised random walks: predicting and recommending links in social networks. In WSDMConference , pages 635–644, 2011. ISBN 978-1-4503-0493-1. doi: http://doi.acm.org/10.1145/1935826.1935914. URL http://cs.stanford.edu/people/jure/pubs/linkpred-ssswsdm11.pdf .[6] L. Backstrom, C. Dwork, and J. Kleinberg. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, andstructural steganography. In WWW Conference , pages 181–190, 2007. URL .[7] V. Batagelj and A. Mrvar. Usair dataset, 2006. URL http://vlado.fmf.uni-ssslj.si/pub/networks/data/ .[8] S. Bhagat, G. Cormode, D. Srivastava, and B. Krishnamurthy. Prediction promotes privacy in dynamic social networks. WOSN ,2010. URL .[9] S. Bubeck, J. Ding, R. Eldan, and M. Z. Rácz. Testing for high-dimensional geometry in random graphs. Random Structuresand Algorithms , 49(3):503–532, 2016. URL https://arxiv.org/abs/1411.5713 .[10] J. Chen, J. He, L. Cai, and J. Pan. Disclose more and risk less: Privacy preserving online social network data sharing. IEEETransactions on Dependable and Secure Computing , 2018. URL https://ieeexplore.ieee.org/iel7/8858/4358699/08423199.pdf .[11] F. Chierichetti, A. Epasto, R. Kumar, S. Lattanzi, and V. Mirrokni. Efficient algorithms for public-private social networks. In SIGKDD Conference , 2015. URL .[12] A. De, N. Ganguly, and S. Chakrabarti. Discriminative link prediction using local links, node features and community struc-ture. In ICDM , pages 1009–1018. IEEE, 2013. URL https://ieeexplore.ieee.org/iel7/6724379/6729471/06729590.pdf .[13] C. Dwork and A. Roth. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical ComputerScience , 9(3–4):211–407, 2014. URL .[14] C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In Theory ofcryptography conference , pages 265–284, 2006. URL https://link.springer.com/content/pdf/10.1007/11681878_14.pdf .[15] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. Liblinear: A library for large linear classification. Journal ofmachine learning research , 9(Aug):1871–1874, 2008.[16] A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In SIGKDD Conference , pages 855–864, 2016.URL https://arxiv.org/abs/1607.00653 .[17] P. D. Hoff, A. E. Raftery, and M. S. Handcock. Latent space approaches to social network analysis. Journal of the American Sta-tistical Association , 97(460):1090–1098, 2002. URL .[18] Y.-A. Lai, C.-C. Hsu, W. H. Chen, M.-Y. Yeh, and S.-D. Lin. PRUNE: Preserving prox-imity and global ranking for network embedding. In Advances in neural information pro-cessing systems , pages 5257–5266, 2017. URL http://papers.nips.cc/paper/7110-sssprune-ssspreserving-sssproximity-sssand-sssglobal-sssranking-sssfor-sssnetwork-sssembedding .[19] J. Leskovec and J. J. Mcauley. Learning to discover social circles in ego networks. In Advances in neural information processingsystems , pages 539–547, 2012.[20] D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. JASIST , 58(7):1019–1031, 2007.ISSN 1532-2890. doi: 10.1002/asi.20591. URL http://macroconnections.media.mit.edu/wp-ssscontent/uploads/2011/03/10.1.1.163.6528-sss1.pdf .[21] Z. Liu, Y.-X. Wang, and A. Smola. Fast differentially private matrix factorization. In Proceedings of the 9th ACM Conferenceon Recommender Systems , pages 171–178. ACM, 2015.[22] L. Lü and T. Zhou. Link prediction in complex networks: A survey. Physica A: statistical mechanics and itsapplications , 390(6):1150–1170, 2011. URL .[23] L. Lü and T. Zhou. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications , 390(6):1150–1170, 2011.[24] L. Lü, C.-H. Jin, and T. Zhou. Similarity index based on local paths for link prediction of complex networks. Physical ReviewE , 80(4):046122, 2009. 25] L. Lü, L. Pan, T. Zhou, Y.-C. Zhang, and H. E. Stanley. Toward link predictability of complex networks. PNAS , 112(8):2325–2330, 2015. URL .[26] A. Machanavajjhala, A. Korolova, and A. D. Sarma. Personalized social recommendations - accurate or private? CoRR ,abs/1105.4254, 2011. URL http://arxiv.org/abs/1105.4254 . VLDB 2011.[27] C. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval. Natural Language Engineering , 16(1):100–103, 2010.[28] F. McSherry and K. Talwar. Mechanism design via differential privacy. In FOCS , pages 94–103, 2007. URL http://kunaltalwar.org/papers/expmech.pdf .[29] S. Negahban, S. Oh, and D. Shah. Rank centrality: Ranking from pairwise comparisons. Operations Research , 65(1):266–287,2016. URL https://pubsonline.informs.org/doi/pdf/10.1287/opre.2016.1534 .[30] M. E. Newman. Finding community structure in networks using the eigenvectors of matrices. Physical review E , 74(3):036104,2006.[31] L. F. Ribeiro, P. H. Saverese, and D. R. Figueiredo. struc2vec: Learning node representations from structural identity. In SIGKDD Conference , pages 385–394, 2017.[32] B. K. Samanthula, L. Cen, W. Jiang, and L. Si. Privacy-preserving and efficient friend recommendation in online socialnetworks. Trans. Data Privacy , 8(2):141–171, 2015. URL .[33] P. Sarkar, D. Chakrabarti, and A. W. Moore. Theoretical justification of popular link prediction heuristics. In IJCAI , 2011.URL .[34] E. Shen and T. Yu. Mining frequent graph patterns with differential privacy. In SIGKDD Conference , pages 545–553, 2013.[35] M. Taddicken. The â ˘AŸPrivacy Paradoxâ ˘A ´Z in the Social Web: The Impact of Privacy Concerns, Individual Characteristics,and the Perceived Social Relevance on Different Forms of Self-Disclosure*. Journal of Computer-Mediated Communication ,19(2):248–273, 01 2014.[36] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. LINE: Large-scale information network embedding. In WWWConference , pages 1067–1077, 2015.[37] C. E. Tucker. Social networks, personalized advertising, and privacy controls. Journal of Marketing Research , 51(5):546–562,2014. doi: 10.1509/jmr.10.0355. URL https://doi.org/10.1509/jmr.10.0355 .[38] C. Von Mering, R. Krause, B. Snel, M. Cornell, S. G. Oliver, S. Fields, and P. Bork. Comparative assessment of large-scaledata sets of protein–protein interactions. Nature , 417(6887):399, 2002.[39] D. J. Watts and S. H. Strogatz. Collective dynamics of â ˘AŸsmall-worldâ ˘A ´Znetworks. nature , 393(6684):440, 1998.[40] M. Zhang and Y. Chen. Link prediction based on graph neural networks. In NeurIPS , pages 5165–5175, 2018. URL https://arxiv.org/pdf/1802.09691.pdf .[41] M. Zhang, Z. Cui, S. Jiang, and Y. Chen. Beyond link prediction: Predicting hyperlinks in adjacency space. In Thirty-SecondAAAI Conference on Artificial Intelligence , 2018.[42] T. Zhou and H. Li. Understanding mobile sns continuance usage in china from the perspectives of social influence and privacyconcern. Computers in Human Behavior , 37:283 – 289, 2014. URL . upplementary MaterialAppendix A Formal notion on differential privacy Here, we formally state the definition of differential privacy [13, 14]. Definition 8. An algorithm M is ( (cid:15) p , δ p ) differentially private if for any two datasets D , D (cid:48) differing in exactly onedata point, and for all measurable sets O ⊆ Range ( M ) , the following holds: Pr( M ( D ) ∈ O ) ≤ e (cid:15) p Pr( M ( D (cid:48) ) ∈ O ) + δ p (6)Equivalently, a randomized algorithm M is said to be (cid:15) p differentially private ( δ p = 0 ) if (cid:12)(cid:12)(cid:12)(cid:12) log Pr( M ( D ) ∈ O )Pr( M ( D (cid:48) ) ∈ O ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:15) p (7)In case of a graph G ( V , E ) , we say that a randomized algorithm M that takes a graph G as an input, is (cid:15) p , if G and G (cid:48) differ in an edge and Eq. 7 holds true.Given any revealed information q ( · ) there are mainly three algorithms that mainly work by adding noises to q indifferent mechanisms, e.g. (i) Laplacian [13, 14], (ii) Gaussian [13, 14] and (iii) Exponential [28] mechanisms. Appendix B Proofs of technical results in Section 3 Here, we first restate and prove the main result for privacy guarantee. B.1 Privacy guarantee of Algorithm 1 Theorem 2. Given any positive scoring function s A ( u, :) with bounded sensitivity ∆ A ( G ) , D P L P in Algorithm 1 is (cid:15) p differentially private.Proof. Here u, A are fixed throughout and will be omitted when convenient. We need to prove that: (cid:12)(cid:12)(cid:12)(cid:12) log Pr( R K |G )Pr( R K |G (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12) < (cid:15) p , (8)where G u and G (cid:48) u differ in one edge but N ( u ) is the same in both. The reason is that, N ( u ) also specifies the outputspace (support of the recommendation). By definition, the addition or deletion of one edge changes the score s A ( u, v |G ) by at most ∆ A ( G ) . I.e., s A ( u, v |G ) − ∆ A ( G ) ≤ s A ( u, v |G (cid:48) ) ≤ s A ( u, v |G ) + ∆ A ( G ) . Over the K iterations of Algorithm 1, let R = ∅ , R , . . . , R K be the recommended node sets, where R k = R k − ∪ u A k . Pr( R K |G ) = K (cid:89) k =1 Pr( u A k |R k − ; G ) = K (cid:89) k =1 ( s A ( u, u A k |G ) + ∆ A + 1) σ (cid:88) w ∈N ( u ) \R i − ( s A ( u, w |G ) + ∆ A + 1) σ (9)As A , G and u are fixed, we will shorthand s A ( u, v |G ) as ψ v for the rest of the proof. Recall that ∆ A = max G (cid:48) max u,v ∈V ∆ A uv where ∆ A uv = | s A ( u, v |G ) − s A ( u, v |G (cid:48) ) | . Consider the term (cid:12)(cid:12)(cid:12)(cid:12) log Pr( u A k |R k − ; G )Pr( u A k |R k − ; G (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log ( ψ u k + ∆ A + 1) σ ( ψ u k + ∆ uu A k + ∆ A + 1) σ (cid:88) w ∈N ( u ) \R k − ( ψ w + ∆ uw + ∆ A + 1) σ (cid:88) w ∈N ( u ) \R k − ( ψ w + ∆ A + 1) σ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . ( ψ uk +∆ A +1) σ ( ψ uk +∆ uu A i +∆ A +1) σ and (cid:80) w ∈N ( u ) \R k − ( ψ w + ∆ uw + ∆ A + 1) σ (cid:80) w ∈N ( u ) \R k − ( ψ w + ∆ A + 1) σ . First, ( ψ u k + ∆ A + 1) σ ( ψ u k + ∆ uu A k + ∆ A + 1) σ ≤ ( ψ u k + ∆ A + 1) σ ( ψ u k + 1) σ = (cid:20) A ψ u k + 1 (cid:21) σ ≤ (1 + ∆ A ) σ (10)and likewise ( ψ u k + ∆ A + 1) σ ( ψ u k + ∆ uu A k + ∆ A + 1) σ ≥ (1 + ∆ A ) − σ . For an upper bound on the second ratio, (cid:80) w ∈N ( u ) \R k − ( ψ w + ∆ uw + ∆ A + 1) σ (cid:80) w ∈N ( u ) \R k − ( ψ w + ∆ A + 1) σ ≤ (cid:80) w ∈N ( u ) \R k − ( ψ w + 2∆ A + 1) σ (cid:80) w ∈N ( u ) \R k − ( ψ w + ∆ A + 1) σ (11) ≤ |N ( u ) \ R k − | (cid:88) w ∈N ( u ) \R k − ( ψ w + 2∆ A + 1) σ ( ψ w + ∆ A + 1) σ ≤ (1 + ∆ A ) σ . (12)Inequality (1) is because ∆ uw ≤ ∆ A . Inequality (2) is due to Chebyshev’s sum inequality (Fact 10), where we have x w = ( ψ w + 2∆ A + 1) σ ( ψ w + ∆ A + 1) σ and y w = ( ψ w + ∆ A + 1) σ . Inequality (3) is because ( ψ w + 2∆ A + 1) σ ( ψ w + ∆ A + 1) σ is a decreasingfunction of ψ w . Similarly, we obtain a lower bound on the second ratio as: (cid:80) w ∈N ( u ) \R k − ( ψ w + ∆ uw + ∆ A + 1) σ (cid:80) w ∈N ( u ) \R k − ( ψ w + ∆ A + 1) σ ≥ (1 + ∆ A ) − σ . (13)From the above steps, (cid:12)(cid:12)(cid:12)(cid:12) log Pr( R K |G )Pr( R K |G (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:88) k ∈ [ K ] (cid:12)(cid:12)(cid:12)(cid:12) log Pr( u A k |R k − ; G )Pr( u A k |R k − ; G (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ Kσ log (1 + ∆ A ) = (cid:15) p . (14) B.2 Privacy guarantees of Algorithm 1 for common neighbors and Jaccard coeffcient Lemma 3. Given the conditions of Theorem 2, Algorithm 1 is (cid:15) p / differentially private for s CN and s JC .Proof. We only argue for the common neighbors case. The argument for Jaccard coefficient is similar. Assume that anedge ( v, y ) (cid:54)∈ E is present in E (cid:48) , with v (cid:54)∈ N ( u ) . Such an edge affects |N ( u ) ∩ N ( v ) | only if y ∈ N ( u ) . Even then, itcan change s CN ( u, v ) by one. Hence ∆ A = 1 . To lighten the notation, we denote ψ v = s CN ( u, v ) . As before, we seekto bound r k = (cid:12)(cid:12)(cid:12) log Pr( u A k |R k − ; G )Pr( u A k |R k − ; G (cid:48) ) (cid:12)(cid:12)(cid:12) , where Pr( u A i |R i − ; G u ) = ( ψ v +∆ A +1) σ (cid:80) s ∈N ( u ) \R j − ( ψ w +∆ A +1) σ . Note that, the value of ψ v only changes for the pair ( u, v ) . For all other pairs, it remain the same. Thus, r i = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log (cid:80) w (cid:54)∈R i − ∪ v ( ψ w + ∆ A + 1) σ + ( ψ v + 2∆ A + 1) σ (cid:80) w (cid:54)∈R i − ∪ v ( ψ w + ∆ A + 1) σ + ( ψ v + ∆ A + 1) σ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , if v (cid:54) = u A i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log ( ψ v + ∆ A + 1) σ ( ψ v + 2∆ A + 1) σ (cid:80) w (cid:54)∈R i − ∪ v ( ψ w + ∆ A + 1) σ + ( ψ v + 2∆ A + 1) σ (cid:80) w (cid:54)∈R i − ∪ v ( ψ w + ∆ A + 1) σ + ( ψ v + ∆ A + 1) σ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , if v = u A i (15)From item (1) in Fact 9, we have r i ≤ σ log(1 + ∆ A ) , when v (cid:54) = u A i . From item (2) in Fact 9, we have r i ≤ σ log ( ψ w +2∆ A +1) σ ( ψ w +∆ A +1) σ ≤ log(∆ A + 1) , when v = u A i . Collecting all cases, (cid:80) i ∈ [ k ] r i ≤ kσ log (cid:16) A +1 − (cid:17) = (cid:15) p / . Fact 9. (1) ( n + 2∆ A + 1) σ + k ( n + ∆ A + 1) σ + k ≤ (1 + ∆ A ) σ for n ∈ Z , k > .(2) ( n + ∆ A + 1) σ ( n + 2∆ A + 1) σ < ( n + ∆ A + 1) σ ( n + 2∆ A + 1) σ × k + ( n + 2∆ A + 1) σ k + ( n + ∆ A + 1) σ < . Fact 10 (Chebyshev sum inequality) . If x w and y w are decreasing and increasing sequences respectively, then |R| (cid:80) w ∈R x w y w < (cid:80) w ∈R x w (cid:80) w ∈R y w . ppendix C Proof of technical results in Section 4 C.1 Proof of Proposition 4 Proposition 4. R ANKING L OSS ( d ∗ u , d M u ) ≤ (cid:88) i ∈ [ K ] (cid:16) d uu M i − d uu ∗ i (cid:17) .Proof. Let the nodes ordered by decreasing s M u be u M , . . . , u M K , . . . . We are not concerned with positions after K .Let the latent distances from u to these nodes be d uu M , d uu M , . . . , d uu M K . If i < j then we want d uu M i < d uu M j ,but this may not happen, in which case we assess a loss of ( d uu M i − d uu M j ) , following Negahban et al. [29]. Thiscan be summed up as (1 / K ) (cid:80) i We apply Lemma 11 to prove these theorems. Recall that r is the threshold distance for an edge between two nodes inthe latent space model and γ u ( A , (cid:15) p ) is given as follows: γ u ( A , (cid:15) p ) = E A (cid:104) (cid:88) i ∈ [ K ] ( s A ( u, u A i |G ) − s A ( u, u A i |G )) (cid:105) (18) Theorem 5. Define (cid:15) = (cid:113) /δ ) |V| + /δ )3( |V|− . With probability − K δ , we have ( i ) A = CN : E A (cid:16) R ANKING L OSS ( d u ; d A u ) (cid:17) ≤ K r (cid:18) K(cid:15) + γ u ( CN , (cid:15) p ) / |V| Ω( r ) (cid:19) /KD ( ii ) A = AA : E A (cid:16) R ANKING L OSS ( d u ; d A u ) (cid:17) ≤ K r (cid:18) log( |V| Ω( r ))(2 K(cid:15) + γ u ( AA , (cid:15) p ) / |V| )Ω( r ) (cid:19) /KD ( iii ) A = JC : E A (cid:16) R ANKING L OSS ( d u ; d A u ) (cid:17) ≤ K r (4 K(cid:15) + 2 γ u ( JC , (cid:15) p ) / |V| ) /KD roof. Here, we show only the case of common neighbors. Others follow using the same method. We first define (cid:101) γ u ( A , (cid:15) p ) = (cid:88) i ∈ [ K ] (cid:16) s A ( u, u A i ) − s A ( u, u A i ) (cid:17) . (19)Note that E A ( (cid:101) γ u ( A , (cid:15) p )) = γ u ( A , (cid:15) p ) .We first fix one realization of A for A = CN. Denote (cid:15) A = (cid:15) + (cid:101) γ u ( CN , (cid:15) p ) / (2 K |V| ) . Further, denote S t = (cid:80) ti =1 ( d uu ∗ i − d uu A i ) and (cid:15) A ,t = 2 rt (cid:16) t(cid:15) A Ω( r ) (cid:17) /tD , then S t ≤ (cid:15) A ,t ∀ t = ⇒ − (cid:15) t − < S t − S t − < (cid:15) A ,t . Therefore Pr( | S t − S t − | < (cid:15) A ,t ) ≥ − Pr( S t > (cid:15) A ,t ) − Pr( S t − > (cid:15) A ,t − ) ≥ − tδ (the last inequality is due toLemma 14). Now, given K (cid:28) N , rt (cid:16) t(cid:15) A Ω( r ) (cid:17) /tD < Kr (cid:16) K(cid:15) A Ω( r ) (cid:17) /tD < Kr (cid:16) K(cid:15) A Ω( r ) (cid:17) /KD . Next, we have Pr( (cid:80) t ∈ [ K ] | S t − S t − | < K(cid:15) A ,K ) ≥ Pr( (cid:80) t ∈ [ K ] | S t − S t − | < (cid:80) t ∈ [ K ] (cid:15) A ,t ) ≥ − (cid:80) t ∈ [ K ] Pr( | S t − S t − | <(cid:15) A ,t ) ≥ − K ( K + 1) δ ≥ − K δ . Now, we note that, (cid:88) t ∈ [ K ] | S t − S t − | < K(cid:15) A ,K = ⇒ E A (cid:104)(cid:80) t ∈ [ K ] | S t − S t − | (cid:105) < K E A ( (cid:15) A ,K ) (20)Since f ( x ) = x /KD is a concave function in x for D ≥ , we apply Jensen inequality to have E A ( (cid:15) A ,K ) ≤ K r (cid:18) K(cid:15) + E A ( (cid:101) γ u ( CN , (cid:15) p )) / |V| Ω( r ) (cid:19) /KD (21)which immediately leads to the desired result. C.3 Expected error in score due to Algorithm 1 Lemma 6. We have γ u ( A , (cid:15) p ) ≤ (cid:88) i ∈ [ K ] s A ( u, u A i )( |V| − i + 1)( s A ( u, u A i + ) + ∆ A + 1) σ ( s A ( u, u A i ) + ∆ A + 1) σ + ( |V| − i )(∆ A + 1) σ , where σ = (cid:15) p log(1 + ∆ A ) ,and s A ( u, u A i + ) := max i Lemma 7. If s max = max v ∈N ( u ) s A ( u, v ) , P RIV := 1 /(cid:15) p then we have: P RIV × log (cid:18) γ u ( A , (cid:15) p )2 Ks max (cid:19) ≤ K (cid:18) log( s max + ∆ A + 1)log(∆ A + 1) − (cid:19) (28) Proof. If s max = max v ∈N ( u ) s A ( u, v ) γ u ( A , (cid:15) p ) ≤ s max (cid:88) i ∈ [ K ] |V| − i + 1 |V| − i (cid:18) s max A (cid:19) σ ≤ Ks max |V| (cid:18) s max A (cid:19) σ (29)Therefore, σ ≥ log( γ u ( A , (cid:15) p ) / Ks max )log(1 + s max / (1 + ∆ A )) (30)Putting σ = (cid:15) p / (2 K log(∆ A + 1)) , we have the required bound. Appendix D Auxiliary Lemmas In this section, we first provide a set of key auxiliary lemmas that will be used to derive several results in the paper. Wefirst denote (cid:101) γ u ( A , (cid:15) p ) = (cid:88) i ∈ [ K ] ( s A ( u, u A i ) − s A ( u, u A i )) (31)Further, let ω r ( u, v ) = ω r ( v, u ) be the common volume between D -dimensional hyperspheres centered around u and v . Lemma 11. Define, (cid:15) = (cid:113) /δ ) |V| + /δ )3( |V|− . Then, with probability at least − Kδ , we have the followingbounds: ( i ) A on CN : 0 ≤ (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u CN i ) ≤ (2 K(cid:15) + (cid:101) γ u ( CN , (cid:15) p ) / |V| )( ii ) A on AA : 0 ≤ (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u AA i ) ≤ log( |V| Ω( r ))(2 K(cid:15) + (cid:101) γ u ( AA , (cid:15) p ) / |V| )( iii ) A on JC : 0 ≤ (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u JC i ) ≤ r )(2 K(cid:15) + (cid:101) γ u ( JC , (cid:15) p ) / |V| ) roof. Proof of (i): We observe that: (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u CNi ) ≥ (cid:80) i ∈ [ K ] s CN ( u, u CNi ) − (cid:80) i ∈ [ K ] s CN ( u, u CNi ) |V| + 2 k(cid:15) = ⇒ (cid:88) i ∈ [ K ] s CN ( u, u CNi ) |V| − (cid:88) i ∈ [ K ] s CN ( u, u ∗ i ) |V| + (cid:88) i ∈ [ K ] s CN ( u, u CNi ) |V| − (cid:88) i ∈ [ K ] s CN ( u, u CN i ) |V| + (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u CNi ) ≥ k(cid:15) = ⇒ (cid:95) i ∈ [ K ] (cid:32) s CN ( u, u CNi ) |V| − ω r ( u, u CNi ) ≥ (cid:15) (cid:33) (cid:95) i ∈ [ K ] (cid:18) − s CN ( u, u ∗ i ) |V| + ω r ( u, u ∗ i ) ≥ (cid:15) (cid:19) = ⇒ Pr (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u CNi ) ≥ − (cid:80) i ∈ [ K ] s CN ( u, u CNi ) + (cid:80) i ∈ [ K ] s CN ( u, u CN i ) |V| + 2 k(cid:15) ≤ (cid:88) i ∈ [ K ] Pr (cid:32) s CN ( u, u CNi ) |V| − ω r ( u, u CNi ) ≥ (cid:15) (cid:33) + (cid:88) i ∈ [ K ] Pr (cid:18) − s CN ( u, u ∗ i ) |V| + ω r ( u, u ∗ i ) ≥ (cid:15) (cid:19) ≤ Kδ (32)The statement (1) is because s CN ( u, u CN i ) > s CN ( u, u CN i ) . The statement (2) is because X + Y > a = ⇒ X >a or Y > a . The statement (3) is due to (1) and (2). Ineq. (4) is due to empirical Bernstein inequality. Proof of (ii): Note that: E [ s AA ( u, v )] = ω r ( u,v )log( |V| Ω( r )) . We observe that: (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u AAi ) ≥ log( |V| Ω( r )) (cid:34) (cid:80) i ∈ [ K ] s AA ( u, u AA i ) − (cid:80) i ∈ [ K ] s AA ( u, u AA i ) |V| + 2 K(cid:15) (cid:35) = ⇒ (cid:88) i ∈ [ K ] s AA ( u, u AA i ) |V| − (cid:88) i ∈ [ K ] s AA ( u, u ∗ i ) |V| + (cid:88) i ∈ [ K ] s AA ( u, u AA i ) |V| − (cid:88) i ∈ [ K ] s AA ( u, u AA i ) |V| + (cid:88) i ∈ [ K ] ω r ( u, u ∗ i )log( |V| Ω( r )) − (cid:88) i ∈ [ K ] ω r ( u, u AA i )log( |V| Ω( r )) ≥ k(cid:15) = ⇒ (cid:95) i ∈ [ K ] (cid:32) s AA ( u, u AAi ) |V| − ω r ( u, u AAi )log( |V| Ω( r )) ≥ (cid:15) (cid:33) (cid:95) i ∈ [ K ] (cid:18) − s AA ( u, u ∗ i ) |V| + ω r ( u, u ∗ i )log( |V| Ω( r )) ≥ (cid:15) (cid:19) = ⇒ Pr (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u AA i ) ≥ log( |V| Ω( r )) (cid:34) (cid:80) i ∈ [ K ] s AA ( u, u AA i ) − (cid:80) i ∈ [ K ] s AA ( u, u AA i ) |V| + 2 K(cid:15) (cid:35)(cid:33) ≤ (cid:88) i ∈ [ K ] Pr (cid:32) s AA ( u, u AA i ) |V| − ω r ( u, u AA i )log( |V| Ω( r )) ≥ (cid:15) (cid:33) + (cid:88) i ∈ [ K ] Pr (cid:18) − s AA ( u, u ∗ i ) |V| + ω r ( u, u ∗ i )log( |V| Ω( r )) ≥ (cid:15) (cid:19) ≤ kδ (33)The statement (1)— (3) follows with same argument for Proof of (i).16 roof of (iii): Note that: E [ s JC ( u, v )] = A ( u,v )2Ω( r ) − A ( u,v ) . We observe that: (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u JC i ) ≥ r ) (cid:34) (cid:80) i ∈ [ K ] s JC ( u, u JC i ) − (cid:80) i ∈ [ K ] s JC ( u, u JC i ) |V| + 2 K(cid:15) (cid:35) ≥ (cid:34) (cid:80) i ∈ [ K ] s JC ( u, u JC i ) − (cid:80) i ∈ [ K ] s JC ( u, u JC i ) |V| + 2 K(cid:15) (cid:35) (2Ω( r ) − ω r ( u, u ∗ i ))(2Ω( r ) − ω r ( u, u JC i )) / (2Ω( r )) = ⇒ (cid:88) i ∈ [ K ] s JC ( u, u JC i ) |V| − (cid:88) i ∈ [ K ] s JC ( u, u ∗ i ) |V| + (cid:88) i ∈ [ K ] s JC ( u, u JC i ) |V| − (cid:88) i ∈ [ K ] s JC ( u, u JC i ) |V| + (cid:88) i ∈ [ K ] ω r ( u, u ∗ i )2Ω( r ) − ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u JC i )2Ω( r ) − ω r ( u, u JC i ) ≥ k(cid:15) = ⇒ (cid:95) i ∈ [ K ] (cid:32) s JC ( u, u JC i ) |V| − ω r ( u, u JC i )2Ω( r ) − ω r ( u, u JC i ) ≥ (cid:15) (cid:33) (cid:95) i ∈ [ K ] (cid:18) − s JC ( u, u ∗ i ) |V| + ω r ( u, u ∗ i )2Ω( r ) − ω r ( u, u ∗ i ) ≥ (cid:15) (cid:19) = ⇒ Pr (cid:88) i ∈ [ K ] ω r ( u, u ∗ i ) − (cid:88) i ∈ [ K ] ω r ( u, u JC i ) ≥ r ) (cid:34) (cid:80) i ∈ [ K ] s JC ( u, u JC i ) − (cid:80) i ∈ [ K ] s JC ( u, u JC i ) |V| + 2 K(cid:15) (cid:35)(cid:33) (34) ≤ (cid:88) i ∈ [ K ] Pr (cid:32) s JC ( u, u JC i ) |V| − ω r ( u, u JC i )2Ω( r ) − ω r ( u, u JC i ) ≥ (cid:15) (cid:33) + (cid:88) i ∈ [ K ] Pr (cid:18) − s JC ( u, u ∗ i ) |V| + ω r ( u, u ∗ i )2Ω( r ) − ω r ( u, v ) ≥ (cid:15) (cid:19) ≤ kδ (35)The statement (1)— (3) follows with same argument for Proof of (i). Lemma 12. For any LP algorithm A , we have K (cid:88) i =1 ( d uu A i − d uu ∗ i ) ≤ rK − ω − r (cid:0) K (cid:88) i =1 ω r ( u, u ∗ i ) − K (cid:88) i =1 ω r ( u, u A i ) (cid:1) (36) Proof. We note that, ω − r ( ω r ( u, u A )) + ω − r ( ω r ( u, u A )) ≤ ω − r (0) + ω − r ( ω r ( u, u A ) + ω r ( π )) ω − r ( ω r ( u, u A ) + ω r ( u, u A )) + ω − r ( ω r ( u, u A )) ≤ ω − r (0) + ω − r ( ω r ( u, u A ) + ω r ( π ) + ω r ( u, u A )) · · · · · · ω − r ( K − (cid:88) i =1 ω r ( u, u A )) + ω − r ( ω r ( u, u A K )) ≤ ω − r (0) + ω − r ( K (cid:88) i =1 ω r ( u, u A K )) ω − r (cid:0) K (cid:88) i =1 ω r ( u, u A i ) (cid:1) + ω − r (cid:0) K (cid:88) i =1 ω r ( u, u ∗ i ) − K (cid:88) i =1 ω r ( u, u A i ) (cid:1) ≤ ω − r (0) + ω − r ( K (cid:88) i =1 ω r ( u, u ∗ i )) Inequalities (1) is due to Proposition 13 (ii). Taking the telescoping sum, we have: K (cid:88) i =1 ω − r ( ω r ( u, u A i )) + ω − r (cid:0) K (cid:88) i =1 ω r ( u, u ∗ i ) − K (cid:88) i =1 ω r ( u, u A i ) (cid:1) ≤ Kr + ω − r ( K (cid:88) i =1 ω r ( u, u ∗ i )) a ≤ Kr + ω − r ( ω r ( u, u ∗ )) b ≤ Kr + K (cid:88) i =1 ω − r ( ω r ( u, u ∗ i )) (2 a ) is due to decreasing nature of ω − r ( . ) (Proposition 13 (i)) and inequality (2 b ) is obtained by addingadditional positive terms. Hence, K (cid:88) i =1 ( d uu A i − d uu ∗ i ) ≤ rK − ω − r (cid:0) K (cid:88) i =1 ω r ( u, u ∗ i ) − K (cid:88) i =1 ω r ( u, u A i ) (cid:1) (37) Proposition 13. (i) A − u ( y ) is decreasing and convex. (ii) A − u ( x ) + A − u ( a − x ) < A − u (0) + A − u ( a ) .Proof. (i) We have from [33] dA − u ( y ) dy = − (cid:34) C (cid:18) − ( A − u ( y )) r (cid:19) D − (cid:35) − (38)We differentiate again and have d A − u ( y ) dy = − (cid:34) C (cid:48) (cid:18) − ( A − u ( y )) r (cid:19) D +12 (cid:35) − A − u ( y ) dA − u ( y ) dy > (39)(ii) Assume f ( x ) = A − u ( x ) + A − u ( a − x ) . d f ( x ) /dx = d A − u ( x ) dx + d A − u ( a − x ) dx > . Moreover, f (cid:48) ( x ) = 0 at x = a/ . Hence f ( x ) is U-shaped convex function. Therefore, f ( x ) ≤ f (0) = f ( a ) .Lemma 12 can be used to prove the corresponding bounds for different LR heuristics. Lemma 14. If we define (cid:15) = (cid:113) /δ ) |V| + /δ )3( |V|− , then, with probability − Kδ ( i ) CN : K (cid:88) i =1 ( d uu CN i − d uu ∗ i ) ≤ Kr (cid:18) K(cid:15) + (cid:101) γ u ( CN , (cid:15) p ) / |V| Ω( r ) (cid:19) /KD (40) ( ii ) AA : K (cid:88) i =1 ( d uu AA i − d uu ∗ i ) ≤ Kr (cid:18) log( |V| Ω( r ))(2 K(cid:15) + (cid:101) γ u ( AA , (cid:15) p ) / |V| )Ω( r ) (cid:19) /KD (41) ( iii ) JC : K (cid:88) i =1 ( d uu JC i − d uu ∗ i ) ≤ Kr (4 K(cid:15) + 2 (cid:101) γ u ( JC , (cid:15) p ) / |V| ) /KD (42)(43) Proof. We only prove the case for (i), rests follow same methods. We define (cid:15) = (cid:15) + (cid:101) γ u ( CN , (cid:15) p ) / (2 K |V| ) FromLemma 11, we have K (cid:88) i =1 ( d uu CN i − d uu ∗ i ) ≤ rK − ω − r (cid:0) K (cid:88) i =1 ω r ( u, u ∗ i ) − K (cid:88) i =1 ω r ( u, u CN i ) (cid:1) (44)which is less than rK − ω − r (2 K(cid:15) ) with probability − Kδ from Lemma 11, (i) Now, ω − r (2 K(cid:15) ) ≥ r (1 − (2 K(cid:15)/V ) /D ) (45)We aim to approximate the above quantity by r (1 − (2 K(cid:15)/V ) /D ) ≥ rK (1 − (2 K(cid:15)/V ) /A ) (46)18or some suitable dimension A , which says that A ≥ log(2 K(cid:15)/V )log (cid:16) − K + K (cid:0) K(cid:15)V (cid:1) /D (cid:17) (47) ≥ log(2 K(cid:15)/V )log (cid:16) − K + K (cid:0) K(cid:15)V (cid:1) /D (cid:17) (48) ≥ log(2 K(cid:15)/V )1 − K + KD log (cid:0) K(cid:15)V (cid:1) (49)Inequality is due to concavity of logarithmic function. Now, The quantity log(2 K(cid:15)/V )1 − K + KD log ( K(cid:15)V ) achieves maximum at (cid:15) → when A ≥ KD . Then, (cid:80) Ki =1 ( d uu A i − d uu ∗ i ) ≤ Kr (2 K(cid:15)/V ) /KD Appendix E Additional details about experiments E.1 Dataset details We use eight diverse datasets for our experiments. • USAir [7] is a network of US Air lines. • C.Elegans [39] is a neural network of C. elegans. • Yeast [38] is a protein-protein interaction network in yeast. • Facebook [19] is a snapshot of a part of Online Social Network ’Facebook’. • NS [30] is a collaboration network of researchers in network science. • PB [3] is a network of US political blogs. • Power [39] is an electrical grid of western US. • Ecoli [41] is a pairwise reaction network of metabolites in E. coli. Dataset |V| |E| d avg Clust. Coeff. DiameterUSAir 332 2126 12.81 0.396 6.00C.Elegans 297 2148 14.46 0.181 5.00Yeast 2375 11693 9.85 0.469 15.00Facebook 4039 88234 43.69 0.519 8.00NS 1589 2742 3.45 0.693 ∞ PB 1222 16714 27.36 0.226 8.00Power 4941 6594 2.67 0.103 46.00Ecoli 1805 14660 16.24 0.289 7.00 Table 2: Dataset statistics. E.2 Implementation details of LP protocols For triad-based LP heuristics, the implementations are trivial, and we need no hyper parameter tuning. For embeddingbased methods, the node representations are generated using the training graph. Then, following earlier work [16,40], we use the Hadamard product of the nodes embeddings as features and use them to train a logistic classifier (inLiblinear [15]) and then use the trained model to predict links. In each of these method, we set the dimension ofembedding to be z = 5 in contrast to the default value of z = 128 , which is set using cross-validation. The remaininghyper-parameters are as follows, which is also set using cross-validation. • Node2Vec : Number of walks is and the walk length is . • Struct2Vec : Number of walks is , the walk length is and the number of layers is . • PRUNE : Learning rate is − , the epoch size is and batch size is . • LINE : Learning rate is . , and the number of negative samples is .19ll the other hyperparemeters are set as default in the corresponding software. The experiments are carried out usinga Debian-9 OS with 4-core Intel(R) Core(TM) i3-3225 CPU @ 3.30GHz machine having 8GB RAM. Except forembedding computations which are done using the codes available in respective github repositories, we wrote therest of the codes– starting from sampling training graphs, score calculation to implementation of differentially privatealgorithms— in MATLAB R2017b. E.3 Additional results Here we present results for all differentially private algorithms on several LP protocols across all the datasets. E [ M A P ] → CN AA JC CRWDpLpLapl.Gauss.Exp.0.20.40.60.810 CN AA JC CRW CN AA JC CRW CN AA JC CRW E [ M A P ] → N2V S2V PR. LN.0.10.20.30.40.50 (a) USAir N2V S2V PR. LN. (b) C.Elegan N2V S2V PR. LN. (c) Yeast N2V S2V PR. LN. (d) Facebook Figure 3: Comparison of performance across USAir, C.Elegan, Yeast, Facebook datasets in terms of expected MeanAverage Precision (MAP) between various differential private algorithms e.g. D P L P , Laplace, Gaussian and Exponentialfor 15% held-out set with (cid:15) p = 0 . and K = 10 for all LP protocols (N2V: Node2Vec, S2V: Struct2Vec, PR.: PRUNEand LN.: LINE). The expectation is computed using MC approximation with n = 10 runs of randomization. The first(second) row indicates performance of triad-based LP heuristics (graph embedding techniques). D P L P outperformsLaplacian, Gaussian and Exponential protocols across almost all the datasets. E [ M A P ] → CN AA JC CRWDpLpLapl.Gauss.Exp.0.20.40.60.810 CN AA JC CRW CN AA JC CRW CN AA JC CRW E [ M A P ] → N2V S2V PR. LN.0.10.20.30.40.50 (a) PB N2V S2V PR. LN.0.20.40.60.810 (b) NS N2V S2V PR. LN.0.20.40.60.810 (c) Power N2V S2V PR. LN.0.20.40.60.810 (d) Ecoli Figure 4: Analogue of Figure 3 across