[PDF] Secure Friend Discovery via Privacy-Preserving and Decentralized Community Detection

Abstract

The problem of secure friend discovery on a social network has long been proposed and studied. The requirement is that a pair of nodes can make befriending decisions with minimum information exposed to the other party. In this paper, we propose to use community detection to tackle the problem of secure friend discovery. We formulate the first privacy-preserving and decentralized community detection problem as a multi-objective optimization. We design the first protocol to solve this problem, which transforms community detection to a series of Private Set Intersection (PSI) instances using Truncated Random Walk (TRW). Preliminary theoretical results show that our protocol can uncover communities with overwhelming probability and preserve privacy. We also discuss future works, potential extensions and variations.

Full PDF

aa r X i v : . [ c s . CR ] M a y Secure Friend Discovery viaPrivacy-Preserving and Decentralized Community Detection

Pili Hu

HUPILI @ IE . CUHK . EDU . HK Sherman S.M. Chow

SMCHOW @ IE . CUHK . EDU . HK Wing Cheong Lau

WCLAU @ IE . CUHK . EDU . HK Department of Information Engineering, The Chinese University of Hong Kong

Abstract

The problem of secure friend discovery on a so-cial network has long been proposed and stud-ied. The requirement is that a pair of nodes canmake befriending decisions with minimum infor-mation exposed to the other party. In this paper,we propose to use community detection to tacklethe problem of secure friend discovery. We for-mulate the ﬁrst privacy-preserving and decentral-ized community detection problem as a multi-objective optimization. We design the ﬁrst proto-col to solve this problem, which transforms com-munity detection to a series of Private Set In-tersection (PSI) instances using Truncated Ran-dom Walk (TRW). Preliminary theoretical resultsshow that our protocol can uncover communitieswith overwhelming probability and preserve pri-vacy. We also discuss future works, potential ex-tensions and variations.

One important function provided by social network isfriend discovery. The problem of ﬁnding people of thesame attribute/ interest/ community has long been studiedin the context of social network. For example, proﬁle-basedfriend discovery can recommend people who have similarattributes/ interests; topology-based friend discovery canrecommend people from the same community.One special requirement of algorithms operating on socialnetwork is that it must be privacy-preserving. For exam-ple, social network nodes may be willing to share theirattributes/ interests with people having similar proﬁle; Orthey may be willing to share their raw connections with

ICML 2014 Workshop on Learning, Security and Privacy people in the same community. However, it is unfavourableto leak those private data to arbitrary strangers. Towardsthis end, the friend discovery routine should only exposeminimal necessary information to involved parties.In the current model of large-scale OSNs, service providerslike Facebook play a role of Trusted-Third-Party (TTP).The friend discovery is accomplished as follows: 1) Ev-ery node (user) give his/her proﬁle and friend list to TTP;2) TTP runs any sophisticated social network mining algo-rithm (e.g. link prediction, community detection) and re-turns the friend recommendations to only related users. Themining algorithm can be a complex one involving node-level attributes, netweork topology, or both. Since TTP hasall the data, the result can be very accurate. This modelis commercially viable and successfully deployed in large-scale. However, recent arise of privacy concern motivatesboth researchers and developers to pursue other solutions.Decentralized Social Network (DSN) like Diaspora has re-cently been proposed and implemented. Since it is very dif-ﬁcult to design, implement and deploy a DSN (Datta et al.,2010), much research attention was focused on system is-sues. We envision that the DSN movement will graduallygrow with user’s increasing awareness of privacy. In fact,Diaspora, the largest DSN up-to-date, has already accumu-lated 1 million users. With the decentralized infrastruc-ture established, next question is: can we support accuratefriend discovery under the constraint that each node onlyobserves partial information of the whole social network?Note that the whole motivation of DSN is that single ser-vice provider can not be fully trusted, so the TTP approachcan not be re-used. Towards this end, the computation pro-cedure must be decentralized.One common approach in literature to achieve decentral-ized and privacy-preserving friend discovery is to trans-form it into a set matching problem. For the ﬁrst type,it is natural to represent one’s attributes/ interests/ so-cial activities in form of a set (Zhang et al., 2012). For https://joindiaspora.com ecure Friend Discovery via Privacy-Preserving and Decentralized Community Detection the second type, one straightforward way is to representone’s friend (neighbour) list in form of a set (Nagy et al.,2013). In this way, both proﬁle matching and commonfriend detection become a set intersection problem. Thereexists one useful crypto primitive called Private Set In-tersection (PSI). Brieﬂy and roughly speaking, given twosets W and W held by two node v and v , PSI pro-tocol can compute |W ∩ W | without letting either v or v know other party’s raw input. Resaerchers haveproposed PSI schemes based on commutative encryption(Agrawal et al., 2003), oblivious polynomial evaluation(Freedman et al., 2004) oblivious psudorandom function(Freedman et al., 2005), index-hiding message encoding(Manulis et al., 2010), hardware (Hazay & Lindell, 2008)or generic construction (Huang et al., 2012) using gar-bled circuit (Yao, 1982). The aforementioned privacy-preserving proﬁle matching/ common friend detection pro-tocols are variants of PSI protocols in terms of output, ad-versary model, security requirement and efﬁciency.One major drawback of all the above works is that theycan not fully utilize the topology of a social network.Firstly, proﬁle is just node-level information and not al-ways available on every social network. On the con-trary, topology (connections/ friendship relations) is thefundamental data available on social networks. Sec-ondly, common friend is just one topology-based ap-proach and it only works for nodes within 2-hops. Infact, our previous investigation showed that commonfriend heuristic has a moderate precision and low recallfor discovering community-based friendship (Hu & Lau,2013). This result is unsurprising because a commu-nity can easily span multiple hops. Towards this end,we focus on extending traditional secure friend discov-ery beyond 2-hops via community detection. Note thattopology-only community detection (Clauset et al., 2004)(Blondel et al., 2008) (Raghavan et al., 2007) (Leung et al.,2009) (Agarwal & Kempe, 2008) (Coscia et al., 2012)(Soundarajan & Hopcroft, 2013) is a classical problem un-der centralized and non privacy-preserving setting, i.e. asingle-party possesses the complete social graph and doesarbitrary computation. Although one can translate those al-gorithms into a privacy-preserving and decentralized proto-col using generic garbled circuit construction (Yao, 1982),the computation and communication cost renders it imprac-tical in the real world. To design an efﬁcient scheme, weneed to consider community detection accuracy and pri-vacy preservation as a whole. A tradeoff among accuracy,privacy and efﬁciency can also be made when necessary.To summarize, this paper made the following contributions: • We proposed and formulated the ﬁrst privacy-preserving and decentralized community detection problem, whichlargely improves the recall of topology-based friend dis- covery on Decentralized Social Networks. • We designed the ﬁrst protocol to solve this problem. Theprotocol transforms the community detection problem to aseries of Private Set Intersection (PSI) instances via Trun-cated Random Walk (TRW). Preliminary results show thatthe protocol can uncover communities with overwhelmingprobability and preserve privacy. • We propose open problems and discuss future works, ex-tensions and variations in the end.

First type of related work is Private Set Intersection (PSI)as they are already widely used for secure friend discovery.Second type of related work is topology-based graph min-ing. Although our problem is termed “community detec-tion”, the most closely related works are actually topology-based Sybil defense. This is because previous communitydetection problems are mainly considered under the cen-tralized scenario. On the contrary, Sybil defense schemesees wide application in P2P system, so one of the root con-cern is decentralized execution. Note, there exist some dis-tributed community detection works but they can not be di-rectly used because nodes exchange too much information.For example (Hui et al., 2007) allow nodes to exchangeadjacency lists and intermediate community detection re-sults, which directly breaks the privacy constraint that wewill formulate in following sections. Due to space limit,a detailed survey of related work is omitted. Interestedreaders can see community detection surveys (Fortunato,2010)(Xie et al., 2013) and Sybil detection surveys (Yu,2011)(Alvisi et al., 2013).

The notion of community is that intra-community is denseand inter-community linkage is sparse. In this section, weﬁrst review classical community detection formulations un-der centralized scenario and our previous formulation un-der decentralized scenario. Then we formulate the privacy-preserving version. To make the problem amenable to theo-retical analysis, we consider a Community-Based RandomGraph (CBRG) model in the last part.

Classical community detection is formulated as a cluster-ing problem. That is, given the full graph G = ( V, E ) ,partition the vertex set into K subsets S , S . . ., S K (a par-titioning), such that ∩ Ki =1 S i = ∅ and ∪ Ki =1 S i = V . Aquality metric Q ( { S , . . . , S K } ) is deﬁned over the parti-tions and a community detection algorithm will try to ﬁnd apartitioning that maximize or minimize Q depending on itsnature. This is for non-overlapping community detection ecure Friend Discovery via Privacy-Preserving and Decentralized Community Detection and one can simply remove the constraint ∩ Ki =1 S i = ∅ toget the overlapping version. Note that Q is only an artiﬁcialsurrogate to the axiomatic notion of community. The max-imum Q does not necessarily corresponds to the best com-munity. However, the community detection problem be-comes tractable via well-studied optimization frameworksby assuming a form of Q e.g. Modularity, Conductance.Most classical works are along this line mainly due to thelack of ground-truth data at early years.Now consider the decentralized scenario. One node (ob-server) is limited to its local view of the whole graph. Itis unreasonable to ask for a global partitioning in terms ofsets of nodes. The tractable question to ask is: whether onenode is in the same community as the observer or not? Thisgives a binary classiﬁcation formulation of community de-tection (Hu & Lau, 2013). The result of community detec-tion with respect to a single observer can be represented asa length- | V | vector. Stacking all those vectors together, wecan get a community encoding matrix (Zhong et al., 2014): M i,j = (cid:26) ∃ S k , s.t.v i ∈ S k , v j ∈ S k This matrix representation is subsumed by partitioning rep-resentation in general case. If restricted to non-overlappingcase, the two representations are equivalent. Since M en-codes all pair-wise outcome, it is immediately useful forfriend discovery application. In what follows, we will de-ﬁne accuracy and privacy in terms of how well M can belearned by nodes or adversary. In this initial study, we focus on non collusive passive ad-versary. That is, DSN nodes all execute our protocol faith-fully but they are curious to infer further information fromobserved protocol sequence. We use a single non-collusivesniff-only adversary to capture this notion. The systemcomponents are as follows: • Graph: G = ( V, E ) . The connection matrix is denoted as C , where C i,j = 1 if ( v i , v j ) ∈ E ; otherwise, C i,j = 0 .The ground-truth community encoding matrix is denotedas M g , which is unknown to all parties at the beginning.For simplicity of discussion, we assume the nodes identi-ﬁers, i.e. V , is public information. • Nodes: v , . . . , v | V | ∈ V . A node’s initial knowledge isits own direct connections, i.e. N ( v i ) = { v j | ( v j , v i ) ∈ E } . Nodes are fully honest. Their objective is to max-imize the accuracy of detecting M . Eventually, a node v i can get full row (column) in M denoted by M i, : ( M : ,i ).Depending on the protocol choice, relevant cells in M canbe made available immediately or on-demand. • Adversary: A . It can passively sniff on one node v a ∈ V . A will observe all protocol sequence related with a , in- cluding initial knowledge N ( v a ) and the community de-tection result M a, : . A ’s objective is to maximize success-ful rate in guessing M g and C , using any ProbabilisticPolynomial Algorithms (PPA). Note, the full separationof Nodes and Adversary is for ease of discussion. Inreal DSN, this passive attacker can be a curious user whowants to infer more information of the network.As protocol designer, our objectives are: • Accurately detect community after execution of the pro-tocol, i.e. making M and M g as close as possible. • Limit the successful rate of adversary’s guessing of M g and C , under the condition that A gets the protocol se-quence on node v a and makes best guess via PPA.One can see that our problem is multi-objective in nature.The accuracy part is a maximization problem and the pri-vacy part is a is min-max problem. Formal deﬁnition isgiven in Eq. 1.In this formulation, “Protocol” is an abstract notation of theprotocol speciﬁcation, not protocol execution sequence. I a is the information observed by adversary, which is depen-dent on Protocol. Succ( B , B , R ) is the measure of suc-cessful rate with symbols deﬁned as follows: • B , B ∈ { , } | V |×| V | are two { , } matrix in the samesize as M and C . • R ⊆ V × V is the challenge relations. • To measure how close are the two matrix over the chal-lenge set, we use the successful rate:

Succ( B , B , R ) = Pr n B i,j = B i,j | ( v i , v j ) $ ←− R o That is, how likely a randomly selected pair of nodes from R will have the same value in B and B .For the accuracy part, we deﬁne the challenge relation as V × V because we want the result to be accurate for allnodes. For the privacy part, we deﬁne the challenge rela-tion as R Ca = R Ma = ( V − U ( a )) × ( V − U ( a )) , where U ( a ) denotes the set of nodes in the same community as a . The reason to exclude nodes from the same communityis obvious. Since adversary will get M a , : after protocolexecution, it already knows the community membership of U ( a ) . Given the knowledge of community, one can makemore intelligent guess of the connections. This is madeclear in later discussions. Before proceed, we remark that the problem deﬁned in Eq.1 is hard even without the privacy-preserving objective. Inother words, the community detection problem (accuracy)has not been fully solved even under the TTP scenario.To improve the accuracy, researchers have already used ecure Friend Discovery via Privacy-Preserving and Decentralized Community Detection max

Find Protocol ,M = Protocol( G )  Succ(

M, M g , V × V ) , −  max Algo ∈ PPA ,a $ ←− V,C A , M A ← Algo(Protocol , I a ) (cid:18) Succ( C A , C, R Ca ) , Succ( M A , M g , R Ma ) (cid:19)  (1) M g = 

11 10 0 10 0 1 1  , E [ C ] =  pp pq q pq q p p  Figure1. Illustration of community-based random graph genera-tion. K = 2 , c = 2 heavy mathematical programming tools, try to incorporatemore side information, develop problem-speciﬁc heuris-tics, or perform heavy-duty parameter tuning. To makeour problem amenable to theoretical analysis, we considera Community-Based Random Graph (CBRG) model in thispaper. Let M g be the ground-truth community encodingmatrix. We generate the random connection matrix as fol-lows: 1) Pr { C i,j = 1 } = p if M gi,j = 1 ( v i and v j are inthe same community); Pr { C i,j = 1 } = q otherwise. Thereare K communities and each of size c , so the total numberof vertices is | V | = Kc . We denote such a random graphas CBRG(

K, c, p, q ) . One example ground-truth commu-nity encoding matrix and the expected connection matrixare illustrated in Fig. 1. In this section, we present our protocol and main results.

Our protocol involves the two stages: • Pre-processing is done via Truncated Random Walk. Ev-ery node send out W random walkers, w v i , . . . , w v i W , withtime-to-live (TTL) values l v i , . . . , l v i W initially set to L .Upon receiving a Random Walker (RW) w , the noderecords the ID of w , deducts its TTL l , and sends it to arandom neighbour if l > . At the end of this stage, eachnode v i accumulated a set of random walker IDs W i . Withproper parameters W and L , the truncated random walkerissued by v i will more likely reach other nodes in the samecommunity as v i . So by inspecting the intersection size of W i and W j , we can answer whether v i and v j are in thesame community. This essentially transforms the commu-nity detection problem to a set intersection problem. • To uncover the relevant cells in pairwise community en-coding matrix M , we only need to perform Privacy SetIntersection (PSI) on two sets. PSI schemes differ in theirﬂavours: 1) reveal intersection set (PSI-Set); 2) reveal in-tersection size (PSI-Cardinality); 3) reveal whether inter- section size is greater than a threshold (PSI-Threshold).We use the 3rd type PSI in our construction, which canbe implemented by adapting (Zhang et al., 2012). In whatfollows, we just assume existence of such a crypto primi-tive: it computes I[ |W i ∩ W j | > T ] without leaking extrainformation.One can see that the scheme is decentralized by design. Weonly need to argue its community detection accuracy andthe privacy-preserving property. The intuition of our proof is as follows: • Truncated Random Walk will be mostly limited to onecommunity, if the axiomatic notion of “community”holds. More precisely, as long as p is enough larger than β = ( K − q , there will be enough difference in inter-section size for nodes coming from the same and differentcommunities. In this case, we can set proper threshold toensure low error rate. • Observe two facts about privacy objective: 1) most pro-tocol sequence the adversary observed comes from itsown community; 2) we exclude A ’s community fromchallenge relations. In order to make better-than-prioriguesses, A at least need to observe some other nodes fromprotocol sequence. The number of nodes from V − U ( a ) can be observed is limited. Even if we assume adversarycan make good use of the information (captured by coef-ﬁcient γ M , γ C ∈ [0 , ), this small advantage is averagedout over a large challenge relation set.The detailed proof is omitted and the main results are sum-marized in the following theorem. Theorem 1

Our protocol guarantees: • False Positive Rate: Pr {|W i ∩ W j | > T | M gi,j = 0 } φW L ( L + 1) K − T • False Negative Rate: ( µ = cW P ) Pr {|W i ∩ W j | T | M gi,j = 1 } e − µ (1 − T /µ ) / • Adversary’s advantage:

Adv( M A , M g , R Ma ) γ M W ( L + 1)( K − c Adv( C A , C g , R Ca ) γ C W ( L + 1)( K − c ecure Friend Discovery via Privacy-Preserving and Decentralized Community Detection In the theorem,

Adv( B , B , R ) = Succ( B , B , R ) − Prior( B , B , R ) . Prior( B , B , R ) denotes the probabil-ity to make successful guess based on mere prior informa-tion of B . For example, suppose B contains as ma-jority, i.e. Pr n B i,j = 1 | i, j $ ←− R o = P > . . The bestguess is to let B i,j = 1 , ∀ i, j ∈ R . One can show thatthe success probability is P and this strategy is optimal ifno other information is available. Due to the speciﬁcs ofour problem, adversary can make more intelligent guessesthan random { , } bit. Towards this end, the advantage isdeﬁned with respsect to successful rate of this priori-basedstrategy. Due to the speciﬁcs of the problem, both accuracy and pri-vacy guarantees are parameterized. To give an intuitiveview of what can be achieved, consider one instantiation ofCBRG: K = 100 ( c = 500 ( p = 0 . (intra-community edge gen-eration probability), β = q ( K −

1) = 0 . , q = 0 . (inter-community edge generation probability).We can set protocol parameters as follows: W = 100 ( L = 3 (length of RW) and T =61 (threshold of intersection size). This gives us followingaccuracy and privacy guarantees: • False Negative Rate: . × − • False Positive Rate: . • Advantage for guessing M : . × γ M • Advantage for guessing C : . × γ C One can see that our proposed protocol can accurately de-tect community and preserve privacy given proper param-eters. Note ﬁrst that above W and L are casually selectedby heuristics, which have not been jointly optimized. Notesecond that the FPR and FNR can be exponentially reducedby repeated experiments, which only maps to a linear in-crease in W . The example in this section is only to demon-strate the effectiveness of our protocol and a full explo-ration of design space is left for future work. We formulated the privacy-preserving community detec-tion problem in this paper as a multi-objective optimiza-tion. We proposed a protocol based on Truncated Ran-dom Walk (TRW) and Private Set Intersection (PSI). Wehave proven that our protocol detects community with over-whelming probability and preserves privacy. Explorationof the design space and thorough experimentation on syn-thesized/ real graphs are left for future work. In followingparts of this early report, we discuss several simpler can-didate protocols and how they fail to meet our objective. This help to demonstrate the rationale of our formulationand protocol design.

Suppose we change the protocol such that v i and v j ﬁrstexchange W i and W j and then run any intersection algo-rithm separately. After uncovering all related cells in M ,adversary knows W i , ∀ i = 1 , . . . , | V | . A can directly cal-culate |W i ∩ W j | , ∀ i, j . This allows adversary to guess M perfectly. From the community membership, A canfurther infer links because intra-community edge genera-tion probability and inter-community generation probabil-ity are different. This already allows better guess than usingglobal prior of C . Furthermore, inferring links from mea-surements is a classical well-studied topic called NetworkTomography. A can actually re-organize W i ’s into a list ofsize- L sets, each representing the nodes traversed by a RW.Researchers have shown that links can be inferred fromthis co-occurrence data with good accuracy, e.g. NICO(Rabbat et al., 2008).Another natural thought to protect non-common set ele-ments is via hashing. Suppose there exists a cryptographichash h ( · ) . We deﬁne H i = { h ( w ) | w ∈ W i } . Now, twonodes just compare H i and H j in the community uncoverstage. This can protect true identities of the RWs if their IDspace is large enough. However, it does not prevent adver-sary from intelligent guess of M and C . Methods noted inprevious paragraph can also be used in this case.In our protocol, we used the PSI-Threshold version. Thatis, given W i and W j , the two parties know nothing exceptfor the indicator I[ |W i ∩W j | > T ] . Two weaker and widelystudied variations are: PSI-Cardinality and PSI-Set. Con-sider PSI-Set. The adversary now only knows elements inthe intersection. Based on his own W a and PSI-Set pro-tocol sequence, he can get W i ∩ W j ∩ W a , ∀ i, j . A cancalculate the probability that a RW w tranverses both v i and v j conditioned on w tranverses v a . Based on this in-formation, A can adjust threshold T and T to accuratelydetect communities. The derivation is similar to our pro-tocol in this paper but more technically involved, which isalso left as future work. The bottom line is that PSI-Setleaks enough information for more intelligent guesses. Asfor PSI-Cardinality, we are not sure at present what an ad-versary can do with |W i ∩ W a | , ∀ i . Since the two variantsleak more information and might be potentially exploited,we use PSI-Threshold in our protocol. Following are some open problems of privacy-preservingcommunity detection: • If we allow a small fraction of nodes to collude, how to de- ecure Friend Discovery via Privacy-Preserving and Decentralized Community Detection ﬁne a reasonable security game? What privacy-preservingresult can we achieve? • Current scheme requires all nodes to re-run the protocol,if there is any change in the topology, e.g. new node joinsor new friendship (connection) is formed. Is it possibleto ﬁnd a privacy-preserving community detection schemethat can be incrementally updated? • The privacy preservation of our proposed protocol is de-pendent on graph size. One root cause is that we onlyleveraged crypto primitives in the Private Set Intersection(PSI) part. The simulation of Truncated Random Walk(TRW) is done in a normal way. Since random walk is abasic construct in many graph algorithms, it is of interestknow how (whether or not) nodes can simulate RandomWalk in a decentralized and privacy preserving fashion.R

EFERENCES

Agarwal, G. and Kempe, D. Modularity-maximizinggraph communities via mathematical programming.

TheEuropean Physical Journal B-Condensed Matter andComplex Systems , 66(3):409–418, 2008.Agrawal, R., Evﬁmievski, A., and Srikant, R. Informationsharing across private databases. In

Proceedings of the2003 ACM SIGMOD international conference on Man-agement of data , pp. 86–97. ACM, 2003.Alvisi, Lorenzo, Clement, Allen, Epasto, Alessandro, Lat-tanzi, Silvio, and Panconesi, Alessandro. SoK: The evo-lution of sybil defense via social networks. In

Securityand Privacy (SP), 2013 IEEE Symposium on , pp. 382–396. IEEE, 2013.Blondel, V.D., Guillaume, J.L., Lambiotte, R., and Lefeb-vre, E. Fast unfolding of communities in large networks.

Journal of Statistical Mechanics: Theory and Experiment ,2008(10):P10008, 2008.Clauset, A., Newman, M.E.J., and Moore, C. Findingcommunity structure in very large networks.

Physical re-view E , 70(6):066111, 2004.Coscia, M., Rossetti, G., Giannotti, F., and Pedreschi, D.DEMON: a local-ﬁrst discovery method for overlappingcommunities. In

ACM SIGKDD , 2012.Datta, A., Buchegger, S., Vu, L.H., Strufe, T., and Rzadca,K. Decentralized online social networks.

Handbook ofSocial Network Technologies and Applications , pp. 349–378, 2010.Fortunato, S. Community detection in graphs.

PhysicsReports , 486(3-5):75–174, 2010.Freedman, M., Nissim, K., and Pinkas, B. Efﬁcientprivate matching and set intersection. In

Advancesin Cryptology-EUROCRYPT 2004 , pp. 1–19. Springer,2004.Freedman, Michael J, Ishai, Yuval, Pinkas, Benny, andReingold, Omer. Keyword search and oblivious pseudo-random functions. In

Theory of Cryptography , pp. 303– 324. Springer, 2005.Hazay, Carmit and Lindell, Yehuda. Constructions oftruly practical secure protocols using standard smartcards.In

Proceedings of the 15th ACM conference on Computerand communications security , pp. 491–500. ACM, 2008.Hu, Pili and Lau, Wing Cheong. Community classiﬁ-cation on decentralized social networks based on 2-hopneighbourhood information. In

IEEE ICNP , 2013.Huang, Yan, Evans, David, and Katz, Jonathan. Privateset intersection: Are garbled circuits better than customprotocols. In

Network and Distributed System SecuritySymposium (NDSS). The Internet Society , 2012.Hui, P., Yoneki, E., Chan, S.Y., and Crowcroft, J. Dis-tributed community detection in delay tolerant networks.In

Proceedings of 2nd ACM/IEEE international workshopon Mobility in the evolving internet architecture , pp. 7.ACM, 2007.Leung, I.X.Y., Hui, P., Lio, P., and Crowcroft, J. Towardsreal-time community detection in large networks.

Physi-cal Review E , 79(6):066107, 2009.Manulis, Mark, Pinkas, Benny, and Poettering, Bertram.Privacy-preserving group discovery with linear complex-ity. In

Applied Cryptography and Network Security , pp.420–437. Springer, 2010.Nagy, Marcin, De Cristofaro, Emiliano, Dmitrienko,Alexandra, Asokan, N, and Sadeghi, Ahmad-Reza. Doi know you?: efﬁcient and privacy-preserving commonfriend-ﬁnder protocols and applications. In

Proceedingsof the 29th Annual Computer Security Applications Con-ference , pp. 159–168. ACM, 2013.Rabbat, M.G., Figueiredo, M.A.T., and Nowak, R.D. Net-work inference from co-occurrences.

IEEE Transactionson Information Theory , 54(9):4053–4068, 2008.Raghavan, U.N., Albert, R., and Kumara, S. Near lineartime algorithm to detect community structures in large-scale networks.

Physical Review E , 76(3):036106, 2007.Soundarajan, Sucheta and Hopcroft, Johh E. Use of localgroup information to identify communities in networks.

TKDD , 2013.Xie, Jierui, Kelley, Stephen, and Szymanski, Boleslaw K.Overlapping community detection in networks: the stateof the art and comparative study.

ACM Computing Sur-veys , 45(4):1–37, 2013.Yao, A.C. Protocols for secure computations. In

Pro-ceedings of the 23rd Annual Symposium on Foundationsof Computer Science , pp. 160–164, 1982.Yu, H. Sybil defenses via social networks: a tutorial andsurvey.

ACM SIGACT News , 42(3):80–101, 2011.Zhang, R., Zhang, Y., Sun, J.S., and Yan, G. Fine-grainedprivate matching for proximity-based mobile social net-working. In

Infocom , 2012.Zhong, Xiang, Hu, Pili, and Lau, Wing Cheong. Scalableand robust community detection via proximity-based cutand merge. In