Asynchronous Gossip in Smartphone Peer-to-Peer Networks
aa r X i v : . [ c s . D C ] F e b Asynchronous Gossip in SmartphonePeer-to-Peer Networks
Calvin Newport
Georgetown University
Washington, DC, [email protected]
Alex Weaver
Georgetown University
Washington, DC, [email protected]
Chaodong Zheng
Nanjing University
Nanjing, [email protected]
Abstract —In this paper, we study gossip algorithms in com-munication models that describe the peer-to-peer networkingfunctionality included in most standard smartphone operatingsystems. We begin by describing and analyzing a new syn-chronous gossip algorithm in this setting that features both afaster round complexity and simpler operation than the best-known existing solutions. We also prove a new lower boundon the rounds required to solve gossip that resolves a minoropen question by establishing that existing synchronous solutionsare within logarithmic factors of optimal. We then adapt oursynchronous algorithm to produce a novel gossip strategy foran asynchronous model that directly captures the interface of astandard smartphone peer-to-peer networking library (enablingalgorithms described in this model to be easily implemented onreal phones). Using new analysis techniques, we prove that thisasynchronous strategy e ffi ciently solves gossip. This is the firstknown e ffi cient asynchronous information dissemination resultfor the smartphone peer-to-peer setting. We argue that ournew strategy can be used to implement e ff ective informationspreading subroutines in real world smartphone peer-to-peernetwork applications, and that the analytical tools we developedto analyze it can be leveraged to produce other broadly usefulalgorithmic strategies for this increasingly important setting. Index Terms —gossip, distributed algorithms, peer-to-peer net-works
I. I ntroduction
In this paper, we study gossip in smartphone peer-to-peernetworks, an interesting emerging networking platform thatmakes use of the peer-to-peer libraries included in standardsmartphone operating systems (for examples of these networksin practice, see: [23], [1], [22], [18], [17], [11], [12]). We beginby improving the best-known synchronous gossip algorithmsin this setting, and then build on these results to describeand analyze the first e ffi cient asynchronous solution. Themodel in which we study this latter algorithm captures theinterfaces and guarantees of an actual peer-to-peer networkinglibrary used in iOS, meaning that our gossip solution can bedirectly implemented on commodity iPhones. To emphasizethis practicality, in Appendix D we provide the SWIFT codethat implements this algorithm in iOS—a rare instance in thestudy of distributed algorithms for wireless networks in whichthe gap between theory and practice is minimal.Below we briefly summarize the models we study and therelevant existing bounds in these models, before describing thenew results proved in this paper. The Mobile Telephone Model (MTM).
The mobile telephonemodel (MTM) [13] extends the well-studied telephone model of wired peer-to-peer networks (e.g.,[10], [14], [16], [5],[9], [15]) to better capture the dynamics of the peer-to-peernetwork libraries implemented in existing smartphone oper-ating systems. In recent years, multiple distributed algorithmproblems have been studied in the MTM setting, including:rumor spreading [13], load balancing [7], leader election [20],network capacity [8], and gossip [19], [21].As we elaborate in Section III, in the MTM, time proceedsin synchronous rounds. At the beginning of each round, eachwireless device (which we will call a node ) can advertise asmall amount of information to its neighbors in the peer-to-peer network topology (defined by an undirected graph). Afterreceiving advertisements, nodes can attempt local connections.In more detail, in each round, each node can send and acceptat most one connection proposal . If a node u ’s proposal isaccepted by neighboring node v , then u and v can perform abounded amount of reliable communication using this connec-tion before the round ends.This scan-and-connect network architecture—in whichnodes can broadcast small advertisements to all of theirneighbors, but form pairwise connections with only a limitednumber at a time—is a defining feature of existing smart-phone peer-to-peer libraries. In the peer-to-peer libraries thatdepend on Bluetooth, for example, the advertisements areimplemented as low energy beacons that contain at most tensof bytes, whereas the pairwise connections are implementedas reliable, high throughput links that can achieve up to2 Mbits / sec [3]. These libraries, therefore, allow devices tobroadcast advertisements to all neighbors, but severely restrictthe number of concurrent pairwise connections allowed. IniOS, for example, this limit is 7 (the MTM typically reducesthis bound to 1 to simplify the model description and analysis). Mobile Telephone Model vs. Classical Telephone Model.
The MTM can be understood as a modification of the classicaltelephone model of peer-to-peer networks [10], [14], [16],[5], [9], [15]. The MTM di ff ers from its predecessor in twoways: (1) it allows nodes to broadcast small advertisements toall neighbors; and (2) it bounds the numbers of concurrentconnections allowed at each node. As elaborated in [6],[13], this second di ff erence prevents existing telephone modelesults from applying to the mobile telephone setting, asthe best-known telephone model analyses specifically dependon the ability of nodes to service an unbounded number ofincoming concurrent connections (the standard analysis ofPUSH-PULL rumor spreading, for example, depends on theability of many nodes to simultaneously pull the rumor from acommon informed neighbor). On the other hand, the additionof advertisements to the MTM means that results in this newmodel do not apply to the classical telephone setting, whichnot include this behavior. Fundamentally new techniques areneeded to study the MTM. The Asynchronous Mobile Telephone Model (aMTM).
Themobile telephone model includes synchronized rounds. Thisassumption simplifies analyses that probe the fundamentalcapabilities of scan-and-connect style peer-to-peer networks. Italso introduces, however, a gap between theory and practice, asreal smartphone peer-to-peer networks are not synchronized.To help close this gap, in [21], the authors introduced the asynchronous mobile telephone model (aMTM), which, as weelaborate in Section IV, eliminates the synchronous round as-sumption from the MTM, and allows communication events tounfold with unpredictable delays, controlled by an adversary.To increase the practicality of the aMTM, the authors of [21]also provide a software wrapper around the network librarieso ff ered in iOS that matches the interface from the formalspecification of the aMTM—simplifying the task of directlyimplementing algorithms analyzed in the aMTM on iPhones. Existing Results.
Work on the MTM began with [13], whichstudied rumor spreading, and described a strategy that uses a1-bit advertisement to compensate for connection bounds tospread a rumor in at most O ((1 /α ) log n log ∆ ) rounds, withhigh probability, in a network with n nodes, maximum degree ∆ , and vertex expansion α (see Section II). The paper alsoproved that there exist graphs with good graph conductance, φ ,for which e ffi cient rumor spreading is impossible. This createsa separation from the classical telephone model where both vertex expansion and conductance are known to be good mea-sures of the ability to spread a rumor e ffi ciently in a graph. Inthe classical model, for example, the canonical PUSH-PULLrumor spreading strategy requires Θ ((1 /α ) log n ) rounds forgraphs with vertex expansion α [15], and Θ ((1 /φ ) log n ) roundsfor graphs with conductance φ [14].The more general problem of gossiping k rumors in themobile telephone model was first studied in [19], which de-scribed an algorithm that spreads the rumors in O (( k /α ) log n )rounds, with high probability. This algorithm was one-shot , inthe sense that it cannot accommodate on-going rumor arrivals,or detect when it has terminated. In recent work [21], a simplergossip algorithm was described and analyzed that improves In [19], the algorithm is listed as requiring O (( k /α ) log n ) rounds,but that result assumes a single bit advertisements in each round—requiring devices to spell out control information over many roundsof advertising. To normalize with this paper, in which tags can containlog n bits, this existing strategy’s time complexity improves by a logfactor. this bound to O (( k /α ) log n log ∆ ) rounds, and can handleon-going rumor arrivals.By comparison, the best-known gossip solution in the clas-sical telephone model requires O ( D + polylog( n )) rounds [4].This result was considered a breakthrough as it removedthe dependence on graph properties such as expansion orconductance. The solution in [4], however, requires unboundedconcurrent connections and unbounded message size (allowingall rumors in the set di ff erence between two nodes to bedelivered during a given one-round connection ).The aMTM was introduced in [21], which analyzes a basicasynchronous rumor spreading algorithm, and prove it requires O ( √ ( n /α ) · log n α · δ max ) time, with high probability, where δ max is a sum of the maximum delays on the relevant commu-nication events (as is standard in asynchronous models, δ max is unknown to the algorithm and can change from executionto execution). For gossip, however, the paper establishes onlya crude deterministic bound of O ( n · k · δ max ) time to gossip k rumors. Finding an e ffi cient gossip algorithm in the aMTMwas left as the core open question of [21], as such an algorithmcould be directly deployed as an information spreading routinein real smartphone peer-to-peer networks. New Result
Our ulti-mate goal in this paper is to design and analyze an e ffi cientand simple gossip strategy for the aMTM. The first step towardthis goal is to identify an e ffi cient synchronous strategy thatcan be adapted to asynchrony. The existing synchronous gossipalgorithm from [21] is not a good candidate for this purposebecause it requires nodes to advertise whether or not they wereinvolved in a connection at any point during the previouslog n rounds. This behavior cannot be easily adapted to anenvironment with no rounds.In Section III, we overcome this issue by describing asimpler strategy we call random di ff usion gossip that doesnot depend on round history. This algorithm has each nodecontinually advertise two pieces of information about itscurrent rumor set: a hash of the set and its size. When facedwith multiple neighbors with di ff erent rumor set hash values, anode will randomly select a recipient of a connection proposalfrom among those with the smallest rumor set sizes. Thisstrategy is easily adapted to asynchrony as it does not explicitlyuse rounds.As we show, in addition to being both round-independentand pleasingly straightforward in its operation, random dif-fusion gossip is more e ffi cient than the solution from [21],requiring only O (( k /α ) log n log ∆ ) rounds to spread k rumors.The source of this speed-up is a new and improved versionof the core technical lemma from [13], which bounds theperformance of a random matching strategy in bipartite graphs.Notice that this gossip result also improves the best knownresult for rumor spreading (i.e., for k = O ( k /α ) (where ˜ O suppresses polylogarithmic This explains why the rumor count, k , is not needed in the timecomplexity actors in n and ∆ ). As argued in the previous work on gossip,it might be possible to leverage pipelining to achieve results in˜ O ( k + (1 /α )), which would make the existing gossip strategiesfor this model far from optimal in certain cases. In Section III,we resolve this open question by proving that Ω ( k /α ) is indeeda lower bound for spreading k rumors in the mobile telephonemodel. New Result
Our synchronous ran-dom di ff usion gossip algorithm’s operation is easily adaptedto our asynchronous model. Adapting its analysis, however, ismore complicated. Like most algorithms studied in the MTM,our synchronous analysis of random di ff usion gossip relies onthe synchronized behavior of the devices in the network: fixingfor each round a set of potentially productive connections, andthen arguing that a reasonable fraction of these connectionswill succeed in parallel during the round.Our first step toward enabling an asynchronous analysis isto divide time into intervals of a length proportional to δ max .These phases are not used by the algorithm (as δ max is a priori unknown), but instead meant only to facilitate our analysis. Asin the synchronous setting, we fix a set of potential connectionsat the beginning of each interval. We show that amidst all thechaotic, asynchronous behavior that occurs during the interval,for each such connection from some node u to some node v inthis set, one of two things will happen: there will be a pointat which u selects a connection from a set that includes v andthat is not too large (keeping the probability of v ’s selectionreasonable), or some other node will end up connecting with v before u even gets a chance to learn about v .To make use of this probabilistic analysis, we leveragea rebuilt version of the core randomized matching lemmafrom [13] (discussed above), that we make not only morepowerful but also significantly more friendly to asynchrony.In more detail, this new version includes two crucial changes.First, the original lemma follows the behavior of a randomizedmatching strategy over multiple rounds to achieve the neededresult. Our new version, by contrast, requires only a singleround, which is necessary to apply to our interval structure,as in the asynchronous setting too much can change in thenetwork between intervals to enable a coherent multi-intervalgraph analysis. Second, the original version relied on theprecise probabilities of particular connections occurring, usingboth upper and lower bounds on these values to prove itsclaim. Our new version only requires the loose lower boundson connection probabilities established by our asynchronousanalysis.Combining these techniques, we are able to translatethe synchronous complexity bound directly to the asyn-chronous setting, proving that k rumors spread in at most O (( k /α ) log n log ∆ · δ max ) time.II. P reliminaries Here we define useful notation and results that we usethroughout the analysis that follows.
Range Notation.
We use the notation [ m ] for 1 ≤ m tosignify the range of integers 1 , . . . , ⌈ m ⌉ . In contrast, we use the notation [ a , b ], for a ≤ b , to denote the real numbers from a to b . Graphs and Vertex Expansion.
Fix an undirected graph G = ( V , E ). For node u ∈ V , we use the notation N ( u ) to denote u ’s neighbors in G and deg ( u ) = | N ( u ) | to denote u ’s degreein G . Let ∆ = max u ∈ V deg ( u ) be the maximum degree of anynode in G . For a given subset of nodes S ⊆ V , let ∂ S = { v | v ∈ V \ S , N ( v ) ∩ S , ∅} denote the boundary of S . We thenlet α ( S ) = | ∂ S | / | S | and define the vertex expansion of a graph G as α = min S ⊂ V , | S |∈ [ n / α ( S ).Let B ( S ) represent a bipartite graph with bipartitions ( S , V \ S ) and let v ( B ( S )) represent the size of the maximum matchingover B ( S ). We leverage the following lemma from [13]. Lemma II.1. (Lemma 5.4 of [13]). Let γ = min S ⊂ V , | S |∈ [ n / { ( B ( S )) / | S |} . It follows that γ ≥ α/ . Useful Probability Results.
Many of our results are describedas holding with high probability (or, w.h.p.), which we defineto mean with a failure probability polynomially small in thenetwork size n . To help achieve these results, we sometimesapply concentration bounds, often using the following presen-tation of the Cherno ff bound. Theorem II.2.
Let X , . . . , X n be a series of independentrandom variables such that X i ∈ [0 , where X = P ni = X i has expectation E [ X ] = µ . For ε ∈ [0 , , Pr[ X ≤ (1 − ε ) · µ ] ≤ exp( − (1 / · ε µ ) . In several places in our analysis, we tame correlated randomvariables by applying the following stochastic dominanceresult. This general idea is common, but we prove the resultfrom scratch here in the exact form we need for the sake ofcompleteness. The full proof resides in Appendix A1.
Lemma II.3.
Let X , . . . , X T be a sequence of T randomindicator variables where X i = with some unknown prob-ability q i . Assume ∀ i ∈ [ T ] , it always holds that q i ≥ p, forsome constant probability p. Next, define the total number ofsuccesses as Y = P i ∈ [ T ] X i . It follows that Y = Ω ( pT ) withprobability at least Ω (1 − exp( − pT )) . III. S ynchronous G ossip In this section, we analyze new upper and lower bounds forgossip in the synchronous MTM.
A. The Mobile Telephone Model
The mobile telephone model (MTM) (introduced in [13])describes a peer-to-peer network of wireless devices. Thenetwork is modeled as an undirected graph G = ( V , E ), whereeach device u is represented by a vertex in the graph. Wewill use the term node to refer to both the device and thecorresponding vertex in the graph. If two devices u and v arewithin communication range in the network, we connect thecorresponding nodes with an undirected edge { u , v } ∈ E . Wedenote the number of nodes in the graph as n = | V | .Time in the MTM proceeds in synchronous rounds withall nodes beginning at round 1. In each round, each nodeegins by broadcasting an advertisement containing O (log n )bits to its neighbors in G . After receiving advertisements, eachnode can decide to send a connection proposal to at most oneneighbor. Any node that receives one or more proposals mustaccept exactly one. We allow the model to arbitrarily selectwhich proposal is accepted in this case. (That is, we do notnecessarily assume that each node successfully receives all incoming proposals and is therefore able to make a carefuldecision on which to accept.)Finally, if some node v accepts a connection proposal fromneighboring node u , then u and v are considered connected .They can then perform a bounded amount of interactive andreliable communication before the round concludes. Notice,this model definition limits each node to participating inat most 2 connections per round (one outgoing and oneincoming). B. The Gossip Problem
The gossip problem we study assumes that k ≥ tokens in the following) are distributedarbitrarily to nodes at the beginning of the execution (thatis, some nodes can start with many tokens, some can startwith none). The problem is solved once all nodes know all k rumors. Nodes do not know k in advance. We treat the gossiptokens as comparable black boxes. The only way for a node u to communicate a token to node v is if u and v are connected.In the synchronous setting, we limit nodes to communicatingat most a constant number of tokens over a given connectionin a single round. (Later, when we study this problem in theasynchronous setting, we instead bound the maximum timerequired to transmit a single token over a connection.) C. The Random Di ff usion Gossip Algorithm Here we present the random di ff usion gossip algorithmwhich we formalize as pseudocode in Algorithm 1. The corestrategy of this algorithm is for nodes to attempt to send tokensto the neighbors with the smallest token sets. This contrasts tothe strategy of [21] in which nodes bias connection attemptstoward neighbors that have not participated in connections inrecent rounds. Algorithm 1:
Random di ff usion gossip (for process u ) T ← initial token set of u H ← shared hash function while true do Advertise ( h H ( T ), | T | , u i ) A ← ReceiveAdvertisements () s ← min ( { s v | h h , s v , ∗i ∈ A , h , H ( T ) } ) N ← { v | h h , s , v i ∈ A , h , H ( T ) } v ← node chosen randomly from N (attempt to connect to v; if successful, send / receivea token from the set di ff erence) In more detail, in each round, each node u advertises a hash of its token set, the size of its token set, and its unique identifier. Node u then considers advertisements from neigh-bors that advertised di ff erent token set hashes, identifying thesmallest token set size from this set. It randomly selects oneof these nodes to send a connection proposal. If the proposalis accepted, a token from the set di ff erence is transferred,increasing at least one of the two nodes’ token sets. D. Analysis
Our goal is to prove the following bound on the timecomplexity of this algorithm.
Theorem III.1.
With high probability in n, the random dif-fusion gossip algorithm solves gossip in O (( k /α ) log n log ∆ ) rounds, where k is the number of initial tokens, α is the vertexexpansion of the graph, n is the size of the graph, and ∆ isthe maximum degree of the graph. We begin by defining some useful notation. At the beginningof round r , let T u ( r ) be the token set of node u and let s u ( r ) be the minimum token set size among u ’s neighbors.Furthermore, for a fixed topology graph G = ( V , E ), let N ( u )be the neighbors of u in G and let N u ( r ) be the productive neighbors for u at the beginning of round r , where we define N u ( r ) = { v | v ∈ N ( u ) , | T v ( r ) | = s u ( r ) , H ( T u ( r )) , H ( T v ( r )) } .For integer sizes i ∈ , . . . , k ; let S i ( r ) = { v | v ∈ V , i = | T v ( r ) |} be the set of nodes that know exactly i tokens at thebeginning of round r . Next, let n i ( r ) = | S i ( r ) | and n ∗ i ( r ) = min ( n i ( r ) , n − n i ( r )).We also define i min ( r ) = min( { i | i ∈ , . . . , k ] | n i ( r ) > } ) asthe minimum token set size for which there is at least one nodewith exactly that many tokens. For convenience, let S min ( r ) = S j ( r ) and n ∗ min ( r ) = n ∗ j ( r ) for j = i min ( r ). Finally, we define C ( r ) = |{ i | i ∈ , . . . , k ] | n i ( r ) > }| as the number of token setsizes held by nodes.The approach we will take when proving our theoremstatement is to bound how long any minimum token set size i min ( r ) can remain the minimum token set size. Since theminimum token set size can never decrease, this will thenallow us to prove the total time complexity for our algorithm.For most of our analysis, we will focus on the connectionsbetween nodes in S min ( r ) and V \ S min ( r ). In order for thiscut to exist though, clearly it must be the case that C ( r ) > C ( r ) =
1, theproof of which can be found in Appendix B1.
Lemma III.2.
Fix a round r > such that C ( r ) = . EitherC ( r + > or i min ( r + > i min ( r ) . The purpose of Lemma III.2 is to simply establish thatregardless of the minimum token set size, there are some nodeswhich quickly achieve a token set larger than the minimumnumber of tokens held by any node. This allows us to analyze As in [21], a couple of simplifying assumptions are made here.The first is that we avoid hash collisions in the executions weconsider, allowing us to make the reasonable assumption that di ff erenttoken set hashes always indicate di ff erent token sets. We also makethe pragmatic assumption that these hash vaues, as well as token setsize counts, fit within the O (log n ) bound on advertisements. he cut between these nodes in V \ S min ( r ) and the nodesthat still possess exactly i min ( r ) tokens, S min ( r ). Productiveconnections made over this cut will provide nodes of S min ( r )with new tokens, increasing their token set size, and shrinking S min ( r ). When no nodes remain, the minimum token set sizemust be larger than i min ( r ).Furthermore, note that if for some rounds r and r suchthat C ( r ) > C ( r ) =
1, and r < r it must be the casethat every node in S min ( r ) has participated in a productiveconnection. Therefore we will continue our analysis with theassumption that C ( r ) > r we fix and revisitLemma III.2 in the proof of Theorem III.1.We continue by defining the productive subgraph G ( r ) of G (defined with respect to a fixed round r ) which defines all theconnections which nodes might attempt to form in the givenround r . Definition III.3.
At the beginning of round r > , define the productive subgraph G ( r ) of the graph topology G = ( V , E ) asthe undirected graph G ( r ) = ( V , E ( r )) such that E ( r ) = {{ u , v } | v ∈ N u ( r ) } . For the purposes of our analysis, it will be su ffi cient tofocus on a subgraph of the productive subgraph which onlyconsiders nodes in S min ( r ) and their neighbors. Definition III.4.
At the beginning of round r > , definethe minimum productive subgraph G min ( r ) as the undirectedbipartite subgraph G min ( r ) = ( L min ( r ) , R min ( r ) , E min ( r )) suchthat • L min ( r ) = { u | u ∈ V \ S min ( r ) , N ( u ) ∩ S min ( r ) , ∅} • R min ( r ) = { u | u ∈ S min ( r ) , N ( u ) ∩ ( V \ S min ( r )) , ∅} • E min ( r ) = {{ u , v } | u ∈ L min ( r ) , v ∈ R min ( r ) , { u , v } ∈ E ( r ) } In other words, the minimum productive subgraph G min ( r )only contains edges representing the potential connectionswhich would result from connection proposals sent to nodeswith the fewest number of tokens in the entire network atthe beginning of round r (from nodes with more than thisnumber of tokens). The significance of G min ( r ) is that everyproductive connection in this graph causes a node with thefewest number of tokens to no longer have the fewest numberof tokens. For this reason, we next lower bound the numberof potential productive connections in G min ( r ). The full prooffor this lemma can be found in Appendix B2. Lemma III.5.
For a fixed round r > , there is a matchingover G min ( r ) with size m ≥ ( α/ · n ∗ min ( r ) . We now have a lower bound for the number of potentialconnections that nodes in S min ( r ) could participate in for agiven round. To show that our algorithm is able exploit thesepossible connections, we now prove and apply a significantlyreworked version of a core lemma from [13] which bounds thebehavior of randomized connection attempts in bipartite graphssatisfying certain properties. In the immediate context of oursynchronous analysis, this new version of the lemma providesa log-factor time complexity improvement as compared to the original version. As detailed in the introduction, however, mostof the updates captured below (which represent some of thecore technical contributions of this paper) are introduced tomake this lemma applicable to the asynchronous analysis thatfollows in the next section.We also note that that this improved version of the lemmacan be plugged into the analysis of [13] to provide a logfactor improvement to the complexity of its rumor spreadingalgorithm. Lemma III.6. (Replaces Theorem 7.4 in [13]). Let G ( L , R ) bethe subgraph of G min ( r ) induced by node subsets L and R andlet N L , R ( u ) is the neighbors node u in G ( L , R ) and deg L , R ( u ) = | N L , R ( u ) | . Fix any i ∈ [32 · log ∆ ] . For a fixed round r > , letL ⊆ L min ( r ) and R ⊆ R min ( r ) be subsets such that: there is a matching of size | L | over G ( L , R ) , | R | ≥ | L | ≥ c · m for some < c ≤ where m is the sizeof the maximum matching over G ( L , R ) , P u ∈ L deg L , R ( u ) ≤ m ∆ − i − · log ∆ , and for every u ∈ L, every neighbor of u in R min ( r ) is in R.With at least constant probability within one round of therandom di ff usion gossip algorithm, At least Ω (cid:16) m log ∆ (cid:17) nodes of R participate in a productiveconnection, or We can identify L ′′ ⊆ L ∩ L min ( r ′ ) and R ′′ ⊆ R ∩ R min ( r ′ ) for some r ′ ∈ { r , r + } such that: a) there is a matching of size | L ′′ | over G ( L ′′ , R ′′ ) , b) | R ′′ | ≥ | L ′′ | ≥ (1 − / log ∆ ) · | L | , c) P u ∈ L ′′ deg L ′′ , R ′′ ( u ) ≤ m ∆ − i · log ∆ , and d) for every u ∈ L ′′ , every neighbor of u in R min ( r ′ ) is in R ′′ .Proof. Our proof, like the proof of Theorem 7.4 in [13], isbroken up into several steps. For the matching M of size atleast m · c over our original graph G ( L , R ), we denote a node v ∈ R as the original match of a node u ∈ L if { u , v } ∈ M . Thisterminology is also taken from the original proof. Remove High Degree Nodes from L . Let δ i = (1 / c ) · log ∆ · ∆ − i − · log ∆ and consider all nodes in L with degree at most δ i .As in [13] this choice of δ i is based on our assumptions that | L | ≥ c · m and P u ∈ L deg L , R ( u ) ≤ m ∆ − i − · log ∆ such that at most a1 / log ∆ fraction of the nodes u ∈ L can have deg L , R ( u ) > δ i .Let L ′ ⊆ L be the subset of nodes once we remove all suchhigh degree nodes from L and again note that | L ′ | ≥ (1 − / log ∆ ) · | L | .We then remove all nodes from R that are not connectedto L ′ and denote the remaining set R ′ . Note that for everynode u ∈ L ′ , every neighbor N L , R ( u ) = N L ′ , R ′ ( u ). The authorsof [13] note that this implies G ( L ′ , R ′ ) has a matching of size | L ′ | since for every node u ∈ L ′ , u ’s original match is in R ′ .These observations alone fulfill conditions a , b and d of thesecond objective of the lemma. Therefore if condition c holdssuch that P u ∈ L ′ deg L ′ , R ′ ( u ) ≤ m ∆ − i · log ∆ , the second objectiveof the lemma is already satisfied by setting L ′′ = L ′ , R ′′ = R ′ ,and r ′ = r . We therefore assume for the remainder of the proofthat P u ∈ L ′ deg L ′ , R ′ ( u ) ≥ m ∆ − i · log ∆ .t this point we diverge significantly from the strategy ofthe original proof and introduce a new technique for leveragingthis assumption regarding the the degree sum in G ( L ′ , R ′ ).We start by leveraging a definition which was first used in[2] in the context of the maximal independent set problem.Namely, call a node u good with respect to a graph G if |{ v | v ∈ N ( u ) , deg ( v ) ≤ deg ( u ) }| ≥ deg ( u ) / N ( u ) and deg ( u ) are u ’s neighbor set and degree in G . Otherwise call ubad with respect to G . In other words, a node is good withrespect to a graph G if at least one third of its neighbors in G have at most its degree in G .In G ( L ′ , R ′ ) let R ′ b ⊆ R ′ be the bad nodes in R ′ and let R ′ g ⊆ R ′ be the good nodes, where good and bad are defined withrespect to G ( L ′ , R ′ ). Since every edge G ( L ′ , R ′ ) has an endpointin R ′ , clearly either P u ∈ R ′ b deg L ′ , R ′ ( u ) ≥ (1 / · m ∆ − i · log ∆ or P u ∈ R ′ g deg L ′ , R ′ ( u ) ≥ (1 / · m ∆ − i · log ∆ . Simply speaking since R ′ b ∪ R ′ g = R ′ , at least half of the edges in G ( L ′ , R ′ ) are incidenton R ′ b or at least half are incident on R ′ g . We first assume theformer case. Case G ( L ′ , R ′ ) are Incidenton R ′ b . Let G ( L ′ b , R ′ b ) be the graph induced by the edges inci-dent on R ′ b and note that for every v ∈ R ′ b , N L ′ , R ′ ( v ) = N L ′ b , R ′ b ( v ).Next, recognize that if for any v ∈ R ′ b , deg L ′ b , R ′ b ( v ) > δ i , v wouldhave higher degree in G ( L ′ , R ′ ) than any node in L ′ (sinceevery node in L ′ has degree at most δ i ), making v trivially goodwith respect to G ( L ′ , R ′ ). This contradicts v ∈ R ′ b , therefore deg L ′ b , R ′ b ( v ) ≤ δ i .Divide the nodes of R ′ b into ⌈ log δ i ⌉ classes based on theirdegree in G ( L ′ b , R ′ b ) such that nodes of degree [2 j − , j ] arein the j th class, denoted R ′ b ( j ). Let E j be the edges incidenton nodes of this class. Note that by our case assumption, P j ∈ [log δ i ] | E j | ≥ (1 / · m ∆ − i · log ∆ . Since for every node in u ∈ L ′ b , deg L ′ b , R ′ b ( u ) ≤ deg L ′ , R ′ ( u ) ≤ δ i , the probability that v ∈ R ′ b ( j ) participates in a productive connection in round r isat least1 − Y u ∈ N L ′ b , R ′ b ( v ) − deg L ′ b , R ′ b ( u ) ! (1) ≥ − Y u ∈ N L ′ b , R ′ b ( v ) − deg L ′ , R ′ ( u ) ! (2) ≥ − Y u ∈ N L ′ b , R ′ b ( v ) − δ i ! = − (1 − /δ i ) deg L ′ b , R ′ b ( v ) (3) ≥ − (1 − /δ i ) j − ≥ − + j − /δ i (4) = j − /δ i + j − /δ i ≥ j − δ i (5)Note that for Line 4 we use the inequality (1 + x ) n ≤ − xn for x ∈ [ − , , n ∈ N (which can be shown via Bernoulli’sinequality) and for Line 5 we use our observation that 2 j − ≤ δ i and therefore 2 j − /δ i ≤
1. Now, since there are | E j | edgesincident on nodes in R ′ b ( j ) and each node in R ′ b ( j ) has degree at most 2 j , there are at least | E j | j nodes in this class. Therefore,the expected number of nodes from R ′ b ( j ) which participate ina productive connection r is at least | E j | j · j − δ i = | E j | δ i .Therefore, the expected number of nodes selected across alllog δ i classes is X j ∈ [log δ i ] | E j | δ i = δ i X j ∈ [log δ i ] | E j | (1) ≥ (1 / · m ∆ − i · log ∆ δ i = δ i X j ∈ [log δ i ] | E j | (2) ≥ (1 / · m ∆ − i · log ∆ δ i = m ∆ − i · log ∆ δ i (3) = m · ∆ − i · log ∆ · (1 / c ) · log ∆ · ∆ − i − · log ∆ = m · c · ∆ − · log ∆ · log ∆ = Ω (cid:18) m log ∆ (cid:19) (4)Please note that here we derive Line 3 from Line 1 byleveraging our case assumption. Since this expectation is equalto the sum of negatively-correlated random variables, as in [13]we can then apply the Cherno ff bound from Theorem II.2 toachieve a concentration around this bound to show that theprobability the actual number of productive connections (forexample) is at most a 1 / m / log ∆ ≥ m / log ∆ < R ′ which happens trivially. Therefore, with atleast constant probability the first objective of the lemma issatisfied. Case G ( L ′ , R ′ ) are Incidenton R ′ g . Now assume P u ∈ R ′ g deg L ′ , R ′ ( u ) ≥ (1 / · m ∆ − i · log ∆ . Asbefore, let G ( L ′ g , R ′ g ) denote the graph induced by the edgesincident on R ′ g and note that for all u ∈ L ′ g , deg L ′ g , R ′ g ( u ) ≤ deg L ′ , R ′ ( u ) and for all v ∈ R ′ g , N L ′ g , R ′ g ( v ) = N L ′ , R ′ ( v ). For v ∈ R ′ g ,let the notation N ℓ L ′ g , R ′ g ( v ) = { u | u ∈ N L ′ g , R ′ g ( v ) , deg L ′ g , R ′ g ( u ) ≤ deg L ′ g , R ′ g ( v ) } denote v ’s lower degree neighbors in G ( L ′ g , R ′ g ).Define N ℓ L ′ , R ′ ( v ) for v ∈ R ′ similarly. Therefore, since for all u ∈ L ′ g , deg L ′ g , R ′ g ( u ) ≤ deg L ′ , R ′ ( u ) and for all v ∈ R ′ g , N L ′ g , R ′ g ( v ) = N L ′ , R ′ ( v ), we have that for all v ∈ R ′ g , N ℓ L ′ , R ′ ( v ) ⊆ N ℓ L ′ g , R ′ g ( v ).Therefore, the probability a node v ∈ R ′ g is selected is at least,1 − Y u ∈ N L ′ g , R ′ g ( v ) − deg L ′ , R ′ ( u ) ! (1) ≥ − Y u ∈ N ℓ L ′ g , R ′ g ( v ) − deg L ′ , R ′ ( u ) ! (2) ≥ − Y u ∈ N ℓ L ′ , R ′ ( v ) − deg L ′ , R ′ ( u ) ! (3) ≥ − Y u ∈ N ℓ L ′ , R ′ ( v ) − deg L ′ , R ′ ( v ) ! (4) ≥ − (1 − / deg L ′ , R ′ ( v )) | N ℓ L ′ , R ′ ( v ) | (5) − (1 − / deg L ′ , R ′ ( v )) deg L ′ , R ′ ( v ) / ≥ − e − / > / N ℓ L ′ , R ′ ( v ) toreplace the degree of u with that of v and Line 6 is wherewe leverage the assumption that v is good. Now removeevery node from R ′ that is selected in round r and denotethe remaining set R ′′ , and remove from L ′ every node u forwhich u ’s original match was removed from R ′ . Denote theremaining nodes L ′′ . Since we know from the above thateach node v ∈ R ′ g is removed from R ′ with probability atleast 1 / { u , v } is removedfrom G ( L ′ , R ′ ) is at least the probability that v is removedfrom R ′ , the probability that an edge { u , v } incident on R ′ g is removed is at least 1 /
4. Since by our case assumptionthat there are at least (1 / · m ∆ − i · log ∆ edges incident onnodes in R ′ g , in expectation, at least (1 / · m ∆ − i · log ∆ edgesare removed from G ( L ′ , R ′ ). Therefore, since by our initialassumption that P u ∈ L ′ deg L , R ( u ) ≤ m ∆ − i − · log ∆ and the fact that P u ∈ L ′ deg L ′ , R ′ ( u ) ≤ P u ∈ L deg L , R ( u ), the expected number ofedges X remaining in G ( L ′′ , R ′′ ) is at most E [ X ] ≤ m ∆ − i − · log ∆ − (1 / · m ∆ − i · log ∆ . Therefore, by applying Markov’s inequalitywe can upper bound the probability that X ≥ m ∆ − i · log ∆ : Pr [ X ≥ m ∆ − i · log ∆ ] ≤ m ∆ − i − · log ∆ − (1 / · m ∆ − i · log ∆ m ∆ − i · log ∆ = ∆ · log ∆ − / = / − / < / P u ∈ L ′′ deg L ′′ , R ′′ ( u ) ≤ m ∆ − i · log ∆ with at least constant probability. Since this satisfies condition c of the second objective of the lemma, we conclude byshowing that either the remaining conditions of this objectiveare satisfied or the first objective has been achieved.If | L ′′ | < (1 − / log ∆ ) · | L | then note that this means | L ′′ | < (1 − / log ∆ ) · | L ′ | since | L ′ | ≥ (1 − / log ∆ ) · | L | . Asis noted in [13], this implies that at least a 1 / log ∆ fractionof nodes in L ′ had their original match removed in round r which means that at least | L ′ | / log ∆ nodes of R ′ were selectedand therefore participated in a productive connection. Since | L ′ | ≥ (1 − / log ∆ ) · | L | = Ω ( m ) this would indicate thatat least Ω ( m / log ∆ ) nodes of R ′ participated in a productiveconnection which would satisfy the first objective of thelemma.Therefore, assume | L ′′ | ≥ (1 − / log ∆ ) · | L | . Note thatonce again by our construction, for every node u ∈ L ′′ , everyneighbor of u in R min ( r +
1) is in R ′′ . This includes u ’s originalmatch in R ′ such that there is a matching over G ( L ′′ , R ′′ )of size | L ′′ | . Furthermore by our construction of G ( L ′′ , R ′′ ), L ′′ ⊆ L ∩ L min ( r +
1) and R ′′ ⊆ R ∩ R min ( r + L ′′ , R ′′ , and r ′ = r + (cid:3) We now apply Lemma III.6 inductively over O (log ∆ )rounds and leverage our result from Lemma III.5 to boundthe number of connections over this period. This proof can befound in Appendix B3. Lemma III.7.
Fix a round r > . With at least con-stant probability, within at most O (log ∆ ) rounds at least Ω (( α/ log ∆ ) · n ∗ min ( r )) nodes of S min ( r ) participate in a pro-ductive connection. Now that we have bounded the expected number of success-ful connections over a phase of O (log ∆ ) rounds, our goal willbe to bound the number of rounds required to increase theminimum token set size in the entire network. To establishthis lemma (for which the proof can be found in AppendixB5) we first leverage Lemma A.1 which only bounds the timerequired for at least half the nodes to have more than theminimum token set size. Lemma III.8.
Fix a round r > . There exists a round r t wherer t = r + O ((1 /α ) log n log ∆ ) such that w.h.p. in n, all nodesin S min ( r ) participate in a productive connection by round r t .Proof (of Theorem III.1). We now have everything we need toprove our main theorem. Consider any round r with minimumtoken set size i min ( r ). From Lemma III.2 we have that if C ( r ) = r + C ( r + > i min ( r + > i min ( r ). As we show in Lemmas III.5 throughIII.8, if the former is true for each round r ′ we considerwhere r ′ > r and C ( r ′ ) >
1, at most ((1 /α ) log n log ∆ ) totalrounds are needed for every node in S min ( r ) to participate ina productive connection. Furthermore, if instead C ( r ′ ) = S min ( r ) has participated in a productive connection in thismany rounds.Therefore this necessitates that the minimum token set sizeafter ((1 /α ) log n log ∆ ) is at least i min ( r ) +
1. Since clearly,the minimum token set size can increase at most k times, thetotal round complexity of the algorithm to spread all k tokensis at most O(( k /α ) log n log ∆ ) total rounds. (cid:3) E. Lower Bound for Gossip in the Mobile Telephone Model
We now show that our algorithm is optimal to withinpolylogarithmic factors by proving a lower bound for gossipin our model. The proof can be found in Appendix B6.
Theorem III.9.
For k initial tokens and any value / n ≤ α ≤ / , there is a graph with n nodes and vertex expansion atleast α where Ω ( k /α ) rounds are required to solve the gossipproblem. IV. A synchronous G ossip Here we analyze an asynchronous version of our randomdi ff usion gossip strategy in the aMTM. A. The Asynchronous Mobile Telephone Model
The asynchronous mobile telephone model (aMTM), firstintroduced in [21], removes the assumption of synchronousrounds from the MTM. Core communication properties, suchas the time required for a neighbor to receive an advertisementor connection proposal, or the speed at which information istransmitted over a connection, can now vary arbitrarily duringan execution.Similar to the MTM, the topology of the underlying networkis defined by an undirected graph. Furthermore, the behaviorf the nodes in the aMTM is similarly constrained by afixed scan-and-connect behavioral loop in which nodes: updatetheir own advertisement, wait to hear new advertisementsfrom at least some neighbors, decide whether to act on theseadvertisements by attempting to form a connection with aneighbor, then repeat. Unlike the MTM, however, nodes donot progress through this loop in a synchronized manner,with delays decided by an adversarial scheduler. This loop isformalized in Algorithm 2. The model implements the methods update , receiveAds , and blockForConn , which abstract thedetails of the underlying asynchronous communication. Algorithm 2:
The aMTM interface (for device u ) state ← idle I nitialize () while true do tag ← G et T ag () update ( tag ) receiver ← null A ← receiveAds () if A , ∅ then receiver ← S elect ( A ) if receiver , null then state ← blockForConn ( receiver ) if state = connected then C ommunicate ( receiver ) state ← idle Model Guarantees and Parameters.
As is standard withasynchronous network models, we constrain the model’s be-havior with respect to a set of maximum delays and bit ratesspecified for its key communication activities. We define thesedelays for a given execution with the parameters δ update , δ conn ,and R b . The values of these parameters are not known to thealgorithm and can change from execution to execution. Wedetail the guarantees they help specify below: Advertisement Guarantees:
If a node u calls update at sometime t , then the model guarantees that every neighbor of u mustreceive an advertisement from u in the interval t to t + δ update ,and that only advertisements u passed to update duringthis interval are received in this interval. Notice, there is noguarantee that u ’s neighbors receive all of its advertisements.It is possible, for example, that u advertises a at a given time,then loops back around in less than t update time and replacesthis with a new advertisement a ′ before any neighbor had achance to receive a . On the other hand, once a node beginsadvertising, its neighbors will hear from it at least once every δ update time. Connection Attempt Guarantees:
The parameter δ conn bounds the maximum time required for the blockedForConn model method to resolve a connection attempt and returnwhether or not the attempt succeeded. In more detail, when u calls blockedForConn ( v ), for some neighbor v , the modelguarantees to deliver a connection proposal to v . If v is alreadyengaged in a connection (i.e., it has previously accepted a proposal and the resulting connection is still open), it willreject u ’s proposal. Otherwise, it will accept the proposal. Themodel must deliver the proposal and the response within thisinterval of length δ conn . The loop blocks until this underlyingcommunication completes and blockForConn can return thestatus of the connection. Notice that as in the synchronousmodel, these guarantees prevent any node from servicing morethan one incoming connection at a time. Communication Guarantees:
Assume v accepts u ’s con-nection proposal. At this point, they are connected and cancommunicate as specified by their respective C ommunicate methods. For many algorithms, such as the gossip strategystudied in this paper, we simply specify what occurs duringthis connection in the sender’s C ommunicate routine. Whenimplementing algorithms, however, this behavior must beexplicitly specified for both the sender and receiver roles.The amount of time required by these interactions depends onboth the amount of information transmitted by C ommunicate and the transmission rates determined by the model. We usethe parameter R b to bound the minimum bit rate at whichthe model can transmit information between a connected pairof neighbors. We assume that when a call to C ommunicate returns, the connection is closed. It follows that each nodecan participate in at most one outgoing connection at a time. Implementation.
The authors in [21] provide an implemen-tation of the aMTM interface in iOS. This implementationworks with the peer-to-peer networking libraries includedin iOS to execute the main aMTM loop. The algorithmdesigner working with this interface need only implementthe I nitialize , U pdate , S elect , and C ommunicate functions.This close connection between the abstract aMTM model andreal world implementation simplifies the task of deploying oniPhones any peer-to-peer algorithm described in the aMTM.To underscore the directness of this connection, we provide inAppendix D the straightforward SWIFT code that implementsour gossip strategy in iOS.
B. The Asynchronous Random Di ff usion Algorithm We now introduce our asynchronous random di ff usiongossip algorithm, which is formalized in Algorithm 3. Thisalgorithm adapts the strategy of synchronous random di ff usiongossip to the asynchronous setting. The major di ff erence is thatin each loop iteration, a node selects a neighbor for connectionfrom the set of advertisements it has received since the lastiteration. In the synchronous setting, by contrast, a node isalways considering the latest advertisements from all of itsneighbors. C. Analysis
We prove the following bound on the time complexity ofasynchronous random di ff usion gossip. Theorem IV.1.
With high probability in n, the asynchronousrandom di ff usion gossip algorithm solves the gossip problemin time O (( k /α ) log n log ∆ · δ max ) , where where k is the numberof tokens, n is the network size, α is the vertex expansion of lgorithm 3: Asynchronous random di ff usion gossip(for node u ) function I nitialize () T ← initial tokens (if any) known by u H ← some hash function function G et T ag () return h H ( T ) , | T | , u i function S elect ( A ) ˆ s ← min ( { s | h h , s , ∗i ∈ A , h , H ( T ) } ) ˆ N ← { v | h h , ˆ s , v i ∈ A , h , H ( T ) } return node chosen randomly from ˆ N function C ommunicate ( v ) (send / receive a token in the set di ff erence with v)the network, ∆ is the maximum degree, and δ max upper boundsthe time required for one iteration of the aMTM loop for thisalgorithm. The δ max parameter included in the above theorem wasintroduced to simplify notation by eliminating the need to citemultiple timing parameters in our complexity bound. Formally,we define: δ max = δ conn + δ update + b max / R b , where b max describesthe maximum size (in bits) of a gossip token.Notice, because our algorithm only transfers a constantnumber of tokens in each call to C ommunicate , each suchcall requires at most O ( b max / R b ) time. The δ update and δ conn parameters upper bound the time required to get throughthe update and blockedForConn methods, respectively. Itfollows that each iteration of our gossip algorithm’s mainaMTM loop requires at most O ( δ max ) time, making δ max auseful aggregate parameter for bounding asynchronous timecomplexity.For the analysis that follows, we re-purpose much of ournotation and several of our definitions from the synchronoussetting. We will accomplish this through a slight abuse ofnotation in which we take any element parameterized withan integer round r in the previous section and redefine it withrespect to a real time t . For example, let T u ( t ) be the token setof node u at time t in the same way T u ( r ) was u ’s token setat the beginning of round r .Similarly, we can adapt our notions of the productivesubgraph G ( t ), and minimum productive subgraph G min ( t ), fora time t using the values of s u ( t ) and N u ( t ). That being said, We omit for now the time required for two connected nodes todetermine which token to transfer. Our algorithm simply specifiesthat they transfer some token in the set di ff erence of their tokensets. For the sake of completeness, one could add an additionalparameter to capture the maximum bits needed to also decide onthis set di ff erence. We omitted this extra parameter for now asin the application scenarios we envision, the token sizes are oftenlarge enough their transfer swamps the overhead required to identifywhich token to transfer. In the event that the token set sizes areallowed to become massive, however, we can leverage the tokentransfer subroutine from [19] to decide this set di ff erence using only O (polylog( k )) additional bits. some additional care is required in dealing with these graphsin the asynchronous model. In a round-based setting, you canfix the productive subgraph at the beginning of the round andknow that all nodes will make connection decisions based onthat exact graph during the round. In the asynchronous modelno such guarantees hold. You might fix a productive subgraphat some time t , for example, but that graph can change beforeall the nodes get a chance to learn it and make a connectiondecision.To handle this nuance, we introduce our first pieces ofnotation unique to our asynchronous analysis. Fix some time t at which some node u calls S elect . Let ˆ N u ( t ) and ˆ s u ( t )be the values calculated on Lines 7 and 8 of Algorithm3, respectively, during this call to S elect . These values arecalculated from the advertisement set A u ( t ) which is passed tonode u ’s call to the S elect function at time t . Note that fora particular time t and node u , N u ( t ) and ˆ N u ( t ), and s u ( t ) andˆ s u ( t ), can vary, as N u ( t ) and s u ( t ) are based on the status of thenetwork at exactly time t , whereas ˆ N u ( t ) and ˆ s u ( t ) are basedon the advertisement set passed to S elect at time t (whichmay by that point already be out of date). Also note that ˆ N u ( t )and ˆ s u ( t ) are undefined for times that do not correspond to aS elect call.To help tame this reality that a given node’s snapshot ofthe network can become out of date before it has a chanceto act on it, we introduce the following definition concerningsnapshots of the changing minimum productive subgraph: Definition IV.2.
Fix a time interval [ t , t ] and two nodes u , v ∈ V such that { u , v } ∈ E min ( t ) . We say that u properly considers v with respect to t during this interval if there exists a timet consider , t ≤ t consider ≤ t , such that u calls S elect at t consider ,v ∈ ˆ N u ( t consider ) , and | ˆ N u ( t consider ) | ≤ deg min ( u ) , where deg min ( u ) is the degree of u in G min ( t ) . Put another way, if u properly considers v with respect to[ t , t ], then u attempts to connect with v in this interval withat least the same probability as it would in a round of thesynchronous algorithm corresponding to minimum productivesubgraph G min ( t ).It would simplify our analysis if for any time t we couldidentify an interval [ t , t ] such that u properly considers v forevery edge { u , v } ∈ E min ( t ), as we could then directly applyour analysis from the synchronous case. We cannot, however,guarantee that such intervals always exist in our asynchronoussetting. Consider an edge { u , v } ∈ E min ( t ), for some t . It mightbe the case that before u can receive an advertisement from v , that some other node connects to v and transmits a tokenthat removes v from the minimum productive subgraph. By thetime u subsequently hears from v , it might no longer includeit in its set of productive neighbors.In some sense, however, this is a good case as it only increases the probability that v receives a connection attempt.The following lemma (for which the proof can be found inAppendix C1) formalizes this intuition by proving that forany endpoint v in a snapshot of the minimum productivesubgraph, v will be selected with at least the probabilityhat it would if we had run a round of the synchronousalgorithm on that snapshot. This will allow us to subsequentlyapply Lemma III.6, which we carefully reworked from itsoriginal version in [13] so that it now only requires that thislower bound on selection probabilities holds. (The originalversion made use of the exact selection probabilities from thesynchronous algorithm.) Lemma IV.3.
Fix any time t and node v ∈ R min ( t ) , and letN min and deg min be the neighbor set and degree functions de-fined for G min ( t ) . There exists a time t , where t = t + O ( δ max ) ,such that v connects productively in the interval [ t , t ] withprobability at least − Q u ∈ N min ( v ) (1 − / deg min ( u )) . With the above lemma, for any given time t , we have shownthere is a time interval [ t , t ] that is not too long such thateach node in L min ( t ) behaves similarly to the nodes in thesynchronous setting with respect to G min ( t ). We now concludeby showing that this similarity is su ffi cient to apply the sameanalysis we used to prove Theorem III.1. Proof (of Theorem IV.1).
The proof of our main theorem fol-lows the same style of argument made by Lemmas III.5through III.8 in our synchronous analysis. Instead of as-suming synchronized rounds, however, we now characterizeour algorithm’s behavior over contiguous intervals of length ℓ = O ( δ max ), where ℓ is selected to be long enough to allowLemma IV.3 to apply to the intervals.Let t i be the time at which interval i begins. We treat eachinterval i like a round defined with respect to the minimumproductive subgraph G min ( t i ). The main di ff erence in this newsetting versus the synchronous is that Lemma IV.3 providesonly a lower bound on a node in R min ( t ) being selectedin interval i , whereas in the synchronous setting we knowthe exact probability of this event. Fortunately, our reworkedversion of Lemma III.6 requires only this lower bound. Indeed,much of the technical di ffi culty in reworking this lemma fromits original form was to allow it to require only this lowerbound instead of precise probabilities.In more detail, the only property assumed of the algorithmby Lemma III.6 is that a node in R min ( r ) be selected withprobability at least 1 − Q u ∈ N min ( v ) (1 − / deg min ( u )). Since this isexactly what we showed in Lemma IV.3 for our asynchronousalgorithm, Lemma III.6 applies to the graphs correspondingto our intervals. The remainder of the relevant lemmas in oursynchronous analysis require only that Lemma III.6 holds. Wecan therefore apply these lemmas to our intervals to obtaina similar complexity bound for gossiping k tokens, with theonly di ff erence being that instead of bounding the number ofrounds, we bound the number of intervals of length O ( δ max )that are required. (cid:3) R eferences [1] Gianluca Aloi, Marco Di Felice, Valeria Loscr`ı, Pasquale Pace,and Giuseppe Ruggeri. Spontaneous smartphone networks as auser-centric solution for the future internet. IEEE Communica-tions Magazine , 52:26–33, 2014. [2] Noga Alon, L´azsl´o Babai, and Alon Itai. A fast and simplerandomized parallel algorithm for the maximal independent setproblem.
Journal of Algorithms , 7:567–583, 1986.[3] Peter Barry and Patrick Crowley.
Modern Embedded Com-puting: Designing Connected, Pervasive, Media-Rich Systems .Morgan Kaufmann Publishers Inc., San Francisco, CA, USA,1st edition, 2012.[4] Keren Censor-Hillel, Bernhard Haeupler, Jonathan A. Kelner,and Petar Maymounkov. Rumor spreading with no dependenceon conductance.
SIAM J. Comput. , 46:58–79, 2017.[5] Flavio Chierichetti, Silvio Lattanzi, and Alessandro Panconesi.Rumour spreading and graph conductance. In
Proceedings ofthe Symposium on Discrete Algorithms (SODA) , pages 1657–1663. SIAM, 2010.[6] Sebastian Daum, Fabian Kuhn, and Yannic Maus. Rumorspreading with bounded in-degree. In
International Colloquiumon Structural Information and Communication Complexity (SIR-ROCO) , pages 323–339. Springer International Publishing,2016.[7] Michael Dinitz, Jeremy Fineman, Seth Gilbert, and CalvinNewport. Load balancing with bounded convergence in dynamicnetworks. In
Proceedings of the of the International Conferenceon Computer Communications (INFOCOM) , pages 1–9. IEEE,2017.[8] Michael Dinitz, Magn´us M Halld´orsson, Calvin Newport,and Alex Weaver. The capacity of smartphone peer-to-peernetworks. In
Proceedings of the International Symposiumon Distributed Computing (DISC) , pages 14:1–14:17. SchlossDagstuhl, 2019.[9] Nikolaos Fountoulakis and Konstantinos Panagiotou. Rumorspreading on random regular graphs and expanders. In
Proceed-ings of the International Conference on Approximation, and theInternational Conference on Randomization, and Combinato-rial Optimization: Algorithms and Techniques (APPROX / RAN-DOM) , pages 560–573. Springer-Verlag, 2010.[10] Alan M Frieze and Geo ff rey R Grimmett. The shortest-pathproblem for graphs with random arc-lengths. Discrete AppliedMathematics , 10:57–77, 1985.[11] Open Garden. Firechat, 2018. URL:https: // // ff ari and Calvin Newport. How to discreetly spreada rumor in a crowd. In Proceedings of the InternationalSymposium on Distributed Computing (DISC) , pages 357–370,2016.[14] George Giakkoupis. Tight bounds for rumor spreading in graphsof a given conductance. In
Proceedings of the Symposium onTheoretical Aspects of Computer Science (STACS) , pages 57–68, 2011.[15] George Giakkoupis. Tight bounds for rumor spreading withvertex expansion. In
Proceedings of the Symposium on DiscreteAlgorithms (SODA) , pages 801–815. SIAM, 2014.[16] George Giakkoupis and Thomas Sauerwald. Rumor spreadingand vertex expansion. In
Proceedings of the Symposium onDiscrete Algorithms (SODA) , pages 1623–1641. SIAM, 2012.[17] Adrian Holzer, Sven Reber, Jonny Quarta, Jorge Mazuze, andDenis Gillet. Padoc: Enabling social networking in proximity.
Computer Networks , 111:82–92, 2016.[18] Zongqing Lu, Guohong Cao, and Thomas La Porta. Net-working smartphones for disaster recovery. In
Proceedingsof the International Conference on Pervasive Computing andCommunications (PerCom) , pages 1–9. IEEE, 2016.[19] Calvin Newport. Gossip in a smartphone peer-to-peer network.In
Proceedings of the Symposium on Principles of DistributedComputing (PODC) , pages 43–52. ACM, 2017.20] Calvin Newport. Leader election in a smartphone peer-to-peer network. In
Proceedings of the International Parallel andDistributed Processing Symposium (IPDPS) , pages 172–181.IEEE, 2017.[21] Calvin Newport and Alex Weaver. Random gossip processesin smartphone peer-to-peer networks. In
Proceedings of theInternational Conference on Distributed Computing in SensorSystems (DCOSS) , pages 139–146. IEEE, 2019.[22] DG Reina, Mohamed Askalani, SL Toral, Federico Barrero,Eleana Asimakopoulou, and Nik Bessis. A survey on multihopad hoc networks for disaster response scenarios.
InternationalJournal of Distributed Sensor Networks , 11:647037, 2015.[23] Noriyuki Suzuki, Jane Louie Fresco Zamora, ShigeruKashihara, and Suguru Yamaguchi. Soscast: Location estima-tion of immobilized persons through sos message propagation.In
Proceedings of the International Conference on IntelligentNetworking and Collaborative Systems (INCoS) , pages 428–435. IEEE, 2012. A ppendix A. Omitted Proofs from Section II1) Proof of Lemma II.3:
For each i ∈ [ T ], let ˆ X i be anindicator variable that is 1 with probability p and 0 otherwise.We now define a process for generating a coupled distributionwhere the values sampled are pairs of bits. Namely, we sample i pairs ( Y i , ˆ Y i ) where ˆ Y i is 1 with probability p and 0 otherwise.If ˆ Y i is 1, we set Y i to 1 as well. Otherwise, we set Y i to 1 withprobability ( q i − p ) / (1 − p ). In this way, the marginal probabilitythat Y i = q i , the same success probability as ouroriginal indicator variable X i . Clearly ˆ X i and ˆ Y i are also 1 withthe same probability p . Therefore, since for any T -sequenceexecution where ˆ Y = P Ti = ˆ Y i and Y = P Ti = Y i it’s true that Y ≥ ˆ Y , it follows that as long as ˆ Y is at least some value, thenso is Y . In other words, it’s su ffi cient to lower bound the valueof ˆ Y to derive a lower bound for the number of successes inthe the series X , . . . , X T .For the expectation E [ ˆ Y ] = pT , we can apply the Cherno ff bound from Theorem II . ε = / Y is less than T / Y ≤ (1 / · pT ] = Pr[ ˆ Y ≤ Θ ( pT )] ≤ exp (cid:18) − Θ ( pT )8 (cid:19) = O (exp( − pT ))Therefore, with very high probability in pT , Y ≥ ˆ Y = Ω ( pT ). B. Omitted Proofs from Section III1) Proof of Lemma III.2:Proof.
When C ( r ) =
1, all nodes u ∈ V have the same numberof tokens, therefore s u ( r ) = | T v ( r ) | for all edges { u , v } ∈ E .Clearly, if not all nodes have k tokens, then there is some node u with some neighbor v such that H ( T u ) , H ( T v ). Furthermore,since s u ( r ) = | T v ( r ) | for all edges { u , v } ∈ E , v ∈ N u ( r ).Therefore, since | N u ( r ) | > u will send a connection proposalto some neighbor w ∈ N u ( r ).If w receives a connection proposal from u , w is guaranteedto accept at least one connection proposal this round andparticipate in at least one connection this round (initiated by a neighbor’s proposal). Therefore, after this round w willpossess a new token such that | T w ( r + | > i ∗ ( r ). Thereare now two possibilities: ∀ u ∈ V , | T u ( r + | ≥ | T w ( r + | or ∃ u ∈ V , | T u ( r + | < | T w ( r + | . In the first case, i min ( r + > i min ( r ) since all nodes possess more than i min ( r )tokens. In the second case C ( r + > u has fewer tokens than w . (cid:3)
2) Proof of Lemma III.5:
Fix the cut ( S min ( r ) , V \ S min ( r ))and recall that n ∗ min ( r ) is the size of the smaller of thetwo bipartitions. From Lemma II.1 we know that there is amatching M of size at least ( α/ · n ∗ min ( r ) across this cut.Consider an arbitrary edge { u , v } ∈ M and without loss ofgenerality assume u ∈ V \ S min ( r ) and v ∈ S min ( r ).By the definition of S min ( r ), no node has fewer tokens than v and therefore v must have the smallest token set size out of all u ’s neighbors. Since v ∈ N ( u ), this means that s u ( r ) = | T v ( r ) | which implies that H ( T u ( r )) , H ( T v ( r )). This is su ffi cient toshow that { u , v } is in the productive subgraph.Furthermore, clearly N ( u ) ∩ S min ( r ) , ∅ and N ( v ) ∩ ( V \ S min ( r )) , ∅ and therefore u ∈ L min ( r ) and v ∈ R min ( r ).Therefore, it must be the case that { u , v } is in the minimumproductive subgraph as well. Since we can show this for anyarbitrary edge { u , v } ∈ M , it is true for every such edge inthe matching. Therefore M is also a matching of size at least( α/ · n ∗ min ( r ) over G min ( r ).
3) Proof of Lemma III.7:
We can now apply the samereasoning as the proof of Theorem 7.2 in [13] to showthat applying Lemma III.6 inductively over O (log ∆ ) roundsachieves our desired number of connections. We summarizethis argument here.Fix a round r and apply the first iteration of Lemma III.6 atthe beginning of this round. For the i th application of LemmaIII.6, let m i be the size of the maximum matching over G ( L , R )in this iteration. By the lemma statement, with at least constantprobability, either Ω ( m i / log ∆ ) nodes of R min ( r ) participate ina productive connection or we can identify some subgraph G ( L ′′ , R ′′ ) defined according to the second lemma objective. Ifthe latter holds, this graph G ( L ′′ , R ′′ ) becomes the new G ( L , R )for the ( i + G ( L , R ) in the ( i + a , b , and d of objective2 of Lemma III.6, respectively. The only property that is nottrivially satisfied is therefore 3. To ensure that m i is not toosmall compared to the size m of our original matching over G ( L , R ), we note that for all i , m i ≥ (1 − / log ∆ ) i · m ≥ c · m .Therefore, since i ≤ · log ∆ , this expression is made valid bysetting c = exp( −
64) (please note that while we make no e ff ortto do so here in lieu of a clearer probabilistic analysis, thisconstant can certainly be optimized). Therefore, m i = Ω ( m )for any inductive step i and so if the first objective of thelemma is satisfied on any iteration, at least Ω ( m / log ∆ ) nodesin R min ( r ) participate in a productive connection.urthermore notice that after 32 · log ∆ steps where thesecond objective of the lemma is satisfied, the degree sumof the final graph is m i . Therefore each node in L onlyhas one neighbor to choose from such that the numberof productive connections with nodes in R min ( r ) is trivially m i = Ω ( m i / log ∆ ). Therefore, it holds that after 32 · log ∆ stepsin which at least one of the lemma objectives is satisfied, atleast Ω ( m / log ∆ ) nodes in R min ( r ) participate in a productiveconnection (again where m is the maximum size of thematching over G min ( r )).Call any round where at least one of the objectives ofLemma III.6 is satisfied a success. Since we know eachround is successful with at least constant probability, we canapply the stochastic dominance argument from Lemma II.3 todemonstrate (with high probability in ∆ ) that at most O (log ∆ )steps are required to achieve 32 · log ∆ successes. Therefore atmost O (log ∆ ) rounds are required before Ω ( m / log ∆ ) nodesin R min ( r ) participate in a productive connection.Finally, from Lemma III.5 we know that for any fixed round r there is a matching over the minimum productive subgraph ofsize ( α/ · n ∗ min ( r ). Therefore, m ≥ ( α/ · n ∗ min ( r ) and so at least Ω (( α/ log ∆ ) · n ∗ min ( r )) nodes of R min ( r ) ⊆ S min ( r ) participate ina productive connection after at most O (log ∆ ) rounds.
4) Helper Lemma to Support Lemma III.8:
Lemma A.1.
Fix a round r > such that n − n min ( r ) ≤ n / .With high probability in n, after at most O ((1 /α ) log n log ∆ ) rounds, more than half of the nodes posses more than i min ( r ) tokens. Call a phase p i of O (log ∆ ) rounds successful if at least O (( α/ log ∆ ) · n ∗ min ( r i )) nodes of S min ( r i ) participate in a pro-ductive connection where r i is the first round of p i . We observe t successful phases p , . . . , p t while at most half the nodes havemore than the minimum token set size. Our goal will be toshow that there can only be so many of these phases before atleast half the nodes in the network possess more than i min ( r )tokens.Notice that for any integers i and j such that i < j , n − n min ( r i ) ≤ n − n min ( r j ), since we can only grow the numberof nodes with more than the minimum number of tokens.Furthermore, recall that for any i such that n min ( r i ) ≤ n / n min ( r i ) = n ∗ min ( r i ). Therefore, if m i is the size ofthe maximum matching over G min ( r i ), for any i and j where i < j and n ∗ min ( r i ) ≤ n ∗ min ( r j ) ≤ n / m i ≤ m j .Therefore (by Lemma III.5) we have that the i th successfulphase achieves Ω (( α/ log ∆ ) · n ∗ min ( r i )) ≥ Ω (( α/ log ∆ ) · n ∗ min ( r ))productive connections.Therefore at most O (log ∆ /α ) successful phases are requireduntil the number nodes with more than the minimum numberof tokens grows by a constant fraction. We can now grouptogether T sequences of O (log ∆ /α ) phases s , . . . , s T andsolve for T such that n ∗ min ( r ) · (1 + Ω (1)) T ≥ n / T ≤ O (log n ) sequences for a total of O ((1 /α ) log n log ∆ ) total phases. Finally, we bound how manyof these phases must pass until until we achieve a su ffi cientnumber of successful phases. To this end, we apply thestochastic dominance argument from Lemma II.3 using theconstant probability lower bound introduced by Lemma III.6.This gives us with high probability in n that our final phasecomplexity is O ((1 /α ) log n log ∆ ) phases and our final roundcomplexity is O ((1 /α ) log n log ∆ ) total rounds.
5) Proof of Lemma III.8:
In Lemma A.1 we showed thatwhen n − n min ( r ) ≤ n / r >
0, by periodicallygrowing the set of nodes with more than the fewest numberof tokens by a constant fraction, at most O ((1 /α ) log n log ∆ rounds are required until n min ( r ) ≤ n /
2. A symmetric argumentcan be made in the case that n min ( r ) ≤ n / shrink the nodes with the fewest number of tokens by a constantfraction until at most a constant number remain.Let s , . . . s T be several sequences where s i = p , . . . , p t ismade up of O ( α/ log ∆ ) phases where each phase p i is made upof O (log ∆ ) rounds. While in Lemma A.1 we lower boundedthe size of the matching in each phase in each sequenceby the first round of p of s , we now lower bound thesize of these matchings by the last round of p t of s T . Themaximum matching over the minimum productive subgraphwith respect to this round gives us a lower bound on thenumber of connections achieved in each successful phase upto this round.Therefore, in order to solve for T in this case we solve theexpression ( n − n min ( r )) · (1 − Ω (1)) T ≤ r . Again, this expression indicates for the i th sequence wereduce the number of nodes in S min ( r ) by a constant fractionof n ∗ min ( r ′ ) where r ′ is the round at the beginning of the ( i + T ≤ O (log n ). The rest of the proofis the same as that of Lemma A.1, yielding a final roundcomplexity of O ((1 /α ) log n log ∆ ) total rounds to inform allbut one node which is then trivially connected to in at mostone more round.
6) Proof of Lemma III.9:
Let q = n α . Construct our graph G with vertex expansion at least α by creating a q -clique ofnodes and connecting the remaining n − q nodes to every nodein the q -clique. This graph is equivalent to a star graph when q = α = / n ) and to a clique when q = n ( α = α . Recall that to find the vertex expansion of a graph,the goal is to minimize the quantity | ∂ ( S ) | / | S | over all cuts S of size at most n /
2. When considering all possible cuts ofour graph, we can choose to either include nodes from the q -clique or nodes not in the clique (or both). If we select anynode from the q -clique, by our construction, every remainingnode is now in ∂ ( S ). Therefore, the only freedom we have tominimize | ∂ ( S ) | / | S | is to increase the size of S as to maximizethe denominator. However, the minimum value we can deriveis still only 1 ≥ α since we can include at most n / S .Our only remaining option is then to try to minimize | ∂ ( S ) | / | S | by not including any nodes from the q -clique in ouret S . As soon as we include a single node outside the q -clique, | ∂ ( S ) | = q . Furthermore, we are compelled to include up to n / q ≤ / n / q -clique). This minimizes the target quantitysince it has no e ff ect on the numerator and it maximizes thedenominator. However, even when the quantity is minimizedin this way, it’s always the case that | ∂ ( S ) | / | S | ≥ q / ( n / q = n α , this quantity is at least 2 α , satisfying thecondition of the Lemma statement.We will now show that it takes Ω ( k /α ) rounds to spread k tokens to all n nodes in G . We begin by providing all k tokensto every node in the q -clique. To solve the gossip problem, all k tokens must be delivered to the n − q ≥ n / kn / q connections can occur per round(since nodes outside the clique aren’t connected and the nodesin the clique are limited to at most one connection per round)a total of at least kn / (2 q ) rounds are required. Substituting q = n α then gives the needed lower bound on the total numberof rounds: Ω ( kn / (2 q )) = Ω ( kn / (2 n α )) = Ω ( k / (2 α )) = Ω ( k /α ) . C. Omitted Proofs from Section IV1) Proof of Lemma IV.3:
Fix any time t and node v asspecified in the lemma statement. Fix t to be the minimumtime after t that is su ffi ciently large to guarantee that for everypair of neighbors { x , y } in the underlying network topology, y receives an advertisement from x that was passed to x ’s update method at some time t ≥ t , and y has time tocall S elect after receiving at least one such an advertisement.Clearly, t = t + O ( δ max ).We will consider all possible executions of our algorithmover the interval [ t , t ]. We partition these executions intotwo disjoint event spaces with respect to v . The first space,which we will denote A , will contain all executions inwhich every node u ∈ N min ( v ) properly considers v duringthe interval [ t , t ]. (Recall that in the lemma statement wedefine N min ( v ) to be the neighbor set of v in G min ( t ), and deg min ( v ) = | N min ( v ) | .) The second space A will then simplybe the complement of A , containing all other executions.Begin with some execution a ∈ A . Recall by the definitionof A , every node in N min ( v ) properly considers v duringthe interval [ t , t ] in execution a . Fix one such neighbor u ∈ N min ( v ). There is some time t in our interval such that atthis time, u makes a call to S elect , during which v ∈ ˆ N u ( t )and | ˆ N u ( t ) | ≤ deg min ( u ). During this call, u will select v for aconnection attempt with probability 1 / | ˆ N u ( t ) | ≥ / deg min ( u ).It follows that the probability that u does not send v a proposalis at most 1 − / deg min ( u ).Since execution a is in the event space A , we know thatevery node u ∈ N min ( v ) properly considers v during thisinterval. Moreover, given that these nodes properly consider v , the probability that two nodes return v from S elect isindependent (since the selected neighbor is chosen uniformlyat random from ˆ N ). Therefore, we can bound the probability of event X , where X denotes that v receives at least oneconnection proposal, as follows:Pr[ ¬ X ] ≤ Π u ∈ N min ( v ) (1 − deg min ( u ))Pr[ X ] ≥ − Π u ∈ N min ( v ) (1 − deg min ( u ))Furthermore, by the guarantees of the aMTM, having re-ceived at least one connection proposal, v is guaranteed toaccept at least one. Therefore, v participates in a productiveconnection with at least the above probability.We now consider some execution a ∈ A . By the definitionof A , there must be in a some node u ∈ N min ( v ), such that u does not properly consider v in the interval [ t , t ]. Fix t select to be the first time that u calls S elect after receivingan advertisement from v that was passed to update at a timegreater than or equal to t . By our definition of t , we canalways identify a time t select that satisfies these properties in[ t , t ].By assumption, we know that u does not properly consider v during the call to S elect at t select . We consider the two possiblereasons for this behavior, and show in both cases v must havealready participated in a productive connection between t and t select .The first possible reason is that v < ˆ N u ( t select ). By thedefinition of our algorithm, the minimum token set size in thenetwork can never decrease . It follows that if v < ˆ N u ( t select ),then v must have learned at least one token since t . It followsthat v participated in a productive connection since t .The second reason that u might not properly consider v would be if v ∈ ˆ N u ( t select ), but ˆ N u ( t select ) is too large such that | ˆ N u ( t select ) | > deg min ( u ). However, at time t , exactly deg min ( u )neighbors of u had a token set size of i min ( t ) (by the definitionof G min ( t )), with all other neighbors of u in G having strictlymore tokens. Since nodes cannot lose tokens, the number of u ’s neighbors with at most i min ( t ) tokens can never increase.If | ˆ N u ( t select ) | > deg min ( u ), then ˆ s u ( t select ) > i min ( t ), from whichit follows that v , along with all of u ’s neighbors with token setsize i min ( t ) at t , must have received at least one token since t , meaning it participated in a productive connection.We have shown, therefore, that for any a ∈ A , v participatesin a productive connection in [ t , t ] in a with probability 1.Pulling together these pieces, we have partitioned the possibleexecutions in the interval [ t , t ] into two sets. In both sets,the probability of v participating in a productive connection isat least 1 − Q u ∈ N min ( v ) (1 − deg min ( u )), as required by the lemmastatement. . Swift Implementation of the Asynchronous Random Di ff u-sion Gossip Algorithmu-sion Gossip Algorithm