[PDF] Coordination Complexity: Small Information Coordinating Large Populations

Abstract

We initiate the study of a quantity that we call coordination complexity. In a distributed optimization problem, the information defining a problem instance is distributed among n parties, who need to each choose an action, which jointly will form a solution to the optimization problem. The coordination complexity represents the minimal amount of information that a centralized coordinator, who has full knowledge of the problem instance, needs to broadcast in order to coordinate the n parties to play a nearly optimal solution. We show that upper bounds on the coordination complexity of a problem imply the existence of good jointly differentially private algorithms for solving that problem, which in turn are known to upper bound the price of anarchy in certain games with dynamically changing populations. We show several results. We fully characterize the coordination complexity for the problem of computing a many-to-one matching in a bipartite graph by giving almost matching lower and upper bounds.Our upper bound in fact extends much more generally, to the problem of solving a linearly separable convex program. We also give a different upper bound technique, which we use to bound the coordination complexity of coordinating a Nash equilibrium in a routing game, and of computing a stable matching.

Full PDF

aa r X i v : . [ c s . D S ] J a n Coordination Complexity:Small Information Coordinating Large Populations

Rachel Cummings ∗ Katrina Ligett ∗ Jaikumar Radhakrishnan † Aaron Roth ‡ Zhiwei Steven Wu ‡ August 6, 2018

Abstract

We study a quantity that we call coordination complexity . In a distributed optimization problem,the information deﬁning a problem instance is distributed among n parties, who need to each choosean action, which jointly will form a solution to the optimization problem. The coordination complexityrepresents the minimal amount of information that a centralized coordinator, who has full knowledge ofthe problem instance, needs to broadcast in order to coordinate the n parties to play a nearly optimalsolution.We show that upper bounds on the coordination complexity of a problem imply the existence of goodjointly differentially private algorithms for solving that problem, which in turn are known to upper boundthe price of anarchy in certain games with dynamically changing populations.We show several results. We fully characterize the coordination complexity for the problem of com-puting a many-to-one matching in a bipartite graph by giving almost matching lower and upper bounds.Our upper bound in fact extends much more generally, to the problem of solving a linearly separable con-vex program. We also give a different upper bound technique, which we use to bound the coordinationcomplexity of coordinating a Nash equilibrium in a routing game, and of computing a stable matching. ∗ Computing and Mathematical Sciences, California Institute of Technology; Email: { rachelc,katrina } @caltech.edu ; Supported in part by NSF grant CNS-1254169, US-Israel Binational Science Foundationgrant 2012348, the Charles Lee Powell Foundation, a Google Faculty Research Award, an Okawa Foundation Research Grant, aMicrosoft Faculty Fellowship, and a Simons Award for Graduate Students in Theoretical Computer Science. † School of Technology and Computer Science, Tata Institute of Fundamental Research; A portion of this work was done whilethe author was visiting the Simons Institute for Theory of Computing in Berkeley, CA. Email: [email protected] ‡ Computer and Information Science, University of Pennsylvania; Email: { aaroth,wuzhiwei } @cis.upenn.edu ; Sup-ported in part by NSF Grant CCF-1101389, an NSF CAREER award, and an Alfred P. Sloan Foundation Fellowship. Introduction

In this paper, we study a quantity which we call coordination complexity . This quantity measures the amountof information that a centralized coordinator needs to broadcast in order to coordinate n parties, each withonly local information about a problem instance, to jointly implement a globally optimal solution. Unlikein communication complexity , there is no need for the communication protocols in our setting to derivethe optimal solution starting with nothing but local information, nor even verify that a proposed solutionis optimal (as is the goal in non-deterministic communication complexity). Instead, in our setting, there isa central coordinator who already has complete knowledge of the problem instance — and hence also ofthe optimal solution. His goal is simply to publish a concise message to guide the n parties making up theproblem instance to coordinate on the desired solution — ideally using fewer bits than would be (trivially)needed to simply publish the optimal solution itself. Aside from its intrinsic interest, our motivation for studying this quantity is two-fold. First, as weshow, problems with low coordination complexity also have good protocols for implementing nearly optimalsolutions under the constraint of joint differential privacy [DMNS06, KPRU14] — i.e. protocols that allowthe joint implementation of a nearly optimal solution in a manner such that no coalition of parties can learnmuch about the portion of the instance known by any party not in the coalition. The existence of jointlydifferentially private protocols in turn have recently been shown to imply a low “price of anarchy” for no-regret players in the strategic variant of the optimization problem when the game in question is smooth —even when the population is dynamically changing [LST15]. Hence, as a result of the connection we developin this paper, in order to show dynamic price of anarchy bounds of the sort given in [LST15], it is sufﬁcientto show that the game in question has low coordination complexity, without needing to directly developand analyze differentially private algorithms. Using this connection we also derive new results for what canbe implemented under the constraint of pure joint differential privacy — results that were previously onlyknown subject to approximate joint differential privacy.Second, coordination complexity is a stylized measure of the power of concise broadcasts (e.g. prices inthe setting of allocation problems, or congestion information in the setting of routing problems) to coordinatepopulations in the absence of any interaction. Here we note that prices seem to coordinate markets, despitethe fact that individuals do not actually participate in any kind of interactive “Walrasian mechanism” ofthe sort that would be needed to compute the allocation itself, in addition to the prices (see e.g. [KC82,DNO14]). Indeed, prices alone are generally not sufﬁcient to coordinate high welfare allocations becauseprices on their own can induce a large number of indifferences that might need to be resolved in a coordinatedway — and hence Walrasian equilibria are deﬁned not just as vectors of equilibrium prices, but as vectorsof prices paired with optimal allocations. Publishing a Walrasian equilibrium would be a trivial solution inour setting, because it involves communicating the entire solution that we wish to coordinate — the optimalallocation. Nevertheless, we show that the coordination complexity of the allocation problem is — up tolog factors — equal to the number of types of goods in a commodity market. This is the same as whatwould be needed to communicate prices (indeed, our solution can be viewed as communicating prices ina slightly different, “regularized” market), and can be substantially smaller than what would be needed tocommunicate the optimal allocation itself. Within our framework of coordination complexity, we assume that the players are not strategic — they will faithfully follow thecoordination protocol upon observing the message broadcast by the coordinator. We do study the interface between the coordinationcomplexity and the strategic variants of some problems in Section 5. Of course, the connection here is in a stylized model — in a market, there is not in fact any party with complete information ofthe problem instance — but the market is nevertheless encoding good “distributional information” about the population of buyerslikely to arrive. .1 Our Results and Techniques In our model (which we formally deﬁne in Section 2), a problem instance D is deﬁned by an n -tuple fromsome abstract domain X : D ∈ X n . We write D = ( D (1) , . . . , D ( n ) ) to denote the fact that the informationdeﬁning the problem instance is partitioned among n agents, and each agent i knows only his own part D ( i ) . The solution space is also a product space: A n , and each agent i can choose a single action a i ∈ A – the choices of all of the agents jointly form a solution a = ( a , . . . , a n ) . The coordinator knows theentire problem instance D , and publishes a signal σ ( D ) ∈ { , } ℓ . Each agent then chooses an action a i := π ( D ( i ) , σ ( D )) based only on the coordinator’s signal and her own part of the problem instance. Thejointly induced solution a = ( a , . . . , a n ) is the output of the interaction. The pair of functions σ, π jointlyform a protocol, and ℓ , the length of the coordinator’s signal is the coordination complexity of the protocol.The coordination complexity of a problem is the minimal coordination complexity of any protocol solvingthe problem.A canonical example to keep in mind is many-to-one matchings: Here, a problem instance is deﬁnedby a bipartite graph between n agents and k types of goods. Each good j has a supply s j , and the goalis to ﬁnd a maximum cardinality matching such that no agent is matched to more than one good, and nogood j is matched to more than s j agents. Here, the portion of the instance known to agent i is the setof goods adjacent to agent i – but nothing about the goods adjacent to other agents. Note that describing amatching requires Ω( n log k ) bits, which is the trivial upper bound on the coordination complexity for thisproblem. For this problem, we show nearly matching upper and lower bounds: no protocol with coordinationcomplexity o ( k ) can guarantee a constant approximation to the optimal solution, whereas there is a protocolwith coordination complexity O ( k log n ) that can obtain a (1 + o (1)) -approximation to the optimal solution.Our upper bound in fact extends much more generally, to any problem that can be written down as a convexprogram whose objective and constraints are linearly separable between agents’ data.The idea of the upper bound is to broadcast a portion of the optimal dual solution to the convex program– one dual variable for every constraint that is deﬁned by the data of multiple agents (there is no need topublish the dual variables corresponding to constraints that depend only on the data of a single agent). Forthe many-to-one matching problem, these dual variables correspond to “prices” – one for each of the k typesof goods. This idea on its own does not work, however, because a dual optimal solution to a convex programis not generally sufﬁcient to specify the primal optimal solution. When specialized to the case of matchings,this is because optimal “market clearing prices” can induce a large number of indifferences among goodsfor each of the n agents, and these indifferences might need to be broken in a coordinated way to induce anoptimal matching. To solve this problem, we instead release the dual variables corresponding to a slightlydifferent convex program, in which a strongly convex regularizer has been added to the objective. The effectof the strongly convex regularizer is that the optimal dual solution now uniquely speciﬁes the optimal primalsolution – although now the optimal primal solution to a modiﬁed problem. The rest of our approach dealswith trading off the weight of the regularizer with the number of bits needed to approximately specify eachof the dual variables, and the error of the regularized optimal solution relative to the optimal solution to theoriginal problem.We also give several other positive results, based on a different technique: broadcasting the truncatedtranscript of a process known to converge to a solution of interest. Using this technique, we give lowcoordination complexity protocols for the problem of coordinating on an equilibrium in a routing game, andfor the problem of coordinating on a stable many-to-one matching.Finally, we show that problems that have both low sensitivity objectives (as all of the problems westudy do) and low coordination complexity also have good jointly differentially private protocols. Using the2esults of [LST15], this also shows a bound on the price of anarchy of the strategic variant of these problems,whenever they are smooth games , which holds even under a dynamically changing population. Our model of coordination complexity is related to, but distinct from, the well-studied notion of commu-nication complexity — see [KN97] for a textbook introduction. While both complexity notions measurethe number of bits that must be transmitted among decentralized parties to reach a particular outcome, theydiffer in the initial endowment of information, as well as in the requirements of each player to know theﬁnal outcome. In communication complexity, the information describing the problem instance is fully dis-tributed, and communication is necessary for all parties to know the outcome. Coordination complexity incontrast assumes the existence of a coordinator who knows the entire problem instance, and must broadcastinformation to the players which will allow them to each compute their part of the output – there is no needfor any of the parties to know the entire output. More similar to our setting is non-deterministic commu-nication complexity , in which we may imagine that there is an oracle who knows the inputs of all playersand broadcasts a message (perhaps partially) describing a solution together with a certiﬁcate that allows theparties to verify the optimality of the solution. In contrast, in our model of coordination complexity, thecoordinator does not need to provide any certiﬁcate allowing parties to verify that the coordinated solutionis optimal (indeed, each party need not have any information about the portion of the solution proposed toother parties).The informational requirements of coordinating matchings has a long history of study in economics,and has recently gained attention in theoretical computer science. Hayek’s classic paper [Hay45] conjec-tured that Walrasian price mechanisms, which coordinate matchings via a tˆatonnement process that updatesmarket-clearing prices based on demand, are “informationally efﬁcient,” in that they verify optimal alloca-tions with the least amount of information. This was later formalized by [Hur60] and [MR74] in speciﬁc set-tings of interest, using an informational metric that measured smooth real-valued communication. Nisan andSegal study the communication complexity of matchings using the tools of communication complexity asdeveloped in computer science, and show that any communication protocol that determines an optimal allo-cation must also determine supporting prices [NS06]. Recently, [DNO14] and [ANRW15] studied the prob-lem of computing an optimal matching through the lens of interactive communication complexity, showingthat interactive protocols can have signiﬁcantly lower communication complexity than non-interactive ones.Note that the communication complexity bounds given in these papers are always larger than the descriptionlength of the matching itself – in contrast, here when we study coordination complexity, nontrivial boundsmust not just be smaller than the input, but must also be smaller than the size of the optimal matching.Finally, there are two papers that study a very similar setting to ours, although they obtain rather differentresults. Calsamiglia [Cal84] studies a real-valued communication model in which a central coordinator withfull knowledge of the instance needs to broadcast a concise signal to coordinate an allocation in an exchangemarket—see [Seg06] for context on how this result ﬁts into the economic literature on communicationcomplexity. Deng, Papadimitriou, and Safra also study a similar model in Section 4 of [DPS02], which theycall “Market Communication”. Despite the similarity in models, the results of both [Cal84] and [DPS02]stand in sharp contrast to ours—they both give lower bounds , showing that the amount of communicationnecessary needs to grow linearly with the number of buyers n , while we give upper bounds showing thatit is necessary to grow only with the number of different types of goods k . Calsamiglia does not allowapproximation in his model, which is necessary for our results. Deng, Papadimitriou, and Safra allow for We thank Ilya Segal for pointing out [Cal84] to us, and thank Sepehr Assadi for pointing out Section 4 of [DPS02]. +

14, HHRW14, RRUW15] has studied protocolsfor implementing outcomes in various settings under the constraint of joint differential privacy [DMNS06,KPRU14], which allows n parties to jointly implement some solution while ensuring that no coalition ofparties can learn much about the input of any party outside the coalition. Most (but not all) of these algo-rithms are actually private coordination protocols of the sort we study here, in which the algorithm can beviewed as a coordinator who is constrained to broadcast a private signal. These jointly private algorithmsare not constrained to transmit a short signal – and indeed, the private signals can sometimes be verbose.But as we show, problems with low coordination complexity also have good jointly differentially privatealgorithms, which was one of our original motivations for studying this quantity.Lykouris, Syrgkanis, and Tardos [LST15] show that the existence of a jointly differentially private algo-rithm for solving an optimization problem implies that the strategic variant of the problem has a low “priceof anarchy” for learning agents, even in dynamic settings, in which player types change over time, as longas the game is smooth. Because we show in Section 5 that any problem with a low sensitivity objectiveand low coordination complexity has a good jointly differentially private algorithm, using the results of[LST15], to prove a bound on the price of anarchy in a smooth dynamic game, we show it sufﬁces to boundthe coordination complexity of the game. A coordination problem is deﬁned by a set of n agents, a data domain X , an action range A , and a socialobjective function S : X n × A n → R . An instance of a coordination problem consists of a set of n elementsfrom the data domain: D = ( D (1) , . . . , D ( n ) ) ∈ X n . Each agent i has knowledge only of D ( i ) , his ownportion of the problem instance, and the goal is for a centralized coordinator to broadcast a concise messageto the agents to allow them to arrive at a solution a = ( a , . . . , a n ) ∈ A n that approximately maximizes theobjective function S ( D, a ) .A coordination protocol consists of two functions, an encoding function σ : X n → { , } ∗ and a decod-ing function π : X × { , } ∗ → A . A coordination protocol ( σ, π ) proceeds in two stages: • First the coordinator broadcasts the message σ ( D ) to all agents using the encoding function. • Then each agent selects an action a i on the basis of her own portion of the problem instance and thebroadcast message, using the decoding function: a i := π ( D ( i ) , σ ( D )) .Both functions σ and π may be randomized. The approximation ratio of a protocol is the ratio of theoptimal objective value to the expected objective value of the solution induced by the protocol, in the worstcase over problem instances. Deﬁnition 1 (Approximation Ratio) . A coordination protocol ( σ, π ) obtains a ρ approximation to a problemif: max D ∈X n OPT( D ) E a ,...,a n [ S ( D, a )] ≤ ρ where each a i = π ( D ( i ) , σ ( D )) , and the expectation is taken over the randomness of σ and π . The coordination complexity of a protocol is the maximum number of bits the encoding function broad-casts, in the worst case over problem instances. 4 eﬁnition 2 (Coordination Complexity) . A coordination protocol ( σ, π ) has coordination complexity ℓ if: max D ∈X n | σ ( D ) | = ℓ. The coordination complexity of obtaining a ρ approximation to a problem is the minimum value of thecoordination complexity of all protocols ( σ, π ) that obtain a ρ approximation to the problem.We conclude by making several observations about coordination protocols. First, as we have deﬁnedthem, they are non-interactive – the coordinator ﬁrst broadcasts a signal, and then the agents respond. This iswithout loss of generality, since the coordinator has full knowledge of the problem instance. Any interactiveprotocol could be reduced at no additional communication cost to a non-interactive protocol, simply byhaving the coordinator publish the transcript that would have arisen from the interactive protocol. This is incontrast to the setting of communication complexity, in which interactive protocols can be more powerfulthan non-interactive protocols (and makes it easier to prove lower bounds for coordination complexity).Second, the coordination complexity of a problem is trivially upper bounded both by the descriptionlength of the problem instance (as is communication complexity), and by the description length of theproblem’s optimal solution (unlike in non-deterministic communication complexity, there is no need to pairthe optimal solution with a certiﬁcate allowing individual agents to verify it). Hence, non-trivial bounds willbe asymptotically smaller than both of these quantities. Bipartite Matching

The primary coordination problem we study in this paper is the bipartite matchingproblem . In this problem, there is a bipartite graph G = ( V, W, E ) , in which every node in V is associatedwith a player and every node in W represents a good. Each player i ’s private data is the set of edgesincident to her node – i.e. D ( i ) = { j : ( i, j ) ∈ E } . We study two variants of this problem. In the one-to-one matching problem, W represents a set of distinct goods, and the goal is to coordinate a maximumcardinality matching E ′ ⊆ E such that for every i ∈ V , |{ j ∈ W : ( i, j ) ∈ E ′ }| ≤ and for every j ∈ W , |{ i ∈ V : ( i, j ) ∈ E ′ }| ≤ . In the many-to-one matching problem, W represents a set of k commodities j ,each with a supply b j . The goal is to coordinate a maximum cardinality many-to-one matching E ′ ⊆ E suchthat for every i ∈ V , |{ j ∈ W : ( i, j ) ∈ E ′ }| ≤ and for every j ∈ W , |{ i ∈ V : ( i, j ) ∈ E ′ }| ≤ b j . Thesocial objective in this setting is the welfare or the cardinality of the matching, and we will use OPT( G ) todenote the optimal welfare objective.Note that the resulting solution might not be feasible since the players’ demands are not always satisﬁed.We need to make sure that we are not over-counting when measuring the welfare. In one-to-one matchings,if more than one players select a good, only the ﬁrst player is matched to it. In many-to-one matchings, ifmore than b j players select a good of type j , only the ﬁrst b j players are matched the good j . Notation

We use k · k to denote the ℓ norm, and more generally use k · k p to denote the ℓ p norm. In this section, we present lower bounds on the coordination complexity of bipartite matching problems. Asa building block, we prove a lower bound for the one-to-one matching problem on a bipartite graph with n vertices on each side, showing an Ω( n ) lower bound – i.e. that no substantial improvement on the trivialsolution is possible. We then extend this lower bound to the problem of many-to-one matchings, in whichthere are n agents who must be matched to k goods (each good can be matched to many agents, up to itssupply). Here, we show an Ω( k ) lower bound. In the next section, we give a nearly matching upper bound,which substantially improves over the trivial solution.5 .1 A Variant of the Index Function Problem Before we present our lower bound, we introduce a variant of the random index function problem [KNR99],which will be useful for our proof.

MULTIPLE-INDEX

There are two players Alice and Bob. Alice receives as input a sequence of t pairs, I = h ( S i , u i ) : i = 1 , , . . . , t i , where the S i are disjoint sets each with k elements, and u i is uniformlydistributed in S i . Based on her input Alice sends Bob a message M ( I ) . Bob then receives ( S j , j ) , where j is chosen from [ t ] . Bob must determine u j ; let his output be B ( S j , j, M ( I )) ∈ S j . We say that the protocolsucceeds if B ( S j , j, M ( I )) = u j . Let ℓ ( t, k, p ) be the minimum number of bits (for the worst input) thatAlice must send in order for Bob to succeed with probability at least p .Note that if Bob guesses randomly, then the protocol already succeeds with probability p = 1 /k . Thefollowing result shows that any signiﬁcant improvement over this trivial probability of success will requireAlice to send Bob a long message. See appendix for a full proof. Lemma 1.

For p ≥ /k , we have ℓ ( t, k, p ) ≥ (8 log e ) t ( p − /k ) . We will ﬁrst focus on the lower bound on one-to-one matching and show the following.

Theorem 1.

Suppose the coordination protocol Π for one-to-one matching guarantees an approximationratio of ρ in expectation. Then, the coordinator of Π must broadcast Ω( n/ρ ) bits on problem instances ofsize n (in the worst case). Fix the protocol Π . We will extract a two-party communication protocol for the MULTIPLE-INDEX problem from Π , and use the above lemma. As a ﬁrst step for our lower bound proof, we will consider thefollowing random graph construction process RanG . Random Graph Construction

RanG ( ρ, n ) : Let κ = n ρ and A = n ρ . Consider the following randombipartite graph G with vertex set ( V, W ) such that | V | = | W | = n . • Randomly generate an ordering w , w , . . . , w n of W (all n ! orderings being equally likely), andpartition W as W ∪ W such that W = { w , w , . . . , w κ } , and W = { w κ +1 , w κ +2 , . . . , w n } . • Similarly, randomly generate an ordering v , v , . . . , v n of V , and parition V into n/A bocks, B , B , . . . , B n/A (each with A vertices), where B j := { v i : ( j − A + 1 ≤ i ≤ j A } . • Connect B j and W as follows. First, we describe the connections between V and W . The neigh-bourhoods of the vertices in each B j will be disjoint: we partition W into equal-sized disjoint sets ( T v : v ∈ B j ) , and let the neighbours of v ∈ B j be exactly the ρ vertices in T v . • In addition, assign each vertex in v one neighbor in W , by connecting V with W in round-robinfashion — connect vertex v i to vertex w j , where j = ( κ + i mod ( n − κ )) .Before we prove Theorem 1, let us ﬁrst observe that a graph generated by RanG always has a matchingwith high welfare. 6 emma 2.

Each graph G generated by the above process RanG ( · , n ) has optimal welfare OPT( G ) ≥ n .Proof. Given the ﬁxed ordering over the vertices in V , match each of its ﬁrst (1 − ρ ) n vertices to its uniqueneighbor in W . Since ρ ≥ , this gives a matching with welfare at least (cid:16) − ρ (cid:17) n ≥ n . Proof of Theorem 1.

Let Π be a coordination protocol with coordination complexity ℓ and approximationratio ρ . This means that on a graph instance generated by RanG , the parties can coordinate on a matchingwith expected weight at least n ρ . Since | W | = n ρ , we know in expectation at least n ρ − n ρ = n ρ of thevertices in W are matched. Let α v be the probability that in Π , agent v picks her neighbor in W . Then,by linearity of expectation, P v ∈ V α v ≥ n , that is, E v ∈ V [ α v ] ≥ . Hence, there must be some block B j such that E v ∈ B j [ α v ] ≥ . We will now restrict attention to the block B j and consider the following instanceof MULTIPLE-INDEX : for each v ∈ B j , set S v = N ( v ) —the neighborhood of vertex v , and let u v bethe vertex unique vertex in S v ∩ W . Since the message broadcast by the coordination protocol allows theplayers to identify the special element with average success probability of ρ , by Lemma 1 the length ofmessage ℓ ≥ (8 log e ) | B j | (cid:18) ρ − ρ + 1 (cid:19) ≥ (8 log e ) (cid:18) n ρ (cid:19) (cid:18) ρ − ρ + 1 (cid:19) ≥ Ω (cid:18) nρ (cid:19) , which completes the proof. Finally, we give the following lower bound on coordination complexity for many-to-one matchings. Thelower bound relies on the result from Section 3.2—we show that any coordination protocol for many-to-onematchings can also be reduced to a protocol for one-to-one matchings, and so the lower bound in Section 3.2can be extended to give a lower bound for the many-to-one setting.For our lower bound instance, we consider bipartite graphs G = ( V, E ) such that the vertices in V represent n different players and W represent a set of k goods j , each with a supply b . Theorem 2.

Suppose that there exists a coordination protocol for many-to-one matchings that guaranteesan approximation ratio of ρ in expectation. Then such a protocol has coordination complexity of Ω( k/ρ ) . We will start by considering a one-to-one matching instance generated by

RanG with k vertices on eachside of the graph G ′ = ( V ′ , W ′ , E ′ ) . By Lemma 2, the optimal matching of G ′ has size OPT ′ ≥ k .Now we will turn this into an instance of a many-to-one matching problem: make b copies of each vertexin V ′ to obtain vertex set V , and set W := W ′ such that the supply of each good j is b ; then for an edge ( v ′ , w ′ ) in the original graph, connect all copies of v ′ to w ′ in the new graph. This gives a bipartite graph G = ( V, W, E ) . The following claim is straightforward. Claim 1.

The new graph G has a matching of size at least ( b OPT ′ ) . Now suppose that we could coordinate the players in V to obtain a matching M ∗ of size b OPT ′ ρ in G .Then with a simple sampling procedure, we can extract a high cardinality matching for the original graph:for each vertex in v ′ ∈ V ′ , sample one of the b copies of v ′ in G uniformly at random along with its incidentmatched edge. If two vertices in V ′ are connected to the same type of good in W ′ , break ties arbitrarily andkeep only one of the edges. 7 emma 3. The sampled matching in G ′ has expected size at least OPT ′ ρ . We will defer the proof to the appendix. We now have all the pieces to prove Theorem 2.

Proof of Theorem 2.

Suppose that there exists a coordination protocol ( σ, π ) for many-to-one matchingswith a guaranteed approximation ratio of ρ . By the result of Lemma 3, we know that this coordinationprotocol for one-to-one matchings has an approximation ratio most O ( ρ ) . By the lower bound in Theorem 1,we know that the length of σ ( G ) is at least Ω( k/ρ ) . In this section, we give a coordination protocol for problems which can be expressed as linearly separableconvex programs, with coordination complexity scaling only with the number of constraints that bind be-tween agents (so called coupling constraints , deﬁned below). In the next section, we show how to specializethis protocol to the special case of many-to-one matchings, which gives coordination complexity nearlymatching our lower bound.

Deﬁnition 3.

A linearly separable convex optimization problem consists of n players and for each player i , • a compact and bounded convex feasible set F ( i ) ⊆ { x ( i ) ∈ R l | k x ( i ) k ≤ } , • a concave objective and -Lipschitz function v ( i ) : F ( i ) → R such that v ( i ) ( ) = 0 , • and k convex constraint function c ( i ) j : F ( i ) → [0 , (indexed by j = 1 , . . . , k ).The convex optimization problem is: max x n X i =1 v ( i ) ( x ( i ) ) subject to n X i =1 c ( i ) j ( x ( i ) ) ≤ b j for j = 1 , . . . , k (Coupling constraints) x ( i ) ∈ F ( i ) for i = 1 , . . . , n (Personal constraints)where each player i controls the block of decision variable x ( i ) . Viewed as a coordination problem, the data held by each agent i is D ( i ) = {F ( i ) v ( i ) , c ( i )1 , . . . , c ( i ) k } , hisaction range is A i = F ( i ) , and the social objective function is S the objective of the convex program.We will denote the product of the personal constraints by F = F (1) × . . . × F ( n ) , the objective functionby v ( x ) , and the optimal value by OPT . In this notation we can write the problem as max x ∈F and P ni =1 c ( i ) j ( x ( i ) ) ≤ b j for all j v ( x ) . Note that here the problem is constrained both by the personal constraints F and by the coupling constraints.We will assume the problem above is feasible and our goal is coordinate the players to play an aggregatesolution x = ( x ( i ) ) i ∈ [ n ] that is approximately feasible and optimal. Our solution consists of two steps:8. We will ﬁrst introduce a regularization term η k x k to our objective function, and coordinate theplayers to maximize the regularized objective. The purpose of adding this regularization term is tomake the objective function strongly concave , which will cause it to have the property that an optimaldual solution will uniquely specify the optimal primal solution.2. Then we will show that the resulting optimal solution to the regularized problem is close to beingoptimal for the original (unregularized) problem. The weight of the regularization has to be traded offagainst the bit precision to which we need to communicate the optimal dual variables. In the ﬁrst step, we add a small regularization term to our original objective function. Consider the followingconvex optimization problem: max x ∈F and P ni =1 c ( i ) j ( x ( i ) ) ≤ b j for all j v ′ ( x ) = n X i =1 v ( i ) ( x ( i ) ) − η k x k Claim 2.

The objective function v ′ is η -strongly concave. To solve the convex program, we will work with the partial

Lagrangian L ( x, λ ) , which results frombringing only the coupling constraints into the objective via Lagrangian dual variables, but leaving thepersonal constraints to continue to constrain the primal feasible region: L ( x, λ ) = n X i =1  v ( i ) (cid:16) x ( i ) (cid:17) − η k x ( i ) k − k X j =1 λ j X i =1 c ( i ) j (cid:16) x ( i ) (cid:17) − b j ! = n X i =1 h v ( i ) (cid:16) x ( i ) (cid:17) − η k x i k i − k X j =1 λ j X i =1 c ( i ) j (cid:16) x ( i ) (cid:17) − b j ! Let

OPT ′ denote the optimum of the convex program, and by strong duality we have max x ∈F min λ ∈ R k + L ( x, λ ) = min λ ∈ R k + max x ∈F L ( x, λ ) = OPT ′ Fixing the optimal dual variables, λ , the optimal primal solution y satisﬁes y = argmax x ∈F L ( x, λ ) Note that the result of moving the coupling constraints into the Lagrangian is that we can now write theprimal optimization problem over a feasible region deﬁned only by the personal constraints. Because ofthis fact, and because the Lagrangian objective is linearly separable across players, given λ , each player’sportion of the solution y ( i ) is argmax x ∈F i  v ( i ) (cid:16) x ( i ) (cid:17) − η k x ( i ) k − k X j =1 λ j c ( i ) j (cid:16) x ( i ) (cid:17) . (1)9hus, if the argmax were unique, this means that the optimal dual variables λ would be sufﬁcient tocoordinate each of the parties to ﬁnd their portion of the optimal solution, without the need for furthercommunication (the problem, in general, is that the argmax need not be unique, and ties may need to bebroken in a coordinated fashion). However, because we have added a strongly concave regularizer, the argmax is unique in our setting: Claim 3.

The solution to argmax x ∈F i v ( i ) (cid:16) x ( i ) (cid:17) − η k x ( i ) k − k X j =1 λ j X i =1 c ( i ) j (cid:16) x ( i ) (cid:17)! is unique.Proof. This follows from the fact that the function v ( i ) (cid:0) x ( i ) (cid:1) − η k x ( i ) k is strongly concave.This gives rise to our simple coordination mechanism ReC . The mechanism ﬁrst computes the optimaldual variables in our regularized partial Lagrangian problem, rounds them to ﬁnite precision, and then pub-lishes these variables. Then each individual player ﬁnds her part of the near optimal solution by performingthe optimization in Equation (1). The details are in Algorithm 4.1.

Algorithm 1

Coordination Protocol for Linearly Separable Convex Programs

ReC ( η, ε ) Input: a linearly separable convex program instance I , regularization parameter η and target accuracy ε Initialize: α = ηε √ nk Modify the objective of I into max x ∈F v ( x ) − η k x k Compute the optimal dual solution λ • for the modiﬁed convex programRound each coordinate of λ • into a multiple of α/ √ k and obtain b λ Broadcast the rounded dual solution b λ . To decode, each player i computes: b x ( i ) ( λ ) = argmax x ∈F i  v ( i ) (cid:16) x ( i ) (cid:17) − η k x ( i ) k − k X j =1 b λ j c ( i ) j (cid:16) x ( i ) (cid:17) Next we show that the resulting solution is close to the optimal solution of the regularized convexprogram (i.e. that we do not lose much by truncating the dual variables to have ﬁnite bit precision). Let ( x • , λ • ) be an optimal primal-dual pair for the regularized convex program. Note that since the objective ofthe program is strongly concave, x • is unique. First, we will show that if the broadcast dual vector b λ is closeto an optimal dual solution λ • , the resulting solution b x will also be close to the optimal primal solution x • . Lemma 4.

Suppose we have a dual vector b λ such that k λ • − b λ k ≤ α . Let b x = argmax x ∈F L ( x, b λ ) , then k b x − x • k ≤ √ α ( nk ) / √ η Lemma 5.

The coordination mechanism

ReC instantiated with regularization parameter η and target ac-curacy parameter ε will coordinate the players to play a solution b x that satisﬁes k b x − x • k ≤ ε , and has acoordination complexity of O ( k log( nk/ηε )) .Proof. Note that α = ηε √ nk , and the mechanism rounds each coordinate of the optimal dual solution λ • to amultiple of α/ √ k , so the approximate dual vector b λ can be speciﬁed with O ( k log( √ k/α )) bits.Since for each coordinate j , | λ • j − b λ j | ≤ α/ √ k , we also have that that k λ • − b λ k ≤ α . By Lemma 4, weknow that k b x − x • k ≤ ε . Now we carry out the second step to show that if we choose the regularization parameter η carefully, thesolution resulting from the coordination mechanism above is both approximately feasible and optimal. Let x ∗ denote the optimal solution of the original convex program, x • denote the optimal solution of the regu-larized convex program, and b x ( η ) denote the solution resulting from the coordination mechanism when weuse parameter η .As an intermediate step, we will ﬁrst bound the objective difference between x • and x ∗ . Lemma 6.

For any choice of η , v ( x ∗ ) − v ( x • ) ≤ ηn .Proof. Since both x ∗ and x • are in the feasible region of the regularized convex program, we know that v ( x • ) − η k x • k ≥ v ( x ∗ ) − η k x ∗ k . Since for each i , ( x • ) ( i ) , ( x ∗ ) ( i ) has ℓ norm bounded by 1, then k x • k − k x ∗ k ≤ n . Therefore, we musthave v ( x • ) ≥ v ( x ∗ ) − ηn Next we bound the objective difference between b x and x • using Lipschitzness. Lemma 7.

Suppose that k b x − x • k ≤ ε , then v ( x • ) − v ( b x ) ≤ nε. Proof.

The proof follows easily from the fact that each v ( i ) is 1-Lipschitz and the function v is n -Lipschitzin the aggregate vector x . Theorem 3.

The coordination mechanism

ReC ( η, ε ) coordinates the players to play a joint solution b x thatsatisﬁes v ( b x ) ≥ OPT − n ( ε + η ) and min x ∈F k x − b x k ≤ ε and has coordination complexity of O ( k log( nk/ηε )) .Proof. Follows easily from the previous lemmas. 11 .3 Application to Many-to-One Matchings

Next we show a simple instantiation of our coordination mechanism for linearly separable convex programsto give a coordination complexity upper bound for many-to-one matchings. First, let’s consider the followinglinear program formulation of the matching problem. max x n X i =1 k X j =1 v i,j x i,j (2)subject to n X i =1 x i,j ≤ b j for j = 1 , . . . , k (3) k X j =1 x i,j ≤ for j = 1 , . . . , k (4) x i,j ≥ for i = 1 , . . . , n and j = 1 , . . . , k (5)Observe that the matching linear program is an example of a linearly separable convex program asdeﬁned in Deﬁnition 3. Each player i has valuation v i,j ∈ { , } for each type of good j and controls thedecision variables { x i,j } kj =1 . Each supply constraint in Equation (3) corresponds to a coupling constraint,and constraints in both Equation (4) and Equation (5) are personal constraints.A nice property about the matching linear program is that any extreme point is integral. However, thisstructure no longer holds if we add a regularization term to the welfare objective, so the resulting solution b x resulting from the coordination mechanism will be fractional. To obtain an integral solution, we can simplyuse independent rounding, which does not require any further coordination. In order to obtain an integralsolution, each player i will take their portion of the fractional solution ( b x i,j ) kj =1 and will independentlysample a good by selecting each good j with probability b x i,j . We will continue to use similar notation: let v ( · ) denote the welfare objective in the linear program, let x ∗ be the optimal solution for the matching linearprogram with welfare OPT , b x be the optimal solution for the regularized program with welfare b V , x ′ bethe rounded solution of b x , and let F denote the feasible region deﬁned by all the constraints of Equation (3)in the linear program. The following lemma bounds the loss of welfare due to rounding. For details of theproof, see the full version. Lemma 8.

Let β ∈ (0 , . Then with probability at least − β , the rounded solution x ′ satisﬁes v ( x ′ ) ≥ − log(2 /β ) p b V ! b V Proof.

Since E hP ni =1 P kj =1 x ′ i,j i = b V , by Chernoff-Hoeffding bound, we know that for any δ ∈ (0 , , Pr  n X i =1 k X j =1 x ′ i,j < (1 − δ ) b V  < exp (cid:16) − δ b V / (cid:17) If we set β = exp( − δ b V / , then we get δ = q /β ) / b V , which recovers the stated bound.Now we look at approximate feasibility of x ′ . 12 emma 9. Suppose that min x ∈F k x − b x k ≤ ε . Then with probability − β , x ′ satisﬁes k X j =1 n X i =1 x ′ i,j − b j ! + ≤ q k log( k/β ) b V + √ nkε Observe that since this is a packing linear program, if desired, it is easy to obtain exact feasibility bysimply scaling down the supply constraints: this transfers the approximation factor in the feasibility boundto the become an approximation factor in the objective.Lastly, we are ready to establish the welfare guarantee for the rounded solution. Since the solution weobtain might slightly violate the feasibility constraints, we want to make sure we are not over-counting. Ifmore than b j parties select a particular good of type j , we only count the ﬁrst b j parties to select it whenmeasuring our welfare guarantee. Theorem 4.

There exists a coordination protocol with coordination complexity O ( k log( nk )) such that theparties coordinate on a matching x ′ with total weight: k X j =1 min ( n X i =1 v i,j x ′ i,j , b j ) ≥ − O √ k log( k/β ) √ OPT !! OPT as long as

OPT ≥ . Observe that in the setting of many-to-one matchings, when the supply of each good is s j ≫ , weexpect that OPT ≫ k , and hence in this setting, the above theorem guarantees a solution with weight (1 − o (1)) OPT . Remark 1.

We remark that many other combinatorial optimization tasks have fractional relaxations thatcan be written as linearly separable convex programs, and the same rounding technique can be appliedto get low-coordination-complexity protocols for them. This class includes among others multi-commodityﬂow (where the coordination complexity scales with the number of edges in the underlying graph, but notwith the number of parties who wish to route ﬂow) and multi-dimensional knapsack problems (where thecoordination complexity scales with the number of different types of knapsack constraints, but not with thenumber of parties who need to decide on their inclusion in or out of the knapsack).

In this section, we explain a simple implication of our results: Problems that have low sensitivity objectives(i.e. problems such that one party’s data and action do not substantially affect the objective value) and lowcoordination complexity also have good algorithms for solving them subject to joint differential privacy .When the strategic variant of the optimization problem is a smooth game, they also have good welfareproperties for no-regret players, even when agent types are dynamically changing. A database D ∈ X n is an n -tuple of private records, each from one of n agents. Two databases D, D ′ are i -neighbors if they differ only in their i -th index: that is, if D j = D ′ j for all j = i . If two databases D and D ′ are i -neighbors for some i , we say that they are neighboring databases . We write D ∼ D ′ to denote that D and D ′ are neighboring. We will be interested in randomized algorithms that take a database as input,and output an element from some abstract range R . 13 eﬁnition 4 ([DMNS06]) . A mechanism M : X n → R is ( ε, δ ) - differentially private if for every pair ofneighboring databases D, D ′ ∈ X n and for every subset of outputs S ⊆ R , Pr[ M ( D ) ∈ S ] ≤ exp( ε ) Pr[ M ( D ′ ) ∈ S ] + δ. For the class of problems we consider, elements in both the domain and the range of the mechanism arepartitioned into n components, one for each player. In this setting, joint differential privacy [KPRU14] is amore natural constraint: For all i , the joint distribution on outputs given to players j = i is differentially pri-vate in the input of player i . Given a vector x = ( x , . . . , x n ) , we write x − i = ( x , . . . , x i − , x i +1 , . . . , x n ) to denote the vector of length ( n − which contains all coordinates of x except the i -th coordinate. Deﬁnition 5 ([KPRU14]) . A mechanism M : X n → R n is ( ε, δ ) - jointly differentially private if for every i ,for every pair of i -neighbors D, D ′ ∈ X n , and for every subset of outputs S ⊆ R n − , Pr[ M ( D ) − i ∈ S ] ≤ exp( ε ) Pr[ M ( D ′ ) − i ∈ S ] + δ. If δ = 0 , we say that M is ε -differentially private. The case of δ > is sometimes referred to as approximate differential privacy. Note that this is still a very strong privacy guarantee; the mechanism preserves the privacy of any player i against arbitrary coalitions of other players. It only weakens the constraint of differential privacy by allowingplayer i ’s output to depend arbitrarily on her own input.An important class of jointly differentially private algorithms – particularly amenable to our purposes –are those that work in the so-called billboard model . Algorithms in the billboard model compute a differen-tially private signal as a function of the input database; then each player i ’s portion of the output is computedas a function only of this private signal and the private data of player i . The following lemma shows thatalgorithms operating in the billboard model satisfy joint differential privacy. Lemma 10 ([HHR + . Suppose M : X n → R is ( ε, δ ) -differentially private. Consider any set offunctions f i : X × R → R ′ . Then the mechanism M ′ that outputs to each player i : f i ( D i , M ( D )) is ( ε, δ ) -jointly differentially private. Note the similarity between algorithms operating in the billboard model and coordination complexityprotocols: a signal is computed by a central party, and then the action of each agent is a function only of thissignal and of their own portion of the problem instance. Thus, the following lemma is immediate:

Lemma 11.

A coordination protocol ( σ, π ) satisﬁes ( ε, δ ) -joint differential privacy if the coordinator’sencoding function σ satisﬁes ( ε, δ ) -differential privacy.Proof. Follows from Lemma 10.

Next, we give a general way to convert any coordination protocol to a jointly differentially private algorithm– and the lower the coordination complexity of the protocol, the better the utility guarantee of the privatealgorithm. The tool we use is the exponential mechanism of [MT07], one of the most basic tools in differ-ential privacy. To formally deﬁne this mechanism, we consider some arbitrary range R and some qualityscore function q : X n × R → R , which maps database-output pairs to quality scores.14 eﬁnition 6 (The Exponential Mechanism [MT07]) . The exponential mechanism M E ( D, q, R , ε ) selectsand outputs an element r ∈ R with probability proportional to exp (cid:18) εq ( D, r )2∆( q ) (cid:19) , where ∆( q ) ≡ max D,D ′ ∈X n ,D ∼ D ′ | q ( D ) − q ( D ′ ) | . McSherry and Talwar showed that the exponential mechanism is private and with high probability selectsan outcome with high quality.

Theorem 5 ([MT07]) . The exponential mechanism M E ( · , q, R , ε ) satisﬁes ( ε, -differential privacy, andfor any D ∈ X n it outputs an outcome r ∈ R that satisﬁes q ( D, r ) ≥ max r ′ q ( D, r ′ ) − q )(log( |R| /β )) ε with probability at least − β . Using the exponential mechanism, we can take any coordination protocol ( σ, π ) , and construct a jointlydifferentially private coordination protocol ( σ ′ , π ) with the same coordination complexity, and almost thesame approximation factor. The idea is to construct a differentially private encoding function σ ′ that selectsfrom the message space of σ using the exponential mechanism. Without loss of generality, we assume thatthe social objective function S has low-sensitivity: max i ∈ [ n ] max a ∈A n ,D ∈X n max a ′ i ∈A i ,D ′ i ∈X | S ( D, a ) − S (( D ′ i , D − i ) , ( a ′ i , a − i )) | ≤ . Now we present the private protocol as follows:

Algorithm 2

Jointly private algorithm

PriCoor (( σ, π ) , q, ε, D ) Input:

A coordination protocol ( σ, π ) , objective function f , and input instance D Let R = { σ ( D ′ ) | D ′ ∈ X n } be the space of all possible messages in the range of σ Let quality function q be deﬁned as q ( D, r ) = E π [ S ( D, ( π ( r, D ( i ) )) i ∈ [ n ] )] ∀ D ∈ X n , r ∈ R Let σ ′ ( D ) = M E ( D, q, R ) be the message selected by the exponential mechanism Output a = ( π ( σ ′ ( D ) , D ( i ) )) ni =1 Lemma 12.

Suppose that ( σ, π ) has coordination complexity ℓ and approximation ratio ρ for the objective f . Then the algorithm PriCoor (( σ, π ) , f, ε, D ) satisﬁes ( ε, -joint differential privacy, and with probabilityat least − β , the resulting action proﬁle a satisﬁes E [ S ( D, a )] ≥ OPT( D ) ρ − ℓ + log(1 /β )) ε , where the expectation is taken over the internal randomness of the encoding function σ ′ and decodingfunction π . We can always obtain this condition by scaling. It is already satisﬁed in the matching problem. roof. Since the encoding function is an instantiation of the exponential mechanism, we know from Lemma 11that the instantiation

PriCoor (( σ, π ) , q, ε, D ) satisﬁes ( ε, -joint differential privacy.Since the coordination protocol guarantees an approximation ratio of ρ , there exists some message r • inthe set R such that E π h S ( D, ( π ( r • , D ( i ) )) i ∈ [ n ] ) i = q ′ ( D, r • ) ≥ OPT( D ) ρ . Note that ∆( q ′ ) ≤ by our assumption on f . Then the utility guarantee of the exponential mechanism gives E π h S ( D, ( π ( σ ′ ( r, D ) , D ( i ) ))) i ∈ [ n ] ) i = q ′ ( D, π ( σ ′ ( r, D ))) ≥ max r ∈R q ′ ( D, r ) − |R| − log(1 /β ))) ε Also, observe that max r ∈R q ′ ( D, r ) ≥ OPT( D ) /ρ and log( |R| ) ≤ ℓ , so we also have E σ h S ( D, ( π ( σ ′ ( r, D ) , D ( i ) ))) i ∈ [ n ] ) i ≥ OPT( D ) /ρ − ℓ − log(1 /β ))) ε which recovers our stated bound. Now we brieﬂy discuss a connection between coordination complexity and the efﬁciency in games withdynamic population, which leverages the connection to joint differential privacy discovered by [LST15]. Webrieﬂy introduce the model in [LST15], but the discussion will necessarily be lacking in detail – see [LST15]for a formal treatment.Let G be an n -player normal form stage game . We consider this game played repeatedly with a changingpopulation of players over T rounds. Each player i has an action set A i , type D ( i ) , and a utility function u ( D ( i ) , a ) = u i ( a ) . For concreteness, we can think about allocation games deﬁned by auction rules M ,which take as input an action proﬁle and output an allocation X i ( a ) and a payment P i ( a ) for each player.Players have quasi-linear utility u i ( a ) = v ( D ( i ) , X i ( a )) − P i ( a ) = v i ( X i ( a )) − P i ( a ) , where v i : A n → [0 , denotes the valuation of player i over the allocation. In these games, a natural objective function issocial welfare: S ( D, a ) = P ni =1 v i ( X i ( a )) . We write OPT( D ) = max a ∈A n S ( D, a ) to denote the optimalwelfare with respect to an instance D .In the model of [LST15], after each round, every player independently exits with some probability p .Whenever a player leaves the game, she is replaced a new player, whose type is chosen adversarially. Wewill write D t to denote the game instance, and a t to denote the action proﬁle played at round t . Lastly, wealso assume that each player in the game is a no-regret learner and plays some adaptive learning algorithm . The main result of [LST15] is that the existence of jointly differentially private algorithms that ﬁndaction proﬁles approximately optimizing the welfare in a game implies that when the dynamically changinggame is played by no-regret players, their average welfare is high.

Theorem 6 (Corollary 5.2 of [LST15]) . Consider a mechanism with dynamic population ( M, T, p ) , suchthat the stage mechanism M is allocation based ( λ, µ ) -smooth and T ≥ /p . Assume that there existsan ( ε, δ ) -joint differentially private allocation algorithm X • : X n → A n such that for any input instance D ∈ X n it computes a feasible outcome that is ρ -approximately optimal E [ S ( D, X • ( D )] ≥ OPT( D ) /ρ. For more details of adaptive learning algorithms and adaptive regret, see [HS07]. f all players use adaptive learning in the repeated mechanism, then the overall welfare satisﬁes X t E [ S ( D t , a t )] ≥ λρ max { , µ } X t E [OPT( D t )] − nT max { , µ } p p (1 + n ( ε + δ )) ln( N T ) , where N = max i |A i | . Note that if the problem of coordinating a high-welfare allocation has small coordination complexity thisimplies the existence of a jointly differentially private allocation algorithm with a high welfare guarantee.By combining Theorem 6 and Lemma 12, we obtain the following result.

Lemma 13.

Consider a mechanism with dynamic population ( M, T, p ) such that the stage mechanism M is allocation based ( λ, µ ) -smooth and T ≥ /p . Assume there is a coordination protocol ( σ, π ) withcoordination complexity ℓ and approximation ratio ρ for the corresponding welfare maximization problem.Then if all players use adaptive learning in the repeated mechanism, the average welfare satisﬁes X t E (cid:2) S ( D t , a t ) (cid:3) ≥ λρ max { , µ } X t E [OPT( D t )] − inf ε> (cid:26) nT max { , µ } p pnε ln( N T ) + λTρ max { , µ } ℓ + log( n )) ε (cid:27) . In this section, we give another general technique for designing coordination protocols. The key idea is tobroadcast a message that is sufﬁcient for players to derive the sequence of actions that they would have exe-cuted in some joint dynamic that is known to converge to the solution to the coordination problem. Similartechniques have been used in the privacy literature, for example [KMRW15], [HHR +

14] and [RR14]. Wewill focus on the application of coordinating an equilibrium ﬂow in atomic routing games, which followsthe general outline of [RR14].A basic primitive that turns out to be broadly useful when writing down the transcript of some dynamicis being able to keep a running count of a stream of numbers. For many applications, in fact, it is sufﬁcient tobe able to maintain an approximate count. Before we start, we introduce two subroutines for keeping trackof the approximate count of a binary stream using low communication. The ﬁrst one compresses a numericstream τ : [ T ] → {− , , } into a short message, and the second one decompresses it. See Algorithm 3 forthe simple compression protocol. Algorithm 3

ApproxCount ( τ, r, T ) Input: a stream of numbers τ : [ T ] → {− , , } and reﬁnement parameter r Initialize: a counter C : [ T ] → N , a list of update steps U = ∅ for t = 1 , . . . , T : if t = 1 : let C ( t ) = 0 ; else : let C ( t ) = C ( t − if | C ( t ) − P ti =1 τ ( i ) | ≥ r : if C ( t ) < P ti =1 τ ( i ) then C ( t ) = C ( t −

1) + r and U ← U ∪ { ( t, +) } if C ( t ) > P ti =1 τ ( i ) then C ( t ) = C ( t − − r and U ← U ∪ { ( t, − ) } Output: the list of time steps U This algorithm releases a concise description U that sufﬁces to reconstruct an approximate running count C ( t ) of the stream. 17 laim 4. For all t ∈ [ T ] , the approximate count C ( t ) satisﬁes | C ( t ) − P ti =1 τ ( i ) | ≤ r . The summarystatistic U can be written with O (cid:16) k τ k log Tr (cid:17) bits. The second one takes the compressed message from

ApproxCount as input, and extracts the approximatecounts. See Algorithm 4.

Algorithm 4

ExtractCount ( U, r, T ) Input: a list of update steps U , reﬁnement parameter r , and time horizon T Initialize: a counter C : [ T ] → N such that τ ( t ) = 0 for all t ∈ [ T ] for: each ( t, • ) ∈ U Let c ′ = C ( t − if • = + : c = c ′ + r ; else : c = c ′ − r for: each t ′ ∈ { t, . . . , T } :Let C ( t ′ ) = c Output: the approximate counts C Looking ahead, we will use

ApproxCount in the coordinator’s encoding function to compress the infor-mation in the simulated dynamics, and use

ExtractCount in each player’s decoding function to extract theinformation about the dynamics. An atomic routing game instance is deﬁned by a directed graph G = ( V, E ) , n players with their source-sink pairs ( s , d ) , . . . , ( s n , d n ) , and a continuous, nondecreasing and λ -Lipschitz cost function c e : [0 , n ] → [0 , for each edge e ∈ E . Each player i needs to route 1 unit of ﬂow from s i to d i , so her strategy set A i isthe set of s i - d i paths. We think of a ﬂow of a single player alternately as a vector indexed by paths P , andas a vector indexed by edges e . The aggregate ﬂow is the sum over all player ﬂows. A ﬂow f (viewed as avector indexed by paths) is feasible if for each player i , f ( i ) P equals 1 for exactly one s i - d i path and equals 0for all other paths. We can translate such a ﬂow into a ﬂow indexed by edges by deﬁning f ( i ) e = P e ∈ P f ( i ) P .The cost c P ( f ) of path P in a ﬂow f is c P ( f ) = X e ∈ P c e ( f e ) where f e = P ni =1 f ( i ) e . We will denote the number of edges by m , the set of feasible ﬂow by F , and some-times abuse notation to use f ( i ) to denote the path of player i . Now we formalize the notion of approximateequilibrium ﬂow in a routing game. Deﬁnition 7 (Approximate Equilibrium Flow) . Let f be a feasible ﬂow for the atomic instance ( G, c ) . Theﬂow f is an ε - equilibrium ﬂow if each player i ∈ [ n ] is playing ε -best-response , that is for every pair everypair of s i − d i paths P, P ′ with f ( i ) P > , c P ( f ) ≤ c P ′ ( f ′ ) + ε where f ′ is the ﬂow identical to f except for its i -th component: f ′ ( i ) P = 0 and f ′ ( i ) P ′ = 1 . When f is a -equilibrium ﬂow, we simply say that f is a equilibrium ﬂow. potential function Ψ of the routing game, which is deﬁned as Ψ( f ) ≡ X e ∈ E f e X i =1 c e ( i ) . Note that in atomic routing games, equilibrium ﬂows are not unique, and so equilibrium selection isimportant. This motivates the coordination problem we study.We will rely on the following fact to show that a ﬂow is an approximate equilibrium.

Fact 1.

Consider a ﬂow f ∈ F . Suppose a player i could decrease her cost by deviating from path P topath e P , which gives rise to a new ﬂow e f , then c P ( f ) − c e P ( e f ) = Ψ( f ) − Ψ( e f ) . Our goal is to give a coordination protocol that coordinates the players to play an approximate equilib-rium ﬂow and has low coordination complexity (scaling with the number of edges | E | instead of numberof players n ). A very straightforward procedure to compute an equilibrium ﬂow is the best-response dy-namics : while the ﬂow f is not a η -equilibrium ﬂow, pick a player i and an arbitrary path deviation thatcould decrease her cost. In our coordination protocol, the coordinator will ﬁrst simulate an approximate ver-sion of the best-response dynamics, and compress the dynamics into a concise string using the subroutine ApproxCount ; then the coordinator will broadcast the string to the players so that each player can simulatethe sequence of actions she would have played in the dyamics, and thus determine the action she plays atthe end of the dynamics using

ExtractCount .In our approximate best-response dynamics, we will let the players best-respond to the approximatecount of players on the edges. We ﬁrst need to deﬁne the a player’s best response with respect to a countvector in R m . Deﬁnition 8 (Best-Response with respect to counts) . Given a count vector f • ∈ R | E | , a path P ( i ) for player i is an ε -best-response with respect to the vector f • if c P ( i ) ( f • ) − ε ≤ min e P ( i ) ∈A i c e P ( i ) ( f • ) . Keep in mind that any feasible ﬂow f is also a count vector. We give the formal description of thecoordinator’s encoding function BR - Sim in Algorithm 5.We will now focus on analyzing the best-response dynamics within BR - Sim . Note that in the analysis wemight say “player plays best-response” or “player deviates”; while these sound natural, all these proceduresare simulated by the coordinator and the protocol is non-interactive.

Lemma 14.

At any moment of the dynamics, let f ∈ R | E | be the ﬂow given by all players’ paths and let g ∈ R | E | be the count vector given by the counters { Counter ( e ) } e ∈ E . Suppose that player i ’s path f ( i ) isan η -best-response with respect to g , then the path f ( i ) is an ( η + λmr + λm ) -best-response.Proof. By Claim 4, we guarantee that throughout the dynamics, for each e ∈ E | Counter ( e ) − f | ≤ r. The social objective here is not social welfare — we want instead to minimize approximation factor of the equilibrium ε . lgorithm 5 BR - Sim (( G, { ( s i , d i ) } i ∈ [ n ] , c ) , α, r ) Input: a routing game instance ( G, { ( s i , d i ) } i ∈ [ n ] , c ) , best-response parameter α and reﬁnement param-eter r such that α > λm ( r + 1) Initialize: l = α − λmr − λm, T = mnl for each edge e ∈ E Let

Counter ( e ) be an instantiation of ApproxCount ( · , r, ( T + 1) n ) (waiting for incoming stream) Form the initial ﬂow:for each player i :Let f ( i ) be the s i - d i path P ( i ) with the fewest number of edges, break ties lexicographically for each edge e : if e ∈ P ( i ) send “1” to Counter ( e ) else send “0” to Counter ( e ) Best-responses dynamics:for t = 1 , . . . , T if each player is playing an α -best-response w.r.t. the counts { Counter ( e ) } e ∈ E : Halt for each player i if i is not playing an α -best-response w.r.t. the ﬂow { Counter ( e ) } e ∈ E Let b f ( i ) be the best-response of i w.r.t. ( Counter ( e )) e ∈ E (breaking ties lexicographically) for each e : if e ∈ f ( i ) \ b f ( i ) : Send “-1” to Counter ( e ) if e ∈ b f ( i ) \ f ( i ) : Send “1” to Counter ( e ) else : Send “0” to Counter ( e ) Let f ( i ) = b f ( i ) else : for each e : Send “0” to Counter ( e ) for each e : Let U e be the output of Counter ( e ) Output: { U e } e ∈ E This allows us to bound the cost difference of the same path with respect to two ﬂows f and g — for anypath P ⊆ E | c P ( f ) − c P ( g ) | ≤ λmr. This implies | min P ′ ∈A i c P ′ ( f ) − min P ′ ∈A i c P ′ ( g ) | ≤ λmr. Note that c f ( i ) ( g ) − min P ′ ∈A i c P ′ ( g ) ≤ η by our assumption, then it follows from the last two inequalitiesthat c f ( i ) ( f ) ≤ min P ′ ∈A i c P ′ ( f ) + η + λmr, so f ( i ) is a ( η + λmr ) -best-response w.r.t. the ﬂow f . Let f ′ i be a deviation of player i and let f ′ = ( f ′ i , f ( − i ) ) be the resulting ﬂow. We know that for each edge e , | f ′ e − f e | ≤ . It follows that min P ′ ∈A i c P ′ ( f ) ≤ min P ′ ∈A i c P ′ ( f ′ ) + λm. c f ( i ) ( f ) ≤ min P ′ ∈A i c P ′ ( f ′ ) + η + λmr + λm , which guarantees that f ( i ) is a ( η + λmr + λm ) -best-response. Lemma 15.

Every time a player makes a deviation in the dynamics, the potential function Ψ decreases byat least ( α − λmr − λm ) .Proof. Let f denote the true ﬂow among the n players. Since the amount the potential function decreasesequals the amount that player decreases her cost, we can bound c f ( i ) ( f ) − c b f ( i ) ( f ′ ) , where f ′ = ( b f ( i ) , f ( − i ) ) .Let g denote the count vector given by the counters ( Counter ( e )) e ∈ E . Suppose a player i has her pathswitched from f ( i ) to b f ( i ) during the dynamics. Then this means min P ∈A i c P ( g ) = c b f ( i ) ( g ) ≤ c f ( i ) ( g ) − α. By the accuracy guarantee of Claim 4, | c P ( f ) − c P ( g ) | ≤ λmr and (cid:12)(cid:12)(cid:12)(cid:12) min P ′ ∈A i c P ′ ( f ) − min P ′ ∈A i c P ′ ( g ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ λmr. This means | c f ( i ) ( g ) − c f ( i ) ( f ) | ≤ λmr and | c b f ( i ) ( g ) − c b f ( i ) ( f ) | ≤ λmr. Furthermore, note that | f e − f ′ e | ≤ for each edge e since they differ by only player i ’s path, so | c b f ( i ) ( f ′ ) − c b f ( i ) ( f ) | ≤ λm. Combining all the inequalities above we get c f ( i ) ( f ) − c b f ( i ) ( f ′ ) = ( c f ( i ) ( f ) − c f ( i ) ( g )) + ( c f ( i ) ( g ) − c b f ( i ) ( g ))+ ( c b f ( i ) ( g ) − c b f ( i ) ( f )) + ( c b f ( i ) ( f ) − c b f ( i ) ( f ′ )) ≥ − λmr + α − λmr − λm = α − λmr − λm, which also lower bounds the amount that the potential function decreases. Lemma 16.

At the end of the best-response dynamics, the players are playing ( α + λmr + λm ) -approximateequilibrium ﬂow in the routing game instance.Proof. In each iteration, there is at least one player performing a deviation, which will decrease the potentialfunction by at least ( α − λmr − λm ) = l .Note that the initial ﬂow has potential Ψ( f ) ≡ X e ∈ E f e X i =1 c e ( i ) ≤ nm. This means after at most T = mn/l iterations of the best-response dynamics, every player is playing α -best-response with respect to the ﬂow given by ( Counter ( e )) e ∈ E . By Lemma 14, each player is playing a ( α + λmr + λm ) -best-response w.r.t. to the ﬁnal ﬂow f . Hence, the ﬁnal ﬂow is an ( α + λmr + λm ) -equilibrium ﬂow. 21iven that we have shown that at the end of the best-response dynamics in BR - Sim , the players areplaying an approximate equilibrium, it only remains to construct a decoding function for the players torecover their own sequence of actions in the dynamics. Observe that BR - Sim outputs the list of update stepsacross all counters, which allows each player to simulate the history of the approximate counts. Thereforethe decoding function is straightforward: ﬁrst call

ExtractCount to extract the history of counts in thebest-response dynamics; then each player i ﬁrst forms her own initial ﬂow by picking the shortest path inher action set, and then at every time step t such that ( t mod n ) ≡ i , she decides whether to switch toa best-response with respect to the counts. Since her best response, after breaking ties lexicographically,is uniquely determined, in this way she is able to determine which path she is playing along at the endof the dynamics, which is her part of the approximate equilibrium. The full description of the algorithm ExtractPath is presented in Algorithm 6.

Algorithm 6

ExtractPath (( s i , d i ) , { U e } e ∈ E , α, r ) Input: a player i ’s source-destination pair ( s i , d i ) , message containing update steps { U e } e ∈ E for the { Counter ( e ) } e ∈ E , best-response parameter α and reﬁnement parameter Initialize: l = α − λmr − λm for each edge e ∈ E :Let C e = ExtractCount ( U e , r, T ) be the history of approximate counts Form the initial ﬂow:

Let f ( i ) be the s i - d i path with the fewest number of edges, break ties lexicographically for t = 1 , . . . , mnl :Let ﬂow g = ( C e ( tn + i )) e ∈ E if f ( i ) is not an α -best-response w.r.t. g Switch f ( i ) to a best-response w.r.t. g Output: the ﬁnal s i - d i path f ( i ) Claim 5.

Suppose that BR - Sim (( G, { ( s i , d i ) } i , c ) , α, r ) outputs a set of update step lists { U e } e ∈ E , then foreach player i , her output ﬂow from the instantiation ExtractPath (( s i , d i ) , { U e } e ∈ E , α, r ) is the same as theﬁnal ﬂow in the best-response dynamics in BR - Sim . Theorem 7.

Fix any ε > λm . Let r = ε − λm λm , and α = ε − λmr − λm . Then given any routing game in-stance Γ = ( G, { ( s i , d i ) } i , c ) , the coordination protocol ( BR - Sim (Γ , α, r ) , ExtractPath (( s i , d i ) , { U e } e ∈ E , α, r )) coordinates the players to play an ε -approximate equilibrium ﬂow and has coordination complexity of e O (cid:18) λm n ( ε − λm ) (cid:19) . Proof.

Given the parameters we choose, the players will end up playing an ε -approximate equilibriumby Lemma 16.Next, we bound the encoding length of the output from BR - Sim . Recall that l = α − λmr − λm , andeach time a player makes a deviation in the best-response dynamics, the potential function decreases by l .By the same analysis of Lemma 16, we know that the total number of deviations that occur in the dynamics(across all edges) is bounded T = mn/l . Since each counter updates its count only if the count is changedby r , the total number of updates across all counters is bounded by mT /r . Also, the total length of eachcounter is bounded ( T + 1) n . By Claim 4, the output list of update steps among all counters has encoding22ength at most O (cid:18) m n ( ε − λmr − λm ) r log (cid:18) m n l (cid:19)(cid:19) = e O (cid:18) λm n ( ε − λm ) (cid:19) which recovers our stated bound.To interpret the bound in Theorem 7, consider the case in which the routing game is a large game – inwhich n is substantially larger than the number of edges m , and in which no player has a large inﬂuence onthe latency of any single edge. In our context, this means the Lipschitz parameter λ of the cost function c e is small. Since the range of the cost is normalized to lie between 0 and 1, it is reasonable to assume that λ = O (1 /n ) . Given such a largeness assumption, the result of Theorem 7 gives a coordination protocol thatcoordinates a ( m/n ) -approximate equilibrium ﬂow with a coordination complexity of e O ( m /ε ) for anyconstant ε . Remark 2.

Another application for this coordination technique is the general allocation problem whenthe players have gross substitutes preferences. In this problem, there are n players and m types of goods,and each type of good has a supply of s . A coordinator who knows the preferences of the players can ﬁrstsimulate m simultaneous ascending price auctions, one for each type of good. The analysis of [KC82]shows that the prices and the allocation in the auctions converge to Walrasian equilibrium : each buyer issimultaneously able to buy his most preferred bundle of goods under the set of prices. We could then have asimilar coordination protocol: compress an approximate version of the ascending auctions dynamics using

ApproxCount , and broadcast a message that allows each player to reconstruct the price trajectory in theauctions, which can then coordinate a high-welfare allocation. eferences [ANRW15] Noga Alon, Noam Nisan, Ran Raz, and Omri Weinstein. Welfare maximization with limitedinteraction. CoRR , abs/1504.01780, 2015.[Cal84] Xavier Calsamiglia.

Informational requirements of parametric resource allocation processes .Asociaci´on Sudeuropea de Econom´ıa Te´orica, 1984.[CKRW14] Rachel Cummings, Michael Kearns, Aaron Roth, and Zhiwei Steven Wu. Privacy and truthfulequilibrium selection for aggregative games.

CoRR , abs/1407.7740, 2014.[DMNS06] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensi-tivity in private data analysis. In

Proceedings of the 3rd Theory of Cryptography Conference ,TCC ’06, pages 265–284, 2006.[DNO14] Shahar Dobzinski, Noam Nisan, and Sigal Oren. Economic efﬁciency requires interaction. In

Proceedings of the 46th ACM Symposium on Theory of Computing , STOC ’14, pages 233–242,2014.[DPS02] Xiaotie Deng, Christos Papadimitriou, and Shmuel Safra. On the complexity of equilibria. In

Proceedings of the thiry-fourth annual ACM symposium on Theory of computing , pages 67–71.ACM, 2002.[Hay45] Friedrich A. Hayek. The use of knowledge in society.

American Economic Review , 35(4):519–530, 1945.[HHR +

14] Justin Hsu, Zhiyi Huang, Aaron Roth, Tim Roughgarden, and Zhiwei Steven Wu. Privatematchings and allocations. In

Proceedings of the 46th ACM Symposium on Theory of Comput-ing , STOC ’14, pages 21–30, 2014.[HHRW14] Justin Hsu, Zhiyi Huang, Aaron Roth, and Zhiwei Steven Wu. Jointly private convex program-ming.

CoRR , abs/1411.0998, 2014.[HS07] Elad Hazan and C. Seshadhri. Adaptive algorithms for online decision problems.

ElectronicColloquium on Computational Complexity (ECCC) , 14(088), 2007.[Hur60] Leonid Hurwicz.

Optimality and Informational Efﬁciency in Resource Allocation Processes .Mathematical Methods in the Social Sciences. Stanford University Press, 1960.[KC82] Alexander S Kelso and Vincent P Crawford. Job matching, coalition formation, and grosssubstitutes.

Econometrica , pages 1483–1504, 1982.[KMRW15] Sampath Kannan, Jamie Morgenstern, Aaron Roth, and Zhiwei Steven Wu. Approximatelystable, school optimal, and student-truthful many-to-one matchings (via differential privacy).In

Proceedings of the 26th ACM-SIAM Symposium on Discrete Algorithms , SODA ’15, pages1890–1903, 2015.[KN97] Eyal Kushilevitz and Noam Nisan. Communication complexity.

Advances in Computers ,44:331–360, 1997. 24KNR99] Ilan Kremer, Noam Nisan, and Dana Ron. On randomized one-round communication com-plexity.

Computational Complexity , 8(1):21–49, 1999.[KPRU14] Michael Kearns, Mallesh M. Pai, Aaron Roth, and Jonathan Ullman. Mechanism design inlarge games: incentives and privacy. In

Proceedings of the 5th Innovations in TheoreticalComputer Science , ITCS ’14, pages 403–410, 2014.[LST15] Thodoris Lykouris, Vasilis Syrgkanis, and ´Eva Tardos. Learning and efﬁciency in games withdynamic population.

CoRR , abs/1505.00391, 2015.[MR74] Kenneth Mount and Stanley Reiter. The information size of message spaces.

Journal of Eco-nomic Theory , 8(2):161–192, June 1974.[MS96] D Monderer and LS Shapley. Potential games.

Games and Economic Behavior , 14(1):124–143, 1996.[MT07] Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In

Proceedingsof the 48th Annual IEEE Symposium on Foundations of Computer Science , FOCS ’07, pages94–103, 2007.[NS06] Noam Nisan and Ilya Segal. The communication requirements of efﬁcient allocations andsupporting prices.

Journal of Economic Theory , 129(1):192–224, 2006.[RR14] Ryan M. Rogers and Aaron Roth. Asymptotically truthful equilibrium selection in large con-gestion games. In

Proceedings of the 16th ACM Conference on Economics and Computation ,EC ’14, pages 771–782, 2014.[RRUW15] Ryan M. Rogers, Aaron Roth, Jonathan Ullman, and Zhiwei Steven Wu. Inducing approxi-mately optimal ﬂow using truthful mediators. In

Proceedings of the Sixteenth ACM Conferenceon Economics and Computation, EC ’15, Portland, OR, USA, June 15-19, 2015 , pages 471–488, 2015.[Seg06] Ilya Segal. Communication in economic mechanisms.

ECONOMETRIC SOCIETY MONO-GRAPHS , 41:222, 2006.

A A Coordination Protocol For Many-to-One Stable Matchings

Finally, we present a coordination protocol for the many-to-one stable matchings problem, which is rela-tively straightforward. A many-to-one stable matching problem consists of k schools U = { u , . . . , u k } ,each with capacity C j and n students S = { s , . . . , s n } . Every student i has a strict preference ordering ≻ i over all the schools, and each school j has a strict preference ordering ≻ j over the students. It will be usefulfor us to think of a school u ’s ordering over students A as assigning a unique score score u ( s ) ∈ { , . . . , n } to every student, in descending order (for example, these could be student scores on an entrance exam).Therefore, each student’s private data is D i = ( ≻ i , { score u ( s ) } u ) . We recall the standard notion of stabilityin a many-to-one matching. Deﬁnition 9.

A matching µ : S → U ∪ ∅ is feasible and stable if: . (Feasibility) For each u j ∈ U , |{ i : µ ( a i ) = u j }| ≤ C j

2. (No Blocking Pairs with Filled Seats) For each a i ∈ A , and each u j ∈ U such that µ ( a i ) = u j , either µ ( a i ) ≻ a i u j or for every student a ′ i ∈ µ − ( u j ) , a ′ i ≻ u j a i .3. (No Blocking Pairs with Empty Seats) For every u j ∈ U such that | µ − ( u j ) | < C j , and for everystudent a i ∈ A such that a i ≻ u j ∅ , µ ( a i ) ≻ a i u j . A simple way to specify a matching is to specify an admissions threshold for every school – i.e. a score admit ( j ) that represents the minimum score student the school is willing to accept. A set of admissionsthresholds admit deﬁnes a matching µ admit in the natural way, in which every student enrolls in their mostpreferred school to which they have been admitted: µ admit ( i ) := arg max ≻ ai { u j | score u j ( a i ) ≥ admit ( j ) } . A set of admission scores is stable and feasible if the matching it induces is stable and feasible.Now we give a coordination protocol to coordinate students to select schools that form a stable matching.The protocol crucially relies on the Gale-Shapley deferred acceptance algorithm — the coordinator will ﬁrstsimulate the dynamics of the deferred acceptance algorithm based on the student proﬁles, and obtains thestable matching along with the associated admission scores. Then it sufﬁces for the planner to publishthe list of admission scores so that all the students can coordinate on the stable matching. We present anscore-based deferred acceptance algorithm in Algorithm 7.

Algorithm 7

Deferred Acceptance (with Admission Scores)

Stab ( D ) Input: n students’ data D including their preferences over the schools and score proﬁles for each school u j ∈ U admission score admit ( j ) = n ; temporary enrolled students temp ( j ) = ∅ for each student s i ∈ S : let µ ( s i ) = ⊥ while there is some under-enrolled school u j ′ such that | temp ( j ′ ) | < C j ′ and admit ( j ′ ) > admit ( j ′ ) ← admit ( j ′ ) − for each student s i : if µ ( s i ) = arg max ≻ ai { u j | score u j ( a i ) ≥ admit ( j ) } then temp ( j ) ← temp ( j ) \ { a i } µ ( a i ) = arg max ≻ ai { u j | score u j ( a i ) ≥ admit ( j ) } temp ( µ ( a i )) ← temp ( µ ( a i )) ∪ { a i } Output: the ﬁnal admission scores { admit ( j ) } The following is a standard fact about the deferred acceptance algorithm.

Claim 6.

The ﬁnal matching µ computed from Stab ( D ) is a stable. Consider the following coordination protocol: the coordinator will ﬁrst run

Stab and broadcast the setof admission scores, and then based on the score, each student will just enroll in the favorite school that shequaliﬁes, that is arg max ≻ ai { u j | score u j ( a i ) ≥ admit ( j ) } . Note that the set of admission scores can be encoded with at most O ( k log( n )) bits. Theorem 8.

There exists a coordination protocol with coordination complexity of O ( k log n ) that coordi-nates the students to coordinate on a stable matching. Missing Proofs from Section 3

First, let us ﬁrst ﬁx some notations. Given any two random variables X and Y , we will write I ( X : Y ) to denote the mutual information, D KL ( X : Y ) to denote the Kullback Leibler divergence, and δ ( X, Y ) to denote the total variation distance between the random variables. We denote the Shannon entropy of arandom variable Z by H ( Z ) . Lemma 1.

For p ≥ /k , we have ℓ ( t, k, p ) ≥ (8 log e ) t ( p − /k ) .Proof. Fix a protocol where the coordinator broadcasts ℓ bits in the worst case. We will show that ℓ mustbe large. Let M ( I ) be the (random) message that Alice sends to Bob based on her input I . Let S ( I ) = h S , S , . . . , S t i and u ( I ) = h u , u , . . . , u t i . Then, S ( I ) , u ( I ) and M ( I ) are random variables. Fix avalue S for S ( I ) and so that conditioned on “ S ( I ) = S ” the the probability of success is at least p (such an S must exist). Then, I [ u ( I ) : M ( I )] = H [ M ( I ) − H ( M ( I ) | u ( I )) ≤ H ( M ( I )) ≤ E [ | M ( I ) | ] ≤ ℓ. On the other hand, since u , u , . . . , u t are independent (conditioned on “ S ( I ) = S ”), we have ℓ ≥ I [ u ( I ) : M ( I )] ≥ I [ u : M ( I )] + · · · + I [ u t : M ( I )] , implying that E i [ I [ u i : M ( I )]] ≤ ℓt . Let u iz denote the random variable whose distribution is the same as that of u i when conditioned on theevent M ( I ) = z , and let the distance between the distributions of u iz and u i be δ ( u iz , u i ) := X w ∈ S i | Pr[ u iz = w ] − Pr[ u i = w ] | . Then, we have I [ u i : M ( I )] = E z [ D KL ( u iz : u i )] (from deﬁnition) ≥ (2 log e ) E z h ( δ ( u iz , u i )) i (by Pinsker’s inequality) ≥ (2 log e ) (cid:16) E z [ δ ( u iz , u i )] (cid:17) . (by Jensen’s inequality)With one more application of Jensen’s inequality, we obtain the following E i [ I [ u i : M ( I )]] ≥ (2 log e ) E i (cid:20)(cid:16) E z [ δ ( u iz : u i )] (cid:17) (cid:21) ≥ (2 log e ) (cid:18) E i h E z [ δ ( u iz : u i )] i(cid:19) . Suppose Bob succeeds in guessing the special element of S i with probability p i ; then E i [ p i ] = p . Further-more, we have δ ( u iz , u i ) ≥ | p i − /k | . Thus, by Jensen’s inequality, we have (cid:18) E i h E z [ δ ( u iz : u i )] i(cid:19) ≥ p − /k ) It follows that ℓ/t ≥ (8 log e )( p − /k ) , that is, ℓ ≥ (8 log e ) t ( p − /k ) .27 emma 3. The sampled matching in G ′ has expected size at least OPT ′ ρ .Proof. For w ∈ W ′ , let d w be the number of copies of good w that have been assigned in the matching M ∗ .Then P w d w ≥ b OPT ′ /ρ . If d w ≥ , then we have the following estimate for the probability that w ischosen. Pr[ w is chosen ] ≥ d w b (cid:18) − b (cid:19) d w − (6) = d w b (cid:18) b − (cid:19) − ( d w − (7) ≥ d w b exp( − ( d w − / ( b − (because x ≤ e x ) (8) ≥ d w eb . (because d w ≤ b ) (9)Let M ′ be the sampled matching in G ′ . Then by linearity of expectation: E [ | M ∗ | ] ≥ X w ∈ W ′ d w eb ≥ b OPT ′ ebρ ≥ OPT ′ ρ . This completes our proof.

C Missing Proofs from Section 4

Lemma 4.

Suppose we have a dual vector b λ such that k λ • − b λ k ≤ α . Let b x = argmax x ∈F L ( x, b λ ) , then k b x − x • k ≤ √ α ( nk ) / √ η Proof.

Fixing any x , the function L ( x, λ ) is √ nk -Lipschitz with respect to ℓ norm because for any λ, λ ′ kL ( x, λ ) − L ( x, λ ′ ) k = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) k X j =1 ( λ j − λ ′ j ) n X i =1 c ( i ) j (cid:16) x ( i ) (cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ k λ − λ ′ k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) n X i =1 c ( i ) j (cid:16) x ( i ) (cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ k λ − λ ′ k√ nk By the property of Lipschitz functions we know that g ( λ ) = max x ∈F L ( x, λ ) is also √ nk -Lipschitz. Since k λ • − b λ k ≤ α , we can then bound kL ( x • , b λ ) − L ( x • , λ • ) k ≤ α √ nk, and also kL ( x • , λ • ) − L ( b x, b λ ) k = k g ( λ • ) − g ( b λ ) k ≤ α √ nk. It follows that kL ( x • , b λ ) − L ( b x, b λ ) k ≤ α √ nk. Note that ﬁxing b λ , the function L ( x, b λ ) is η -strongly concave in x . Since b x is a maximizer of L ( · , b λ ) , wehave k b x − x • k ≤ η (cid:16) L ( b x, b λ ) − L ( x • , b λ ) (cid:17) ≤ α √ nk/η Therefore, we must have k b x − x • k ≤ √ α ( nk ) / √ η . 28 emma 9. Suppose that min x ∈F k x − b x k ≤ ε . Then with probability − β , x ′ satisﬁes k X j =1 n X i =1 x ′ i,j − b j ! + ≤ q k log( k/β ) b V + √ nkε Proof.

Based on the relation between ℓ and ℓ norm, we have min x ∈F k x − b x k ≤ √ nk min x ∈F k x − b x k ≤ √ nkε This means k X j =1 n X i =1 b x i,j − b j ! + ≤ √ nkε Note that for each good j and any δ ∈ (0 , , we have from Chernoff-Hoeffding bound that Pr " n X i =1 x ′ i,j > (1 + δ ) n X i =1 b x i,j < exp (cid:0) − δ n/ (cid:1) . Let X j = P ni =1 b x i,j . If we set β/k = exp (cid:0) − δ n/ (cid:1) , then by union bound we have the following exceptwith probability β for all j ∈ [ k ] , n X i =1 x ′ i,j ≤ (cid:18) q k/β ) /X j (cid:19) X j Also, v ( b x ) = k b x k because for any ( i, j ) such that v i,j = 0 , we must have b x i,j = 0 , otherwise the player i could increase the regularized objective value by having b x i,j = 0 in Equation (1). It follows that k X j =1 n X i =1 x ′ i,j − X j ! ≤ k X j =1 q k/β ) X j ≤ p k log( k/β ) k b x k = q k log( k/β ) b V Therefore, k X j =1 n X i =1 x ′ i,j − b j ! + ≤ √ nkε + q k log( k/β ) b V , which recovers the stated bound.

Theorem 4.

OPT ≥ . roof. We instantiate our coordination mechanism for linearly separable convex programs with η = ε =1 / (100 n k ) , and round the solution to x ′ . Applying our previous lemmas, we get that with probability atleast − β : k X j =1 min ( n X i =1 v i,j x ′ i,j , b j ) ≥ b V − log(4 /β ) p b V − √ nkε − q k log(2 k/β ) b V ≥ OPT − n ( ε + η ) − log(4 /β ) p b V − √ nkε − q k log(2 k/β ) b V ≥ OPT − n ( ε + η ) − (cid:16) log(4 /β ) + p k log(2 k/β ) (cid:17) p OPT + n ( ε + η ) − √ nkε ≥ OPT − n ( ε + η ) − √ k log(2 k/β ) p OPT + n ( ε + η ) − √ nkε ≥ OPT − √ k log(2 k/β ) √ OPT which recovers the stated bound. Note that the coordination complexity of this mechanism is O ( k log( nk ))))