Constructing Large Matchings via Query Access to a Maximal Matching Oracle
aa r X i v : . [ c s . D S ] O c t Constructing Large Matchings via Query Access to a MaximalMatching Oracle
Lidiya Khalidah binti Khalil and Christian Konrad
Department of Computer Science, University of Bristol, Bristol, UK { lb17727,christian.konrad } @bristol.ac.uk Abstract.
Multi-pass streaming algorithm for
Maximum Matching have been studied since morethan 15 years and various algorithmic results are known today, including 2-pass streaming algo-rithms that break the 1 / − ǫ )-approximation streaming algorithmsthat run in O(poly ǫ ) passes in bipartite graphs and in O(( ǫ ) ǫ ) or O(poly( ǫ ) · log n ) passes in gen-eral graphs, where n is the number of vertices of the input graph. However, proving impossibilityresults for such algorithms has so far been elusive, and, for example, even the existence of 2-passsmall space streaming algorithms with approximation factor 0 .
999 has not yet been ruled out.The key building block of all multi-pass streaming algorithms for
Maximum Matching is the
Greedy matching algorithm. Our aim is to understand the limitations of this approach: How many passesare required if the algorithm solely relies on the invocation of the
Greedy algorithm?In this paper, we initiate the study of lower bounds for restricted families of multi-pass streamingalgorithms for
Maximum Matching . We focus on the simple yet powerful class of algorithms that ineach pass run
Greedy on a vertex-induced subgraph of the input graph. In bipartite graphs, weshow that 3 passes are necessary and sufficient to improve on the trivial approximation factor of1 /
2: We give a lower bound of 0 . Ω ( ǫ ) passes are required for computing a (1 − ǫ )-approximation, even inbipartite graphs. Last, the considered class of algorithms is not well-suited to general graphs: Weshow that Ω ( n ) passes are required in order to improve on the trivial approximation factor of 1 / The
Greedy matching algorithm is the key building block of most published streaming algo-rithms for approximate
Maximum Matching [14,23,12,22,2,13,18,21]. Given a graph G = ( V, E ), Greedy scans the set of edges E in arbitrary order and inserts the current edge e ∈ E into aninitially empty matching M if possible, i.e., if both endpoints of e are not yet matched by anedge in M . Greedy produces a maximal matching, which is known to be at least half as largeas a matching of largest size.The
Greedy matching algorithm is well-suited for implementation in the streaming modelof computation . A streaming algorithm processing a graph G = ( V, E ) with | V | = n receives apotentially adversarially ordered sequence of the edges of the input graph, and the objective isto solve a graph problem using as little space as possible. Many graph problems require space Ω ( n log n ) to be solved in the streaming model [25], and streaming algorithms that use spaceO( n poly log n ) are referred to as semi-streaming algorithms . Multi-pass streaming algorithmsprocess the input stream multiple times. Observe that Greedy constitutes a one-pass semi-streaming algorithm for
Maximum Matching with approximation factor .The Maximum Matching problem is the most studied graph problem in the streaming model,and despite intense research efforts, the
Greedy algorithm is the best one-pass streamingalgorithm known today, even if space O( n − δ ) is allowed, for any δ >
0. Performing multiplepasses over the input allows improving the approximation factor. The main questions addressedin the literature are: (1) What can be achieved in p passes, for small p (e.g. p ∈ { , } ), and (2)How many passes are required in order to obtain a (1 − ǫ )-approximation, for any ǫ >
0. SeeTable 1 for an overview of the currently best results. passes Approximation det/rand Reference See also
Bipartite Graphs deterministic Greedy , folklore2 2 − √ ≈ . . ǫ − ǫ deterministic Kale and Tirodkar [18] [14]O( ǫ log log ǫ ) 1 − ǫ deterministic Ahn and Guha [2] [12] General Graphs deterministic Greedy , folklore2 0 . ǫ O( ǫ ) − ǫ deterministic Tirodkar [26] [23]O( ǫ log n ) 1 − ǫ deterministic Ahn and Guha [2] Table 1: State of the art semi-streaming algorithms for
Maximum Matching .Only few lower bounds are known: We know that one-pass semi-streaming algorithms cannothave an approximation factor larger than 1 − e [19] (see also [15]). The only multi-pass lowerbound known addresses the exact version of Maximum Matching , showing that computing amaximum matching in p passes requires space n Ω (1 /p ) /p O (1) [17]. No lower bound is knownfor multiple passes and approximations, and, for example, the existence of a 2-pass 0 . Greedy algorithm is the key building block of all algorithms referenced in Table 1(including those mentioned in the “See also” column). In many cases, the presented algorithmscollect edges by solely executing
Greedy on specific subgraphs in each pass and output a largematching computed from the edges produced by
Greedy . In this paper, we are interested inthe limitations of this approach: How large a matching can be computed if
Greedy is executedat most p times?Known streaming algorithms apply Greedy in different ways. For example, the 2-pass and3-pass algorithms by Konrad [21] run
Greedy on randomly sampled subgraphs that dependon a previously computed maximal matching. The multi-pass algorithms by Ahn and Guha [2]maintain vertex weights ∈ [0 ,
1] over the course of the algorithm and run
Greedy on a thresholdsubgraph , i.e., on the set of edges uv so that the sum of the current weights associated with u and v is at most 1. The algorithm by Eggert et al. [12] runs Greedy on an edge-inducedsubgraph in order to find augmenting paths.In this paper, we initiate the study of lower bounds for restricted families of multi-passstreaming algorithms for
Maximum Matching that are based on
Greedy . We start this lineof research by addressing the probably simplest and most natural approach, which is nev-ertheless surprisingly powerful: the class of deterministic algorithms that run
Greedy on avertex-induced subgraph in each pass. Two known streaming algorithms fit our model:1. A 3-pass 0 . G = ( A, B, E ),the algorithm first computes a maximal matching in G , i.e., M ← Greedy ( G ). Then, thealgorithm attempts to find length-3 augmenting paths by invoking Greedy twice more: M L ← Greedy ( G [ A ( M ) ∪ B ( M )]), where A ( M ) are the matched A -vertices and B ( M ) arethe unmatched B -vertices. Last, M R ← Greedy ( A ( M ) , B ′ ), where B ′ ⊆ B ( M ) are thosematched B vertices that are endpoints in length-2 paths in M L ∪ M . Kale and Tirodkarshowed that M ∪ M L ∪ M R contains a 0 . . 2. The (1 − ǫ )-approximation O( ǫ )-passes streaming algorithm for bipartite graphs by Eggertet al. [12] can be adapted to fit our model using O( ǫ ) invocations of Greedy .We abstract this approach as a game between a player and an oracle: Let G be a graphwith vertex set V . The player initially knows V . In each round i the player sends a query query ( V i ) to the oracle, where V i ⊆ V . The oracle returns a maximal matching in the vertex-induced subgraph G [ V i ]. For this model to yield lower bounds for the streaming model, weimpose that the oracle is streaming-consistent , i.e., there exists a stream of edges π so that theoracle’s answers to the queries ( query ( V i )) i equal runs of Greedy on the respective substreamof edges G [ V i ] of π (see preliminaries for a more detailed definition). We denote this modelas the vertex-query model (as opposed to an edge-query model, where the player may ask formaximal matchings in a subgraph spanned by a subset of edges). Player Oracle query ( V r )response: maximal matching in G [ V r ] Fig. 1: Illustration of the game between the player and oracle in the vertex-query model.
Our Results.
In bipartite graphs, we show that at least 3 rounds are required to improveon the approximation factor of 1 /
2, and we give a lower bound of 0 . . We also show that Ω ( ǫ ) rounds are required for computing a (1 − ǫ )-approximation. This polynomial lower bound is in line with the poly ǫ rounds upper bound byEggert et al. [12]. Last, we demonstrate that our query model is not well-suited to generalgraphs: We show that improving on a factor of 1 / Ω ( n ) rounds. Further Related Work.
Besides the adversarial one-pass and multi-pass streaming models,
Maximum Matching has also been studied in the random order [22, ? , ? , ? , ? , ? ] and the insertion-deletion settings [20,8,5,11]. In the random order model, where edges arrive in uniform randomorder, Konrad et al. [22] were the first to give a semi-streaming algorithms with approximationratio above 1 /
2. Very recently, Bernstein showed that an approximation ratio of 2 / n − ǫ is necessary and sufficient for computing a n ǫ -approximation(see [11] for a slightly improved lower bound).Many works allow only query access to the input graph. For example, cross-additive queries,bipartite independent set queries, additive queries, cut-queries, and edge-detection queries havebeen considered [16,3,10,9,6,24,1], however, mainly for graph reconstruction problems. Veryrecently, linear queries and or-queries have been considered for graph connectivity [4]. Outline.
In Section 2, we give notation and definitions. We also define the vertex-query modeland provide a construction mechanism that ensures that our oracles are streaming-consistent.Then, in Section 3 we prove that 3 rounds are required to improve on 1 / . Ω ( ǫ ) rounds are needed for computing a (1 − ǫ )-approximation, and in Section 5 we show that3mproving on in general graphs requires Ω ( n ) rounds. Finally, we conclude in Section 6 andgive open questions. Matchings.
Let G = ( V, E ) be a graph with | V | = n . A matching M ⊆ E is a subset of vertex-disjoint edges. Matching M is maximal if for every e ∈ E \ M : M ∪ { e } is not a matching.A maximum matching is one of largest cardinality. If the size of a matching M is n/
2, i.e., itmatches all vertices of the graph, then M is a perfect matching . Notation.
We write V ( M ) to denote the set of vertices incident to the edges of a matching M . For a subset of vertices V ′ ⊆ V , we denote by G [ V ′ ] the vertex-induced subgraph of G byvertices V ′ , i.e., G [ V ′ ] = ( V ′ , ( V ′ × V ′ ) ∩ E ). For a set of edges E ′ ⊆ E , we denote by OP T ( E ′ )the size of a maximum matching in the subgraph of G spanned by the edges E ′ . For an integer n , we define [ n ] := { , , . . . , n } . The Vertex-query Model.
In the vertex-query model , a player and an oracle play a rounds-based matching game on a vertex set V of size n that is initially known to both parties. Overthe course of the game, the oracle makes up a graph G = ( V, E ). The objective of the player isto learn a large matching in G . The way the player learns edges is as follows:In each round 1 ≤ i ≤ r , where r is the total number of rounds played, the player submits aquery query ( V i ) to the oracle, for some V i ⊆ V . The oracle then determines a set of edges M i ,which is guaranteed to be a maximal matching in the vertex-induced subgraph G [ V i ]. Observethat in doing so, the oracle not only commits to the fact that M i ⊆ E , but also that thevertices V i \ V ( M i ) form an independent set (which follows from the fact that M i is maximal).Furthermore, we impose that the answers to all queries are consistent with graph G and that G has a perfect matching.After the r query rounds, the player reports a largest matching M P that can be formedusing the edges ∪ i ≤ r M i . The approximation ratio of the solution obtained is | M P | / ( n ).We are interested in oracles that are consistent with the streaming model. We say that anoracle is streaming-consistent , if there exists an ordering π of the edges E so that, for everyround i , M i is produced by running Greedy on the substream of π consisting of the edges of G [ V i ]. We will ensure that all our oracles are streaming-consistent. Construction of Streaming-consistent Oracles.
We will construct streaming-consistentoracles as follows. Upon query V , the oracle answers with M and places M in the beginningof the stream π . Next, given query V i , for some i ≥
2, the oracle first runs
Greedy on thesubstream of π consisting of the edges G [ V i ] which produces an intermediate matching M ′ ,thereby attempting to match V i using edges of previous matchings ∪ j
Suppose that the oracle is constructed as above. Then, given the sequence ofqueries V , . . . , V r and matchings M , . . . , M r , there exists a sequence of queries ˜ V , . . . , ˜ V r thatproduces matchings ˜ M , . . . , ˜ M r such that: – The player learns the same set of edges, i.e., for every i ≤ r : S j ≤ i M j = S j ≤ i ˜ M j , and – No query ˜ V i contains a pair of vertices u, v such that uv ∈ ∪ j
In this section, we show that the player cannot produce an approximation ratio better than in two rounds, even on bipartite graphs. We also show that three rounds do not allow for anapproximation ratio better than 0 .
6, which is achieved by the algorithm .In order to keep track of the information learned by the player, we will make use of structuregraphs , which we discuss first.
Observe that when the oracle answers the query query ( V i ) and returns a maximal matching M i , the player not only learns that the edges M i are contained in the input graph G , but alsolearns that the vertices V i \ V ( M i ) form an independent set in G (due to the maximality of M i ).We maintain the structure learned by the player and the structure committed to by the oracle(which do not have to be identical) using structure graphs : Definition 1 (Structure graph).
A 4-tuple ( A, B, E, F ) is a bipartite structure graph if: – A, B are disjoint sets of vertices, – E, F are disjoint sets of edges such that ( A, B, E ) and ( A, B, F ) are bipartite graphs, – The structure graph admits a perfect matching, i.e., there exists a set of edges M ∗ such that M ∗ ∩ F = ∅ and M ∗ is a perfect matching in the bipartite graph ( A, B, E ∪ M ∗ ) . From the perspective of the player, the set E corresponds to the edges returned by theoracle so far, i.e., E = ∪ j ≤ i M j , and the set F corresponds to guaranteed non-edges , i.e., F = ∪ j ≤ i C ( V i \ V ( M i )), where C ( V ′ ) denotes a biclique (respecting the bipartition A, B ) among thevertices V ′ .In the following, we will denote the structure graph after round i learned by the player by˜ H i = ( A, B, ˜ E i , ˜ F i ), i.e., ˜ E i = ∪ j ≤ i M j and ˜ F i = ∪ j ≤ i C ( V i \ V ( M i )). The oracle will also maintaina sequence of structure graphs ( H i ) i with H i = ( A, B, E i , F i ) such that H i dominates ˜ H i , forevery 1 ≤ i ≤ r . We say that a structure graph H = ( A, B, E, F ) dominates a structure graph˜ H = ( A, B, ˜ E, ˜ F ), if ˜ E ⊆ E and ˜ F ⊆ F . This notion allows the oracle to commit to edges andnon-edges that the player has not yet learned. This domination property allows us to simplifyour arguments.In our lower bound arguments, we make use of the following two assumptions: Assumption 1
After round i , the player knows the structure graph H i . This is a valid assumption since H i dominates ˜ H i and thus contains at least as much informationas ˜ H i . This assumption therefore only strengthens the player. Furthermore, we will also assumea slightly strengthened property of the property discussed in Observation 1: Assumption 2
For every ≤ i ≤ r , we assume that query V i does not contain a pair of vertices u, v ∈ V i such that uv ∈ E i − . This is a valid assumption, since if such a pair u, v of vertices existed in V i , the oracle couldsimply match u to v in M i and the algorithm would not learn any new information.Last, observe that the approximation ratio of the player’s strategy is completely determinedby H r , the oracle’s structure graph after the last round. Since H r dominates ˜ H r , the player’slargest matching is of size at most OP T ( E r ). Since by definition of a structure graph, H r admitsa perfect matching, the approximation ratio achieved is 2 · OP T ( E r ) /n .5 out A in B in A out MM ∗ L M ∗ R Fig. 2: Illustration of the structure graph H on a graph on 16 vertices. The matching M ishalf the size of the matching M ∗ = M ∗ L ∪ M ∗ R . B out A in B in A out MM ∗ L M ∗ R Fig. 3: Matching M (in red) returned by theoracle. The red vertices constitute A ∪ B ,i.e., the vertices of the second query. The case | A in | ≥ | B in | is illustrated here. We see thatno edges from B in × A out are returned, andthat M does not allow us to increase the sizeof M . Assume that n is a multiple of 4. The player and the oracle play the matching game on abipartite vertex set V = A ˙ ∪ B with | A | = | B | = n/
2. Consider the structure graph: H = ( A in ∪ A out , B in ∪ B out , M, A out × B out ) , where | A in | = | A out | = | B in | = | B out | = n/
4, and M is a perfect matching between A in and B in . Observe that there exists an M ∗ outside A out × B out such that M ∗ is a perfect matchingin ( A, B, M ∪ M ∗ ), namely, M ∗ consists of the two perfect matchings M ∗ L connecting B out to A in and M ∗ R connecting B in to A out . See Figure 2 for an illustration.We have: Lemma 1.
There is a structure graph isomorphic to H that dominates ˜ H .Proof. Denote the first query by A , B ( A ⊆ A , and B ⊆ B ). We will argue that we canrelabel the sets A in , A out , B in , B out so that H dominates ˜ H :If A ≤ n/ A in be an arbitrary subset of the A vertices of size n/ A , and let A out be the remaining A -vertices. If A > n/ A out be an arbitrary subsetof A vertices of size n/ A \ A , and let A in be the remaining A -vertices. Proceedsimilarly for B . The oracle returns the subset M ⊆ M where each edge has one endpoint in A and one endpoint in B , which is clearly maximal given that edges in A out × B out are forbidden.Since OP T ( M ) = | M | = n , Lemma 1 implies the unsurprising fact that no one roundalgorithm has an approximation ratio better than · nn = . We argue now that an additionalround does not help with increasing the approximation factor. Theorem 2.
The best approximation ratio achievable in two rounds is / .Proof. Let A , B be the vertices of the second query. By Lemma 1, H dominates ˜ H , andby Assumption 1 we can assume that the player already knows H . Let A in = A ∩ A in , A out = A ∩ A out and define B in and B out similarly.6 out b A in a B in b a b A out a b a b a M (a) H : The blue edges constitute aperfect matching that does not useany edges connecting A out to B out . B out b A in a B in b a b A out a b a b a (b) H : black edges are in M , rededges in E , gray edges in F . B out b A in a B in b a b A out a b a b a (c) H : The blue dotted edges andthe edge a b constitute a maxi-mum matching. Fig. 4: Illustrations of structure graphs H and H .Suppose first that | A in | ≥ | B in | . Then the oracle returns a matching M that matchesan arbitrary subset of A in of size | B in | to B in , and matches max {| B out | , | A in | − | B in |} of theremaining A in vertices arbitrarily to vertices in B out . In doing so, either all A in vertices or all B vertices are matched. Since H indicates that there are no edges connecting the “out”-vertices, M is therefore maximal.Observe further that M ∪ M does not match any vertex in A out , and, hence, only half of the A -vertices are matched in M ∪ M . The player thus cannot report any matching of size largerthan | M | , which constitutes a 1 / | A in | < | B in | is identical with roles of A and B vertices reversed. In this section, we work with a vertex set V = A ˙ ∪ B with | A | = | B | = 5 (and thus | V | = n = 10).By choosing disjoint copies of this vertex set, our result can be extended to graphs with anarbitrarily large number of vertices. First Query.
Similar to the two round case, we define the structure graph H = ( A in ∪ A out , B in ∪ B out , M, A out × B out ), however, this time | A in | = | B in | = 3 and | A out | = | B out | = 2.The matching M matches A in to B in , see Figure 4a.It shall be convenient to assign labels to the vertices in our structure graph. In our argumentsbelow, in order to avoid symmetric cases, we relabel the vertices of our structure graph as wesee fit, however, we always ensure that the structure graph after relabeling is isomorphic to thestructure graph before the relabeling.First, similar to Lemma 1, it is not hard to see that a structure graph isomorphic to H dominates ˜ H (proof omitted). Lemma 2.
There is a structure graph isomorphic to H that dominates ˜ H . Second Query.
We assume that the player knows H after the first query (Assumption 1).Next, we define structure graph H = ( A in ∪ A out , B in ∪ B out , M ∪ E , A out × B out ∪ F ), where E = { a b , a b } , and F = { a b , a b } . It is easy to see that H is indeed a structure graph(see Figures 4b and 4c).We shall prove that there is a structure graph isomorphic to H that dominates ˜ H . Lemma 3considers the case when the second query V contains exactly three “in”-vertices, i.e., verticesfrom A in ∪ B in , and Lemma 4 considers the case when there are fewer “in”-vertices. By Assump-tion 2, we do not need to consider the cases when more than three “in”-vertices are containedin V since then V necessarily contains a pair of vertices u, v such that uv ∈ M .7 emma 3. If the player queries exactly 3 “in”-vertices (i.e., vertices from A in ∪ B in ) in theirsecond query then there exists a structure graph isomorphic to H that dominates ˜ H .Proof. The player can either query more vertices in A in or in B in , and these cases are sym-metrical. Hence we only consider the case when the player queries more vertices in A in . Due toAssumption 2, for queries that contain vertices in both A in and B in , we assume these verticesdo not form any edges seen in M .Since we will not match any vertices in A out , we do not need to distinguish between caseswhere the player queries different numbers of vertices in A out . We distinguish between thefollowing cases:1. Player queries all vertices in A in and the query includes b : the oracle returns M = { a b } .2. Player queries all vertices in A in and only b in B out : relabel b as b and proceed as in case(1).3. Player queries all vertices in A in and no vertices in B out : the oracle returns M = ∅ .4. Player queries two vertices in A in , one vertex in B in and the query includes b : relabel the“in” vertices so that after relabeling the vertices a , a and b are included in the query.The oracle returns M = E .5. Player queries two vertices in A in , one vertex in B in and only b in B out : relabel b as b and proceed as in case (4).6. Player queries two vertices in A in , one vertex in B in and no vertices in B out : relabel “in”vertices so that after relabeling the vertices a and b are included in the query. The oraclereturns M = { a b } .In all cases considered, observe that M ⊆ E . Further, edges F ensure that M is maximal.We argue now that querying three “in”-vertices in the second round is best possible in thesense that querying fewer (or more) “in”-vertices does not yield more information. Lemma 4.
If the player queries fewer than 3 “in”-vertices (i.e., vertices from A in ∪ B in ) thenthere exists a structure graph isomorphic to H that dominates ˜ H .Proof. Clearly if the player does not query any “in”-vertices, no matching will be found i.e. M = ∅ . If the player queries exactly one vertex in A in , we can relabel this vertex as a and ifthe query contains a vertex in B out , relabel this one to be b . Then the matching found will bea subset of E . If the player queries exactly two “in” vertices there are two cases to consider.If they are both in A in , we ensure one of these vertices is a by relabeling, and, if at least onevertex in B out is queried, potentially relabel this vertex to be b and return the edge a b . If theplayer queried one vertex in A in and one in B in , we relabel these vertices as a , b and returnthe edge between them, a b . Hence the edges learned by the player are always a subset of E .In all cases considered, edges F ensure that matching M is maximal. Third Query.
We assume that the player knows structure graph H . Similar to the secondquery, we distinguish between the cases where the player queries exactly three “in”-vertices andfewer “in”-vertices. Again, by Assumption 2, we do not need to consider the case where theplayer queries more than three “in”-vertices. In the following proofs, we will define differentstructure graphs H that depend on the individual query. Lemma 5.
If the player queries exactly 3 “in”-vertices in the third round, then the playercannot output a matching of size larger than . roof. We provide the oracle’s answers when the player queries exactly three “in”-vertices.Among those cases, there are three cases to consider where the player queries more vertices in B in than in A in :1. Case 1:
Player queries b , b , b . The oracle defines H = ( A, B, E , F ) such that E = M ∪ E ∪ { a b , a b } and F = A out × B out ∪ F . If the player queried both vertices in A out , the oracle returns M = { a b , a b } . Otherwise M would consist of one or zero edgesdepending on the player’s query. In particular, we have M ⊂ E .In cases 2 and 3, we do not define any edges involving vertices from A out or B out , so theoracle proceeds regardless of which vertices in A out , B out the player queried.2. Case 2:
Player queries a , b , b . The oracle defines H = ( A, B, E , F ) such that E = M ∪ E ∪ { a b } and F = A out × B out ∪ F ∪ { a b , a b } . The oracle returns M = { a b } .3. Case 3:
Player queries b , b , a . The oracle defines H = ( A, B, E , F ) such that E = M ∪ E ∪ { a b } and F = A out × B out ∪ F ∪ { a b , a b } . The oracle returns M = { a b } .Observe that the case b , a , b ∈ V is not relevant, since a b ∈ M and Assumption 2.Figure 5 shows that in these three cases, H is a structure graph and the largest matching thatthe player thus able to return is of size 3.If the player queries more vertices in A in than in B in , we will argue that the player willnot learn any edges connecting to vertices in A out , and since the player then only holds edgesincident to 3 of the 5 A -vertices, the player cannot report a matching larger than of size 3.If the player queries all three vertices in A in then he clearly cannot learn any edges connectingto A out . If the player queries a vertex in B in , note that we can match it with a vertex queriedin A in , and there will be no vertices left to match with vertices in A out (see Figure 6). Sinceno more non-edges are defined, it is easy to see that edges can be added to create a perfectmatching. Lemma 6.
If the player queries fewer than three “in”-vertices in the third round, then theplayer cannot output a matching of size larger than .Proof. We distinguish the following cases:1. If the player queries no “in” vertices, this is obvious, and we would have M = ∅ .2. If the player queries exactly one “in” vertex, the only possible way to obtain a larger matchingthan one of size 3 is to find an edge incident to b , i.e., by querying b , but we can define F = A out × B out ∪ F ∪ { a b , a b } and then M = ∅ .3. If the player queries one vertex in A in and one in B in , we can connect them by an edge, say e , and then M = { e } does not help increasing the size of a matching.4. If the player queries two vertices in A in , the player will not be able to learn any edges tovertices in A out , and so A out remains unmatched, which implies that the player cannot returna matching of size larger than 3.5. If the player queries two vertices in B in , the oracle defines H as in Case 1 of Lemma 5, andthe matching returned is a subset of E .Hence we have shown that no matter what queries are made in the second and third rounds,the player cannot increase the size of the matching learned within the 10-vertex subgraph. Thisthen holds for a graph with | A | = | B | = n where 5 | n and the theorem follows. Theorem 3.
The best approximation factor achievable in three rounds is / . out b A in a B in b a b A out a b a b a (a) Case 1: Query V includes { b , b , b } B out b A in a B in b a b A out a b a b a (b) Case 1: blue dashed edges together with a b , a b constitute a perfect matching B out b A in a B in b a b A out a b a b a (c) Case 2: Query V includes { a , b , b } B out b A in a B in b a b A out a b a b a (d) Case 2: blue dashed edges together with a b consti-tute a perfect matching B out b A in a B in b a b A out a b a b a (e) Case 3: Query V includes { b , b , a } . B out b A in a B in b a b A out a b a b a (f) Case 3: blue dashed edges form a perfect matching Fig. 5: Round 3 cases. Green vertices are queried by the player in round 3. Red edges are in E \ E , orange is E \ E , grey is F . The blue dashed edges can be added to the graph tocreate a perfect matching. − ǫ )-approximation in Bipartite Graphs Requires Ω ( ǫ ) Rounds Let G c = ( A, B, E ) with A = B = [ c ] be the semi-complete graph on 2 c vertices, i.e., vertices a ∈ A and b ∈ B are connected if and only if b ≥ a . Observe that G c has a unique perfectmatching M ∗ = { ( i, i ) ∈ E | i ∈ [ c ] } .Let G be the disjoint union of n/ (2 c ) copies of G c (assuming for simplicity that n is amultiple of 2 c ). We will refer to a copy of G c in G as a gadget. We now show that computing a(1 − ǫ )-approximation requires Ω ( ǫ ) queries on G . Theorem 4.
Any query algorithm with approximation factor − ǫ requires at least ǫ − queries,even in bipartite graphs.Proof. Let c = ǫ −
1. We consider the graph G . First, suppose that the algorithm does notcompute a perfect matching in any of the n/ (2 c ) gadgets. Then, the computed matching isof size at most c − c n and thus constitutes at best a c − c = 1 − ǫ − ǫ < − ǫ approximation.The algorithm therefore needs to compute a perfect matching in at least one gadget. Since allgadgets are disjoint, we now argue that it requires at least c queries in order to compute a perfect10 out b A in a B in b a b A out a b a b a Fig. 6: An example of how the oracle behaves when the player queries more vertices in A in thanin B in during the third round. Green vertices are queried by the player. Red edges are in E \ E ,orange is E \ E , gray is F . The player learns no edges incident to A out and can therefore onlyreport a matching of size 3.matching in one gadget. Consider thus the gadget G c and denote by M ∗ the perfect matchingin G c . We claim that each query may produce at most one edge of the perfect matching M ∗ in G c : Indeed, let A ′ = { a , a , . . . , a k } ⊆ A and B ′ = { b , b , . . . , b ℓ } ⊆ B be so that A ′ ∪ B ′ is anyquery submitted to the oracle. Further, suppose that a < a < · · · < a k and b < b < · · · < b ℓ .The oracle will return the following matching M : M = { a i b ℓ +1 − i | i ∈ [min { k, ℓ } ] } ∩ E .
We will now argue that M is maximal and | M ∩ M ∗ | ≤
1. To this end, let j be the largest indexsuch that a j b ℓ +1 − j ∈ E , which is equivalent to j being the largest index so that a j ≤ b ℓ +1 − j .Observe that since the ( a i ) i and ( b i ) i are increasing, we have a j ′ b ℓ +1 − j ′ ∈ E ⇔ j ′ ≤ j , whichalso implies that vertices a j ′ are matched, for every j ′ ≤ j . Consider now a vertex a q , for some q > j . Since a j +1 > b ℓ − j and a q ≥ a j +1 , it follows that there is no edge between a q and anyof the unmatched B ′ -vertices { b , b , . . . , b ℓ − j } . This implies that the matching M is maximal.Next, suppose that M contains at least one edge from M ∗ and let q be the smallest index suchthat a q = b ℓ +1 − q , i.e., ( a q , b ℓ +1 − q ) ∈ M ∗ . Then, for any q ′ > q , we have a q ′ > a q = b ℓ +1 − q > b ℓ +1 − q ′ , which implies that a q ′ = b ℓ +1 − q ′ . Hence, at most one edge from M ∗ is returned per query.Last, we argue that the oracle can be made streaming-consistent: Consider any ordering ofthe edges so that edge ij arrives before edge ik , for every k < j .Using the oracle described in the previous proof on a single gadget G n/ , we obtain thefollowing corollary: Corollary 1.
Any query algorithm that produces a maximum matching requires at least n/ queries (on a graph on n vertices), even on bipartite graphs. / Ω ( n ) Queries Let G be a bomb graph on n ( n even) vertices U ∪ V with | U | = | V | = [ n/ G [ V ] is aclique, G [ U ] is an independent set, and u ∈ U and v ∈ V are connected if and only if u = v ( U and V are connected via a perfect matching). Denote by M ∗ the perfect matching between U and V and by C the edges of the clique G [ V ].In the next lemma, we show that any large matching in G must contain a large number ofedges from M ∗ . 11 emma 7. Let M be a matching in G . Then: | M | ≤ n + | M ∩ M ∗ | . Proof.
Observe that | M | = | M ∩ M ∗ | + | M ∩ C | , and since there are n/ − | M ∩ M ∗ | verticesin V that are not matched to a vertex in U , we have | M ∩ C | ≤ ( n/ − | M ∩ M ∗ | ) /
2. Hence: | M | = | M ∩ M ∗ | + | M ∩ C | ≤ | M ∩ M ∗ | + ( n/ − | M ∩ M ∗ | ) / n | M ∩ M ∗ | . Theorem 5.
Any r -round query algorithm on general graphs has approximation ratio at most + rn (on an n -vertex input graph).Proof. Consider an arbitrary query U ′ ∪ V ′ so that U ′ ⊆ U and V ′ ⊆ V . The oracle returns thefollowing matching: First, the oracle arbitrarily pairs up all vertices of V ′ except possibly onein case | V ′ | is odd. Let M denote this matching. If | V ′ | is even then M is returned. Supposenow that | V ′ | is odd and let v ∈ V ′ be the vertex that is not matched in M . Then, if v ’s partner u ∈ U in M ∗ is contained in U ′ , then return M ∪ { uv } , otherwise return M .It is easy to see that, by construction, the returned matching is maximal and contains atmost one edge from M ∗ . Hence, in r -rounds the algorithm can learn at most r edges from M ∗ .By Lemma 7, the returned matching is therefore of size at most n + r , which constitutes a + rn -approximation.The oracle can be made streaming-consistent: Consider any edge order where we first haveedges C in arbitrary order followed by M ∗ in arbitrary order. In this paper, we introduced a new query model that allows us to prove lower bounds for stream-ing algorithms for
Maximum Matching that repeatedly run the
Greedy matching algorithm ona vertex-induced subgraph of the input graph. We showed that the three rounds algorithm with approximation factor 0 . − ǫ )-approximation in bipartite graphs requires Ω ( ǫ ) rounds, andcomputing an approximation strictly better than in general graphs requires Ω ( n ) rounds. Weconclude with open questions: – Can we prove that computing a maximum matching in the vertex-query model in bipartitegraphs requires Ω ( n ) rounds, or is there an algorithm that requires only o ( n ) rounds? – Can we prove a Ω ( ǫ ) lower bound for computing a (1 − ǫ )-approximation in bipartitegraphs? References
1. Abasi, H., Bshouty, N.H.: On learning graphs with edge-detecting queries. In: Garivier, A., Kale, S. (eds.)Algorithmic Learning Theory, ALT 2019, 22-24 March 2019, Chicago, Illinois, USA. Proceedings of MachineLearning Research, vol. 98, pp. 3–30. PMLR (2019), http://proceedings.mlr.press/v98/abasi19a.html
2. Ahn, K.J., Guha, S.: Linear programming in the semi-streaming model with applicationto the maximum matching problem. Information and Computation 222, 59 – 79 (2013), , 38th International Collo-quium on Automata, Languages and Programming (ICALP 2011)3. Alon, N., Beigel, R., Kasif, S., Rudich, S., Sudakov, B.: Learning a hidden matching. SIAM J. Comput. 33(2),487–501 (2004), https://doi.org/10.1137/S0097539702420139
4. Assadi, S., Chakrabarty, D., Khanna, S.: Graph connectivity and single element recovery via linear and orqueries (2020)5. Assadi, S., Khanna, S., Li, Y., Yaroslavtsev, G.: Maximum matchings in dynamic graph streams and thesimultaneous communication model. In: Krauthgamer, R. (ed.) Proceedings of the Twenty-Seventh AnnualACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016. pp.1345–1364. SIAM (2016), https://doi.org/10.1137/1.9781611974331.ch93 . Beame, P., Har-Peled, S., Ramamoorthy, S.N., Rashtchian, C., Sinha, M.: Edge Estimation withIndependent Set Oracles. In: Karlin, A.R. (ed.) 9th Innovations in Theoretical Computer Sci-ence Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), vol. 94,pp. 38:1–38:21. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2018), http://drops.dagstuhl.de/opus/volltexte/2018/8355
7. Bernstein, A.: Improved Bounds for Matching in Random-Order Streams. In: Czumaj, A.,Dawar, A., Merelli, E. (eds.) 47th International Colloquium on Automata, Languages, and Pro-gramming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), vol. 168,pp. 12:1–12:13. Schloss Dagstuhl–Leibniz-Zentrum f¨ur Informatik, Dagstuhl, Germany (2020), https://drops.dagstuhl.de/opus/volltexte/2020/12419
8. Chitnis, R., Cormode, G., Esfandiari, H., Hajiaghayi, M., McGregor, A., Monemizadeh, M., Vorotnikova, S.:Kernelization via sampling with applications to finding matchings and related problems in dynamic graphstreams. In: Krauthgamer, R. (ed.) Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium onDiscrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016. pp. 1326–1344. SIAM (2016), https://doi.org/10.1137/1.9781611974331.ch92
9. Choi, S.: Polynomial time optimal query algorithms for finding graphs with arbitrary real weights. In: Shalev-Shwartz, S., Steinwart, I. (eds.) COLT 2013 - The 26th Annual Conference on Learning Theory, June 12-14,2013, Princeton University, NJ, USA. JMLR Workshop and Conference Proceedings, vol. 30, pp. 797–818.JMLR.org (2013), http://proceedings.mlr.press/v30/Choi13.html
10. Choi, S.S., Kim, J.H.: Optimal query complexity bounds for finding graphs. In: Proceedings of the FortiethAnnual ACM Symposium on Theory of Computing. p. 749–758. STOC ’08, Association for ComputingMachinery, New York, NY, USA (2008), https://doi.org/10.1145/1374376.1374484
11. Dark, J., Konrad, C.: Optimal lower bounds for matching and vertex cover in dynamic graph streams. In:35th Computational Complexity Conference, CCC 2020. LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum f¨urInformatik (2020)12. Eggert, S., Kliemann, L., Munstermann, P., Srivastav, A.: Bipartite matching in the semi-streaming model.Algorithmica 63(1–2), 490–508 (Jun 2012)13. Esfandiari, H., Hajiaghayi, M., Monemizadeh, M.: Finding large matchings in semi-streaming. In: Domeni-coni, C., Gullo, F., Bonchi, F., Domingo-Ferrer, J., Baeza-Yates, R., Zhou, Z., Wu, X. (eds.) IEEE Interna-tional Conference on Data Mining Workshops, ICDM Workshops 2016, December 12-15, 2016, Barcelona,Spain. pp. 608–614. IEEE Computer Society (2016), https://doi.org/10.1109/ICDMW.2016.0092
14. Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On graph problems in a semi-streamingmodel. Theor. Comput. Sci. 348(2), 207–216 (Dec 2005), https://doi.org/10.1016/j.tcs.2005.09.013
15. Goel, A., Kapralov, M., Khanna, S.: On the communication and streaming complexity of maximum bi-partite matching. In: Rabani, Y. (ed.) Proceedings of the Twenty-Third Annual ACM-SIAM Symposiumon Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17-19, 2012. pp. 468–485. SIAM (2012), https://doi.org/10.1137/1.9781611973099.41
16. Grebinski, V., Kucherov, G.: Optimal reconstruction of graphs under the additive model. Algorithmica 28(1),104–124 (2000), https://doi.org/10.1007/s004530010033
17. Guruswami, V., Onak, K.: Superlinear lower bounds for multipass graph processing. Algorithmica 76(3),654–683 (2016), https://doi.org/10.1007/s00453-016-0138-7
18. Kale, S., Tirodkar, S.: Maximum matching in two, three, and a few more passes over graph streams. In:Jansen, K., Rolim, J.D.P., Williamson, D., Vempala, S.S. (eds.) Approximation, Randomization, and Combi-natorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2017, August 16-18, 2017, Berke-ley, CA, USA. LIPIcs, vol. 81, pp. 15:1–15:21. Schloss Dagstuhl - Leibniz-Zentrum f¨ur Informatik (2017), https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2017.15
19. Kapralov, M.: Better bounds for matchings in the streaming model. In: Khanna, S. (ed.)Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms,SODA 2013, New Orleans, Louisiana, USA, January 6-8, 2013. pp. 1679–1697. SIAM (2013), https://doi.org/10.1137/1.9781611973105.121
20. Konrad, C.: Maximum matching in turnstile streams. In: Bansal, N., Finocchi, I. (eds.) Al-gorithms - ESA 2015 - 23rd Annual European Symposium, Patras, Greece, September 14-16,2015, Proceedings. Lecture Notes in Computer Science, vol. 9294, pp. 840–852. Springer (2015), https://doi.org/10.1007/978-3-662-48350-3_70
21. Konrad, C.: A Simple Augmentation Method for Matchings with Applications to Streaming Algorithms.In: Potapov, I., Spirakis, P., Worrell, J. (eds.) 43rd International Symposium on Mathematical Foun-dations of Computer Science (MFCS 2018). Leibniz International Proceedings in Informatics (LIPIcs),vol. 117, pp. 74:1–74:16. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2018), http://drops.dagstuhl.de/opus/volltexte/2018/9656
22. Konrad, C., Magniez, F., Mathieu, C.: Maximum matching in semi-streaming with few passes. In: Gupta, A.,Jansen, K., Rolim, J.D.P., Servedio, R.A. (eds.) Approximation, Randomization, and Combinatorial Opti-mization. Algorithms and Techniques - 15th International Workshop, APPROX 2012, and 16th International orkshop, RANDOM 2012, Cambridge, MA, USA, August 15-17, 2012. Proceedings. Lecture Notes in Com-puter Science, vol. 7408, pp. 231–242. Springer (2012), https://doi.org/10.1007/978-3-642-32512-0_20
23. McGregor, A.: Finding graph matchings in data streams. In: Proceedings of the 8th Interna-tional Workshop on Approximation, Randomization and Combinatorial Optimization Problems, andProceedings of the 9th International Conference on Randamization and Computation: Algorithmsand Techniques. p. 170–181. APPROX’05/RANDOM’05, Springer-Verlag, Berlin, Heidelberg (2005), https://doi.org/10.1007/11538462_15
24. Rubinstein, A., Schramm, T., Weinberg, S.M.: Computing Exact Minimum Cuts WithoutKnowing the Graph. In: Karlin, A.R. (ed.) 9th Innovations in Theoretical Computer ScienceConference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), vol. 94,pp. 39:1–39:16. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2018), http://drops.dagstuhl.de/opus/volltexte/2018/8316
25. Sun, X., Woodruff, D.P.: Tight Bounds for Graph Problems in Insertion Streams. In: Garg, N., Jansen,K., Rao, A., Rolim, J.D.P. (eds.) Approximation, Randomization, and Combinatorial Optimization. Al-gorithms and Techniques (APPROX/RANDOM 2015). Leibniz International Proceedings in Informatics(LIPIcs), vol. 40, pp. 435–448. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2015), http://drops.dagstuhl.de/opus/volltexte/2015/5316
26. Tirodkar, S.: Deterministic Algorithms for Maximum Matching on General Graphs in the Semi-StreamingModel. In: Ganguly, S., Pandya, P. (eds.) 38th IARCS Annual Conference on Foundations of Software Tech-nology and Theoretical Computer Science (FSTTCS 2018). Leibniz International Proceedings in Informatics(LIPIcs), vol. 122, pp. 39:1–39:16. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany(2018), http://drops.dagstuhl.de/opus/volltexte/2018/9938http://drops.dagstuhl.de/opus/volltexte/2018/9938