Breaking the Quadratic Barrier for Matroid Intersection
Joakim Blikstad, Jan van den Brand, Sagnik Mukhopadhyay, Danupon Nanongkai
BBreaking the Quadratic Barrier for Matroid Intersection
Joakim Blikstad , Jan van den Brand , Sagnik Mukhopadhyay , and Danupon Nanongkai KTH Royal Institute of Technology, Sweden {blikstad,janvdb,sagnik,danupon}@kth.se
Abstract
The matroid intersection problem is a fundamental problem that has been extensively studiedfor half a century. In the classic version of this problem, we are given two matroids M = ( V, I )and M = ( V, I ) on a comment ground set V of n elements, and then we have to find thelargest common independent set S ∈ I ∩ I by making independence oracle queries of the form“Is S ∈ I ?” or “Is S ∈ I ?” for S ⊆ V . The goal is to minimize the number of queries.Beating the existing ˜ O ( n ) bound, known as the quadratic barrier , is an open problem thatcaptures the limits of techniques from two lines of work. The first one is the classic Cunningham’salgorithm [SICOMP 1986], whose ˜ O ( n )-query implementations were shown by CLS+ [FOCS2019] and Nguy˜ên [2019]. The other one is the general cutting plane method of Lee, Sidford,and Wong [FOCS 2015]. The only progress towards breaking the quadratic barrier requireseither approximation algorithms or a more powerful rank oracle query [CLS+ FOCS 2019]. Noexact algorithm with o ( n ) independence queries was known.In this work, we break the quadratic barrier with a randomized algorithm guaranteeing˜ O ( n / ) independence queries with high probability, and a deterministic algorithm guaranteeing˜ O ( n / ) independence queries. Our key insight is simple and fast algorithms to solve a graphreachability problem that arose in the standard augmenting path framework [Edmonds 1968].Combining this with previous exact and approximation algorithms leads to our results. More generally, these algorithms take ˜ O ( nr ) queries where r denotes the rank which can be as big as n . i a r X i v : . [ c s . D S ] F e b ontents heavy and light vertices . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.3 Heavy vertex reachability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.4 Augmenting path algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 ii Introduction
Matroid intersection.
The matroid intersection problem is a fundamental combinatorial op-timization problem that has been studied for over half a century. A wide variety of prominentoptimization problems, such as bipartite matching, finding an arborescence, finding a rainbowspanning tree, and spanning tree packing, can be modeled as matroid intersection problems [Sch03,Chapter 41]. Hence the matroid intersection problem is a natural avenue to study all of theseproblems simultaneously.Formally, a matroid is defined by the the tuple M = ( V, I ) where V is a finite set of size n ,called the ground set , and I ⊆ V is a family of subsets of V , known as the independent sets , thatsatisfy two properties: (i) I is downward closed , i.e., all subsets of any set in I are also in I , and (ii)for any two sets A, B ∈ I with | A | < | B | , there is an element v ∈ B \ A such that A ∪ { v } ∈ I , i.e., A can be extended by an element in B . Given two such matroids M = ( V, I ) and M = ( V, I )defined over the same ground set V , the matroid intersection problem asks to output the largestcommon independent set S ∈ I ∩ I . The size of such a set is called rank and is denoted by r .The classic version of this problem that has been studied since the 1960s assumes independencequery access to the matroids: Given a matroid M , an independence oracle takes a set S ⊆ V asinput and outputs a single boolean bit depending on whether S ∈ I or not, i.e., it outputs 1 iff S ∈ I . The matroid intersection problem assumes the existence of two such independence oracles,one for each matroid. The goal is to design an efficient algorithm in order to minimize the number ofsuch oracle accesses, i.e., to minimize the independence query complexity of the matroid intersectionproblem. This is the version of the problem that we study in this work. Note that a more powerfulquery model called rank query has been recently studied in [LSW15, CLS + not considersuch model. Previous work.
Starting with the work of Edmonds in the 1960s, algorithms with polynomialquery complexity for matroid intersection have been studied [EDVJ68, Edm70, AD71, Law75, Edm79,Cun86, LSW15, Ngu19, CLS + O ( nr . ) based on the “blocking flow” ideas similar to Hopcroft-Karp’s bipartite-matchingalgorithm or Dinic’s maximum flow algorithm. This was the best query algorithm for the matroidintersection problem for close to three decades until the recent works of Nguy˜ên [Ngu19] andChakrabarty-Lee-Sidford-Singla-Wong [CLS +
19] who independently showed that Cunningham’salgorithm can be implemented using only ˜ O ( nr ) independence queries. In a separate line of work,Lee-Sidford-Wong [LSW15] proposed a cutting plane algorithm using ˜ O ( n ) independence queries.When r is sublinear in n , the result of [CLS +
19, Ngu19] provides faster (subquadratic) algorithmthan that of [LSW15], but for linear r (i.e., r ≈ n ), all of these results are stuck at query complexityof ˜ O ( n ). This is known as the quadratic barrier [CLS + +
19] and falls under thefollowing two categories. Either we need to assume the more powerful rank oracle model where[CLS +
19] provides a ˜ O ( n . )-time algorithm. Or, we solve an approximate version of the matroidintersection problem, where [CLS +
19] provides an algorithm with ˜ O ( n . /ε . ) complexity for (1 − ε )-approximately solving the matroid intersection problem in the independence oracle model. Breakingthe quadratic barrier with an exact algorithm in the independence query model remains open. Our results.
We break the quadratic barrier with both deterministic and randomized algorithms:
Theorem 1.1 (Details in Theorems 4.8 and 4.9) . Matroid Intersection can be solved by • a deterministic algorithm taking ˜ O ( n / ) independence queries, and • a randomized (Las Vegas) algorithm taking ˜ O ( n / ) independence queries with high probability.
1y high probability, we mean probability of at least 1 − /n c for an arbitrarily large constant c .While we only focus on the query complexity in this paper, we note that the time complexities ofour algorithms are dominated by the independence oracle queries. That is, our deterministic andrandomized algorithms have time complexity ˜ O ( n / T ind ) and ˜ O ( n / T ind ) respectively, where T ind denotes the maximum time taken by an oracle to answer an independence query. Technical overview.
Below we explain the key insights of our algorithms which are fast algorithmsto solve a graph problem called reachability and a simple way to combine our algorithms with theexisting exact and approximation algorithms to break the quadratic barrier.
Reachability problem:
In this problem, there is a directed bipartite graph G on n vertices withbi-partition ( S ∪ { s, t } , ¯ S ). We want to determine whether a directed ( s, t )-path exists in G . Weknow the vertices of G , but not the edge set E of G . We are allowed to ask the following two typesof neighborhood queries :1. Out-neighbor query: Given v ∈ ¯ S and X ⊆ S ∪ { s, t } , does there exists an edge from v tosome vertex in X ?2. In-neighbor query: Given v ∈ ¯ S and X ⊆ S ∪ { s, t } , does there exists an edge from somevertex in X to v ?In other words, we can ask an oracle if a “right vertex” v ∈ ¯ S has an edge to or from a set X of“left vertices”. This problem arose as a subroutine of previous matroid intersection algorithms that arebased on finding augmenting paths [AD71, Law75, Cun86, CLS +
19, Ngu19]. Naively, we can solvethis problem with quadratic ( O ( n )) queries: find all edges of G by making a query for all possible O ( n ) pairs of vertices. Cunningham [Cun86] used this algorithm in his framework to solve thematroid intersection problem with O ( nr . ) queries. Recent results by [CLS +
19, Ngu19] solved thereachability problem with ˜ O ( nd ) queries, where d is the distance between s and t in G , essentially bysimulating the breadth-first search process. Plugging these algorithms into Cunningham’s frameworkleads to algorithms for the matroid intersection with ˜ O ( nr ) queries. When d is large, the algorithmsof [CLS +
19, Ngu19] still need ˜Θ( n ) queries to solve the reachability problem. It is not clear howto solve this problem with a subquadratic number of queries. The key component of our algorithmsis subquadratic-query algorithms for the reachability problem: Theorem 1.2 (Details in Theorems 3.1 and 3.2) . The reachability problem can be solved by • a deterministic algorithm that takes ˜ O ( n / ) queries, and • a randomized (Las Vegas) algorithm that takes ˜ O ( n √ n ) queries with high probability. Plugging Theorem 1.2 into standard frameworks such as Cunningham’s does not directly leadus to a subquadratic-query algorithm for matroid intersection. Our second insight is a simple wayto combine algorithms for the reachability problem with the exact and approximation algorithms of[CLS +
19] to achieve the following theorem.
Theorem 1.3 (Details in Lemma 4.7) . If there is an algorithm A that solves the reachabilityproblem with T queries, then there is an algorithm B that solves the matroid intersection problemwith ˜ O ( n / + n √T ) independence queries. If A is deterministic, then B is also deterministic. Theorems 1.2 and 1.3 immediately lead to Theorem 1.1. We provide proof ideas of Theorems 1.2and 1.3 in the subsections below. 2 .1 Proof idea for Theorem 1.2: Algorithm for the reachability problem
Before mentioning an overview of the algorithm for solving the reachability problem, we brieflymention what makes this problem hard. Note that if we discover that some v ∈ ¯ S is reachable from s ,we can find all out-neighbors of v in ( S ∪ { s, t } ) in O (log n ) queries per such neighbor. We do this byusing a binary search with out-neighbor queries, halving the size of the set of potential out-neighborsof v in each step. However, when we discover that some v ∈ S is reachable from s , we cannot usethe same binary-search trick to efficiently find the out-neighbors of v due to the asymmetry of theallowed queries, where we can make queries only for vertices v ∈ ¯ S . Such asymmetry makes it hardto efficiently apply a standard ( s, t )-reachability algorithm (such as breadth-first search) on thegraph. Both our randomized and deterministic algorithms for the reachability problem follow the sameframework below, where we partition vertices in ¯ S into heavy and light vertices and find verticesthat can reach some heavy vertices. Our randomized and deterministic algorithms differ in howthey determine whether a vertex is heavy or light. Heavy/Light vertices.
Our reachability algorithms run in phases and keep track of a set ofvertices that are reachable from the source vertex s , denoted by F (for “found”). We can assumethat F contains all out-neighbors of vertices in F ∩ ¯ S , because we can find these out-neighbors veryefficiently by doing binary-search that makes ˜ O (1) queries per out-neighbor. In each phase, thealgorithm either(a) increases the size of F by an additive factor of at least h for some parameter h (we use either h = √ n or h = n / ), or(b) returns whether there is an ( s, t ) path.Hence, in total, there are at most nh many phases. To this end, for every v ∈ ¯ S , we say that v is F -heavy if either(h1) v has at least h out-neighbors to S \ F , or(h2) there is an edge from v to t .If v ∈ ¯ S is not F -heavy, we say that it is F -light . We omit F when it is clear from the context. Weemphasize that the notion of heavy and light applies only to vertices in ¯ S . Two tasks that remainare how to determine if a vertex is heavy or light, and how to use this to achieve (a) or (b). Heavy vertex reachability.
First, we show how to achieve (a) or (b). We assume for now thatwe know which vertices in ¯ S are heavy or light. We can also assume that we know all out-going edgesof all light vertices (e.g. black edges in Figure 1); this requires ˜ O ( nh ) queries over all phases. Ourmain component is to determine a set of vertices that can reach some heavy vertex. (Heavy verticesare always in such set.) We can do this with ˜ O ( n ) queries essentially by simulating a breadth-firstsearch process reversely from heavy vertices. This process leaves us with subtrees rooted at theheavy vertices with edges pointed to the roots; see Figure 1 for an example. The actual algorithm isquite simple and can be found in Section 3.Once all vertices that can reach some heavy vertices are found, we end up in one of the followingsituations:• Some vertex in F can reach a heavy vertex v satisfying (h2). In this case, we know immediatelythat s can reach t via v . In contrast, in the symmetric case where in- and out-neighbor queries can be made for every vertex (and not just v ∈ ¯ S ), we can solve the reachability problem with ˜ O ( n ) queries. This requires a simple breadth-first search startingfrom s where we discover neighbors using binary search. S S S ¯ S ¯ S ¯ S st Fv v ⊆ ⊆ ⊆ ⊆ ⊆ ⊆ ⊆ Heavy vertices Light vertices Light vertices
Figure 1:
Example of the reverse BFS process to compute heavy vertex reachability. The vertices from S and ¯ S occur at alternate layers. The black edges (out-edges of light nodes) are known a priori. The greenedges are traversed in the reverse BFS procedure whereas the red edges are not traversed in the reverse BFS.The green vertices are discovered in the reverse BFS. The green vertices form a tree rooted at v . Vertex v is heavy because of its large out-degree. Vertex v is heavy because t is its out-neighbor. The path from F to v and the out-neighbors of v are highlighted in light-blue, which is added to F after the reverse BFS. Notethat, even though v is reachable from F , the path from F to v is not discovered, and hence the algorithmmoves to the next iteration. • Some vertex in F can reach a heavy vertex v satisfying (h1). In this case, we query and addall out-neighbors of v in S \ F (taking ˜ O ( n ) queries). This adds at least h vertices to F asdesired.• No vertices in F can reach any heavy vertex. In this case, we conclude that s does not reach t :to be able to reach t , s must be able to reach some vertex that points to t (and thus is heavy). Heavy/light categorization.
Again, this is where our randomized and deterministic algorithmsdiffer. With randomness, we can use random sampling to approximate the out-degree of every vertexin ¯ S and find all out-going edges of vertices that are potentially light. This takes ˜ O ( nh + n /h )queries over all phases. For the deterministic algorithm, a naive idea is to maintain, for every v ∈ ¯ S ,up to h out-going neighbors of v in S \ F . The challenge is that when these neighbors are includedin F , we have to find new neighbors. By carefully picking these neighbors, we can argue that intotal only ˜ O ( n √ nh ) queries are needed over all phases. Summary.
In total, in addition to the categorization of heavy / light vertices, we use ˜ O ( n ) queriesto solve the heavy vertex reachability problem in each of the ˜ O ( n/h ) phases. We also need ˜ O ( nh )queries over all phases to find at most h out-neighbors in S \ F of light vertices. So, in total, ouralgorithm uses ˜ O ( n h + nh ) queries plus the number of queries needed for the categorization, whichis ˜ O ( n h + nh ) for randomized and ˜ O ( n √ nh ) for deterministic algorithms. The standard connection between the matroid intersection problem and the reachability problemthat is exploited by most combinatorial algorithms [AD71, Law75, Cun86, CLS +
19, Ngu19] is basedon finding augmenting paths in what is called the exchange graph . Given a common independent set4 of the two matroids over common ground set V , the exchange graph G ( S ) is a directed bipartitegraph over vertex set V ∪ { s, t } as in the reachability problem above with ¯ S = V \ S . The edgesof the exchange graph are defined to ensure the following property: Finding an ( s, t )-path in theexchange graph amounts to augmenting S , i.e. finding a new common independent with a biggersize. Conversely, if no ( s, t )-path exists in the exchange graph, it is known that S is of maximumcardinality and, hence, S can be output as the answer to the matroid intersection problem. Thusthe problem of augmentation in the exchange graph can be reduced to the reachability problemwhere the neighborhood queries in the reachability problem correspond to the queries to the matroidoracles. Let us suppose that we can solve the reachability problem using T queries. An immediate andstraightforward way of using this subroutine to solve matroid intersection is the following: Call thissubroutine iteratively to find augmenting paths to augment along in the exchange graph, therebyincreasing the size of the common independent set by one in each iteration. As the size of thelargest common independent set is r , we need to perform r augmentations in total. This leads to analgorithm solving matroid intersection using O ( r T ) independence queries.To improve upon this, we avoid doing the majority of the augmentations by starting witha good approximation of the largest common independent set. We use the recent subquadratic(1 − ε )-approximation algorithm of [CLS +
19, Section 6] that uses ˜ O ( n . /ε . ) independence queriesto obtain a common independent set of size at least r − εr . Once we obtain a common independentset with such approximation guarantee, we only need to perform an additional εr augmentations.This is still not good enough to obtain a subquadratic matroid intersection algorithm when combinedwith our efficient algorithms for the reachability problem from Theorem 1.2.The final observation we make, is that for small ε = o ( n − / ), we can combine the ˜ O ( n . /ε . )approximation algorithm of [CLS +
19] with an efficient implementation of Cunningham’s algorithm(as in [CLS +
19, Ngu19]) to obtain a (1 − ε )-approximation algorithm for matroid intersection using˜ O ( n / + n/ε ) queries. This has a slightly better complexity than just running the approximationalgorithm of [CLS + O ( n . /ε . ) approximation algorithm with ε ≈ n − / , and then run the Cunningham-style algorithm until the distance between s and t in theexchange graph becomes at least Θ(1 /ε ).Our final algorithm is then:1. Run the ˜ O ( n . /ε . )-query (1 − ε )-approximation algorithm from [CLS +
19, Section 6] with ε = n − / to obtain a common independent set S of size at least r − n / . This step takes˜ O ( n / ) queries.2. Starting with S , run the Cunningham-style algorithm as implemented by [CLS +
19, Section 5]until the ( s, t )-distance is at least √T to obtain a common independent set of size at least r − O ( n/ √T ). This step takes ˜ O ( n ( r − | S | ) + n √T ) = ˜ O ( n / + n √T ) queries.3. For the remaining O ( n/ √T ) augmentations, find augmenting paths one by one by solving thereachability problem. This step takes ˜ O ( n √T ) queries.Hence we obtain a matroid intersection algorithm which uses ˜ O ( n / + n √T ) independence queries,as in Theorem 1.3. We start with the necessary preliminaries in Section 2. In Section 3, we provide the subquadraticdeterministic and randomized algorithms for augmentation. Finally, in Section 4, we combinethese algorithms for augmentation with existing algorithms to obtain subquadratic deterministic The independence queries are more powerful than the neighborhood queries, but we are only interested in theneighborhood queries in our algorithm for the reachability problem.
Matroid. A matroid is a combinatorial object defined by the tuple M = ( V, I ), where the groundset V is a finite set of elements and I ⊆ V is a non-empty family of subsets (denoted as the independent sets ) of the ground set V , such that the following properties hold:1. Downward closure: If S ∈ I , then any subset S ⊂ S (including the empty set) is also in I ,2. Exchange property:
For any two sets S , S ∈ I with | S | < | S | , there is an element v ∈ S \ S such that S ∪ { v } ∈ I . Matroid Intersection.
Given two matroids M = ( V, I ) and M = ( V, I ) defined on thesame ground set V , the matroid intersection problem is finding a maximum cardinality commonindependent set S ∈ I ∩ I . When discussing matroid intersection, we will denote by r the size ofsuch a maximum cardinality common independent set and by n the size of the ground set V . Exchange graph.
Consider two matroids M = ( V, I ) and M = ( V, I ) defined on the sameground set V . Let S ∈ I ∩ I be a common independent set. The exchange graph G ( S ), w.r.t. tothe common independent set S ∈ I ∩ I , is defined to be a directed bipartite graph where the twosides of the bipartition are S and ¯ S = V \ S . Moreover, there are two additional special vertices s and t (that are not included in either S or ¯ S ) which have directed edges incident on them only from¯ S . The directed edges (or arcs) are interpreted as follows:1. Any edge of the form ( s, v ) for v ∈ ¯ S implies that S ∪ { v } is an independent set in M .2. Similarly, any edge of the form ( v, t ) for v ∈ ¯ S implies that S ∪ { v } is an independent set in M .3. Any edge of the form ( u, v ) ∈ S × ¯ S implies that ( S \ { u } ) ∪ { v } is an independent set in M .4. Similarly, any edge of the form ( v, u ) ∈ ¯ S × S implies that ( S \ { u } ) ∪ { v } is an independentset in M .We are interested in the notion of chordless ( s, t )-paths in G ( S ) [Cun86, Section 2] which aredefined next. For this definition, we consider a path as a sequence of vertices that take part in thepath. A subsequence of a path is an ordered subset of the vertices (not necessarily contiguous) ofthe path where the ordering respects the path ordering. Definition 2.1.
An ( s, t )-path p is chordless if there is no proper subsequence of p which is also an( s, t )-path. A chordless path in the exchange graph G ( S ) is sometimes called an augmenting path . Claim 2.2 (Augmenting path) . Consider a chordless path p from s to t in G ( S ) (if it exists), andlet V ( p ) be the elements of the ground set (or, equivalently, vertices in the exchange graph excluding s and t ) that take part in the path p . Then S V ( p ) is a common independent set of M and M . If we examine the set S V ( p ) obtained from Claim 2.2, it is clear that the number of elementsadded to the set S is one more than the number of elements removed from S . This observationimmediately gives the following corollary, and shows the importance of the notion of exchangegraphs. 6 orollary 2.3. The size of the largest common independent set of M and M is at least | S | + 1 if and only if t is reachable from s in G ( S ) . It is useful to note that the shortest ( s, t )-path in G ( S ) is always chordless. Many combinatorialmatroid intersection algorithms thus focus on finding shortest ( s, t )-paths. The following claimrelating the distance from s to t in G ( S ) and the size of S is useful for approximation algorithmsfor matroid intersection. Claim 2.4 ([Cun86]) . If the length of the shortest ( s, t ) -path in G ( S ) is at least d , then | S | ≥ (1 − O ( d )) r , where r is the size of the largest common independent set. Matroid query oracles.
There are two primary models of query oracles associated with thematroid theory: (i) the independence query oracle, and (ii) the rank query oracle. The independencequery oracle, given a set S ⊆ V of a matroid M , outputs 1 iff S is an independent set of M (i.e., iff S ∈ I ). The rank query oracle, given a set S ⊆ V , outputs the rank of S , rk M ( S ) def = max T ⊆ S : T ∈I | T | ,i.e., the size of the largest independent set contained in S . Clearly, if S itself is an independent set,then rk M ( S ) = | S | . Hence, a rank query oracle is at least as powerful as the independence queryoracle. In this work, we are however interested primarily in the independence query oracle model.Next, we state two claims regarding the independence query oracle that we use in the paper. Claim 2.5 (Edge discovery) . By issuing one independence query each, we can find out(i) given a vertex v ∈ ¯ S , whether v is an out-neighbor of s ; or(ii) given a vertex v ∈ ¯ S , whether v is an in-neighbor of t ; or(iii) given a vertex v ∈ ¯ S and a subset X ⊆ S , whether there exists an edge from some vertex in X to v ; or(iv) given a vertex v ∈ ¯ S and a subset X ⊆ S , whether there exists an edge from v to some vertexin X . Claim 2.5 follows from observing that we can make the following kinds of independence queries:(i-ii) whether S ∪ { v } is an independent set in M respectively M , and (iii-iv) whether S ∪ { v } \ X is an independent set in M respectively M . Note that these edge-discovery queries can simulatethe neighborhood-queries in the reachability problem.With these kinds of queries, we can perform a binary search to find an in-/out-neighbor of v ∈ ¯ S .The following lemma is proven in [CLS +
19, Lemma 11] and also mentioned in [Ngu19]. We skip theproof in the paper.
Claim 2.6 (Binary search with independence/neighborhood queries, [Ngu19, CLS + . Considera vertex v ∈ ¯ S and a subset X ⊆ S ∪ { s, t } . By issuing O (log r ) independence queries to M , wecan find a vertex u ∈ X such that there is an edge ( u, v ) (i.e., u is an in-neighbor of v ), or otherwisedetermine that no such edge exists. Similarly, by issuing O (log r ) independence queries to M , wecan find a vertex u ∈ X such that there is an edge ( v, u ) (i.e., u is an out-neighbor of v ). We will assume
InEdge ( v, X ) respectively OutEdge ( v, X ) are procedures which implementClaim 2.6. 7 Algorithms for augmentation
From Claim 2.2, we know the following: Given a common independent set S , either S is of maximumcardinality or there exists a (directed) ( s, t )-path in the exchange graph G ( S ). In this section, weconsider the ( s, t )-reachability problem in G ( S ) using independence oracles. Our main results inthis section are the following two theorems. We denote the size of S as | S | = r in both of thesetheorems. Theorem 3.1 (Randomized augmentation) . There is a randomized algorithm which with highprobability uses O ( n √ r log n ) independence queries and either determines that S is of maximumcardinality or finds an augmenting path in G ( S ) . Theorem 3.2 (Deterministic augmentation) . There is a deterministic algorithm which uses O ( nr / log r ) independence queries and either determines that S is of maximum cardinality or findsan augmenting path in G ( S ) . Section 1.1 gives an informal overview of the augmentation algorithm already. In this section, weprovide more details so that the reader can be convinced about the correctness of the algorithm.The algorithm for augmentation, denoted as
Augmentation algorithm for easy reference, runsin phases and keeps track of a set F of vertices that are reachable from the vertex s . Let F S and F ¯ S denote the bipartition of F inside S and ¯ S , i.e., F S = F ∩ S and F ¯ S = F ∩ ¯ S . In each phase, thealgorithm will increase the size of F S by an additive factor of at least h until the algorithm discoversan ( s, t )-path (or, otherwise, discover there is no such path). Hence, in total, there are at most | S | h many phases. We now give an overview of how to implement each phase.Note that, without loss of generality, we can assume that the set F S contains all vertices thatare out-neighbors of vertices in F ¯ S . This is because whenever a vertex v ∈ ¯ S is added to F ¯ S , wecan quickly add all of v ’s out-neighbors in S \ F S into the set F S by using Claim 2.6. This requires O (log r ) independence queries for each such out-neighbor. Hence, in total, this procedure uses atmost O ( n log r ) independence queries, since each u ∈ S \ F S is added in F S at most once. Heavy and light vertices.
Before explaining what the algorithm does in each phase, we introducethe notion of heavy and light vertices: We divide the vertices in ¯ S \ F ¯ S into two categories. We calla vertex v ∈ ¯ S \ F ¯ S heavy if it either has an edge to t or has at least h out-neighbors in S \ F S . Thevertices in ¯ S \ F ¯ S that are not heavy are denoted as light (See Figure 1 for reference; the heavy nodesare highlighted in light-yellow). Note that both these notions are defined in terms of out-degrees,i.e., a heavy vertex can have arbitrary in-degree and so can a light vertex. Also, note that the notionof heavy and light vertices are defined w.r.t. to the set F S . Because the set F S changes from onephase to the next, so does the set of heavy vertices and light vertices. Description of phase i . Let us assume, for the time being, that there is an efficient procedureto categorize the vertices in ¯ S \ F ¯ S into the sets of heavy and light vertices. We first apply thisprocedure at the beginning of phase i .Now, for simplicity, consider an easy case: In phase i , there is a heavy vertex that has anin-neighbor in F S . In this case, we can go over all vertices in ¯ S \ F ¯ S to find such a heavy vertex—thiscan be done with n many independence queries. Once we find such a heavy vertex, we include it in Note that r usually denotes the size of the maximum common independent set which is an upper bound on thesize of the vertex set S . We abuse the notation and use r here to denote | S | . S and all of its out-neighbors in F ¯ S . Note that, in this case, either of the following two things canhappen: either we have increased the size of F S by at least h as the heavy vertex has at least h out-neighbors in S \ F S ; or the heavy vertex we found has t as its out-neighbor in which case wehave found an ( s, t )-path.Unfortunately, this may not be the case in phase i . In this case, we do an additional procedurecalled the reverse breadth-first search or, in short, reverse BFS . The goal of the reverse BFS is tofind a heavy vertex reachable from F . Before describing this procedure, note the following twoproperties of the light vertices:1. A light vertex will remain a light vertex even if we increase the size of F S .2. We can assume that we know all out-neighbors of any light vertex.Property 2 needs some explanation. This property is true because of two observations: (i) Allout-neighbors of a light vertex can be found out with O ( h log n ) independence queries using Claim2.6, and (ii) because of Property 1, across all phases, we need to find out the out-neighbors of a lightvertex only once . So, even though we need to make O ( nh log n ) queries in total, this cost amortizesacross all phases.The idea is, as before, to discover a heavy vertex which is reachable from F so that we caninclude all of its out-neighbors in F S (for example, consider the heavy vertex v in Figure 1). So ourgoal is to find some path from F to a heavy vertex (Consider the path starting from v highlightedin light-blue in Figure 1). This naturally implies the need for doing a reverse BFS from the heavyvertices. We also note that any path from F to t must pass through a heavy vertex (the vertex justpreceding t must by definition be heavy). Hence, if our reverse BFS fails to find a path from F tosome heavy vertex, the algorithm has determined that no ( s, t )-path exists.What remains is to find out how to implement the reverse BFS procedure efficiently. To thisend, we exploit Property 2 of light vertices and assume that we know all edges directed from ¯ S to S that the reverse BFS procedure needs to visit. This follows from the following crucial observation: No internal node of the reverse BFS forest is a heavy node , i.e., in other words, the heavy verticesoccur only as root nodes of the reverse BFS trees. This is because if, along the traversal of a reverseBFS procedure starting from a heavy node v , we reach another heavy node v , we can ignore v asthe reverse BFS starting from node v has already taken care of processing v . This means thatany edge in ¯ S × S that takes part in the reverse BFS procedure must originate from a light vertexand, hence, is known a priori due to Property 2. All it remains for the reverse BFS procedure is todiscover in-neighbors of vertices in ¯ S using edges from S × ¯ S . By Claim 2.6, each such in-neighborcan be found by making O (log r ) independence queries. In total, the reverse BFS procedure uses˜ O ( n ) independence queries. Post-processing.
Note that, in order to use Claim 2.2, the ( s, t )-path needs to be chordless.However, the ( s, t )-path p that the algorithm outputs has no such guarantee. So, as a post-processingstep, the algorithm uses an additional ˜ O ( r ) independence queries to convert this path into a chordless path: Consider any vertex v ∈ V ( p ) ∩ ¯ S and assume u as the parent of v , and w as the child of v in the path p . The vertex v needs to check whether it has an in-neighbor other than u among theancestors of v in V ( p ) or an out-neighbor other than w among the descendants of v in V ( p ). Sincethe length of the path obtained from the previous step is O ( r ) (because of | S | = r and the pathdoes not contain any cycle), this requires O (log r ) independence queries. If all vertices in V ( p ) ∩ ¯ S have no such in or out-neighbors, then it is easy to see that p is indeed a chordless path. If thereis such a (say) in-neighbor u of v , then we remove all vertices of V ( p ) between u and v , and theresulting subsequence is still an ( s, t )-path. A similar procedure is done when an out-neighbor is9iscovered. In total, this takes O ( r log r ) independence queries, since each vertex can be removedfrom the path at most once. Cost analysis.
The total number of queries needed to implement phase i is a summation of twoterms: (i) the number of queries needed to partition the vertices into heavy and light categories,and (ii) the number of queries needed to run the reverse BFS procedure. We have seen that(ii) can be implemented with ˜ O ( n ) independence queries. For (i), we present two algorithms: arandomized sampling algorithm, and a deterministic algorithm which is slightly less efficient thanthe randomized one. This is the main technical difference between the algorithm of Theorem3.1 and that of Theorem 3.2. The cost analysis for (i) is also amortized and the total number ofqueries needed across all phases is ˜ O (max { nh, nr/h } ) for randomized and ˜ O ( n √ rh ) for deterministicimplementation. Setting h = √ r for randomized and h = r / for deterministic, we see that totalrandomized query complexity of augmentation is ˜ O ( n √ r ) and deterministic query complexity is˜ O ( nr / ). heavy and light vertices We start with reminding the readers the definition of the heavy and light vertices in ¯ S \ F ¯ S . Definition 3.3.
We call a vertex v ∈ ¯ S \ F ¯ S heavy if either ( v, t ) is an edge of G ( S ) or v has atleast h out-neighbors in S \ F S . Otherwise we call v light.To check whether v has an edge to t is easy and requires only a single independence query: “Is S ∪ { v } independent in M ?” The difficulty lies when this is not the case and we need to determineif v has outdegree at least h to S \ F S . We present two algorithms to solve this categorizationproblem: one randomized sampling algorithm; and a less efficient deterministic algorithm. Moreconcretely, we show the following two lemmas. Lemma 3.4.
There is a randomized categorization procedure which, with high probability, categorizesheavy and light vertices in the set ¯ S \ F ¯ S correctly by issuing O ( n log n ) independence queries perphase and an additional O ( nh log n ) independence queries over the whole run of the Augmentation algorithm.
Lemma 3.5.
There exists a deterministic categorization procedure which uses O ( n √ rh log r ) queriesover the whole run of the Augmentation algorithm.
The proofs of these two lemmas are deferred to Section 5.
In this section, we present the reverse BFS in Algorithm 3.7 and analyze some properties of it.Recall that the reverse BFS is run once in each phase of the algorithm to find some vertex in F which can reach some heavy vertex. We also remind the reader of the example in Figure 1. In thissection, we prove the following. Lemma 3.6 (Heavy vertex reachability) . There is an algorithm (Algorithm 3.7:
ReverseBFS )which, given F such that there are no edges from F ¯ S to S \ F S , a categorization of ¯ S \ F ¯ S into heavy and light , and all out-edges of the light vertices to S \ F S , uses O ( n log r ) queries and either finds apath from some vertex in F to a heavy node, or otherwise determines that no such path exists. We next provide the pseudo-code (Algorithm 3.7).10 lgorithm 3.7
ReverseBFS
Input:
Categorization of ¯ S \ F ¯ S into heavy and light ; and a set LightEdges containing all out-edges of the light vertices.
Output:
A path from F to some heavy vertex, if one exists. Q ← { v ∈ ¯ S \ F ¯ S which are heavy } NotVisited ← ( S ∪ ¯ S ∪ { s, t } ) \ Q while Q = ∅ do Pop a vertex v from Q . if v ∈ F then return the path from v to a heavy vertex in the BFS-forest. else if v ∈ ¯ S \ F ¯ S then while u = InEdge ( v, NotVisited ) is not ∅ do Push u to Q and remove it from NotVisited . else if v ∈ S \ F S then for u ∈ NotVisited such that ( u, v ) ∈ LightEdges do Push u to Q and remove it from NotVisited . return “NO PATH EXISTS” Correctness.
We first argue that the algorithm is correct. When a vertex v ∈ ¯ S \ F ¯ S is processedby the algorithm, each unvisited in-neighbor will be added to the queue Q in the while loop in line 8.When a vertex v ∈ S \ F S is processed by the algorithm, any edge from NotVisited to v mustoriginate from a light vertex, since NotVisited contains no heavy vertices and we are guaranteedthat no edge from F ¯ S to S \ F S exist. Hence Algorithm 3.7 will eventually process every vertexreachable, by traversing edges in reverse, from the heavy vertices. Cost analysis.
The only place Algorithm 3.7 uses independence queries is in line 8. Each vertexwill be discovered at most once by the binary search in
InEdge . This means that we do at most n calls to InEdge , each using O (log r ) queries by Claim 2.6. Hence the reverse BFS uses O ( n log r )queries per phase. We now present the main augmenting path algorithm, as explained in the overview in Section 3.1.11 lgorithm 3.8
Augmentation
Input:
Two matroids M = ( V, I ), and M = ( V, I ) and a commonindependent set S ⊆ I ∩ I . Output:
An augmenting ( s, t )-path in G ( S ) if one exists. F ← { s } LightEdges ← ∅ while t F do Description of a phase. Categorize v ∈ ¯ S \ F ¯ S into heavy and light . . See Sections 5.1 and 5.2. for each new light vertex v do Use
OutEdge to find all out-neighbors of v in S \ F S . Add edges ( v, u ) to
LightEdges for each such out-neighbor u . p ← ReverseBFS ( S, F,
LightEdges ) . See Section 3.3. if p = “NO PATH EXISTS” then return “NO PATH EXISTS” else . p is a path from F to a heavy vertex Denote by V ( p ) the vertices on the path p . Add all v ∈ V ( p ) to F . for v ∈ V ( p ) ∩ ¯ S do while u = OutEdge ( v, ( S \ F ) ∪ { t } ) is not ∅ do Add u to F . Post-processing.
Post-process the ( s, t )-path found to make it chordless . return the augmenting path.Note that we have not specified if we are using the randomized or deterministic categorizationof heavy and light vertices, from Sections 5.1 and 5.2. We will for now assume this categorizationprocedure as a black box which is always correct.We start by stating some invariants of Algorithm 3.8.1. F contains only vertices reachable from s . In fact, for each vertex in F we have found a pathfrom s to this vertex.2. LightEdges contains all out-edges from light vertices to S \ F .3. In the beginning of each phase, there exists no v ∈ F , u ∈ ( S ∪ { s, t } ) \ F such that ( v, u ) isan edge in G ( S ). This is because whenever v ∈ ¯ S is added to F , all v ’s neighbors are alsoadded, see line 15. Correctness.
When the algorithm outputs an ( s, t )-path, the path clearly exists, by Invariant 1.So it suffices to argue that the algorithm does not return “NO PATH EXISTS” incorrectly. Notethat the algorithm only returns “NO PATH EXISTS” when
ReverseBFS does so, that is whenthere is no path from F to a heavy vertex (by Lemma 3.6). So suppose that this is that case, and12lso suppose, for the sake of a contradiction, that an ( s, t )-path p exists in G ( S ). Denote by v thevertex preceding t in the path p . By Invariant 3 we know that v is not in F . But then v is heavy,since ( v, t ) is an edge of G ( S ). Hence a subpath of p will be a path from F to the heavy vertex v ,which is the desired contradiction. Number of phases.
We argue that there are at most rh + 1 phases of the algorithm. After aphase, either the algorithm returns “NO PATH EXISTS” (in which case this was the last phase), orsome path p was found by the reverse BFS. Then V ( p ) must include some heavy vertex v . Then allneighbors of v will be added to F in line 15. Thus we know that either t was added to F (in whichcase this was the last phase), or at least h vertices from S was added to F . Since | S | ≤ r in thebeginning, this can happen at most rh times. Number of queries.
We analyse the number of independence queries used by different parts ofthe algorithm:•
ReverseBFS (Algorithm 3.7) is run once each phase, and uses O ( n log r ) queries per call byLemma 3.6. This contributes a total of O ( nr log rh ) independence queries over all phases.• Each u ∈ S is discovered at most once by the OutEdge call on line 15. So this line contributesa total of O ( n log r ) independence queries.• Each vertex becomes light at most once over the run of the algorithm. When this happens,the algorithm finds all of its (up to h ) out-neighbors on line 6, using OutEdge calls. Thiscontributes a total of O ( nh log r ) independence queries.• The post-processing can be performed using O ( r log r ) independence queries, as explained inSection 3.1.• The heavy / light -categorization uses O ( nh log n + nr log nh ) independence queries when therandomized procedure is used, by Lemma 3.4. When the deterministic categorization procedureis used, we use O ( n √ rh log r ) independence queries instead, by Lemma 3.5.We see that in total, the algorithm uses:• O ( n √ r log n ) independence queries with the randomized categorization, setting h = r / .• O ( nr / log r ) independence queries with the deterministic categorization, setting h = √ r .The above analysis proves Theorems 3.1 and 3.2. Remark . When the randomized categorization procedure fails, Algorithm 3.8 will still alwaysreturn the correct answer, but it might use more independence queries. So Algorithm 3.8 is in facta Las-Vegas algorithm with expected query-complexity O ( n √ r log n ). Remark . We note that our algorithm can not be used to find which vertices are reachable from s using subquadratic number of queries. There are two hurdles to getting a subquadratic algorithm for Matroid Intersection. Firstly, standardaugmenting path algorithms need to find the augmenting paths one at a time. This is since afteraugmenting along a path, the edges in the exchange graph change (some edges are added, someremoved). This is unlike bipartite matching, where a set of vertex-disjoint augmenting paths can be13ugmented along in parallel. It is not clear how to find the augmenting paths faster than Θ( n ) each,so these standard augmenting path algorithms are stuck at Ω( nr ) independence queries.To overcome this, Chakrabarty-Lee-Sidford-Singla-Wong [CLS +
19, Section 6] introduce thenotion of augmenting sets , which allows multiple parallel augmentations. Using the augmenting setsthey present a subquadratic (1 − ε )-approximation algorithm using ˜ O ( n . ε . ) independence queries: Lemma 4.1 (Approximation algorithm [CLS + . There exists an (1 − ε ) approximation algorithmfor matroid intersection using O ( n √ n log rε √ ε ) independence queries. The second hurdle is that when the distance d between s and t is high, the breadth-firstalgorithms of [CLS +
19, Ngu19] use ˜Θ( dn ) independence queries to compute the distance layers,which is Ω( nr ) when d ≈ r . Here our algorithm from Section 3 helps since it can find a singleaugmenting path using a subquadratic number of independence queries, even when the distance d islarge.So our idea is as follows:• Start by using the subquadratic approximation algorithm. This avoids having to do themajority of augmentations one by one.• Continue with the fast implementation [CLS +
19, Section 5] of the Cunningham-style blockingflow algorithm.• When the ( s, t )-distance becomes too large, fall back to using the augmenting-path algorithmfrom Section 3 to find the (few) remaining augmenting paths.
Algorithm 4.2 subquadratic Matroid Intersection Run the approximation algorithm (Lemma 4.1) with ε = n / r − / log − / r to obtain a common independent set S of size at least (1 − ε ) r = r − n / r / log − / r . Starting with S , run Cunningham’s algorithm (as implemented by [CLS + s and t becomes larger than d . Keep running
Augmentation (Algorithm 3.8) from Section 3 and augment-ing the current common independent set with the obtained ( s, t )-path (as inClaim 2.2) until no ( s, t )-path can be found in the exchange graph.The choice of d will be different depending on whether we use the randomized or deterministicversion of Algorithm 3.8. In order to run Algorithm 4.2, we need to know r so that we may choose ε (and d ) appropriately. However, the size r of the largest common independent set is unknown.We note that it suffices, for the purpose of the asymptotic analysis, to use a -approximation ¯ r for r (that is ¯ r ≤ r ≤ r ). It is well known that such an ¯ r can be found in O ( n ) independence queriesby greedily finding a maximal common independent set in the two matroids. Now we can boundthe query complexity of Algorithm 4.2. Lemma 4.3.
Line 1 of Algorithm 4.2 uses O ( n / r / log / r ) independence queries. Note that unlike in Section 3, we now use the normal definition of r as the size of the maximum-cardinalitycommon independent set of the two matroids. roof. The approximation algorithm uses O ( n . √ log rε . ) = O ( n / r / log / r ) independence queries,when ε = n / r − / log − / r . Lemma 4.4.
Line 2 of Algorithm 4.2 uses O ( n / r / log / r + nd log r ) independence queries.Proof. There are two main parts of Cunningham’s blocking-flow algorithm.• Computing the distances. The algorithm will run several BFS’s to compute the distances. Thetotal number of independence queries for all of these BFS’s can be bounded by O ( dn log r ),since the distances are monotonic so each vertex is tried at a specific distance at most once.For more details, see [CLS +
19, Section 5.1].• Finding the augmenting paths. Given the distance-layers, a single augmenting path can befound in O ( n log r ) independence queries, by a simple depth-first-search. Again, we refer to[CLS +
19, Section 5.2] for more details. Since we start with a common independent set S of size(1 − ε ) r = r − n / r / log − / r , we know that S can be augmented at most n / r / log − / r additional times. Hence a total of O ( n / r / log / r ) independence queries suffices to find allof these augmenting paths. Remark . We note that if we skip Line 3 in Algorithm 4.2, we thus get a (1 − d )-approximationalgorithm (by Claim 2.4), using ˜ O ( n / r / + nd ) independence queries, which beats the ˜ O ( n . /ε . )approximation algorithm when ε = o ( n / r − / ). Lemma 4.6.
Line 3 of Algorithm 4.2 uses O ( rd T ) independence queries, where T is the number ofindependence queries used by one invocation of Augmentation (Algorithm 3.8).Proof.
After line 2, the algorithm has found a common independent set of size at least (1 − O ( d )) r = r − O ( rd ), by Claim 2.4. This means that only O ( rd ) additional augmentations need to beperformed.By Lemmas 4.3, 4.4 and 4.6, we see that Algorithm 4.2 uses a total of O ( n / r / log / r + nd log r + rd T ) independence queries. If we pick d = q r T n log r we get the following lemma. Lemma 4.7.
If the query complexity of
Augmentation is T , then matroid intersection can besolved using O ( n / r / log / r + √ nr T log r ) independence queries. Combining with Theorems 3.1 and 3.2 we get our subquadratic results.
Theorem 4.8 (Randomized Matroid Intersection) . There is a randomized algorithm which withhigh probability uses O ( n / r / log / r ) independence queries and solves the matroid intersectionproblem. When r = Θ( n ) , this is ˜ O ( n / ) . Theorem 4.9 (Deterministic Matroid Intersection) . There is a deterministic algorithm which uses O ( nr / log r + n / r / log / r ) independence queries and solves the matroid intersection problem.When r = Θ( n ) , this is ˜ O ( n / ) .Remark . The limiting term for the the randomized algorithm is between line 1 and line 2. If afaster approximation algorithm is found, the same strategy as above might give an ˜ O ( nr / )-queryalgorithm. 15 Algorithm for heavy/light categorization
In this section, we finally provide the algorithm for the categorization of vertices in ¯ S \ F ¯ S intoheavy and light vertices as defined in Definition 3.3. In this section, we prove the following lemma (restated from Section 3.2).
Lemma 3.4.
There is a randomized categorization procedure which, with high probability, categorizesheavy and light vertices in the set ¯ S \ F ¯ S correctly by issuing O ( n log n ) independence queries perphase and an additional O ( nh log n ) independence queries over the whole run of the Augmentation algorithm.
We will use X to denote S \ F . Let the out-neighborhood of a vertex v ∈ ¯ S \ F ¯ S inside X be denoted as Ngh X ( v ). Consider the family of sets { Ngh X ( v ) } v ∈ ¯ S \ F ¯ S residing inside the ambientuniverse X . We want to find out which of these sets are of size at least h (i.e., correspond to theheavy vertices) and which of them are not (i.e., corresponds to the light vertices). To this end, wedevise the following random experiment. Experiment 5.1.
Sample a set R of k elements drawn uniformly and independently from X (withreplacement) and check whether R ∩ Ngh X ( v ) = ∅ . It is easy to check the following: For any v ∈ ¯ S \ F ¯ S , Experiment 5.1 is successful with probability:Pr R [ R ∩ Ngh X ( v ) = ∅ ] = (cid:18) − | Ngh X ( v ) || X | (cid:19) k . Note that, to perform this experiment for a vertex v , we need to make a single independencequery of the form whether ( S \ R ) ∪ { v } ∈ I . Next, we make the following claim. Claim 5.2.
There is a non-negative integer k such that the following holds:1. If | Ngh X ( v ) | < h , then Experiment 5.1 succeeds with probability at least 3/4, and2. If | Ngh X ( v ) | > h , then Experiment 5.1 succeeds with probability at most 1/4. Before proving Claim 5.2, we show the rest of the steps of this procedure. For every vertex, werepeat Experiment 5.1 s = O (log n ) many times independently. By standard concentration bound,we make the following observations:1. If | Ngh X ( v ) | < h , strictly more than s/
2. If | Ngh X ( v ) | > h , strictly less than s/ s/ heavy . Theprobability that a light vertex can be classified as heavy by this procedure is very small due toProperty 1. On the other hand, a vertex with | Ngh X ( v ) | > h will be correctly classified as heavywith a very high probability. However, a heavy vertex with | Ngh X ( v ) | ≤ h may not be correctlyclassified. So, for such vertices, we want to check in a brute-force manner. To this end, we discoverthe set Ngh X ( v ) for any vertex v which is not declared heavy and make decisions accordingly. Recall that by very high probability we mean with probability at least 1 − n − c for some arbitrary large constant c . ounding the error probability. We argue that we can bound the error probabilities fromProperties 1 and 2 over the whole run of the
Augmentation algorithm by a union bound. Saythat the error probabilities of Properties 1 and 2 is bounded by n − c for some large constant c ≥ n vertices, and there is at most rh < n phases. Hence, theprobability that — over the whole run of Augmentation (Algorithm 3.8) — that any vertex ismisclassified as heavy, or that the procedure decides to discover a set
Ngh X ( v ) with Ngh X ( v ) > n − c +2 . Similarly we note that in the algorithm for Matroid Intersection (Algorithm 4.2)we run Augmentation at most r times, so the error probability is at most n − c +3 . Cost analysis.
As mentioned before, each instance of Experiment 5.1 can be performed witha single query. As there are O ( n log n ) experiments in total in each phase of the algorithm, thenumber of queries needed to perform all experiments over the whole run of the Augmentation algorithm will be is O ( nr log nh ) (recall that the number of phases is r/h ). Now consider the part ofthe algorithm where we need to discover the set Ngh X ( v ) for any vertex v which is not declaredheavy after the completion of all experiments in a phase. For each such vertex, this will take at most O ( | Ngh X ( v ) | log n ) = O ( h log n ) queries (due to Claim 2.6). Note that we only need to make thesekinds of queries from each vertex once over the whole run of the algorithm (as in future queries wealready know all v ’s neighbors and can answer directly). Hence, the total number of such queries isat most O ( nh log n ) across all phases of the algorithm. Proof of Claim 5.2.
First we note that if | X | ≤ h , case 2 is vacuously true, so we may pick k = 0such that Experiment 5.1 always succeeds. So now assume that | X | > h and let x = h | X | ∈ (0 , ).We want to show that there exists some positive integer k satisfying (1 − x ) k ≥ and (1 − x ) k ≤ .Pick k = (cid:24) log log(1 − x ) (cid:25) . Then k ≥ log log(1 − x ) >
0, which means that (1 − x ) k ≤ . We also havethat k ≤ log log(1 − x ) + 1 < log log(1 − x ) (since x ∈ (0 , )), which means that (1 − x ) k ≥ . In this section we prove the following lemma (restated from Section 3.2).
Lemma 3.5.
There exists a deterministic categorization procedure which uses O ( n √ rh log r ) queriesover the whole run of the Augmentation algorithm.
The main idea of the determinsitic categorization is the following: For each v ∈ ¯ S \ F ¯ S , ourdeterministic categorization keeps track of a set N v ⊆ Ngh X ( v ) of h out-neighbors to v (if thatmany out-neighbors exist). Then we can either use N v as a proof that v is heavy, or when we failedto find such a N v we know that v is light.In each phase, some of the vertices in N v may be added to F (and thus removed from X ). Thismay decrease the size of N v . In this case we would like to find additional out-neighbors to add to N v , until | N v | = h , or determine that | Ngh X ( v ) | < h . One possible and immediate strategy wouldbe to use Claim 2.6 to find a new out-neighbor of v in O (log n ) independence queries. However,adding arbitrary neighbors from Ngh X ( v ) \ N v will be expensive: over the whole run of the algorithmpotentially every vertex in S will be added to N v at some point which will require ˜ O ( nr ) manyindependence queries in total for all N v ’s—this is far too expensive than what we can allow. Instead,we want to be device a better strategy to pick u ∈ Ngh X ( v ) \ N v . Determinisitc strategy.
For u ∈ X we will denote by the weight of u , or w ( u ), the number ofsets N v which contain u . Note that these weights change over the run of the algorithm. Also, note17hat the values w ( u ) can be inferred from the sets N v ’s which are known to the querier. Hence, wecan assume that the querier knows the weights of elements in X . When u is moved from X to F , w ( u ) new out-neighbors must be found, one for each v ∈ ¯ S \ F ¯ S for which the set N v contained u .This motivates the following strategy: Whenever we need to find a new out-neighbors of v ,we find u ∈ Ngh X ( v ) \ N v that minimizes w ( u ). To perform this strategy, we note that thebinary-search idea from Claim 2.6 can be implemented to find a u which minimizes w ( u ). Indeed,if { u , u , . . . u | X | } ⊆ X with w ( u ) ≤ w ( u ) ≤ . . . ≤ w ( u | X | ), the binary search can first askif there is an edge to { u , . . . , u b| X | / c } with a single query. If this was the case we recurse on { u , u , . . . u b| X | / c } , otherwise recurse on { u b| X | / c +1 , . . . , u | X | } . This will guarantee that a the u i which minimizes w ( u i ) will be found. Cost Analysis.
For each v ∈ ¯ S we will at most once determine that N v cannot be extended, i.e.that | Ngh X ( v ) | < h . This will require O ( n ) independence queries in total. The remaining cost wewill amortize over the vertices in V = S ∪ ¯ S . Consider that we find some out-neighbor u ∈ X tosome vertex v ∈ ¯ S , using the above strategy. This uses O (log r ) independence queries. We willcharge this cost to u if w ( u ) ≤ n √ h √ r , otherwise we will charge the cost to v . We make the followingobservations:1. For u ∈ S , the total cost we charge to it at most O ( n √ h √ r log r ).2. For v ∈ ¯ S , the total cost we charge to it is at most O ( √ rh log r ).Property 1 is easy to see, since we charge the cost O (log r ) to it at most O ( n √ h √ r ) times. To arguethat Property 2 holds, let u ∈ S be the first vertex which got added to N v which had weight w ( u )strictly more than n √ h √ r (at the moment it was added to N v ). At this point in time, we know that forall remaining u ∈ Ngh X ( v ) \ N v , must have w ( u ) ≥ w ( u ) > n √ h √ r . Note that we can bound the totalweight P u ∈ X w ( u ) = P v ∈ ¯ S \ F ¯ S | N v | ≤ nh at any point in time. Because of this upper bound, therecan be at most nhn √ h/ √ r = √ rh such u . Hence we can charge vertex v at most √ rh more times.Since there are at most r vertices u ∈ S and n vertices v ∈ ¯ S , we conclude that the total cost(over all phases) for the deterministic categorization is O ( n √ rh log r ). This proves Lemma 3.5. A major open problem is to close the big gap between upper and lower bounds for the matroidintersection problem with independent and rank queries. A major step towards this goal is toprove an n lower bound. It will already be extremely interesting to prove such a bound fordeterministic algorithms. It is also interesting to prove a cn lower bound for randomized algorithmsfor some constant c > O ( n √ r )-query exact algorithmand an ˜ O ( n/ poly( (cid:15) ))-query (1 − (cid:15) )-approximation algorithm under independence queries (suchbounds have already been achieved under rank queries [CLS + O ( n √ n ) under both queries when r = Ω( n ). We actually use the same strategy to initialize the sets N v : We discover out-neighbors u in the increasing order of w ( u ).
18e believe that fully understanding the complexity of the reachability problem will be anothermajor step towards understanding the matroid intersection problem. We conjecture that our ˜ O ( n √ n )bound is tight for r = Ω( n ).It is also very interesting to break the quadratic barrier for the weighted case. This barrier canbe broken by a (1 − (cid:15) )-approximation algorithm by combining techniques from [CLS +
19, CQ16] ,but not the exact one.Related problems are those for minimizing submodular functions. Proving an n lowerbound or subquadratic upper bound for, e.g., finding the minimizer of a submodular functionor the non-trivial minimizer of a symmetric submodular function. Many recent studies (e.g.[RSW18, GPRW20, LLSZ20, MN19]) have led to some non-trivial bounds. However, it is stillopen whether an n lower bound or an n − Ω(1) upper bound exist even in the special cases ofcomputing minimum st -cut and hypergraph mincut in the cut query model. Acknowledgment
This project has received funding from the European Research Council (ERC) under the EuropeanUnions Horizon 2020 research and innovation programme under grant agreement No 715672. Janvan den Brand is partially supported by the Google PhD Fellowship Program. Danupon Nanongkaiand Sagnik Mukhopadhyay are also partially supported by the Swedish Research Council (Reg. No.2019-05622).
References [AD71] Martin Aigner and Thomas A. Dowling. Matching theory for combinatorial geometries.
Transactions of the American Mathematical Society , 158(1):231–245, 1971.[CLS +
19] Deeparnab Chakrabarty, Yin Tat Lee, Aaron Sidford, Sahil Singla, and Sam Chiu-waiWong. Faster matroid intersection. In
FOCS , pages 1146–1168. IEEE Computer Society,2019.[CQ16] Chandra Chekuri and Kent Quanrud. A fast approximation for maximum weightmatroid intersection. In
SODA , pages 445–457. SIAM, 2016.[Cun86] William H. Cunningham. Improved bounds for matroid partition and intersectionalgorithms.
SIAM J. Comput. , 15(4):948–957, 1986.[Edm70] Jack Edmonds. Submodular functions, matroids, and certain polyhedra. In
Combinato-rial structures and their applications , pages 69–87. 1970.[Edm79] Jack Edmonds. Matroid intersection. In
Annals of discrete Mathematics , volume 4,pages 39–49. Elsevier, 1979.[EDVJ68] Jack Edmonds, GB Dantzig, AF Veinott, and M Jünger. Matroid partition.
50 Yearsof Integer Programming 1958–2008 , page 199, 1968.[GPRW20] Andrei Graur, Tristan Pollner, Vidhya Ramaswamy, and S. Matthew Weinberg. Newquery lower bounds for submodular function minimization. In
ITCS , volume 151 of
LIPIcs , pages 64:1–64:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. From a private communication with Kent Quanrud Sn . In SODA , pages 542–549. SIAM, 2008.[Law75] Eugene L. Lawler. Matroid intersection algorithms.
Math. Program. , 9(1):31–56, 1975.[LLSZ20] Troy Lee, Tongyang Li, Miklos Santha, and Shengyu Zhang. On the cut dimension of agraph.
CoRR , abs/2011.05085, 2020.[LSW15] Yin Tat Lee, Aaron Sidford, and Sam Chiu-wai Wong. A faster cutting plane method andits implications for combinatorial and convex optimization. In
FOCS , pages 1049–1065.IEEE Computer Society, 2015.[MN19] Sagnik Mukhopadhyay and Danupon Nanongkai. Weighted min-cut: Sequential, cut-query and streaming algorithms.
CoRR , abs/1911.01651, 2019.[Ngu19] Huy L. Nguyen. A note on cunningham’s algorithm for matroid intersection.
CoRR ,abs/1904.04129, 2019.[RSW18] Aviad Rubinstein, Tselil Schramm, and S. Matthew Weinberg. Computing exactminimum cuts without knowing the graph. In
Proceedings of the 9th ITCS , pages39:1–39:16, 2018.[Sch03] Alexander Schrijver.