[PDF] Semi-Streaming Algorithms for Submodular Matroid Intersection

Abstract

While the basic greedy algorithm gives a semi-streaming algorithm with an approximation guarantee of 2 for the \emph{unweighted} matching problem, it was only recently that Paz and Schwartzman obtained an analogous result for weighted instances. Their approach is based on the versatile local ratio technique and also applies to generalizations such as weighted hypergraph matchings. However, the framework for the analysis fails for the related problem of weighted matroid intersection and as a result the approximation guarantee for weighted instances did not match the factor 2 achieved by the greedy algorithm for unweighted instances. Our main result closes this gap by developing a semi-streaming algorithm with an approximation guarantee of 2+\epsilon for \emph{weighted} matroid intersection, improving upon the previous best guarantee of 4+\epsilon. Our techniques also allow us to generalize recent results by Levin and Wajc on submodular maximization subject to matching constraints to that of matroid-intersection constraints. While our algorithm is an adaptation of the local ratio technique used in previous works, the analysis deviates significantly and relies on structural properties of matroid intersection, called kernels. Finally, we also conjecture that our algorithm gives a (k+\epsilon) approximation for the intersection of k matroids but prove that new tools are needed in the analysis as the used structural properties fail for k\geq 3.

Full PDF

aa r X i v : . [ c s . D S ] F e b Semi-Streaming Algorithms for Submodular Matroid Intersection ∗ Paritosh GargEPFL [email protected]

Linus JordanEPFL [email protected]

Ola SvenssonEPFL [email protected]

Abstract

While the basic greedy algorithm gives a semi-streaming algorithm with an approxima-tion guarantee of 2 for the unweighted matching problem, it was only recently that Paz andSchwartzman obtained an analogous result for weighted instances. Their approach is based onthe versatile local ratio technique and also applies to generalizations such as weighted hypergraphmatchings. However, the framework for the analysis fails for the related problem of weightedmatroid intersection and as a result the approximation guarantee for weighted instances did notmatch the factor 2 achieved by the greedy algorithm for unweighted instances.Our main resultcloses this gap by developing a semi-streaming algorithm with an approximation guarantee of2 + ε for weighted matroid intersection, improving upon the previous best guarantee of 4 + ε .Our techniques also allow us to generalize recent results by Levin and Wajc on submodularmaximization subject to matching constraints to that of matroid-intersection constraints.While our algorithm is an adaptation of the local ratio technique used in previous works,the analysis deviates signiﬁcantly and relies on structural properties of matroid intersection,called kernels. Finally, we also conjecture that our algorithm gives a ( k + ε ) approximation forthe intersection of k matroids but prove that new tools are needed in the analysis as the usedstructural properties fail for k ≥ For large problems, it is often not realistic that the entire input can be stored in random accessmemory so more memory eﬃcient algorithms are preferable. A popular model for such algorithmsis the (semi-)streaming model (see e.g. [Mut05]): the elements of the input are fed to the algorithmin a stream and the algorithm is required to have a small memory footprint.Consider the classic maximum matching problem in an undirected graph G = ( V, E ). Analgorithm in the semi-streaming model is fed the edges one-by-one in a stream e , e , . . . , e | E | andat any point of time the algorithm is only allowed O ( | V | polylog( | V | )) bits of storage. The goal isto output a large matching M ⊆ E at the end of the stream. Note that the allowed memory usageis suﬃcient for the algorithm to store a solution M but in general it is much smaller than the sizeof the input since the number of edges may be as many as | V | /

2. Indeed, the intuitive diﬃcultyin designing a semi-streaming algorithm is that the algorithm needs to discard many of the seenedges (due to the memory restriction) without knowing the future edges and still return a goodsolution at the end of the stream. ∗ This research was supported by the Swiss National Science Foundation project 200021-184656 “Randomness inProblem Instances and Randomized Algorithms.” This model can also be considered in the multi-pass setting when the algorithm is allowed to take several passesover the stream. However, in this work we focus on the most basic and widely studied setting in which the algorithmtakes a single pass over the stream. M = ∅ . Then for each edge e in the stream, add it to M if M ∪ { e } is afeasible solution, i.e., a matching; otherwise the edge e is discarded.The algorithm uses space O ( | V | log | V | ) and a simple proof shows that it returns a 2-approximatesolution in the unweighted case, i.e, a matching of size at least half the size of an maximum matching.However, this basic approach fails to achieve any approximation guarantee for weighted graphs .Indeed, for weighted matchings, it is non-trivial to even get a small constant-factor approxima-tion. One way to do so is to replace edges if we have a much heavier edge. This is formalized in[FKM +

04] who get a 6-approximation. Later, [McG05] improved this algorithm to ﬁnd a 5 . ε )-approximation.It was only in recent breakthrough work [PS17] that the gap in the approximation guaranteebetween unweighted and weighted matchings was closed. Speciﬁcally, [PS17] gave a semi-streamingalgorithm for weighted matchings with an approximation guarantee of 2 + ε for every ε > O ε ( | V | log | V | ) to O ε ( | V | log | V | ). These results for weighted matchings aretight (up to the ε ) in the sense that any improvement would also improve the state-of-the-art inthe unweighted case, which is a long-standing open problem.The algorithm of [PS17] is an elegant use of the local ratio technique [BYBFR04, BYE85] in thesemi-streaming setting. While this technique is very versatile and it readily generalizes to weightedhypergraph matchings, it is much harder to use it for the related problem of weighted matroidintersection. This is perhaps surprising as many of the prior results for the matching problemalso applies to the matroid intersection problem in the semi-streaming model (see Section 2 fordeﬁnitions). Indeed, the greedy algorithm still returns a 2-approximate solution in the unweightedcase and the algorithm in [CS14] returns a (4 + ε )-approximate solution for weighted instances. So,prior to our work, the status of the matroid intersection problem was that of the matching problem before [PS17].We now describe on a high-level the reason that the techniques from [PS17] are not easilyapplicable to matroid intersection and our approach for dealing with this diﬃculty. The approachin [PS17] works in two parts, ﬁrst certain elements of the stream are selected and added to a set S , and then at the end of the stream a matching M is computed by the greedy algorithm thatinspects the edges of S in the reverse order in which they were added. This way of constructing thesolution M greedily by going backwards in time is a standard framework for analyzing algorithmsbased on the local ratio technique. Now in order to adapt their algorithm to matroid intersection,recall that the bipartite matching problem can be formulated as the intersection of two partitionmatroids. We can thus reinterpret their algorithm and analysis in this setting. Furthermore, afterthis reinterpretation, it is not too hard to deﬁne an algorithm that works for the intersection ofany two matroids. However, bipartite matching is a special case of matroid intersection whichcaptures a rich set of seemingly more complex problems. This added expressiveness causes theanalysis and the standard framework for analyzing local ratio algorithms to fail. Speciﬁcally, weprove that a solution formed by running the greedy algorithm on S in the reverse order (as done forthe matching problem) fails to give any constant-factor approximation guarantee for the matroidintersection problem. To overcome this and to obtain our main result, we make a connection to aconcept called matroid kernels (see [Fle01] for more details about kernels), which allows us to, in amore complex way, identify a subset of S with an approximation guarantee of 2 + ε .Finally, for the intersection of more than two matroids, the same approach in the analysis does2ot work, because the notion of matroid kernel does not generalize to more than two matroids.However, we conjecture that the subset S generated for the intersection of k matroids still containsa ( k + ε )-approximation. Currently, the best approximation results are a ( k + ε )-approximationfrom [CS14] and a (2( k + p k ( k − − k = 3, the former isbetter, giving a (9+ ε )-approximation. For k >

3, the latter is better, giving an O ( k )-approximation. Generalization to submodular functions.

Very recently, Levin and Wajc [LW20] obtainedimproved approximation ratios for matching and b-matching problems in the semi-streaming modelwith respect to submodular functions. Speciﬁcally, they get a (3+2 √ √ ε )-approximation for maximum weight (linear) b-matching. In our paper, we are able toextend our algorithm for weighted matroid intersection to work with submodular functions bycombining our and their ideas. In fact, we are able to generalize all their results to the case ofmatroid intersection with better or equal approximation ratios: we get (3+2 √ δ )-approximationfor monotone submodular matroid intersection, (4 + 3 √ δ )-approximation for non-monotonesubmodular matroid intersection and (2 + ε )-approximation for maximum weight (linear) matroidintersection. Outline.

In Section 2 we introduce basic matroid concepts and we formally deﬁne the weightedmatroid intersection problem in the semi-streaming model. Section 3 and Section 4 are devotedto our main result, i.e., the semi-streaming algorithm for weighted matroid intersection with anapproximation guarantee of (2 + ε ). Speciﬁcally, in Section 3 we adapt the algorithm of [PS17]without worrying about the memory requirements, show why the standard analysis fails, and thengive our new analysis. We then make the obtained algorithm memory eﬃcient in Section 4. Furtherin Section 5, we adapt our algorithm to work with submodular functions by using ideas from [LW20].Finally, in Section 6, we discuss the case of more than two matroids. Matroids.

We deﬁne and give a brief overview of the basic concepts related to matroids that weuse in this paper. For a more comprehensive treatment, we refer the reader to [Sch03]. A matroid is a tuple M = ( E, I ) consisting of a ﬁnite ground set E and a family I ⊆ E of subsets of E satisfying: • if X ⊆ Y, Y ∈ I , then X ∈ I ; and • if X ∈ I, Y ∈ I and | Y | > | X | , then ∃ e ∈ Y \ X such that X ∪ { e } ∈ I .The elements in I (that are subsets of E ) are referred to as the independent sets of the matroidand the set E is referred to as the ground set . With a matroid M = ( E, I ), we associate the rankfunction rank M : 2 E → N and the span function span M : 2 E → E deﬁned as follows for every E ′ ⊆ E , rank M ( E ′ ) = max {| X | | X ⊆ E ′ and X ∈ I } , span M ( E ′ ) = { e ∈ E | rank M ( E ′ ∪ { e } ) = rank M ( E ′ ) } . One can get rid of the δ factor if we assume that the function value is polynomially bounded by | E | , an assumptionmade by [LW20].

3e simply write rank( · ) and span( · ) when the matroid M is clear from the context. In words,the rank function equals the size of the largest independent set when restricted to E ′ and the spanfunction equals the elements in E ′ and all elements that cannot be added to a maximum cardinalityindependent set of E ′ while maintaining independence. The rank of the matroid equals rank( E ),i.e., the size of the largest independent set. The weighted matroid intersection problem in the semi-streaming model.

In the weightedmatroid intersection problem , we are given two matroids M = ( E, I ) , M = ( E, I ) on a commonground set E and a non-negative weight function w : E → R ≥ on the elements of the ground set.The goal is to ﬁnd a subset X ⊆ E that is independent in both matroids, i.e., X ∈ I and X ∈ I ,and whose weight w ( X ) = P e ∈ X w ( e ) is maximized.In seminal work [Edm79], Edmonds gave a polynomial-time algorithm for solving the weightedmatroid intersection problem to optimality in the classic model of computation when the wholeinput is available to the algorithm throughout the computation. In contrast, the problem becomessigniﬁcantly harder and tight results are still eluding us in the semi-streaming model where thememory footprint of the algorithm and its access pattern to the input are restricted. Speciﬁcally,in the semi-streaming model the ground set E is revealed in a stream e , e , . . . , e | E | and at time i the algorithm gets access to e i and can perform computation based on e i and its current memorybut without knowledge of future elements e i +1 , . . . , e | E | . The algorithm has independence-oracleaccess to the matroids M and M restricted to the elements stored in the memory, i.e., for a set ofsuch elements, the algorithm can query whether the set is independent in each matroid.. The goalis to design an algorithm such that (i) the memory usage is near-linear O (( r + r ) polylog( r + r ))at any time, where r and r denote the ranks of the input matroids M and M , respectively,and (ii) at the end of the stream the algorithm should output a feasible solution X ⊆ E , i.e., asubset X that satisﬁes X ∈ I and X ∈ I , of large weight w ( X ). We remark that the memoryrequirement O (( r + r ) polylog( r + r )) is natural as r + r = | V | when formulating a bipartitematching problem as the intersection of two matroids .The diﬃculty in designing a good semi-streaming algorithm is that the memory requirement ismuch smaller than the size of the ground set E and thus the algorithm must intuitively discard manyof the elements without knowledge of the future and without signiﬁcantly deteriorating the weightof the ﬁnal solution X . The quality of the algorithm is measured in terms of its approximationguarantee: an algorithm is said to have an approximation guarantee of α if it is guaranteed tooutput a solution X , no matter the input and the order of the stream, such that w ( X ) ≥ OPT /α where OPT denotes the weight of an optimal solution to the instance. As aforementioned, ourmain result in this paper is a semi-streaming algorithm with an approximation guarantee of 2 + ε ,for every ε >

0, improving upon the previous best guarantee of 4 + ε [CS14]. In this section, we ﬁrst present the local ratio algorithm for the weighted matching problem thatforms the basis of the semi-streaming algorithm in [PS17]. We then adapt it to the weightedmatroid intersection problem. While the algorithm is fairly natural to adapt to this setting, we The considered problem can also be formulated as the problem of ﬁnding an independent set in one matroid thematroids, say M , and maximizing a submodular function which would be the (weighted) rank function of M . Forthat problem, [HKMY20] recently gave a streaming algorithm with an approximation guarantee of (2 + ε ). However,the space requirement of their algorithm is exponential the rank of M (which would correspond to be exponentialin | V | in the matching case) and thus it does not provide a meaningful algorithm for our setting.

41 0 10 01Time 1 1 1 201 02Time 2 1 212 0 12Time 3 1 22 12Time 4Time 1 Time 2

111 2

Time 3

111 212

Time 4Figure 1: The top part shows an example execution of the local ratio technique for weightedmatchings (Algorithm 1). The bottom part shows how to adapt this (bipartite) example to thelanguage of weighted matroid intersection (Algorithm 2).give an example in Section 3.2.1 that shows that the same techniques as used for analyzing thealgorithm for matchings does not work for matroid intersection. Instead, our analysis, which ispresented in Section 3.3, deviates from the standard framework for analyzing local ratio algorithmsand it heavily relies on a structural property of matroid intersection known as kernels. We remarkthat the algorithms considered in this section do not have a small memory footprint. We deal withthis in Section 4 to obtain our semi-streaming algorithm.

The local ratio algorithm for the weighted matching problem is given in Algorithm 1. The algorithmmaintains vertex potentials w ( u ) for every vertex u , a set S of selected edges, and an auxiliary weightfunction g : S → R ≥ of the selected edges. Initially the vertex potentials are set to 0 and theset S is empty. When an edge e = { u, v } arrives, the algorithm computes how much it gainscompared to the previous edges, by taking its weight minus the weight/potential of its endpoints( g ( e ) = w ( e ) − w ( u ) − w ( v )). If the gain is positive, then we add the edge to S , and add the gainto the weight of the endpoints, that is, we set w ( u ) = w ( u ) + g ( e ) and w ( v ) = w ( v ) + g ( e ).5 lgorithm 1 Local ratio algorithm for weighted matching

Input:

A stream of the edges of a graph G = ( V, E ) with a weight function w : E → R ≥ . Output:

A matching M . S ← ∅ ∀ u ∈ V, w ( u ) ← for edge e = ( u, v ) in the stream do if w ( u ) + w ( v ) < w ( e ) then g ( e ) ← w ( e ) − w ( u ) − w ( v ) w ( u ) ← w ( u ) + g ( e ) w ( v ) ← w ( v ) + g ( e ) S ← S ∪ { e } end if end for return a maximum weight matching M among the edges stored on the stack S For a better intuition of the algorithm, consider the example depicted on the top of Figure 1. Thestream consists of four edges e , e , e , e with weights w ( e ) = 1 and w ( e ) = w ( e ) = w ( e ) = 2.At each time step i , we depict the arriving edge e i in thick along with its weight; the vertexpotentials before the algorithm considers this edge is written on the vertices, and the updatedvertex potentials (if any) after considering e i are depicted next to the incident vertices. The edgesthat are added to S are solid and those that are not added to S are dashed.At the arrival of the ﬁrst edge of weight w ( e ) = 1, both incident vertices have potential 0 andso the algorithm adds this edge to S and increases the incident vertex potentials with the gain g ( e ) = 1. For the second edge of weight w ( e ) = 2, the sum of incident vertex potentials is 1 andso the gain of e is g ( e ) = 2 −

1, which in turn causes the algorithm to add this edge to S and toincrease the incident vertex potentials by 1. The third time step is similar to the second. At thelast time step, edge e arrives of weight w ( e ) = 2. As the incident vertex potentials sum up to 2the gain of e is not strictly positive and so this edge is not added to S and no vertex potentialsare updated. Finally, the algorithm returns the maximum weight matching in S which in this caseconsists of edges { e , e } and has weight 3. Note that the optimal matching of this instance hadweight 4 and we thus found a 4 / M by inspecting the edges in S in reverse order, i.e., we ﬁrst considerthe edges that were added last. An easy proof (see e.g. [GW18]) then shows that the matching M constructed in this way has weight at least half the optimum weight.In the next section, we adapt the above described algorithm to the context of matroid inter-sections. We also give an example that the above framework for the analysis fails to give anyconstant-factor approximation guarantee. Our alternative (tight) analysis of this algorithm is thengiven in Section 3.3. When adapting Algorithm 1 to matroid intersection to obtain Algorithm 2, the ﬁrst problemwe encounter is the fact that matroids do not have a notion of vertices, so we cannot keep aweight/potential for each vertex. To describe how we overcome this issue, it is helpful to considerthe case of bipartite matching and in particular the example depicted in Figure 1. It is well known6 lgorithm 2

Local ratio for matroid intersection

Input:

A stream of the elements of the common ground set of matroids M = ( E, I ) , M = ( E, I ). Output:

A set X ⊆ E that is independent in both matroids. S ← ∅ for element e in the stream do calculate w ∗ i ( e ) = max (cid:0) { } ∪ { θ : e ∈ span M i ( { f ∈ S | w i ( f ) ≥ θ } ) } (cid:1) for i ∈ { , } . if w ( e ) > w ∗ ( e ) + w ∗ ( e ) then g ( e ) ← w ( e ) − w ∗ ( e ) − w ∗ ( e ) w ( e ) ← w ∗ ( e ) + g ( e ) w ( e ) ← w ∗ ( e ) + g ( e ) S ← S ∪ { e } end ifend forreturn a maximum weight set T ⊆ S that is independent in M and M that the weighted matching problem on a bipartite graph with edge set E and bipartition V , V canbe modelled as a weighted matroid intersection problem on matroids M = ( E, I ) and M = ( E, I )where for i ∈ { , } I i = { E ′ ⊆ E | each vertex v ∈ V i is incident to at most one vertex in E ′ } . Instead of keeping a weight for each vertex, we will maintain two weight functions w and w ,one for each matroid. These weight functions will be set so that the following holds in the specialcase of bipartite matching: on the arrival of a new edge e , let T i ⊆ S be an independent set in I i of selected edges that maximizes the weight function w i . Then we have thatmin f ∈ T i : T i \{ f }∪{ e }∈ I i w i ( f ) if T i ∪ { e } 6∈ I i and 0 otherwise (1)equals the vertex potential of the incident vertex V i when running Algorithm 1. It is well-known(e.g. by the optimality of the greedy algorithm for matroids) that the cheapest element f to removefrom T i to make T i \ { f } ∪ { e } an independent set equals the largest weight θ so that the elementsof weight at least θ spans e . We thus have that (1) equalsmax (cid:0) { } ∪ { θ : e ∈ span M i ( { f ∈ S | w i ( f ) ≥ θ } ) } (cid:1) and it follows that the quantities w ∗ ( e ) and w ∗ ( e ) in Algorithm 2 equal the incident vertex potentialsin V and V of Algorithm 1 in the special case of bipartite matching. To see this, let us return toour example in Figure 1 and let V be the two vertices on the left and V be the two vertices onthe right. In the bottom part of the ﬁgure, the weight functions w and w are depicted (at thecorresponding side of the edge) after the arrival of each edge. At time step 1, e does not need toreplace any elements in any of the matroids and so w ∗ ( e ) = w ∗ ( e ) = 0. We therefore have thatits gain is g ( e ) = 1 and the algorithm sets w ( e ) = w ( e ) = 1. At time 2, edge e of weight 2arrives. It is not spanned in the ﬁrst matroid whereas it is spanned by edge e of weight 1 in thesecond matroid. It follows that w ∗ ( e ) = 0 and w ∗ ( e ) = w ( e ) = 1 and so e has positive gain g ( e ) = 1 and it sets w ( e ) = 1 and w ( e ) = w ( e ) + 1 = 2. The third time step is similar tothe second. At the last time step, e of weight 2 arrives. However, since it is spanned by e with w ( e ) = 1 in the ﬁrst matroid and by e with w ( e ) = 1 in the second matroid, its gain is 0 andit is thus not added to the set S . Note that throughout this example, and in general for bipartite7raphs, Algorithm 2 is identical to Algorithm 1. One may therefore expect that the analysis ofAlgorithm 1 also generalizes to Algorithm 2. We explain next that this is not the case for generalmatroids. We give a simple example showing that the greedy selection (as done in the analysis for Algorithm 1for weighted matching) does not work for matroid intersection. Still, it turns out that the set S generated by Algorithm 2 always contains a 2-approximation but the selection process is moreinvolved. Lemma 1.

There exist two matroids M = ( E, I ) and M = ( E, I ) on a common ground set E and a weight function w : E → R ≥ such that a greedy algorithm that considers the elementsin the set S in the reverse order of when they were added by Algorithm 2 does not provide anyconstant-factor approximation.Proof. The example consists of the ground set E = { a, b, c, d } with weights w ( a ) = 1 , w ( b ) =1 + ε, w ( c ) = 2 ε, w ( d ) = 3 ε for a small ε > /ε )).The matroids M = ( E, I ) and M = ( E, I ) are deﬁned by • a subset of E is in I if and only if it does not contain { a, b } ; and • a subset of E is in I if and only if it contains at most two elements.To see that M and M are matroids, note that M is a partition matroid with partitions { a, b } , { c } , { d } , and M is the 2-uniform matroid (alternatively, one can easily check that M and M satisfy the deﬁnition of a matroid).Now consider the execution of Algorithm 2 when given the elements of E in the order a, b, c, d : • Element a has weight 1, and { a } is independent both in M and M , so we set w ( a ) = w ( a ) = g ( a ) = 1 and a is added to S . • Element b is spanned by a in M and not spanned by any element in M . So we get g ( b ) = w ( b ) − w ∗ ( b ) − w ∗ ( b ) = 1 + ε − − ε . As ε >

0, we add b to S , and set w ( b ) = w ( a ) + ε =1 + ε and w ( b ) = ε . • Element c is not spanned by any element in M but is spanned by { a, b } in M . As b has thesmallest w weight, w ∗ ( c ) = w ( b ) = ε . So we have g ( c ) = 2 ε − w ∗ ( c ) − w ∗ ( c ) = 2 ε − − ε = ε >

0, and we set w ( c ) = ε and w ( c ) = 2 ε and add c to S . • Element d is similar to c . We have g ( d ) = 3 ε − − ε = ε > w ( d ) = ε and w ( d ) = 3 ε and add d to S .As the algorithm selected all the elements, we have S = E . It follows that the greedy algorithm on S (in the reverse order of when elements were added) will select d and c , after which the set is amaximal independent set in M . This gives a weight of 5 ε , even though a and b both have weight atleast 1, which shows that this algorithm does not guarantee any constant factor approximation.8 .3 Analysis of Algorithm 2 We prove that Algorithm 2 has an approximation guarantee of 2.

Theorem 2.

Let S be the subset generated by Algorithm 2 on a stream E of elements, matroids M = ( E, I ) , M = ( E, I ) and weight function w : E → R ≥ . Then there exists a subset T ⊆ S independent in M and in M whose weight w ( T ) is at least w ( S ∗ ) / , where S ∗ denotes an optimalsolution to the weighted matroid intersection problem. Throughout the analysis we ﬁx the input matroids M = ( E, I ) , M = ( E, I ), the weightfunction w : R → R ≥ , and the order of the elements in the stream. While Algorithm 2 only deﬁnesthe weight functions w and w for the elements added to the set S , we extend them in the analysisby, for i ∈ { , } , letting w i ( e ) = w ∗ i ( e ) for the elements e not added to S .We now prove Theorem 2 by showing that g ( S ) ≥ w ( S ∗ ) / T ⊆ S such that w ( T ) ≥ g ( S ) (Lemma 5). In the proof of both these lemmas, we use thefollowing properties of the computed set S . Lemma 3.

Let S be the set generated by Algorithm 2 and S ′ ⊆ S any subset. Consider one ofthe matroids M i with i ∈ { , } . There exists a subset T ′ ⊆ S ′ that is independent in M i , i.e., T ′ ∈ I i , and w i ( T ′ ) ≥ g ( S ′ ) . Furthermore, the maximum weight independent set in M i over thewhole ground set E can be selected to be a subset of S , i.e. T i ⊆ S , and it satisﬁes w i ( T i ) = g ( S ) .Proof. Consider matroid M (the proof is identical for M ) and ﬁx S ′ ⊆ S . The set T ′ ⊆ S ′ thatis independent in M and that maximizes w ( T ′ ) satisﬁes w ( T ′ ) = Z ∞ rank( { e ∈ T ′ | w ( e ) ≥ θ } ) dθ = Z ∞ rank( { e ∈ S ′ | w ( e ) ≥ θ } ) dθ . The second equality follows from the fact that the greedy algorithm that considers the elements indecreasing order of weight is optimal for matroids and thus we have rank( { e ∈ T ′ | w ( e ) ≥ θ } ) =rank( { e ∈ S ′ | w ( e ) ≥ θ } ) for any θ ∈ R .Now index the elements of S ′ = { e , e , . . . , e ℓ } in the order they were added to S by Algorithm 2and let S ′ j = { e , . . . , e j } for j = 0 , , . . . , ℓ (where S ′ = ∅ ). By the above equalities and bytelescoping, w ( T ′ ) = ℓ X i =1 Z ∞ (cid:0) rank( { e ∈ S ′ i | w ( e ) ≥ θ } ) − rank( { e ∈ S ′ i − | w ( e ) ≥ θ } ) (cid:1) dθ . We have that rank( { e ∈ S ′ i | w ( e ) ≥ θ } ) − rank( { e ∈ S ′ i − | w ( e ) ≥ θ } ) equals 1 if w ( e i ) ≥ θ and e i span( { e ∈ S ′ i − | w ( e ) ≥ θ } ) and it equals 0 otherwise. Therefore, by the deﬁnition of w ∗ ( · ),the gain g ( · ) and w ( e i ) = w ∗ ( e i ) + g ( e i ) in Algorithm 2 we have w ( T ′ ) = ℓ X i =1 (cid:2) w ( e i ) − max (cid:0) { } ∪ { θ : e i ∈ span (cid:0) { f ∈ S ′ i − | w i ( f ) ≥ θ } (cid:1) } (cid:1)(cid:3) ≥ ℓ X i =1 g ( e i ) = g ( S ′ ) . The inequality holds because S ′ i − is a subset of the set S at the time when Algorithm 2 consid-ers element e i . Moreover, if S ′ = S , then S ′ i − equals the set S at that point and so we thenhave w ∗ ( e i ) = max (cid:0) { } ∪ { θ : e i ∈ span (cid:0) { f ∈ S ′ i − | w i ( f ) ≥ θ } (cid:1) } (cid:1) , which implies that the aboveinequality holds with equality in that case. We can thus also conclude that a maximum weightindependent set T ⊆ S satisﬁes w ( T ) = g ( S ). Finally, we can observe that T is also a maximumweight independent set over the whole ground set since we have rank( { e ∈ S | w ( e ) ≥ θ } ) =rank( { e ∈ E | w ( e ) ≥ θ } ) for every θ >

0, which holds because, by the extension of w , an element e S satisﬁes e ∈ span( { f ∈ S : w ( f ) ≥ w ( e ) } ).9e can now relate the gain of the elements in S with the weight of an optimal solution. Lemma 4.

Let S be the subset generated by Algorithm 2. Then g ( S ) ≥ w ( S ∗ ) / .Proof. We ﬁrst observe that w ( e ) + w ( e ) ≥ w ( e ) for every element e ∈ E . Indeed, for anelement e ∈ S , we have by deﬁnition w ( e ) = g ( e ) + w ∗ ( e ) + w ∗ ( e ), and w i ( e ) = g ( e ) + w ∗ i ( e ), so w ( e ) + w ( e ) = 2 g ( e ) + w ∗ ( e ) + w ∗ ( e ) = w ( e ) + g ( e ) > w ( e ). In the other case, when e S then w ∗ ( e ) + w ∗ ( e ) ≥ w ( e ), and w i ( e ) = w ∗ i ( e ), so automatically, w ( e ) + w ( e ) ≥ w ( e ).The above implies that w ( S ∗ ) + w ( S ∗ ) ≥ w ( S ∗ ). On the other hand, by Lemma 3, wehave w i ( T i ) ≥ w i ( S ∗ ) (since T i is a max weight independent set in M i with respect to w i ) and w i ( T i ) = g ( S ), thus g ( S ) ≥ w i ( S ∗ ) for i = 1 , T ⊆ S independent in both M and M such that w ( T ) ≥ g ( S ). As described in Section 3.2.1, we cannot select T using the greedymethod. Instead, we select T using the concept of kernels studied in [Fle01]. Lemma 5.

Let S be the subset generated by Algorithm 2. Then there exists a subset T ⊆ S independent in M and in M such that w ( T ) ≥ g ( S ) .Proof. Consider one of the matroids M i with i ∈ { , } and deﬁne a total order < i on E such that e < i f if w i ( e ) > w i ( f ) or if w i ( e ) = w i ( f ) and e appeared later in the stream than f . The pair( M i , < i ) is known as an ordered matroid. We further say that a subset E ′ of E dominates element e of E if e ∈ E ′ or there is a subset C e ⊆ E ′ such that e ∈ span( C e ) and c < e for all elements c of C e .The set of elements dominated by E ′ is denoted by D M i ( E ′ ). Note that if E ′ is an independent set,then the greedy algorithm that considers the elements of D M i ( E ′ ) in the order < i selects exactlythe elements E ′ .Theorem 2 in [Fle01] says that for two ordered matroids ( M , < ) , ( M , < ) there always is aset K ⊆ E , which is referred to as a M M -kernel, such that • K is independent in both M and in M ; and • D M ( K ) ∪ D M ( K ) = E .We use the above result on M and M restricted to the elements in S . Speciﬁcally we select T ⊆ S to be the kernel such that D M ( T ) ∪ D M ( T ) = S . Let S = D M ( T ) and S = D M ( T ). ByLemma 3, there exists a set T ′ ⊆ S independent in M such that w ( T ′ ) ≥ g ( S ). As noted above,the greedy algorithm that considers the element of S in the order < i (decreasing weights) selectsexactly the elements in T . It follows by the optimality of the greedy algorithm for matroids that T is optimal for S in M with weight function w , which in turn implies w ( T ) ≥ g ( S ). In the sameway, we also have w ( T ) ≥ g ( S ). By deﬁnition, for any e ∈ S , we have w ( e ) = w ( e ) + w ( e ) − g ( e ).Together, we have w ( T ) = w ( T ) + w ( T ) − g ( T ) ≥ g ( S ) + g ( S ) − g ( T ). As elements from T arein both S and S , and all other elements are in at least one of both sets, we have g ( S ) + g ( S ) ≥ g ( S ) + g ( T ), and thus w ( T ) ≥ g ( S ). We now modify Algorithm 2 to only select elements with a signiﬁcant gain, parametrized by α > y . If α isclose enough to 1 and y is large enough, then Algorithm 3 is very close to Algorithm 2, and allowsfor a similar analysis. This method is very similar to the one used in [PS17] and [GW18], but ouranalysis is quite diﬀerent. 10ore precisely, we take an element e only if w ( e ) > α ( w ∗ ( e ) + w ∗ ( e )) instead of w ( e ) >w ∗ ( e ) + w ∗ ( e ), and we delete elements if the ratio between two g weights becomes larger than y ( g ( e ) g ( e ′ ) > y ). For technical purposes, we also need to keep independent sets T and T whichmaximize the weight functions w and w respectively. If an element with small g weight is in T or T , we do not delete it, as this would modify the w i -weights and selection of coming elements.We show that this algorithm is a semi-streaming algorithm with an approximation guarantee of(2 + ε ) for an appropriate selection of the parameters (see Lemma 7 for the space requirement andTheorem 8 for the approximation guarantee). Lemma 6.

Let S be the subset generated by Algorithm 3 with α ≥ and y = ∞ . Then w ( S ∗ ) ≤ αg ( S ) .Proof. We deﬁne w α : E → R by w α ( e ) = w ( e ) if e ∈ S and w α ( e ) = w ( e ) α otherwise. Byconstruction, Algorithm 3 and Algorithm 2 give the same set S , and the same weight function g for this modiﬁed weight function. By Lemma 4, w α ( S ∗ ) ≤ g ( S ). On the other hand, w ( S ∗ ) ≤ αw α ( S ∗ ). Lemma 7.

Let S be the subset generated generated by Algorithm 3 with α = 1+ ε and y = min( r ,r ) ε and S ∗ be a maximum weight independent set, where r and r are the ranks of M and M respectively. Then w ( S ∗ ) ≤ ε + o ( ε )) g ( S ) . Furthermore, at any point of time, the size of S is at most r + r + min( r , r ) log α ( yε ) .Proof. We ﬁrst prove that the generated set S satisﬁes w ( S ∗ ) ≤ ε + o ( ε )) g ( S ) and we thenverify the space requirement of the algorithm, i.e., that it is a semi-streaming algorithm.Let us call S ′ the set of elements selected by Algorithm 3, including the elements deleted later.By Lemma 6, we have 2 αg ( S ′ ) ≥ w ( S ∗ ), so all we have to prove is that g ( S ′ ) − g ( S ) ≤ εg ( S ). Weset i ∈ { , } to be the index of the matroid with smaller rank.In our analysis, it will be convenient to think that the algorithm maintains the maximum weightindependent set T i of M i throughout the stream. We have, at the arrival of an element e that isadded to S , that the set T i is updated as follows. If T i ∪ { e } ∈ I i then e is simply added to T i . Otherwise, before updating T i , there is an element e ∗ ∈ T i such that w i ( e ∗ ) = w ∗ i ( e ) and T i \ { e ∗ } ∪ { e } is maximum weight independent set in M i with respect to w i . Thus we can speak ofelements which are replaced be another element in T i . By construction, if e replaces f in T i , then w i ( e ) > αw i ( f ).We can now divide the elements of S ′ into stacks in the following way: If e replaces an element f in T i , then we add e on top of the stack containing f , otherwise we create a new stack containingonly e . At the end of the stream, each element e ∈ T i is in a diﬀerent stack, and each stack containsexactly one element of T i , so let us call S ′ e the stack containing e whenever e ∈ T i . We deﬁne S e to be the restriction of S ′ e to S . In particular, each element from S ′ is in exactly one S ′ e stack, andeach element from S is in exactly one S e stack. For each stack S ′ e , we set e del ( S ′ e ) to by the highestelement of S ′ e which was removed from S . By construction, g ( S ′ e ) − g ( S e ) ≤ w i ( e del ( S ′ e )). On theother hand, w i ( f ) < ε g ( f ) for any element f ∈ S ′ (otherwise we would not have selected it), so g ( S ′ e ) − g ( S e ) < ε g ( e del ( S ′ e )). As e del ( S ′ e ) was removed from S , we have g ( e del ( S ′ e )) < g max y . Asthere are exactly r i stacks, we get g ( S ′ ) − g ( S ) < r i g max ε r i ε = εg max ≤ εg ( S ).We now have to prove that the algorithm ﬁts the semi-streaming criteria. In fact, the size of S never exceeds r + r + r i log α ( yε ). By the pigeonhole principle, if S has at least r i log α ( yε ) elements,then there is at least one stack S e which has at least log α ( yε ) elements. By construction, the w i weight increases by a factor of at least α each time we add an element on the same stack, so the w i weight diﬀerence between the lowest and highest element on the biggest stack would be at least11 ε . As w i ( f ) < ε g ( f ), the g weight diﬀerence would be at least y , and we would remove the lowestelement, unless it was in T or T . Theorem 8.

Let S be the subset generated by running Algorithm 3 with α = 1 + ε and y = min( r ,r ) ε . Then there exists a subset T ⊆ S independent in M and in M such that w ( T ) ≥ g ( S ) .Furthermore, T is a ε + o ( ε )) -approximation for the intersection of two matroids.Proof. Let S ∗ be a maximum weight independent set. By Lemma 7, we have 2(1 + 2 ε + o ( ε ) g ( S ) ≥ w ( S ∗ ). By Lemma 5 we can select an independent set T with w ( T ) ≥ g ( S ) if the algorithm doesnot delete elements. Let S ′ be the set of elements selected by Algorithm 3, including the elementsdeleted later. As long as we do not delete elements from T or T , Algorithm 2 restricted to S ′ will select the same elements, with the same weights, so we can consider S ′ to be generated byAlgorithm 2. We now observe that all the arguments used in Lemma 5 also work for a subsetof S ′ , in particular, it is also true for S that we can ﬁnd an independent set T ⊆ S such that w ( T ) ≥ g ( S ). Remark 9.

Algorithm 3 is not the most eﬃcient possible in terms of memory, but is aimed to besimpler instead. Using the notion of stacks introduced in the proof of Lemma 7, it is possible to mod-ify the algorithm and reduce the memory requirement by a factor log(min(rank( M ) , rank( M ))). Remark 10.

The techniques of this section can also be used in the case when the ranks of thematroids are unknown. Speciﬁcally, the algorithm can maintain the stacks created in the proof ofLemma 7 and allow for an error ε in the ﬁrst two stacks created, an error of ε/ ε/ i in the next 2 i stacks. Remark 11.

It is easy to construct examples where the set S only contains a 2 α -approximation(for example with bipartite graphs), so up to a factor ε our analysis is tight. Algorithm 3

Semi-streaming adaptation of Algorithm 2

Input:

A stream of the elements and 2 matroids (which we call M , M ) on the same ground set E , a real number α > y . Output:

In this section, we consider the problem of submodular matroid intersection in the semi-streamingmodel. We ﬁrst give the deﬁnition of a submodular function and then formally deﬁne our problem.

Deﬁnition 12 (Submodular function) . A set function f : 2 E → R is submodular if it satisﬁes thatfor any two sets A, B ⊆ E , f ( A ) + f ( B ) ≥ f ( A ∪ B ) + f ( A ∩ B ). For any two sets A, B ⊆ E , let f ( A | B ) := f ( A ∪ B ) − f ( B ). For any element e and set A ⊆ E , let f ( e | A ) := f ( A ∪ { e } | A ).Now, an equivalent and more intuitive deﬁnition for f to be submodular is that for any two sets A ⊆ B ⊆ E , and e ∈ E \ B , it holds that f ( e | A ) ≥ f ( e | B ). The function f is called monotone iffor any element e ∈ E and set A ⊆ E , it holds that f ( e | A ) ≥ M = ( E, I ) , M = ( E, I ) on a common ground set E and an oracleaccess to non-negative submodular function f : 2 E → R ≥ on the powerset of the elements of theground set. The goal is to ﬁnd a subset X ⊆ E that is independent in both matroids, i.e., X ∈ I and X ∈ I , and whose weight f ( X ) is maximized.Our Algorithm 4 is a straightforward generalization of Algorithm 2 and Algorithm 1 of [LW20].Since, the weight of an element e now depends on the underlying set it would be added to, we(naturally) deﬁne the weight of e to be the additional value e provides after adding it to set S , i.e. w ( e ) = f ( e | S ). If e provides S a good enough value, i.e, f ( e | S ) ≥ α ( w ∗ ( e ) + w ∗ ( e )), we add itto set S but with a probability q now. This probability q is the most important diﬀerence betweenAlgorithm 3 and Algorithm 4. This is a trick that we borrow from the Algorithm 1 of [LW20] whichis useful when f is non-monotone because of the following Lemma 2.2 of [BFNS14]. Lemma 13 (Lemma 2.2 in [BFNS14]) . Let h : 2 E → R ≥ be a non-negative submodular function,and let S be a a random subset of E containing every element of M with probability at most q (notnecessarily independently), then E [ h ( S )] ≥ (1 − q ) h ( ∅ ) . In our proof, we can relate the weight of the set that we pick and the value f ( S ∗ ∪ S f ) where S f denotes the elements in the stack when algorithm stops and S ∗ denotes the set of optimumelements. If the function f is monotone, this is suﬃcient as f ( S ∗ ∪ S f ) ≥ f ( S ∗ ). This, however,is not true if function f is non-monotone. Here, one can use the Lemma 13 with the function h ( T ) = f ( T ∪ S ∗ ). This enables us to conclude that E [ f ( S ∗ ∪ S f )] ≥ (1 − q ) f ( S ∗ ) . We extend the analysis of Section 4 by using ideas from [LW20] to analyze our algorithm. Beforegoing into the technical details, we give a brief overview of our analysis. For sake of intuition,we assume that the Algorithm 4 does not delete elements and also does not skip elements withprobability 1 − q . Then, due to the fact that the weight of an element e is the additional valueit provides to the current set S , one can relate the weight of the independent set picked with theweight of the optimal solution given the set S f i.e., f ( S ∗ | S f ) by basically using the analysis ofthe previous section. However, this is not enough as the weight of the optimal solution is f ( S ∗ ).But, we can still relate the gain of S f to f ( S f ) similar to [LW20] which helps us relate f ( S ∗ ∪ S f )and weight of our solution. In order to extend it to the case when elements are skipped withprobability 1 − q , we show the above to hold in expectation similar to [LW20] which is helpful fordealing with non-monotone functions because of Lemma 13. Finally, we remark that one can usean analysis similar to Section 4, to show that the eﬀect of deleting elements does not aﬀect theweight of solution by a lot. 13 lgorithm 4 Extension of Algorithm 3 to submodular functions

Input:

A stream of the elements and 2 matroids (which we call M , M ) on the same ground set E , a submodular function f : 2 E R , a real number α ≥

1, a real number q such that 0 ≤ q ≤ y . Output:

A set X ⊆ E that is independent in both matroids.Whenever we write an assignment of a variable with subscript i , it means we do it for i = 1 , S ← ∅ for element e in the stream do calculate w ∗ i ( e ) = max (cid:0) { } ∪ { θ : e ∈ span M i ( { f ∈ S | w i ( f ) ≥ θ } ) } (cid:1) . if f ( e | S ) > α ( w ∗ ( e ) + w ∗ ( e )) thenwith probability 1 − q , continue ; { //skip e with probability 1 − q . } g ( e ) ← f ( e | S ) − w ∗ ( e ) − w ∗ ( e ) S ← S ∪ { e } w i ( e ) ← g ( e ) + w ∗ i ( e )Let T i be a maximum weight independent set of M i with respect to w i .Let g max = max e ∈ S g ( e )Remove all elements e ′ ∈ S , such that y · g ( e ′ ) < g max and e ′ / ∈ T ∪ T from S. end ifend forreturn a maximum weight set T ⊆ S that is independent in M and M Let S f denote the set S generated when the algorithm stops and S ′ f denote the union of S f andthe elements that were deleted by the algorithm. For sake of analysis, we deﬁne the weight function w : E → R of an element e to be the additional value it provided to the set S when it appeared inthe stream, i.e., w ( e ) = f ( e | S ). Like before, we extend the deﬁnition of weight functions w and w for an element e that is not added to S as w i ( e ) = w ∗ i ( e ) for i ∈ { , } . We note here that allthe functions deﬁned above are random variables which depend on the internal randomness of thealgorithm. Unless we explicitly mention it, we generally talk about statements with respect to anyﬁxed realization of the internal random choices of the algorithm.In our analysis, we will prove properties about our algorithm that are already proven for Algo-rithms 2 and 3 in the previous sections. Our proof strategy will be simply running Algorithm 2 or 3with the appropriate weight function which will mimick running our original algorithm. Hence, wewill prove these statements in a black-box fashion. A weight function that we will use repeatedlyin our proofs is w ′ : E → R ≥ where w ′ ( e ) = w ( e ) if e ∈ S ′ f , otherwise w ′ ( e ) = 0. This basically hasthe eﬀect of discarding elements not in S ′ f i.e, elements that were never picked by the algorithmeither because they did not provide a good enough value or because they did but were still skipped. Lemma 14.

Consider the set S ′ f which is the union of S f generated by the Algorithm 4 and theelements it deletes. Then a maximum weight independent set in M i for i ∈ { , } over the wholeground set E can be selected to be a subset of S ′ f , i.e. T i ⊆ S ′ f and it satisﬁes w i ( T i ) = g ( S ′ f ) .Proof. Consider running the Algorithm 2 with weight function w ′ . Notice that doing this generatesa stack containing exactly the elements in the set S ′ f and exactly the same functions w , w and g .Now by applying Lemma 3, we get our result.We prove the following lemma similar to [LW20] which relates the gain of elements in S ′ f to theweight of the optimal solution given the set S ′ f i.e, f ( S ∗ | S ′ f ). Notice that the below lemma holdsonly in expectation for q = 1. 14 emma 15. Denote the set S ′ f which is the union of S f generated by the Algorithm 4 with q ∈{ / (2 α + 1) , } and the elements it deletes. Then, E [ f ( S ∗ | S ′ f )] ≤ α E [ g ( S ′ f )] .Proof. We ﬁrst prove the lemma for q = 1 as the proof is easier than that for q = 1 / (2 α + 1).Consider running the Algorithm 2 with weight function w ′′ : E → R ≥ deﬁned as follows. If e ∈ S ′ f , w ′′ ( e ) = w ( e ), else w ′′ ( e ) = w ( e ) /α . Notice that doing this generates a stack containing exactlythe elements in the set S ′ f and exactly the same functions w , w and g . Now by applying Lemma4, we get that w ( S ∗ ) ≤ αg ( S ′ f ). By submodularity, we get f ( S ∗ | S ′ f ) ≤ αg ( S ′ f ).Now, we prove the lemma for q = 1 / (2 α + 1). We ﬁrst deﬁne λ : E → R for an element e ∈ E as λ ( e ) = f ( e | S ′ f ). Notice that, by submodularity of f and deﬁnition of λ , we have f ( S ∗ | S ′ f ) ≤ λ ( S ∗ ). Hence, it suﬃces to prove E [ λ ( S ∗ )] ≤ α E [ g ( S ′ f )]. We prove this below.Let the event that the element e ∈ E does not give us a good enough value i.e, it satisﬁes α ( w ∗ ( e ) + w ∗ ( e )) ≥ w ( e ) be R e . We have two cases to consider now.1. The ﬁrst is when R e is true. Then, for any ﬁxed choice of randomness of the algorithm forwhich R e is true, we argue as follows. By deﬁnition, w i ( e ) = w ∗ i ( e ). Hence, α ( w ( e )+ w ( e )) ≥ w ( e ). Also, w ( e ) = f ( e | S ) where S is the stack when e appeared in the stream. As S ⊆ S ′ f ,by submodularity and deﬁnition of λ , we get that w ( e ) ≥ λ ( e ). Hence, we also get that α E [ w ( e ) + w ( e ) | R e ] ≥ E [ λ ( e ) | R e ].2. The second is when R e is false. Then, for any ﬁxed choice of randomness of the algorithm forwhich R e is false, we argue as follows. Here, e is picked with probability q given the set S at thetime e appeared in the stream. If we pick e , then w ( e )+ w ( e ) = g ( e )+ w ∗ ( e )+ g ( e )+ w ∗ ( e ) =2 w ( e ) − w ∗ ( e ) − w ∗ ( e ). Otherwise, if we do not pick e , then w ( e ) + w ( e ) = w ∗ ( e ) + w ∗ ( e ).Hence, the expected value of w ( e ) + w ( e ) satisﬁes, E [ w ( e ) + w ( e ) |¬ R e , S ] = 2 qw ( e ) + (1 − q )( w ∗ ( e ) + w ∗ ( e )) ≥ qw ( e ) . The last inequality follows as we have q = 1 / (2 α + 1) ≤ /

2. By the choice of q andsubmodularity, we get that α E [ w ( e ) + w ( e ) |¬ R e , S ] ≥ qαw ( e ) = (1 − q ) w ( e ) ≥ (1 − q ) λ ( e ).By law of total expectation and conditioned on R e not taking place we get, α E [ w ( e ) + w ( e ) |¬ R e ] ≥ E [ λ ( e ) |¬ R e ].Finally by the law of total expectation and the points 1 and 2, we obtain that α E [ w ( e ) + w ( e )] ≥ E [ λ ( e )] holds for any element e ∈ E . Applying this to elements of S ∗ , we get that α E [ w ( S ∗ ) + w ( S ∗ )] ≥ E [ λ ( S ∗ )]. On the other hand, by Lemma 14, we have w i ( T i ) ≥ w i ( S ∗ )(since T i is a max weight independent set in M i with respect to w i ) and w i ( T i ) = g ( S ′ f ), thus g ( S ′ f ) ≥ w i ( S ∗ ) for i = 1 ,

2. Hence, we get that E [ λ ( S ∗ )] ≤ α E [ g ( S ′ f )].Since, we would like the relate the gain of elements in S ′ f to the optimal solution we bound thevalue of f ( S ′ f ) in terms of the gain below similar to [LW20]. Lemma 16.

Consider the set S ′ f which is the union of S f generated by the Algorithm 4 and theelements it deletes. Then, g ( S ′ f ) ≥ (1 − /α ) f ( S ′ f ) .Proof. By deﬁnition, any element e ∈ S ′ f , should have satisﬁed w ( e ) ≥ α ( w ∗ ( e ) + w ∗ ( e )). Hence, g ( e ) ≥ w ( e ) − w ( e ) /α . Summing over all elements in S ′ f , we get g ( S ′ f ) ≥ (1 − /α ) w ( S ′ f ) ≥ f ( S ′ f )where last inequality (not an equality as S ′ f also contains deleted elements) follows by deﬁnition of w and submodularity of f .Our algorithm only has the set S f and not S ′ f which also includes the deleted elements. Hence,in our next lemma, we prove that the gain of elements in these two sets is roughly the same.15 emma 17. Consider the set S ′ f which is the union of S f generated by running the Algorithm 4with α > , y = min( r , r ) /δ for any δ , such that < δ ≤ α − and the elements it deletes. Here, r i is the rank of M i for i ∈ { , } . Then, g ( S ′ f ) − g ( S f ) ≤ δg ( S f ) . Moreover, at any point duringthe execution, S contains at most r + r + min( r , r ) log α ( yα − ) elements.Proof. Consider running the Algorithm 3 with weight function w ′ . Notice that doing this generatesa stack containing exactly the elements as in the set S f , exactly the same set of deleted elementsand exactly the same functions w , w and g . Moreover, this generates the exact same stacks asthe Algorithm 4 at every point of execution. Now by the proof of Lemma 7, we get our result.Lastly, we prove that there exists a set T that is independent in both matroids and has a weightatleast the gain of the elements in S f . Lemma 18.

Let S f be the subset generated by Algorithm 4. Then there exists a subset T ⊆ S independent in M and in M such that w ( T ) ≥ g ( S f ) .Proof. Consider running the Algorithm 3 with weight function w ′ . Recall that for any element e ∈ S ′ f , w ′ ( e ) = w ( e ), otherwise w ′ ( e ) = 0. Notice that doing this generates a stack containingexactly the elements as in the set S f and exactly the same functions w , w and g . The resultfollows by Theorem 8.Now, we have all the lemmas to prove our main theorem which we state below. Theorem 19.

The subset S f generated by Algorithm 4 with α > , q ∈ { / (2 α + 1) , } and y = min( r , r ) /δ for any δ , such that < δ ≤ α − contains a (4 α − / (2 α −

2) + O ( δ ) approximation in expectation for the intersection of two matroids with respect to a non-monotonesubmodular function f . This is optimized by taking α = 1 + √ / , resulting in an approximationratio of √ O ( δ ) ∼ . . Moreover, the same algorithm run with q = 1 and y = min( r , r ) /δ is (2 α + α/ ( α − O ( δ ) approximate if f is monotone. This is optimized by taking α = 1 + 1 / √ ,which yields a √ O ( δ ) ∼ . approximation.Proof. By Lemmas 15 and 16, we have that 2 α E [ g ( S ′ f )] ≥ E [ f ( S ∗ | S ′ f )] and g ( S ′ f )( α/ ( α − ≥ f ( S ′ f ). Combining them, we get,(2 α + α/ ( α − E [ g ( S ′ f )] ≥ E [ f ( S ′ f ) + f ( S ∗ | S ′ f )] = E [ f ( S ∗ ∪ S ′ f )] . By Lemma 17, we also get that g ( S ′ f ) − g ( S f ) ≤ δg ( S f ). This gives us that(2 α + α/ ( α − δ ) E [ g ( S f )] ≥ E [ f ( S ∗ ∪ S ′ f )] . Now, by Lemma 18, there exists a subset T ⊆ S f independent in M and M such that w ( T ) ≥ g ( S f ). By deﬁnition of w , and submodularity of f , we get that f ( T ) ≥ w ( T ). This in turn implies, f ( T ) ≥ g ( S f ). This gives us that(2 α + α/ ( α − δ ) E [ f ( T )] ≥ E [ f ( S ∗ ∪ S ′ f )] . Notice that the above inequality also holds if q = 1 as all the above arguments also work if q = 1. Hence, if f is monotone, we get f ( S ∗ ∪ S f ) ≥ f ( S ∗ ) which gives us our desired inequalityby rearranging terms. However, if f is non-monotone one has to work a little more which we showbelow. 16o deal with the case when f is non-monotone, we use Lemma 13 and take h ( T ) = f ( S ∗ ∪ T )for any T ⊆ E within the lemma statement, to get that E [ f ( S ∗ ∪ S ′ f )] ≥ (1 − q ) f ( S ∗ ) as everyelement of E appears in S ′ f with probability at most q . Putting everything together, we get that(2 α + α/ ( α − δ ) E [ f ( T )] ≥ (1 − q ) f ( S ∗ ) . Now, substituting the value of q = 1 / (2 α + 1) and rearranging terms, we get the desired inequality. Remark 20.

We can exactly match the approximation ratios in [LW20] i.e, without the extraadditive factor of O ( δ ) by not deleting elements. Moreover, S stores at most O (min( r , r ) log α | E | )elements at any point if we assume that values of f are polynomially bounded in | E | , an assumptionthat the authors in [LW20] make. We can easily extend Algorithm 3 to the intersection of k matroids (see Algorithm 5 for details).Most results remain true, in particular, we can have kg ( S ) ≥ (1 + ε ) w ( S ∗ ) by carefully selecting α and y . The only part which does not work is the selection of the independent set from S . Indeed,matroid kernels are very speciﬁc to two matroids. We now prove that a similar approach fails, byproving that the logical generalization of kernels to 3 matroids is wrong, and that a counter-examplecan arise from Algorithm 5. Thus, any attempt to ﬁnd a k + ε approximation using our techniquesmust bring some fundamentally new idea. Still, we conjecture that the generated set S containssuch an approximation. Proposition 21.

There exists a set S and 3 matroids ( S, I ) , ( S, I ) , ( S, I ) such that there doesnot exist a set T ⊆ S such that S = D M ( T ) ∪ D M ( T ) ∪ D M ( T ) (see Lemma 5 for a deﬁnition of D M i ( T ) ) and T is independent in M , M and M where < i is given by w i generated by Algorithm 5(for α suﬃciently small).Proof. We set S = { a, x, y, z, b } , which are given in this order to Algorithm 5. We now deﬁne I , I , I in the following way. A set of 2 elements is in I i if and only if:-In I if it is not { a, x } -In I if it is not { a, y } -In I if it is not { a, z } A set of 3 elements is in I i if and only if each of its subsets of 2 elements is in I i and:-In I if it contains z -In I if it contains x -In I if it contains y A set of 4 elements is not in I i .Let us verify that these constraints correspond to matroids. As the problem is symmetrical,it is suﬃcient to verify that M is a matroid. The 3 element independent sets in M are exactly { y, z, b } , { x, z, b } , { x, y, z } , { a, z, b } and { a, y, z } . Now we consider X, Y ∈ I with | X | < | Y | . Weshould ﬁnd e ∈ Y \ X such that X ∪ { e } ∈ I . If X = ∅ , take any element from Y . If X is asingleton, then there are two cases: either it is one of X ⊆ { a, x } , or it is not. In any case, Y contains at most one element from { a, x } . As it contains at least two elements, Y has to containan element from { y, z, b } . In the ﬁrst case, we can add any of these to X to get an independent set.In the second case, X ⊆ { y, z, b } , so we can add any element to X and it will remain independent,so just pick any element from Y \ X . If X contains two elements, then Y is one of the sets from17he list above. In particular, it contains z . If z / ∈ X , then we can add z to X . Otherwise, either X ⊆ { y, z, b } , in which case we can add any element, or X is { a, z } or { x, z } . In either case, Y must contain an element from { y, b } , which we can add to X .We now set the weights w ( a ) = 1 , w ( x ) = w ( y ) = w ( z ) = 3 and w ( b ) = 8 and run Algorithm 5. • Element a has weight 1, and { a } is independent in M , M and M , so we set w ( a ) = w ( a ) = w ( a ) = g ( a ) = 1 and a is added to S . • Element x is spanned by a in M , and not spanned by any element in M and M , so we get g ( x ) = w ( x ) − w ∗ ( x ) − w ∗ ( x ) − w ∗ ( x ) = 3 − − − >

0, we add x to S . We alsoset w ( x ) = 3 and w ( x ) = w ( x ) = 2. • Element y and z are very similar to x . • Element b is spanned in all three matroids by the elements of w i weight at least 2. On theother hand, b is not spanned in any matroid by the elements of w i weight strictly bigger than2, so w ∗ i ( b ) = 2 for i = 1 , ,

3, thus g ( b ) = 8 − − − w i ( b ) = 2 + 2 = 4 for every i .To recapitulate, we have w ( a ) = 1 , w ( x ) = 3 , w ( y ) = w ( z ) = 2 , w ( b ) = 4 and the w and w weights are similar, with y respectively z being heavier.Let us assume for a contradiction that T is a solution to the problem. T must contain b , as it is the heaviest element in every matroid.If T contains a , then it cannot contain any of x, y, z , otherwise it would not be independent inone of the matroids, so we would have T ⊆ { a, b } . But x has to be in at least one D M i ( T ), and theset { x, b } is independent in every matroid, and has a bigger weight than { a, b } , so x would not bein D M i ( T ). Thus T cannot contain a .As the problem is symmetrical for { x, y, z } , it is suﬃcient to test T = { z, b } , T = { y, z, b } and T = { x, y, z, b } . The last two are not in I , so the only remaining possibility is T = { z, b } . Butthen y is not in D M or D M because { z, b, y } is independent in M and M , and it is not in D M because w ( y ) > w ( z ) ⇔ y < z and { y, b } is independent in M . As y is not in any D M i , thisconcludes the proof. Remark 22.

In the example of Proposition 21, we have g ( S ) = w ( a ) + w ( b ), and { a, b } is inde-pendent in all 3 matroids, so this does not contradict Conjecture 23. Conjecture 23.

The stack S generated by Algorithm 2 contains a k approximation for any k . In the case k = 2, this corresponds to Theorem 2. For any k , one can easily ﬁnd examples were S does not contain more than a k approximation, but we were unable to ﬁnd an example were itdoes not contain a k approximation. 18 lgorithm 5 Extension of Algorithm 3 to k matroids Input:

A stream of the elements and k matroids (which we call M , . . . , M k ) on the same groundset E , a real number α > y . Output:

A set S ⊆ E of “saved” elements.When we write an assignment of a variable with subscript i , it means we do it for i = 1 , . . . , k . S ← ∅ for element e in the stream do calculate w ∗ i ( e ) = max (cid:0) { } ∪ { θ : e ∈ span M i ( { f ∈ S | w i ( f ) ≥ θ } ) } (cid:1) . if w ( e ) > α P ki =1 w ∗ i ( e ) then g ( e ) ← w ( e ) − P ki =1 w ∗ i ( e ) S ← S ∪ { e } w i ( e ) ← g ( e ) + w ∗ i ( e )Let T i be a maximum weight independent set of M i with respect to w i .Let g max = max e ∈ S g ( e )Remove all elements e ′ ∈ S , such that y · g ( e ′ ) < g max and e ′ / ∈ S ki =1 T i from S. end ifend for The authors thank Moran Feldman for pointing us to the recent paper [LW20].

References [BFNS14] Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schwartz,

Submodular max-imization with cardinality constraints , Proceedings of the twenty-ﬁfth annual ACM-SIAM symposium on Discrete algorithms, SIAM, 2014, pp. 1433–1452.[BYBFR04] Reuven Bar-Yehuda, Keren Bendel, Ari Freund, and Dror Rawitz,

Local ratio: Auniﬁed framework for approximation algorithms. in memoriam: Shimon even 1935-2004 , ACM Computing Surveys (CSUR) (2004), 422–463.[BYE85] R. Bar-Yehuda and S. Even, A local-ratio theorem for approximating the weightedvertex cover problem , Analysis and Design of Algorithms for Combinatorial Problems(G. Ausiello and M. Lucertini, eds.), North-Holland Mathematics Studies, vol. 109,North-Holland, 1985, pp. 27 – 45.[CK13] Amit Chakrabarti and Sagar Kale,

Submodular maximization meets streaming: Match-ings, matroids, and more , CoRR abs/1309.2038 (2013).[CS14] M. Crouch and D.M. Stubbs,

Improved streaming algorithms for weighted matching,via unweighted matching , Leibniz International Proceedings in Informatics, LIPIcs (2014), 96–104.[Edm79] Jack Edmonds, Matroid intersection , Discrete Optimization I, Annals of DiscreteMathematics, vol. 4, Elsevier, 1979, pp. 39 – 49.19FKM +

04] Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, and JianZhang,

On graph problems in a semi-streaming model , 07 2004, pp. 531–543.[Fle01] Tam´as Fleiner,

A matroid generalization of the stable matching polytope , vol. 2081, 062001, pp. 105–114.[GW18] Mohsen Ghaﬀari and David Wajc,

Simpliﬁed and Space-Optimal Semi-Streaming(2+epsilon)-Approximate Matching , 2nd Symposium on Simplicity in Algorithms(SOSA 2019) (Dagstuhl, Germany) (Jeremy T. Fineman and Michael Mitzenmacher,eds.), OpenAccess Series in Informatics (OASIcs), vol. 69, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018, pp. 13:1–13:8.[HKMY20] Chien-Chung Huang, Naonori Kakimura, Simon Mauras, and Yuichi Yoshida,

Ap-proximability of monotone submodular function maximization under cardinality andmatroid constraints in the streaming model , CoRR abs/2002.05477 (2020).[LW20] Roie Levin and David Wajc,

Streaming submodular matching meets the primal-dualmethod , arXiv preprint arXiv:2008.10062 (2020).[McG05] Andrew McGregor,

Finding graph matchings in data streams , Approximation, Ran-domization and Combinatorial Optimization. Algorithms and Techniques (Berlin, Hei-delberg) (Chandra Chekuri, Klaus Jansen, Jos´e D. P. Rolim, and Luca Trevisan, eds.),Springer Berlin Heidelberg, 2005, pp. 170–181.[Mut05] S. Muthukrishnan,