[PDF] Improved Multi-Pass Streaming Algorithms for Submodular Maximization with Matroid Constraints

Abstract

Full PDF

aa r X i v : . [ c s . D S ] F e b Improved Multi-Pass Streaming Algorithms forSubmodular Maximization with Matroid Constraints ∗ Chien-Chung Huang † Theophile Thiery ‡ Justin Ward ‡ Abstract

We give improved multi-pass streaming algorithms for the problem ofmaximizing a monotone or arbitrary non-negative submodular functionsubject to a general p -matchoid constraint in the model in which elementsof the ground set arrive one at a time in a stream. The family of constraintswe consider generalizes both the intersection of p arbitrary matroid con-straints and p -uniform hypergraph matching. For monotone submodularfunctions, our algorithm attains a guarantee of p + 1 + ε using O ( p/ε )-passes and requires storing only O ( k ) elements, where k is the maximumsize of feasible solution. This immediately gives an O (1 /ε )-pass (2 + ε )-approximation algorithms for monotone submodular maximization in a ma-troid and (3 + ε )-approximation for monotone submodular matching. Ouralgorithm is oblivious to the choice ε and can be stopped after any numberof passes, delivering the appropriate guarantee. We extend our techniquesto obtain the ﬁrst multi-pass streaming algorithm for general, non-negativesubmodular functions subject to a p -matchoid constraint with a number ofpasses independent of the size of the ground set and k . We show that arandomized O ( p/ε )-pass algorithm storing O ( p k log( k ) /ε ) elements givesa ( p + 1 + ¯ γ oﬀ + O ( ε ))-approximation, where ¯ γ oﬀ is the guarantee of thebest-known oﬄine algorithm for the same problem. Many discrete optimization problems in theoretical computer science, operationsresearch, and machine learning can be cast as special cases of maximizing a submodular function f subject to some constraint. Formally, a function f :2 X → R ≥ is submodular if and only if f ( A ) + f ( B ) ≥ f ( A ∪ B ) + f ( A ∩ B )for all A, B ⊆ X . One reason for the ubiquity of submodularity in optimization ∗ A preliminary version of this work was presented at the International Conference on Ap-proximation Algorithms for Combinatorial Optimization Problems (APPROX 2020). Thiswork was funded by the grants ANR-19-CE48-0016 and ANR-18-CE40-0025-01 from the FrenchNational Research Agency (ANR). This work was supported by EPSRC New InvestigatorAward EP/T006781/1. † CNRS, DI ENS, Universit´e PSL, Paris, France, [email protected] ‡ School of Mathematical Sciences, Queen Mary University of London, United Kingdom,Emails: { t.f.thiery, justin.ward } @qmul.ac.uk f ( e | A ) , f ( A + e ) − f ( A ) be the marginal increase obtained in f when addingan element e to a set A (where here and throughout we use the shorthands A + e and A − e for A ∪ { e } and A \{ e } , respectively). It is well-known that f issubmodular if and only if f ( e | B ) ≤ f ( e | A ) for any A ⊆ B and any e B . Ifadditionally we have f ( e | A ) ≥ A and e A we say that f is monotone .Here, we consider the problem of maximizing both monotone and arbitrarysubmodular functions subject to an arbitrary p -matchoid constraint on the setof elements that can be selected. Formally, a p -matchoid M p = ( I p , X ) on X is given by a collection of matroids {M i = ( X i , I i ) } each deﬁned on somesubset of X , where each e ∈ X is present in at most p of these subsets. Aset S ⊆ X is then independent if and only if S ∩ X i ∈ I i for each matroid M i . One can intuitively think of a p -matchoid as a collection of matroids inwhich each element “participates” in at most p of the matroid constraints. Theresulting family of constraints is quite general and captures both intersections of p matroid constraints (by letting X i = X for all M i ) and matchings in p -uniformhypergraphs (by considering X as a collection of hyperedges and deﬁning auniform matroid constraint for each vertex, ensuring that at most one hyperedgecontaining this vertex is selected).In many applications of submodular optimization, such as summarization [1,20, 22, 24] we must process datasets so large that they cannot be stored in mem-ory. Thus, there has been recent interest in streaming algorithms for submodularoptimization problems. In this context, we suppose the ground set X is initiallyunknown and elements arrive one-by-one in a stream. We suppose that thealgorithm has an eﬃcient oracle for evaluating the submodular function f onany given subset of X , but has only enough memory to store a small numberof elements from the stream. Variants of standard greedy and local search algo-rithms have been developed that obtain a constant-factor approximation in thissetting, but their approximation guarantees are considerably worse than that oftheir simple, oﬄine counterparts.Here, we consider the multi-pass setting in which the algorithm is allowed toperform several passes over a stream—in each pass all of X arrives in some order,and the algorithm is still only allowed to store a small number of elements. In the oﬄine setting, simple variants of greedy [15] or local search [13, 18] algorithmsin fact give the best-known approximation guarantees for maximizing submod-ular functions subject to the p -matroid constraints or a general p -matchoid con-straint. However, these algorithms potentially require considering all elementsin X each time a choice is made . It is natural to ask whether this is truly nec-essary, or whether we could instead recover an approximation ratio nearly equalto these oﬄine algorithms by using only a constant number of passes throughthe data stream. 2 .1 Our Results Here we show that for monotone submodular functions, O (1 /ε )-passes suﬃce toobtain guarantees only (1 + ε ) times worse than those guaranteed by the oﬄinelocal search algorithm. We give an O ( p/ε )-pass streaming algorithm that gives a p +1+ ε approximation for maximizing a monotone submodular function subjectto an arbitrary p -matchoid constraint. It immediately gives us an O (1 /ε )-passstreaming algorithm attaining a 2 + ε approximation for matroid constraintsand a 3 + ε approximation for matching constraints in graphs. Each pass of ouralgorithm is equivalent to a single pass of the streaming local search algorithmdescribed by Chakrabarti and Kale [6] and Chekuri, Gupta, and Quanrud [7].However, obtaining a rapid convergence to a p + 1 + ε approximation requiressome new insights. We show that if a pass makes either large or small progressin the value of f , then the guarantee obtained at the end of this pass can be im-proved. Balancing these two eﬀects then leads to a carefully chosen sequence ofparameters for each pass. Our general approach is similar to that of Chakrabartiand Kale [6], but our algorithm is oblivious to the choice of ε . This allows us togive a uniform bound on the convergence of the approximation factor obtainedafter some number d of passes. This bound is actually available to the algo-rithm, and so we can certify the quality of the current solution after each pass.In practice, this allows for terminating the algorithm early if a suﬃcient guaran-tee has already been obtained. Even in the worst case, however, we improve onthe number of passes required by similar previous results by a factor of O ( ε − ).Our algorithm only requires storing O ( k ) elements, where k is the rank of thegiven p -matchoid, deﬁned as the size of the largest independent set of elements.Building on these ideas, we also give a randomized, multi-pass algorithm thatuses O ( p/ε )-passes and attains a p +1+ ¯ γ oﬀ + O ( ε ) approximation for maximizingan arbitrary submodular function subject to a p -matchoid constraint, where ¯ γ oﬀ is the approximation ratio attained by best-known oﬄine algorithm for the sameproblem. To the best of our knowledge, ours is the ﬁrst multipass algorithm whenthe function is non-monotone with a number of passes independent of n and k ,where n is the size of the ground set. In this case, our algorithm requires storing O ( p k log k/ε ) elements. We remark that to facilitate comparison with existingwork, we have stated all approximation guarantees as factors γ ≥

1. However,we note that if one states ratios of the form 1 /γ less than 1, then our resultslead to 1 /γ − ε approximations in which all dependence on p can be eliminated(by setting simply selecting some ε ′ = pε ).3 .2 Related Work Current State of the Art

Oﬄine StreamingConstraint M NN M NNmatroid e/ ( e −

1) [5] 2 .

598 [3] 4 [6, 7, 11] 5 . p, b )-hyp.m p + ε [13] p + εp − [13] 4 p [7, 11] 4 p + 2 − o (1) [11] p -mat.int p + ε [18] p +( p − εp − [18] 4 p [6, 7, 11] 4 p + 2 − o (1) [11] ep (1 − ε )(2 − o (1)) p -matchoid p + 1 [2, 15] [8, 12] 4 p [7, 11] 4 p + 2 − o (1) [11] Table 1: Approximation ratio in oﬄine and streaming setting

Multipass Our resultsConstraint M ε [6] O (1 /ε ) [6] ∗ ε .

589 + ε O (1 /ε )( p, b )-hyp.m p + 1 + ε O ( p log( p ) /ε ) ∗ p + 1 + ε O ( p/ε )[6] [6] p + 1 + O ( ε )+ p p − p -mat.int p + 1 + ε O ( p log( p ) /ε ) ∗ p + 1 + ε O ( p/ε )[6] [6] p + 1 + O ( ε )+ p p − p -matchoid ∗ ∗ ∗ p + 1 + ε p + 1 + O ( ε )+ ep (1 − ε )(2 − o (1)) O ( p/ε ) Table 2: Summary of results for maximizing a submdodular function in themultipass streaming.We use the following abbreviations: M means monotone and NN means that f is non-negative. p -mat.int means p -matroid intersection and ( p, b )-hyp.mdenotes rank p -hypergraph b -matching. ∗ : If we restrict ourselves with algorithms performing O ( poly ( ε, p ))-passesthen only the 1-pass setting is understood.There is a vast literature on submodular maximization with various con-straints and diﬀerent models of computation. In the oﬄine model, the work onmaximizing a monotone submodular function goes back to Nemhauser, Wolseyand Fischer [25]. Monotone submodular functions are well studied and manynew and powerful results have been obtained since then. The best approxima-tion algorithm under a matroid constraint is due to Calinescu et al. [5] which isthe best that can be done using a polynomial number of queries [25] (if f is givenas a value oracle) or assuming P = NP [9] (if f is given explicitly). For moregeneral constraints, Lee, Sviridenko and Vondr´ak obtained a p + ε approxima-tion algorithm under p -matroid intersection constraint [18]. Feldman et al. [13]obtained the same approximation ratio for the general class of p -exchange sys-4ems. For general p -matchoid constraints, the best approximation ratio is p + 1,which is attained by the standard greedy algorithm [15].Non-monotone objectives are less understood even under the simplest as-sumptions. The current best-known result for maximizing a submodular func-tion under a matroid constraint is 2 .

598 [3], which is far from the 2 .

093 hardnessresult [16]. Table 1 gives the best known bounds for the constraints that weconsider in the paper.Due to the large volume of data in modern applications, there has also beena line of research focused on developing fast algorithms for submodular maxi-mization [2, 23]. However, all results we have discussed so far assume that theentire instance is available at any time, which may not be feasible for massivedatasets. This has motivated the study of streaming submodular maximizationalgorithms with low memory requirements. Badaniyuru et al. [1] achieved a2 + ε approximation algorithm for maximizing a monotone submodular functionunder a cardinality constraint in the streaming setting. This was recently shownto be the best possible bound attainable in one pass with memory sublinear inthe size of the instance [14]. Chakrabarti and Kale [6] gave a 4 p approximationfor p -matroid intersection constraint or p -uniform hypergraph matching. Later,Chekuri et al. [7] generalized their argument to arbitrary p -matchoid constraints,and also gave a modiﬁed algorithm for handling non-monotone submodular ob-jectives. A fast, randomized variant of the algorithm of [6] was studied by Feld-man, Karbasi and Kazemi [11], who showed that it has the same approximationguarantee when f is monotone and achieves a 2 p +2 p p ( p + 1)+1 = 4 p +2 − o (1)approximation for general submodular function. Related to our work, there isan active research direction focusing on streaming (sub)modular maximizationsubject to matching constraints. For submodular maximization, the best ap-proximation is 3 + 2 √ √ ee − + ε mul-tipass streaming algorithm in O (1 /ε )-passes (see [2, 17, 21, 22, 26]). Huang etal. [17] achieved a 2 + ε approximation under a knapsack constraint in O (1 /ε )passes. For the intersection of p -partition matroids or rank p -hypergraph match-ing, the number of passes becomes dependent on p . Chakrabarti and Kale [6] showed that if one allows O (cid:0) p log( p ) /ε (cid:1) -passes, a p + 1 + ε approximationis possible. Here we show how to obtain the same guarantee for an arbitrary p -matchoid constraint, while reducing the number of passes to O ( p/ε ). In [6] a bound of O (log p/ε ) is stated. We note that there appears to be a small oversight intheir analysis, arising from the fact that their convergence parameter κ in this case is O ( ε /p ).In any case, it seems reasonable to assume that p is a small constant in most cases. lgorithm 1: The multi-pass streaming local search algorithm procedure

MultipassLocalSearch ( α, β , . . . , β d ) S ← ∅ ; for i = 1 to d do Let ˜ S be the output of StreamingLocalSearch ( α, β i , S i − ); S i ← ˜ S ; return S d ; procedure StreamingLocalSearch ( α, β, S init ) S ← S init ; foreach x in the stream doif x ∈ S init then discard x ; C x ← Exchange ( x, S ); if f ( x | S ) ≥ α + (1 + β ) P c ∈ C x ν ( c, S ) then S ← S \ C x + x ; return S ; For monotone functions, our main multi-pass algorithm is given by the procedure

MultipassLocalSearch in Algorithm 1. We suppose that we are given asubmodular function f : 2 X → R ≥ and a p -matchoid constraint M p = ( I p , X )on X given as a collection of matroids {M i = ( X i , I i ) } . Our procedure runsfor d passes, each of which uses a modiﬁcation of the algorithm of Chekuri,Gupta, and Quanrud [7], given as the procedure StreamingLocalSearch . Ineach pass, procedure

StreamingLocalSearch maintains a current solution S , which is initially set to some S init . Whenever an element x ∈ S init arrivesagain in the subsequent stream, the procedure simply discards x . For all otherelements x , the procedure invokes a helper procedure Exchange , given formallyin Algorithm 2, to ﬁnd an appropriate set C x ⊆ S of up to p elements so that S \ C x + x ∈ I . It then exchanges x with C x if it gives a signiﬁcantly improvedsolution. The improvement is measured with respect to a set of auxiliary weights ν ( x, S ) maintained by the algorithm. For u, v ∈ X , let u ≺ v denote that“element u arrives before v ” in the stream. Then, we deﬁne the incrementalvalue of an element e with respect to a set T as ν ( e, T ) = f ( e | { t ′ ∈ T : t ′ ≺ e } ) . There is a slight diﬃculty here in that we must also deﬁne incremental values forthe elements of S init . To handle this diﬃculty, we in fact deﬁne ≺ with respectto a pretend stream ordering. Note that in all invocations of the procedure6 lgorithm 2: The procedure

Exchange ( x, S ) procedure Exchange ( x, S ) C x ← ∅ ; foreach M ℓ = ( X ℓ , I ℓ ) with x ∈ X ℓ do S ℓ ← S ∩ X ℓ ; if S ℓ + x

6∈ I then T ℓ ← { y ∈ S ℓ : S ℓ − y + x ∈ I ℓ } ; C x ← C x + arg min t ∈ T ℓ ν ( t, S ); return C x ; StreamingLocalSearch made by

MultipassLocalSearch , the set S init iseither ∅ or the result of a previous application of StreamingLocalSearch .In our pretend ordering ( ≺ ) all of S init ﬁrst arrives in the same relative pretendordering as the previous pass, followed by all of X \ S init in the same order givenby the stream X . We then deﬁne our incremental values with respect to thispretend stream ordering.Using these incremental values, StreamingLocalSearch proceeds as fol-lows. When an element x S init arrives, StreamingLocalSearch computes aset of elements C x ⊆ S that can be exchanged for x . StreamingLocalSearch replaces C x with x if and only if the marginal value f ( x | S ) with respect to S is at least (1 + β ) times larger than the sum of the current incremental values ν ( c, S ) of all elements c ∈ C x plus some threshold α , where α, β > x is accepted . Otherwise, wesay that x is rejected . An element x ∈ S that has been accepted may later beremoved from S if x ∈ C y for some later element y that arrives in the stream.In this case we say that x is evicted .The approximation ratio obtained by one pass of StreamingLocalSearch depends on the parameter β in two ways, which can be intuitively understoodin terms of the standard analysis of the oﬄine local search algorithm for theproblem. Intuitively, if β is chosen to be too large, more valuable elements willbe rejected upon arrival and so, in the oﬄine setting, our solution would beonly approximately locally optimal, leading to a deterioration of the guaranteeby a factor of (1 + β ). However, in the streaming setting, the algorithm onlyattempts to exchange an element upon its arrival, and so the ﬁnal solution willnot necessarily be even (1 + β )-approximately locally optimal—an element x may be rejected because f ( x | S ) is small when it arrives, but the processing oflater elements in the stream can evict some elements of S . After these evictions,we could have f ( x | S ) larger. The key observation in the analyses of [6, 7] isthat the total value of these evicted elements—and so also the total increase inthe marginal value of all rejected elements—can be bounded by O ( β ) times the7nal value of f ( S ) at the end of the algorithm. Intuitively, if β is chosen tobe too small, the algorithm will make more exchanges, evicting more elements,which may result in rejected elements being much more valuable with respectto the ﬁnal solution. Selecting the optimal value of β thus requires balancingthese two eﬀects.Here, we observe that this second eﬀect depends only on the total valueof those elements that were accepted after an element arrives. To use thisobservation, we measure the ratio δ = f ( S init ) /f ( ˜ S ) between the value of theinitial solution S init of some pass of StreamingLocalSearch and the ﬁnalsolution ˜ S produced by this pass. If δ is relatively small—and so one pass makesa lot of progress—then this pass gives us an improvement of δ − over the ratioalready guaranteed by the previous pass since f ( ˜ S ) = δ − f ( S init ). On the otherhand, if δ is relatively large—and so one pass does not make much progress—then the total increase in the value of our rejected elements can be bounded by − δβ f ( ˜ S ), and so the potential loss due to only testing these elements at arrivalis relatively small. Balancing these two eﬀects allows us to set β smaller in eachsubsequent passes and obtain an improved guarantee.We now turn to the analysis of our algorithm. Here we focus on a single passof StreamingLocalSearch . For

T, U ⊆ X we let f ( T | U ) , f ( T ∪ U ) − f ( U ).Throughout, we use S to denote the current solution maintained by this pass(initially, S = S init ). The following key properties of incremental values will beuseful in our analysis. We defer the proof to the Appendix. Lemma 2.1.

For any T ⊆ U ⊆ X ,1. P e ∈ T ν ( e, T ) = f ( T ) − f ( ∅ ) .2. ν ( e, U ) ≤ ν ( e, T ) for all e ∈ T .3. f ( T | U \ T ) ≤ P t ∈ T ν ( t, U ) .4. At all times during the execution of StreamingLocalSearch , ν ( e, S ) ≥ α for all e ∈ S . Let A denote the set of elements accepted during the present pass. Theseare the elements which were present in the solution S at some previous timeduring the execution of this pass. Initially we have A = S = S init and wheneveran element is added to S , during this pass we also add this element to A . Let˜ A and ˜ S denote the sets of elements A and S at the end of this pass. Notethat we regard all elements of S init as having been accepted at the start of thepass. The following lemma follows from the analysis of Chekuri, Gupta, andQuanrud [7] in the single-pass setting. We give a complete, self-contained proofin Appendix A. Each element e ∈ ˜ A \ ˜ S was accepted but later evicted by thealgorithm. For any such evicted element, we let χ ( e ) denote the value of ν ( e, S )at the moment that e was removed from S .8 emma 2.2. Let f : 2 X → R ≥ be a submodular function. Suppose ˜ S is thesolution produced at the end of one pass of StreamingLocalSearch and ˜ A be the set of all elements accepted during this pass. Then, f ( OPT ∪ ˜ A ) ≤ ( p + βp − β ) X e ∈ ˜ A \ ˜ S χ ( e ) + ( p + βp + 1) f ( ˜ S ) + kα . We now derive a bound for the summation P e ∈ ˜ A \ ˜ S χ ( e ) (representing thevalue of evicted elements) in terms of the total gain f ( ˜ S ) − f ( S init ) made by thepass, and also bound the total number of accepted elements in terms of f ( OPT ). Lemma 2.3.

Let f : 2 X → R ≥ be a submodular function. Suppose that ˜ S isthe solution produced at the end of one pass of StreamingLocalSearch and ˜ A is the set of all elements accepted during this pass. Then, | ˜ A | ≤ f ( OPT ) /α and X e ∈ ˜ A \ ˜ S χ ( e ) ≤ β (cid:16) f ( ˜ S ) − f ( S init ) (cid:17) . Proof.

We consider the quantity Φ( A ) , P e ∈ A \ S χ ( e ). Suppose some element a with C a = ∅ is added to S by the algorithm, evicting the elements of C a .Then (as each element can be evicted only once) Φ( A ) increases by precisely∆ , P e ∈ C a χ ( e ). Let S − a , S + a and A − a , A + a be the sets S and A , respectively,immediately before and after a is accepted. Let δ a := f ( S + a ) − f ( S − a ) be thechange in the objective function after the exchange between a and C a . Since a is accepted, we must have f ( a | S − a ) ≥ α + (1 + β ) P e ∈ C a ν ( e, S − a ). Then, δ a = f ( S − a \ C a + a ) − f ( S − a ) , = f ( a | S − a \ C a ) − f ( C a | S − a \ C a ) , ≥ f ( a | S − a ) − f ( C a | S − a \ C a ) , (by submodularity) ≥ f ( a | S − a ) − X e ∈ C a ν ( e, S − a ) , (by Lemma 2.1 (3)) ≥ α + (1 + β ) X e ∈ C a ν ( e, S − a ) − X e ∈ C a ν ( e, S − a ) , (since a is accepted)= α + β X e ∈ C a χ ( e ) (by deﬁnition of χ ( e ))= α + β ∆ . It follows that whenever Φ( A ) increases by ∆, f ( S ) must increase by at least β ∆. Initially, Φ( A ) = 0 and f ( S ) = f ( S init ) and at the end of the algorithm,Φ( A ) = P e ∈ ˜ A \ ˜ S χ ( e ) and f ( S ) = f ( ˜ S ). Thus, β P e ∈ ˜ A \ ˜ S χ ( e ) ≤ [ f ( ˜ S ) − f ( S init )].9t remains to show that | ˜ A | ≤ f ( OPT ) /α . For this, we note that the abovechain of inequalities also implies that every time an element is accepted (andso | A | increases by one), f ( S ) also increases by at least α . Thus, we have f ( OPT ) ≥ f ( ˜ S ) ≥ α | ˜ A | .Using Lemma 2.3 to bound the sum of exit values in Lemma 2.2 then im-mediately gives us the following guarantee for each pass performed in Multi-passLocalSearch . In the i th such pass, we will have S init = S i − , ˜ S = S i ,and β = β i . We let A i denote the set of ˜ A of all elements accepted during thisparticular pass. Lemma 2.4.

Let f : 2 X → R ≥ be a submodular function. Consider the i th passof StreamingLocalSearch performed by

MultipassLocalSearch , and let A i be the set of all elements accepted during this pass. Then, | A i | ≤ f ( OPT ) /α and f ( OPT ∪ A i ) ≤ ( p/β i + p −

1) [ f ( S i ) − f ( S i − )] + ( p + pβ i + 1) f ( S i ) + kα . We now show how to use Lemma 2.4 together with a careful selection of param-eters α and β , . . . , β d to derive guarantees for the solution f ( S i ) produced afterthe i th pass made in MultipassLocalSearch . Here, we consider the case that f is a monotone function. In this case, we have f ( OPT ) ≥ f ( OPT ∪ A i ) for all i . We set α = 0 in each pass. In the ﬁrst pass, we will set β = 1. Then, since S = ∅ Lemma 2.4 immediately gives: f ( OPT ) ≤ f ( OPT ∪ A ) ≤ (2 p −

1) [ f ( S ) − f ( ∅ )] + (2 p + 1) f ( S ) = 4 pf ( S ) . (1)For passes i >

1, we use the following, which relates the approximation guaranteeobtained in this pass to that from the previous pass.

Theorem 1.

For i > , suppose that f ( OPT ) ≤ γ i − · f ( S i − ) and deﬁne δ i = f ( S i − ) f ( S i ) as the ratio between the two previous passes. Then, f ( OPT ) ≤ min n γ i − δ i , ( pβ i + p − − δ i ) + p + β i p + 1 o · f ( S i ) + kα . Proof.

From the deﬁnition of γ i − and δ i , we have: f ( OPT ) ≤ γ i − f ( S i − ) = γ i − δ i f ( S i ) . On the other hand, f ( S i ) − f ( S i − ) = (1 − δ i ) f ( S i ). Thus, Lemma 2.4 gives: f ( OPT ) ≤ [( p/β i + p −

1) (1 − δ i ) + p + β i p + 1] f ( S i ) + kα . γ i − from the previous pass, γ i − δ i is an increasing function of δ i and ( p/β i + p − − δ i ) + p + β i p + 1 is andecreasing function of δ i . Thus, the guarantee we obtain in Theorem 1 is alwaysat least as good as that obtained when these two values are equal. Setting: γ i − δ i = ( pβ i + p − − δ i ) + p + β i p + 1 , and solving for δ i gives us: δ i = p (1 + β i ) p + β i ( γ i − − p ) . (2)In the following analysis, we consider this value of δ i since the guarantee given byTheorem 1 will always be no worse than that given by this value. The analysisfor a single matroid constraint follows from our results for p -matchoids, butthe analysis and parameter values obtained are much simpler, so we present itseparately, ﬁrst. Theorem 2.

Suppose we run Algorithm 1 for an arbitrary matroid constraintand monotone submodular function f , with β i = i . Then i ) f ( S i ) ≥ f ( OPT ) for all i > . In particular, after i = ε passes, (2 + ε ) f ( S i ) ≥ f ( OPT ) .Proof. Let γ i be the guarantee for our algorithm after i passes. We show, byinduction on i , that γ i ≤ i +1) i . For i = 1, we have β = 1 and so from (1) wehave γ = 4, as required. For i >

1, suppose that γ i − ≤ ii − . Since p = 1 and β i = 1 /i , identity (2) gives: δ i ≤ (1 + i ) i ( ii − ) = ( i +1) i ( i − i − = ( i − i + 1) i . Thus, by Theorem 1, the i th pass of our algorithm has guarantee γ i satisfying: γ i ≤ γ i − δ i ≤ ii − i − i + 1) i = 2( i + 1) i , as required. Theorem 3.

Suppose we run Algorithm 1 for an arbitrary p -matchoid constraintand monotone submodular function f , β = 1 and β i = γ i − − − pγ i − − p , for i > , where γ i is given by the recurrence γ = 4 p and γ i = 4 p γ i − ( γ i − − γ i − − p ) , for i > . Then (cid:16) p + 1 + pi (cid:17) f ( S i ) ≥ f ( OPT ) for all i > . In particular, after i = pε passes, ( p + 1 + ε ) f ( S i ) ≥ f ( OPT ) . roof. We ﬁrst show that approximation guarantee of our algorithm after i passes is given by γ i . Setting β = 1, we obtain γ = 4 p from (1), agreeingwith our deﬁnition. For passes i >

1, let β i = γ i − − − pγ i − − p . As in the case ofmatroid constraint, Theorem 1 implies that the guarantee for pass i will be atmost δ i γ i − , where δ i is chosen to satisfy (2). Speciﬁcally, if we set δ i = p (cid:16) γ i − − − pγ i − − p (cid:17) p + γ i − − − pγ i − − p ( γ i − − p ) = p (cid:16) γ i − − γ i − − p (cid:17) γ i − − p ( γ i − − γ i − − p ) , then we have δ i γ i − = γ i .We now show by induction on i that γ i ≤ p + 1 + pi . In the case i = 1, wehave γ = 4 p and the claim follows immediately from p ≥

1. In the general case i >

0, and we may assume without loss of generality that γ i − ≥

1. Otherwisethe theorem holds immediately, as each subsequent pass can only increase thevalue of the solution. Then, we note (as shown in Appendix B) that for p ≥ γ i − ≥ γ i is an increasing function of γ i − . By the induction hypothesis, γ i − ≤ p + 1 + pi − . Therefore: γ i ≤ p (cid:16) p + 1 + pi − (cid:17) (cid:16) p + pi − (cid:17)(cid:16) p + pi − (cid:17) ≤ p + 1 + pi , as required. The last inequality above follows from straightforward but tediousalgebraic manipulations, which can be found in Appendix B. In this section, we show that the guarantees for monotone submdodular maxi-mization can be extended to non-monotone submodular maximization even whendealing with multiple passes. Our main algorithm is given by procedure

Mul-tipassRandomizedLocalSearch in Algorithm 3. In each pass, it calls a pro-cedure

RandomizedLocalSearch , which is an adaptation of

StreamingLo-calSearch , to process the stream. Note that each such pass produces a pairof feasible solutions S and S ′ , which we now maintain throughout Multipass-RandomizedLocalSearch . The set S is maintained similarly as before andgradually improves by exchanging “good” elements into a solution throughoutthe pass. The set S ′ will be maintained by considering the best output of anoﬄine algorithm that we run after each pass as described in more detail below.To deal with non-monotone submodular functions, we will limit the prob-ability of elements being added to S . Instead of exchanging good elements onarrival, we store them in a buﬀer B of size m . When the buﬀer becomes full,12 lgorithm 3: The randomized multi-pass streaming algorithm procedure

MultipassRandomizedLocalSearch ( α, β , . . . , β d , m ) S ← ∅ , S ′ ← ∅ ; for i = 1 to d do Let ( ˜

S, S ′ ) be the output of RandomizedLocalSearch ( S i − , α, β i , m ); S i ← ˜ S , S ′ i ← arg max { f ( S ′ i − ) , f ( S ′ ) } ; return ¯ S = arg max { f ( S d ) , f ( S ′ d ) } ; procedure RandomizedLocalSearch ( S init , α, β, m ) S ← S init ; B ← ∅ ; foreach x in the stream doif f ( x | S ) ≥ α + (1 + β ) P e ∈ C x ν ( e, S ) then B ← B + x ; if | B | = m then x ← uniformly random element from B ; C x ← Exchange ( x, S ); B ← B − x ; S ← S + x − C x ; foreach x ′ in B do C x ′ ← Exchange ( x ′ , S ); if f ( x ′ | S ) < α + (1 + β ) P e ∈ C x ′ ν ( e, S ) then B ← B − x ′ ; S ′ ← Offline ( B ); return ( S, S ′ );an element is chosen uniformly at random and added to S . Adding a new el-ement to the current solution may aﬀect the quality of the remaining elementsin the buﬀer and thus we need to re-evaluate them and remove the elementsthat are no longer good. As before, we let A denote the set of elements thatwere previously added to S during the current pass of the algorithm. Note thatwe do not consider an element to be accepted until it has actually been addedto S from the buﬀer. For any ﬁxed set of random choices, the execution of RandomizedLocalSearch can be considered as the execution of

Streamin-gLocalSearch on the following stream: we suppose that an element x arriveswhenever it is selected from the buﬀer and accepted into S . All elements that arediscarded from the buﬀer after accepting x then arrive, and will also be rejectedby StreamingLocalSearch . Any elements remaining in the buﬀer after theexecution of the algorithm do not arrive in the stream. Applying Lemma 2.4with respect to this pretend stream ordering allows us to bound f ( ˜ S ) with re-13pect to f ( OPT \ B ) (that is, the value of the part of OPT that does not remainin the buﬀer B ) after a single pass of RandomizedLocalSearch . Formally, let˜ B i be the value of the buﬀer after the i th pass of our algorithm. Then, applyingLemma 2.4 to the set OPT \ ˜ B i , and taking expectation, gives: E [ f ( A i ∪ ( OPT \ ˜ B i ))] ≤ ( p/β + p −

1) ( E [ f ( S i )] − E [ f ( S i − )])+ ( p + βp + 1) E [ f ( S i )] + αk . (3)In order to bound the value of the elements in ˜ B i , we apply any oﬄine ¯ γ oﬀ -approximation algorithm Offline to the buﬀer at the end of the pass to obtaina solution S ′ . In MultipassRandomizedLocalSearch , we then rememberthe best such oﬄine solution S ′ i computed across the ﬁrst i passes. Then, in the i th pass, we have E [ f ( OPT ∩ ˜ B i )] ≤ ¯ γ oﬀ E [ f ( S ′ )] ≤ ¯ γ oﬀ E [ f ( S ′ i )] . (4)From submodularity of f and A i ∩ ˜ B i = ∅ we have f ( A i ∪ OPT ) ≤ f ( A i ∪ ( OPT \ ˜ B i )) + f ( OPT ∩ ˜ B i ). Thus, combining (3) and (4) we have: E [ f ( A i ∪ OPT )] ≤ ( p/β + p −

1) ( E [ f ( S i )] − E [ f ( S i − )])+ ( p + βp + 1) E [ f ( S i )] + ¯ γ oﬀ E [ f ( S ′ i )] + αk . (5)To relate the right-hand side to f ( OPT ) we use the following result from Buch-binder et al. [4]:

Lemma 4.1 (Lemma 2.2 in [4]) . Let f : 2 X → R ≥ be a non-negative submodularfunction. Suppose that A is a random set where no element e ∈ X appears in A with probability more than p . Then, E [ f ( A )] ≥ (1 − p ) f ( ∅ ) . Moreover, for anyset Y ⊆ X , it follows that E [ f ( Y ∪ A ) ] ≥ (1 − p ) f ( Y ) . We remark that a similar theorem also appeared earlier in Feige, Mirrokni,and Vondr´ak [10] for a random set that contains each element independently with probability exactly p . Here, the probability that an element occurs in A i isdelicate to handle because such an element may either originate from the start-ing solution S i − or be added during the pass. Thus, we use a rougher estimate.By deﬁnition A i ⊆ A i ∪ A i − ∪ . . . ∪ A . Thus, Pr[ e ∈ A i ] ≤ Pr[ e ∈ A i ∪ . . . ∪ A ].The number of selections during the j th pass is at most | A j | and by Lemma 2.4(applied to the set OPT \ ˜ B j due to our pretend stream ordering in each pass j ), | A j | ≤ f ( OPT \ ˜ B j ) /α ≤ f ( OPT ) /α in any pass. Here, the second inequalityfollows from the optimality of OPT , and the fact that any subset of the feasiblesolution

OPT is also feasible for our p -matchoid constraint. Thus, the totalnumber of selections in the ﬁrst i passes at most P ij =1 | A j | ≤ i · f ( OPT ) /α .We select an element only when the buﬀer is full, and each selection is made14ndependently and uniformly at random from the buﬀer. Thus, the probabil-ity that any given element is selected when the algorithm makes a selection isat most 1 /m and by a union bound, Pr[ e ∈ A i ∪ . . . ∪ A ] ≤ i · f ( OPT ) / ( mα ).Let d be the number of passes that the algorithm makes and suppose we set α = εf ( OPT ) / k (in Appendix C we show that this can be accomplished ap-proximately by guessing f ( OPT ), which can be done at the expense of an extrafactor O (log k ) space). Finally, let m = 4 dk/ε . Then, applying Lemma 4.1,after i ≤ d passes we have: E [ f ( A i ∪ OPT )] ≥ (1 − d · f ( OPT ) / ( mα )) f ( OPT ) ≥ (1 − ε/ f ( OPT ) . (6)Our deﬁnition of α also implies that αk ≤ ε/ f ( OPT ). Using this and equation(6) in (5), we obtain:(1 − ε ) f ( OPT ) ≤ ( p/β + p − E [ f ( S i )] − E [ f ( S i − )]) + ( p + βp + 1) E [ f ( S i )] + ¯ γ oﬀ E [ f ( S ′ i )] . (7)As we show in Appendix C, the rest of the analysis then follows similarly to thatin Section 3, using the fact that f ( ¯ S ) = max { f ( S d ) , f ( S ′ d ) } . Theorem 4.

Let M p = ( X, I ) be a p-matchoid of rank k and let f : 2 X → R ≥ be a non-negative submodular function. Suppose there exists an algorithm for theoﬄine instance of the problem with approximation factor ¯ γ oﬀ . For any ε > ,the randomized streaming local-search algorithm returns a solution ¯ S ∈ I suchthat f ( OPT ) ≤ ( p + 1 + ¯ γ oﬀ + O ( ε )) E [ f ( ¯ S )] using a total space of O (cid:16) p k log kε (cid:17) and O (cid:0) pε (cid:1) -passes. References [1] A. Badanidiyuru, B. Mirzasoleiman, A. Karbasi, and A. Krause. Streamingsubmodular maximization: massive data summarization on the ﬂy. In S. A.Macskassy, C. Perlich, J. Leskovec, W. Wang, and R. Ghani, editors,

The20th ACM SIGKDD International Conference on Knowledge Discovery andData Mining, KDD ’14, New York, NY, USA - August 24 - 27, 2014 , pages671–680. ACM, 2014.[2] A. Badanidiyuru and J. Vondr´ak. Fast algorithms for maximizing submod-ular functions. In

Proc. ACM-SIAM Symposium on Discrete Algorithms(SODA) , pages 1497–1514, 2013.[3] N. Buchbinder and M. Feldman. Constrained submodular maximization viaa nonsymmetric technique.

Mathematics of Operations Research , 44(3):988–1005, 2019. 154] N. Buchbinder, M. Feldman, J. Naor, and R. Schwartz. Submodular maxi-mization with cardinality constraints. In

Proc. ACM-SIAM Symposium onDiscrete Algorithms (SODA) , pages 1433–1452, 2014.[5] G. Calinescu, C. Chekuri, M. P´al, and J. Vondr´ak. Maximizing a monotonesubmodular function subject to a matroid constraint.

SIAM Journal onComputing , 40(6):1740–1766, 2011.[6] A. Chakrabarti and S. Kale. Submodular maximization meets streaming:matchings, matroids, and more.

Mathematical Programming , 154(1-2):225–247, 2015.[7] C. Chekuri, S. Gupta, and K. Quanrud. Streaming algorithms for submodu-lar function maximization. In M. M. Halld´orsson, K. Iwama, N. Kobayashi,and B. Speckmann, editors,

Automata, Languages, and Programming - 42ndInternational Colloquium, ICALP 2015, Kyoto, Japan, July 6-10, 2015,Proceedings, Part I , volume 9134 of

Lecture Notes in Computer Science ,pages 318–330. Springer, 2015.[8] C. Chekuri, J. Vondr´ak, and R. Zenklusen. Submodular function maxi-mization via the multilinear relaxation and contention resolution schemes.

SIAM Journal on Computing , 43(6):1831–1879, 2014.[9] U. Feige. A threshold of ln n for approximating set cover. Journal of theACM , 45(4):634–652, 1998.[10] U. Feige, V. S. Mirrokni, and J. Vondr´ak. Maximizing non-monotone sub-modular functions.

SIAM Journal on Computing , 40(4):1133–1153, 2011.[11] M. Feldman, A. Karbasi, and E. Kazemi. Do less, get more: streamingsubmodular maximization with subsampling. In

Advances in Neural Infor-mation Processing Systems (NeurIPS) , pages 732–742, 2018.[12] M. Feldman, J. Naor, and R. Schwartz. A uniﬁed continuous greedy algo-rithm for submodular maximization. In

Proc. IEEE Symposium on Foun-dations of Computer Science, (FOCS) , pages 570–579, 2011.[13] M. Feldman, J. S. Naor, R. Schwartz, and J. Ward. Improved approxima-tions for k-exchange systems. In

Proc. European Symposium on Algorithms(ESA) , pages 784–798, 2011.[14] M. Feldman, A. Norouzi-Fard, O. Svensson, and R. Zenklusen. The one-waycommunication complexity of submodular maximization with applicationsto streaming and robustness. In

Proc. ACM Symposium on Theory of Com-puting (STOC) , pages 1363–1374, 2020.1615] M. L. Fisher, G. L. Nemhauser, and L. A. Wolsey. An analysis of ap-proximations for maximizing submodular set functions II.

MathematicalProgramming Study , 8:73–87, 1978.[16] S. O. Gharan and J. Vondr´ak. Submodular maximization by simulated an-nealing. In

Proc. ACM-SIAM Symposium on Discrete Algorithms (SODA) ,pages 1098–1116, 2011.[17] C.-C. Huang and N. Kakimura. Multi-pass streaming algorithms for mono-tone submodular function maximization.

CoRR , abs/1802.06212, 2018.[18] J. Lee, M. Sviridenko, and J. Vondr´ak. Submodular maximization overmultiple matroids via generalized exchange properties.

Mathematics of Op-erations Research , 35(4):795–806, 2010.[19] R. Levin and D. Wajc. Streaming submodular matching meets the primal-dual method. arXiv preprint arXiv:2008.10062 , 2020.[20] H. Lin and J. A. Bilmes. Multi-document summarization via budgetedmaximization of submodular functions. In

Human Language Technologies:Conference of the North American Chapter of the Association of Compu-tational Linguistics, Proceedings, June 2-4, 2010, Los Angeles, California,USA , pages 912–920. The Association for Computational Linguistics, 2010.[21] A. McGregor and H. T. Vu. Better streaming algorithms for the maximumcoverage problem.

Theory of Computing Systems , 63(7):1595–1619, 2019.[22] B. Mirzasoleiman, A. Badanidiyuru, and A. Karbasi. Fast constrained sub-modular maximization: Personalized data summarization. In

ICML , pages1358–1367, 2016.[23] B. Mirzasoleiman, A. Badanidiyuru, A. Karbasi, J. Vondr´ak, andA. Krause. Lazier than lazy greedy. In

Proc. AAAI Conference on Ar-tiﬁcial Intelligence (AAAI) , pages 1812–1818, 2015.[24] B. Mirzasoleiman, S. Jegelka, and A. Krause. Streaming non-monotonesubmodular maximization: Personalized video summarization on the ﬂy. arXiv preprint arXiv:1706.03583 , 2017.[25] G. L. Nemhauser and L. A. Wolsey. Best algorithms for approximatingthe maximum of a submodular set function.

Mathematics of OperationsResearch , 3(3):177–188, 1978.[26] A. Norouzi-Fard, J. Tarnawski, S. Mitrovi´c, A. Zandieh, A. Mousavifar,and O. Svensson. Beyond 1/2-approximation for submodular maximizationon massive data streams. In

Proc. International Conference on MachineLearning (ICML) , pages 3826–3835, 2018.17

Proof of Lemma 2.2

Here, we give a self-contained analysis of the single-pass algorithm of Chekuri,Gupta, and Quanrud [7], corresponding to Algorithm 1 initialized with S init = ∅ .First, we prove Lemma 2.1, which concerns properties of the incremental valuesmaintained by Algorithm 1. Lemma 2.1.

For any T ⊆ U ⊆ X ,1. P e ∈ T ν ( e, T ) = f ( T ) − f ( ∅ ) .2. ν ( e, U ) ≤ ν ( e, T ) for all e ∈ T .3. f ( T | U \ T ) ≤ P t ∈ T ν ( t, U ) .4. At all times during the execution of StreamingLocalSearch , ν ( e, S ) ≥ α for all e ∈ S .Proof. Property (1) follows directly from the telescoping summation X e ∈ T ν ( e, T ) = X e ∈ T [ f ( e ∪ { t ′ ∈ T : t ′ ≺ e } ) − f ( { t ′ ∈ T : t ′ ≺ e } ] = f ( T ) − f ( ∅ ) . Property (2) follows from submodularity since T ⊆ U implies that { t ′ ∈ T : t ′ ≺ e } ⊆ { t ′ ∈ U : t ′ ≺ e } .For property (3), we note that: f ( T | U \ T ) = X t ∈ T f ( t | U \ T ∪ { t ′ ∈ T : t ′ ≺ t } ) , ≤ X t ∈ T f ( t | { u ′ ∈ U : u ′ ≺ t } ) , = X t ∈ T ν ( t, U ) , where the ﬁrst equation follows from a telescoping summation, and the inequalityfollows from submodularity, since { u ′ ∈ U : u ′ ≺ t } ⊆ U \ T ∪ { t ′ ∈ T : t ′ ≺ t } .We prove property (4) by induction on the stream of elements arriving.Initially S = ∅ . Thus, the ﬁrst time that any element x is accepted, we musthave C x = ∅ and so f ( x | S ) ≥ α ≥

0. After this element is accepted, we have ν ( x, S ) = ν ( x, { x } ) = f ( x | ∅ ) = α . Proceeding inductively, then, let S − x and S + x be the set of elements in S before and after some new element x arrives and isprocessed by Algorithm 1, and suppose that ν ( s, S − x ) ≥ α for all s ∈ S − x . Then,if x is rejected, we have S + x = S − x and so ν ( s, S + x ) = ν ( s, S − x ) ≥ α for all s ∈ S + x .If x is accepted, then S + x = S \ C x + x and f ( x | S − x ) ≥ α +(1+ β ) P e ∈ C x ν ( e, S − x ).Thus, ν ( x, S + x ) ≥ f ( x | S + x − x ) ≥ f ( x | S − x ) ≥ α + (1 + β ) | C x | α ≥ α , x is accepted. For any other s ∈ S + x , we have { t ′ ∈ S \ C x : t ′ ≺ s } ⊆ { t ′ ∈ S : t ′ ≺ s } and so by property (3) of the lemma, ν ( s, S + x ) ≥ ν ( s, S − x ) ≥ α , asrequired.In our analysis we will use the following structural lemma from Chekuri etal. [7] (here, restated in our notation). This lemma applies to the execution ofour algorithm StreamingLocalSearch when S init = ∅ , and so no element isdiscarded upon arrival due to x ∈ S init . However, we note that the executionof our algorithm is in fact exactly the same as this algorithm executed on thepretend stream ordering introduced in Section 2 to deﬁne the incremental values ν . Speciﬁcally, in each pass of our algorithm, the set S init is a feasible solutionproduced by the preceding pass and in the pretend stream ordering, all elementsof S init arrive in our pretend ordering in the same relative (pretend) order asthis preceding pass. It follows that whenever x ∈ S init arrives in our pretendordering for the present pass, we have C x = ∅ and ν ( x, S ) = ν ( x, S init ) ≥ α by Lemma 2.1 (4), since x was present in the feasible solution S = S init at theend of the preceding pass. Thus, each x ∈ S init will ﬁrst be accepted in ourpretend stream ordering, and then the rest of X \ S init is processed, exactly as in StreamingLocalSearch .Recall that we let ˜ A be the set of all elements that were accepted by thispass of StreamingLocalSearch (and so at some point appeared in S ). Foreach element x ∈ X , we let S − x be the current set S at the moment that x arrivesand S + x the set after x is processed. For an element e that is accepted but later evicted from S , let χ ( e ) be the incremental value ν ( e, S ) of e at the momentthat e was evicted. Lemma A.1 (Lemma 9 of [7]) . Let T ∈ I be a feasible solution disjoint from ˜ A , and ˜ S be the output of the streaming algorithm. There exists a mapping ϕ : T → ˜ A such that:1. Every s ∈ ˜ S appears in the set ϕ ( t ) for at most p choices of t ∈ T .2. Every e ∈ ˜ A \ ˜ S appears in the set ϕ ( t ) for at most p − choices of t ∈ T .3. For each t ∈ T : X c ∈ C t ν ( c, S − t ) ≤ X e ∈ ϕ ( t ) \ ˜ S χ ( e ) + X s ∈ ϕ ( t ) ∩ ˜ S ν ( s, ˜ S ) . Using this charging argument, we can now prove Lemma 2.2 directly.19 emma 2.2.

Let f : 2 X → R ≥ be a submodular function. Suppose ˜ S is thesolution produced at the end of one pass of StreamingLocalSearch and ˜ A be the set of all elements accepted during this pass. Then, f ( OPT ∪ ˜ A ) ≤ ( p + βp − β ) X e ∈ ˜ A \ ˜ S χ ( e ) + ( p + βp + 1) f ( ˜ S ) + kα . Proof.

Let R = OPT \ ˜ A . Since S − r ⊆ ˜ A for all r , the submodularity of f impliesthat X r ∈ R f ( r | S − r ) ≥ X r ∈ R f ( r | ˜ A ) ≥ f ( R ∪ ˜ A ) − f ( ˜ A ) = f ( OPT ∪ ˜ A ) − f ( ˜ A ) . (8)For any r ∈ R , since r was rejected upon arrival, f ( r | S − r ) ≤ (1 + β ) X c ∈ C r ν ( c, S − r ) + α . (9)Thus, applying Lemma A.1 we obtain: X r ∈ R f ( r | S − r ) ≤ (1 + β ) X r ∈ R X c ∈ C r ν ( c, S − r ) + kα, ((9) and | R | ≤ k ) ≤ X r ∈ R (1 + β ) (cid:20) X e ∈ ϕ ( r ) \ ˜ S χ ( e ) + X s ∈ ϕ ( r ) ∩ ˜ S ν ( s, ˜ S ) (cid:21) + kα, (Lemma A.1 (3)) ≤ (1 + β ) (cid:20) ( p − X e ∈ ˜ A \ ˜ S χ ( e ) + p X s ∈ ˜ S ν ( s, ˜ S ) (cid:21) + kα, (Lemma A.1 (1, 2))where in the last inequality we have also used Lemma 2.1 (4), which implies thateach χ ( e ) and ν ( s, ˜ S ) is non-negative. Combining the above inequality with (8),we obtain f ( OPT ∪ ˜ A ) ≤ (1 + β )  ( p − X e ∈ ˜ A \ ˜ S χ ( e ) + p X s ∈ ˜ S ν ( s, ˜ S )  + f ( ˜ A ) + kα . (10)We now bound f ( ˜ A ) in terms of the values ν ( s, ˜ S ) and χ ( e ). Since S ⊆ ˜ A atall times during the algorithm, and χ ( e ) = ν ( e, S ) at the moment e was evicted,we have χ ( e ) ≥ ν ( e, ˜ A ) by Lemma 2.1 (2). Thus, f ( ˜ A ) − f ( ∅ ) = X a ∈ ˜ A ν ( a, ˜ A ) = X s ∈ ˜ S ν ( s, ˜ A )+ X e ∈ ˜ A \ ˜ S ν ( e, ˜ A ) ≤ X s ∈ ˜ S ν ( s, ˜ S )+ X e ∈ ˜ A \ ˜ S χ ( e ) , (11)where the ﬁrst equation follows from Lemma 2.1 (1), and the last inequalityfollows from Lemma 2.1 (2). 20ombining (10) and (11) we have: f ( OPT ∪ ˜ A ) ≤ ((1 + β )( p −

1) + 1) X e ∈ ˜ A \ ˜ S χ ( e )+ ((1 + β ) p + 1) X e ∈ ˜ S ν ( s, ˜ S ) + f ( ∅ ) + kα, = ( p + pβ − β ) X e ∈ ˜ A \ ˜ S χ ( e ) + ( p + βp + 1) X s ∈ ˜ S ν ( s, ˜ S ) + f ( ∅ ) + kα . (12)By Lemma 2.1 (1), we have the following bound for the second summation in(12):( p + βp +1) X e ∈ ˜ S ν ( e, ˜ S )+ f ( ∅ ) = ( p + βp +1)[ f ( ˜ S ) − f ( ∅ )]+ f ( ∅ ) ≤ ( p + βp +1) f ( ˜ S ) . Combining this and (12) we obtain: f ( OPT ∪ ˜ A ) ≤ ( p + pβ − β ) X e ∈ ˜ A \ ˜ S χ ( e ) + ( p + βp + 1) f ( ˜ S ) + kα. B Calculations for the proof of Theorem 3

We recall that γ i = γ i − δ i = 4 pγ i − ( γ i − − γ i − − p ) . Then, to see that γ i is an increasing function of γ i − for p ≥ γ i − ≥

1, wenote that: ddγ i − γ i = 4 p ( γ i − −

1) + 4 pγ i − ( γ i − − p ) − pγ i ( γ i − − γ i − − p ) = 4 p ( γ i − − γ i − − p ) + 4 pγ i − ( γ i − − p ) − pγ i − ( γ i − − γ i − − p ) ≥ pγ i − ( γ i − −

1) + 4 pγ i − − pγ i − ( γ i − − γ i − − p ) ≥ . The third line follows from p ≥ γ i − ≥ p (cid:16) p + 1 + pi − (cid:17) (cid:16) p + pi − (cid:17)(cid:16) p + pi − (cid:17) ≤ p + 1 + pi . p (cid:16) p + 1 + pi − (cid:17) (cid:16) p + pi − (cid:17)(cid:16) p + pi − (cid:17) = 4 p (( p + 1)( i −

1) + 4 p ) ( p ( i −

1) + 4 p )(2 p ( i −

1) + 4 p ) , = 4 p (( p + 1)( i −

1) + 4 p ) ( p ( i −

1) + 4 p )(2 p ( i + 1)) , = (( i − p + 1) + 4 p ) ( i + 3)( i + 1) , = ( i − i + 3) i ( p + 1) + i ( i + 3)4 pi ( i + 1) , = (cid:0) i + 2 i − (cid:1) i ( p + 1) + ( i + 3 i )4 pi ( i + 1) , and p + 1 + pi = ( p + 1) i + 4 pi , = i ( i + 1) ( p + 1) + ( i + 1) pi ( i + 1) , = (cid:0) i + 2 i + 1 (cid:1) i ( p + 1) + (cid:0) i + 2 i + 1 (cid:1) pi ( i + 1) . Then, since p ≥ i ≥ (cid:16) p + 1 + pi (cid:17) − p (cid:16) p + 1 + pi − (cid:17) (cid:16) p + pi − (cid:17)(cid:16) p + pi − (cid:17) = 4 i ( p + 1) − i − pi ( i + 1) ≥ . C Additional Details for the Non-Monotone Case

C.1 Guessing the value of f ( OPT ) Guessing the value of f ( OPT ) is a common technique in streaming submodularfunction maximization. Badanidiyuru et al. [1] showed how to approximate f ( OPT ) within a constant factor using O (log( k )) space in a single pass. To avoidextra complications, we show how to guess f ( OPT ) in two passes and refer thereader to [1] for an approximation of f ( OPT ) on the ﬂy. Let τ = max e ∈ X f ( e ).Using submodularity, it is easy to see that τ ≤ f ( OPT ) ≤ kτ . Consider the setΛ = (cid:8) i | i ∈ Z , τ ≤ i ≤ k · τ (cid:9) . Then there exists a value λ ∈ Λ such that f ( OPT )2 ≤ λ ≤ f ( OPT ). Settingthe parameter α = ελ/ (2 k ), we get that α ∈ [ εf ( OPT ) / k ; εf ( OPT ) / k ]. The22eﬁned range of α is suﬃcient for the analysis . Unfortunately, it is still notpossible to know which λ ∈ Λ satisﬁes the property. However, it suﬃces to runthe randomized local-search algorithm for every λ ∈ Λ in parallel and outputthe best solution of all the copies. This operation increases the space complexityby a multiplicative O (log k ) factor, and adds one additional pass to ﬁnd τ . C.2 Proof of Theorem 6

Here we give a full proof of the following theorem from Section 4:

Theorem 4.

Let M p = ( X, I ) be a p-matchoid of rank k and let f : 2 X → R ≥ be a non-negative submodular function. Suppose there exists an algorithm for theoﬄine instance of the problem with approximation factor ¯ γ oﬀ . For any ε > ,the randomized streaming local-search algorithm returns a solution ¯ S ∈ I suchthat f ( OPT ) ≤ ( p + 1 + ¯ γ oﬀ + O ( ε )) E [ f ( ¯ S )] using a total space of O (cid:16) p k log kε (cid:17) and O (cid:0) pε (cid:1) -passes. In the same spirit as in Section 3, we show that we can derive a guaranteewith respect to the solution E [ f ( S i )] produced after the i th pass even whenthe function is non-monotone. In fact, we show that the analysis of the non-monotone case reduces to the monotone case as shown in the following theorem. Theorem 5.

Let f be a non-negative submodular function. Let the additivethreshold α = εf ( OPT ) / k and let d ≥ i > . Suppose that at the start ofthe i th iteration of the randomized local-search algorithm with a buﬀer of size m = 4 dk/ε we have (1 − ε ) f ( OPT ) ≤ γ i − E [ f ( S i − )] + ¯ γ oﬀ E [ f ( S ′ i − )] . Then, (1 − ε ) f ( OPT ) ≤ min (cid:26) γ i − δ i , (cid:18) pβ i + p − (cid:19) (1 − δ i ) + p + β i p + 1 (cid:27) · E [ f ( S i )]+ ¯ γ oﬀ E [ f ( S ′ i )] , where δ i = E [ f ( S i − )] E [ f ( S i )] .Proof. From the deﬁnition of γ i − and δ i , it follows that,(1 − ε ) f ( OPT ) ≤ γ i − E [ f ( S i − )]+¯ γ oﬀ E [ f ( S ′ i − )] ≤ γ i − δ i E [ f ( S i )]+¯ γ oﬀ E [ f ( S ′ i )](13)where in the last inequality we have used the deﬁnition of δ i and the fact that f ( S ′ i ) ≥ f ( S ′ i − ), which follows from the way S ′ i is deﬁned in Algorithm 3. Equation (6) and the bound αk ≤ εf ( OPT ) are where we need the exact value of α , usingupper and lower bounds for α yield the same result up to the hidden constant in the term O ( ǫ ).

23n the other hand, E [ f ( S i )] − E [ f ( S i − )] = (1 − δ i ) E [ f ( S i )]. Thus, by (7)we also have:(1 − ε ) f ( OPT ) ≤ (cid:16) pβ i + p − (cid:17) ( E [ f ( S i )] − E [ f ( S i − )])+( p + βp + 1) E [ f ( S i )] + ¯ γ oﬀ E [ f ( S ′ i )]= (cid:16)(cid:16) pβ i + p − (cid:17) (1 − δ i ) + p + β i p + 1 (cid:17) E [ f ( S i )] + ¯ γ oﬀ E [ f ( S ′ i )] . (14)Since the right-hand side of equation 13 is an increasing function of δ i and theright-hand side of equation 14 is a decreasing function of δ i , the guarantee weobtain is always at least as good as that obtained when these two values areequal.As in the monotone case, the lemma enables us to derive values of β so asto minimize the value of the approximation ratio. The following follows directlyfrom the same calculations as in Section 3 and Appendix B. Theorem 6.

Suppose we run Algorithm 3 with a buﬀer of size m = 4 dk/ε on a arbitrary p -matchoid constraint and a submodular function, with α = εf ( OPT ) / k , β = 1 and β i = γ i − − − pγ i − − p where γ i is given by the recurrence, γ = 4 p and γ i = pγ i − ( γ i − − γ i − − p ) . Then, (1 − ε ) f ( OPT ) ≤ (cid:18) p + 1 + 4 pi (cid:19) E [ f ( ˜ S i )] + ¯ γ oﬀ E [ f ( S ′ i )] . In particular after d = pε passes, (1 − ε ) f ( OPT ) ≤ ( p + 1 + ¯ γ oﬀ + ε ) E [ f ( ¯ S d )] . Under a matroid constraint, Algorithm 3 with α = εf ( OPT ) / k , β i = 1 /i and d = 2 ε − passes outputs a solution ¯ S such that, (1 − ε ) f ( OPT ) ≤ (2 + ¯ γ oﬀ + ε ) E [ f ( ¯ S )] , where ¯ γ oﬀ is the approximation ration of the best oﬄine algorithm for maximizing f under a matroid constraint.Proof of Theorem 4. We assume that we know the value of f ( OPT ) before hand,which can be accomplished approximately as in Section C.1. Let ε ′ = ε/p with 1 / ≥ ε ′ > α = ε ′ f ( OPT ) / k . We want to obtain an additiveerror term instead of a multiplicative error term as stated in Theorem 6. ByTheorem 6, (1 − ε ′ ) f ( OPT ) ≤ (cid:18) p + 1 + ¯ γ oﬀ + 4 pd (cid:19) E [ f ( ¯ S d )]= ( p + 1 + ¯ γ oﬀ ) (cid:0) O (cid:0) d − (cid:1)(cid:1) E [ f ( ¯ S d )] . − ε ′ ) − ≤ ε ′ for ε ′ ∈ (0 , / f ( OPT ) ≤ ( p + 1 + ¯ γ oﬀ ) (cid:0) O (cid:0) d − (cid:1)(cid:1)(cid:0) ε ′ (cid:1) E [ f ( ¯ S d )] . (15)Since ε ′ = ε/p , setting d = O ( p/ε ) we ﬁnally obtain the desired result: f ( OPT ) ≤ ( p + 1 + ¯ γ oﬀ )(1 + O ( ε/p ))(1 + 2 ε/p ) E [ f ( ¯ S d )] ≤ ( p + 1 + ¯ γ oﬀ + O ( ε )) E [ f ( ¯ S d )] . For the space complexity, we note that the randomized local-search algo-rithm stores the buﬀer B and maintains two past solutions S i , S ′ i ∈ I , togetherwith the current solution S ∈ I . Hence, the total space needed is equal to O ( | B | + | S ′ i | + | S i | + | S | ) = O ( m + 3 k ) = O (cid:0) p kε − (cid:1) , times an additional factorof O (log k ) for guessing f ( OPT ). The number of passes is d = O ( p/εp/ε