A Refined Analysis of Submodular Greedy
aa r X i v : . [ c s . D S ] F e b A Faster Tight Approximation for Submodular Maximization Subjectto a Knapsack Constraint
Ariel Kulik ∗ , Roy Schwartz † , and Hadas Shachnai ‡ Computer Science Department, Technion, Haifa 3200003, IsraelFebruary 26, 2021
Abstract
The problem of maximizing a monotone submodular function subject to a knapsack constraintadmits a tight (1 − e − )-approximation: exhaustively enumerate over all subsets of size at most threeand extend each using the greedy heuristic [Sviridenko, 2004]. We prove it suffices to enumerateonly over all subsets of size at most two and still retain a tight (1 − e − )-approximation. Thisimproves the running time from O ( n ) to O ( n ) queries. The result is achieved via a refinedanalysis of the greedy heuristic. Keywords:
Submodular functions, Knapsack constraint, Approximation Algorithms
Submodularity is a fundamental mathematical notion that captures the concept of economy of scaleand is prevalent in many areas of science and technology. Given a ground set E , a set function f : 2 E → R over E is called submodular if it has the diminishing returns property: f ( A ∪ { e } ) − f ( A ) ≥ f ( B ∪ { e } ) − f ( B ) for every A ⊆ B ⊆ E and e ∈ E \ B . Submodular functions naturallyarise in different areas such as combinatorics, graph theory, probability, game theory, and economics.Some well known examples include coverage functions, cuts in graphs and hypergraphs, matroid rankfunctions, entropy, and budget additive functions.A submodular function f is monotone if f ( S ) ≤ f ( T ) for every S ⊆ T ⊆ E . In this note weconsider the problem of maximizing a monotone submodular function subject to a knapsack constraint (MSK). An instance of the problem is a tuple ( E, f, w, W ) where E is a set of n elements, f : 2 E → R ≥ is a non-negative, monotone and submodular set function given by a value oracle , w : E → N + is aweight function over the elements, and W ∈ N is the knapsack capacity. A subset S ⊆ E is feasible if P e ∈ S w ( e ) ≤ W , i.e., the total weight of elements in S does not exceed the capacity W ; the valueof S ⊆ E is f ( S ). The objective is to find a feasible subset S ⊆ E of maximal value.MSK arises in many applications. Some examples include sensor placement [7], document summa-rization [8], and network optimization [13]. The problem is a generalization of monotone submodularmaximization with a cardinality constraint (i.e., w ( e ) = 1 for all e ∈ E ), for which a simple greedyalgorithm yields a (1 − e − )-approximation [11]. This is the best ratio which can be obtained inpolynomial time in the oracle model [10]. The approximation ratio of (1 − e − ) is also optimal in thespecial case of coverage functions under P = N P [5].The first (1 − e − )-approximation for MSK was given by Sviredenko [14] as an adaptation ofan algorithm of Khuller, Moss and Naor [6] proposed for the special case of coverage functions. Thealgorithm of Sviridenko exhaustively enumerates (iterates) over all subsets G ⊆ E of at most 3 elements ∗ [email protected] † [email protected] ‡ [email protected] An equivalent definition is: f ( A ) + f ( B ) ≥ f ( A ∪ B ) + f ( A ∩ B ) for any A, B ⊆ E . We use N to denote the set of non-negative integers, and N + = N \ { } . G using a greedy approach. Within the greedy phase the algorithm maintains afeasible subset A ⊆ E , and in each step an element e ∈ { e ′ ∈ E | w ( A ∪ { e ′ } ) ≤ W } which maximizes f ( A ∪{ e } ) − f ( A ) w ( e ) is added to A . Overall, the algorithm uses O ( n ) oracle calls and arithmetic operations.In [3], Ene and Nguyen presented a (1 − e − − ε )-approximation for MSK in time O ( n · log n ) forany fixed ε >
0, improving upon an earlier O ( n · polylog( n )) algorithm with the same approximationratio due to Badanidiyuru and Vondr´ak [1]. We note, however, that the dependence of the runningtimes of these algorithms on ε renders them purely theoretical. To date, the algorithm of [14] has thebest running time for a (1 − e − )-approximation for MSK.Our main result is the following theorem. Theorem 1.1.
There is a (1 − e − ) -approximation for MSK using O ( n ) value oracle calls andarithmetic operations. To prove the result we use a simple variant of the algorithm of [14] which only enumerates over sets G of up to 2 elements, essentially showing one can save in the enumeration step of [14]. Intuitively,the analysis in [14] bounds the value of the solution generated by the greedy phase assuming a worstcase submodular function f , and then bounds the value loss due to a discarded element (the elementis discarded by the analysis, not by the algorithm) assuming a worst case submodular function g .The main insight for our improved result is that g = f ; that is, there is no function which attainssimultaneously the worst cases assumed in [14] for the outcome of greedy and for the value loss dueto the discarded element. Based on this insight, we give in Section 2 a refined analysis of the greedyphase. The refined analysis is the key for the proof of Theorem 1.1, given in Section 3. We believeour refined analysis may find additional applications. We start with some definitions and notation. Given a monotone submodular function f : 2 E → R ≥ and A ⊆ E , we define the function f A : 2 E → R ≥ by f A ( S ) = f ( A ∪ S ) − f ( A ) for any S ⊆ E . It iswell known that f A is also monotone, submodular and non-negative (see, e.g., Claim 13 in [4]). Wealso use f ( e ) = f ( { e } ) for e ∈ E . Algorithm 1:
Greedy ( E, f, w, W ) Input :
An MSK instance (
E, f, w, W ) Set E ′ ← E and A ← ∅ while E ′ = ∅ do Find e ∈ E ′ such that f A ( e ) w ( e ) is maximal. Set E ′ ← E ′ \ { e } . If w ( A ∪ { e } ) ≤ W set A ← A ∪ { e } . end Return A The greedy procedure is given in Algorithm 1. While the procedure is useful for deriving efficientapproximation, as a stand-alone algorithm it does not guarantee any constant approximation ratio.We say that the element e ∈ E found in Step 3 is considered in the specific iteration of the loop in Step2. Furthermore, if the element was also added to A in Step 5 we say it was selected in this iteration. Lemma 2.1.
For any MSK instance ( E, f, w, W ) , Algorithm 1 returns a feasible solution for ( E, f, w, W ) . For any MSK instance (
E, f, w, W ), we define a value function V . Let { a , . . . , a ℓ } be the outputof Greedy ( E, f, w, W ), in the order by which the elements are added to A in Step 5 of Algorithm 1.Furthermore, define A i = { a , . . . , a i } for i ∈ [ ℓ ], and A = ∅ . We define V : [0 , w ( A ℓ )] → R ≥ by ∀ i ∈ [ ℓ ] , w ( A i − ) ≤ u ≤ w ( A i ) : V ( u ) = f ( A i − ) + ( u − w ( A i − )) f A i − ( { a i } ) w ( a i ) . For a set A ⊆ E we use w ( A ) = P e ∈ A w ( e ). For any k ∈ N + we use [ k ] to denote the set { i ∈ N | ≤ i ≤ k } = { , , . . . , k } .
2e note that the value of V ( w ( A i )) is well defined for i ∈ [ ℓ −
1] since f ( A i − ) + ( w ( A i ) − w ( A i − )) f A i − ( { a i } ) w ( a i ) = f ( A i ) = f ( A i ) + ( w ( A i ) − w ( A i )) f A i ( { a i +1 } ) w ( a i +1 ) . That is, the value function V is semi-linear and continuous. By definition we have that V (0) = f ( ∅ )and V ( w ( A ℓ )) = f ( A ℓ ). Intuitively, V ( u ) can be viewed as the value attained by Algorithm 1 whileusing capacity of u . We use V ′ to denote the first derivative of V . We note that V ′ ( u ) is defined foralmost all u ∈ [0 , w ( A ℓ )]. Similar to [14], our analysis is based on lower bounds over V ′ (in [14] theanalysis used a discretization of V , thus omitting the differentiation).For every i ∈ [ ℓ ] and u ∈ ( w ( A i − ) , w ( A i )) we have V ′ ( u ) = f Ai − ( a i ) w ( a i ) . The next lemma gives alower bound for V ′ . Lemma 2.2.
Let E ′ be the set from Algorithm 1 at the beginning of the iteration in which a i isselected, and let ∅ 6 = Y ⊆ A i − ∪ E ′ . Then, f Ai − ( a i ) w ( a i ) ≥ f ( Y ) − f ( A i − ) w ( Y ) .Proof. Let m = | Y | and Y = { y , . . . , y m } . For any 1 ≤ j ≤ m , if y j ∈ A i − then f Ai − ( { y j } ) w ( y j ) = 0 ≤ f Ai − ( { a i } ) w ( a i ) . Otherwise, by the assumption of the lemma, y j ∈ E ′ and therefore f Ai − ( { y j } ) w ( y j ) ≤ f Ai − ( { a i } ) w ( a i ) ,since a i was selected in Step 5 of Algorithm 1 when the value of the variable A was A i − . Thus, f Ai − ( { y j } ) w ( y j ) ≤ f Ai − ( { a i } ) w ( a i ) for every j ∈ [ m ]. By the last inequality, and since f is monotone andsubmodular, we have the following. f ( Y ) ≤ f ( A i − ∪ Y )= f ( A i − ) + m X j =1 f A i − ∪{ y ,...,y j − } ( { y j } ) ≤ f ( A i − ) + m X j =1 f A i − ( { y j } )= f ( A i − ) + m X j =1 w ( y j ) f A i − ( { y j } ) w ( y j ) ≤ f ( A i − ) + m X j =1 w ( y j ) f A i − ( { a i } ) w ( a i )= f ( A i − ) + w ( Y ) f A i − ( { a i } ) w ( a i ) . By rearranging the terms, we have f Ai − ( a i ) w ( a i ) ≥ f ( Y ) − f ( A i − ) w ( Y ) , as desired.To lower bound V , we use Lemma 2.2 with several different sets as Y . Let X be a solution foran MSK instance I = ( E, f, w, W ) and X , . . . , X k be a partition of X such that f X ∪ ... ∪ Xi − ( X i ) w ( X i ) ≥ f X ∪ ... ∪ Xi ( X i +1 ) w ( X i +1 ) for every i ∈ [ k − Y = X and utilizethe differential inequality to lower bound V on an interval [0 , D ]. The point D is set such that,on the interval [ D , D ], using Lemma 2.2 with Y = X ∪ X yields a better lower bound on V ′ incomparison to Y = X . Subsequently, the resulting differential inequality is used to bound V on[ D , D ]. When repeated k times, the process results in the bounding function h , formally given inDefinition 2.3. Lemma 2.4 shows that indeed h lower bounds V . Definition 2.3.
Let I = ( E, f, w, W ) be an MSK instance, and consider X ⊆ E such that w ( X ) ≤ W ,and X , . . . , X k is a partition of X ( X i = ∅ for all i ∈ [ k ] ). Denote S j = S ji =1 X i ( S = ∅ ) and r j = f Sj − ( X j ) w ( X j ) for j ∈ [ k ] . Also, assume that r ≥ r ≥ . . . ≥ r k and define D = 0 , D j = P ji =1 w ( X i ) ln r i r j +1 or ≤ j ≤ k − , and D k = ∞ . The bounding function of X , . . . , X k and I is h : R ≥ → R ≥ defined by ∀ j ∈ [ k ] , D j − ≤ u < D j : h ( u ) = f ( S j ) − r j · w ( S j ) · exp (cid:18) − u − D j − w ( S j ) (cid:19) (1)It can be easily verified that D ≤ D ≤ . . . ≤ D k , and D j ′ = D j for j ′ < j if and only if r j ′ +1 = r j +1 . By definition, the bounding function is differentiable almost everywhere. Furthermore,for every j ∈ [ k ] such that D j < D j +1 and D j = 0, let j ′ be the minimal j ′ ∈ [ j ] such that D j ′ = D j .Then, lim u ր D j h ( u ) = f ( S j ′ ) − r j ′ · w ( S j ′ ) · exp (cid:18) − D j ′ − D j ′ − w ( S j ′ ) (cid:19) = f ( S j ′ ) − r j ′ · w ( S j ′ ) · exp − P j ′ i =1 w ( X i ) ln r i r j ′ +1 − P j ′ − i =1 w ( X i ) ln r i r j ′ w ( S j ′ ) = f ( S j ′ ) − r j ′ · w ( S j ′ ) · exp − P j ′ i =1 w ( X i ) w ( S j ′ ) · ln r j ′ r j ′ +1 ! = f ( S j ′ ) − r j ′ +1 · w ( S j ′ )= f ( S j ′ ) + j +1 X i = j ′ +1 f S i − ( X i ) − j +1 X i = j ′ +1 r i · w ( X i ) − r j +1 · w ( S j ′ )= f ( S j +1 ) − r j +1 · w ( S j +1 ) = h ( D j ) . The first equality follows from (1). The second and forth equalities follow from the definitions of D j ′ and S j ′ . The sixth equality holds since for every j ′ < i ≤ j + 1 we have that f Si − ( X i ) w ( X i ) = r i = r j +1 .Thus, the bounding function h is continuous.Let S ∗ be an optimal solution for an MSK instance, I = ( E, f, w, W ). Then, the boundingfunction of I and S ∗ (i.e., k = 1) is h ( u ) = f ( S ∗ ) (cid:0) − exp (cid:0) − uW (cid:1)(cid:1) . It follows from [14] that V ( u ) ≥ (cid:0) − exp (cid:0) − uW (cid:1)(cid:1) f ( S ∗ ) = h ( u ) f ( S ∗ ) for u ∈ [0 , W − max e ∈ S ∗ w ( e )] ∩ N , where the restriction to integervalues can be easily relaxed. Thus, the following lemma can be viewed as a generalization of theanalysis of [14]. Lemma 2.4.
Let I = ( E, f, w, W ) be an MSK instance, V its value function, and A the output ofAlgorithm 1 for the instance I . Consider a subset of elements X ⊆ E where w ( X ) ≤ W , and apartition X , . . . , X k of X , such that f X ∪ ... ∪ Xi − ( X i ) w ( X i ) ≥ f X ∪ ... ∪ Xi ( X i +1 ) w ( X i +1 ) for any i ∈ [ k − . Let h be thebounding function of I and X , . . . , X k , and W max = min { W − max e ∈ X w ( e ) , w ( A ) } . Then, for any u ∈ [0 , W max ] , it holds that V ( u ) ≥ h ( u ) . The proof of Lemma 2.4 uses a differential comparison argument. We say a function ϕ : Z → R , Z ⊆ R , is positively linear in the second dimension if there is K > u, t , t where( u, t ) , ( u, t ) ∈ Z it holds that ϕ ( u, t ) − ϕ ( u, t ) = K · ( t − t ). The following is a simple variant ofstandard differential comparison theorems (see, e.g., [9]). Lemma 2.5.
Let [ a, b ] = S sr =1 [ c r , c r +1 ] be an interval such that c ≤ c ≤ . . . ≤ c s +1 and ϑ , ϑ :[ a, b ] → R be two continuous functions such that ϑ ( a ) ≥ ϑ ( a ) and the deriviatives ϑ ′ , ϑ ′ are definedand continuous on ( c r , c r +1 ) for every r ∈ [ s ] . Also, for any r ∈ [ s ] let ϕ r : ( c r , c r +1 ) × R → R bepositively linear in the second dimension. If ϑ ′ ( u ) ≥ ϕ r ( u, ϑ ( u )) and ϑ ′ ( u ) ≤ ϕ r ( u, ϑ ( u )) for every r ∈ [ s ] and u ∈ ( c r , c r +1 ) , then ϑ ( u ) ≥ ϑ ( u ) for every u ∈ [ a, b ] . The lemma follows from standard arguments in the theory of differential equations. A formal proofis given in Appendix A We define ln = ln a = ∞ . roof of Lemma 2.4. Let ( D j ) kj =0 , and ( S j ) kj =0 be as in Definition 2.3. Define ϕ ( u, v ) = f ( S j ) − vw ( S j ) (2)for j ∈ [ k ], D j − ≤ u < D j and v ∈ R . Let A = { a , . . . , a ℓ } , where a , . . . , a ℓ is the order by whichthe elements were added to A in Step 5 of Algorithm 1. As before, we use A i = { a , . . . , a i } for i ∈ [ ℓ ]and A = ∅ . Let C = (0 , W max ) \ { D , . . . , D k − } \ { a , . . . , a ℓ } . Then, for any u ∈ C , there is j ∈ [ k ]such that D j − < u < D j . Hence, ϕ ( u, h ( u )) = f ( S j ) − h ( u ) w ( S j ) = r j · exp (cid:18) − u − D j − w ( S j ) (cid:19) = h ′ ( u ) , (3)where h ′ is the first derivative of h . As u ∈ C , there is also i ∈ [ ℓ ] such that w ( A i − ) < u < w ( A i ).Hence, V ′ ( u ) = f A i − ( a i ) w ( a i ) ≥ f ( S j ) − f ( A i − ) w ( S j ) = f ( S j ) − V ( w ( A i − )) w ( S j ) ≥ f ( S j ) − V ( u ) w ( S j ) = ϕ ( u, V ( u )) (4)For the first inequality, we note that X ⊆ A i − ∪ E ′ , where E ′ is the set at the beginning of theiteration in which a i was selected. Indeed, otherwise we have that X contains an element e ∈ E thatwas considered by the algorithm at some iteration 1 ≤ ℓ < i , but not selected since w ( A ℓ ∪ { e } ) > W .This would imply that u > w ( A i − ) ≥ w ( A ℓ ) > W max . Thus, as S j ⊆ X we have the conditions ofLemma 2.2. The second inequality holds since V is increasing. By (3) and (4) we have ∀ u ∈ C : h ′ ( u ) = ϕ ( u, h ( u )) and V ′ ( u ) ≥ ϕ ( u, V ( u )) . (5)We can write C = S sr =1 ( c r , c r +1 ) where 0 = c ≤ c ≤ . . . ≤ c s +1 = W max . For any r ∈ [ s ] let ϕ r : ( c r , c r +1 ) → R be the restriction of ϕ to ( c r , c r +1 ) ( ϕ r ( u ) = ϕ ( u ) for any u ∈ ( c r , c r +1 )). It canbe easily verified that ϕ r is continuous and positively linear in the second dimension. Furthermore, itholds that V ′ and h ′ are continuous on ( c r , c r +1 ) for any r ∈ [ s ]. Thus, by (5) and Lemma 2.5 it holdsthat V ( u ) ≥ h ( u ) for any u ∈ [0 , W max ]. Algorithm 2:
EnumGreedy κ ( E, f, w, W ) Input:
An MSK instance (
E, f, w, W ), and enumeration size κ ∈ N . Set S ∗ ← ∅ for every G ⊆ E , | G | ≤ κ do A ← Greedy ( E, f G , w, W − w ( G )) If f ( A ∪ G ) ≥ f ( S ∗ ) then S ∗ ← A ∪ G end Return S ∗ To prove Theorem 1.1 we use
EnumGreedy , i.e., we take Algorithm 2 with κ = 2. We note that EnumGreedy is the (1 − e − )-approximation algorithm of [14]. Lemma 3.1.
EnumGreedy is a (1 − e − ) -approximation for MSK.Proof. It can be easily verified that the algorithm always returns a feasible solution for the inputinstance. Let (
E, f, w, W ) be an MSK instance and Y ⊆ E an optimal solution for the instance.Let Y = { y , . . . , y | Y | } , and assume the elements are ordered by their marginal values. That is, f { y ,...,y i − } ( { y i } ) = max ≤ j ≤| Y | f { y ,...,y i − } ( { y j } ) for every 1 ≤ i ≤ | Y | .If | Y | ≤ G = { y i | i ∈ { , } , i ≤ | Y | }} . In thisiteration it holds that f ( G ∪ A ) ≥ f ( G ) ≥ f ( Y ) (since f is monotone); thus, following this iteration5e have f ( S ∗ ) ≥ f ( Y ) ≥ (1 − e − ) f ( Y ). Therefore, in this special case the algorithm returns anapproximate solution as required. Hence, we may assume that | Y | > G = { y , y } . Let A be theoutput of Greedy ( E, f G , w, W − w ( G )) in Step 3 in this iteration. If Y \ G ⊆ A then f ( A ∪ G ) ≥ f ( Y ),thus following this iteration it holds that f ( S ∗ ) ≥ f ( Y ), and the algorithm returns an optimal solution.Therefore, we may assume that Y \ G A .Let e ∗ ∈ Y \ G such that w ( e ∗ ) = max e ∈ Y \ G w ( e ) and denote R = Y \ G \ { e ∗ } . Define twosets X , X such that { X , X } = {{ e ∗ } , R } and f ( X ) w ( X ) ≥ f ( X ) w ( X ) . As f is submodular it follows that f X ( X ) w ( X ) ≤ f ( X ) w ( X ) ≤ f ( X ) w ( X ) . Let h be the bounding function of ( E, f G , w, W − w ( G )) and X , X . Also,let r , r , and D be the values from Definition 2.3.By Step 5 of Algorithm 1, as Y \ G A , it follows that w ( A ) ≥ W − w ( G ) − w ( e ∗ ). Thus, byLemma 2.4, it holds that f G ( A ) ≥ V ( W − w ( G ) − w ( e ∗ )) ≥ h ( W − w ( G ) − w ( e ∗ )). We consider thefollowing cases. Case 1: W − w ( G ) − w ( e ∗ ) ≥ D . In this case it holds that f ( G ) + f G ( A ) ≥ f ( G ) + h ( W − w ( G ) − w ( e ∗ ))= f ( G ) + f G ( X ∪ X ) − w ( X ∪ X ) · r · exp (cid:18) − W − w ( G ) − w ( e ∗ ) − D w ( X ∪ X ) (cid:19) = f ( Y ) − w ( X ∪ X ) · exp − W − w ( G ) − w ( e ∗ ) − w ( X ) · ln r r w ( X ∪ X ) + ln r ! ≥ f ( Y ) − w ( X ∪ X ) · exp (cid:18) − w ( e ∗ ) + w ( X ) · ln r + w ( X ) · ln r w ( X ∪ X ) (cid:19) . (6)The first and second equalities follow from the definitions of h and D (Definition 2.3). The lastinequality follows from w ( X ∪ X ) + w ( G ) ≤ W . Define two sets H e ∗ , H R as follows. If X = { e ∗ } then H e ∗ = ∅ and H R = { e ∗ } . If X = R then H e ∗ = R and H R = ∅ . It follows that w ( X ) · ln r + w ( X ) · ln r = w ( X ) · ln f G ( X ) w ( X ) + w ( X ) · ln f G ∪ X ( X ) w ( X )= w ( e ∗ ) · ln f G ∪ H e ∗ ( e ∗ ) w ( e ∗ ) + w ( R ) · ln f G ∪ H R ( R ) w ( R ) . (7)As the elements y , . . . , y m are ordered according to their marginal values, we have that f ( y ) ≥ f y ( y ) ≥ f G ( e ∗ ) ≥ f G ∪ H e ∗ ( e ∗ ). Therefore, f ( G ) ≥ · f G ∪ H e ∗ ( e ∗ ) and we have that f ( Y ) − f G ∪ H R ( R ) = f ( G ) + f G ∪ H e ∗ ( e ∗ ) ≥ · f G ∪ H e ∗ ( e ∗ ) . (8)By combining (7) and (8) we obtain the following. w ( X ) · ln r + w ( X ) · ln r ≤ w ( e ∗ ) · ln f ( Y ) − f G ∪ H R ( R )3 · w ( e ∗ ) + w ( R ) · ln f G ∪ H R ( R ) w ( R )= − w ( e ∗ ) · ln 3 + w ( e ∗ ) · ln f ( Y ) − f G ∪ H R ( R ) w ( e ∗ ) + w ( R ) · ln f G ∪ H R ( R ) w ( R ) ≤ − w ( e ∗ ) + w ( R ∪ { e ∗ } ) ln f ( Y ) w ( R ∪ { e ∗ } ) = − w ( e ∗ ) + w ( X ∪ X ) ln f ( Y ) w ( X ∪ X ) (9)The second inequality follows from the log-sum inequality (see, e.g, Theorem 2.7.1 in [2]) and ln 3 > f ( G ) + f G ( A ) ≥ f ( Y ) − w ( X ∪ X ) · exp (cid:18) − f ( Y ) w ( X ∪ X ) (cid:19) = f ( Y ) · (cid:0) − e − (cid:1) . ase 2: W − w ( G ) − w ( e ∗ ) < D and X = { e ∗ } . We can use the assumption in this case to lowerbound f G ( X ) f G ∪ X ( X ) as follows. W − w ( G ) − w ( X ) < D = w ( X ) · ln r r = w ( X ) · (cid:18) ln f G ( X ) f G ∪ X ( X ) + ln w ( X ) w ( X ) (cid:19) By rearranging the terms we haveln f G ( X ) f G ∪ X ( X ) > W − w ( G ) − w ( X ) w ( X ) − ln w ( X ) w ( X ) . Thus, f G ( X ) > f G ∪ X ( X ) · w ( X ) w ( X ) · exp (cid:18) W − w ( G ) − w ( X ) w ( X ) (cid:19) ≥ f G ∪ X ( X ) · δ − · exp ( δ ) , (10)where δ = W − w ( G ) − w ( X ) w ( X ) and the last inequality follows from w ( X ) + w ( X ) + w ( G ) ≤ W .We use (10) to lower bound f ( G ∪ A ). f ( G ) + f G ( A ) ≥ f ( G ) + h ( W − w ( G ) − w ( X ))= f ( G ) + f G ( X ) − f G ( X ) · exp ( − δ )= 23 ( f ( G ) + f G ( X )) + 13 ( f ( G ) + f G ( X )) − f G ( X ) · exp ( − δ ) ≥
23 ( f ( G ) + f G ( X )) + f G ( X ) − f G ( X ) · exp ( − δ ) ≥
23 ( f ( G ) + f G ( X )) + f G ∪ X ( X ) (1 − exp ( − δ )) · δ − · exp ( δ ) ≥
23 ( f ( G ) + f G ( X ) + f G ∪ X ( X )) ≥ (cid:0) − e − (cid:1) f ( Y )The second inequality follows from f ( G ) = f ( y )+ f { y } ( y ) ≥ · f G ( e ∗ ) = 2 f G ( X ) due to the orderingof elements in Y . The third inequality follows from (10). The forth inequality follows from(1 − exp( − δ )) · δ − · exp( δ ) = (exp( δ ) − · δ − ≥ ≥ , as exp( δ ) ≥ δ . Case 3: W − w ( G ) − w ( e ∗ ) < D and X = R . In this case, we have f ( G ) + f G ( A ) ≥ f ( G ) + h ( W − w ( G ) − w ( e ∗ ))= f ( G ) + f G ( R ) − f G ( R ) · exp (cid:18) − W − w ( G ) − w ( e ∗ ) w ( R ) (cid:19) ≥
23 ( f ( Y ) − f G ( R )) + f G ( R ) − f G ( R ) · exp ( − ≥ (cid:0) − e − (cid:1) f ( Y )The second inequality follows from w ( X ) + w ( X ) + w ( G ) ≤ W , and f ( G ) ≥ f ( G ) + f G ∪ R ( { e } ) = f ( Y ) − f G ( R ), as G = { y , y } .Thus, in all cases f ( A ∪ G ) = f ( G ) + f G ( A ) ≥ (1 − e − ) f ( Y ). Hence, in the iteration where G = { y , y } we have that f ( S ∗ ) ≥ (cid:0) − e − (cid:1) f ( Y ).Theorem 1.1 follows from Lemma 3.1 and the observation that EnumGreedy uses O ( n ) oraclecalls and arithmetic operations. It is natural to ask whether EnumGreedy also yields a (1 − e − )-approximation. Here, the answer is clearly negative. For any N >
0, consider the MSK instance I = ( E, f, w, W ), with E = { , , } , w (1) = w (2) = N , w (3) = 1, W = 2 N and f ( S ) = | S ∩{ , }| · N + 2 · | S ∩ { }| . While the optimal solution for the instance is { , } with f ( { , } ) = 2 N , EnumGreedy ( E, f, W, w ) returns either { , } or { , } where f (1 ,
3) = f ( { , } ) = N + 1. Alreadyfor N = 4 the solution returned is not an (1 − e − )-approximation. We note that the function f inthis example is modular (linear). 7 eferences [1] Ashwinkumar Badanidiyuru and Jan Vondr´ak. Fast algorithms for maximizing submodular func-tions. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms(SODA) , pages 1497–1514. SIAM, 2014.[2] Thomas M. Cover and Joy A. Thomas.
Elements of Information Theory (Wiley Series in Telecom-munications and Signal Processing) . Wiley-Interscience, New York, NY, USA, second edition,2006.[3] Alina Ene and Huy L. Nguyen. A Nearly-Linear Time Algorithm for Submodular Maximizationwith a Knapsack Constraint. In , pages 53:1–53:12, 2019.[4] Yaron Fairstein, Ariel Kulik, Joseph (Seffi) Naor, Danny Raz, and Hadas Shachnai. A (1-e-1- ǫ )-approximation for the monotone submodular multiple knapsack problem. In , volume 173 of LIPIcs , pages 44:1–44:19, 2020.[5] Uriel Feige. A threshold of ln n for approximating set cover.
J. ACM , 45(4):634–652, July 1998.[6] Samir Khuller, Anna Moss, and Joseph Naor. The budgeted maximum coverage problem.
Infor-mation processing letters , 70(1):39–45, 1999.[7] Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, andNatalie Glance. Cost-effective outbreak detection in networks. In
Proceedings of the 13th ACMSIGKDD international conference on Knowledge discovery and data mining , pages 420–429, 2007.[8] Hui Lin and Jeff Bilmes. Multi-document summarization via budgeted maximization of submod-ular functions. In
Human Language Technologies: The 2010 Annual Conference of the NorthAmerican Chapter of the Association for Computational Linguistics , pages 912–920, 2010.[9] Alex McNabb. Comparison theorems for differential equations.
Journal of mathematical analysisand applications , 119(1-2):417–428, 1986.[10] G. L. Nemhauser and L. A. Wolsey. Best algorithms for approximating the maximum of asubmodular set function.
Mathematics of Operations Research , 3(3):177–188, 1978.[11] George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. An analysis of approximationsfor maximizing submodular set functions—i.
Mathematical programming , 14(1):265–294, 1978.[12] Thomas C Sideris.
Ordinary differential equations and dynamical systems . Springer, 2013.[13] K. Son, H. Kim, Y. Yi, and B. Krishnamachari. Base station operation and user associationmechanisms for energy-delay tradeoffs in green cellular networks.
IEEE Journal on SelectedAreas in Communications , 29(8):1525–1536, 2011.[14] Maxim Sviridenko. A note on maximizing a submodular set function subject to a knapsackconstraint.
Operations Research Letters , 32(1):41–43, 2004.
A Proof of Lemma 2.5
To prove Lemma 2.5 we first show a simpler claim.
Lemma A.1.
Within the settings of Lemma 2.5, if ϑ ( c r ) ≥ ϑ ( c r ) for some r ∈ [ s ] then ϑ ( u ) ≥ ϑ ( u ) for any u ∈ [ c r , c r +1 ] . roof. If c r = c r +1 the claim trivially holds. Therefore we may assume that c r < c r +1 . Define δ : [ c r , c r +1 ] → R by δ ( u ) = ϑ ( u ) − ϑ ( u ), and let δ ′ = ϑ ′ − ϑ ′ be its derivative. We note that ϑ ′ and ϑ ′ are continuous in ( c r , c r +1 ) and hence δ ′ is continuous and integrable on ( c r , c r +1 ). Thus, for any u ∈ ( c r , c r +1 ) it holds that δ ( u ) = lim t ց c r δ ( u ) − δ ( t ) + δ ( c r ) = lim t ց c r Z ut δ ′ ( z ) dz + δ ( c r ) = δ ( c r ) + Z uc r δ ′ ( z ) dz (11)where the first equality holds since δ is continuous. Furthermore, as ϕ r is positively linear in thesecond dimension there is K r > ϕ r ( u, t ) − ϕ r ( u, t ) = K r · ( t − t ) for any u ∈ ( c r , c r +1 )and t , t ∈ R . Hence, for any u ∈ ( c r , c r +1 ), it holds that δ ′ ( u ) = ϑ ′ ( u ) − ϑ ′ ( u ) ≤ ϕ r ( u, ϑ ( u )) − ϕ r ( u, ϑ ( u )) = K r · ( ϑ ( u ) − ϑ ( u ))= K r (cid:18) δ ( c r ) + Z uc r δ ′ ( z ) dz (cid:19) ≤ K r · (cid:12)(cid:12)(cid:12)(cid:12)Z uc r δ ′ ( z ) dz (cid:12)(cid:12)(cid:12)(cid:12) . The last equality follows from (11) and the inequality holds since δ ( c r ) = ϑ ( c r ) − ϑ ( c r ) ≤
0. ByGronwall’s inequality (see, e.g., Lemma 3.3 in [12]), it follows that δ ( u ) ≤ u ∈ ( c r , c r +1 ), andtherefore ϑ ( u ) ≥ ϑ ( u ) for any u ∈ ( c r , c r +1 ). As δ is continuous, δ ( c r +1 ) = lim u ր c r +1 δ ( u ) ≤
0; thus, ϑ ( c r +1 ) ≥ ϑ ( c r +1 ) as well. Proof of Lemma 2.5.
The lemma essentially follows immediately form Lemma A.1 using an inductiveclaim. We will prove by induction on r ∈ [ s + 1] that ϑ ( u ) ≥ ϑ ( u ) for any u ∈ [ a, c r ]. For r = 1 theclaim holds since c = a and ϑ ( a ) ≥ ϑ ( a ). Let r > ϑ ( u ) ≥ ϑ ( u ) for any u ∈ [ a, c r ].Then, by Lemma A.1, ϑ ( u ) ≥ ϑ ( u ) for any u ∈ [ c r , c r +1 ] as well. That is, the claim holds for r + 1.Taking r = s + 1 we have that ϑ ( u ) ≥ ϑ ( u ) for any u ∈ [ a, c s +1 ] = [ a, ba, b