[PDF] A fast algorithm for quadratic resource allocation problems with nested constraints

Abstract

We study the quadratic resource allocation problem and its variant with lower and upper constraints on nested sums of variables. This problem occurs in many applications, in particular battery scheduling within decentralized energy management (DEM) for smart grids. We present an algorithm for this problem that runs in O(nlogn) time and, in contrast to existing algorithms for this problem, achieves this time complexity using relatively simple and easy-to-implement subroutines and data structures. This makes our algorithm very attractive for real-life adaptation and implementation. Numerical comparisons of our algorithm with a subroutine for battery scheduling within an existing tool for DEM research indicates that our algorithm significantly reduces the overall execution time of the DEM system, especially when the battery is expected to be completely full or empty multiple times in the optimal schedule. Moreover, computational experiments with synthetic data show that our algorithm outperforms the currently most efficient algorithm by more than one order of magnitude. In particular, our algorithm is able to solves all considered instances with up to one million variables in less than 17 seconds on a personal computer.

Full PDF

AA fast algorithm for quadratic resource allocation problems withnested constraints

Martijn H. H. Schoot Uiterkamp, Johann L. Hurink, Marco E. T. GerardsUniversity of Twente, Enschede, the NetherlandsSeptember 9, 2020

Abstract

We study the quadratic resource allocation problem and its variant with lower and upper con-straints on nested sums of variables. This problem occurs in many applications, in particular batteryscheduling within decentralized energy management (DEM) for smart grids. We present an algorithmfor this problem that runs in O ( n log n ) time and, in contrast to existing algorithms for this problem,achieves this time complexity using relatively simple and easy-to-implement subroutines and datastructures. This makes our algorithm very attractive for real-life adaptation and implementation.Numerical comparisons of our algorithm with a subroutine for battery scheduling within an existingtool for DEM research indicates that our algorithm signiﬁcantly reduces the overall execution timeof the DEM system, especially when the battery is expected to be completely full or empty multipletimes in the optimal schedule. Moreover, computational experiments with synthetic data show thatour algorithm outperforms the currently most eﬃcient algorithm by more than one order of magni-tude. In particular, our algorithm is able to solves all considered instances with up to one millionvariables in less than 17 seconds on a personal computer. The resource allocation problem is a classical and well-researched problem in the optimization andoperations research literature. The objective of the resource allocation problem is to divide a ﬁxedamount of resource (e.g., time, money, energy) over a set of activities while minimizing a given costfunction (or maximizing a given utility function). In the most studied version of this problem, the costfunctions are quadratic, which leads to the following formulation of the so-called quadratic resourceallocation problem (QRAP): QRAP: min x ∈ R n (cid:88) i ∈N x i a i s.t. (cid:88) i ∈N x i = R, (1) l i ≤ x i ≤ u i , i ∈ N , where a ∈ R n> , R ∈ R , l, u ∈ R n , and N := { , . . . , n } . The problem QRAP has been studiedextensively over the last decades due to its wide applicability in, among others, engineering, ﬁnance, andmachine learning (see also the surveys in [32, 33]). As a consequence, many eﬃcient algorithms havebeen developed for this problem and its generalizations.In this article, we study an extension of QRAP, namely the QRAP with lower and upper constraints1 a r X i v : . [ m a t h . O C ] S e p n nested sums of variables (QRAP-NC). This problem can be formulated as follows:QRAP-NC: min x ∈ R n (cid:88) i ∈N x i a i s.t. L j ≤ (cid:88) i ∈N j x i ≤ U j , j ∈ N n − , (cid:88) i ∈N x i = R,l i ≤ x i ≤ u i , i ∈ N , (2)where N j := { , . . . , j } for j ∈ N \{ n } , L, U ∈ R n − , and we deﬁne L n = U n = R for convenience. Notethat if L j = U j for some j ∈ N n − , we may split up the problem QRAP-NC into two smaller instances ofQRAP-NC that involve the variables x , . . . , x j and x j +1 , . . . , x n respectively. Thus, we assume withoutloss of generality that L j < U j for all j ∈ N n − . Moreover, we may assume that L = l , U = u , and L j ≥ L j − + l j and U j ≤ U j − + u j for j ∈ N n − \{ } .The problem QRAP-NC has numerous applications in, among others, machine learning, telecommu-nications, and speed optimization problems (see also the overviews in [1, 42]). Our particular motivationfor studying QRAP-NC is its application in decentralized energy management (DEM) for smart distri-bution grids. In DEM, the goal is to optimize the joint energy consumption of multiple devices within,e.g., a neighborhood. In a DEM system, devices optimize their own consumption locally but this localoptimization is coordinated to obtain certain global objectives (hence the term “decentralized”). In thecontext of DEM, we are interested in optimization of storage devices such as electrical batters and heatbuﬀers. Energy storage devices plays an important role in DEM systems since they are quite ﬂexible intheir energy usage and are thus suitable to compensate for peak consumption or production of energy inthe distribution grid (see, e.g., [36, 29, 46]).One important example of a device-level optimization problem within DEM is the scheduling of abattery within a neighborhood. We consider the situation where the charging and discharging of thebattery has to be scheduled over a set N of equidistant time intervals, each of length ∆ t . Given thepower proﬁle p := ( p i ) i ∈N of the neighborhood, the goal is to determine for each time interval i ∈ N thecharging power x i of the battery during this interval so that the combined battery and neighborhoodproﬁle is ﬂattened as much as possible. Aiming for this goal reduces the stress put on the grid andthe risk of blackouts. The (physical) restrictions of the battery are given by a minimum and maximumcharging rate X min and X max and a capacity C . Given the amount of energy present in the battery (thestate-of-charge (SoC)) at the start and end of the scheduling horizon, denoted by S start and S end , we canformulate the resulting device-level optimization problem as follows (see also [41]):BATTERY: min x ∈ R n (cid:88) i ∈N ( p i + x i ) s.t. 0 ≤ S start + ∆ t (cid:88) i ∈N j x i ≤ D, j ∈ N n − ,S start + ∆ t (cid:88) i ∈N x i = S end ,X min ≤ x i ≤ X max , i ∈ N . Note that this is an instance of QRAP-NC by applying the variable transform y := p + x .An important feature within the DEM paradigm is that device-level problems have to be solved lo-cally. This means that the corresponding device-level optimization algorithms are executed on embeddedsystems with limited computational power (see, e.g., [5]) that are located within, e.g., households. Sincethese algorithms are called multiple times with the DEM system as a subroutine, it is important thatthese algorithms are very eﬃcient. Therefore, eﬃcient and tailored device-level optimization algorithmsare crucial ingredients for the real-life implementation of DEM systems. In particular, for the optimiza-tion of storage devices, this means that fast and tailored algorithms to solve QRAP-NC are crucial. Formore background on DEM, we refer to [39, 12]. There is a rich literature on solution approaches for QRAP-NC with only upper nested constraints onsums of variables, i.e., only nested constraints of the form (cid:80) i ∈N j x i ≤ U j , j ∈ N n − are given. This case2as been studied mainly in the context of convex optimization over submodular constraints (see, e.g.,[18, 19, 43]). However, the literature on the general case of QRAP-NC is limited. The authors in [41]propose an infeasibility-guided divide-and-conquer algorithm, to which we shall refer in this article asALG inf . This algorithm solves a relaxation of the problem where the nested constraints are ignored and,subsequently, splits up the problem into two smaller instances of QRAP-NC at the variable for whichthe lower or upper nested constraint is violated most in the solution to the relaxation. The worst-casetime complexity of this algorithm is O ( n ). Furthermore, [42] proposes a decomposition-based algorithm,hereafter referred to as ALG dec , that solves QRAP-NC in O ( n log n ) time. This algorithm decomposesQRAP-NC into a hierarchy of QRAP subproblems whose single-variable bounds are optimal solutionsto QRAP subproblems further down in the hierarchy. Currently, this is the most eﬃcient algorithm forQRAP-NC.As mentioned before, we are interested in algorithms for QRAP-NC that are fast in practice. Althoughthe decomposition-based algorithm ALG dec has a good worst-case time complexity, we observe severaldisadvantages of this approach that may make it less favorable in practice than its worst-case timecomplexity suggests:1. Each level of recursion within ALG dec solves a series of instances of QRAP whose parametersare determined by optimal solutions to multiple instances of QRAP on earlier levels. Since eachinstance is solved from scratch, much time is spent on initializing the subproblems.2. ALG dec achieves for each level of recursion an O ( n ) time complexity by solving the QRAP subprob-lems using an O ( n ) time algorithm such as the ones in [27]. These O ( n ) time algorithms repeatedlycall linear-time algorithms such as [6] to ﬁnd the median of a set. However, these median-ﬁnd algo-rithms are relatively slow in practice due to a big constant factor in their complexity [6]. Moreover,they are signiﬁcantly more diﬃcult to implement than simple sorting or sampling-based strategies[26, 2].To alleviate these issues, we propose in this article a new algorithm for QRAP-NC, called ALG seq ,which has the same time complexity as ALG dec , namely O ( n log n ), but in contrast requires only relativelysimple and fast subroutines to attain this complexity. As a consequence, this algorithm is both faster inpractice and easier to implement than ALG dec . These are generally more important criteria for the actualadaptation of a given algorithm than the polynomial worst-case time complexity [30]. Our algorithmbuilds upon the monotonicity results for QRAP-NC derived in [42] and solves a sequence of QRAPsubproblems that have a sequential nested structure rather than the divide-and-conquer structure ofboth ALG dec and ALG inf . More precisely, for each j ∈ N , the j th subproblem involves only the ﬁrst j variables x , . . . , x j . As a consequence, our approach can solve its ﬁrst j subproblems without anyknowledge on the parameters involving indices higher than j , whereas both ALG inf and ALG dec requireall problem parameters to be known a priori. This makes our algorithm particularly useful in situationswhere problem parameters arrive over time. This is, e.g., the case when each variable denotes a decisionfor a speciﬁc time slot and all parameters related to this time slot become available only during or at thestart of this time slot. Moreover, due to the nested structure, each input and bookkeeping parameter isaccessed within a relatively small time period instead of frequently throughout the entire course of thealgorithm. This is beneﬁcial for caching since this increases the number of times a value can be accessedquickly from a cache instead of relatively slowly from the main memory.We attain the O ( n log n ) complexity using an eﬃcient implementation of double-ended priority queues[28, 8] for several bookkeeping parameters. This data type supports insertion of arbitrary elements andﬁnding and deletion of minimum and maximum elements in at most O (log n ) time. Our approach requires O ( n ) of such operations, which leads to an overall time complexity of O ( n log n ). Double-ended priorityqueues can be implemented using specialized data structures such as min-max heaps [4] or by a simplecoupling of a standard min-heap and max-heap (see also [8]). The latter heaps are one of the most basicdata structures and many eﬃcient implementation exist for diﬀerent programming languages [9]. Thus,we can achieve the time complexity of O ( n log n ) using relatively simple data structures, as opposed toALG dec , where a more involved implementation of a linear-time median algorithm is required.Our algorithm for QRAP-NC also leads to eﬃcient and fast algorithms for instances of QRAP-NCwhere we replace each term x i a i by a i f ( x i a i ) for each i ∈ N with a given convex function f . Sucha structure is present in many applications considered in the literature, in particular in most of theapplications surveyed or evaluated in [1, 42]. We obtain such eﬃcient algorithms by a reduction resultin [37], which states that any optimal solution to an instance of QRAP-NC is also optimal for thisinstance when we take as objective function (cid:80) i ∈N a i f ( x i a i ). As a consequence, our algorithm solves also3uch problems in O ( n log n ) time. This leads to faster algorithms for a wide range of practical problems,including the vessel speed optimization problem [31, 24] and processor scheduling with agreeable deadlines[23, 16].We evaluate the performance of our algorithm ALG seq and compare it to the state-of-the-art algo-rithms ALG inf and ALG dec . For this evaluation, we use both synthetic instances and realistic instancesof the battery scheduling problem BATTERY using real power consumption data as input. With re-gard to the realistic instances, we compare our approach to a tailored implementation of ALG inf usingDEMKit, an existing simulation tool for DEM research [21]. Within DEMKit, the battery schedulingproblem is used as a subroutine within a distributed optimization framework that coordinates the energyconsumption of multiple devices [17]. Our results indicate that the number of tight nested constraints inan optimal solution greatly inﬂuences which algorithm is faster for a given problem instance. In partic-ular, ALG seq is on average faster than ALG inf , except when the percentage of tight nested constraintsis relatively low (less than 2%). Moreover, the execution time of ALG seq is more stable than that ofALG inf , which makes our algorithm more suitable for use in DEM systems that employ a high level ofparallelism (see, e.g., [20]). With regard to the synthetic instances, we study the scalability of ALG seq ,ALG inf , and ALG dec . Our results indicate that both our algorithm ALG seq and ALG inf are at least oneorder of magnitude faster than ALG dec and that ALG seq is on average almost twice as fast as ALG inf .In particular, ALG seq solves instances with up to one million variables in less than 17 seconds.The outline of the remainder of this article is as follows. In Section 2, we present a simple procedureto solve QRAP, which forms an important ingredient for our eventual approach for solving QRAP-NC. InSection 3, we present an initial sequential algorithm ALG seq for solving QRAP-NC with an O ( n ) worst-case time complexity. Based on this algorithm, we derive in Section 4 an O ( n log n ) time algorithmfor this problem. In Section 5, we evaluate the performance of this algorithm and compare it to thestate-of-the-art. Finally, we provide our conclusions in Section 6. In this section, we discuss a simple approach to solve QRAP that belongs to the class of so-called break-point search methods [27, 33] that structurally search for the optimal Lagrange multiplier correspondingto the resource constraint (1). This approach forms an important ingredient of our O ( n log n ) timealgorithm for QRAP-NC in Section 4.We start by considering the Lagrangian relaxation of QRAP:QRAP[ δ ] : min x ∈ R n (cid:88) i ∈N (cid:18) x i a i − δx i (cid:19) s.t. l i ≤ x i ≤ u i , i ∈ N , where δ ∈ R is the Lagrange multiplier corresponding to the resource constraint (1). We denote theoptimal solution to this problem by x [ δ ] := ( x i [ δ ]) i ∈N . Since the objective function of this problem isseparable, the optimal solution to QRAP[ δ ] is given by x i [ δ ] =  l i if δ < l i a i ,a i δ if l i a i ≤ δ < u i a i ,u i if u i a i ≤ δ. (3)Observe that x i [ δ ] is a continuous piecewise linear non-decreasing function of δ . More precisely, x i [ δ ] isconstant for δ ≤ l i a i , linear with slope a i for δ ∈ [ l i a i , u i a i ], and again constant for δ ≥ u i a i (see also Figure 1).For each i ∈ N , we call the points where x i [ δ ] has “kinks”, i.e., where x i [ δ ] is non-diﬀerentiable, the breakpoints of x i [ δ ]. We denote these breakpoints for i ∈ N by α i and β i respectively, i.e., α i := l i a i and β i := u i a i , where we refer to α i as the lower breakpoint of x i [ δ ] and to β i as the upper breakpoint of x i [ δ ].We denote the multiset of lower breakpoints by A := { α i | i ∈ N } and the multiset of upper breakpointsby B := { β i | i ∈ N } . The reason for deﬁning A and B as multi sets is so that we can readily associateeach breakpoint value in the set with one index in N .Note that also the sum z [ δ ] := (cid:80) i ∈N x i [ δ ] is continuous, piecewise linear, and non-decreasing. More-over, it has 2 n breakpoints, namely those of all terms x i [ δ ]. Thus, the multiset of breakpoints of z [ δ ]is given by A ∪ B . Feasibility of the original problem QRAP implies that there exists a value ¯ δ for theLagrange multiplier δ such that z [¯ δ ] = R , meaning that x [¯ δ ] is optimal not only for QRAP(¯ δ ) but also4 iai uiai l i u i δ x i [ δ ] Figure 1: The function x i [ δ ] for a given i ∈ N . The slope of the line segment for δ ∈ [ l i a i , u i a i ] is a i .for the original problem QRAP. Note that this multiplier is not necessarily unique: in general, there mayexist an interval I ⊂ R such that δ ∈ I implies z [ δ ] = R .Our approach to ﬁnd the value ¯ δ consists two steps. First, we aim to ﬁnd two consecutive breakpoints δ and δ such that δ ≤ ¯ δ ≤ δ . Since z is non-decreasing, this is equivalent to ﬁnding two consecutivebreakpoints δ and δ such that z [ δ ] ≤ R ≤ z [ δ ]. For this, we may consider all breakpoints in A ∪ B innon-decreasing order until we have found the ﬁrst, i.e., smallest, breakpoint δ such that ¯ δ < δ . In detail,for each candidate breakpoint δ , we compute z [ δ ] and if z [ δ ] > R , we set δ := δ and δ as the previouslyconsidered breakpoint. To compute z [ δ ] eﬃciently, we keep track of the sums P ( δ ) := (cid:88) i : δ< liai l i + (cid:88) i : δ ≥ uiai u i , Q ( δ ) := (cid:88) i : liai ≤ δ< uiai a i and update these values each time a new breakpoint has been considered (see Table 1).Type of δ Update P ( δ ) Update Q ( δ ) δ ≡ α i P ( δ ) − l i Q ( δ ) + a i δ ≡ β i P ( δ ) + u i Q ( δ ) − a i Table 1: Updating the bookkeeping sums P ( δ ) and Q ( δ ) when searching the breakpoints in non-decreasing order.In a second step, given the consecutive breakpoints δ and δ with ¯ δ ∈ [ δ , δ ], we determine ¯ δ and x [¯ δ ]. Note that, since x [ δ ] is non-decreasing, we have for each i ∈ N : • x i [¯ δ ] = l i if and only if x i ( δ ) = l i , and • x i [¯ δ ] = u i if and only if x i ( δ ) = u i .Thus, given δ and δ , we know whether a given variable x i [¯ δ ] equals its lower bound l i , its upper bound u i , or is strictly in between these bounds. To ﬁnd x i [¯ δ ] for those variables that are strictly in betweentheir bounds, note that, by deﬁnition of x [ δ ], R = z [¯ δ ] = (cid:88) i : x i [¯ δ ]= l i l i + (cid:88) i : l i

Parameters a ∈ R n> , R ∈ R , and l, u ∈ R n Output:

Optimal solution x to QRAP Compute the breakpoint multisets A and B Initialize P := (cid:80) ni =1 l i ; Q := 0 repeat Determine smallest breakpoint δ i := min( A ∪ B ) if P + Qδ i = C then δ i = ¯ δ ; compute x [¯ δ ] using Equation (3) return else if P + Qδ i > C then (¯ δ < δ i ): ¯ δ = C − PQ ; compute x [¯ δ ] using Equation (3) return else if δ i is lower breakpoint ( δ i = α i ) then P := P − l i ; Q := Q + a i A := A\{ α i } else P := P + u i ; Q := Q − a i B := B\{ β i } end if end if until multiplier ¯ δ has been found return Optimal solution ¯ x := x [¯ δ ]given (for example if the breakpoints have already been sorted in a previous run of the algorithm), thetime complexity of the algorithm reduces to O ( n ).We conclude this subsection with two observations that are crucial for the eﬃciency of our algorithmfor QRAP-NC presented in the following section:1. Instead of searching the breakpoints in non-decreasing order, we may also search them in non-increasing order and continue the search until we ﬁnd the ﬁrst, i.e., largest breakpoint δ such that δ < ¯ δ .2. Solving two instances of QRAP that diﬀer only in the value of R in the resource constraint (1) canbe done simultaneously in one run of Algorithm 1. This is because the multisets of the breakpointsfor these two instances of QRAP are the same. Thus, we can modify Algorithm 1 such that itcontinues the breakpoint search after the optimal multiplier for the smallest of the given valuesof R has been found. Note that, essentially, the optimal multiplier for a given value R serves as thestarting candidate for the optimal multiplier for instances with a higher value of R . This is in factone of the two crucial observations for our approach for solving the QRAP subproblems, which wediscuss further in Section 4.2. In this section, we present our initial sequential algorithm for the problem QRAP-NC. This algorithmsolves the problem as a sequence of 2 n − j ∈ N and6 ∈ R the following subproblem:QRAP-NC j ( C ) : min x ∈ R j (cid:88) i ∈N j x i a i s.t. (cid:88) i ∈N j x i = C, (4) L k ≤ (cid:88) i ∈N k x i ≤ U k , k ∈ N j − , (5) l i ≤ x i ≤ u i , i ∈ N j . Throughout this article, we denote the optimal solution to this subproblem by x j ( C ) := ( x ji ( C )) i ∈N j ,where we use the brackets ( · ) instead of [ · ] to emphasize the distinction of this solution from an optimalsolution x [ δ ] of the Lagrangian relaxation QRAP( δ ) of QRAP. Note that this optimal solution is uniquesince the objective function of the corresponding problem is strictly convex and all constraints are linear.Moreover, observe that the n th subproblem QRAP-NC n ( R ) is equal to the original problem QRAP-NC.The key ingredient to our algorithm is that we can replace the nested constraints (5) by speciﬁc single-variable constraints without changing the optimal solution. By doing this, we transform an instance ofQRAP-NC into an equivalent instance of QRAP. More precisely, we show that each subproblem QRAP-NC j ( C ) yields the same optimal solution as the following instance of QRAP:QRAP j ( C ) : min x ∈ R j (cid:88) i ∈N j x i a i s.t. (cid:88) i ∈N j x i = C, (6) x j − i ( L j − ) ≤ x i ≤ x j − i ( U j − ) , i ∈ N j − , (7) l j ≤ x j ≤ u j , (8)where the bounds x j − ( L j − ) and x j − ( U j − ) in (7) are the optimal solutions of the problems QRAP-NC j − ( L j − ) and QRAP-NC j − ( U j − ) respectively. Note that the single-variable bounds for x j in (8)are the same as those of the original subproblem QRAP-NC j ( C ).The validity of this transformation is proven by Lemmas 1-3. First, Lemma 1 shows that the optimalsolution x j ( C ) to the subproblem QRAP-NC j ( C ) is non-decreasing in C . Subsequently, Lemma 2 usesthis property to show that when adding the alternative single-variable bounds (7) to the problem formu-lation of QRAP-NC j ( C ), the optimal solution x j ( C ) to QRAP-NC j ( C ) is not cut oﬀ. Finally, Lemma 3shows that the alternative single-variable bounds (7) are stronger than the nested constraints (5). Lemma 1. If L j ≤ A ≤ B ≤ U j , we have x j ( A ) ≤ x j ( B ) for a given j ∈ N .Proof. This proof is based on the proof of Theorem 2 in [42] and given in Appendix A.1.

Lemma 2.

For a given j ∈ N n − and C ∈ [ L j , U j ] , we have that x ji ( L j ) ≤ x j +1 i ( C ) ≤ x ji ( U j ) .Proof. Let x (cid:48) := ( x j +11 ( C ) , . . . , x j +1 j ( C )) be the vector of the ﬁrst j components of the optimal solutionto the problem QRAP-NC j +1 ( C ). Since x (cid:48) is feasible for all nested constraints (5) for k ∈ N j , thisvector is also the optimal solution to QRAP-NC j ( A ) where A := (cid:80) i ∈N j x (cid:48) i , i.e., we have x (cid:48) = x j ( A ).Since A ∈ [ L j , U j ], Lemma 1 implies that x ji ( L j ) ≤ x ji ( A ) ≤ x ji ( U j ) for all i ∈ N j . It follows that x ji ( L j ) ≤ x j +1 i ( C ) ≤ x ji ( U j ) for all i ∈ N j . Lemma 3.

If for a given j ∈ N n − and vector y ∈ R j we have x j ( L j ) ≤ y ≤ x j ( U j ) , then L k ≤ (cid:80) i ∈N k y i ≤ U k for all k ∈ N j .Proof. The sum of the inequalities x ji ( L j ) ≤ y i ≤ x ji ( U j ) over all i ∈ N k yields (cid:88) i ∈N k x ji ( L j ) ≤ (cid:88) i ∈N k y i ≤ (cid:88) i ∈N k x ji ( U j ) . Since x j ( L j ) and x j ( U j ) are feasible for QRAP-NC j ( L j ) and QRAP-NC j ( U j ) respectively and k ≤ j ,we have L k ≤ (cid:80) i ∈N k x ji ( L j ) and (cid:80) i ∈N k x ji ( U j ) ≤ U k and the result of the lemma follows.7emma 2 implies that, given optimal solutions x j − ( L j − ) and x j − ( U j − ), we can replace the nestedconstraints (5) in QRAP-NC j ( C ) by the single-variable bounds (7) without cutting oﬀ the optimalsolution to QRAP-NC j ( C ). Moreover, since these single-variable bounds are stronger than the nestedconstraints by Lemma 3, adding these constraints does not change the optimal objective value. It followsdirectly that any optimal solution to QRAP j ( C ) is also optimal for QRAP-NC j ( C ).Based on Lemmas 1-3, the following approach can be used to solve QRAP-NC. We successively solvethe subproblems QRAP j ( L j ) and QRAP j ( U j ) from j = 1 to n − n ( R ),whereby in each step we use the optimal solutions to the preceding subproblems QRAP j − ( L j − ) andQRAP j − ( U j − ) as input. Note that each of the subproblems is an instance of QRAP. This approach issummarized in Algorithm 2. Algorithm 2

An initial sequential algorithm for QRAP-NC. Input:

Parameters a ∈ R n> , L, U ∈ R n − , R ∈ R , and l, u ∈ R n Output:

Optimal solution x to QRAP-NC Initialize x ( L ) = L ; x ( U ) = U for j = 2 , . . . , n − do Compute optimal solutions x j ( L j ) and x j ( U j ) to QRAP j ( L j ) and QRAP j ( U j ) respectively end for Compute optimal solution x n ( R ) to QRAP n ( R ) return Optimal solution ¯ x := x n ( R )Since each subproblem QRAP j ( · ) can be solved in O ( n ) time [10], the worst-case time complexity ofAlgorithm 2 is O ( n ). However, linear-time algorithms for QRAP such as [10] attain their linear timecomplexity by employing linear-time algorithms for median ﬁnding, which are, as already mentioned, ingeneral slower than simple sorting- or sampling-based approaches [26, 2]. Note, that also the O ( n log n )time algorithm in [42] attains its worst-case time complexity by using such slow linear-time algorithmsas a subroutine.In the next section, we propose an algorithm to solve QRAP-NC in O ( n log n ) time that, as opposedto the algorithm in [42], does not require linear-time median-ﬁnding algorithms. Instead, it only requiresa simple data structure for double-ended priority queues to store several bookkeeping parameters.We conclude this section with two remarks that may be of independent interest:1. It can be shown that Lemmas 1-3 also hold for the case where the variables are integer-valued,i.e., x ∈ Z n (see also Theorem 5 in [42]), given that all parameters a , L , U , l , and u are alsointeger-valued and nonnegative. As a consequence, when solving each subproblem QRAP j ( · ) withinteger variables, Algorithm 2 computes an optimal solution to QRAP-NC with integer variables.The worst-case time complexity of this algorithm is O ( n ) since each QRAP j ( · ) subproblem withinteger variables can be solved in O ( j ) time [25].2. Lemmas 1-3 can be generalized to the case where the objective function is the sum of separableconvex cost functions f i , i.e., where we replace each term x i a i by a convex function f i ( x i ). Forthis more general problem, this leads to a sequential algorithm that is very similar to Algorithm 2.However, initial computational tests indicated that both this algorithm and Algorithm 2 are inpractice much slower than both ALG inf and ALG dec . O ( n log n ) time algorithm for QRAP-NC The sequential algorithm derived in the previous section does not match the best known time complexityof the algorithm in [42]. However, we show in this section that we can implement Algorithm 2 suchthat its time complexity reduces to O ( n log n ) without requiring a linear-time median ﬁnding algorithm.Instead, we only require a data type that supports insertion of elements and the ﬁnding and removingof minimum and maximum elements in O (log n ) time such as a double-ended priority queue.The key to eﬃciency in our approach is that we do not explicitly compute the solution to each QRAPsubproblem. Instead, we only compute an optimal Lagrange multiplier corresponding to the resourceconstraint (6) that characterizes the entire optimal solution to this subproblem. Subsequently, we usethese multipliers to reconstruct the optimal solution to the original problem QRAP-NC using two setsof simple recursive relations that can be executed in O ( n ) time. In order to compute the Lagrange8ultipliers without explicitly storing intermediate solutions, we exploit the special structure of thesemultipliers and of a speciﬁc algorithm for solving QRAP.First, in Section 4.1, we introduce some of the used notation. Second, in Section 4.2, we derivean eﬃcient approach for computing the optimal Lagrange multipliers of the subproblems QRAP j ( L j )and QRAP j ( U j ). Based on these optimal Lagrange multipliers, we derive in Section 4.3, two simplerecursions to compute the optimal solution x to QRAP-NC. Finally, in Section 4.4, we present an O ( n log n ) algorithm for QRAP-NC and discuss an implementation that attains this worst-case timecomplexity. We introduce the following notation concerning the subproblems QRAP j ( L j ) and QRAP j ( U j ) that weuse throughout the remainder of this article. We denote for j ∈ N the lower and upper single variablebounds (7) and (8) of QRAP j ( C ) with C ∈ [ L j , U j ] by ¯ l j := (¯ l ji ) i ∈N j and ¯ u j := (¯ u ji ) i ∈N j , where¯ l ji := x j − i ( L j − ) and ¯ u ji := x j − i ( U j − ) for i < j , and ¯ l jj := l j and ¯ u jj := u j . Furthermore, we denoteby α j := ( α ji ) i ∈N j and β j := ( β ji ) i ∈N j the lower and upper breakpoints for the QRAP j ( C ) subproblem.We call the breakpoints corresponding to i = j , i.e., α jj and β jj , initial breakpoints since QRAP j ( C ) isthe ﬁrst subproblem, i.e., with lowest index j , in which we have to compute breakpoint values for thevariable x j . Note that we can compute these breakpoints directly as α jj := l j a j and β jj := u j a j by deﬁnitionof the subproblem QRAP j ( C ).Furthermore, let κ j and λ j denote the optimal Lagrange multipliers for the subproblems QRAP j ( L j )and QRAP j ( U j ) respectively and deﬁne κ := ( κ j ) j ∈N and λ := ( λ j ) j ∈N , where we set κ := α and λ := β . If the optimal Lagrange multiplier for a given subproblem QRAP j ( L j ) is not unique, we deﬁnewithout loss of generality κ j as the maximum optimal Lagrange multiplier. Analogously, we deﬁne λ j asthe minimum optimal Lagrange multiplier of subproblem QRAP j ( U j ). Note that x j ( L j ) = x j [ κ j ] and x j ( U j ) = x j [ λ j ] by deﬁnition of the subproblems QRAP j ( L j ) and QRAP j ( U j ) and of κ j and λ j . Finally,for a given subproblem QRAP j ( C ), we deﬁne the set of its lower breakpoints as A j := { α ji | i ∈ N j } andthe set of its upper breakpoints as B j := { β ji | i ∈ N j } . Recall that in Section 2 we deﬁned breakpointsets as multi sets for convenience when solving QRAP. However, for our approach for a fast algorithm forQRAP-NC, it is crucial that the breakpoint sets do not contain duplicate elements. Therefore, in thissection and the remainder of this article, we regard A j and B j as ordinary sets. The goal of this subsection is to derive an eﬃcient approach for computing the optimal Lagrange mul-tiplier of each QRAP subproblem in Algorithm 2 without explicitly calculating any of the intermediateoptimal solutions x j ( L j ) and x j ( U j ) for j ∈ N . If we would follow the latter strategy, i.e., if we solve eachpair of subproblems QRAP j ( L j ) and QRAP j ( U j ) from scratch, e.g., using Algorithm 1, we would haveto explicitly compute the breakpoint sets for each pair of subproblems. This leads to O ( n ) computationsand thus forms an eﬃciency bottleneck within this algorithm.We show that we can apply the breakpoint search procedure in Algorithm 1 for solving the subprob-lems such that each breakpoint set A j +1 can be obtained from the previous set A j in O (1) amortizedsteps, i.e., the total number of steps required to carry out this construction for all j ∈ N n − is O ( n ).This can be done because of two intermediate results that we show in this subsection. First, the numberof distinct values that the breakpoints can take is not O ( n ) but O ( n ). We obtain this result by un-veiling a useful relation between breakpoints of consecutive subproblems, i.e., between α j , β j and α j +1 , β j +1 . Second, when constructing the breakpoint sets, each distinct breakpoint value is included in orremoved from a breakpoint set at most twice during the entire procedure. For this, it is important thatwe solve each lower subproblem QRAP j ( L j ) by considering the breakpoints in non-decreasing order andeach upper subproblem QRAP j ( U j ) by considering the breakpoints in non-increasing order. Together,these two results imply that the construction of the breakpoint sets requires in total O ( n ) additions andremovals of breakpoint values. By using an appropriate data structure such as double-ended priorityqueues for maintaining the breakpoint sets, each of these steps can be executed in O (log n ) time, whichleads to an overall O ( n log n ) complexity for computing the optimal Lagrange multipliers κ and λ .The outline of the remainder of this subsection is as follows. First, in Section 4.2.1, we analyze therelation between breakpoints of consecutive subproblems and show that the number of distinct breakpointvalues is O ( n ). Subsequently, in Section 4.2.2, we use this information and the structure of Algorithm 19o construct the breakpoint sets for each subproblem from those of the predecing subproblems. Finally,in Section 4.2.3, we discuss how the updating of the bookkeeping parameters within the breakpointsearch procedure must be adjusted when applying this procedure to the subproblems QRAP j ( L j ) andQRAP j ( U j ). We ﬁrst show how we can eﬃciently obtain the breakpoint set of a given subproblem QRAP j +1 ( C ) basedon the breakpoint set and optimal Lagrange multipliers of the preceding subproblems QRAP j ( L j ) andQRAP j ( U j ). We establish for a given j ∈ N n − and i < j the following relation between the subsequentlower breakpoints α ji and α j +1 i : • If κ j < α ji , it follows from Equation (3) that x ji [ κ j ] = ¯ l ji since α ji = ¯ l ji a i . This implies that¯ l j +1 i = x ji ( L j ) = x ji [ κ j ] = ¯ l ji and thus α j +1 i = α ji . • If α ji ≤ κ j < β ji , it follows from Equation (3) that x ji [ κ j ] = a i κ j . Thus, α j +1 i = ¯ l j +1 i a i = x ji ( L j ) a i = x ji [ κ j ] a i = κ j . • If β ji ≤ κ j , then it follows from Equation (3) that x ji [ κ j ] = ¯ u ji . This implies that ¯ l j +1 i = x ji ( L j ) = x ji [ κ j ] = ¯ u ji and thus α j +1 i = β ji .Summarizing, we can determine α j +1 i from the previous breakpoints α ji and β ji and the optimal Lagrangemultiplier κ j as follows: α j +1 i =  α ji if κ j < α ji ,κ j if α ji ≤ κ j < β ji ,β ji if β ji ≤ κ j . (9)Analogously, we obtain the following expression for the upper breakpoint β j +1 i in terms of the previousbreakpoints α ji and β ji and the optimal Lagrange multiplier λ j : β j +1 i =  β ji if λ j > β ji ,λ j if β ji ≥ λ j > α ji ,α ji if α ji ≥ λ j . (10)Note that it follows from these relations that α ji ≤ α j +1 i and β ji ≥ β j +1 i for each j ∈ N n − . Moreover,note that the only values that the breakpoints can take are those of the initial breakpoints α jj and β jj or ofthe optimal Lagrange multipliers in κ and λ . Thus, the number of distinct values among all breakpointsis limited by 4 n . As observed at the end of Section 2, we can solve a given QRAP subproblem by searching its breakpointseither in non-decreasing or non-increasing order. In particular, we can solve all lower subproblemsQRAP j ( L j ) by searching the breakpoints in non-decreasing order and all upper subproblems QRAP j ( U j )by searching the breakpoints in non-increasing order. When doing this, note that for solving the uppersubproblem QRAP j ( U j ) we can use as breakpoint sets the sets that “remain” from the breakpointsearch for the lower subproblem. More precisely, instead of the sets A j and B j that we also use asbreakpoint sets for the lower subproblem QRAP j ( L j ), we can use the sets { α ji ∈ A j | α ji ≥ κ j } and { β ji ∈ B j | β ji ≥ κ j } respectively. This is because κ j ≤ λ j and thus in the breakpoint search for theupper problem QRAP j ( U j ) no breakpoints smaller than κ j need to be considered.We deﬁne the sets ˜ A j and ˜ B j as the sets of lower and upper breakpoints that remain to be consid-ered after solving the subproblems QRAP j ( L j ) and QRAP j ( U j ) in the way described in the previousparagraph, i.e., we have ˜ A j := { α ji ∈ A j | κ j ≤ α ji ≤ λ j } , ˜ B j := { β ji ∈ B j | κ j ≤ β ji ≤ λ j } .

10e call these sets the remaining breakpoint sets of the subproblems QRAP j ( L j ) and QRAP j ( U j ). Inthe following, we relate these two remaining breakpoint sets to the breakpoint sets of the next twosubproblems, i.e., to the sets A j +1 and B j +1 . For this, we focus on the relation between the lower re-maining breakpoint sets ˜ A j and the lower breakpoint set A j +1 ; the relation between the upper remainingbreakpoint set ˜ B j and the upper breakpoint set B j +1 is analogous.For each i ∈ N j , we consider four cases for the value of α j +1 i :1. If κ j ≤ α ji ≤ λ j , it follows from Equation (9) that α j +1 i = α ji . Thus, all values in ˜ A j act asbreakpoint values for the next subproblems, i.e., ˜ A j ⊆ A j +1 .2. If α ji < κ j and κ j < β ji , it follows from Equation (9) that α j +1 i = κ j .3. If α ji < κ j and β ji ≤ κ j , it follows from Equations (9) and (10) that x ji ( L j ) = ¯ u ji and β j +1 i = β ji = α j +1 i respectively. Thus, β j +1 i = α j +1 i ≤ · · · ≤ α ni ≤ β ni ≤ · · · ≤ β j +1 i . This means that α j (cid:48) i = β j (cid:48) i = β ji and ¯ l j (cid:48) i = ¯ u j (cid:48) i = ¯ u ji for all j (cid:48) > j . Thus, in all remainingsubproblems, the lower and upper breakpoints of i coincide and x j (cid:48) i ( C ) = ¯ u ji for any j (cid:48) > j and L j (cid:48) ≤ C ≤ U j (cid:48) , regardless of the values of the future optimal Lagrange multipliers κ j (cid:48) and λ j (cid:48) .This means that we can remove this index (variable) from the breakpoint search.4. Finally, if α ji > λ j , it follows from Equations (9) and (10) that α j +1 i = α ji and β j +1 i = α ji respectively. Thus, α j +1 i = β j +1 i = α ji . Analogously to the case α ji ≤ β ji < κ j , it follows that¯ l j (cid:48) i = ¯ u j (cid:48) i = ¯ l ji and x j (cid:48) i ( C ) = ¯ l ji for all j (cid:48) > j and L j (cid:48) ≤ C ≤ U j (cid:48) . Thus, also in this case we canremove the index i from the breakpoint search.These four cases imply that we can construct A j +1 from ˜ A j as follows: A j +1 = ˜ A j ∪ { α j +1 j +1 } ∪ (cid:40) { κ j } if there exists i such that α ji < κ j < β ji , ∅ otherwise.Analogously, we can construct B j +1 from ˜ B j as follows: B j +1 = ˜ B j ∪ { β j +1 j +1 } ∪ (cid:40) { λ j } if there exists i such that α ji < λ j < β ji , ∅ otherwise.The above constructions show how the breakpoint sets evolve over the course of the algorithm. First, inthis construction, at most 4 n additions of breakpoint values to a breakpoint set occur. Second, during thebreakpoint search procedure of Algorithm 1, breakpoints are only removed and not added. This meansthat updating the breakpoint steps can be done in O ( n ) steps, i.e., by O ( n ) additions and removals ofbreakpoint values. In order to to eﬃciently compute the sums z j [ δ ] := (cid:80) i ∈N j x ji [ δ ] for a given breakpoint δ , we deﬁne thefollowing bookkeeping parameters analogously to those in the breakpoint search procedure for QRAP inAlgorithm 1: P j ( δ ) := (cid:88) i ≤ j : δ< ¯ ljiai ¯ l ji + (cid:88) i ≤ j : δ j ≥ ¯ ujiai ¯ u ji ; Q j ( δ ) := (cid:88) i ≤ j : ¯ ljiai ≤ δ< ¯ ujiai a i ;Each breakpoint value κ j (cid:48) and λ j (cid:48) in a given breakpoint set acts as a collective breakpoint for oneor multiple activities. As a consequence, within the breakpoint search procedure, they have the samefunction as the “regular” initial lower and upper breakpoint values α ii and β ii . Thus, when a breakpointvalue of the form κ j (cid:48) or λ j (cid:48) has been considered, we require an eﬃcient update of the bookkeeping sums P j ( κ j (cid:48) ), Q j ( κ j (cid:48) ) or P j ( λ j (cid:48) ), Q j ( λ j (cid:48) ) respectively. In the case of κ j (cid:48) , we update P j ( κ j (cid:48) ) by subtracting11rom this value the sum of the lower bounds ¯ l ji of those activities i whose lower breakpoint equals κ j (cid:48) ,i.e., for which α ji = κ j (cid:48) . The sum of these values is (cid:88) i

We have χ n = κ n = λ n . Moreover, for each j ∈ N n − , we have:1. χ j +1 ≤ κ j implies (cid:80) i ∈N j x ni ( R ) = L j and χ j = κ j ; . λ j ≤ χ j +1 implies (cid:80) i ∈N j x ni ( R ) = U j and χ j = λ j ,3. κ j < χ j +1 < λ j implies L j < (cid:80) i ∈N j x ni ( R ) < U j and χ j = χ j +1 .Proof. See Appendix A.2.

Lemma 5.

For each i ∈ N , we have x ni ( R ) =  l i if χ i < α ii ,a i χ i if α ii ≤ χ i < β ii ,u i if β ii ≤ χ i . (11) Proof.

See Appendix A.3.Note that, starting from χ n = κ n and using Lemma 4, we can compute the values χ j recursively as χ j =  κ j if χ j +1 ≤ κ j ,λ j if χ j +1 ≥ λ j ,χ j +1 otherwise. (12)Thus, given the optimal Lagrange multipliers κ and λ , we can compute x in O ( n ) time as x n ( R ) usingthe two relatively simple recursions in Equations (11) and (12). O ( n log n ) time algorithm for QRAP-NC In the previous two subsections, we derived an eﬃcient approach to compute the optimal Lagrange mul-tipliers κ and λ for the QRAP j ( L j ) and QRAP j ( U j ) subproblems and to compute from these multipliersthe optimal solution x . In this subsection, we combine these two ingredients to formulate a fast andeﬃcient algorithm for QRAP-NC (Algorithm 3). More precisely, in the ﬁrst part of this subsection,Section 5.2, we present our algorithm and discuss several of its details regarding the subroutines for com-puting the optimal Lagrange multipliers of the QRAP j ( L j ) and QRAP j ( U j ) subproblems. This includesseveral procedures that deal with corner cases and with the updating of the breakpoint sets and thebookkeeping parameters. In the second part, Section 4.4.2, we focus on the eﬃciency of the algorithm.In particular, we prove in Lemma 6 that the algorithm has an O ( n log n ) worst-case time complexitywhen using an appropriate data structure. Algorithm 3 captures our approach for solving QRAP-NC. First, in Lines 3-13, the algorithm initializesall problem parameters, the initial breakpoint values and breakpoint sets, and the initial bookkeepingparameters. Throughout the entire algorithm, it maintains four separate sets A , B , K , and L of break-point values corresponding to the “source” of the values, i.e., this speciﬁes whether they are one ofthe initial breakpoint values α ii or β ii or one of the optimal Lagrange multipliers κ j or λ j respectively.Second, in Lines 14-16, the algorithm applies for each j ∈ N \{ } the procedure SolveSubproblems ( j )(see Algorithm 4) that computes the optimal Lagrange multipliers κ j and λ j for the two subproblemsQRAP j ( L j ) and QRAP j ( U j ). Finally, using the obtained vectors of optimal Lagrange multipliers κ and λ , the algorithm computes in Lines 17-22 the (alternative) multiplier values χ using the recursion inEquation (12) and from these values the solution x n ( R ) using Equation (11).The procedure SolveSubproblems ( j ) carries out the breakpoint search procedure for the subprob-lems QRAP j ( L j ) and QRAP j ( U j ) as described in Section 2 (Lines 38-59). This is done by ﬁrst initial-izing the bookkeeping parameters for these breakpoint search procedures in Lines 39-46 and Lines 49-56 and subsequently applying the the procedures LowerSubproblem ( j ) (Line 47, Algorithm 5) and UpperSubproblem ( j ) (Line 57, Algorithm 6), which are identical in nature to Lines 5-22 of Algorithm 1.Before carrying out the breakpoint search procedure, two possible corner cases are considered in Lines 1-38 with regard to relation between the to-be-computed multipliers κ j and λ j and their predecessors κ j − and λ j − . We brieﬂy discuss these corner cases for κ j ; the corner cases for λ j are analogous.The ﬁrst corner case occurs when κ j = κ j − (Lines 1-9 in SolveSubproblems ( j )). This casecorresponds to Lines 5-7 in Algorithm 1 , where the currently considered candidate multiplier δ i leads13 lgorithm 3 An O ( n log n ) time algorithm for QRAP-NC. Input:

Parameters a ∈ R n> , L, U ∈ R n − , R ∈ R , and l, u ∈ R n Output:

Optimal solution x to QRAP-NC L = max( L , l ); U = min( U , u ) for j = 2 to n do L j = max( L j , L j − + l j ) U j = min( U j , U j − + u j ) end for for i = 1 to n do α ii = l i a i ; β ii = u i a i end for κ = α ; λ = β ; κ j = ∞ , λ j = −∞ for j > Initialize breakpoint sets: A := { α } ; B := { β } ; K := ∅ ; L := ∅ Initialize bookkeeping sums: ¯ P L = ¯ P U = 0;¯ Q L = ¯ Q U = a for j = 2 to n do Apply procedure

SolveSubproblems ( j ) end for χ n := κ n for i = n − do Compute χ i using Equation (12) end for Compute x n ( R ) using Equation (11) return Optimal solution ¯ x := x n ( R ) Algorithm 4

Procedure

SolveSubproblems ( j ). if L j − + max( l j , min( a j κ j − , u j )) = L j then κ j := κ j − ; replace κ j − in K by κ j if κ j < α jj then ¯ P jL := ¯ P j − L + l j ; ¯ Q jL := ¯ Q j − L else if β jj < κ j then ¯ P jL = ¯ P j − L + u j ; ¯ Q jL := ¯ Q j − L else ¯ P jL := ¯ P j − L ; ¯ Q jL := ¯ Q j − L + a j end if else if L j − + max( l j , min( a j κ j − , u j )) > L j then ( κ j < κ j − :) κ j = ( L j − L j − ) /a j ¯ P jL := L j − ; ¯ Q jL := a j Add κ j to K else ( κ j > κ j − :) remove κ j − from K end if if U j − + max( l j , min( a i λ j − , u j )) = U j then λ j := λ j − ; replace λ j − in L by λ j if λ j < α jj then ¯ P jU := ¯ P j − U + l j , ¯ Q jU := ¯ Q j − U else if β jj < λ j then ¯ P jU = ¯ P j − U + u j ; ¯ Q jU := ¯ Q j − U else ¯ P jU := ¯ P j − U ; ¯ Q jU := ¯ Q j − U + a j end if else if U j − + max( l j , min( a j λ j − , u j )) < U j then ( λ j > λ j − :) λ j = ( U j − U j − ) /a j ¯ P jU := U j − ; ¯ Q jU := a j Add λ j to L else ( λ j < λ j − :) remove λ j − from L end if if min( κ j − , κ j ) < α jj ≤ max( λ j − , λ j ) then Add α jj to A end if if min( κ j − , κ j ) ≤ β jj < max( λ j − , λ j ) then Add β jj to B end if if κ j > λ j − then if κ j − < α jj then P := ¯ P j − L + l j ; Q := ¯ Q j − L else if α jj ≤ κ j − < β jj then P := ¯ P j − L ; Q := ¯ Q j − L + a j else P := ¯ P j − L + u j ; Q := ¯ Q j − L end if Apply procedure

LowerSubproblem ( j ) end if if λ j < λ j − then if β jj < λ j − then P := ¯ P j − U + u j ; Q := ¯ Q j − U else if α jj < λ j − ≤ β jj then P := ¯ P j − U ; Q := ¯ Q j − U + a j else P := ¯ P j − U + l j ; Q := ¯ Q j − U end if Apply procedure

UpperSubproblem ( j ) end if to a solution x [ δ i ] that sums to C , i.e., z [ δ i ] = C . For QRAP-NC j ( L j ), this case thus occurs if andonly if L j − + x jj [ κ j ] = L j , i.e., if and only if x ji ( L j ) = x j − i ( L j − ) for all i ∈ N j − and L j − +max( l j , min( a j χ j , u j )) = L j . The second case (Lines 9-13) occurs when κ j < κ j − and corresponds toLines 8-10 of Algorithm 1, where the candidate multiplier δ i leads to a solution x [ δ i ] whose sum is larger14 lgorithm 5 Procedure

LowerSubproblem ( j ). repeat Choose minimum to-be-considered break-point: δ := max( A , B , K , L ) and correspond-ing member set D ∈ {A , B , K , L} if P + Qδ = L j then κ j := δ ; add κ j to K ¯ P jL := P , ¯ Q jL := Q return else if P + Qδ > L j then ( κ j < δ :) κ j := ( L j − P ) /Q ; add κ j to K ¯ P jL := P , ¯ Q jL := Q return else ( κ j > δ :) breakpoint δ will be considered if D ≡ A then

Let breakpoint be δ ≡ α kk P := P − l k ; Q := Q + a k Remove α kk from A else if D ≡ B then

Let breakpoint be δ ≡ β kk P := P + u k ; Q := Q − a k Remove β kk from B else if D ≡ K then

Let breakpoint be δ ≡ κ k P := P − ¯ Q kL κ k , Q := Q + ¯ Q kL Remove κ k from K else Let breakpoint be δ ≡ λ k P := P + ¯ Q kU λ k ; Q := Q − ¯ Q kU Remove λ k from L end if end if until κ j has been determined Algorithm 6

Procedure

UpperSubproblem ( j ). repeat Choose maximum to-be-considered break-point: δ := min( A , B , K , L ) and correspond-ing member set D ∈ {A , B , K , L} if P + Qδ = U j then λ j := δ ; add λ j to L ¯ P jU := P , ¯ Q jU := Q return else if P + Qδ < U j then ( λ j > δ :) λ j := ( U j − P ) /Q ); add λ j to L ¯ P jU := P , ¯ Q jU := Q return else ( λ j < δ ): breakpoint will be considered if D ≡ A then

Let breakpoint be δ ≡ α kk P := P + l k , Q := Q − a k Remove α kk from A else if D ≡ B then

Let breakpoint be δ ≡ β kk P := P − u k ; Q := Q + a k Remove β kk from B else if D ≡ K then

Let breakpoint be δ ≡ κ k P := P + ¯ Q kL κ k , Q := Q − ¯ Q kL Remove κ k from K else Let breakpoint be δ ≡ λ k P := P − ¯ Q kU λ k ; Q := Q + ¯ Q kU Remove λ k from L end if end if until λ j has been determinedthan C , i.e., z [ δ i ] > C . In QRAP-NC j ( L j ), this case occurs if and only if L j − + x jj [ κ j ] > L j , i.e., if andonly if x ji ( L j ) = x j − i ( L j − ) for all i ∈ N j − and L j − + max( l j , min( a j χ j , u j )) > L j . In both cases, it isnot necessary to carry out the actual breakpoint search to ﬁnd κ j since either κ j = κ j − (the ﬁrst case)or κ j = ( L j − L j − ) /a j (the second case).Whether or not one of the above mentioned corner cases occurs partly determines whether or notwe have to include the new initial breakpoint values α jj and β jj in the breakpoint search procedure. Thealgorithm makes this decision in Lines 33-38: α jj and β jj are included only if they are in between thelowest and highest breakpoint values that can be considered in the breakpoint search. This lowest valueis κ j if κ j ≤ κ j − (when one of the two corner cases for κ j occurs and thus this value has alreadybeen determined) and κ j − otherwise (when breakpoint search is required to ﬁnd κ j ). Analogously, thehighest value is λ j if λ j ≥ λ j − and λ j − otherwise.15 .4.2 Time complexity We now establish the worst-case time complexity of Algorithm 3 by means of the following lemma:

Lemma 6.

Algorithm 3 can be implemented such that its worst-case time complexity is O ( n log n ) .Proof. Observe that, throughout the algorithm and all its procedures, all operations have a total timecomplexity of O ( n ) except for four operations on the sets A , B , K , and L of to-be-considered breakpoints.For each of these breakpoint sets, say D , these are ﬁnding the minimum and maximum breakpoint in D (Lines 2 and 18 in Algorithm 4 and Line 2 in Algorithms 5 and 6), inserting a breakpoint value in D (Lines 13, 29, 34, and 37 in Algorithm 4), and removing the minimum or maximum breakpoint from D (Lines 15 and 31 in Algorithm 4 and Lines 16, 20, 24, and 28 in Algorithms 5 and 6). As we showedin Section 4.2, each breakpoint value is inserted and removed at most once during the course of thealgorithm. Moreover, in the worst case, we have to ﬁnd the minimum and maximum breakpoint valuein D a number of n times. Thus, the total number of breakpoint set operations is O ( n ). If we maintainthe breakpoint sets as min-max heaps [4], each of these operations can be executed in O (1) (ﬁnding theminimum and maximum) and O (log n ) (inserting and removing a breakpoint) time. This means thatthe total time complexity of all four breakpoint set operations is O ( n log n ) if we use min-max heaps tostore the breakpoint sets. It follows that Algorithm 3 can be implemented such that its worst-case timecomplexity is O ( n log n ).In practice, carrying out the breakpoint set operations might be faster if we use a diﬀerent datastructure than min-max heaps to maintain the breakpoint sets A , B , K , and L . For instance, when n is small, simple arrays might be suﬃcient for fast insertion and removal of breakpoints, even thoughthis increases the worst-case time complexity to O ( n ). On the other hand, [19] suggests to keep thebreakpoint sets by means of a so-called disjoint set data structure (see, e.g., [11]). Using such a structure,a sequence of O ( n ) breakpoint insertions and deletions in sets of size at most n can be done in O ( n )time using the algorithm in [14]. However, it is unclear whether the algorithm in [14] is fast in practicefor two reasons. First, it is complicated and cumbersome to implement compared to other algorithmsfor insertion and removal operations on disjoint set data structures [15]. Second, although the authorsmention in the preliminary study [13] that their algorithm outperforms the then state-of-the-art, theliterature contains hardly if any studies on its practical performance. Alternatively, one could use otheralgorithms (e.g., those evaluated in [34]) that have a worse worst-case time complexity but have beenshown to be fast in practice. In this section, we evaluate the performance of our Algorithm 3 as presented in Section 4.4, to which weshall refer as ALG seq for clarity, and compare it with the state-of-the-art algorithms ALG inf from [41]and ALG dec from [42]. We carry out two types of experiments. First, we evaluate the performance ofour algorithm on realistic instances of the battery scheduling problem BATTERY. For this, we tailorALG seq to this problem and compare this implementation to a tailored implementation of ALG inf withinthe simulation tool DEMKit [21]. Second, we compare the execution time and scalability of our algorithmand of ALG inf and ALG dec on synthetic instances with sizes ranging from 10 to one million variables.We have implemented all three algorithms in Python (version 3.5) to be able to compare them to theimplementation in DEMKit, which is also written in Python. All simulations and computations havebeen executed on a 2.60 GHz Dell Inspiron 15 with an Intel Core i7-6700HQ CPU and 16 GB of RAM.In Section 5.1, we describe in more detail the problem instances that we use in the evaluation.Subsequently, in Section 5.2, we discuss several implementation choices and in Section 5.3 we presentand discuss the results of our evaluation.

For the comparison of the tailored implementation of our algorithm ALG seq with the tailored imple-mentation of ALG inf within DEMKit, we generate realistic instances of the problem BATTERY. Forthis, we consider the setting where a battery charging schedule for two consecutive days needs to becomputed. This scheduling horizon is divided into 15-minute time intervals, resulting in n = 192. Tostudy the inﬂuence of the battery size on the solving time, we consider three scenarios that correspondto three diﬀerent battery sizes and denote them by Small , Medium , and

Large . In these scenarios,16he battery capacity is 20 kWh, 100 kWh, or 180 kWh and the (dis)charging rate is 4 kW, 20 kW, or36 kW respectively. This leads to ∆ t = and to the values for X min , X max , and D as given in Table 3.Note that this is equivalent to the situation where either 10, 50, or 90 percent of the households haveinstalled a smaller “home” battery with a capacity of 5 kWh and a (dis)charging rate of 1 kW, whichcorresponds to real-life ﬁeld tests such as described in [35]. We set both the initial and target SoC to agiven fraction of the capacity, i.e., S start = S end = sD , where s ∈ { , . , . , . . . , } . For each scenario,we simulate 50 battery schedules of two days. As input for the base load p , we use measurement data ofthe actual power consumption of 40 households for 100 consecutive days that were obtained in the ﬁeldtest described in [22]. X min X max D Small − . · . · . · Medium − . · . · . · Large − . · . · . · Table 3: Parameter choices for the battery scheduling problem for each scenario.For the scalability analysis, we generate synthetic instances in the same way as in [42]. For this,we consider instance sizes n in the set { , , , , , , . . . , } and for each of these sizes,we generate 10 instances. In each instance, we sample the parameters a , l , and u from the uniformdistributions U (0 , U (0 . , . U (0 . , .

9) respectively. To generate the nested bounds L and U ,we ﬁrst draw for each i ∈ N two values X i and Y i from the uniform distribution U ( l i , u i ). Subsequently,we deﬁne for each j ∈ N the values v j := (cid:80) i ∈N j X i and w j := (cid:80) i ∈N j Y i and we set L j := min( v j , w j )and U j := max( v j , w j ) for j < n and L n = U n = ( v n + w n ). In both the divide-and-conquer algorithm ALG dec and the infeasibility-guided algorithm ALG inf , we useAlgorithm 1 to solve the QRAP subproblems. Note that using this algorithm instead of linear-timealgorithms such as in [27] increases the worst-case time complexity of ALG dec and ALG inf by a factor O (log n ). However, for practically relevant problems sizes, this procedure is generally faster in practiceand easier to implement than the linear-time algorithms in, e.g., [27].For the double-ended queues needed in ALG seq for the optimal Lagrange multipliers κ and λ , weuse the Python container datatype deque . Moreover, we initially implemented the double-ended priorityqueues for the lower and upper initial breakpoint values ( α ii ) i ∈N and ( β ii ) i ∈N as symmetric min-maxheaps [3]. However, initial tests indicated that using instead a coupled min-heap and max-heap im-plementation with total correspondence leads to similar or even lower execution times of the overallalgorithm. Moreover, the latter data structure is much simpler to implement using the standard Pythonlibary heapq . Therefore, we use this method instead of min-max heaps. In this alternative method, weinsert new breakpoints in both the min-heap and the max-heap and use the min-heap to ﬁnd and deletea minimum breakpoint (in the lower subproblems) and the max-heap to ﬁnd and delete a maximumbreakpoint (in the upper subproblems). Moreover, we assign to each breakpoint a ﬂag that is 1 if thebreakpoint has been removed from either of the heaps and 0 otherwise. This prevents that we ﬁnd aminimum (maximum) breakpoint in the min-heap (max-heap) that was already considered in the otherheap and thus has been removed from the breakpoint search. In this section, we present and discuss the results of our evaluation. First, we discuss the results ofthe comparison of the tailored implementation of ALG seq with the tailored implementation of ALG inf within DEMKit. Figure 2 shows the ratios between the execution times of the tailored implementationof ALG inf and that of ALG seq . Moreover, Tables 4-6 contain for each scenario and each initial and targetSoC value the mean, maximum, and coeﬃcient of variation (CoV) of the execution times. The CoV isthe sample deviation divided by the sample mean and is a suitable measure of the variation betweensamples when comparing diﬀerent collections of samples with signiﬁcantly diﬀerent sample means.Tables 4-6 show that the mean execution time of ALG seq is similar in each scenario, whereas that ofALG inf appears to decrease as the battery size increases. This implies that also the ratios between theexecution times decrease as the battery size increases, which is conﬁrmed by the boxplots in Figure 2. In17 s R a t i o (a) Small . s R a t i o (b) Medium . s R a t i o (c) Large . Figure 2: Boxplots of the execution time of the tailored implementation of ALG inf within DEMKitdivided by that of the tailored implementation of ALG seq for the three scenarios. Ratios larger than 1imply that ALG seq was faster than ALG inf .ALG seq

ALG inf within DEMKit s Mean Max CoV Mean Max CoV0 1 . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − Table 4: The mean, maximum, and coeﬃcient of variation of the execution times of the tailored im-plementation of ALG seq and the tailored implementation of ALG inf within DEMKit for the scenario

Small . ALG seq

ALG inf within DEMKit s Mean Max CoV Mean Max CoV0 1 . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − Table 5: The mean, maximum, and coeﬃcient of variation of the execution times of the tailored im-plementation of ALG seq and the tailored implementation of ALG inf within DEMKit for the scenario

Medium .particular, a smaller battery size seems to imply that ALG seq is likely to be faster than ALG inf whereasALG inf is likely to be faster for larger battery size. The reason for this is that the execution time ofALG inf heavily depends on the number of tight nested constraints in an optimal solution. To supportthis fact, we plot in Figure 3 boxplots of these numbers. Note that when the initial SoC is 20% or 30% ofthe battery capacity in the scenario

Large , in only 4 of the 50 instances the number of tight constraintswas more than 1, meaning that in the remaining 46 instances the optimal solution to the relaxation of18LG seq

ALG inf within DEMKit s Mean Max CoV Mean Max CoV0 1 . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − Table 6: The mean, maximum, and coeﬃcient of variation of the execution times of the tailored im-plementation of ALG seq and the tailored implementation of ALG inf within DEMKit for the scenario

Large .the problem did not violate any of the nested constraints. The relation between the number of tightnested constraints and the ratios is also strongly visible when comparing Figures 2 and 3: the ratiosincrease as the number of tight constraints increases. s c o n s tr a i n t s (a) Small . s c o n s tr a i n t s (b) Medium . s c o n s tr a i n t s (c) Large . Figure 3: Boxplots of the number of tight constraints in the optimal solutions for the three scenarios.From these results, we can derive a “rule of thumb” for the choice of a proper algorithm to usegiven the expected number of tight nested constraints. To this end, we compute for each number of tightconstraints the percentage of instances where the tailored implementation of ALG seq runs faster than thetailored implementation of ALG inf within DEMKit given the optimal solution has this particular numberof tight nested constraints (see Table 7). These values suggest that when the number of tight constraintsis more than ≈ . ≈ . seq is always faster.Number of tight nested constraints 1 2 3 4 5 6 ≥ seq is faster than the tailoredimplementation of ALG inf within DEMKit given the number of tight nested constraints in their optimalsolutions.Note that this rule-of-thumb is in line with the physical interpretation of tight nested constraintsin BATTERY. For this, note that a battery being completely empty or full is equivalent to a nestedconstraint of BATTERY being tight. When the charging rates of the battery are large, the battery isbetter able to, at a given moment, ﬂatten large peaks or drops in power consumption. However, thelatter is also dependent on whether there is enough space (energy) left in the battery to store (dispatch)this energy, which is more likely when the battery capacity is large. Thus, when adopting a large battery19or load proﬁle ﬂattening, it is less likely that it will be completely empty or full.Although the ratio between the execution times of ALG seq and ALG inf appears to depend signiﬁcantlyon the battery size, the maximum and CoV of the execution times of ALG seq is on average around 1.9and 3.0 times smaller than that of ALG inf respectively. This means that the execution times of ALG seq are on average more stable than those of ALG inf , regardless of the battery size. For DEM in general andDEMKit in particular, this is beneﬁcial since the coordination and optimization of schedules for diﬀerentdevices is often done in parallel due to the decentralized nature of the coordination (see, e.g., [20]). As aconsequence, the execution time of the entire coordination and optimization framework is constrained bythe maximum execution time required for solving one (subset of) device-level optimization problem(s).Thus, using ALG seq instead of ALG inf within such a framework may signiﬁcantly reduce the overallexecution time of the framework.In the following, we present and discuss the results of the scalability evaluation. Figure 4 shows theexecution times of the three algorithms ALG seq , ALG inf , and ALG dec , and Table 8 shows for each studiedinstance size n the mean and CoV of the execution times of the corresponding instances. The addedregression lines in Figure 4 are the ﬁtted power laws of the execution times, i.e., for each algorithm weﬁt the function φ ( n ) = c · n c to the execution times. These lines indicate that the practical executiontime of both ALG seq and ALG dec is close to O ( n ) and that of ALG inf is actually slightly less than O ( n ).Thus, it can be expected that for very large values of n , more precisely for n > . · , ALG inf is onaverage faster than ALG seq . However, the CoV for ALG inf are around one order of magnitude larger thanthose of both ALG seq and ALG dec . This suggests that the execution time of the latter two algorithmsis signiﬁcantly less aﬀected by the choice of problem parameters than ALG inf . This is in line the resultsof the comparison of the tailored implementation of ALG seq for BATTERY with that of ALG inf withinDEMKit. − − − − n E x ec u t i o n t i m e ( s ) ALG seq : 8 . · − · n . ALG inf : 2 . · − · n . ALG dec : 1 . · − · n . Figure 4: Execution times of ALG seq (circles, black), ALG inf (triangles, gray), and ALG dec (squares,open).The results in Figure 4 and Table 8 indicate that on average ALG seq is 27.2 times faster thanALG dec and 1.95 times faster than ALG inf . With regard to the performance of ALG dec , we acknowledgethat ALG dec and in particular the updating scheme for the single-variable bounds can probably beimplemented more eﬃciently than in the current implementation. To reduce the inﬂuence of the overallimplementation on the results of this study, we measured the total time that is spent in ALG dec on solvingQRAP subproblems and compared this to the execution times of ALG seq and ALG inf . This alternativetime represents the time that is minimally required to solve all QRAP subproblems regardless of theimplementation of the scheme used to update the single-variable bounds. These measurements indicatethat on average around 59% of the total execution time of ALG dec is spent on solving QRAP subproblems.However, this time is still on average 15.9 times more than the execution time of ALG seq and 9.5 timesmore than that of ALG inf . 20ean CoV n ALG seq

ALG inf

ALG dec

ALG seq

ALG inf

ALG dec

10 9 . · − . · − . · − . · − . · − . · −

20 1 . · − . · − . · − . · − . · − . · −

50 4 . · − . · − . · − . · − . · − . · −

100 1 . · − . · − . · − . · − . · − . · −

200 2 . · − . · − . · − . · − . · − . · −

500 5 . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − . · − .

63 3 . · − . · − . · − . · − . · − .

41 4 . · − . · − . · − . · − . · − .

18 3 . · − . · − . · − . · − .

12 1 . · . · − . · − . · − .

28 2 .

35 4 . · . · − . · − . · − .

61 4 .

21 8 . · . · − . · − . · − .

07 1 . · . · . · − . · − . · − . · . · . · . · − . · − . · − Table 8: Mean and coeﬃcient of variation of the execution times.

We proposed an O ( n log n ) time algorithm for quadratic resource allocation problems with lower andupper bound constraints on nested sums of variables. As opposed to existing algorithms with thesame time complexity, our algorithm can achieve the O ( n log n ) time complexity using only basic datastructures and is therefore easier to implement. In computational experiments, we demonstrate thegood practical performance of our approach, both on synthetic data and on realistic instances from theapplication area of decentralized energy management (DEM) for smart grids.Our approach builds upon monotonicity arguments that ﬁnd their origin in the validity of greedyalgorithms for convex optimization problems over polymatroids [18, 19]. Such monotonicity argumentshave been primarily studied for resource allocation problems where the objective function is separable,i.e., can be written as the sum of single-variable functions. However, in previous work [38] we provethe validity of similar monotonicity arguments to solve a nonseparable resource allocation problem withso-called generalized bound constraints. Moreover, recent results on the use of interior-point methods fornested resource allocation problems [40, 45] suggest that incorporating speciﬁc nonseparable terms in theobjective function does not increase the complexity of the used solution method. Thus, one interestingdirection for future research is to investigate whether one can use monotonicity arguments to deriveeﬃcient algorithms for resource allocation problems over nested constraints with nonseparable objectivefunctions.With regard to the application within DEM systems, we compared our algorithm with an existingimplementation of the state-of-the-art algorithm of [41] within a simulation tool for DEM research. Oneof our objectives was to decide which of these two algorithm is more suitable to use for a given (type of)problem instance. It would be worthwhile to conduct a more thorough comparison and to develop anautomated procedure to decide which algorithm is most likely to be faster. Moreover, the nonseparableversion of the studied problem mentioned in the previous paragraph is related to energy management ofbatteries in three-phase distribution networks, where load proﬁle ﬂattening on all three phases togetheris required to avoid blackouts in these networks [44, 22, 38]. Thus, research in this direction is alsorelevant in the context of DEM. A Proofs of Lemmas 1, 4, and 5

A.1 Proof of Lemma 1

Lemma 1. If L j ≤ A ≤ B ≤ U j , we have x j ( A ) ≤ x j ( B ) for a given j ∈ N .Proof. For convenience, we include the equality constraint (4) into the nested constraints (5) by replacing21hese nested constraints by ˜ L k ≤ (cid:88) i ∈N k x i ≤ ˜ U k , k ∈ N j , where ˜ L k = L k and ˜ U k = U k for k < j , and ˜ L j = ˜ U j = C . The Karush-Kuhn-Tucker (KKT) optimalityconditions (see, e.g., [7]) for the subproblem QRAP-NC j ( C ) are as follows: x i a i + j (cid:88) k = i ( η jk − ζ jk ) + µ ji − ν ji = 0 , i ∈ N j , (13a)˜ L k ≤ (cid:88) k ∈N i x i ≤ ˜ U k , k ∈ N j , (13b) η ji (cid:32) ˜ U i − (cid:88) k ∈N i x k (cid:33) = 0 , i ∈ N j , (13c) ζ ji (cid:32) (cid:88) k ∈N i x k − ˜ L i (cid:33) = 0 , i ∈ N j , (13d) µ ji ( u i − x i ) = 0 , i ∈ N j , (13e) ν ji ( x i − l i ) = 0 , i ∈ N j , (13f) η ji , ζ ji , µ ji , ν ji ≥ , i ∈ N j . (13g)Let ( ζ j ( C ) , η j ( C ) , µ j ( C ) , ν j ( C )) denote the Lagrange multipliers corresponding to the optimal solution x j ( C ). Thus, ( x j ( C ) , ζ j ( C ) , η j ( C ) , µ j ( C ) , ν j ( C )) satisfy the KKT-conditions (13).Suppose that there exists an index s ∈ N such that x js ( A ) > x js ( B ). Let r be the largest index with r ≤ s such that (cid:80) k ∈N r − x jk ( A ) ≥ (cid:80) k ∈N r − x jk ( B ), and let t be the smallest index with t ≥ s such that (cid:80) k ∈N t x jk ( A ) ≤ (cid:80) k ∈N t x jk ( B ). By deﬁnition of r , s , and t , we have that t (cid:88) i = r x ji ( B ) = (cid:88) i ∈N t x ji ( B ) − (cid:88) i ∈N r − x ji ( B ) ≥ (cid:88) i ∈N t x ji ( A ) − (cid:88) i ∈N r − x ji ( A ) = t (cid:88) i = r x ji ( A ) . Moreover, observe that we cannot have r = s = t simultaneously. Indeed, if r = s = t , then we have bydeﬁnition of r , s , and t that (cid:88) k ∈N s x jk ( A ) ≤ (cid:88) k ∈N s x jk ( B ) ≤ (cid:88) k ∈N s − x jk ( A ) + x js ( B ) . This implies x js ( A ) ≤ x js ( B ), which is a contradiction. Thus, either r < s or s < t or both.We show that we obtain a contradiction if r < s . The proof for the case where s < t is symmetrical.If r < s , the following holds: • By deﬁnition of r and s , we have (cid:88) k ∈N r x jk ( A ) < (cid:88) k ∈N r x jk ( B ) = (cid:88) k ∈N r − x jk ( B ) + x jr ( B ) ≤ (cid:88) k ∈N r − x jk ( A ) + x jr ( B ) . Thus, x jr ( A ) < x jr ( B ). • For each k such that r ≤ k ≤ s −

1, we have by deﬁnition of r and s and KKT-condition (13b) that˜ L k ≤ (cid:88) i ∈N k x ji ( A ) < (cid:88) i ∈N k x ji ( B ) ≤ ˜ U k . It follows from KKT-conditions (13c), (13d), and (13g) that ζ jk ( B ) = η jk ( A ) = 0. Thus, for each r ≤ k ≤ s −

1, we have j (cid:88) i = k ( η ji ( A ) − ζ ji ( A )) − j (cid:88) i = k +1 ( η ji ( A ) − ζ ji ( A )) = η jk ( A ) − ζ jk ( A ) ≤ j (cid:88) i = k ( η ji ( B ) − ζ ji ( B )) − j (cid:88) i = k +1 ( η ji ( B ) − ζ ji ( B )) = η jk ( B ) − ζ jk ( B ) ≥ . In particular, this implies that j (cid:88) i = r ( η ji ( A ) − ζ ji ( A )) ≤ j (cid:88) i = s ( η ji ( A ) − ζ ji ( A )) (14)and j (cid:88) i = r ( η ji ( B ) − ζ ji ( B )) ≥ j (cid:88) i = s ( η ji ( B ) − ζ ji ( B )) . (15) • We have l r ≤ x jr ( A ) < x jr ( B ) ≤ u r . It follows from KKT-conditions (13e)-(13g) that ν jr ( B ) = µ jr ( A ) = 0 . (16)Similarly, since l s ≤ x js ( B ) < x js ( A ) ≤ u s , we have by KKT-conditions (13e)-(13g) that ν js ( A ) = µ js ( B ) = 0 . (17)We can now derive a contradiction as follows: j (cid:88) i = s ( η ji ( A ) − ζ ji ( A )) = − x js ( A ) a s − µ js ( A ) + ν js ( A ) (18a) < − x js ( B ) a s − µ js ( B ) + ν js ( B ) (18b)= j (cid:88) i = s ( η ji ( B ) − ζ ji ( B )) (18c) ≤ j (cid:88) i = r ( η ji ( B ) − ζ ji ( B )) (18d)= − x jr ( B ) a r − µ jr ( B ) + ν jr ( B ) (18e) < − x jr ( A ) a r − µ jr ( A ) + ν jr ( A ) (18f)= j (cid:88) i = r ( η ji ( A ) − ζ ji ( A )) (18g) ≤ j (cid:88) i = s ( η ji ( A ) − ζ ji ( A )) . (18h)Here, • (18a), (18c), (18e), and (18g) follow from KKT-condition (13a); • (18b) follows from Equation (17) and the fact that x js ( A ) > x js ( B ) and a s > • (18d) follows from Equation (15); • (18f) follows from Equation (16) and the fact that x jr ( A ) < x jr ( B ) and a s > • (18h) follows from Equation (14).It follows that x js ( A ) ≤ x js ( B ). 23 .2 Proof of Lemma 4 Lemma 4.

We have χ n = κ n = λ n . Moreover, for each j ∈ N n − , we have:1. χ j +1 ≤ κ j implies (cid:80) i ∈N j x ni ( R ) = L j and χ j = κ j ;2. λ j ≤ χ j +1 implies (cid:80) i ∈N j x ni ( R ) = U j and χ j = λ j ,3. κ j < χ j +1 < λ j implies L j < (cid:80) i ∈N j x ni ( R ) < U j and χ j = χ j +1 .Proof. We have χ n = κ n = λ n since we deﬁned L n = U n = R and by deﬁnition of the solution x n ( R )the nested constraints L n ≤ (cid:80) i ∈N x ni ( L n ) and (cid:80) i ∈N x ni ( U n ) ≤ U n are tight. We prove the lemma byconsidering each of its three cases separately for each j < n :1. We prove this part of the lemma for the case that j is the largest index smaller than (cid:96) j +1 suchthat χ j +1 ≤ κ j , i.e., χ k +1 > κ k for all k ∈ { j + 1 , . . . , (cid:96) j +1 − } . Using this result, we show asfollows that the other case, i.e., both the situations where either j = (cid:96) j +1 or where there exists anindex k > j that it is the largest index in the set { j + 1 , . . . , (cid:96) j +1 − } such that χ k +1 ≤ κ k , leadsto a contradiction. In the former situation, it follows that j + 1 > j = (cid:96) j +1 ≥ j + 1, which is acontradiction. In the latter situation, the lemma applies for k , meaning that (cid:80) i ∈N k x ni ( R ) = L k and thus (cid:96) k = k . However, we also have by deﬁnition of (cid:96) j +1 that (cid:96) k = (cid:96) j +1 since j + 1 ≤ k < (cid:96) j +1 .This implies k = (cid:96) j +1 , which is a contradiction.If χ j +1 ≤ κ j , it follows from the lower breakpoint relations in Equation (9) that we have either α j +1 i ≥ κ j ≥ χ j +1 (if κ j < β ji ) or α j +1 i = β ji ≤ κ j ≤ λ j (if β ji ≤ κ j ) for all i ≤ j + 1. We show thatin both cases it holds that x (cid:96) j +1 i ( V (cid:96) j +1 ) = x ji ( L j ): • In the former case, note that α ki ≤ α k +1 i for all k < n by Equation (9) and that χ j +1 = χ k forall k ∈ { j +1 , . . . , (cid:96) j +1 } by deﬁnition of χ j +1 . Since χ k +1 > κ k for all k ∈ { j +1 , . . . , (cid:96) j +1 − } ,we have that α ki ≥ α j +1 i ≥ χ j +1 = χ k +1 > κ k for all k ∈ { j + 1 , . . . , (cid:96) j +1 − } . Thus, x ki ( L k ) = ¯ l ki = x k − i ( L k − ) for all k ∈ { j + 1 , . . . , (cid:96) j +1 − } , which implies that x ji ( L j ) = x (cid:96) j +1 − i ( L (cid:96) j +1 − ). Moreover, note that since α (cid:96) j +1 ≥ α j +1 ≥ χ j +1 = χ (cid:96) j +1 , we have that x (cid:96) j +1 i ( V (cid:96) j +1 ) = x (cid:96) j +1 − i ( L (cid:96) j +1 − ). It follows that x (cid:96) j +1 i ( V (cid:96) j +1 ) = x ji ( L j ). • The latter case implies that x ji ( L j ) = x ji ( U j ) = ¯ u ji . It follows by Lemmas 1 and 2 that x (cid:96) j +1 i ( V (cid:96) j +1 ) ≤ x (cid:96) j +1 i ( U (cid:96) j +1 ) ≤ x ji ( U j ) = x ji ( L j ) ≤ x (cid:96) j +1 i ( L (cid:96) j +1 ) ≤ x (cid:96) j +1 i ( V (cid:96) j +1 ).On the one hand, if V (cid:96) j +1 = L (cid:96) j +1 , we have L j = (cid:88) i N j x ji ( L j ) = (cid:88) i ∈N j x (cid:96) j +1 i ( L (cid:96) j +1 ) = L (cid:96) j +1 − (cid:96) j +1 (cid:88) i = j +1 x (cid:96) j +1 i ( L (cid:96) j +1 ) ≥ (cid:88) i ∈N (cid:96)j +1 x ni ( R ) − (cid:96) j +1 (cid:88) i = j +1 x ni ( R )= (cid:88) i ∈N j x ni ( R ) ≥ L j , where the inequality follows since (cid:80) i ∈N (cid:96)j +1 x ni ( R ) = L (cid:96) j +1 and by Lemma 2. On the other hand,if V (cid:96) j +1 = U (cid:96) j +1 , we have by Lemma 2 that L j = (cid:88) i ∈N j x ji ( L j ) = (cid:88) i ∈N j x (cid:96) j +1 i ( U (cid:96) j +1 ) ≥ (cid:88) i ∈N j x ni ( U n ) ≥ L j . In both cases, it follows that (cid:80) i ∈N j x ni ( R ) = L j , from which it follows directly that χ j = κ j .2. The proof for the case λ j ≥ χ j +1 is analogous to the proof for the case χ j +1 ≤ κ j .3. Suppose that x ji ( L j ) = x ni ( L n ) holds for all i < j + 1. By Lemma 2, this implies that x ki ( L k ) = x ji ( L j ) = x ni ( L n ) for all k ∈ { j, . . . , n } and i < j + 1. In particular, we have that x ki ( L k ) = ¯ l ki forall k ∈ { j + 1 , . . . , n } , which implies that κ k ≤ α ki . Furthermore, note that for any k (cid:48) ∈ N thereis at least one index i k (cid:48) ≤ k such that α k (cid:48) i k (cid:48) ≤ κ k (cid:48) < β k (cid:48) i k (cid:48) . Otherwise, there exists (cid:15) > κ k (cid:48) + (cid:15) is an optimal Lagrange multiplier. It follows from the relation between α k (cid:48) i k (cid:48) and α k (cid:48) +1 i k (cid:48) in24quation (9) that α k (cid:48) +1 i k (cid:48) = κ k (cid:48) for any k (cid:48) < n . This implies in particular that α k +1 i k = κ k ≤ α ki k − forall k ∈ { j + 1 , . . . , n } . It follows that κ (cid:96) j +1 ≤ α j +1 i j = κ j and thus that κ (cid:96) j +1 < χ j +1 = χ (cid:96) j +1 . Since χ (cid:96) j +1 ∈ { κ (cid:96) j +1 , λ (cid:96) j +1 } , we have χ (cid:96) j +1 = λ (cid:96) j +1 , from which it follows that (cid:80) i ∈N (cid:96)j +1 x ni ( R ) = U (cid:96) j +1 .However, this implies that (cid:88) i ∈N (cid:96)j +1 x (cid:96) j +1 i ( L (cid:96) j +1 ) = (cid:88) i ∈N (cid:96)j +1 x (cid:96) j +1 i ( R ) = U (cid:96) j +1 ≥ (cid:88) i ∈N (cid:96)j +1 x (cid:96) j +1 i ( U (cid:96) j +1 ) ≥ (cid:88) i ∈N (cid:96)j +1 x (cid:96) j +1 i ( L (cid:96) j +1 ) . This implies that (cid:80) i ∈N (cid:96)j +1 x (cid:96) j +1 i ( L (cid:96) j +1 ) = (cid:80) i ∈N (cid:96)j +1 x (cid:96) j +1 i ( U (cid:96) j +1 ), from which it follows that L (cid:96) j +1 = U (cid:96) j +1 by the monotonicity of x (cid:96) j +1 ( · ) as proven in Lemma 1. However, this is a contradiction withthe assumption that L k < U k for all k < n . Hence, there must be at least one index i (cid:48) such that x ji (cid:48) ( L j ) < x ni (cid:48) ( R ). It follows that L j = (cid:80) i ∈N j x ji ( L j ) < (cid:80) i ∈N j x ni ( R ).To prove that (cid:80) i ∈N j x ni ( R ) < U j , we can use a similar argument wherein we show that theproposition x ji ( U j ) = x ni ( R ) cannot be true for all i < n . Together, this implies that L j < (cid:80) i ∈N j x ni ( R ) < U j , from which it follows directly that χ j = χ (cid:96) j +1 = χ j +1 . A.3 Proof of Lemma 5

Lemma 5.

For each i ∈ N , we have x ni ( R ) =  l i if χ i < α ii ,a i χ i if α ii ≤ χ i < β ii ,u i if β ii ≤ χ i . Proof.

Let J denote the set of indices whose corresponding nested lower or upper constraint is tight in x n ( R ). More precisely, J := { k j | j ∈ N } ≡ { j , . . . , j q } , where q := |J | and j < · · · < j q . For a given p ∈ { , . . . , q } , note that since either the lower or uppernested constraint corresponding to j p is tight in the solution x n ( R ), we have that (cid:80) i ∈N jp x ni ( R ) = V j p .This implies that the vector ( x ni ( R )) ≤ i ≤ j p is the optimal solution to the subproblem QRAP-NC j p ( V j p ),i.e., to the problemQRAP-NC j p ( V j p ) : min x ∈ R jp (cid:88) i ∈N jp x i a i s.t. (cid:88) i ∈N jp x i = V j p ,L k ≤ (cid:88) i ∈N k x i ≤ U k , k ∈ { , . . . , j p − } , (19) l i ≤ x i ≤ u i , i ∈ { , . . . , j p } . Note that in the optimal solution ( x ni ( R )) i ∈N jp to this problem, none of the nested constraints (19) for k with j p − < k < j p are tight. As a consequence, when deriving the reformulated equivalent problemQRAP j p ( V j p ), it follows from Lemmas 2 and 3 that we may replace the single-variable bounds (7) for i with j p − < i < j p by the original variable bounds l i ≤ x i ≤ u i . Thus, we can reformulate QPRAP-NC j p ( V j p ) to QRAP j p ( V j p ) : min x ∈ R j (cid:88) i ∈N jp x i a i s.t. (cid:88) i ∈N jp x i = V j p ,x j p − i ( L j p − ) ≤ x i ≤ x j p − i ( U j p − ) , i ∈ { , . . . , j p − } ,l j ≤ x j ≤ u j , i ∈ { j p − + 1 , . . . , j p } . χ j p is the optimal Lagrange multiplier for this problem. As a consequence, we can directlycompute x j p i ( R ) for i ∈ { j p − + 1 , . . . , j p } using Equation (3): x ni ( R ) = x j p i ( V j p ) =  l i if χ j p < α ii ,a i χ j p if α ii ≤ χ j p < β ii ,u i if β ii ≤ χ j p . The result of the lemma follows since we have χ j p = χ i for each i ∈ { j p − , . . . , j p } by deﬁnition of j p and χ i . Acknowledgment

This research has been conducted within the SIMPS project (647.002.003) supported by NWO andEneco.

References [1] P. T. Akhil and R. Sundaresan. Algorithms for separable convex optimization with linear ascendingconstraints.

S¯adhan¯a , 43(9):146, 2018.[2] A. Alexandrescu. Fast deterministic selection. In C. S. I. Raman, S. P. Pissis, S. J. Puglisi, andRajeev, editors,

Leibniz International Proceedings in Informatics, LIPIcs , volume 75. volume 75of Leibniz International Proceedings in Informatics (LIPIcs), pages 24:1-24:9, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2017.[3] A. Arvind and C. Rangan. Symmetric min-max heap: A simpler data structure for double-endedpriority queue.

Inf. Process. Lett. , 69(4):197–199, 1999.[4] M. D. Atkinson, J.-R. Sack, N. Santoro, and T. Strothotte. Min-max heaps and generalized priorityqueues.

Commun. ACM , 29(10):996–1000, 1986.[5] M. Beaudin and H. Zareipour. Home energy management systems: A review of modelling andcomplexity.

Renew. Sustain. Energy Rev. , 45:318–335, 2015.[6] M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest, and R. E. Tarjan. Time bounds for selection.

J.Comput. Syst. Sci. , 7(4):448–461, 1973.[7] S. Boyd and L. Vandenberghe.

Convex optimization . Cambridge University Press, Cambridge, MA,7 edition, 2004.[8] P. Brass.

Advanced data structures . Cambridge University Press, Cambridge, MA, 1 edition, 2008.[9] G. S. Brodal. A survey on priority queues. In A. Brodnik, A. L´opez-Ortiz, V. Raman, and A. Viola,editors,

Space-Eﬃcient Data Structures, Streams, and Algorithms , pages 150–163. Springer BerlinHeidelberg, Berlin, Heidelberg, 2013.[10] P. Brucker. An O ( n ) algorithm for quadratic knapsack problems. Oper. Res. Lett. , 3(3):163–166,1984.[11] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein.

Introduction to algorithms . The MITPress, Cambridge, MA, 3 edition, 2009.[12] B. P. Esther and K. S. Kumar. A survey on residential demand side management architecture,approaches, optimization models and methods.

Renew. Sustain. Energy Rev. , 59:342–351, 2016.[13] H. N. Gabow and R. E. Tarjan. A linear-time algorithm for a special case of disjoint set union. In

Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing (STOC) , STOC ’83,pages 246–251, Boston, MA, 1983. ACM.[14] H. N. Gabow and R. E. Tarjan. A linear-time algorithm for a special case of disjoint set union.

J.Comput. Syst. Sci. , 30(2):209–221, 1985. 2615] Z. Galil and G. F. Italiano. Data structures and algorithms for disjoint set union problems.

ACMComput. Surv. (CSUR) , 23(3):319–344, 1991.[16] M. E. T. Gerards.

Algorithmic power management: Energy minimisation under real-time con-straints . PhD thesis, University of Twente, 2014.[17] M. E. T. Gerards, H. A. Toersche, G. Hoogsteen, T. van der Klauw, J. L. Hurink, and G. J. M. Smit.Demand side management using proﬁle steering. In , Eindhoven,2015. IEEE.[18] D. S. Hochbaum. Lower and upper bounds for the allocation problem and other nonlinear optimiza-tion problems.

Math. Oper. Res. , 19(2):390–409, 1994.[19] D. S. Hochbaum and S.-P. Hong. About strongly polynomial time algorithms for quadratic opti-mization over submodular constraints.

Math. Program. , 69:269–309, 1995.[20] G. Hoogsteen, M. E. T. Gerards, and J. L. Hurink. On the scalability of decentralized energy man-agement using proﬁle steering. In , Sarajevo, 2018.[21] G. Hoogsteen, J. L. Hurink, and G. J. M. Smit. DEMKit: A decentralized energy managementsimulation and demonstration toolkit. In , Bucharest, 2019.[22] G. Hoogsteen, A. Molderink, J. L. Hurink, G. J. Smit, B. Kootstra, and F. Schuring. Chargingelectric vehicles, baking pizzas, and melting a fuse in Lochem.

CIRED - Open Access Proc. J. ,2017(1):1629–1633, 2017.[23] W. Huang and Y. Wang. An optimal speed control scheme supported by media servers for low-powermultimedia applications.

Multimed. Syst. , 15(2):113–124, 2009.[24] L. M. Hvattum, I. Norstad, K. Fagerholt, and G. Laporte. Analysis of an exact algorithm for thevessel speed optimization problem.

Netw. , 62(2):132–135, 2013.[25] T. Ibaraki and N. Katoh.

Resource allocation problems: Algorithmic approaches . The MIT Press,Cambridge, MA, 1 edition, 1988.[26] K. C. Kiwiel. On Floyd and Rivest’s SELECT algorithm.

Theor. Comput. Sci. , 347(1):214–238,2005.[27] K. C. Kiwiel. Breakpoint searching algorithms for the continuous quadratic knapsack problem.

Math. Program. , 112(2):473–491, 2007.[28] D. E. Knuth.

The art of computer programming - Volume 3: Sorting and searching . Addison-Wesley,Reading, MA, 2 edition, 1998.[29] H. Lund, P. A. Østergaard, D. Connolly, I. Ridjan, B. V. Mathiesen, F. Hvelplund, J. Z. Thellufsen,and P. Sorknæs. Energy storage and smart energy systems.

Int. J. Sustain. Energy Plan. Manag. ,11:3–14, 2016.[30] M. M¨uller-Hannemann and S. Schirra, editors.

Algorithm engineering: Bridging the gap betweenalgorithm theory and practice . Springer Berlin Heidelberg, Berlin, Heidelberg, 1 edition, 2010.[31] I. Norstad, K. Fagerholt, and G. Laporte. Tramp ship routing and scheduling with speed optimiza-tion.

Transp. Res. Part C Emerg. Technol. , 19(5):853–865, 2011.[32] M. Patriksson. A survey on the continuous nonlinear resource allocation problem.

Eur. J. Oper.Res. , 185(1):1–46, 2008.[33] M. Patriksson and C. Str¨omberg. Algorithms for the continuous nonlinear resource allocation prob-lem - new implementations and numerical studies.

Eur. J. Oper. Res. , 243(3):703–722, 2015.[34] M. M. A. Patwary, J. Blair, and F. Manne. Experiments on union-ﬁnd algorithms for the disjoint-setdata structure. In P. Festa, editor,

International Symposium on Experimental Algorithms (SEA) ,pages 411–423, Naples, 2010. Springer Berlin Heidelberg.2735] V. M. J. J. Reijnders, M. E. T. Gerards, J. L. Hurink, and G. J. M. Smit. Testing grid-basedelectricity prices and batteries in a ﬁeld test. In

CIRED 2018 Workshop Proceedings , Ljubljana,2018.[36] B. P. Roberts and C. Sandberg. The role of energy storage in development of smart grids.

Proc.IEEE , 99(6):1139–1144, 2011.[37] M. H. H. Schoot Uiterkamp, M. E. T. Gerards, and J. L. Hurink. On a reduction for a class ofresource allocation problems, 2020. arXiv: https://arxiv.org/abs/2008.11829 .[38] M. H. H. Schoot Uiterkamp, M. E. T. Gerards, and J. L. Hurink. Quadratic nonseparable resourceallocation problems with generalized bound constraints, 2020. arXiv: https://arxiv.org/abs/2007.06280 .[39] P. Siano. Demand response and smart grids - a survey.

Renew. Sustain. Energy Rev. , 30:461–478,2014.[40] J. Slager.

Nonlinear convex optimisation problems in the smart grid . B.sc. thesis, University ofTwente, 2019.[41] T. van der Klauw, M. E. T. Gerards, and J. L. Hurink. Resource allocation problems in decentralizedenergy management.

OR Spectr. , 39(3):749–773, 2017.[42] T. Vidal, D. Gribel, and P. Jaillet. Separable convex optimization with nested lower and upperconstraints.

INFORMS J. Optim. , 1(1):71–90, 2019.[43] T. Vidal, P. Jaillet, and N. Maculan. A decomposition algorithm for nested resource allocationproblems.

SIAM J. Optim. , 26(2):1322–1340, 2016.[44] S. Weckx and J. Driesen. Load balancing with EV chargers and PV inverters in unbalanced distri-bution grids.

IEEE Trans. Sustain. Energy , 6(2):635–643, 2015.[45] S. E. Wright and S. Lim. Solving nested-constraint resource allocation problems with an interiorpoint method.

Oper. Res. Lett. , 48(3):297–303, 2020.[46] K. K. Zame, C. A. Brehm, A. T. Nitica, C. L. Richard, and G. D. Schweitzer III. Smart grid andenergy storage: Policy recommendations.