Low-complexity method for hybrid MPC with local guarantees
Damian Frick, Angelos Georghiou, Juan L. Jerez, Alexander Domahidi, Manfred Morari
LLOW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCALGUARANTEES ∗ DAMIAN FRICK † , ANGELOS GEORGHIOU ‡ , JUAN L. JEREZ § , ALEXANDERDOMAHIDI § , AND
MANFRED MORARI † Abstract.
Model predictive control problems for constrained hybrid systems are usually cast asmixed-integer optimization problems (MIP). However, commercial MIP solvers are designed to runon desktop computing platforms and are not suited for embedded applications which are typicallyrestricted by limited computational power and memory. To alleviate these restrictions, we develop anovel low-complexity, iterative method for a class of non-convex, non-smooth optimization problems.This class of problems encompasses hybrid model predictive control problems where the dynamicsare piece-wise affine (PWA). We give conditions such that the proposed algorithm has fixed pointsand show that, under practical assumptions, our method is guaranteed to converge locally to localminima. This is in contrast to other low-complexity methods in the literature, such as the non-convexalternating directions method of multipliers (ADMM), for which no such guarantees are known forthis class of problems. By interpreting the PWA dynamics as a union of polyhedra we can exploit theproblem structure and develop an algorithm based on operator splitting procedures. Our algorithmdeparts from the traditional MIP formulation, and leads to a simple, embeddable method that onlyrequires matrix-vector multiplications and small-scale projections onto polyhedra. We illustrate theefficacy of the method on two numerical examples, achieving good closed-loop performance withcomputational times several orders of magnitude smaller compared to state-of-the-art MIP solvers.Moreover, it is competitive with ADMM in terms of suboptimality and computation time, butadditionally provides local optimality and local convergence guarantees.
1. Introduction.
Many practical applications for control fall in the domain ofhybrid systems, including applications such as active suspension control or energymanagement in automotives [2, 9] and applications in power electronics [13]. Theseapplications require fast sampling times in the sub-second range, and the controlalgorithms need to be implemented on industrial, resource-constrained platforms, suchas microcontrollers or re-configurable hardware. Hybrid systems are characterizedby complex interactions between discrete and continuous behaviors that make themextremely challenging to control. In the last decades, model predictive control (MPC)[28] has received widespread attention both in research and industry. It providesa systematic approach for controlling constrained hybrid systems, promising highcontrol performance with minimal tuning effort.For hybrid systems, mature modeling tools such as the mixed logical dynamicalframework, see [5], are available. It defines rules for posing hybrid MPC as mixed-integer optimization problems (MIP). These MIPs can then be tackled with powerfulcommercial solvers such as CPLEX [19]. These solvers have focused on the solution ofgeneric, large-scale, mixed-integer linear/quadratic optimization problems on powerfulcomputing platforms, and thus have several drawbacks for embedded applications:(i) their code size is in the order of tens of megabytes; (ii) the algorithms requiresubstantial working memory; (iii) they depend on numerical libraries that cannot beported to most embedded platforms. This has restricted the applicability of hybridMPC to systems that can run on desktop computing platforms, with sampling times ∗ This manuscript is the preprint of a paper submitted to the SIAM Journal on Control and Op-timization. If accepted, the copy of record will be available at http://epubs.siam.org/journal/sjcodc † Automatic Control Laboratory, ETH Zurich, Physikstrasse 3, 8092 Z¨urich, Switzerland([email protected], [email protected]). ‡ Desautels Faculty of Management, McGill University, Montreal, Quebec, Canada ([email protected]). § embotech AG, Technoparkstrasse 1, 8005 Z¨urich, Switzerland ([email protected], [email protected]). 1 a r X i v : . [ m a t h . O C ] A ug D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI in the order of minutes or even hours.Recently there have been efforts to enable hybrid MPC applications with faster dy-namics to run on platforms with more limited computational power. Explicit MPC [1]is the method of choice for very small problems but quickly becomes intractable forincreasing problem dimension. Therefore, methods that further exploit (i) the spar-sity-inducing multistage structure of MPC problems, and (ii) the fact that MPCproblems are parametric and solved in a receding horizon fashion, have been devel-oped. Methods based on reformulation and relaxation of MIPs arising from optimalcontrol problems have been proposed in [3, 12, 20, 21], and have led to reduced com-putation times compared to general-purpose solvers. However, these methods are notfast enough for many practical applications.Methods based on operator splitting allow to trade off computational complexitywith optimality, i.e., they typically produce “good” solutions in a fraction of the timerequired even by tailored MIP solvers to find global optima. Furthermore, they allowthe problem to be split into several small subproblems, making these methods espe-cially suited for embedded applications and implementations on computing systemswith limited resources. The alternating directions method of multipliers (ADMM) [7]is an example of such operator splitting methods. In [33], ADMM was used on vari-ous hybrid MPC examples achieving substantial gains in computational performancecompared to general-purpose MIP solvers. Convergence guarantees are available forsome special cases of non-convex ADMM [24, 27], however, neither optimality or con-vergence guarantees are available for the class of problems considered in this work.Other low-complexity methods based on operator splitting suited for non-convex MPCare [17,18]. They are attractive because they offer computational performance similarto ADMM, but unlike ADMM they still retain some guarantees on optimality andconvergence. Nonetheless, they do not address the class of problems considered here.
In this paper, we develop an algorithm based on operatorsplitting to address hybrid MPC problems, that provides both the desired theoreticalguarantees and computational speed. Our method exploits the multistage structureof the optimization problem, allowing to decompose the complex piece-wise affine(PWA) equality constraints into decoupled non-convex polyhedral constraints. Byleveraging their polyhedral nature, we are able to derive guarantees on optimalityand convergence. The main contributions of this paper are the following: • We develop an iterative, numerical method to find local minima of a class of non-convex, non-smooth optimization problems that arise in MPC problems for systemswith PWA dynamics. Our method is of low complexity and only requires matrix-vector multiplications and small-scale Euclidean projections onto convex polyhedra. • We provide (i) conditions for the existence of fixed points of the proposed algorithm;(ii) show that the method converges locally to such fixed points; and (iii) prove thatfixed points correspond to local minima. To the author’s knowledge, for the class ofproblems considered, this is the first method of such simplicity that allows to givelocal optimality and convergence guarantees. • The proposed algorithm is implemented using automatic code generation, directlytargeting the embedded applications. This code generation tool is made publicly avail-able via github [11]. We examine the method’s performance on two numerical exampleand demonstrate computational speed-ups of several orders of magnitude, comparedto conventional MIP solvers, with only a minor loss of optimality. We further show,that the method is competitive with other local methods such as ADMM.
OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES In Section 2, we state the problem. In Section 3, we construct anoperator whose fixed points correspond to local minima. In Section 4, we propose analgorithm based on this operator and discuss local optimality and local convergenceof the method. In Section 5, we present numerical results. Proofs of auxiliary resultsare given in the appendix.
For vectors x, y ∈ R n , (cid:107) x (cid:107) := √ x (cid:62) x is the 2-norm and (cid:104) x, y (cid:105) := x (cid:62) y the scalar product. For a closed set C , P C ( x ) :=arg min z ∈C (cid:107) x − z (cid:107) is the Euclidean projection of x onto C . The closed (cid:15) -ball around x is B (cid:15) ( x ) := { z | (cid:107) x − z (cid:107) ≤ (cid:15) } , we use the shorthand B (cid:15) := B (cid:15) (0). The relative interior is relint C . By C + { x } we denote the Minkowski sum of C and the singleton x . The characteristic function of set C is χ C ( x ) := { x ∈ C ; + ∞ otherwise } . An operator T : X ⇒ Y maps every point x ∈ X to a set T ( x ) ⊆ Y . T is called single-valued iffor all x ∈ X : T ( x ) is a singleton. A zero of T is a point x ∈ X such that 0 ∈ T ( x ),denoted by x ∈ zero T , where zero T := { x | ∈ T ( x ) } . A fixed point of T is a point x ∈ X such that x ∈ T ( x ), denoted by x ∈ fix T , where fix T := { x | x ∈ T ( x ) } .Finally, when we say that a set has “measure zero” we more precisely mean that ithas Lebesgue measure zero.
2. Problem statement.
In this work, we consider optimization problems thatare parametric in θ ∈ R p and given as follows:(1a) p (cid:63) ( θ ) := min z ∈E∩Z ( θ ) 12 z (cid:62) Hz + h (cid:62) z , with H ∈ R n × n positive definite, and z := ( z , . . . , z N ) ∈ R n , with z k ∈ R n k . Theparametric nature implies that only parts of the problem data will change every timethe problem needs to be solved. The non-convex set Z ( θ ) ⊆ R n has the form:(1b) Z ( θ ) := N × k =1 (cid:16) m k (cid:91) i =1 Z k,i ( θ ) (cid:17) , where the sets Z k,i ( θ ) ⊆ R n k are closed convex polyhedra that may depend linearlyon θ , in the following way: Z k,i ( θ ) := { z k ∈ R n k | G k,i z k = g k,i ( θ ) , F k,i z k ≤ f k,i ( θ ) } ,with matrices and vectors of appropriate dimensions, and g k,i ( θ ), f k,i ( θ ) affine func-tions of the parameter θ . For a given parameter θ , the Euclidean projection onto Z ( θ ) can be evaluated very efficiently by decoupling it into simple convex projections,see Algorithm 2 in Section 4. This important feature of Z will be a key ingredientof the algorithm presented in Section 4. For simplicity of notation we will omit theparameter dependence of Z ( θ ) in the remainder of this paper. Finally, the set E is anaffine equality constrained set(1c) E := { z ∈ R n | Az = b } = { V v + ¯ v | v ∈ R n − m } , with A ∈ R m × n full row rank, and an appropriate matrix V ∈ R n × n − m and vector¯ v ∈ R n [29, p. 430, (15.14)], where V has full column rank by construction. Wefurthermore assume that E ∩ Z is non-empty, hence the problem is feasible.
Remark
Mathematical Programs with Comple-mentarity Constraints (MPCCs) can be brought into the form of Problem (1), aslong as their constraint functions are linear and their objective function is strictlyconvex quadratic. This can be done by constructing a union of sets that enumeratesthe different possibilities of the complementarity constraints. However, in general
D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI it is not possible to represent an MPCC in the form of Problem (1) with
N > Z k,i . This representation is therefore often, and in general,exponential in the number of complementarity constraints. In contrast, with the rep-resentation Problem (1) we explicitly consider this Cartesian structure and in thiswork develop a method that exploits it. Problem (1) can be re-written in the so-called consensusform by introducing copies y of z as follows(2) p (cid:63) := min z,y : z = y z (cid:62) Hz + h (cid:62) z + χ E ( z ) + χ Z ( y ) , which is a non-convex, non-smooth optimization problem. The constraint sets E and Z have been moved into the cost function using their characteristic functions χ E ( z )and χ Z ( y ). This problem formulation ensures that the two sets E and Z can bedecoupled. The consensus constraint z = y encodes the coupling between z and y . We are particularly interested inMPC problems where the discrete-time dynamics are given by a PWA function of thestates and inputs. Such parametric hybrid MPC problems can be written as:min x k ,u k (cid:88) Nk =1 q k +1 ( x k +1 ) + r k ( u k )(3a) s . t . x k +1 = A k,i x k + B k,i u k + c k,i if ( x k , u k ) ∈ C k,i , k = 1 , . . . , N , (3b) x = θ , (3c)for i = 1 , . . . , m k with m k representing the number of regions of the PWA dynamics.Vector x k ∈ R n x denotes the state of the system at time k , u k ∈ R n u denotes the inputsapplied to the system between times k and k + 1, and θ ∈ R n x is the (parametric)initial state of the system. The objective terms q k +1 and r k are strictly convex,quadratic functions. The sets C k,i are non-empty closed convex polyhedra that definethe partition of the PWA dynamics, and include state and input constraints.Problems of the form (3) can be solved by introducing additional continuous andbinary variables, and formulating an MIP using, e.g., the big-M reformulation [5] ormore advanced formulations [34]. The resulting problems can then be solved usingoff-the-shelf commercial MIP solvers. Instead of formulating a MIP, in this work wetransform Problem (3) into the form (1). In this way only N n x auxiliary continuousvariables and no binary variables need to be introduced. For each time instant k weintroduce a copy w k of x k +1 , as follows: w k := A k,i x k + B k,i u k + c k,i , if ( x k , u k ) ∈ C k,i .This allows us to define the closed convex polyhedral sets Z , ˆ ı ( θ ) := (cid:8) ( u , w ) (cid:12)(cid:12) w = A , ˆ ı θ + B , ˆ ı u + c , ˆ ı and ( θ, u ) ∈ C , ˆ ı (cid:9) , Z k,i := (cid:8) ( x k , u k , w k ) (cid:12)(cid:12) w k = A k,i x k + B k,i u k + c k,i and ( x k , u k ) ∈ C k,i (cid:9) , for ˆ ı = 1 , . . . , m and for i = 1 , . . . , m k , k = 2 , . . . , N . The non-convex constraint set Z now has the form of (1b) and is decoupled in the horizon N . For an equivalentformulation, we impose x k +1 = w k by defining a set of coupling equality constraints E := N × k =1 (cid:16) R n u × (cid:8) ( w k , x k +1 ) ∈ R n x (cid:12)(cid:12) x k +1 = w k (cid:9)(cid:17) . Remark α k q k +1 ( x k +1 ) + (1 − α k ) q k +1 ( w k ) + r k ( u k ) for each OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES k and any α k ∈ (0 , w k = x k +1 this ensures that thecost remains unchanged. Without loss of generality, we have used α k = 0 . k . Remark Z ). In this work we deal with thegeneral case of Z being polyhedral non-convex, and we do not make explicit use ofstructure of Z beyond the fact that it can be represented as the Cartesian product overunions of closed convex polyhedra. In Section 4, we give local convergence guaranteesunder assumptions that additionally restrict the set Z . However, even with theserestrictions, Z can still be polyhedral non-convex, which in particular means thatit can have holes and gaps. Nevertheless, depending on the application, additionalstructure of Z may be desirable for the presented method to be useful from a practicalstandpoint. For example, when the set Z consists of many disjoint sets, then thereexist many local minima, some of which may be almost trivial to find. Indeed, wehave found that for MPC problems the method performs well if the piecewise affinedynamics are continuous, which often leads to fewer local minima.
3. KKT conditions for Problem (2).
Our aim is to develop an algorithmto find local minima of Problem (2). For convex optimization, Karush-Kuhn-Tucker(KKT) conditions give rise to necessary and sufficient conditions for global optimality,under mild assumptions. In the non-convex, non-smooth case, however, they are ingeneral neither necessary nor sufficient, even for local optimality. We use a notion ofKKT points for instances of Problem (2) that provide necessary conditions for localoptimality. Computing such KKT points is in general intractable. Therefore, in thissection, we propose two restrictions (inner approximations) that are computationallyattractive. Based on these KKT conditions, we construct an operator K ξ whoseproperties and structure allow us to derive a computationally efficient method andgive local optimality and convergence guarantees in Section 4. We denote KKT pointsby † , local optima by ◦ and global optima by (cid:63) . A point ( z † , y † , λ † ) is called ageneralized KKT point [32, Theorem 3.34, p. 127] of Problem (2) if it satisfies0 ∈ { Hz † + h } + N E ( z † ) − { λ † } , (4a) 0 ∈ N Z ( y † ) + { λ † } , (4b) 0 = z † − y † , (4c)where the regular normal cone ˆ N C ( z ) and the (limiting) normal cone N C ( z ) of a set C are defined as in [31, Proposition 6.5, p. 200]:ˆ N C ( z ) := (cid:8) x ∈ R n (cid:12)(cid:12) lim sup u −→ C z (cid:104) x, u − z (cid:105)(cid:107) u − z (cid:107) (cid:9) , and N C ( z ) := lim sup x −→ C z ˆ N C ( x ) = ∂χ C ( z ) . In particular, the KKT conditions (4) imply that z † = y † ∈ E ∩ Z . Since any KKTpoint ( z † , y † , λ † ) satisfies primal feasibility, z † = y † , we will use the shorthand notation( z † , λ † ) for KKT points in the remainder of this paper. The KKT conditions (4) giverise to necessary conditions for local optimality as detailed below. Lemma
For every local minimum z ◦ of Problem (1) there exist Lagrange multipliers λ ◦ ∈ R n such that ( z ◦ , λ ◦ ) is ageneralized KKT point of Problem (2). These KKT conditions are difficult to exploit due to the dependence on N Z ( · ), whichcannot be evaluated easily [15]. However, we can define two restrictions that admit D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI l o c a l m i n i m a g e n e r a l i z e d K K T r e g u l a r proximal ξ Fig. 1: The sets illustrate the relationship between different KKT points and representpoints z for which there exist multipliers λ such that ( z, λ ) is a generalized, regularor proximal KKT point (dashed), as well as the set of local minima (dotted).practical characterizations, resulting in what we call regular and ξ -proximal KKTpoints. These special cases of KKT points can be evaluated more efficiently and haveimportant regularity properties. Furthermore, in Section 4.2, we will show that theyprovide sufficient (but no longer necessary) conditions for local optimality. We call apoint ( z † , λ † ) a regular KKT point of Problem (2), if it satisfies0 ∈ { Hz † + h } + ˆ N E ( z † ) − { λ † } , and(5a) 0 ∈ ˆ N Z ( z † ) + { λ † } . (5b)Note that we have replaced the normal cone N Z ( z † ) from (4b) with the regular normalcone ˆ N Z ( z † ). Since ˆ N Z ( z † ) ⊆ N Z ( z † ), see [31, Proposition 6.5, p. 200], the set ofregular KKT points is an inner approximation, a restriction, of the set of generalizedKKT points. Given ξ >
0, we call ( z † , λ † ) a ξ - proximal KKT point of Problem (2), ifit satisfies 0 ∈ { Hz † + h } + ˆ N E ( z † ) − { λ † } , and(6a) z † ∈ P Z ( z † − ξ λ † ) . (6b)Compared to the definition of regular KKT points we have replaced the regular normalcone N Z ( z † ) with a condition on the projection, and we have introduced a positivescaling ξ using z † ∈ P Z ( z † − ξ λ † ) ⇒ − ξ λ † ∈ ˆ N Z ( z † ), from [31, Example 6.16, p. 212].This directly implies that every ξ -proximal KKT point ( z † , λ † ) is also a regular KKTpoint and therefore (6) is a restriction of (5).Evaluating whether a pair ( z, λ ) is a regular KKT point has an exponential worst-case complexity, because representing ˆ N Z ( z ) may be combinatorial and can quicklybecome intractable. In contrast, evaluating whether a pair ( z, λ ) is a ξ -proximal KKTpoint amounts to a projection that can be performed in polynomial-time. Further-more, for the primal variables z , this inner approximation (6) of (5) can be made tight,as shown in the following proposition. Figure 1 illustrates the conceptual relationshipbetween the different KKT points. Proposition
Consider Problem (2).i) For any regular KKT point ( z † , λ † ) , there exists a ¯ ξ > such that for all ξ ≥ ¯ ξ , ( z † , λ † ) is a ξ -proximal KKT point.ii) Given any ξ > and ξ -proximal KKT point ( z † , λ † ) , then ( z † , λ † ) is also aregular KKT point. OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES H andthe polyhedrality of E ∩ Z . Therefore, given a ξ > z † , λ † ) there exists a µ † such that ( z † , µ † ) is a ξ -proximal KKT point. The fact that set-membershipevaluation of (6) is efficient, motivates the construction of an operator K ξ that has theproperty that (i) it is cheap to evaluate and (ii) has zeros corresponding to ξ -proximalKKT points of Problem (2). We use these two properties in Section 4 to construct analgorithm based on this operator for finding local minima of Problem (1). We definea set-valued operator K ξ : R n ⇒ R n :(7a) K ξ ( s ) := { M ξ s + c ξ } − P Z ( s ) , with ξ > proximal scaling , and M ξ := ξ ( ξR − I ) − R , c ξ := ( ξR − I ) − ( R ( h + H ¯ v ) − ¯ v ) , (7b) R := V ( V (cid:62) HV ) − V (cid:62) , (7c)where V and ¯ v as in Section 2. We will show that finding an s such that 0 ∈ K ξ ( s ) isequivalent to finding a ξ -proximal KKT point ( z † , λ † ). For K ξ to be defined properly, ξR − I needs to be invertible. Since R (cid:23)
0, this holds for all ξ > (cid:0) λ +min ( R ) (cid:1) − . (8a) This implies M ξ (cid:23) λ +min ( M ξ ) > , (8b)where λ +min ( R ) is the smallest non-zero eigenvalue of R . For Problem (3), it can beshown that ξ > λ max ( H ) is necessary and sufficient for satisfying both (8a) and (8b).Using (7a) we can show that a point s † ∈ zero K ξ , called a zero of K ξ , corresponds toa ξ -proximal KKT point ( z † , λ † ) with z † , λ † as follows:(9) z † := M ξ s † + c ξ and λ † := ξ ( z † − s † ) , where the precise relationship is given in Lemma 3.3. Lemma K ξ ). Given any ξ > . A point s † ∈ R n satisfies s † ∈ zero K ξ if and only if the pair ( z † , λ † ) constructed through (9) is a ξ -proximalKKT point of Problem (2). Multiple points s † ∈ zero K ξ can result in the same z † (but different λ † ), since M ξ is only invertible when E = R n . This means in particular that there can be multiplezeros of K ξ corresponding to the same local optimum.
4. Solution method and theoretical guarantees.
In this section we presentan algorithm constructed to find ξ -proximal KKT points of Problem (2) by find-ing zeros of the operator K ξ . The basis of this algorithm is a fixed-point iteration,where every iteration is computationally inexpensive. In Theorem 4.2, we show thatif the proposed algorithm converges, then it converges to a local minimum of Prob-lem (1). Furthermore, in Theorem 4.6 we show that, under practical assumptions,the presented algorithm converges locally to such minima, almost always. By ‘almostalways’ we mean that, for any local minimum that has corresponding fixed points,there exist at most a measure zero subset of fixed points from which the method maynot converge. D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI
We define an operator T ξ : R n ⇒ R n , that hasthe zeros of K ξ as its fixed points:(10a) T ξ ( s ) := s − W K ξ ( s ) = s − W (cid:0) M ξ s + c ξ − P Z ( s ) (cid:1) , with W ∈ R n × n invertible to ensure fix T ξ = zero K ξ . W is defined as(10b) W := Q (cid:104) Λ − − I (cid:105) Q (cid:62) , where Q ∈ R n × n is orthonormal and Λ ∈ R n − m × n − m diagonal, positive definite, suchthat M ξ = Q diag(Λ , Q (cid:62) is the eigendecomposition of M ξ . In addition, the structureof W will be used to prove local nonexpansiveness of the operator T ξ , see Lemma 4.5in Section 4.4. The local nonexpansiveness of T ξ allows us to use the Krasnoselskijiteration [6, Theorem 3.2, p. 65] to find fixed points of T ξ . Given an initial iterate s ∈ R n , the Krasnoselskij iteration is defined as follows:(10c) s j +1 = (1 − γ ) s j + γT ξ ( s j ) , with step size γ ∈ (0 , Algorithm 1
Low-complexity method for Problem (2)
Require: s ∈ R n , γ ∈ (0 , (cid:15) tol > if z (cid:63) := ¯ v − R ( h + H ¯ v ) ∈ Z then return z (cid:63) (cid:46) check trivial solution repeat for j = 0 , , . . . z j +1 = M ξ s j + c ξ y j +1 ∈ P Z ( s j ) (cid:46) projection, see Alg. 2 s j +1 = s j − γW ( z j +1 − y j +1 ) (cid:46) Krasnoselskij iteration until (cid:107) z j +1 − y j +1 (cid:107) ≤ (cid:15) tol then return y j +1 (cid:46) termination criterionEach iteration of Algorithm 1 is simple and geared towards an efficient embeddedimplementation utilizing automatic code-generation. It involves only matrix-vectormultiplications and a small number of projections. The projection onto the non-convex set Z in step 4 is the most expensive step. It can be evaluated in polynomialtime using Algorithm 2, where at most (cid:80) Nk =1 m k small-scale projections onto poly-hedra are needed (Algorithm 2, step 2). State-of-the-art embedded solvers [10] oreven explicit solutions [16] can be used for these projections. For problems of theform (3), the matrices M ξ and W are block-banded with bandwidth max { n x , n u } ,therefore steps 3 and 5 of Algorithm 1 can be evaluated very efficiently and have acomputational complexity of O ( N ( n x + n u ) ). This enables the automatic generationof efficient, embeddable code that is tailored to particular (parametric) problems. Algorithm 2 P Z ( s ), projection of s onto Z Require: s = ( s , . . . , s N ) ∈ R n , Z of form (1b) for k = 1 , . . . , N do for i = 1 , . . . , m k do z k,i = P Z k,i ( s k ) (cid:46) small-scale projections z k ∈ arg min i =1 ,...,m k (cid:107) s k − z k,i (cid:107) (cid:46) select closest return z = ( z , . . . , z N ) OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES N Z ( z )ˆ N ∗Z ( z ) + { z }Z (cid:15) z ˆ N ∗Z ( z ) + { z }Z ˆ N Z ( z ) (cid:15) z Fig. 2: Two cases of local convexifications ˆ N ∗Z ( z ) + { z } of non-convex sets Z and Z at points z (left) and z (right), as described in Lemma 4.1. First, we will show that when Algorithm 1 converges, itconverges to a local minimum, and equivalently that the fixed points of the operator T ξ correspond to local minima. Any set Z of form (1b) can be transformed into a finiteunion of closed, convex polyhedra Z i ⊆ R n with Z = (cid:83) Ii =1 Z i , where I := (cid:81) Nk =1 m k .This equivalent representation helps to simplify our theoretical considerations. Theexact construction of the sets Z i does not affect the presented results. Furthermore,we define the set of active components of Z at a point z ∈ Z as I Z ( z ) := { i ∈{ , . . . , I } | z ∈ Z i } . To establish the subsequent theoretical results we will usethe polar ˆ N ∗Z of the regular normal cone, defined in [31, Equation 6(14), p. 215]as ˆ N ∗Z ( z ) = { v | (cid:104) v, w (cid:105) ≤ , ∀ w ∈ ˆ N Z ( z ) } . The following lemma establishes twoimportant properties of the set Z and its normal cone. These properties will be usedin Theorem 4.2 for proving local optimality. At any point z ∈ Z , Lemma 4.1 relatesthe set Z to a local convexification ˆ N ∗Z ( z ) + { z } , as illustrated in Figure 2. Lemma
For any z ∈ Z , the set Z and the closedconvex set ˆ N ∗Z ( z ) + { z } are related in the following ways:i) There exists an (cid:15) > such that Z ∩ B (cid:15) ( z ) ⊆ ˆ N ∗Z ( z ) + { z } .ii) Their regular normal cones at z coincide, i.e., ˆ N Z ( z ) = ˆ N ˆ N ∗Z ( z )+ { z } ( z ) . We now use Lemma 4.1 to prove the first key result.
Theorem
If Algorithm 1 converges to a point ( y ◦ , z ◦ , s ◦ ) ,then y ◦ = z ◦ and z ◦ is a local minimum of Problem (1). Moreover, ( z ◦ , ξ ( z ◦ − s ◦ )) isa regular KKT point of Problem (2).Proof. When Algorithm 1 converges to a point ( y ◦ , z ◦ , s ◦ ), we have by definitionthat z ◦ = M ξ s ◦ + c ξ , y ◦ ∈ P Z ( s ◦ ) and s ◦ = s ◦ − γW ( z ◦ − y ◦ ). Together with theinvertibility of W , it follows that z ◦ = y ◦ . Moreover, from the definition of K ξ itfollows that 0 ∈ K ξ ( s ◦ ), i.e., s ◦ ∈ zero K ξ . Lemma 3.3 thus implies that ( z ◦ , λ ◦ ), with λ ◦ := ξ ( z ◦ , s ◦ ) is a ξ -proximal KKT point of Problem (2), which implies that ( z ◦ , λ ◦ )is a regular KKT point of Problem (2), see Section 3.1. Given this, we now considerthe auxiliary problem:(11) min z,y : z = y z (cid:62) Hz + h (cid:62) z + χ E× ( ˆ N ∗Z ( z ◦ )+ { z ◦ } ) ( z, y ) . Problem (11) is strictly convex and its KKT conditions are given as z = y (primalfeasibility) and(12) 0 ∈ { Hz + h } + ˆ N E ( z ) − { λ } and 0 ∈ ˆ N ˆ N ∗Z ( z ◦ )+ { z ◦ } ( y ) + { λ } . D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI
By Lemma 4.1(ii), (12) is equivalent to the regular KKT conditions (5). Therefore,( z ◦ , λ ◦ ) also satisfies (12). By strict convexity, implied by H (cid:31)
0, ( z ◦ , λ ◦ ) is primal-dual optimal for Problem (11). Furthermore, using Lemma 4.1(i), we know that thereexists an (cid:15) > z ◦ ∈ Z ∩ B (cid:15) ( z ◦ ) ⊆ ˆ N ∗Z ( z ◦ ) + { z ◦ } , therefore, the pair y = z = z ◦ is also optimal for the non-convex problemmin z,y : z = y z (cid:62) Hz + h (cid:62) z + χ E× ( Z∩B (cid:15) ( z ◦ )) ( z, y ) , which is a local instance of Problem (2), where the feasible region is restricted to B (cid:15) ( z ◦ ). This implies that z ◦ is locally optimal for Problem (2) and Problem (1). We have seen that for any regular KKT point we can ensurethe existence of a ξ -proximal KKT point by choosing ξ large enough. Therefore, ifthere exists a regular KKT point, we can ensure the existence of fixed points of theoperator T ξ , and therefore of Algorithm 1, as demonstrated in the following theorem. Theorem
If there exists a regular KKT point ( z † , λ † ) of Problem (2), then there exists a ¯ ξ > such that Algorithm 1, and equiva-lently T ξ , has at least one fixed point for any ξ ≥ ¯ ξ .Proof. Given a regular KKT point ( z † , λ † ) of Problem (2). By Proposition 3.2(i),there exists a ¯ ξ > ξ ≥ ¯ ξ we have that ( z † , λ † ) is a ξ -proximal KKTpoint. By Lemma 3.3, s † := z † − ξ λ † ∈ zero K ξ and thus s † is a fixed point of T ξ and( z † , y † , s † ), with y † = z † = M ξ s † + c ξ ∈ P Z ( s † ) is a fixed point of Algorithm 1.In Figure 3 we illustrate four examples for the existence and nature of KKT points.Figures 3a–3c are cases where the proximal scaling ξ > ξ >
0, even though a generalized KKT point exists. Figures 3a and 3b illustratecommon cases. On the other hand Figures 3c and 3d depict more rare cases.
In this section, we prove that thepresented method converges locally to local minima. We say that a fixed point s ◦ corresponds to a local minimum z ◦ , if the ξ -proximal KKT point ( z ◦ , λ ◦ ) can beconstructed from s ◦ through (9). Note that, as shown in Lemma 3.3 and Theorem 4.2,any fixed point of T ξ can be used to reconstruct a local minimum. We denote the setof fixed points corresponding to z ◦ by fix T ξ | z ◦ := { s | ( z ◦ , ξ ( z ◦ − s )) satisfies (6) } .The set fix T ξ | z ◦ is convex for fixed z ◦ , because ˆ N E ( z ◦ ) is a convex set for any z ◦ andbecause the set of λ ’s satisfying z ◦ ∈ P Z ( z ◦ − ξ λ ) form the proximal normal cone of Z at z ◦ which is also convex, see [31, p. 213]. We make the following assumptions: (A1) Problem (2) has at least one regular KKT point. (A2)
Given any local minimum z ◦ of Problem (1) such that fix T ¯ ξ | z ◦ (cid:54) = ∅ and z ◦ (cid:54)∈ fix T ¯ ξ | z ◦ for some ¯ ξ >
0. There exists a ξ ≥ ¯ ξ and a fixed point s ◦ ∈ fix T ξ | z ◦ with multipliers λ ◦ := ξ ( z ◦ − s ◦ ) such that − λ ◦ ∈ relint ˆ N Z ( z ◦ ). (A3) The structure of set Z is such that, for any z ∈ Z , either ˆ N Z ( z ) = { } , orthere exists an (cid:15) > w ∈ ˆ N Z ( z ) ⊥ ∩ B (cid:15) , we have z + w ∈ Z .Assumption (A1) ensures the existence of regular KKT points. It furthermore impliesthe existence of fixed points of T ξ for ξ large enough, according to Theorem 4.3. As-sumption (A2) can be understood as a non-degeneracy condition on the multipliers.Assumption (A3) is an assumption on the geometry of Z and ensures mild local regu- OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES E = R Z Z z λ z λ (a) E = R Z Z z λ z λ (b) EZ Z z λ λ (c) EZ Z zλ λ (d) Fig. 3: KKT points of min z,y ∈E× ( Z ∪Z ) : x = y z (cid:62) z + h (cid:62) z . Level curves of the objectiveand the objective value evaluated on Z , Z and E are illustrated in the respective col-ors, solid where feasible, dotted where infeasible. (a) ( z , λ ) and ( z , λ ) are regularKKT points corresponding to local minima. (b) ( z , λ ), is a regular KKT point cor-responding to a local minimum, ( z , λ ) a non-regular KKT point corresponding to a(not locally optimal) critical point. (c) ( z, λ ), ( z, λ ) are a non-regular and a regularKKT point, corresponding to the same global minimum. (d) ( z, λ ), ( z, λ ) are theonly KKT points corresponding to the unique global minimum, both are non-regular. Z ˆ N Z ( z ) wz Fig. 4: A set Z where (A3) is violated at point z . We see that w is orthogonal toˆ N Z ( z ), but z + (cid:15)w (cid:54)∈ Z for any (cid:15) > E ∩ Z at critical points. In fact, we show in Section 4.5 that (A2)holds for any given Problem (1) with (almost) arbitrary convex quadratic objectiveand in Appendix B we present Algorithm 3 which provides a necessary and sufficientcondition for (A3). Algorithm 3 is a combinatorial algorithm that enumerates theactive sets and components of Z . It may therefore perform badly for anything butsmall dimensions. However, when the sets Z k,i of Z are of low dimension, which isoften the case in structured problems such as hybrid MPC problems, the check can beperformed separately for each k . This significantly reduces the computational burden.Moreover, this check can be performed once and offline for all parameters θ . Notethat (A3) does not hold for all instances of Problem (1). The sets in Figure 2 andFigure 3, as well as the numerical examples in Section 5 satisfy (A3), whereas Figure 4illustrates a simple counter-example in three dimensions. Moreover, when (A3) is vi-olated, Algorithm 1 may still converge. If it converges, the solution is guaranteed tobe a local minimum independently of whether (A3) holds, as stated in Theorem 4.2.In Algorithm 1, we apply the Krasnoselskij iteration to the operator T ξ . Thisiteration is known to converge globally when the operator T ξ is nonexpansive, whichis the case when Z is convex . To show local convergence for the non-convex case2 D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI ˆ N ∗Z (0) Z Z H fix T ξ | z = 0Fig. 5: The point z = 0 is locally optimal for the problem min z ∈Z z (cid:62) z + h (cid:62) z , with Z := Z ∪ Z , if and only if h ∈ H := − ˆ N Z (0). It can be verified that for all h ∈ H there exist multipliers λ such that (0 , λ ) is a regular KKT point, i.e., (A1) holds.Furthermore, (A2) holds for all ξ > λ +min ( R ) − and h ∈ relint H (almost all h ∈ H ),as indicated by Theorem 4.7. Also, (A3) holds for Z . Finally, given any h ∈ H :fix T ξ | {− h } . If h (cid:54) = 0 (0 (cid:54)∈ fix T ξ | s ◦ := − h ∈ fix T ξ | (cid:15) > P Z ( s ) is like the projection onto theconvex set ˆ N ∗Z (0) for all s ∈ B (cid:15) ( s ◦ ), i.e., it is single-valued, continuous and firmlynonexpansive on B (cid:15) ( s ◦ ), as indicated by Lemma 4.4.in Theorem 4.6, we need T ξ to be nonexpansive in a neighborhood around fixedpoints. Starting in such a neighborhood, we will converge via the same argument.Theorem 4.6 is proved using two auxiliary lemmas. In Lemma 4.4 we show that,in a neighborhood around almost all fixed points, the projection P Z behaves likea projection onto a convex set. Therefore, it is locally firmly nonexpansive, single-valued and continuous [4, Proposition 4.8, p. 61]. Then, in Lemma 4.5, we use thisproperty, together with the particular construction of the operator T ξ , to show that T ξ is nonexpansive, single-valued and continuous in a neighborhood around almostall fixed points. This is used in Theorem 4.6 to guarantee the local convergence ofAlgorithm 1 to a local minimum , almost always. Lemma
Let Assumptions (A1)–(A3) hold. For any ξ large enough, and anylocal minimum z ◦ to Problem (1) with z ◦ (cid:54)∈ fix T ξ and fix T ξ | z ◦ (cid:54) = ∅ ; we have thatfor all fixed points s ◦ ∈ fix T ξ | z ◦ , except a measure zero subset, there exists an (cid:15) > such that the projection P Z is single-valued, continuous and firmly nonexpansive on B (cid:15) ( s ◦ ) . Note that Lemma 4.4 is weaker than prox-regularity of Z . Prox-regularity of Z isequivalent to P Z being single-valued, Lipschitz-continuous and monotone in a neigh-borhood around any point z ∈ Z , see [30, Theorem 1.3 (i)–(k), p. 5234]. Lemma 4.4makes a similar statement about P Z for points in the neighborhood B (cid:15) ( s ◦ ) of s ◦ ∈ fix T ξ | z ◦ , with (cid:15) >
0. Because Lemma 4.4 only holds when z ◦ (cid:54)∈ fix T ξ , the neighbor-hood B (cid:15) ( s ◦ ) can always be chosen such that it does not intersect with Z . In fact,Figure 5 illustrates a simple example, where Assumptions (A1)–(A3), and therebyLemma 4.4, hold for a set Z that is not prox-regular at z = 0. Lemma
Let ξ > and given any fixed point s ◦ ∈ fix T ξ such that there existsan (cid:15) > , where the projection P Z is single-valued, continuous and firmly nonexpansiveon B (cid:15) ( s ◦ ) . Then the operator T ξ is also single-valued, continuous and nonexpansiveon B (cid:15) ( s ◦ ) . Theorem
Let Assumptions (A1)–(A3) hold. For any ξ large enough, and an OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES initial iterate s sufficiently close to a fixed point of T ξ , Algorithm 1 converges almostalways to a fixed point ( z ◦ , y ◦ , s ◦ ) . When it converges, z j and y j converge to eachother, i.e., z ◦ = y ◦ , and to a local minimum of Problem (1). Moreover, s j convergesto a fixed point of T ξ , i.e., to s ◦ ∈ fix T ξ .Proof. Given ξ > λ +min ( R ) − , i.e., (8a), and large enough, according to Theo-rem 4.3, (A1) implies the existence of fixed points of T ξ , i.e., fix T ξ (cid:54) = ∅ . We firstconsider that there exists a fixed point s (cid:63) ∈ fix T ξ with corresponding local minimum z (cid:63) such that s (cid:63) = z (cid:63) . Then ( z (cid:63) ,
0) is a KKT point of Problem (2). Therefore M ξ z (cid:63) + c ξ ∈ P Z ( z (cid:63) ) = { z (cid:63) } ⇔ z (cid:63) = ( I − M ξ ) − c ξ = ¯ v − R ( h + H ¯ v ) , with I − M ξ invertible due to ξ > λ +min ( R ) − and (8b). This is checked in step 1 ofAlgorithm 1. If it holds the algorithm terminates and returns z (cid:63) , a global minimum. Ifthere is no s ◦ ∈ fix T ξ such that the corresponding local minimum z ◦ satisfies s ◦ = z ◦ ,we can apply Lemma 4.4. It says that for large enough ξ and any local minimum z ◦ ,such that fix T ξ | z ◦ (cid:54) = ∅ , we know that for all s ◦ ∈ fix T ξ | z ◦ , except a measure zerosubset, there exists an (cid:15) > P Z is single-valued, continuousand firmly-nonexpansive on B (cid:15) ( s ◦ ). We consider any such ξ and s ◦ . Using Lemma 4.5, T ξ is nonexpansive on B (cid:15) ( s ◦ ). If the initial iterate is close enough, i.e., s ∈ B (cid:15) ( s ◦ ),then by the nonexpansiveness of T ξ the iterates will satisfy s j ∈ B (cid:15) ( s ◦ ) for all j ∈ N .The Krasnoselskij iteration applied to T ξ , in step 5 of Algorithm 1, converges to a fixedpoint of T ξ for any choice of step-size parameter γ ∈ (0 ,
1) [6, Theorem 3.2, p. 65].Convergence lim j →∞ (cid:107) z j − y j (cid:107) = 0, follows by single-valuedness, and continuity of T ξ .By Theorem 4.2, z j , y j converge to a local minimum of Problem (1).Theorem 4.6 may require ξ to be large. Moreover, a lower bound of ξ satisfyingAssumption (A2) may be hard to compute. However, in practice good results canbe obtained for relatively small values of ξ . In addition, if the algorithm fails toconverge, the theoretical results indicate that choosing a larger value of ξ may improvethe convergence behavior. This is verified experimentally in Section 5.2.1, where weexamine the behavior for different ξ . In the following we show that Assumption (A2) is satis-fied for almost all instances of Problem (1). Corollary 4.8 makes the result applicableto instances of Problem (3), covering hybrid MPC.
Theorem
Consider Problem (1) with arbitrary parame-ter θ ∈ R p and cost matrix H (cid:31) . Given any point z ∈ E ∩ Z . For almost all linearcost terms h for which fix T ξ | z (cid:54) = ∅ for some ξ > , we have that (A2) holds at z .Proof. By Proposition 3.2(i), Lemma 3.3 and the definition of T ξ , there exists a¯ ξ > ξ ≥ ¯ ξ we have that fix T ξ | z (cid:54) = ∅ if and only if there exists a λ † such that ( z, λ † ) is a regular KKT point. We will show that for all linear cost terms h , except a measure zero subset, there exists a λ ◦ such that s ◦ := z − ξ λ ◦ ∈ fix T ξ | z and − λ ◦ ∈ relint ˆ N Z ( z ). This means that (A2) holds at z .The set H of h for which there exist λ † such that ( z, λ † ) is a regular KKT pointis given by the following convex set H := (cid:8) h ∈ R n | ∃ λ : 0 ∈ { Hz + h } + ˆ N E ( z ) − { λ } and − λ ∈ ˆ N Z ( z ) (cid:9) = (cid:2) I (cid:3) (cid:16)(cid:0) R n × − ˆ N Z ( z ) × ˆ N E ( z ) + { Hz } (cid:1) ∩ (cid:8)(cid:104) hλv (cid:105) | v = λ − h (cid:9)(cid:17) . D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI
Due to convexity of the regular normal cones all sets involved in the descriptionof H are convex. Therefore, we can use distributivity of the relint operator overthe Cartesian product and over the Minkowski sum [31, Exercise 2.45(a–b), p. 67],the interchangeability of linear maps and the relint operator [31, Proposition 2.44(a),p. 66], and distributivity of the relint operator over finite intersections if relint H (cid:54) = ∅ ,see [31, Proposition 2.42, p. 65]. From this we obtain thatrelint H = (cid:2) I (cid:3) (cid:16)(cid:0) R n × − relint ˆ N Z ( z ) × ˆ N E ( z ) + { Hz } (cid:1) ∩ (cid:8)(cid:104) hλv (cid:105) | v = λ − h (cid:9)(cid:17) = (cid:8) h ∈ R n | ∃ λ : 0 ∈ { Hz + h } + ˆ N E ( z ) − { λ } and − λ ∈ relint ˆ N Z ( z ) (cid:9) , if relint H (cid:54) = ∅ . Therefore, all linear cost terms for which (A2) holds are in relint H .Conversely, via [22, Theorem. 1, p. 90] and convexity of H , all linear cost terms forwhich (A2) does not hold are in a measure zero subset of H . To conclude the proof, itremains to show that relint H (cid:54) = ∅ . The set ˆ N Z ( z ) is closed, convex and non-empty,this implies that relint ˆ N Z ( z ) (cid:54) = ∅ via [31, Proposition 2.40, p. 64]. Therefore, thereexists a λ such that − λ ∈ relint ˆ N Z ( z ). Then h := λ − Hz ∈ relint H , which impliesrelint H (cid:54) = ∅ .To equivalently transform Problem (3) into Problem (1), in Remark 2.2, we haveeffectively restricted the choice of cost function to the case where its gradient Hz + h is in the set E α , for α ∈ (0 , N , where the set E α is defined as E α := × Nk =1 (cid:16) R n u × (cid:8) ( w k , x k +1 ) ∈ R n x (cid:12)(cid:12) α k w k = (1 − α k ) x k +1 (cid:9)(cid:17) . With E as defined for Problem (3) in Section 2.2, we have that E α + E ⊥ = R n , i.e.for any α ∈ (0 , N : E α and E ⊥ span R n . Theorem 4.7 does not immediately applyto this case, because the choice of cost function has been restricted. Corollary 4.8extends Theorem 4.7 to Problem (3). Corollary
Consider Problem (3) witharbitrary initial state θ ∈ R p , α ∈ (0 , N and quadratic cost matrix H (cid:31) followingRemark 2.2. Given any point z ∈ E ∩ Z and for all linear cost terms h satisfyingRemark 2.2 for which fix T ξ | z (cid:54) = ∅ for some ξ > , except a measure zero subset of E α , we have that (A2) holds at z .Proof sketch. Following Section 2.2, any instance of Problem (3) can be written asan instance of Problem (1), where, according to Remark 2.2, we require cost functionssuch that Hz + h ∈ E α , for some α ∈ (0 , N and any z . Following the proof ofTheorem 4.7, we only need to show that relint H ∩ E α (cid:54) = ∅ . This then impliesthat relint( H ∩ E α ) = relint( H ) ∩ E α , which implies the result. As in the proof ofTheorem 4.7, we have that relint ˆ N Z ( z ) (cid:54) = ∅ . We need to show that there exists an h ∈ E α such that ( { Hz + h } + ˆ N E ( z )) ∩− relint ˆ N Z ( z ) (cid:54) = ∅ . Due to [31, Theorem 6.46,p. 231] which details the presentation of the regular normal cones of polyhedral sets, wehave that ˆ N E ( z ) = E ⊥ . Furthermore we have Hz + E α = E α , for any z , by definition.Therefore, we have ( { Hz } + E α + ˆ N E ( z )) ∩ − relint ˆ N Z ( z ) = R n ∩ − relint ˆ N Z ( z ) (cid:54) = ∅ .
5. Numerical results.
We consider two numerical examples and compare ourmethod with an MIP reformulation based on disjunctions [34, Section 5], solved usingCPLEX, Gurobi and MOSEK. We used YALMIP [26] as optimization interface. Toensure a fair comparison with the MIP solvers, we have used the absolute MIP gap tolerance ∆ gap , which can be specified for all three used solvers CPLEX, Gurobi andMOSEK. The tolerance ∆ gap allows the solver to terminate as soon a solution with
OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES p is found that can be certified to be at most ∆ gap from the optimal objec-tive p (cid:63) , i.e., p + ∆ gap ≤ p (cid:63) . For each optimization problem that is solved as part ofSection 5 we perform the following steps to determine a meaningful gap: 1) Solve theproblem with the proposed method, yielding a suboptimal solution z ◦ with objectivevalue p ◦ ; 2) solve the problem to global optimality by warm-starting the MIP solverusing z ◦ . This yields the optimal objective value p (cid:63) ; 3) set the absolute MIP gaptolerance of the MIP solver to ∆ gap := p ◦ − p (cid:63) and re-solve the MIP problem withoutwarm-starting. In all instances where a global solver is used in this section, we haveset the absolute MIP gap tolerance as described above. We further compare to anefficient implementation of non-convex ADMM, similar to [33], where the splitting(2) was used. Efficient C-code for the proposed method and ADMM was generatedautomatically using the developed code generation tool [11], integrating explicit so-lutions of the individual convex projections using MPT3 [16]. For Example 5.1 theresulting binaries are less than 75 kB. For Example 5.2 they are 490 kB for predictionhorizon N = 10 and 550 kB for N = 20. The binaries for ADMM are of comparablesize. In contrast, the executables for the MIP solvers are in the range of 10 to 20 MB.The reported timings are obtained on an Intel Core i7 processor using a single core running at 2.8 GHz with 8 GB of RAM. No warm-starting or re-starting is used andwe always start with the initial iterate s = 0 ( y = λ = 0 for ADMM). We have usedstep size parameter γ = 0 . (cid:15) tol = 10 − . The same tolerancewas also used for the termination criterion of ADMM, according to [7], as absolute,primal and dual residual tolerance. Both examples are instances of Problem (3). Theytherefore satisfy (A2) for almost any linear cost term, according to Corollary 4.8. Fur-thermore, Assumption (A3) was verified to hold for both examples using Algorithm 3in Appendix B. In fact, for Example 5.1, Assumption (A3) was verified analytically,whereas for Example 5.2 Assumption (A3) was verified computationally in one hourand 18 minutes via an implementation of Algorithm 3 in MATLAB. The implemen-tation was not optimized for speed and it is likely that the computation times couldbe improved substantially with a more efficient implementation.Additionally, we have applied the proposed method to suitable problems from the MacMPEC collection [23]. This is discussed in Appendix C.
We consider a simple example taken from the hy-brid MPC literature [5, Example 4.1, p. 415]. It consists of two states, one input andPWA dynamics defined over two regions: x k +1 = (cid:40) A x k + Bu k if [ ] x k ≥ A x k + Bu k if [ ] x k ≤ , with u k ∈ [ − , (cid:80) Nk =1 (cid:107) x k +1 (cid:107) + (cid:107) u k (cid:107) and A := (cid:104) −√ √ (cid:105) , A := (cid:104) √ −√ (cid:105) , B := [ ] . For prediction horizons N = 5 , , , . . . ,
60, we apply the hybrid MPC in closed-loop for 10 steps. Each MPC problem is solved using the proposed method, with ξ = 10, and using ADMM with penalty parameter ρ = 10, where the tolerance (cid:15) tol = 10 − is used for both methods. The convergence of our method is illustratedin Figure 6a by a decreasing consensus violation (cid:107) z j − y j (cid:107) , for N = 40 and the firsttime step of the receding horizon problem. Even though not shown, the method alsoconverges for all successive time steps (and horizons N ), with similar convergence6 D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI , − − − − − (cid:15) tol k z j − y j k (a) Consensus violation (cid:107) z j − y j (cid:107) ( solid ) for N = 40 at thefirst time step, with threshold( dashed ). . . . . . x x . . . . . x x start(b) Closed-loop trajectory, for N = 40 and 10 time steps. Opti-mal ( dashed ), and our method( solid ). . . . . CPLEXGurobi MOSEKADMM proposed method t max control horizon N s o l u t i o n t i m e i n [ s ] (c) Solution time in [ s ] averaged over 10 closed-loop timesteps, for different control horizons. Our method ( solid, (cid:52) ), ADMM ( solid, ∗ ) and a MIP reformulation solvedusing CPLEX ( dashed, ♦ ), Gurobi ( dash-dotted, (cid:35) )and MOSEK ( dotted, (cid:3) ). . . . . control horizon N s o l u t i o n t i m e i n [ s ] (d) Minimum, maximum and median solution time in [ s ]over 10 closed-loop time steps for our method ( solid ) withdifferent control horizons. In comparison to the mediansolution time of ADMM ( dashed ) and 5-th to 95-th per-centile ( very light ). Fig. 6: Comparison of Example 5.1 with initial state x = x = 1.characteristics. This leads to a stable closed-loop trajectory, illustrated in Figure 6b.The trajectory is almost indistinguishable from the optimum, obtained using a MIPreformulation based on disjunctions [34, Section 5]. The computational advantage ofour method is underlined by Figure 6c, where the average runtime for different controlhorizons N is shown. Our method is approximately two orders of magnitude fasterthan the considered commercial MIP solvers. Furthermore, Figure 6d illustrates, thatour method is slightly faster than ADMM in terms of the median runtime. Theaverage runtime of ADMM in Figure 6c is substantially higher than for our method,because ADMM fails to converge for two of the ten time steps and terminates whenit reaches 10 000 iterations. We were not able to achieve convergence of ADMM byadjusting ρ for these cases. In order to illustrate the localconvergence behavior of Algorithm 1 and the dependence of the method on both theinitial iterate s and the scaling ξ , we have solved Example 5.1 for N = 10 and 50 000different initial iterates s := z − ξ λ , where z was drawn uniformly randomly from[ − , and λ from [ − , . We considered three different proximal scalings ξ = { , , } and the consensus tolerance was (cid:15) tol = 10 − . The convergence OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES + ∞ (a)(c)(d) (not converged) 8.6% % of initial iterates o b j ec t i v e v a l u e o f s o l u t i o n (a) ξ = 10. The method converges in 91 .
4% of cases and does not converge in 8 .
6% of cases.Clusters with objective values (a) in [0 . , . . (c) in [0 . , . . (d) in [1 . , . . (b) is not yet present, likely because ξ is too low. + ∞ (a)(b)(c)(d)(not converged) 0.9% % of initial iterates o b j ec t i v e v a l u e o f s o l u t i o n (b) ξ = 100. The method converges in 99 .
1% of cases and does not converge in 0 .
9% of cases.Clusters with objective values (a) in [0 . , . (b) in [0 . , . . (c) in [0 . , . (d) in [1 . , . . + ∞ (a)(b)(c)(d)(not converged) 0.5% % of initial iterates o b j ec t i v e v a l u e o f s o l u t i o n (c) ξ = 1000. The method converges in 99 .
5% of cases and does not converge in 0 .
5% ofcases. Clusters with objective values (a) in [0 . , . (b) in [0 . , . . (c) in [0 . , . (d) in [1 . , . . Fig. 7: Percentage of initial iterates s achieving different objective values for differentvalues of ξ . Converged solutions are clustered according to objective values, whereeach cluster contains multiple local optima. Cluster (a) includes the optimal objective0.4189. . . start x x (a) . . x (b) . . x (c) . . x (d) Fig. 8: Trajectories resulting from different initial iterates s , for ξ = 100, in clusters (a) – (d) . The number of trajectories differing (in terms of the objective) by at most10 − is (a) (b) (c)
15 and (d)
D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI to local minima can be seen in Figure 7, where we have clustered the solutions intofour clusters (a) – (d) and illustrate the percentage of initial iterates achieving certainobjective values which correspond to local optima. In Figure 7(a), we can see thatfor ξ = 10 we converge to fewer different local solutions, compared to ξ = 100 or ξ = 1000 given in Figure 7(b)–(c). In particular, the cluster (b) is absent fromFigure 7(a) and the cluster (c) contains a more narrow range of objective values.Furthermore, comparing Figures 7(a)–(c), we see that the method converges morefrequently for larger ξ . Since the proposed method is a local method, the initialiterate s of Algorithm 1 has a much large effect than ξ on which local minima ourmethod converges to. However, if ξ is too small, some local minima will cease tocorrespond to fixed points of the method and the number of cases where the methodfails to converge increases. Moreover, the state trajectories of the solutions for eachcluster, for ξ = 100, are plotted in Figures 8a–8d. Each cluster contains multiplelocal minima that have similar objective value. The 50 000 different initial iteratesonly lead to 34 trajectories that each differ (in terms of the objective value) by at least10 − . Cluster (a) with 62% −
68% of initial iterates contains solutions that are veryclose to the global optimum. The initial iterate s = 0 used in Example 5.1 belongsto this cluster. This illustrates that the presented method is indeed a local method,and does not necessarily converge to the global optimum. In this numerical example, we demonstrate the properties of theproposed algorithm on a more complex problem. To this end, we consider a hybridMPC problem for racing miniature cars [25], where the friction forces acting on thetires are modeled as a PWA function, see [14, p. 108ff]. The forward velocity v x of thecar is fixed to 2 m / s . The state x = ( v y , ω ) of the system consists of the lateral velocity v y and the turning rate ω . The steering angle δ is the only input to the system. Thecontinuous-time dynamics are described by˙ v y = m ( F f,y ( v y , ω, δ ) + F r,y ( v y , ω ) − mv x ω ) , ˙ ω = I Z ( l f F f,y ( v y , ω, δ ) − l r F r,y ( v y , ω )) , where m, I Z , l f , l r are known model parameters and F f,y , F r,y are the lateral frictionforces acting on the front and rear tires of the vehicle, respectively. They are givenas PWA functions with 5 pieces each, giving rise to an irredundant description ofthe dynamics with 19 regions per time step. For a prediction horizon N , this leadsto 19 N possible combinations overall. The dynamics are discretized with a samplingtime of T s = 20 ms. Additionally, we impose input constraints δ ∈ [ − ◦ , ◦ ] andstate constraints:(13) v y ∈ [ − m / s , m / s ] , ω ∈ [ − rad / s , rad / s ] . The objective (cid:80) Nk =1 (cid:107) diag(1 , √ x k +1 − ¯ x k +1 ) (cid:107) + (cid:107) δ k − ¯ δ k (cid:107) , tracks a state ¯ x andinput ¯ δ reference trajectory producing an S-shaped motion of the race car.We consider closed-loop behavior, where the MPC is applied in a receding-horizonfashion. For every time step a problem with different initial state and reference issolved and only the first input u is applied to the dynamical system. A new problemwith updated initial state and reference is solved for the next time step. We comparethe closed-loop evolution for the different methods over 150 steps. The initial state is v y = ω = 0. The dynamics of the system are with respect to the lateral and angularvelocities, v y and ω . The MPC problem instances are solved for a prediction horizonof N = 10 ( ξ = 300) and N = 20 ( ξ = 400). The penalty parameter of ADMM OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES ρ = 350 and ρ = 450 for N = 10 and N = 20, respectively. Atime limit of 3 s for N = 10, and 43 s for N = 20 was imposed on all solvers. Bothtime limits are well above the computation times achieved by our algorithm, with amedian runtime of 34 ms for N = 10, and 87 ms for N = 20, as reported in Figure 10and Table 1. For N = 20, Gurobi has a median runtime of 10 s, reaching the timelimit in a few cases, CPLEX has a median runtime if 12 s, reaching the time limit insome cases and MOSEK reaches the time limit for most time steps leading to medianruntime of to 43 s. The runtime of ADMM is similar to the proposed method (medianruntime of 55 ms for N = 20), however, it fails to converge in a few cases, as shown inFigure 10. In contrast, our method converges in at most 1878 iterations for each ofthe 150 problems, solved in the closed-loop simulation. This is illustrated in Figure 9.
400 800 1 ,
200 1 , − − − k z j − y j k Fig. 9: Consensus violations (cid:107) z j − y j (cid:107) for N = 20 andtime steps 1 to 150, illustrat-ing convergence.
50 100 1500 . . . time steps k s o l u t i o n t i m e i n [ s ] Fig. 10: Solution time in [ s ]for time steps k of the closed-loop simulation, for N = 20,time limit 43 s ( dashed ).Our method ( solid ), ADMM( solid ) and Gurobi ( dash-dotted ). . . − − . − − . − − . − − . x y . . − − . − − . − − . − − . x y optimal ADMM
MOSEK o u r m e t h o d start(a) N = 10 . . − − . − − . − − . − − . x . . − − . − − . − − . − − . x Gurobi
CPLEX (b) N = 20 Fig. 11: Closed-loop position trajectory with initialcondition x = y = ϕ = 0. Optimal closed-looptrajectory ( dashed ), our method ( solid ), ADMM( solid ), CPLEX ( dashed ), Gurobi ( dash-dotted )and MOSEK (dotted).In Figure 11, we compare the position of the car, using the position dynamicsto transform the velocities into positions, via integration of ˙ ϕ = ω , ˙ x = v x cos( ϕ ) − v y sin( ϕ ) and ˙ y = v x sin( ϕ ) + v y cos( ϕ ). The proposed method ( solid ) visibly out-performs the commercial MIP solvers in terms of solution quality. This is also illus-trated in Table 1, where the relative distance to the optimal closed-loop trajectory0 D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI N ourmethod ADMM Gurobi CPLEX MOSEK10 0.8% (34 ms) 0.2% (21 ms) 3.3% (1.1 s) 11% (1.8 s) 74% (3 s)20 0.7% (87 ms) 0.3% (55 ms) 2.5% (10 s) 7.7% (12 s) 48% (43 s)Table 1: Comparison of relative 2-norm distance (cid:107) z − z (cid:63) (cid:107)(cid:107) z (cid:63) (cid:107) to the optimal closed-loopposition trajectory z (cid:63) for N = 10 and N = 20. Additionally, in brackets, we reportthe median computation times for which these trajectories were achieved.is reported. Our method comes very close to the optimal trajectory, while requiringorders of magnitude less runtime. Both our method and ADMM provide near optimalsolutions for similar computational effort. However, our method additionally providesguarantees on the convergence properties of the algorithm. To better understand the proximal scal-ing ξ , we consider 2000 feasible, random instances with different initial states ( v y , ω )satisfying (13). We solve the MPC problem for 40 values of ξ , with a horizon of N = 10and step size γ = 0 .
95. Figure 12 demonstrates the relationship between ξ , (i) thenumber of solved instances and (ii) the number of iterations needed for convergence.For larger ξ more problems are solved, with only 4 unsolved problems remainingfor ξ = 5000. Furthermore, the number of iterations needed to reach the consen-sus threshold (right axis) grows only modestly with larger ξ . In all instances, theconstraint violations and relative suboptimality do not change significantly. Thesefindings are consistent with our expectations from the theory, i.e., that larger ξ usu-ally lead to more problems solved but at the expense of slower convergence.
10 100 1 , ξ % s o l v e d , o f i t e r a t i o n s Fig. 12: Effect of proximal scaling ξ . Percentage of problems solved ( solid , left axis)and number of iterations to reach 10 − consensus violation (right axis) for the solvedproblems, median ( dashed ), 5-th and 95-th percentile ( dashed ).
6. Conclusion.
We propose a low-complexity method for finding local minima ofnon-convex, non-smooth optimization problems. This simple and fast method is idealfor hybrid MPC on embedded platforms. In numerical experiments, we observe thatour method provides “good” locally optimal solutions at a fraction of the time neededby global solvers. Moreover, it is competitive with ADMM in terms of speed. Incontrast to other local methods, our algorithm additionally provides local optimalityand local convergence guarantees.
Appendix A. Auxiliary results and proofs.
OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES A.1. Preliminaries.
We first introduce preliminary results used in this ap-pendix, including the some additional notation: The distance of a point x to aclosed set C is dist( x, C ) := min z ∈C (cid:107) x − z (cid:107) . The relative boundary of a set C isrelbd C := cl C \ relint C , where cl C is the closure and relint C the relative interior .Moreover, we will make frequent use of the following facts relating the Euclideanprojection and the regular normal cone of non-convex polyhedral sets, and the pre-ceding properties of operators in the remainder of this appendix. Proposition
A.1 (Projection and regular normals, [31, Example 6.16, p.212],[31, Proposition 6.17, p.213]).
Given a set
C ⊆ R n .(i) For any x ∈ C it holds that x ∈ P C ( x − v ) ⇒ v ∈ ˆ N C ( x ) .(ii) If C is convex, then for any x ∈ C it holds that x ∈ P C ( x − v ) ⇔ v ∈ ˆ N C ( x ) .(iii) For any x ∈ C if x ∈ P C ( x − τ v ) for some τ > , then P C ( x − τ v ) = { x } forevery τ ∈ (0 , τ ) . Definition
A.2 (Nonexpansive and firmly nonexpansive operators, [4, Defini-tion 4.1 (i)–(ii), p. 59]).
Given an operator T : X ⇒ Y and a point ¯ x ∈ X .(i) T is locally nonexpansive around ¯ x if there exists an (cid:15) > such that for all x, y ∈ B (cid:15) (¯ x ) and any u ∈ T ( x ) , v ∈ T ( y ) it holds that (cid:107) u − v (cid:107) ≤ (cid:107) x − y (cid:107) . (ii) T is locally firmly nonexpansive around ¯ x if there exists an (cid:15) > such thatfor all x, y ∈ B (cid:15) (¯ x ) and any u ∈ T ( x ) , v ∈ T ( y ) it holds that (cid:107) u − v (cid:107) ≤ (cid:104) x − y, u − v (cid:105) . Finally, we elaborate on the structure of the regular normal cone of Z , where thefollowing representation is used: Z = (cid:83) Ii =1 Z i . We remind the reader that I Z ( z ) := { i ∈ { , . . . , I } | z ∈ Z i } is the set of active components of Z at a point z ∈ Z .Proposition A.3 states, that the regular normal cone of Z at z can be written as thefinite intersection of the normal cones of its convex polyhedral parts Z i , where onlythe sets Z i that contain z , i.e., the active components of Z at z , need to be considered. Proposition
A.3 ([15, p. 59f]).
The regular normal cone ˆ N Z of Z evaluated at z ∈ Z is given by ˆ N Z ( z ) = (cid:84) i ∈I Z ( z ) ˆ N Z i ( z ) . A.2. Necessary conditions for local optimality.
We start with a proofsketch for Lemma 3.1, which states that for every local minimum z ◦ of Problem (1)there exists multipliers λ ◦ such that ( z ◦ , λ ◦ ) is a generalized KKT point of Prob-lem (2). Proof sketch of Lemma 3.1.
Problem (1) is equivalent to the consensus problem,Problem (2), which can be written as an optimization problem with variational in-equality constraints (OPVIC) as followsmin z,y z (cid:62) Hz + h (cid:62) z s . t . ψ ( z, y ) := (cid:20) z − yy − z (cid:21) ≤ , ( z, y ) ∈ E × Z ,y ∈ Ω := R n , (cid:104) , y − u (cid:105) ≤ ∀ u ∈ Ω . Because the map ψ is affine, the set E × Z is polyhedral (non-convex) and the set Ω ispolyhedral convex, the constraint set of this problem satisfies a regularity condition2
D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI called a local error bound [35, Theorem 4.3, p. 952] at any feasible point. Thisimplies that the problem is calm everywhere, see [35, Proposition 4.2, p. 951]. Inparticular, this means that the problem is calm at local solutions, which togetherwith [35, Theorem 3.6, p. 950] implies that for every local solution z ◦ to Problem (2)there exists a λ ◦ ∈ R n such that ( z ◦ , z ◦ , λ ◦ ) satisfies the generalized KKT conditions(4) and therefore ( z ◦ , z ◦ , λ ◦ ) is a generalized KKT point. A.3. Proximal KKT points and zeros of K ξ . Next, we establishes a nec-essary and sufficient relationship between regular and ξ -proximal KKT points in theproof of Proposition 3.2 and relate the zeros of the operator K ξ to ξ -proximal KKTpoints of Problem (2), in the proof of Lemma 3.3. Proof of Proposition 3.2.
We consider the two statements separately.i) Any regular KKT point ( z, λ ) satisfies − λ ∈ ˆ N Z ( z ). Thus, by Proposition A.3(14a) − λ ∈ ˆ N Z ( z ) ⇒ − λ ∈ (cid:92) i ∈I Z ( z ) ˆ N Z i ( z ) . For any ξ > ⇔ − ξ λ ∈ ˆ N Z i ( z ) ∀ i ∈ I Z ( z ) ⇔ z ∈ P Z i ( z − ξ λ ) ∀ i ∈ I Z ( z ) , where (14b) follows from the convexity of Z i and the relationship between P Z i andˆ N Z i , described in Proposition A.1(ii). By construction, for any (cid:15) > Z ∩ B (cid:15) ( z ) = (cid:83) i ∈I Z ( z ) Z i ∩ B (cid:15) ( z ) and thus (14b) ⇒ z ∈ P Z∩B (cid:15) ( z ) ( z − ξ λ ).We consider ¯ ξ := (cid:15) (cid:107) λ (cid:107) . For any ξ ≥ ¯ ξ we have (cid:107) ( z − ξ λ ) − z (cid:107) = ξ (cid:107) λ (cid:107) ≤ (cid:15) .Thus, for all x ∈ Z with (cid:107) z − x (cid:107) > (cid:15) we have (cid:107) ( z − ξ λ ) − x (cid:107) > (cid:15) . Together with z ∈ P Z∩B (cid:15) ( z ) ( z − ξ λ ), this implies that z ∈ P Z ( z − ξ λ ) for all ξ ≥ ¯ ξ = (cid:15) (cid:107) λ (cid:107) , whichimplies that ( z, λ ) is a ξ -proximal KKT point for any ξ ≥ ¯ ξ .ii) Given any ξ >
0, and ξ -proximal KKT point ( z, λ ). By definition z ∈ P Z ( z − ξ λ ) which implies − ξ λ ∈ ˆ N Z ( z ) via Proposition A.1(i). This is equivalent to − λ ∈ ˆ N Z ( z ), which directly implies that ( z, λ ) is a regular KKT point. Proof of Lemma 3.3.
Given any ξ >
0, and any s † ∈ zero K ξ , by definition 0 ∈{ M ξ s † + c ξ } − P Z ( s † ), and thereby z † := M ξ s † + c ξ ∈ P Z ( s † ). We define λ † := ξ ( z † − s † ) and thus z † ∈ P Z ( z † − ξ λ † ) ⇔ (6b) holds. Furthermore, we have z † = M ξ ( z † − ξ λ † ) + c ξ which, using (7b), is equivalent to z † = R (cid:0) λ † − h − H ¯ v (cid:1) + ¯ v .By strict convexity z † is the unique solution to 0 ∈ ∂ z ( z (cid:62) Hz +( h − λ † ) (cid:62) z + χ E ( z ))( z † ),which is equivalent to 0 ∈ { Hz † + h − λ † } + ˆ N E ( z † ) and therefore (6a) holds and ( z † , λ † )is a ξ -proximal KKT point. A.4. Local convexification.
Lemma 4.1 is one of the main mathematical toolsthat we use to prove the local optimality and local convergence of the method. Forany z ∈ Z , it relates the set non-convex Z to the closed convex set ˆ N ∗Z ( z ) + { z } , ina neighborhood of z . Specifically, it states that Z is contained in ˆ N ∗Z ( z ) + { z } , ina neighborhood of z , and that the regular normal cones of the two sets at z are thesame. Proof of Lemma 4.1.
We prove the two parts individually.i) Given (cid:15) >
0, take any ¯ z ∈ Z and z ∈ Z ∩ B (cid:15) (¯ z ). Using the definition of thepolar cone we have that z ∈ ˆ N ∗Z (¯ z ) + { ¯ z } if and only if (cid:104) z − ¯ z, v (cid:105) ≤ v ∈ ˆ N Z (¯ z ). OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES (cid:15) > v ∈ ˆ N Z (¯ z )such that (cid:104) z − ¯ z, v (cid:105) >
0. Using Proposition A.3, we have that for any (cid:15) > v ∈ ˆ N Z (¯ z ) is equivalent to (cid:104) v, u − ¯ z (cid:105) ≤ u ∈ Z i , and all i ∈ I Z (¯ z ).However, since z ∈ Z , there exists an i ∈ I Z (¯ z ) such that z ∈ Z i , and therefore bychoosing u = z we have (cid:104) v, u − ¯ z (cid:105) < (cid:104) z − ¯ z, v (cid:105) , a contradiction.ii) For any z ∈ Z , we have that ˆ N Z ( z ) = ˆ N ∗∗Z ( z ), due to ˆ N Z ( z ) closed andconvex, see [31, Corollary 6.21, p. 216]. From the definition of the polar and regularnormal cones, see [31, p. 203, p. 215], and convexity of ˆ N ∗Z ( z ) we have that v ∈ ˆ N Z ( z )is equivalent to (cid:104) v, w (cid:105) ≤ w ∈ ˆ N ∗Z ( z ), which is equivalent to (cid:104) v, u − z (cid:105) ≤ u ∈ ˆ N ∗Z ( z ) + { z } . This means v ∈ ˆ N ˆ N ∗Z ( z )+ { z } ( z ) and implies the result. A.5. Local convergence to local minima.
We are now ready to prove the twocentral lemmas that lead to the local convergence result in Theorem 4.6: Lemma 4.4states that in a neighborhood around almost all fixed points of T ξ , the projection P Z behaves like a projection onto a convex set. This is used in the proof of Lemma 4.5 toshow that the operator T ξ is nonexpansive, single-valued and continuous in a neigh-borhood around almost all fixed points.To prove Lemma 4.4, the following two auxiliary propositions are needed. Propo-sition A.4 gives a local characterization of the projection around points ¯ s where it isunique. It states, that there always exists a neighborhood B (cid:15) (¯ s ), with (cid:15) >
0, such thatfor all points in the neighborhood the projection onto Z can be restricted to the pro-jection onto just the union of the active components of Z at ¯ z = P Z (¯ s ). Furthermorefor all points s ∈ B (cid:15) (¯ s ) in the neighborhood the result z ∈ P Z ( s ) of the projectionsatisfies the bound (cid:107) z − ¯ z (cid:107) ≤ (cid:107) s − ¯ s (cid:107) . Proposition A.5 states that the set fix T ξ | z ◦ offixed points corresponding to a given local optimum z ◦ of Problem (1) either is emptyfor all ξ or is non-empty for all ξ large enough. Proposition
A.4 (Projection).
Given ¯ z and ¯ s such that P Z (¯ s ) = { ¯ z } , thenthere exists an (cid:15) > such that for all s ∈ B (cid:15) (¯ s ) : P Z ( s ) = P Y (¯ z ) ( s ) ⊆ B (cid:107) s − ¯ s (cid:107) (¯ z ) ,where Y ( z ) := (cid:83) i ∈I Z ( z ) Z i .Proof. First, we will show that for all s ∈ R n , we have P Y (¯ z ) ( s ) ⊆ B (cid:107) s − ¯ s (cid:107) (¯ z ). Forall i ∈ I Z (¯ z ), the projection P Z i is nonexpansive because Z i is closed and convex.Therefore, for any i ∈ I Z (¯ z ) and any s ∈ R n we have (cid:107)P Z i ( s ) − P Z i (¯ s ) (cid:107) ≤ (cid:107) s − ¯ s (cid:107) .Consider any z ∈ P Y (¯ z ) ( s ), since ¯ z and ¯ s are such that P Z (¯ s ) = { ¯ z } it follows that P Y (¯ z ) (¯ s ) = P Z i (¯ s ) = { ¯ z } for all i ∈ I Z (¯ z ). In particular for some i ∈ I Z (¯ z ) we have(15) (cid:107) z − ¯ z (cid:107) = (cid:107)P Z i ( s ) − P Z i (¯ s ) (cid:107) ≤ (cid:107) s − ¯ s (cid:107) , which implies P Y (¯ z ) ( s ) ⊆ B (cid:107) s − ¯ s (cid:107) (¯ z ). Second, we show that there exists an (cid:15) > y ∈ Z \ Y (¯ z )(16a) (cid:107) y − s (cid:107) > (cid:107) z − s (cid:107) ∀ s ∈ B (cid:15) (¯ s ) , z ∈ P Y (¯ z ) ( s ) . We define ς := inf y ∈Z\Y (¯ z ) (cid:107) y − ¯ s (cid:107) − (cid:107) ¯ z − ¯ s (cid:107) >
0, by closedness of Z and Z i . Thetriangle inequality implies that for any y ∈ Z \Y (¯ z ) we have (cid:107) y − ¯ s (cid:107) ≤ (cid:107) y − s (cid:107) + (cid:107) s − ¯ s (cid:107) .We choose (cid:15) < ς , which implies(16b) (cid:107) y − s (cid:107) ≥ (cid:107) y − ¯ s (cid:107) − (cid:107) s − ¯ s (cid:107) ≥ ς + (cid:107) ¯ z − ¯ s (cid:107) − (cid:107) s − ¯ s (cid:107) > (cid:15) + (cid:107) ¯ z − ¯ s (cid:107) − (cid:15) , where in the second step we have used the definition of ς , and in the third step wehave used ς > (cid:15) and (cid:107) s − ¯ s (cid:107) ≤ (cid:15) . Now we consider any z ∈ P Y (¯ z ) ( s ), via the triangle4 D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI inequality (cid:107) z − s (cid:107) ≤ (cid:107) z − ¯ z (cid:107) + (cid:107) s − ¯ s (cid:107) + (cid:107) ¯ z − ¯ s (cid:107) and (15) we have (cid:107) z − s (cid:107) ≤ (cid:107) s − ¯ s (cid:107) + (cid:107) ¯ z − ¯ s (cid:107) ≤ (cid:15) + (cid:107) ¯ z − ¯ s (cid:107) , which together with (16b) implies (16a). This implies the result. Proposition
A.5.
Given any local minimum z ◦ of Problem (1).Either fix T ξ | z ◦ = ∅ for all ξ > or there exists a ¯ ξ > such that fix T ξ | z ◦ (cid:54) = ∅ forall ξ ≥ ¯ ξ .Proof. Given any local minimum z ◦ . Either fix T ξ | z ◦ = ∅ for all ξ > ξ such that there exists an ¯ s ∈ fix T ¯ ξ | z ◦ (cid:54) = ∅ . By Lemma 3.3, it holds that( z ◦ , λ ), with λ := ¯ ξ ( z ◦ − ¯ s ), is a ¯ ξ -proximal KKT point. From Proposition 3.2, itfurther follows that ( z ◦ , λ ) is also a ξ -proximal KKT point for any ξ ≥ ¯ ξ , whichimplies s := z ◦ − ξ λ ∈ fix T ξ | z ◦ , and thus fix T ξ | z ◦ (cid:54) = ∅ for all ξ ≥ ¯ ξ .We use the following claim to prove Lemma 4.4. The proof of Claim A.6 ispresented immediately after the proof of Lemma 4.4. The claim states that if ξ ischosen large enough, then for any local minimum z ◦ for which the operator T ξ hasfixed points such that z ◦ itself is not a fixed point of T ξ , we have that for almostall fixed points of T ξ , the projection is unique and the fixed point lies in the relativeinterior of the shifted regular normal cone ˆ N Z ( z ◦ ) + { z ◦ } of Z at z ◦ . Claim
A.6. Given Assumptions (A1)–(A2). For any ξ > z ◦ to Problem (2), with fix T ξ | z ◦ (cid:54) = ∅ and z ◦ (cid:54)∈ fix T ξ | z ◦ , we have that(17) P Z ( s ◦ ) = { z ◦ } and s ◦ ∈ relint ˆ N Z ( z ◦ ) + { z ◦ } , for all fixed points s ◦ ∈ fix T ξ | z ◦ , except a measure zero subset. Proof of Lemma 4.4.
Given Claim A.6 and consider any pair z ◦ , s ◦ satisfying (17).We will show that in a neighborhood of s ◦ the projection onto Z is the same as pro-jecting onto an appropriate closed, convex set. This then implies local firm nonex-pansiveness, continuity and single-valuedness of the projection.Let B (cid:15) ( s ◦ ) be a neighborhood of s ◦ , for some (cid:15) > S :=lin( ˆ N Z ( z ◦ )), the linear subspace spanned by ˆ N Z ( z ◦ ), and its orthogonal complement S ⊥ . For any s ∈ B (cid:15) ( s ◦ ), we can write s = s ◦ + u + v , where u ∈ S , v ∈ S ⊥ and u + v ∈ B (cid:15) . This allows us to argue about the contribution of u and v successively.We consider ¯ s := s ◦ + u . Due to P Z ( s ◦ ) = { z ◦ } , continuity of the distance function,and Proposition A.1(iii), we have P Z (¯ s ) = P Z ( s ◦ + u ) = { z ◦ } for all u ∈ B (cid:15) , giventhat (cid:15) is small enough. Furthermore, due to openness of relint ˆ N Z ( z ◦ ) on S , we have¯ s ∈ relint ˆ N Z ( z ◦ ) + { z ◦ } for all u ∈ B (cid:15) , with (cid:15) > z ◦ , s ◦ satisfying (17), we will consider z ◦ , ¯ s . We now examinethe contribution of v , i.e., s = ¯ s + v . Similarly to Lemma 4.1, where we showed thatwe can replace the regular normal cone of Z by the normal cone of ˆ N ∗Z ( z ◦ ) + { z } , wewill now consider the projection onto that set. We do this because ˆ N ∗Z ( z ◦ ) + { z ◦ } isa local convexification of Z . We have v ∈ S ⊥ ⊆ ˆ N ∗Z ( z ◦ ). We consider the projection P ˆ N ∗Z ( z ◦ ) ( s − z ◦ ) = arg min w ∈ ˆ N ∗Z ( z ◦ ) (cid:107) ¯ s − z ◦ + v − w (cid:107) = arg min w ∈ ˆ N ∗Z ( z ◦ ) (cid:107) ¯ s − z ◦ (cid:107) + (cid:107) v − w (cid:107) − (cid:104) ¯ s − z ◦ , w (cid:105) , where we have used (cid:104) ¯ s − z ◦ , v (cid:105) = 0. By (cid:104) ¯ s − z ◦ , w (cid:105) ≤ v is a minimizer. It is unique because the set ˆ N ∗Z ( z ◦ ) is closed and convex, therefore P ˆ N ∗Z ( z ◦ ) ( s − z ◦ ) = { v } and consequently P ˆ N ∗Z ( z ◦ )+ { z ◦ } (¯ s + v ) = { z ◦ + v } . Additionally,due to Z ∩ B (cid:15) ( z ◦ ) ⊆ ˆ N ∗Z ( z ◦ ) + { z ◦ } from Lemma 4.1 and z ◦ + v ∈ Z ∩ B (cid:15) ( z ◦ ) due toAssumption (A3), it follows that z ◦ + v ∈ P Z (¯ s + v ). Finally using Proposition A.4 OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES P Z (¯ s + v ) ⊆ Z ∩ B (cid:107) v (cid:107) ( z ◦ ) ⊆ Z ∩ B (cid:15) ( z ◦ ), which immediately impliesthat P Z (¯ s + v ) = P ˆ N ∗Z ( z ◦ )+ { z ◦ } (¯ s + v ) = { z ◦ + v } . This means for all s ∈ B (cid:15) ( s ◦ )the projection onto Z is equivalent to the projection onto the closed, convex setˆ N ∗Z ( z ◦ ) + { z ◦ } and therefore enjoys all of the properties of a projection onto a closedconvex set. In particular single-valuedness, continuity and firm nonexpansiveness [4,Proposition 4.8, p. 61]. Proof of Claim A.6.
Given a local minimum z ◦ to Problem (2) and consider any¯ ξ according to Proposition A.5 such that (A2) holds. By definition of fix T ξ and byProposition A.1(iii), (A2) also holds for any larger ξ ≥ ¯ ξ . We will show that for any ξ > ¯ ξ , and any local minimum z ◦ such that fix T ξ | z ◦ (cid:54) = ∅ and z ◦ (cid:54)∈ fix T ξ | z ◦ :(18) relint fix T ξ | z ◦ = (cid:8) s ∈ fix T ξ | z ◦ (cid:12)(cid:12) P Z ( s ) = { z ◦ } , s ∈ relint ˆ N Z ( z ◦ ) + { z ◦ } (cid:9) . [22, Theorem. 1, p. 90] states that the boundary of a convex set has zero Lebesguemeasure. Convexity of fix T ξ | z ◦ together with (18) therefore imply that (17) holds forall s ∈ fix T ξ | z ◦ , except a measure zero subset.Due to fix T ¯ ξ | z ◦ (cid:54) = ∅ and (A2), we have that there exists an ¯ s ∈ fix T ¯ ξ | z ◦ with¯ s ∈ relint ˆ N Z ( z ◦ )+ { z ◦ } . Furthermore, we have that s ( ξ ) := z ◦ − ξ ¯ ξ ( z ◦ − ¯ s ) ∈ fix T ξ | z ◦ and s ( ξ ) ∈ relint ˆ N Z ( z ◦ ) + { z ◦ } for ξ ≥ ¯ ξ and by Proposition A.1(iii) we have that P Z ( s ( ξ )) = { z ◦ } for any ξ > ¯ ξ . Therefore, for any ξ > ¯ ξ we have that(19) fix T ξ | z ◦ ∩ relint (cid:0) ˆ N Z ( z ◦ ) + { z ◦ } (cid:1) ∩ (cid:8) s (cid:12)(cid:12) P Z ( s ) = { z ◦ } (cid:9) (cid:54) = ∅ . We consider the set { s | z ◦ ∈ P Z ( s ) } . We have (cid:8) s (cid:12)(cid:12) z ◦ ∈ P Z ( s ) (cid:9) = (cid:8) s (cid:12)(cid:12) (cid:107) s − z ◦ (cid:107) ≤ dist( s, Z ) (cid:9) = (cid:8) s (cid:12)(cid:12) (cid:107) s − z ◦ (cid:107) ≤ dist( s, Y ( z ◦ )) (cid:9) ∩ (cid:8) s (cid:12)(cid:12) (cid:107) s − z ◦ (cid:107) ≤ dist( s, Y c ( z ◦ )) (cid:9) = (cid:0) ˆ N Z ( z ◦ ) + { z ◦ } (cid:1) ∩ (cid:8) s (cid:12)(cid:12) (cid:107) s − z ◦ (cid:107) ≤ dist( s, Y c ( z ◦ )) (cid:9) , with Y ( z ◦ ) := (cid:83) i ∈I Z ( z ◦ ) Z i and Y c ( z ◦ ) := (cid:83) i ∈I c Z ( z ◦ ) Z i , where I c Z ( z ◦ ) := { , . . . , I } \I Z ( z ◦ ). Where we have used that by Proposition A.3 and convexity of Z i we have { s | (cid:107) s − z ◦ (cid:107) ≤ dist( s, Y ( z ◦ )) } = { s | z ◦ ∈ P Y ( z ◦ ) ( s ) } = ˆ N Y ( z ◦ ) ( z ◦ ) + { z ◦ } = ˆ N Z ( z ◦ ) + { z ◦ } . By continuity and upper-boundedness of ς ( s ) := (cid:107) s − z ◦ (cid:107) − dist( s, Y c ( z ◦ )), and byclosedness of Y c ( z ◦ ), it follows thatrelint (cid:8) s (cid:12)(cid:12) (cid:107) s − z ◦ (cid:107) ≤ dist( s, Y c ( z ◦ )) (cid:9) = (cid:8) s (cid:12)(cid:12) (cid:107) s − z ◦ (cid:107) < dist( s, Y c ( z ◦ )) (cid:9) = (cid:8) s (cid:12)(cid:12) P Z ( s ) = { z ◦ } (cid:9) for all s such that z ◦ ∈ P Z ( s ). Therefore, by distributivity of the relint operator overfinite intersections due to (19), see [31, Proposition 2.42, p. 65], we haverelint (cid:8) s (cid:12)(cid:12) z ◦ ∈ P Z ( s ) (cid:9) = relint (cid:0) ˆ N Z ( z ◦ ) + { z ◦ } (cid:1) ∩ (cid:8) s (cid:12)(cid:12) P Z ( s ) = { z ◦ } (cid:9) (cid:54) = ∅ . By definition of fix T ξ | z ◦ , convexity of { s | z ◦ ∈ P Z ( s ) } , [31, Proposition 2.42, p. 65]and (19) we then haverelint fix T ξ | z ◦ = relint (cid:8) s (cid:12)(cid:12) ∈ { Hz ◦ + h } + ˆ N E ( z ◦ ) − { ξ ( z ◦ − s ) } , z ◦ ∈ P Z ( s ) (cid:9) = fix T ξ | z ◦ ∩ relint (cid:8) s (cid:12)(cid:12) z ◦ ∈ P Z ( s ) (cid:9) = fix T ξ | z ◦ ∩ relint (cid:0) ˆ N Z ( z ◦ ) + { z ◦ } ) ∩ (cid:8) s (cid:12)(cid:12) P Z ( s ) = { z ◦ } (cid:9) (cid:54) = ∅ . D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI
Proof of Lemma 4.5.
We consider any ξ > (cid:15) > s ◦ ∈ fix T ξ such that theprojection P Z is single-valued, continuous and firmly nonexpansive on B (cid:15) ( s ◦ ). Werecall the definition of T ξ in (10a). Clearly T ξ is single-valued and continuous on B (cid:15) ( s ◦ )by virtue of the projection. It remains to show the nonexpansiveness of T ξ on B (cid:15) ( s ◦ ).By [4, Definition 4.1 (ii), p. 59], T ξ is nonexpansive on B (cid:15) ( s ◦ ) if (cid:107) T ξ ( s ) − T ξ ( t ) (cid:107) ≤(cid:107) s − t (cid:107) for all s, t ∈ B (cid:15) ( s ◦ ). For any s, t , we have that (cid:107) T ξ ( s ) − T ξ ( t ) (cid:107) = (cid:107) ( I − W M ξ )( s − t ) (cid:107) (20) + 2 (cid:104) W (cid:62) ( I − W M ξ )( s − t ) , P Z ( s ) − P Z ( t ) (cid:105) + (cid:107) W ( P Z ( s ) − P Z ( t )) (cid:107) . To show (20) ≤ (cid:107) s − t (cid:107) for all s, t ∈ B (cid:15) ( s ◦ ) we will split (20) into two parts, the por-tions belonging to the range- and nullspace of M ξ . We recall that M ξ is positive semi-definite for ξ satisfying (8a). Therefore, we can consider the eigenvalue decompositionof M ξ , as in (10) and define q i , for i = 1 , . . . , n , to be the normalized eigenvectors of M ξ . We define δ, p ∈ R n as coefficients, such that (cid:80) ni =1 δ i q i = s − t and (cid:80) ni =1 p i q i = P Z ( s ) − P Z ( t ). We further define δ ⊥ := (cid:80) n − mi =1 δ i q i , δ := (cid:80) ni = n − m +1 δ i q i , p ⊥ := (cid:80) n − mi =1 p i q i and p := (cid:80) ni = n − m +1 p i q i , which splits s − t and P Z ( s ) − P Z ( t ) into theirrange and nullspace portions. Additionally, p ⊥ Λ := (cid:80) n − mi =1 λ − i p i q i , for λ i the non-zeroeigenvalues of M ξ . Substituting δ ⊥ , δ , p ⊥ , p and p ⊥ Λ into (20), using the eigenvaluedecomposition of M ξ and W as defined in (10b), we obtain (cid:107) T ξ ( s ) − T ξ ( t ) (cid:107) = (cid:107) δ ⊥ (cid:107) + (cid:107) δ (cid:107) + (cid:104) δ ⊥ , p ⊥ Λ (cid:105) − (cid:104) δ , p (cid:105) + (cid:107) p ⊥ Λ (cid:107) + (cid:107) p (cid:107) . (21)We want to show that (20) = (21) ≤ (cid:107) s − t (cid:107) = (cid:107) δ ⊥ (cid:107) + (cid:107) δ (cid:107) . However, this onlyhas to hold for δ, p that are related via the projection P Z where in particular we wantto exploit its firm nonexpansiveness. The projection P Z being firmly nonexpansive,and nonexpansive, on B (cid:15) ( s ◦ ), means via Definition (A.2), that for all s, t ∈ B (cid:15) ( s ◦ ) (cid:107)P Z ( s ) − P Z ( t ) (cid:107) ≤ (cid:104) s − t, P Z ( s ) − P Z ( t ) (cid:105) , and(FNE) (cid:107)P Z ( s ) − P Z ( t ) (cid:107) ≤ (cid:107) s − t (cid:107) . (NE)By applying the Cauchy-Schwarz inequality, and combining (FNE) and (NE) we get(22) 0 ≤ (cid:107)P Z ( s ) − P Z ( t ) (cid:107) ≤ (cid:104) s − t, P Z ( s ) − P Z ( t ) (cid:105) ≤ (cid:107) s − t (cid:107) . Substituting δ ⊥ , δ , p ⊥ and p into (22), we obtain(23) 0 ≤ (cid:107) p ⊥ (cid:107) + (cid:107) p (cid:107) ≤ (cid:104) p ⊥ , δ ⊥ (cid:105) + (cid:104) p , δ (cid:105) ≤ (cid:107) δ ⊥ (cid:107) + (cid:107) δ (cid:107) . To show that T ξ is nonexpansive it is therefore sufficient to show that for all δ, p thatsatisfy (23) it holds that (21) ≤ (cid:107) δ ⊥ (cid:107) + (cid:107) δ (cid:107) . Notice, that (21) and (23) do notcontain any coupling terms between range- and nullspace portions of δ or p , thereforewe can argue about them separately.i) We consider all δ ⊥ , p ⊥ such that 0 ≤ (cid:107) p ⊥ (cid:107) ≤ (cid:104) p ⊥ , δ ⊥ (cid:105) ≤ (cid:107) δ ⊥ (cid:107) . Using (8b)implies that (cid:107) Λ − (cid:107) ≤ (cid:107) p ⊥ Λ (cid:107) ≤ (cid:107) p ⊥ (cid:107) ≤ (cid:107) δ ⊥ (cid:107) and (cid:104) δ ⊥ , p ⊥ Λ (cid:105) ≤ (cid:107) δ ⊥ (cid:107) .Therefore, (cid:107) δ ⊥ (cid:107) + (cid:104) δ ⊥ , p ⊥ Λ (cid:105) + (cid:107) p ⊥ Λ (cid:107) ≤ (cid:107) δ ⊥ (cid:107) + (cid:107) δ ⊥ (cid:107) + (cid:107) δ ⊥ (cid:107) = (cid:107) δ ⊥ (cid:107) , i.e.,the result holds for the range space portion.ii) We consider all δ , p such that 0 ≤ (cid:107) p (cid:107) ≤ (cid:104) p , δ (cid:105) ≤ (cid:107) δ (cid:107) . It follows that (cid:107) δ (cid:107) − (cid:104) δ , p (cid:105) + (cid:107) p (cid:107) ≤ (cid:107) δ (cid:107) − (cid:104) δ , p (cid:105) + (cid:104) δ , p (cid:105) ≤ (cid:107) δ (cid:107) . Thus, the result alsoholds for the null space portion. OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES
Appendix B. Condition for satisfying (A3).
We give an algorithm to checkwhether Assumption (A3) is satisfied at every point in Z , representing a necessaryand sufficient condition. This conditions relies on the enumeration of all possiblecombinations of active components and their active sets, and can be checked once andoffline for all parameters θ . We consider E := { z ∈ R n | Az = b } , with A ∈ R m × n ,and Z ( θ ) with components Z i ( θ ) defined as(24) Z ( θ ) = I (cid:91) i =1 Z i ( θ ) = I (cid:91) i =1 { z ∈ R n | G i z = g i ( θ ) , F i z ≤ f i ( θ ) } , where G i ∈ R p i × n and F i ∈ R m i × n . The functions f i ( θ ) = F i,θ θ + f i and g i ( θ ) = G i,θ θ + g i depend affinely on a parameter θ ∈ R p . With Z i ( θ ) closed convex polyhedrafor fixed θ ∈ R p and non-empty for some θ .Given a parameter θ and index i ∈ { , . . . , I } . The active set J Z i ( θ ) ( z ) := (cid:8) j ∈{ , . . . , m i } (cid:12)(cid:12) s . t . F ij z = f ij ( θ ) (cid:9) of Z i ( θ ) at z ∈ Z i ( θ ) is defined in the usual way [29,Definition 12.1, p. 308], where F ij denotes the j -th row of F i . For z (cid:54)∈ Z i ( θ ), we define J Z i ( θ ) ( z ) := ∅ . For any active set J ⊆ { , . . . , m i } , F i J denotes the subset of rows of F i that belong to the active set J , f i J is defined in the same way. We further definethe active set J Z ( θ ) ( z ) of Z ( θ ) at z ∈ Z ( θ ), as the collection of all the active sets ofthe components of Z ( θ ), i.e., J Z ( θ ) ( z ) := {J Z i ( θ ) ( z ) } Ii =1 ,The set I Z ⊆ { ,...,I } denotes the set of sets of active components of Z ( θ ), i.e., I ∈ I Z if and only if there exists θ ∈ R p and z ∈ Z ( θ ) such that I ≡ I Z ( θ ) ( z ), where I Z ( θ ) ( z ) denotes the set of active components at z , as defined in Section 4.2. Similarlywe denote by J Z := {J Z i } Ii =1 the collection of the sets of all possible active sets of Z ( θ ), where J Z i := (cid:8) J ⊆ { , . . . , m i } (cid:12)(cid:12) ∃ θ ∈ R p , ∃ z ∈ Z ( θ ) s . t . J ≡ J Z i ( θ ) ( z ) (cid:9) denotes the set of all possible active sets for component i .Assumption (A3) is purely geometric and needs to be satisfied at every point in Z .Algorithm 3 will effectively enumerate all possible combinations of active componentsand their active sets and check for each such combination. This amounts to a finitenumber of checks. Checking a parametric set Z ( θ ), due to the polyhedral structure,requires checking finitely many different non-parametric sets. Algorithm 3
Check of Assumption (A3)
Require: Z ( θ ) as in (24) for each I ∈ I Z , {J i } Ii =1 ∈ J Z do N ← (cid:84) i ∈I { v | v = G i (cid:62) ν + F i (cid:62)J i µ , µ ≥ } (cid:46) construct regular normal cone if N = { } then continue R ← (cid:83) i ∈I { v | G i v = 0 , F i J i v ≤ } (cid:46) construct union of recession cones if N ⊥ ⊆ R then continue else w ← w ∈ N ⊥ \ R ( z, θ ) ← z, θ s . t . z ∈ (cid:84) Ii =1 A i ( θ ) return ( w, z, θ ) (cid:46) return counter-exampleAlgorithm 3 is a combinatorial algorithm and may perform badly for anythingbut small dimensions. However when the set Z is a Cartesian product of sets in8 D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI low dimension, the check can be performed separately for each part of the Cartesianproduct. This significantly reduces the “curse of dimensionality”.
Lemma
B.1.
Given a (parametric) set Z ( θ ) as in (24) , then Algorithm 3 termi-nates in a finite number of steps. Either Z ( θ ) satisfies Assumption (A3) for all θ , orthe algorithm returns a valid counter-example.Proof of Lemma B.1. We will show that Algorithm 3 effectively checks Assump-tion (A3) for every θ and every point in Z ( θ ). We will argue, that checking (A3) forevery θ ∈ R p and every z ∈ Z ( θ ) is equivalent to checking it for every set of activeconstraints. Note that J Z the set of active sets of Z ( θ ) is a finite set with at most2 (cid:80) Ii =1 m i elements.In Algorithm 3, we iterate over the elements of I Z and J Z , which means thatthe loop is executed at most a finite number of times. Given a pair I , {J i } Ii =1 wewant to verify, or invalidate, (A3) by considering two sets N and R . The sets N , R are defined solely via the active constraints and do not depend on θ . For each activecomponent i ∈ I we define two auxiliary sets A i ( θ ) := { z | G i z = g i ( θ ) , F i J i z = f i J i ( θ ) , F i J ci z < f i J ci ( θ ) } , F i ( θ ) := { z | G i z = g i ( θ ) , F i J i z ≤ f i J i ( θ ) } , where the set J ci is simply the complement of J i , i.e., the set of constraints that arenot active. For a given θ , A i ( θ ) is the set of points z ∈ Z that belong to the active set J i , whereas F i ( θ ) ⊇ A i ( θ ) is a larger set, that is defined by only the active constraintsand in general is not included in Z . We further define A ( θ ) := (cid:84) Ii =1 A i ( θ ), the setof points that belong to all of the active sets of the active components. By checkingAssumption (A3) for each A ( θ ) we will effectively check it for each z ∈ Z .The conditions to be checked involve the regular normal cone ˆ N Z ( θ ) ( z ). For each θ , z ∈ A ( θ ) the regular normal cone ˆ N Z ( θ ) ( z ) is given as ˆ N Z ( θ ) ( z ) = (cid:84) i ∈I { G i (cid:62) ν + F i (cid:62)J i µ | µ ≥ } , using Proposition A.3 and the fact that normal cones of polyhedralsets can be represented as linear and conic combinations of their halfspaces, see [31,Theorem 6.46, p. 231]. It is independent of θ and z , and only depends the activeconstraints and components. This set, N := ˆ N Z ( θ ) ( z ), is computed in step 2 ofAlgorithm 3. A half-space representation of N can be obtained using Fourier-Motzkinelimination [8, p. 84] for each i ∈ I , after which the intersections are trivial. We nowneed to check two conditions. If N = { } , then the first part of (A3) is satisfied forall z ∈ A ( θ ), and we continue with the next active set. Otherwise, we need to checkwhether there exists an (cid:15) > w ∈ N ⊥ ∩B (cid:15) we have that z + w ∈ Z ( θ ).In order to check this condition, in step 4, we construct the set R as the union ofthe recession cones R i := rec( F i ( θ )) = { v | G i v = 0 , F i J i v ≤ } of F i ( θ ). Finally itremains to show that checking this condition for all z ∈ A ( θ ), in step 5, is equivalentto checking whether the orthogonal complement N ⊥ := { v ∈ R n | (cid:104) v, u (cid:105) = 0 , ∀ u ∈ N } is contained in R . If N ⊥ ⊆ R , consider any z ∈ A ( θ ), an (cid:15) > w ∈ N ⊥ ∩ B (cid:15) . Clearly there exists an i ∈ I such that w ∈ R i . Furthermore, byconstruction of A ( θ ) and R i , we have z + w ∈ Z i ⊆ Z . This means (A3) is satisfiedfor all z ∈ A ( θ ) and we continue with the next active set. If N ⊥ (cid:54)⊆ R , then thereexists a w ∈ N ⊥ \ R . Because N and R i are cones, this is true for all positive scalingsof w . In particular, for any (cid:15) > t > tw ∈ N ⊥ ∩ B (cid:15) and tw (cid:54)∈ R . Analogously to above, it follows that for any z ∈ A ( θ ), z + tw (cid:54)∈ Z i forall i ∈ I , and therefore z + tw (cid:54)∈ Z , which violates (A3). In this case, the algorithmterminates and returns a violating direction w , point z and parameter θ . Because OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES
Appendix C. MacMPEC benchmark problems.
The
MacMPEC library [23]is a collection of almost 200 benchmark problems for mathematical programs withequilibrium constraints (MPECs). It contains 41 MPCCs with linear constraints(including linear complementarity constraints) that can be formulated in the form ofProblem (2) with a positive definite quadratic objective. Formulating such MPCCs inthe form of Problem (2) can lead to a number of regions that is worst-case exponentialin the number of complementarity constraints. Therefore, for 12 of the 41 problemswe were unable to obtain a tractable representation in the form of Problem (2). Theremaining 29 problems were brought into the form of Problem (2). However, in onlytwo of the cases, qpec1 and qpec2 , we managed to extract the Cartesian productstructure, i.e.
N >
1, and in all cases, we used E = R n . We solved each problemusing the proposed method with ξ = 2 λ max ( H ), (cid:15) tol = 10 − and initial iterate s = 0.For all of these problems, Algorithm 1 converged and returned solutions close to theglobal minimum.In Table 2, for each of the 41 problems, we report the problem dimensions: thenumber of decision variables n , the number of regions m and the “horizon” N . Wealso report the objective value of the found local minima compared to the objectivevalue of the global optimum, and how many iterations were needed for each problemto converge to a solution with 10 − consensus tolerance. Each of the 29 problems thatcould be tractably represented converged close to the global minimum within a fewiterations, which the exception of scale3 , which needed 6815 iterations to converge.For each problem, we have additionally, checked Assumption (A3) using Algorithm 3.For the problems portfl-i- { } , checking Assumption (A3) was intractabledue to the large number of regions (more than 4000) and decision variables (74). For23 of the remaining 24 problems Assumption (A3) was determined to be satisfied,with Algorithm 3 terminating within a few seconds. For one problem, hs044-i ,Algorithm 3 terminated after 40 . hs044-i and portfl-i- { } , whereAssumption (A3) was violated or could not be checked. Moreover, they converged tolocal minima, as indicated by Theorem 4.2.0 D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARI p r o b l e m d i m e n s i o n s o b j ec t i v e v a l u e n m N o u r m e t h o d o p t i m a l ξ i t e r a t i o n s A ss u m p t i o n ( A ) bard1 § s a t i s fi e d ( < s ) bard1m s a t i s fi e d ( . s ) ex.9.2.1 § s a t i s fi e d ( < s ) ex.9.2.2 s a t i s fi e d ( < s ) ex.9.2.4 . . s a t i s fi e d ( < s ) ex.9.2.5 s a t i s fi e d ( < s ) ex.9.2.6 − − s a t i s fi e d ( . s ) ex.9.2.7 § s a t i s fi e d ( < s ) gauvin s a t i s fi e d ( < s ) hs044-i . . v i o l a t e d ¶ ( . m i n ) jr1 . . s a t i s fi e d ( < s ) jr2 . . s a t i s fi e d ( < s ) kth2 . e − s a t i s fi e d ( < s ) kth3 . . s a t i s fi e d ( < s ) liswet1- { } † –––––––– portfl-i-1 . e − . e − ‡ portfl-i-2 . e − . e − ‡ portfl-i-3 . e − . e − ‡ portfl-i-4 . e − . e − ‡ portfl-i-6 . e − . e − ‡ qpec-100- { } † –––––––– qpec-200- { } † –––––––– qpec1 s a t i s fi e d ( < s ) qpec2 s a t i s fi e d ( < s ) ralphmod † –––––––– scholtes3 . . s a t i s fi e d ( < s ) c o n t i nu e d ... OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES p r o b l e m d i m e n s i o n s o b j ec t i v e v a l u e n m N o u r m e t h o d o p t i m a l ξ i t e r a t i o n s A ss u m p t i o n ( A ) scholtes5 s a t i s fi e d ( . s ) scale1 s a t i s fi e d ( < s ) scale2 s a t i s fi e d ( < s ) scale3 s a t i s fi e d ( < s ) scale4 s a t i s fi e d ( < s ) scale5 s a t i s fi e d ( < s ) sl1 . . s a t i s fi e d ( . s ) T a b l e : R e s u l t s f o r t h e MacMPEC c o ll ec t i o n o f b e n c h m a r k p r o b l e m s [ ]. § t h e p r o b l e m s bard1 , ex.9.2.1 a nd ex.9.2.7 a r ee q u i v a l e n t w h e n f o r m u l a t e d i n t h e f o r m o f P r o b l e m ( ) . † t h e p r o b l e m s liswet1- { } , qpec-100- { } , qpec-200- { } a nd ralphmod c o v e r a t o t a l p r o b l e m i n s t a n c e s a nd w e r e e x c l ud e db e c a u s e w e w e r e un a b l e t o find a t r a c t a b l e r e p r e s e n t a t i o n i n t h e f o r m o f P r o b l e m ( ) du e t o t h e l a r g e nu m b e r o f c o m p l e m e n t a r i t y c o n s t r a i n t s a nd s l a c kv a r i a b l e s . ¶ f o r hs044-i , A l go r i t h m ( ) r e t u r n e d a c o un t e r - e x a m p l e t o A ss u m p t i o n ( A ) w i t h w = ( , , − , ) a t z = ( . , . , . , . ) , s t ill t h e p r o p o s e d m e t h o d c o n v e r g e d t oa n e a r o p t i m a l s o l u t i o n . ‡ f o r t h e p r o b l e m s portfl-i- { } , c h e c k i n g A ss u m p t i o n ( A ) w i t h A l go r i t h m ( ) w a s i n t r a c t a b l e du e t o t h e l a r g e nu m b e r o f r e g i o n s a nd d e c i s i o n v a r i a b l e s . N e v e r t h e l e ss , t h e p r o p o s e d m e t h o d c o n v e r g e d t o n e a r o p t i m a l s o l u t i o n s . D. FRICK, A. GEORGHIOU, J. L. JEREZ, A. DOMAHIDI AND M. MORARIREFERENCES[1]
A. Alessio and A. Bemporad , A Survey on Explicit Model Predictive Control , Springer BerlinHeidelberg, 2009, pp. 345–369.[2]
A. Arce, A. del Real, and C. Bordons , MPC for battery/fuel cell hybrid vehicles includingfuel cell dynamics and battery performance improvement , Journal of Process Control, 19(2009), pp. 1289–1304.[3]
D. Axehill , Integer Quadratic Programming for Control and Communication , PhD thesis,Link¨oping University, 2008.[4]
H. Bauschke and P. Combettes , Convex analysis and monotone operator theory in Hilbertspaces , Springer, 2011.[5]
A. Bemporad and M. Morari , Control of systems integrating logic, dynamics, and con-straints , Automatica, 35 (1999), pp. 407–427.[6]
V. Berinde , Iterative approximation of fixed points , Springer, 2 ed., 2007.[7]
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein , Distributed optimization andstatistical learning via the alternating direction method of multipliers , Foundations andTrends in Machine Learning, 3 (2011), pp. 1–122.[8]
G. Dantzig , Linear programming and extensions , Princeton University Press, 1963.[9]
S. Di Cairano, A. Bemporad, I. Kolmanovsky, and D. Hrovat , Model predictive controlof magnetically actuated mass spring dampers for automotive applications , InternationalJournal of Control, 80 (2007), pp. 1701–1716.[10]
A. Domahidi, A. Zgraggen, M. Zeilinger, M. Morari, and C. Jones , Efficient interiorpoint methods for multistage problems arising in receding horizon control , in IEEE Con-ference on Decision and Control, 2012, pp. 668–674.[11]
D. Frick , Automatic code-generation tool . http://github.com/dafrick/codegen, 2016.[12]
D. Frick, A. Domahidi, and M. Morari , Embedded optimization for mixed logical dynamicalsystems , Computers & Chemical Engineering, 72 (2015), pp. 21–33.[13]
T. Geyer, G. Papafotiou, R. Frasca, and M. Morari , Constrained optimal control of thestep-down dc-dc converter , IEEE Transactions on Power Electronics, 23 (2008), pp. 2454–2464.[14]
A. Hempel , Control of Piecewise Affine Systems Through Inverse Optimization , PhD thesis,ETH Z¨urich, 2015. Dissertation, ETH-Z¨urich, 2015, No. 23222.[15]
R. Henrion and J. Outrata , On calculating the normal cone to a finite union of convexpolyhedra , Optimization, 57 (2008), pp. 57–78.[16]
M. Herceg, M. Kvasnica, C. Jones, and M. Morari , Multi-parametric toolbox 3.0 , in Eu-ropean Control Conference, 2013, pp. 502–510.[17]
J.-H. Hours and C. Jones , A parametric nonconvex decomposition algorithm for real-timeand distributed NMPC , IEEE Transactions on Automatic Control, 61 (2016), pp. 287–302.[18]
B. Houska, J. Frasch, and M. Diehl , An augmented lagrangian based algorithm for dis-tributed nonconvex optimization , SIAM Journal on Optimization, 26 (2016), pp. 1101–1127.[19]
IBM , IBM ILOG CPLEX optimization studio , 2015.[20]
M. Jung , Relaxations and Approximations for Mixed-Integer Optimal Control , PhD thesis,Universit¨at Heidelberg, 2013.[21]
M. N. Jung, C. Kirches, and S. Sager , On Perspective Functions and Vanishing Constraintsin Mixed-Integer Nonlinear Optimal Control , Springer Berlin Heidelberg, 2013, pp. 387–417.[22]
R. Lang , A note on the measurability of convex sets , Archiv der Mathematik, 47 (1986),pp. 90–92.[23]
S. Leyffer , MacMPEC: AMPL collection of MPECs , Argonne National Laboratory, (2009),https://wiki.mcs.anl.gov/leyffer/index.php/MacMPEC.[24]
G. Li and T. Pong , Global convergence of splitting methods for nonconvex composite opti-mization , SIAM Journal on Optimization, 25 (2015), pp. 2434–2460.[25]
A. Liniger, A. Domahidi, and M. Morari , Optimization-based autonomous racing of1:43 scale RC cars , Optimal Control Applications and Methods, 36 (2015), pp. 628–647.[26]
J. L¨ofberg , YALMIP : a toolbox for modeling and optimization in MATLAB , in IEEE Inter-national Symposium on Computer Aided Control Systems Design, 2004, pp. 284–289.[27]
S. Magn´usson, P. C. Weeraddana, M. G. Rabbat, and C. Fischione , On the convergence ofalternating direction lagrangian methods for nonconvex structured optimization problems ,IEEE Transactions on Control of Network Systems, 3 (2016), pp. 296–309.[28]
M. Morari and J. Lee , Model predictive control: past, present and future , Computers &Chemical Engineering, 23 (1999), pp. 667–682.OW-COMPLEXITY METHOD FOR HYBRID MPC WITH LOCAL GUARANTEES [29] J. Nocedal and S. Wright , Numerical optimization , Springer Science+Business Media, 2 ed.,2006.[30]
R. A. Poliquin, R. T. Rockafellar, and L. Thibault , Local differentiability of distancefunctions , Transactions of the American Mathematical Society, 352 (2000), pp. 5231–5249.[31]
R. Rockafellar and R. Wets , Variational analysis , Springer, 1998.[32]
A. Ruszczy´nski , Nonlinear Optimization , Princeton University Press, 2006.[33]
R. Takapoui, N. Moehle, S. Boyd, and A. Bemporad , A simple effective heuristic forembedded mixed-integer quadratic programming , International Journal of Control, (2017),pp. 1–11.[34]
J. Vielma , Mixed integer linear programming formulation techniques , SIAM Review, 57 (2015),pp. 3–57.[35]