[PDF] Shortest Paths in Graphs of Convex Sets

Abstract

Given a graph, the shortest-path problem requires finding a sequence of edges with minimum cumulative length that connects a source to a target vertex. We consider a generalization of this classical problem in which the position of each vertex in the graph is a continuous decision variable, constrained to lie in a corresponding convex set. The length of an edge is then defined as a convex function of the positions of the vertices it connects. Problems of this form arise naturally in road networks, robot navigation, and even optimal control of hybrid dynamical systems. The price for such a wide applicability is the complexity of this problem, which is easily seen to be NP-hard. Our main contribution is a strong mixed-integer convex formulation based on perspective functions. This formulation has a very tight convex relaxation and allows to efficiently find globally-optimal paths in large graphs and in high-dimensional spaces.

Full PDF

11 Shortest Paths in Graphs of Convex Sets

Tobia Marcucci, Jack Umenberger, Pablo A. Parrilo, and Russ Tedrake

Abstract —Given a graph, the shortest-path problem requiresﬁnding a sequence of edges with minimum cumulative length thatconnects a source to a target vertex. We consider a generalizationof this classical problem in which the position of each vertex inthe graph is a continuous decision variable, constrained to lie ina corresponding convex set. The length of an edge is then deﬁnedas a convex function of the positions of the vertices it connects.Problems of this form arise naturally in road networks,robot navigation, and even optimal control of hybrid dynamicalsystems. The price for such a wide applicability is the complexityof this problem, which is easily seen to be NP-hard. Our maincontribution is a strong mixed-integer convex formulation basedon perspective functions. This formulation has a very tight convexrelaxation and allows to efﬁciently ﬁnd globally-optimal paths inlarge graphs and in high-dimensional spaces.

I. I

NTRODUCTION

The Shortest-Path Problem (SPP) is one of the most deeply-studied problems in combinatorial optimization. In its single-source single-target version, this problem asks to ﬁnd a pathof minimum length connecting two prescribed vertices of agraph, with the length of a path being deﬁned as the sum of thelength of its edges. Typically, the edge lengths are ﬁxed scalarsgiven as problem data and the assumptions made on their valuecan have a dramatic impact on the problem complexity [1,Chapters 6–8].In this paper we consider a generalization of the SPP inwhich the edge lengths do not have ﬁxed values but are convexfunctions of continuous variables representing the position ofthe vertices (see Figure 1). More precisely, we consider agraph in which each vertex is paired with a convex set. Thespatial position of a vertex is a continuous decision variableconstrained to lie in the associated convex set. The length of anedge is a generic convex function (e.g. the Euclidean distance)of the position of the vertices it connects. As a consequence,when looking for a path of minimum length, we now havethe extra degree of freedom of optimizing the position of thevertices visited by the path.In the past decades many variants of the SPP with general-ized path length have been studied (see Section I-A). However,to the best of our knowledge, this speciﬁc generalization hasnot been analyzed before, even though problems with thisstructure emerge naturally in many areas. Taking for examplethe canonical scenario in which we seek a shortest pathbetween two cities in a road network, we can imagine thesize of the cities in the map to be nonnegligible with respectto the length of the roads. Modeling this as an ordinary SPPwith one vertex per city might be very conservative, whereasconstructing a purely discrete problem with a precise model of

The authors are with the Massachusetts Institute of Technology,department of Electrical Engineering and Computer Science. E-mail: { tobiam,umnbrgr,parrilo,russt } @mit.edu . X s X t Fig. 1: Example of an SPP in a graph of convex sets. Thedashed blue line is the shortest path from the source set X s tothe target set X t . The position of each vertex along the path(white circles) is allowed to move within the correspondingconvex set. Transitions are allowed only between sets con-nected by an edge of the graph (black arrows).the road network might be too expensive. A natural approachis then to let the transit point in a city be a continuous variableand solve the resulting mixed-integer problem.Another application is robot motion planning, in whichcase each vertex in our graph is paired with a convex regionof space that does not intersect with obstacles (see Sec-tion VIII). Problems such as footstep planning for humanoidrobots on uneven terrain or collision-free motion planning ofautonomous vehicles under dynamic constraints are frequentlytackled via mixed-integer programming [2], [3], [4], [5], [6],[7], [8]. However, our problem formulation is substantiallydifferent as we do not parameterize the discrete state of thesystem with binary variables, but instead we use binaries toselect transitions between regions of the state space. Thisdifferent parameterization leads to slightly larger but, in ourexperience, much stronger Mixed-Integer Convex Programs(MICPs). A similar abstraction allows to reformulate optimalcontrol problems for hybrid dynamical systems [9] as SPPs(see Section VIII-C).The proposed generalization of the SPP is easily seen tobe NP-hard, therefore we do not expect to ﬁnd an exactpolynomial-time algorithm for its solution. In this paper werelax the requirement of polynomial-time solvability, and wepresent a strong mixed-integer convex formulation of theproblem that can be effectively solved to global optimality. Wederive this formulation using perspective functions: a tool fromconvex analysis which in the recent years has seen a multitudeof applications in mixed-integer programming [10], [11], [12],[13] and also in optimal control [14], [15]. Numerical resultsshow that the convex relaxation of our formulation is generallyvery tight, even when working in high-dimensional spacesand with large graphs. Furthermore, in our computational a r X i v : . [ c s . D M ] J a n experience, the proposed MICP outperforms state-of-the-artmixed-integer formulation for optimal control [15]. We alsohighlight that this MICP shares the same structure as theLinear-Programming (LP) formulation of the classical SPP:this feature might allow us to leverage a massive body of worksto devise efﬁcient approximation algorithms (see, e.g., [16]).Preliminary results suggest that the convex relaxation of thisMICP might be well suited, e.g., for the development ofrounding strategies. A. Related generalizations of the shortest-path problem

Many generalizations of the classical SPP have been studiedin the last seventy years. A taxonomy of the literature prior to1984 can be found in [17], according to which our problemwould be classiﬁed as an SPP with “generalized path length.”Within this class of problems, it is worth mentioning [18]where the length of a sequence of K edges is deﬁned re-cursively as a function of the ﬁnal edge and the length of theﬁrst K − edges. This problem admits an efﬁcient solutionalgorithm, provided that some extra assumptions (sufﬁcient forthe dynamic-programming principle to hold) are made. Knuthextended Dijkstra’s algorithm [19] to the context of formallanguage theory in [20]. An improved algorithm for this SPPhas been presented in [21], while [22] expanded the class ofedge costs that these methods can handle. When applied tographs, this more abstract setup yields a generalization of thenotion of length of a path similar to the one from [18].The closest family of problems to the one we considerhere falls under the name of Euclidean SPP (see, e.g., therecent book [23]). This problem requires ﬁnding a continuouspath that connects two points and does not collide with acollection of polygonal obstacles. In two dimensions, theshortest path is a polygonal line whose corners are vertices ofthe obstacles. By constructing a visibility graph, the problemis then reduced to a discrete graph search which is solvable inpolynomial time [24], [25]. In three dimensions or more thisstrategy breaks; in fact, the problem becomes NP-hard [26,Theorem 2.3.2]. An approximation algorithm for the three-dimensional case has been proposed [27]. Efﬁcient algorithmsfor the multidimensional case based on a grid-discretization ofthe space have been presented in [28], [29]. Exact-geometryalgorithms for problems of this nature have been recentlydiscussed in [30].An evident difference between the Euclidean SPP and theproblem we analyze in this paper is that the ﬁrst requires theidentiﬁcation of a continuous path. In our case, we only requirea ﬁnite set of points to lie in appropriate regions of space,without any conditions on the path connecting them. Anotherdifference of fundamental importance concerns the notion oflength we employ. While Euclidean shortest-path algorithmsstrongly exploit the metric structure of the underlying space,the notion of length we use here is much weaker: a convexfunction with extended-real values. As an example, this allowsus to deﬁne the distance between two points as the energyconsumed by a dynamical system to move between them, withthe length being inﬁnite in case the motion is infeasible (seeSection VIII). Finally, the graph structure beneath our problem allows to prevent undesired transitions between regions of freespace, increasing the ﬂexibility of our framework.To conclude, we mention the touring-polygon problemwhich, in its unconstrained version, requires ﬁnding the short-est path between two points that visits a set of polygons in agiven order [31], [23, Chapter 10]. Except for special cases,e.g. convex polygonal regions, this problem is NP-hard [31,Theorem 6]. Similar in spirit are some classical problems incomputational geometry: the safari [32], the zoo-keeper [33],and the watchman route [34] problems.II. P ROBLEM D EFINITION

We start this section with a formal deﬁnition of the gener-alized SPP we study in this paper. Then, in Section II-A, weuse a simple reduction argument to prove the NP-hardness ofthis problem.Let G := ( V, E ) be a directed graph with vertices V andedges E . We assume each edge ( i, j ) ∈ E to connect distinctvertices i (cid:54) = j . For each vertex i ∈ V , we have a compactconvex set X i ⊂ R d and a point x i contained in it. Thelength of an edge ( i, j ) is determined by the location of thepoints x i and x j via the expression (cid:96) ij ( x i , x j ) , where (cid:96) ij is anonnegative closed convex function with extended-real valuesthat attains ﬁnite value in at least one point. Even though wewill refer to (cid:96) ij as the “edge length,” we underline that thisfunction needs not to be a valid metric, and axioms such assymmetry or the triangle inequality are not required to hold. Apath P from a source vertex s ∈ V to a target t ∈ V −{ s } is asequence of distinct vertices ( i k ) Kk =0 such that i = s , i K = t ,and ( i k , i k +1 ) ∈ E for all k = 0 , . . . , K − . We denote the setof edges traversed by the path P as E P := { ( i k , i k +1 ) } K − k =0 .The length of the path P is then deﬁned as (cid:96) P ( x P ) := (cid:88) ( i,j ) ∈ E P (cid:96) ij ( x i , x j ) , where x P := ( x i , . . . , x i K ) .Among the paths P from s to t , we seek one of minimumlength and, in doing that, we are allowed to optimize thelocation of the points x i . Our problem can be compactlydescribed as min P min x P ∈ X P (cid:96) P ( x P ) , (1)where X P := X i × · · · × X i K . Remark 1.

While we explicitly rule out self-edges of theform ( i, i ) for all i ∈ V , a self-transition from a region X i toitself can be easily allowed. We deﬁne an auxiliary vertex i (cid:48) associated with the set X i (cid:48) = X i , and we add to E the edge ( i, i (cid:48) ) together with a copy ( i (cid:48) , j ) of any edge ( i, j ) ∈ E . Ofcourse, this generalizes to the case in which we allow a ﬁnitenumber of self-transitions within a region.Problem (1) generalizes the classical single-source single-target SPP in that the length of the edges is not know a priori,but it depends on the continuous variables x i . The classicalproblem formulation is recovered as a special case when thesets X i are singletons { θ i ∈ R d } . For what concerns the edge length (cid:96) ij , we can, for example,deﬁne it as the Euclidean distance between x i and x j : (cid:96) ij ( x i , x j ) := (cid:107) x j − x i (cid:107) . (2)In this case the optimal location of the points x P will deﬁnea polygonal line connecting x s to x t via a path as straight aspossible (perfectly straight if ( s, t ) ∈ E ). Conversely, lettingthe edge length be the squared 2-norm (cid:96) ij ( x i , x j ) := (cid:107) x j − x i (cid:107) , (3)straight paths might be suboptimal if they require long steps x i k +1 − x i k . By allowing (cid:96) ij to take inﬁnite values, we canenforce hard constraints that couple the position of the points x i and x j (see Section IV-G). This can be further generalizedto model the scenario in which the points x i describe thetime evolution a dynamical system, allowing optimal controlproblems to be stated in the form (1) (see Section VIII). A. Problem complexity

We have seen that problem (1) is a strict generalizationof the classical SPP; more precisely, because of the assump-tions (cid:96) ij ≥ , the SPP with nonnegative edge lengths. Thisfamous problem can be solved in polynomial-time using, e.g.,Dijkstra’s algorithm [19], or more efﬁcient implementations ofit [35]. The same cannot be expected for (1) since, as shownbelow, this problem is NP-hard.Recall that an s - t path is called Hamiltonian if it visits eachvertex in V , and a graph is said to be Hamiltonian if it containssuch a path. The directed Hamiltonian-Path Problem (HPP)asks if a directed graph G is Hamiltonian. As an example, thegraph in Figure 1 is not Hamiltonian. Lemma 1.

The directed HPP is reducible to problem (1) inpolynomial time.Proof.

We construct an instance of (1) which shares the samegraph G as the given HPP. We let the source X s := { θ s ∈ R d } and target X t := { θ t (cid:54) = θ s } sets be single points, while we letall the other regions X i be large enough to contain the linesegment Γ connecting θ s and θ t . We deﬁne the edge lengthas in (3). At the optimum, the points x P are arranged on thesegment Γ and have equal Euclidean distance γ := (cid:107) x i k +1 − x i k (cid:107) = (cid:107) θ t − θ s (cid:107) /K, for all k = 0 , . . . , K − . The cost associated with thisarrangement is Kγ = (cid:107) θ t − θ s (cid:107) /K . Since this valuedecreases with the number K of edges traversed by the optimalpath P , whenever possible, K will attain the maximum value | V | − . We conclude that the graph G is Hamiltonian ifand only if the solution P of this instance of problem (1) isHamiltonian. Synthesizing this instance, as well as verifyingif K = | V | − , takes polynomial time. Theorem 1.

Problem (1) is NP-hard.Proof.

This follows from the directed HPP being NP-complete [36], [1, Theorem 8.11]. Any strictly convex function of x j − x i would be equally valid. Remark 2.

The hardness of problem (1) is not related to thedimension d of the space in which the sets X i live. In fact,the reduction argument in Lemma 1 holds for any d ≥ .To sum up, problem (1) encompasses two classical problemsin combinatorial optimization: the SPP with nonnegative edgelengths and the HPP. The ﬁrst is solvable in polynomial timeand is recovered as the sets X i become zero dimensional, thesecond is NP-complete and associated with sets X i of large-enough volume. We then expect the volume of these sets toplay a crucial role in determining whether a good solution forproblem (1) can be found quickly.III. M IXED -I NTEGER C ONVEX F ORMULATION

We take two steps to reformulate problem (1) as an MICP.First we extend the LP formulation of the classical SPPto the generalized setting described in Section II: this stepinvolves the use of perspective functions and yields a bi-linear program, i.e., a program whose nonconvexity comesexclusively from bilinear equality constraints. Successively weconvert this bilinear program into an MICP. All the necessarybackground on the classical SPP and on perspective functionsis introduced. For several special cases and examples of theupcoming reformulation we refer the reader to Section IV.In the following, we let O i := { j : ( i, j ) ∈ E } and I i := { j : ( j, i ) ∈ E } be the set of vertices connected to i ∈ V byoutgoing and incoming edges, respectively. For two vertices i and j , we let δ ij := 1 if i = j and δ ij := 0 if i (cid:54) = j . A. Bilinear formulation

In the classical SPP the edge lengths are nonnegative ﬁniteconstants c ij , and the problem is typically formulated as min (cid:88) ( i,j ) ∈ E c ij ϕ ij (4a) s . t . (cid:88) j ∈ O i ϕ ij − (cid:88) j ∈ I i ϕ ji = δ si − δ ti , ∀ i ∈ V, (4b) ϕ ij ≥ , ∀ ( i, j ) ∈ E. (4c)In this LP the shortest path is parameterized by the decisionvariables ϕ ij , whose role is to take unit value if and only ifthe edge ( i, j ) is traversed by the shortest path. To this end, itis not necessary to explicitly enforce the integrality constraint ϕ ij ∈ { , } : all the basic feasible solutions of 4 can be shownto have binary value and the addition of this constraint wouldnot affect the optimal value of the problem. The multiplicationof c ij by ϕ ij in (4a) ensures that only the edges along theshortest path contribute to the cost. Interpreting ϕ ij as theunits of ﬂow traversing the edge ( i, j ) , constraint (4b) is astandard conservation of ﬂow: one unit of ﬂow is injected fromthe source and ejected from the target, while the incomingand outgoing ﬂows coincide for all the other vertices. For abinary solution, this guarantees that the set of edges for which ϕ ij = 1 actually connects the source to the target. That theseedges identify a valid path (i.e., a sequence of distinct vertices)is ensured only at optimality: since c ij is nonnegative, visitingthe same vertex more than once we would incur a nonnegativecost without making any progress towards the target. A natural extension of the LP formulation 4 to the SPP de-scribed in Section II yields the following nonconvex program: min (cid:88) ( i,j ) ∈ E (cid:96) ij ( x i , x j ) ϕ ij (5a) s . t . x i ∈ X i , ∀ i ∈ V, (5b)constraints of problem (4) . (5c)Note that here both the ﬂows ϕ ij and the vertex locations x i are decision variables, and the product of the edge length (cid:96) ij ( x i , x j ) by ϕ ij in (5a) makes this problem nonconvex ingeneral. Note also that, for the regions X i not visited by theshortest path, the position of the points x i is a free decisionvariable.As stated, problem (5) has a small ﬂaw. When ϕ ij = 0 ,we would always want the cost contribution of the edge ( i, j ) in (5a) to be zero. However, as opposed to the classical SPP,in our problem formulation the edge lengths (cid:96) ij are allowedto take inﬁnite value, and the product in (5a) is undeﬁnedif (cid:96) ij ( x i , x j ) = ∞ and ϕ ij = 0 . We momentarily put asidethis technical issue and postpone its resolution to the nextsubsection.As a second step towards our MICP formulation, we movethe nonconvexity of problem (5) from the objective (5a) to theconstraints by deﬁning two auxiliary variables per edge: y ij := ϕ ij x i , z ij := ϕ ij x j . (6)Momentarily assuming ϕ ij > , we divide and multiply thearguments of (cid:96) ij by ϕ ij to restate the objective (5a) as (cid:88) ( i,j ) ∈ E (cid:96) ij ( y ij /ϕ ij , z ij /ϕ ij ) ϕ ij . (7)For ϕ ij > , the ( i, j ) th addend in (7) is recognized to bethe perspective of the function (cid:96) ij [37, Section 3.2.6]. Theperspective operation preserves convexity: the convexity of (cid:96) ij is hence inherited by the addends in (7), which are jointlyconvex in ( ϕ ij , y ij , z ij ) . The nonconvexity of our problemis now concentrated in the bilinear equations (6), whoseconvexiﬁcation is the object of Section III-C. Before that, letus show how we can formally handle the case in which someof the ﬂows ϕ ij are zero. B. Zero ﬂows in the bilinear formulation

When a ﬂow variable ϕ ij is zero, the objective (7) isundeﬁned: this issue is easily ﬁxed by extending the domainof the perspective operation. (The deﬁnition below mightappear cumbersome for numerical calculations but, as shownin Appendix A and Section IV, in most common cases it yieldssimple expressions readily amenable to standard optimizationsolvers.) Deﬁnition 1.

Let f : R n → R m ∪ {∞} m be a closed convexfunction, and let ¯ x ∈ R n be any point such that f (¯ x ) is ﬁnite.We deﬁne the perspective of the function f as ˜ f ( λ, x ) :=  λf ( x/λ ) if λ > τ ↓ τ f (¯ x + x/τ ) if λ = 0 ∞ if λ < , where the value of the limit operation can be shown to beindependent of the point ¯ x .Importantly, for a scalar-valued function, this “extended”perspective operation still preserves convexity [38, Proposi-tions IV.2.2.1 and IV.2.2.1].With this tool at our disposal, we rephrase (5) as the bilinearprogram min (cid:88) ( i,j ) ∈ E ˜ (cid:96) ij ( ϕ ij , y ij , z ij ) (8a) s . t . y ij = ϕ ij x i , ∀ ( i, j ) ∈ E, (8b) z ij = ϕ ij x j , ∀ ( i, j ) ∈ E, (8c) x i ∈ X i , ∀ i ∈ V, (8d) (cid:88) j ∈ O i ϕ ij − (cid:88) j ∈ I i ϕ ji = δ si − δ ti , ∀ i ∈ V, (8e) ϕ ij ≥ , ∀ ( i, j ) ∈ E. (8f)When all the ﬂows ϕ ij are strictly positive, the objective (8a) isequal to (7) by deﬁnition, and this bilinear program is exactlythe problem we derived above. On the other hand, when a ﬂow ϕ ij is zero, problem (8) does not raise any issue: the ( i, j ) thaddend in (8a) is well deﬁned and correctly evaluates to zero,even when (cid:96) ij ( x i , x j ) = ∞ . To see this, notice that ϕ ij = 0 implies y ij = z ij = 0 by (8b) and (8c), and Deﬁnition 1 reads ˜ (cid:96) ij (0 , ,

0) = lim τ ↓ τ (cid:96) ij (¯ x i + 0 /τ, ¯ x j + 0 /τ ) = 0 , where ¯ x i and ¯ x j are such that (cid:96) ij (¯ x i , ¯ x j ) is ﬁnite.Before concluding that (8) is a valid formulation of ourgeneralized SPP, we are left to verify that the integralityproperty of the LP formulation (4) is actually inherited bythis bilinear program. Proposition 1.

The addition of the integrality constraints ϕ ij ∈ { , } for all ( i, j ) ∈ E does not affect the optimalvalue of problem (8) .Proof. Let (cid:96) (cid:63) be the optimal value of (8) and (cid:96) (cid:63) { , } ≥ (cid:96) (cid:63) beits cost after the addition of the integrality constraints. If (8) isinfeasible, we immediately have ∞ = (cid:96) (cid:63) = (cid:96) (cid:63) { , } . Hence, weassume (cid:96) (cid:63) to be ﬁnite. We let x (cid:63)i be the optimal location of thevertices obtained by solving (8). We ﬁx x i = x (cid:63)i in (8), and letonly the ﬂows ϕ ij be decision variables. The resulting problemis equivalent to the LP (4), provided that we deﬁne c ij := (cid:96) ij ( x (cid:63)i , x (cid:63)j ) if this value is ﬁnite and we remove the edge ( i, j ) from the graph G if (cid:96) ij ( x (cid:63)i , x (cid:63)j ) = ∞ . The cost of this LP mustequal (cid:96) (cid:63) , otherwise our solution of (8) would not be optimal. More precisely, the one deﬁned is the closure of the perspective functionof f ; the perspective function is typically deﬁned to be inﬁnite for λ = 0 [38,Section IV.2.2]. Since in this paper we only work with the former, there isno risk of misunderstanding. Furthermore, by the integrality property of (4), we can assumethe optimal ﬂows ϕ (cid:63)ij of this LP to be binary. Paired with thevariables x (cid:63)i , the ﬂows ϕ (cid:63)ij yield a feasible solution with cost (cid:96) (cid:63) for problem (8) subject to the integrality constraints. Thisimplies (cid:96) (cid:63) ≥ (cid:96) (cid:63) { , } and, consequently, (cid:96) (cid:63) { , } = (cid:96) (cid:63) . C. Convexiﬁcation of the bilinear formulation

The bilinear equalities (8b) and (8c) make it prohibitiveto solve problem (8) directly. Here we reformulate theseequalities as mixed-integer convex constraints, allowing toeffectively solve our generalized SPP (1) to global optimalityvia branch-and-bound techniques.The plan is to replace (8b) and (8c) with a collection ofconvex constraints. The role of these convex constraints istwofold: • When the ﬂows ϕ ij take binary values, they must ensurethat the bilinear equations (8b) and (8c) are correctlyenforced. This guarantees that the MICP we constructis actually a valid problem formulation. • When the ﬂows ϕ ij are not integral, they must envelopthe solution set of problem (8) (i.e. all the feasible pointswhich we cannot guarantee to be suboptimal beforehand)as tightly as possible. This increases the efﬁciency of ourformulation.The family of convex constraints we derive below is sufﬁcientto achieve the ﬁrst of these two goals. Section V is devoted tothe generation of additional convex constraints that increasethe tightness of the proposed MICP.The next lemma describes an algorithmic procedure toconstruct a convex envelope around the solution set of prob-lem (8). In its statement, we let Φ i := a + (cid:88) j ∈ O i b j ϕ ij + (cid:88) j ∈ I i c j ϕ ji . be a generic scalar afﬁne combination of the ﬂows traversingvertex i ∈ V . Lemma 2.

Any feasible solution of problem (8) that satisﬁes Φ i = 0 must also verify the linear equality constraint ax i + (cid:88) j ∈ O i b j y ij + (cid:88) j ∈ I i c j z ji = 0 . (9) Furthermore, any feasible solution of (8) that satisﬁes Φ i ≥ must also verify the convex constraint ax i + (cid:88) j ∈ O i b j y ij + (cid:88) j ∈ I i c j z ji ∈ Φ i X i . (10) Proof.

Expanding the left-hand side of the identity Φ i x i =Φ i x i , and using the bilinear constraints (8b) and (8c), we getthe bilinear equality ax i + (cid:88) j ∈ O i b j y ij + (cid:88) j ∈ I i c j z ji = Φ i x i . When Φ i = 0 , we obtain the linear equality (9). If Φ i ≥ ,the inclusion x i ∈ X i from (8d) immediately implies (10).That (10) is a convex constraint (jointly in the ﬂows and the“spatial” variables x i , y ij , and z ji ) follows from the fact thatany constraint of the form x ∈ λS is jointly convex in λ and x , provided that S is convex and λ ≥ . This can be veriﬁedby applying the deﬁnition of a convex set to { ( λ, x ) : λ ≥ , x = λy, y ∈ S } .Lemma 2 leverages the bilinearities (8b) and (8c) to translateany equality (or inequality) between the ﬂows incident to avertex into a corresponding “spatial” linear equality (9) (orconvex constraint (10)). From every relation of the form Φ i =0 or Φ i ≥ that is guaranteed to hold at optimality of (8), andwhich is not trivially implied by the ones we already used, wecan then generate a spatial constraint that tightens our convexenvelope around the solution set of problem (8).We now apply Lemma 2 to derive a family of convexconstraints sufﬁcient for the bilinear constraints (8b) and (8c)to be correctly enforced in case of a binary ﬂow. We start fromthe trivial afﬁne combinations Φ i := ϕ ij and Φ i := 1 − ϕ ij for all i ∈ V and j ∈ O i , which are certainly nonnegative atoptimality of problem (8). Applying (10) we get the convexconstraints y ij ∈ ϕ ij X i and x i − y ij ∈ (1 − ϕ ij ) X i . Forconvenience, we deﬁne the set Λ i := { ( ϕ, x, y ) : ϕ ∈ [0 , , y ∈ ϕX i , x − y ∈ (1 − ϕ ) X i } and we rewrite the latter convex constraints as ( ϕ ij , x i , y ij ) ∈ Λ i , ∀ ( i, j ) ∈ E. (11)We will use the following property of the set Λ i multiple times. Proposition 2.

The projection of the set Λ i onto the space ofthe x variables coincides with X i .Proof. The inclusion X i ⊆ proj x Λ i is veriﬁed intersecting Λ i with the hyperplane ϕ = 0 . For the reverse direction, assume x ∈ proj x Λ i . Then there exist ϕ ∈ [0 , , y , and x , x ∈ X i such that y = ϕx and x − y = (1 − ϕ ) x . Summing thesetwo equations we get x = ϕx + (1 − ϕ ) x ∈ X i , where theinclusion follows from the convexity of X i .We repeat the operation above inverting the roles of thevertices i and j , and considering i as a vertex incoming j . Welet Φ j := ϕ ij ≥ and Φ j := 1 − ϕ ij ≥ for all j ∈ V and i ∈ I j . Using (10), these result in the convex constraints ( ϕ ij , x j , z ij ) ∈ Λ j , ∀ ( i, j ) ∈ E. (12)For a binary ﬂow, the two convex constraints (11) and (12)are already sufﬁcient to correctly enforce the bilinear equa-tions (8b) and (8c). When ϕ ij = 0 , conditions (8b) and (8c)require the auxiliary variables y ij and z ij to collapse to zero,so that the ( i, j ) th cost addend vanishes. Correctly, condi-tion (11) yields y ij = 0 as well as the redundant condition x i ∈ X i . Similarly, from (12), we have z ij = 0 and x j ∈ X j .When ϕ ij = 1 , the bilinear constraints require the variables y ij and z ij to match x i and x j , respectively, and ensure that ( i, j ) th cost addend takes the value (cid:96) ij ( x i , x j ) . Accordingly,from (11) and (12) we get y ij = x i ∈ X i and z ij = x j ∈ X j . Replacing the bilinear equations (8b) and (8c) with theconvex constraints (11) and (12), we then obtain the followingmixed-integer convex formulation of problem (1): min (cid:88) ( i,j ) ∈ E ˜ (cid:96) ij ( ϕ ij , y ij , z ij ) (13a) s . t . ( ϕ ij , x i , y ij ) ∈ Λ i , ∀ ( i, j ) ∈ E, (13b) ( ϕ ij , x j , z ij ) ∈ Λ j , ∀ ( i, j ) ∈ E, (13c) (cid:88) j ∈ O i ϕ ij − (cid:88) j ∈ I i ϕ ji = δ si − δ ti , ∀ i ∈ V, (13d) ϕ ij ∈ { , } , ∀ ( i, j ) ∈ E. (13e)Notice that we dropped the inclusion (8d). This is allowed byProposition 2: assuming, without loss of generality, that eachvertex is incident to at least one edge, condition (8d) is impliedby (13b) and (13c), even when the ﬂows ϕ ij are not integer. D. Comments on the mixed-integer convex program

Overall, the size of problem (13) scales bilinearly with thenumber of vertices and edges and with the dimension d ofthe space in which the sets X i live. More precisely, we have | E | binary variables and O ( d ( | V | + | E | )) continuous variables.Assuming the number of constraints from (13b) and (13c) tobe independent of d , the constraints are O ( | V | + | E | ) .The convex relaxation of (13) is obtained simply by drop-ping condition (13e) from the problem formulation. Note that,in contrast to the bilinear program (8), we generally expect tohave a nonzero relaxation gap between the optimal values ofthe MICP (13) and of its convex relaxation.As for the classical LP formulation (4), the nonnegativityof (cid:96) ij ensures that the shortest path identiﬁed by the MICPdoes not contain cycles. However, the same is not necessarilytrue for the convex relaxation of (13), whose optimal solutionmight involve cycles with nonzero ﬂow. This is analyzed morein depth in Section V-B where we propose a set of additionalconstraints that partially counteract this phenomenon.IV. C OMMON S PECIAL C ASES

We now discuss some special cases of the procedure pre-sented in Section III. We analyze several common choices forthe sets X i and the edge-length functions (cid:96) ij , providing ex-plicit expressions for the various components of problem (13).The following property of the perspective operation willallow to derive a functional description of the set Λ i from afunctional description of X i . Proposition 3.

Let the function f be deﬁned as in Deﬁnition 1.Assume the set S to be bounded and to admit the functionaldescription S = { x : f ( x ) ≤ } . For a nonnegative scalar λ ,we have λS = { x : ˜ f ( λ, x ) ≤ } . Proof.

We denote the set on the right-hand side as ˜ S ( λ ) . For λ = 0 we have λS = { } and, using Deﬁnition 1, ˜ S (0) = { x : lim τ ↓ τ f (¯ x + x/τ ) ≤ } . The latter set can be recognizedto be the recession cone of S , which equals { } since S isbounded. For λ > , we have ˜ S ( λ ) = { x : λf ( x/λ ) ≤ } = { x : f ( x/λ ) ≤ } = { λy : f ( y ) ≤ } = λS . A. When the sets X i are singletons We start by showing that when the sets X i are singletonsthe MICP (13) is equivalent to the LP formulation (4) of theclassical SPP. The valuable integrality property of this LP, forwhich all basic feasible ﬂows ϕ ij are binary, is hence inheritedby our formulation in this limiting case.Assuming X i := { θ i } for all i ∈ V , we have Λ i = { ( ϕ, x, y ) : ϕ ∈ [0 , , y = ϕθ i , x − y = (1 − ϕ ) θ i } = { ( ϕ, x, y ) : ϕ ∈ [0 , , y = ϕθ i , x = θ i } . Therefore, constraints (13b) and (13c) can be solved out to get x i = θ i for all i ∈ V , as well as y ij = ϕ ij θ i and z ij = ϕ ij θ j for all ( i, j ) ∈ E . Using these values, the ( i, j ) th cost addendin (13a) becomes ˜ (cid:96) ij ( ϕ ij , ϕ ij θ i , ϕ ij θ j ) = (cid:40) ϕ ij (cid:96) ij ( θ i , θ j ) if ϕ ij > if ϕ ij = 0 . Assuming each constant edge length c ij := (cid:96) ij ( θ i , θ j ) to beﬁnite, otherwise we can just remove the edge ( i, j ) from theproblem, the latter equals c ij ϕ ij . Problem (13) is then reducedto the LP (4) with the additional integrality requirement (13e),which we know is redundant in this case. B. When the sets X i are polytopes Assume the sets X i are polytopes of the form { x : A i x ≤ b i } . To derive an explicit description of the set Λ i , we useProposition 3 and the expression of the perspective of an afﬁnefunction from Appendix A-C. We obtain the polytope Λ i = { ( ϕ, x, y ) : ϕ ∈ [0 , , A i y ≤ b i ϕ,A i ( x − y ) ≤ b i (1 − ϕ ) } . C. When the sets X i are afﬁne transformations of the unit ball Let X i := { x : (cid:107) A i x + b i (cid:107) ≤ } . Using Proposition (3) andthe formula from Appendix A-D, we have Λ i = { ( ϕ, x, y ) : ϕ ∈ [0 , , (cid:107) A i y + b i ϕ (cid:107) ≤ ϕ, (cid:107) A i ( x − y ) + b i (1 − ϕ ) (cid:107) ≤ (1 − ϕ ) } . For a p -norm with p ∈ { , ∞} the set Λ i is a polytope, for p = 2 it is the intersection of two Second-Order Cones (SOCs). D. When the edge length (cid:96) ij is constant Assume the edge-length function to be constant and ﬁniteas in the classical SPP: (cid:96) ij ( x i , x j ) := c ij . Using the formulafrom Appendix A-A, the addends in (13a) become c ij ϕ ij ,and the spatial variables x i , y ij , and z ij in problem (13)become free decision variables. As in the case of singletonsets X i from Section (IV-A), problem (13) simpliﬁes to theLP formulation (4) of the classical SPP. E. When the edge length (cid:96) ij is positively homogeneous Assume the edge length (cid:96) ij to be positively homogeneous,i.e., (cid:96) ij ( ax i , ax j ) = a(cid:96) ij ( x i , x j ) for all x i , x j , and a ≥ . Anexample of such a function is (cid:96) ij ( x i , x j ) := (cid:107) A ij x i + B ij x j (cid:107) ,from which (2) is recovered when the norm is the -norm and B ij := − A ij := I . As shown in Appendix A-B, in this specialcase the addends in (13a) verify ˜ (cid:96) ij ( ϕ ij , y ij , z ij ) = (cid:96) ij ( y ij , z ij ) . (14)In case of a p -norm, using slack variables, this is implementedas a linear objective subject to linear constraints if p ∈ { , ∞} or subject to SOC constraints if p = 2 . F. When the edge length (cid:96) ij is a positive semideﬁnitequadratic form Assume the edge length to be deﬁned as (cid:96) ij ( x i , x j ) := (cid:107) A ij x i + B ij x j (cid:107) . Notice that the edge length (3) is recov-ered as a special case for B ij := − A ij := I . FollowingAppendix A-E, for ϕ ij > the addends in (13a) become ˜ (cid:96) ij ( ϕ ij , y ij , z ij ) = (cid:107) A ij y ij + B ij z ij (cid:107) /ϕ ij , (15)while for ϕ ij = 0 we have ˜ (cid:96) ij (0 , y ij , z ij ) = (cid:40) if A ij y ij + B ij z ij = 0 ∞ otherwise . (16)The latter expressions can be modeled using a SOC con-straint. We introduce a nonnegative slack variable l ij per edge,we replace the objective (13a) with (cid:80) ( i,j ) ∈ E l ij , and we addthe rotated SOC constraint ϕ ij l ij ≥ (cid:107) A ij y ij + B ij z ij (cid:107) . (17)For ϕ ij > , the slack l ij is forced by the cost to coincidewith the right-hand side of (15). For ϕ ij = 0 , l ij is pushedto zero which, recalling that (13b) and (13c) imply y ij = z ij = 0 whenever ϕ ij = 0 , coincides with the right-hand sideof (16). We conclude that (17) models the two expressionsabove correctly. G. When the edge length (cid:96) ij enforces a hard constraint Imagine we want to couple the position of the two points x i and x j via a constraint of the form ( x i , x j ) ∈ X ij , with X ij closed and convex. One way to proceed is to deﬁne (cid:96) ij such that (cid:96) ij ( x i , x j ) = ∞ if ( x i , x j ) does not belong to X ij . Following Appendix A-F, we see that this results in thefollowing extra constraint for problem (13): ( y ij , z ij ) ∈ ϕ ij X ij . (18)When ϕ ij = 0 , constraints (13b) and (13c) imply y ij = z ij =0 , and condition (18) is trivially veriﬁed. When ϕ ij = 1 ,from (13b) and (13c) we have y ij = x i and z ij = x j , and (18)becomes ( x i , x j ) ∈ X ij as desired.With the hope of increasing the tightness of the convexrelaxation, one could think of preprocessing the MICP andremove the edge ( i, j ) from the graph if a transition fromvertex i to vertex j is never feasible, i.e., if the intersection of X i × X j and X ij is empty. This turns out to be unnecessary. Proposition 4.

Assume that ( X i × X j ) ∩ X ij = ∅ , thenany feasible solution of the convex relaxation of (13) , subjectto (18) , is such that ϕ ij = 0 .Proof. Assume ϕ ij > . By (13b) and (13c) we have y ij /ϕ ij ∈ X i and z ij /ϕ ij ∈ X j , whereas (18) implies ( y ij /ϕ ij , z ij /ϕ ij ) ∈ X ij . A contradiction.V. F ORMULATION S TRENGTHENING

In Section III-C we have constructed the MICP formula-tion (13) of our generalized SPP by replacing the bilinearconstraints (8b) and (8c) with their convex relaxations (13b)and (13c). Even though these convex constraints are sufﬁcientto ensure the correctness of the MICP, they envelop the solu-tion set of our problem very loosely, making our formulationweak. In this section we analyze two simple constraints thatsubstantially increase the tightness of the convex relaxation ofour MICP.

A. Spatial counterpart of the conservation-of-ﬂow constraint

After the trivial inequalities ϕ ij ≥ and − ϕ ij ≥ exploited in Section III-C, the next natural candidate to takethe role of the afﬁne combination Φ i in Lemma 2 is thedifference between the outgoing and incoming ﬂows of vertex i ∈ V . According to the conservation of ﬂow (8e), this valueis always zero at optimality of problem (8).Applying (9), we obtain d | V | linear equality constraints (cid:88) j ∈ O i y ij − (cid:88) j ∈ I i z ji = δ si x s − δ ti x t , ∀ i ∈ V. (19)At optimality of the MICP, these simplify to trivial identi-ties but, for noninteger ﬂows, they dramatically increase thestrength of our formulation. A toy example that illustrates thisclaim is given in Section IX-A. B. Degree constraints

The second group of constraints we analyze are degreeconstraints, i.e., constraints on the maximum amount of ﬂowthat can go though a vertex [39].As discussed in Section III, an optimal solution of theMICP (13) identiﬁes a sequence of edges that never visitsthe same vertex twice. More precisely, at optimality of (13),the following conditions are always veriﬁed: (cid:88) j ∈ O i ϕ ij ≤ − δ ti , (cid:88) j ∈ I i ϕ ji ≤ − δ si , ∀ i ∈ V. (20)These conditions need not to hold for the convex relaxationof (13), whose optimal solution might involve a nonzero ﬂowthrough the edges of a cycle and vertices with incomingﬂow greater than one; a simple example of this behavioris illustrated in Section IX-B. Enforcing (20) we explicitlyrule out this undesired behavior, increasing the strength ofour MICP. Note that it is not necessary to enforce boththe conditions in (20): one is implied by the other and theconservation of ﬂow (13d). We have then a total of | V | additional linear inequality constraints for our problem. Using Lemma 2, also the degree constraints (20) can betranslated into spatial constraints. Applying (10), the ﬁrst ofthe two inequalities in (20) yields the convex constraint x i − δ ti x t − (cid:88) j ∈ O i y ij ∈  − δ ti − (cid:88) j ∈ O i ϕ ij  X i , ∀ i ∈ V. Even if this constraint is not implied by the ones we alreadyhave, its addition to our problem formulation did not, empiri-cally, yield a performance increase for the problem instancesconsidered in the numerical experiments of this paper. Forthis reason, and in order to simplify the derivations below, wedecide not to include it in our formulation.VI. T HE D UAL P ROGRAM

In this section we derive the dual of the convex relaxationof the MICP (13), subject to the tightening constraints (19)and (20). This helps us gaining a deeper understanding of ourproblem formulation, and it is useful to prove bounds on itsoptimal value (see Section VII).We start by deﬁning the main tools employed in this section.

Deﬁnition 2.

For a function f : R n → R ∪ {∞} , we call f ∗ ( x ) = sup y ( x (cid:62) y − f ( y )) the conjugate function of f . Deﬁnition 3.

For a set S ⊆ R n , we deﬁne the indicatorfunction of S as ι S ( x ) = (cid:40) if x ∈ S ∞ if x (cid:54)∈ S .

Deﬁnition 4.

For a set S ⊆ R n , we deﬁne the support function of S as σ S ( x ) = sup y ∈ S x (cid:62) y. The conjugate function f ∗ is always convex. If the set S isconvex, so is the indicator function ι S . The support functionis immediately recognized to be the conjugate of the indicatorfunction ( σ S ( x ) = ι ∗ S ( x ) ) and, as such, it is always convex.Conjugate functions allow to easily derive the dual of aproblem with convex objective and linear constraints only,such as min x,y f ( x ) (21a) s . t . Ax + By = c, (21b) F x + Gy ≤ h. (21c)Retracing the steps from [37, Section 5.1.6], the dual ofproblem (21) is easily seen to be max π,ρ − f ∗ ( − A (cid:62) π − F (cid:62) ρ ) − c (cid:62) π − h (cid:62) ρ (22a) s . t . B (cid:62) π + G (cid:62) ρ = 0 , (22b) ρ ≥ , (22c)where π is the Lagrange multiplier for the equality con-straint (21b) and ρ for the inequality (21c). Our objective function (13a) is convex, and among theconstraints of (13), together with (19) and (20), the onlynonlinear conditions are (13b) and (13c). We then movethese two constraints to the objective function by adding theirindicator functions as penalties: (cid:88) ( i,j ) ∈ E (˜ (cid:96) ij ( ϕ ij , y ij , z ij ) + ι Λ i ( ϕ ij , x i , y ij )+ ι Λ j ( ϕ ij , x j , z ij )) . (23)Problem (13) is unchanged, but it now consists in the min-imization of a convex function subject to linear constraintsonly. The recipe (22) can be then applied directly.There are a couple of additional observations that facilitatethe derivation of our dual: • Following (22a), we would need to derive the conjugateof the primal objective (23). This would not yield a neatexpression since, in general, the conjugate of the sum offunctions is not the sum of the conjugates. However, ifwe have a sum of independent functions (cid:80) ni =1 f i ( x i ) itis trivially veriﬁed that its conjugate is (cid:80) ni =1 f ∗ i ( x i ) .We leverage this observation as follows. Every timewe move one of the constraints (13b) or (13c) to theobjective using its indicator function we: i) add a copyof the variables involved in this constraint; ii) add aconstraint enforcing the equality between this copy andthe original decision variables; and iii) let the copies,and not the original variables, be the arguments of theindicator function. In this way, the objective (23) becomesa sum of independent functions and its conjugate can betaken one addend at the time. • The following result from [40, Proposition 2.3(iv)] isuseful when deriving the conjugate of the addends in (23).For a closed proper convex function f , the conjugate ofthe perspective function is ( ˜ f ) ∗ ( λ, x ) = ι C ( λ, x ) , (24)where C := { ( λ, x ) : λ + f ∗ ( x ) ≤ } .The dual of our program is reported in equation (25); itsdecision variables are the following Lagrange multipliers. For i ∈ V , we associate α i ∈ R , β i ∈ R d , γ i ∈ R to theconservation of ﬂow (13d), its spatial version (19), and the ﬁrstdegree constraint in (20), respectively. As just discussed, eachone of the constraints in (13b) and (13c) requires the additionof an equality constraint, and consequently a multiplier, foreach one of its arguments. For ( i, j ) ∈ E , we let µ ϕij ∈ R , µ xij ∈ R d , and µ yij ∈ R d be the multipliers corresponding toeach argument in (13b). Similarly, for constraint (13c) we usethe multipliers ν ϕij ∈ R , ν xij ∈ R d , and ν zij ∈ R d . A. Comments on the dual program

We start by analyzing the structure of problem (25). Wecompare it to the dual of the LP formulation (4) of the classicalSPP: max α t − α s (26a) s . t . c ij ≥ α j − α i , ∀ ( i, j ) ∈ E. (26b) max α t − α s − (cid:88) i ∈ V −{ t } γ i − (cid:88) ( i,j ) ∈ E ( σ Λ i ( µ ϕij , µ xij , µ yij ) + σ Λ j ( − ν ϕij , − ν xij , − ν zij )) (25a) s . t . − (cid:96) ∗ ij ( − µ yij − β i , ν zij + β j ) ≥ α j − α i − γ i + ν ϕij − µ ϕij , ∀ ( i, j ) ∈ E, (25b) (cid:88) j ∈ O i µ xij − (cid:88) j ∈ I i ν xji = δ si β s − δ ti β t , ∀ i ∈ V, (25c) γ i ≥ , ∀ i ∈ V. (25d)Here, as in (25), α i is the multiplier of the conservation ofﬂow (4b) at vertex i . These multipliers are well-known tobe interpretable as potentials: the objective asks to maximizethe potential jump between the source s and the target t , theconstraint ensures that the potential jump along each edge doesnot exceed the length of the edge itself. Despite the manyadditional terms, problem (25) has a similar interpretation.The objective (25a) still maximizes the potential jump α t − α s , but it also includes the multipliers of the degreeconstraints (20) (which would be redundant in the classicalSPP) and the support functions conjugate to the indicators ofthe primal constraints (13b) and (13c).Constraint (25b) comes from conjugating the perspective ofthe edge length in (23): according to (24) this yields an indi-cator function in the dual objective, hence a dual constraint.This constraint clearly resembles (26b): the potential jump α j − α i , together with some extra terms, is upper bounded bya term associated to the edge length. The connection is evenmore evident noticing that, for (cid:96) ij ( x i , x j ) := c ij , we have (cid:96) ∗ ij ( x i , x j ) = − c ij and the left-hand sides of (25b) and (26b)coincide.Finally, we have constraint (25c) which yields a nice primal-dual symmetry: notice that this condition coincides with thespatial conservation of ﬂow (19) if we identify µ xij with y ij , ν xji with z ij , β s with x s , and β t with x t .Problem (25) is always feasible and its cost is nonnegative.This is seen by setting all the multipliers to zero: all theconstraints are veriﬁed and the cost is zero. By weak duality,the optimal value of problem (25) bounds from below the costof the convex relaxation of (13) subject to (19) and (20).VII. A L OWER B OUND FOR THE C ONVEX R ELAXATION

We have discussed how the addition of the constraints (19)and (20) can increase the tightness of the convex relaxationof the MICP (13). The result presented in this section canbe seen as a “sanity check” for the resulting formulation:the discussion from Section II-A suggests a simple lowerbound on the optimal cost of problem (1), here we showthat the constraints (19) and (20) are sufﬁcient for the convexrelaxation of our problem to always recover this bound.Consider a problem setup in which all the edges ( i, j ) ∈ E share the same length function (cid:96) ij ( x i , x j ) := (cid:96) ( x j − x i ) , (27)which only depends on the difference x j − x i , and noton x i and x j independently. Assume also that (cid:96) (0) = 0 .Momentarily, we focus our attention on the case in which the source and target sets are single points θ s and θ t . Withthe goal of lower bounding the optimal cost of problem (1),we can drop the constraints x i ∈ X i for all i ∈ V − { s, t } .Similarly to the proof of Lemma 1, an optimal solution forthis relaxed problem is obtained by ﬁrst detecting an s - t path P = ( i k ) Kk =0 with maximum number K of edges, and then byarranging the points x i k at equal distance along the segment Γ connecting θ s and θ t . The cost of this arrangement is K(cid:96) (cid:18) θ t − θ s K (cid:19) = ˜ (cid:96) ( K, θ t − θ s ) . On the other hand, any s - t path can contain at most K = | V | − edges, therefore a simple lower bound on the optimalcost of (1) is ˜ (cid:96) ( | V | − , θ t − θ s ) .More generally, when the sets X s and X t are allowed to befull dimensional, we have the following result. Proposition 5.

Assume the edges ( i, j ) ∈ E to have a commonlength function (27) . The optimal cost of the convex relaxationof the MICP (13) , subject to the constraints (19) and (20) , isnot smaller than min x s ∈ X s ,x t ∈ X t ˜ (cid:96) ( | V | − , x t − x s ) . (28)Before proving this proposition, we specialize the dualprogram (25) to the edge length (27). In this case, the term (cid:96) ∗ ij ( − µ yij − β i , ν zij + β j ) in the dual constraint (25b) becomes sup x i ,x j (( − µ yij − β i ) (cid:62) x i + ( ν zij + β j ) (cid:62) x j − (cid:96) ( x j − x i )) . (29)If µ yij + β i (cid:54) = ν zij + β j this supremum would be inﬁniteand the problem infeasible (this is seen by setting x i = x j and recalling that (cid:96) (0) = 0 ). Therefore, in case of the edgelength (27), we have an additional dual constraint µ yij + β i = ν zij + β j , ∀ ( i, j ) ∈ E. (30)Using this constraint, the conjugate (29) becomes sup x i ,x j (( ν zij + β j ) (cid:62) ( x j − x i ) − (cid:96) ( x j − x i )) = (cid:96) ∗ ( ν zij + β j ) , and constraint (25b) reads − (cid:96) ∗ ( ν zij + β j ) ≥ α j − α i − γ i + ν ϕij − µ ϕij , ∀ ( i, j ) ∈ E. (31) Proof of Proposition 5.

The plan is to synthesize a dual fea-sible solution whose cost coincides with (28). The thesis isthen implied by weak duality. Let s (cid:48) ∈ O s be any vertex directly connected to the source s . Similarly, let t (cid:48) ∈ I t . We consider the following assignmentfor the Lagrange multipliers of (13d), (19), and (20): α i := 0 , β i := β, γ i := (cid:96) ∗ ( β ) , ∀ i ∈ V, where β is a decision variable and, noticing that (cid:96) (0) = 0 implies (cid:96) ∗ ≥ , we see that γ i is nonnegative as requiredby (25d). We set to zero the multipliers of (13b) and (13c)corresponding to the primal variables ϕ , y , and z : µ ϕij := ν ϕij := 0 , µ yij := ν zij := 0 , ∀ ( i, j ) ∈ E ; while for the ones associated to the variable x we let µ xs ˆ s := ν x ˆ tt := β,µ xij := ν xij := 0 , ∀ ( i, j ) ∈ E − { ( s, s (cid:48) ) , ( t (cid:48) , t ) } . We substitute these multipliers in the dual problem (25), withthe additional constraint (30) and with (25b) replaced by (31).The dual problem is reduced to max β ( − ( | V | − (cid:96) ∗ ( β ) − σ Λ s (0 , β, − σ Λ t (0 , − β, . We notice that σ Λ s (0 , β,

0) = sup ( ϕ,x,y ) ∈ Λ s β (cid:62) x = sup x ∈ X s β (cid:62) x = σ X s ( β ) , where the second equality follows from Proposition 2. Anal-ogously, σ Λ t (0 , − β,

0) = σ X t ( − β ) . Our dual program is thenreduced to max β ( − ( | V | − (cid:96) ∗ ( β ) − σ X s ( β ) − σ X t ( − β )) . This can be recognized to be the dual of the minimizationin (28). Thus, by strong duality, these two problems have thesame optimal value.Note that in the proof above both the multipliers β i and γ i of (19) and (20) are nonzero. Meaning that these twoconstraints are essential for our problem formulation to passthe sanity check from Proposition 5.VIII. A PPLICATION TO O PTIMAL C ONTROL OF D YNAMICAL S YSTEMS

We now show how multiple optimal control problems fordiscrete-time dynamical systems can be formulated as the SPPdescribed in Section II.As a ﬁrst example, we consider the problem of driving alinear system across a disjunctive subset of state space. Weanalyze both the cases in which the time horizon of the controlproblem is to be optimized or it is ﬁxed a priori. Problems ofthis form are pervasive in motion planning of robots [5], [41],[8] and autonomous vehicles [2], [4], [7]; see also the recentreview [42].In Section VIII-C, we consider the more general problemof controlling a PieceWise-Afﬁne (PWA) system [43]. Thisbroad class of hybrid systems has seen application in amultitude of areas: automotive [44], power electronics [45],robotics [46], [47], and many more [48]. Since the seminalwork by Bemporad and Morari [9], mixed-integer controlalgorithms for hybrid systems have been widely developed. At present, computation times are the main limitation to awidespread application of these techniques, and their decreasehas been a central theme recently [49], [50], [51], [52], [53].The approach described below is fundamentally different fromtoday’s most efﬁcient mixed-integer formulations of theseproblems [15], and it might enable the design of higher-performance control algorithms.

A. Constrained linear systems

Consider the constrained linear control system ζ ( k + 1) = Aζ ( k ) + Bu ( k ) , where k ∈ Z ≥ is the discrete time, ζ ∈ Z ⊂ R d ζ is the systemstate, and u ∈ U ⊂ R d u is the control input. We assumethe constraint set Z to be the union of convex compact sets { Z i } i ∈ V for some ﬁnite index set V , which will representthe vertex set of our graph. For simplicity, here we assumethe control set U to be convex and compact. Our goal isto ﬁnd a feasible control sequence that transfers the systemfrom the given initial state Z s := { ζ (0) } to a target set Z t ,while minimizing some cost function. The time K at whichthe system reaches the target set is not speciﬁed a priori.We might allow the system to transition from any set Z i to any set Z j , in which case we let E := { ( i, j ) ∈ V : i (cid:54) = j } . Otherwise, we can prevent an undesired transitionby not including the edge ( i, j ) in E . As in Remark 1, self-transitions can be allowed by introducing copies of the sets Z i . We let G := ( V, E ) be the graph underlying problem (1).To each vertex i ∈ V we associate the convex compact set X i := Z i × U , and each set X i contains the point x i :=( ζ i , u i ) ∈ R d , where d := d ζ + d u .We deﬁne the edge-length function (cid:96) ij so that state transi-tions that do not agree with the system dynamics incur inﬁnitecost. This is done as in Section IV-G: (cid:96) ij ( x i , x j ) := (cid:40) (cid:96) (cid:48) ij ( x i , x j ) if ζ j = Aζ i + Bu i ∞ otherwise . (32)As an example, in case of a minimum time problem, wherewe want to reach the target Z t as soon as possible, we candeﬁne (cid:96) (cid:48) ij ( x i , x j ) := 1 . Otherwise, another popular class ofcost functions in control are positive semideﬁnite quadraticforms K − (cid:88) k =0 ( ζ (cid:62) ( k ) Qζ ( k ) + u (cid:62) ( k ) Ru ( k )) . For Z t = { } , this objective balances the magnitude of thecontrol effort and the distance of the system from the target.In this case, we let (cid:96) (cid:48) ij ( x i , x j ) := ζ (cid:62) i Qζ i + u (cid:62) i Ru i .After having solved the MICP (13), and recovered theshortest path P = ( i k ) Kk =0 , the optimal control sequence is u ( k ) = u i k for k = 0 , . . . , K − , while the optimal statetrajectory is ζ ( k ) = ζ i k for k = 0 , . . . , K . (Note that thecontrol input u t associated with the target set X t is a redundantvariable in these optimizations.) B. Problems with ﬁxed time horizon

Choosing a ﬁxed time horizon K for a control problem isa tricky compromise between performance and computationalefﬁciency. Furthermore, letting K be a decision variable hasalso several technical advantages [54], [55], [56]. Nevertheless,in some cases, we might need the value of K to be ﬁxed. Withsome extra effort, also ﬁxed-horizon control problems can beformulated as the SPP (1).Here we assume to have a ﬁnite collection of convexcompact sets Z l ⊂ R d ζ indexed by l ∈ L . The initial state ζ (0) is given. At each time step k = 1 , . . . , K − the state ζ ( k ) is allowed to lie in any of the regions Z l . The ﬁnal state ζ ( K ) must belong to the set ¯ Z ⊂ R d ζ . As in the previouscase, we require the controls u ( k ) to lie in the set U for all k and we minimize some convex function of states and controls.We construct the graph with K − | L | vertices shownin Figure 2. The source vertex s is associated with the region X s := { ζ (0) } × U , the target t with X t := ¯ Z × U . Theremaining ( K − | L | vertices have the form i = ( k, l ) for k = 1 , . . . , K − and l ∈ L , and they encode the region Z l in which the system might be at time k . For these vertices,the convex regions X i are independent of time k and equal to Z l × U . The source s is connected by an edge to each vertex (1 , l ) for l ∈ L in the ﬁrst layer. Similarly, the | L | vertices ( K − , l ) are connected to the target t . For k = 1 , . . . , K − ,we have an edge from vertex i = ( k, l ) to vertex j = ( k +1 , l (cid:48) ) for all ( l, l (cid:48) ) ∈ L . This ensures that every state transitionincreases the time count by exactly one unit.Since any s - t path in the graph we just constructed hasexactly K edges, the time available to reach the target isﬁxed. For the remaining components of problem (1), thediscussion from the previous subsection carries over withoutany modiﬁcation.When working with ﬁxed-horizon problems, it is frequentlyuseful to enforce a penalty on the magnitude of the terminalstate ζ ( K ) , e.g. ζ (cid:62) ( K ) P ζ ( K ) with P positive semideﬁnite.This is a fundamental tool to ensure closed-loop stability ofthe control system [57]. In our construction, this can be easilydone via a proper deﬁnition of (cid:96) (cid:48) ij in (32) for j = t . Remark 3.

Overall, the size of the mixed integer programwe construct scales linearly with the time horizon K andquadratically with the number | L | of sets. Conversely, classicalformulations for these problems [15] have size linear in both K and | L | . However, as we will see in Section IX-E, the higherstrength of our approach is generally worth this price. C. Piecewise-afﬁne and hybrid systems

The analysis above extends immediately to PWA systems.This is a very broad class of hybrid dynamical systems: looselyspeaking, almost any system whose nonlinearity is exclusivelydue to discrete logics can be written in PWA form [58]. APWA system has the structure ζ ( k + 1) = A l ζ ( k ) + B l u ( k ) + c l , if ( ζ ( k ) , u ( k )) ∈ D l , (33) K K Discrete time k | L |21 S t a t e - s p a c e s u b s e t s l X s X t Fig. 2: Graph G for an optimal control problem with ﬁxedtime horizon K .with l ∈ L . In words, we have a collection of | L | afﬁne dy-namics, each of which applies in a different (convex compact)portion D l of the state and control space.The optimal control of PWA systems can be easily castin shortest-path form either with free time horizon or, con-structing a graph as in Figure 2, with ﬁxed time horizon. Theonly difference lies in the deﬁnition of the edge length (32).Assume for simplicity that the initial mode l s of the systemis uniquely determined by ζ (0) , i.e. l s is the unique l ∈ L such that ( ζ (0) , u ) ∈ D l for some u . To enforce the correctdynamics at time k = 0 , we modify (32) requiring (cid:96) sj tohave ﬁnite value on the set ζ j = A l s ζ s + B l s u s + c l s .Similarly, for k = 1 , . . . , T − , we let (cid:96) ij have ﬁnite value if ζ j = A l ζ i + B l u i + c l , where i = ( k, l ) and j = ( k + 1 , l (cid:48) ) .IX. E XAMPLES

We collect in this section multiple numerical examples. Theﬁrst two (Sections IX-A and IX-B) are toy examples meant toshow the effects of the tightening constraints from Section V.Then we move to a more challenging but still relatively smallexample in Section IX-C. Section IX-D contains a statisticalanalysis of the performance of our formulation in case of large-scale random instances of the SPP. Finally, in Section IX-E,we demonstrate the applicability of our problem formulationin the context of optimal control of PWA systems.

A. Effects of the spatial conservation of ﬂow

We provide a simple example to demonstrate how theaddition of (19) to the constraints of the MICP (13) can havea dramatic effect on the tightness of the convex relaxation.As shown in Figure 3a, we let X := { θ ∈ R d } and X := { ∈ R d } , while X is a full-dimensional set, e.g. a cube,centered at θ/ . We set s := 1 and t := 3 . The edge set is E := { (1 , , (2 , , (1 , } , and all the edges have 2-normlength function (2). We analyze the tightness of the convexrelaxation of problem (13) as the cube X grows in volume.Independently of the volume of X , the shortest path haslength (cid:107) θ (cid:107) . This is achieved either by letting ϕ = 1 or,since X intersects the line between X and X , by letting ϕ = ϕ = 1 and, e.g., x = θ/ .We start by showing that, in case constraint (19) is notenforced, the cost of the convex relaxation can be made X = { } X X = {0}1 1 (a) Graph G and sets X i . Feasible ﬂows parameterized by thecontinuous variable ε ∈ [0 , . C o s t c o n v e x r e l a x a t i o n r = 10 r = 10 r = 10 r = 10 (b) Cost of the convex relaxation of problem (13) as a function of theﬂow parameter ε for different sizes r of the set X . Constraint (19)is not enforced. r C o s t MICPRelaxation without (19)Relaxation with (19) (c) Optimal cost of the MICP (13) and of its convex relaxation,without and with constraint (19), as functions of the size r of theset X . Fig. 3: Numerical results from Section IX-A.arbitrarily small by increasing the size of X . Note that,for the problem in Figure 3a, any ﬂow that veriﬁes theconservation law (13d) also veriﬁes the degree constraint (20).Thus, constraint (20) is irrelevant in this analysis.We parameterize the feasible ﬂows in the graph with ϕ = ε ∈ [0 , and, by the conservation law, we have ϕ = ϕ =1 − ε (see Figure 3a). Constraints (13b) and (13c) allow to solvefor a subset of the continuous variables: y = (1 − ε ) θ , y = εθ , and z = z = 0 . Using (14), the objective function ofthe convex relaxation can be written as (cid:88) ( i,j ) ∈ E (cid:107) z ij − y ij (cid:107) = (cid:107) z − (1 − ε ) θ (cid:107) + (cid:107) y (cid:107) + (cid:107) εθ (cid:107) . (34)The remaining conditions from (13b) and (13c) are z ∈ (1 − ε ) X , x − z ∈ εX , (35a) y ∈ (1 − ε ) X , x − y ∈ εX . (35b)The optimal cost of the convex relaxation is the minimumof (34) with respect to ε , x , z , y and subject to (35).This constrained minimization is difﬁcult to solve by hand;nevertheless, for any ﬁxed ε ∈ (0 , , we can assume the set X to be large enough so that all the constraints in (35) areredundant. In this case, the unconstrained minimum of (34)with respect to the spatial variables is ε (cid:107) θ (cid:107) and is achievedfor z = (1 − ε ) θ and y = 0 . We conclude that, as thevolume of the set X goes to inﬁnity, the optimal choice forthe ﬂow ε tends to zero and the cost of the convex relaxationapproaches zero arbitrarily close.In Figure 3 we report a two-dimensional numerical examplewhere θ := ( − , , and the size of X is parameterized viathe scalar r > as X := { θ/ x : (cid:107) x (cid:107) ∞ ≤ r } . Figure 3bshows the optimal cost of the convex relaxation as a functionof ε , for different sizes r of X . As predicted, for large sets X , the minimum cost is achieved for ε close to zero, and issensibly smaller than the optimal cost (cid:107) θ (cid:107) = 1 of the MICPwhich is obtained for ε ∈ { , } . The dependence of the costof the convex relaxation on the size r of the set X is depictedmore neatly in Figure 3c.We now add constraint (19) which, for vertex i = 2 , reads y = z . Using the triangle inequality (cid:107) z − (1 − ε ) θ (cid:107) + (cid:107) z (cid:107) ≥ (cid:107) (1 − ε ) θ (cid:107) , the cost in (34) is lower bounded by (cid:107) θ (cid:107) . Therefore, with theaddition of (19), the problem formulation has zero relaxationgap, independently of the volume of X . This is conﬁrmednumerically in Figure 3c, where the curves of the MICP andthe convex relaxation with (19) overlap for all values of r . B. Effects of the degree constraints

We now describe the effects of the degree constraint (20) onthe strength of the MICP (13). We assume that constraint (19)is already included in the problem formulation.Unfortunately, the usefulness of (20) is only observed ingraphs more involved than the one from Section IX-A. Oneexample is depicted in Figure 4a. The sets X = X s := { θ } and X = X t := { } are singletons, while X and X arecubes centered at θ/ and θ/ , respectively. The edge set is E := { (1 , , (2 , , (2 , , (3 , , (3 , } . The edge-length function is the squared 2-norm (3). Similarlyto the previous example, we analyze the cost of the convexrelaxation as the sets X and X grow in volume.Independently of the size of the sets, the optimal ﬂow iseasily seen to be ϕ = ϕ = ϕ = 1 and ϕ = ϕ = 0 ,while the optimal continuous variables are x = 2 θ/ and x = θ/ . This choice corresponds to three steps of equallength and has cost (cid:107) θ/ (cid:107) = (cid:107) θ (cid:107) / .We ﬁrst analyze the problem without constraint (20) show-ing that the resulting convex relaxation is very loose. To thisend, we consider a speciﬁc ﬂow (red values in Figure 4a): ϕ = ϕ = 1 , ϕ = ϕ = 1 − ε , ϕ = ε , with ε ∈ (0 , .From the constraints (13b) and (13c) we have y = θ, y = z = x , z = x , z = z = 0 . X = { } X X X = {0}1 1 1 11 0 11 (a) Graph G and sets X i . The red ﬂow, parameterized by ε , is usedto show the looseness of the convex relaxation when constraint (20)is not enforced. The feasible ﬂows after the addition of (20) areparameterized by α and written in green. C o s t c o n v e x r e l a x a t i o n r = 10 r = 10 r = 10 r = 10 (b) Cost of the convex relaxation of problem (13) as a function ofthe ﬂow parameter ε for different sizes r of the sets X and X .Constraint (19) is enforced, constraint (20) is not. r C o s t MICPRelaxation without (20)Relaxation with (20) (c) Optimal cost of the MICP (13) and of its convex relaxation,without and with constraint (20), as functions of the size r of thesets X and X . Constraint (19) is enforced. Fig. 4: Numerical results from Section IX-B.Whereas, constraint (19) for i = 2 , yields y = z and y + y = x . Using property (15), we can express the costof this ﬂow assignment as (cid:88) ( i,j ) ∈ E (cid:107) z ij − y ij (cid:107) ϕ ij = (cid:107) x − θ (cid:107) + (cid:107) x − x (cid:107) + (cid:107) y (cid:107) + (cid:107) y − y (cid:107) − ε + (cid:107) x − y (cid:107) ε . (36)As in the previous example, for a ﬁxed ε ∈ (0 , , assuming X and X to be sufﬁciently large, the value of the variables x , x , y , y can be picked arbitrarily. A straightforward,but a little tedious, unconstrained minimization of (36) withrespect to these variables gives the optimal cost (cid:107) θ (cid:107) / (4 − ε ) .Therefore, as X and X grow in volume, the optimal choicefor ε converges to zero and the cost of this convex programdecreases to (cid:107) θ (cid:107) / : a value signiﬁcantly smaller than theoptimal value (cid:107) θ (cid:107) / of the MICP.In Figure 4 we show the numerical results in case θ :=( − , , X := { θ/ x : (cid:107) x (cid:107) ∞ ≤ r } , and X := { θ/ x : (cid:107) x (cid:107) ∞ ≤ r } . Figure 4b shows the cost of the convexrelaxation as a function of ε for different sizes r of X and X . Figure 4c reports the cost of the convex relaxation asa function of the size r of X and X . As predicted, theminimum cost is achieved for ε close to zero and approaches (cid:107) θ (cid:107) / / asymptotically.We add the degree constraints (20) to our problem for-mulation, and we show that this is sufﬁcient to make theconvex relaxation of the MICP tight. By the conservation ofﬂow, the edge (1 , must always be traversed by a unit ﬂow,and (20) implies ϕ = 0 . We parameterize all the remainingfeasible ﬂows with ϕ = α ∈ [0 , and, consequently, ϕ = ϕ = 1 − α (green values in Figure 4a). For α = 0 weget the minimum of the MICP, while we know that the choice α = 1 is suboptimal. We can then restrict our attention tothe open interval α ∈ (0 , . A subset of the constraints (13b)and (13c) yields y = θ, z = x , z = z = 0 . (37)While, for i = 2 , , condition (19) reads y + y = x and y = z . The cost of this ﬂow assignment is (cid:107) x − θ (cid:107) + (cid:107) y − y (cid:107) + (cid:107) y (cid:107) − α + (cid:107) y − x (cid:107) α . (38)This should be minimized subject to the constraints in (13b)and (13c) that have not already been considered in (37). On theother hand, the unconstrained minimum of (38) with respectto the continuous variables x , y , y is easily found to be (cid:107) θ (cid:107) / (3 − α ) . Since for α ∈ (0 , this is strictly greater thanthe optimal cost (cid:107) θ (cid:107) / of the MICP, we conclude that α = 0 is also the optimal solution of the convex relaxation. Thus,constraint (20) leads to an MICP with zero relaxation gap forthe problem at hand, independently of the size of X and X .This is conﬁrmed in Figure 4c. Remark 4.

This example shows that, even when condi-tion (19) is enforced, the degree constraint (20) can stillimprove the strength of our problem formulation. The examplefrom Section IX-A shows that the inverse is also true: for thatproblem constraint (20) is always redundant, whereas (19)makes an arbitrarily loose formulation into a perfectly tightone.

C. A two-dimensional example

We now move to a more involved two-dimensional example:this is described in Figure 5 and has been carefully constructedto show that the convex relaxation of our formulation needs notto be tight in the regime where the sets X i are large. We havea graph with | V | = 9 vertices and | E | = 19 edges. As before,the source X s = { θ s } and target X t = { θ t } sets are singlepoints, while the remaining regions are full dimensional. Thegeometry of the sets X i and the edge set E can be deducedfrom Figure 5.We consider two edge-length functions: the 2-norm (2) andthe squared 2-norm (3). The corresponding shortest paths areshown in Figure 5 as a blue dashed line and a red dash-dotted line, respectively. As expected, the ﬁrst path is almost X s X t Fig. 5: Graph G and sets X i for the example in Section IX-C.The blue dashed (red dash-dotted) line is the shortest path incase of 2-norm (squared-2-norm) edge-length function.straight while the length of the segments in the second is betterbalanced.In Figure 6 we compare the cost of the MICP (13) and of itsconvex relaxation, equipped with constraints (19) and (20). Asin the previous examples, we analyze these values as functionsof the size of the regions X i . We parameterize the scale ofthese sets using the scalar r : we ﬁx the position of the center(centroid of the vertices) of each polytope X i , and we scaleby r the position of each vertex relative to the center. The casedepicted in Figure 5 is obtained for r = 1 .Figure 6a reports the results for the 2-norm edge length:despite the many regions and the presence of cycles, theconvex relaxation is tight for all values of r .For the edge length (3), Figure 6b shows that the convexrelaxation is not always tight, even though the relaxation gapis small even in the worst case. In agreement with the resultsfrom Section IV-A, when the size of the sets converges to zerothe convex relaxation of problem (13) becomes tight. A verysmall relaxation gap can be noticed for r approximately equalto one (as in Figure 5): this shows that our formulation isnot necessarily tight when the sets X i do not overlap. In theregime of large r , approximately r ≥ , the convex relaxationbecomes looser and, in agreement with the proof of Lemma 1,the asymptotic cost of the MICP equals (cid:107) θ t − θ s (cid:107) /K = 11 . ,where K = 7 is the maximum number of edges traversedby an s - t path in Figure 5. Furthermore, a closer inspectionof Figure 6b reveals that the curve of the convex relaxationconverges to (cid:107) θ t − θ s (cid:107) / ( | V | −

1) = 10 . , which correspondsto the lower bound derived in Proposition 5.Recalling the discussion from Section II-A and noticingthat the graph in Figure 5 is not Hamiltonian, it is to beexpected that the relaxation of our MICP might be loose in theregime of large sets X i . If our MICP was always tight in these r C o s t MICPConvex relaxation (a) 2-norm edge length (2). r C o s t MICPConvex relaxation (b) Squared-2-norm edge length (3). (This example is carefully builtto make the convex relaxation loose in the regime of large r .) Fig. 6: Numerical results from Section IX-C. Optimal cost ofthe MICP (13) and of its convex relaxation as functions of thesize r of the sets X i .scenarios, we would have a polynomial-time algorithm for theHPP, which is NP-complete. On the other hand, it is not eventrue that for these problem instances our convex relaxationalways yields the trivial lower bound from Proposition 5.As an example, the removal of the edge connecting the top-left set to the bottom set in Figure 5 does not make thegraph Hamiltonian, but it is sufﬁcient to close the asymptoticrelaxation gap in Figure 6b. D. Large-scale random instances

So far we have only discussed simple examples with a hand-ful of vertices and edges. Moreover, we have mostly focusedon analyzing the strength of our formulation as a function ofthe size of the sets X i . We now move to problems of largerscale, and we analyze the impact of multiple parameters on theefﬁciency of our formulation. We formulate a large numberof random problem instances and we analyze the resultingsolution statistics.Generating random graphs representative of the “typical”SPP on convex sets we might encounter in practice is a difﬁcultoperation. Restrictions such as requiring the source s to beconnected to all the vertices in the graph introduce strongbiases in the topology of the graph. Inevitably, the instanceswe describe below are not completely representative and ouralgorithm might perform worse or better on different classesof random graphs. With the results reported in this section wedo not want to make any claim regarding, e.g., the averagestrength of our formulation. Our purpose is instead to showthat the applicability of our formulation is not limited to small-scale problems. Relaxation gap ( % ) Convex-relaxation solve time (s) MICP solve time (s)2-norm Squared 2-norm 2-norm Squared 2-norm 2-norm Squared 2-normIncreased parameters ( × ) mean max mean max mean max mean max mean max mean maxNone (nominal) . . . . .

04 0 .

08 0 .

04 0 .

07 0 . . . . Dimensions d . . . . .

24 0 .

70 0 .

20 0 .

37 1 . . . . Edges | E | . . . . .

53 1 .

02 0 .

47 0 .

63 4 . . . . Vertices | V | and edges | E | . . . . .

21 0 .

60 0 .

18 0 .

41 1 . . . . Volume ∆ 0 . . . . .

05 0 .

15 0 .

04 0 .

08 0 . . . . TABLE I: Relaxation gap and computation times, in the average and worst case, for the random problem instances describedin Section IX-D. In each row we increase part of the problem parameters by a factor of , we generate random probleminstances, and we report the statistics of their solution. The nominal value of the parameters is d = 4 , | E | = 100 , | V | = 50 ,and ∆ = 0 . . Two edge-length functions are considered: the 2-norm (2) and the squared 2-norm (3). These statistics showthat our formulation can be used to tackle problems of signiﬁcant size. However, given the random nature of these instances,these values might be unrepresentative of the average performance of our formulation. X s X t Fig. 7: Projection onto two dimensions of a random instanceof the SPP from Section IX-D for a nominal value of theproblem parameters.We construct a random instance of problem (1) as follows.We set X s := { ∈ R d } and X t := { ∈ R d } . Each of theremaining | V | − sets X i is a cube of volume ∆ with centerdrawn from the uniform distribution over [0 , d . For a givennumber of edges | E | , we construct the edge set E in two steps.First we generate multiple s - t paths such that each vertex in V − { s, t } is traversed exactly by one path. The number ofthese paths is drawn uniformly from the interval [1 , | V | − and the vertices they traverse are determined by generating arandom partition of the set V − { s, t } : both the number ofsets in the partition and their cardinality are uniform randomvariables. Finally, we expand the edge set E by drawing edgesuniformly at random from the set { ( i, j ) ∈ V : i (cid:54) = j } until the desired cardinality | E | is achieved. We use thefollowing nominal parameters: d = 4 dimensions, | E | = 100 edges, | V | = 50 regions, and a volume ∆ = 0 . for theregions X i . To give an idea of what these problems looklike, the projection onto two dimensions of a random instancegenerated with these parameters is shown in Figure 7.We consider both the 2-norm (2) and the squared-2-norm (3)edge lengths. For each edge length, ﬁrst we solve randomproblem instances with nominal parameters. Then we considerfour subgroups of the parameters: for each subgroup, wemultiply the value of the parameters in it by , and we solveanother random instances. Table I shows the statistics ofthese trials. The ﬁrst group of columns records the average and maximum relaxation gap (percentage difference between thecost of the MICP and of its convex relaxation). The secondand third groups of columns record the average and worst-case solution times for the convex relaxation and the MICP. In support of the analysis below, we recall that, including (19)and (20), these MICPs have | E | binaries, O ( d ( | V | + | E | )) continuous variables, and O ( d | V | + | E | ) constraints.Overall, the 2-norm edge length (2) results in much simpleroptimizations: the relaxation gaps never exceed . andcomputation times are relatively low.The squared-2-norm length (3) leads to more challengingproblems even though, in the nominal case, the averagerelaxation gap is still very low and the MICP computationtimes are always within . s. The growth of the spacedimension to d = 20 increases the size of our programs, andalso deteriorates the tightness of the convex relaxation. In theworst case, we have a relaxation gap of . and an MICPsolution time greater than 3 min. A similar analysis applieswhen the number | E | of edges is increased from to :in general, we have found our MICP formulation to strugglewith graphs of high density | E | / | V | of edges. (Recall that | E | equals the number of binaries in our MICP, while the bigger | V | the more constrained the binaries are.) To show this, inthe fourth row we keep edges but we increase the verticesto | V | = 250 . This has the effect of reducing the edge densityand, even if the resulting MICPs are bigger than the onesfrom the previous case, the relaxation gap and the computationtimes are reduced. Finally, we increase the size of the cubes X i from ∆ = 0 . to ∆ = 0 . : these sets have now a totalvolume of | V | ∆ = 2 . , which is signiﬁcantly larger than theunit cube containing them. Despite this, the performance ofour formulation does not differ signiﬁcantly from the nominalcase. Notice that this is not in contradiction with the previousexamples, where we were analyzing the regime of extremelylarge sets X i . In addition, recall that the volume of the setsdoes not affect the size of these programs. E. Optimal control of a piecewise-afﬁne system

We conclude illustrating the discussion from Section VIIIvia the optimal-control problem shown in Figure 8. We con-sider a mechanical system with position q ∈ R and velocity Problems are solved with the commercial solver Mosek 9.0.96 with defaultoptions on a 2.4 GHz 8-Core Intel Core i9. q , u q , u (a) Optimal solution of the MICP. q , u q , u (b) Solution of the convex relaxation of the formulation [14], [15].Relaxation gap is : the corresponding MICP is solved in s. q , u q , u (c) Solution of the convex relaxation of the proposed formulation.Relaxation gap is : the corresponding MICP is solved in . s. Fig. 8: Control problem of driving a second-order systemfrom start (green circle) to goal (green cross). In the light-blue regions the system is highly controllable ( η = 1 ); in thered regions controllability is low ( η = 0 . ). Optimal positions ( q ( k )) Kk =0 are white circles; optimal controls ( u ( k )) K − k =0 areblue arrows. The triangles represent the auxiliary variables q l ( k ) whose convex combination yields q ( k ) . The opacity ofthe triangles equals the optimal value of the indicator variables b l ( k ) which serve as weights in this convex combination. v ∈ R . The force u ∈ R serves as control input. The systemhas the dynamics of a double integrator q ( k + 1) = q ( k ) + v ( k ) , v ( k + 1) = v ( k ) + ηu ( k ) , where η is a scalar parameter that regulates the controllabilityof the system. Representing the system state via ζ := ( q, v ) ,the transition matrices are A :=   , B :=  η η  . At time k = 0 , the system is in position q (0) := ( − . , . (bottom-left green circle in Figure 8) with zero velocity v (0) .At each time step k = 1 , . . . , K − , the position vector q ( k ) is allowed to be in one of the seven regions depictedin Figure 8, while the velocity and the controls are limited bythe constraints (cid:107) v ( k ) (cid:107) ∞ ≤ and (cid:107) u ( k ) (cid:107) ∞ ≤ . The goal isto reach the conﬁguration q ( K ) := (3 . , . (top-right greencross in Figure 8) with zero velocity v ( K ) in K := 30 timesteps. In doing this, we want minimize the quadratic form K − (cid:88) k =0 ( (cid:107) v ( k ) (cid:107) / (cid:107) u ( k ) (cid:107) ) . We let the controllability parameter η vary between the re-gions. For the regions included in the range − ≤ q ≤ (light blue in Figure 8) we set η = 1 , and the system is highlycontrollable. In the remaining two regions (red in Figure 8)we let η = 0 . , making it very expensive to apply anysigniﬁcant force. Since the matrix B varies with the state, theresulting system is PWA and the control problem falls into theclass considered in Section VIII-C. The graph G beneath thisproblem (see Figure 2) has | V | = 205 vertices and | E | = 1386 edges, the sets X i live in a space of d = 6 dimensions.Figure 8a shows the optimal trajectory ( q ( k )) Kk =0 (whitecircles) with the optimal controls ( u ( k )) K − k =0 (blue arrows).Geometrically, the red regions would represent a shortcuttowards the goal, but the reduced controllability of these areaswould make it very expensive not to fall out of the feasibleset. The optimal strategy is then to follow the S-shaped pathand incur a cost of . .As a baseline for our method, we ﬁrst solve this problemusing the strongest MICP formulation available in the controlliterature: this also employs perspective functions and has beenpresented in [14] and further developed in [15, Section 5.2.2].For each time step k , it uses | L | indicator variables b l ( k ) ∈{ , } to select which one of the afﬁne dynamics in (33) shouldbe applied. This is done by decomposing the state and thecontrols at time k into the convex combination of | L | auxiliaryvariables ( ζ l ( k ) , u l ( k )) ∈ D l : ( ζ ( k ) , u ( k )) = (cid:88) l ∈ L b l ( k )( ζ l ( k ) , u l ( k )) , (39)where (cid:80) l ∈ L b l ( k ) = 1 . Each copy ( ζ l ( k ) , u l ( k )) is used topredict the next state according to the l th dynamics, and the Equations (39) and (40) are convexiﬁed using the method from [59], [10]. actual state of the system at time k + 1 is recovered as theconvex combination of these predictions: ζ ( k + 1) = (cid:88) l ∈ L b l ( k )( A l ζ l ( k ) + B l u l ( k ) + c l ) . (40)When the binaries are relaxed, b l ( k ) ∈ [0 , , the systemevolves according to a convex combination of the variousafﬁne dynamics.Figure 8b shows the solution of the convex relaxation ofthe formulation [14], [15]. It reports the position q ( k ) , the(barely visible) controls u ( k ) , and the auxiliary copies q l ( k ) of the position vector whose convex combination yields q ( k ) .The latter have triangle markers with opacity equal to thevalue of the indicator b l ( k ) . At each time step k , the solveris allowed to select the best convex combination of the | L | afﬁne dynamics: it decides to reach the goal with a perfectly-straight trajectory and incur a cost of . , which is only of the MICP cost ( relaxation gap). Note that this behavioris completely insensitive to the geometry of the problem.The auxiliary variables are also uninformative: given the widevariety of convex combinations of the afﬁne dynamics thatyield a straight trajectory, the values of q l ( k ) and b l ( k ) makeit impossible to guess in which region the system should be at agiven time. The MICP resulting from this problem formulationrequired s to be solved to global optimality (with themachine and solver reported in Section IX-D).The convex relaxation of our formulation is much tighter: itsoptimal value is . , which is of the MICP cost ( relaxation gap). This has a dramatic effect on computationtimes which are now reduced to . s. To generate a plotcomparable with 8b we leverage the structure of the graph G in Figure 2. For each time step k = 1 , . . . , K − , this graphhas vertices i = ( k, l ) and the total ﬂow b l ( k ) := (cid:88) j ∈ O i ϕ ij traversing vertex i takes the role of the binary indicatorfrom the formulation [14], [15]. In fact, at optimality of theMICP (13) we have ( ζ ( k ) , u ( k )) ∈ D l if and only if b l ( k ) = 1 ,while the conservation of ﬂow (13d) is easily seen to imply (cid:80) l ∈ L b l ( k ) = 1 . Recalling that the spatial variables x i areobtained by stacking the state and the controls, and providedthat b l ( k ) > , the role of the auxiliary continuous variablesfrom [14], [15] can be taken by ( ζ l ( k ) , u l ( k )) := (cid:88) j ∈ O i y ij /b l ( k ) ∈ D l , where the inclusion on the right holds even for nonbinary ﬂowsand follows directly from (13b). Reconstructing the systemstate and controls as in (39), we obtain ( ζ ( k ) , u ( k )) = (cid:88) i ∈{ k }× L (cid:88) j ∈ O i y ij . By the conservation of ﬂow (13d) and its spatial version (19), we couldequivalently have deﬁned b l ( k ) := (cid:80) j ∈ I i ϕ ji and ( ζ l ( k ) , u l ( k )) := (cid:80) j ∈ I i z ji /b l ( k ) . The values just described are depicted in Figure 8c. Thesystem trajectory ( q ( k )) Kk =0 and the controls ( u ( k )) K − k =0 ob-tained from the proposed convex program resemble the S-shaped optimal solution in Figure 8a much more closely thanthe formulation [14], [15]. For the auxiliary variables, we notethat all the markers in the regions with low controllability areinvisible, meaning that our convex relaxation identiﬁes theseas regions of high cost, and sets to zero the correspondingindicators b l ( k ) . All the visible points q l ( k ) are clustered alongthe optimal trajectory of the MICP, suggesting that our convexrelaxation contains detailed information on the optimal path toreach the goal.X. C ONCLUSIONS AND F UTURE W ORKS

We have presented a generalization of the SPP in whichthe position of each vertex in the graph is a continuousdecision variable lying in a convex set, and the length ofan edge is a convex function of the position of the verticesit connects. Our main contribution is a strong mixed-integerconvex formulation for the solution of this NP-hard problem. Awide variety of numerical tests show that the convex relaxationof this MICP is often very tight. We have focused part ofour attention on control systems: many mixed-integer controlproblems turn out to be interpretable as generalized SPPs and,in our tests, the proposed MICP outperforms state-of-the-arttechniques for their solution.One of our future goals is the development of approximationalgorithms for this generalized SPP: the similarities betweenour MICP and the LP formulation of the classical SPP mightallow us to leverage a massive literature to progress in thisdirection. In terms of applications, we plan to conduct athorough test of the performance of the proposed method inthe context of robot motion planning. Finally, we underlinethat the core techniques from this paper are not bound to theSPP and can be extended to many other classical problemsin combinatorics; the bipartite-matching and the traveling-salesman problems being two examples.A

PPENDIX AP ERSPECTIVE OF C OMMONLY -U SED F UNCTIONS

In this appendix we derive the perspective functions em-ployed in this paper. Below we assume A ∈ R m × n , b ∈ R m ,and c ∈ R . A. Constants

Consider the constant function f ( x ) := b . Using Deﬁni-tion 1, for λ ≥ , the perspective of f is given by ˜ f ( λ, x ) = (cid:40) λb if λ > τ ↓ τ b if λ = 0 = bλ. B. Positively homogeneous functions

Let f ( x ) be positively homogeneous, i.e., f ( ax ) = af ( x ) for all x and a ≥ . For example, linear functions f ( x ) := Ax and norms f ( x ) := (cid:107) x (cid:107) enjoy this property. Any positivelyhomogeneous function must verify f (0) < ∞ , in fact, if not,we would have the contradiction ∞ = f (0) = 0 f (0) = 0 · ∞ .Letting ¯ x = 0 in Deﬁnition 1, it is immediately veriﬁed that ˜ f ( λ, x ) = f ( x ) for all λ ≥ . C. Afﬁne functions

Assume f ( x ) := Ax + b . We notice that the perspectiveof the sum of two real-valued functions is the sum of theirperspectives. For λ ≥ , this implies ˜ f ( λ, x ) = Ax + bλ . D. Norms

Consider a function of the form f ( x ) := (cid:107) Ax + b (cid:107) + c .Exploiting the positive homogeneity of the norm, the perspec-tive of (cid:107) Ax + b (cid:107) is immediately veriﬁed to be (cid:107) Ax + bλ (cid:107) .Therefore, we have ˜ f ( λ, x ) = (cid:107) Ax + bλ (cid:107) + cλ . E. Positive semideﬁnite quadratic forms

Assume f ( x ) := (cid:107) Ax (cid:107) . For λ > , we have ˜ f ( λ, x ) = (cid:107) Ax (cid:107) /λ . For λ = 0 and ¯ x = 0 , we obtain ˜ f (0 , x ) = lim τ ↓ τ (cid:107) Ax/τ (cid:107) = (cid:40) if Ax = 0 ∞ otherwise . F. Functions with extended-real values

Consider f ( x ) := (cid:40) f (cid:48) ( x ) if x ∈ D ∞ otherwise , where f (cid:48) and D are closed and convex. Assume ¯ x to be suchthat f (¯ x ) < ∞ . For λ > , we have ˜ f ( λ, x ) = λ (cid:40) f (cid:48) ( x/λ ) if x/λ ∈ D ∞ otherwise = (cid:40) ˜ f (cid:48) ( λ, x ) if x ∈ λD ∞ otherwise . For λ = 0 , we have ˜ f (0 , x ) = lim τ ↓ τ (cid:40) f (cid:48) (¯ x + x/τ ) if ¯ x + x/τ ∈ D ∞ otherwise = (cid:40) ˜ f (cid:48) (0 , x ) if x ∈ D ∞ ∞ otherwise , where D ∞ denotes the recession cone of D . Denoting with thesymbol + the Minkowski sum between two sets, and noticingthat λD + D ∞ = λD for λ > , we conclude that ˜ f ( λ, x ) = (cid:40) ˜ f (cid:48) ( λ, x ) if x ∈ λD + D ∞ ∞ otherwise . A CKNOWLEDGMENT

The authors would like to thank Hongkai Dai for all the timespent improving the optimization-solver interface employed inthe numerical examples presented in this paper.This research was supported by the National Science Foun-dation, Award No. EFMA-1830901, and by the Departmentof the Navy, Ofﬁce of Naval Research, Award No. N00014-18-1-2210. Any opinions, ﬁndings, and conclusions or recom-mendations expressed in this material are those of the authorsand do not necessarily reﬂect the views of the Ofﬁce of NavalResearch. R

EFERENCES[1] A. Schrijver,

Combinatorial optimization: polyhedra and efﬁciency .Springer Science & Business Media, 2003, vol. 24.[2] A. Richards and J. P. How, “Aircraft trajectory planning with collisionavoidance using mixed integer linear programming,” in

Proceedingsof the 2002 American Control Conference (IEEE Cat. No. CH37301) ,vol. 3. IEEE, 2002, pp. 1936–1941.[3] A. Richards and J. How, “Mixed-integer programming for control,” in

Proceedings of the 2005, American Control Conference, 2005.

IEEE,2005, pp. 2676–2683.[4] M. Vitus, V. Pradeep, G. Hoffmann, S. Waslander, and C. Tomlin,“Tunnel-MILP: Path planning with sequential convex polytopes,” in

AIAA guidance, navigation and control conference and exhibit , 2008,p. 7132.[5] R. Deits and R. Tedrake, “Footstep planning on uneven terrain withmixed-integer convex optimization,” in . IEEE, 2014, pp. 279–286.[6] ——, “Efﬁcient mixed-integer planning for UAVs in cluttered envi-ronments,” in . IEEE, 2015, pp. 42–49.[7] B. Landry, R. Deits, P. R. Florence, and R. Tedrake, “Aggressivequadrotor ﬂight through cluttered environments using mixed integerprogramming,” in . IEEE, 2016, pp. 1469–1475.[8] M. R. Maximo and R. J. Afonso, “Mixed-integer quadratic program-ming for automatic walking footstep placement, duration, and rotation,”

Optimal Control Applications and Methods , 2020.[9] A. Bemporad and M. Morari, “Control of systems integrating logic,dynamics, and constraints,”

Automatica , vol. 35, no. 3, pp. 407–427,1999.[10] S. Ceria and J. Soares, “Convex programming for disjunctive convexoptimization,”

Mathematical Programming , vol. 86, no. 3, pp. 595–614,1999.[11] A. Frangioni and C. Gentile, “Perspective cuts for a class of convex 0–1mixed integer programs,”

Mathematical Programming , vol. 106, no. 2,pp. 225–236, 2006.[12] O. G¨unl¨uk and J. Linderoth, “Perspective reformulations of mixedinteger nonlinear programs with indicator variables,”

Mathematicalprogramming , vol. 124, no. 1-2, pp. 183–205, 2010.[13] ——, “Perspective reformulation and applications,” in

Mixed IntegerNonlinear Programming . Springer, 2012, pp. 61–89.[14] N. Moehle and S. Boyd, “A perspective-based convex relaxation forswitched-afﬁne optimal control,”

Systems & Control Letters , vol. 86,pp. 34–40, 2015.[15] T. Marcucci and R. Tedrake, “Mixed-integer formulations for optimalcontrol of piecewise-afﬁne systems,” in

Proceedings of the 22nd ACMInternational Conference on Hybrid Systems: Computation and Control ,2019, pp. 230–239.[16] D. P. Williamson and D. B. Shmoys,

The design of approximationalgorithms . Cambridge University Press, 2011.[17] N. Deo and C.-Y. Pang, “Shortest-path algorithms: Taxonomy andannotation,”

Networks , vol. 14, no. 2, pp. 275–323, 1984.[18] A. Frieze, “Minimum paths in directed graphs,”

Journal of the Opera-tional Research Society , vol. 28, no. 2, pp. 339–346, 1977.[19] E. W. Dijkstra, “A note on two problems in connexion with graphs,”

Numerische Mathematik , vol. 1, no. 1, pp. 269–271, 1959.[20] D. E. Knuth, “A generalization of Dijkstra’s algorithm,”

InformationProcessing Letters , vol. 6, no. 1, pp. 1–5, 1977.[21] G. Ramalingam and T. Reps, “An incremental algorithm for a gener-alization of the shortest-path problem,”

Journal of Algorithms , vol. 21,no. 2, pp. 267–305, 1996.[22] L. Boguchwal, “Shortest path algorithms for functional environments,”

Discrete Optimization , vol. 18, pp. 217–251, 2015.[23] F. Li and R. Klette,

Euclidean shortest paths . Springer, 2011.[24] T. Lozano-P´erez and M. A. Wesley, “An algorithm for planning collision-free paths among polyhedral obstacles,”

Communications of the ACM ,vol. 22, no. 10, pp. 560–570, 1979.[25] D.-T. Lee and F. P. Preparata, “Euclidean shortest paths in the presenceof rectilinear barriers,”

Networks , vol. 14, no. 3, pp. 393–410, 1984.[26] J. Canny and J. Reif, “New lower bound techniques for robot motionplanning problems,” in . IEEE, 1987, pp. 49–60.[27] C. H. Papadimitriou, “An algorithm for shortest-path motion in threedimensions,”

Information processing letters , vol. 20, no. 5, pp. 259–263, 1985. [28] J. N. Tsitsiklis, “Efﬁcient algorithms for globally optimal trajectories,” IEEE Transactions on Automatic Control , vol. 40, no. 9, pp. 1528–1538,1995.[29] J. Kim and J. P. Hespanha, “Discrete approximations to continuousshortest-path: Application to minimum-risk path planning for groups ofUAVs,” in , vol. 2. IEEE, 2003, pp. 1734–1740.[30] A. Deshpande, “Exact geometry algorithms for robotic motion plan-ning,” Ph.D. dissertation, Massachusetts Institute of Technology, 2019.[31] M. Dror, A. Efrat, A. Lubiw, and J. S. Mitchell, “Touring a sequenceof polygons,” in

Proceedings of the thirty-ﬁfth annual ACM symposiumon Theory of computing , 2003, pp. 473–482.[32] S. Ntafos, “Watchman routes under limited visibility,”

ComputationalGeometry , vol. 1, no. 3, pp. 149–170, 1992.[33] C. Wei-Pang and S. Ntafos, “The zookeeper route problem,”

InformationSciences , vol. 63, no. 3, pp. 245–259, 1992.[34] W.-p. Chin and S. Ntafos, “Optimum watchman routes,” in

Proceedingsof the second annual symposium on Computational geometry , 1986, pp.24–33.[35] M. L. Fredman and R. E. Tarjan, “Fibonacci heaps and their usesin improved network optimization algorithms,”

Journal of the ACM(JACM) , vol. 34, no. 3, pp. 596–615, 1987.[36] R. M. Karp, “Reducibility among combinatorial problems,” in

Complex-ity of computer computations . Springer, 1972, pp. 85–103.[37] S. Boyd and L. Vandenberghe,

Convex optimization . CambridgeUniversity Press, 2004.[38] J.-B. Hiriart-Urruty and C. Lemar´echal,

Convex analysis and minimiza-tion algorithms I: Fundamentals . Springer science & business media,2013, vol. 305.[39] L. Taccari, “Integer programming formulations for the elementary short-est path problem,”

European Journal of Operational Research , vol. 252,no. 1, pp. 122–130, 2016.[40] P. L. Combettes, “Perspective functions: Properties, constructions, andexamples,”

Set-Valued and Variational Analysis , vol. 26, no. 2, pp. 247–264, 2018.[41] R. Deits and R. Tedrake, “Computing large convex regions of obstacle-free space through semideﬁnite programming,” in

Algorithmic founda-tions of robotics XI . Springer, 2015, pp. 109–124.[42] D. Ioan, I. Prodan, S. Olaru, F. Stoican, and S.-I. Niculescu, “Mixed-integer programming in motion planning,”

Annual Reviews in Control ,2020.[43] E. Sontag, “Nonlinear regulation: The piecewise linear approach,”

IEEETransactions on automatic control , vol. 26, no. 2, pp. 346–358, 1981.[44] F. Borrelli, A. Bemporad, M. Fodor, and D. Hrovat, “An MPC/hybridsystem approach to traction control,”

IEEE Transactions on ControlSystems Technology , vol. 14, no. 3, pp. 541–552, 2006.[45] T. Geyer, G. Papafotiou, and M. Morari, “Hybrid model predictivecontrol of the step-down DC–DC converter,”

IEEE Transactions onControl Systems Technology , vol. 16, no. 6, pp. 1112–1124, 2008.[46] T. Marcucci, R. Deits, M. Gabiccini, A. Bicchi, and R. Tedrake,“Approximate hybrid model predictive control for multi-contact pushrecovery in complex environments,” in . IEEE, 2017,pp. 31–38.[47] W. Han and R. Tedrake, “Feedback design for multi-contact push recov-ery via LMI approximation of the piecewise-afﬁne quadratic regulator,”in . IEEE, 2017, pp. 842–849.[48] E. F. Camacho, D. R. Ram´ırez, D. Lim´on, D. M. De La Pe˜na, andT. Alamo, “Model predictive control techniques for hybrid systems,”

Annual reviews in control , vol. 34, no. 1, pp. 21–31, 2010.[49] V. V. Naik and A. Bemporad, “Embedded mixed-integer quadraticoptimization using accelerated dual gradient projection,”

IFAC-PapersOnLine , vol. 50, no. 1, pp. 10 723–10 728, 2017.[50] B. Stellato, V. V. Naik, A. Bemporad, P. Goulart, and S. Boyd, “Em-bedded mixed-integer quadratic optimization using the OSQP solver,”in . IEEE, 2018, pp. 1536–1541.[51] P. Hespanhol, R. Quirynen, and S. Di Cairano, “A structure exploitingbranch-and-bound algorithm for mixed-integer model predictive con-trol,” in . IEEE, 2019,pp. 2763–2768.[52] T. Marcucci and R. Tedrake, “Warm start of mixed-integer programsfor model predictive control of hybrid systems,”

IEEE Transactions onAutomatic Control , 2020. [53] J. Liang, S. Di Cairano, and R. Quirynen, “Early termination ofconvex QP solvers in mixed-integer programming for real-time decisionmaking,”

IEEE Control Systems Letters , 2020.[54] H. Michalska and D. Q. Mayne, “Robust receding horizon controlof constrained nonlinear systems,”

IEEE Transactions on Automaticcontrol , vol. 38, no. 11, pp. 1623–1633, 1993.[55] P. O. Scokaert and D. Q. Mayne, “Min-max feedback model predictivecontrol for constrained linear systems,”

IEEE Transactions on Automaticcontrol , vol. 43, no. 8, pp. 1136–1142, 1998.[56] A. Richards and J. P. How, “Robust variable horizon model predictivecontrol for vehicle maneuvering,”

International Journal of Robust andNonlinear Control , vol. 16, no. 7, pp. 333–351, 2006.[57] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. Scokaert, “Con-strained model predictive control: Stability and optimality,”

Automatica ,vol. 36, no. 6, pp. 789–814, 2000.[58] W. P. Heemels, B. De Schutter, and A. Bemporad, “Equivalence ofhybrid dynamical models,”

Automatica , vol. 37, no. 7, pp. 1085–1091,2001.[59] E. Balas, “Disjunctive programming: Properties of the convex hull offeasible points,”