[PDF] Optimal Mixed Discrete-Continuous Planning for Linear Hybrid Systems

Abstract

Planning in hybrid systems with both discrete and continuous control variables is important for dealing with real-world applications such as extra-planetary exploration and multi-vehicle transportation systems. Meanwhile, generating high-quality solutions given certain hybrid planning specifications is crucial to building high-performance hybrid systems. However, since hybrid planning is challenging in general, most methods use greedy search that is guided by various heuristics, which is neither complete nor optimal and often falls into blind search towards an infinite-action plan. In this paper, we present a hybrid automaton planning formalism and propose an optimal approach that encodes this planning problem as a Mixed Integer Linear Program (MILP) by fixing the action number of automaton runs. We also show an extension of our approach for reasoning over temporally concurrent goals. By leveraging an efficient MILP optimizer, our method is able to generate provably optimal solutions for complex mixed discrete-continuous planning problems within a reasonable time. We use several case studies to demonstrate the extraordinary performance of our hybrid planning method and show that it outperforms a state-of-the-art hybrid planner, Scotty, in both efficiency and solution qualities.

Full PDF

OOptimal Mixed Discrete-Continuous Planningfor Linear Hybrid Systems

Jingkai Chen [email protected] Institute of Technology

Brian C. Williams [email protected] Institute of Technology

Chuchu Fan [email protected] Institute of Technology

ABSTRACT

Planning in hybrid systems with both discrete and continuous con-trol variables is important for dealing with real-world applicationssuch as extra-planetary exploration and multi-vehicle transporta-tion systems. Meanwhile, generating high-quality solutions givencertain hybrid planning specifications is crucial to building high-performance hybrid systems. However, since hybrid planning ischallenging in general, most methods use greedy search that isguided by various heuristics, which is neither complete nor optimaland often falls into blind search towards an infinite-action plan. Inthis paper, we present a hybrid automaton planning formalism andpropose an optimal approach that encodes this planning problem asa Mixed Integer Linear Program (MILP) by fixing the action numberof automaton runs. We also show an extension of our approachfor reasoning over temporally concurrent goals. By leveraging anefficient MILP optimizer, our method is able to generate provablyoptimal solutions for complex mixed discrete-continuous planningproblems within a reasonable time. We use several case studies todemonstrate the extraordinary performance of our hybrid plan-ning method and show that it outperforms a state-of-the-art hybridplanner, Scotty, in both efficiency and solution qualities.

KEYWORDS

Linear Hybrid Systems, Hybrid Planning, Optimization

ACM Reference Format:

Jingkai Chen, Brian C. Williams, and Chuchu Fan. 2021. Optimal MixedDiscrete-Continuous Planning for Linear Hybrid Systems. In

Proceedingsof ACM Conference (Conference’17).

ACM, New York, NY, USA, 12 pages.https://doi.org/10.1145/nnnnnnn.nnnnnnn

Hybrid systems are a powerful modeling framework to capture boththe physical plants and embedded computing devices of complexcyber-physical systems. When planning the desired behaviors of ahybrid system, we have to consider both the discrete actions takenby the computing units and the continuous control inputs for thephysical actuators. This poses unique and significant challengesin planning for hybrid systems, as one has to consider the change

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected].

Ground BasinForbiddenArea(Obstacle) MountainBase

Figure 1:

Map of Mars.

Ground MountainCharge Station (10, 10) Rover(25, 5) Astronaut(35, 10)Destination(45, 5)

Figure 2:

Example of Mars transportation problems: the charge sta-tion is marked as ⊲ , and its position is at (10,10); the rover ◦ is at(25,5); the astronaut ⋄ is at (35,10); and the destination (cid:50) is at (45,5).In the example solution, the rover path is in red, the astronaut walk-ing path is in blue, and the astronaut taking a ride is in green. of dynamics of the continuous flow by the control inputs, the in-terleaves of continuous flows and the discrete transitions betweenmodes, resets associated with transitions, and concurrently runningagents in multi-agent systems.Planning mixed discrete-continuous strategies in hybrid systemsis theoretically difficult: on the discrete side, planning with numericstate variables has been proved to be undecidable [26]; on the con-tinuous side, computing the exact unbounded time reachable statesfor hybrid systems is also a well-known undecidable problem [27].Nevertheless we are interested in high-quality solutions with theshortest makespan or lowest energy consumption, which is crucialto designing high-performance systems. Motivating Example

Consider a task where an astronaut shouldgo to an observation location by crossing different terrains (e.g.,mountain, ground, and basin) on Mars, and a rover needs to go tothe charge station, as shown in Figure 2. The astronaut can eitherwalk or take a rover. The moving speed of the rover is much fasterthan the walking speed of the astronaut. The rover is powered bya battery. This battery can be charged when the rover is stoppedin a charge station and should always have the remaining batteryoutside of the charge station. While the rover and astronaut shouldnot enter the forbidden areas, the rover can move through differentterrains with different velocity limits and energy consumption rates. a r X i v : . [ c s . R O ] F e b fter the rover is manually shut down, it cannot restart within 1minute. In this mission, plans with shorter makespans are preferred.A sound solution to this problem is that the astronaut moves di-rectly to the destination, and the rover moves directly to the chargestation without picking up the astronaut. However, in this plan, therover is not used at all. If the rover has enough battery, it can de-liver the astronaut to the destination first and then go to the chargestation, which saves a lot of time for the astronaut. Unfortunately,energy is always limited in reality. Intuitively, a better solution isthat while the astronaut is moving towards the rover, the rovermoves to the charge station for charging, then picks up and deliversthe astronaut to the destination, and finally returns to the chargestation. As the rover moves much faster than the astronaut, thisplan requires much less time than letting the astronaut walk to thedestination. Another possible faster solution is that the rover picksup and delivers the astronaut somewhere midway to the destina-tion. Then, the astronaut walks to the destination while the roveris moving back to the charge station for a charge.In this paper, we adopt the hybrid I/O automaton [42] frameworkto model the mixed discrete-continuous planning problems, whichis expressive enough to capture all the mentioned features such asdiscrete and continuous input and state variables, linear dynamicsfor continuous flows, and guards and resets for discrete transitions.The crucial technique to make the mixed discrete-continuous op-timal planning possible is the introduction of finite-step runs. Astep is defined either as a discrete jump or as a set of continuousflows in a mode where we need to compute the dwell time for stay-ing in this mode. We prove that the optimal solution (also called arun) with finite steps will converge to the optimal solution of theoriginal hybrid system when the number of steps goes to infinity.By tailoring the automaton to admit only finite-step runs, we showthat finding the optimal input such that the corresponding run hasa minimum makespan can be encoded as a Mixed Integer LinearProgram (MILP), which can be solved efficiently using off-the-shelfMILP solvers, such as Gurobi [25].To encode our mixed discrete-continuous planning problem as aMILP, we draw inspiration from the idea of modeling flow tubes [31]as linear programs [14]. We further extend these linear programsto incorporate discrete decisions to model discrete variables andactions. Our complexity analysis shows that the number of MILPvariables and constraints at each step increases linearly with theproduct of the number of linear constraints in each condition andthe number of operators and variables. To accelerate MILP solving,we introduce two types of additional constraints about conflictingoperators. We also show that our solution approach is able to planfor the temporally concurrent goals known as Qualitative StatePlans (QSPs) [30], which describe the desired system behavior overcontinuous time as tasks with time windows.To demonstrate the efficiency and solution qualities of our method,we benchmarked against the Scotty planning system [14] on threedomains: Mars transportation, air refueling, and truck-and-dronedelivery. In addition to dealing with different dynamics under alarge number of modes, all these three domains require judiciouslycoordinating heterogeneous agent teams for cooperation and care-fully reasoning over resources to decide necessary recharging orrefueling. The experimental results show that our approach can findhigh-quality solutions for all the problems in seconds and provide optimality proof for most examples, while Scotty fails to solve halfof the problems within 600 seconds. Moreover, the makespans ofour first solutions returned within 1 second are already better thanthose of Scotty, and our final solutions can significantly improvethem.The remainder of this paper is organized as follows. We startby discussing the related work in both planning and controllersynthesis (Section 2). In Section 3, we give the formal definition ofour hybrid planning problem as well as a formulation of our moti-vating example. In Section 4, we introduce a tractable variant of thehybrid planning problem by fixing the action number of automatonruns, which leads to a finite-step linear hybrid planning problem.In Section 5, we present our MILP encoding of this finite-step linearhybrid planning problem. Then, we introduce our extension to dealwith temporally concurrent goals in Section 6. Section 7 shows theresults of benchmarking our method against the Scotty Planningsystem on three challenging domains. Finally, concluding remarksare discussed in Section 8. Planning [1, 4, 6, 10, 14, 41] and controller synthesis [11, 24, 28, 35,35, 44, 47, 50–52, 54–56] methods intersect at finding the (optimal)strategies for various system models with respect to different systemspecifications. In what follows, we briefly mention a couple ofrepresentative approaches that are related, without exhaustivelylisting all approaches in each category.

Discrete abstraction-based synthesis.

Discrete abstraction-basedsynthesis methods first compute a discrete, finite-state abstractionof control systems and then synthesize discrete controllers based onautomaton theories or two-player games [24, 35, 35, 44, 50, 55, 56].Synthesis tools based on abstraction such as CoSyMA [45], Pes-soa [48], LTLMop [36, 54], Tulip [15, 56], and SCOTS [49] cansupport complex nonlinear systems, stochastic uncertainties, ornon-deterministic transitions [16, 40, 46, 49, 53, 57], and generalGR(1) [57] or Signal Temporal Logic [47] specifications. Our prob-lem may be solved using abstraction-based synthesis using temporallogic specifications. However, none of the above tools can be useddirectly on our general linear hybrid system with both discreteinput/state signals and guards/resets in transitions. Moreover, ourapproach aims at finding high-quality solutions with low costs orhigh rewards over long horizons instead of finding all valid so-lutions. In fact, our planning approach is efficient and effectiveat finding high-level plans, which is complementary and can becombined with the controller synthesis algorithms for achievingautonomy in complex hybrid systems.

Sampling-based planning.

Sampling-based methods such as Prob-abilistic Roadmaps (PRM) [34], Rapidly-exploring Random Tree(RRT) [37], Fast Marching Tree (FMT) [32], and hybrid automataplanner [39] are widely used for searching for plans in nonconvex,high-dimensional, and even partially unknown environments. Re-searchers also combine the PRM sampling method with classicalplanning for solving task-and-motion planning problems, whichinvolves both continuous motions and discrete tasks [21–23, 33, 38].Compared with the deterministic guarantees provided by controllersynthesis methods and our approach, these methods come onlywith probabilistic guarantees. ybrid planning.

The planning problems with a subset of thefeatures considered in this paper (i.e., mixed discrete-continuousmodels, continuous control variables, autonomous transitions, indi-rect effects, and concurrency) can be of the categories PDDL2.1[17],PDDL-S [14], or PDDL+[18]. Some SMT-based PDDL+ plannerssuch as SMTPlan+ [6] and dReal [5] are complete given a finite num-ber of fixed time steps. However, PDDL+ does not support controlvariables and these planners solve different problems from ours. Asmost of the solution approaches to PDDL2.1 and PDDL-S [9, 13, 29]use greedy search methods such as the enforced-hill climbing, theyare neither complete nor optimal even with finite steps. Moreover,most of their heuristics belong to the Metric-FF family [29], whichare known to suffering from resource persistence or cyclical re-source transfer [8] when indirect effects or obstacles are present.Thus, none of these heuristics can handle all of these problem fea-tures and often lead the greedy search to be blind in certain domains.These motivate us to develop an effective method that is guaranteedto provide high-quality solutions for mixed discrete-continuousplanning problems with various features.Note that hybrid planning problems are closely related to theformalism of hybrid automata studied in model checking [27, 42],which can be found in [18]. In addition, researchers put many effortsinto translating PDDL+ to hybrid automata [2, 3], which leveragesthe advanced hybrid model checking tools [7, 19, 20] to efficientlyprove plan non-existence. Our method directly plans for hybridautomata instead of any PDDL extension. These two representa-tions can be translated to each other since snap actions are basicallyjumps, and the overall conditions and effects of durative actionsare basically flows. By using jumps and flows instead of durativeactions, we are able to have a clean MILP encoding for hybridplanning problems.Among all the hybrid extensions of PDDL, the problems thispaper aims at are most relevant to PDDL-S since it is the onlyplanning formalism that supports continuous control variablesover time [14]. Kongming [41] is the first planner that is able tosolve PDDL-S, and then a more scalable planner Scotty [14] wasdeveloped. Scotty is able to efficiently solve complex underwaterexploration problems, and it is the current state-of-the-art PDDL-Splanner. The reasons for its efficiency are: (1) Scotty encodes thecumulative effect of each control variable as a single variable, whichrenders a clean convex optimization problem for plan validation,which are called flow tubes [31]; (2) It uses the temporal relaxedplanning graph heuristics (i.e., delete relaxation) [9] to guide itsgreedy search. Our method is inspired by its cumulative effectencoding and extends its optimization problem to handle discretedecisions and nonconvex conditions. By using such an encoding, wedo not need to discretize the timeline with a fixed time step as [41]or discretize control parameters as [9]. Meanwhile, our solutionapproach avoids the incompleteness and suboptimality caused byScotty’s greedy search.

We use a linear hybrid automaton with inputs as the model forour system and then define the hybrid planning problem on thisautomaton.

Definition 3.1 (Linear Hybrid Automaton).

A linear hybrid au-tomaton with inputs is a tuple ℋ = ⟨ 𝑉 = ( 𝑄 ∪ 𝐸 ) , 𝑞 Init , Goal , 𝐽, 𝐹 ⟩ : • 𝑄 = 𝐿 ∪ 𝑋 is the set of internal variables, which are alsocalled state variables. 𝐿 is the set of discrete state variables,the values of which val ( 𝐿 ) are taken from finite sets calledmodes. 𝑋 is the set of continuous state variables, the valuesof which val ( 𝑋 ) are taken from continuous sets over thereals. We call val ( 𝐿 ) = val ( 𝐿 ) × val ( 𝑋 ) internal state space. • 𝐸 is the set of external variables, which are also called input variables. External variables could also contain discrete andcontinuous variables, which are defined analogously to theinternal variables. • 𝑞 Init ∈ val ( 𝐿 ) × val ( 𝑋 ) is an initial state, and Goal is apredicate that represents a set of goal states. • 𝐽 is the set of jumps . A jump 𝑗 ∈ 𝐽 is associated with a condi-tion cond and an effect eff . The condition cond is a predicate over 𝑉 , where a predicate is a computable Boolean-valuedfunction cond : val ( 𝑉 ) → B that maps the values of the vari-ables 𝑉 to either true or false . The condition is also knownas the guard condition or the enabling condition of the jump.An effect eff : 𝑉 → 𝑄 specifies how the value of the state vari-ables changes when the jump occurs. It assigns new valuesto the variables in 𝑄 . The variables that are not mentionedin the effect statements are assumed to remain unchanged. • 𝐹 is the set of flows for the state variables 𝑋 . 𝐹 𝑘 ⊆ 𝐹 is the setof flows for 𝑋 𝑘 ⊆ 𝑋 , where { 𝑋 , 𝑋 , .., 𝑋 𝐾 } is a set of disjointcontinuous variable sets such that ∪ 𝑘 𝑋 𝑘 = 𝑋 . A flow 𝑓 ∈ 𝐹 𝑘 is associated with a differential equation (cid:164) 𝑋 𝑘 = 𝐴 𝑘 𝐸 + 𝐵 𝑘 anda condition cond over 𝑉 , where 𝐴 𝑘 , 𝐵 𝑘 are constant matrices,and cond is defined in the same way as in jumps that specifieswhen a flow 𝑓 is activated. At each time, multiple flows 𝑓 ∈ 𝐹 can be activated with exactly one flow 𝑓 𝑘 from each 𝐹 𝑘 . Thatis, there will always be a set of flows, which together specifythe evolution of the continuous internal variables 𝑋 as lineardifferential equations. We call such a set of flows the flowset at each time, and it belongs to the power set of 𝐹 . Duringthe time when a flow is activated, the values of discrete statevariables stay the same.Note that val ( 𝑄 ) also defines the invariant set of the internalvariables, where val ( 𝑋 ) could be nonconvex. Therefore, we canavoid defining the unsafe set separately.Without loss of generality, we use an integer variable with do-main { , , .., | val ( 𝑣 )| − } to replace a discrete variable 𝑣 ∈ 𝑉 , where | val ( 𝑣 )| is the number of the elements in val ( 𝑣 ) . Thus, we can furtherassume that all conditions, initial states, and goals are representedas a propositional sentence of liner constraints: 𝜙 :: = true | ( 𝐺𝑉 ≥ 𝐻 ) | 𝜙 | 𝜙 ∧ 𝜙 | 𝜙 ∨ 𝜙 , (1)where 𝐺 ∈ R | 𝑉 | is a | 𝑉 | -vector of real values and 𝐻 ∈ R is areal value. This propositional sentence of linear constraints canrepresent both convex and nonconvex regions defined by linearinequalities over both integer and continuous variables. Effects canbe also represented by Equation (1) except its linear constraintsinvolve both 𝑉 = 𝑄 ∪ 𝐸 , which are the state and input variablesbefore taking the effects, and 𝑄 ′ , which is the state variables aftertaking the effects. xample 3.2. To formulate our motivating example, we definetwo discrete internal variables: the astronaut LA ∈ { , } has modes Walking(0) and mode

Riding(1) ; the rover LR ∈ { , , } has Driving(0) , Stopped(1) and

Charge(2) .Internal continuous variables pA ∈ [ , ] × [ , ] represent theastronaut’s position, and xR ∈ [ , ] × [ , ] × [ , ] × [ , ∞) includes the rover’s position pR ∈ [ , ] × [ , ] , battery level 𝐸 ∈ [ , ] , and an internal clock 𝑐 ∈ [ , ∞) . pRx and pRy are therover’s positions over the x-axis and the y-axis, respectively. Theinitial state is LA = , LR = , pA = ( , ) , xR = ( , , , ) asshown in Figure 2.This system also takes commands as input variables, includingdiscrete input variables cmdA ∈ { , } , cmdR ∈ { , , } , and con-tinuous input variables vA ∈ [− . , . ] , vR ∈ [− , ] representvelocities. Jumps

Board cond (LA=0) ∧ (cmdA=1) ∧ (pA=pR) ∧ (vR=0) eff (LA=1)Deboard cond (LA=1) ∧ (cmdA=0) ∧ (vR=0) eff (LA=0)Stop cond (cmdR=0) eff (LR=1),(c=0)Drive cond (LR=1) ∧ (cmdR=1) ∧ (c>1) eff (LR=0)Charge cond (LR=1) ∧ (cmdR=2) ∧ (pR=(10,10)) eff (LR=2) Flows

Ride eq (cid:164) pA ( 𝑡 ) =vR cond (LA=1) ∧ (cmdA=1) ∧ (pA=xA)Walk eq (cid:164) pA ( 𝑡 ) =vA cond (LA=0) ∧ (cmdA=0)Ground eq (cid:164) xR ( 𝑡 ) = [vR,-1,0] cond (LR=0) ∧ (cmdR=1) ∧ (pRx <20)Mount eq (cid:164) xR ( 𝑡 ) = [vR,-2,0] cond (LR=0) ∧ (cmdR=1) ∧ (pRx >20) ∧ ( | vR | <2)Stop eq (cid:164) xR ( 𝑡 ) = [0,0,0,1] cond (LR=1) ∧ (cmdR=0)Charge eq (cid:164) xR ( 𝑡 ) = [0,0,0,10] cond (LR=2) ∧ (cmdR=2) While flow sets can only change continuous state variables,jumps can change both discrete and continuous state variables.The conditions for both jumps and flows would depend on all vari-ables (i.e., including state and input variables). While we call theunion of jumps and flows 𝐽 ∪ 𝐹 as operators 𝑂 , we call both jumpsand flow sets 𝐽 ∪ 𝐹 as actions 𝐴 . We use cond 𝑎 to denote the set ofstates 𝑣 ∈ val ( 𝑉 ) such that the condition associated with the action 𝑎 is true: 𝜙 ( 𝑣 ) = 𝑇𝑟𝑢𝑒 .Given a flow set 𝑎 = ∪ 𝑘 𝑓 𝑘 ∈ 𝐹 , we denote the derivative of 𝑋 as 𝐴𝐸 + 𝐵 . Such 𝐴 and 𝐵 can be easily constructed from the differentialequations for each flow 𝑋 𝑘 = 𝐴 𝑘 𝐸 + 𝐵 𝑘 . If at the beginning of theflow the value of 𝑋 is 𝑥 , and the elapsed time of such flow is 𝛿 ,then the 𝑋 ’s value would be updated as 𝑥 ← 𝑥 + 𝐴 Δ + 𝐵𝛿 , where Δ = ∫ 𝛿 𝐸𝑑𝑡 is the cumulative effects of 𝐸 during 𝑑 .An input signal is a function 𝑒 : [ , ∞) → val ( 𝐸 ) , which specifiesthe value of the input variables at any time 𝑡 ≥

0. Once an inputsignal is fixed, a run of the hybrid automaton is defined as follows:

Definition 3.3.

Given a linear hybrid automaton ℋ = ⟨ 𝑉 = ( 𝑄 ∪ 𝐸 ) , 𝑞 Init , Goal , 𝐽, 𝐹 ⟩ and an input command 𝑒 : [ , ∞) → val ( 𝐸 ) , a run of ℋ is defined as a sequence of internal states 𝑞 , · · · , 𝑞 𝑛 ∈ val ( 𝐿 ) × val ( 𝑋 ) : 𝜉 ℋ ,𝑒 = 𝑞 𝑎 ,𝛿 −−−−→ 𝑞 · · · , 𝑞 𝑛 − 𝑎 𝑛 − ,𝛿 𝑛 − −−−−−−−−→ 𝑞 𝑛 , such that(1) 𝑞 ∈ 𝑞 Init and 𝑞 𝑛 ∈ Goal . We assume that in our run Zeno behaviors are not allowed. That is, we do not allowan infinite number of jumps to occur in a finite time interval. (2) 𝑎 , · · · , 𝑎 𝑛 − are actions. Let 𝑡 𝑖 = (cid:205) 𝑖 − 𝑗 = 𝛿 𝑗 be the accumulatedtime associated with 𝑞 𝑖 for each 𝑖 = , · · · , 𝑛 , then: (a) if 𝑎 𝑖 ∈ 𝐽 is a jump, then 𝛿 𝑖 = ( 𝑞 𝑖 , 𝑒 ( 𝑡 𝑖 )) ∈ cond ( 𝑎 𝑖 ) , and 𝑞 𝑖 + = eff ( 𝑞 𝑖 , 𝑒 ( 𝑡 𝑖 )) ; (b) if 𝑎 𝑖 ∈ 𝐹 is a flow set, then 𝛿 𝑖 ≥ ( 𝑞 𝑖 , 𝑒 ( 𝑡 𝑖 )) ∈ cond ( 𝑎 𝑖 ) , 𝑥 𝑖 + = 𝑥 𝑖 + 𝐴 ∫ 𝑡 𝑖 + 𝑡 𝑖 𝑒 ( 𝜏 ) 𝑑𝜏 + 𝐵𝛿 𝑖 , ℓ 𝑖 + = ℓ 𝑖 where 𝑞 𝑖 = ( ℓ 𝑖 , 𝑥 𝑖 ) . Moreover, between the time 𝑡 ∈ [ 𝑡 𝑖 , 𝑡 𝑖 + ) , ( 𝑞 ( 𝑡 ) , 𝑒 ( 𝑡 )) should always satisfy cond( 𝑎 𝑖 ).We also denote the total time (cid:205) 𝑛 − 𝑖 = 𝛿 𝑖 of a run 𝑒 as 𝜉 ℋ ,𝑒 . TotalTime .Note that although 𝑒 is defined on the infinite time horizon [ , ∞) ,we do not need to have the value for 𝑒 ( 𝑡 ) when 𝑡 > 𝜉 ℋ ,𝑒 . TotalTime

Now given a linear hybrid automaton with inputs, we can definethe planning problem as finding an input signal whose run has theminimum makespan.

Definition 3.4.

Given a linear hybrid automaton ℋ = ⟨ 𝑉 = ( 𝑄 ∪ 𝐸 ) , 𝑞 Init , Goal , 𝐽, 𝐹 ⟩ , the planning problem is to find a theoptimal input 𝑒 ∗ signal so the corresponding solution’s makespanis minimized: 𝑒 ∗ = argmin 𝑒 𝜉 ℋ ,𝑒 . TotalTime

Solving the planning problem (Definition 3.4) to get the optimal in-put signal 𝑒 ∗ needs to reason over all possible 𝑒 ( 𝑡 ) . For most hybridautomaton, this is intractable, as the unbounded-time reachabilityproblem is undecidable even for rectangular hybrid automaton [27],which is a simpler hybrid automaton than ours with the right-handside of the differential equations containing only constants.Essentially, to solve the optimal 𝑒 ∗ , we need to assign valuesof all input variables for infinitely many 𝑡 . To make this problemsolvable, we fix the number of actions allowed in the run of thehybrid automaton and simplify the original problem by searchingfor 𝑒 ( 𝑡 ) that corresponds to each action. We introduce a fixed-steplinear hybrid automaton to capture such an idea. Definition 4.1. A finite-step linear hybrid automaton with input ℋ 𝑛 is a linear hybrid automaton ℋ (as defined in Definition 3.1)with all runs of ℋ to have exactly 𝑛 actions.In Section 5, we present how to use a MILP encoding to solvethe planning problem for fixed-step linear hybrid automaton. Next,we show that for a hybrid automaton ℋ , once we fix the numberof actions 𝑛 and make it ℋ 𝑛 , the feasible solution set (the set ofinput signals 𝑒 such that 𝜉 ℋ 𝑛 ,𝑒 is a run of ℋ with 𝑛 actions) isnon-decreasing as the number of actions 𝑛 increases.Lemma 4.2. Let ℋ 𝑛 be a finite-step linear hybrid automaton for ℋ with fixed action number 𝑛 . Let Ξ ℋ 𝑛 and ℰ ℋ 𝑛 be all the runsof ℋ 𝑛 and their corresponding input signals, respectively. For any < 𝑛 ′ < 𝑛 , we have ℰ ℋ 𝑛 ′ ⊆ ℰ ℋ 𝑛 . Proof:

For any 𝑒 ∈ ℰ ℋ 𝑛 , let 𝑞 𝑎,𝛿 −−→ 𝑞 ′ be a segment of its run 𝜉 ℋ ,𝑒 and 𝑎 ∈ 𝐹 is a flow set. We can replace this segment with 𝑞 𝑎,𝛿 −−→ 𝑞 ′ 𝑎, −−→ 𝑞 ′ , and the new run is still a run of ℋ given inputsignal 𝑒 . As this new run has 𝑛 + 𝑒 ∈ ℰ ℋ 𝑛 + for any 𝑒 ∈ ℰ ℋ 𝑛 .As the original hybrid automaton ℋ does not fix the actionnumber, we know ℰ ℋ = (cid:212) 𝑛 = ∞ 𝑛 = ℰ ℋ 𝑛 = lim 𝑛 →∞ ℰ ℋ 𝑛 , which directlyfollows from Lemma 4.2.et 𝑒 ∗ 𝑛 = argmin 𝑒 ∈ ℰ ℋ 𝑛 𝜉 ℋ 𝑛 ,𝑒 . TotalTime . (2)As ℰ ℋ 𝑛 ′ ⊆ ℰ ℋ 𝑛 , it is easy to check that 𝜉 ℋ 𝑛 ′ ,𝑒 ∗ 𝑛 ′ . TotalTime ≥ 𝜉 ℋ 𝑛 ,𝑒 ∗ 𝑛 . TotalTime . This gives us the following corollary.Corollary 4.3.

Following Lemma 4.2, then lim 𝑛 →∞ 𝑒 ∗ 𝑛 = 𝑒 ∗ , where 𝑒 ∗ 𝑛 is defined as Equation (2). In this section, we describe how to encode a finite-step linear hybridplanning problem as a Mixed Integer Linear Program, in which num-bers of variables or constraints at each step increase linearly withthe product of the operator number and the number of disjuncts in-volved in each condition (Section 5.3). We first introduce a methodto encode formulas with syntax as in Equation (1) (Section 5.1),and move on to the detailed encoding procedure for the finite-steplinear hybrid problem (Section 5.2). Additional constraints aboutconflicting operators for speeding up MILP solving are discussedin (Section 5.4).

Firstly, we introduce the methods to encode constraints with syntaxEquation (1) as MILP constraints. We start by encoding a canoni-calized form and move on to the general case.

Encoding CNF Linear Constraint Formula

Note that a condi-tion expressed using Equation (1) can be always transformed intoa conjunctive normal form (CNF) of linear constraints: cond ( 𝑉 ) ≡ ∧ 𝑚𝑟 ∨ 𝑚 𝑟 𝑠 ( cond 𝑟𝑠 ( 𝑉 )) ≡ ∧ 𝑚𝑟 ∨ 𝑚 𝑟 𝑠 ( 𝐺 𝑟𝑠 𝑉 ≥ 𝐻 𝑟𝑠 ) , (3)where 𝐺 𝑟𝑠 ∈ R | 𝑉 | and 𝐻 𝑟𝑠 ∈ R , 𝑚 is the number of conjuncts in cond , and 𝑚 𝑟 is the number of disjuncts in the 𝑟 th conjunct. Forconvenience, we also replace 𝐺 𝑟𝑠 𝑉 > 𝐻 𝑟𝑠 with 𝐺 𝑟𝑠 𝑉 ≥ 𝐻 𝑟𝑠 withoutinvalidating the solutions. As disjunctions are present in cond ,which result in nonconvex sets in general, we use the Big-M methodto handle such disjunctive logic in ∨ 𝑚 𝑟 𝑠 ( 𝐺 𝑟𝑠 𝑉 ≥ 𝐻 𝑟𝑠 ) . We define a 𝑚 𝑟 -vector of intermediate Boolean variables 𝛼 𝑟 with domain { , } .While 𝐺 𝑟𝑠 𝑉 ≥ 𝐻 𝑟𝑠 should hold if 𝛼 𝑟𝑠 =

1, we have 𝛼 𝑟𝑠 = 𝛼 𝑟𝑠 . Then, the 𝑟 th disjunction ∨ 𝑚 𝑟 𝑠 ( 𝐺 𝑟𝑠 𝑉 ≥ 𝐻 𝑟𝑠 ) isrepresented as a set of linear constraints: (cid:0) ∧ 𝑚 𝑟 𝑠 ( 𝛼 𝑟𝑠 = = ⇒ 𝐺 𝑟𝑠 𝑉 ≥ 𝐻 𝑟𝑠 ) (cid:1) ∧ (cid:32) 𝑚 𝑟 ∑︁ 𝑠 𝛼 𝑟𝑠 ≥ (cid:33) (4)Let 𝑀 be a very large positive number, then each implication isencoded as linear inequalities over both 𝑉 and indicator variables: 𝐺 𝑟𝑠 𝑉 + 𝑀 ( − 𝛼 𝑟𝑠 ) ≥ 𝐻 𝑟𝑠 (5)As cond ( 𝑉 ) = true needs all the conjuncts to hold, we end upwith the following constraint: ∧ 𝑚𝑟 ∧ 𝑚 𝑟 𝑠 (cid:32) ( 𝐺 𝑟𝑠 𝑉 + 𝑀 ( − 𝛼 𝑟𝑠 ) ≥ 𝐻 𝑟𝑠 ) ∧ (cid:32) 𝑚 𝑟 ∑︁ 𝑠 𝛼 𝑟𝑠 ≥ (cid:33)(cid:33) . (6) Encoding General Linear Constraint Formula

While some re-gions are easier to encode by using CNFs, the CNFs of some otherstake more space. For example, in Figure 3, the blue region is agood candidate to be encoded as CNF (∧ 𝜙 𝑖 ) ∧ (∨ 𝜙 𝑖 ) ∧ (∨ 𝜙 𝑖 ) , where 𝜙 𝑖 𝑗 is a linear inequality and its direction is given in thefigure. However, the green region is more intuitive to be encodedas (∧ 𝜙 𝑖 ) ∨ (∧ 𝜙 𝑖 ) , which is not CNF. Figure 3:

Examples of two regions to encode as linear constraintformulas.

We can use similar methods to encode the general case in Equation (1)in a recursive fashion. Given a set of formulas { 𝜙 , 𝜙 , ..𝜙 𝑚 } , we as-sume the MILP constraint for 𝜙 𝑟 is already encoded as ∧ 𝑚 𝑟 𝑠 ( LHS 𝑟𝑠 ≥ RHS 𝑟𝑠 ) , which is a set of linear constraints over 𝑉 and some indicatorvariables. We show both the conjunction and disjunction encodingsof these formulas. To obtain the MILP constraint of conjunction ∧ 𝑚𝑟 𝜙 𝑟 , we have: ∧ 𝑚𝑟 ∧ 𝑚 𝑟 𝑠 ( LHS 𝑟𝑠 ≥ RHS 𝑟𝑠 ) . (7)To encode disjunction ∨ 𝑚𝑟 𝜙 𝑟 , we introduce an 𝑚 -vector ofBoolean variables 𝛼 and the disjunction is captured by: (cid:0) ∧ 𝑚𝑟 ∧ 𝑚 𝑟 𝑠 ( LHS 𝑟𝑠 + 𝑀 ( − 𝛼 𝑟 ) ≥ RHS 𝑟𝑠 ) (cid:1) ∧ (cid:32)∑︁ 𝑟 𝛼 𝑚𝑟 ≥ (cid:33) . (8)For each linear constraint, if there exist some indicator variablethat is not equal to 1, the constraint trivially holds. This set ofconstraints can be further canonicalized into ∧ 𝑚 𝑟 𝑠 ( LHS 𝑟𝑠 ≥ RHS 𝑟𝑠 ) over 𝑉 and indicator variables for other logic operations. Now we are ready to encode the entire planning problem as definedin Definition 3.4 for linear hybrid automata with 𝑛 -step runs. To represent the internal states { 𝑞 , 𝑞 , .., 𝑞 𝑛 } , wedefine a set of variables { 𝑄 , 𝑄 , .., 𝑄 𝑛 } , and 𝑄 𝑖 corresponds to theinternal state 𝑄 𝑖 right after 𝑎 𝑖 occurs and right before 𝑎 𝑖 + occurs.Their domains are copied from 𝑄 . To represent the input signal 𝑒 ,we also have 𝐸 𝑖 ∈ { 𝐸 , 𝐸 , .., 𝐸 𝑛 − } , corresponding to the values of 𝐸 when 𝑎 𝑖 occurs, and their domains are copied from 𝐸 .To represent the actions that happen at each step, we define aset of binary activation variables { 𝑃 , 𝑃 , .., 𝑃 𝑛 − } . 𝑃 𝑖 is the union of 𝑃 𝐽𝑖 and 𝑃 𝐹𝑖 = 𝑃 𝐹 𝑖 ∪ 𝑃 𝐹 𝑖 ∪ , .., 𝑃 𝐹 𝐾 𝑖 , which are the activation variablesat step 𝑖 for jumps 𝐽 and flows { 𝐹 , 𝐹 , .., 𝐹 𝐾 } , respectively. Each 𝑝 𝑜𝑖 ∈ 𝑃 𝑖 corresponds to an operator 𝑜 (i.e., a jump or a flow) atstep 𝑖 . If 𝑝 𝑜𝑖 =

1, operator 𝑜 is activated at step 𝑖 ; otherwise, 𝑜 isinactivated. To fully determine the effects of flows, we need tospecify the cumulative effects of the input variables and the elapsedtime during these flows. Thus, we define 𝑑 𝑖 with domain [ , ∞) torepresent the elapsed time during step 𝑖 ; and real variable Δ denotes ∫ 𝑑 𝑖 𝐸 𝑖 𝑑𝑡 , the cumulative effects of 𝐸 𝑖 during step 𝑖 .We denote the set of all these MILP variables as 𝒱 . Let Π : 𝒱 → R be a MILP solution that maps a MILP variable 𝑣 ∈ 𝒱 to a value Π ( 𝑣 ) ∈ val ( 𝑣 ) . Then, given a MILP solution Π , we can get the valuesof the input command 𝑒 as well as a valid run 𝜉 ℋ ,𝑒 = 𝑞 𝑎 ,𝛿 −−−−→ nitial and goal states C1 . ( 𝑄 = 𝑞 Init ) ∧ (

Goal ( 𝑄 𝑛 ) = true ) Operator activation

C2 . (cid:211) 𝑛 − 𝑖 = (cid:211) 𝐹 𝑘 ⊆ 𝐹 ( (cid:205) 𝑝 𝑜𝑖 ∈( 𝑃 𝐽𝑖 ∪ 𝑃 𝐹𝑘𝑖 ) 𝑝 𝑜𝑖 = ) Jump constraint

C3 . (cid:211) 𝑛 − 𝑖 = (cid:211) 𝑗 ⊆ 𝐽 ( ( 𝑝 𝑗𝑖 = ) = ⇒ cond 𝑗 ( 𝑉 𝑖 )) C4 . (cid:211) 𝑛𝑖 = (cid:211) 𝑗 ⊆ 𝐽 ( ( 𝑝 𝑗𝑖 = ) = ⇒ ( 𝑄 𝑖 = eff 𝑗 ( 𝑉 𝑖 − ))) C5 . (cid:211) 𝑛 − 𝑖 = (cid:211) 𝑗 ∈ 𝐽 ( ( 𝑝 𝑗𝑖 = ) = ⇒ ( 𝑑 𝑖 = )) Flow constraint

C6 . (cid:211) 𝑛 − 𝑖 = (cid:211) 𝑓 ⊆ 𝐹 ( 𝑝 𝑓𝑖 = ) = ⇒ ( (cid:211) 𝑟 (cid:212) 𝑠 ( cond 𝑄𝑓 ,𝑟𝑠 ( 𝑄 𝑖 ) ∧ cond 𝑄𝑓 ,𝑟𝑠 ( 𝑄 𝑖 + ))) C7 . (cid:211) 𝑛 − 𝑖 = (cid:211) 𝑓 ⊆ 𝐹 ( 𝑝 𝑓𝑖 = ) = ⇒ cond 𝐸𝑓 ( 𝐸 𝑖 ) C8 . (cid:211) 𝑛 − 𝑖 = (cid:211) 𝑓 ⊆ 𝐹 ( 𝑝 𝑓𝑖 = ) = ⇒ cond Δ 𝑓 ( Δ 𝑖 , 𝑑 𝑖 ) C9 . (cid:211) 𝑛 − 𝑖 = (cid:211) 𝑓 ⊆ 𝐹 ( ( 𝑝 𝑓𝑖 = ) = ⇒ ( 𝑋 𝑘 ( 𝑖 + ) − 𝑋 𝑘𝑖 = 𝐴 𝑓 Δ 𝑖 − 𝐵 𝑓 𝑑 𝑖 )) C10 . (cid:211) 𝑛 − 𝑖 = (cid:211) 𝑓 ∈ 𝐹 ( ( 𝑝 𝐹𝑖 = ) = ⇒ ( 𝐿 𝑖 + = 𝐿 𝑖 )) Figure 4:

Constraints in the MILP encoding. 𝑞 · · · , 𝑞 𝑛 − 𝑎 𝑛 − ,𝛿 𝑛 − −−−−−−−−→ 𝑞 𝑛 as given in Definition 3.3. We extract 𝑒 over duration [ , (cid:205) ( 𝑛 − ) 𝑖 = Π ( 𝑑 𝑖 )] as follows: 𝑒 ( 𝑡 ) = (cid:40) Π ( 𝐸 𝑖 ) , if 𝑡 = ( (cid:205) 𝑖𝑗 = Π ( 𝑑 𝑗 )) Π ( Δ 𝑖 )/ Π ( 𝑑 𝑖 ) , if ( (cid:205) ( 𝑖 − ) 𝑗 = Π ( 𝑑 𝑗 )) < 𝑡 < ( (cid:205) 𝑖𝑗 = Π ( 𝑑 𝑗 )) (9)We extract the run 𝜉 ℋ ,𝑒 of 𝑒 from Π as follows: 𝑞 𝑖 = Π ( 𝑄 𝑖 ) , for 𝑖 ∈ { , , .., 𝑛 } ,𝛿 𝑖 = Π ( 𝑑 𝑖 ) , for 𝑖 ∈ { , , .., 𝑛 − } ,𝑎 𝑖 = 𝑎 if 𝑝 ∈ 𝑃 𝑖 and Π ( 𝑝 ) = , for 𝑖 ∈ { , , .., 𝑛 − } . (10) As specified in Definition 3.3, we aimat finding a run with minimum

TotalTime , and thus the objectivefunction is (cid:205) 𝑛 − 𝑖 = 𝛿 𝑖 . Next, we introduce the constraints over thesevariables, which force only one jump or one flow set to be chosenat every time step with their conditions being satisfied and effectsbeing imposed, such that the goal states can be reached from the ini-tial state through these actions. Figure 4 summarizes all constraintsC1-C10 to encode the planning problem. We have the followingtheorem that justify the pair of input command and run given byEquation (9)-Equation (10) is valid:Theorem 5.1.

Given a 𝑛 -step hybrid automata ℋ and a MILPsolution Π over the variables 𝒱 as defined in Section 5.2.1, the inputcommand 𝑒 and run 𝜉 ℋ ,𝑒 that are extracted from Π by Equation (9)-Equation (10) are an optimal solution of ℋ if 𝒱 satisfies constraintC1-C10 in Figure 4 and (cid:205) 𝑛 − 𝑖 = 𝛿 𝑖 is minimized. Next, we prove Theorem 5.1 by explaining C1-C10 in detail. Asthese constraints are the exact translation of Definition 3.3, oursolution approach is sound and complete and thus optimal.

Initial and Goal States

First, constraint C1 ensures that runsstart from 𝑞 Init , and its final state 𝑄 𝑛 satisfies Goal such thatDefinition 3.3(1) is respected. We use Equation (7)-Equation (8) toencode constraint

Goal ( 𝑄 𝑁 ) = true . Operator Activation

Constraint C2 can avoid the ambiguity ofhaving multiple jumps or multiple flows for the same internal con-tinuous variables at each step. Recall that 𝑝 𝑜𝑖 = 𝑜 is activated at step 𝑖 . Constraint C2 forceseither of the following conditions to hold: (1) exactly one jumpis active, and all the flows are inactivated (Definition 3.3(2a)); (2)all jumps are inactivated, and there is exactly one flow for eachcontinuous variable set is activated (Definition 3.3(2b)). Jump Constraint

When a jump is active, its conditions should besatisfied, and their effects should be imposed (Definition 3.3(2a)).For each jump 𝑗 ∈ 𝐽 with condition cond 𝑗 , when jump 𝑗 is activateat step 𝑖 , which is 𝑝 𝑗𝑖 =

1, condition cond 𝑗 should hold, which iscaptured by C3. Constraint C4 enforces the effect, which is linearconstraints 𝑄 𝑖 = eff 𝑗 ( 𝑉 𝑖 − ) , to happen right after jumps. Note thatthis constraint also forces the unaffected variables to remain thesame after the jumps. In addition, C5 ensures that the elapsed timeduring jumps is zero, which is activated when some jump is chosen. Flow Constraint

While the condition of jumps should only holdright before it happens, the condition of flows should always holduntil the next action (Definition 3.3(2b)). Let cond 𝑓 be the conditionof a flow 𝑓 ∈ 𝐹 and denote the constraints of cond 𝑓 over 𝑄 and 𝐸 as cond 𝑄𝑓 and cond 𝐸𝑓 , respectively. Given our solution specification,while the constraint over 𝑄 should hold through the execution of 𝑓 , including the start and end, the constraint over 𝐸 should alsothoroughly hold except at the end of this flow, where the change of 𝐸 may trigger other jumps or flows.Since the considered dynamics are linear and all the conditionsare sets of disjunctive linear constraints, satisfying cond 𝑄𝑓 at thestart and the end of flow 𝑓 with respect to the same disjunct ineach disjunctive linear constraint implies cond 𝑄𝑓 holds during 𝑓 .Constraint C6 captures cond 𝑄𝑓 at the start and end of flow 𝑗 beingactivated. This constraint is sufficient to guarantee the linear tra-jectory from 𝑄 𝑖 ti 𝑄 𝑖 + always satisfies cond 𝑄𝑗 , and the reason is asfollows: they share the same indicator variables and always satisfythe same disjunct in each disjunction, and thus they are in the sameconvex region; as the line between two points in a convex regionalways stay in this region, we know that the trajectory from 𝑄 𝑖 to 𝑄 𝑖 + satisfy cond 𝑄𝑓 . As a similar encoding for motion planningwith polytope obstacles can be seen in [12], our encoding methodextends it to handle more general linear constraint formula.As condition cond 𝐸𝑓 should hold through flow 𝑓 except at theend, we add constraints over 𝐸 when 𝑓 starts and the constraintsover Δ , which is the cumulative effects of 𝐸 during 𝑓 happening.While the former constraint is captured in C7, the latter is in C8. Fora linear constraint 𝐺 𝐸𝑓 ,𝑟𝑠 𝐸 𝑖 ≥ 𝐻 𝐸𝑓 ,𝑟𝑠 , we can obtain the equivalentlinear constraint over Δ 𝑖 and 𝑑 𝑖 by integrating it over time on bothides and substituting in Δ = ∫ 𝑑 𝐸𝑑𝑡 : 𝐺 𝐸𝑓 ,𝑟𝑠 Δ 𝑖 ≤ 𝐻 𝐸𝑓 ,𝑟𝑠 𝑑 𝑖 . (11)By doing this for each 𝐺 𝐸𝑓 ,𝑟𝑠 𝐸 𝑖 ≥ 𝐻 𝐸𝑓 ,𝑟𝑠 in cond 𝐸𝑓 ,𝑟𝑠 , we obtain thecondition cond Δ 𝑓 ,𝑟𝑠 over ( Δ , 𝑑 ) .Then, we determine the evolution of state variables during theflows. Recall that the state variables 𝑄 consist of continuous statevariables 𝑋 and discrete state variables 𝐿 . Let the differential equa-tion of a flow 𝑓 be (cid:164) 𝑋 𝑘 = 𝐴 𝑓 𝐸 + 𝐵 𝑓 , and then C9 enforces continuousdynamics by adding the effects of Δ and 𝑑 to 𝑋 𝑘 . Constraint C10makes sure that flows do not change discrete variables. Now, we discuss the complexity of our MILP encoding. Let 𝑛 be thenumber of total steps, 𝐾 be the total number of disjoint continuousvariable sets { 𝑋 , 𝑋 , .., 𝑋 𝐿 } , 𝑄 be the internal state variables, 𝐸 bethe input variables, 𝐽 be the jumps, 𝐹 be the flows, and 𝑚 and 𝑚 ′ be the maximum number of linear inequalities and disjuncts ineach condition or effect, respectively. As shown in Section 5.2.1,we know the total number of variables in a MILP is the sum of thefollowing variables: the variables for internal states, input signals,elapsed times, cumulative effects, which are 𝑛 (| 𝑄 | + | 𝐸 | + ) MILPvariables in total; the activation variables for flows and jumps,which is 𝑛 (| 𝐽 | + | 𝐹 |) ; the additional Boolean variables for indicatingactivated disjuncts in conditions, which is 2 𝑚 ′ + 𝑛𝑚 ′ (| 𝐽 | + | 𝐹 |) asshown in Table 1. Thus, we know the number of all the variables isin 𝒪 ( 𝑛 (| 𝑉 | + 𝑚 ′ (| 𝐽 | + | 𝐹 |)) , where 𝒪 is the asymptotic notation. Table 1:

Numbers of indicator variables in C1-C10.

C1 C2 C3 C4 or C5 C6 C7 or C8 C9 or C102 𝑚 ′ 𝑛𝑚 ′ | 𝐽 | 𝑛𝑚 ′ | 𝐹 | 𝑛𝑚 ′ | 𝐹 | Table 2:

Numbers of linear constraints in C1-C10.

C1 C2 C3 or C4 C5 C6 C7 or C8 C9 or C102 𝑚 𝑛𝐾 𝑛𝑚 | 𝐽 | 𝑛 | 𝐽 | 𝑛𝑚 | 𝐹 | 𝑛𝑚 | 𝐹 | 𝑛 | 𝐹 | We show the total numbers of constraints in C1-10 in Table 2. Thetotal number of all these constraints is 2 𝑚 + 𝑛 ( 𝐾 + ( 𝑚 + )| 𝐽 | + ( 𝑚 + )| 𝐹 |) . As 𝐾 ≤ | 𝐹 | , we know the number of total constraints is in 𝒪 ( 𝑛𝑚 (| 𝐽 | + | 𝐹 |)) . At each time step, the constraint number increaselinearly with the product of the number of operators | 𝐽 | + | 𝐹 | andthe number of disjuncts in a condition 𝑚 . To accelerate MILP solving, we add additional constraints to encodeconflicting operators that cannot happen together or in sequence.While constraints C3, C4, C6, and C7 already prevent these con-flicting operators from happening, additional constraints can helpa MILP optimizer to effectively prune state space and thus reducetotal runtimes.One type of additional constraints is about the flows with mutu-ally exclusive conditions. Let 𝑓 , 𝑓 ′ ∈ 𝐹 be two flows whose differ-ential equations scope on different continuous variable sets 𝑋 𝑘 , 𝑋 𝑘 ′ ,and cond 𝑓 and cond 𝑓 ′ be there conditions. If cond 𝑓 ∧ cond 𝑓 ′ isalways false , which means their specified states are totally disjoint, we add constraint 𝑝 𝑓𝑖 + 𝑝 𝑓 ′ 𝑖 ≤ 𝑖 ∈ { , , .., 𝑛 − } , whichensures at most one of these flows can be activated at every step.The total number of the added constraints is 𝑛𝑚 𝑓 , where 𝑚 𝑓 is thenumber of conflicting flow pairs. Note that these constraints areredundant, which are already encoded by C6, but could help a MILPoptimizer to easily identify conflicting operators without furtherchecking the complex constraints in C6.Another type of additional constraints is about subsequent con-flicting operators. Let 𝑜, 𝑜 ′ ∈ 𝐽 ∪ 𝐹 be two operators, and post 𝑄𝑜 and pre 𝑄𝑜 ′ be the possible internal states after taking 𝑜 , and the pos-sible internal states before 𝑜 ′ , respectively. We have pre 𝑄𝑜 ′ = cond 𝑄𝑜 ′ regardless of 𝑜 ′ is a jump or a flow, where cond 𝑄𝑜 ′ is the condition of 𝑜 ′ over internal states 𝑄 . On the other hand, post 𝑜 can be differentgiven different operator types: if 𝑜 is a flow, post 𝑜 is still cond 𝑄𝑜 since flow 𝑜 needs this condition to hold during its happening; if 𝑜 isa jump, we set post 𝑜 = { eff 𝑜 ( 𝑞, 𝑢 ) | cond 𝑄𝑜 ( 𝑞 ) = true and ( 𝑞, 𝑢 ) ∈ 𝑉 } . If post 𝑜 ∧ pre 𝑜 ′ is always false , we know they cannot hap-pen in sequence. Thus, we add constraint 𝑝 𝑜𝑖 + 𝑝 𝑜 ′ 𝑖 + ≤ 𝑖 ∈ { , , .., 𝑛 − } . The total number of the added constraints is ( 𝑛 − ) 𝑚 𝑜 , where 𝑚 𝑜 is the number of conflicting operator pairsthat cannot happen in sequence. In this section, we extend our method to handle a set of temporallyconcurrent goals by compiling them into our hybrid automata witha certain final goal. There are various types of specifications torepresent desired system behaviors over continuous time in bothplanning and control, such as STL (Signal Temporal Logic) [43],and Qualitative State Plan (QSP) [30]. While the former formalismhas a more expressive syntax by using formal logic, QSP is a well-known specification used in planning and is more suitable to theapplications we consider in this paper, which specifies a set of tasksto complete as well as the temporal bounds between their startsand ends. We introduce our method to deal with QSP in this sectionand present related experiments in Section 7.3. We view exploringthe methods and applications related to STL as our future work. pA= (35, 10) pA = (10, 10) EventEpisode[20, 30] [20, 30]

Figure 5:

QSP example of four events and two episodes. Episode 𝑒𝑝 constrain the astronaut to stay at the initial state between and minutes right after the mission begins; Episode 𝑒𝑝 constrain theastronaut to stay at the charge station between and minutessometime during the mission. A 𝑄𝑆𝑃 is a tuple 𝑞𝑠𝑝 = ⟨ 𝐸𝑉 , 𝐸𝑃 ⟩ : 𝐸𝑉 is a set of events, and 𝑒 ∈ 𝐸𝑉 is the initial event that represents the mission begins; 𝐸𝑃 isa set of episodes. Each episode 𝑒𝑝 = ⟨ 𝑒 ⊢ 𝑒 ⊣ , 𝑙𝑏, 𝑢𝑏, cond ⟩ is associatedwith start and end events 𝑒 ⊢ , 𝑒 ⊣ ∈ 𝐸𝑉 , a duration bound [ 𝑙𝑏, 𝑢𝑏 ] ,and a condition cond . For each 𝑒 ∈ 𝐸𝑉 , we denote { 𝑒𝑝 ∈ 𝐸𝑃 | 𝑒 = 𝑒𝑝.𝑒 ⊢ } as starting ( 𝑒 ) and { 𝑒𝑝 ∈ 𝐸𝑃 | 𝑒 = 𝑒𝑝.𝑒 ⊣ } as ending ( 𝑒 ) . Anexample of QSP is given in Figure 5.A schedule 𝑠 : 𝐸𝑉 → R ≥ to 𝑞𝑠𝑝 is a function that maps 𝑒 ∈ 𝐸𝑉 to a non-negative real value such that (1) 𝑠 ( 𝑒 ) =

0; and (2) 𝑙𝑏 𝑖 ≤ 𝑠 ( 𝑒 ⊢ 𝑖 ) − 𝑠 ( 𝑒 ⊣ 𝑖 ) ≤ 𝑢𝑏 𝑖 for every 𝑒𝑝 𝑖 ∈ 𝐸𝑃 . We say a trajectory 𝜉 atisfies 𝑞𝑠𝑝 if there exists a schedule 𝑠 such that for every 𝑒𝑝 𝑖 ∈ 𝐸𝑃 , cond 𝑖 ( 𝜉 ( 𝑡 )) = true when ( 𝑠 ( 𝑒 ⊢ 𝑖 ) < 𝑡 < 𝑠 ( 𝑒 ⊣ 𝑖 )) .Given a hybrid automaton ℋ = ⟨ 𝑉 = ( 𝑄 ∪ 𝐸 ) , 𝑞 Init , Goal , 𝐽, 𝐹 ⟩ and a QSP 𝑞𝑠𝑝 = ⟨ 𝐸𝑉 , 𝐸𝑃 ⟩ , we compile this QSP into the origi-nal automaton as described below, and the runs of the obtainednew automaton ℋ ′ respect both ℋ and 𝑞𝑠𝑝 . We denote this newautomaton as ℋ ′ = ⟨ 𝑉 ′ = ( 𝑄 ′ ∪ 𝐸 ) , 𝑞 ′ Init , Goal ′ , 𝐽 ′ , 𝐹 ′ ⟩ .First, we make a clock variable 𝑐 𝑒𝑝 with domain [− , ∞) for eachepisode 𝑒𝑝 ∈ 𝐸𝑃 . While 𝑐 𝑒𝑝 = − 𝑒𝑝 has not started, 𝑐 𝑒𝑝 = − 𝑒𝑝 has been achieved. When 𝑒𝑝 is happening, 𝑐 𝑒𝑝 ≥

0. Thus,the continuous state variables of ℋ ′ is 𝑄 ′ = 𝑄 ∪ 𝐶 and 𝐶 = { 𝑐 𝑒𝑝 ∈[− , ∞) | 𝑒𝑝 ∈ 𝐸𝑃 } . Since all the episodes have not started in thebeginning except the episodes started by initial event 𝑒 , the newinitial state is 𝑞 ′ Init = 𝑞 Init ∪ {( 𝑐 = − ) | 𝑐 ∈ 𝐶 / starting ( 𝑒 )}} ∪{ 𝑐 = | 𝑒𝑝 ∈ starting ( 𝑒 )} . As all the episodes should be achievedeventually, the new goal is Goal ′ = Goal ∪ {( 𝑐 = − ) | 𝑐 ∈ 𝐶 } .To describe that clock variables reset at events, we add a set ofjumps 𝐽 𝐸𝑉 , and 𝐽 ′ = 𝐽 ∪ 𝐽 𝐸𝑉 . For each event 𝑒 ∈ 𝐸𝑉 , there is ajump 𝑗 𝑒 ∈ 𝐽 𝐸𝑉 with the following condition: {( 𝑐 𝑒𝑝 = − ) | 𝑒𝑝 ∈ starting ( 𝑒𝑝 )} , which ensures event 𝑒 has not happened before,and {( 𝑙𝑏 ( 𝑒𝑝 ) ≤ 𝑐 𝑒𝑝 ≤ 𝑢𝑏 ( 𝑒 )) | 𝑒𝑝 ∈ ending ( 𝑒 )} , which shows 𝑒 should end only when all the ended episodes has lasted for aproper duration with respect to their temporal bounds. The effects {( 𝑐 𝑒𝑝 = ) | 𝑒𝑝 ∈ starting ( 𝑒 )} and {( 𝑐 𝑒𝑝 = − ) | 𝑒𝑝 ∈ ending ( 𝑒 )} capture the clock variable resets for started episodes and endedepisodes, respectively.To force the condition is imposed and its clock variable clickswhen an episode is happening, we have a flow 𝑓 𝑒𝑝 for each clockvariable 𝑐 ∈ 𝐶 . This flow has differential equation (cid:164) 𝑐 𝑒𝑝 = ( 𝑐 𝑒𝑝 ≥ ) ∪ cond 𝑒𝑝 . We also have 𝑓 𝑒𝑝 with differentialequation (cid:164) 𝑐 𝑒𝑝 = ( 𝑐 𝑒𝑝 ≤ − ) to represent that episode 𝑒𝑝 is not happening. Thus, the new flows are 𝐹 ′ = 𝐹 ∪ 𝐹 𝐸𝑃 and 𝐹 𝐸𝑃 = { 𝑓 𝑒𝑝 | 𝑒𝑝 ∈ 𝐸𝑃 } ∪ { 𝑓 𝑒𝑝 | 𝑒𝑝 ∈ 𝐸𝑃 } . To demonstrate the capabilities of our method, we ran our MILPencoding with Gurobi 9.0.1, which is highly optimized and leveragesmultiple processor cores, and benchmarked against Scotty [14] onthe Mars transportation domains with different initial setups, theair refueling domains with different numbers of UAVs taking photosin different numbers of regions, and the truck-and-drone deliverydomains with different numbers of trucks, drones, and packages.All experiments were run on a 3.40GHZ 8-Core Intel Core i7-6700CPU with 36GB RAM with a runtime limit of 600s. At the end ofthis section, we also discuss the sizes of these MILP encodings.

The Mars transportation domains involve reasoning over obstacleavoidance and battery consumption under different terrains, suchthat the astronaut can reach the destination with the help of therover in the shortest time. A map consists of a set of regions, andeach region is a polygon associated with a terrain type. A regioncan be of the forbidden area, mountain, ground, and basin, whichfollows the terrains in Figure 1. Driving a rover in different ter-rains has different velocity limits and energy consumption rates:driving in the mountains should be limited to 10km/h, and the

Rover Astronaut DestinationOn (a) The rover directly picks up anddelivers the astronaut to the desti-nation.

Rover Astronaut DestinationOn Off (b) The rover does not have enoughbattery for the trip or going to thecharge station, and the astronauthas to walk.

Rover Astronaut DestinationOn Charge (c) The rover picks up and deliversthe astronaut but has to rechargeduring the trip.

Rover Astronaut DestinationCharge On (d) The rover picks up and deliversthe astronaut after recharging.

Figure 6:

Mars transportation examples with different initialbattery levels and charge station locations: the charge station ismarked as ⊲ ; the forbidden areas, mountain, ground, and basin arein gray, red, green, and blue, respectively; the route of the rover isin red and starts from the bottom left; the route of an astronautwalking is in blue, and its goal destination is at the bottom right;the route of the astronaut taking the rover to the destination is ingreen. battery consumption rate is 3unit per hour; the velocity limit andthe consumption rate in the basin is 30km/h and 2unit/h, and thoseare 50km/h and 2unit/h for the ground. Walking in these threeterrains is 2km/h. On the map, while we fix the initial locationsand destinations for astronauts and rovers, we vary the locationsof charge stations and the initial battery levels in the four differentexamples in Figure 6.In Figure 6, the forbidden areas, mountain, ground, and basin arein gray, red, green, and blue, respectively. As we can see, while therover starts from its initial location at the bottom left and traversesthrough mountains and basins to arrive at the destination, theastronaut walks towards the rover and joins the ride. It is interestingto notice that the rover chooses the upper route since traversingthe lower mountain area costs more time and energy. While thebattery is enough for the rover to complete the route in (a), itis insufficient in (b) and (c). In Figure 6(c), the rover carries theastronaut to the charge station and then continues the missionafter getting enough battery. In Figure 6(b), the rover battery is toolow and even insufficient for the trip to the charge station. Thus,the astronaut gets off and walks from the closest location to thedestination after draining the battery. In Figure 6(d), the rover alsogets recharged, but it happens before picking up the astronaut dueto different charge station locations. able 3: Experimental results of twelve domains. The three numbers after each delivery domain name are the numbers of trucks, drones, andpackages, respectively; 𝑡 : the total runtime in seconds; 𝑔 : the makespan of the returned solution; 𝑡 : the runtime to find the first solution; 𝑔 :the makespan of the first solution; 𝑡 ∗ : the runtime to first find the solution that is finally returned; 𝑛 : the number of actions; 𝒱 𝐶 , 𝒱 𝐼 , 𝒱 ′ 𝐶 , 𝒱 ′ 𝐼 , Domain Scotty MILP Encoding 𝑡 𝑔 𝑡 𝑔 𝑡 𝑔 𝑡 ∗ 𝑛 𝒱 𝐶 𝒱 ′ 𝐶 𝒱 𝐼 𝒱 ′ 𝐼 𝑔 =

35 within one second 𝑡 <

1, ourmethod can also find such a solution 𝑔 =

35 within one second 𝑡 <

1. In this solution, the astronaut directly moves to the destinationwithout the help of the rover, which only uses one action but takesa very long time. While Scotty stops after finding this solution, ourmethod keeps searching for better solutions and finds the optimalsolution roughly within one second 𝑡 ∗ <

1. These solutions are thenproved to be optimal and returned as Gurobi exhausts the solutionspace. Thus, our method is able to quickly find a consistent solutionand an optimal solution for the Mars transportation domains.

In this domain, autonomous Unmanned Aerial Vehicles (UAVs) needto take pictures of several regions before landing at the destinationlocation. Since a UAV has limited fuel, it needs to refuel in-air from atanker plane. This problem is difficult since it requires reasoning onthe optimal ordering of visiting all the regions and also coordinatingthe UAVs and the tank planes to take necessary refueling. Whenmultiple UAVs are in a mission, we should also effectively dispatchthe photo-taking tasks such that the makespan is minimized. Themaximum velocity of the tank plane is 20 𝑚 / 𝑠 . While flying, UAVscan fly with the velocity up to 30 𝑚 / 𝑠 , and the fuel decreases at2unit / 𝑠 . Refueling requires the distances between UAVs and tanksplanes to be less than 10 𝑚 . When an UAV is refueling, the maximumallowable velocity is 5 𝑚 / 𝑠 , and the fuel increases at 10unit / 𝑠 . Whilethe tank capacity of UAVs is 100 units, we assume the tank planehas enough fuel during missions.We experimented with this domain on four examples with differ-ent numbers of regions and UAVs, as shown in Figure 7. The UAVsand the plane start from the same spot and should arrive at the samedestination. While there is only one UAV in the examples (a), (b),and (d), we add another UAV in example (d). All the examples onlyhave one tank plane. While our method succeeds in finding feasiblesolutions in two seconds for all the examples, Scotty spends much StartEnd (a) The UAV takes photos for threeregions and does not need refuel-ing.

StartEnd (b) The UAV takes photos for fourregions and refuels once along theroute.

StartEnd (c) The UAV takes photos for ten re-gions and refuels twice along theroute.

StartEnd (d) Two UAVs take photos for eightregions along two different routes.While one UAV does not need refu-eling, the other one refuels once.

Figure 7:

Air refueling examples with different numbers of UAVsand regions to take photos: the regions for taking photos are graypolygons; all the examples consider one UAV (blue) and one tankplane (red) except example (d), which has an additional UAV (green);all the UAVs (blue) are fueled up (i.e., 100 units) in the beginningexcept the second UAV (green) in domain (d), whose fuel is 10 units.When the plane is refueling, the routes are makred in yellow. ackage1Truck1Drone1Truck2Drone2Package2 OffOnOff Off On OffPackage1 DestinationPackage2 Destination

Figure 8:

Examples of two trucks and two drones delivering twopackages: their initial positions and the package destinations aremarked; the truck routes are in red or green and start from the bot-tom left and the top right, respectively; the routes are in green if thetrucks are carrying drones; the routes of drones flying are in blue. more time on (a) and (b) and fails to solve the other two exampleswithin 10 minutes, which require more complex coordination onvisiting a larger number of regions. It is interesting to note thatour first solutions are already better than the Scotty solutions, andthe makespans of our final solutions are mostly half of those ofScotty. This is because the delete-relaxation heuristics in Scottymisguided its greedy search when energy resources (i.e., fuel) are inthis domain, which prevents Scotty from being effective or efficientin this domain.

In this domain, we consider a fleet of delivery trucks, each equippedwith a couple of drones, and the drone and truck both make de-liveries to the customers. While trucks can travel between depotsthrough highways or roads, the drone can fly freely in obstacle-freeregions or land on trucks to take a ride. When trucks are drivingon the road, they should follow the minimum and maximum speedlimits as well as the directions, which prevents trucks from violat-ing the traffic rules such as making U-turns on highways. Dronesare more flexible, but they are slower, and the travel distance islimited by their battery capacity. In this domain, we look for a planto deliver all the packages in the shortest time. Figure 8 shows anexample of the truck-and-drone delivery domains between twodepots, in which the two trucks loaded with packages and dronesare driving towards each other on a two-way street. Unfortunately,the package destinations are not on the road ahead, and the truckscannot turn around. A reasonable plan is that the packages areswapped to the other truck by using the drones to cross the street,and then the truck and drone on the other side continue delivery.We test on a map with five depots and ten highways betweenthese depots. Each road is straight and around 10km long with aspeed limit of 30-60km/h. The drone can fly with a maximum speedof 5km/h. We assume no obstacle for drones in these examples. Theexperimental results of four truck-and-drone delivery examples areshown in Table 3. While the packages can be delivered at any timein the first three examples, Delivery (d) requires the packages tobe delivered within certain time windows specified as a QSP. Aswe noticed, Scotty does not make progresses to carrying drones tothe deport near the drop-off locations and thus fails to solve any ofthese problems. It can be seen from the 𝑡 column that our methodis able to find the solution very quickly within several seconds. Theoptimal solutions can also be found in a very short time, 𝑡 ∗ for both(a) and (b). In domains (c) and (d), in which we have two trucks, four drones, and four packages, even though it fails to prove theoptimality of the incumbent in 10 minutes, their returned solutionslargely reduce the makespan of the first returned solutions. Now, we study the MILP models of the three benchmarked domains.Table 3 shows the numbers of integer variables 𝒱 𝐼 , continuousvariables 𝒱 𝐶 , and constraints 𝐶 in our original encoding (Section 5),as well as those (i.e., 𝒱 ′ 𝐶 , 𝒱 ′ 𝐼 , C’) in the model that has been presolvedby Gurobi. Gurobi presolves a MILP model by compiling it into asmaller model with equivalent feasible and optimal solutions asthe original. As we can see in Table 3, the presolved models reduceabout 20% continuous variables, 20 ∼

40% integer variables, and30 ∼

50% constraints in most examples. We also observed thatpresolving takes less than 0 . 𝑠 in our experiments.As the Mars domain does not have many discrete state variablesor actions, the numbers of its discrete variables and continuousvariables are roughly the same. When it comes to the air refuelingdomain with more than 8 regions to visit or the truck-and-dronedelivery domain with a large number of discrete variables to indicatetrucks are on a certain highway, we observe the number of discretevariables is 2 ∼ 𝑠 even for Delivery (d), whichhas 1511 variables and 34130 constraints, the largest problem wecan prove optimality within the runtime limit is Delivery (b), whichhas 537 variables and 7610 constraints. In this paper, we presented a mixed discrete-continuous planningapproach that fixes the action number of the automaton runs andencodes the corresponding finite-step hybrid planning problem asa MILP. Our complexity analysis shows that the number of theMILP variables and constraints at each step increases linearly withthe product of the number of linear constraints involved in eachcondition and the number of operators and variables. By leveragingthe state-of-the-art MILP optimizer Gurobi, our method is ableto efficiently find provably optimal or high-quality solutions forchallenging mixed discrete-continuous planning problems. Thiswas supported by our experimental results against Scotty on theMars transportation domains, the air refueling domains, and thetruck-and-drone delivery domains.In this paper, we show how to deal with temporally concurrentgoals modeled in QSPs with our MILP approach. For future work,we plan to extend our method to support the full features of STL,which can model more expressive desired systems behaviors. Wealso would like to explore and solve more real-world applicationswith this extension.

Acknowledgements.

This project was funded by the Defense Ad-vanced Research Projects Agency under Grant Contract No. N16A-T002-0149.

REFERENCES [1] Arthur Bit-Monnot, Luca Pulina, and Armando Tacchella. 2019. Cyber-PhysicalPlanning: Deliberation for Hybrid Systems with a Continuous Numeric State. In roceedings of the International Conference on Automated Planning and Scheduling ,Vol. 29. 49–57.[2] Sergiy Bogomolov, Daniele Magazzeni, Stefano Minopoli, and Martin Wehrle.2015. PDDL+ planning with hybrid automata: Foundations of translating mustbehavior. In

Twenty-Fifth International Conference on Automated Planning andScheduling .[3] Sergiy Bogomolov, Daniele Magazzeni, Andreas Podelski, and Martin Wehrle.2014. Planning as model checking in hybrid domains. In

Twenty-Eighth AAAIConference on Artificial Intelligence .[4] Daniel Bryce. 2016. A happening-based encoding for nonlinear pddl+ planning.In

Workshops at the Thirtieth AAAI Conference on Artificial Intelligence .[5] Daniel Bryce, Sicun Gao, David J Musliner, and Robert P Goldman. 2015. SMT-Based Nonlinear PDDL+ Planning.. In

AAAI . 3247–3253.[6] Michael Cashmore, Maria Fox, Derek Long, and Daniele Magazzeni. 2016. Acompilation of the full PDDL+ language into SMT. (2016).[7] Alessandro Cimatti, Edmund Clarke, Fausto Giunchiglia, and Marco Roveri. 2000.NuSMV: a new symbolic model checker.

International Journal on Software Toolsfor Technology Transfer

2, 4 (2000), 410–425.[8] Amanda Coles, M Fox, and D Long. 2013. A hybrid LP-RPG heuristic for modellingnumeric resource flows in planning.

Journal of Artificial Intelligence Research

Journal of Artificial IntelligenceResearch

44 (2012), 1–96.[10] Giuseppe Della Penna, Daniele Magazzeni, and Fabio Mercorio. 2012. A universalplanning system for hybrid domains.

Applied intelligence

36, 4 (2012), 932–959.[11] Chuchu Fan, Umang Mathur, Sayan Mitra, and Mahesh Viswanathan. 2018. Con-troller synthesis made real: reach-avoid specifications and linear dynamics. In

International Conference on Computer Aided Verification . Springer, 347–366.[12] Chuchu Fan, Umang Mathur, Sayan Mitra, and Mahesh Viswanathan. 2018.Controller Synthesis Made Real: Reachavoid Specifications and Linear Dynam-ics. In

Computer Aided Verification . Springer International Publishing, 347–366.https://doi.org/10.1007/978-3-319-96145-3_19[13] Enrique Fernandez-Gonzalez, Erez Karpas, and Brian Williams. 2017. Mixeddiscrete-continuous planning with convex optimization. In

Thirty-First AAAIConference on Artificial Intelligence .[14] Enrique Fernández-González, Brian Williams, and Erez Karpas. 2018. Scottyac-tivity: Mixed discrete-continuous planning with convex optimization.

Journal ofArtificial Intelligence Research

62 (2018), 579–664.[15] Ioannis Filippidis, Sumanth Dathathri, Scott C. Livingston, Necmiye Ozay, andRichard M. Murray. 2016. Control design for hybrid systems with TuLiP: TheTemporal Logic Planning toolbox. In

IEEE Conference on Control Applications .1030–1041.[16] Ioannis Filippidis, Sumanth Dathathri, Scott C Livingston, Necmiye Ozay, andRichard M Murray. 2016. Control design for hybrid systems with TuLiP: Thetemporal logic planning toolbox. In . IEEE, 1030–1041.[17] Maria Fox and Derek Long. 2003. PDDL2. 1: An extension to PDDL for expressingtemporal planning domains.

Journal of artificial intelligence research

20 (2003),61–124.[18] Maria Fox and Derek Long. 2006. Modelling mixed discrete-continuous domainsfor planning.

Journal of Artificial Intelligence Research

27 (2006), 235–297.[19] Goran Frehse. 2008. PHAVer: algorithmic verification of hybrid systems pastHyTech.

International Journal on Software Tools for Technology Transfer

10, 3(2008), 263–279.[20] Goran Frehse, Colas Le Guernic, Alexandre Donzé, Scott Cotton, Rajarshi Ray,Olivier Lebeltel, Rodolfo Ripado, Antoine Girard, Thao Dang, and Oded Maler.2011. SpaceEx: Scalable verification of hybrid systems. In

International Conferenceon Computer Aided Verification . Springer, 379–395.[21] Caelan Reed Garrett, Tomás Lozano-Pérez, and Leslie Pack Kaelbling. 2015. FFRob:An efficient heuristic for task and motion planning. In

Algorithmic Foundationsof Robotics XI . Springer, 179–195.[22] Caelan Reed Garrett, Tomás Lozano-Pérez, and Leslie Pack Kaelbling. 2017.Sample-Based Methods for Factored Task and Motion Planning.. In

Robotics:Science and Systems .[23] Caelan Reed Garrett, Tomas Lozano-Perez, and Leslie Pack Kaelbling. 2018. FFRob:Leveraging symbolic planning for efficient task and motion planning.

The Inter-national Journal of Robotics Research

37, 1 (2018), 104–136.[24] Antoine Girard. 2012. Controller synthesis for safety and reachability via approx-imate bisimulation.

Automatica

48, 5 (2012), 947–953.[25] Incorporate Gurobi Optimization. 2020. Gurobi optimizer reference manual. (2020).[26] Malte Helmert. 2002. Decidability and Undecidability Results for Planning withNumerical State Variables.. In

AIPS . 44–53.[27] Thomas A Henzinger, Peter W Kopke, Anuj Puri, and Pravin Varaiya. 1998. What’sdecidable about hybrid automata?

Journal of computer and system sciences

57, 1(1998), 94–124. [28] Sylvia L Herbert, Mo Chen, SooJean Han, Somil Bansal, Jaime F Fisac, and Claire JTomlin. 2017. FaSTrack: A modular framework for fast and guaranteed safemotion planning. In . IEEE, 1517–1522.[29] Jörg Hoffmann. 2003. The Metric-FF Planning System: Translating“IgnoringDelete Lists”to Numeric State Variables.

Journal of artificial intelligence research

20 (2003), 291–341.[30] Jörg Hoffmann and Bernhard Nebel. 2001. The FF planning system: Fast plangeneration through heuristic search.

Journal of Artificial Intelligence Research

ICAPS . 386–389.[32] Lucas Janson, Edward Schmerling, Ashley Clark, and Marco Pavone. 2015. Fastmarching tree: A fast marching sampling-based method for optimal motionplanning in many dimensions.

International Journal of Robotics Research

34, 7(2015), 883–921.[33] Leslie Pack Kaelbling and Tomás Lozano-Pérez. 2011. Hierarchical task andmotion planning in the now. In . IEEE, 1470–1477.[34] Lydia E Kavraki, Petr Svestka, J-C Latombe, and Mark H Overmars. 1996. Proba-bilistic roadmaps for path planning in high-dimensional configuration spaces.

IEEE Transactions on Robotics and Automation

12, 4 (1996), 566–580.[35] Marius Kloetzer and Calin Belta. 2008. A Fully Automated Framework for Controlof Linear Systems from Temporal Logic Specifications.

IEEE Trans. Automat.Control

53, 1 (2008), 287–297.[36] Hadas Kress-Gazit, Gerogios E. Fainekos, and George J. Pappas. 2009. TemporalLogic based Reactive Mission and Motion Planning.

IEEE Transactions on Robotics

25, 6 (2009), 1370–1381.[37] James J Kuffner and Steven M LaValle. 2000. RRT-Connect: An efficient approachto single-query path planning. In

IEEE International Conference on Robotics andAutomation , Vol. 2. IEEE, 995–1001.[38] Fabien Lagriffoul, Neil T Dantam, Caelan Garrett, Aliakbar Akbari, SiddharthSrivastava, and Lydia E Kavraki. 2018. Platform-independent benchmarks fortask and motion planning.

IEEE Robotics and Automation Letters

3, 4 (2018),3765–3772.[39] Morteza Lahijanian, Lydia E Kavraki, and Moshe Y Vardi. 2014. A sampling-basedstrategy planner for nondeterministic hybrid systems. In . IEEE, 3005–3012.[40] Luca Laurenti, Morteza Lahijanian, Alessandro Abate, Luca Cardelli, and MartaKwiatkowska. 2020. Formal and efficient synthesis for continuous-time linearstochastic hybrid processes.

IEEE Trans. Automat. Control (2020).[41] Hui X Li and Brian C Williams. 2008. Generative Planning for Hybrid SystemsBased on Flow Tubes.. In

ICAPS . 206–213.[42] Nancy Lynch, Roberto Segala, Frits Vaandrager, and Henri B Weinberg. 1995.Hybrid i/o automata. In

International Hybrid Systems Workshop . Springer, 496–510.[43] Oded Maler and Dejan Nickovic. 2004. Monitoring temporal properties of con-tinuous signals. In

Formal Techniques, Modelling and Analysis of Timed andFault-Tolerant Systems . Springer, 152–166.[44] Kaushik Mallik, Anne-Kathrin Schmuck, Sadegh Soudjani, and Rupak Majumdar.2018. Compositional synthesis of finite-state abstractions.

IEEE Trans. Automat.Control

64, 6 (2018), 2629–2636.[45] Sebti Mouelhi, Antoine Girard, and Gregor Gössler. 2013. CoSyMA: A Tool forController Synthesis Using Multi-scale Abstractions. In

International Conferenceon Hybrid Systems: Computation and Control . ACM, 83–88.[46] Erion Plaku, Lydia E Kavraki, and Moshe Y Vardi. 2013. Falsification of LTLsafety properties in hybrid systems.

International Journal on Software Tools forTechnology Transfer

15, 4 (2013), 305–320.[47] Vasumathi Raman, Alexandre Donzé, Dorsa Sadigh, Richard M Murray, andSanjit A Seshia. 2015. Reactive synthesis from signal temporal logic specifications.In

Proceedings of the 18th international conference on hybrid systems: Computationand control . 239–248.[48] Pritam Roy, Paulo Tabuada, and Rupak Majumdar. 2011. Pessoa 2.0: A ControllerSynthesis Tool for Cyber-physical Systems. In

International Conference on HybridSystems: Computation and Control . ACM, 315–316.[49] Matthias Rungger and Majid Zamani. 2016. SCOTS: A tool for the synthesis ofsymbolic controllers. In

Proceedings of the 19th international conference on hybridsystems: Computation and control . 99–104.[50] Paulo Tabuada. 2009.

Verification and Control of Hybrid Systems - A SymbolicApproach . Springer.[51] Paulo Tabuada and George J. Pappas. 2006. Linear Time Logic Control of Discrete-Time Linear Systems.

IEEE Trans. Automat. Control

51, 12 (2006), 1862–1877.[52] Sean Vaskov, Shreyas Kousik, Hannah Larson, Fan Bu, James Ward, Stewart Wor-rall, Matthew Johnson-Roberson, and Ram Vasudevan. 2019. Towards provablynot-at-fault control of autonomous robots in arbitrary dynamic environments. arXiv preprint arXiv:1902.02851 (2019).[53] René Vidal, Shawn Schaffert, Omid Shakernia, John Lygeros, and Shankar Sastry.2001. Decidable and semi-decidable controller synthesis for classes of discreteime hybrid systems. In

Proceedings of the 40th IEEE Conference on Decision andControl (Cat. No. 01CH37228) , Vol. 2. IEEE, 1243–1248.[54] Kai Weng Wong, Cameron Finucane, and Hadas Kress-Gazit. 2013. Provably-correct robot control with LTLMoP, OMPL and ROS. In

IEEE/RSJ InternationalConference on Intelligent Robots and Systems . 2073.[55] Tichakorn Wongpiromsarn, Ufuk Topcu, and Richard M. Murray. 2012. RecedingHorizon Temporal Logic Planning.

IEEE Trans. Automat. Control

57, 11 (2012),2817–2830. [56] Tichakorn Wongpiromsarn, Ufuk Topcu, Necmiye Ozay, Huan Xu, and Richard M.Murray. 2011. TuLiP: A Software Toolbox for Receding Horizon Temporal LogicPlanning. In

International Conference on Hybrid Systems: Computation and Control .ACM, 313–314.[57] Tichakorn Wongpiromsarn, Ufuk Topcu, Necmiye Ozay, Huan Xu, and Richard MMurray. 2011. TuLiP: a software toolbox for receding horizon temporal logicplanning. In