Symbolic Self-triggered Control of Continuous-time Non-deterministic Systems without Stability Assumptions for 2-LTL Specifications
©© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtainedfor all other uses, in any current or future media, including reprinting/republishing this materialfor advertising or promotional purposes, creating new collective works, for resale or redistributionto servers or lists, or reuse of any copyrighted component of this work in other works. a r X i v : . [ ee ss . S Y ] O c t ymbolic Self-triggered Control of Continuous-time Non-deterministicSystems without Stability Assumptions for 2-LTL Specifications Sasinee Pruekprasert, Clovis Eberhart, and Jérémy Dubut
Abstract — We propose a symbolic self-triggered controllersynthesis procedure for non-deterministic continuous-timenonlinear systems without stability assumptions. The goal isto compute a controller that satisfies two objectives. The firstobjective is represented as a specification in a fragment of LTL,which we call 2-LTL. The second one is an energy objective, inthe sense that control inputs are issued only when necessary,which saves energy. To this end, we first quantise the stateand input spaces, and then translate the controller synthesisproblem to the computation of a winning strategy in a mean-payoff parity game. We illustrate the feasibility of our methodon the example of a navigating nonholonomic robot.
I. I
NTRODUCTION
Not only has self-triggered control been a hot academicresearch topic in recent years, but it also provides a varietyof practical implementations [1]. By performing sensingand actuation only when needed, self-triggered controlis well-known as an energy-aware control paradigm tosave communication resources for Networked ControlSystems [2]. The lifespan of battery-powered devices canbe prolonged by reducing their energy consumption [1]and the communication load of nonholonomic robotssignificantly reduced in comparison to using periodic con-trollers [3]. However, previous research on self-triggeredcontrol of continuous-time systems only studies simplespecifications such as stability [4] and reach-avoid orsafety problems [5], [6].The main reason for this limitation is that those ap-proaches are based on reachability analysis. The mainnovelty of our work is to use techniques from gametheory to go beyond reach-avoid and safety specificationsfor self-triggered control. Game theory, and in particularparity games [7], is a well-known technique to deal withexpressive logic like the µ -calculus [7] and CTL ∗ [8], asthe parity winning conditions provide complex scenariosand strategies while keeping computability. In particular,parity games can be used for control synthesis of reactivesystems under Linear Temporal Logic (LTL) specifications[9]. On the other hand, quantitative games such as mean-payoff games [10] have been adapted to quantitativecontrol specifications [11], [12]. One of such specificationsis the mean-payoff threshold problem for the averagecontrol-signal length of self-triggered controllers. This The authors are supported by ERATO HASUO Metamathematics forSystems Design Project (No. JPMJER1603), JST. J. Dubut is also supportedby Grant-in-aid No. 19K20215, JSPS.The authors are with National Institute of Informatics, Hitotsubashi2-1-2, Tokyo 101-8430, Japan {sasinee, eberhart, dubut}@nii.ac.jp .C. Eberhart and J. Dubut are also affiliated with the Japanese-French
Laboratory for Informatics. threshold provides guarantees for the energy-saving andcommunication-reduction performance of the controller:the greater the average length of a signal is, the less oftenthe controller needs to perform sensing and actuation,and fewer commands are sent across the network. In thispaper, we deal with this threshold problem together witha logical specification using mean-payoff parity games,which combine mean-payoff games and parity games. Thelogical specifications can be dealt with the parity side,while the average signal length threshold can be seen asa threshold problem in a mean-payoff game.Our procedure is based on the symbolic control ap-proach, which synthesises correct-by-design controllers ofcontinuous-state systems. In this approach, we first con-struct a symbolic model, which is a discrete abstractionof the continuous-state system, based on approximatesimulation or bisimulation. Then, we synthesise a sym-bolic controller and leverage its control strategy to con-trol the continuous-state system. This technique allowsus to synthesise provably-correct controllers for complexspecification such as LTL specifications, which can hardlybe enforced with conventional control methods. However,previous symbolic control algorithms for continuous-timenonlinear systems under LTL specifications need stabilityassumptions (e.g., [13], [14]), which do not hold in manysystems. Symbolic control without stability assumptionsis enforced on simpler classes of specifications such asreach-avoid [5], [15], [16]. The work closest to ours is [5],in which the authors propose an algorithm to synthesisesymbolic self-triggered controllers for discrete-time deter-ministic systems under reach-avoid specifications.This work proposes a symbolic self-triggered controlprocedure for continuous-time non-deterministic nonlin-ear systems without stability assumptions for specifica-tions represented by a fragment of LTL, which we call 2-LTL. To the best of our knowledge, our work is the first tostudy symbolic control of continuous-time nonlinear sys-tems without stability assumptions for a class of LTL spec-ifications that is strictly more expressive than reach-avoid.Our procedure operates in several steps: (1) constructinga finite symbolic model of the continuous system, (2)translating the self-triggered controller synthesis problemon the symbolic model into a mean-payoff parity gameproblem, (3) constructing a winning strategy for the mean-payoff parity game, (4) translating the strategy back into acontroller for the symbolic model, and (5) translating thecontroller for the symbolic model back into one for thecontinuous system. otation:
We denote vectors in (cid:82) m by x = (cid:163) x ··· x m (cid:164) (cid:214) . For such a vector x , we use (cid:107) x (cid:107) for itsinfinity norm max{ | x i | | i ∈ {1,..., m }}. Given x ∈ (cid:82) m and r ∈ (cid:82) > , we write B r ( x ) for the ball { y ∈ (cid:82) m | (cid:107) x − y (cid:107) ≤ r }of centre x and radius r . Finally, given a set X , we denoteits powerset { Y | Y ⊆ X } by ℘ ( X ).II. C ONTROL F RAMEWORK
A. System
We formalise a non-deterministic continuous-time non-linear system as a 6-tuple Σ = ( X , X in , U , U , ξ → , ξ ← ), where X ⊆ (cid:82) n is a bounded convex state space, X in ⊆ X is aspace of initial states, U ⊆ (cid:82) m is a bounded convex spaceof control inputs, U is a set of control signals of the form[0, T ] → U that assign a control input at each time inthe interval [0, T ] with T ∈ (cid:82) > , and ξ → , ξ ← : (cid:82) n × U × (cid:82) ≥ → ℘ ( (cid:82) n ) are functions such that ξ → x , u (0) = ξ ← x , u (0) = { x }.Intuitively, given a state x ∈ (cid:82) n , a signal u ∈ U , and a time t ≥ ξ → x , u ( t ) ( resp. ξ ← x , u ( t )) is the reachable states from x ( resp. the set of states from which the system can reach x ) under the control signal u at time t .The system is defined on the whole Euclidean space (cid:82) n ,but we are only interested in its behaviour on a boundedsubspace X because the quantities involved in systemsare physically bounded, as observed in [16]. For technicalreasons, we also assume that the distance from X in to theboundary of X is positive.The system Σ is said to be forward and backwardcomplete [17] if ξ → x , u ( t ) (cid:54)= (cid:59) and ξ ← x , u ( t ) (cid:54)= (cid:59) for all ( x , u ) ∈ (cid:82) n × U and all t ∈ [0,len( u )], where len( u ) = T is the lengthof the signal u : [0, T ] → U . In other words, under anycontrol signal, there exist a state reachable from x and astate that reaches x at any time within the signal length. Definition 1.
A system Σ is incrementally forward andbackward complete if it is forward and backward complete,and, for each u ∈ U , there exist functions β → u , β ← u : (cid:82) ≥ × (cid:82) ≥ → (cid:82) ≥ such that 1) for any t ∈ (cid:82) ≥ , β → u ( _ , t ) and β ← u ( _ , t ) are strictly increasing and their limits at +∞ is +∞ , and 2) for any x , x ∈ (cid:82) n , u ∈ U , and t ≤ len ( u ) , ∀ ( x (cid:48) , x (cid:48) ) ∈ ξ → x , u ( t ) × ξ → x , u ( t ), (cid:107) x (cid:48) − x (cid:48) (cid:107) ≤ β → u ( (cid:107) x − x (cid:107) , t ) , ∀ ( x (cid:48) , x (cid:48) ) ∈ ξ ← x , u ( t ) × ξ ← x , u ( t ), (cid:107) x (cid:48) − x (cid:48) (cid:107) ≤ β ← u ( (cid:107) x − x (cid:107) , t ) . Notice that incremental forward and backward com-pleteness does not depend on the state space X , but on (cid:82) n . There may exists ( x , u ) ∈ X × U such that ξ → x , u ( t ) ∩ X =(cid:59) , i.e., the system runs out of the desired state space. Assumption 1.
The system Σ is incrementally forward andbackward complete. Assumption 1 is similar to the one used in [16], butadapted to non-deterministic systems and taking back-ward dynamics into account. The intuition behind As-sumption 1 is that the distance between the states reachedfrom two starting points can be bound by an expressionthat depends only on the distance between those startingpoints, the control signal, and the run time. In addition,we require the following assumption.
Assumption 2.
For any control signal u ∈ U , we havefunctions α → u , α ← u : (cid:82) ≥ × [0, len ( u )] → (cid:82) ≥ such that 1) forany t ∈ (cid:82) ≥ , α → u ( _ , t ) and α ← u ( _ , t ) are increasing, and 2)for any x , x ∈ X and any t ∈ [0, len ( u )] , we have for any y ∈ ξ → x , u ( t ), (cid:107) x − y (cid:107)≤ α → u ( (cid:107) x − x (cid:107) , t ) , for any y ∈ ξ ← x , u ( t ), (cid:107) x − y (cid:107)≤ α ← u ( (cid:107) x − x (cid:107) , t ) . Assumption 2 basically states that the set of reachablestates cannot be arbitrarily far from the starting state (for agiven input signal and run time). Those functions β → u , β ← u , α → u , and α ← u can typically be computed using Lyapunovfunctions (see [16], [17], [18] for details). B. State-transition Model
Let us first introduce general definitions for state-transition models and their controllers. For this paper,a state-transition model is given by a quadruple M = ( Y , Y in , V , → ), where Y is either a continuous or a discretestate space, Y in ⊆ Y is a set of initial states, V is a set ofcontrol signals, and →⊆ Y × V × Y is the transition relation.For a given state y ∈ Y , a sequence y u y u ... ∈ Y ( V Y ) ω ( resp. y u y ... u l − y l ∈ Y ( V Y ) ∗ ) is a run ( resp. a finiterun ) generated by M starting from the state y if y = y and ( y i , u i , y i + ) ∈→ for any i ∈ (cid:90) ≥ ( resp. i ∈ {0,..., l − Run ( M , y ) ( resp. FRun ( M , y )) denote the set of all runs( resp. finite runs) generated by M from y . Let Run ( M ) = (cid:83) y ∈ Y in Run ( M , y ) and FRun ( M ) = (cid:83) y ∈ Y in FRun ( M , y ).We define the state-transition model of Σ as follows. Definition 2.
The state-transition model M ( Σ ) of a system Σ = ( X , X in , U , U , ξ → , ξ ← ) is M ( Σ ) = ( X , X in , U , ∆ ), wherethe transition relation ∆ ⊆ X × U × X is given by ( x , u , x (cid:48) ) ∈ ∆ iff x (cid:48) ∈ ξ → x , u ( len ( u )) , x ∈ ξ ← x (cid:48) , u ( len ( u )) and, for all t ≤ len ( u ) , ξ → x , u ( t ) ⊆ X .
Runs of a model are discrete sequences of states, butthe system runs in continuous time. In order to fill thisgap, we introduce the notion of trajectory to match thesediscrete runs to continuous sequences of states.
Definition 3. A trajectory of a system Σ starting from astate x ∈ X induced by a run x u x u x ... ∈ Run ( M ( Σ ), x ) is a function σ : (cid:82) ≥ → X such that, for all k ∈ (cid:90) ≥ and allt ∈ (cid:163)(cid:80) i < k len ( u i ), (cid:80) i ≤ k len ( u i ) (cid:164) , σ ( t ) ∈ ξ → x k , u k (cid:161) t − (cid:88) i < k len ( u i ) (cid:162) ∩ ξ ← x k + , u (cid:48) k (cid:161) (cid:88) i ≤ k len ( u i ) − t (cid:162) , where u (cid:48) k ( s ) = u k (cid:161) s + t − (cid:88) i < k len ( u i ) (cid:162) , ∀ s ∈ [0, (cid:88) i ≤ k len ( u i ) − t ].Let Traj ( Σ , x , r ) be the set of trajectories of Σ thatare induced by a run r ∈ Run ( M ( Σ ), x ). For any finiterun r f ∈ FRun ( M ( Σ ), x ), FTraj ( Σ , x , r f ) is the set of finitetrajectories defined in the same way. Let Traj ( Σ , x ) = (cid:83) r ∈ Run ( M ( Σ ), x ) Traj ( Σ , x , r ), and Traj ( Σ ) = (cid:83) x ∈ X in Traj ( Σ , x ). C. Controlled System
In this section, we define controllers and controlledsystems, and explain the self-triggered control process.First, we define model controllers of M = ( Y , Y in , V , → ). ig. 1. Overview of the self-triggered control process. Based onpreviously observed states and issued control signals, the controllerissues a control signals to control the system. Definition 4. A model controller of a state-transitionmodel M is a function C : FRun ( M ) → V . Let C / M denote the state-transition model M con-trolled under C . A run y u y ... ∈ FRun ( M , y ) ( resp. a finite run y u y ... u l − y l ∈ Run ( M , y )) is gener-ated by C / M if it satisfies the following conditions:1) y = y , and 2) for all i ∈ (cid:90) ≥ ( resp. for all i ∈ [0, l − u i = C ( y u ... y i ). Then, let Run ( C / M , y ) ( resp. FRun ( C / M , y )) denotes the set of allruns ( resp. finite runs) generated by C / M from y ∈ Y .Let Run ( C / M ) = (cid:83) y ∈ Y in Run ( C / M , y ) and FRun ( C / M ) = (cid:83) y ∈ Y in FRun ( C / M , y ). Definition 5. A controller of Σ = ( X , X in , U , U , ξ → , ξ ← ) is afunction C : FRun ( M ( Σ )) → U . Notice that a controller C of a system Σ is definedbased on its state-transition model M ( Σ ). This is becausethe controller issues control signals based on runs, whichonly track the states at the end of each signal. SinceDefinition 5 is coherent with Definition 4, we can alsoregard C as a model controller of M ( Σ ). Hence, we alsouse Run ( C / M ( Σ )) ( resp. FRun ( C / M ( Σ ))) to denote the setsof runs ( resp. finite runs) of C / M ( Σ ) from initial states.Furthermore, we use C / Σ to denote the system Σ con-trolled under the controller C , and define the trajectoriesof C / Σ in the same way as in Definition 3. Thereby, thedefinitions of trajectories Traj ( C / Σ ) and FTraj ( C / Σ ) carryover to controlled systems directly.The overview of the control process is illustrated inFig 1. First, the controller observes the initial state x ∈ X in and issues the control signal u = C ( x ). Then, to preserveenergy, the controller is inactive throughout the durationof the control signal u . Namely, the longer the signallength len( u ) is, the more energy is preserved. Sincethe system is non-deterministic, there are several statesthat can possibly be reached under the signal u . Afterthe signal ends (at time len( u )), the controller becomesactive and resolves the non-determinism by detecting theactual current state x and issue a new control signal u = C ( x u x ). The process is then repeated.III. P ROBLEM F ORMULATION
Our goal is to synthesise a controller that satisfies twocontrol objectives. The first objective is described as a2-LTL formula. The second one is an energy-preservationobjective: to ensure that the average length of the issuedcontrol signals is above a given threshold. A. Specification
We model the first objective using a fragment of LTL,which we call 2-LTL. Let AP denote the set of atomicpropositions, i.e., assertions that can be either true or falseat each state x ∈ X . Let P : X → ℘ ( AP ) assign the set ofatomic propositions that hold at each state. Definition 6.
Let be the logic whose formulas arethe Φ ’s generated by the following grammar: ϕ :: = (cid:62) | p | ¬ ϕ | ϕ ∨ ϕ Φ :: = ♦ ϕ | (cid:3) ϕ | (cid:3)♦ ϕ | ♦(cid:3) ϕ | Φ ∨ Φ | Φ ∧ Φ ,where p ∈ AP is an atomic proposition.
We call ϕ ’s and Φ ’s state formulas and path formulas,respectively. A logic specification is written as a pathformula. Here, (cid:3) and ♦ have the usual interpretation ofLTL. A state x ∈ X satisfying ( resp. not satisfying) a stateformula ϕ is denoted by x (cid:205) ϕ ( resp. x (cid:54)(cid:205) ϕ ). We alsouse the same notations σ (cid:205) Φ and σ (cid:54)(cid:205) Φ for a trajectory σ : (cid:82) ≥ → X and a path formula Φ . For every state x ∈ X , x (cid:205) ϕ is defined as follows: x (cid:205) (cid:62) x (cid:205) p if p ∈ P ( x ) x (cid:205) ¬ ϕ if x (cid:54)(cid:205) ϕ x (cid:205) ϕ ∨ ϕ if x (cid:205) ϕ or x (cid:205) ϕ ,and for all σ : (cid:82) ≥ → X , σ (cid:205) Φ is defined as follows: σ (cid:205) ♦ ϕ if ∃ t ∈ (cid:82) ≥ , σ ( t ) (cid:205) ϕσ (cid:205) (cid:3) ϕ if ∀ t ∈ (cid:82) ≥ , σ ( t ) (cid:205) ϕσ (cid:205) (cid:3)♦ ϕ if ∀ t ∈ (cid:82) ≥ , ∃ t (cid:48) > t , σ ( t (cid:48) ) (cid:205) ϕσ (cid:205) ♦(cid:3) ϕ if ∃ t ∈ (cid:82) ≥ , ∀ t (cid:48) > t , σ ( t (cid:48) ) (cid:205) ϕσ (cid:205) Φ ∨ Φ if σ (cid:205) Φ or σ (cid:205) Φ σ (cid:205) Φ ∧ Φ if σ (cid:205) Φ and σ (cid:205) Φ .One objective of a controller C is to control the systemin such a way that all trajectories in Traj ( C / Σ ) satisfya given 2-LTL path formula Φ . Notice that the classof 2-LTL specifications is more general than the reach-avoid specifications, which is studied in [5], [15], [6]. Forexample, we can express the logic specification to reach target _ region while avoiding unsafe _ region using the2-LTL formula ♦ target _ region ∧ (cid:3) ¬ unsafe _ region . B. Controller Synthesis Problem
Definition 7.
Given a system Σ = ( X , X in , U , U , ξ → , ξ ← ) ,a set AP of atomic propositions, a function P : X → ℘ ( AP ) , a formula Φ , and a threshold ν ∈ (cid:82) > , the controller synthesis problem is to synthesise a controllerC : FRun ( M ( Σ )) → U such that • all finite runs in FRun ( C / M ( Σ )) can be extended toan infinite run in Run ( M ( Σ )) , • σ (cid:205) Φ for any σ ∈ Traj ( C / Σ ) , and • lim h →∞ h h (cid:88) i = len ( u i ) > ν for any x u ... ∈ Run ( C / M ( Σ )) ,or determine that such a controller C does not exist. The first condition simply ensures that the controlledsystem does not reach a deadlock, while the other twoconditions are the actual control objectives.V. P
ROBLEM R EDUCTION TO S YMBOLIC C ONTROL
In this section, we state our symbolic controller syn-thesis problem, which considers a discrete system ob-tained by quantising states and inputs, and by restrictingcontrol signals to piecewise-constant ones. We show thata symbolic controller for this problem also satisfies theconditions in Definition 7.
A. Piecewise-constant Control Signal with Discrete Input
For a given bounded convex control input space U ⊆ (cid:82) m and a discretisation parameter µ ∈ (cid:82) > , let U µ = (cid:169)(cid:163) u ··· u m (cid:164) (cid:214) ∈ U (cid:175)(cid:175) u i = µ l i , l i ∈ (cid:90) , i ≤ m (cid:170) (1)be the quantised input set by an m -dimensional hyper-cube of 2 µ edge length. As U is bounded, U µ is finite.Given τ ∈ (cid:82) > and (cid:96) = [ (cid:96) min , (cid:96) max ], let us consider a set U τ , (cid:96) , µ = (cid:91) j τ ∈ (cid:96) { u : [0, j τ ] → U µ } | j ∈ (cid:90) > and ∀ i ∈ {0,..., j − ∀ t ∈ (cid:163) i τ ,( i + τ (cid:162) , u ( t ) = u ( i τ )}.of piecewise-constant control signals. Each signal u ∈ U τ , (cid:96) , µ is a concatenation of constant signals of length τ and value in the finite input set U µ . We limit the length ofeach signal u ∈ U τ , (cid:96) , µ to be in the range (cid:96) = [ (cid:96) min , (cid:96) max ];therefore, U τ , (cid:96) , µ is also a finite set.Hence, let us consider a system Σ τ , (cid:96) , µ = ( X , X in , U µ , U τ , (cid:96) , µ , ξ → , ξ ← ), which is the system Σ restricted topiecewise-constant control signals in U τ , (cid:96) , µ . B. Symbolic Model and Symbolic Controller
A symbolic model is a state-transition model (see Sec-tion II-B) with a discrete state space and a finite set ofsignals. For given η ∈ (cid:82) > , let[ X ] η = (cid:169) x ∈ (cid:82) n (cid:175)(cid:175) x i = η l i , l i ∈ (cid:90) , i ≤ n ∧ B η ( x ) ∩ X (cid:54)= ∅ (cid:170) . (2)Then, we define a symbolic model of Σ τ , (cid:96) , µ as follows. Definition 8.
Given a system Σ τ , (cid:96) , µ = ( X , X in , U µ , U τ , (cid:96) , µ , ξ → , ξ ← ) and a state-space quantisation parameter η ∈ (cid:82) > , a symbolic model is a state-transition model S η ( Σ τ , (cid:96) , µ ) = ( Q = [ X ] η , Q in = [ X in ] η , U τ , (cid:96) , µ , δ ) such that ( q , u , ˜ q ) ∈ δ if ( q , u , ˜ q ) ∈ Q × U τ , (cid:96) , µ × Q and ∀ x ∈ B η ( q ) and t ≤ len ( u ) , ξ → x , u ( t ) ⊆ X , and ∃ ˜ x ∈ ξ → q , u ( len ( u )) , (cid:107) ˜ x − ˜ q (cid:107) ≤ β → u ( η , len ( u )) + η , and ∃ x ∈ ξ ← ˜ q , u ( len ( u )) , (cid:107) x − q (cid:107) ≤ β ← u ( η , len ( u )) + η .Notice that [ X ] η may contain some points that are notin X , but they have no outgoing transition in δ , so theywill not influence our controller synthesis algorithm.A symbolic controller is a function S : FRun ( S η ( Σ τ , (cid:96) , µ )) → U τ , (cid:96) , µ that is a model controller of S η ( Σ τ , (cid:96) , µ ). Remark 1.
We can also use multi-dimensional quan-tisation parameters µ = (cid:163) µ ··· µ m (cid:164) (cid:214) ∈ (cid:82) m > and η = (cid:163) η ··· η n (cid:164) (cid:214) ∈ (cid:82) n > . In this case, Equation (1) becomesU µ = (cid:169)(cid:163) u ··· u m (cid:164) (cid:214) ∈ U | u i = µ i l i , l i ∈ (cid:90) , i ≤ m (cid:170) , and similarly for Equation (2) . Fig. 2. Overview of the symbolic self-triggered control process.
C. Symbolic Control and Approximate Simulation Relation
To study the relationship between S η ( Σ τ , (cid:96) , µ ) and M ( Σ τ , (cid:96) , µ ), we introduce alternating approximate simu-lation relation between state-transition models, which isinspired from alternating approximate bisimulation [19]. Definition 9.
Given a pair of transition models M = ( Q , Q in , V , ∆ ) and M = ( Q , Q in , V , ∆ ) , a metric d : Q × Q → (cid:82) ≥ , and a precision ε ∈ (cid:82) ≥ , M alternating ε -approximately simulates M if the following holds: ∀ x ∈ Q in , ∃ q ∈ Q in such that d ( x , q ) ≤ ε , and ∀ ( x , q ) ∈ Q × Q such that d ( x , q ) ≤ ε , ∀ u ∈ V such that ∃ ( q , u , q (cid:48) ) ∈ ∆ , a) ∃ ( x , u , x (cid:48) ) ∈ ∆ , b) ∀ x (cid:48) ∈ Q such that ( x , u , x (cid:48) ) ∈ ∆ , ∃ ( q , u , ˜ q ) ∈ ∆ such that d ( x (cid:48) , ˜ q ) ≤ ε . If M alternating ε -approximately simulates M , then,for any signal defined at a state of M , we have the samesignal at the corresponding state of M . Moreover, anynon-deterministic behaviour of M is also present in M .Thus, we can turn any controller of M into one of M . Lemma 1.
Using the metric d : X × Q → (cid:82) ≥ given byd ( x , q ) = (cid:107) x − q (cid:107) , M ( Σ τ , (cid:96) , µ ) alternating η − approximatelysimulates the transition model S η ( Σ τ , (cid:96) , µ ) .Proof. The first condition of Definition 9 is obvious. Con-dition 2)-a) follows from Assumption 1. For 2)-b), weconsider (cid:107) x − q (cid:107) ≤ η and ( q , u , q (cid:48) ) ∈ δ , and assume that( x , u , x (cid:48) ) ∈ ∆ . Since Q = [ X ] η , there exists ˜ q ∈ Q such that (cid:107) x (cid:48) − ˜ q (cid:107) ≤ η . We will show that ( q , u , ˜ q ) ∈ δ . Condition 1) ofDefinition 8 holds by the fact that ( q , u , q (cid:48) ) ∈ δ . Then, byAssumption 1, there exists ( q , u , ˜ x ) ∈ ∆ . By Assumption 1and the triangular inequality, (cid:107) ˜ x − ˜ q (cid:107) ≤(cid:107) ˜ x − x (cid:48) (cid:107)+(cid:107) x (cid:48) − ˜ q (cid:107)≤ β → u ( (cid:107) x − q (cid:107) ,len( u )) + η ,which proves condition 2) of Definition 8. Condition 3)of Definition 8 is shown in the same way. Consequently,( q , u , ˜ q ) ∈ δ and therefore condition 2)-b) holds. D. Symbolic Controller Synthesis Problem
In this section, we reduce the controller synthesis prob-lem for the system Σ τ , (cid:96) , µ to the synthesis of a symboliccontroller for S η ( Σ τ , (cid:96) , µ ). The overview of the symboliccontrol process is depicted in Fig. 2.By Lemma 1, we can turn any symbolic controller S : FRun ( S η ( Σ τ , (cid:96) , µ )) → U τ , (cid:96) , µ into a controller C S : FRun ( M ( Σ τ , (cid:96) , µ )) → U τ , (cid:96) , µ . More precisely, let π : X → Q be a mapping such that x ∈ B η ( π ( x )). Then, foreach run r = x u x ... u l x l ∈ FRun ( M ( Σ τ , (cid:96) , µ )), if π ( r ) = π ( x ) u π ( x )... u l π ( x l ), then we assign C S ( r ) = S ( π ( r )). By ig. 3. Overview of the proposed control algorithm. Lemma 1, π ( r ) is a run, and so C S ( r ) is well defined forany r ∈ FRun ( M ( Σ τ , (cid:96) , µ )). Definition 10.
Given a system Σ = ( X , X in , U , U , ξ → , ξ ← ) , aset AP of atomic propositions, a function P : X → ℘ ( AP ) , a path formula Φ , a threshold ν ∈ (cid:82) > , and quantisa-tion parameters τ , (cid:96) , µ , η , the symbolic controller synthesisproblem consists in synthesising a symbolic controller S : FRun ( S η ( Σ τ , (cid:96) , µ )) → U τ , (cid:96) , µ such that • all finite runs in FRun ( C S / M ( Σ )) can be extended toan infinite run in Run ( M ( Σ )) , • σ (cid:205) Φ for any σ ∈ Traj ( C S / Σ ) , and • lim h →∞ h h (cid:88) i = len ( u i ) > ν for any x u ... ∈ Run ( C S / M ( Σ )) .or determine that such S does not exist. By Lemma 1, we have the following theorem.
Theorem 1.
If a symbolic controller S : FRun ( S η ( Σ τ , (cid:96) , µ )) → U τ , (cid:96) , µ solves the problem of Definition 10, then the con-troller C S : FRun ( M ( Σ τ , (cid:96) , µ )) → U τ , (cid:96) , µ solves the controllersynthesis problem of Definition 7. V. C
ONTROL A LGORITHM
The overview of the proposed control algorithm ispresented in Fig. 3. From the given system Σ , the signal-length interval (cid:96) , and initial quantisation parameters η = η , µ = µ , τ = τ , we construct the symbolic model S η ( Σ τ , (cid:96) , µ ). Then, we transform the symbolic control prob-lem to a threshold problem of a mean-payoff parity game.If there exists a wining strategy of the controller for thegame, the algorithm translates the strategy to a symboliccontroller and terminates. Otherwise, the algorithm re-fines the quantisation parameters (e.g., setting η = η , or µ = µ , or τ = τ ) and repeats the process. The algorithmterminates without solving the problem when the param-eters η , µ , τ are smaller than some given thresholds. A. Mean-payoff Parity Game
Let us first start by recalling known results about mean-payoff parity games (MPPGs) , which we use for solvingthe symbolic controller synthesis problem. We invite aninterested reader to see [20] for more details about MPPGs.
Definition 11. A mean-payoff parity game is a tuple G = (cid:161) G = ( V = V (cid:116) V , E , s : E → V , t : E → V ), λ , c , ν (cid:162) , where • G = ( V = V (cid:116) V , E , s : E → V , t : E → V ) is a directedgraph. V is partitioned into two disjoint sets V andV of vertices for Player-1 and
Player-2 , respectively.E is its set of edges. Functions s and t map edges totheir sources and targets. • λ : E → (cid:90) ≥ maps each edge to its payoff . • c : V → (cid:90) ≥ maps each vertex to its colour . • ν ∈ (cid:90) ≥ is a given mean-payoff threshold . A play on G is an infinite sequence ω = v e v e ... ∈ ( V E ) ω such that, for all i ≥ s ( e i ) = v i and t ( e i ) = v i + .A finite play is a finite sequence in V ( EV ) ∗ defined inthe same way. Let FPlay be the set of all finite plays, and
FPlay and FPlay be the sets of finite plays ending witha vertex in V and V , respectively. Both players play thegame by selecting strategies. A strategy of Player- i is a par-tial function σ i : FPlay i (cid:42) E such that s ( σ i ( v e ... v n )) = v n , i.e., σ i chooses an edge whose source is the endingvertex of the play if such an edge exists, and does notchoose any edge otherwise. A play ω = v e v e ... isconsistent with σ i if e j = σ i ( v e ... v j ) for all v j ∈ V i .For an initial vertex v and a pair of strategies σ and σ of both players, there exists a unique play, denoted by play ( v , σ , σ ), consistent with both σ and σ . This playmay be finite if a player cannot choose an edge.For an infinite play ω = v e v e ..., we denote the max-imal colour that appears infinitely often in the sequence c ( v ) c ( v )... by Inf( ω ). Then, the mean-payoff value ofthe play ω is MP( ω ) = lim n →∞ n (cid:80) ni = λ ( e i ). A vertex v ∈ V is winning for Player-1 if there exists a strategy σ of Player-1 such that, for any strategy σ of Player-2, play ( v , σ , σ ) is infinite, Inf( play ( v , σ , σ )) is even and MP ( play ( v , σ , σ )) > ν . Such a strategy σ is called a winning strategy for Player-1 from the vertex v . Then, the threshold problem [21] is to compute the set of winningvertices of Player-1 for a given MPPG.In [21], the authors propose a pseudo-quasi-polynomialalgorithm that solves the threshold problem and computesa winning strategy for Player-1 from each winning state.In Section V-C, we reduce the symbolic control problemto the synthesis of a winning strategy on an MPPG, whichcan be solved using this algorithm. B. Atomic Propositions along Symbolic Transitions
We introduce functions ρ ∃ , ρ ∀ : δ → ℘ ( ℘ ( AP ) × ℘ ( AP ))to under-approximate the set of atomic propositions thathold along trajectories. This is needed because the infor-mation about which states are visited along a trajectoryis lost in the discrete model. These functions help recoverpart of this information, and will be crucial in the problemtranslation in Section V-C.For each transition ( q , u , q (cid:48) ) ∈ δ in the symbolic model S η ( Σ τ , (cid:96) , µ ), we require that for (cid:96) ∈ { ∀ , ∃ } ρ (cid:96) ( q , u , q (cid:48) ) ⊆ (cid:169) ( P + , P − ) ∈ ℘ ( AP ) × ℘ ( AP ) (cid:175)(cid:175) ∀ x ∈ B η ( q ), ∀ x (cid:48) ∈ B η ( q (cid:48) ), ∀ σ ∈ FTraj ( Σ τ , (cid:96) , µ , x ,( xux (cid:48) )), (3) (cid:96) t ≤ len( u ), P + ⊆ P ( σ ( t )) ∧ P − ∩ P ( σ ( t )) = (cid:59) (cid:170) .The intuition is as follows. If ( P + , P − ) ∈ ρ ∀ ( q , u , q (cid:48) ) ( resp. ρ ∃ ( q , u , q (cid:48) )), then, at all time ( resp. at some time) along thetransition ( q , u , q (cid:48) ), all p ∈ P + hold and no p ∈ P − holds.Then, we can define ρ (cid:96) ( q , u , q (cid:48) ) (cid:205) ϕ inductively on thestate formula ϕ in a sound way. q q n ... uu q q , u q q n ... len( u ) len( u )len( u ) (cid:32) Fig. 4. Translation to a mean-payoff parity game
For the implementation, we use functions B + , B − : X × (cid:82) → ℘ ( AP ) such that, for any state x ∈ X and anyradius r ∈ (cid:82) > , B + ( x , r ) = { p ∈ AP | ∀ x (cid:48) ∈ B r ( x ), x (cid:48) (cid:205) p }and B − ( x , r ) = { p ∈ AP | ∀ x (cid:48) ∈ B r ( x ), x (cid:48) (cid:213) p } are the setsof atomic propositions that are satisfied and not satisfied,respectively, at all states in the ball B r ( x ). Assumption 3.
For any state x ∈ X and any radius r ∈ (cid:82) > ,the sets B + ( x , r ) and B − ( x , r ) can be computed. Then, we may use the following functions ρ ∃ and ρ ∀ . ρ ∃ ( q , u , q (cid:48) ) = (cid:161) B + ( q , η ) ∪ B + ( q (cid:48) , η ), B − ( q , η ) ∪ B − ( q (cid:48) , η ) (cid:162) . ρ ∀ ( q , u , q (cid:48) ) = (cid:161) B + ( q , r ) ∩ B + ( q (cid:48) , r ), B − ( q , r ) ∩ B − ( q (cid:48) , r ) (cid:162) ,where r = β → u ( q ,len( u )) + α → u ( q ,len( u )).By Assumptions 1 and 2, ρ ∃ and ρ ∀ satisfy Equation (3). C. Problem Translation to Mean-payoff Game
In this section, we present a translation from thesymbolic model S η ( Σ τ , (cid:96) , µ ) = ( Q , Q in , U τ , (cid:96) , µ , δ ), a path for-mula Φ , and a threshold ν ∈ (cid:82) ≥ to an MPPG G Φ , ν = ( G Φ , λ Φ , c Φ , ν ). The MPPG is played between the controller(as Player-1) and the non-determinism of the system (asPlayer-2). The parity constraint will force the controllerto induce trajectories that satisfy Φ , while the mean-payoff constraint will ensure that the average length ofthe chosen input signals is above the threshold. Thetranslation is roughly as illustrated in Fig. (4): Player-1can move from state q to state ( q , u ) (corresponding tochoosing input signal u ), then Player-2 can choose to goto any q i reachable from q following u (corresponding toa non-deterministic environmental behaviour). The costson the edges are such that the mean payoff is equal tothe average signal length. Finally, the nodes’ colours aredefined inductively on Φ .More precisely, we first define a graph G Σ = ( V Σ = V (cid:116) V , E Σ = E → (cid:116) E → , s , t ) and a function λ : E → (cid:90) ≥ asfollows, which corresponds to what is shown in Fig. (4): • V = Q and V = Q × U τ , (cid:96) , µ , • E → = Q × U τ , (cid:96) , µ and E → = (cid:169) (( q , u ), q (cid:48) ) | ( q , u , q (cid:48) ) ∈ δ (cid:170) , • ∀ e = ( q , u ) ∈ E → , q e −→ ( q , u ) and λ ( e ) = len( u ), • ∀ e = (( q , u ), q (cid:48) ) ∈ E → , ( q , u ) e −→ q (cid:48) and λ ( e ) = len( u ).This will form the base of our game G Φ , ν , whichwill roughly consist of multiple copies of G Σ , labelledwith different colours, built inductively from Φ . Techni-cally, we define G Φ , ν = (( Z Φ × V Σ , Z Φ × E Σ , s Φ , t Φ ), ˜ λ , c Φ , ν ),where ˜ λ ( z , e ) = λ ( e ), ( z , q ) ( z ,( q ,( q , u ))) −−−−−−−−→ ( z ,( q , u )), and( z ,( q , u )) ( z ,(( q , u ), q (cid:48) )) −−−−−−−−→ ( z (cid:48) , q (cid:48) ) for some z (cid:48) ∈ Z Φ . The numberof copies Z Φ , the colour function c Φ , and the target copy z (cid:48) are defined inductively.For the base cases Φ = ♦ ϕ , (cid:3) ϕ , ♦(cid:3) ϕ , (cid:3)♦ ϕ , we onlypresent (cid:3)♦ ϕ , as the other cases are similar. We need two copies Z Φ = {1,2} of G Φ , labelled with colours c Φ ( z , v ) = z .Finally, z (cid:48) is 2 if ρ ∃ ( q , u , q (cid:48) ) (cid:205) ϕ , and 1 otherwise. Theintuition is that we jump to a state in copy 2 if we canensure that there is a state satisfying ϕ on the trajectoryleading to that state, and jump to a state in copy 1otherwise. From there, it is clear that if we can find astrategy that visits states in copy 2 infinitely often (thewinning condition for parity), then we can force thediscrete model to output trajectories that satisfy Φ .For Φ ∨ Ψ , Φ ∧ Ψ , we synchronise parity automata byremembering, for each colour (say in G Ψ , ν ), the maxcolour (of G Φ , ν ) seen during the execution since a largercolour has been seen. This allows us to compute thedesired c Φ ∨ Ψ and c Φ ∧ Ψ . Theorem 2.
From a winning strategy σ for Player-1 in G Φ , ν , one can effectively compute a symbolic controller C σ for S η ( Σ τ , (cid:96) , µ ) that solves the symbolic controller synthesisproblem of Definition 10.Proof. (sketch) C σ copies σ ’s choice of input signals.The parity condition ensures that the controlled systemsatisfies Φ , while the mean-payoff condition ensures thatthe average signal length is greater than the threshold.VI. I LLUSTRATIVE E XAMPLE
We consider the following non-deterministic nonholo-nomic robot system, which is modified version of [3].˙ x ( t ) = v (1 + λ ( t ))cos( θ ( t ))˙ y ( t ) = v (1 + λ ( t ))sin( θ ( t )) ˙ θ ( t ) = ω ( t ),where ω is the input signal for the steering angle, v isthe speed of the robot, and λ is randomly selected from[ − ¯ λ , ¯ λ ] for a given parameter ¯ λ ∈ (cid:82) ≥ . Notice in particularhow this simple system verifies no stability assumption. ξ → and ξ ← may be over-approximations of how thephysical system behaves. The non-determinism in the sys-tem may come from the physical system or its mathemat-ical modelling (e.g., to account for floating-point errors).For the system described above, the non-determinismcomes from the physical system, where the velocity ofthe robot is known only up to some error bound.Recall that we only consider piecewise-constant inputsignals in U τ , (cid:96) , µ (see Section IV-A). For each signal ω oflength m τ , let ω , ω ,..., ω m − be the constants signalsof length τ such that ω k ( t ) = ω ( k τ + t ) for any t ≤ τ and k ∈ {0,..., m − β → ω and α → ω as follows. β → ω ( d , k τ + t ) = d + v (1 + ¯ λ )sin( d )( k τ + t − k − (cid:88) i = f ω i ( τ ) − f ω k ( t )) if d < π d + v (1 + ¯ λ )( k τ + t − k − (cid:88) i = f ω i ( τ ) − f ω k ( t )) otherwise, α → ω ( d , k τ + t ) = d + v (1 + ¯ λ )( k τ + t − k − (cid:88) i = f ω i ( τ ) − f ω k ( t )),where f ω ( t ) = (cid:40) (cid:98) ω t π (cid:99) ( π − ω if ω (cid:54)=
00 otherwise. ig. 5. A finite run under the synthesised controller in 7 time steps.The labels of the arrows show the input signals. For example, the arrowfrom position 0 to 1 represents the signal that assigns ω = π for τ = ω = τ = Functions β ← ω and α ← ω are defined in the same way.We implement our control algorithm using v = λ = X = [ − × [ − × [0,2 π ], X in = {(0,0, π )}, U = [ − π , π ], η = (cid:163) π (cid:164) (cid:214) , µ = π , τ = (cid:96) min = (cid:96) max =
1. For the control specification, we set the threshold ν = Φ = (cid:3)♦ ϕ where (cid:163) x y θ (cid:164) (cid:214) (cid:205) ϕ ⇐⇒ x > ∧ y > ONCLUSION AND F UTURE W ORK
In this paper, we proposed a self-triggered control syn-thesis procedure for non-deterministic continuous-timenonlinear systems without stability assumptions. The twomain ingredients of this procedure are 1) discretisingthe state and input spaces to obtain a discrete symbolicmodel corresponding to the original continuous system2) reducing the control synthesis problem to the com-putation of a winning strategy in a mean-payoff paritygame. We illustrated our method on the example of anonholonomic robot navigating in an arena, under aspecification requiring it to repeat some reachability tasks.As a future work, we would like to expand the size of theconsidered fragment of LTL.VIII. ACKNOWLEDGEMENTSWe thank Prof. Kazumune Hashimoto from Osaka Uni-versity for his fruitful comments.R
EFERENCES[1] W. P. M. H. Heemels, K. H. Johansson, and P. Tabuada, “An introduc-tion to event-triggered and self-triggered control,” in
Proc. 51st IEEE
Conference on Decision and Control (CDC) , 2012, pp. 3270–3285. [2] K. Hashimoto, S. Adachi, and D. V. Dimarogonas, “Energy-awarenetworked control systems under temporal logic specifications,” in
Proc. 57th Conference on Decision and Control (CDC) , 2018, pp.132–139.[3] C. Santos, F. Espinosa, M. Martinez-Rey, D. Gualda, and C. Losada,“Self-triggered formation control of nonholonomic robots,”
Sensors(Basel, Switzerland) , vol. 19, no. 12, pp. 132–139, Jun. 2019.[4] A. Anta and P. Tabuada, “To sample or not to sample: Self-triggeredcontrol for nonlinear systems,”
IEEE Transactions on automaticcontrol , vol. 55, no. 9, pp. 2030–2042, 2010.[5] K. Hashimoto, A. Saoud, M. Kishida, T. Ushio, and D. V. Dimarogo-nas, “A symbolic approach to the self-triggered design for networkedcontrol systems,”
IEEE Control Systems Letters , vol. 3, no. 4, pp.1050–1055, 2019.[6] K. Hashimoto and D. V. Dimarogonas, “Synthesizing communica-tion plans for reachability and safety specifications,”
IEEE Transac-tions on Automatic Control , vol. 65, no. 2, pp. 561–576, 2019.[7] E. A. Emerson and C. S. Jutla, “Tree automata, mu-calculus anddeterminacy,” in
Proc. 32nd annual symposium on Foundations ofcomputer science (SFSC’91) . Association for Computing Machinery,1991, pp. 368–377.[8] O. Friedmann, M. Lange, and M. Latte, “Satisfiability games forbranching-time logics,”
Logical Methods in Computer Science , vol. 9,no. 4, 2013.[9] M. Luttenberger, P. J. Meyer, and S. Sickert, “Practical synthesis of reactive systems from ltl specifications via parity games,”
ActaInformatica , vol. 57, pp. 3–36, 2019.[10] A. Ehrenfeucht and J. Mycielski, “Positional strategies for mean-payoff game,”
International journal of game theory , vol. 8, no. 2,pp. 109–113, 1979.[11] S. Pruekprasert, T. Ushio, and T. Kanazawa, “Quantitative supervi-sory control game for discrete event systems,”
IEEE Transactions onAutomatic Control , vol. 61, no. 10, pp. 2987–3000, 2016.[12] Y. Ji, X. Yin, and S. Lafortune, “Mean payoff supervisory control un-der partial observation,” in
Proc. 57th IEEE Conference on Decisionand Control (CDC) , 2018, pp. 3981–3987.[13] K. Kido, S. Sedwards, and I. Hasuo, “Bounding errors due toswitching delays in incrementally stable switched systems,”
IFAC-PapersOnLine , vol. 51, no. 16, pp. 247–252, 2018.[14] M. Zamani, P. Mohajerin Esfahani, R. Majumdar, A. Abate, andJ. Lygeros, “Symbolic control of stochastic systems via approxi-mately bisimilar finite abstractions,”
IEEE Transactions on Auto-matic Control , vol. 59, no. 12, pp. 3135–3150, 2014.[15] E. Macoveiciuc and G. Reissig, “Memory efficient symbolic solutionof quantitative reach-avoid problems,” in
Proc. American ControlConference (ACC) , 2019, pp. 1671–1677.[16] M. Zamani, G. Pola, M. Mazo, and P. Tabuada, “Symbolic modelsfor nonlinear control systems without stability assumptions,”
IEEETransactions on Automatic Control , vol. 57, no. 7, pp. 1804–1809,2012.[17] D. Angeli and E. D. Sontag, “Forward completeness, unboundednessobservability, and their lyapunov characterizations,”
Systems andControl Letters , vol. 38, no. 4, pp. 209–217, 1999.[18] D. Angeli, “A lyapunov approach to incremental stability properties,”
IEEE Transactions on Automatic Control , vol. 47, no. 3, pp. 410–421,2002.[19] G. Pola and P. Tabuada, “Symbolic models for nonlinear controlsystems: Alternating approximate bisimulations,”
SIAM Journal onControl and Optimization , vol. 48, no. 2, pp. 719–733, 2009.[20] K. Chaterjee, T. A. Henzinger, and M. Jurdzinski, “Mean-payoffparity games,” in
Proc. 20th Annual IEEE Symposium on Logic inComputer Science (LICS’ 05) . IEEE, 2005.[21] L. Daviaud, M. Jurdzi´nski, and R. Lazi´c, “A pseudo-quasi-polynomialalgorithm for mean-payoff parity games,” in
Proc. 33rd AnnualACM/IEEE Symposium on Logic in Computer Science , 2018, pp. 325–334.[22] K. Chatterjee and L. Doyen, “Energy parity games,”
TheoreticalComputer Science , vol. 458, pp. 49–60, 2012. [23] L. Brim, J. Chaloupka, L. Doyen, R. Gentilini, and J.-F. Raskin,“Faster algorithms for mean-payoff games,”