A Distributed Simplex Architecture for Multi-Agent Systems
Usama Mehmood, Scott D. Stoller, Radu Grosu, Shouvik Roy, Amol Damare, Scott A. Smolka
AA Distributed Simplex Architecture forMulti-Agent Systems
Usama Mehmood , Scott D. Stoller , Radu Grosu ,Shouvik Roy , Amol Damare , and Scott A. Smolka Department of Computer Science, Stony Brook University, USA Department of Computer Engineering, Technische Universit¨at Wien, Austria
Abstract.
We present
Distributed Simplex Architecture (DSA), a newruntime assurance technique that provides safety guarantees for multi-agent systems (MASs). DSA is inspired by the Simplex control architec-ture of Sha et al., but with some significant differences. The traditionalSimplex approach is limited to single-agent systems or a MAS with acentralized control scheme. DSA addresses this limitation by extend-ing the scope of Simplex to include MASs under distributed control. InDSA, each agent has a local instance of traditional Simplex such that thepreservation of safety in the local instances implies safety for the entireMAS. We provide a proof of safety for DSA, and present experimentalresults for several case studies, including flocking with collision avoid-ance, safe navigation of ground rovers through way-points, and the safeoperation of a microgrid.
Keywords:
Runtime assurance · Simplex architecture · Control BarrierFunctions · Distributed flocking · Reverse switching.
A multi-agent system (MAS) is a group of autonomous, intelligent agents thatwork together to solve tasks and carry out missions. MAS applications includethe design of power systems and smart-grids [1,2], autonomous control of roboticswarms for monitoring, disaster management, military battle systems, etc. [3],and sensor networks. Many MAS applications are safety-critical. It is thereforeparamount that MAS control strategies ensure safety.In this paper, we present the
Distributed Simplex Architecture (DSA), a newruntime assurance technique that provides safety guarantees for MASs under dis-tributed control. DSA is inspired by Sha et al.’s Simplex Architecture [4, 5], butdiffers from it in significant aspects. The Simplex Architecture provides runtimeassurance of safety by switching control from an unverified (hence potentiallyunsafe) advanced controller (AC) to a verified-safe baseline controller (BC), ifthe action produced by the AC could result in a safety violation in the nearfuture. The switching logic is implemented in a verified decision module (DM).The applicability of the traditional Simplex Architecture is limited to systemswith a centralized control architecture. a r X i v : . [ c s . M A ] D ec U. Mehmood et al.
ProcessAdvancedController(AC)BaselineController(BC) Decision Module (DM)
Agent i S i,2 S i,m S i,1 DSA
Rest of Network
DSA
Agent i
DSA
Agent k
DSA
Agent j
DSA
Agent l
Fig. 1: The DSA for the MAS on the right. All agents in the MAS are homoge-neous and operate under DSA, but the figure shows the DSA components foronly agent i . The sensed state of agent i ’s j th neighbor is denoted as S i,j . TheAC, BC, and DM take as input the state of the agent and its neighbors.DSA, illustrated in Fig. 1, addresses this limitation by making necessaryadditions to the traditional Simplex to widen its scope to include MASs. Also,as in [6], it implements reverse switching by reverting control back to the ACwhen it is safe to do so.In DSA, for each agent, there is a verified-safe BC and a certified switchinglogic such that if all the agents operate under DSA, then safety of the MAS isguaranteed. The BC and DM along with the AC are distributed and dependonly on local information. DSA itself is distributed in the sense that it involvesone local instance of traditional Simplex per agent such that the conjunction oftheir respective safety properties yields the desired safety property for the entireMAS. For example, consider our flocking case study, where we want to establishcollision-freedom for the entire MAS. This can be accomplished in a distributedmanner by showing that each local instance of Simplex, say for agent i , ensurescollision-freedom for agent i and its neighboring agents.DSA allows agents to switch their mode of operation independently. At anygiven time, some agents may be operating in AC mode while others are operatingin BC mode. Our approach to the design of the BC and DM leverages ControlBarrier Functions (CBFs), which have been used to synthesize safe controllers[7–9], and are closely related to Barrier Certificates used for safety verificationof closed dynamical systems [10, 11]. A CBF is a mapping from the state spaceto a real number, with its zero level-set partitioning the state space into safeand unsafe regions. If certain inequalities on the Lie derivative of the CBF aresatisfied, then the corresponding control actions are considered safe (admissible).In DSA, the BC is designed as an optimal controller with the goal of increas-ing a utility function based on the Lie derivatives of the CBFs. As CBFs area measure of the safety of a state, optimizing for control actions with a higherLie derivative values gives a direct way to make the state safer. The safety ofthe BC is further guaranteed by constraining the control action to remain in aset of admissible actions that satisfy certain inequalities on the Lie derivativesof the CBFs. CBFs are also used in the design of the switching logic, as theyprovide an efficient method for checking whether an action could lead to a safetyviolation during the next time step.
Distributed Simplex Architecture for Multi-Agent Systems 3
We demonstrate the effectiveness of DSA on several example MASs, includinga flock of robots moving coherently while avoiding inter-agent collisions, groundrovers safely navigating through a series of way-points, and safe operation of amicrogrid.
The Simplex Control Architecture relies on a verified-safe baseline controller(BC) in conjunction with the verified switching logic of the Decision Module(DM) to guarantee the safety of the plant (Agent i in the Fig. 1) while permittingthe use of an unverifiable, high-performance advanced controller (AC).Let the admissible states be those which satisfy all safety constraints andoperational limits. Other states are are called inadmissible . The goal of the Sim-plex Architecture is to ensure the system never enters an inadmissible state. Theset R of recoverable states is a subset of the admissible states such that the BC,starting from any state in R guarantees that all future states are also in R . Therecoverable set takes into account the inertia of the physical system, giving theBC enough time to preserve safety.The DM’s forward switching condition (FSC) evaluates the control actionproposed by the AC and decides whether to switch to the BC. A common tech-nique to develop a FSC is to shrink the recoverable region by a margin basedon the maximum time derivative of the state and the length of a timestep, andswitch to BC if the current state lies outside this smaller set. Control Barrier Functions (CBFs) [12, 13] are an extension of the Barrier Cer-tificates used for safety verification of hybrid systems [10, 11]. CBFs are a classof Lyapunov-like functions used to guarantee safety for nonlinear control sys-tems by assisting in the design of a class of safe controllers that establish theforward-invariance of safe sets [9,14]. Our presentation of CBFs is based on [13].Consider a nonlinear affine control system˙ x = f ( x ) + g ( x ) u, (1)with state x ∈ D ⊂ R n , control input u ∈ U , and functions f and g that arelocally Lipschitz. The set R of recoverable states is defined as the super-level setof a continuously differentiable function h : D ⊂ R n → R . The recoverable set R and its boundary δ R are given by: R = { x ∈ D ⊂ R n | h ( x ) ≥ } (2) δ R = { x ∈ D ⊂ R n | h ( x ) = 0 } (3) U. Mehmood et al.
For all x ∈ D , if there exists an extended class K function α : R → R (strictlyincreasing and α (0) = 0) such that the following condition on the Lie-derivativeof h is satisfied: sup u ∈ U [ L f h ( x ) + L g h ( x ) u + α ( h ( x ) ≥ h ( x ) is a valid CBF. Condition (4) implies the existence of acontrol action for all x ∈ D , such that the Lie-derivative of h is bounded frombelow by − α ( h ( x )). Furthermore, for x ∈ δ R , condition (4) reduces to a resultfor set invariance known as Nagumo’s theorem [15, 16]. Condition (4) is used todefine the set K ( x ) of control actions that establish the forward invariance ofset R ; i.e., starting from x ∈ R , the state will always remain inside R : K ( x ) = { u ∈ U : L f h ( x ) + L g h ( x ) u + α ( h ( x )] ≥ } (5) Theorem 1. [13] For the control system given in Eq. (1) and recoverable set R defined in (2) as the super-level set of some continuously differentiable function h : R n → R , if h is a control barrier function for all x ∈ D and δhδx (cid:54) = 0 forall x ∈ δ R , then any controller u such that ∀ x ∈ D : u ( x ) ∈ K ( x ) ensuresforward-invariance of R .Proof. Condition (4) on the Lie-derivative of h reduces, on the boundary of R , tothe set invariance condition of Nagumo’s theorem: for x ∈ δ R , ˙ h ≥ − α ( h ( x )) = 0.Hence, according to Nagumo’s theorem [15, 16] the set R is forward-invariant. This section describes the Distributed Simplex Architecture (DSA). We for-mally introduce the MAS safety problem and then discuss the main componentsof DSA, namely, the distributed baseline controller (BC) and the distributeddecision module (DM).We say that an instance of DSA is symmetric if every agent uses the sameswitching condition and baseline controller. Moreover, DSA, or more preciselythe MAS it is controlling, is homogeneous if every constituent agent is an instanceof the same plant model.Consider a MAS consisting of k homogeneous agents, denoted as M = { , ..., k } , where the nonlinear control affine dynamics for the i th agent are:˙ x i = f ( x i ) + g ( x ) u i , (6)Here, x i ∈ D ∈ R n is the state of agent i and u i ∈ U ⊂ R m is its controlinput. For an agent i , we define the set of its neighbors N i ⊆ M as the agentswhose state is accessible to the agent i either through sensing or communication.Depending on the application, the set of neighbors could be fixed or vary dynam-ically. For example, in our flocking case study (Section 4), agent i ’s neighbors (ina given state) are the agents within a fixed distance r of agent i ; we assume agent Distributed Simplex Architecture for Multi-Agent Systems 5 i can accurately sense the positions and velocities those agents. We denote thecombined state of all the agents in the MAS as the vector x = { x T , x T , ...x Tk } T and denote the state of the neighbors of agent i (including agent i ) as x N i . DSAuses discrete-time control: the DM and controllers are evaluated every η sec-onds. We assume that all agents evaluate DM and controllers simultaneously;this assumption simplifies the analysis but can be relaxed. Admissible States
Set of admissible states
A ⊂ R kn consists of all states thatsatisfy the safety constraints. A constraint C is a function from k -agent MASstates to the reals; C : D k → R . In this paper, we are primarily concerned withbinary constraints (between neighboring agents) C ij : D × D → R , and unaryconstraints C i : D → R . Hence, set of admissible states, A ⊂ R kn are the statesof MAS x ∈ R kn , such that all the unary and binary constraints are satisfied.Formally, a symmetric instance of DSA aims to solve the following problem.Given a MAS defined as in Eq. (1) and x (0) ∈ A , design a BC and DM to beused by all agents such that the MAS remains safe; i.e. x ( t ) ∈ A , ∀ t > Recoverable States
For each agent i , the local admissible set A i ⊂ R n is theset of states x i ∈ R n which satisfy all the unary constraints. The set S i ⊂ A i is defined as the super-level set of the CBF h i : R n → R , which is designed toensure forward-invariance of A i . Similarly, for a pair of neighboring agents i, j where i ∈ M , j ∈ N i , the pairwise admissible set A ij ⊂ R n is the set of pairs ofstates which satisfy all the binary constraints. The set S ij ⊂ A ij is defined as thesuper-level set of the CBF h ij : R n → R designed to ensure forward-invarianceof A ij . The recoverable set R ij ⊂ R n , for a pair of neighboring agents i, j where i ∈ M , j ∈ N i , is defined in terms of S i , S j and S ij . S i = { x i ∈ R n | h i ( x i ) ≥ } (7) S ij = { ( x i , x j ) ∈ R n | h ij ( x i , x j ) ≥ } (8) R ij = ( S i × S j ) ∩ S ij (9)The recoverable set R ⊂ A for the entire MAS is defined as the set of systemstates in which ( x i , x j ) ∈ R ij for every pair of neighboring agents i, j . TheCBFs can be computed using sum-of-squares programming [17] or the techniquein [18]. An application of these techniques for the synthesis of CBFs for severalsystems can be found in [13]. Note that if agent i and j ’s controllers satisfy thefollowing constraints based on the Lie derivatives of h i , h j and h ij , similar tothe constraints in (5), the pairwise state of agents i and j will remain in R ij according to Theorem 1. L f h i ( x i ) + L g h i ( x i ) u i + α ( h i ( x i )) ≥ L f h j ( x j ) + L g h j ( x j ) u j + α ( h j ( x j )) ≥ L f h ij ( x i , x j ) + L g h ij ( x i , x j ) (cid:20) u i u j (cid:21) + α ( h ij ( x i , x j )) ≥ U. Mehmood et al.
Constraint Partitioning
Note that the constraints in (10) are linear in thecontrol variable. For ease of notation we write the unary constraints as A i u i ≤ b i and the binary constraints as [ P ij Q ij ] [ u i u j ] ≤ b ij .The binary constraint in (10c) is a condition on the control action of a pair ofagents. For a centralized MAS, the global controller can pick coordinated actionsfor agents i and j to ensure the binary constraint (10c) is satisfied. However, fora decentralized MAS, the distributed control of the two agents cannot indepen-dently satisfy the binary constraint without running an agreement protocol.As DSA is a distributed control framework, we solve the problem of thesatisfaction of the binary constraint by partitioning the binary constraint intotwo unary constraints such that the satisfaction of the unary constraints impliesthe satisfaction of the binary constraint (but not vice versa) [9]. (cid:2) P ij Q ij (cid:3) (cid:20) u i u j (cid:21) ≤ b ij → (cid:40) P ij u i ≤ b ij / Q ij u j ≤ b ij / i and j guarantees safety because the binary constraint stillholds. Moreover, the equal partitioning ensures that the agents share an equalresponsibility to keep the pairwise state safe. The admissible control space foran agent i, denoted by L i , is an intersection of half-spaces of the hyper-planesdefined by the linear constraints. L i = { u i ∈ U | ∀ j ∈ N i : A i u i ≤ b i ∧ P ij u i ≤ b ij } (12) Theorem 2.
Given a MAS indexed by M and with dynamics in (6), if thecontroller for each agent i ∈ M chooses an action u i ∈ L i , thereby satisfying theLie-derivative constraints on the respective CBFs, and x (0) ∈ R , then the MASis guaranteed to be safe.Proof. If all the agents choose an action from their respective admissible controlspaces L i , then the forward invariance of the set S i for all i ∈ M and S ij for all i ∈ M , j ∈ N i is established by Theorem 1. Therefore, R ij is forward invariantfor all i ∈ M , j ∈ N i and consequently R is forward invariant. The BC is a distributed controller with the task to keep the state of the agentin the safe region. For an agent i , the control law of the BC depends on itsstate x i and the states of its . In our design, the BC considers only the safety-critical aspects, leaving the mission-critical objectives to the AC. Specifically, theBC is designed to move the system toward safer states as quickly as possible.This reduces the width of the necessary “safety margin” between unsafe andrecoverable states, allowing a looser FSC (i.e., allowing the AC to stay in controlmore often). Distributed Simplex Architecture for Multi-Agent Systems 7
We design the BC as the solution to the following constrained multi-objectiveoptimization (MOO) problem where the utility function is a weighted sum of ob-jective functions based on the Lie derivatives of the CBFs h i and h ij introducedabove: u ∗ i = argmax u i h i ( L f h i + L g h i u i ) + (cid:88) j ∈N i h ij ( L f h ij + L g h ij (cid:20) u i (cid:21) )s.t. u i ∈ L i (13)The bottom component of the column vector in the last term is agent i ’s predic-tion for agent j ’s next control action u j . Since we consider MASs in which agentsare unable to communicate planned control actions, agent i simply predicts that u j = 0. This approach has been shown to work well in prior work on distributedmodel-predictive control for flocking [19].Recall that, by definition, the CBFs quantify the degree of safety of thestate with respect to the given safety constraints, with larger (positive) valuesindicating safer states. For a given state, the Lie derivative of a CBF is a linearfunction in the control action. A positive value of the Lie derivative indicatesthat the proposed action will lead to a state which has a higher CBF value andtherefore is safer.The solution to the optimization problem in (13) is a control action thatmaximizes the weighted sum of the Lie derivatives of the CBFs. We note that ina weighted-sum formulation of a MOO problem, it is possible that some objectivefunctions are negative in the optimal solution. We ensure the selected action u i is safe by constraining u i to be in the admissible control space L i , defined in(12).The weights in the utility function in (13) prioritize certain safety constraintsover others. We use state-dependent weights which are the inverses of the CBFfunctions, thereby giving more weight to maximizing the Lie derivatives of CBFscorresponding to safety constraints that are closer to being violated. Each agent’s DM implements the switching logic for both forward switching andreverse switching. Control is switched from the AC to the BC if the forwardswitching condition (FSC) is true. Similarly, control is reverted back to the AC(from the BC) if the reverse switching condition (RSC) is true. For an agent i , thestate of the DM is denoted as DM i ∈ { AC, BC } , with DM i = AC ( DM i = BC )indicating that the advanced (baseline) controller is in control. DSA starts withall agents in the AC mode; i.e., DM i ( t ) = AC for all t ≤ i ∈ M ; this isjustified by the assumption that x (0) ∈ R . For t >
0, the DM state is given by: DM i ( t ) = AC if DM i ( t −
1) = BC and RSC ( x N i ) BC if DM i ( t −
1) = AC and F SC ( x N i ) DM i ( t −
1) otherwise (14)where x N i is the vector containing the states of agent i and its neighbors. U. Mehmood et al.
We derive the switching conditions from the CBFs as follows. To ensuresafety, the FSC must be true in a state x N i ( t ) if an unrecoverable state is reach-able from x N i ( t ) in one time step η . For a CBF function, in a given state, wedefine a worst-case action to be an action that minimizes the Lie derivative ofthe CBF. The check for one-step reachability of an unrecoverable state is basedon the minimum value of the Lie derivative of the CBFs, which corresponds tothe worst-case actions by the agents. Hence, for each CBF h , we define a mini-mum threshold value λ h ( x N i ) equal to the magnitude of the minimum of the Liederivative of the CBF times η , and we switch to BC if, in the current state, thevalue of any CBF h is less than λ h ( x N i ). This results in a FSC of the followingform: F SC ( x N i ) = ( h i < λ h i ( x N i )) ∨ ( ∃ j ∈ N i | h ij < λ h ij ( x N i )) (15)Thus, the one-step reachability check shrinks the size of the recoverable set byan amount equal to the maximum change that can occur from the current statein one control period, and the switch occurs if the current state is outside thissmaller set.We derive the RSC using a similar approach, except based on an m -time-stepreachability check with m >
1, in order to prevent frequent switching betweenAC and BC. The RSC holds if, in the current state, the value of each CBF h isgreater than the threshold mλ h ( x N i ). RSC ( x N i ) = ( h i > mλ h i ( x N i )) ∧ ( ∀ j ∈ N i | h ij > mλ h ij ( x N i )) (16)This definition of the RSC ensures that, when control is switched to AC, thestate is safe and the FSC will not hold for at least m time steps. Our main result is the following safety theorem for DSA.
Theorem 3.
Given an MAS indexed by M with dynamics in (6), if each agentoperates under DSA with the BC as defined in (13), the DM as defined in (14),and x (0) ∈ R , then the MAS will remain safe.Proof. The proof proceeds by considering both DM states for an arbitrary agent i and establishing that its next state is safe. First, consider an agent i at time t with DM i ( t ) = AC . As the FSC is false, the one-step reachability check inthe FSC ensures that the CBFs for unary and binary safety constraints arestrictly positive at the next state x i ( t + η ), i.e. h i ( x i ( t + η )) > ∀ j ∈N i : h ij ( x i ( t + η ) , x j ( t + η )) >
0, hence the next state is recoverable. Next,consider an agent i at time t with DM i ( t ) = BC . We divide the neighbors of i into two sets based on their DM states: the sets of neighbors in AC mode andBC mode are denoted as N ACi and N BCi , respectively. The agents in BC modechoose their control actions from their corresponding admissible control spacesas defined in Eq. 12. Hence, according to
Theorem
2, these agents will satisfyunary safety constraints and pairwise safety constraints among themselves. Asfor the neighbors in AC mode, due to the one-step reachability check in their
Distributed Simplex Architecture for Multi-Agent Systems 9
FSC, in the state x i ( t + η ), the pairwise CBFs satisfy h ij ( x i ( t + η ) , x j ( t + η )) ≥ j ∈ N ACi . Hence, x i ( t + η ) is recoverable for DM i ( t ) = BC . We haveproven that, for any agent i and time step t , if x i ( t ) is recoverable, then x i ( t + η )is recoverable. By assumption, x (0) ∈ R . Therefore, by induction, x ( t ) ∈ R for t > We evaluate DSA on the distributed flocking problem with the goal of preventinginter-agent collisions. Consider an MAS consisting of n robotic agents, indexedby M = { , . . . , n } with double integrator dynamics: (cid:20) ˙ p i ˙ v i (cid:21) = (cid:20) I × (cid:21) (cid:20) p i v i (cid:21) + (cid:20) I × (cid:21) a i (17)where p i , v i , a i ∈ R are the position, velocity and acceleration of agent i ∈ M ,respectively. The magnitudes of velocities and accelerations are bounded by ¯ v and ¯ a , respectively. Acceleration a i is the control input for agent i . As DSA isa discrete-time protocol, the state of the DM and the a i ’s are updated every η seconds. The state of an agent i is denoted by the vector s i = [ p Ti v Ti ] T . The state of the entire flock at time t , is denoted by the vector s ( t ) = [ p ( t ) T v ( t ) T ] T ∈ R n ,where p ( t ) = [ p T ( t ) · · · p Tn ( t )] T and v ( t ) = [ v T ( t ) · · · v Tn ( t )] T are the vectorsrespectively denoting the positions and velocities of the flock at time t .We assume that an agent can accurately sense the positions and velocities ofnearby agents within a fixed distance r . The set of the spatial neighbors of agent i is defined as N i ( p ) = { j ∈ M | j (cid:54) = i ∧ (cid:107) p i − p j (cid:107) < r } , where (cid:107) · (cid:107) denotes theEuclidean norm. For ease of notation, we sometimes use s and s i to refer to thestate variables s ( t ) and s i ( t ), respectively, without the time index.The MAS is characterized by a set of operational constraints which includephysical limits and safety properties. States that satisfy the operational con-straints are called admissible , and are denoted by the set A ∈ R n . The desiredsafety property is that no pair of agents is in a “state of collision”. A pair ofagents is considered to be in a state of collision if the Euclidean distance betweenthem is less than a threshold distance d min ∈ R + , resulting in a binary safetyconstraint of the form: (cid:107) p i − p j (cid:107) − d min ≥ ∀ i ∈ M , j ∈ N i . Similarly, a state s is recoverable if all pairs of agents can brake (de-accelerate) relative to eachother without colliding. Otherwise, the state s is considered unrecoverable . Let R ij ⊂ R be the set of recoverable states for a pair of agents i, j ∈ M . Theflock-wide set of recoverable states, denoted by R ⊂ R n , is defined in terms of R ij . As in [14], the set R ij is defined as the super-level set of a pairwise CBF h ij : R → R : R ij = { s i , s j | h ij ( s i , s j ) ≥ } . The flock-wide set of recoverablestates R ⊂ A is defined as the set of system states in which ( s i , s j ) ∈ R ij forevery pair of neighboring agents i, j . In accordance with [14], the function h ij ( s i , s j ) is based on a safety con-straint over a pair of agents i, j ∈ M . The safety constraint ensures that forany pair of agents, the maximum braking force can always keep the agents ata distance greater than d min from each other. As introduced earlier, d min isthe threshold distance that defines a collision. Considering that the tangentialcomponent of the relative velocity, denoted by ∆v , causes a collision, the con-straint regulates ∆v by application of maximum acceleration to reduce ∆v tozero. Hence, the safety constraint can be represented as the following conditionon the inter-agent distance (cid:107) ∆ p ij (cid:107) = (cid:107) p i − p j (cid:107) , the braking distance ( ∆v ) / a ,and the safety threshold distance d min : (cid:13)(cid:13) ∆ p ij (cid:13)(cid:13) − ( ∆v ) a ≥ d min (18) h ij ( s i , s j ) = (cid:113) a ( (cid:13)(cid:13) ∆ p ij (cid:13)(cid:13) − d min ) − ∆v ≥ ∆v to zero under a deceleration of 2¯ a . The constraint in Eq. (18) is re-arranged to get the CBF h ij given in Eq. (19).Combining (19) and (10c), we arrive at the linear constraint on the acceler-ations for agents i and j , which constrains the Lie derivative of the CBF in (19)to be greater than − α ( h ij ). We set α ( h ij ) = γh ij , as in [14], resulting in thefollowing constraint on the accelerations of agents i, j : ∆ p Tij ( ∆ a ij ) (cid:107) ∆ p ij (cid:107) − ( ∆ v Tij ∆ p ij ) (cid:107) ∆ p ij (cid:107) + (cid:107) ∆ v ij (cid:107) (cid:107) ∆ p ij (cid:107) + 2¯ a∆ v Tij ∆ p ij (cid:107) ∆ p ij (cid:107) (cid:112) a ( (cid:107) ∆ p ij (cid:107) − d min ) ≥ − γh ij (20)where the left-hand side is the Lie derivative of the CBF h ij and ∆ p ij = p i − p j , ∆ v ij = v i − v j , and ∆ a ij = a i − a j are the vectors representing the relative po-sition, the relative velocity, and the relative acceleration of agents i and j , re-spectively. We further note that the binary constraint (20) can be representedas [ P ij Q ij ] [ a i a j ] ≤ b ij , and hence it can be split into two unary constraints( P ij u i ≤ b ij / Q ij u j ≤ b ij / i ∈ A , denoted by K i ( s i ) ⊂ R , is defined as theintersection of the half-planes defined by the Lie-derivative-based constraints,where each neighboring agent contributes a single constraint: K i ( s i ) = (cid:8) a i ∈ R | P ij u i ≤ b ij / , ∀ j ∈ N i (cid:9) (21)With the CBF for collision-free flocking defined in (19) and the admissiblecontrol space defined in (21), the BC, RSC, and FSC follow from (13), (15), and(16), respectively. We use the Reynolds flocking model [20] as the AC. In the Reynolds model,the acceleration a i for each agent is a weighted sum of three acceleration terms, Distributed Simplex Architecture for Multi-Agent Systems 11
Time D i s t an c e closest neighbor distance d min Time D i s t an c e (a) Reynolds Model Time D i s t an c e (b) Reynolds Model with DSA Fig. 2: Results for a flock of size 15, with and without DSA.based on simple rules of interaction with the neighboring agents: separation (move away from your close neighbors), cohesion (move towards the centroid ofyour neighbors), and alignment (match your velocity with the average velocityof your neighbors). The acceleration for agent i is a i = w s a si + w c a ci + w al a ali ,where w s , w c , w al ∈ R + are the scalar weights and a si , a ci , a ali ∈ R are the accel-eration terms corresponding to separation, cohesion, and alignment, respectively.We note that the Reynolds model does not guarantee collision avoidance. Nev-ertheless, when the flock stabilizes, the average distance to the closest neighborsare determined by the choice of the weights of the interaction terms. The number of agents in the MAS is n = 15. The other parameters used in theexperiments are r = 4, ¯ a = 5, ¯ v = 2 . d min = 2, and η = 0 . − , and [ − , , respectively, and we ensure thatthe initial state is recoverable. The weights of the Reynolds’ model terms arepicked experimentally to ensure that no pair of agents are in a state of collisionin the steady state. They are set to w s = 3, w c = 1 .
5, and w al = 0 . To recall, the safety property is that all pairs of agents maintain a dis-tance greater than d min from each other. Fig. 2 plots, for the duration of thesimulations, the distance between the agents and their closest neighbors, i.e., min j ∈N i (cid:107) p i − p j (cid:107) . As evident from Fig. 2(a), Reynolds model results in multiple safetyviolations before the flock stabilizes at around 40 seconds. In contrast, as shown https://streamable.com/zn2bl5https://streamable.com/hetraw Time/s D i s t an c e closest neighbor distance d min (b) Distance to the closest neighbor for allagents. Fig. 3: Experimental results for the way-point control study.in Fig. 2(b), DSA preserves safety, maintaining a separation greater than d min between all agents. DSA has an additional benefit of stabilizing the flock muchearlier (around 18 seconds). We further note that the average time the agentsspent in BC mode is only 2 .
47 percent of the total duration of the simulation,indicating that DSA is largely non-invasive.
This section describes the problem setup and experimental results for the way-point (WP) control case study. For this study, the model of the agents is the sameas the one used for the flocking case study, given in Eq. (17). The experimentalsetup is shown in Fig. 3, where the agents initially positioned on the left-handside are to sequentially navigate through a series of WPs while maintaining asafe distance from each other. The WPs are represented by the black squares.The CBF, BC and DM are same as those defined for the flocking problem;see Section 4. The AC is a rule-based controller where each agent acceleratestowards its next WP (ignoring the other agents) until the final WP is reached.Agents are assigned one WP from each column such that they are on a collisioncourse if they follow the AC’s commands.
The number of agents used in the experiment is four as is the number of WPsan agent is required to visit. Initially, the agents are at rest with their positionsrepresented by the red dots in Fig. 3(a). The final configuration is shown ingreen. The duration of the simulation is 37 seconds. The other parameters usedin the experiments are r = 1 .
0, ¯ a = 0 .
8, ¯ v = 0 . d min = 0 .
2, and η = 0 . Distributed Simplex Architecture for Multi-Agent Systems 13 distances, indicating that the agents maintain a safe distance from one another.A video of the simulation is available online. With an increasing prevalence of distributed energy resources (DERs) such aswind and solar power, electrification using microgrids (MGs) has witnessed un-precedented growth. Unlike traditional power systems, MG DERs do not haverotating components such as turbines. The lack of rotating components can leadto low inertia, making MGs susceptible to oscillations resulting from transientdisturbances [21]. Ensuring the safe operation of an MG is thus a challengingproblem. In this case study, we demonstrate the effectiveness of DSA in main-taining MG voltage levels within safe limits.The MG we consider is a network of n droop-controlled inverters, indexedby M = { , . . . , n } . The dynamics of each inverter is modeled as [21–24]:˙ θ i = ω i (22a)˙ ω i = ω i − ω i + λ pi ( P seti − P i ) (22b)˙ v i = v i − v i + λ qi ( Q seti − Q i ) (22c)where θ i , ω i , and v i are respectively the phase angle, frequency, and voltage ofinverter i , i ∈ M . The state vector for the MG is denoted by s = [ θ T ω T v T ] T ∈ R n , where θ , ω , and v are respectively vectors representing the voltage phaseangle, frequency, and voltage at each node of the MG. A pair of inverters areconsidered neighbors if they are connected by a transmission line. Also, λ pi and λ qi are droop coefficients of “active power vs frequency” and “reactive powervs voltage” droop controllers, respectively. Finally, ω i and v i are the nominalfrequency and voltage values. P i and Q i are the active and reactive powers injected by inverter i into thesystem: P i = v i (cid:88) k ∈N i v k ( G i,k cos θ i,k + B i,k sin θ i,k ) Q i = v i (cid:88) k ∈N i v k ( G i,k sin θ i,k − B i,k cos θ i,k ) (23)where θ i,k = θ i − θ k , and N i ⊆ M is the set of neighbors. G i,k , B i,k are respec-tively conductance and susceptance values of the transmission line connectinginverters i and k . P seti and Q seti are the active power and reactive power setpoints. The invertershave the ability to change their respective power setpoints according to the MG’soperating conditions. This is modeled as: P seti = P i + u pi , Q seti = Q i + u qi (24)where P i and Q i are the setpoints for the nominal operating condition, and u pi and u qi are control inputs. https://streamable.com/e9rnqd Time/s -0.4-0.200.2 V / p . u . without DSA with DSA limits Fig. 4: Voltage graph at node 4 of the MG network.
The safety property for the MG network is a set of unary constraints restrictingthe voltages at each node to remain within safe limits. The recoverable set R i ⊂ R for inverter i is defined as the super-level set of a CBF h i : R → R . Wefollow the SOS-optimization technique given in [24] to synthesize the CBF.Since the power flow equations (22) are nonlinear, we apply a third-orderTaylor series expansion to approximate the dynamics in polynomial form. Wethen follow the three-step process given in [24] to obtain the CBF for each MGnode. We then calculate the admissible control space according to (12), and theBC, FSC, and RSC follow from (13), (15), and (16), respectively. The AC sets the active/reactive power setpoints to their nominal values. Thus,the AC does not limit voltage and frequency magnitudes but is only concernedwith stabilizing frequency and voltage magnitudes to their nominal values.
We consider a 6-bus MG [24]. Disconnecting the MG from the main utility, wereplace bus 0 with a droop-controlled inverter (Eq. (22)), with inverters alsoplaced on buses 1, 4 and 5. Bus 0 is the reference bus for the phase angle.Nominal values of voltage and frequency, as well as the active/reactive powerset-points, were obtained by solving the steady-state power-flow equations givenin Eq. (23); these were then used to shift the equilibrium point to the origin.Droop coefficients λ pi and λ qi were set to 2.43 rad/s/p.u. and 0.20 p.u./p.u., and τ i was set to 0.5 s. Loads are modeled as constant power loads, and a Kron-reduced network [25] with only the inverter nodes was used for analysis. Theunsafe set is defined in terms of the shifted (around the 0 p.u.) nodal voltagemagnitudes as follows: v i < − . v i > . Distributed Simplex Architecture for Multi-Agent Systems 15
The original Simplex architecture [4,26] was developed for a systems comprisinga single controller and a single (non-distributed) plant. With DSA, we extendthe scope of Simplex to MASs under distributed control. RTA [27, 28] is a run-time assurance technique that can be applied to component-based systems. Inthis case, however, each RTA wrapper (i.e., each Simplex-like instance) indepen-dently ensures a local safety property of a component. For example, in [27], RTAinstances for an inner-loop controller and a guidance system are uncoordinatedand operate independently. In contrast, in DSA, each agent takes the statesof neighboring agents into account when making control decisions, in order toensure that pairwise safety constraints are satisfied.A runtime verification framework for dynamically adaptive multi-agent sys-tems (DAMS-RV) is proposed in [29]. DAMS-RV is activated every time thesystem adapts to a change in the system itself or its environment. It takes afeedback loop- and model-based approach to verifying dynamic agent collabora-tion. However this method relies on a monitoring phase to observe and identifychanges that occur in agent collaboration so that verification can be carried outon the system operating in new contexts. DSA does not require such interme-diary supervision. In [30], a dynamic policy model that can be used to expressconstraints on agent behavior is presented. These constraints limit agent au-tonomy to lie within well-defined boundaries. Constraint specifications are keptsimple by allowing the policy designer to decompose a specification into com-ponents and define the overall policy as a composition of these small units. Incontrast, DSA uses CBFs to compute the requisite safety regions.CBF-based methodologies [9, 13, 14, 31] have been widely used for MAS run-time safety assurance. In [9, 14], a formal framework for collision avoidance inmulti-robot systems is presented. CBFs are used to design a wrapper around anAC that guarantees forward invariance of a safe set. The wrapper solves an op-timization problem involving the Lie derivative of the CBF to compute minimalchanges to the advanced controller’s output needed to ensure safety.
We have presented Distributed Simplex Architecture, a runtime assurance tech-nique for the safety of multi-agent systems. DSA is distributed in the sense thatit involves one local instance of traditional Simplex per agent such that the con-junction of their respective safety properties yields the desired safety propertyfor the entire MAS. Moreover, an agent’s switching logic depends only on itsown state and that of neighboring agents. We demonstrated the effectiveness ofDSA by successfully applying it to flocking, way-point visiting, and microgridcontrol. As future work, we plan to apply DSA to non-homogenous MASs andimplement it on a physical platform.
References
1. M. Nasir, Z. Jin, H. A. Khan, N. A. Zaffar, J. C. Vasquez, and J. M. Guerrero,“A decentralized control architecture applied to DC nanogrid clusters for ruralelectrification in developing regions,”
IEEE Transactions on Power Electronics ,vol. 34, no. 2, pp. 1773–1785, 2019.2. Z. Boussaada, O. Curea, H. Camblong, N. Bellaaj Mrabet, and A. Hacala, “Multi-agent systems for the dependability and safety of microgrids,”
International Jour-nal on Interactive Design and Manufacturing (IJIDeM) , 2016.3. A. Tahir, J. B¨oling, M.-H. Haghbayan, H. T. Toivonen, and J. Plosila, “Swarms ofunmanned aerial vehicles — a survey,”
Journal of Industrial Information Integra-tion , vol. 16, p. 100106, 2019.4. D. Seto and L. Sha, “A case study on analytical analysis of the inverted pendulumreal-time control system,” Software Engineering Institute, Carnegie MellonUniversity, Pittsburgh, PA, Tech. Rep. CMU/SEI-99-TR-023, 1999. [Online].Available: http://resources.sei.cmu.edu/library/asset-view.cfm?AssetID=135115. L. Sha, “Using simplicity to control complexity,”
IEEE Software , vol. 18, no. 4,pp. 20–28, 2001.6. D. Phan, R. Grosu, N. Jansen, N. Paoletti, S. A. Smolka, and S. D. Stoller, “Neuralsimplex architecture,” in
NASA Formal Methods Symposium (NFM 2020) , 2020.7. T. Gurriet, A. Singletary, J. Reher, L. Ciarletta, E. Feron, and A. Ames, “Towardsa framework for realizable safety critical control through active set invariance,”in , 2018, pp. 98–106.8. M. Egerstedt, J. N. Pauli, G. Notomista, and S. Hutchinson, “Robotecology: Constraint-based control design for long duration autonomy,”
AnnualReviews in Control . IEEE,2016, pp. 5213–5218.10. S. Prajna and A. Jadbabaie, “Safety verification of hybrid systems using bar-rier certificates,” in
Hybrid Systems: Computation and Control, 7th InternationalWorkshop, HSCC 2004, Philadelphia, PA, USA, March 25-27, 2004, Proceedings ,ser. Lecture Notes in Computer Science, R. Alur and G. J. Pappas, Eds., vol. 2993.Springer, 2004, pp. 477–492.11. S. Prajna, “Barrier certificates for nonlinear model validation,”
Autom. , vol. 42,no. 1, pp. 117–126, 2006.12. P. Wieland and F. Allg¨ower, “Constructive safety using control barrierfunctions,”
IFAC Proceedings Volumes . IEEE, 2019, pp. 3420–3431.14. U. Borrmann, L. Wang, A. D. Ames, and M. Egerstedt, “Control barrier certificatesfor safe swarm behavior,” in
ADHS , ser. IFAC-PapersOnLine, M. Egerstedt andY. Wardi, Eds., vol. 48, no. 27. Elsevier, 2015, pp. 68–73. Distributed Simplex Architecture for Multi-Agent Systems 1715. F. Blanchini and S. Miani,
Set-Theoretic Methods in Control , 1st ed. Birkh¨auserBasel, 2007.16. F. Blanchini, “Set invariance in control,”
Automatica , vol. 35, no. 11, pp. 1747 –1767, 1999.17. L. Wang, D. Han, and M. Egerstedt, “Permissive barrier certificates for safe sta-bilization using sum-of-squares,” in . IEEE, 2018, pp. 585–590.18. E. Squires, P. Pierpaoli, and M. Egerstedt, “Constructive barrier certificates withapplications to fixed-wing aircraft collision avoidance,” in
IEEE Conference onControl Technology and Applications, CCTA 2018, Copenhagen, Denmark, August21-24, 2018 . IEEE, 2018, pp. 1656–1661.19. U. Mehmood, N. Paoletti, D. Phan, R. Grosu, S. Lin, S. D. Stoller, A. Tiwari,J. Yang, and S. A. Smolka, “Declarative vs rule-based control for flocking dynam-ics,” in
Proceedings of the 33rd Annual ACM Symposium on Applied Computing ,2018.20. C. W. Reynolds, “Flocks, herds and schools: A distributed behavioral model,”
SIGGRAPH Comput. Graph. , vol. 21, no. 4, pp. 25–34, Aug. 1987. [Online].Available: http://doi.acm.org/10.1145/37402.3740621. N. Pogaku, M. Prodanovic, and T. C. Green, “Modeling, analysis and testingof autonomous operation of an inverter-based microgrid,”
IEEE Transactions onPower Electronics , vol. 22, no. 2, pp. 613–625, 2007.22. J. Schiffer, R. Ortega, A. Astolfi, J. Raisch, and T. Sezi, “Conditions for stabilityof droop-controlled inverter-based microgrids,”
Automatica
IEEE Transactionson Industry Applications , vol. 38, no. 2, pp. 533–542, 2002.24. S. Kundu, S. Geng, S. P. Nandanoori, I. A. Hiskens, and K. Kalsi, “Distributed bar-rier certificates for safe operation of inverter-based microgrids,” in , 2019, pp. 1042–1047.25. P. Kundur, N. Balu, and M. Lauby,
Power System Stability and Control , ser.EPRI power system engineering series. McGraw-Hill Education, 1994. [Online].Available: https://books.google.com.pk/books?id=2cbvyf8Ly4AC26. D. Seto, B. Krogh, L. Sha, and A. Chutinan, “The simplex architecture for safeonline control system upgrades,” in
Proceedings of the 1998 American ControlConference. ACC (IEEE Cat. No.98CH36207) , vol. 6, 1998, pp. 3504–3508.27. M. Aiello, J. Berryman, J. Grohs, and J. Schierman,
Run-Time Assurance forAdvanced Flight-Critical Control Systems , 2010.28. J. Schierman, D. Ward, B. Dutoi, A. Aiello, J. Berryman, M. DeVore, W. Storm,and J. Wadley,
Run-Time Verification and Validation for Safety-Critical FlightControl Systems , 2012.29. Y. J. Lim, G. Hong, D. Shin, E. Jee, and D.-H. Bae, “A runtime verificationframework for dynamically adaptive multi-agent systems,” in , 2016, pp. 509–512.30. H. Alotaibi and H. Zedan, “Runtime verification of safety properties in multi-agentssystems,” in , 2010, pp. 356–362.31. A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier function basedquadratic programs for safety critical systems,”