[PDF] A Distributed Simplex Architecture for Multi-Agent Systems

Abstract

We present Distributed Simplex Architecture (DSA), a new runtime assurance technique that provides safety guarantees for multi-agent systems (MASs). DSA is inspired by the Simplex control architecture of Sha et al., but with some significant differences. The traditional Simplex approach is limited to single-agent systems or a MAS with a centralized control scheme. DSA addresses this limitation by extending the scope of Simplex to include MASs under distributed control. In DSA, each agent has a local instance of traditional Simplex such that the preservation of safety in the local instances implies safety for the entire MAS. We provide a proof of safety for DSA, and present experimental results for several case studies, including flocking with collision avoidance, safe navigation of ground rovers through way-points, and the safe operation of a microgrid.

Full PDF

AA Distributed Simplex Architecture forMulti-Agent Systems

Usama Mehmood , Scott D. Stoller , Radu Grosu ,Shouvik Roy , Amol Damare , and Scott A. Smolka Department of Computer Science, Stony Brook University, USA Department of Computer Engineering, Technische Universit¨at Wien, Austria

Abstract.

We present

Distributed Simplex Architecture (DSA), a newruntime assurance technique that provides safety guarantees for multi-agent systems (MASs). DSA is inspired by the Simplex control architec-ture of Sha et al., but with some signiﬁcant diﬀerences. The traditionalSimplex approach is limited to single-agent systems or a MAS with acentralized control scheme. DSA addresses this limitation by extend-ing the scope of Simplex to include MASs under distributed control. InDSA, each agent has a local instance of traditional Simplex such that thepreservation of safety in the local instances implies safety for the entireMAS. We provide a proof of safety for DSA, and present experimentalresults for several case studies, including ﬂocking with collision avoid-ance, safe navigation of ground rovers through way-points, and the safeoperation of a microgrid.

Keywords:

Runtime assurance · Simplex architecture · Control BarrierFunctions · Distributed ﬂocking · Reverse switching.

A multi-agent system (MAS) is a group of autonomous, intelligent agents thatwork together to solve tasks and carry out missions. MAS applications includethe design of power systems and smart-grids [1,2], autonomous control of roboticswarms for monitoring, disaster management, military battle systems, etc. [3],and sensor networks. Many MAS applications are safety-critical. It is thereforeparamount that MAS control strategies ensure safety.In this paper, we present the

Distributed Simplex Architecture (DSA), a newruntime assurance technique that provides safety guarantees for MASs under dis-tributed control. DSA is inspired by Sha et al.’s Simplex Architecture [4, 5], butdiﬀers from it in signiﬁcant aspects. The Simplex Architecture provides runtimeassurance of safety by switching control from an unveriﬁed (hence potentiallyunsafe) advanced controller (AC) to a veriﬁed-safe baseline controller (BC), ifthe action produced by the AC could result in a safety violation in the nearfuture. The switching logic is implemented in a veriﬁed decision module (DM).The applicability of the traditional Simplex Architecture is limited to systemswith a centralized control architecture. a r X i v : . [ c s . M A ] D ec U. Mehmood et al.

ProcessAdvancedController(AC)BaselineController(BC) Decision Module (DM)

Agent i S i,2 S i,m S i,1 DSA

Rest of Network

DSA

Agent i

DSA

Agent k

DSA

Agent j

DSA

Agent l

Fig. 1: The DSA for the MAS on the right. All agents in the MAS are homoge-neous and operate under DSA, but the ﬁgure shows the DSA components foronly agent i . The sensed state of agent i ’s j th neighbor is denoted as S i,j . TheAC, BC, and DM take as input the state of the agent and its neighbors.DSA, illustrated in Fig. 1, addresses this limitation by making necessaryadditions to the traditional Simplex to widen its scope to include MASs. Also,as in [6], it implements reverse switching by reverting control back to the ACwhen it is safe to do so.In DSA, for each agent, there is a veriﬁed-safe BC and a certiﬁed switchinglogic such that if all the agents operate under DSA, then safety of the MAS isguaranteed. The BC and DM along with the AC are distributed and dependonly on local information. DSA itself is distributed in the sense that it involvesone local instance of traditional Simplex per agent such that the conjunction oftheir respective safety properties yields the desired safety property for the entireMAS. For example, consider our ﬂocking case study, where we want to establishcollision-freedom for the entire MAS. This can be accomplished in a distributedmanner by showing that each local instance of Simplex, say for agent i , ensurescollision-freedom for agent i and its neighboring agents.DSA allows agents to switch their mode of operation independently. At anygiven time, some agents may be operating in AC mode while others are operatingin BC mode. Our approach to the design of the BC and DM leverages ControlBarrier Functions (CBFs), which have been used to synthesize safe controllers[7–9], and are closely related to Barrier Certiﬁcates used for safety veriﬁcationof closed dynamical systems [10, 11]. A CBF is a mapping from the state spaceto a real number, with its zero level-set partitioning the state space into safeand unsafe regions. If certain inequalities on the Lie derivative of the CBF aresatisﬁed, then the corresponding control actions are considered safe (admissible).In DSA, the BC is designed as an optimal controller with the goal of increas-ing a utility function based on the Lie derivatives of the CBFs. As CBFs area measure of the safety of a state, optimizing for control actions with a higherLie derivative values gives a direct way to make the state safer. The safety ofthe BC is further guaranteed by constraining the control action to remain in aset of admissible actions that satisfy certain inequalities on the Lie derivativesof the CBFs. CBFs are also used in the design of the switching logic, as theyprovide an eﬃcient method for checking whether an action could lead to a safetyviolation during the next time step.

Distributed Simplex Architecture for Multi-Agent Systems 3

We demonstrate the eﬀectiveness of DSA on several example MASs, includinga ﬂock of robots moving coherently while avoiding inter-agent collisions, groundrovers safely navigating through a series of way-points, and safe operation of amicrogrid.

The Simplex Control Architecture relies on a veriﬁed-safe baseline controller(BC) in conjunction with the veriﬁed switching logic of the Decision Module(DM) to guarantee the safety of the plant (Agent i in the Fig. 1) while permittingthe use of an unveriﬁable, high-performance advanced controller (AC).Let the admissible states be those which satisfy all safety constraints andoperational limits. Other states are are called inadmissible . The goal of the Sim-plex Architecture is to ensure the system never enters an inadmissible state. Theset R of recoverable states is a subset of the admissible states such that the BC,starting from any state in R guarantees that all future states are also in R . Therecoverable set takes into account the inertia of the physical system, giving theBC enough time to preserve safety.The DM’s forward switching condition (FSC) evaluates the control actionproposed by the AC and decides whether to switch to the BC. A common tech-nique to develop a FSC is to shrink the recoverable region by a margin basedon the maximum time derivative of the state and the length of a timestep, andswitch to BC if the current state lies outside this smaller set. Control Barrier Functions (CBFs) [12, 13] are an extension of the Barrier Cer-tiﬁcates used for safety veriﬁcation of hybrid systems [10, 11]. CBFs are a classof Lyapunov-like functions used to guarantee safety for nonlinear control sys-tems by assisting in the design of a class of safe controllers that establish theforward-invariance of safe sets [9,14]. Our presentation of CBFs is based on [13].Consider a nonlinear aﬃne control system˙ x = f ( x ) + g ( x ) u, (1)with state x ∈ D ⊂ R n , control input u ∈ U , and functions f and g that arelocally Lipschitz. The set R of recoverable states is deﬁned as the super-level setof a continuously diﬀerentiable function h : D ⊂ R n → R . The recoverable set R and its boundary δ R are given by: R = { x ∈ D ⊂ R n | h ( x ) ≥ } (2) δ R = { x ∈ D ⊂ R n | h ( x ) = 0 } (3) U. Mehmood et al.

For all x ∈ D , if there exists an extended class K function α : R → R (strictlyincreasing and α (0) = 0) such that the following condition on the Lie-derivativeof h is satisﬁed: sup u ∈ U [ L f h ( x ) + L g h ( x ) u + α ( h ( x ) ≥ h ( x ) is a valid CBF. Condition (4) implies the existence of acontrol action for all x ∈ D , such that the Lie-derivative of h is bounded frombelow by − α ( h ( x )). Furthermore, for x ∈ δ R , condition (4) reduces to a resultfor set invariance known as Nagumo’s theorem [15, 16]. Condition (4) is used todeﬁne the set K ( x ) of control actions that establish the forward invariance ofset R ; i.e., starting from x ∈ R , the state will always remain inside R : K ( x ) = { u ∈ U : L f h ( x ) + L g h ( x ) u + α ( h ( x )] ≥ } (5) Theorem 1. [13] For the control system given in Eq. (1) and recoverable set R deﬁned in (2) as the super-level set of some continuously diﬀerentiable function h : R n → R , if h is a control barrier function for all x ∈ D and δhδx (cid:54) = 0 forall x ∈ δ R , then any controller u such that ∀ x ∈ D : u ( x ) ∈ K ( x ) ensuresforward-invariance of R .Proof. Condition (4) on the Lie-derivative of h reduces, on the boundary of R , tothe set invariance condition of Nagumo’s theorem: for x ∈ δ R , ˙ h ≥ − α ( h ( x )) = 0.Hence, according to Nagumo’s theorem [15, 16] the set R is forward-invariant. This section describes the Distributed Simplex Architecture (DSA). We for-mally introduce the MAS safety problem and then discuss the main componentsof DSA, namely, the distributed baseline controller (BC) and the distributeddecision module (DM).We say that an instance of DSA is symmetric if every agent uses the sameswitching condition and baseline controller. Moreover, DSA, or more preciselythe MAS it is controlling, is homogeneous if every constituent agent is an instanceof the same plant model.Consider a MAS consisting of k homogeneous agents, denoted as M = { , ..., k } , where the nonlinear control aﬃne dynamics for the i th agent are:˙ x i = f ( x i ) + g ( x ) u i , (6)Here, x i ∈ D ∈ R n is the state of agent i and u i ∈ U ⊂ R m is its controlinput. For an agent i , we deﬁne the set of its neighbors N i ⊆ M as the agentswhose state is accessible to the agent i either through sensing or communication.Depending on the application, the set of neighbors could be ﬁxed or vary dynam-ically. For example, in our ﬂocking case study (Section 4), agent i ’s neighbors (ina given state) are the agents within a ﬁxed distance r of agent i ; we assume agent Distributed Simplex Architecture for Multi-Agent Systems 5 i can accurately sense the positions and velocities those agents. We denote thecombined state of all the agents in the MAS as the vector x = { x T , x T , ...x Tk } T and denote the state of the neighbors of agent i (including agent i ) as x N i . DSAuses discrete-time control: the DM and controllers are evaluated every η sec-onds. We assume that all agents evaluate DM and controllers simultaneously;this assumption simpliﬁes the analysis but can be relaxed. Admissible States

Set of admissible states

A ⊂ R kn consists of all states thatsatisfy the safety constraints. A constraint C is a function from k -agent MASstates to the reals; C : D k → R . In this paper, we are primarily concerned withbinary constraints (between neighboring agents) C ij : D × D → R , and unaryconstraints C i : D → R . Hence, set of admissible states, A ⊂ R kn are the statesof MAS x ∈ R kn , such that all the unary and binary constraints are satisﬁed.Formally, a symmetric instance of DSA aims to solve the following problem.Given a MAS deﬁned as in Eq. (1) and x (0) ∈ A , design a BC and DM to beused by all agents such that the MAS remains safe; i.e. x ( t ) ∈ A , ∀ t > Recoverable States

For each agent i , the local admissible set A i ⊂ R n is theset of states x i ∈ R n which satisfy all the unary constraints. The set S i ⊂ A i is deﬁned as the super-level set of the CBF h i : R n → R , which is designed toensure forward-invariance of A i . Similarly, for a pair of neighboring agents i, j where i ∈ M , j ∈ N i , the pairwise admissible set A ij ⊂ R n is the set of pairs ofstates which satisfy all the binary constraints. The set S ij ⊂ A ij is deﬁned as thesuper-level set of the CBF h ij : R n → R designed to ensure forward-invarianceof A ij . The recoverable set R ij ⊂ R n , for a pair of neighboring agents i, j where i ∈ M , j ∈ N i , is deﬁned in terms of S i , S j and S ij . S i = { x i ∈ R n | h i ( x i ) ≥ } (7) S ij = { ( x i , x j ) ∈ R n | h ij ( x i , x j ) ≥ } (8) R ij = ( S i × S j ) ∩ S ij (9)The recoverable set R ⊂ A for the entire MAS is deﬁned as the set of systemstates in which ( x i , x j ) ∈ R ij for every pair of neighboring agents i, j . TheCBFs can be computed using sum-of-squares programming [17] or the techniquein [18]. An application of these techniques for the synthesis of CBFs for severalsystems can be found in [13]. Note that if agent i and j ’s controllers satisfy thefollowing constraints based on the Lie derivatives of h i , h j and h ij , similar tothe constraints in (5), the pairwise state of agents i and j will remain in R ij according to Theorem 1. L f h i ( x i ) + L g h i ( x i ) u i + α ( h i ( x i )) ≥ L f h j ( x j ) + L g h j ( x j ) u j + α ( h j ( x j )) ≥ L f h ij ( x i , x j ) + L g h ij ( x i , x j ) (cid:20) u i u j (cid:21) + α ( h ij ( x i , x j )) ≥ U. Mehmood et al.

Constraint Partitioning

Note that the constraints in (10) are linear in thecontrol variable. For ease of notation we write the unary constraints as A i u i ≤ b i and the binary constraints as [ P ij Q ij ] [ u i u j ] ≤ b ij .The binary constraint in (10c) is a condition on the control action of a pair ofagents. For a centralized MAS, the global controller can pick coordinated actionsfor agents i and j to ensure the binary constraint (10c) is satisﬁed. However, fora decentralized MAS, the distributed control of the two agents cannot indepen-dently satisfy the binary constraint without running an agreement protocol.As DSA is a distributed control framework, we solve the problem of thesatisfaction of the binary constraint by partitioning the binary constraint intotwo unary constraints such that the satisfaction of the unary constraints impliesthe satisfaction of the binary constraint (but not vice versa) [9]. (cid:2) P ij Q ij (cid:3) (cid:20) u i u j (cid:21) ≤ b ij → (cid:40) P ij u i ≤ b ij / Q ij u j ≤ b ij / i and j guarantees safety because the binary constraint stillholds. Moreover, the equal partitioning ensures that the agents share an equalresponsibility to keep the pairwise state safe. The admissible control space foran agent i, denoted by L i , is an intersection of half-spaces of the hyper-planesdeﬁned by the linear constraints. L i = { u i ∈ U | ∀ j ∈ N i : A i u i ≤ b i ∧ P ij u i ≤ b ij } (12) Theorem 2.

Given a MAS indexed by M and with dynamics in (6), if thecontroller for each agent i ∈ M chooses an action u i ∈ L i , thereby satisfying theLie-derivative constraints on the respective CBFs, and x (0) ∈ R , then the MASis guaranteed to be safe.Proof. If all the agents choose an action from their respective admissible controlspaces L i , then the forward invariance of the set S i for all i ∈ M and S ij for all i ∈ M , j ∈ N i is established by Theorem 1. Therefore, R ij is forward invariantfor all i ∈ M , j ∈ N i and consequently R is forward invariant. The BC is a distributed controller with the task to keep the state of the agentin the safe region. For an agent i , the control law of the BC depends on itsstate x i and the states of its . In our design, the BC considers only the safety-critical aspects, leaving the mission-critical objectives to the AC. Speciﬁcally, theBC is designed to move the system toward safer states as quickly as possible.This reduces the width of the necessary “safety margin” between unsafe andrecoverable states, allowing a looser FSC (i.e., allowing the AC to stay in controlmore often). Distributed Simplex Architecture for Multi-Agent Systems 7

We design the BC as the solution to the following constrained multi-objectiveoptimization (MOO) problem where the utility function is a weighted sum of ob-jective functions based on the Lie derivatives of the CBFs h i and h ij introducedabove: u ∗ i = argmax u i h i ( L f h i + L g h i u i ) + (cid:88) j ∈N i h ij ( L f h ij + L g h ij (cid:20) u i (cid:21) )s.t. u i ∈ L i (13)The bottom component of the column vector in the last term is agent i ’s predic-tion for agent j ’s next control action u j . Since we consider MASs in which agentsare unable to communicate planned control actions, agent i simply predicts that u j = 0. This approach has been shown to work well in prior work on distributedmodel-predictive control for ﬂocking [19].Recall that, by deﬁnition, the CBFs quantify the degree of safety of thestate with respect to the given safety constraints, with larger (positive) valuesindicating safer states. For a given state, the Lie derivative of a CBF is a linearfunction in the control action. A positive value of the Lie derivative indicatesthat the proposed action will lead to a state which has a higher CBF value andtherefore is safer.The solution to the optimization problem in (13) is a control action thatmaximizes the weighted sum of the Lie derivatives of the CBFs. We note that ina weighted-sum formulation of a MOO problem, it is possible that some objectivefunctions are negative in the optimal solution. We ensure the selected action u i is safe by constraining u i to be in the admissible control space L i , deﬁned in(12).The weights in the utility function in (13) prioritize certain safety constraintsover others. We use state-dependent weights which are the inverses of the CBFfunctions, thereby giving more weight to maximizing the Lie derivatives of CBFscorresponding to safety constraints that are closer to being violated. Each agent’s DM implements the switching logic for both forward switching andreverse switching. Control is switched from the AC to the BC if the forwardswitching condition (FSC) is true. Similarly, control is reverted back to the AC(from the BC) if the reverse switching condition (RSC) is true. For an agent i , thestate of the DM is denoted as DM i ∈ { AC, BC } , with DM i = AC ( DM i = BC )indicating that the advanced (baseline) controller is in control. DSA starts withall agents in the AC mode; i.e., DM i ( t ) = AC for all t ≤ i ∈ M ; this isjustiﬁed by the assumption that x (0) ∈ R . For t >

0, the DM state is given by: DM i ( t ) =  AC if DM i ( t −

1) = BC and RSC ( x N i ) BC if DM i ( t −

1) = AC and F SC ( x N i ) DM i ( t −

1) otherwise (14)where x N i is the vector containing the states of agent i and its neighbors. U. Mehmood et al.

We derive the switching conditions from the CBFs as follows. To ensuresafety, the FSC must be true in a state x N i ( t ) if an unrecoverable state is reach-able from x N i ( t ) in one time step η . For a CBF function, in a given state, wedeﬁne a worst-case action to be an action that minimizes the Lie derivative ofthe CBF. The check for one-step reachability of an unrecoverable state is basedon the minimum value of the Lie derivative of the CBFs, which corresponds tothe worst-case actions by the agents. Hence, for each CBF h , we deﬁne a mini-mum threshold value λ h ( x N i ) equal to the magnitude of the minimum of the Liederivative of the CBF times η , and we switch to BC if, in the current state, thevalue of any CBF h is less than λ h ( x N i ). This results in a FSC of the followingform: F SC ( x N i ) = ( h i < λ h i ( x N i )) ∨ ( ∃ j ∈ N i | h ij < λ h ij ( x N i )) (15)Thus, the one-step reachability check shrinks the size of the recoverable set byan amount equal to the maximum change that can occur from the current statein one control period, and the switch occurs if the current state is outside thissmaller set.We derive the RSC using a similar approach, except based on an m -time-stepreachability check with m >

1, in order to prevent frequent switching betweenAC and BC. The RSC holds if, in the current state, the value of each CBF h isgreater than the threshold mλ h ( x N i ). RSC ( x N i ) = ( h i > mλ h i ( x N i )) ∧ ( ∀ j ∈ N i | h ij > mλ h ij ( x N i )) (16)This deﬁnition of the RSC ensures that, when control is switched to AC, thestate is safe and the FSC will not hold for at least m time steps. Our main result is the following safety theorem for DSA.

Theorem 3.

Given an MAS indexed by M with dynamics in (6), if each agentoperates under DSA with the BC as deﬁned in (13), the DM as deﬁned in (14),and x (0) ∈ R , then the MAS will remain safe.Proof. The proof proceeds by considering both DM states for an arbitrary agent i and establishing that its next state is safe. First, consider an agent i at time t with DM i ( t ) = AC . As the FSC is false, the one-step reachability check inthe FSC ensures that the CBFs for unary and binary safety constraints arestrictly positive at the next state x i ( t + η ), i.e. h i ( x i ( t + η )) > ∀ j ∈N i : h ij ( x i ( t + η ) , x j ( t + η )) >

0, hence the next state is recoverable. Next,consider an agent i at time t with DM i ( t ) = BC . We divide the neighbors of i into two sets based on their DM states: the sets of neighbors in AC mode andBC mode are denoted as N ACi and N BCi , respectively. The agents in BC modechoose their control actions from their corresponding admissible control spacesas deﬁned in Eq. 12. Hence, according to

Theorem

2, these agents will satisfyunary safety constraints and pairwise safety constraints among themselves. Asfor the neighbors in AC mode, due to the one-step reachability check in their

Distributed Simplex Architecture for Multi-Agent Systems 9

FSC, in the state x i ( t + η ), the pairwise CBFs satisfy h ij ( x i ( t + η ) , x j ( t + η )) ≥ j ∈ N ACi . Hence, x i ( t + η ) is recoverable for DM i ( t ) = BC . We haveproven that, for any agent i and time step t , if x i ( t ) is recoverable, then x i ( t + η )is recoverable. By assumption, x (0) ∈ R . Therefore, by induction, x ( t ) ∈ R for t > We evaluate DSA on the distributed ﬂocking problem with the goal of preventinginter-agent collisions. Consider an MAS consisting of n robotic agents, indexedby M = { , . . . , n } with double integrator dynamics: (cid:20) ˙ p i ˙ v i (cid:21) = (cid:20) I × (cid:21) (cid:20) p i v i (cid:21) + (cid:20) I × (cid:21) a i (17)where p i , v i , a i ∈ R are the position, velocity and acceleration of agent i ∈ M ,respectively. The magnitudes of velocities and accelerations are bounded by ¯ v and ¯ a , respectively. Acceleration a i is the control input for agent i . As DSA isa discrete-time protocol, the state of the DM and the a i ’s are updated every η seconds. The state of an agent i is denoted by the vector s i = [ p Ti v Ti ] T . The state of the entire ﬂock at time t , is denoted by the vector s ( t ) = [ p ( t ) T v ( t ) T ] T ∈ R n ,where p ( t ) = [ p T ( t ) · · · p Tn ( t )] T and v ( t ) = [ v T ( t ) · · · v Tn ( t )] T are the vectorsrespectively denoting the positions and velocities of the ﬂock at time t .We assume that an agent can accurately sense the positions and velocities ofnearby agents within a ﬁxed distance r . The set of the spatial neighbors of agent i is deﬁned as N i ( p ) = { j ∈ M | j (cid:54) = i ∧ (cid:107) p i − p j (cid:107) < r } , where (cid:107) · (cid:107) denotes theEuclidean norm. For ease of notation, we sometimes use s and s i to refer to thestate variables s ( t ) and s i ( t ), respectively, without the time index.The MAS is characterized by a set of operational constraints which includephysical limits and safety properties. States that satisfy the operational con-straints are called admissible , and are denoted by the set A ∈ R n . The desiredsafety property is that no pair of agents is in a “state of collision”. A pair ofagents is considered to be in a state of collision if the Euclidean distance betweenthem is less than a threshold distance d min ∈ R + , resulting in a binary safetyconstraint of the form: (cid:107) p i − p j (cid:107) − d min ≥ ∀ i ∈ M , j ∈ N i . Similarly, a state s is recoverable if all pairs of agents can brake (de-accelerate) relative to eachother without colliding. Otherwise, the state s is considered unrecoverable . Let R ij ⊂ R be the set of recoverable states for a pair of agents i, j ∈ M . Theﬂock-wide set of recoverable states, denoted by R ⊂ R n , is deﬁned in terms of R ij . As in [14], the set R ij is deﬁned as the super-level set of a pairwise CBF h ij : R → R : R ij = { s i , s j | h ij ( s i , s j ) ≥ } . The ﬂock-wide set of recoverablestates R ⊂ A is deﬁned as the set of system states in which ( s i , s j ) ∈ R ij forevery pair of neighboring agents i, j . In accordance with [14], the function h ij ( s i , s j ) is based on a safety con-straint over a pair of agents i, j ∈ M . The safety constraint ensures that forany pair of agents, the maximum braking force can always keep the agents ata distance greater than d min from each other. As introduced earlier, d min isthe threshold distance that deﬁnes a collision. Considering that the tangentialcomponent of the relative velocity, denoted by ∆v , causes a collision, the con-straint regulates ∆v by application of maximum acceleration to reduce ∆v tozero. Hence, the safety constraint can be represented as the following conditionon the inter-agent distance (cid:107) ∆ p ij (cid:107) = (cid:107) p i − p j (cid:107) , the braking distance ( ∆v ) / a ,and the safety threshold distance d min : (cid:13)(cid:13) ∆ p ij (cid:13)(cid:13) − ( ∆v ) a ≥ d min (18) h ij ( s i , s j ) = (cid:113) a ( (cid:13)(cid:13) ∆ p ij (cid:13)(cid:13) − d min ) − ∆v ≥ ∆v to zero under a deceleration of 2¯ a . The constraint in Eq. (18) is re-arranged to get the CBF h ij given in Eq. (19).Combining (19) and (10c), we arrive at the linear constraint on the acceler-ations for agents i and j , which constrains the Lie derivative of the CBF in (19)to be greater than − α ( h ij ). We set α ( h ij ) = γh ij , as in [14], resulting in thefollowing constraint on the accelerations of agents i, j : ∆ p Tij ( ∆ a ij ) (cid:107) ∆ p ij (cid:107) − ( ∆ v Tij ∆ p ij ) (cid:107) ∆ p ij (cid:107) + (cid:107) ∆ v ij (cid:107) (cid:107) ∆ p ij (cid:107) + 2¯ a∆ v Tij ∆ p ij (cid:107) ∆ p ij (cid:107) (cid:112) a ( (cid:107) ∆ p ij (cid:107) − d min ) ≥ − γh ij (20)where the left-hand side is the Lie derivative of the CBF h ij and ∆ p ij = p i − p j , ∆ v ij = v i − v j , and ∆ a ij = a i − a j are the vectors representing the relative po-sition, the relative velocity, and the relative acceleration of agents i and j , re-spectively. We further note that the binary constraint (20) can be representedas [ P ij Q ij ] [ a i a j ] ≤ b ij , and hence it can be split into two unary constraints( P ij u i ≤ b ij / Q ij u j ≤ b ij / i ∈ A , denoted by K i ( s i ) ⊂ R , is deﬁned as theintersection of the half-planes deﬁned by the Lie-derivative-based constraints,where each neighboring agent contributes a single constraint: K i ( s i ) = (cid:8) a i ∈ R | P ij u i ≤ b ij / , ∀ j ∈ N i (cid:9) (21)With the CBF for collision-free ﬂocking deﬁned in (19) and the admissiblecontrol space deﬁned in (21), the BC, RSC, and FSC follow from (13), (15), and(16), respectively. We use the Reynolds ﬂocking model [20] as the AC. In the Reynolds model,the acceleration a i for each agent is a weighted sum of three acceleration terms, Distributed Simplex Architecture for Multi-Agent Systems 11

Time D i s t an c e closest neighbor distance d min Time D i s t an c e (a) Reynolds Model Time D i s t an c e (b) Reynolds Model with DSA Fig. 2: Results for a ﬂock of size 15, with and without DSA.based on simple rules of interaction with the neighboring agents: separation (move away from your close neighbors), cohesion (move towards the centroid ofyour neighbors), and alignment (match your velocity with the average velocityof your neighbors). The acceleration for agent i is a i = w s a si + w c a ci + w al a ali ,where w s , w c , w al ∈ R + are the scalar weights and a si , a ci , a ali ∈ R are the accel-eration terms corresponding to separation, cohesion, and alignment, respectively.We note that the Reynolds model does not guarantee collision avoidance. Nev-ertheless, when the ﬂock stabilizes, the average distance to the closest neighborsare determined by the choice of the weights of the interaction terms. The number of agents in the MAS is n = 15. The other parameters used in theexperiments are r = 4, ¯ a = 5, ¯ v = 2 . d min = 2, and η = 0 . − , and [ − , , respectively, and we ensure thatthe initial state is recoverable. The weights of the Reynolds’ model terms arepicked experimentally to ensure that no pair of agents are in a state of collisionin the steady state. They are set to w s = 3, w c = 1 .

5, and w al = 0 . To recall, the safety property is that all pairs of agents maintain a dis-tance greater than d min from each other. Fig. 2 plots, for the duration of thesimulations, the distance between the agents and their closest neighbors, i.e., min j ∈N i (cid:107) p i − p j (cid:107) . As evident from Fig. 2(a), Reynolds model results in multiple safetyviolations before the ﬂock stabilizes at around 40 seconds. In contrast, as shown https://streamable.com/zn2bl5https://streamable.com/hetraw Time/s D i s t an c e closest neighbor distance d min (b) Distance to the closest neighbor for allagents. Fig. 3: Experimental results for the way-point control study.in Fig. 2(b), DSA preserves safety, maintaining a separation greater than d min between all agents. DSA has an additional beneﬁt of stabilizing the ﬂock muchearlier (around 18 seconds). We further note that the average time the agentsspent in BC mode is only 2 .

47 percent of the total duration of the simulation,indicating that DSA is largely non-invasive.

This section describes the problem setup and experimental results for the way-point (WP) control case study. For this study, the model of the agents is the sameas the one used for the ﬂocking case study, given in Eq. (17). The experimentalsetup is shown in Fig. 3, where the agents initially positioned on the left-handside are to sequentially navigate through a series of WPs while maintaining asafe distance from each other. The WPs are represented by the black squares.The CBF, BC and DM are same as those deﬁned for the ﬂocking problem;see Section 4. The AC is a rule-based controller where each agent acceleratestowards its next WP (ignoring the other agents) until the ﬁnal WP is reached.Agents are assigned one WP from each column such that they are on a collisioncourse if they follow the AC’s commands.

The number of agents used in the experiment is four as is the number of WPsan agent is required to visit. Initially, the agents are at rest with their positionsrepresented by the red dots in Fig. 3(a). The ﬁnal conﬁguration is shown ingreen. The duration of the simulation is 37 seconds. The other parameters usedin the experiments are r = 1 .

0, ¯ a = 0 .

8, ¯ v = 0 . d min = 0 .

2, and η = 0 . Distributed Simplex Architecture for Multi-Agent Systems 13 distances, indicating that the agents maintain a safe distance from one another.A video of the simulation is available online. With an increasing prevalence of distributed energy resources (DERs) such aswind and solar power, electriﬁcation using microgrids (MGs) has witnessed un-precedented growth. Unlike traditional power systems, MG DERs do not haverotating components such as turbines. The lack of rotating components can leadto low inertia, making MGs susceptible to oscillations resulting from transientdisturbances [21]. Ensuring the safe operation of an MG is thus a challengingproblem. In this case study, we demonstrate the eﬀectiveness of DSA in main-taining MG voltage levels within safe limits.The MG we consider is a network of n droop-controlled inverters, indexedby M = { , . . . , n } . The dynamics of each inverter is modeled as [21–24]:˙ θ i = ω i (22a)˙ ω i = ω i − ω i + λ pi ( P seti − P i ) (22b)˙ v i = v i − v i + λ qi ( Q seti − Q i ) (22c)where θ i , ω i , and v i are respectively the phase angle, frequency, and voltage ofinverter i , i ∈ M . The state vector for the MG is denoted by s = [ θ T ω T v T ] T ∈ R n , where θ , ω , and v are respectively vectors representing the voltage phaseangle, frequency, and voltage at each node of the MG. A pair of inverters areconsidered neighbors if they are connected by a transmission line. Also, λ pi and λ qi are droop coeﬃcients of “active power vs frequency” and “reactive powervs voltage” droop controllers, respectively. Finally, ω i and v i are the nominalfrequency and voltage values. P i and Q i are the active and reactive powers injected by inverter i into thesystem: P i = v i (cid:88) k ∈N i v k ( G i,k cos θ i,k + B i,k sin θ i,k ) Q i = v i (cid:88) k ∈N i v k ( G i,k sin θ i,k − B i,k cos θ i,k ) (23)where θ i,k = θ i − θ k , and N i ⊆ M is the set of neighbors. G i,k , B i,k are respec-tively conductance and susceptance values of the transmission line connectinginverters i and k . P seti and Q seti are the active power and reactive power setpoints. The invertershave the ability to change their respective power setpoints according to the MG’soperating conditions. This is modeled as: P seti = P i + u pi , Q seti = Q i + u qi (24)where P i and Q i are the setpoints for the nominal operating condition, and u pi and u qi are control inputs. https://streamable.com/e9rnqd Time/s -0.4-0.200.2 V / p . u . without DSA with DSA limits Fig. 4: Voltage graph at node 4 of the MG network.

The safety property for the MG network is a set of unary constraints restrictingthe voltages at each node to remain within safe limits. The recoverable set R i ⊂ R for inverter i is deﬁned as the super-level set of a CBF h i : R → R . Wefollow the SOS-optimization technique given in [24] to synthesize the CBF.Since the power ﬂow equations (22) are nonlinear, we apply a third-orderTaylor series expansion to approximate the dynamics in polynomial form. Wethen follow the three-step process given in [24] to obtain the CBF for each MGnode. We then calculate the admissible control space according to (12), and theBC, FSC, and RSC follow from (13), (15), and (16), respectively. The AC sets the active/reactive power setpoints to their nominal values. Thus,the AC does not limit voltage and frequency magnitudes but is only concernedwith stabilizing frequency and voltage magnitudes to their nominal values.

We consider a 6-bus MG [24]. Disconnecting the MG from the main utility, wereplace bus 0 with a droop-controlled inverter (Eq. (22)), with inverters alsoplaced on buses 1, 4 and 5. Bus 0 is the reference bus for the phase angle.Nominal values of voltage and frequency, as well as the active/reactive powerset-points, were obtained by solving the steady-state power-ﬂow equations givenin Eq. (23); these were then used to shift the equilibrium point to the origin.Droop coeﬃcients λ pi and λ qi were set to 2.43 rad/s/p.u. and 0.20 p.u./p.u., and τ i was set to 0.5 s. Loads are modeled as constant power loads, and a Kron-reduced network [25] with only the inverter nodes was used for analysis. Theunsafe set is deﬁned in terms of the shifted (around the 0 p.u.) nodal voltagemagnitudes as follows: v i < − . v i > . Distributed Simplex Architecture for Multi-Agent Systems 15

The original Simplex architecture [4,26] was developed for a systems comprisinga single controller and a single (non-distributed) plant. With DSA, we extendthe scope of Simplex to MASs under distributed control. RTA [27, 28] is a run-time assurance technique that can be applied to component-based systems. Inthis case, however, each RTA wrapper (i.e., each Simplex-like instance) indepen-dently ensures a local safety property of a component. For example, in [27], RTAinstances for an inner-loop controller and a guidance system are uncoordinatedand operate independently. In contrast, in DSA, each agent takes the statesof neighboring agents into account when making control decisions, in order toensure that pairwise safety constraints are satisﬁed.A runtime veriﬁcation framework for dynamically adaptive multi-agent sys-tems (DAMS-RV) is proposed in [29]. DAMS-RV is activated every time thesystem adapts to a change in the system itself or its environment. It takes afeedback loop- and model-based approach to verifying dynamic agent collabora-tion. However this method relies on a monitoring phase to observe and identifychanges that occur in agent collaboration so that veriﬁcation can be carried outon the system operating in new contexts. DSA does not require such interme-diary supervision. In [30], a dynamic policy model that can be used to expressconstraints on agent behavior is presented. These constraints limit agent au-tonomy to lie within well-deﬁned boundaries. Constraint speciﬁcations are keptsimple by allowing the policy designer to decompose a speciﬁcation into com-ponents and deﬁne the overall policy as a composition of these small units. Incontrast, DSA uses CBFs to compute the requisite safety regions.CBF-based methodologies [9, 13, 14, 31] have been widely used for MAS run-time safety assurance. In [9, 14], a formal framework for collision avoidance inmulti-robot systems is presented. CBFs are used to design a wrapper around anAC that guarantees forward invariance of a safe set. The wrapper solves an op-timization problem involving the Lie derivative of the CBF to compute minimalchanges to the advanced controller’s output needed to ensure safety.

We have presented Distributed Simplex Architecture, a runtime assurance tech-nique for the safety of multi-agent systems. DSA is distributed in the sense thatit involves one local instance of traditional Simplex per agent such that the con-junction of their respective safety properties yields the desired safety propertyfor the entire MAS. Moreover, an agent’s switching logic depends only on itsown state and that of neighboring agents. We demonstrated the eﬀectiveness ofDSA by successfully applying it to ﬂocking, way-point visiting, and microgridcontrol. As future work, we plan to apply DSA to non-homogenous MASs andimplement it on a physical platform.

References

1. M. Nasir, Z. Jin, H. A. Khan, N. A. Zaﬀar, J. C. Vasquez, and J. M. Guerrero,“A decentralized control architecture applied to DC nanogrid clusters for ruralelectriﬁcation in developing regions,”

IEEE Transactions on Power Electronics ,vol. 34, no. 2, pp. 1773–1785, 2019.2. Z. Boussaada, O. Curea, H. Camblong, N. Bellaaj Mrabet, and A. Hacala, “Multi-agent systems for the dependability and safety of microgrids,”

International Jour-nal on Interactive Design and Manufacturing (IJIDeM) , 2016.3. A. Tahir, J. B¨oling, M.-H. Haghbayan, H. T. Toivonen, and J. Plosila, “Swarms ofunmanned aerial vehicles — a survey,”

Journal of Industrial Information Integra-tion , vol. 16, p. 100106, 2019.4. D. Seto and L. Sha, “A case study on analytical analysis of the inverted pendulumreal-time control system,” Software Engineering Institute, Carnegie MellonUniversity, Pittsburgh, PA, Tech. Rep. CMU/SEI-99-TR-023, 1999. [Online].Available: http://resources.sei.cmu.edu/library/asset-view.cfm?AssetID=135115. L. Sha, “Using simplicity to control complexity,”

IEEE Software , vol. 18, no. 4,pp. 20–28, 2001.6. D. Phan, R. Grosu, N. Jansen, N. Paoletti, S. A. Smolka, and S. D. Stoller, “Neuralsimplex architecture,” in

NASA Formal Methods Symposium (NFM 2020) , 2020.7. T. Gurriet, A. Singletary, J. Reher, L. Ciarletta, E. Feron, and A. Ames, “Towardsa framework for realizable safety critical control through active set invariance,”in , 2018, pp. 98–106.8. M. Egerstedt, J. N. Pauli, G. Notomista, and S. Hutchinson, “Robotecology: Constraint-based control design for long duration autonomy,”

AnnualReviews in Control . IEEE,2016, pp. 5213–5218.10. S. Prajna and A. Jadbabaie, “Safety veriﬁcation of hybrid systems using bar-rier certiﬁcates,” in

Hybrid Systems: Computation and Control, 7th InternationalWorkshop, HSCC 2004, Philadelphia, PA, USA, March 25-27, 2004, Proceedings ,ser. Lecture Notes in Computer Science, R. Alur and G. J. Pappas, Eds., vol. 2993.Springer, 2004, pp. 477–492.11. S. Prajna, “Barrier certiﬁcates for nonlinear model validation,”

Autom. , vol. 42,no. 1, pp. 117–126, 2006.12. P. Wieland and F. Allg¨ower, “Constructive safety using control barrierfunctions,”

IFAC Proceedings Volumes . IEEE, 2019, pp. 3420–3431.14. U. Borrmann, L. Wang, A. D. Ames, and M. Egerstedt, “Control barrier certiﬁcatesfor safe swarm behavior,” in

ADHS , ser. IFAC-PapersOnLine, M. Egerstedt andY. Wardi, Eds., vol. 48, no. 27. Elsevier, 2015, pp. 68–73. Distributed Simplex Architecture for Multi-Agent Systems 1715. F. Blanchini and S. Miani,

Set-Theoretic Methods in Control , 1st ed. Birkh¨auserBasel, 2007.16. F. Blanchini, “Set invariance in control,”

Automatica , vol. 35, no. 11, pp. 1747 –1767, 1999.17. L. Wang, D. Han, and M. Egerstedt, “Permissive barrier certiﬁcates for safe sta-bilization using sum-of-squares,” in . IEEE, 2018, pp. 585–590.18. E. Squires, P. Pierpaoli, and M. Egerstedt, “Constructive barrier certiﬁcates withapplications to ﬁxed-wing aircraft collision avoidance,” in

IEEE Conference onControl Technology and Applications, CCTA 2018, Copenhagen, Denmark, August21-24, 2018 . IEEE, 2018, pp. 1656–1661.19. U. Mehmood, N. Paoletti, D. Phan, R. Grosu, S. Lin, S. D. Stoller, A. Tiwari,J. Yang, and S. A. Smolka, “Declarative vs rule-based control for ﬂocking dynam-ics,” in

Proceedings of the 33rd Annual ACM Symposium on Applied Computing ,2018.20. C. W. Reynolds, “Flocks, herds and schools: A distributed behavioral model,”

SIGGRAPH Comput. Graph. , vol. 21, no. 4, pp. 25–34, Aug. 1987. [Online].Available: http://doi.acm.org/10.1145/37402.3740621. N. Pogaku, M. Prodanovic, and T. C. Green, “Modeling, analysis and testingof autonomous operation of an inverter-based microgrid,”

IEEE Transactions onPower Electronics , vol. 22, no. 2, pp. 613–625, 2007.22. J. Schiﬀer, R. Ortega, A. Astolﬁ, J. Raisch, and T. Sezi, “Conditions for stabilityof droop-controlled inverter-based microgrids,”

Automatica

IEEE Transactionson Industry Applications , vol. 38, no. 2, pp. 533–542, 2002.24. S. Kundu, S. Geng, S. P. Nandanoori, I. A. Hiskens, and K. Kalsi, “Distributed bar-rier certiﬁcates for safe operation of inverter-based microgrids,” in , 2019, pp. 1042–1047.25. P. Kundur, N. Balu, and M. Lauby,

Power System Stability and Control , ser.EPRI power system engineering series. McGraw-Hill Education, 1994. [Online].Available: https://books.google.com.pk/books?id=2cbvyf8Ly4AC26. D. Seto, B. Krogh, L. Sha, and A. Chutinan, “The simplex architecture for safeonline control system upgrades,” in

Proceedings of the 1998 American ControlConference. ACC (IEEE Cat. No.98CH36207) , vol. 6, 1998, pp. 3504–3508.27. M. Aiello, J. Berryman, J. Grohs, and J. Schierman,

Run-Time Assurance forAdvanced Flight-Critical Control Systems , 2010.28. J. Schierman, D. Ward, B. Dutoi, A. Aiello, J. Berryman, M. DeVore, W. Storm,and J. Wadley,

Run-Time Veriﬁcation and Validation for Safety-Critical FlightControl Systems , 2012.29. Y. J. Lim, G. Hong, D. Shin, E. Jee, and D.-H. Bae, “A runtime veriﬁcationframework for dynamically adaptive multi-agent systems,” in , 2016, pp. 509–512.30. H. Alotaibi and H. Zedan, “Runtime veriﬁcation of safety properties in multi-agentssystems,” in , 2010, pp. 356–362.31. A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier function basedquadratic programs for safety critical systems,”