[PDF] Safe CPS from Unsafe Controllers

Abstract

In this paper, we explore using runtime verification to design safe cyber-physical systems (CPS). We build upon the Simplex Architecture, where control authority may switch from an unverified and potentially unsafe advanced controller to a backup baseline controller in order to maintain system safety. New to our approach, we remove the requirement that the baseline controller is statically verified. This is important as there are many types of powerful control techniques -- model-predictive control, rapidly-exploring random trees and neural network controllers -- that often work well in practice, but are difficult to statically prove correct, and therefore could not be used before as baseline controllers. We prove that, through more extensive runtime checks, such an approach can still guarantee safety. We call this approach the Black-Box Simplex Architecture, as both high-level controllers are treated as black boxes. We present case studies where model-predictive control provides safety for multi-robot coordination, and neural networks provably prevent collisions for groups of F-16 aircraft, despite occasionally outputting unsafe actions.

Full PDF

SSafe CPS from Unsafe Controllers

Usama Mehmood, Stanley Bak, Scott A. Smolka, and Scott D. Stoller

Department of Computer ScienceStony Brook UniversityStony Brook NY, USA

Abstract.

In this paper, we explore using runtime veriﬁcation to designsafe cyber-physical systems (CPS). We build upon the Simplex Architec-ture, where control authority may switch from an unveriﬁed and poten-tially unsafe advanced controller to a backup baseline controller in orderto maintain system safety. New to our approach, we remove the require-ment that the baseline controller is statically veriﬁed. This is importantas there are many types of powerful control techniques—model-predictivecontrol, rapidly-exploring random trees and neural network controllers—that often work well in practice, but are diﬃcult to statically prove cor-rect, and therefore could not be used before as baseline controllers. Weprove that, through more extensive runtime checks, such an approachcan still guarantee safety. We call this approach the

Black-Box SimplexArchitecture , as both high-level controllers are treated as black boxes.We present case studies where model-predictive control provides safetyfor multi-robot coordination, and neural networks provably prevent col-lisions for groups of F-16 aircraft, despite occasionally outputting unsafeactions.

Modern cyber-physical systems (CPS) are found in vital domains such as trans-portation, autonomy, health-care, energy, agriculture, and defense. As these sys-tems perform complex functions, they require complex designs. Since CPS in-teract with the physical world, correctness is especially important, but formalanalysis can be diﬃcult for complex systems.In the design of such CPS, powerful techniques such as model-predictive con-trol and deep reinforcement learning are increasingly being considered instead oftraditional high-level control design. Such trends exacerbate the safety veriﬁca-tion problem. For example, one approach for autonomous driving is end-to-endlearning , where a complex neural network directly receives sensor inputs andoutputs low-level steering commands [7,6]. Additionally, there is increasing in-terest in systems that can learn in the ﬁeld, changing their behaviors based onobservations. Classical veriﬁcation strategies are poorly suited for such designs.One approach for dynamically providing safety for systems with complex andunveriﬁed components is runtime assurance [10], where the state of the plant ismonitored at runtime to detect possible imminent violations of formal properties.If necessary, corrective measures are taken to avoid the violations. A well-known a r X i v : . [ c s . S E ] F e b U. Mehmood et al.

AdvancedControllerBaselineController Decision Module

SensorDataCommandCommand

Plant+Low-LevelController (a) Traditional Simplex Architecture

Advanced ControllerLookaheadBaseline Controller Decision Module

SensorDataCommandCommand

Sequence

Command

Plant+Low-LevelController (b) Black-Box Simplex Architecture

Fig. 1: The Black-Box Simplex Architecture guarantees safety despite a black-box advanced controller and a black-box baseline controller.runtime assurance technique is the Simplex Architecture [34,35], which has beenapplied to a wide range of systems [11,27,31].In the original Simplex Architecture, shown in Figure 1(a), the baseline con-troller and the decision module are part of the trusted computing base. Theymust be veriﬁed correct for the system to work. The decision module monitorsthe state of the system and switches control from the advanced controller to thebaseline controller if using the former could result in a safety violation in thenear future. The advanced controller is typically concerned with mission require-ments. On the other hand, the baseline controller should be simpler in designand proven to preserve safety of the system.The successful application of the original Simplex Architecture requires creat-ing a provably safe baseline controller, a diﬃcult task for many systems. Further,many classes of controllers, such as those designed using model-predictive con-trol, rapidly-exploring random trees, or neural-network controllers, may workwell in practice, but are diﬃcult to verify and therefore cannot be used as base-line controllers.

The main contribution of this work is to overcome thislimitation.

We propose the

Black-Box Simplex Architecture , a variant of thetraditional Simplex Architecture that can guarantee the safety of the systemdespite an unveriﬁed and even incorrect baseline controller, which is treated asa black box.In the Black-Box Simplex Architecture, shown in Figure 1(b), the baselinecontroller tries to produce a sequence of commands that begins with the ad-vanced controller’s current command and brings the plant to a state where main-taining the safety property is easy (to be formally deﬁned later). The veriﬁeddecision module checks at runtime whether the baseline controller’s commandsequence satisﬁes these requirements. If so, the decision module stores it forpotential use as a backup plan at the next time step, in case no further safecommand sequences are produced. This is a key tradeoﬀ one encounters withBlack-Box Simplex compared to traditional Simplex: the decision module mustperform more computationally expensive runtime veriﬁcation (so that its perfor-mance is important). Whether this tradeoﬀ is practical depends on the speciﬁcsystem, and we investigate it through several case studies. We assume that the afe CPS from Unsafe Controllers 3 plant has a trusted low-level controller which is able to correctly apply the com-mand from the decision module to the plant.We prove two theorems about this architecture: (i) safety is always guar-anteed, and (ii) if the baseline and advanced controllers perform well (to beformally deﬁned in the next section) and the decision module is fast enough,the architecture is transparent: the advanced controller appears to have full con-trol of the system. The practicality of these assumptions is also demonstratedthrough our two signiﬁcant case studies. In the ﬁrst, a multi-robot coordinationsystem uses a baseline controller with a model-predicative control algorithm. Inthe second, a mid-air collision avoidance system for groups of F-16 aircraft iscreated from imperfect logic encoded in neural networks.The rest of the paper is organized as follows. Section 2 presents a formaldeﬁnition of the Black-Box Simplex Architecture, including proofs of the safetyand transparency theorems. Section 3 contains two case studies implementingthe architecture. Section 4 discusses related work while Section 5 oﬀers ourconcluding remarks.

This section reviews the limitations of traditional Simplex Architecture and thenpresents this paper’s main contribution, the Black-Box Simplex Architecture.

The traditional Simplex Architecture, shown in Figure 1(a), preserves the safetyof the system while permitting the use of an unveriﬁed advanced controller. Itdoes this by using the advanced controller in conjunction with a veriﬁed baselinecontroller and a veriﬁed decision module. The goal of the Simplex Architectureis to ensure that the system state is always admissible , i.e., it satisﬁes all safetyconstraints and operational limits.The decision module cannot simply check if the next state is admissible.Rather, the veriﬁed design of a Simplex system usually requires oﬄine reasoningwith respect to a trusted baseline controller. If the system dynamics are linearand the admissible states are deﬁned with linear constraints, a state-feedbackbaseline controller and a decision module can be synthesized by solving a lin-ear matrix inequality [34]. If the system or constraints are nonlinear, however,there is no easy recipe to create a trusted baseline controller and trusted de-cision module. This prevents more widespread use of the traditional SimplexArchitecture.

The Black-Box Simplex Architecture lifts the requirement that the baseline con-troller is veriﬁed, allowing provable safety with both an unveriﬁed advanced

U. Mehmood et al. controller and an unveriﬁed baseline controller. Its architecture is shown in Fig-ure 1(b). Apart from eliminating the need to establish safety of the baselinecontroller, the Black-Box Simplex Architecture diﬀers from the traditional Sim-plex Architecture in other important ways. First, the advanced controller sharesits command with the baseline controller instead of passing it directly to the de-cision module. Second, the baseline controller uses this command as the startingpoint of a command sequence intended to safely recover the system. Many con-trol techniques naturally produce command sequences, such as model predictivecontrol with a ﬁnite-step horizon or controllers derived from rapidly-exploringrandom trees (RRTs). If a model of the low-level controllers and plant is pro-vided, a traditional single-output controller can be used to create a commandsequence through repeated invocations and system simulation.The decision module checks the baseline controller’s command sequence, pos-sibly rejecting it if safety is not ensured. As long as the advanced controller drivesthe system through states where the baseline controller can recover, it continuesto actuate the system. However, if the baseline controller fails to compute a safecommand sequence, due to the fault of either the unveriﬁed advanced controlleror the unveriﬁed baseline controller, the decision module can still recover thesystem using the safe command sequence from the previous step.The applicability of Black-Box Simplex depends on the feasibility of twosystem-speciﬁc steps: (i) constructing safe command sequences and (ii) provingtheir safety at runtime. For some systems, a safe command sequence can simplybring the system to a stop. An autonomous car, for example, could have safecommand sequences that steer the car to the side of the road and then stop.A safe sequence for a drone might direct it to the closest emergency landinglocation. For an autonomous ﬁxed-wing aircraft swarm, a safe sequence couldﬂy the aircraft in non-intersecting circles to buy time for a human operator tointervene.Proving safety of a given command sequence can also be challenging anddepends on the system dynamics. For nondeterministic systems, this could in-volve performing reachability computations at runtime [20,4,2]. Such techniquesassume an accurate system model is available in order to compute reachablesets. Traditional oﬄine control theory also requires this assumption, so we donot view it as overly burdensome.In the Black-Simplex Architecture, although both controllers are unveriﬁed,we do not combine them into a single unveriﬁed controller for two reasons. First,the design of the safety controller is easier if it is kept simple and is not burdenedwith fulﬁlling all mission-speciﬁc goals. Second, it allows for the use of oﬀ-the-shelf controller strategies that are focused on either mission completion or safety.

We formalize the behavior and requirements for the components of the Black-BoxSimplex Architecture in order to prove properties about the system’s behavior. afe CPS from Unsafe Controllers 5

Plant Model.

We consider discrete-time plant dynamics, modeled as a func-tion f ( x i (cid:124)(cid:123)(cid:122)(cid:125) state , u i (cid:124)(cid:123)(cid:122)(cid:125) input , w i (cid:124)(cid:123)(cid:122)(cid:125) disturbance ) = x i +1 (cid:124)(cid:123)(cid:122)(cid:125) next state (1)where x i ∈ X is the system state, u i ∈ U is a control input command, w i ∈ W isan environmental disturbance, and i ∈ Z + is the time step. We sometimes alsoconsider a deterministic version of the system, where the disturbance w i can betaken to be zero at every step. Admissible States.

The system is characterized by a set of operationalconstraints which include physical limits and safety properties. States that satisfyall the operational constraints are called admissible states . Command Sequences.

A single-input command is some u ∈ U , and a k -length sequence of commands is written as u ∈ U k . The length of a sequence canbe written as u len = k , where we also can take the length of a single command, u len = 1. We use Python-like notation for subsequences, where the ﬁrst elementin a sequence is u [0], and the rest of the sequence is u [1:]. Decision Module.

The decision module in Black-Box Simplex stores a com-mand sequence s which we sometimes call the decision module’s state. The be-havior of the decision module is deﬁned through two functions, dm update and dm step . The dm update function attempts to modify the decision module’s storedcommand sequence: dm update ( x (cid:124)(cid:123)(cid:122)(cid:125) state , s (cid:124)(cid:123)(cid:122)(cid:125) cur seq , t (cid:124)(cid:123)(cid:122)(cid:125) proposed seq ) = s (cid:48) (cid:124)(cid:123)(cid:122)(cid:125) new seq (2)where if s (cid:48) = t then we say that the proposed command sequence is accepted ;otherwise s (cid:48) = s and we say that it was rejected . Correctness conditions on dm update are given in Section 2.4. Note that the decision module will accepta safe command sequence from the advanced controller, even if the previouscommand sequence from the advanced controller was rejected because it wasunsafe. As in [25], we refer to this as reverse switching , since it switches controlback to the advanced controller.The dm step function produces the next command u to apply to the plant, aswell as the next step’s command sequence s (cid:48) for the decision module: dm step ( ¯ s (cid:124)(cid:123)(cid:122)(cid:125) cur seq ) = ( u (cid:124)(cid:123)(cid:122)(cid:125) next input , s (cid:48) (cid:124)(cid:123)(cid:122)(cid:125) next seq ) (3)where u = s [0] and s (cid:48) is constructed from ¯ s by removing the ﬁrst command (ifthe current sequence s has only one command then it is repeated): s (cid:48) = (cid:40) s if s len = 1 s [1:] otherwise Controllers.

The advanced and baseline controllers are deﬁned using func-tions of the system state. In particular, the advanced controller is deﬁned by a

U. Mehmood et al. function ac ( x ) = u , where u ∈ U is a single command. The baseline controller issimilarly deﬁned with bc ( x ) = u , where u ∈ U k is a k-length command sequence.For Black-Box Simplex, we make use of look-ahead baseline controllers , whichoutput command sequences that start with the same command as an advancedcontroller. These can be deﬁned with a function lbc ac ( x ) = u , with u [0] = ac ( x ).We generally drop the subscript on lbc , as it is clear from context. Execution Semantics.

At step i , given system state x i and decision mod-ule state s i , the next system state x i +1 and next decision module state s i +1 arecomputed with the following sequence of steps: (1) ac ( x i ) = z i ; (2) lbc ( x i ) = t i , with t i [0] = z i ; (3) dm update ( x i , s i , t i ) = s (cid:48) i ; (4) dm step ( s (cid:48) i ) = ( u i , s i +1 );(5) f ( x i , u i , w i ) = x i +1 , for some disturbance w i ∈ W . We deﬁne several relevant concepts and then state and prove safety and trans-parency theorems for the Black-Box Simplex Architecture.

Deﬁnition 1 (Safe System Execution).

A system execution is called safe ifand only if the system state is admissible at every step.

Safety can be ensured by following a permanently safe command sequencefrom a given system state.

Deﬁnition 2 (Permanently Safe Command Sequence).

Given state x i , ak-length permanently safe command sequence s i ∈ U k is one where the stateis admissible at every step j ≥ i , where ( u i , s i +1 ) = dm step ( s i ) , and x i +1 = f ( x i , u i , w i ) , for every choice of disturbance w i ∈ W . That is, the system state will remain admissible when applying each command inthe sequence s i , and then repeatedly using the last command forever, accordingto the semantics of dm step . More general deﬁnitions of permanently safe com-mand sequences could be considered, such as repeating a suﬃx rather than justthe last command. For simplicity we do not explore this here.We deﬁne the notion of recoverable commands as those that result in statesthat have permanently safe command sequences. Deﬁnition 3 (Recoverable Command).

Given state x i , a recoverable com-mand u is one where there exists a permanently safe command sequence from x i +1 , where x i +1 = f ( x i , u, w i ) , for every choice of disturbance w i ∈ W . Optimal decision modules are deﬁned by requiring the dm update function ac-cept all sequences that can guarantee future safety. Deﬁnition 4 (Optimal Decision Module). An optimal decision module hasa dm update function that accepts t at state x if and only if t is a permanently safecommand sequence starting from x . afe CPS from Unsafe Controllers 7 A correct decision module is one which only accepts sequences that can guar-antee future safety. A correct decision module, by this deﬁnition, could rejectevery command sequence.

Deﬁnition 5 (Correct Decision Module). A correct decision module hasa dm update function that accepts t at state x only if t is a permanently safecommand sequence starting from x . The role of the baseline controller is to try to keep the system safe. Anoptimal look-ahead baseline controller can be deﬁned as one that always producesa permanently safe command sequence when it exists. This is optimal in thesense that during the system execution, it will be able to override the advancedcontroller as late as possible while still guaranteeing safety. This controller canbe deﬁned with respect to a speciﬁc advanced controller ac . Deﬁnition 6 (Optimal Look-Ahead Baseline Controller).

Given state x with u = ac ( x ) , if there exists a permanently safe command sequence s from x with s [0] = u , then an optimal look-ahead baseline controller will always producea permanently safe command sequence t , with t [0] = u . Note that t may diﬀer from s , as there can be multiple permanently safecommand sequences from the same state. Theorem 1 (Safety).

Given initial state x along with an initial permanentlysafe command sequence s , if the decision module is correct, then the system’sexecution is safe, regardless of the outputs of the advanced controller ac andbaseline controller lbc .Proof. The command executed at each step comes from the state of the decisionmodule s i , which maintains the invariant that s i is always a permanently safecommand sequence from the current system state x i . The dm update function canonly replace a permanently safe command sequence with another permanentlysafe command sequence. Since initially, s is permanently safe, then by inductionon the step number, the decision module’s command sequence at every step ispermanently safe, and so the system’s execution is safe.Although safety is important, achieving only safety is trivial, as a decisionmodule can simply reject all new command sequences. A runtime assurancesystem must also have a transparency property, where the advanced controllerretains control in suﬃciently well-designed systems. Theorem 2 (Transparency).

If (i) from every state encountered x i the outputof the advanced controller, ac ( x i ) = z i , is a recoverable command, (ii) the look-ahead baseline controller is optimal, and (iii) the decision module is optimal,then the input command used to actuate the system at every step is the advancedcontroller’s command, z i . U. Mehmood et al.

Proof.

The proof proceeds by stepping through an arbitrary step i of the exe-cution semantics deﬁned in Section 2.3. Since the output of the advanced con-troller, ac ( x i ) = z i , is assumed to be recoverable, there exists a permanently safecommand sequence from x i that starts with z i . By the deﬁnition of an optimallook-ahead baseline controller, since there exists a permanently safe commandsequence, the output lbc ( x i ) = t must also be a permanently safe commandsequence, with t [0] = z i as required by the deﬁnition of a look-ahead baselinecontroller. In step three of the execution semantics, dm update ( x i , s i , t i ) = s (cid:48) i . Since t is a permanently safe command sequence and the decision module is optimal,the command sequence will be accepted by the decision module, and so s (cid:48) i = t .Step four of the execution semantics produces u i , which is the ﬁrst commandin the sequence t . As shown before, this command is equal to z i . which is usedin step ﬁve of the execution semantics to actuate the system. This reasoningapplies at every step, and so the advanced controller’s command is always used. There are several practical considerations with the described approach. For ex-ample, the black-box controllers may not only generate logically incorrect com-mands, but they may fail to generate a command at all, for example, enteringan inﬁnite loop. We can account for such behaviors by simply having a defaultcommand that is assumed in the execution semantics from Section 2.3. The de-fault command is used if lbc does not produce a timely output. For increasedprotection, the black-box controllers can be isolated on dedicated hardware [3]so that they do not, for example, crash a shared operating system.Another concern is that the decision module’s analysis of the command se-quence is nontrivial and could involve a runtime reachability computation. If thistakes too long to prove safety, the command sequence also should be rejected.This means that the practicality of the architecture depends on the eﬃciency ofreachability methods, an active area of research orthogonal to this work.Finally, it is probably impractical for many systems to create an optimaldecision module or optimal baseline controller, so there is a question of whetherTheorem 2 is useful. The proof of this theorem, however, deals with a singlestep of execution. Thus, if at a speciﬁc state the look-ahead baseline controlleris able to ﬁnd a permanently safe command sequence and the decision moduleis able to validate it, then the system will behave as if the look-ahead baselinecontroller and decision module were optimal for that step. In other words, aswe improve the baseline controller’s ability to recover the system and designmore eﬃcient reachability methods for the decision module, the architecturewill become increasingly transparent.

In this section, we consider two case studies: a multi-robot coordination system,and a mid-air collision avoidance system for groups of F-16 aircraft. In Theo-rem 1, we have established that the Black-Box Simplex Architecture guarantees afe CPS from Unsafe Controllers 9 -8 -4 0 4 8-8-4048 (a) Initial conﬁgura-tion, k = 1 -8 -4 0 4 8-8-4048 (b) k = 10 -8 -4 0 4-8-4048 (c) Baseline con-troller fails, k = 11 -12 -8 -4 0 4 8 12-12-8-4048 (d) k = 32 Fig. 2: Simulation of the MAS with 7 robots. The decision module performs sys-tem recovery after the baseline controller produces an unsafe command sequenceat k = 11 (Rays extending from the ﬁnal positions of agents shown as larger reddots and in the direction of their ﬁnal velocities intersect. The rays are shown asdotted red lines). We represent the current positions as red dots, the future po-sitions corresponding to the safe/unsafe command sequences as green/blue dots,the velocities as blue lines, and the trajectory of the agents as a grey curve.the safety of the system. The goal here is to demonstrate that the theory devel-oped is practically applicable to develop complex systems with safety guarantees. We consider a multi-agent system (MAS), indexed by M = { , ..., n } , of planarrobots modeled with discrete-time dynamics of the form: p i ( k + 1) = p i ( k ) + dt · v i ( k ) , | v i ( k ) | < v max v i ( k + 1) = v i ( k ) + dt · a i ( k ) , | a i ( k ) | < a max (4)where p i , v i , a i ∈ R are the position, velocity and acceleration of agent i ,respectively, at time step k , and dt ∈ R + is the time step. The magnitudesof velocities and accelerations are bounded by v max and a max , respectively. Theacceleration a i is the control input for agent i . The combined state of all agents isdenoted as x = [ p T , v T , ..., p Tn , v Tn ] T , and their accelerations are a = [ a T , ..., a Tn ] T .In the initial conﬁguration, the agents are equally spaced on the boundary ofa circle and are at rest. Agent i ’s goal is to reach a target location r i , located onthe opposite side of the circle. The initial conﬁguration of the MAS is shown inFigure 2(a), where the agents and their target locations are represented as reddots and blue crosses, respectively.The safety property is absence of inter-agent collisions. A pair of agentsis considered to collide if the Euclidean distance between them is less than athreshold d min . Thus, the safety property is that (cid:107) p i − p j (cid:107) > d min ≥ i, j ∈ M with i (cid:54) = j . Both the advanced controller and the baseline controller are designed usingcentralized model predictive control (MPC), which produces command sequencesas part of the solution of an optimization problem. The advanced controlleronly outputs the ﬁrst command of the command sequence, whereas the baselinecontroller produces the full command sequence. Note that numerical methodsfor global nonlinear optimization do not provide a guaranteed optimal solutionand may fail. For this reason, such controllers could not be considered as thebaseline controller in the traditional Simplex Architecture, and should not beused directly when safety is important. Both the advanced controller and thebaseline controller are high-level controllers that produce accelerations. In oursimulations we do not model the low-level controller and have the plant dynamicswork directly with the accelerations. When implementing this on physical robots,depending on the dynamics, a trusted low-level controller will appropriately mapthe desired acceleration commands to the actuator inputs.An MPC controller produces a command sequence s of length T , where T is the prediction horizon, and each command s [ i ] contains the accelerations forall agents to use at step i . The centralized MPC controller solves the followingoptimization problem at each time step k:arg min a ( k | k ) ,...,a ( k + T − | k ) T − (cid:88) t =0 J ( k + t | k ) + λ · T − (cid:88) t =0 (cid:107) a ( k + t | k ) (cid:107) (5)where a ( k + t | k ) and J ( k + t | k ) are the predictions made at time step k forthe values at time step k + t of the accelerations and the centralized (global)cost function J , respectively. The ﬁrst term is the sum of the centralized costfunction, evaluated for T time steps, starting at time step k . It encodes thecontrol objective. The second term, scaled by a weight λ >

0, penalizes largecontrol inputs.

Advanced controller

The centralized cost function J ac for the advancedcontroller contains two terms: (1) a separation term based on the inverse of thesquared distance between each pair of agents; and (2) a target seeking term basedon the distance between the agent and its target location. J ac = ω s (cid:88) i>j (cid:107) p i − p j (cid:107) + ω t (cid:88) i (cid:107) p i − r i (cid:107) (6)where ω s , ω t ∈ R are the weights of the separation term and target seekingterms. The separation term promotes inter-agent spacing but does not guaranteecollision avoidance. The ﬁrst command in the command sequence produced bythe MPC optimization is the advanced controller’s command and it is passed tothe look-ahead baseline controller. The command sequence is the solution of theoptimization in Eq. 5, with J replaced by J ac . Baseline controller

The centralized cost function J bc for the baseline con-troller contains two terms. As in Eq. 6, the ﬁrst term is the separation term.The second term is the divergence term which forces the agents to move out ofthe circle by aligning the velocities with rays radially pointing out of the centerof the circle. afe CPS from Unsafe Controllers 11 J bc = ω s (cid:88) i>j (cid:107) p i − p j (cid:107) + ω d (cid:88) i (cid:18) − ( p i − c ) · v i | p i − c || v i | (cid:19) (7)where ω s , ω d ∈ R are the weights of the separation term and the divergenceterm, and c is the center of the circle containing the initial conﬁgurations of therobots and their target locations. The control law for the baseline controller isEq. 5, with J replaced by J bc . A zero acceleration is appended at the end of thebaseline controller’s command sequence for ease in establishing collision freedomfor all future time steps. Decision module

The look-ahead baseline controller combines accelerationsfrom the advanced controller and the baseline controller, producing the commandsequence t = [ ac ( x ) , bc ( x (cid:48) ) ,(cid:126) x (cid:48) is the next state after executing ac ( x ) instate x . The function dm update ( x, s, t ) accepts the proposed command sequence t if and only if t is a permanently safe command sequence. For this system, acommand sequence t is considered permanently safe in a state x if it satisﬁesthe following two conditions. First, for all states in the state trajectory obtainedby executing t from x , the Euclidean distance between every pair of distinctagents is at least d min . Second, in the ﬁnal state, for all pairs of distinct agents,the rays extending from their positions and in the direction of velocities do notintersect. Any pair of agents that satisﬁes the second condition will not collidein the future, since the last command in the sequence t has zero acceleration.The initial permanently safe command sequence is a zero acceleration for allagents, as the agents start at rest.We ﬁrst consider seven robotic agents initialized on a circle centered at theorigin, with a radius of 10. The other parameters are: dt = 0 . sec , d min = 1 . a max = 1 .

5, and v max = 2. The length of the prediction horizon is T ac = T bc = 10.The optimization problems for the MPC controllers are solved using the MAT-LAB fmincon function. The random seed in the initialization of fmincon causesnondeterminism in the solution. Successful Recovery After Failure

In this experiment, we use seven roboticagents initially positioned on a circle of radius 10, as shown in Figure 2(a). At k = 11, the baseline controller produces an unsafe command sequence. The statetrajectory corresponding to the unsafe sequence is shown in blue. As shown inFigure 2(c), the ﬁnal velocities of the two agents corresponding to the larger reddots are converging after simulating the current state with the unsafe sequence.Hence, at k = 11. the decision module rejects the proposed command sequenceand shifts control to the previous safe command sequence, which safely recoversthe system. The last command in the stored command sequence is a zero accel-eration and is repeated forever. Here, we purposefully did not return control tothe advanced controller. A video of the simulation is available online. https://streamable.com/yoltx42 U. Mehmood et al. -8 -4 0 4 8-8-4048 (a) Conﬁguration inwhich the agents areclosest, k = 18 -8 -4 0 4 8-8-4048 (b) Final conﬁguration, k = 36 -8 -4 0 4 8-8-4048 (c) Stress test with 12agents. Fig. 3: (a,b) Simulation of robotic MAS with 7 robots. The advanced controllersafely brings the robots to their target locations. (c) Stress test of robotic MASwith 12 robots. The agents safely reach their target locations. The trajectorysegments where stored command sequences are used are shown in blue.

Target Locations Successfully Reached

In this experiment, we start fromthe same initial conﬁguration as in Figure 2(a). The outcome of this experi-ment, shown in Figures 3(a) and 3(b), is diﬀerent from the one in Figure 2, eventhough both start from the same initial conﬁguration. The reason is that thelocal optimizer for the MPC controllers used a diﬀerent a random seed for op-timization, and in this case the optimizer successfully found a permanently safecommand sequence at all steps. The closest any pair of agents get is at k = 18;see Figure 3(a), where the minimum pairwise distance is 1 . d min = 1 .

7. A video of the simulation isavailable online. This demonstrates the transparency of Black-Box Simplex, asthe safe outputs of the advanced controller are always used.

Reverse Switching Scenario

We stress-test the multi-robot system by ini-tializing 12 agents on a circle of radius 10. The path of the agents is shownin Figure 3(c). During the entire simulation, there are 10 instances where thedecision module rejects the proposed command sequence and instead uses thepreviously stored safe command sequence until the next safe command sequenceis produced. All agents reach their target locations without colliding; the mini-mum separation between any pair of agents is 1.724. A video of the simulationis available online. Handling Uncertainty

Up to this point, the MAS considered has no un-certainty in the states or dynamics. We next investigate the decision module’sruntime overhead when uncertainty needs to be taken into account. For this, we https://streamable.com/1c8th8 https://streamable.com/h9gv09afe CPS from Unsafe Controllers 13(a) Reachable States with Sensor Error (b) Reachable States with Disturbances Fig. 4: Zonotope reachability computes future states with uncertainty.consider two types of uncertainty: sensor uncertainty and dynamics uncertainty.The ﬁrst case arises when the sensors used to determine the positions and ve-locities can have error. The second case could be used to account for modelingerrors, through disturbances on the positions and velocities at each step.We continue to use the same MPC strategy as before; thus, the controllersignore the uncertainty when generating proposed command sequences. Only thelogic used by the decision module to accept or reject command sequences needsto be adjusted to account for uncertainty. We examine the scenario shown beforein Figure 2(b). To account for the uncertainty, we perform an online reachabilitycomputation, instead of a simulation. To do this, we use eﬃcient methods forreachability for linear systems based on zonotopes [13], which we implement inPython. In this case, each agent has four state variables, two for position andtwo for velocity. The composed system with seven agents has 28 variables.In the ﬁrst case, shown in Figure 4(a), the current state is assumed to haveuncertainty independently in both position and velocity with an L norm of 0.1.We use a 16-sided polygon to bound this uncertainty. In the plot, the determin-istic simulation is given, along with black polygons for each agent that show thestates reachable at each step due to the sensor uncertainty. The uncertainty inthe velocity causes the set to expand over time, since the open-loop commandsequence produced by the controller does not attempt to compensate for theuncertainty. The zonotope representation of the composed system uses 112 gen-erator vectors to represent the set of states at each time step, which aﬀects themethod’s runtime.In the second case, shown in Figure 4(b), the initial state has very little error,but the dynamics is changed to have disturbances at each step. For each agent’sposition and velocity, we allow an external disturbance value to be added inthe range [ − . , . To measure runtime, we used a standard laptop with a 2.70 GHz Intel Xeon E-2176M CPU, 32 GB RAM, running Ubuntu 20.04. The method is extremely fast.For the case of sensor uncertainty, computing the box bounds of the reachable setat all the steps takes about 1.5 milliseconds. With uncertainty, even though thenumber of generators grows over time, it is not large enough to signiﬁcantly aﬀectthe runtime. The computation with disturbances requires about 2 millisecondsto complete. Although this computation would be repeated at each step, webelieve such execution times are suﬃciently fast for us in the decision modulelogic.One issue that comes up with disturbances is that proving command se-quences are permanently safe is more diﬃcult. In both cases shown, since theposition uncertainty grows over time, the agents could potentially collide if wego out far enough. There are two ways to overcome this. One is to adopt a diﬀer-ent deﬁnition of a permanently safe command sequence, such as simply sayingthe agents are considered safe when we can guarantee some large separation dis-tance. An alternative solution is to pair a closed-loop low-level controller withthe plant, and use the output of model-predictive control to generate waypointsfor this controller, rather than a sequence of open-loop accelerations.

Our second evaluation system involves guaranteeing collision avoidance for groupsof aircraft. We use a six degrees of freedom F-16 simulation model [15], based ondynamics taken from an Aerospace Engineering textbook [36]. Each aircraft ismodeled with 16 state variables, including positional states, positional velocities,rotational states, rotational velocities, an engine thrust lag term, and integratorstates for the low-level controllers. These controllers actuate the system usingthe typical aircraft control surfaces—the ailerons, elevators, and rudder—as wellas by setting the engine thrust. The system evolves continuously with piece-wisenonlinear diﬀerential equations, where the function that computes the derivativegiven the state is provided as Python code. In order to match the discrete-timeplant model in Deﬁnition 1, we periodically select a control strategy with a fre-quency of once every two seconds. The model further includes high-level autopilotlogic for waypoint following, which we reuse in the advanced controller.For the collision-avoidance baseline controller, we build upon the ACASXusystem designed for unmanned aircraft [19]. While the original system was de-signed using a partially observable Markov decision process (POMDP), the re-sultant controller was encoded in a large look-up table that used hundreds ofgigabytes of storage [16]. To make the system more practical, a downsamplingprocess followed by a lossy compression using neural networks was used [17,16].We use these downsampled neural networks for collision avoidance.The ACASXu system issues horizontal turn advisories based on the relativepositions of two aircraft, an ownship and an intruder . The system is similar toSimplex, where the output can be either clear-of-conﬂict , where any commandis allowed, or an override command that is one of weak-left , weak-right , strong-left or strong-right . We adapt this system to the multi-aircraft case by having afe CPS from Unsafe Controllers 15 each aircraft run an instance of ACASXu against every other aircraft. At eachdecision point, the ownship will use the advisory from the closest intruder aircraftthat commanded a turn, if any, only producing clear-of-conﬂict if all outputs are clear-of-conﬂict . To create command sequences, we advance the plant model andre-run ACASXu from the future state multiple times in a closed-loop fashion.As with the multi-robot scenario, we examine cases where the initial aircraftstate x has all aircraft starting evenly-spaced, facing towards the center of acircle with a given initial diameter. Each aircraft has an initial velocity of 807ft/sec and an initial altitude of 1000 ft, both of which are maintained throughoutthe maneuver by the controllers.The advanced controller commands each aircraft to ﬂy towards a waypointpast the opposite side of the circle. The safety property requires maintaininghorizontal separation. The near mid-air collision cylinder (NMAC) deﬁnes theminimum acceptable separation to be 500 ft [22], although we will considervarious safety distances in our evaluation. If only the advanced controller isused, all aircraft ﬂy straight, so they collide in the center.In addition to the advanced controller being unsafe, the baseline ACASXucontroller should not be fully trusted for many reasons. The original POMDPformulation was not proven formally correct, not to mention the downsampingand lossy neural network compression. While some research has examined prov-ing open-loop properties for the ACASXu neural networks [17,5], these do notimply closed-loop collision avoidance. Further, we use a multi-aircraft adaptationof the system, which could also lead to problems. Finally, the intended systemresponse to the ACASXu outputs is that weak-left and weak-right should causeturning at 1.5 degrees per second, whereas strong-left and strong-right turn at3.0 degrees per second [16]. However, turning an aircraft in the F-16 model—as well as in the real world—is not an instantaneous process, and requires ﬁrstperforming a roll maneuver before the heading angle begins to change. For thesereasons, the baseline controller in this scenario is also an unveriﬁed component,and we will show scenarios where it misbehaves. Nonetheless, we will composethe incorrect advanced controller with the incorrect baseline controller to createa correct multi-aircraft collision avoidance system by using Black-Box Simplex.For the initial permanently safe command sequence s , we have each aircraftﬂy in clockwise circles forever, which avoids collisions. To check whether a gen-erated command sequence is permanently safe, the decision module simulatesthe system and checks that (i) each aircraft’s state stays within the model limits(for example, no aircraft enters a stall), (ii) all aircraft obey the safety distanceconstraint at all times, and (iii) the execution ends in a state where the rollangle of each aircraft has been small (less than 15 degrees) and the distancesbetween all pairs of aircraft has been increasing consecutively for several sec-onds. Presumably, if all aircraft continue to ﬂy straight and level from such aconﬁguration, their distance would continue to increase and no collisions wouldoccur in the future.We next elaborate on three scenarios: (i) a three aircraft case, which showsthe safety of the system despite unsafe outputs, (ii) a four aircraft case which Fig. 5: Black-Box Simplex is safe. In the three aircraft case the multi-ACASXusystem fails, whereas Black-Box Simplex maintains the 1500 ft collision distance.shows the increased transparency of Black-Box Simplex, and (iii) a 15 aircraftcase which shows safe navigation of a complex scenario. Appendix A also includesa seven aircraft case which shows the safety condition can be easily customized.

Three Aircraft Scenario

The ACASXu collision avoidance system was de-signed with two aircraft in mind, an ownship and an intruder. We adapted it tothe multi-aircraft case, but this mismatch between the system design assump-tions and usage scenario can lead to problems. While the system does usuallyavoid collisions, it is not diﬃcult to ﬁnd cases where it fails, especially whenthere are more than two aircraft. With three aircraft, we could ﬁnd cases wherethe collision distance property was violated.In all the plots in this section, we show snapshots in time where the distancebetween aircraft is minimized for a particular scenario. The two red aircraft ineach image are the closest pair, and their distance is printed in the bottom rightof each ﬁgure. The solid line shows the historic path of each aircraft, and thedotted line is the future trajectory.In Figure 5, we show a three-aircraft scenario, where the initial circle diameteris 90,000 ft. Using the original ACASXu system, where a clear-of-conﬂict outputruns the waypoint-follower logic, is unsafe. Figure 5(a) shows this case, wherethe distance between the top two aircraft is 175 ft, violating the near mid-aircollision safety distance. The other two subplots show the system using Black-Box Simplex with a safety distance of 1500ft, and the minimum separation is1602 ft, which satisﬁes the constraint.

Four Aircraft Scenario

Figure 6 shows a four aircraft scenario using an initialcircle diameter of 70,000 ft. In this case, both designs have safe executions. Usingthe original ACASXu system leads to a minimum separation of 5342 ft, whereasthe minimum separation with Black-Box Simplex is 1600 ft, much closer tothe 1500 safety distance constraint used in the decision module. Although bothsystems are safe, from the plots it is clear that the Black-Box Simplex version afe CPS from Unsafe Controllers 17(a) Original ACASXu (b) Black-Box Simplex

Fig. 6: Black-Box Simplex is more transparent. For the four aircraft case,ACASXu is signiﬁcantly more intrusive than Black-Box Simplex, which over-rides commands just enough to guarantee 1500 ft separation requirement.is more transparent, in the sense that it produces smaller modiﬁcations to thedirect-line trajectories commanded by the advanced controller.

Fifteen Aircraft Scenario

Finally, we demonstrate the system’s ability tosafely navigate complex scenarios. For this, we use a 15 aircraft scenario, withan initial circle diameter of 90,000 ft. With 15 aircraft, the composed systemhas 240 real-valued state variables, each of which evolves according to piece-wisenonlinear diﬀerential equations. Figure 7 shows the system’s behavior. While theoriginal ACASXu system is unsafe, Black-Box Simplex has a minimum separa-tion of 1500.5 ft, just above the 1500 ft safety constraint used in the decisionmodule. Another surprising observation is that in some of the cases, such as this15 aircraft case and in the seven aircraft case shown in Figure 8(b), the aircraftperform something similar to a roundabout maneuver. This is an emergent be-havior and not something we explicitly hardcoded or anticipated. A video of thiscase is also available online. Discussion

The acceptance condition for dm update checked by the decision mod-ule consists of three parts described earlier: (i) the system state stays withinmodel limits, (ii) the horizontal safety distance is maintained at all times, and(iii) the maneuver ends with straight-and-level ﬂight, with all aircraft movingaway from each other. However, there are many behaviors we would want toavoid that are not checked by this logic. For example, ﬂying into the groundwill not cause a command sequence to be rejected. The system could also com-mand maximum afterburner, which would waste fuel unnecessarily, or actuatethe system wildly in ways that would damage a real airframe or cause excessive https://streamable.com/upm0lc8 U. Mehmood et al.(a) Original ACASXu(failure) (b) Black-Box Simplex (c) Black-Box Simplex(Zoomed In) Fig. 7: Black-Box Simplex safely navigates complex scenarios. In the 15-aircraftcase, all aircraft cross the circle while maintaining the 1500 ft separation distance.wear. We could try avoid these cases by enumerating them and adding appropri-ate checks in the decision module. However, the possibilities for misbehavior arelarge, and such an approach is likely to miss undesirable cases. Note that thisproblem is not unique to Black-Box Simplex, but also exists with the originalSimplex Architecture. Another way to address this issue is to push it aside—assume that although the controllers are unveriﬁed, they are not malicious andso the decision module can focus on checking the aspects of the controllers thatare most critical and more likely to be incorrect.A third—perhaps better—way to handle this issue is to reformulate the prob-lem. We can modify the “Plant + Low-Level Controller” box in Black-Box Sim-plex to also include trusted low-level controllers for the following six actions:(1) ﬂy towards the waypoint, (2) ﬂy straight, (3) turn weak left, (4) turn weakright, (5) turn strong left, and (6) turn strong right. At each decision point,then, the advanced controller always chooses the ﬁrst action, whereas the base-line controller’s ACASXu system produces an output corresponding to one ofthe latter ﬁve actions, for each aircraft. Creating a reliable controller individu-ally for each of these six actions is much closer to a classical control problem;we could assume such controllers are provided that can maintain altitude andairspeed and do not ﬂy into the ground or damage the aircraft in other ways.The problem solved by Black-Box Simplex is then about high-level control: howto safely compose these actions to reach the target waypoint while maintainingthe safety constraint.Another limitation of this case study is that it does not have disturbances, sothe dynamics are deterministic. Handling disturbances with the same approachas in the previous case study, using online reachability computation, would bediﬃcult for several reasons. First, closed-form diﬀerential equations—which arerequired by most reachability tools—are not provided for this system, as thederivative is computed by Python code. It would be hard to even extract dif-ferential equations from the code, as the simulator includes components likemultiple look-up tables and conditional branches. Second, the simulation hasnonlinear dynamics, for which reachability methods scale more poorly than withlinear dynamics. Third, the number of variables can be large, for example 240 afe CPS from Unsafe Controllers 19 real-valued state variables in the 15-aircraft case. This complexity of system isbeyond what is currently possible with nonlinear reachability tools [12], and evenif it becomes possible, it is unlikely to run fast enough for use at runtime.Instead, the more feasible approach would again be to rely on control theoryto provide performance guarantees for the designed controllers. If each maneu-ver resulting from the six actions described earlier could be analyzed for maxi-mum overshoot and steady-state error—typical metrics evaluated during controldesign—executions of the system could be computed and then bloated by theworst-case distance to obtain a lower bound on the minimum horizontal separa-tion. Used like this, Black-Box Simplex builds upon the vast body of work fromcontrol theory to produce safe high-level behaviors from untrusted controllers.In terms of the runtime of the decision module simulation logic, we havenot yet optimized its computation time in our implementation. Importantly,since each aircraft’s dynamics are disjoint, there is signiﬁcant room for speedupthrough parallelism, up to the number of aircraft in the system. The existingimplementation simulates the dynamics using an adaptive-step explicit Runge-Kutta scheme of order 5(4) from Python’s scipy package. On our laptop plat-form with default accuracy parameters, this runs at about 55 times faster thanreal-time per aircraft. As future work, we could consider optimizing runtime byusing diﬀerent numerical integration strategies, depending on factors such as thedistance to the safety violation.

Reachability-based veriﬁcation methods for black-box systems for waypoint fol-lowing with uncertainty have been recently investigated in the ReachFlow frame-work [20]. Unlike Black-Box Simplex, ReachFlow requires closed-form diﬀeren-tial equations for the plant model. Further, it is built upon the Flow* nonlinearreachability tool [9], which is unlikely to scale to large complex systems, such asthe 240-variable ﬁfteen aircraft scenario we considered.A framework for safe trajectory planning using MILP for piecewise-linearvehicle models has been considered [32,33]. This method relies on the ability ofa model-predictive controller to produce command sequences where the terminalstate in the prediction horizon is constrained within a safe invariant set, thusproviding a safe back-up command sequence for the next step in case the systemfails to ﬁnd a safe sequence. The system stores a path to the start state (returntrajectory), which may be used to get out of loops. The scope of this workis limited to MPC and it is not clear how to extend it to other applications.Moreover, the conditions for switching back and forth between the stored returntrajectory and the MPC are not formalized.In the recently-proposed Contingency Model Predictive Control framework [1],an MPC controller maintains a contingency plan in addition to the nominalor desired plan to ensure safety during an identiﬁed potential emergency. LikeBlack-Box Simplex, the initial command is common in both plans. The MPCis robust to stochastic disturbances by anticipating and planning in advance for the worst-case events. Due to the coupling of the nominal and contingency plans,the deployed command is continuously impacted by the possibility of the con-tingency, however unlikely it may be. In contrast, in Black-Box Simplex, besidesthe advanced controller sharing its command with the baseline controller, bothcontrollers work independently. Also, the theory here generalizes beyond MPC.Designing safe switching logic for a given baseline controller is related to theconcept of computing viability kernels [30] (closed controlled invariant subsets)in control theory. Like computing reachability, this requires symbolic diﬀerentialequations, and requires high-dimensional set operations which can be ineﬃcientin high-dimensional spaces, although there is some progress on this [18,21].Simplex designs have also been considered that use a combination of oﬄineanalysis with online reachability [4]. Again though, reachability computationis currently intractable for large nonlinear systems, and requires symbolic dif-ferential equations. Other work has used Simplex to provide safety guaranteesfor neural network controllers with online retraining [26]. In these approaches,however, the baseline controller must be veriﬁed ahead of time.Online simulation-based methods have also been investigated to secure powergrids from insider attacks [23]. As with this work, fast online simulation is critical,although the goal there is system security not safe high-level control design.The design of the MPC controllers for our MAS case study is the similar tocontrol barrier certiﬁcate methods [8,14]. There, a runtime assurance system wasused to provide minimally perturbed advanced controller commands, computedusing a constrained-optimization problem. However, the optimization problemmight become infeasible or global nonlinear optimization could perform poorlyat one of the steps at runtime, causing this method to fail. With Black-BoxSimplex, failure of the baseline controller does not compromise safety.Formal veriﬁcation methods have also investigated multi-aircraft roundaboutmaneuvers [29] using diﬀerential dynamic logic proof systems [28]. Like thatwork, we also verify ﬂyable maneuvers in our case study, although we use asimulation model. Although this has numerical simulation error, the system isdiﬃcult to analyze otherwise as the behavior is deﬁned with source code, notdiﬀerential equations.The ModelPlex framework [24] generates runtime safety monitors to validateif the assumptions from oﬄine model veriﬁcation hold during execution. Mod-elPlex monitors check if the observed system execution ﬁts the veriﬁed model andif it does not ﬁt, it initiates the fail-safe actions necessary to avoid safety risks.The goal of ModelPlex is assuring model validation, which is diﬀerent from thiswork, and its successful application still requires a veriﬁed baseline controller.

We have presented the Black-Box Simplex Architecture, a methodology for con-structing safe CPS from unveriﬁed black-box high-level controllers. The maintradeoﬀ present in Black-Box Simplex is that the decision module has increased afe CPS from Unsafe Controllers 21 complexity, and for the system to perform smoothly, it must be able to quicklyverify command sequences.This itself is not an easy problem. For example, in the case of using Black-BoxSimplex with end-to-end machine learning, in addition to runtime veriﬁcationof command sequences, the decision module would also need to run its ownperception logic to make sense of the environment, further increasing the com-putational burden. With the proposed approach, however, we have reduced thediﬃcult problem of proving high-level safety to a simpler problem of performanceoptimization of the decision module logic. Black-Box Simplex provides a feasiblepath for the veriﬁcation of systems that are otherwise unveriﬁable in practice.

References

1. Alsterda, J.P., Brown, M., Gerdes, J.C.: Contingency model predictive control forautomated vehicles. In: 2019 American Control Conference (ACC). pp. 717–722(2019). https://doi.org/10.23919/ACC.2019.88152602. Althoﬀ, M., Dolan, J.M.: Online veriﬁcation of automated road vehicles usingreachability analysis. IEEE Transactions on Robotics (4) (2014)3. Bak, S., Chivukula, D.K., Adekunle, O., Sun, M., Caccamo, M., Sha, L.: Thesystem-level simplex architecture for improved real-time embedded system safety.In: 2009 15th IEEE Real-Time and Embedded Technology and Applications Sym-posium. pp. 99–107. IEEE (2009)4. Bak, S., Johnson, T.T., Caccamo, M., Sha, L.: Real-time reachability for veriﬁedsimplex design. In: 35th IEEE Real-Time Systems Symposium (RTSS 2014). IEEEComputer Society, Rome, Italy (Dec 2014)5. Bak, S., Tran, H.D., Hobbs, K., Johnson, T.T.: Improved geometric path enumer-ation for verifying relu neural networks. In: Proceedings of the 32nd InternationalConference on Computer Aided Veriﬁcation (2020)6. Bechtel, M.G., McEllhiney, E., Kim, M., Yun, H.: Deeppicar: A low-cost deep neu-ral network-based autonomous car. In: 2018 IEEE 24th International Conferenceon Embedded and Real-Time Computing Systems and Applications (RTCSA). pp.11–21. IEEE (2018)7. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P.,Jackel, L.D., Monfort, M., Muller, U., Zhang, J., et al.: End to end learning forself-driving cars. arXiv preprint arXiv:1604.07316 (2016)8. Borrmann, U., Wang, L., Ames, A.D., Egerstedt, M.: Control barrier certiﬁ-cates for safe swarm behavior. In: Egerstedt, M., Wardi, Y. (eds.) ADHS. IFAC-PapersOnLine, vol. 48, pp. 68–73. Elsevier (2015)9. Chen, X., ´Abrah´am, E., Sankaranarayanan, S.: Flow*: An analyzer for non-linearhybrid systems. In: International Conference on Computer Aided Veriﬁcation. pp.258–263. Springer (2013)10. Clark, M., Koutsoukos, X., Porter, J., Kumar, R., Pappas, G., Sokolsky, O., Lee,I., Pike, L.: A study on run time assurance for complex cyber physical systems.Tech. rep., Air Force Research Laboratory, Aerospace Systems Directorate (2013)11. Desai, A., Ghosh, S., Seshia, S.A., Shankar, N., Tiwari, A.: SOTER: A runtimeassurance framework for programming safe robotics systems. In: 49th AnnualIEEE/IFIP International Conference on Dependable Systems and Networks, DSN2019, Portland, OR, USA, June 24-27, 2019. IEEE (2019)2 U. Mehmood et al.12. Geretti, L., Sandretto, J.A.D., Althoﬀ, M., Benet, L., Chapoutot, A., Chen, X.,Collins, P., Forets, M., Freire, D., Immler, F., et al.: Arch-comp20 category report:Continuous and hybrid systems with nonlinear dynamics. EPiC Series in Comput-ing , 49–75 (2020)13. Girard, A.: Reachability of uncertain linear systems using zonotopes. In: Interna-tional Workshop on Hybrid Systems: Computation and Control. Springer (2005)14. Gurriet, T., Mote, M., Ames, A.D., Feron, E.: An online approach to active setinvariance. In: Conference on Decision and Control. IEEE (2018)15. Heidlauf, P., Collins, A., Bolender, M., Bak, S.: Veriﬁcation challenges in f-16ground collision avoidance and other automated maneuvers. In: 5th InternationalWorkshop on Applied Veriﬁcation of Continuous and Hybrid Systems. EPiC Seriesin Computing, EasyChair (2018)16. Julian, K.D., Kochenderfer, M.J., Owen, M.P.: Deep neural network compressionfor aircraft collision avoidance systems. Journal of Guidance, Control, and Dynam-ics (3), 598–608 (2019)17. Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: An eﬃ-cient SMT solver for verifying deep neural networks. In: International Conferenceon Computer Aided Veriﬁcation. pp. 97–117. Springer (2017)18. Kaynama, S., Maidens, J., Oishi, M., Mitchell, I.M., Dumont, G.A.: Computingthe viability kernel using maximal reachable sets. In: Proceedings of the 15th ACMinternational conference on Hybrid Systems: Computation and Control. pp. 55–64(2012)19. Kochenderfer, M.J., Chryssanthacopoulos, J.: Robust airborne collision avoidancethrough dynamic programming. Massachusetts Institute of Technology, LincolnLaboratory, Project Report ATC-371 (2011)20. Lin, Q., Chen, X., Khurana, A., Dolan, J.: Reachﬂow: An online safety assuranceframework for waypoint-following of self-driving cars. In: 2020 IEEE/RSJ Interna-tional Conference on Intelligent Robots and Systems (IROS) (2020)21. Maidens, J.N., Kaynama, S., Mitchell, I.M., Oishi, M.M., Dumont, G.A.: La-grangian methods for approximating the viability kernel in high-dimensional sys-tems. Automatica (7), 2017–2029 (2013)22. Marston, M., Baca, G.: Acas-xu initial self-separation ﬂight tests. Tech. rep., NASA(2015)23. Mashima, D., Chen, B., Zhou, T., Rajendran, R., Sikdar, B.: Securing substa-tions through command authentication using on-the-ﬂy simulation of power sys-tem dynamics. In: IEEE International Conference on Communications, Control,and Computing Technologies for Smart Grids (2018)24. Mitsch, S., Platzer, A.: Modelplex: Veriﬁed runtime validation of veriﬁed cyber-physical system models. Formal Methods in System Design (1), 33–74 (2016)25. Phan, D., Grosu, R., Jansen, N., Paoletti, N., Smolka, S.A., Stoller, S.D.: Neuralsimplex architecture. In: NASA Formal Methods Symposium (NFM 2020) (2020)26. Phan, D., Grosu, R., Jansen, N., Paoletti, N., Smolka, S.A., Stoller, S.D.: Neuralsimplex architecture. In: NASA Formal Methods Symposium (NFM 2020). pp.97–114. Springer (2020)27. Phan, D., Yang, J., Grosu, R., Smolka, S.A., Stoller, S.D.: Collision avoidance formobile robots with limited sensing and limited information about moving obstacles.Formal Methods in System Design (1), 62–86 (2017)28. Platzer, A.: Diﬀerential dynamic logic for hybrid systems. Journal of AutomatedReasoning (2), 143–189 (2008)afe CPS from Unsafe Controllers 2329. Platzer, A., Clarke, E.M.: Formal veriﬁcation of curved ﬂight collision avoidancemaneuvers: A case study. In: International Symposium on Formal Methods. pp.547–562. Springer (2009)30. Saint-Pierre, P.: Approximation of the viability kernel. Applied Mathematics andOptimization (2), 187–209 (1994)31. Schierman, J., DeVore, M.D., Richards, N., Gandhi, N., Cooper, J., Horneman,K.R., Stoller, S., Smolka, S.: Runtime assurance framework development for highlyadaptive ﬂight control systems (2015)32. Schouwenaars, T., Valenti, M., Feron, E., How, J.: Implementation and ﬂight testresults of milp-based uav guidance. 2005 IEEE Aerospace Conference pp. 1–13(2005)33. Schouwenaars, T.: Safe trajectory planning of autonomous vehicles. Ph.D. thesis,Massachusetts Institute of Technology (2006)34. Seto, D., Krogh, B., Sha, L., Chutinan, A.: The simplex architecture for safe onlinecontrol system upgrades. In: Proceedings of the 1998 American Control Conference.ACC (IEEE Cat. No. 98CH36207). vol. 6. IEEE (1998)35. Sha, L.: Using simplicity to control complexity. IEEE Software (4), 20–28 (2001).https://doi.org/10.1109/MS.2001.93621336. Stevens, B.L., Lewis, F.L., Johnson, E.N.: Aircraft control and simulation. JohnWiley & Sons (2015)4 U. Mehmood et al. A Seven Aircraft Case - Safety Condition Customization

We investigated a seven aircraft scenario with an initial circle diameter of 70,000ft. Here, the original ACASXu system violates the horizontal separation con-straint, and the minimum separation distance is 277 ft. We run Black-Box Sim-plex on this system using three diﬀerent safety distances, 1500 ft, 1000 ft, and500 ft. All avoid collisions, and as the safety distance is decreased, the observedminimum distance also decreases. This shows that Black-Box Simplex can be eas-ily customized to a change in the safety requirement. Doing this for the originalACASXu system would require signiﬁcant eﬀort in recomputing the POMDPsand retraining the neural networks to perform a compression of the action tables.Plots of the seven aircraft trajectories are provided in Figure 8 and video of the1000 ft case is available online . (a) Original ACASXu (failure) (b) Black-Box Simplex with SafetyDistance 1500 ft(c) Black-Box Simplex with SafetyDistance 1000 ft (d) Black-Box Simplex with SafetyDistance 500 ft Fig. 8: Black-Box Simplex is easily customizable. In the seven aircraft case, ad-justing the safety distance in the decision module results in diﬀerent systembehaviors. In each case, the advanced controller command is overridden onlyenough to guarantee the corresponding safety constraint.5