SSafe CPS from Unsafe Controllers
Usama Mehmood, Stanley Bak, Scott A. Smolka, and Scott D. Stoller
Department of Computer ScienceStony Brook UniversityStony Brook NY, USA
Abstract.
In this paper, we explore using runtime verification to designsafe cyber-physical systems (CPS). We build upon the Simplex Architec-ture, where control authority may switch from an unverified and poten-tially unsafe advanced controller to a backup baseline controller in orderto maintain system safety. New to our approach, we remove the require-ment that the baseline controller is statically verified. This is importantas there are many types of powerful control techniques—model-predictivecontrol, rapidly-exploring random trees and neural network controllers—that often work well in practice, but are difficult to statically prove cor-rect, and therefore could not be used before as baseline controllers. Weprove that, through more extensive runtime checks, such an approachcan still guarantee safety. We call this approach the
Black-Box SimplexArchitecture , as both high-level controllers are treated as black boxes.We present case studies where model-predictive control provides safetyfor multi-robot coordination, and neural networks provably prevent col-lisions for groups of F-16 aircraft, despite occasionally outputting unsafeactions.
Modern cyber-physical systems (CPS) are found in vital domains such as trans-portation, autonomy, health-care, energy, agriculture, and defense. As these sys-tems perform complex functions, they require complex designs. Since CPS in-teract with the physical world, correctness is especially important, but formalanalysis can be difficult for complex systems.In the design of such CPS, powerful techniques such as model-predictive con-trol and deep reinforcement learning are increasingly being considered instead oftraditional high-level control design. Such trends exacerbate the safety verifica-tion problem. For example, one approach for autonomous driving is end-to-endlearning , where a complex neural network directly receives sensor inputs andoutputs low-level steering commands [7,6]. Additionally, there is increasing in-terest in systems that can learn in the field, changing their behaviors based onobservations. Classical verification strategies are poorly suited for such designs.One approach for dynamically providing safety for systems with complex andunverified components is runtime assurance [10], where the state of the plant ismonitored at runtime to detect possible imminent violations of formal properties.If necessary, corrective measures are taken to avoid the violations. A well-known a r X i v : . [ c s . S E ] F e b U. Mehmood et al.
AdvancedControllerBaselineController Decision Module
SensorDataCommandCommand
Plant+Low-LevelController (a) Traditional Simplex Architecture
Advanced ControllerLookaheadBaseline Controller Decision Module
SensorDataCommandCommand
Sequence
Command
Plant+Low-LevelController (b) Black-Box Simplex Architecture
Fig. 1: The Black-Box Simplex Architecture guarantees safety despite a black-box advanced controller and a black-box baseline controller.runtime assurance technique is the Simplex Architecture [34,35], which has beenapplied to a wide range of systems [11,27,31].In the original Simplex Architecture, shown in Figure 1(a), the baseline con-troller and the decision module are part of the trusted computing base. Theymust be verified correct for the system to work. The decision module monitorsthe state of the system and switches control from the advanced controller to thebaseline controller if using the former could result in a safety violation in thenear future. The advanced controller is typically concerned with mission require-ments. On the other hand, the baseline controller should be simpler in designand proven to preserve safety of the system.The successful application of the original Simplex Architecture requires creat-ing a provably safe baseline controller, a difficult task for many systems. Further,many classes of controllers, such as those designed using model-predictive con-trol, rapidly-exploring random trees, or neural-network controllers, may workwell in practice, but are difficult to verify and therefore cannot be used as base-line controllers.
The main contribution of this work is to overcome thislimitation.
We propose the
Black-Box Simplex Architecture , a variant of thetraditional Simplex Architecture that can guarantee the safety of the systemdespite an unverified and even incorrect baseline controller, which is treated asa black box.In the Black-Box Simplex Architecture, shown in Figure 1(b), the baselinecontroller tries to produce a sequence of commands that begins with the ad-vanced controller’s current command and brings the plant to a state where main-taining the safety property is easy (to be formally defined later). The verifieddecision module checks at runtime whether the baseline controller’s commandsequence satisfies these requirements. If so, the decision module stores it forpotential use as a backup plan at the next time step, in case no further safecommand sequences are produced. This is a key tradeoff one encounters withBlack-Box Simplex compared to traditional Simplex: the decision module mustperform more computationally expensive runtime verification (so that its perfor-mance is important). Whether this tradeoff is practical depends on the specificsystem, and we investigate it through several case studies. We assume that the afe CPS from Unsafe Controllers 3 plant has a trusted low-level controller which is able to correctly apply the com-mand from the decision module to the plant.We prove two theorems about this architecture: (i) safety is always guar-anteed, and (ii) if the baseline and advanced controllers perform well (to beformally defined in the next section) and the decision module is fast enough,the architecture is transparent: the advanced controller appears to have full con-trol of the system. The practicality of these assumptions is also demonstratedthrough our two significant case studies. In the first, a multi-robot coordinationsystem uses a baseline controller with a model-predicative control algorithm. Inthe second, a mid-air collision avoidance system for groups of F-16 aircraft iscreated from imperfect logic encoded in neural networks.The rest of the paper is organized as follows. Section 2 presents a formaldefinition of the Black-Box Simplex Architecture, including proofs of the safetyand transparency theorems. Section 3 contains two case studies implementingthe architecture. Section 4 discusses related work while Section 5 offers ourconcluding remarks.
This section reviews the limitations of traditional Simplex Architecture and thenpresents this paper’s main contribution, the Black-Box Simplex Architecture.
The traditional Simplex Architecture, shown in Figure 1(a), preserves the safetyof the system while permitting the use of an unverified advanced controller. Itdoes this by using the advanced controller in conjunction with a verified baselinecontroller and a verified decision module. The goal of the Simplex Architectureis to ensure that the system state is always admissible , i.e., it satisfies all safetyconstraints and operational limits.The decision module cannot simply check if the next state is admissible.Rather, the verified design of a Simplex system usually requires offline reasoningwith respect to a trusted baseline controller. If the system dynamics are linearand the admissible states are defined with linear constraints, a state-feedbackbaseline controller and a decision module can be synthesized by solving a lin-ear matrix inequality [34]. If the system or constraints are nonlinear, however,there is no easy recipe to create a trusted baseline controller and trusted de-cision module. This prevents more widespread use of the traditional SimplexArchitecture.
The Black-Box Simplex Architecture lifts the requirement that the baseline con-troller is verified, allowing provable safety with both an unverified advanced
U. Mehmood et al. controller and an unverified baseline controller. Its architecture is shown in Fig-ure 1(b). Apart from eliminating the need to establish safety of the baselinecontroller, the Black-Box Simplex Architecture differs from the traditional Sim-plex Architecture in other important ways. First, the advanced controller sharesits command with the baseline controller instead of passing it directly to the de-cision module. Second, the baseline controller uses this command as the startingpoint of a command sequence intended to safely recover the system. Many con-trol techniques naturally produce command sequences, such as model predictivecontrol with a finite-step horizon or controllers derived from rapidly-exploringrandom trees (RRTs). If a model of the low-level controllers and plant is pro-vided, a traditional single-output controller can be used to create a commandsequence through repeated invocations and system simulation.The decision module checks the baseline controller’s command sequence, pos-sibly rejecting it if safety is not ensured. As long as the advanced controller drivesthe system through states where the baseline controller can recover, it continuesto actuate the system. However, if the baseline controller fails to compute a safecommand sequence, due to the fault of either the unverified advanced controlleror the unverified baseline controller, the decision module can still recover thesystem using the safe command sequence from the previous step.The applicability of Black-Box Simplex depends on the feasibility of twosystem-specific steps: (i) constructing safe command sequences and (ii) provingtheir safety at runtime. For some systems, a safe command sequence can simplybring the system to a stop. An autonomous car, for example, could have safecommand sequences that steer the car to the side of the road and then stop.A safe sequence for a drone might direct it to the closest emergency landinglocation. For an autonomous fixed-wing aircraft swarm, a safe sequence couldfly the aircraft in non-intersecting circles to buy time for a human operator tointervene.Proving safety of a given command sequence can also be challenging anddepends on the system dynamics. For nondeterministic systems, this could in-volve performing reachability computations at runtime [20,4,2]. Such techniquesassume an accurate system model is available in order to compute reachablesets. Traditional offline control theory also requires this assumption, so we donot view it as overly burdensome.In the Black-Simplex Architecture, although both controllers are unverified,we do not combine them into a single unverified controller for two reasons. First,the design of the safety controller is easier if it is kept simple and is not burdenedwith fulfilling all mission-specific goals. Second, it allows for the use of off-the-shelf controller strategies that are focused on either mission completion or safety.
We formalize the behavior and requirements for the components of the Black-BoxSimplex Architecture in order to prove properties about the system’s behavior. afe CPS from Unsafe Controllers 5
Plant Model.
We consider discrete-time plant dynamics, modeled as a func-tion f ( x i (cid:124)(cid:123)(cid:122)(cid:125) state , u i (cid:124)(cid:123)(cid:122)(cid:125) input , w i (cid:124)(cid:123)(cid:122)(cid:125) disturbance ) = x i +1 (cid:124)(cid:123)(cid:122)(cid:125) next state (1)where x i ∈ X is the system state, u i ∈ U is a control input command, w i ∈ W isan environmental disturbance, and i ∈ Z + is the time step. We sometimes alsoconsider a deterministic version of the system, where the disturbance w i can betaken to be zero at every step. Admissible States.
The system is characterized by a set of operationalconstraints which include physical limits and safety properties. States that satisfyall the operational constraints are called admissible states . Command Sequences.
A single-input command is some u ∈ U , and a k -length sequence of commands is written as u ∈ U k . The length of a sequence canbe written as u len = k , where we also can take the length of a single command, u len = 1. We use Python-like notation for subsequences, where the first elementin a sequence is u [0], and the rest of the sequence is u [1:]. Decision Module.
The decision module in Black-Box Simplex stores a com-mand sequence s which we sometimes call the decision module’s state. The be-havior of the decision module is defined through two functions, dm update and dm step . The dm update function attempts to modify the decision module’s storedcommand sequence: dm update ( x (cid:124)(cid:123)(cid:122)(cid:125) state , s (cid:124)(cid:123)(cid:122)(cid:125) cur seq , t (cid:124)(cid:123)(cid:122)(cid:125) proposed seq ) = s (cid:48) (cid:124)(cid:123)(cid:122)(cid:125) new seq (2)where if s (cid:48) = t then we say that the proposed command sequence is accepted ;otherwise s (cid:48) = s and we say that it was rejected . Correctness conditions on dm update are given in Section 2.4. Note that the decision module will accepta safe command sequence from the advanced controller, even if the previouscommand sequence from the advanced controller was rejected because it wasunsafe. As in [25], we refer to this as reverse switching , since it switches controlback to the advanced controller.The dm step function produces the next command u to apply to the plant, aswell as the next step’s command sequence s (cid:48) for the decision module: dm step ( ¯ s (cid:124)(cid:123)(cid:122)(cid:125) cur seq ) = ( u (cid:124)(cid:123)(cid:122)(cid:125) next input , s (cid:48) (cid:124)(cid:123)(cid:122)(cid:125) next seq ) (3)where u = s [0] and s (cid:48) is constructed from ¯ s by removing the first command (ifthe current sequence s has only one command then it is repeated): s (cid:48) = (cid:40) s if s len = 1 s [1:] otherwise Controllers.
The advanced and baseline controllers are defined using func-tions of the system state. In particular, the advanced controller is defined by a
U. Mehmood et al. function ac ( x ) = u , where u ∈ U is a single command. The baseline controller issimilarly defined with bc ( x ) = u , where u ∈ U k is a k-length command sequence.For Black-Box Simplex, we make use of look-ahead baseline controllers , whichoutput command sequences that start with the same command as an advancedcontroller. These can be defined with a function lbc ac ( x ) = u , with u [0] = ac ( x ).We generally drop the subscript on lbc , as it is clear from context. Execution Semantics.
At step i , given system state x i and decision mod-ule state s i , the next system state x i +1 and next decision module state s i +1 arecomputed with the following sequence of steps: (1) ac ( x i ) = z i ; (2) lbc ( x i ) = t i , with t i [0] = z i ; (3) dm update ( x i , s i , t i ) = s (cid:48) i ; (4) dm step ( s (cid:48) i ) = ( u i , s i +1 );(5) f ( x i , u i , w i ) = x i +1 , for some disturbance w i ∈ W . We define several relevant concepts and then state and prove safety and trans-parency theorems for the Black-Box Simplex Architecture.
Definition 1 (Safe System Execution).
A system execution is called safe ifand only if the system state is admissible at every step.
Safety can be ensured by following a permanently safe command sequencefrom a given system state.
Definition 2 (Permanently Safe Command Sequence).
Given state x i , ak-length permanently safe command sequence s i ∈ U k is one where the stateis admissible at every step j ≥ i , where ( u i , s i +1 ) = dm step ( s i ) , and x i +1 = f ( x i , u i , w i ) , for every choice of disturbance w i ∈ W . That is, the system state will remain admissible when applying each command inthe sequence s i , and then repeatedly using the last command forever, accordingto the semantics of dm step . More general definitions of permanently safe com-mand sequences could be considered, such as repeating a suffix rather than justthe last command. For simplicity we do not explore this here.We define the notion of recoverable commands as those that result in statesthat have permanently safe command sequences. Definition 3 (Recoverable Command).
Given state x i , a recoverable com-mand u is one where there exists a permanently safe command sequence from x i +1 , where x i +1 = f ( x i , u, w i ) , for every choice of disturbance w i ∈ W . Optimal decision modules are defined by requiring the dm update function ac-cept all sequences that can guarantee future safety. Definition 4 (Optimal Decision Module). An optimal decision module hasa dm update function that accepts t at state x if and only if t is a permanently safecommand sequence starting from x . afe CPS from Unsafe Controllers 7 A correct decision module is one which only accepts sequences that can guar-antee future safety. A correct decision module, by this definition, could rejectevery command sequence.
Definition 5 (Correct Decision Module). A correct decision module hasa dm update function that accepts t at state x only if t is a permanently safecommand sequence starting from x . The role of the baseline controller is to try to keep the system safe. Anoptimal look-ahead baseline controller can be defined as one that always producesa permanently safe command sequence when it exists. This is optimal in thesense that during the system execution, it will be able to override the advancedcontroller as late as possible while still guaranteeing safety. This controller canbe defined with respect to a specific advanced controller ac . Definition 6 (Optimal Look-Ahead Baseline Controller).
Given state x with u = ac ( x ) , if there exists a permanently safe command sequence s from x with s [0] = u , then an optimal look-ahead baseline controller will always producea permanently safe command sequence t , with t [0] = u . Note that t may differ from s , as there can be multiple permanently safecommand sequences from the same state. Theorem 1 (Safety).
Given initial state x along with an initial permanentlysafe command sequence s , if the decision module is correct, then the system’sexecution is safe, regardless of the outputs of the advanced controller ac andbaseline controller lbc .Proof. The command executed at each step comes from the state of the decisionmodule s i , which maintains the invariant that s i is always a permanently safecommand sequence from the current system state x i . The dm update function canonly replace a permanently safe command sequence with another permanentlysafe command sequence. Since initially, s is permanently safe, then by inductionon the step number, the decision module’s command sequence at every step ispermanently safe, and so the system’s execution is safe.Although safety is important, achieving only safety is trivial, as a decisionmodule can simply reject all new command sequences. A runtime assurancesystem must also have a transparency property, where the advanced controllerretains control in sufficiently well-designed systems. Theorem 2 (Transparency).
If (i) from every state encountered x i the outputof the advanced controller, ac ( x i ) = z i , is a recoverable command, (ii) the look-ahead baseline controller is optimal, and (iii) the decision module is optimal,then the input command used to actuate the system at every step is the advancedcontroller’s command, z i . U. Mehmood et al.
Proof.
The proof proceeds by stepping through an arbitrary step i of the exe-cution semantics defined in Section 2.3. Since the output of the advanced con-troller, ac ( x i ) = z i , is assumed to be recoverable, there exists a permanently safecommand sequence from x i that starts with z i . By the definition of an optimallook-ahead baseline controller, since there exists a permanently safe commandsequence, the output lbc ( x i ) = t must also be a permanently safe commandsequence, with t [0] = z i as required by the definition of a look-ahead baselinecontroller. In step three of the execution semantics, dm update ( x i , s i , t i ) = s (cid:48) i . Since t is a permanently safe command sequence and the decision module is optimal,the command sequence will be accepted by the decision module, and so s (cid:48) i = t .Step four of the execution semantics produces u i , which is the first commandin the sequence t . As shown before, this command is equal to z i . which is usedin step five of the execution semantics to actuate the system. This reasoningapplies at every step, and so the advanced controller’s command is always used. There are several practical considerations with the described approach. For ex-ample, the black-box controllers may not only generate logically incorrect com-mands, but they may fail to generate a command at all, for example, enteringan infinite loop. We can account for such behaviors by simply having a defaultcommand that is assumed in the execution semantics from Section 2.3. The de-fault command is used if lbc does not produce a timely output. For increasedprotection, the black-box controllers can be isolated on dedicated hardware [3]so that they do not, for example, crash a shared operating system.Another concern is that the decision module’s analysis of the command se-quence is nontrivial and could involve a runtime reachability computation. If thistakes too long to prove safety, the command sequence also should be rejected.This means that the practicality of the architecture depends on the efficiency ofreachability methods, an active area of research orthogonal to this work.Finally, it is probably impractical for many systems to create an optimaldecision module or optimal baseline controller, so there is a question of whetherTheorem 2 is useful. The proof of this theorem, however, deals with a singlestep of execution. Thus, if at a specific state the look-ahead baseline controlleris able to find a permanently safe command sequence and the decision moduleis able to validate it, then the system will behave as if the look-ahead baselinecontroller and decision module were optimal for that step. In other words, aswe improve the baseline controller’s ability to recover the system and designmore efficient reachability methods for the decision module, the architecturewill become increasingly transparent.
In this section, we consider two case studies: a multi-robot coordination system,and a mid-air collision avoidance system for groups of F-16 aircraft. In Theo-rem 1, we have established that the Black-Box Simplex Architecture guarantees afe CPS from Unsafe Controllers 9 -8 -4 0 4 8-8-4048 (a) Initial configura-tion, k = 1 -8 -4 0 4 8-8-4048 (b) k = 10 -8 -4 0 4-8-4048 (c) Baseline con-troller fails, k = 11 -12 -8 -4 0 4 8 12-12-8-4048 (d) k = 32 Fig. 2: Simulation of the MAS with 7 robots. The decision module performs sys-tem recovery after the baseline controller produces an unsafe command sequenceat k = 11 (Rays extending from the final positions of agents shown as larger reddots and in the direction of their final velocities intersect. The rays are shown asdotted red lines). We represent the current positions as red dots, the future po-sitions corresponding to the safe/unsafe command sequences as green/blue dots,the velocities as blue lines, and the trajectory of the agents as a grey curve.the safety of the system. The goal here is to demonstrate that the theory devel-oped is practically applicable to develop complex systems with safety guarantees. We consider a multi-agent system (MAS), indexed by M = { , ..., n } , of planarrobots modeled with discrete-time dynamics of the form: p i ( k + 1) = p i ( k ) + dt · v i ( k ) , | v i ( k ) | < v max v i ( k + 1) = v i ( k ) + dt · a i ( k ) , | a i ( k ) | < a max (4)where p i , v i , a i ∈ R are the position, velocity and acceleration of agent i ,respectively, at time step k , and dt ∈ R + is the time step. The magnitudesof velocities and accelerations are bounded by v max and a max , respectively. Theacceleration a i is the control input for agent i . The combined state of all agents isdenoted as x = [ p T , v T , ..., p Tn , v Tn ] T , and their accelerations are a = [ a T , ..., a Tn ] T .In the initial configuration, the agents are equally spaced on the boundary ofa circle and are at rest. Agent i ’s goal is to reach a target location r i , located onthe opposite side of the circle. The initial configuration of the MAS is shown inFigure 2(a), where the agents and their target locations are represented as reddots and blue crosses, respectively.The safety property is absence of inter-agent collisions. A pair of agentsis considered to collide if the Euclidean distance between them is less than athreshold d min . Thus, the safety property is that (cid:107) p i − p j (cid:107) > d min ≥ i, j ∈ M with i (cid:54) = j . Both the advanced controller and the baseline controller are designed usingcentralized model predictive control (MPC), which produces command sequencesas part of the solution of an optimization problem. The advanced controlleronly outputs the first command of the command sequence, whereas the baselinecontroller produces the full command sequence. Note that numerical methodsfor global nonlinear optimization do not provide a guaranteed optimal solutionand may fail. For this reason, such controllers could not be considered as thebaseline controller in the traditional Simplex Architecture, and should not beused directly when safety is important. Both the advanced controller and thebaseline controller are high-level controllers that produce accelerations. In oursimulations we do not model the low-level controller and have the plant dynamicswork directly with the accelerations. When implementing this on physical robots,depending on the dynamics, a trusted low-level controller will appropriately mapthe desired acceleration commands to the actuator inputs.An MPC controller produces a command sequence s of length T , where T is the prediction horizon, and each command s [ i ] contains the accelerations forall agents to use at step i . The centralized MPC controller solves the followingoptimization problem at each time step k:arg min a ( k | k ) ,...,a ( k + T − | k ) T − (cid:88) t =0 J ( k + t | k ) + λ · T − (cid:88) t =0 (cid:107) a ( k + t | k ) (cid:107) (5)where a ( k + t | k ) and J ( k + t | k ) are the predictions made at time step k forthe values at time step k + t of the accelerations and the centralized (global)cost function J , respectively. The first term is the sum of the centralized costfunction, evaluated for T time steps, starting at time step k . It encodes thecontrol objective. The second term, scaled by a weight λ >
0, penalizes largecontrol inputs.
Advanced controller
The centralized cost function J ac for the advancedcontroller contains two terms: (1) a separation term based on the inverse of thesquared distance between each pair of agents; and (2) a target seeking term basedon the distance between the agent and its target location. J ac = ω s (cid:88) i>j (cid:107) p i − p j (cid:107) + ω t (cid:88) i (cid:107) p i − r i (cid:107) (6)where ω s , ω t ∈ R are the weights of the separation term and target seekingterms. The separation term promotes inter-agent spacing but does not guaranteecollision avoidance. The first command in the command sequence produced bythe MPC optimization is the advanced controller’s command and it is passed tothe look-ahead baseline controller. The command sequence is the solution of theoptimization in Eq. 5, with J replaced by J ac . Baseline controller
The centralized cost function J bc for the baseline con-troller contains two terms. As in Eq. 6, the first term is the separation term.The second term is the divergence term which forces the agents to move out ofthe circle by aligning the velocities with rays radially pointing out of the centerof the circle. afe CPS from Unsafe Controllers 11 J bc = ω s (cid:88) i>j (cid:107) p i − p j (cid:107) + ω d (cid:88) i (cid:18) − ( p i − c ) · v i | p i − c || v i | (cid:19) (7)where ω s , ω d ∈ R are the weights of the separation term and the divergenceterm, and c is the center of the circle containing the initial configurations of therobots and their target locations. The control law for the baseline controller isEq. 5, with J replaced by J bc . A zero acceleration is appended at the end of thebaseline controller’s command sequence for ease in establishing collision freedomfor all future time steps. Decision module
The look-ahead baseline controller combines accelerationsfrom the advanced controller and the baseline controller, producing the commandsequence t = [ ac ( x ) , bc ( x (cid:48) ) ,(cid:126) x (cid:48) is the next state after executing ac ( x ) instate x . The function dm update ( x, s, t ) accepts the proposed command sequence t if and only if t is a permanently safe command sequence. For this system, acommand sequence t is considered permanently safe in a state x if it satisfiesthe following two conditions. First, for all states in the state trajectory obtainedby executing t from x , the Euclidean distance between every pair of distinctagents is at least d min . Second, in the final state, for all pairs of distinct agents,the rays extending from their positions and in the direction of velocities do notintersect. Any pair of agents that satisfies the second condition will not collidein the future, since the last command in the sequence t has zero acceleration.The initial permanently safe command sequence is a zero acceleration for allagents, as the agents start at rest.We first consider seven robotic agents initialized on a circle centered at theorigin, with a radius of 10. The other parameters are: dt = 0 . sec , d min = 1 . a max = 1 .
5, and v max = 2. The length of the prediction horizon is T ac = T bc = 10.The optimization problems for the MPC controllers are solved using the MAT-LAB fmincon function. The random seed in the initialization of fmincon causesnondeterminism in the solution. Successful Recovery After Failure
In this experiment, we use seven roboticagents initially positioned on a circle of radius 10, as shown in Figure 2(a). At k = 11, the baseline controller produces an unsafe command sequence. The statetrajectory corresponding to the unsafe sequence is shown in blue. As shown inFigure 2(c), the final velocities of the two agents corresponding to the larger reddots are converging after simulating the current state with the unsafe sequence.Hence, at k = 11. the decision module rejects the proposed command sequenceand shifts control to the previous safe command sequence, which safely recoversthe system. The last command in the stored command sequence is a zero accel-eration and is repeated forever. Here, we purposefully did not return control tothe advanced controller. A video of the simulation is available online. https://streamable.com/yoltx42 U. Mehmood et al. -8 -4 0 4 8-8-4048 (a) Configuration inwhich the agents areclosest, k = 18 -8 -4 0 4 8-8-4048 (b) Final configuration, k = 36 -8 -4 0 4 8-8-4048 (c) Stress test with 12agents. Fig. 3: (a,b) Simulation of robotic MAS with 7 robots. The advanced controllersafely brings the robots to their target locations. (c) Stress test of robotic MASwith 12 robots. The agents safely reach their target locations. The trajectorysegments where stored command sequences are used are shown in blue.
Target Locations Successfully Reached
In this experiment, we start fromthe same initial configuration as in Figure 2(a). The outcome of this experi-ment, shown in Figures 3(a) and 3(b), is different from the one in Figure 2, eventhough both start from the same initial configuration. The reason is that thelocal optimizer for the MPC controllers used a different a random seed for op-timization, and in this case the optimizer successfully found a permanently safecommand sequence at all steps. The closest any pair of agents get is at k = 18;see Figure 3(a), where the minimum pairwise distance is 1 . d min = 1 .
7. A video of the simulation isavailable online. This demonstrates the transparency of Black-Box Simplex, asthe safe outputs of the advanced controller are always used.
Reverse Switching Scenario
We stress-test the multi-robot system by ini-tializing 12 agents on a circle of radius 10. The path of the agents is shownin Figure 3(c). During the entire simulation, there are 10 instances where thedecision module rejects the proposed command sequence and instead uses thepreviously stored safe command sequence until the next safe command sequenceis produced. All agents reach their target locations without colliding; the mini-mum separation between any pair of agents is 1.724. A video of the simulationis available online. Handling Uncertainty
Up to this point, the MAS considered has no un-certainty in the states or dynamics. We next investigate the decision module’sruntime overhead when uncertainty needs to be taken into account. For this, we https://streamable.com/1c8th8 https://streamable.com/h9gv09afe CPS from Unsafe Controllers 13(a) Reachable States with Sensor Error (b) Reachable States with Disturbances Fig. 4: Zonotope reachability computes future states with uncertainty.consider two types of uncertainty: sensor uncertainty and dynamics uncertainty.The first case arises when the sensors used to determine the positions and ve-locities can have error. The second case could be used to account for modelingerrors, through disturbances on the positions and velocities at each step.We continue to use the same MPC strategy as before; thus, the controllersignore the uncertainty when generating proposed command sequences. Only thelogic used by the decision module to accept or reject command sequences needsto be adjusted to account for uncertainty. We examine the scenario shown beforein Figure 2(b). To account for the uncertainty, we perform an online reachabilitycomputation, instead of a simulation. To do this, we use efficient methods forreachability for linear systems based on zonotopes [13], which we implement inPython. In this case, each agent has four state variables, two for position andtwo for velocity. The composed system with seven agents has 28 variables.In the first case, shown in Figure 4(a), the current state is assumed to haveuncertainty independently in both position and velocity with an L norm of 0.1.We use a 16-sided polygon to bound this uncertainty. In the plot, the determin-istic simulation is given, along with black polygons for each agent that show thestates reachable at each step due to the sensor uncertainty. The uncertainty inthe velocity causes the set to expand over time, since the open-loop commandsequence produced by the controller does not attempt to compensate for theuncertainty. The zonotope representation of the composed system uses 112 gen-erator vectors to represent the set of states at each time step, which affects themethod’s runtime.In the second case, shown in Figure 4(b), the initial state has very little error,but the dynamics is changed to have disturbances at each step. For each agent’sposition and velocity, we allow an external disturbance value to be added inthe range [ − . , . To measure runtime, we used a standard laptop with a 2.70 GHz Intel Xeon E-2176M CPU, 32 GB RAM, running Ubuntu 20.04. The method is extremely fast.For the case of sensor uncertainty, computing the box bounds of the reachable setat all the steps takes about 1.5 milliseconds. With uncertainty, even though thenumber of generators grows over time, it is not large enough to significantly affectthe runtime. The computation with disturbances requires about 2 millisecondsto complete. Although this computation would be repeated at each step, webelieve such execution times are sufficiently fast for us in the decision modulelogic.One issue that comes up with disturbances is that proving command se-quences are permanently safe is more difficult. In both cases shown, since theposition uncertainty grows over time, the agents could potentially collide if wego out far enough. There are two ways to overcome this. One is to adopt a differ-ent definition of a permanently safe command sequence, such as simply sayingthe agents are considered safe when we can guarantee some large separation dis-tance. An alternative solution is to pair a closed-loop low-level controller withthe plant, and use the output of model-predictive control to generate waypointsfor this controller, rather than a sequence of open-loop accelerations.
Our second evaluation system involves guaranteeing collision avoidance for groupsof aircraft. We use a six degrees of freedom F-16 simulation model [15], based ondynamics taken from an Aerospace Engineering textbook [36]. Each aircraft ismodeled with 16 state variables, including positional states, positional velocities,rotational states, rotational velocities, an engine thrust lag term, and integratorstates for the low-level controllers. These controllers actuate the system usingthe typical aircraft control surfaces—the ailerons, elevators, and rudder—as wellas by setting the engine thrust. The system evolves continuously with piece-wisenonlinear differential equations, where the function that computes the derivativegiven the state is provided as Python code. In order to match the discrete-timeplant model in Definition 1, we periodically select a control strategy with a fre-quency of once every two seconds. The model further includes high-level autopilotlogic for waypoint following, which we reuse in the advanced controller.For the collision-avoidance baseline controller, we build upon the ACASXusystem designed for unmanned aircraft [19]. While the original system was de-signed using a partially observable Markov decision process (POMDP), the re-sultant controller was encoded in a large look-up table that used hundreds ofgigabytes of storage [16]. To make the system more practical, a downsamplingprocess followed by a lossy compression using neural networks was used [17,16].We use these downsampled neural networks for collision avoidance.The ACASXu system issues horizontal turn advisories based on the relativepositions of two aircraft, an ownship and an intruder . The system is similar toSimplex, where the output can be either clear-of-conflict , where any commandis allowed, or an override command that is one of weak-left , weak-right , strong-left or strong-right . We adapt this system to the multi-aircraft case by having afe CPS from Unsafe Controllers 15 each aircraft run an instance of ACASXu against every other aircraft. At eachdecision point, the ownship will use the advisory from the closest intruder aircraftthat commanded a turn, if any, only producing clear-of-conflict if all outputs are clear-of-conflict . To create command sequences, we advance the plant model andre-run ACASXu from the future state multiple times in a closed-loop fashion.As with the multi-robot scenario, we examine cases where the initial aircraftstate x has all aircraft starting evenly-spaced, facing towards the center of acircle with a given initial diameter. Each aircraft has an initial velocity of 807ft/sec and an initial altitude of 1000 ft, both of which are maintained throughoutthe maneuver by the controllers.The advanced controller commands each aircraft to fly towards a waypointpast the opposite side of the circle. The safety property requires maintaininghorizontal separation. The near mid-air collision cylinder (NMAC) defines theminimum acceptable separation to be 500 ft [22], although we will considervarious safety distances in our evaluation. If only the advanced controller isused, all aircraft fly straight, so they collide in the center.In addition to the advanced controller being unsafe, the baseline ACASXucontroller should not be fully trusted for many reasons. The original POMDPformulation was not proven formally correct, not to mention the downsampingand lossy neural network compression. While some research has examined prov-ing open-loop properties for the ACASXu neural networks [17,5], these do notimply closed-loop collision avoidance. Further, we use a multi-aircraft adaptationof the system, which could also lead to problems. Finally, the intended systemresponse to the ACASXu outputs is that weak-left and weak-right should causeturning at 1.5 degrees per second, whereas strong-left and strong-right turn at3.0 degrees per second [16]. However, turning an aircraft in the F-16 model—as well as in the real world—is not an instantaneous process, and requires firstperforming a roll maneuver before the heading angle begins to change. For thesereasons, the baseline controller in this scenario is also an unverified component,and we will show scenarios where it misbehaves. Nonetheless, we will composethe incorrect advanced controller with the incorrect baseline controller to createa correct multi-aircraft collision avoidance system by using Black-Box Simplex.For the initial permanently safe command sequence s , we have each aircraftfly in clockwise circles forever, which avoids collisions. To check whether a gen-erated command sequence is permanently safe, the decision module simulatesthe system and checks that (i) each aircraft’s state stays within the model limits(for example, no aircraft enters a stall), (ii) all aircraft obey the safety distanceconstraint at all times, and (iii) the execution ends in a state where the rollangle of each aircraft has been small (less than 15 degrees) and the distancesbetween all pairs of aircraft has been increasing consecutively for several sec-onds. Presumably, if all aircraft continue to fly straight and level from such aconfiguration, their distance would continue to increase and no collisions wouldoccur in the future.We next elaborate on three scenarios: (i) a three aircraft case, which showsthe safety of the system despite unsafe outputs, (ii) a four aircraft case which Fig. 5: Black-Box Simplex is safe. In the three aircraft case the multi-ACASXusystem fails, whereas Black-Box Simplex maintains the 1500 ft collision distance.shows the increased transparency of Black-Box Simplex, and (iii) a 15 aircraftcase which shows safe navigation of a complex scenario. Appendix A also includesa seven aircraft case which shows the safety condition can be easily customized.
Three Aircraft Scenario
The ACASXu collision avoidance system was de-signed with two aircraft in mind, an ownship and an intruder. We adapted it tothe multi-aircraft case, but this mismatch between the system design assump-tions and usage scenario can lead to problems. While the system does usuallyavoid collisions, it is not difficult to find cases where it fails, especially whenthere are more than two aircraft. With three aircraft, we could find cases wherethe collision distance property was violated.In all the plots in this section, we show snapshots in time where the distancebetween aircraft is minimized for a particular scenario. The two red aircraft ineach image are the closest pair, and their distance is printed in the bottom rightof each figure. The solid line shows the historic path of each aircraft, and thedotted line is the future trajectory.In Figure 5, we show a three-aircraft scenario, where the initial circle diameteris 90,000 ft. Using the original ACASXu system, where a clear-of-conflict outputruns the waypoint-follower logic, is unsafe. Figure 5(a) shows this case, wherethe distance between the top two aircraft is 175 ft, violating the near mid-aircollision safety distance. The other two subplots show the system using Black-Box Simplex with a safety distance of 1500ft, and the minimum separation is1602 ft, which satisfies the constraint.
Four Aircraft Scenario
Figure 6 shows a four aircraft scenario using an initialcircle diameter of 70,000 ft. In this case, both designs have safe executions. Usingthe original ACASXu system leads to a minimum separation of 5342 ft, whereasthe minimum separation with Black-Box Simplex is 1600 ft, much closer tothe 1500 safety distance constraint used in the decision module. Although bothsystems are safe, from the plots it is clear that the Black-Box Simplex version afe CPS from Unsafe Controllers 17(a) Original ACASXu (b) Black-Box Simplex
Fig. 6: Black-Box Simplex is more transparent. For the four aircraft case,ACASXu is significantly more intrusive than Black-Box Simplex, which over-rides commands just enough to guarantee 1500 ft separation requirement.is more transparent, in the sense that it produces smaller modifications to thedirect-line trajectories commanded by the advanced controller.
Fifteen Aircraft Scenario
Finally, we demonstrate the system’s ability tosafely navigate complex scenarios. For this, we use a 15 aircraft scenario, withan initial circle diameter of 90,000 ft. With 15 aircraft, the composed systemhas 240 real-valued state variables, each of which evolves according to piece-wisenonlinear differential equations. Figure 7 shows the system’s behavior. While theoriginal ACASXu system is unsafe, Black-Box Simplex has a minimum separa-tion of 1500.5 ft, just above the 1500 ft safety constraint used in the decisionmodule. Another surprising observation is that in some of the cases, such as this15 aircraft case and in the seven aircraft case shown in Figure 8(b), the aircraftperform something similar to a roundabout maneuver. This is an emergent be-havior and not something we explicitly hardcoded or anticipated. A video of thiscase is also available online. Discussion
The acceptance condition for dm update checked by the decision mod-ule consists of three parts described earlier: (i) the system state stays withinmodel limits, (ii) the horizontal safety distance is maintained at all times, and(iii) the maneuver ends with straight-and-level flight, with all aircraft movingaway from each other. However, there are many behaviors we would want toavoid that are not checked by this logic. For example, flying into the groundwill not cause a command sequence to be rejected. The system could also com-mand maximum afterburner, which would waste fuel unnecessarily, or actuatethe system wildly in ways that would damage a real airframe or cause excessive https://streamable.com/upm0lc8 U. Mehmood et al.(a) Original ACASXu(failure) (b) Black-Box Simplex (c) Black-Box Simplex(Zoomed In) Fig. 7: Black-Box Simplex safely navigates complex scenarios. In the 15-aircraftcase, all aircraft cross the circle while maintaining the 1500 ft separation distance.wear. We could try avoid these cases by enumerating them and adding appropri-ate checks in the decision module. However, the possibilities for misbehavior arelarge, and such an approach is likely to miss undesirable cases. Note that thisproblem is not unique to Black-Box Simplex, but also exists with the originalSimplex Architecture. Another way to address this issue is to push it aside—assume that although the controllers are unverified, they are not malicious andso the decision module can focus on checking the aspects of the controllers thatare most critical and more likely to be incorrect.A third—perhaps better—way to handle this issue is to reformulate the prob-lem. We can modify the “Plant + Low-Level Controller” box in Black-Box Sim-plex to also include trusted low-level controllers for the following six actions:(1) fly towards the waypoint, (2) fly straight, (3) turn weak left, (4) turn weakright, (5) turn strong left, and (6) turn strong right. At each decision point,then, the advanced controller always chooses the first action, whereas the base-line controller’s ACASXu system produces an output corresponding to one ofthe latter five actions, for each aircraft. Creating a reliable controller individu-ally for each of these six actions is much closer to a classical control problem;we could assume such controllers are provided that can maintain altitude andairspeed and do not fly into the ground or damage the aircraft in other ways.The problem solved by Black-Box Simplex is then about high-level control: howto safely compose these actions to reach the target waypoint while maintainingthe safety constraint.Another limitation of this case study is that it does not have disturbances, sothe dynamics are deterministic. Handling disturbances with the same approachas in the previous case study, using online reachability computation, would bedifficult for several reasons. First, closed-form differential equations—which arerequired by most reachability tools—are not provided for this system, as thederivative is computed by Python code. It would be hard to even extract dif-ferential equations from the code, as the simulator includes components likemultiple look-up tables and conditional branches. Second, the simulation hasnonlinear dynamics, for which reachability methods scale more poorly than withlinear dynamics. Third, the number of variables can be large, for example 240 afe CPS from Unsafe Controllers 19 real-valued state variables in the 15-aircraft case. This complexity of system isbeyond what is currently possible with nonlinear reachability tools [12], and evenif it becomes possible, it is unlikely to run fast enough for use at runtime.Instead, the more feasible approach would again be to rely on control theoryto provide performance guarantees for the designed controllers. If each maneu-ver resulting from the six actions described earlier could be analyzed for maxi-mum overshoot and steady-state error—typical metrics evaluated during controldesign—executions of the system could be computed and then bloated by theworst-case distance to obtain a lower bound on the minimum horizontal separa-tion. Used like this, Black-Box Simplex builds upon the vast body of work fromcontrol theory to produce safe high-level behaviors from untrusted controllers.In terms of the runtime of the decision module simulation logic, we havenot yet optimized its computation time in our implementation. Importantly,since each aircraft’s dynamics are disjoint, there is significant room for speedupthrough parallelism, up to the number of aircraft in the system. The existingimplementation simulates the dynamics using an adaptive-step explicit Runge-Kutta scheme of order 5(4) from Python’s scipy package. On our laptop plat-form with default accuracy parameters, this runs at about 55 times faster thanreal-time per aircraft. As future work, we could consider optimizing runtime byusing different numerical integration strategies, depending on factors such as thedistance to the safety violation.
Reachability-based verification methods for black-box systems for waypoint fol-lowing with uncertainty have been recently investigated in the ReachFlow frame-work [20]. Unlike Black-Box Simplex, ReachFlow requires closed-form differen-tial equations for the plant model. Further, it is built upon the Flow* nonlinearreachability tool [9], which is unlikely to scale to large complex systems, such asthe 240-variable fifteen aircraft scenario we considered.A framework for safe trajectory planning using MILP for piecewise-linearvehicle models has been considered [32,33]. This method relies on the ability ofa model-predictive controller to produce command sequences where the terminalstate in the prediction horizon is constrained within a safe invariant set, thusproviding a safe back-up command sequence for the next step in case the systemfails to find a safe sequence. The system stores a path to the start state (returntrajectory), which may be used to get out of loops. The scope of this workis limited to MPC and it is not clear how to extend it to other applications.Moreover, the conditions for switching back and forth between the stored returntrajectory and the MPC are not formalized.In the recently-proposed Contingency Model Predictive Control framework [1],an MPC controller maintains a contingency plan in addition to the nominalor desired plan to ensure safety during an identified potential emergency. LikeBlack-Box Simplex, the initial command is common in both plans. The MPCis robust to stochastic disturbances by anticipating and planning in advance for the worst-case events. Due to the coupling of the nominal and contingency plans,the deployed command is continuously impacted by the possibility of the con-tingency, however unlikely it may be. In contrast, in Black-Box Simplex, besidesthe advanced controller sharing its command with the baseline controller, bothcontrollers work independently. Also, the theory here generalizes beyond MPC.Designing safe switching logic for a given baseline controller is related to theconcept of computing viability kernels [30] (closed controlled invariant subsets)in control theory. Like computing reachability, this requires symbolic differentialequations, and requires high-dimensional set operations which can be inefficientin high-dimensional spaces, although there is some progress on this [18,21].Simplex designs have also been considered that use a combination of offlineanalysis with online reachability [4]. Again though, reachability computationis currently intractable for large nonlinear systems, and requires symbolic dif-ferential equations. Other work has used Simplex to provide safety guaranteesfor neural network controllers with online retraining [26]. In these approaches,however, the baseline controller must be verified ahead of time.Online simulation-based methods have also been investigated to secure powergrids from insider attacks [23]. As with this work, fast online simulation is critical,although the goal there is system security not safe high-level control design.The design of the MPC controllers for our MAS case study is the similar tocontrol barrier certificate methods [8,14]. There, a runtime assurance system wasused to provide minimally perturbed advanced controller commands, computedusing a constrained-optimization problem. However, the optimization problemmight become infeasible or global nonlinear optimization could perform poorlyat one of the steps at runtime, causing this method to fail. With Black-BoxSimplex, failure of the baseline controller does not compromise safety.Formal verification methods have also investigated multi-aircraft roundaboutmaneuvers [29] using differential dynamic logic proof systems [28]. Like thatwork, we also verify flyable maneuvers in our case study, although we use asimulation model. Although this has numerical simulation error, the system isdifficult to analyze otherwise as the behavior is defined with source code, notdifferential equations.The ModelPlex framework [24] generates runtime safety monitors to validateif the assumptions from offline model verification hold during execution. Mod-elPlex monitors check if the observed system execution fits the verified model andif it does not fit, it initiates the fail-safe actions necessary to avoid safety risks.The goal of ModelPlex is assuring model validation, which is different from thiswork, and its successful application still requires a verified baseline controller.
We have presented the Black-Box Simplex Architecture, a methodology for con-structing safe CPS from unverified black-box high-level controllers. The maintradeoff present in Black-Box Simplex is that the decision module has increased afe CPS from Unsafe Controllers 21 complexity, and for the system to perform smoothly, it must be able to quicklyverify command sequences.This itself is not an easy problem. For example, in the case of using Black-BoxSimplex with end-to-end machine learning, in addition to runtime verificationof command sequences, the decision module would also need to run its ownperception logic to make sense of the environment, further increasing the com-putational burden. With the proposed approach, however, we have reduced thedifficult problem of proving high-level safety to a simpler problem of performanceoptimization of the decision module logic. Black-Box Simplex provides a feasiblepath for the verification of systems that are otherwise unverifiable in practice.
References
1. Alsterda, J.P., Brown, M., Gerdes, J.C.: Contingency model predictive control forautomated vehicles. In: 2019 American Control Conference (ACC). pp. 717–722(2019). https://doi.org/10.23919/ACC.2019.88152602. Althoff, M., Dolan, J.M.: Online verification of automated road vehicles usingreachability analysis. IEEE Transactions on Robotics (4) (2014)3. Bak, S., Chivukula, D.K., Adekunle, O., Sun, M., Caccamo, M., Sha, L.: Thesystem-level simplex architecture for improved real-time embedded system safety.In: 2009 15th IEEE Real-Time and Embedded Technology and Applications Sym-posium. pp. 99–107. IEEE (2009)4. Bak, S., Johnson, T.T., Caccamo, M., Sha, L.: Real-time reachability for verifiedsimplex design. In: 35th IEEE Real-Time Systems Symposium (RTSS 2014). IEEEComputer Society, Rome, Italy (Dec 2014)5. Bak, S., Tran, H.D., Hobbs, K., Johnson, T.T.: Improved geometric path enumer-ation for verifying relu neural networks. In: Proceedings of the 32nd InternationalConference on Computer Aided Verification (2020)6. Bechtel, M.G., McEllhiney, E., Kim, M., Yun, H.: Deeppicar: A low-cost deep neu-ral network-based autonomous car. In: 2018 IEEE 24th International Conferenceon Embedded and Real-Time Computing Systems and Applications (RTCSA). pp.11–21. IEEE (2018)7. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P.,Jackel, L.D., Monfort, M., Muller, U., Zhang, J., et al.: End to end learning forself-driving cars. arXiv preprint arXiv:1604.07316 (2016)8. Borrmann, U., Wang, L., Ames, A.D., Egerstedt, M.: Control barrier certifi-cates for safe swarm behavior. In: Egerstedt, M., Wardi, Y. (eds.) ADHS. IFAC-PapersOnLine, vol. 48, pp. 68–73. Elsevier (2015)9. Chen, X., ´Abrah´am, E., Sankaranarayanan, S.: Flow*: An analyzer for non-linearhybrid systems. In: International Conference on Computer Aided Verification. pp.258–263. Springer (2013)10. Clark, M., Koutsoukos, X., Porter, J., Kumar, R., Pappas, G., Sokolsky, O., Lee,I., Pike, L.: A study on run time assurance for complex cyber physical systems.Tech. rep., Air Force Research Laboratory, Aerospace Systems Directorate (2013)11. Desai, A., Ghosh, S., Seshia, S.A., Shankar, N., Tiwari, A.: SOTER: A runtimeassurance framework for programming safe robotics systems. In: 49th AnnualIEEE/IFIP International Conference on Dependable Systems and Networks, DSN2019, Portland, OR, USA, June 24-27, 2019. IEEE (2019)2 U. Mehmood et al.12. Geretti, L., Sandretto, J.A.D., Althoff, M., Benet, L., Chapoutot, A., Chen, X.,Collins, P., Forets, M., Freire, D., Immler, F., et al.: Arch-comp20 category report:Continuous and hybrid systems with nonlinear dynamics. EPiC Series in Comput-ing , 49–75 (2020)13. Girard, A.: Reachability of uncertain linear systems using zonotopes. In: Interna-tional Workshop on Hybrid Systems: Computation and Control. Springer (2005)14. Gurriet, T., Mote, M., Ames, A.D., Feron, E.: An online approach to active setinvariance. In: Conference on Decision and Control. IEEE (2018)15. Heidlauf, P., Collins, A., Bolender, M., Bak, S.: Verification challenges in f-16ground collision avoidance and other automated maneuvers. In: 5th InternationalWorkshop on Applied Verification of Continuous and Hybrid Systems. EPiC Seriesin Computing, EasyChair (2018)16. Julian, K.D., Kochenderfer, M.J., Owen, M.P.: Deep neural network compressionfor aircraft collision avoidance systems. Journal of Guidance, Control, and Dynam-ics (3), 598–608 (2019)17. Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: An effi-cient SMT solver for verifying deep neural networks. In: International Conferenceon Computer Aided Verification. pp. 97–117. Springer (2017)18. Kaynama, S., Maidens, J., Oishi, M., Mitchell, I.M., Dumont, G.A.: Computingthe viability kernel using maximal reachable sets. In: Proceedings of the 15th ACMinternational conference on Hybrid Systems: Computation and Control. pp. 55–64(2012)19. Kochenderfer, M.J., Chryssanthacopoulos, J.: Robust airborne collision avoidancethrough dynamic programming. Massachusetts Institute of Technology, LincolnLaboratory, Project Report ATC-371 (2011)20. Lin, Q., Chen, X., Khurana, A., Dolan, J.: Reachflow: An online safety assuranceframework for waypoint-following of self-driving cars. In: 2020 IEEE/RSJ Interna-tional Conference on Intelligent Robots and Systems (IROS) (2020)21. Maidens, J.N., Kaynama, S., Mitchell, I.M., Oishi, M.M., Dumont, G.A.: La-grangian methods for approximating the viability kernel in high-dimensional sys-tems. Automatica (7), 2017–2029 (2013)22. Marston, M., Baca, G.: Acas-xu initial self-separation flight tests. Tech. rep., NASA(2015)23. Mashima, D., Chen, B., Zhou, T., Rajendran, R., Sikdar, B.: Securing substa-tions through command authentication using on-the-fly simulation of power sys-tem dynamics. In: IEEE International Conference on Communications, Control,and Computing Technologies for Smart Grids (2018)24. Mitsch, S., Platzer, A.: Modelplex: Verified runtime validation of verified cyber-physical system models. Formal Methods in System Design (1), 33–74 (2016)25. Phan, D., Grosu, R., Jansen, N., Paoletti, N., Smolka, S.A., Stoller, S.D.: Neuralsimplex architecture. In: NASA Formal Methods Symposium (NFM 2020) (2020)26. Phan, D., Grosu, R., Jansen, N., Paoletti, N., Smolka, S.A., Stoller, S.D.: Neuralsimplex architecture. In: NASA Formal Methods Symposium (NFM 2020). pp.97–114. Springer (2020)27. Phan, D., Yang, J., Grosu, R., Smolka, S.A., Stoller, S.D.: Collision avoidance formobile robots with limited sensing and limited information about moving obstacles.Formal Methods in System Design (1), 62–86 (2017)28. Platzer, A.: Differential dynamic logic for hybrid systems. Journal of AutomatedReasoning (2), 143–189 (2008)afe CPS from Unsafe Controllers 2329. Platzer, A., Clarke, E.M.: Formal verification of curved flight collision avoidancemaneuvers: A case study. In: International Symposium on Formal Methods. pp.547–562. Springer (2009)30. Saint-Pierre, P.: Approximation of the viability kernel. Applied Mathematics andOptimization (2), 187–209 (1994)31. Schierman, J., DeVore, M.D., Richards, N., Gandhi, N., Cooper, J., Horneman,K.R., Stoller, S., Smolka, S.: Runtime assurance framework development for highlyadaptive flight control systems (2015)32. Schouwenaars, T., Valenti, M., Feron, E., How, J.: Implementation and flight testresults of milp-based uav guidance. 2005 IEEE Aerospace Conference pp. 1–13(2005)33. Schouwenaars, T.: Safe trajectory planning of autonomous vehicles. Ph.D. thesis,Massachusetts Institute of Technology (2006)34. Seto, D., Krogh, B., Sha, L., Chutinan, A.: The simplex architecture for safe onlinecontrol system upgrades. In: Proceedings of the 1998 American Control Conference.ACC (IEEE Cat. No. 98CH36207). vol. 6. IEEE (1998)35. Sha, L.: Using simplicity to control complexity. IEEE Software (4), 20–28 (2001).https://doi.org/10.1109/MS.2001.93621336. Stevens, B.L., Lewis, F.L., Johnson, E.N.: Aircraft control and simulation. JohnWiley & Sons (2015)4 U. Mehmood et al. A Seven Aircraft Case - Safety Condition Customization
We investigated a seven aircraft scenario with an initial circle diameter of 70,000ft. Here, the original ACASXu system violates the horizontal separation con-straint, and the minimum separation distance is 277 ft. We run Black-Box Sim-plex on this system using three different safety distances, 1500 ft, 1000 ft, and500 ft. All avoid collisions, and as the safety distance is decreased, the observedminimum distance also decreases. This shows that Black-Box Simplex can be eas-ily customized to a change in the safety requirement. Doing this for the originalACASXu system would require significant effort in recomputing the POMDPsand retraining the neural networks to perform a compression of the action tables.Plots of the seven aircraft trajectories are provided in Figure 8 and video of the1000 ft case is available online . (a) Original ACASXu (failure) (b) Black-Box Simplex with SafetyDistance 1500 ft(c) Black-Box Simplex with SafetyDistance 1000 ft (d) Black-Box Simplex with SafetyDistance 500 ft Fig. 8: Black-Box Simplex is easily customizable. In the seven aircraft case, ad-justing the safety distance in the decision module results in different systembehaviors. In each case, the advanced controller command is overridden onlyenough to guarantee the corresponding safety constraint.5