Programming and Reasoning with Partial Observability
aa r X i v : . [ c s . P L ] J a n ERIC ATKINSON,
Massachusetts Institute of Technology, USA
MICHAEL CARBIN,
Massachusetts Institute of Technology, USAComputer programs are increasingly being deployed in partially-observable environments. A partially observ-able environment is an environment whose state is not completely visible to the program, but from whichthe program receives partial observations. Developers typically deal with partial observability by writing astate estimator that, given observations, attempts to deduce the hidden state of the environment. In safety-critical domains, to formally verify safety properties developers may write an environment model. The modelcaptures the relationship between observations and hidden states and is used to prove the software correct.In this paper, we present a new methodology for writing and verifying programs in partially observableenvironments. We present belief programming , a programming methodology where developers write an en-vironment model that the program runtime automatically uses to perform state estimation. A belief programdynamically updates and queries a belief state that captures the possible states the environment could bein. To enable verification, we present
Epistemic Hoare Logic that reasons about the possible belief states of abelief program the same way that classical Hoare logic reasons about the possible states of a program. Wedevelop these concepts by defining a semantics and a program logic for a simple core language called BLIMP.In a case study, we show how belief programming could be used to write and verify a controller for the MarsPolar Lander in BLIMP. We present an implementation of BLIMP called CBLIMP and evaluate it to determinethe feasibility of belief programming.CCS Concepts: •
Software and its engineering → General programming languages ; Semantics ; •
The-ory of computation → Operational semantics ; Pre- and post-conditions ; Invariants ; Modal and temporallogics ; Hoare logic ; Program reasoning ; •
Computing methodologies → Reasoning about belief andknowledge .Additional Key Words and Phrases: programming languages, logic, partial observability, uncertainty
ACM Reference Format:
Eric Atkinson and Michael Carbin. 2020. Programming and Reasoning with Partial Observability.
Proc. ACMProgram. Lang.
4, OOPSLA, Article 200 (November 2020), 43 pages. https://doi.org/10.1145/3428268
Computer systems are increasingly deployed in partially observable environments in which the sys-tem cannot exactly determine the environment’s true state [Russel and Norvig 2020; Smallwood and Sondik1973]. For example, the software that controls an uncrewed aerial vehicle (UAV) cannot exactly de-termine the vehicle’s true altitude above the ground. Instead, the vehicle’s software receives ameasurement from a GPS altimeter that estimates the vehicle’s altitude.This measurement or observation reveals only partial information about the environment’s truestate, such as that the UAV’s true altitude is within 25 feet of the reported measurement. The pri-mary challenge for a system deployed in such an environment is therefore that it must leverage the
Authors’ addresses: Eric Atkinson, Massachusetts Institute of Technology, USA, [email protected]; Michael Carbin,Massachusetts Institute of Technology, USA, [email protected] to make digital or hard copies of part or all of this work for personal or classroom use is granted without feeprovided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice andthe full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses,contact the owner/author(s).© 2020 Copyright held by the owner/author(s).2475-1421/2020/11-ART200https://doi.org/10.1145/3428268Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. partial information provided by an observation to meet its goals, which, in contrast, are typicallyexpressed in terms of the environment’s true state.Consider, for example, a UAV tasked with avoiding a collision with the ground. Its controllersoftware will include a state estimator that must infer if it is possible that the vehicle may soonhave an altitude of 0 given only estimated altitude measurements. If the vehicle is indeed at risk,then the controller must take action to ensure the vehicle’s altitude stays strictly positive. However,the discrepancy between the measurements and the vehicle’s true altitude introduces the risk thatthe state estimator’s inferences may indicate a strictly positive altitude when the true altitudeis in fact 0. Moreover, the controller’s reasoning must soundly work with the state estimator’sinferences to intervene whenever the true altitude is dangerously near 0.In safety-critical domains that desire formal guarantees, such as robotics and vehicle navigation,the best-available approach to formally verify the system is to 1) formally specify an environmentmodel : the specific relationship between an observation and the system’s true state; 2) implementthe state estimator and verify the correctness of its inferences in relation to the environment model;and 3) implement the remainder of the controller and verify that the composition of the environ-ment model and controller meets the system’s requirements.
In this paper, we present belief programming , a programming methodology in which the devel-oper writes a program for the controller that includes a specification of the environment model.From that specification, the program runtime automatically provides a state estimator, eliminatingthe need to manually implement the state estimator and verify its behavior against the environ-ment model.To instantiate the concepts of this programming methodology, we present a new language, Be-Lief IMP (BLIMP), a variant of the pedagogical language IMP [Winskel 1993]. BLIMP providesfirst-class abstractions for environment modeling, for observations, and to interface with the au-tomatically generated state estimator.
Environment Model.
The belief programming methodology extends IMP with a nondeterministicchoice statement, x = choose( 𝑝 ) , that nondeterministically updates the program variable x to avalue that satisfies the predicate 𝑝 . Unlike a traditional choose statement [Back 1978], the value of x is not immediately observable to the controller. Instead the semantics of x is the set of all possiblevalues that satisfy 𝑝 . The programming methodology permits such nondeterministic values to becomposed with additional computation to produce a jointly nondeterministic and unobserved setof program variables whose potential values correspond to the partially observable values of thesystem’s physical state. Observations.
To reveal the true value of an unobserved program variable, the controller mustexplicitly perform an observation. Belief programming extends IMP with an observation state-ment, observe y , that makes the value of an unobserved variable y visible to the program. If, forexample, y is a measurement that is derived from another unobservable value x , such as an alti-tude measurement derived from the UAV’s true altitude, then the true value of y reveals partialinformation about the true value of x . State Estimation.
The belief programming runtime system dynamically maintains a belief state that captures the set of all possible values of all unobserved variables. The belief programmingmethodology also extends IMP with an inference statement, infer 𝑝 (cid:3) , that computes a booleaninference over the program’s belief state and enables the controller to guide its actions given the Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:3 validity of a proposition. For example, if ^ (altitude < 1) is true, which states that is possiblethat the UAV’s altitude is less than foot, then the controller can intervene to avoid a collision.Together our abstractions for environment modeling, observations, and state estimation enablebelief programming to provide runtime capabilities for state estimation that eliminate the need toimplement and verify the state estimator itself. While the belief programming methodology automates the construction of a sound state estimator,a developer must still verify that the contoller’s actions and state estimator soundly work togetherto meet the system’s requirements. To address this problem, we present the
Epistemic Hoare Logic (EHL), a variant of Hoare Logic that supports modal propositions in its assertion logic that modela belief program’s dynamically tracked belief state.EHL includes the modal propositions ^ 𝑝 , “it is possible that 𝑝 is true", and (cid:3) 𝑝 , “it is alwaysthe case that 𝑝 is true", that quantify 𝑝 over the set of all possible values of the program’s vari-ables as captured by its belief state. These propositions, along with EHL’s inference rules, en-able a developer to represent the state estimator’s inferences as propositions in the logic – e.g. ^ (altitude < 1) meaning “it is possible that the true altitude is less than 1 foot" – and alsospecify and verify the system’s requirements – e.g. (cid:3) (altitude >= 1) meaning “it is always thecase that the true altitude is at least 1 foot." In this paper, we present the following contributions: • Belief Programming.
We introduce belief programming, a programming methodology thatmakes it possible for the program runtime to automatically provide a state estimator givenan environment model specification. Specifically, the program runtime tracks the program’sbelief state: all possible values of the program’s unobservable state. • Language.
We present the syntax and semantics of Belief IMP (BLIMP), a language designedfor belief programming. We establish basic properties of BLIMP semantics that should betrue of any belief programming language. Namely, we show the state estimator that a BLIMPprogram provides soundly and precisely captures the environment’s true state. • Epistemic Hoare Logic.
We present Epistemic Hoare Logic for verifying properties of BLIMPprograms. We show that our logic is sound with respect to BLIMP’s semantics. • Case Study.
We present a case study showing how belief programming can be used to developa verified implementation of the Mars Polar Lander’s flight control software. The Mars PolarLander is a lost space probe, hypothesized to have crashed into the surface of Mars duringdescent due to a control software error [JPL Special Review Board 2000]. We present a con-troller implemented with belief programming, and formally prove using EHL that it doesnot have the error that caused the MPL crash. • Implementation.
We evaluate the feasibility of belief programming by presenting an imple-mentation of BLIMP in C called CBLIMP. Our results show that belief programming is feasi-ble for problems in robotics and vehicle navigation domains.The dual contributions of belief programming and Epistemic Hoare Logic enable developers tomore easily program in partially observable environments where correctness is paramount. Suchdevelopers must currently hand-write an environment model and a state estimator, and beliefprogramming enables them to omit the state estimator. Epistemic Hoare Logic enables developersto reason about the correctness of the resulting belief program, just as Hoare logic allows them toreason about the correctness classical hand-written control software.
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. cmd = 0 t_max = 10000; t = 0; while ( t < t_max ) { input obs ; // C o n t r o ller loop start if ( obs < 475) { cmd = 50 } else if ( obs > 525) { cmd = -50 } else { cmd = 0 } // C o n t r o ller loop end t = t + 1 } Fig. 1. An altitude controller for a UAV.
In this section we show howa developer uses belief program-ming to implement a controllerfor a UAV. The controller’s ob-jective is to maintain the UAV’saltitude at 500 feet above theground. While the UAV can pre-cisely control its altitude, it hasto contend with measurementerror from its altitude sensingequipment and wind gusts thatcan blow it off course.The listing in Figure 1 showshow a developer can write sucha controller in a traditional pro-gramming language. At everytime step, up to a maximum of steps, the controller receives an altitude observation obs (Line 5). We assume the observation comes from a sensor which has some inherent measurementerror, so that the value stored in obs is not precisely the UAV’s true altitude. If the observation issufficiently low, the controller issues a command to climb by 50 feet (Line 7). Conversely, if the ob-servation is sufficiently high, the controller issues a command to descend (Line 8). The conditionson Lines 7 and 8 form a coarse-grained state estimator that determines if the UAV is too high, toolow, or at an acceptable altitude. We assume the command is stored in cmd , and that there is anexternal process that reads cmd and modifies the UAV’s altitude by exactly cmd .1 alt = 500; cmd = 0; t_max = 10000; t = 0; while ( t < t_max ) { 450 <= alt && alt <= 550 } { alt = choose ( alt - 25 <= . && . <= alt + 25); obs = choose ( alt - 25 <= . && . <= alt + 25); // C o n t r o ller loop start ... // C o n t r o ller loop end alt = alt + cmd t = t + 1; } Fig. 2. Environment model for the UAV altitude controller.
The developer needs to ensure the UAVmaintains a consistent altitude for safetyreasons. We will assume that the devel-oper wants to provide this assurance us-ing formal verification, which requires anenvironment model of how the true andobserved altitude are related. The devel-oper can write such a model as a programthat specifies the set of environments theUAV may be in, including the set of valuesthe UAV’s true and observed altitude maytake on.The listing in Figure 2 shows how adeveloper can write such a model. Themodel is composed with the controller byinlining the annotated lines in Figure 1into Line 11 of Figure 2. Note that this re-places the input obs in Figure 1 with thevalue of obs specified by the model, andwe have added a loop invariant on Line 4.The modeling language includes a non-deterministic assignment operator choose , which takes a predicate and nondeterministically
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:5 alt = 500; cmd = 0; t_max = 10000; t = 0; while ( t < t_max ) { (cid:3) (450 <= alt && alt <= 550) } { alt = choose ( alt - 25 <= . && . <= alt + 25); obs = choose ( alt - 25 <= . && . <= alt + 25); observe obs ; infer ^ ( alt < 450) { cmd = 50 } else infer ^ ( alt > 550) { cmd = -50 } else { cmd = 0 }; alt = alt + cmd ; t = t + 1 } Fig. 3. Implementation of the UAV controller using belief programming in BLIMP. chooses a value that satisfies that predicate. The placeholder . stands for the new value of thevariable, so that x = choose(x - 1 <= . && . <= x + 1) picks a new value for x that is withina distance of of its previous value.At every time step, the model chooses the current altitude alt from a value within a distance of of the previous altitude (Line 6). This models a wind gust causing a change of up to 25 feet ofaltitude per step. The model then chooses the observation obs from within a distance of of thetrue altitude (Line 8). This models the altitude instrumentation as having a measurement error ofup to 25 feet. After the controller runs, the model alters the altitude by adding to it the resultingcommand cmd (Line 13).The condition the developer needs to ensure is given by the loop invariant on Line 4:
450 <=alt && alt <= 550 . This means that the UAV maintains its target altitude of 500 feet within anerror margin of 50 feet. The developer can prove that the composition of the environment modeland the controller satisfy this condition using classical verification techniques.
We will now explain, alternatively, how the developer implements this program using beliefprogramming. The listing in Figure 3 shows the code to implement the controller in our beliefprogramming language BLIMP. As this program executes, it maintains a belief state of the setof possible environments it could be in. Instead of the conditions over concrete observations inFigure 1, the code in Figure 3 uses conditions over belief states to determine control actions.To explain how belief programming operators work, we will walk through the execution of thefirst iteration of the while loop in Figure 3. At the start of the loop, the belief state contains a singleenvironment that has alt = , cmd = , t_max = , and t = . Choose Statements.
The choose statements on Lines 5 and 6 expand the belief state to include allpossible environments the nondeterminism could generate. After the first assignment to alt onLine 5, the belief state contains all environments such that alt ∈ [ , ] . After the assignmentto obs on Line 6, the belief state contains all environments such that obs ∈ [ , ] , with Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. the additional constraint that the distance between alt and obs is less than or equal to . Thismeans that, for example, the belief state does not contain the environment where alt = and obs = . Observe Statements.
The observe statement on Line 7 implicitly receives an input that is theobserved altitude obs . It updates the belief state to contain only environments that are consistentwith that observed altitude. For example, if the program receives the value , then observe modifies the belief state to only contain environments where obs = and, correspondingly,where alt ∈ [ , ] . Infer Statements.
The infer statements on Lines 9-11 branch based on belief state conditions.Conditions use the (cid:3) and ^ modal operators to quantify over environments in the belief state,with (cid:3) meaning “for all environments in the belief state” and ^ meaning “there exists an envi-ronment in the belief state”. The condition ^ (alt < 450) means that there is an environment inthe belief state such that alt is smaller than . Similarly ^ (alt > 550) means that there is anenvironment such that alt is larger than . Compared to the state estimator on Lines 7 and 8of Figure 1, the infer statements provide more intuition for why the controller was constructedthis way. If one of these conditions is true, then it is possible for the true environment to be out-side of the desired range of ± feet, and immediate action is needed to correct the situation.Since, assuming the example observation of , our belief state contains only the environmentswhere alt ∈ [ , ] , neither of these conditions is true. Thus, the cmd = 0 branch of the inferstatements is executed. This constrains every environment in the belief state to include cmd = 0 .Note that under the assumption that the observation localizes alt to within 25 feet, at most onecondition of the infer statements will be true in any given belief state. Assignments.
The final line of the loop, Line 13, updates alt to be its previous value plus thevalue of cmd in every environment in the belief state. Because the belief state contains the envi-ronments such that cmd = and alt ∈ [ , ] , after the update the belief state contains allenvironments such that alt ∈ [ , ] . The loop invariant on Line 3 states that any environ-ment in the belief state must have a value for alt in the range [ , ] . Because our belief stateconstrains alt to the smaller range [ , ] , our belief state satisfies the invariant. The previous section describes a single execution of the belief program given concrete observa-tions, but in general developers need to reason about all potential executions given any observa-tions that are allowed under the environment model. In this section, we show how a developercan use Epistemic Hoare Logic to reason about the potential executions of the belief program inFigure 3. We first present a small example that showcases how Epistemic Hoare Logic can be usedto reason about a single statement in the program. We then explain at a high level how similarreasoning can be used to verify the program maintains its altitude within limits, with the detailsin Appendix C.
In this section, we consider proving a simple property about the statementon Line 5 of Figure 3 that specifies altitude updates. We will assume, in accordance with the loopinvariant, that the altitude immediately before executing this statement is larger than 450. Wewill then show that after executing this statement, the altitude is larger than 425. This property isexpressed by the following judgment in Epistemic Hoare Logic: 𝐷 ⊢ { (cid:3) (450 <= alt) } alt = choose(alt - 25 <= . && . <= alt + 25) { (cid:3) (425 <= alt) } Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:7
The context 𝐷 indicates that the judgement applies when the statement is executed under deter-ministic control flow. The pre-condition, (cid:3) (450 <= alt) , states that in any environment in thebelief state, the variable alt is at least . The post-condition, (cid:3) (425 <= alt) , states that in anyenvironment in the belief state, the variable alt is at least . Epistemic Hoare Logic Rules.
To prove this deduction, we first instantiate the Epistemic HoareLogic rule for choose statements that we have designed. We present the general rule in Figure 10of Section 4. By default, the rule allows for nondeterministic control flow, but we can specialize itto the 𝐷 context via a subtyping rule. Instantiating the choose rule and subtyping with our originalpre-condition yields the following judgement, using the notation 𝑠 ∗ for the statement on Line 5: 𝐷 ⊢ { (cid:3) (450 <= alt) } 𝑠 ∗ { (cid:3) (450 <= a) && (cid:3) (a - 25 <= alt && alt <= a + 25) } The rule produces a post-condition by first renaming all instances of the assigned variable alt with a fresh variable a in the pre-condition. It then conjuncts this with the choose statement’spredicate under a (cid:3) with alt replaced with a and the placeholder . replaced with alt . Implications.
To rewrite the post-condition into the desired form, we prove that another pred-icate is implied by the post-condition and use the rule of consequence (formalized in general inFigure 10) to replace the post-condition with the new predicate.First, we use the general principle that the (cid:3) operator commutes with && . We have formalizedthis principle as Theorem B.1 in Appendix B. This gives us the following judgment: 𝐷 ⊢ { (cid:3) (450 <= alt) } 𝑠 ∗ { (cid:3) (450 <= a && a - 25 <= alt && alt <= a + 25) } We then use the principle of lifting theorems about environments to theorems about beliefstates. This states that if we have an implication over environments, we can wrap the premiseand conclusion of this implication with (cid:3) , and obtain an implication over belief states. We haveformalized this principle as Theorems B.4 and B.5 in Appendix B. In this example, we use the factthat if the environment satisfies
450 <= a && a - 25 <= alt && alt <= a + 25 , then it alsosatisfies
425 <= alt . The principle of lifting says that as a result, if the belief state satisfies (cid:3) (450<= a && a - 25 <= alt && alt <= a + 25) , then it also satisfies (cid:3) (425 <= alt) . Applyingthe rule of consequence gives us the original judgment we set out to prove: 𝐷 ⊢ { (cid:3) (450 <= alt) } 𝑠 ∗ { (cid:3) (425 <= alt) } The developer would like to ensure that the loop body in Figure 3 maintainsan altitude of 500 feet within a tolerance of 50 feet. This means that the loop body must preserveits invariant. This corresponds to the Epistemic Hoare Logic deduction 𝐷 ⊢ { (cid:3) (450 <= alt && alt <= 550) } 𝑠 { (cid:3) (450 <= alt && alt <= 550) } where the pre- and post-conditions are both equal to the loop invariant on Line 3 of Figure 3.The program 𝑠 is the loop body on Lines 5-14 of Figure 3. Thus, this deduction states that if theloop body starts in a belief state satisfying the loop invariant, it produces a belief state that alsosatisfies the loop invariant. We consider the case of deterministic control flow, as signified withthe 𝐷 context. This is appropriate because the loop condition, t < t_max , is consistently eithertrue or false across all environments in the belief state.The high-level procedure is to assume the loop invariant as the pre-condition for the loop body.We then use the logic’s rules to derive an appropriate post-condition for the loop body, and finallyprove that the post condition implies the loop invariant using the reasoning principles in Section B.The details of this process are in Appendix C. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. 𝐸 ::= 𝑥 | 𝑐 | 𝐸 + 𝐸 | 𝐸 - 𝐸 | 𝐸 * 𝐸 | 𝐸 / 𝐸 | 𝑦 | . 𝑃 ::= 𝑏 | 𝐸 < 𝐸 | 𝐸 == 𝐸 | 𝑃 && 𝑃 | 𝑃 || 𝑃 | ! 𝑃𝑃 (cid:3) ::= (cid:3) 𝑃 | ^ 𝑃 | 𝑃 (cid:3) && 𝑃 (cid:3) | 𝑃 (cid:3) || 𝑃 (cid:3) | ! 𝑃 (cid:3) 𝑃 ∃ ::= ∃ 𝑦 ∗ . 𝑃 (cid:3) Fig. 4. Syntax of expressions and propositions.
In this section, we present the belief programming language BLIMP. We first present BLIMP’s syn-tax and semantics, and then state and prove properties of the semantics. We then present an exe-cution model . While the semantics describes both the environment modeling and state estimationbehavior of BLIMP programs, the execution model projects out the state estimation operations.
In this section, we present the syntax of BLIMP, which gives the constructs of belief programming.
Figure 4 gives the syntax of expressions and propositions.
Expressions.
We use the notation 𝐸 to refer to expressions. An expression may be a variable 𝑥 , anumeric constant 𝑐 , or formed using one of the binary operators + , - , * , / , which have standardinterpretations. An expression may also contain a quantified variable 𝑦 or the placeholder . .Quantified variables only make sense when the variable is bound by an outer quantifier, and theplaceholder value only makes sense in the context of an enclosing choose statement. Propositions.
We use the notation 𝑃 to refer to propositions. Propositions may be boolean con-stants 𝑏 ∈ { true , false } or comparisons between expressions using the comparison operators < and == . Propositions may also be combined by conjunction, disjunction, and negation through theboolean operators && , || , and ! , respectively. Modal Propositions.
We use the notation 𝑃 (cid:3) to refer to modal propositions. Modal propositionsare propositions that are modified using the (cid:3) and ^ operators. They quantify over environmentsin the belief state, and are used to query the belief state for state estimation. Modal propositionsmay be combined by conjunction, disjunction, and negation, using the same syntax as for non-modal propositions. Note that in contrast to many modal logics, BLIMP’s modal operators mayonly be applied once; hence, propositions such as (cid:3)(cid:3) 𝑃 are not in the language. This is a designdecision we have made to succinctly capture properties of the domain. The alternative would beto allow propositions such as (cid:3)(cid:3) 𝑃 , and include a theorem of the form (cid:3)(cid:3) 𝑃 ⇒ (cid:3) 𝑃 . Existential Propositions.
We use the notation 𝑃 ∃ to refer to existentially quantified modal propo-sitions. An existential proposition is a modal proposition prepended with the ∃ symbol and acomma-separated list of quantified variables 𝑦 ∗ (which could potentially be empty). Existentialpropositions form the core propositional language for reasoning with Epistemic Hoare Logic, andappear in BLIMP in assertions and loop invariants. Figure 5 presents the syntax of statements. We use the notation 𝑆 to refer to theset of statements, which is specified by the grammar in Figure 5. A statement may be an assignment,a choose statement, an assertion, an observation, an if statement, an infer statement, a while loop,a composition of two statements, or a skip statement. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:9 𝑆 ::= 𝑥 = 𝐸 | 𝑥 = choose( 𝑃 ) | assert 𝑃 ∃ | observe 𝑥 | if 𝑃 { 𝑆 } else { 𝑆 } | infer 𝑃 (cid:3) { 𝑆 } else { 𝑆 } | while 𝑃 { 𝑃 ∃ } { 𝑆 } | 𝑆 ; 𝑆 | skip Fig. 5. Syntax of statements
Assignment and Choose.
An assignment state-ment 𝑥 = 𝐸 and a choose statement 𝑥 = choose( 𝑃 ) both assign to the program variable 𝑥 . The assign-ment statement does so using an expression whilethe choose statement does so using a propositionthat contains the . placeholder value. Assertions.
The keyword assert signifies an as-sertion statement. An assertion includes an existen-tial proposition specifies a property that must betrue at the statement’s program point.
Observation.
The keyword observe signifies anobservation statement. An observation includes theprogram variable, 𝑥 , to be observed at the state-ment’s program point. If and Infer Statements.
The keywords if and infer signify an if statement and an infer state-ment, respectively. Both statements select branches based on a condition. The distinction betweenan if and an infer statement is that an if statement branches on an ordinary proposition, whereas aninfer statement branches on a modal proposition. If statements facilitate conditional environmentmodels, whereas infer statements facilitate state estimation by branching on belief state queries. While Loops.
The keyword while signifies a while loop. Such a loop consists of a body, a condi-tion regarding whether to continue or not, and a loop invariant. The loop invariant is an existentialproposition that must be true at the start and end of each loop, and may be used to express a safetyproperty that must be true at every iteration of a time-step loop.
Composition and Skip.
Statements may be sequentially composed with the ; operator. The skip keyword signifies a skip statement, a null statement that performs no operations. In this section, we illustrate how belief programming works by presenting the semantics of BLIMP.The semantics precisely specify how BLIMP manipulates belief states with the goal that the beliefstates always capture all possible behaviors and only capture realizable behaviors.
Environments.
An environment 𝜎 ∈ Σ = ( 𝑋 → C) is a finite map from program variable namesto their numeric values, where each value belongs to a finite subset C of the integers. We use thenotation 𝜎 [ 𝑥 ↦→ 𝑐 ] to mean the environment 𝜎 with the variable 𝑥 mapped to the value 𝑐 . An optional environment 𝜇 ∈ Σ ∪ {·} may be either an environment 𝜎 or a null value · . The use of nulloptional environments is described in more detail in Section 3.2.5. Belief States.
A belief state 𝛽 ∈ P ( Σ ) is an element of the powerset of Σ , i.e. it is a set of envi-ronments. The interpretation is that if an environment 𝜎 is in 𝛽 , then the program believes that 𝜎 is a possibly true environment. Figure 6 presents the semantics of expressions. Our approach is a big-step op-erational semantics that states what value an expression computes when evaluated under a givenenvironment. We use the notation h 𝜎, 𝑒 i ⇓ 𝑐 to mean that the expression 𝑒 evaluated under the en-vironment 𝜎 yields the value 𝑐 . The meaning of a variable 𝑥 is the value of the variable in the input Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. h 𝜎, 𝑥 i ⇓ 𝜎 ( 𝑥 ) h 𝜎, 𝑐 i ⇓ 𝑐 h 𝜎, 𝑒 i ⇓ 𝑐 h 𝜎, 𝑒 i ⇓ 𝑐 h 𝜎, 𝑒 + 𝑒 i ⇓ 𝑐 + 𝑐 h 𝜎, 𝑒 i ⇓ 𝑐 h 𝜎, 𝑒 i ⇓ 𝑐 h 𝜎, 𝑒 - 𝑒 i ⇓ 𝑐 − 𝑐 h 𝜎, 𝑒 i ⇓ 𝑐 h 𝜎, 𝑒 i ⇓ 𝑐 h 𝜎, 𝑒 * 𝑒 i ⇓ 𝑐 ∗ 𝑐 h 𝜎, 𝑒 i ⇓ 𝑐 h 𝜎, 𝑒 i ⇓ 𝑐 h 𝜎, 𝑒 / 𝑒 i ⇓ 𝑐 / 𝑐 Fig. 6. Semantics of expressions. We use the notation h 𝜎, 𝑒 i ⇓ 𝑐 to mean that the expression 𝑒 evaluatedunder the environment 𝜎 yields the value 𝑐 . 𝜎 (cid:15) 𝑏 ⇐⇒ 𝑏 = true 𝜎 (cid:15) 𝑝 && 𝑝 ⇐⇒ 𝜎 (cid:15) 𝑝 ∧ 𝜎 (cid:15) 𝑝 𝜎 (cid:15) 𝑝 || 𝑝 ⇐⇒ 𝜎 (cid:15) 𝑝 ∨ 𝜎 (cid:15) 𝑝 𝜎 (cid:15) ! 𝑝 ⇐⇒ 𝜎 𝑝𝜎 (cid:15) 𝑒 < 𝑒 ⇐⇒ there exist 𝑐 , 𝑐 s . t . h 𝜎, 𝑒 i ⇓ 𝑐 ∧ h 𝜎, 𝑒 i ⇓ 𝑐 ∧ 𝑐 < 𝑐 𝜎 (cid:15) 𝑒 == 𝑒 ⇐⇒ there exist 𝑐 , 𝑐 . s . t . h 𝜎, 𝑒 i ⇓ 𝑐 ∧ h 𝜎, 𝑒 i ⇓ 𝑐 ∧ 𝑐 = 𝑐 Fig. 7. Semantics of propositions. We use the notation 𝜎 (cid:15) 𝑝 to mean 𝜎 , which must be an environment,satisfies 𝑝 , a proposition. environment 𝜎 . The meaning of a constant 𝑐 is the value of the constant. The arithmetic operators + , - , * , and / have their standard interpretation. We assume that / denotes integer division. Figure 7 presents the semantics of propositions. The meaning of a proposi-tion is whether or not, for a given environment, the environment satisfies the proposition (i.e. theproposition is true in the environment). We use the notation 𝜎 (cid:15) 𝑝 to mean that the environment 𝜎 satisfies the proposition 𝑝 . The meaning of an equality == or size < comparison is that an en-vironment satisfies the comparison iff evaluating the expressions yields numbers that satisfy thecomparison. Note that the existential quantifiers in the definition are trivial because the expressionsemantics is a total function of the environment. The meaning of each of the operators && , || , and ! is its standard interpretation in propositional logic. 𝛽 (cid:15) (cid:3) 𝑝 ⇐⇒ for every 𝜎 ∈ 𝛽, 𝜎 (cid:15) 𝑝𝛽 (cid:15) ^ 𝑝 ⇐⇒ there is some 𝜎 ∈ 𝛽 s . t . 𝜎 (cid:15) 𝑝𝛽 (cid:15) ∃ . 𝑝 (cid:3) ⇐⇒ 𝛽 (cid:15) 𝑝 (cid:3) 𝛽 (cid:15) ∃ 𝑦 , ˆ 𝑦 ′ . 𝑝 (cid:3) ⇐⇒ there is some 𝑐 s . t . 𝛽 (cid:15) ∃ ˆ 𝑦 ′ . 𝑝 (cid:3) [ 𝑐 / 𝑦 ] Fig. 8. Semantics of modal and existential propositions. We use the notation 𝛽 (cid:15) 𝑝 (cid:3) and 𝛽 (cid:15) 𝑝 ∃ to meanthat the belief state 𝛽 satisfies 𝑝 (cid:3) or 𝑝 ∃ . Figure 8 presents the semantics of modal and existentialpropositions. The meaning of a modal or existential proposition is whether or not a given belief
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:11 state satisfies the proposition (i.e. the proposition is true in the belief state). We use the notation 𝛽 (cid:15) 𝑝 (cid:3) and 𝛽 (cid:15) 𝑝 ∃ to mean that the belief state 𝛽 satisfies 𝑝 (cid:3) or 𝑝 ∃ . The meaning of (cid:3) is to universallyquantify over all environments in the belief state, and the meaning of ^ is to existentially quantifyover environments in the belief state. The meaning of the operators && , || , and ! (elided in Figure 8)is their standard propositional logic interpretations, and is the same as Figure 7 with 𝜎 replacedwith 𝛽 . The meaning of ∃ is its standard meaning in first-order logic. We use notation 𝑝 (cid:3) [ 𝑐 / 𝑦 ] tomean the proposition 𝑝 (cid:3) with 𝑐 substituted for 𝑦 , and the notation ∃ . 𝑝 (cid:3) to mean an existentialproposition with an empty set of quantified variables. Figure 9 presents the semantics of statements. We follow a big-step operationalapproach, where every statement updates the belief state as well as an optional true environment.While the program execution only updates the belief state, we model programs as simultaneouslynondeterministically updating the true environment to provide an easy specification of the setof legal observation inputs. Namely, a legal observation is one that could have come from thetrue environment. A null true environment signifies that, due to nondeterministic control flow,the true environment took a different branch than the belief state. Observations under a null trueenvironment are illegal. The nondeterminism of the true environment impacts the belief statethrough the observations, but for fixed observation inputs, the belief state update is deterministic.Every statement produces, given an initial belief state and true environment, a configuration , whichis either a new belief state and new optional true environment or an error ⊥ . We augment the resultwith an observation list, which documents all observations and is an element of the grammar 𝑂 ::= 𝑥 : 𝑐 :: 𝑂 | nil In other words, an observation list is a list of associations of variable names to values. We use thenotation h 𝛽, 𝜇, 𝑠 i ⇓ h 𝐶 | 𝑜 i to mean that the belief state 𝛽 and optional true environment 𝜇 producethe configuration 𝐶 augmented with the observation list 𝑜 . Assignment and Choose.
The meaning of either an assignment or choose statement is a newenvironment, if an original environment was present, and belief state with the statement variablerebound to a new value. In the case of assignment, the value is given by evaluating the expression.In the case of a choose statement the value must be consistent with the proposition, meaning thatif we replace the placeholder . with the new value, the proposition must hold. In both cases, thenew belief state is obtained by applying this process to each environment in the initial belief state.A choose statement is nondeterministic with respect to the true environment, but deterministicwith respect to the belief state. Assertions.
The meaning of an assertion, if its predicate is true, is to return the input belief stateand environment. If its predicate is false, the assertion returns an error.
Observations.
An observation observe 𝑥 does not modify the true environment. It does modifythe belief state to be consistent with the true environment on the observed variable 𝑥 by onlykeeping those environments in the initial belief state that have the same value for 𝑥 as in the trueenvironment. The semantics also specify that the value of 𝑥 is in the observation list, and that anerror occurs if the true environment is null. If Statements.
If the if statement’s condition is deterministic (i.e., it is either true in all environ-ments in the belief state or false in all environments), then the execution takes the appropriatebranch. This is specified by semantic rules that require 𝛽 (cid:15) (cid:3) 𝑝 or 𝛽 (cid:15) (cid:3) (! 𝑝 ) , where 𝛽 is the initialbelief state and 𝑝 is the if statement’s condition. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. 𝛽 ′ = { 𝜎 𝛽 [ 𝑥 ↦→ 𝑐 𝛽 ] | 𝜎 𝛽 ∈ 𝛽 ∧ h 𝜎 𝛽 , 𝑒 i ⇓ 𝑐 𝛽 }h 𝛽, · , 𝑥 = 𝑒 i ⇓ h 𝛽 ′ , · | nil i h 𝛽, · , 𝑥 = 𝑒 i ⇓ h 𝛽 ′ , · | nil i h 𝜎, 𝑒 i ⇓ 𝑐 h 𝛽, 𝜎, 𝑥 = 𝑒 i ⇓ h 𝛽 ′ , 𝜎 [ 𝑥 ↦→ 𝑐 ] | nil i 𝛽 ′ = { 𝜎 𝛽 [ 𝑥 ↦→ 𝑐 𝛽 ] | 𝜎 𝛽 ∈ 𝛽 ∧ 𝜎 𝛽 (cid:15) 𝑝 [ 𝑐 𝛽 / . ]}h 𝛽, · , 𝑥 = choose( 𝑝 ) i ⇓ h 𝛽 ′ , · | nil i h 𝛽, · , 𝑥 = choose( 𝑝 ) i ⇓ h 𝛽 ′ , · | nil i 𝜎 (cid:15) 𝑝 [ 𝑐 / . ]h 𝛽, 𝜎, 𝑥 = choose( 𝑝 ) i ⇓ h 𝛽 ′ , 𝜎 [ 𝑥 ↦→ 𝑐 ] | nil i 𝛽 (cid:15) 𝑝 (cid:3) h 𝛽, 𝜇, assert 𝑝 (cid:3) i ⇓ h 𝛽, 𝜇 | nil i 𝛽 𝑝 (cid:3) h 𝛽, 𝜇, assert 𝑝 (cid:3) i ⇓ h⊥ | nil i 𝛽 ′ = { 𝜎 𝛽 | 𝜎 𝛽 ∈ 𝛽 ∧ 𝜎 𝛽 ( 𝑥 ) = 𝜎 ( 𝑥 )}h 𝛽, 𝜎, observe 𝑥 i ⇓ h 𝛽 ′ , 𝜎 | 𝑥 : 𝜎 ( 𝑥 )i h 𝛽, · , observe 𝑥 i ⇓ h⊥ | nil i 𝛽 (cid:15) ^ 𝑝 && ^ (! 𝑝 ) 𝛽 𝑇 = { 𝜎 𝛽 | 𝜎 𝛽 ∈ 𝛽 ∧ 𝜎 𝛽 (cid:15) 𝑝 } 𝛽 𝐹 = { 𝜎 𝛽 | 𝜎 𝛽 ∈ 𝛽 ∧ 𝜎 𝛽 𝑝 } 𝜇 𝑇 = 𝜎 if 𝜇 = 𝜎 ∧ 𝜎 (cid:15) 𝑝 else · 𝜇 𝐹 = 𝜎 if 𝜇 = 𝜎 ∧ 𝜎 𝑝 else ·h 𝛽 𝑇 , 𝜇 𝑇 , 𝑠 i ⇓ h 𝛽 ′ 𝑇 , 𝜇 ′ 𝑇 | nil i h 𝛽 𝐹 , 𝜇 𝐹 , 𝑠 i ⇓ h 𝛽 ′ 𝐹 , 𝜇 ′ 𝐹 | nil i 𝜇 ′ = 𝜇 ′ 𝑇 𝜇 = 𝜎 ∧ 𝜎 (cid:15) 𝑝𝜇 ′ 𝐹 𝜇 = 𝜎 ∧ 𝜎 𝑝 · else h 𝛽, 𝜇, if 𝑝 { 𝑠 } else { 𝑠 } i ⇓ h 𝛽 ′ 𝑇 ∪ 𝛽 ′ 𝐹 , 𝜇 ′ | nil i 𝛽 (cid:15) ^ 𝑝 && ^ (! 𝑝 ) 𝛽 𝑇 = { 𝜎 𝛽 | 𝜎 𝛽 ∈ 𝛽 ∧ 𝜎 𝛽 (cid:15) 𝑝 } 𝛽 𝐹 = { 𝜎 𝛽 | 𝜎 𝛽 ∈ 𝛽 ∧ 𝜎 𝛽 𝑝 } 𝜇 𝑇 = 𝜎 if 𝜇 = 𝜎 ∧ 𝜎 (cid:15) 𝑝 else · 𝜇 𝐹 = · if 𝜇 = 𝜎 ∧ 𝜎 (cid:15) 𝑝 else 𝜎 h 𝛽 𝑇 , 𝜇 𝑇 , 𝑠 i ⇓ h 𝛽 ′ 𝑇 , 𝜇 ′ 𝑇 | 𝑜 𝑇 i h 𝛽 𝐹 , 𝜇 𝐹 , 𝑠 i ⇓ h 𝛽 ′ 𝐹 , 𝜇 ′ 𝐹 | 𝑜 𝐹 i 𝑜 𝑇 ≠ nil ∨ 𝑜 𝐹 ≠ nil h 𝛽, 𝜇, if 𝑝 { 𝑠 } else { 𝑠 } i ⇓ h⊥ | nil i 𝛽 (cid:15) (cid:3) 𝑝 h 𝛽, 𝜇, 𝑠 i ⇓ h 𝐶 | 𝑜 ih 𝛽, 𝜇, if 𝑝 { 𝑠 } else { 𝑠 } i ⇓ h 𝐶 | 𝑜 i 𝛽 (cid:15) (cid:3) (! 𝑝 ) h 𝛽, 𝜇, 𝑠 i ⇓ h 𝐶 | 𝑜 ih 𝛽, 𝜇, if 𝑝 { 𝑠 } else { 𝑠 } i ⇓ h 𝐶 | 𝑜 i 𝛽 (cid:15) 𝑝 (cid:3) h 𝛽, 𝜇, 𝑠 i ⇓ h 𝐶 | 𝑜 ih 𝛽, 𝜇, infer 𝑝 (cid:3) { 𝑠 } else { 𝑠 } i ⇓ h 𝐶 | 𝑜 i 𝛽 𝑝 (cid:3) h 𝛽, 𝜇, 𝑠 i ⇓ h 𝐶 | 𝑜 ih 𝛽, 𝜇, infer 𝑝 (cid:3) { 𝑠 } else { 𝑠 } i ⇓ h 𝐶 | 𝑜 ih 𝛽, 𝜇, assert 𝑝 𝐼 ∃ ; if( 𝑝 ) { 𝑠 ; while 𝑝 { 𝑝 𝐼 ∃ } { 𝑠 }} else { skip } i ⇓ h 𝐶 | 𝑜 ih 𝛽, 𝜇, while 𝑝 { 𝑝 𝐼 ∃ } { 𝑠 } i ⇓ h 𝐶 | 𝑜 ih 𝛽, 𝜇, 𝑠 i ⇓ h 𝛽 ′ , 𝜇 ′ | 𝑜 ih 𝛽 ′ , 𝜇 ′ , 𝑠 i ⇓ h 𝛽 ′′ , 𝜇 ′′ | 𝑜 ih 𝛽, 𝜇, 𝑠 ; 𝑠 i ⇓ h 𝛽 ′′ , 𝜇 ′′ | 𝑜 ++ 𝑜 i h 𝛽, 𝜇, skip i ⇓ h 𝛽, 𝜇 | nil i Fig. 9. Semantics of statements. We use the notation h 𝛽, 𝜇, 𝑠 i ⇓ h 𝐶 | 𝑜 i to mean that the belief state 𝛽 andoptional true environment 𝜇 produce the configuration 𝐶 augmented with the observation list 𝑜 . Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:13
If the statement’s condition is nondeterministic (as specified by requiring 𝛽 (cid:15) ^ 𝑝 && ^ (! 𝑝 ) ), theif statement executes both branches, sending as belief-state input to each the set of environmentsin which the condition has the appropriate value. It sends the true environment as input to thebranch it actually takes and the null environment to the other branch. The resulting belief state isthen the union of environments resulting from either branch, and the resulting true environmentis from the branch that the initial true environment actually took.If an if statement’s condition is nondeterministic, then neither of its branches can make obser-vations, or else the result is an error. This is because it is unclear what interaction with the trueenvironment means within a branch that environment did not necessarily take. Infer Statements.
The semantics of infer statements are similar to the semantics of if statementsin an ordinary language, where infer operates solely on the belief state. If the belief state satisfiesthe condition, it evaluates the first branch and otherwise evaluates the second branch.
While Loops.
The semantics of while loops is defined recursively using if statements. This issimilar to a standard equivalence notion for while-loop programs (see [Winskel 1993] Section 2.5).We additionally include an assertion that requires the loop invariant to be true.
Composition and Skip.
The semantics of statement composition and skips are standard, exceptthat they have been appropriately extended to include the observation list. Sequencing concate-nates observation lists using the ++ operator. Errors.
We have elided for clarity from Figure 9 the full semantics of how errors propagate. Weassume that errors propagate maximally throughout the program, so if at any point the semanticsproduce ⊥ , the whole program produces ⊥ . In this section, we establish several properties of the semantics in Figure 9 that we posit any beliefprogramming system should satisfy. These properties constrain the semantics so that the beliefstate updates respect the true environment updates.The first property is that beliefs should be sound. This means that for each environment in thebelief state, and each new environment and new belief state that are reachable according to thesemantics, the new environment is in the new belief state. We formalize this as follows:
Theorem 3.1 (Belief Soundness). If 𝜎 ∈ 𝛽 and h 𝛽, 𝜎, 𝑠 i ⇓ h 𝛽 ′ , 𝜎 ′ | 𝑜 i , then 𝜎 ′ ∈ 𝛽 ′ . Proof.
By structural induction on derivations of ⇓ . Details are in Appendix A.1. (cid:3) This means that it is impossible for the true environment to lie outside of the belief state.In addition, beliefs should be precise. This means that for every environment in a new beliefstate reachable from an initial belief state and the semantics, there is an environment in the initialbelief state from which the new environment is reachable. We formalize this as follows:
Theorem 3.2 (Belief Precision). If h 𝛽, 𝜇, 𝑠 i ⇓ h 𝛽 ′ , 𝜇 ′ | 𝑜 i , then for every 𝜎 ′ 𝛽 ∈ 𝛽 ′ , there is some 𝜎 𝛽 ∈ 𝛽 such that h 𝛽, 𝜎 𝛽 , 𝑠 i ⇓ h 𝛽 ′ , 𝜎 ′ 𝛽 | 𝑜 i Proof.
By structural induction on derivations of ⇓ . Details are in Appendix A.2. (cid:3) This means that every environment in the belief state could be the true environment.Finally, beliefs should be deterministic given observations. This means that the belief state de-pends on the true environment only through observations. We formalize this as follows:
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020.
Theorem 3.3 (Belief Determinism). If h 𝛽, 𝜇 , 𝑠 i ⇓ h 𝛽 ′ , 𝜇 ′ | 𝑜 i and h 𝛽, 𝜇 , 𝑠 i ⇓ h 𝛽 ′ , 𝜇 ′ | 𝑜 i and 𝑜 = 𝑜 , then 𝛽 ′ = 𝛽 ′ . Proof.
By structural induction on derivations of ⇓ using a strengthened induction hypothesis.Details are in Appendix A.3. (cid:3) Here, we discuss how belief state updates execute in absence of the true environment. This modelshow the belief program executes. According to Theorem 3.3, the semantics in Figure 9 dependson the true environment 𝜇 only through the observation list 𝑜 . By projecting out the operationson belief states, we can use the semantics in Figure 9 to compute the new belief state using onlythe initial belief state and the sequence of observations. In other words, if we define the beliefexecution ⇓ 𝛽 as follows h 𝛽, 𝜇, 𝑠 i ⇓ h 𝛽 ′ , 𝜇 ′ | 𝑜 ih 𝛽, 𝑜, 𝑠 i ⇓ 𝛽 𝛽 ′ then ⇓ 𝛽 is a partial function of 𝛽 , 𝑜 , and 𝑠 . The function ⇓ 𝛽 gives the semantics of the belief pro-gram’s execution on a concrete sequence of observations 𝑜 . In this section, we present Epistemic Hoare Logic and sketch a proof of its soundness. Figure 10gives the inference rules for Epistemic Hoare Logic. Each rule yields a deduction 𝑃𝐶 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } ,where the context 𝑃𝐶 is drawn from the grammar 𝑃𝐶 ::= 𝑁 | 𝐷 . The purpose of the contextis to ensure observations do not occur under nondeterministic control flow, as that would resultin an error according to the semantics. The deduction 𝐷 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } meas that, assuming thestatement 𝑠 is executed under deterministic control flow and terminates, 𝑠 maps belief states sat-isfying the pre-condition 𝑝 ∃ to new belief states satisfying the post-condition 𝑝 ′∃ . The deduction 𝑁 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } means the same thing, but where the enclosing control flow may be nondeter-ministic. Assignment and Choose.
Assignment and choose statements conjunct the pre-condition with anew proposition. In the case of assignment, the new proposition encodes that the value of the vari-able is equal to the result of the expression. In the case of a choose statement, the new propositionencodes that the value of the variable must be consistent with the choose statement’s proposition.In either case the previous value of the variable is encoded with a fresh variable.
Assertions.
An assertion has identical pre- and post-condition propositions. In order to applythe rule, the developer must show that the pre-condition implies the asserted proposition.
Observations.
An observation adds a new existentially quantified variable 𝑦 𝑛 + to the variablelist from the pre-condition. The post-condition ensures that the value of the observed variableis deterministic by stating that in every environment it is equal to 𝑦 𝑛 + . Moreover, 𝑦 𝑛 + is notcompletely unrestricted; it must satisfy all properties that the observed variable satisfied in theprecondition. Observations always require that the enclosing control flow is deterministic. Composition and Skip.
The rules for statement sequencing and skip statements are standard;they are the same as in classical Hoare logic [Floyd 1967; Hoare 1969]
If and Infer.
Both if and infer statements require that both branches satisfy the same post-condition. For if, the pre-condition of each branch includes the statement’s condition under the (cid:3)
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:15 𝑥 fresh in 𝑝 (cid:3) , 𝑒𝑝 ′ (cid:3) = 𝑝 (cid:3) [ 𝑥 / 𝑥 ] 𝑒 ′ = 𝑒 [ 𝑥 / 𝑥 ] 𝑁 ⊢ {∃ ˆ 𝑦. 𝑝 (cid:3) } 𝑥 = 𝑒 {∃ ˆ 𝑦. 𝑝 ′ (cid:3) && (cid:3) ( 𝑥 == 𝑒 ′ ) } 𝑁 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } 𝐷 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } 𝑥 fresh in 𝑝 (cid:3) , 𝑝𝑝 ′ (cid:3) = 𝑝 (cid:3) [ 𝑥 / 𝑥 ] 𝑝 ′ = 𝑝 [ 𝑥 / 𝑥 ] [ 𝑥 / . ] 𝑁 ⊢ {∃ ˆ 𝑦. 𝑝 (cid:3) } 𝑥 = choose( 𝑝 ) {∃ ˆ 𝑦. 𝑝 ′ (cid:3) && (cid:3) ( 𝑝 ′ ) } ∀ 𝛽. 𝛽 (cid:15) 𝑝 ∃ ⇒ 𝛽 (cid:15) 𝑝 𝑎 ∃ 𝑁 ⊢ { 𝑝 ∃ } assert 𝑝 𝑎 ∃ { 𝑝 ∃ } 𝑃𝐶 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } 𝑃𝐶 ⊢ { 𝑝 ′∃ } 𝑠 { 𝑝 ′′∃ } 𝑃𝐶 ⊢ { 𝑝 ∃ } 𝑠 ; 𝑠 { 𝑝 ′′∃ } 𝑁 ⊢ { 𝑝 ∃ } skip { 𝑝 ∃ } 𝑦 𝑛 + fresh in 𝑝 (cid:3) 𝑝 ′ (cid:3) = 𝑝 (cid:3) [ 𝑦 𝑛 + / 𝑥 ] 𝐷 ⊢ {∃ 𝑦 , 𝑦 , . . . , 𝑦 𝑛 . 𝑝 (cid:3) } observe 𝑥 {∃ 𝑦 , 𝑦 , . . . , 𝑦 𝑛 , 𝑦 𝑛 + . 𝑝 ′ (cid:3) && (cid:3) ( 𝑥 == 𝑦 𝑛 + ) } 𝑁 ⊢ {∃ ˆ 𝑦. ( (cid:3) 𝑝 ) && 𝑝 (cid:3) } 𝑠 {∃ ˆ 𝑦 ′ . (cid:3) 𝑝 ′ } 𝑁 ⊢ {∃ ˆ 𝑦. ( (cid:3) ! 𝑝 ) && 𝑝 (cid:3) } 𝑠 {∃ ˆ 𝑦 ′ . (cid:3) 𝑝 ′ } 𝑁 ⊢ {∃ ˆ 𝑦. 𝑝 (cid:3) } if 𝑝 { 𝑠 } else { 𝑠 } {∃ ˆ 𝑦 ′ . (cid:3) 𝑝 ′ }∀ 𝛽. 𝛽 (cid:15) ∃ ˆ 𝑦. 𝑝 (cid:3) ⇒ 𝛽 (cid:15) (cid:3) 𝑝 || (cid:3) (! 𝑝 ) 𝐷 ⊢ {∃ ˆ 𝑦. ( (cid:3) 𝑝 ) && 𝑝 (cid:3) } 𝑠 { 𝑝 ′∃ } 𝐷 ⊢ {∃ ˆ 𝑦. ( (cid:3) ! 𝑝 ) && 𝑝 (cid:3) } 𝑠 { 𝑝 ′∃ } 𝐷 ⊢ {∃ ˆ 𝑦. 𝑝 (cid:3) } if 𝑝 { 𝑠 } else { 𝑠 } { 𝑝 ′∃ } 𝑃𝐶 ⊢ {∃ ˆ 𝑦. 𝑝 𝑖 (cid:3) && 𝑝 (cid:3) } 𝑠 { 𝑝 ′∃ } 𝑃𝐶 ⊢ {∃ ˆ 𝑦. !( 𝑝 𝑖 (cid:3) ) && 𝑝 (cid:3) } 𝑠 { 𝑝 ′∃ } 𝑃𝐶 ⊢ {∃ ˆ 𝑦. 𝑝 (cid:3) } infer 𝑝 𝑖 (cid:3) { 𝑠 } else { 𝑠 } { 𝑝 ′∃ } 𝑁 ⊢ { 𝑝 ′∃ } 𝑠 { 𝑝 ′′∃ }∀ 𝛽. 𝛽 (cid:15) 𝑝 ∃ ⇒ 𝛽 (cid:15) ∃ ˆ 𝑦. (cid:3) 𝑝 𝐼 ∀ 𝛽. 𝛽 (cid:15) ∃ ˆ 𝑦. (cid:3) 𝑝 𝐼 && (cid:3) 𝑝 ⇒ 𝛽 (cid:15) 𝑝 ′∃ ∀ 𝛽. 𝛽 (cid:15) 𝑝 ′′∃ ⇒ 𝛽 (cid:15) ∃ ˆ 𝑦. (cid:3) 𝑝 𝐼 𝑁 ⊢ { 𝑝 ∃ } while 𝑝 { ∃ ˆ 𝑦. (cid:3) 𝑝 𝐼 } { 𝑠 } {∃ ˆ 𝑦. (cid:3) (! 𝑝 && 𝑝 𝐼 ) } 𝐷 ⊢ { 𝑝 ′∃ } 𝑠 { 𝑝 ′′∃ }∀ 𝛽. 𝛽 (cid:15) ∃ ˆ 𝑦. 𝑝 𝐼 (cid:3) ⇒ 𝛽 (cid:15) (cid:3) 𝑝 || (cid:3) (! 𝑝 ) ∀ 𝛽. 𝛽 (cid:15) 𝑝 ∃ ⇒ 𝛽 (cid:15) ∃ ˆ 𝑦. 𝑝 𝐼 (cid:3) ∀ 𝛽. 𝛽 (cid:15) ∃ ˆ 𝑦. 𝑝 𝐼 (cid:3) && (cid:3) 𝑝 ⇒ 𝛽 (cid:15) 𝑝 ′∃ ∀ 𝛽. 𝛽 (cid:15) 𝑝 ′′∃ ⇒ 𝛽 (cid:15) ∃ ˆ 𝑦. 𝑝 𝐼 (cid:3) 𝐷 ⊢ { 𝑝 ∃ } while 𝑝 { ∃ ˆ 𝑦. 𝑝 𝐼 (cid:3) } { 𝑠 } {∃ ˆ 𝑦. (cid:3) !( 𝑝 ) && 𝑝 𝐼 (cid:3) } 𝑃𝐶 ⊢ { 𝑝 ′∃ } 𝑠 { 𝑝 ′′∃ }∀ 𝛽. 𝛽 (cid:15) 𝑝 ∃ ⇒ 𝛽 (cid:15) 𝑝 ′∃ ∀ 𝛽. 𝛽 (cid:15) 𝑝 ′′∃ ⇒ 𝑝 ′′′∃ 𝑃𝐶 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′′′∃ } Fig. 10. Epistemic Hoare Logic rules. We use the notation 𝑃𝐶 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } to mean that in a context 𝑃𝐶 the statement 𝑠 maps belief states satisfying 𝑝 ∃ to new belief states satisfying 𝑝 ′∃ . modality. For infer, the pre-condition of each branch includes the statement’s condition, which isitself a modal proposition.Furthermore, the post-condition of an if statement must use the (cid:3) modality, whereas the post-condition of an infer statement may be any existential proposition. A more conventional approachwould be to have both branches of the if statement imply the same predicate as the post-condition.For this to be sound, we would need to show that if 𝛽 | = 𝑝 ∃ and 𝛽 | = 𝑝 ∃ then 𝛽 ∪ 𝛽 | = 𝑝 ∃ ,which is in fact false. However, it is true that if 𝛽 | = (cid:3) 𝑝 and 𝛽 | = (cid:3) 𝑝 , then 𝛽 ∪ 𝛽 | = (cid:3) 𝑝 . Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020.
To preserve a deterministic context, developers must ensure that the the if statement’s conditionhas the same value in all environments.
While.
To verify a while loop with Epistemic Hoare Logic, the developer must prove severalproperties of the initial pre-condition 𝑝 ∃ , the loop invariant 𝑝 𝐼 ∃ , the loop body’s pre-condition 𝑝 ′∃ ,and the loop body’s post-condition 𝑝 ′′∃ .The developer must show that the pre-condition implies the loop invariant, the invariant impliesthe body’s pre-condition, and the body’s post-condition implies the invariant. Implications suchas these are a standard feature of proving properties using Hoare rules, although the overall proofrules are sometimes structured differently.To preserve a deterministic context, developers must also ensure that the the while loop’s con-dition has the same value in all environments. Rule of Consequence.
The rule of consequence that any proposition that implies the pre-conditionmay be substituted for the pre-condition. Conversely, any proposition implied by the post-conditionmay be substituted for the post-condition.
In this section, we formalize and establish the soundness of Epistemic Hoare Logic with respectto the semantics of BLIMP. We start by explaining what supporting lemmas are needed, and thenstate the main theorem and sketch the proof.
The main theorem depends on a number of lemmas that relate the substitu-tions in Figure 10 to the environment mapping in Figure 9.The following lemma gives the required property for expressions. It states that if we evaluatean expression under a new environment with a fresh variable 𝑥 that is a rebinding of 𝑥 , we arefree to rename 𝑥 to 𝑥 in the expression without changing the result. Lemma 4.1 (Expression Substitution). If 𝑥 is fresh in 𝑒 , then h 𝜎, 𝑒 i ⇓ 𝑐 ⇐⇒ h 𝜎 [ 𝑥 ↦→ 𝜎 ( 𝑥 )] , 𝑒 i ⇓ 𝑐 and h 𝜎, 𝑒 i ⇓ 𝑐 ⇐⇒ h 𝜎 [ 𝑥 ↦→ 𝜎 ( 𝑥 )] , 𝑒 [ 𝑥 / 𝑥 ]i ⇓ 𝑐 Proof.
By structural induction on expressions. (cid:3)
By similar means, we establish analogous properties for propositions, modal propositions, andexistential propositions. The full set of lemmas is in Appendices A.4.1-A.4.3.
The main theorem depends upon a lemma that states that ob-servations cannot happen under nondeterministic control flow. We have formalized this propertyas follows:
Lemma 4.2 (Observation List Emptiness). If 𝑁 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } and h 𝛽, 𝜇, 𝑠 i ⇓ h 𝐶 | 𝑜 i , then 𝑜 = nil . Proof.
By structural induction on derivations of ⇓ . The case of sequencing uses the fact that nil ++ nil = nil (cid:3) In this section, we establish the soundness of the logic. Our approachis to show partial correctness, meaning that if the program terminates, its final state is describedby the post-condition.The theorem has two parts. First, we establish the soundness of the logic in the case wherecontrol flow may be nondeterministic. The theorem states that if the belief state satisfies the pre-condition, and the semantics produces a configuration, the configuration includes a new belief
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:17 state that satisfies the post-condition. Second, we establish the soundness of the logic under deter-ministic control flow. The theorem states that if the initial belief state satisfies the pre-condition,and the true environment is in the belief state, then the program executes without error and thenew belief state satisfies the post-condition.
Theorem 4.3 (Logic Soundness). (1) If 𝑁 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } , 𝛽 (cid:15) 𝑝 ∃ , and h 𝛽, 𝜇, 𝑠 i ⇓ h 𝐶 | 𝑜 i , then 𝐶 = ( 𝛽 ′ , 𝜇 ′ ) and 𝛽 ′ (cid:15) 𝑝 ′∃ . (2) If 𝐷 ⊢ { 𝑝 ∃ } 𝑠 { 𝑝 ′∃ } , 𝜎 ∈ 𝛽 , 𝛽 (cid:15) 𝑝 ∃ , and h 𝛽, 𝜎, 𝑠 i ⇓ h 𝐶 | 𝑜 i , then 𝐶 = ( 𝛽 ′ , 𝜎 ′ ) and 𝛽 ′ (cid:15) 𝑝 ′∃ . Proof. (1) By structural induction on derivations of ⇓ . The cases for assignments and choosestatements rely on substitution lemmas. The cases for deterministic if, infer, and while followfrom induction hypotheses, although while loops require destructing the semantic rule. Fornon-deterministic if statements, we show that 𝛽 (cid:15) ∃ ˆ 𝑦. (cid:3) 𝑝 ∧ 𝛽 (cid:15) ∃ ˆ 𝑦. (cid:3) 𝑝 ⇒ 𝛽 ∪ 𝛽 (cid:15) ∃ ˆ 𝑦. (cid:3) 𝑝 ,which follows from standard principles of first-order logic. The details are in Appendix A.4.4.(2) By structural induction on derivations of ⇓ . The cases except for observe are similar to thoseabove, except that we use Theorem 3.1 to ensure the premises of the induction hypotheses.The case for observe relies on the assumption that 𝜎 ∈ 𝛽 to instantiate 𝑦 𝑛 + . The details arein Appendix A.4.5. (cid:3) In this section, we show how belief programming and Epistemic Hoare Logic could be usedto implement and verify the control software of the Mars Polar Lander (MPL). The MPL is a lostspace probe, hypothesized to have crashed into the surface of Mars during descent due to a controlsoftware error [JPL Special Review Board 2000]. We do not claim that belief programming is thefirst or only technique for preventing the loss of the MPL. However, the notoriety and subsequentinvestigation of the MPL’s loss has resulted in ample documentation [JPL Special Review Board2000] useful for illustrating in detail how belief programming and Epistemic Hoare Logic work.The code presented in this section is written in BLIMP, except that for convenience we definetwo pieces of syntactic sugar. The syntax 𝑝 => 𝑝 desugars to ! 𝑝 || 𝑝 and the syntax 𝑥 = 𝑝 desugars to if 𝑝 { 𝑥 = 1 } else { 𝑥 = 0 } .The code in Figure 11 is the piece of MPL’s software [JPL Special Review Board 2000] responsi-ble for the final phase of its Martian descent. The code uses a radar altimeter as well as two touchsensors on the landing legs to monitor its progress along the descent. Note that this is a simpli-fication from the original software, which used three touch sensors. The code consists of a stateestimator to determine when it reaches the Martian surface, and control code to shut off its engineonce it does. Reading Observations.
On Line 8 the controller reads the value of the radar altimeter into thevariable radar_alt . On Lines 10 and 11 the code reads the values of the touchdown sensors into cur_td_1 and cur_td_2 . It also stores their previous values in prev_td_1 and prev_td_2 . State Updates.
The block of code from Line 13-21 sets the state variables state_1 and state_2 based on the values of the touchdown sensors. Specifically, if an individual sensor has indicated a for two iterations in a row, its state is set to . Otherwise, its state is set to .Notably the annotated lines (Lines 13 and 14) were missing from the original software, meaningthat any two positive sensor readings in a row were sufficient to permanently set the state to .It is hypothesized that this was part of the sequence of events that caused the MPL to crash, andthese two lines are the recommended fix [JPL Special Review Board 2000]. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. // Controller Initialization prev_td_1 = 0; cur_td_1 = 0; prev_td_2 = 0; cur_td_2 = 0; health_1 = 1; health_2 = 1; engine_enabled = 1; event_enabled = 0; while ( engine_enabled == 1) { // Controller loop start input radar_alt ; prev_td_1 = cur_td_1 ; input cur_td_1 ; prev_td_2 = cur_td_2 ; input cur_td_2 ; state_1 = 0; // Missing , probable cause of crash . state_2 = 0; // Missing , probable cause of crash . if ( prev_td_1 == 1 && cur_td_1 == 1) { state_1 = 1 }; if ( prev_td_2 == 1 && cur_td_2 == 1) { state_2 = 1 }; if (( state_1 == 1 && health_1 == 1) || ( state_2 == 1 && health_2 == 1) && event_enabled == 1) { engine_enabled = 0; } // Indicator health check if ( radar_alt < 40 && event_enabled == 0) { if ( prev_td_1 == 1 && cur_td_1 == 1) { health_1 = 0; }; if ( prev_td_2 == 1 && cur_td_2 == 1) { health_2 = 0; } event_enabled = 1; } } Fig. 11. Code of the Mars Polar Lander
Engine Shutdown.
The block of code from Line 22-26 determines when to shut down the engine.If the health check has completed (see below) and at least one of the healthy indicators registers atouchdown, then the program sets engine_enabled to , shutting down the engine. Health Check.
The block of code from Line 28-36 performs a health check on the touchdownindicators. The health check assigns into the health variables health_0 and health_1 , and isperformed the first time the radar altimeter indicates an altitude lower than 40 meters. At this
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:19 point, the lander is off of the ground so both touchdown indicators should read . If an indicatorreads on both the current and previous time step, it is assumed to be defective, and its healthvariable is set to . After the health check completes, the program sets the flag event_enabled to to allow the touchdown indicators to shut down the engine. From the above code, we can deduce several sources of error the MPL control software was de-signed to be robust with respect to: • Transient False Positives.
If a touch sensor is momentarily triggered, the software will notimmediately assume the lander has contacted the ground. The software requires two timesteps in a row where the sensor was positive before it sets its state variable to true andthereby registers that it has contacted the ground. The assumption here is that the durationof the transient false positive is no longer than one time step. • Permanent False Positives.
If a touch sensor is defective, it may constantly send out an indi-cation that the lander has contacted the ground. This is detected and corrected for by theindicator health check on Lines 28-36. This block of code checks if the touch sensor yieldsa non-transient positive contact signal when the lander is just under 40 meters above theground. If so, the code assumes that sensor is defective and ignores its output for the engineshutoff decision. The assumption here is that at most one sensor will be defective. • Permanent False Negatives.
If a touch sensor is defective, it may constantly send out an in-dication that the lander has not contacted the ground when in fact it has. The above codeaccounts for this by using two sensors and shutting off the engine when either one indicatesa touchdown. The assumption here is that at most one sensor will be defective.
Landing Leg Deployment.
Another source of error for the MPL that is not obvious from the codeabove but that has been well-documented is landing leg deployment. When the landing legs deployabout 1500 meters above the surface, the process can result in false positives that exceed the onetime step assumed for transients [JPL Special Review Board 2000]. Without the two annotatedlines (Lines 13 and 14), this would cause the sensor’s state variable to be permanently set to ,causing the engine to shut down immediately after the health check completed. The nondeterministic program in Figure 12 formalizes these sources of er-ror. It provides inputs to the control software by setting the variables cur_td_1 , cur_td_2 , and radar_alt . It can be composed with the control software by inlining the code in Figure 11 onLines 11 and 43. After composing with the control software, the resulting overall program couldthen be verified to prove the loop invariant on Lines 13 and 14 of Figure 12 always holds. We nowdescribe each piece of the formal model in more detail. Permanent Errors.
The block of code on Lines 6-9 of Figure 12 models permanent errors, bothfalse positive and false negative. Each of the variables perm_1 and perm_2 is if its sensor hassuffered a permanent failure, with the assumption being that neither of these two variables willbe at the same time. The variables perm_1_v and perm_2_v store the permanent error values,which are the constant values that each of the sensors will read when after suffering a permanenterror. Loop Condition.
The code on Lines 12-14 of Figure 12 specifies the same loop condition as inFigure 11, but also adds a loop invariant. The invariant specifies that when the lander is off theground, its engine is enabled. Verifying this property would ensure that the software does not havethe bug that caused the MPL to crash. The invariant also specifies on Line 14 that the lander should
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. // Model Initialization prev_err_1 = 0; prev_err_2 = 0; trans_td = 0; alt = 8000; time_on_ground = 0; // Permanent errors perm_1 = choose (. == 0 || . == 1); perm_2 = choose ((. == 0 || . == 1) && ( perm_1 == 1 => . == 0)); perm_1_v = choose (. == 0 || . == 1); perm_2_v = choose (. == 0 || . == 1); // Controller initialization ... while ( engine_enabled == 1) { ( alt > 0 => engine_enabled == 1) && ( trans_td == 0 => time_on_ground < 2) } { // Model start if ( alt == 0 && engine_enabled == 1) { time_on_ground = time_on_ground + 1 }; alt_rate = choose (0 <= . && . <= 39 && . <= alt ); alt = alt - alt_rate ; radar_alt = choose (38 <= . - alt && . - alt <= 38); err_1 = choose (( prev_err_1 == 1 => . == 0) && (. == 0 || . == 1)); prev_err_1 = err_1 ; err_2 = choose (( prev_err_2 == 1 => . == 0) && (. == 0 || . == 1)); prev_err_2 = err_2 ; if ( alt == 0 && ( err_1 == 1 || err_2 == 1)) { trans_td = 1; }; leg_err = 1400 <= alt && alt <= 1600; if perm_1 { cur_td_1 = perm_1_v } else if ( leg_err == 1 || err_1 == 1) { cur_td_1 = choose (. == 0 || . == 1) } else { cur_td_1 = alt == 0; }; if perm_2 { cur_td_2 = perm_2_v } else if ( leg_err == 1 || err_2 == 1) { cur_td_2 = choose (. == 0 || . == 1) } else { cur_td_2 = alt == 0; }; // Model end // Controller loop start ... } Fig. 12. Model for verification of the Mars Polar Lander.
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:21 spend less than 2 time steps on the ground with the engine on. This is another constraint the MPLwas designed to satisfy [JPL Special Review Board 2000]. However, the model in Figure 12 admitstransient false negative errors, which can violate this constraint in extreme cases. Thus, on Line 14of Figure 12, we assume this holds in the case where there are no transient false negatives.
Time on Ground.
The code on Lines 17-19 of Figure 12 measures the amount of time the landerhas spent on the ground with the engine on and stores it in time_on_ground . This is measuredpurely to evaluate the loop invariant and is not passed to the controller.
Altitude.
The code on Lines 20-22 of 12 specifies how altitude changes and the error modelof the radar altimeter. It stores the rate of altitude change in alt_rate , the new altitude in alt ,and the altimeter reading in radar_alt . We assume that the altitude changes by at most 39 me-ters and that the altimeter is accurate to within 38 meters. The original motivation for includingtouchdown sensors on the MPL was that the radar altimeter is inaccurate below about 40 me-ters [JPL Special Review Board 2000]. This model is designed to conservatively capture this prop-erty while still ensuring that the condition on Line 28 of Figure 11 triggers the indicator healthcheck.Furthermore, note that on line 3 of Figure 12 we have specified that the entry point to theprogram is when the lander is at an altitude of 8 kilometers.
Transient Errors.
The code on Lines 24-28 of Figure 12 models transient errors. The variables err_1 and err_2 are set to be if a transient error occurred for the first or second touchdown sen-sor, respectively. The previous values of these variables are stored in prev_err_1 and prev_err_2 .This code specifies that a transient error can occur for a sensor if it did not occur at the previoustime step. Furthermore, if a transient occurs after touchdown (i.e. a false negative), the code spec-ifies that the variable trans_td is set to . Landing Leg Deployment.
To account for landing leg deployment errors, the model sets theflag leg_err whenever the landing gear are deploying. This occurs at about 1500 meters forthe MPL [JPL Special Review Board 2000]. We have modeled a deployment window of 100 me-ters around this nominal value, so that landing leg deployment occurs between 1400 and 1600meters.
Touchdown Indicators.
The code on Lines 31-34 of Figure 12 specifies how the first touchdownindication is generated from the various sources of error. The result is stored in cur_td_1 . If theindicator suffered a permanent error, then it returns its permanent error value. Otherwise, duringlanding leg deployment or a transient error, it may output either or . If none of these errors arepresent, then it indicates iff the lander has touched the surface (i.e. the altitude is 0 meters).The code on Lines 36-39 of Figure 12 specifies how the second touchdown indication is generatedand is entirely symmetric. We now show how to use the model in Figure 12 to construct a belief program to control the MPL.We show a fragment of the belief program in Figure 13. This fragment can be completed by inliningthe appropriate parts of Figure 12 into Lines 2 and 9 of Figure 13.The belief program executes by observing each of the sensor readings generated from the model.It then determines, on Line 15, whether these are sufficient to guarantee the lander is on the ground.If so, it shuts down the engine. The belief program also modifies the loop invariant to have a (cid:3) modality, stating that it must be true in every environment (Lines 5 and 6).
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. // Model Initialization ... engine_enabled = 1; while ( engine_enabled == 1) { (cid:3) (( alt > 0 => engine_enabled == 1) && ( trans_td == 0 => time_on_ground < 2)) } { // Model start ... // Model end observe radar_alt ; observe cur_td_1 ; observe cur_td_2 ; infer (cid:3) ( alt == 0) { engine_enabled = 0 } } Fig. 13. Implementation of the Mars Polar Lander with belief programming
In this section, we will explain how to verify the loop invariant on Lines 5-6 of Figure 13. We sim-plify the problem here by only considering the first condition on Line 5. The remaining conditionis considered in Appendix D.
Initialization Post-condition.
As specified by the rules in Figure 10, we must show that the ini-tialization code generates a post-condition that satisfies the loop invariant. This post-conditioncan be written as (cid:3) (engine_enabled == 1) && . . . , where the (cid:3) -proposition is generated by thecode on Line 3 of Figure 13. Now, we can apply the fact that ∀ 𝜎. 𝜎 (cid:15) (engine_enabled == 1) ⇒ 𝜎 (cid:15) (alt > 0 => engine_enabled == 1) and Theorems B.4 and B.5 to see that 𝛽 (cid:15) (cid:3) (engine_enabled == 1) && · · · ⇒ 𝛽 (cid:15) (cid:3) (alt > 0 => engine_enabled == 1) Loop Body Post-condition.
According to the rules in Figure 10, we must show that the loop body’spost-condition implies the loop invariant. We can summarize the post-condition as (cid:3) (engine_enabled_0 == 1) &&( (cid:3) (alt == 0) && (cid:3) (engine_enabled == 0)) ||( ^ (alt != 0) && (cid:3) (engine_enabled == engine_enabled_0)) where the first line comes from the loop condition on Line 4 and the second and third line comefrom the infer statement on Line 15. We assume we have applied the standard rule for strongest-postcondition predicates and taken the disjunction of each branch of the infer [Floyd 1967].We can now show the post-condition implies the invariant. The disjunction gives us two cases.In the first, we can assume that (cid:3) (alt == 0) . Now, we can apply the fact that by vacuous truth, ∀ 𝜎. 𝜎 (cid:15) (alt == 0) ⇒ 𝜎 (cid:15) (alt > 0 => engine_enabled == 1) Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:23 and Theorems B.4 and B.5 to see that 𝛽 (cid:15) (cid:3) (alt == 0) ⇒ 𝛽 (cid:15) (cid:3) (alt > 0 => engine_enabled == 1) In the second case, we follow a similar logic to the initialization post-condition argument above,with the additional premise that (cid:3) (engine_enabled == engine_enabled_0) . We have demonstrated the feasibility of a belief programming implementation that directly repre-sents the belief state with a set. Specifically, we wrote the implementation in C using a hash setdata structure to represent beliefs.Called CBLIMP, our implementation is a shallow embedding of BLIMP into C; i.e. it is a C librarythat implements each of the core BLIMP primitives as a C function. CBLIMP’s observe functiontakes, in addition to the current belief state and the variable to be observed, a parameter that isthe observed value of the variable. CBLIMP’s infer function returns a boolean to be used as abranching condition by the C program, and CBLIMP’s if function takes as parameters callbacksfor each branch that execute on modified belief states. All regular C variables in a CBLIMP programcan be considered to be deterministic with respect to the belief state (i.e. they have the same valuein all environments in the belief state).CBLIMP includes functions that extend BLIMP primitives to simulate the true environment viarandom sampling. The simulated observe function presents a different interface: instead of takingthe observed value as a parameter, it uses the value from the simulated true environment.We enforce that environments are finite by augmenting choose statements to include a rangethat the newly assigned variable must belong to. For example, as part of the MPL model, Line 24of Figure 12 gives the choose statement err_1 = choose((prev_err_1 == 1 => . == 0) && (. == 0 || . == 1)); This specifies that err_1 is a boolean value (i.e. it is either 0 or 1) and that whenever prev_err_1 is 1, err_1 must be 0. In our implementation, we have alternatively specified this as err_1 = choose(0,1, prev_err_1 == 1 => . == 0); where the arguments and of the choose statement are the lower and upper bounds of the rangethat err_1 must belong to. Note that these statements are equivalent; whereas in the originalstatement we specified that err_1 is a boolean using the choose statement’s proposition, in thenew statement we have instead specified this using the range bounds. Research Question.
We evaluated CBLIMP to answer the question: does the direct implementa-tion achieve practical performance given the latency requirements of the domain? For the UAV ex-ample, a step latency of 1 second is required to match the latency of common GPS receivers [Sparkfun2020]. For the MPL example, a latency of 10ms is required [JPL Special Review Board 2000].
Benchmarks.
We used the UAV example with a 100-step time horizon and the MPL example asbenchmarks. The code is nearly identical to that in the paper, with the following changes: • The MPL benchmark includes an additional intermediate variable that captures the errorbetween the true altitude and the radar altitude. • We manually determined bounds for every variable specified by a choose statement to facil-itate use of our implementation’s augmented choose statements.We also implemented another version of MPL that uses a different grid size . Note that the orig-inal version of MPL defines a uniform discretization grid for both true and radar altitude at a
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. resolution of 1 meter. We modified this to be a non-uniform grid on both true and observed alti-tude with higher-altitude grid cells exponentially larger than lower-altitude grid cells. We call thisbenchmark "MPL-Exp". We have provided its BLIMP source code and verified it in Appendix E.
Methodology.
We ran each benchmark 5 times and recorded the mean and standard deviation ofthe time to execute a single iteration of the main while loop. We also recorded the maximum timetaken by an iteration across all runs.To construct observations to send as inputs to the belief program, we simulated the true envi-ronment alongside the belief program’s execution, sampling the new true environment uniformlyat random when we encountered a choose statement. While the latency measurements includeboth the time to update the belief state and run the simulation, we expect them to be dominatedby belief state operations.
Table 1. Results of performance experiments
Benchmark Mean +- Std. Dev. MaximumUAV 2.49 +- 0.37 ms 4.41 msMPL 2.45 +- 1.42 s 11.9 sMPL-Exp 0.76 +- 0.56 ms 2.37 ms
Results.
Table 1 summarizes the step latencyfor each benchmark. We can see that with theUAV benchmark, the direct implementation ofbelief programming is practical in the sensethat the step latency is well under the 1s thresh-old. However, with the MPL example, the di-rect implementation is not practical as-is. Thelatency requirement of 10ms is about 1000xfaster than the worst-case latency of our im-plementation. By contrast, our modified "MPL-Exp" benchmark is practical for the MPL problembecause its worst-case latency is considerably below the 10ms threshold.
Threats to validity.
We ran these benchmarks on a 2017 MacBook Pro with an i7-7920HQ CPUat 3.10GHz and 16 GB of 2133 MHz DDR3. Both of our benchmarks operate in domains that neces-sitate embedded computers that are less powerful. The UAV example enjoys a comfortable marginover the required latency; it is likely that the processors common on larger drones could meetthe latency requirements. The MPL-Exp benchmark would typically be run on a small embeddedprocessor, both for reliability benefits and because such processors are typically the ones hardenedagainst cosmic radiation. Although we expect such a processor to be slower than a modern laptopcomputer, with a comfortable 5x slowdown margin until it violates the latency requirement, wespeculate that this benchmark could meet the requirement with standard performance engineeringtechniques.
There is an open question of how to design an efficient implementation for the belief programmingruntime. CBLIMP directly implements the semantics of Figure 9 using an exhaustive representationof belief states. This implementation scales poorly with the number of variables in the program,and we discuss here more efficient potential runtime implementation approaches.
Runtime SMT.
One approach could symbolically execute BLIMP constructs to collect constraintsthat an SMT solver would use to evaluate modal propositions. This approach would make use of theenhanced performance of SMT solvers compared to an exhaustive approach, and bears similaritiesto other languages that deploy solvers at runtime (e.g. [Samimi et al. 2010; Yang et al. 2012]).
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:25
Restricted belief states.
Known efficient implementations exist when the belief states admittedby the language are more restricted than the full powerset of environments. Examples of re-stricted classes of belief states studied in the literature include ellipsoids [Bertsekas 1971] andpolytopes [Wan et al. 2018].
Synthesis.
Using the semantics of Figure 9 as a specification, a variety of synthesis tools [Delaware et al.2015; Lau et al. 2000; Solar-Lezama et al. 2006] exist that could potentially generate more efficientimplementations than naive enumeration. The goal would be to translate infer statements to ordi-nary if statements, using synthesis to construct a predicate for the if statement that is a functionof the observed values and is semantically equivalent to the infer statement’s predicate. The logic in Figure 10 and the theorems in Section B enable sound, manual reasoning about thebehavior of belief programs. As with many program logics, the gap from manual to automatedreasoning is the need for automated techniques for invariant inference and implication checking.
Invariant Inference.
As in traditional program logics, a while loop requires a loop invariant. Al-though propositions in the Epistemic Hoare Logic include modalities, classic approaches such astemplate-based invariant inference may be directly applicable [Flanagan and Leino 2001] via tem-plates that include modalities. An additional distinct difference from many traditional programlogics is that the rules for if statements require developers to manually determine a suitable post-condition or, in other words, provide an invariant. Here too template-based techniques may bedirectly applicable. In either case, there may also be new opportunities for analysis-based invari-ant inference techniques that account for modalities.
Discharging Implications.
A classic approach to discharge implications that appear in the premisesof Hoare logic rules is to employ an automatic theorem prover such as Z3 [Moura and Bjørner2008]. To apply this approach to Epistemic Hoare Logic, one would need to contend with themodalities in propositions. Recent work holds out the promise of automated reasoning techniquesfor modal implications via a reduction to SMT [Areces et al. 2015; Caridroit et al. 2017].
Set-based Uncertainty. [Combettes 1993] is a survey paper that gives an overview of how set-based uncertainty is used in the signal processing domain. It explains how programs can over-approximate the true belief state (using e.g. ellipsoids [Schweppe 1973]; more recent work has stud-ied polytopes [Wan et al. 2018]), and how the quality of approximation can be measured [Schweppe1973] to determine if the resulting belief state is too large. It also gives efficient algorithms [Cimmino1938; Kaczmarz 1937] for a restricted set of operations on approximate belief states . By contrast,belief programming reasons about the exact belief state and provides a richer set of operations.However, it cannot achieve the same computational efficiency as an approximation.
Classical Verification.
In Sections 2 and 5, we alluded to how the UAV and MPL examples couldbe verified using classical techniques. Here, we explain this process in more detail.The developer would first compose their handwritten environment model (Figures 2 and 12)with their handwritten state estimator (Figures 1 and 11). The resulting program is in the languageIMP [Winskel 1993] with the addition of choose statements that provide nondeterminism. The de-veloper could obtain a Hoare logic for this language by either extending the logic of IMP [Winskel1993] to include choose statements, or by rewriting it to a language such as GCL [Dijkstra 1975]which supports nondeterminism natively and also has a Hoare logic. Finally, the developer wouldapply the Hoare logic to the program, which requires discharging verification conditions. Because
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. the proposition language is the standard propositional calculus, we expect there are many tech-niques in the literature that the developer could apply to this end.The advantage of classical verification, relative to belief programming, is that developers canexpect to draw on a wide variety of verification literature to solve the problem, as the formulationis relatively standard. The disadvantage is that developers must write the raw code of the state esti-mator by hand, which is a tedious process. By contrast, in belief programming, the state estimatorconsists of infer statements, which are more intuitive to use.
Epistemic and Belief Revision Logics.
The Epistemic Hoare Logic we present in Section 4 is similarto dynamic Epistemic logics such as public announcement logic [Plaza 2007] and action-basedlogics [Baltag and Moss 2004]. Whereas these logics typically use either propositions or abstractaction spaces to modify the belief state, our logic uses a belief program to do so.
Synthesis.
To avoid writing state estimators by hand, developers might generate a state estima-tor by synthesizing it directly from the environment model [Delaware et al. 2015; Lau et al. 2000;Solar-Lezama et al. 2006]. Such a problem would be challenging due to having hidden state en-coded in the belief program that must be explicit in the state estimator.For example, in the desired handwritten state estimator in Figure 11, the program contains thevariables prev_td_1 and prev_td_2 , which are not related to any of the variables in the model inFigure 12. We anticipate that existing synthesis tools will have difficulty automatically inferringthe existence of such hidden variables.
Dynamic Constraint Solving.
Some existing programming languages [Samimi et al. 2010; Yang et al.2012] perform constraint solving at runtime using an SMT solver. Such approaches are necessarilysimilar to a belief program, which also represents a constrained set of program states at runtime.Existing systems were designed for other domains, and do not articulate complete set of choose , observe , and infer constructs that BLIMP has. Probabilistic programming languages (PPLs) are also designed to enable developers to reason aboutuncertainty. We compare BLIMP to PPLs on three core axes: language features, contemporaryreasoning techniques, and the practicality of inference (i.e., the implementation of infer ) .
Language Features.
BLIMP’s primary programming constructs are choose , observe , and infer and have probabilistic analogs in probabilistic programming languages. BLIMP’s choose has theclassic interpretation of non-probabilistic, nondeterministic choice. The analog in probabilistic pro-gramming is the probabilistic sample construct that randomly samples a value according to a dis-tribution. BLIMP’s observe has a similar semantics to observe constructs in PPLs. BLIMP’s infer ,which enables the program itself to perform inference, has some support in PPLs as well [Baudart et al.2020; Staton 2017].In general, probabilistic programming provides a more flexible modeling mechanism than BLIMP’sset-based uncertainty, enabling a developer to specify distributions for nondeterministic outcomes.For applications for which probabilistic models of outcomes are available, probabilistic program-ming can be an appropriate choice. However, probabilistic models of outcomes are not available forall applications (e.g., our MPL application) and introducing distributions can complicate reasoning,as we discuss below. BLIMP is an additional design point for applications that do not necessarilybenefit from probabilistic modeling. Reasoning.
The verification of probabilistic programs is an active area of research [Sampson et al.2014; Sato et al. 2019]. This work typically provides the ability to express and verify the probability
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:27 that an assertion is true of the program. In principle, it is possible to verify BLIMP-like modalassertions in such a framework. Specifically, (cid:3) maps to an assertion that a proposition is true withprobability 1; ^ maps to an assertion that a proposition is true with positive probability.However, a significant challenge with reasoning about probabilistic programs is that the distri-bution over states at any given point in a program may not have an analytical characterization.Specifically, the composition of a standard, well-known probability distribution (e.g., a Gaussiandistribution) with a computation can result in a distribution that is not well-characterized by a stan-dard, well-known distribution for which standard statistical quantities (such as mean and variance)are easily accessible.Therefore the full modeling, programming, and reasoning workflow must carefully considerthe distributions used in sample statements such that they adhere to the application’s uncertaintymodel and that the resulting distributions in the rest of the program can be precisely reasonedabout at an acceptable level of complexity.BLIMP is, instead, an additional design point for modeling uncertainty that need not rely on anappropriate selection of sample distributions or, more generally, the complexity of techniques forreasoning about probabilistic constructs. Practicality of Inference.
BLIMP’s runtime implementation to support inference – i.e. infer –tracks all possible environments. The resulting implementation is sound. A probabilistic program-ming language can take a similar strategy – i.e., exact inference – if it desires sound inference.However, probabilistic programming languages can also leverage approximate inference algo-rithms for probabilistic programs such as Sequential Monte Carlo methods [Del Moral et al. 2006]that need only track a high-probability subset of the possible environments. These methods areapproximate in that the probability of an event is estimated with a fidelity that is a function ofthe size and diversity of the tracked state. Designing algorithms to select this state efficiently isapplication-specific. Contemporary diagnostics for Monte Carlo methods (e.g. [Cowles and Carlin1996; Liu 1996]) are typically not sound in that they can indicate a good approximation when thetrue approximation is poor. Establishing bounds on the quality of the resulting approximation isstill an active area of research [Chatterjee and Diaconis 2018].Therefore, while approximate inference methods can in practice provide empirically good re-sults, reasoning about the soundness of their results is still an active area of research.
In this paper, we presented belief programming and Epistemic Hoare Logic. Belief programmingenables developers to write programs that can be directly executed to give state estimators thatare derived from environment models. Epistemic Hoare Logic enables developers to reason aboutbelief programs. We discussed both by reference to the BLIMP language, with belief programmingdescribed by BLIMP’s semantics and Epistemic Hoare Logic operating over BLIMP statements. Wedetermined that belief programming is feasible by evaluating our BLIMP implementation, CBLIMP.Taken together, this work lays new foundations for soundly reasoning about the behavior of soft-ware that executes in partially-observable environments.
ACKNOWLEDGMENTS
We would like to thank Alex Renda, Deokhwan Kim, Ben Sherman, Jesse Michel, Cambridge Yang,Jonathan Frankle, and anonymous reviewers for their helpful comments and suggestions. Thiswork was supported in part by the Office of Naval Research (ONR-N00014-17-1-2699). Any opin-ions, findings, and conclusions or recommendations expressed in this material are those of theauthor and do not necessarily reflect the views of the Office of Naval Research.
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020.
REFERENCES
Carlos Areces, Pascal Fontaine, and Stephan Marz. 2015. Modal Satisfiability via SMT Solving. In
Software, Services, andSystems . Springer.Ralph-Johan Back. 1978.
On the Correctness of Refinement Steps in Program Development . Ph.D. Dissertation. University ofHelsinki.Alexandru Baltag and Lawrence S. Moss. 2004. Logics for Epistemic Programs.
Synthese
139 (03 2004), 165–224.Guillaume Baudart, Louis Mandel, Eric Atkinson, Benjamin Sherman, Marc Pouzet, and Michael Carbin. 2020. ReactiveProbabilistic Programming. In
Conference on Programming Language Design and Implementation .Dimitri P. Bertsekas. 1971.
Control of uncertain systems with a set-membership description of the uncertainty . Ph.D. Disser-tation. MIT.Thomas Caridroit, Jean-Marie Lagniez, Daniel Le Barre, Tiago de Lima, and Valentin Montmirail. 2017. A SAT-basedApproach to Solving the Modal Logic S5-Satisfiability Problem. In
AAAI Conference on Artificial Intelligence .Souray Chatterjee and Persi Diaconis. 2018. The Sample Size Required in Importance Sampling.
Annals of Applied Proba-bility
28 (04 2018), 1099–1135. Issue 2.G. Cimmino. 1938. Calcolo aprossimato per le souzioni dei sistemi di equazioni lineari.
La Ricera Scientifica, Series II
Proc. IEEE
81 (02 1993), 182–208. Issue 2.Mary Kathryn Cowles and Bradley P. Carlin. 1996. Markov Chain Monte Carlo Convergence Diagnostics: A ComparativeReview.
J. Amer. Statist. Assoc.
91 (06 1996), 883–904. Issue 434.Pierre Del Moral, Arnaud Doucet, and Ajay Jasra. 2006. Sequential Monte Carlo samplers.
Journal of the Royal StatisticalSociety: Series B (Statistical Methodology)
68 (06 2006), 411–436. Issue 3.Benjamin Delaware, Clément Pit-Claudel, Jason Gross, and Adam Chlipala. 2015. Fiat: Deductive Synthesis of AbstractData Types in a Proof Assistant. In
Symposium on Principles of Programming Languages .Edsger W. Dijkstra. 1975. Guarded Commands, Nondeterminacy, and Formal Derivations of Programs.
Commun. ACM
International Symposiumon Formal Methods Europe .R.W. Floyd. 1967. Assigning Meanings to Programs. In
Symposium in Applied Mathematics .C.A.R. Hoare. 1969. An Axiomatic Basis for Computer Programming.
Commun. ACM
12 (10 1969), 576–580. Issue 10.JPL Special Review Board. 2000.
Report on the Loss of the Mars Polar Lander and Deep Space 2 Missions . Technical Report.Jet Propulsion Laboratory.Stefan Kaczmarz. 1937. Angenaherte Auflosung von Systemen linearer Gleichungen. In
Bulletin International de l’AcademiePolonaise des Sciences et des Lettres. Classe des Sciences Mathematiques et Naturelles. Serie A, Sciences Mathematiques .Tessa Lau, Pedro Domingos, and Daniel S. Weld. 2000. Version Space Algebra and its Application to Programming byDemonstration. In
International Conference on Machine Learning .Jun S. Liu. 1996. Metropolized Independent Sampling with Comparisons to Rejection Sampling and Importance Sampling.
Statistics and Computing
Tools and Algorithms for the Constructionand Analysis of Systems .Jan Plaza. 2007. Logics of Public Communications.
Synthese
158 (09 2007), 165–179.Stuart Russel and Peter Norvig. 2020.
Artificial Intelligence: A Modern Approach (4 ed.). Pearson.Hesam Samimi, Ei Darli Aung, and Todd Millstein. 2010. Falling Back on Executable Specification Languages. In
EuropeanConference Object-Oriented Programming .Adrian Sampson, Pavel Panchekha, Todd Mytkowicz, Kathryn S. McKinley, Dan Grossman, and Luis Ceze. 2014. Expressingand Verifying Probabilistic Assertions. In
Conference on Programming Language Design and Implementation .Tetsuya Sato, Alejandro Aguirre, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Justin Hsu. 2019. Formal Verificationof Higher-order Probabilistic Programs: Reasoning About Approximation, Convergence, Bayesian Inference, and Opti-mization. In
Symposium on Principles of Programming Languages .Fred C. Schweppe. 1973.
Uncertain Dynamic Systems . Prentice-Hall.Richard D. Smallwood and Edward J. Sondik. 1973. The Optimal Control of Partially Observable Markov Processes Over aFinite Horizon.
Operations Research
21 (10 1973), 1071–1088. Issue 5.Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Vijay Saraswat, and Sanjit Seshia. 2006. Combinatorial Sketchingfor Finite Programs. In
International Conference on Architectural Support for Programming Languages and OperatingSystems
European Symposium on Programming .Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:29
Jian Wan, Sanjay Sharma, and Robert Sutton. 2018. Guaranteed State Estimation for Nonlinear Discrete-Time Systems viaIndirectly Implemented Polytopic Set Computation.
IEEE Trans. on Automatic Control
63 (12 2018), 4317–4322. Issue 12.Glynn Winskel. 1993.
The Formal Semantics of Programming Languages . MIT Press.Jean Yang, Kuat Yessenov, and Armando Solar-Lezama. 2012. A Language for Automatically Enforcing Privacy Policies. In
Symposium on Principles of Programming Languages .Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020.
A PROOFSA.1 Theorem 3.1: Belief Soundness
Proceed by structural induction on derivations of ⇓ . Individual cases are as follows: • Nondeterministic If Statements.
We case split on which branch the true environmenttakes. If it is the true branch, then 𝜎 ∈ 𝛽 𝑇 . Then, by induction hypothesis 𝜎 ′ ∈ 𝛽 ′ 𝑇 ⇒ 𝜎 ′ 𝑖𝑛𝛽 ′ 𝑇 ∪ 𝛽 ′ 𝐹 . The case for the false branch is symmetric. • All other Statements.
This follows directly from the definition of the new belief state andthe induction hypotheses.
A.2 Theorem 3.2: Belief Precision
Proceed by structural induction on derivations of ⇓ . In each case, we identify the choice of 𝜎 𝛽 given 𝜎 ′ 𝛽 that satisfies the property. • Assignment.
Choose 𝜎 𝛽 ∈ 𝛽 such that h 𝜎 𝛽 , 𝑒 i ⇓ 𝑐 𝛽 and 𝜎 ′ 𝛽 = 𝜎 𝛽 [ 𝑥 ↦→ 𝑐 𝛽 ] . • Choose.
Choose 𝜎 𝛽 ∈ 𝛽 such that there exists a 𝑐 𝛽 such that 𝜎 𝛽 (cid:15) 𝑝 [ 𝑐 𝛽 / . ] and 𝜎 ′ 𝛽 = 𝜎 𝛽 [ 𝑥 ↦→ 𝑐 𝛽 ] . • Observe.
Choose 𝜎 𝛽 = 𝜎 ′ 𝛽 . • Nondeterministic If.
Case split on whether 𝜎 ′ 𝛽 is in 𝛽 ′ 𝑇 or 𝛽 ′ 𝐹 . If it is in 𝛽 ′ 𝑇 , choose 𝜇 ′ 𝑇 = 𝜎 𝛽 and apply the inductive hypothesis for 𝑠 . Otherwise, apply the inductive hypothesis for 𝑠 .In either case, the resulting 𝜎 𝛽 must be in 𝛽 . • Other Statements.
Apply inductive hypotheses.
A.3 Theorem 3.3: Belief Determinism
First, we define the following prefix property : Property 1 (Prefix). If h 𝛽, 𝜎, 𝑠 i ⇓ h 𝛽 ′ | 𝑜 i and h 𝛽, 𝜎, 𝑠 i ⇓ h 𝛽 ′ | 𝑜 ′ i where 𝑜 ≠ 𝑜 ′ , then neither is 𝑜 a prefix of 𝑜 ′ nor 𝑜 ′ a prefix of 𝑜 . The prefix property states that if a belief program can generate two different observation lists,the lists have to fundamentally differ and cannot simply be extensions of each other.Proceed by structural induction on derivations of ⇓ . We strengthen the inductive hypothesiswith the prefix property. Specific cases are as follows: • Sequencing.
We consider here the rule that produces non- nil observation lists through theconcatenate operator.First, we show the prefix property. If 𝑜 is different between the two executions, then theresulting concatenated list must satisfy the prefix property according to the mechanics ofthe concatenate operator. Otherwise, we apply the determinism inductive hypothesis on 𝑠 to show that the intermediate belief state 𝛽 ′ is the same between executions. This means wecan directly apply the prefix property inductive hypothesis on 𝑠 to show the result.Now, we show belief determinism. We can assume the length of all observation lists in thisrule are the same across both executions. Otherwise, 𝑜 in one execution would be a prefixof 𝑜 in the other execution, violating the prefix property. Due to the mechanics of the listconcatenate operation, this means we can assume the values of the lists are the same acrossboth executions. Thus, we can apply the determinism inductive hypotheses to show theresult. • Observe.
The list always has length 1, which ensures the prefix property. The new beliefstate is a function of the original belief state and the observed value, which ensures beliefdeterminism.
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:31 • Other cases.
In the remaining cases, either the observation list is constant and the newbelief state is a function of the original one, or the conclusion follows directly from inductionhypotheses.
A.4 Logic
A.4.1 Substitution Lemmas.
Here, we restate and reprove the substitution lemma from Section 4and state and prove the other substitution lemmas we will need.
Lemma A.1 (Expression Substitution). If 𝑥 is fresh in 𝑒 , then h 𝜎, 𝑒 i ⇓ 𝑐 ⇐⇒ h 𝜎 [ 𝑥 ↦→ 𝜎 ( 𝑥 )] , 𝑒 i ⇓ 𝑐 and h 𝜎, 𝑒 i ⇓ 𝑐 ⇐⇒ h 𝜎 [ 𝑥 ↦→ 𝜎 ( 𝑥 )] , 𝑒 [ 𝑥 / 𝑥 ]i ⇓ 𝑐 Proof.
By structural induction on the language of expressions. (cid:3)
Lemma A.2 (Proposition Substitution). If 𝑥 is fresh in 𝑝 , then 𝜎 (cid:15) 𝑝 ⇐⇒ 𝜎 [ 𝑥 ↦→ 𝜎 ( 𝑥 )] (cid:15) 𝑝 and 𝜎 (cid:15) 𝑝 ⇐⇒ 𝜎 [ 𝑥 ↦→ 𝜎 ( 𝑥 )] (cid:15) 𝑝 [ 𝑥 / 𝑥 ] Proof.
By structural induction on the language of propositions. The base case employs theexpression substitution lemma above. (cid:3)
In the following definition, we use the notation 𝛽 [ 𝑥 ↦→ 𝑥 ] to mean a belief state 𝛽 ′ such that 𝛽 ′ = { 𝜎 [ 𝑥 ↦→ 𝜎 ( 𝑥 )] | 𝜎 ∈ 𝛽 } Lemma A.3 (Modal Proposition Substitution). If 𝑥 is fresh in 𝑝 (cid:3) , then 𝛽 (cid:15) 𝑝 (cid:3) ⇐⇒ 𝛽 [ 𝑥 ↦→ 𝑥 ] (cid:15) 𝑝 (cid:3) and 𝛽 (cid:15) 𝑝 (cid:3) ⇐⇒ 𝛽 [ 𝑥 ↦→ 𝑥 ] (cid:15) 𝑝 (cid:3) [ 𝑥 / 𝑥 ] Proof.
By structural induction on the language of propositions. The base cases employ theproposition substitution lemma above. (cid:3)
A.4.2 Agreement.
In this section, we show some agreement lemmas that are also required to showthe main soundness result.
Lemma A.4 (Expression Agreement). If 𝜎 agrees with 𝜎 on every location except 𝑥 and, 𝑒 does not depend on 𝑥 , then h 𝜎 , 𝑒 i ⇓ 𝑐 ⇐⇒ h 𝜎 , 𝑒 i ⇓ 𝑐 . Proof.
By structural induction on the language of expressions. (cid:3)
Lemma A.5 (Proposition Agreement). If 𝜎 agrees with 𝜎 on every location except 𝑥 and, 𝑝 does not depend on 𝑥 , then 𝜎 (cid:15) 𝑝 ⇐⇒ 𝜎 (cid:15) 𝑝 . Proof.
By structural induction on the language of propositions. The base case follows from theexpression agreement lemma above. (cid:3)
Lemma A.6 (Modal Proposition Agreement). If 𝛽 agrees with 𝛽 on every location except 𝑥 (i.e. 𝜎 ∈ 𝛽 iff 𝜎 ∈ 𝛽 and 𝜎 agrees with 𝜎 on every location except 𝑥 ), then 𝛽 (cid:15) 𝑝 (cid:3) ⇐⇒ 𝛽 (cid:15) 𝑝 (cid:3) Proof.
By structural induction on the language of modal propositions. The base case followsfrom the proposition agreement lemma above, and we also use the fact that belief agreement issymmetric to prove the only if case. (cid:3)
A.4.3 Constant Substitution.
The following is an additional lemma used in the proof of the sound-ness of Epistemic Hoare logic:
Lemma A.7 (Expression Constant Substitution). If 𝜎 ( 𝑥 ) = 𝑐 𝜎 , then h 𝜎, 𝑒 i ⇓ 𝑐 iff h 𝜎, 𝑒 [ 𝑐 𝜎 / 𝑥 ]i ⇓ 𝑐 . Proof.
By structural induction on the language of expressions. (cid:3)
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020.
Lemma A.8 (Proposition Constant Substitution). If 𝜎 ( 𝑥 ) = 𝑐 𝜎 , then 𝜎 (cid:15) 𝑝 iff 𝜎 (cid:15) 𝑝 [ 𝑐 𝜎 / 𝑥 ] . Proof.
By structural induction on the language of propositions. For the base cases, we applythe expression constant substitution lemma above. (cid:3)
Lemma A.9 (Modal Proposition Substitution).
If for all 𝜎 ∈ 𝛽 , 𝜎 ( 𝑥 ) = 𝑐 , and there exists a 𝜎 ∈ 𝛽 such that 𝜎 ( 𝑥 ) = 𝑐 , then 𝛽 (cid:15) 𝑝 ∃ ⇐⇒ 𝛽 (cid:15) 𝑝 ∃ [ 𝑥 / 𝑐 ] . Proof.
By structural induction on the language of modal propositions. For the base cases, weapply the proposition constant substitution lemma above (cid:3)
A.4.4 Proof of Theorem 4.3: Part 1.
Proceed by structural induction on derivations of ⇓ . Specificcases are as follows: • Assignment.
The first part of the post-condition follows from the agreement and substitu-tion lemmas for modal propositions. The second part follows from expression substitution. • Choose.
The first part of the post-condition follows from the agreement and substitutionlemmas. Then, we can show that since 𝑝 [ 𝑥 / 𝑥 ] does not depend on 𝑥 , 𝜎 (cid:15) 𝑝 [ 𝑥 / 𝑥 ] [ . / 𝑐 ] ⇐⇒ 𝜎 [ 𝑥 ↦→ 𝑐 ] (cid:15) 𝑝 [ . / 𝑥 ] . This can be shown by structural induction, similarly to the proof forthe agreement and substitution lemmas. The second part of the post-condition follows fromthis fact. • Nondeterministic If.
By definition, 𝛽 𝑇 (cid:15) ∃ ˆ 𝑦. ( (cid:3) 𝑝 ) && 𝑝 (cid:3) and 𝛽 𝐹 (cid:15) ∃ ˆ 𝑦. ( (cid:3) ! 𝑝 ) && 𝑝 (cid:3) Afterapplying inductive hypotheses, we can use the fact that 𝛽 𝑇 (cid:15) (cid:3) 𝑝 ∧ 𝛽 𝐹 (cid:15) (cid:3) 𝑝 ⇒ 𝛽 𝑇 ∪ 𝛽 𝐹 (cid:15) (cid:3) 𝑝 to show the overall result. • Other Cases.
The other cases follow from applying inductive hypotheses and inlining defi-nitions. In the case of while loops, the premises must be destructed, and the extra premisesof the logic rule ensure that the semantics do not evaluate to ⊥ . A.4.5 Proof of Theorem 4.3: Part 2.
Proceed by structural induction on derivations of ⇓ . Most casesfollow the same reasoning as part 1 above, and we cover here the cases that are unique to part 2. • Observe.
Choose 𝑦 𝑛 + = 𝜎 ( 𝑥 ) . We show the first part of the post-condition by applying theconstant substitution lemma for modal propositions. The first premise of this lemma followsfrom the definition of 𝛽 ′ , and the second condition follows from the assumption that 𝜎 ∈ 𝛽 .The second part of the post-condition follows directly from the definition of 𝛽 ′ . • Subtyping.
Because this part of the theorem is a strengthened version of the first part, anyderivation produced by subtyping must be sound. • Other Cases.
The result follows from applying inductive hypotheses.
B THEOREMS ABOUT PROPOSITIONS
In addition to the logical rules in Figure 10, developers need to prove the implications in thepremises of those rules. We propose that developers do so by lifting propositional reasoning tothe modal operators. The theorems in this section give a general set of reasoning tools for per-forming this lifting.First, we show that modal operator (cid:3) commutes with conjunctions:
Theorem B.1 ( (cid:3) commutes with && ). 𝛽 (cid:15) (cid:3) ( 𝑝 && 𝑝 ) ⇐⇒ 𝛽 (cid:15) (cid:3) 𝑝 && (cid:3) 𝑝 Proof.
From first-order logic instantiation. (cid:3)
We can similarly show that modal operator (cid:3) partially commutes with disjunctions, but theimplication only applies in one direction
Theorem B.2 ( (cid:3) commutes with || ). 𝛽 (cid:15) (cid:3) 𝑝 || (cid:3) 𝑝 ⇒ 𝛽 (cid:15) (cid:3) ( 𝑝 || 𝑝 ) Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:33
Proof.
By case analysis on the premise. Follows from the fact that 𝜎 (cid:15) 𝑝 ⇒ 𝜎 (cid:15) 𝑝 || 𝑝 . (cid:3) The (cid:3) and ^ operators are also related by negation, in the same way that ∀ and ∃ are related infirst-order logic. This means that, while we present all theorems in this section as applying to (cid:3) ,there are analogous theorems that apply to ^ . Theorem B.3 ( (cid:3) - ^ duality). 𝛽 (cid:15) (cid:3) 𝑝 ⇐⇒ 𝛽 (cid:15) ! ^ ! 𝑝 Proof.
Follows from definitions and ∀ - ∃ duality. (cid:3) Next, we show two theorems that enable developers to push implications through modal oper-ators. The first states that the (cid:3) modality applies to any formulas that are true across all environ-ments.
Theorem B.4 (Knowledge of Theorems). (cid:16) ∀ 𝜎. 𝜎 (cid:15) 𝑝 ⇒ 𝜎 (cid:15) 𝑝 (cid:17) ⇒ (cid:16) 𝛽 (cid:15) (cid:3) (! 𝑝 || 𝑝 ) (cid:17) Proof.
Follows from 𝛽 being a subset of all possible 𝜎 . (cid:3) The second theorem states that implications under (cid:3) can be lifted to modal propositions.
Theorem B.5 (Knowledge Instantiation). 𝛽 (cid:15) (cid:3) (! 𝑝 || 𝑝 ) ⇒ (cid:16) 𝛽 (cid:15) (cid:3) 𝑝 ⇒ 𝛽 (cid:15) (cid:3) 𝑝 (cid:17) Proof.
Follows from first-order instantiation and transitivity of ⇒ . (cid:3) Finally, we show that for propositions that do not depend on environment variables (i.e. theyonly depend on quantified variables and constants), truth in one environment implies truth in any.
Theorem B.6 (Environment Independence). If 𝑝 does not contain any 𝑥 , then 𝛽 (cid:15) ∃ ˆ 𝑦. 𝑝 (cid:3) && ^ 𝑝 ⇒ 𝛽 (cid:15) ∃ ˆ 𝑦. 𝑝 (cid:3) && (cid:3) 𝑝 Proof.
By structural induction, we can show that 𝜎 (cid:15) 𝑝 is independent of 𝜎 . Thus, if 𝑝 is true,it is true under all environments. (cid:3) C VERIFICATION OF THE UAV EXAMPLE
This section is structured as follows. Section C.0.1 explains how propositions are generated frombelief programs, with an emphasis on how the rules of Epistemic Hoare logic differ from the rulesof classical Hoare logic. Section C.0.2 gives examples of proving an implication with modal propo-sitions. In each section, we illustrate concepts in the context of verifying the invariant preservationproperty of the loop body in Figure 3.
C.0.1 Post-condition.
Epistemic Hoare logic is most closely related to strongest-postcondition log-ics such as Floyd [1967] that generate a post-condition given a pre-condition and a program. Thismeans there is a deduction 𝐷 ⊢ { (cid:3) (450 <= alt && alt <= 550) } 𝑠 { 𝑝 ′∃ } where the post-condition 𝑝 ′∃ is generated fairly directly by the program 𝑠 . Figure 14 shows suchan 𝑝 ′∃ , and here we explain how it corresponds to 𝑠 , the loop body from Figure 3. Loop Invariant.
The proposition on Line 2 of Figure 14 corresponds to the loop invariant onLine 3 of Figure 3 that we assumed as our pre-condition. Because the variable alt is reassignedlater in the loop body, the logic renames it to alt_0 , a fresh variable that represents the previousvalue of alt at the start of the loop. Note that this differs from the standard approach to nameconflicts, which uses existential quantification [Floyd 1967]. This is because Epistemic Hoare logicneeds to preserve the original quantification with (cid:3) and/or ^ when referring to variables in theenvironment. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. ∃ y . (cid:3) (450 <= alt_0 && alt_0 <= 550) && (cid:3) ( alt_0 - 25 <= alt_1 && alt_1 <= x_0 + 25) && (cid:3) ( alt_1 - 25 <= y && y <= alt_1 + 25) && (cid:3) ( obs == y ) && ( ( ^ ( alt_1 < 450) && (cid:3) ( cmd == 50)) || (! ^ ( alt_1 < 450) && ^ ( alt_1 > 550) && (cid:3) ( cmd == -50)) || (! ^ ( alt_1 < 450) && ! ^ ( alt_1 > 550) && (cid:3) ( cmd == 0)) ) && (cid:3) ( alt == alt_1 + cmd ) Fig. 14. Post-condition of the belief program at the end of the loop.
Choose Statements.
The choose statement on Line 5 of Figure 3 generates the proposition onLine 3 of Figure 14. This proposition states that the previous altitude alt_0 is within a distanceof of the new altitude alt_1 , and is generated from the choose statement’s proposition byreplacing the placeholder . with the new altitude. The updated altitude is renamed to the freshvariable alt_1 because of the reassignment of alt on Line 13 of Figure 3.The choose statement for obs on Line 6 of Figure 3 similarly generates the proposition on Line 4of Figure 14, though as we explain next, the new value of obs is renamed to the existentiallyquantified variable y . Observations.
The observation on Line 7 of Figure 3 generates the existentially quantified vari-able y . This variable stands for the input observed value of obs , and thus must satisfy any con-straints that obs was originally under. In this case, that means that y must be within a distance of of the altitude alt_1 . Furthermore, because obs is being observed, we know that every envi-ronment must have it bound to its true value y . Thus, the observation generates the propositionon Line 5 of Figure 14. Infer.
The infer statement on Lines 9-11 of Figure 3 generates the disjunction on Lines 6-8 ofFigure 14. Each term in the disjunction corresponds to one of the branches of the infer statement.Each term itself is a conjunction of the predicate that causes the branch to be taken and a proposi-tion describing the actions of the branch. In every case, the predicate is a combination of (possiblynegated) ^ -propositions and the branch specifies that the value of cmd is some constant in everyenvironment. Assignment.
The assignment on Line 13 of Figure 3 generates the proposition on Line 10 ofFigure 14. This proposition specifies that in any environment in the belief state, the new altitude alt is equal to the previous altitude alt_1 plus the command cmd . C.0.2 Modal Implications.
In this section, we show how implications of ordinary propositionscan be lifted to implications of modal propositions. We further demonstrate how this can be usedto show that the invariant 𝑝 ′∃ presented in Figure 14 and derived in the Section C.0.1 implies theloop invariant in Figure 3. This completes the proof that the program in Figure 3 preserves its loopinvariant.The proof that 𝑝 ′∃ implies the loop invariant is structured as follows. The first step is to treatterm of the disjunction on Lines 6-8 of Figure 14 as a separate case. For clarity, we only discuss thefirst case on Line 6 in this section. In this case, in addition to the post-condition, we can assume the Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:35 belief state satisfies the proposition ^ (alt_1 < 450) && (cid:3) (cmd == 50) . Our approach will be toa) show that this implies a ^ -proposition about only y b) that we can convert this ^ -propositionto a (cid:3) -proposition, and c) that the resulting (cid:3) -proposition implies the loop invariant. Implied ^ -Proposition. The first technique for lifting proposition implications to modal implica-tions is that if for any environment 𝑃 ⇒ 𝑄 where 𝑃 and 𝑄 are propositions over environments,then in any belief state ^ 𝑃 ⇒ ^ 𝑄 . In this case, we take 𝑃 to be the proposition alt_1 < 450 and 𝑄 to be the proposition alt_1 < y - 25 || y <= 475 . This means we can assume that ^ (alt_1< y - 25 || y <= 475) is true of our belief state. Furthermore, by the proposition on Line 4,we know that every environment in the belief state satisfies the negation of alt_1 < y - 25 ,meaning that if an environment satisfies 𝑄 it must be because it satisfies y <= 475 . Thus, we canassume that ^ (y <= 475) . Conversion to (cid:3) -proposition.
Note that (cid:3) and ^ quantify over environments, and the proposition y <= 475 depends only on the quantified variable y and not on any variables in the environment.This means that if the proposition is true in some environment, it must be true in every environ-ment. Thus, we can assume (cid:3) (y <= 475) Implied (cid:3) -Proposition.
Another technique for lifting proposition implications to modal implica-tions is that if for any environment 𝑃 ⇒ 𝑄 , then for any belief state (cid:3) 𝑃 ⇒ (cid:3) 𝑄 . We now apply thisto the propositions in Figure 14 and the assumption (cid:3) (y <= 475) . In this case, we take 𝑃 to be
425 <= alt_1 && alt_1 <= 575 && alt_1 - 25 <= y && y <= alt_1 + 25 && y <= 475 and 𝑄 to be
475 <= alt_1 + 50 && alt_1 + 50 <= 550 . By similar logic, we can see that theproposition (cid:3) (y <= 475) combined with the propositions on Lines 2-4 implies (cid:3) 𝑃 and that (cid:3) 𝑄 combined with the proposition on Line 13 implies the loop invariant. D VERIFYING THE MPL WITH EPISTEMIC HOARE LOGIC
In this section, we will show a piece of the verification process of the MPL belief program inFigure 13. Specifically, we will show how to verify the belief program’s while loop preserves theloop invariant. Note that some of the variable names in this section are slightly different than thenames in Section 5.First, we strengthen the loop invariant to the following proposition. This strengthened conditioncontains the additional properties: • Altitude and time-on-ground nonnegativity.
The variables altitude and time_on_ground never go below . • Permanent Error Single-upset.
It is never the case that both touchdown sensors suffer perma-nent failures. • Existential Variables.
There are two existentially quantified variables that stand in for theobserved touchdown sensor values. This means that if there are no transients on touchdownand the landing legs are not deploying, these sensors will properly indicate a touchdown. • Engine Disabled Soundness.
On the second time step after touchdown, if there are no tran-sients on touchdown, the controller will disable the engine.1 ∃ y1 , y2 . (cid:3) ( altitude >= 0) && (cid:3) ( t i m e _ o n _ grou nd >= 0) && (cid:3) ( altitude > 0 = > t i m e _ o n _ grou nd == 0) && (cid:3) (!( p e r m a n e nt_1 == 1 && p e r m a n e nt_ 2 == 1)) && Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. (cid:3) ( altitude > 0 = > e n g i n e _ e nabl ed = 1) && (cid:3) (((( y1 != 1 && p e r m a n e nt_1 != 1) || ( y2 != 1 && p e r m a n e nt_2 != 1)) && t r a n s i e n t _ o n_t ou ch do wn == 0 && l a n d i n g _ l e g _de pl oy me nt == 0) = > altitude != 0) ) && (cid:3) (( t r a n s i e n t _ o n_t ou ch do wn == 0 && t i m e _ o n _ grou nd == 1) = > e n g i n e _ e n able d == 0) && (cid:3) ( t r a n s i e n t _ o n_ to uch do wn == 0 = > t i m e _ o n _ gro und < 2) && The strengthened invariant can be shown to be correct on the first iteration of the loop applyingour logic to the model initialization code. Our goal here is to show that this invariant is preservedby the belief program’s control loop.Calling this proposition 𝑝 𝐼 ∃ , we can propagate it through the loop body using the rules in Fig-ure 10. For the post-conditions of each if statement, we assume a natural proposition that includesa disjunction representing both branches and wraps the disjunction in a (cid:3) modality. Similarly, foreach infer statement, we include a disjunction that uses the modality of the infer statement’s con-dition. This yields the following formula, which we call 𝑝 MPL ∃ . We will not explain the contents of 𝑝 MPL ∃ at length, but will refer to it piecemeal in the remainder of the proof.1 ∃ y_radar , y_td1 , y_td2 , y1 , y2 . (cid:3) ( a l t i t u d e_0 >= 0) && (cid:3) ( t i m e _ o n _ g rou nd_ 0 >= 0) && (cid:3) ( a l t i t u d e_0 > 0 = > t i m e _ o n _ g r oun d_0 == 0) && (cid:3) (!( p e r m a n e nt_1 == 1 && p e r m a n e nt_ 2 == 1)) && (cid:3) ( a l t i t u d e_0 > 0 = > e n g i n e _ e n a ble d_0 = 1) && (cid:3) (((( y1 != 1 && p e r m a n e nt_1 != 1) || ( y2 != 1 && p e r m a n e nt_2 != 1)) && t r a n s i e n t _ o n_t ou ch do wn == 0 && l a n d i n g _ l e g _d ep lo ym en t_ 0 == 0) = > a l t i t u de_0 != 0) ) && (cid:3) (( t r a n s i e n t _ o n_ to uc hd ow n_ 0 == 0 && t i m e _ o n _ g r oun d_0 == 1) = > e n g i n e _ e n a bled _0 == 0) && (cid:3) ( t r a n s i e n t _ o n _t ouc hd ow n_ 0 == 0 = > t i m e _ o n _ g r oun d_0 < 2) && ( (cid:3) ( a l t i t u de_ 0 == 0 && e n g i n e _ e na ble d_ == 1 && t i m e _ o n _ g roun d == t i m e _ o n _ g rou nd_ 0 + 1) || (!( a l t i t u de_0 == 0 && e n g i n e _ e na ble d_0 == 1) && t i m e _ o n _ g roun d == t i m e _ o n _ g rou nd_ 0) ) && (cid:3) (0 <= a l t i t u d e_ rate && a l t i t u d e _r ate <= 39 && a l t i t u d e _ra te <= a l t i t u de_0) && (cid:3) ( altitude == a l t i t u d e_0 - a l t i t u d e _rat e) && (cid:3) (( p r e v _ e r r o r_1 _0 == 1 = > error_1 == 0) && ( error_1 == 0 || error_1 == 1)) && (cid:3) ( p r e v _ e r r or_1 == error_1 ) && (cid:3) (( p r e v _ e r r o r_2 _0 == 1 = > error_2 == 0) && ( error_2 == 0 || error_2 == 1)) && (cid:3) ( p r e v _ e r r or_2 == error_2 ) && Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:37 (cid:3) (( t r a n s i e n t _ o n_t ou ch do wn == 0 && altitude == 0) = > ( error_1 != 1 && error_2 != 1)) && ( (cid:3) (1400 <= altitude && altitude <= 1600 && l a n d i n g _ l e g _ de pl oy men t == 1) || ((1400 > altitude || 1600 < altitude ) && l a n d i n g _ l e g _ de pl oy men t == 0) ) && (cid:3) (( p e r m a n e nt_1 == 1 && y_td1 == p e r m a n e n t _1_ val) || ( p e r m a n e nt_1 != 1 && ( l a n d i n g _ l e g _ de plo ym en t == 1 || error_1 == 1) && y_td1 == 0 || y_td1 == 1 ) || ( p e r m a n e nt_1 != 1 && l a n d i n g _ l e g _d ep lo yme nt != 1 && error_1 != 1 && altitude == 0 = > y_td1 == 1 && altitude != 0 = > y_td1 == 0 ) ) && (cid:3) (( p e r m a n e nt_2 == 1 && y_td2 == p e r m a n e n t _2_ val) || ( p e r m a n e nt_2 != 1 && ( l a n d i n g _ l e g _ de plo ym en t == 1 || error_2 == 1) && y_td2 == 0 || y_td2 == 1 ) || ( p e r m a n e nt_1 != 1 && l a n d i n g _ l e g _d ep lo yme nt != 1 && error_2 != 1 && altitude == 0 = > y_td1 == 1 && altitude != 0 = > y_td1 == 0 ) ) && (cid:3) (40 <= y_radar - altitude && y_radar - altitude <= 40) && (cid:3) ( y_radar == r a d a r _ a l titu de) && (cid:3) ( y_td1 == c u r r e n t _ t o u c h d ow n_ in di ca t or _1) && (cid:3) ( y_td2 == c u r r e n t _ t o u c h d ow n_ in di ca t or _2) && ( ( (cid:3) ( altitude == 0) && (cid:3) ( e n g i n e _ e n able d == 0)) || ( ^ ( altitude != 0) && (cid:3) ( e n g i n e _ e nabl ed == e n g i n e _ e n a bled _0)) ) We will now show that ∀ 𝛽. 𝛽 (cid:15) 𝑝 MPL ∃ ⇒ 𝛽 (cid:15) 𝑝 𝐼 ∃ We will proceed by handling each conjunction in 𝑝 𝐼 ∃ individually. D.0.1 Altitude nonnegativity.
Here, we will show that ∀ 𝛽. 𝛽 (cid:15) 𝑝 MPL ∃ ⇒ 𝛽 (cid:15) (cid:3) (altitude >= 0) We rely on the fact that, based on standard properties of numerical comparisons and arithmetic op-erators, ∀ 𝜎. 𝜎 (cid:15) altitude_rate <= altitude_0 ⇒ 𝜎 (cid:15) altitude_0 - altitude_rate >= 0 Thus, by applying Theorem B.4, Theorem B.5, and the assumptions on Lines 20 and 21, we havethe result.
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020.
D.0.2 Time on ground non-negativity.
Here, we will show that ∀ 𝛽. 𝛽 (cid:15) 𝑝 MPL ∃ ⇒ 𝛽 (cid:15) (cid:3) (time_on_ground >= 0) From Lines 3 and 14 of 𝑝 MPL ∃ , we can deduce that1 (cid:3) (time_on_ground_0 >= 0) && (cid:3) (time_on_ground == time_on_ground_0 || time_on_ground == time_on_ground_0 + 1) We will now analyze the two cases of the disjunction separately. In the first case we use the fol-lowing implication, which must be true by a simple substitution argument. ∀ 𝜎. 𝜎 (cid:15) (time_on_ground_0 >= 0 && time_on_ground == time_on_ground_0) ⇒ 𝜎 (cid:15) time_on_ground >= 0 In the second case, we can use following implication, which follows from the interation of >= and + . ∀ 𝜎. 𝜎 (cid:15) (time_on_ground_0 >= 0 && time_on_ground == time_on_ground_0 + 1) ⇒ 𝜎 (cid:15) time_on_ground >= 0 Combining these theorems together and then applying Theorems B.4 and B.5 shows that (cid:3) (time_on_ground >= 0) . D.0.3 Single upset for permanent errors.
Here, we will show that (cid:3) (!(permanent_1 == 1 &&permanent_2 == 1)) . This is directly assumed on Line 5 of 𝑝 MPL ∃ . D.0.4 Engine enabled soundness.
Here, we will show that ∀ 𝛽. 𝛽 (cid:15) 𝑝 MPL ∃ ⇒ 𝛽 (cid:15) (cid:3) (altitude > 0 => engine_enabled == 1) .Notably, the original MPL software did not satisfy this condition and as a result likely shut off itsengine too early resulting in a crash.We case split on the disjunction on Line 61. In the first case on Line 61, the assumption that (cid:3) (altitude == 0) means that any environment in the belief state will vacuously satisfy altitude > 0 => engine_enabled == 1 .In the second case on Line 61, we can assume that ^ (altitude != 0) => (cid:3) (engine_enabled== 1) . Inlining definitions and using Theorem B.3, we can see that (cid:3) (altitude == 0) || (cid:3) (engine_enabled == 1) Then, applying Theorem B.2, we can deduce that (cid:3) (altitude == 0 || engine_enabled == 1) which is equivalent to (cid:3) (altitude != 0 => engine_enabled == 1)
Combining this with the fact that (cid:3) (altitude >= 0)
Using Theorems B.4 and B.5 proves theresult.
D.0.5 Time on ground vs. altitude.
Here, we will show that (cid:3) (altitude > 0 => time_on_ground== 0) . The assumptions on Lines 2, 4, 6, 14, and 21 mean that any environment in the belief statesatisfies,1 altitude_0 > 0 => time_on_ground_0 == 0 && altitude_0 > 0 => engine_enabled == 1 && altitude_0 == 0 => altitude == 0 && (altitude_0 == 0 || engine_enabled_0 != 1 || altitude_0 > 0 && time_on_ground == time_on_ground_0) Thus, any environment in the belief state satisfies altitude_0 > 0 => time_on_ground == 0
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:39
D.0.6 Existential Variables.
In this section, we will show that 𝛽 (cid:15) 𝑝 MPL ∃ ⇒ 𝛽 (cid:15) ∃ y1, y2. ((y1 || y2) &&transient_on_touchdown == 0 && landing_leg_deployment == 0) =>altitude == 0 The conditions on Lines 28, 35, and 46 can be rearranged to show that 𝛽 (cid:15) 𝑝 MPL ∃ ⇒ 𝛽 (cid:15) ∃ y_td1, y_td2. (cid:3) (transient_on_touchdown == 0 &&landing_leg_deployment == 0 &&!(permanent_1 == 1 && permanent_2 == 1) =>((y_td1 || y_td2) <=> altitude == 0)) Because by the assumption on Line 5 we have that the belief state satisfies (cid:3) (!(permanent_1 ==1 && permanent_2 == 1)) , we can 𝛼 -rename y_td1 to y1 and y_td2 to y2 which implies theresult. D.0.7 Engine Disabled Soundness.
In this section we will show that ∀ 𝛽. 𝛽 (cid:15) 𝑝 MPL ∃ ⇒ 𝛽 (cid:15) (cid:3) (time_on_ground == 2 => engine_enabled == 0) We case split on the disjunction on Line 62. In the first case, we can assume that the belief statesatisfies (cid:3) (engine_enabled == 0) which ensures the result.In the second case, we can assume that the belief state satisfies ^ altitude != 0 . We note thefollowing implication ∀ 𝜎. 𝜎 (cid:15) altitude != 0 ⇒ 𝜎 (cid:15) y_radar > 1000 ||((y1 != 1 || y_td1 != 1) && (y2 != 1 || y_td2 != 1)) ||!( (((y1 == 1 && y_td1 == 1) || (y2 == 1 && y_td2 == 1)) && y_radar <= 1000)=> altitude == 0) Next, we lift this implication to operate over ^ . By taking the contrapositive of Theorem B.5 andapplying Theorem B.3 we can show that (∀ 𝜎. 𝜎 (cid:15) 𝑝 ⇒ 𝜎 (cid:15) 𝑝 ) ⇒ (∀ 𝛽. 𝛽 (cid:15) ^ 𝑝 ⇒ 𝛽 (cid:15) ^ 𝑝 ) .Applying this, we see that ∀ 𝛽. 𝛽 (cid:15) altitude != 0 ⇒ 𝛽 (cid:15) ^ (y_radar > 1000) || ^ ((y1 != 1 || y_td1 != 1) && (y2 != 1 || y_td2 != 1)) || ^ !( (((y1 == 1 && y_td1 == 1) || (y2 == 1 && y_td2 == 1)) && y_radar <= 1000)=> altitude == 0) We start by addressing the final term of the disjunction. This is equivalent to saying the belief statesatisfies ! (cid:3) ((((y1 == 1 && y_td1 == 1) || (y2 == 1 && y_td2 == 1)) && y_radar <= 1000)=> altitude == 0) Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020.
While we omit the full derivation, we note that from 𝑝 MPL ∃ we can derive that the belief statesatisfies the negated proposition: (cid:3) ((((y1 == 1 && y_td1 == 1) || (y2 == 1 && y_td2 == 1)) && y_radar <= 1000)=> altitude == 0) Thus, by contradiction, we can disregard this last term of the disjunction.Because the remaining terms only contain quantified variables, we can apply Theorem B.6 tosee that ∀ 𝛽. 𝛽 (cid:15) altitude != 0 ⇒ 𝛽 (cid:15) (cid:3) (y_radar > 1000) || (cid:3) ((y1 != 1 || y_td1 != 1) && (y2 != 1 || y_td2 != 1)) We now consider each case of the disjunction separately.In the first case, the approach is to show that for all 𝛽 , 𝛽 (cid:15) (cid:3) (y_radar > 1000) ⇒ 𝛽 (cid:15) (cid:3) (altitude > 0) ⇒ 𝛽 (cid:15) (cid:3) (time_on_ground == 0) where the first implication uses the assumptions on Lines 58 and 57 of 𝑝 MPL ∃ and the secondimplication uses the result from Section D.0.5.In the second case, we will assume that the belief state satisfies (cid:3) (y_radar <= 1000) , becauseotherwise by Theorem B.6 we would have 𝛽 (cid:15) ^ (y_radar > 1000) ⇒ 𝛽 (cid:15) (cid:3) (y_radar > 1000) which by the above argument must ensure the result. Because of this, we can assume without reser-vation that (cid:3) (landing_leg_deployment == 0) . We will also assume for now that (cid:3) (permanent_1!= 1) . In any environment 𝜎 in a belief state satisfying this, we have that 𝜎 (cid:15) y_td1 != 1 && transient_on_touchdown == 0 ⇒ 𝜎 (cid:15) altitude != 0 ⇒ 𝜎 (cid:15) time_on_ground == 0 Where the first implication uses assumptions from Lines 28 and 35 and the second one uses theresult from Section D.0.5. Similarly, we have that 𝜎 (cid:15) y_t1 != 1 && transient_on_touchdown == 0 ⇒ 𝜎 (cid:15) altitude_0 != 0 ⇒ 𝜎 (cid:15) time_on_ground_0 == 0 ⇒ 𝜎 (cid:15) time_on_ground == 0 This means that in any belief state satisfying 𝑝 MPL ∃ , we will have that (cid:3) (transient_on_touchdown == 0 => time_on_ground == 0) If the assumption (cid:3) (permanent_1 != 1) is not true, then by the assumption on Line 5 we canassume that (cid:3) (permanent_2 != 1) and apply a symmetric line of reasoning to y_td2 and y2 . D.0.8 Bounded time-on-ground.
We will now show that 𝛽 (cid:15) 𝑝 MPL ∃ ⇒ 𝛽 (cid:15) (cid:3) (transient_on_touchdown => time_on_ground < 2) We can summarize the proposition on Line 14 as ∀ 𝛽. 𝛽 (cid:15)(cid:3) (engine_enabled_0 == 1 => time_on_ground == time_on_ground_0 + 1 &&engine_enabled_0 != 1 => time_on_ground == time_on_ground_0) Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:41
We can now apply the fact that ∀ 𝜎. 𝜎 (cid:15) transient_on_touchdown == 0engine_enabled_0 == 1 => time_on_ground == time_on_ground_0 + 1 &&engine_enabled_0 != 1 => time_on_ground == time_on_ground_0 &&time_on_ground_0 >= 0 &&time_on_ground_0 < 2 &&time_on_ground_0 == 1 => engine_enabled_0 == 0 ⇒ 𝜎 (cid:15) transient_on_touchdown => time_on_ground < 2 can be proven using standard techniques for inequlities and propositional logic. We can lift thisto the (cid:3) modality using Theorems B.4 and B.5 and then further apply the assumptions on Lines 3,12, and 13 to show the result. E THE MPL-EXP BENCHMARK
In this section, we describe the MPL-Exp benchmark. This section is laid out similarly to Section 5.Figure 15 presents a modified version of Figure 12, and Figure 16 presents a modified version ofFigure 13. Section E.1 is analogous to D and explains how to verify the modified programs usingEpistemic Hoare logic.
Altitude Model.
Instead of the uniform 1-meter discretization in Figure 12, the MPL-Exp bench-mark uses an exponential discretization. This modified discretization applies to the alt and radar_alt variables, and is defined as follows: • When an altitude variable is 4, the altitude is between 1000 and 10000 meters. • When an altitude variable is 3, the altitude is between 100 and 1000 meters. • When an altitude variable is 2, the altitude is between 10 and 100 meters. • When an altitude variable is 1, the altitude is between 1 and 10 meters. • When an altitude variable is 0, the altitude is between 0 and 1 meters. • When an altitude variable is -1, the altitude is exactly 0 meters.The model in Figure 15 is the same as that of Figure 12 except that the altitude model has beenmodified to reflect the exponential discretization. The update rules are designed to conservativelyover-approximate the model in Figure 15. This means that after applying the definition of thediscretization, any possible true environment in Figure 12 is also possible under Figure 15.The belief program in Figure 16 is the same as that of Figure 13 except that all references to the alt variable have been modified to use the exponential discretization.
E.1 Verification
In this section, we explain how to verify the loop invariant of the MPL-Exp example, which de-scribes the same property as that of the MPL example, but with the exponential altitude discretiza-tion. The process proceeds the same as in Section D. We first construct a strengthened invariant,which is the same as the strengthened invariant in Section D but with altitude comparisons to replaced with comparisons to -1 . Then, using the rules of Epistemic Hoare logic, we construct aformula 𝑝 MPL − Exp ∃ by propagating the strengthened invariant through the loop body, and showthat 𝑝 MPL − Exp ∃ implies the invariant itself.To show that 𝑝 MPL − Exp ∃ implies the strengthened invariant, we follow the same general line ofreasoning as in Section D. All of the proof except the engine disabled soundness condition follows Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. // Model Initialization prev_err_1 = 0; prev_err_2 = 0; trans_td = 0; alt = 4; time_on_ground = 0; // Permanent errors perm_1 = choose (. == 0 || . == 1); perm_2 = choose ((. == 0 || . == 1) && ( perm_1 == 1 => . == 0)); perm_1_v = choose (. == 0 || . == 1); perm_2_v = choose (. == 0 || . == 1); // Controller initialization ... while ( engine_enabled == 1) { ( alt > -1 => engine_enabled == 1) && ( trans_td == 0 => time_on_ground < 2) } { // Model start if ( alt == -1 && engine_enabled == 1) { time_on_ground = time_on_ground + 1 }; if ( alt >= 3) { alt_rate = choose (. == 0 || . == 1) } else { alt_rate = choose (. <= prev_alt + 1) }; alt = alt - alt_rate ; if ( alt >= 3 ) { radar_alt = choose ( alt - 1 <= . && . <= alt + 1) } else if ( alt == 2) { radar_alt = choose ( -1 <= . && . < = 3) } else { radar_alt = choose ( -1 <= . && . <= 2) }; err_1 = choose (( prev_err_1 == 1 => . == 0) && (. == 0 || . == 1)); prev_err_1 = err_1 ; err_2 = choose (( prev_err_2 == 1 => . == 0) && (. == 0 || . == 1)); prev_err_2 = err_2 ; if ( alt == -1 && ( err_1 == 1 || err_2 == 1)) { trans_td = 1; }; leg_err = alt == 4; if perm_1 { cur_td_1 = perm_1_v } else if ( leg_err == 1 || err_1 == 1) { cur_td_1 = choose (. == 0 || . == 1) } else { cur_td_1 = alt == -1; } else if ( leg_err == 1 || err_2 == 1) { cur_td_2 = choose (. == 0 || . == 1) } else { cur_td_2 = alt == -1; }; // Model end // Controller loop start ... } Fig. 15. Model for the MPL-Exp benchmark.
Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 200. Publication date: November 2020. rogramming and Reasoning with Partial Observability 200:43 // Model Initialization ... engine_enabled = 1; while ( engine_enabled == 1) { (cid:3) (( alt > -1 => engine_enabled == 1) && ( trans_td == 0 => time_on_ground < 2)) } { // Model start ... // Model end observe radar_alt ; observe cur_td_1 ; observe cur_td_2 ; infer (cid:3) ( alt == -1) { engine_enabled = 0 } } Fig. 16. Belief program of the MPL-Exp benchmark. the exact same reasoning as Section D with appropriate changes to take into account the newstrengthened invariant and the use of 𝑝 MPL − Exp ∃ .Showing engine disabled soundness requires more care because the proof makes use of inter-mediate propositions that critically depend on the altitude model. E.1.1 Engine Disabled Soundness.
To apply an analogous line of reasoning to Section D.0.7, weneed a value 𝑐 ∗ that satisfies the conditions 𝛽 (cid:15) (cid:3) (y_radar > 𝑐 ∗ ) ⇒ 𝛽 (cid:15) (cid:3) (alt > -1) and 𝛽 (cid:15) (cid:3) (y_radar <= 𝑐 ∗ ) ⇒ 𝛽 (cid:15) (cid:3) (landing_leg_deployment == 0) .While we omit the full derivation here, we note that choosing 𝑐 ∗ = , applying 𝑝 MPL − Exp ∃ andTheorems B.4 and B.5 can prove these implications.The remaining pieces of the proof are the same as in Section D.0.7.andTheorems B.4 and B.5 can prove these implications.The remaining pieces of the proof are the same as in Section D.0.7.