[PDF] Certifying Differential Equation Solutions from Computer Algebra Systems in Isabelle/HOL

Abstract

The Isabelle/HOL proof assistant has a powerful library for continuous analysis, which provides the foundation for verification of hybrid systems. However, Isabelle lacks automated proof support for continuous artifacts, which means that verification is often manual. In contrast, Computer Algebra Systems (CAS), such as Mathematica and SageMath, contain a wealth of efficient algorithms for matrices, differential equations, and other related artifacts. Nevertheless, these algorithms are not verified, and thus their outputs cannot, of themselves, be trusted for use in a safety critical system. In this paper we integrate two CAS systems into Isabelle, with the aim of certifying symbolic solutions to ordinary differential equations. This supports a verification technique that is both automated and trustworthy.

Full PDF

CCertifying Diﬀerential Equation Solutions fromComputer Algebra Systems in Isabelle/HOL

Thomas Hickman, Christian Pardillo Laursen, and Simon Foster

University of York

Abstract.

The Isabelle/HOL proof assistant has a powerful library forcontinuous analysis, which provides the foundation for veriﬁcation ofhybrid systems. However, Isabelle lacks automated proof support forcontinuous artifacts, which means that veriﬁcation is often manual. Incontrast, Computer Algebra Systems (CAS), such as Mathematica andSageMath, contain a wealth of eﬃcient algorithms for matrices, diﬀeren-tial equations, and other related artifacts. Nevertheless, these algorithmsare not veriﬁed, and thus their outputs cannot, of themselves, be trustedfor use in a safety critical system. In this paper we integrate two CASsystems into Isabelle, with the aim of certifying symbolic solutions to or-dinary diﬀerential equations. This supports a veriﬁcation technique thatis both automated and trustworthy.

Veriﬁcation of Cyber-Physical and Autonomous Systems requires that we canverify both discrete control, and continuous evolution, as envisaged by the hy-brid systems domain [1]. Whilst powerful bespoke veriﬁcation tools exist, suchas the KeYmaera X [2] proof assistant, software engineering requires a gen-eral framework, which can support a variety of notations and paradigms [3].Isabelle/HOL [4] is a such a framework. Its combination of an extensible fron-tend for syntax processing, and a plug-in oriented backend, based in ML, whichsupports a wealth of heterogeneous semantic models and proof tools, supportsa ﬂexible platform for software development, veriﬁcation, and assurance [5,6,7].Veriﬁcation of hybrid systems in Isabelle is supported by several detailed li-braries of Analysis, including

Multivariate Analysis [8],

Aﬃne Arithmetic [9], and

HOL-ODE [10], which supports reasoning for Systems of Ordinary DiﬀerentialEquations (SODEs). These libraries essentially build all of calculus from theground up, and thus provide the highest level of rigour. However, Isabelle cur-rently does not oﬀer many automated proof facilities for hybrid systems. KeY-maera X [2], in contrast, is highly automated and thus very usable, even bynon-experts. This is partly due to the inclusion of eﬃcient algorithms for dif-ferential equation solving and quantiﬁer elimination, which are both vital tech-niques. Several of these techniques are supported by integration with ComputerAlgebra Systems (CAS), which support these and several other algorithms, inparticular the Wolfram Engine, which is the foundation for Mathematica. a r X i v : . [ c s . L O ] F e b evertheless, whilst CAS systems are eﬃcient, they do not achieve the samelevel of rigour as Isabelle, and thus the results cannot be used without care ina high assurance development. In Isabelle, all results are certiﬁed using a smallkernel against the core axioms of the object logic, following the LCF architecture.The correctness of external tools does not need to be demonstrated, but only thatparticular results can be certiﬁed. This approach has been highly successful, andin particular has allowed the creation of the famous sledgehammer tool [11,12],which soundly integrates external automated theorem proving systems.In this paper, we apply this approach to integration of CAS systems intoIsabelle/HOL . We focus on generation and certiﬁcation of solutions to SODEs,though our approach is more generally applicable. We integrate two CAS sys-tems: the Wolfram Engine and SageMath, the latter of which is open source. Weshow how SODEs and their solutions can be described and certiﬁed in Isabelle.We then show how we have integrated the two CAS systems, using their APIs,and several new high level Isabelle commands. We evaluate our approach using alarge test set of SODEs, including a large fragment of the KeYmaera X examplelibrary. Our approach is largely successful, but we highlight some future workfor improving the certiﬁcation proof process in Isabelle.The structure of our paper is as follows. In §

2, we highlight related work andnecessary context. In §

3, we describe our tactic for certiﬁcation of SODEs. In § §

5, we present our integrations with SageMath and Wolfram, respectively.In §

6, we evaluate our approach using our test set, and in § The dominant approach for CPS veriﬁcation is diﬀerential dynamic logic (d L ),a proof calculus for reasoning about hybrid programs [13]. Hybrid programsallow modelling of hybrid systems by providing operators for discrete transitions,such as assignment and nondeterministic composition, together with modellingdynamics via continuous evolution of a SODE.The most advanced tool for deductive veriﬁcation of hybrid systems is KeY-maera X [2], a theorem prover for d L . Its capabilities have been shown in numer-ous case studies, such as in [14] for verifying various classes of robotic collision-avoidance algorithms, and in [15] for proving that the ACAS X aircraft collisionavoidance system provides safe guidance under a set of assumptions. KeYmaeraX uses the Wolfram Engine for SODE solving and quantiﬁer elimination.KeYmaera X is, however, restricted to reasoning about d L hybrid programs,and cannot be applied directly to other notations. In particular, we cannot showthat a controller speciﬁcation is reﬁned by a given implementation in a languagelike C [16], although tools such as VeriPhy [17] and ModelPlex [18] somewhatbridge this gap. It also cannot handle transcendental functions, such as sin andlog, which are often used by control engineers. The code supporting our approach can be found in the following GitHub repository: https://github.com/ThomasHickman/Isabelle-CAS-Integration L has also been implemented [19,20,21] in the Isabelle proof assistant [4], asboth a deep [19] and shallow embedding [20,21]. Veriﬁcation in Isabelle brings theadvantage of generality, whereby the hybrid systems proof could be used to showcorrectness of an implementation [16], or used in a larger proof about a complexsystem. It also allows integration with several notations in a single development,which is the goal of our target veriﬁcation framework, Isabelle/UTP [22].A present disadvantage of Isabelle is the lack of automated proof, in compar-ison to KeYmaera X. Consequently, our goal is to improve automation by safeintegration of a CAS. Mathematica has previously been integrated into Isabellefor quantiﬁer elimination problems over univariate polynomials [23]. Here, weintegrate two CAS systems for the purpose of certifying SODE solutions.Plugins in Isabelle, like our CAS integration, are written using the ML lan-guage, and manipulate terms of the logic. Terms are used to encode a typed λ -calculus, and are encoded using the following ML data type. datatype term = Const of string * typ | Free of string * typ |Var of indexname * typ | Bound of int |Abs of string * typ * term | $ of term * term The type typ describes Isabelle types. The basic constructors include constants(

Const ), free variables (

Var ), schematic variables (

Var ), and bound variables(

Bound ) with de Bruijn indices. With the exception of

Bound , these all consistof a name and a type. The remaining two constructors represent λ -abstractionsand applications of one term to another. For example, the term λx y : R . x + y ,a function that adds together two real numbers, is represented as follows: Abs ("x", "real", Abs ("y", "real",Const ("Groups.plus_class.plus", "real => real => real") $ Bound 1 $ Bound 0))

As usual, λx y. e is syntactic sugar for λx.λy. e . Predeﬁned functions, such as +,are represented by constant terms are are fully qualiﬁed.Our CAS plugin takes as input a SODE encoded as a term, which it turns intoinput for a CAS. The CAS returns a solution in its own internal representation,if one exists, and this is turned into another Isabelle term, and certiﬁcation ofthe solution is attempted. Our approach builds on both on Immler’s library forrepresenting SODEs and their solutions [10,9] (

HOL-ODE ).In the next section we describe the approach for SODE certiﬁcation.

In this section we describe how SODE solutions can be certiﬁed using our ode_cert proof tactic. We assume a SODE of the form ˙ x ( t ) = f t ( x t ), de-scribed by a function f : R → R n → R n , which gives a vector of derivatives foreach continuous variable at time t and current state x t . A candidate solutionto this SODE is a function x : R → R n , which can potentially have a restricteddomain T ⊆ R and range D ⊆ R n . For example, consider the following SODE: ig. 1. SODE representation in Isabelle

Example 1. ( ˙ x ( t ) , ˙ y ( t ) , ˙ z ( t )) = ( t, x ( t ) , g (cid:44) ( λ t ( x, y, z ) . ( t, x,

1) whose type is R → R → R . Its representation in Isabelleis shown in Figure 1, where a name is introduced for it by an abbreviation.The goal of the ode cert tactic, then, is to prove conjectures of the form x solves-ode f T D which speciﬁes that x is indeed a solution to f , and is deﬁned within the HOL-ODE package [10]. It requires that we solve the following two predicates:( ∀ t ∈ T. ( x has-vector-derivative ( f t ( x t )) ( at t within T ))) and x ∈ T → D We need to show that at every t in the domain, the derivative of x matches theone predicted by f , and that x has the correct domain and range. The predicate has-vector-derivative is deﬁned within the HOL-Analysis package [8], which alsoprovides a large library of diﬀerentiation theorems. For brevity, we use the syntax f (cid:66) f (cid:48) [ t ∈ T ] to mean ( f has-vector-derivative f (cid:48) )( at t within T ). Theorem 1 (Derivative Introduction Theorems). — (a) ( λx. c ) (cid:66) t ∈ T ] — (b) ( λx. x ) (cid:66) t ∈ T ] f (cid:66) f (cid:48) [ t ∈ T ] g (cid:66) g (cid:48) [ t ∈ T ] (c) ( λx. ( f x, g x )) (cid:66) ( f (cid:48) , g (cid:48) ) [ t ∈ T ] f (cid:66) f (cid:48) [ t ∈ T ] g (cid:66) g (cid:48) [ t ∈ T ] (d) ( λx.f x + g x ) (cid:66) f (cid:48) + g (cid:48) [ t ∈ T ] f (cid:66) f (cid:48) [ t ∈ T ] (e) ( λx. sin( f x )) (cid:66) ( f (cid:48) · cos( f t ))) [ t ∈ T ] f (cid:66) f (cid:48) [ t ∈ T ] f t > (f) ( λx. √ f x ) (cid:66) ( f (cid:48) · / (2 · √ f t )) [ t ∈ T ] f (cid:66) f (cid:48) [ t ∈ T ] g (cid:66) g (cid:48) [ t ∈ T ] g t (cid:54) = 0 (g) ( λx.f x/g x ) (cid:66) ( − f t · (1 / ( g t ) · g (cid:48) · / ( g t )) + f (cid:48) /g t )) [ t ∈ T ]These are standard laws, but in a deductive rather than equational form. Aconstant function λx. c has derivative 0 (a), and the identity function λx. x hasderivative 1 (b). If a derivative is a composed of a pair ( f (cid:48) , g (cid:48) ) then it canbe decomposed into two derivative proofs (c). This law is particularly usefulfor decomposing a SODE into its component ODEs. A function composed oftwo summed components can similarly be composed (d). The derivative of sinis cos (e). The remaining two rules are for square root (f) and division (g).They both have additional provisos to avoid undeﬁnedness. Square root √ x an be diﬀerentiated only when x >

0. Similarly, a division requires that thedenominator is non-zero, hence the extra proviso.The strategy employed by ode cert is as follows:

Algorithm 1 (SODE Certiﬁcation Method)

1. Decompose a SODE in n variables to n subgoals of the form f i (cid:66) f (cid:48) i [ t ∈ T ] for ≤ i ≤ n ;2. Replace every such goal with two goals: f i (cid:66) X i [ t ∈ T ] and X i = f (cid:48) i using afresh meta-variable X i . The latter goal is used to prove equivalence betweenthe expected and actual derivative in f (cid:48) ;3. For each remaining derivative goal, recursively apply the derivative introduc-tion laws (Theorem 1). If any derivative goals remain, the method fails;4. The remaining subgoals are equalities and inequalities in the real variables.Attempt to discharge them all using real arithmetic and ﬁeld laws using thesimpliﬁer tactic for recursive equational rewriting.5. If no goals remain, the ODE is certiﬁed. We exemplify this method with Example 1, using f (cid:44) ( λt ( x, y, z ) . ( t, x, x (cid:44) ( λt. ( t / x , t / x · t + y , z + t )), where x , y , z are integration constants, or initial values for variables. We form thegoal x solves-ode f R R and execute ode cert . The domain and range con-straints are trivial in this case. Following step (1), we obtain 3 subgoals:1. ( λt. t / x ) (cid:66) t [ t ∈ T ];2. ( λt. t / x · t + y ) (cid:66) x [ t ∈ T ];3. ( λt. z + t ) (cid:66) t ∈ T ]We focus on the second subgoal. Having applied the derivative introduction laws,we receive two proof obligations. The ﬁrst is 6 (cid:54) = 0, which is required since 6 isthe denominator in a division, and is trivial. The second is the following equality: − ( t ) · (1 / · · /

6) + 3 · · t − / x · · t ) + 0 = t / x Though seemingly complex, it simpliﬁes to give the desired result, since all butone of the summands reduce to 0. This, and more complex goals, can be solvedusing the built-in simpliﬁcation sets algebra simps and field simps . Theother two derivative subgoals similarly reduce, and so the solution is certiﬁed.The interface for the CAS tools is through two Isabelle commands: ode_solve ode_solve_thm (:)? ? ? ?

The ode solve command takes a SODE in the form used in Example 1, andsends this to the CAS for processing. If a solution is found, the tool suggests alemma that can be inserted of the form x solves-ode f T D , with a concrete solu-tion x , in the style of the sledgehammer tool [11,12]. The given lemma is provedusing ode cert . ode solve thm produces a lemma directly, with the givename. It also optionally allows speciﬁcation of an explicit domain, codomain,and assumption. The assumption is necessary if the SODE contains constantsthat are locally constrained. Diﬀerent CAS systems can be selected using the Is-abelle variable SODE solver , which can take the value fricas , maxima , sympy ,or wolfram . An example showing our tool can be seen in Figure 4.In the next two sections we describe our integrations of Isabelle with Sage-Math and the Wolfram Engine. Fig. 2.

Overview of the SageMathpipeline

SageMath [24] is an open source competitorto the Wolfram Engine. Its functionality is ac-cessed via calls to a Python API. It integratesseveral open source CAS systems in order toprovide its functionality, in each case choos-ing the best implementation for a particularsymbolic computation. This makes SageMathan ideal target for integration with Isabelle. Inthe latest version of SageMath (version 9.1),Maxima [25] is the default CAS for solvingSODEs. FriCAS [26] is also an option, thoughthis is not bundled with SageMath by default.Our plugin also supports the CAS SymPy [27],which is implemented using the SymPy toSageMath translation functions.An overview of the SageMath pipeline isshown in Figure 2. The distinct steps that theSageMath integration uses are Steps 2 to 6.

Step 2 and 6: Conversion.

In Step 2, theSageMath integration code receives a term forthe input SODE. This is traversed, convertedto a string containing Python code, passed toa Python integration script over the command line, and evaluated using Python’s eval function. In Step 6 the opposite happens: the Python integration scripttraverses the expression, converts it to a string containing Isabelle code, andreturns it on standard output. This string is then evaluated using the Isabellefunction

Syntax.read_prop : context -> string -> term , which parses andtype checks a proposition term in a given proof context.Converting between the two representations is mostly a task of mappingbetween function names. However, there are several exceptions to this rule:1. Numbers in Isabelle are in a decomposed binary format. The Isabelle function

HOLogic.dest_number is used to convert these to integer values.2. There are diﬀerent operators in Isabelle for integer, rational and real powers,whereas SageMath uses one operator. When converting from SageMath toIsabelle, the plugin chooses the simplest type of power function. nitialise simple-equations to an empty array; repeatforeach equation in the sode doif equation is of the form ˙ y = x for any variable x or y then replace any occurrence of x in sode with ˙ y ;append equation to simple-equations ;remove equation from sode endif equation is of the form ˙ y = f ( t, x ) (the equation is solely a functionof it’s independent and dependent variables) then solved-equation ← SolveODE( equation ) ;replace any occurrence of y in sode with solved-equation ;remove equation from sode ;output solved-equation as a solution to equation endenduntil sode is unchanged ;solve sode and output the solutions; foreach equation as ˙ y = x in simple-equations do ﬁnd the solution to ˙ x from the existing outputs, diﬀerentiate it and outputthis as a solution to ˙ y end Algorithm 2:

Preprocessing Step Algorithm3. Isabelle does not contain a representation of the mathematical constant e ,but rather the exponential function e n . When e is used on its own in Sage-Math, this is converted into exp(1). Step 3: Preprocessing.

In many CAS systems, the SODE solving functionalityis less powerful than the single equation ODE solving functionality. This oftenmeans that SODEs can be solved by the CAS system only when rewritten asODEs. The preprocessing step, described in Algorithm 2, takes advantage of thisby rewriting two diﬀerent types of SODEs:1. SODEs formed from a higher order ODE, where a variable is introducedto represent a higher derivative, so that the SODE can conform to the for-mat speciﬁed in ode_cert (for example ( ˙ x, ˙ y ) = (2 x + y, x )). This can bepreprocessed back into the higher order ODE which was originally intended.2. SODEs formed from two distinct system. For example, the SODE of a par-ticle acting under gravity with a constant horizontal velocity - ( ˙ v x , ˙ v y ) =(2 , − g ). This can be preprocessed into multiple independent ODEs, andsolved using the CAS’s ODE solving functionality.As an example of Algorithm 2, consider again Example 1 (( ˙ x, ˙ y, ˙ z ) = ( t, x, y = x exists in this SODE.This means we can transform the equation ˙ x = t into ¨ y = t , yielding a new SODEof the form (¨ y, ˙ z ) = ( t, y = t and ˙ x = 1) in this SODE arexpressed solely in terms of their independent and dependent variables, so theycan be solved without considering the other equations. This yields the solution:( y, z ) = (cid:18) t c t + c , t + c (cid:19) (1)We can now ﬁnd x by calculating ˙ y . The ﬁnal solution is:( x, y, z ) = (cid:18) t c , t c t + c , t + c (cid:19) (2) Step 4: Solving.

In Step 4, the input SODE is fed into one of three SODEsolvers: SymPy, Maxima and FriCAS. These three solvers were all consideredas potential CAS systems to use, in the order of: SymPy then Maxima thenFriCAS. FriCAS performs best on the test set, but the option to use the otherCAS systems is preserved.

Step 5: Domain ﬁnding.

When verifying the solution of a SODE, ode_cert requires a domain for which the solution is valid. When SageMath returns asolution, it does not return this information, therefore this domain needs to begenerated from the solution. The strategy we have taken is to assume the domainfor which the solution is valid to be the maximal domain of the solution.SageMath does not have any maximal domain ﬁnding functionality, so wehave used SymPy for this part of the pipeline. The function that ﬁnds maximaldomains in SymPy was also patched to ensure that greater maximal domainscan be calculated .We evaluate our SageMath integration in §

6, but ﬁrst, in the next section,we describe our Wolfram integration.

Using the Wolfram Engine over SageMath comes with the main disadvantage oflosing open-source status, but it also has a few advantages. In our implemen-tation, it is notably faster at producing solutions, and the SODEs require nopreprocessing before solving. Our Wolfram interface is written entirely in SML,which makes it easier for those familiar with Isabelle system programming touse and extend.The implementation of the Wolfram interface is illustrated in Figure 3. First,the Isabelle term that represents the SODE is translated to an equivalent Wol-fram expression. This is passed to the Wolfram Engine for solving. The Wolframsolution to the SODE is lexed and parsed, and stored as an ML datatype. TheWolfram interface is used to retrieve and parse the solution domain, and allparsed expressions are translated to Isabelle. Finally, the plugin combines thedomain, the solution, and the original SODE to produce the solution theorem. The pull request for this can be found at https://github.com/sympy/sympy/pull/19024 . This is merged at https://github.com/sympy/sympy/pull/19047

DE termWolframexpression WolframsolutionExpressiondatatypeIsabellesolution SolutiondomainSolution lemma ode solve

LemmaTranslation Wolfram Engine interface Lex and parseWolfram EngineinterfaceInterpretUnparse Lex and parse,interpret

Fig. 3.

Wolfram plugin workﬂow

Default Wolfram expressions are typeset and diﬃcult to parse, so we in-stead retrieve solutions from the Wolfram Engine in “full form” . This formatpresents the expression in a similar style to an algebraic datatype, with explicitconstructors, and is implemented in our tool as the following ML datatype. datatype expr = Int of int | Real of real | Id of string |Fun of string * expr list | CurryFun of string * expr list list We distinguish between functions with only one set of arguments (

Fun ) andthose with several (

CurryFun ), as the latter are uncommon and dealing withthem clutters the code.To illustrate the implementation stages, the internal representation at eachstage is shown for Example 1. The approach for translation of the SODE toWolfram is the following:1. Generate an alphabetically ordered variable mapping, for each of the SODEvariables, to avoid name clashes and ease solution reconstruction.2. Translate the term to an equivalent Wolfram expression by traversing theexpression tree.3. Construct a

DSolve call using the expression. Please see https://reference.wolfram.com/language/ref/FullForm.html . Solve is the general diﬀerential equation solver for the Wolfram Engine [28],which can solve a list of diﬀerential equations for given dependent and indepen-dent variables.We exemplify the translation, again using Example 1. To represent this sys-tem, the following variable mapping is used: t → a, x → b, y → c, z → d Using this mapping, the system is translated to the following

DSolve call:

DSolve[ { b’[a]==a, c’[a]==b[a], d’[a]==1 } , { b[a],c[a],d[a] } ,a] The Wolfram engine is called using its command-line interface wolframscript ,which takes a function call as an argument and returns the result. Warnings aresuppressed to facilitate parsing. The Wolfram engine represents solutions as alist of rules, which are simply functions on expressions. Many solutions may bereturned, but we only use the ﬁrst one, which is lexed and parsed. The maximaldomain of this solution is retrieved in another call to the Wolfram Engine, similarto the SageMath integration.The solution to the test ODE after lexing and parsing is the following:

Fun ("List",[Fun ("List",[Fun ("Rule",[Fun ("b", [Id "a"]), Fun ("Plus", ...)]),Fun ("Rule",[Fun ("c", [Id "a"]), Fun ("Plus", ...)]),Fun ("Rule",[Fun ("d", [Id "a"]), Fun ("Plus", ...)])])])

Here, the inner most list gives values for each of the continuous variables. Thetranslation from such a Wolfram expression to an Isabelle term is done by re-versing the variable mapping and then traversing the expression tree. There arespecial cases for the constant e , which is translated to the exponent function,and negative powers, similar to the SageMath integration. Solutions may be pro-vided by the Wolfram Engine which use functions not available in Isabelle, suchas those containing integrals. These are reported as errors. Finally, the lemmais assembled by combining the domain, solution, and original SODE.This completes our description of the two CAS integrations. In the nextsection we evaluate them both. In this section, we evaluate our approach to certifying SODEs. We consider twotest sets, to which we apply both CAS integrations, and then evaluate the results. ig. 4.

Wolfram SODE Solver Integration in Isabelle/HOL

The ﬁrst test set is generated programmatically by searching the KeYmaera Xexample repository for any lines containing fragments of the form {} ,which describes of a SODE in KeYmaera X. Any duplicate SODEs are then com-bined. In total, 148 SODEs were found, 79 were duplicate pairs, leaving 69 uniqueSODEs to add to the test set. To represent these equations, and from now on, t is the independent variable, meaning ( ˙ x, ˙ y ) = (cid:16) dxdt , dydt (cid:17) : The KeYmaera X testscontains many “simple” SODEs. For example, the basic gravity SODE:( ˙ h, ˙ v ) = ( v, − g ) (3)was present in many variations. A few of the test cases also contains more com-plex SODEs, such as the following example taken from the test set:( ˙ q x , ˙ q y , ˙ f x , ˙ f y ) = (cid:18) f x Kq x D , f y Kq y D , f xp , f yp (cid:19) (4)We class complex SODEs as those that contain at least one operator, exclud-ing unary minus (for example − ode_cert , apart from ﬁve test cases:˙ x = x + x (5)( ˙ x, ˙ y ) = ( x − + y , y ) (6)˙ x = x − + a (7)( ˙ x, ˙ y ) = ( v, g + dv ) (8)( ˙ x, ˙ y ) = ( y, − w x − dwy ) (9)Both SageMath/FriCAS+ and Wolfram could not solve the ﬁrst three equations.In KeYmaera X [2] and also Isabelle/HOL [21], such SODEs can be veriﬁed using These examples can be found here: https://github.com/LS-Lab/KeYmaera-release/tree/master/examples/hybrid able 1.

Test casesNumber ODE System:˙ x, ( ˙ y, . . . ) = . . . Rationale1 x + t Inhomogeneous polynomial2 tan( t ) Tangent function3 x Second order polynomial4 − y, x Trigonometric solution5 1 /t Domain issues at 06 1 / (2 x −

1) Has two solutions7 xy, xy x + y, x Homogeneous 2 nd order SODE9 2 x + y + t , x Inhomogeneous 2 nd order SODE10 arcsin( t ) Inverse trigonometric function11 √ t Square root12 √ t Higher roots13 t √ Non-rational powers14 x + y, y + 2 z, x + 1 Higher dimensional SODE15 x − t Bessel function16 y, e t Imaginary error function17 sin( x ) / ln( x ) Impossible to solve18 ln( t ) , x Logarithmic diﬀerential induction [13] rather than explicit solutions. Equation 8 was solvedcorrectly only by Wolfram Engine, but ode_cert was unable to prove this andEquation 9 was solved by both Wolfram and SageMath, but ode_cert was againunable to certify it. In all the cases where ode_cert was unable to prove the theresult, this was due to a failure to prove a large algebraic proposition containingmore than 50 operator applications.In addition, 7 of the test cases required an assumption to be speciﬁed in the ode_solve_thm statement. For example, the SODE( ˙ x, ˙ t ) = ( c + b ( u − x ) , b >

0. This means that out of the 20 “complex” testcases, 15 could be automatically solved and veriﬁed and all of the 49 “simple”test cases could be automatically solved and veriﬁed.The results from our additional SODE test cases are presented in Table 2.In these results, 10 of the 18 test cases could not be automatically proved byIsabelle. There are four distinct reasons behind these failures:1. The tactic ode_cert cannot automatically prove the stated theorem. In allof the test cases where this occurs, this is due to ode_cert failing to prove analgebraic proposition. This occurs in test case 2 for SageMath’s result; and8, 9, 10, 13 for both CAS systems. We have been able to prove test cases 8,9 and 2 correct manually with the help of the sledgehammer tool [12]. Thegoals left to prove in cases 8, 9, 10 and 13 contained more than 50 operators. This work can be found here: https://github.com/ThomasHickman/Isabelle-CAS-Integration/blob/master/manually_solved_cases.thy able 2.

Results from the SODE tests.Number Solved bythe CAS Correctdomainfound Proved au-tomaticallyin Isabelle Solved bythe CAS Correctdomainfound Proved au-tomaticallyin IsabelleSageMath Wolfram1 (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (cid:55) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (cid:55) (cid:55) (cid:55) (cid:55) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (cid:51) (cid:51) (cid:55) (cid:51) (cid:51) (cid:55) (cid:51) (cid:51) (cid:55) (cid:51) (cid:51) (cid:55) (cid:55) (cid:51) (cid:55) (cid:55) (cid:55) N/A N/A (cid:55)

N/A N/A15 (cid:55)

N/A N/A (cid:55)

N/A N/A16 (cid:55)

N/A N/A (cid:55)

N/A N/A17 (cid:55)

N/A N/A (cid:55)

N/A N/A18 (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) This refers to using SageMath/FriCAS with the preprocessing step.

2. The CAS system cannot produce the correct answer, but if it did, Isabellecould not prove the answer, as appropriate derivative laws have not beenimplemented. This occurs in test cases 15 and 16.3. The CAS system cannot produce the correct answer, and it is theoreticallyimpossible for it to do so. This occurs in test case 17.4. The CAS system cannot produce the correct answer, but Isabelle does con-tain the derivative laws to prove the answer, if one was produced. This occursin test case 14.Consequently, if we exclude the cases where the CAS system cannot provide asolution, and include those where a proof using sledgehammer was required, therate of success is 11 out of 14, with 3 uncertiﬁable solutions.

In this paper, we described our work on integrating Isabelle with SODE solvingin SageMath and Wolfram, to support veriﬁcation of hybrid systems. In § ode_cert for the automatic certiﬁcation of SODE solutions.In § § § HOL-Analysis [8] and

HOL-ODE [10,9], which allow certiﬁcation to be substantially automated.However, as we have noted ﬁve of our test cases (two from KeYmaera X,and three of our own) produced solutions that could not be certiﬁed. This couldeither be due to lack of proof rules for derivation and real arithmetic in Isabelle.Alternatively, it could be that the solutions returned by the CAS systems are inreality approximations, as indicated by their size compared to the actual SODE.We plan to investigate this further in the future. Either way, we note that Isabelleplaces a high bar on the artifacts that are accepted as mathematically sound,which gives conﬁdence that they can be used in safety critical applications.In future work, we plan to use our integration as part of a Isabelle-basedhybrid systems veriﬁcation tool, using our implementation of d L and relatedcalculi [21,29]. We aim to apply to a number of example projects, such as theKeYmaera X examples. This may expose areas in the Isabelle/HOL hybrid sys-tems infrastructure that require improvement. In addition, we will investigate theintegration of other CAS features into Isabelle. One example of this is quantiﬁerelimination, which would further improve automation. Acknowledgements.

This work is supported by the EPSRC-UKRI Fellowshipproject

CyPhyAssure , grant reference EP/S001190/1.

References

1. Alur, R.: Formal veriﬁcation of hybrid systems. In: Proc. 9th. ACM Intl. Conf. onEmbedded Software (EMSOFT), New York, NY, USA, ACM (2011) 273–2782. Fulton, N., Mitsch, S., Quesel, J.D., V¨olp, M., Platzer, A.: KeYmaera X: Anaxiomatic tactical theorem prover for hybrid systems. In Felty, A.P., Middeldorp,A., eds.: CADE. Volume 9195 of LNCS., Springer (2015) 527–5383. Gleirscher, M., Foster, S., Woodcock, J.: New opportunities for integrated formalmethods. ACM Comput. Surv. (6) (2019)4. Nipkow, T., Wenzel, M., Paulson, L.C.: Isabelle/HOL: A Proof Assistant forHigher-Order Logic. Volume 2283 of LNCS. Springer (2002)5. Wenzel, M., Wolﬀ, B.: Building formal method tools in the Isabelle/Isar framework.In: TPHOLs. Volume 4732 of LNCS., Springer (2007)6. Brucker, A., Wolﬀ, B.: Using ontologies in formal developments targeting certiﬁ-cation. In: iFM. Volume 11918 of LNCS., Springer (2019) 65–827. Foster, S., Nemouchi, Y., O’Halloran, C., Tudor, N., Stephenson, K.: Formal model-based assurance cases in Isabelle/SACM: An autonomous underwater vehicle casestudy. In: FormaliSE, ACM (2020)8. Harrison, J.: A HOL theory of Euclidean space. In Hurd, J., Melham, T., eds.:Theorem Proving in Higher Order Logics, 18th International Conference, TPHOLs2005. Volume 3603 of LNCS., Oxford, UK, Springer (August 2005)9. Immler, F.: A veriﬁed ODE solver and the Lorenz attractor. J. Autom. Reasoning (1) 73–1110. Fabian, I., H¨olzl, J.: Numerical analysis of ordinary diﬀerential equations in Is-abelle/HOL. In Beringer, L., Felty, A., eds.: ITP. Volume 7406 of LNCS., Springer(2012) 377–39211. Blanchette, J.C., Bulwahn, L., Nipkow, T.: Automatic proof and disproof in Is-abelle/HOL. In: FroCoS. Volume 6989 of LNCS., Springer (2011) 12–2712. Blanchette, J.C., Kaliszyk, C., Paulson, L.C., Urban, J.: Hammering towards QED.Journal of Formalized Reasoning (1) (2016)13. Platzer, A.: Diﬀerential dynamic logic for hybrid systems. J. Autom. Reas. (2)(2008) 143–18914. Mitsch, S., Ghorbal, K., Vogelbacher, D., Platzer, A.: Formal veriﬁcation of ob-stacle avoidance and navigation of ground robots. The International Journal ofRobotics Research (12) (2017) 1312–134015. Jeannin, J.B., Ghorbal, K., Kouskoulas, Y., Schmidt, A., Gardner, R., Mitsch,S., Platzer, A.: A formally veriﬁed hybrid system for safe advisories in the next-generation airborne collision avoidance system. Software Tools for TechnologyTransfer (6) 717–74116. Tuong, F., Wolﬀ, B.: Deeply integrating C11 code support into Isabelle/PIDE. In:F-IDE. Volume 310 of EPTCS. (2019) 13–2817. Bohrer, B., Tan, Y.K., Mitsch, S., Myreen, M.O., Platzer, A.: VeriPhy: Veriﬁedcontroller executables from veriﬁed cyber-physical system models. SIGPLAN Not. (4) (2018) 617–63018. Mitsch, S., Platzer, A.: ModelPlex: Veriﬁed runtime validation of veriﬁed cyber-physical system models. Form. Methods Syst. Des. (1) (2016) 33–74 Specialissue of selected papers from RV’14.19. Bohrer, B., Rahli, V., Vukotic, I., Platzer, A.: Formally veriﬁed diﬀerential dynamiclogic. In Bertot, Y., Vafeiadis, V., eds.: Proc 6th ACM SIGPLAN Conf. on CertiﬁedPrograms and Proofs (CPP), ACM (2017) 208–22120. Munive, J.H., Struth, G.: Verifying hybrid systems with modal Kleene algebra. In:RAMICS. Volume 11194 of LNCS., Springer (2018)21. Munive, J.H., Struth, G., Foster, S.: Diﬀerential Hoare logics and reﬁnement calculifor hybrid systems with Isabelle/HOL. In: RAMiCS. Volume 12062 of LNCS.,Springer (April 2020)22. Foster, S., Baxter, J., Cavalcanti, A., Woodcock, J., Zeyda, F.: Unifying semanticfoundations for automated veriﬁcation tools in Isabelle/UTP. Science of ComputerProgramming (October 2020)23. Li, W., Passmore, G., Paulson, L.: Deciding univariate polynomial problems usinguntrusted certiﬁcates in Isabelle/HOL. J. Autom. Reasoning (2019) 29–9124. The Sage Developers: SageMath, the Sage Mathematics Software System (Version9.0). (2020)25. Maxima: Maxima, a computer algebra system. version 5.34.1 (2014) Available athttp://maxima.sourceforge.net/.26. FriCAS team: FriCAS—an advanced computer algebra system (2019) Available at http://fricas.sf.net .27. Meurer, A., Smith, C.P., Paprocki, M., ˇCert´ık, O., Kirpichev, S.B., Rocklin, M.,Kumar, A., Ivanov, S., Moore, J.K., Singh, S., Rathnayake, T., Vig, S., Granger,B.E., Muller, R.P., Bonazzi, F., Gupta, H., Vats, S., Johansson, F., Pedregosa, F.,Curry, M.J., Terrel, A.R., Rouˇcka, v., Saboo, A., Fernando, I., Kulal, S., Cimrman,R., Scopatz, A.: SymPy: symbolic computing in Python. PeerJ Computer Science (January 2017) e10328. Wolfram Research, Inc.: Wolfram language documentation Available at https://reference.wolfram.comhttps://reference.wolfram.com