[PDF] FunGrim: a symbolic library for special functions

Abstract

We present the Mathematical Functions Grimoire (FunGrim), a website and database of formulas and theorems for special functions. We also discuss the symbolic computation library used as the backend and main development tool for FunGrim, and the Grim formula language used in these projects to represent mathematical content semantically.

Full PDF

aa r X i v : . [ c s . M S ] M a r FunGrim: a symbolic library for special functions

Fredrik Johansson [0000 − − − X ] LFANT, Inria Bordeaux, Talence, France [email protected] http://fredrikj.net

Abstract.

We present the Mathematical Functions Grimoire (FunGrim),a website and database of formulas and theorems for special functions.We also discuss the symbolic computation library used as the backendand main development tool for FunGrim, and the Grim formula languageused in these projects to represent mathematical content semantically.

Keywords:

Special functions · Symbolic computation · Mathematicaldatabases · Semantic mathematical markup

The

Mathematical Functions Grimoire (FunGrim, http://fungrim.org/) is anopen source library of formulas, theorems and data for mathematical functions.It currently contains around 2600 entries. As one example entry, the modulartransformation law of the Eisenstein series G k on the upper half-plane H isgiven in http://fungrim.org/entry/0b5b04/ as follows: G k (cid:18) aτ + bcτ + d (cid:19) = ( cτ + d ) k G k ( τ )Assumptions: k ∈ Z ≥ and τ ∈ H and (cid:18) a bc d (cid:19) ∈ SL ( Z )FunGrim stores entries as symbolic expressions with metadata, in this case: Entry(ID("0b5b04"),Formula(Equal(EisensteinG(2*k, (a*tau+b)/(c*tau+d)),(c*tau+d)**(2*k) * EisensteinG(2*k, tau))),Variables(k, tau, a, b, c, d),Assumptions(And(Element(k, ZZGreaterEqual(2)), Element(tau, HH),Element(Matrix2x2(a, b, c, d), SL2Z))))

Formulas are fully quantiﬁed ( assumptions give conditions for the free vari-ables such that the formula is valid) and context-free (symbols have a globallyconsistent meaning), giving precise statements of mathematical theorems. Themetadata may also include bibliographical references. Being easily computer-readable, the database may be used for automatic term rewriting in symbolicalgorithms. This short paper discusses the semantic representation of mathemat-ics in FunGrim and the underlying software. A grimoire is a book of magic formulas. F. Johansson FunGrim is in part a software project and in part a reference work for mathemat-ical functions in the tradition of Abramowitz and Stegun [1] but with updatedcontent and a modern interface. There are many such eﬀorts, notably the NISTDigital Library of Mathematical Functions (DLMF) [4] and the Wolfram Func-tions Site (WFS) [10], which have two rather diﬀerent approaches: – DLMF uses LaTeX together with prose for its content. Since many formu-las depend on implicit context and LaTeX is presentation-oriented ratherthan semantic (although DLMF adds semantic extensions to LaTeX to alle-viate this problem), the content is not fully computer-readable and can alsosometimes be ambiguous to human readers. DLMF is edited for conciseness,giving an overview of the main concepts and omitting in-depth content. – WFS represents the content as context-free symbolic expressions writtenin the Wolfram Language. The formulas can be parsed by Mathematica,whose evaluation semantics provide concrete meaning. Most formulas arecomputer-generated, sometimes exhaustively (for example, WFS lists tensof thousands of transformations between elementary functions and around200,000 formulas for special cases of hypergeometric functions).FunGrim uses a similar approach to that of WFS, but does not depend on theproprietary Wolfram technology. Indeed, one of the central reasons for startingFunGrim is that both DLMF and WFS are not open source (though freelyaccessible). Another central idea behind FunGrim is to provide even strongersemantic guarantees; this aspect is discussed in a later section.Part of the motivation is also to oﬀer complementary content: in the author’sexperience, the DLMF and WFS are strong in some areas and weak in others.For example, both have minimal coverage of some important functions of num-ber theory and they cover inequalities far less extensively than equalities . Atthis time, FunGrim has perhaps 10% of the content needed for a good generalreference on special functions, but as proof as concept, it has detailed contentfor some previously-neglected topics. The reader may compare the following: – http://fungrim.org/topic/Modular lambda function/ versus http://functions.wolfram.com/EllipticFunctions/ModularLambda/ versusformulas for λ ( τ ) in https://dlmf.nist.gov/23.15 + https://dlmf.nist.gov/23.17 . – http://fungrim.org/topic/Barnes G-function/ versus https://dlmf.nist.gov/5.17 . (The Barnes G-function is not covered in WFS.)Most FunGrim content is hand-written so far; adding computer-generatedentries in the same fashion as WFS is a future possibility.We mention three other related projects: – FunGrim shares many goals with the NIST Digital Repository of Mathemat-ical Formulas (DRMF) [3], a companion project to the DLMF. We will notattempt to compare the projects in depth since DRMF is not fully developed, unGrim: a symbolic library for special functions 3 but we mention one important diﬀerence: DRMF represents formulas usinga semantic form of LaTeX which is hard to translate perfectly to symbolicexpressions, whereas FunGrim (like WFS) uses symbolic expressions as thesource representation and generates LaTeX automatically for presentation. – The Dynamic Dictionary of Mathematical Functions (DDMF) [2] generatesinformation about mathematical functions algorithmically, starting ab initio only from the deﬁning diﬀerential equation of each function. This has manyadvantages: it enables a high degree of reliability (human error is removedfrom the equation, so to speak), the presentation is uniform, and it is easyto add new functions. The downside is that the approach is limited to arestricted class of properties for a restricted class of functions. – The LMFDB [7] is a large database of L-functions, modular forms, andrelated objects. The content largely consists of data tables and does notinclude “free-form” symbolic formulas and theorems.

Grim is the symbolic mathematical language used in FunGrim. Grim is designedto be easy to write and parse and to be embeddable within a host programminglanguage such as Python, Julia or JavaScript using the host language’s nativesyntax (similar to SymPy [8]). The reference implementation is Pygrim, a Pythonlibrary which implements Grim-to-LaTeX conversion and symbolic evaluation ofGrim expressions. Formulas are converted to HTML using KaTeX for display onthe FunGrim website; Pygrim also provides hooks to show Grim expressions asLaTeX-rendered formulas in Jupyter notebooks. The FunGrim database itself iscurrently part of the Pygrim source code. Grim has a minimal core language, similar to Lisp S-expressions and Wol-fram language M-expressions. The only data structure is an expression tree com-posed of function calls f(x, y, ...) and atoms (integer literals, string literals,alphanumerical symbol names). For example,

Mul(2, Add(a, b)) represents2( a + b ). For convenience, Pygrim uses operator overloading in Python so thatthe same expression may be written more simply as .On top of the core language, Grim provides a vocabulary of hundreds ofbuiltin symbols ( For , Exists , Matrix , Sin , Integral , etc.) for variable-binding,logical operations, structures, mathematical functions, calculus operations, etc.The following dummy formula is a more elaborate example:

Where(Sum(1/f(n), For(n, -N, N), NotEqual(n, 0)), Def(f(n),Cases(Tuple(n**2, CongruentMod(n, 0, 3)), Tuple(1, Otherwise)))) N X n = − Nn =0 f ( n ) where f ( n ) = ( n , n ≡ , otherwise Documentation of the Grim language is available at http://fungrim.org/grim/ Pygrim is currently in early development and does not have an oﬃcial release. Thesource code is publicly available at https://github.com/fredrik-johansson/fungrim F. Johansson

Grim can be used both as a mathematical markup language and as a simplefunctional programming language. Its design is deliberately constrained: – Grim is not intended to be a typesetting language: the Grim-to-LaTeX con-verter takes care of most presentation details automatically. (The results arenot always perfect, and Grim does allow including typesetting hints wherethe default rendering is inadequate.) – Grim is not intended to be a general-purpose programming language. Unlikefull-blown Lisp-like programming languages, Grim is not meant to be usedto manipulate symbolic expressions from within, and it lacks concrete datastructures for programming, being mainly concerned with representing im-mutable mathematical objects. Grim is rather meant to be embedded in ahost programming language where the host language can be used to traverseexpression trees or implement complex algorithms.Grim formulas entered in Pygrim are preserved verbatim until explicitly eval-uated. This contrasts with most computer algebra systems, which automaticallyconvert expressions to “canonical” form. For example, SymPy automaticallyrewrites 2( b + a ) as 2 a + 2 b (distributing he numerical coeﬃcient and sorting theterms). SymPy’s behavior can be overridden with a special “hold” command,but this can be a hassle to use and might not be recognized by all functions. FunGrim and the Grim language have the following fundamental semantic rules: – Every mathematical object or operator must have an unambiguous inter-pretation, which cannot vary with context. In principle, every syntacticallyvalid constant expression should represent a deﬁnitive mathematical object(possibly the special object Undeﬁned when a function is evaluated outsideits domain of deﬁnition). This means, for example, that multivalued func-tions have ﬁxed branch cuts (analytic continuation must be expressed explic-itly), and removable singularities do not cancel automatically. Many symbolswhich have an overloaded meaning in standard mathematical notation re-quire disambiguation; for example, Grim provides separate

SequenceLimit , RealLimit and

ComplexLimit operators to express lim x → c f ( x ), dependingon whether the set of approach is meant as Z , R or C . – The standard logical and set operators (= and ∈ , etc.) compare identity ofmathematical objects, not equivalence under morphisms. The mathematicaluniverse is constructed to have few, orthogonal “types”: for example, theinteger 1 and the complex number 1 are the same object, with Z ⊂ C . – Symbolic evaluation (rewriting an expression as a simpler expression, e.g.2 + 2 →

4) must preserve the exact value of the input expression. Formulascontaining free variables are implicitly quantiﬁed over the whole universeunless explicit assumptions are provided, and may only be rewritten in ways unGrim: a symbolic library for special functions 5 that preserve the value for all admissible values of the free variables. Forexample, yx → xy is not a valid rewrite operation a priori since the uni-verse contains noncommutative objects such as matrices, but it is valid whenquantiﬁed with assumptions that make x and y commute, e.g. x, y ∈ C .These semantics are stronger than in most symbolic computing environments.Computer algebra systems traditionally ignore “exceptional cases” when rewrit-ing expressions. For example, many computer algebra systems automaticallysimplify x/x to 1, ignoring the exceptional case x = 0 where a division by zerooccurs. A more extreme example is to blindly simplify √ x → x (invalid fornegative numbers), and more generally to ignore branch cuts or complex values.Indeed, one section of the Wolfram Mathematica documentation helpfullywarns users: “The answer might not be valid for certain exceptional values ofthe parameters.” As a concrete illustration, we can use Mathematica to “prove”that e = 2 by evaluating the hypergeometric function F ( a, b,

1) at a = b = − – F ( a, b, → [ a = b ] → e → [ b = − → e – F ( a, b, → [ a = − → − b → [ b = − → F function, and the rules are inconsistent with each other in theexceptional case a = b ∈ Z ≤ ). (SymPy has the same issue.)Our aspiration for the Grim formula language and the FunGrim database isto make such contradictions impossible through strong semantics and pedanticuse of assumptions. This should aid human understanding (a user can inspectthe source code of a formula and look up the deﬁnitions of the symbols) and helpsupport symbolic computation, automated testing, and possibly formal theorem-proving eﬀorts. Perfect consistency is particularly important for working withmultivariate functions, where corner cases can be extremely diﬃcult to spot.In reality, eliminating inconsistencies is an asymptotic goal: there are cer-tainly present and future mathematical errors in the FunGrim database andbugs in the Pygrim reference implementation. We believe that such errors canbe minimized through randomized testing (ideally combined with formal veriﬁ-cation in the future, where such methods are applicable). Pygrim has rudimentary support for evaluating and simplifying Grim expres-sions. It is able to perform basic logical and arithmetic operations, expand spe-cial cases of mathematical functions, perform simple domain inferences, partially The simpliﬁcation is valid if x is viewed as a formal indeterminate generating C [ x ]rather than a free variable representing a complex number. The point remains thatsome computer algebra systems overload variables to serve both purposes, and thisambiguity is a frequent source of bugs. In Grim, the distinction is explicit. In WFS, corresponding contradictory formulas are http://functions.wolfram.com/07.20.03.0002.01 and http://functions.wolfram.com/07.20.03.0118.01. F. Johansson simplify symbolic arithmetic expressions, evaluate and compare algebraic num-bers using an exact implementation of Q arithmetic, and compare real or complexnumbers using Arb enclosures [5] (only comparisons of unequal numbers can bedecided in this way; equal numbers have overlapping enclosures and can only becompared conclusively when an algebraic or symbolic simpliﬁcation is possible).Calling the .eval() method in Pygrim returns an evaluated expression: >>> Element(Pi, SetMinus(OpenInterval(3, 4), QQ)).eval()True_>>> Zeros(x**5 - x**4 - 4*x**3 + 4*x**2 + 2*x - 2,... ForElement(x, CC), Greater(Re(x), 0)).eval()...Set(Sqrt(Add(2, Sqrt(2))), 1, Sqrt(Sub(2, Sqrt(2))))>>> ((DedekindEta(1 + Sqrt(-1)) / Gamma(Div(5, 4))) ** 12).eval()Div(-4096, Pow(Pi, 9)) To simplify formulas involving free variables, the user needs to supply suﬃ-cient assumptions: >>> (x / x).eval()Div(x, x)>>> (x / x).eval(assumptions=Element(x, CC))Div(x, x)>>> (x / x).eval(assumptions=And(Element(x, CC), NotEqual(x, 0)))1>>> Sin(Pi * n).eval()Sin(Mul(Pi, n))>>> Sin(Pi * n).eval(assumptions=Element(n, ZZ))0

In some cases, Pygrim can output conditional expressions: for example, theevaluation F (1 , , , x ) = − log(1 − x ) /x is made with an explicit case distinc-tion for the removable singularity at x = 0 (the singularity at x = 1 is consistentwith log(0) = −∞ and does not require a case distinction). >>> f = Hypergeometric2F1(1, 1, 2, x); f.eval()Hypergeometric2F1(1, 1, 2, x) Pygrim is not a complete computer algebra system; its features are tailoredto developing FunGrim and exploring special function identities. Users may alsoﬁnd it interesting as a symbolic interface to Arb (the .n() method returns anarbitrary-precision enclosure of a constant expression). unGrim: a symbolic library for special functions 7

To test a formula P ( x , . . . , x n ) with free variables x , . . . , x n and correspondingassumptions Q ( x , . . . , x n ), we generate pseudorandom values x , . . . , x n sat-isfying Q ( x , . . . , x n ), and for each such assignment we evaluate the constantexpression P ( x , . . . , x n ). If P evaluates to False, the test fails (a counterexam-ple has been found). If P evaluates to True or cannot be simpliﬁed to True/False(the truth value is unknown), the test instance passes.As an example, we test P ( x ) = [ √ x = x ] with assumptions Q ( x ) = [ x ∈ R ]: >>> formula = Equal(Sqrt(x**2), x)>>> formula.test(variables=[x], assumptions=Element(x, RR)){x: 0} ... True{x: Div(1, 2)} ... True{x: Sqrt(2)} ... True{x: Pi} ... True{x: 1} ... True{x: Neg(Div(1, 2))} ... False The test passes for x = 0 , , √ , π,

1, but x = − is a counterexample. Withcorrect assumptions x ∈ C ∧ (Re( x ) > ∨ (Re( x ) = 0 ∧ Im( x ) > >>> formula.test(variables=[x], assumptions=And(Element(x, CC),... Or(Greater(Re(x), 0), And(Equal(Re(x), 0), Greater(Im(x), 0)))))...Passed 77 instances (77 True, 14 Unknown, 0 False) It currently takes two CPU hours to test the FunGrim database with upto 100 test instances (assignments x , . . . , x n that satisfy the assumptions) perentry. We estimate that around 75% of the entries are eﬀectively testable. Forthe other 25%, either the symbolic evaluation code in Pygrim is not powerfulenough to generate any admissible values (for which Q is provably True), or P contains constructs for which Pygrim does not yet support symbolic or numericalevaluation. For 30% of the entries, Pygrim is able to symbolically simplify P toTrue in at least one test instance (in the majority of cases, it is only able tocheck consistency via Arb). We aim to improve all these statistics in the future.The test strategy is eﬀective: the ﬁrst run to test the FunGrim databasefound errors in 24 out of 2618 entries. Of these, 4 were mathematically wrongformulas (for example, the Bernoulli number inequality ( − n B n +2 > − n +1 ), 6 had incorrect assumptions (for example, theLambert W-function identity W ( x log( x )) = log( x ) was given with assumptions x ∈ [ − e − , ∞ ) instead of the correct x ∈ [ e − , ∞ )); the remaining errors weredue to incorrect metadata or improperly constructed symbolic expressions.A similar number of additional errors were found and corrected after im-proving Pygrim’s evaluation code further. An error rate near 5% seems plausiblefor untested formulas entered by hand (by this author!). We did not speciﬁcallysearch for errors in the literature used as reference material for FunGrim; how-ever, many corrections were naturally made when the entries were ﬁrst added,prior to the development of the test framework. F. Johansson

The FunGrim database can be used for term rewriting, most easily by applyinga speciﬁc entry as a rewrite rule. For example, FunGrim entry ad6c1c is thetrigonometric identity sin( a ) sin( b ) = (cos ( a − b ) − cos ( a + b )): >>> (Sin(2) * Sin(Sqrt(2))).rewrite_fungrim("ad6c1c")Div(Sub(Cos(Sub(2, Sqrt(2))), Cos(Add(2, Sqrt(2)))), 2) This depends on pattern matching. To ensure correctness, a match is onlymade if parameters in the input expression satisfy the assumptions for freevariables listed in the FunGrim entry. The pattern matching is currently im-plemented naively and will fail to match expressions that are mathematicallyequivalent but structurally diﬀerent (better implementations are possible [6]).A rather interesting idea is to search the whole database automatically forrules to apply to simplify a given formula. We have used this successfully on toyexamples, but much more work is needed to develop a useful general-purposesimpliﬁcation engine; this would require stronger pattern matching as well asheuristics for applying sequences of rewrite rules. Rewriting using a database isperhaps most likely to be successful for speciﬁc tasks and in combination withadvanced hand-written search heuristics (or heuristics generated via machinelearning). A prominent example of the hand-written approach is Rubi [9] whichuses a decision tree of thousands of rewrite rules to simplify indeﬁnite integrals.

References

1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formu-las, Graphs, and Mathematical Tables. Dover, New York (1964)2. Benoit, A., Chyzak, F., Darrasse, A., Gerhold, S., Mezzarobba, M., Salvy,B.: The dynamic dictionary of mathematical functions (DDMF). In: Inter-national Congress on Mathematical Software. pp. 35–41. Springer (2010).https://doi.org/10.1007/978-3-642-14128-7 23. Cohl, H.S., McClain, M.A., Saunders, B.V., Schubotz, M., Williams, J.C.: Digitalrepository of mathematical formulae. In: Intelligent Computer Mathematics, pp.419–422. Springer (2014). https://doi.org/10.1007/978-3-319-08434-3 304.