Compilation of mathematical expressions in Kotlin
CCompilation of mathematical expressions in Kotlin
Iaroslav Postovalov
JetBrains Research (Nuclear Physics Methods group)
Novosibirsk, RussiaORCID: [email protected]
Abstract —Interpreting mathematical expressions atruntime is a standard task in scientific software engi-neering. There are different approaches to this problemfrom creating an embedded domain-specific language(eDSL) with its own parser and interpreter specificallyfor that task, a full-fledged embedded compiler. Thisarticle is dedicated to a middle-ground solution imple-mented in the KMath library, which uses the Kotlinobject builder DSL and its own algebraic abstractionsto generate an AST for mathematical operations. ThisAST is then compiled just-in-time to generate JVMbytecode. A similar approach is tested on other Kotlinplatforms, where its performance is compared across avariety of supported platforms, and we show JVM andJavaScript.
Index Terms —Dynamic compiler, Software libraries,Software performance
I. Introduction
A common task in scientific software development isthe dynamic interpretation of mathematical expressions,where an expression, either defined as source code or anexternal string, must be evaluated at runtime.There are broad two approaches to dynamic execution:pure interpretation is more portable but typically slower,and dynamic compilation (i.e. JIT [1], or just-in-timecompilation) is often faster but more complex. As thegoal of this research is to provide a universal and high-performance dynamic execution engine, pure interpreta-tion does not satisfy our requirements, and JIT compila-tion is too complex, requiring many tradeoffs between thelevel of optimization and compiler performance.Various runtimes offer competitive performance andprovide a mechanism to load the intermediate representa-tion (IR) dynamically, among them, the well-known
JavaVirtual Machine (JVM). In addition to the Java languageitself, the JVM supports a variety of other languages,including a modern, expressive language called Kotlin [2].A reasonable question to ask is: why implement acustom compiler infrastructure instead of using an existingsolution? For example, Kotlin provides a scripting enginefor compiling expressions on-the-fly. However such general-purpose compilers are limited by their size and complexity(e.g., kotlin-compiler-embeddable is around 40 MiB),and the compilation performance is usually slow.The alternative described in this paper is implementedin a library called
KMath [3], [4], a Kotlin-based mathe- matical library, which implements a generic set of elementsand abstract algebraic operations over them.The approach taken by KMath offers a number of bene-fits for defining and evaluating mathematical expressions.In particular, users can easily implement expressions ondouble-precision floating-point numbers and define customoperators on all supported algebraic structures.
II. KMath Principles
Mathematical operations in KMath are separated frommathematical objects. To perform an operation, e.g. + , oneneeds two generic objects of type T and a polymorphicalgebraic context parameterized by T , e.g. Space
KMath supports binding toexternal libraries that may be used interchangeably. • The context (called
Algebra or algebraic context)could store information required to provide additionalruntime guarantees. For example, it is possible toguarantee only a specific shape of n-dimensional ar-rays is valid in a given context or lexical scope. [5]Mathematical contexts have the following hierarchy:
Field ^: Ring ^: Space ^: Algebra
These interfaces loosely fulfill the standard mathemati-cal definition designated by their name:1)
Space defines + , additive identity (i.e., ), and × .2) Ring adds a multiplicative identity (i.e., ).3) Field defines a multiplicative inverse, i.e. ÷ A typical implementation of
Field
RealField which works on doubles, and
VectorSpace for
Space
The KMath abstract algebra offers additional benefitsfor expression compilation. One can create a generic ex-pression that uses only operations, provided by an
Algebra contract, and then use a specific
Algebra instance to per-form operations and compute the value of the expression.Still, it requires some additional work to enable dynamicexpression compilation.The first major change of the KMath core API was theaddition of dynamic operation dispatching to the primarymarker interface
Algebra
Algebra
Expression are the so-called functional expressions , which are organized as a treeof
Expression objects, which we describe below.
IV. The MST Structure
The
MST (Mathematical Syntax Tree) is a primitive ab-stract syntax tree representing a language of mathematicalexpressions. Loosely, it consists of the following grammar: ⟨ terminal ⟩ | = ⟨ symbol ⟩ | ⟨ number ⟩⟨ unaryEx ⟩ | = ⟨ unaryOp ⟩⟨ mst ⟩⟨ binaryEx ⟩ | = ⟨ mst ⟩⟨ binaryOp ⟩⟨ mst ⟩⟨ mst ⟩ | = ⟨ terminal ⟩ | ⟨ unaryEx ⟩ | ⟨ binaryEx ⟩ KMath provides three ways to obtain MST instances:1) Parse a string using a more specific grammar (storedin
ArithmeticsEvaluator.g4 file in [4]) that canproduce a certain set of MST nodes from strings like sin(x) ^ 2 + 25 .2) Construct it explicitly.3) Use a special KMath context where all the opera-tions create an MST.The MST can be interpreted recursively, although thisis the slowest method of evaluation, as will be shown inSection IX. KMath provides such an interpreter for threereasons: dynamic compilation is restricted in several VMenvironments, is not implemented in Kotlin/Native, andthe interpreter is useful for testing and debugging.MST APIs are connected to the
Expression
API viathe
MstExpression class, which consists of a pair of anMST node and a reference to an algebraic structure. Theonly difference between
MstExpression and direct MST interpretation is that in the
Expression implementation,symbolic nodes are loaded not only with the constantsand literals of the corresponding algebraic context but alsowith the expression symbols.Four other approaches for
MST translation were consid-ered, two of which were performant and could compile
MST instances for any algebraic structure — both user-declaredand those loaded from within any KMath module.
V. Java Class Generation
The goal of
MST compilation to Java bytecode is to fetchand load Java classes dynamically from an
MST instance.The generated class should implement the
Expression interface with valid type parameters, be consistent withthe interpreter and delegate all operations directly to aKMath algebraic structure to be universal.
ObjectWeb ASM [6] was chosen as a bytecode manip-ulation framework, as it was considered to be the mostlightweight and widely-used in the compiler industry.This approach presented two major problems.The first is boxing. Boxing and unboxing type con-versions [7] are often performed on the JVM due totype erasure but degrade calculation performance. Since
Expression is a generic interface, boxing conversions areperformed there so the number of these conversions can beminimized. It is possible to optimize out boxing with es-cape analysis and scalar replacement, but the performanceof generated expressions on alternative JVM implementa-tions such as
GraalVM [8] must be investigated further.The second problem concerns how to acquire a methodsignature that will call the needed algebraic operation.Four potential solutions were considered:1) The method to invoke is searched within the given
Algebra object with Java reflection.2) Rather than calling algebraic operation functionsdirectly, the unaryOperation and binaryOperation routines are called each time.3) Direct calls are inserted, but only if the user providesa dictionary mapping
Algebra operation identifiersto method signatures.4) unaryOperationFunction / binaryOperationFunction are used, and the functions they return are storedwithin the expression object.Each of the four options has advantages and disadvan-tages which are contrasted in Table I.Two attempts were made to implement Java bytecodegeneration. The first used reflection-based lookup; how-ever, this approach failed in cases when there was amismatch between the operation name and the name of thefunction retrieved, or when the semantics of the operationdid not match those of the corresponding function. Im-plementing this algorithm proved cumbersome, requiringcoercion of stack values and careful use of reflection.In the second attempt, Option 3 was accomplished—function type objects were collected from unaryOperationFunction -like methods. This algorithm ABLE I
Approaches to Calling Operations on JVM
Reflection lookup Direct dynamic calls Method calls by the table Indirect dynamic callsBoxing problem
Only return value is boxed Both arguments and returnvalue are boxed Only return value is boxed,or everything is boxed Both arguments and returnvalue are boxed,but some optimizationswill be possible
Fails if operation name doesn’tmatch the method name
Yes No No No
An extra parameter shouldbe passed to the compiler
No No Yes No
Performs tableswitchoperations lookup
Only if method can’t be found At each call No Only at compilation required a larger boxing allocation overhead; however,was much more stable and universal.We now compare the bytecode generated by both gen-eration algorithms in their decompiled forms: the legacyone in Fig. 1, and after our implementation in Fig. 2.Both classes are generated from the expression x+2 inside a
RealField context, which implements
ExtendedField
Expression
Symbol from
String )are the same. The first difference is stored fields. Thelegacy bytecode generator emitted either one or two fields:the first which stores a reference to
Algebra object,and the second which stores constants (as
Object[] )required by the expression which cannot be placed intothe class file’s constant pool directly. In contrast, thenew generator emits only one field— constants , whichstores dynamic constants as well as Kotlin functionobjects produced by the unaryOperationFunction and binaryOperationFunction methods. The second differ-ence is expression constructor: where originally a referenceto
Algebra serves as the receiver of all operations, thenew one has elements of the constants array only. Bothgenerated classes are constructed with reflection.
VI. JavaScript Source Code Generation
Applying
Kotlin Multiplatform to the created library,it was easy to port MST features to
Kotlin/JS (Kotlinfor JavaScript) and
Kotlin/Native (Kotlin for Native)while supporting functionality similar to JVM dynamiccompilation.The development of Kotlin/JS was straightforward. Theidea of storing functions instead of
Algebra referenceswas derived from the Java bytecode backend in which thegenerated function is assigned a similar constants arraystoring both constant values of the expression and functionreferences to operations used in the expression.As for tooling,
ESTree [9] as JavaScript AST classespackage and astring [10] as code generation frameworkwere selected. The only implementation choice was be-tween creating sources by appending fragments to a stringor building an AST then rendering it. The MST to JS compiler generates a function, thenwraps it as a KMath
Expression . There is an exampleof such a function in Listing 3. var executable = function (constants, arguments) {return constants[1](constants[0](arguments, ”x”), 2);};
Listing 3. Example of generated JavaScript function.
VII. WebAssembly IR Generation
WebAssembly [11] (also known as WASM) is an openstandard defining a portable IR for executable programs,and in the context of this study, WebAssembly codegeneration was attempted; however, the WASM compilerpath is much more limited than JVM bytecode or JSgeneration.To facilitate the dynamic compilation, Kotlin/JS wasused to prototype our backend implementation. In usingit, several trade-offs and problems were encountered.1) Due to slowness and lack of interoperability be-tween JS and WASM, Kotlin/JS builtin mathemat-ical functions (which are simply delegated to theJavaScript
Math object) were unavailable as well asan opportunity to invoke KMath context functions.2) Only f64 and i32
WASM types were supported,as i64 is unavailable without an experimental V8feature which maps i64 to JavaScript’s bigint type. f32 was not available for a similar reason—JavaScript does not provide a type for single-precision floating-point values.3) Basic mathematical functions for f64 (e.g. sin and cos ) required to support
RealField operations weretaken from libm (also known as math.h [12]), whichwas compiled to WASM and appended partially tothe initial state of the WASM module. All other f64 arithmetic is available with WASM opcodes. (func $executable (param $0 f64) (result f64)(f64.add(local.get $0)(f64.const 2)))
Fig. 1. Example of emitted WASM IR in the WAT form. mport java.util.*; import java.util.*;import scientifik.kmath.asm.internal.*; import kotlin.jvm.functions.*;import scientifik.kmath.expressions.*; import kscience.kmath.asm.internal.*;import scientifik.kmath.operations.*; import kscience.kmath.expressions.*;public final class AsmCompiledExpression_1073786867_0 public final class AsmCompiledExpression_45045_0implements Expression
Listing 1. Legacy bytecode generation result (decompiled). Listing 2. Current bytecode generation result (decompiled).
The backend described uses binaryen [13] library tosimplify IR generation and perform various optimiza-tions. While the upcoming
Kotlin/WASM (Kotlin forWebAssembly) toolchain would have been a much moresuitable target due to lower interoperability overhead, theproject is still in early development as of this time.
VIII. LLVM IR Generation
The
LLVM [14] (Low-level Virtual Machine) compilerinfrastructure project is a set of compiler and toolchaintechnologies designed around an IR which serves as aportable, high-level assembly language. LLVM was usedas the backend of Kotlin/Native and was investigated asa possible expression compilation target.However, the decision was made to forgo this feature fortwo reasons:1) This generation target could not be made uni-versal due to the difficulty of interoperation withKotlin /Native as the host platform.2) LLVM’s monolithic scope and poor compilation per-formance, particularly with higher optimization lev-els, is unsuitable for the dynamic compilation, atleast for primitive typed computations. All the pre-viously mentioned IRs have a lightweight or built-ininfrastructure for runtime code generation.
IX. Benchmark Results
The new expression APIs were microbenchmarked. Twomeasurements are included in this paper.
Environment data: • CPU: Intel Core i5 6400, 3.196 GHz, Skylake • RAM: 15.977 GiB • OS: Ubuntu 20.10 GroovyAll tested expressions API implementations calculatethe following formula one million times using double-precision floating-point arithmetic: x + 2 x − sin ( x ) . (1) Java runtime:
1) OpenJDK Hotspot (build 11.0.10+9-LTS)2) OpenJDK GraalVM CE 21.0.0 (build 11.0.10+8-jvmci-21.0-b06)The
JMH [15] (Java Microbenchmark Harness) shippedwithin the kotlinx-benchmark [16] toolkit was used inthroughput mode with 5 warm-ups and 5 plain iterations.JVM hosted measurements are presented in Table II.
TABLE II
JVM Hosted Measurements
Description Average throughputHotspot GraalVMfunctional
Functionalexpression 2.6 Hz Hz mst Interpretation of MST
Hz 0.177 Hz asm
ASM compiledexpression 2.994 Hz Hz raw Statically compiledimplementation in Kotlin 4.012 Hz Hz JS runtime:
1) NodeJS 12.16.1 (V8 7.8.279.23-node.31)JS hosted measurements are presented in Table III.
TABLE III
JS Hosted Measurements
Description Single shot timeNodeJSfunctional
Functional Expression 3.61 s mst
Interpretation of MST 254 s wasm
WASM compiled expression 4.22 s estree
ESTree compiledexpression s raw Statically writtenimplementation in Kotlin 4.78 s
The benchmark data suggest the following implications:1) GraalVM is generally faster than Hotspot.2) Functional expressions are nicely optimized both byJVM and V8.) Raw expressions on JVM do almost no boxing con-versions, and that’s the reason they are the fastest.4) WASM interoperability (passing of even primitivedata) is very expensive.5) Statically created expressions are significantly fasterthan dynamically compiled ones.6) The JVM hosted benchmarks will need to be run onfuture JVM implementations after the
Project Val-halla [17] release. Valhalla implements
JEP 218 [18],so boxing caused by generic interfaces can be elimi-nated by JVM.
X. Conclusion
The research on the dynamical interpretation and codegeneration for generic algebras in KMath is a work inprogress. Much work remains to be done to stabilize theAPI and improve performance. Still, current results showthe potential of dynamic expression building even forperformance-critical parts. While the ASM code genera-tion did not provide a significant performance boost, it isstill useful for research.While MST representation was a side-product of thisresearch, it proved to be a valuable tool on its own. Asa syntactic tree that can be extended with support forfuture symbols, it is possible to use for simple symboliccomputations. For example, there is experimental supportfor automatic differentiation based on Kotlin ∇ [5]. XI. Acknowledgment
The author would like to thank the members of theKMath development team (Alexander Nozik, Peter Kli-mai, and Roland Grinis) and Breandan Considine for thediscussion and revision of the work.The KMath project is developed in cooperation betweenMIPT and JetBrains Research.
References
Kotlin Programming Language , ver-sion 1.4.30, Feb. 4, 2021. [Online]. Available: https://kotlinlang.org.[3] A. Nozik, “Kotlin language for science and kmathlibrary,”
AIP Conference Proceedings , vol. 2163,no. 1, p. 040 004, Oct. 2019. doi : 10.1063/1.5130103.eprint: https : / / aip . scitation . org / doi / pdf / 10 .1063 / 1 . 5130103. [Online]. Available: https : / / aip .scitation.org/doi/abs/10.1063/1.5130103 (visited on02/15/2021).[4] I. Postovalov, A. Nozik, P. Klimai, B. Considine, A.Mischenko, B. Mittelbach, A. Trifonov, A. Radke, A.Antipov, and N. Klimenko,
Mipt-npm/kmath: 0.2.0 ,version v0.2.0, Feb. 22, 2021. doi : 10.5281/zenodo.4554034. [Online]. Available: https : / / doi . org / 10 .5281/zenodo.4554034. [5] B. Considine, M. Famelis, and L. Paull, “Kotlin ∇ :A shape-safe DSL for differentiable programming,”2019.[6] E. Bruneton, E. Kuleshov, R. Forax, A. Loskutov,J. Zaugg, L. Rytz, E. Mandrikov, J. Mansfield, É.McManus, S. Tsypanov, T. Stalnaker, M. Benson,and T. McCormick, ObjectWeb ASM , version 9.0,Sep. 22, 2020. [Online]. Available: https://asm.ow2.io.[7] T. Lindholm, F. Yellin, G. Bracha, and A. Buck-ley,
The Java® Language Specification, Java SE 11Edition . Oracle Corporation, Sep. 2018. [Online].Available: https://docs.oracle.com/javase/specs/jls/se11/jls11.pdf (visited on 02/13/2021).[8] Oracle Corporation,
GraalVM
Astring , version 1.4.3, Oct. 14, 2019. [On-line]. Available: https://github.com/davidbonnet/astring.[11] A. Haas, A. Rossberg, D. Schuff, B. Titzer, M.Holman, D. Gohman, L. Wagner, A. Zakai, andJ. Bastien, “Bringing the web up to speed withwebassembly,” Jun. 2017, pp. 185–200. doi : 10 .1145/3062341.3062363.[12] ISO,
ISO/IEC 9899:2011 Information technology —Programming languages — C
Binaryen , version 98.0.0-nightly.20201113, Nov. 13,2020. [Online]. Available: https : / / github . com /WebAssembly/binaryen.[14] C. Lattner and V. Adve, “LLVM: a compilationframework for lifelong program analysis transforma-tion,” in
International Symposium on Code Gener-ation and Optimization, 2004. CGO 2004. , 2004,pp. 75–86. doi : 10.1109/CGO.2004.1281665.[15] A. Shipilev, S. Kuksenko, J. Zaugg, V. Sitnikov, D.Chuyko, J. Vernee, Bernd, A. Biboudis, C. Dreis, S.Ponomarev, J. Kearney, E. Mandrikov, V. Tolstopy-atov, E. Caspole, V. Ozerov, M. Mirwaldt, P. Stefan,A. Agrawal, C. Redestad, J. Whiting, D. Karnok, H.Tremblay, C. Kozak, N. G. ilya-g and, B. Nobakht,E. Duveblad, and M. Vala, jmh , version 1.21, May 4,2018. [Online]. Available: https://openjdk.java.net/projects/code-tools/jmh/.[16] I. Ryzhenkov, A. Qurbonzoda, S. Igushkin, ilya-g,L. Elena, I. Goncharov, S. M., and V. Tolstopyatov, kotlinx-benchmarkkotlinx-benchmark