Array operators using multiple dispatch: a design methodology for array implementations in dynamic languages
Jeff Bezanson, Jiahao Chen, Stefan Karpinski, Viral Shah, Alan Edelman
AArray Operators Using Multiple Dispatch
Array Operators Using Multiple Dispatch
A design methodology for array implementations in dynamic languages
Jeff Bezanson Jiahao Chen Stefan Karpinski Viral Shah Alan Edelman
MIT Computer Science and Artificial Intelligence Laboratory [email protected], [email protected], [email protected], [email protected], [email protected]
Abstract
Arrays are such a rich and fundamental data type that they tend tobe built into a language, either in the compiler or in a large low-level library. Defining this functionality at the user level insteadprovides greater flexibility for application domains not envisionedby the language designer. Only a few languages, such as C++ andHaskell, provide the necessary power to define n -dimensional ar-rays, but these systems rely on compile-time abstraction, sacrificingsome flexibility. In contrast, dynamic languages make it straightfor-ward for the user to define any behavior they might want, but at thepossible expense of performance.As part of the Julia language project, we have developed an ap-proach that yields a novel trade-off between flexibility and compile-time analysis. The core abstraction we use is multiple dispatch. Wehave come to believe that while multiple dispatch has not been es-pecially popular in most kinds of programming, technical comput-ing is its killer application. By expressing key functions such asarray indexing using multi-method signatures, a surprising rangeof behaviors can be obtained, in a way that is both relatively easyto write and amenable to compiler analysis. The compact factoringof concerns provided by these methods makes it easier for user-defined types to behave consistently with types in the standard li-brary. Keywords
Julia, multiple dispatch, type inference, array indexing,static analysis, dynamic dispatch
1. Array libraries “Unfortunately, it is very difficult for a designer to se-lect in advance all the abstractions which the users of hislanguage might need. If a language is to be used at all, it islikely to be used to solve problems which its designer didnot envision, and for which abstractions embedded in thelanguage are not sufficient.” - Ref. [22] n -arrays (arrays of rank n , or simply arrays) are an essentialdata structure for technical computing, but are challenging to im-plement efficiently [24, 26, 27]. There is a long history of special-purpose compiler optimizations to make operations over arrays ef- [Copyright notice will appear here once ’preprint’ option is removed.] ficient, such as loop fusion for array traversals and common subex-pression elimination for indexing operations [3, 24]. Many lan-guage implementations therefore choose to build array semanticsinto compilers.Only a few of the languages that support n -arrays, however,have sufficient power to express the semantics of n -arrays forgeneral rank n without resorting to hard-coding array behav-ior into a compiler. Single Assignment C [10] is a notable lan-guage with built-in n -array support. Other languages have well-established array libraries, like the C++ libraries Blitz++ [30]and
Boost.MultiArray [9] and Haskell’s
Repa (Regular ParallelArrays) [15, 20, 21]. These libraries leverage the static semanticsof their host languages to define n -arrays inductively as the outerproduct of a 1-array with an ( n − -array [1]. Array libraries typi-cally handle dimensions recursively, one at a time; knowing arrayranks at compile-time allows the compiler to infer the amount ofstorage needed for the shape information, and unroll index compu-tations fully. Array libraries built using compile-time abstraction have good per-formance, but also some limitations. First, language features likeC++ templates are not available at run-time, so these libraries donot support n -arrays where n is known only at run-time. Second,code using these features is notoriously difficult to read and write;it is effectively written in a separate sublanguage. Third, the re-cursive strategy for defining n -arrays naturally favors only certainindexing behaviors. For example, Repa ’s reductions like sum areonly defined naturally over the last index [15]; reducing over a dif-ferent index requires permutations. However, it is worth noting thatHaskell’s type system encourages elegant factoring of abstractions.While the syntax may be unfamiliar, we feel that
Repa ought tohold much interest for the technical computing community.Some applications call for semantics that are not amenable tostatic analysis. Some may require arrays whose ranks are knownonly at run-time, e.g. when reading in arrays from disk or from adata stream where the rank is specified as part of the input data. Inthese programs, the data structures cannot be guaranteed to fit ina constant amount of memory. Others may wish to dynamicallydispatch on the rank of an array, a need which a library mustanticipate by providing appropriate virtual methods.
Applications requiring run-time flexibility are better expressedin dynamic languages such as
Mathematica , MATLAB, R, andPython/NumPy, where all operations are dynamically dispatched(at least semantically, if not in actual implementation). Such flexi-bility, however, has traditionally come at the price of slower execu-tion. To improve performance, dynamic languages typically resortto static analysis at some level. One strategy is to implement ar-
Array Operators Using Multiple Dispatch a r X i v : . [ c s . P L ] J u l anguage design analysis design language design analysis design Figure 1.
Above: most dynamic languages are designed withoutconsideration for program analysis, leaving it to future compilerwriters if the language becomes sufficiently popular. Below: Julia isdesigned with analysis in mind, with a single community responsi-ble for both language design and performant implementation. Thisapproach is natural for statically typed languages, but dynamic lan-guages need static analysis too.rays in an external library written in a static language. The PythonNumPy package is a prominent example, implementing array op-erations as well as its own type system and internal abstractionmechanisms within a large C code base [29]. As a result, NumPy ndarray s are superficially Python objects, but implementation-wise are disjoint from the rest of the Python object system, sincelittle of Python’s native object semantics is used to define theirbehavior.Another approach is to implement static analyses de novo fordynamic languages. However, the flexibility of these languages’programs limits the extent of analysis in practice. For example,MATLAB’s array semantics allow an array to be enlarged auto-matically whenever a write occurs to an out-of-bounds index, andalso for certain operations to automatically promote the elementtype of an array from real to complex numbers. This poses imple-mentation challenges for static MATLAB compilers like
FALCON ,which have to implement a complete type system with multiplecompiler passes and interprocedural flow analyses to check for suchdrastic changes to arrays [18, 25]. In fact, MATLAB’s (and APL’s)semantics are so flexible that shape inference on arrays is impos-sible to compute using ordinary dataflow analysis on bounded lat-tices [13]. Additionally, type checking is essential to disambiguateMATLAB expressions like
A*B , which, depending on the dimen-sions of A and B , could represent a scaling, inner product, outerproduct, matrix-matrix multiplication, or matrix-vector multiplica-tion [25]. Similar work has been done for other dynamic languages,as in Hack, a PHP implementation with a full static type system[31].The conflicting requirements of performance and flexibilitypose a dilemma for language designers and implementers. Mostcurrent languages choose either to support only programs that areamenable to static analysis for the sake of performance, like C++and Haskell, or choose to support more general classes of pro-grams, like MATLAB, Mathematica , and Python/NumPy. Whiledynamic languages nominally give up static analysis in the process,many implementations of these languages still resort to static anal-ysis in practice, either by hard-coding array semantics post hoc in acompiler, or by implementing arrays in an external library writtenin a static language.
2. Julia arrays
Julia[2] is dynamically typed and is based on dynamic multiple dis-patch. However, the language and its standard library have beendesigned to take advantage of the possibility of static analysis (Fig- ure 1), especially dataflow type inference [5, 14]. Such type in-ference, when combined with multiple dispatch, allows users andlibrary writers to produce a rich array of specialized methods tohandle different cases performantly. In this section we describehow this language feature is used to implement indexing for Julia’s
Array data type. The
Array type is parameterized by an elementtype and a rank (an integer). For purposes of this paper, its repre-sentation can be considered a tuple of a contiguous memory regionand a shape (a tuple of integers giving the size of the array in eachdimension). This simple representation is already enough to requirenontrivial design decisions.
Rules must be defined for how various operators act on arraydimensions. Here we will focus on indexing, since selecting parts ofarrays has particularly rich behavior with respect to dimensionality.For example, if a single row or column of a matrix is selected,does the result have one or two dimensions? Array implementationsprefer to invoke general rules to answer such questions. Such arule might say “dimensions indexed with scalars are dropped”, or“trailing dimensions of size one are dropped”, or “the rank of theresult is the sum of the ranks of the indexes” (as in APL [7]).The recursive, one-dimension-at-a-time approach favored instatic languages limits which indexing behaviors can be chosen.For example, an indexing expression of a 3-array in C++ mightbe written as
A[i][j][k] . Here there are three applications of operator[] , each of which will decide whether to drop a dimen-sion based on the type of a single index. The second rule describedabove, and others like it, cannot be implemented in such a scheme.Julia’s dispatch mechanism permits a novel approach that en-compasses more rules, and does not require array rank to be knownstatically, yet benefits when it is. This solution is still a compro-mise among the factors outlined in the introduction, but it is a newcompromise that we find compelling.
Our goal here is a bit unusual: we are not concerned with whichrules might work best, but merely with how they can be specified,so that domain experts can experiment.In fact, different applications may desire different indexing be-haviors. For example, applications employing arrays with unitsor other semantic meaning associated with each dimension maynot want to have the dimensions dropped or rearranged. For ex-ample, tomographic imaging applications may want arrays repre-senting stacks of images as the imaging plane moves through athree- dimensional object. The resulting array would have associ-ated space/time dimensions on top of the dimensions indexing intocolor planes. In other applications, the dimensions are not seman-tically distinguishable and it may be desirable to drop singletondimensions. For example, a statistical computation may find it con-venient to represent an n -point correlation function in an n -array,and integrate over k points to generate the lower order ( n − k ) -correlation functions; the indistinguishability of the points meansthat the result is most conveniently expressed with rank ( n − k ) rather than an n -array with k singleton dimensions.In practice we may have to reach a consensus on what rules touse, but this should not be forced by technical limitations. Multiple dispatch (also known as generic functions, or multi-methods) is an object-oriented paradigm where methods are de-fined on combinations of data types (classes), instead of encapsu-lating methods inside classes (Figure 2). Methods are grouped intogeneric functions. A generic function can be applied to several ar-
Array Operators Using Multiple Dispatch bjectmethod method method method methodobject objectmethod method method method methodobjectobjectgeneric function generic function Figure 2.
Class-based method dispatch (above) vs. multiple dis-patch (below).guments, and the method with the most specific signature matchingthe arguments is invoked.One can invent examples where multiple dispatch is useful inclassic object-oriented domains such as GUI programming. Forexample, a method for drawing a label onto a button might looklike this in Julia syntax: function draw(target::Button, obj::Label)...end
In numerical computing, binary operators are ubiquitous and wecan easily imagine defining special behavior for some combinationof two arguments: +(x::Real, z::Complex) = complex(x+real(z), imag(z))
But how much more is there? Would we ever need to definea method on three different types at once? Indeed, most languagedesigners and programmers seem to have concluded that multipledispatch might be nice, but is not essential, and the feature is notoften used [23]. Perhaps the few cases that seem to need it can behandled using tricks like Python’s add and radd methods.However, in technical computing the need for polymorphic,multi-argument operators goes further. In fact we have found a needfor additional dispatch features that are not always found in multi-method implementations. For array semantics, support for variadicmethods is perhaps the most important such feature. Combiningmultiple dispatch and variadic methods seems straightforward, yetpermits surprisingly powerful definitions, and entails a surprisingamount of complexity. For example, consider a variadic sum func-tion that adds up its arguments. We could write the following twomethods for it (note that in Julia,
Real is the abstract supertype ofall real number types, and
Integer is the abstract supertype of allinteger types): sum(xs::Integer...)sum(xs::Real...)
The syntax ... allows an argument slot to match any numberof trailing arguments (currently, Julia only allows this at the end ofa method signature). In the first case, all arguments are integers andso we can use a naive summation algorithm. In the second case,we know that at least one argument is not an integer (otherwise thefirst method would be used), so we might want to use some form ofcompensated summation instead. Notice that these modest methodsignatures capture a subtle property (at least one argument is non-integral) declaratively , without needing to explicitly loop over thearguments to examine their types. The signatures also provide use-ful type information: at the very least, a compiler could know thatall argument values inside the first method are of type
Integer . Yet the type annotations are not redundant, but are necessary to specifythe desired behavior. There is also no loss of flexibility, since sum can be called with any combination of number types, as users ofdynamic technical computing languages would expect.While the author of these definitions does not write a loop toexamine argument types, such a loop of course still must take placesomewhere inside the dispatch system. Such a complex dispatchsystem is naturally at risk of performing badly. However, Juliapervasively applies dataflow type inference, so that argument typesare often known in advance, in turn allowing method lookup to bedone at compile-time. Technically this is just an optimization, butin practice it has a profound impact on how code is written.
Multiple dispatch appears at first to be about operator overloading:defining the behavior of functions on new, user-defined types. Butthe fact that the compiler “knows” the types of function argumentsleads to a surprising, different application: performing elaborate,declarative transformations of argument tuples.Determining the result shape of an indexing operation is justsuch a transformation. In Julia’s standard library, we have a func-tion index shape that accepts index arguments (which, for presentpurposes, may be scalars or arrays of any rank), and returns theshape (a tuple of integers) of the result. The length of the shapedetermines the rank of the result array. Many different behaviorsare possible, but currently we use the rule that trailing dimensionsindexed with scalars are dropped. For example:
A[1:m, 1:n, 2]
The following two method definitions express this behavior: (The ... ellipsis syntax within an expression, on the right-handside of a definition, performs “argument splicing”: the elements of acontainer are spread into multiple arguments to the called function.Formal arguments that lack a :: type specializer match any value.)The first definition traps and collapses runs of Real arguments ofany length. The second definition ensures that the first definitiononly applies to the tail of an argument tuple, by keeping indexes aslong as some non-scalar arguments remain.Since all indexing functions call this function, changing thesetwo lines is sufficient to change how indexing works. For example,another rule one might want is to drop all dimensions indexed withscalars:
Or we could imitate APL’s behavior, where the rank of the resultis the sum of the ranks of the indexes, as follows: This rule has been the subject of some debate in the Julia community [11].Fortunately it is easy to change, as we will see.
Array Operators Using Multiple Dispatch ere size (as opposed to length ) gives the shape tuple of anarray, so we are just concatenating shapes. Julia’s multi-methods were designed with the idea that dataflowtype inference would be applied to almost all concrete instances ofmethods, based on run-time argument types or compile-time esti-mated argument types. Our definitions exploit the dataflow oper-ation of matching inferred argument types against method signa-tures, thereby destructuring and recurring through argument tuplesat compile-time. As a result, the compiler is able to infer sharpresult types for many variadic calls, and optimize away argumentsplicing that would otherwise be prohibitively expensive. More so-phisticated method signatures lead to more sophisticated type de-ductions.Tuple types (or product types) are crucial to this analysis. Sincethe type of each element of a tuple is tracked, it is possible todeduce that the type of f(x...) , where x has tuple type (A,B) ,is equal to the type of f applied to arguments of types A and B . Variadic methods introduce unbounded tuple types, written as (T...) . Unbounded tuple types form a lattice of infinite height,since new subtypes can always be constructed in the sequence (T...) , (T,T...) , (T,T,T...) , etc. This adds significant com-plexity to our lattice operators. Julia’s multi-methods resemble symbolic pattern matching, suchas those in computer algebra systems. Pattern matching systemseffectively allow dispatch on the full structure of values, and soare in some sense even more powerful than our generic functions.However, they lack a clear separation between the type and valuedomains, leading to performance opacity: it is not clear what thesystem will be able to optimize effectively and what it won’t.Such a separation could be addressed by designating some classof patterns as the “types” that the compiler will analyze. However,more traditional type systems could be seen as doing this already,while also gaining data abstraction in the bargain.
In many array languages, a function like index shape would beimplemented inside the run-time system (possibly scattered amongmany functions), and separately embodied in a hand-written trans-fer function inside the compiler. Our design shows that such ar-rangements can be replaced by a combination of high-level codeand a generic analysis. Similar conclusions on the value of incor-porating analyzed library code into a compiler were drawn by theTelescoping Languages project [16]. Yet other languages like Sin-gle Assignment C allow great flexibility in user-defined functionsbut require the built-in shape functions to be implemented with spe-cial purpose type functions [10, 28].From the programmer’s perspective, Julia’s multi-methods areconvenient because they provide run-time and compile-time ab-straction in a single mechanism. Julia’s “object” system is alsoits “template” system, without different syntax or reasoning aboutbinding time. Semantically, methods always dispatch on run-timetypes, so the same definitions are applicable whether types areknown statically or not. This makes it possible to use popular dy-namic constructs such as
A[I...] where I is a heterogeneous arrayof indexes. In such a case the compiler will need to generate a dy-namic dispatch, but only the performance of the call site is affected.One price of this flexibility is that not all such definitions arewell-founded: it is possible to write methods that yield tuples of in-determinate length. The compiler must recognize this condition andapply widening operators [5, 6]. In these cases, the deduced typesare still correct but imprecise, and in a way that depends on some- Language DR CR DoSGwydion 1.74 18.27 2.14OpenDylan 2.51 43.84 1.23CMUCL 2.03 6.34 1.17SBCL 2.37 26.57 1.11McCLIM 2.32 15.43 1.17Vortex 2.33 63.30 1.06Whirlwind 2.07 31.65 0.71NiceC 1.36 3.46 0.33LocStack 1.50 8.92 1.02Julia 5.86 51.44 1.54Julia operators 28.13 78.06 2.01 Table 1.
Comparison of Julia (1208 functions exported from the
Base library) to other languages with multiple dispatch. The “Juliaoperators” row describes 47 functions with special syntax (binaryoperators, indexing, and concatenation). Data for other systems arefrom Ref. [23].what arbitrary choices of widening operators (for example, such atype might look like (Int...) or (Int,Int...) ). Nevertheless,we believe that the flexibility of Julia’s multi-methods is of net ben-efit to programmers.
3. Discussion
Multiple dispatch is used heavily throughout the Julia ecosystem.To quantify this statement, we use the following metrics for evalu-ating the extent of multiple dispatch [23]:1. Dispatch ratio (DR): The average number of methods in ageneric function.2. Choice ratio (CR): For each method, the total number of meth-ods over all generic functions it belongs to, averaged over allmethods. This is essentially the sum of the squares of the num-ber of methods in each generic function, divided by the totalnumber of methods. The intent of this statistic is to give moreweight to functions with a large number of methods.3. Degree of specialization (DoS): The average number of type-specialized arguments per method.Table 3 shows the mean of each metric over the entire Julia
Base library, showing a high degree of multiple dispatch comparedwith corpora in other languages [23]. Compared to most multipledispatch systems, Julia functions tend to have a large number ofdefinitions. To see why this might be, it helps to compare resultsfrom a biased sample of only operators. These functions are themost obvious candidates for multiple dispatch, and as a resulttheir statistics climb dramatically. Julia is focused on technicalcomputing, and so is likely to have a large proportion of functionswith this character.
To give a better sense of how multi-methods are or could be usedin our domain, we will briefly describe a few examples.
In certain instances of array indexing, it is possible to keep the datain place and return just a view (pointer) to the data instead of copy-ing it. This functionality is implemented in a Julia package called
ArrayViews.jl [19]. A crucial property of an array view is its contiguous rank : the number of leading dimensions for which thestrides equal the cumulative product of the shape. When a view is
Array Operators Using Multiple Dispatch onstructed of another view, the type of view constructed dependson the indexes used and the contiguous rank of the argument. Infavorable cases, a more efficient ContiguousView is returned.This determination is made by a definition similar to the follow-ing: function view(a::DenseArray, I::Subs...)shp = vshape(a, I...)make_view(a, restrict_crank(contrank(a, I...), shp),shp, I...)end contrank essentially counts the number of leading “colons”in the indexing expression. make view selects (via dispatch) whatkind of view to return based on the result type of restrict crank ,which is set up to return the smaller of its two argument shapes.This is an excellent example of a library that needs to definebehaviors actually exceeding the complexity of what is providedin the standard library.
Other classes of array indexing rules are needed in distributedarray implementations. The Star-P system [4, 12] let users “tag”dimensions as potentially distributed using the notation *p , whichconstructed a special type of object tracked by the system. Indexingleads to questions of whether to take the first instance of *p asthe distributed dimension, the last instance, or perhaps just the lastdimension.Such distribution rules could be implemented and experi-mented with readily using an approach similar to that used for index shape . A package providing unitful computations (
SIUnits.jl [8]) makesuse of the same kinds of tradeoffs as array semantics. Unitful com-putations are another case where the relevent metadata (exponentson units) can be known at compile-time in many cases, but notalways. The
SIUnits library is free to express the general case,and have the overhead of tagging and dispatching removed wherepossible.The method signatures of operators on unit quantities ensurethat they only apply to arguments with the same units. Because ofthis design, if an operator is applied to two arguments where theunits are only statically known for one, the compiler can infer thatthe other must have the same units for the operation to succeed.Another implication is that mismatched units can be handled witha 1-line fallback definition that simply raises an informative error.
In this work, we focus on the performance that arises from elim-inating abstraction overhead. Our goal is to convert general defi-nitions (e.g. an indexing function that can handle many kinds ofarguments) into the rough equivalent of handwritten C code forany particular case. This is a useful performance target, since pro-grammers often resort to rewriting slow high- level code in C. Fur-ther speedups are possible, but entail techniques beyond the currentscope. Additionally, we reuse the optimization passes provided byLLVM[17], allowing us to ignore many lower-level performanceissues.Experience so far suggests that we are close to meeting thisperformance goal [2].
4. Conclusion
Programming languages must compromise between the ability toperform static analyses and allowing maximal flexbility in user pro-grams. Performance-critical language features like arrays benefit greatly from static analyses, and so even dynamic languages thatinitially lack static analyses eventually want them one way or an-other.We speculate that, historically, computer scientists developingmultiple dispatch were not thinking about technical computing, andthose who cared about technical computing were not interested inthe obscurer corners of object-oriented programming. However, webelieve that the combination of dataflow type inference, sophisti-cated method signatures, and the need for high-productivity techni-cal environments is explosive. In this context, multi-methods, whilestill recognizable as such, can do work that departs significantlyfrom familiar uses of operator overloading. By itself, this mecha-nism does not address many of the concerns of array programming,such as memory access order and parallelism. However, we feel itprovides a useful increment of power to dynamic language userswho would like to begin to tackle these and other related problems.
Acknowledgments
The authors gratefully acknowledge the enthusiastic participationof the Julia developer community in many stimulating discussions,in particular Dahua Lin and Keno Fischer for the
ArrayViews.jl [19]and
SIUnits.jl [8] packages, respectively. This work was sup-ported by the MIT Deshpande Center, an Intel Science and Tech-nology award, grants from VMWare and Citibank, a Horizon-tal Software Fellowship in Compuational Engineering, and NSFDMS-1035400.
References [1] G. Bavestrelli. A class template for N-dimensional genericresizable arrays.
C/C++ Users Journal , 18(12):32–43,December 2000. URL .[2] J. Bezanson, S. Karpinski, V. B. Shah, and A. Edelman. Julia: A fastdynamic language for technical computing. arXiv:1209.5145v1, 2012.[3] V. A. Busam and D. E. Englund. Optimization of expressions inFortran.
Communication of the ACM , 12(12):666–674, 1969. .[4] R. Choy and A. Edelman. Parallel MATLAB: Doing it right. In
Proceedings of the IEEE , volume 93, pages 331–341, 2005. .[5] P. Cousot and R. Cousot. Abstract interpretation: A unified latticemodel for static analysis of programs by construction or approxima-tion of fixpoints. In
Proceedings of the 4th ACM SIGACT-SIGPLANSymposium on Principles of Programming Languages , POPL ’77,pages 238–252, New York, NY, USA, 1977. ACM. .[6] P. Cousot and R. Cousot. Comparing the Galois connectionand widening/narrowing approaches to abstract interpretation. InM. Bruynooghe and M. Wirsing, editors,
Programming Language Im-plementation and Logic Programming , volume 631 of
Lecture Notesin Computer Science , pages 269–295. Springer Berlin / Heidelberg,1992.[7] A. D. Falkoff and K. E. Iverson. The design of APL.
SIGAPL APLQuote Quad , 6:5–14, April 1975. .[8] K. Fischer. URL https://github.com/loladiro/SIUnits.jl .[9] R. Garcia and A. Lumsdaine. MultiArray: a C++ library for genericprogramming with arrays.
Software: Practice and Experience , 35(2):159–188, 2005. .[10] C. Grelck and S.-B. Scholz. SAC: A functional array language forefficient multi-threaded execution.
International Journal of ParallelProgramming , 34(4):383–427, 2006. ISSN 0885-7458. .[11] T. E. Holy. Drop dimensions indexed with a scalar? URL https://github.com/JuliaLang/julia/issues/5949 .[12] P. Husbands.
Interactive Supercomputing . PhD thesis, Department ofElectrical Engineering and Computer Science, Massachusetts Instituteof Technology, 1999.
Array Operators Using Multiple Dispatch
13] P. G. Joisha and P. Banerjee. An algebraic array shape inferencesystem for MATLAB.
ACM Transactions on Programming Languagesand Systems , 28(5):848–907, Sept. 2006. .[14] M. A. Kaplan and J. D. Ullman. A scheme for the automatic inferenceof variable types.
Journal of the ACM , 27(1):128–145, January 1980..[15] G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, andB. Lippmeier. Regular, shape-polymorphic, parallel arrays in Haskell.In
Proceedings of the 15th ACM SIGPLAN International Conferenceon Functional Programming , ICFP ’10, pages 261–272, New York,NY, USA, 2010. ACM. .[16] K. Kennedy, B. Broom, K. Cooper, J. Dongarra, R. Fowler, D. Gan-non, L. Johnsson, J. Mellor-Crummey, and L. Torczon. Telescopinglanguages: A strategy for automatic generation of scientific problem-solving systems from annotated libraries.
Journal of Parallel and Dis-tributed Computing , 61(12):1803 – 1826, 2001. .[17] C. Lattner and V. Adve. LLVM: A compilation framework for life-long program analysis & transformation. In
Proceedings of the2004 International Symposium on Code Generation and Optimization(CGO’04) , pages 75–86, Palo Alto, California, Mar 2004.[18] X. Li and L. Hendren. Mc2For: a tool for automati-cally transforming MATLAB to Fortran 95. Technical Re-port SABLE-TR-2013-4, Sable Research Group, School of Com-puter Science, McGill University, Montr´eal, Qu´ebec, Canada,2013. URL .[19] D. Lin. URL https://github.com/lindahua/ArrayViews.jl .[20] B. Lippmeier and G. Keller. Efficient parallel stencil convolutionin Haskell. In
Proceedings of the 4th ACM Symposium on Haskell ,Haskell ’11, pages 59–70, New York, NY, USA, 2011. ACM. .[21] B. Lippmeier, M. M. T. Chakravarty, G. Keller, and S. Peyton Jones.Guiding parallel array fusion with indexed types. In
Haskell ’12Proceedings of the 2012 Haskell Symposium , pages 25–36, New York,2012. ACM. .[22] B. H. Liskov and S. N. Zilles. Programming with abstract data types.In
Proceedings of the ACM SIGPLAN symposium on Very high levellanguages , volume 9, pages 50–59, New York, 1974. ACM.[23] R. Muschevici, A. Potanin, E. Tempero, and J. Noble. Multipledispatch in practice. In
Proceedings of the 23rd ACM SIGPLANConference on Object-oriented Programming Systems Languages andApplications , OOPSLA ’08, pages 563–582, New York, NY, USA,2008. ACM. .[24] B. Randell and L. J. Russell.
ALGOL 60 Implementation , vol-ume 5 of
The Automatic Programming Information Centre Studiesin Data Processing . Academic Press, London, 1964. URL .[25] L. D. Rose and D. Padua. Techniques for the translation of MATLABprograms into Fortran 90.
ACM Transactions on Programming Lan-guages and Systems , 21:286–323, 1999.[26] K. Sattley. Allocation of storage for arrays in ALGOL 60.
Communi-cation of the ACM , 4(1):60–65, 1961. .[27] K. Sattley and P. Z. Ingerman. The allocation of storage for arrays inALGOL 60.
ALGOL Bulletin , Sup 13.1:1–15, 1960.[28] S.-B. Scholz. Single Assignment C: Efficient support for high-levelarray operations in a functional setting.
Journal of Functional Pro-gramming , 13(6):1005–1059, 2003. .[29] S. van der Walt, S. C. Colbert, and G. Varoquaux. The NumPy array:A structure for efficient numerical computation.
Computing in Science& Engineering , 13(2):22–30, 2011. .[30] T. L. Veldhuizen. Arrays in Blitz++. In D. Caromel, R. R. Oldehoeft,and M. Tholburn, editors,
Computing in Object-Oriented Parallel En-vironments , volume 1505 of
Lecture Notes in Computer Science , pages223–230. Springer Berlin Heidelberg, 1998. .[31] J. Verlaguet and A. Menghrajani. Hack: a new programming languagefor HHVM. March 2014. URL http://code.facebook.com/posts/264544830379293 . Array Operators Using Multiple Dispatch6