Component Specification in the Cactus Framework: The Cactus Configuration Language
Gabrielle Allen, Tom Goodale, Frank Löffler, David Rideout, Erik Schnetter, Eric L. Seidel
CComponent Specification in the Cactus Framework:The Cactus Configuration Language
Gabrielle Allen
Center for Computation & TechnologyDepartment of Computer ScienceLouisiana State UniversityBaton Rouge, Louisiana 70803Email: [email protected]
Tom Goodale
Frank L¨offler
Center for Computation & TechnologyLouisiana State UniversityBaton Rouge, Louisiana 70803Email: [email protected]
David Rideout
Perimeter Institute for Theoretical Physics31 Caroline St. N.Waterloo, Ontario N2L 2Y5CanadaEmail: [email protected]
Erik Schnetter
Center for Computation & TechnologyDepartment of Physics & AstronomyLouisiana State UniversityBaton Rouge, Louisiana 70803Email: [email protected]
Eric L. Seidel
City College of New YorkNew York, New York 10031Center for Computation & TechnologyLouisiana State UniversityBaton Rouge, Louisiana 70803Email: [email protected]
Abstract —Component frameworks are complex systems thatrely on many layers of abstraction to function properly. Oneessential requirement is a consistent means of describing eachindividual component and how it relates to both other compo-nents and the whole framework. As component frameworks aredesigned to be flexible by nature, the description method shouldbe simultaneously powerful, lead to efficient code, and be easyto use, so that new users can quickly adapt their own code towork with the framework.In this paper, we discuss the Cactus Configuration Language(CCL) which is used to describe components (“thorns”) in theCactus Framework. The CCL provides a description language forthe variables, parameters, functions, scheduling and compilationof a component and includes concepts such as interface and imple-mentation which allow thorns providing the same capabilities tobe easily interchanged. We include several application exampleswhich illustrate how community toolkits use the CCL and Cactusand identify needed additions to the language.
I. I
NTRODUCTION
Component frameworks provide a mechanism for efficientlydeveloping and deploying scientific applications in high–performance computing environments. Such frameworks pro-vide for efficient code reuse, community code developmentand abstraction of specialized capabilities such as adaptivemesh refinement or parallel linear solvers.Component specification is obviously an important part ofcomponent frameworks with the specification providing thedefinition of the interfaces between components, including forexample a description of the variables and functions both pro-vided by and required by the different components. The choiceof specification language impacts the scope of capabilities ofcomponents which can be implemented and exposed as well asthe ease of use of components by both developers and users. Ifthe component specification is too general it can hinder easysharing of components, and if the specification is too narrow it will reduce the potential functionality of components andthus the application.This paper describes the current specification of compo-nents in the Cactus Framework via the Cactus ConfigurationLanguage or CCL. Cactus is an open–source componentframework designed for collaborative development of com-plex codes in high–performance computing environments. Thelargest user base for Cactus is in the field of numericalrelativity where, for example, over 100 components are nowshared among over fifteen different groups through the Ein-stein Toolkit [17] (Section IV-C). In other application areas,Cactus is used by researchers in fields including quantumgravity (Section IV-B), computational fluid dynamics, coastalmodeling and computer science.However, as simulation codes grow more complex, forexample requiring multi–physics capabilities, there is now aneed to extend or possibly re-architect the CCL to react to newfeatures required by Cactus application developers. Further,as the number of Cactus components grow, an increasingproblem is how to provide user tools for component assembly,application debugging, and verification and validation. Thispaper provides a review of the CCL focusing on how itdescribes the interactions between thorns and implications forthe development of user tools.In Section II we describe the architecture of the CactusFramework that particularly relates to its handling and orches-tration of components, including the Cactus Scheduler, mem-ory allocation, data types provided by Cactus, and existing andplanned tools for component management. In Section III wedescribe the Cactus Thorn configuration files using the CactusConfiguration Language, the methods of thorn interaction, andbuilt–in testing options. In Section IV we examine several dif-ferent Cactus applications, the WaveToy Demo, a community a r X i v : . [ c s . D C ] S e p ig. 1. Cactus components are called thorns and the integrating frameworkis called the flesh . The interface between thorns and the flesh is providedby a set of configuration files writing in the Cactus Configuration Language(CCL). toolkit for quantum gravity, and the Einstein Toolkit, in respectto the dependence among components enforced by the CCL.In Section V we describe some “missing” features of the CCLthat will need to be addressed for future Cactus applications.II. C ACTUS
The Cactus Framework [16], [3] is an open source, modular,portable programming environment for HPC computing. Itwas designed and written specifically to enable scientists andengineers to develop and perform the large–scale simulationsneeded for modern scientific discoveries across a broad rangeof disciplines. Cactus is well suited for use in large, interna-tional research collaborations.
A. Architecture
Cactus is a component framework. Its components arecalled thorns whereas the framework itself is called the flesh (Figure 1). The flesh is the core of Cactus, it provides theAPIs for thorns to communicate with each other, and performsa number of administrative tasks at build–time and run–time.Cactus depends on three configuration files and two optionalfiles provided by each thorn to direct these tasks and provideinter–thorn APIs. These files are: • interface.ccl Defines the thorn interface and inher-itance along with variables and aliased functions. • param.ccl Defines parameters which can be specifiedin a Cactus parameter file and are set at the start of aCactus run. • schedule.ccl Defines when and how scheduled func-tions provided by thorns should be invoked by the Cactusscheduler. • configuration.ccl (optional) Defines build–timedependencies in terms of provided and required capabil-ities, e.g. interfaces to Cactus–external libraries. Configuration Files (CCL)
Interface, Parameters, Schedule, Configuration
Source Code
Fortran/C/C++, include files, Makefile
Verification & Validation
Testsuites
Documentation
Thorn guide, Examples, Metadata
Cactus Thorn
Fig. 2. Cactus thorns are comprised of source code, documentation, test–suites for regression testing, along with a set of configuration files writtenin the Cactus Configuration Language (CCL) which define the interface withother thorns and the Cactus flesh. • test.ccl (optional) Defines how to test a thorn’scorrectness via regression tests.The flesh is responsible for parsing the configuration files atbuild-time, generating source code to instantiate the differentrequired thorn variables, parameters and functions, as well aschecking required thorn dependencies.At run-time the flesh parses a user provided parameter filewhich defines which thorns are required and provides key-value pairs of parameter assignments. The flesh then activatesonly the required thorns, sets the given parameters, usingdefault values for parameters which are not specified in theparameter file, and creates the schedule of which functionsprovided by the activated thorns to run at which time.The Cactus flesh provides the main iteration loop for simu-lations (although this can be overloaded by any thorn) but doesnot handle memory allocation for variables or parallelization;this is performed by a driver thorn. The flesh performs nocomputation of its own — this is all done by thorns. It simplyorchestrates the computations defined by the thorns.The thorns are the basic modules of Cactus. They are largelyindependent of each other and communicate via calls to theFlesh API. Thorns are collected into logical groupings called arrangements , This is not strictly required, but strongly recom-mended to aid with their organization. An important conceptis that of an interface . Thorns do not define relationshipswith other specific thorns, nor do they communicate directlywith other thorns. Instead they define relationships with aninterface, which may be provided by multiple thorns. Thisdistinction exists so that thorns providing the same interfacemay be independently swapped without affecting any otherthorns. Interfaces in Cactus are fairly similar to abstract classesin Java or virtual base classes in C++, with the importantdistinction that in Cactus the interface is not explicitly definedanywhere outside of the thorn.This ability to choose among multiple thorns providing thesame interface is important for introducing new capabilities in Note that this parameter file is different from the file param.ccl whichis used to define which parameters exist, while the former is used to assignvalues to those parameters at run-time. actus with minimal changes to other thorns, so that differentresearch groups can implement their own particular solver forsome problem, yet still take advantage of the large amountof community thorns. For example, the original driver thornfor Cactus which handles domain decomposition and messagepassing is a unigrid driver called
PUGH . More recently, a driverthorn which implements adaptive mesh refinement (AMR)was developed called
Carpet [8], [7], [1]. Carpet makesit possible for simulations to run with multiple levels ofmesh refinement, which can be used to achieve great accuracycompared to unigrid simulations. Both
PUGH and
Carpet provide the interface driver and application thorns canrelatively straightforwardly migrate from unigrid to using theadvanced AMR thorn.Thorns providing the same interface may also be compiledtogether in the same executable, with the user choosing in theparameter file, at run-time, which implementation to use. Thisallows users to switch among various thorns without havingto recompile Cactus.Thorns include a doc directory which provides the doc-umentation for the thorn in L A TEX format. This allows usersto build one single reference guide to all thorns via a simplecommand.
B. Scheduling
The Cactus flesh provides a rule–based scheduler. Thornfunctions can be specified to be called by the scheduler atdifferent points in the simulation, in standard time bins. Ascheduled routine can be requested to occur before/after otherfunctions in the same timebin. It is also possible for thornsto define their own schedule groups , which may be thoughtof as a user–defined time bin. The specification of scheduledfunctions in thorns is described in Section III-A2. At run time,the flesh builds a schedule tree and provides an API that allowsthis schedule tree to be traversed such that the functions arecalled in their desired order. Cactus provides the argument listsfor calling these scheduled functions, and provides informationabout which variables need storage allocated and when.
C. Memory Allocation
Memory allocation for Cactus variables is handled bythe driver thorn, using information from the schedule andinterface configuration files. Memory can be allocated forvariables throughout the simulation, or allocated only duringthe execution of a function or schedule group. This providesa mechanism for reducing and tracking the memory footprintof a simulation. Incorrect memory allocation and the use ofuninitialized variables can easily lead to bugs in codes whichare hard to detect. Various Cactus thorns provide tools whichhelp locate such errors, for example by initializing variablesto have a value of
NaN and then checking for these valuesduring the simulation. A full explanation of
NaN may be found online: http://en.wikipedia.org/wiki/NaN
D. Data Types
Cactus defines its own data types for thorns. These datatypes include standard integer and real types, and a complexnumber data type. Supported Cactus data types include
Byte,Int, Real, Complex, String, Keyword and Pointer , but the useof some of them is restricted (e.g.
Keyword and
String toparameters). An optional trailing number to the type can beused to set the size in bytes, where applicable. The motivationto provide Cactus data types comes from the fact that thereis not a standard size for data types across all platforms.Providing Cactus-specific data types allows the framework tomaintain an explicit variable size across all platforms, andprovides maximum code portability. In addition it allows usersto select the size of these standard types at build time acrossall thorns.
E. Tools
As a distributed software framework, Cactus can make useof some additional tools to assemble the code and manage thesimulations. Oftentimes each arrangement of thorns resides inits own source control repository, as they are mostly indepen-dent of each other. This leads to a retrieval process that wouldquickly become unmanageable for end-users (for example theEinstein Toolkit is comprised of 135 thorns). To facilitatethis process we use a thornlist written using the ComponentRetrieval Language [9], which allows the maintainers of adistributed framework to distribute a single file containing theURLs of the components and the desired directory structure.This file can then be processed by a program such as ourown
GetComponents script, and the entire retrieval processbecomes automated.In addition to the complex retrieval process, compilingCactus and managing simulations can be a difficult task,especially for new users. There are a large number of optionsthat may be required for a successful compilation, and thesewill vary across various architectures. To assist with thisprocess a tool called the
Simulation Factory [10], [15] wasdeveloped. Simulation Factory provides a central means ofcontrol for managing access to different resources, configuringand building the Cactus codebase, and also managing thesimulations created using Cactus. Simulation Factory usesa database known as the
Machine Database , which allowsSimulation Factory to be resource agnostic, allowing it to runconsistently across any pre-configured HPC resource.III. C
ACTUS C ONFIGURATION L ANGUAGE
The Cactus Configuration Language (CCL) was providedwith the first Cactus 4.0 release in 1999. The language hasevolved since then with the addition of function aliasing(Section III-A2) and the configuration CCL file (Section II-A),along with a small number of minor changes. The welldesigned initial capabilities and ensuing stability of the CCLis one feature of Cactus which has led to its success acrossdifferent scientific fields and its ability to enable the growthof application communities. chedule Bin Description
CCTK_STARTUP
For routines which need to be runbefore the grid hierarchy is set up, forexample, for function registration.
CCTK_PARAMCHECK
For routines that check parame-ter combinations for potential errors.Routines registered here only have ac-cess to the grid size and the parame-ters.
CCTK_INITIAL
For routines which generate initialdata.
CCTK_PRESTEP
Tasks performed before the main evo-lution step.
CCTK_EVOL
The evolution step.
CCTK_POSTSTEP
Tasks performed after the evolutionstep.
CCTK_ANALYSIS
Routines which can analyze data ateach iteration. This time bin is specialin that ANALYSIS routines are onlycalled if output from the routine isrequested, e.g. in the parameter file
Fig. 3. Scheduled functions in Cactus can be assigned to run in standardtime bins, the most important of which are described in this table.
In this section we outline the structure of the CactusConfiguration Language and provide syntax definitions formany of the elements of CCL. A complete specification anddiscussion of the language may be found in the Cactus User’sGuide . A. Thorn Configuration1) Groups:
Cactus variables are placed in variable groupswith homogeneous attributes, where the attributes describeproperties such as the data type, variable group type, rank,dimensions, and number of time levels. Many Cactus functionsoperate on groups of variables, for example storage allocation,sychronization between processors, and output functions. Forexample, a vector field containing individual variables for fluidflow in different directions would typically include all thevector components in a single variable group. By default, allvariable groups are private, however the public keyword canbe used to change the access level for each subsequent variablegroup in the ccl file.
2) Functions:
Cactus provides two types of functions, scheduled and aliased . Scheduled functions are declared in the schedule.ccl file and are defined to be called at certainstages in the Cactus simulation by prescribing a time bin , aspecific time during a simulation, in which to run. StandardCactus time bins are defined which are invoked in a welldefined order, and a list of the most important Cactus standardtime bins is provided in Figure 3.Additionally, thorn developers can define their own timebins or schedule groups. It is possible to specify the order inwhich two scheduled functions are called, as well as simpleconditionals and loops. Memory allocation of Cactus variablescan be restricted to only the time of execution of a certain http://cactuscode.org/documentation/UsersGuide.pdf function. Figure 4 shows a subset of the syntax which is usedto define a scheduled function. SCHEDULE [GROUP]
Fig. 4. Subset of the syntax for declaring scheduled functions or schedulegroups of functions. A function can be scheduled at a certain time bin orin a schedule group. It can be called while or if a condition is fulfilled.Functions or schedule groups can be scheduled before or after other functionsor schedule groups, within the same time bin or schedule group. Storage forCactus variables might only be allocated for a certain function or schedulegroup, to save overall memory. Variables distributed over multiple processescan be automatically synchronized after a certain function or schedule group,if specified in the ccl file.
Aliased functions are functions that can be shared betweenthorns. They are declared in the interface.ccl file andmay be called by a thorn at any point during the simulation.In order to call an aliased function it is not important to knowthe programming language used for its implementation. TheCactus API takes care of possibly necessary conversions.
3) Variables: Grid variables are Cactus variables that arepassed between thorns by the flesh, and are declared inthe interface.ccl file. They are generally collected into variable groups of the same data type. There are three typesof variable groups: grid functions , arrays , and scalars . Gridfunctions (GFs), the most common variable group type, arearrays with a specific size set by the parameter file, whichare distributed across processors. All GFs must have thesame array size, typically defining the shape and size of thecomputational domain.
Arrays are a more general form of GFsin that each array group may have a distinct size which can begiven by Cactus parameters.
Scalars are single variables of agiven basic type, much like rank-zero arrays. Cactus variablescan specify a number of timelevels, which means a certainnumber of copies of this variable for use in time–evolutionschemes where data at a past time is needed to calculate thenew data at a later time. Part of the syntax for declaring avariable group of variables is shown in Figure 5.
4) Parameters: Parameters are used to specify the runtimebehavior of Cactus and are defined in the param.ccl file.They have a specific data type and scope, a range of allowedvalues, and a default value. Once parameters have been set,they cannot be modified unless specifically declared to be steerable , in which case they may be dynamically changedthroughout the simulation. The allowed datatypes for param-eters are
Int , Real , Keyword , Boolean , and
String . Thorns canuse and extend parameters of other thorns. The syntax fordeclaring Cactus parameters is shown in Figure 6.
5) Include Files:
Header files can be shared between thornsif specified in the interface.ccl file. It is not only data_type>
Fig. 5. Part of the syntax for declaring Cactus variables. Cactus variableshave to be one of the data types Cactus defines and are part of a variablegroup. They can have different Cactus variable types, sizes, and number oftime levels. Each variable group needs to have a human–readable description. [EXTENDS|USES]
Fig. 6. Syntax for declaring Cactus parameters. Thorns might use or extendparameters of other thorns, and define their own. A parameter needs to havea data type. A human–readable description needs to be given, as well as anallowed range with a description for the range and a default value within thatrange. possible to share a single include file, but also to concatenatemultiple include files (also from multiple thorns), and usethem like a single include file. During the build process,Cactus copies all of the source files located in each thorn’s include directory to a central location from which they maybe accessed by any other thorn using one of two methodsshown in Figure 7.
USES INCLUDE requests an includefile from another thorn, and
INCLUDE adds the code in
Fig. 7. Syntax for using include files in Cactus. Thorns might provide aspecific header file to another thorn (the first example), or might provide onepart of a concatenation of multiple header files, possibly from multiple thorns(the latter example).
B. Thorn Interaction1) Scope:
Cactus provides different levels of access forvariables and parameters. Variables can be defined as public or private . Public variables can be inherited by a thorn when thatthorn inherits an interface. Thorn inheritance will be describedin greater detail below.
Private variables can only be seen bythe thorn which defines them.Similarly, parameters may be defined as restricted or pri-vate . Restricted parameters are available to thorns whichrequest access.
Private parameters, like variables, are onlyvisible to the thorn which defines them. The access levelshere only specify if those parameters are directly accessiblein the source code; it is possible to access information about any parameter through Cactus API functions regardless of theparameter scope defined in the param.ccl file.
2) Inheritance:
Cactus provides an inheritance mechanismsimilar to Java’s abstract classes. It allows thorns to gainaccess to variables provided elsewhere by inheriting from theinterface. A key point here is that the thorns are not inheritingfrom other specific thorns; any number of thorns may declarethemselves as implementing an interface. These thorns may allbe compiled together, allowing the user to decide at run-timewhich thorn should be used. The interface is only specified bythe thorns implementing it. This means that thorns declaringthe same interface-name need to have an identical interface,which is checked by Cactus.Cactus also provides capabilities which may be declared inthe configuration.ccl file. Capabilities differ slightlyfrom interfaces in that while any number of thorns providingthe same interface may be compiled together, only one thornproviding a capability may be compiled into a specific config-uration. In this sense, while interfaces define run–time depen-dencies, capabilities define build–time dependencies. This canbe useful for providing external libraries or functions whichare too complex for aliasing. Also, capabilities play a rolein configuring thorns and external libraries since they interactwith the build system of Cactus.Many design decisions are based on the distinction betweeninterfaces and capabilities. For example, the concept of capa-bilities is important for application performance – knowing aninter-thorn relationship at build time allows optimizations tobe included that are not possible at run time.The syntax for declaring and requiring a capability is shownin Figure 8.
PROVIDES
Fig. 8. Part of the syntax for declaring and requiring capabilities in Cactus.Capabilities can be required and provided by thorns. If a thorn provides acapability it interacts with the makesystem through the output of a script whichneeds to be specified in the ccl file, as well as it’s programming language tobe able to call it correctly.
The interface.ccl file also provides a low-level in-clude mechanism, described in Section III-A5, similar to thatfound in C/C++. Thorns may request access to any includefile within the Cactus source tree without specifying whichthorn or interface should provide it. This is used primarily foroptimization reasons as the compiler can then replace inlinefunctions, and in some cases for providing access to externallibraries such as HDF5.
C. Testing
It is strongly recommended, although not required, thatthorns come with one or more test suites. These consist ofample parameter files and the expected output for thoseparameters. These files should be located within the test directory in the thorn, so that the test suites may be run us-ing gmake
XAMPLES
In this section we show some examples of the dependenciesamong Cactus thorns which are generated by the CCL filesfor different applications: a simple example application forthe scalar wave equation with a minimal set of thorns; asmall community toolkit for quantum gravity; and a largecommunity toolkit for numerical relativity. The interest onthorn dependencies arises for two core reasons:1) Cactus is particularly targeted at enabling communitiesto generate shared toolkits for solving a variety of prob-lems in a particular field. The standard computationaltoolkit which is distributed with Cactus is further usedby many different applications. Thorn dependencies andinterfaces thus need to be carefully thought out andperiodically revisited to make sure that the plug-and-play aim of Cactus, where different thorns can providethe same functionality, is achieved with interfaces whichare as simple, flexible and general as possible. Thisdesign usually involves a delicate balance, taking intoaccount the speed of implementation, complexity of theinterface etc.2) Long time Cactus users work with standard thorn listswhich are built up from experience and shared with col-laborators. These thorn lists are amended as new thornsbecome available or are no longer used, and can containseveral hundred thorns. For new users in particular, thereis an increasing issue with providing a procedure forusers to select the appropriate set of thorns for theirapplication, and to understand the capabilities of differ-ent thorns. One big simplification which could be madewould be to reduce the number of thorns in thorn lists byremoving thorns which depend on others and could beautomatically added. Ideally, a tool would be built whichwould allow a user to start from an abstract descriptionof their problem and automatically select appropriatethorns, for example
Evolving Gaussian initial data usingthe 3-D scalar wave equation and outputting 3D data ,or
Evolving two black holes using Einstein’s equationsand calculating gravitational waveforms . The questionis then whether there is currently enough informationin the CCL files to achieve this, or how additionalinformation could be provided.In this section, we use the dependencies among the setsof thorns described in the CCL files for these three exampleapplications to view the complete set of thorn dependenciesand to investigate how the thorn set could potentially be gen-erated from an initial minimal set of thorns. The dependenciesused for the figures are taken from a file generated during the Cactus build process which contains a complete database ofthe contents of the different thorn configuration files.A Perl script is used to parse this database and generatea file in dot format, which can then be processed by aprogram like graphviz [12] and turned into a directedgraph like that in Figure 9. This graph shows five differenttypes of dependencies. Inheritance is denoted by a regulararrow, dependencies due to a required function are denotedby an arrow with a square head, direct thorn dependencies aredenoted by a dotted arrow, shared variable dependencies aredenoted by an arrow with a circular head, and dependenciesdue to a required capability are denoted by an arrow with adiamond head. There are also shaded and unshaded thorns, thedistinction being that the shaded thorns have no other thornsdepending on them.This Perl script does not show the dependencies generatedby a single thorn, so we also use a set of two Python scripts,the first of which parses the actual CCL files and generates anXML file containing all of the dependencies. This file can thenbe queried by the second script, which will search for a singlethorn and find all thorns upon which the query depends. Itwill also output a graph in dot format, as seen in Figure 10.The second script will also allow users to choose betweenalternate implementations of the same interface (e.g.
PUGH or carpet ). The motivation here is that this script should allowthe user to generate a complete thornlist that could then beused to build a simulation. A. Simple Example: Scalar Waves
The set of Cactus thorns to solve the 3-D scalar waveequation (WaveToy Demo) was developed as a pedagogicalexample for understanding Cactus, and as a simple and wellunderstood test case for new developments. These thorns solvethe hyperbolic wave equation in 3D Cartesian coordinates withdifferent boundary conditions for a chosen set of initial dataand include different output formats and a web interface. Thisexample is described on the Cactus web pages [16], whichalso provide a thorn list with information about the 22 thornsthat are used. The example application includes two initialdata thorns which specify the initial scalar field and sources( idscalarwavec and wavebinarysource ), a scalarfield evolver ( wavetoyc ) along with additional thorns fromthe standard Cactus Computational Toolkit. The example usesthe unigrid driver pugh with associated thorns pughslab for hyperslabbing and pughreduce which provides a set ofstandard reduction operations that can calculate for examplethe maximum value or L2 norm over the grid for any gridvariable.A complete set of dependencies between these thorns asspecified in the CCL files is shown in Figure 9. In this diagramwe can see for example the central nature of the ioutil thorn which provides functionality that can be used by thornsimplementing different I/O methods, for example providing aparameter which sets when data for all I/O methods should beoutput and the directory in which to write data. aveToyCBoundary CartGrid3DIDScalarWaveC IsoSurfacerWaveBinarySourceCoordBase HTTPDEextraHTTPD IOAsciiIOBasic IOJpegIOUtil jpeg6bLocalInterp LocalReduce PUGHReducePUGH PUGHSlabSocketSymBase Time Interface InheritanceFunction RequirementDirect Thorn DependencyShared Variable DependencyCapability Requirement
Fig. 9. Dependency graph for complete set of thorns in the simple example application
WaveToy Demo . The shaded items indicate that the thorns are‘leaves’ and have no thorns depending on them.
BoundarySymBase PUGHWaveToyC CartGrid3DIDScalarWaveC CoordBase
Fig. 10. Dependency graph for the WaveToy Demo thornlist. This graph isgenerated using dependencies of thorn IDScalarWaveC which defines initialdata for the fields evolved by the scalar wave equation.
The dependency diagram also shows that any method toautomatically generate this set of thorns using dependencyinformation would need 11 thorns specified as a starting point,these are the shaded thorns in the diagram. For example, if wesimply started from the thorn that specifies the initial scalarfield ( idscalarwavec ) as shown in Figure 10, which couldbe the obvious starting point for a user who knows they wantto evolve a particular scalar field then working only withdependencies would result in a set of thorns without using anycoordinate time ( time ), any I/O, or the possibility to includescalar source terms.Adding additional metadata to thorns is one mechanismto supplement the current CCL information to enable thegeneration of thorn lists for a particular application. Forexample, explicitly tagging thorns as providing I/O methodswould allow these thorns to be automatically added or to beselected by a user. In other cases, these diagrams show thatadditional interfaces or dependencies may need to be added.In Figure 10 attention needs to be given to the compile timedependencies that would include thorns time (which should in fact be inherited by the evolution thorn) and
PUGHReduce and localreduce . B. Small Community Code: The CausalSets Toolkit
The CausalSets Toolkit is an example of a small communitycodebase, which implements a wide variety of computa-tions in discrete quantum gravity, in particular with regardto Causal Set Theory [13]. The toolkit is based upon twomajor components. One is a MonteCarlo arrangement, whichprovides a generic API for providing parallel random numbers,i.e. pseudo random numbers which are independent on allprocesses. A second is a CausetBase API, provided by theBinaryCauset thorn, which abstracts the mathematical notionof a causal set (a locally finite partially ordered set [13]),providing myriad routines for working with such objects.One of the challenges in supporting computations in CausalSet Theory is that there is not a single sort of computation,such as finding approximate solutions to PDEs by finitedifference or spectral methods, which one would like toperform. Instead a physicist will ask many different sorts ofquestions about the behavior of discrete partial orders. A givencomputation will share aspects with others, but the overallstructure may differ considerably. Furthermore the communityis in general not terribly experienced with large scale com-putation, and thus benefits from software which insulates thephysicist from many complications of parallel computing. Thecomponent based approach provided by the Cactus Frameworkis well suited to address both of these challenges, by allowingthe physicist to mix and match individual components to buildup the particular computation desired, working with familiarabstract mathematical concepts, rather than having to workdirectly with source code. Additionally the components aredesigned to run readily on large scale hybrid architectures,without the user needing detailed knowledge of how thecomputation is implemented.The dependency diagram for a collection of thorns whichimplements a sample computation is shown in Figure 11. Thiss a computation of spatial homology of a sprinkled causal set,as described in [4]. Here the BinaryCauset thorn implementsthe core CausetBase API, which provides the causal set alongwith a high level abstract interface to it. The MonteCarlo thornprovides parallel random numbers to CFlatSprinkle, whichgenerates a random causal set, and RandomAntichain, whichselects a random antichain within the causal set provided byCFlatSprinkle. The MonteCarlo arrangement gets the actualpseudorandom numbers from thorn RNGs, and also providesa thorn Distributions to provide samples from a variety ofdistributions, such as Poisson and Gaussian. AntichainEvolprovides a sequence of ‘thickened antichains’, which are thenread by the Nerve thorn, which computes a nerve simplicialcomplex from each thickened antichain. The homology groupsof these simplicial complexes are then computed by a separatestandalone homology package chomp [2]. The whole compu-tation relies on PUGH as a standard Cactus driver, and usesCactus’ IOUtil to provide metadata for IO routines.
NerveAntichainEvol BinaryCauset CFlatSprinkleRandomAntichain DistributionsIOUtilMonteCarloPUGH RNGsInterface InheritanceFunction RequirementDirect Thorn DependencyShared Variable DependencyCapability Requirement
Fig. 11. Dependency graph for a sample computation in Causal Set QuantumGravity. The computation is described in detail in [4].
C. Large Community Code: The Einstein Toolkit
The Einstein Toolkit [17] is an open, community devel-oped software infrastructure for relativistic astrophysics. TheEinstein Toolkit is a collection of software components andtools for simulating and analyzing general relativistic astro-physical systems that builds on numerous software efforts inthe numerical relativity community. The Cactus Framework isused as the underlying computational infrastructure providinglarge-scale parallelization, general computational components,and a model for collaborative, portable code development.The toolkit includes modules to build complete codes forsimulating black hole spacetimes as well as systems governedby relativistic hydrodynamics. Current development in theconsortium is targeted at providing additional infrastructurefor general relativistic magnetohydrodynamics.The Einstein Toolkit uses a distributed software model andits different modules are developed, distributed, and supportedeither by the core team of Einstein Toolkit Maintainers, or byindividual groups. When modules are provided by external groups, the Einstein Toolkit Maintainers provide quality con-trol for modules for inclusion in the toolkit and help coordinatesupport.With such a large set of components and a distributed teamof developers, implementing appropriate standards are crucialto maintain coherence across the code base, and to enablefuture development. This is achieved in some part by defining base thorns that act to define application specific standards,providing default variables, parameters, functions and schedulebins that are common across an application. For example, inthe Einstein Toolkit application specific base thorns include
ADMBase (for the vacuum spacetimes),
HydroBase (formatter spacetimes) and
EOSBase (for equations of state) [6].Figure 12 shows the complete dependency graph for theEinstein Toolkit, which is so extensive that it isn’t possibleto examine in detail in print ; however, we include the graphhere to illustrate its complexity. Of the 135 thorns, 9 haveno dependency on other thorns, and 78 thorns (includingthese independent thorns) are needed as the starting point togenerate the whole toolkit using CCL dependency information.The clusters of dependencies for ADMBase , HydroBase and
EOSBase are apparent in the diagram.The Einstein Toolkit dependency diagram also shows anumber of direct thorn dependencies, indicated by the blackdotted lines. This means that thorns depend not on an interfacebut on a specific thorn. In some cases this is due to missinggeneral interfaces such as appropriate aliased functions whicheither need to be carefully designed or perhaps have simplynot been added where they should have been. A large numberof these direct dependencies are associated with the Carpetadaptive mesh refinement set of thorns where the nature ofthe driver thorn typically enforces a direct dependency forexample for associated I/O or reduction operations. The needto support direct dependency on thorns was one reason why the configuration.ccl file was introduced as an extensionto the original CCL.Figure 13 shows an example of the direct dependenciesfor an initial data thorn in the Einstein Toolkit. The thorn
IDAnalyticBH provides initial data for several differentblack hole spacetimes with analytic solutions. Starting fromthis thorn, only seven other thorns are picked up directly withdependency information. Given that most production runs fornumerical relativity simulations include of order 100 thorns,it is clear that automatically generating appropriate thorn listswill require additional metadata and physics insight.V. F
UTURE W ORK
The original Cactus Configuration Language was released aspart of the Cactus 4.0b distribution in 1999 and has since thattime been extended in different ways as new features wererequired. Despite serving the Cactus user community wellsince this time, it is clearly time to reexamine the requirementsfor the CCL in the light of current and future needs and to Note that if viewing this paper as a PDF document it is possible to zoomin to see features in detail. dmadmanalysisadmbase admconstraintsahfinderahfinderdirect calckdistortedbhivp ehfinderexactextract grhydroidanalyticbhidaxibrillbhidaxioddbrillbh idbrilldataidconstraintviolateidfileadm idlinearwavesmeudon_bin_bhmeudon_bin_ns meudon_mag_nsml_admconstraints ml_admquantitiesml_bssn ml_bssn_helperml_bssn_o2 ml_bssn_o2_helpernoexcision quasilocalmeasuresrotatingdbhivp tmunubasetwopunctures weylscal4coordgaugegrhydro_initdatastaticconformal tovsolver admcoupling admmacrosaeilocalinterp lapackblaslorene boundaryellsor cartoon2d periodicreflectionsymmetry rotatingsymmetry180rotatingsymmetry90carpetinterp carpetcarpetinterp2 carpetioasciicarpetiobasic carpetiohdf5carpetioscalar carpetreducecarpetregridcarpetregrid2carpetslab cartgrid3d carpetevolutionmaskcarpetlib carpetmasknanchecker carpettrackerioascii iohdf5utiliojpeg spacemask dissipationhydro_analysishydro_initexcisionlegoexcisionmultipole noisesphericalsurface constantscoordbaseellbase eos_hybrideos_base eos_polytrope eos_idealfluideosg_hybrideosg_base eosg_idealfluideosg_polytropeformaline fortran newradgenericfdloopcontrol gslhdf5 iohdf5 httpdextrahttpd hydrobasesetmask_sphericalsurfaceinitbaseiobasic ioutiltimerreport terminationtriggerjpeg6b tgrtensorlocalinterplocalreduce ml_bssn_test molnice normspughpughinterppughreducepughslab slabslabtestsocket summationbyparts symbasetatelliptic time
Interface InheritanceFunction RequirementDirect Thorn DependencyShared Variable DependencyCapability Requirement
Fig. 12. Complete dependency graph for the
CartGrid3D CarpetIDAnalyticBH ADMBaseStaticConformal IOUtil InitBaseCoordBase
Fig. 13. Dependency graph for the Einstein Toolkit starting from theIDAnalyticBH thorn. For this graph the thorn Carpet was chosen to providethe driver interface, however PUGH could have been used instead. take into account new technologies and possibilities. In thissection we describe new features required in the CCL andtheir motivation.Cactus (and the set of thorns in the Cactus ComputationalToolkit) currently best supports finite difference, finite volume,or finite element methods implemented on structured grids.Extensions to the CCL are required to support meshlessmethods (e.g. particle methods such as smoothed particlehydrodynamics or particle-in-cell, used for example in manyastrophysics codes) and unstructured meshes where additionalconnectivity information is required to specify how gridpoints are connected (e.g. unstructured grids are importantfor example in coastal modeling to resolve the fine detailsof the coastline). Implementing both these features in Cactus requires developing appropriate parallel driver and associatedinfrastructure thorns in addition to changes to the CCL.Cactus currently operates with a single computational gridso that all physical models need to run on a single domain.Comprehensive multiphysics support is needed where differ-ent physical models can be configured and run on differentdomains, for example for coupling together wind and currentmodels in coastal science, or modeling different physicalcomponents of a relativistic star.Constants (e.g. π or the solar mass) are commonly used inscientific codes. Currently in Cactus constants are handled viainclude files, for example the Einstein Toolkit contains a thornwhich provides commonly used astrophysical constants in aninclude file. These constants are then only available in sourcecode and not in CCL files. A preferable approach would be todefine such constants directly as part of the CCL specification.Similar to constants, the CCL needs to support enumerationsand user-defined structures, so that e.g. a hydrodynamicalstate vector consisting of density, velocity, and temperaturecan be handled as a combined entity instead of as a setof five separate variables. This should include the ability tohandle vectors and tensors in a natural manner, a featurethat is missing in many computer languages, but which isnevertheless important in physics simulations. Tensor supportwould need to include support for symmetries (so that e.g.only 6 out of 9 components of the stress tensor are stored). Inimplementing this, it is important that the abstract specificationof data types is decoupled from the decision of how to lay themut in memory, which needs to be left to the driver to ensurethe highest possible performance on modern architectures thatmay offer vectorization and deep cache hierarchies.While Cactus, through the CCL, contains information onhow thorns fit together computationally the CCL does notcontain information on the scientific content of the thorns.This issue needs some attention as the number of thorns inparticular domains grows and models become more complex.Options to handle this could include extending the CCL,or adding descriptive metadata separate to the CCL, or byinvestigating whether enough information can be providedfrom the CCL and base thorns for a particular application.Such additional information is important, for example, to beable to automatically construct appropriate thornlists for aparticular physical model.A further issue related to the growth in both the numberof thorns and the complexity of applications is constructingand editing CCL files. CCL files for some thorns are nowvery long and complex and difficult to read and comprehend.This issue could be addressed by restructuring the CCL itselfor by providing intuitive and flexible higher level tools forinterpreting, checking and editing files.A final consideration is the syntax for the CCL. Changingthe CCL syntax could improve the ease with which the filescould be constructed and edited, and importantly provide moreoptions for standard tools which could be used to construct,investigate, debug and edit the CCL files. As an example,using a standardized syntax for CCL would allow users to takeadvantage of the extensive features of the Eclipse Platform [5].Eclipse is an advanced Integrated Development Environment(IDE) that includes features such as customizable syntaxhighlighting, auto-completion of code, and dynamic syntaxchecking for languages it recognizes. One option for revisingthe CCL syntax would be to use an existing data markuplanguage that incorporates metadata such as the ResourceDescription Framework (RDF) [14]. RDF is a widely usedstandard for describing data in internet tools. It uses URIsto describe the relationship between two objects as well asthe two ends of the link, which is commonly known as a triple . This would be a natural method for describing thedependencies between thorns, however RDF is generally usedas an extension of XML, which is not easily readable byhumans. As the CCL files must be generated by hand, itwould be preferable to use an alternate format that focuseson readability. One such example is YAML (YAML Ain’tMarkup Language) [11], a data serialization language witha strong emphasis on human readability. YAML representsdata as a series of sequences and mappings, both of whichcan be nested within others. While YAML does not inherentlysupport metadata, it would be quite simple to add metadata tothe thorns by adding extra mappings to the CCL files.VI. C ONCLUSION
We have presented an overview of the Cactus ConfigurationLanguage (CCL) that describes Cactus thorns and have shown http://en.wikipedia.org/wiki/Integrated development environment how the CCL is used in three different applications. Thedependency information included in the CCL specificationcan be used to identify potential issues in designing complexcodebases, and to build high–level tools to better assist usersin constructing codes for particular applications.New features needed in the CCL specification have beenidentified, including support for more numerical methods,multiple physical models, user-defined structures, scientificmetadata and to address the growing complexity of interfaces.A CKNOWLEDGMENT
The development of Cactus and the CCL has been a longterm and ongoing effort with many contributors and fun-ders. In particular we acknowledge the contributions of GerdLanfermann, Joan Mass´o, Thomas Radke, and John Shalf,and funding from the National Science Foundation, Max-Planck-Gesellschaft, and Louisiana State University. We alsoacknowledge colleagues in the Einstein Toolkit Consortiumwhose thorns provide the motivation and core use case forthis work.Work on thorn dependencies was funded by NSF
Chomp , http://chomp.rutgers.edu.[3] T. Goodale, G. Allen, G. Lanfermann, J. Mass´o, T. Radke, E. Seidel, andJ. Shalf,
The Cactus framework and toolkit: Design and applications ,Vector and Parallel Processing – VECPAR’2002, 5th International Con-ference, Lecture Notes in Computer Science (Berlin), Springer, 2003.[4] Seth Major, David Rideout, and Sumati Surya,
Stable homology as anindicator manifoldlikeness in causal set theory , Class.Quant.Grav. Multi-physics coupling of einstein and hydrodynamicsevolution: a case study of the einstein toolkit , CBHPC ’08: Proceedingsof the 2008 compFrame/HPC-GECO workshop on Component basedhigh performance (New York, NY, USA), ACM, 2008, pp. 1–9.[7] Erik Schnetter, Peter Diener, Nils Dorband, and Manuel Tiglio,
A multi-block infrastructure for three-dimensional time-dependent numericalrelativity , Class. Quantum Grav. (2006), S553–S578, eprint gr-qc/0602104, URL http://stacks.iop.org/CQG/23/S553.[8] Erik Schnetter, Scott H. Hawley, and Ian Hawke, Evolutions in 3Dnumerical relativity using fixed mesh refinement , Class. Quantum Grav. (2004), no. 6, 1465–1488, eprint gr-qc/0310042.[9] Eric L. Seidel, Gabrielle Allen, Steven Brandt, Frank L¨offler, andErik Schnetter, Simplifying complex software assembly: the componentretrieval language and implementation ∼ Causal sets: Discrete gravity