A Simple Way to Distribute Mathematica Evaluations
aa r X i v : . [ h e p - ph ] F e b MPP–2009–019arXiv:yymm.nnnn [hep-ph]
A Simple Way to Distribute Mathematica Evaluations
M. Bruhnke a , T. Hahn ba Universit¨at W¨urzburg,Am Hubland, D–97074 W¨urzburg, Germany b Max-Planck-Institut f¨ur PhysikF¨ohringer Ring 6, D–80805 Munich, GermanyFebruary 11, 2009
Abstract
We present a simple package for distributing evaluations of a Mathematica func-tion for many arguments on a cluster of computers. After setting up the hosts, theonly change is to replace
Map[f, points] by MapCore[f, points] . With the fairly recent arrival of low-cost multi-core CPUs, institutes often have significantcomputing power at their disposal. Mathematica 7, whose main motto is parallel comput-ing, makes it relatively simple to send a calculation to the fellow cores on the same machine,though still not exactly straightforward to distribute a calculation on a larger cluster. Thepackage we present in the following fills this gap. After a one-time setup of the cluster, itallows to easily distribute calculations to as many hosts as there are Mathematica licensesavailable (both ordinary licenses and Mathematica 7’s sublicenses).We certainly do not propose to parallelize ‘atomic’ Mathematica operations, like
Simplify , which is a daunting task even at the conceptual level. Rather, we focus onlengthy evaluations of one function over many arguments, for example the evaluation ofa cross-section for many points in phase and/or parameter space. Incidentally, our pack-age is not restricted to numerical evaluations, but can handle any kind of Mathematicaexpressions.Many physicists would argue that at least numerical evaluations of a certain volumeshould be done in a compiled language for performance reasons. This is at best partiallytrue, as Mathematica has a formidable arsenal of functions, e.g. for numerical analysis,which are not easily available elsewhere, and it is the choice of algorithm that influencesthe computation time much more than the speed of a single evaluation. Furthermore, inconjunction with MathLink, e.g. through FormCalc’s Mathematica interface [1], the execu-1ion speed is essentially that of a compiled language and Mathematica’s part is ‘governing’the calculation.The package we present in this paper is remarkably short and contains one main function
MapCore which substitutes
Map in serial calculations. Sect. 2.1 describes usage of thepackage, Sect. 3 provides a function reference, and Sect. 5 describes installation and systemsetup.
The MultiCore package is loaded with << MultiCore‘
The next step is to add cores ∗ on which evaluations can be distributed. This can be donedirectly with e.g. AddCore["pc123.mppmu.mpg.de"] or, if login under a different username is required,
AddCore["[email protected]"]
This explicit method becomes cumbersome, however, if many cores with varying loads areinvolved. The alternate invocation
AddCore[10] takes up to ten of the currently ‘free’ cores. This information is supplied by the findcores shell script (part of the MultiCore package) which in turn reads the admissible cores froma .submitrc file and invokes ruptime to determine the load. The .submitrc file has thesimple syntax pc380 4pc381 4pc339b 2pc472 ∗ A note on nomenclature: we refer to a ‘core’ as the fundamental computation unit, i.e. a processorable to run a single thread. A physical CPU may have several cores and similarly a host may have severalCPUs. rwhod daemon, since then its loadwill be reported through ruptime and findcores will use only the free cores.In the case of a Linux cluster, the .submitrc file can be generated (more or less)automatically, with the help of the setupcores script, as in: ./setupcores > $HOME/.submitrc
This script assumes that the hosts are listed via ruptime , that a password-free login via ssh is possible, and that each host is running a flavour of Linux where /proc/cpuinfo canmeaningfully be read out. The file generated in this way constitutes a ‘raw’ version andshould be reviewed by hand.Each core launched requires a Mathematica license, i.e. a Kernel license. From Math-ematica 7 on, each (main) license includes four sublicenses and it is possible to use thesesublicenses for parallelization (cf. Sect. 3.11, $SublicenseFactor ).One can further take care not to invoke more slave processes than licenses available.To this end
AddCore is invoked with an integer n
0, meaning that it should spawn atmost so many slaves that | n | (main) licenses are left for other users. Also one can providea second integer argument m | m | sublicenses unused. This mode really makessense only for network licenses. For non-network licenses, AddCore silently assumes thatthe other machines listed in .submitrc have similar licenses.MultiCore generally works in a master–slave model, requiring one license (but hardlyany CPU time) for the master and one main or sublicense for each slave. We assume thatall cores in the cluster run the same Mathematica version, in particular that the master’sversion number is the same as all slaves’. In particular we assume that subkernels on slavecores can be launched if and only if the master is running Mathematica 7.Quitting the master’s Mathematica Kernel automatically closes all links, so explicitly‘removing’ registered cores is usually not necessary unless one wants to free Mathematicalicenses. Each slave session is characterized by an identifier of the form host[id] , where host is the host name and id an integer link id. The syntax for RemoveCore is RemoveCore[host]RemoveCore[host[id]] where both host and id may be a pattern. Thus, RemoveCore["pc123"] closes all slaveson host pc123 and
RemoveCore[_] closes all current slave sessions.Once the cores are registered, the only necessary substitution is to replace
Map ( /@ ) by MapCore to make multiple evaluations execute in parallel.
Important:
The only slightly non-straightforward aspect is the remote definition of thefunction being evaluated.
MapCore sends the definition of this function to the slave as muchas the
Save function would save it in a file. This fails to work (for both
MapCore and
Save )if the function depends on a
LinkObject in the master’s session, i.e. if the function is or3nvokes a MathLink function. Even if the slave session has the same MathLink executableinstalled, it will in general not communicate via the syntactically same
LinkObject .To work around such cases, the
AddCore function has an optional second argument.This argument is sent to the slave upon opening of the link as an initialization command.In our opinion the best procedure in the MathLink case mentioned above is not to installthe MathLink executable in the master’s session at all, to prevent sending any explicit
LinkObject pointing to the master’s installed MathLink executables, and instead includethe
Install statement in the
AddCore invocation, as in
AddCore[0, Install["LoopTools"]]
Also, if the function has a very lengthy definition one might want to place it in a file andload that via the initialization command, e.g.
AddCore[0, << myfunction.m]
Of course one would have to submit this file to each slave first if they do not have accessto the master’s filesystem. Note, however, that the slaves’ working directory is the user’shome directory, not the current working directory on the master. In other words, the fileto be loaded must include a path unless it resides in the home directory anyway.
MapCore tries to have the given points calculated as quickly as possible. Therefore it dis-tributes more (less) than patchsize points to faster (slower) cores by evaluating its internaltiming statistics. Once all points of the list are distributed,
MapCore redistributes theunfinished points until the result for all points are available. It automatically decreasesthe patchsize according to the remaining list size, too. Although due to the competition N − MapCore returns, the time until all slaves are againready is negligible. The identifier $CallID helps
MapCore to distinguish between new andold data of multiply distributed points.
Especially during long parallized calculations of many CPU-time-expensive points, linkerror handling plays an important role. If the link to one host, i.e. one or more cores, islost,
MapCore redistributes the as yet uncalculated points to the remaining hosts, executesthe equivalent
RemoveCore call and prints a warning message. After
MapCore has returnedone might want to add the lost host by re-invoking
AddHost .4 MultiCore package Function Reference
AddCore adds (registers) cores, i.e. opens links to remote machines for subsequent dis-tributed evaluation with
MapCore . It is invoked in one of the following ways: • AddCore[ hostname ] adds one core on hostname using a main license. • AddCore[ hostname , "subkernel"] adds one core on hostname using a sublicense. • AddCore[ n ] ( n >
0, integer) adds up to n cores using the findcores script (de-scribed below) using a ratio $SublicenseFactor : 1 of sublicenses to main licenses(cf. Sect. 3.11). • AddCore[ n ] ( n
0, integer) adds as many cores as there are main licenses using findcores , but leaves at least | n | main licenses for other users. • AddCore[ n , m ] ( n , m integer) same as above, with n for main licenses and m forsublicenses.The last two invocations really make sense only for network licenses. For non-networklicenses, it is silently assumed that the information taken from $LicenseProcesses and $MaxLicenseProcesses (in the master’s session) holds also for the remote cores. Each linkcorresponds to one core on a remote machine. It is hence permissible to add the same hostmore than once, to account for its number of cores. The links are identified, apart from thehostname, by a unique integer link id. This id is also sent to each slave process as $CoreID and can be used to e.g. construct unique filenames. Core additions are cumulative. Linksare released either through explicit removal with RemoveCore or by quitting the master’sMathematica Kernel.The findcores script is part of the MultiCore package. It needs a .submitrc file inwhich the admissible cores for distributed computing are listed. Each line has the syntax hostname [
Comment lines starting with a are allowed. Cores are processed in sequential order, i.e.the fastest machine should appear at the top of this list. The .submitrc file is searchedfor in the following order: • ./.submitrc , • $HOME/.submitrc , • ( MultiCore installation directory ) /submitrc , • /usr/local/share/submitrc . 5 indcores invokes ruptime to determine the load on a remote machine. This works onlyif the remote machine is running an rwhod daemon. If not, the load is assumed to be zero,i.e. all cores are taken. RemoveCore removes (unregisters) cores from the internal list, shuts down the correspond-ing remote kernels and closes the links. Each core is identified by two quantities, thehostname and the link id. Calling
RemoveCore is usually not necessary, as quitting themaster’s Mathematica Kernel automatically closes all links. • RemoveCore[ hostname [ id ]] removes all cores matching hostname and id , whereeither may contain a pattern. For example, RemoveCore[_] removes all links, and
RemoveCore["pc456"[_]] removes all links to pc456 . • RemoveCore[ hostname ] is equivalent to RemoveCore[ hostname [_]] . ListCore lists the currently registered cores. • ListCore[ hostname [ id ]] lists all cores matching hostname and id , where eithermay contain a pattern. ListCore[_] thus lists all cores. • ListCore[ hostname ] is equivalent to ListCore[ hostname [_]] . MapCore is the main function of the MultiCore package. It substitutes
Map in serial calcu-lations. • MapCore[ f , points , patchsize ] distributes the computation of f for all items in points to the cores previously registered with AddCore .The integer argument patchsize is optional (default value: 5) and tells
MapCore howmany points on average should be sent to each core. As every set of results returned bya slave contains timing information, the master distributes points according to the slaves’performance. Until the master has gathered enough statistics about the slaves’ timings itsends exactly patchsize points to each core.The larger the computation time for a single point is, the smaller patchsize should bechosen. A smaller value may also be profitable if the participating cores have significantdifferences in speed. A patchsize of 1 achieves the best load-levelling but incurs the high-est communication overhead. We have generally found the communication overhead tobe negligible if the computation time for one patch is several seconds or more (see alsoperformance tests in Section 4). 6 .5 RemoteMath
RemoteMath encodes the invocation of a remote Mathematica Kernel. It receives onearguments and one flag, the hostname and the type of license which shall be used whilelaunching the kernel. If required one can define different invocation strings for differenthosts. • RemoteMath[ host , opt ] := remotestring defines remotestring as the command forinvoking a remote Mathematica Kernel on host . Options for the remote kernel aregiven in opt , which is presently restricted to -subkernel for launching a subkernel.The default command is ssh (host) ’exec /bin/sh -lc \"test ‘uname -s‘ = Darwin && nice -19 MathKernel (opt) -mathlink \|| nice -19 math (opt) -mathlink"’ This is an ssh command which starts a remote login shell that executes, with nice 19,
MathKernel on MacOS and math on other systems. Starting a login shell is important asit sources the shell’s initialization files, which may modify the PATH.If the Mathematica Kernel executable cannot be started using this command becauseit is not on the PATH, we recommend adding the appropriate directories to the PATH onthe remote system rather than modifying the
RemoteMath definition.
With
RemoteMap one can specify a mapping function which shall be applied on all remotehosts, i.e. slave sessions, to the point patches they receive from the master. Its default
RemoteMap[f_, points_] := Map[f, points] is the usual
Map function. This may be overwritten with an individual function whichmust have the same argument structure as
Map[ f , points ] . This feature could for examplebe used to leave a part of the parallelization to Mathematica 7 using the ParallelMap function. In that case one of course would set the number of cores in .submitrc to 1 forall hosts. $FindCores contains the full path to the findcores script, including (if necessary) anyoptions. The full syntax of findcores is: findcores [-f rcfile] [-h ruptimehost] rcfile specifies the explicit location of the submitrc file (see Sect. 3.1) and ruptimehost specifies the host on which to invoke ruptime to find out the load of themachines listed in the submitrc file. The latter is necessary if running the master processon a machine not connected to the cluster, e.g. a laptop.Note: changing $FindCores modifies subsequent invocations of
AddCore only, i.e. linksonce established are not changed by a different value of $FindCores . $MsgLevel specifies how verbose the master–slave communication is reported on screen. • $MsgLevel = n sets the message level to n .The default message level is 1, which just reports the adding and removing of coresas well as link failures. $CoreID is unique identifier for each slave session. • $CoreID (in the master’s session) is the id of the last slave session spawned. Thisnumber should not be tampered with. • $CoreID (in the slave’s session) is a unique identifier of the session. $CallID is available in both the master and slave session. In the master session it countsthe total number of calls to MapCore . In the slave session it identifies that certain call to
MapCore which invoked the last computation on this slave. Note that they do not have tobe equivalent (see Sect. 2.2).
The integer $SublicenseFactor is a global parameter in the master session which is set to4 if the Mathematica version is 7 or above, and 0 otherwise. Only
AddCore[ n ] with n > $SublicenseFactor manually only makes senseif one uses Mathematica 7 and wants to optimize it to the mean ratio of unused sublicensesto unused main licenses which might be greater than 4 in some cluster networks.8 .12 $ListPositions $ListPositions is available in the slave session only. This list contains the positions ofthe points in the original list which are to be evaluated by the slave.Both $CallID and $ListPositions can e.g. be used to construct unique filenames.For example, if a single evaluation is very costly in CPU time, one may want to store eachresult immediately after computation. This could be solved through a wrapper function RemoteMap[f_, points_] :=MapThread[store[f], {points, $ListPositions}]store[f_, dir_:"results"][x_, i_] :=Block[ {file = ToFileName[dir, ToString[i]]},If[ FileType[file] === File,Get[file],(* else *)If[ FileType[dir] === None, CreateDirectory[dir] ];(Put[
Results for each point would be stored in results/ n , where n is each point’s index in theoriginal list. In addition to $ListPositions one could use $CallID to generate uniquefilenames over multiple invokations of MapCore in the same master session.
We tested the performance and scalability properties of MultiCore on both a homogeneousand inhomogeneous cluster of 25 cores for different evaluation times per point (tpp) anddifferent patchsizes. As a testing function we used a simple pause directive f[p_][x_] := (Pause[p]; x) and mapped it over 10000 resp. 1000 arbitrary points for different numbers of cores rangingfrom 0 (local evaluation), 1 (slave) to 25 (slaves) and pausing times p = 0 . , . , T depends on the number and performance of theadded cores: 1 T = 1 n N X i =1 i with tpp i being the tpp of core i and N the number of cores and n the total number ofpoints. The three plots on the right hand side of Figure 1 show the testing results fordifferent tpp’s (of the fastest core) and for different patchsizes. Again, the patchsize is nota crucial parameter. As before, deviations occur for the small tpp = 0.01 sec. The scalingbehaviour for large numbers of cores seems to be at most satisfactory since MultiCore’sparallalizing takes about twice as long as the ideal case predicts. But if one compares thetotal timings of 25 unequal cores to the corresponding timings on the left hand side, onesees that it takes only about 10 cores from the homogeneous cluster to do the same job.Therefore one principally has to consider the performance gain before joining much slowercores to one’s cluster. The MultiCore package is available from . Installa-tion is as simple as unpacking the tar file. MultiCore requires Mathematica versions 5 andup (version 7 preferred).To be able to load MultiCore regardless of the current directory, the MultiCore in-stallation directory has to be added to Mathematica’s $Path , for example by placing astatement like
PrependTo[$Path, "/my/path/to/MultiCore"] in prefdir /Kernel/init.m , where prefdir is one of • /usr/share/Mathematica (system-wide, Linux), • $HOME/.Mathematica (user-specific, Linux), • /Library/Mathematica (system-wide, MacOS), • $HOME/Library/Mathematica (user-specific, MacOS), • $ALLUSERSPROFILE/Application Data/Mathematica (system-wide, Cygwin), • $USERPROFILE/Application Data/Mathematica (user-specific, Cygwin).10 æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æà à à à à à à à à à à à à à à à à à à à à à à à à àà à à à à à à à à à à à à à à à à à à à à à à à à à ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ìì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ôô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ôç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç çç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç €€€€€€ €€€€ €€€€ €€€€ €€€ ð cores s (cid:144) time tpp = ð points = ç ô ò ì à æ idealpatchsize æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æà à à à à à à à à à à à à à à à à à à à à à à à à àà à à à à à à à à à à à à à à à à à à à à à à à à à ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ìì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ôô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ôç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç çç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç ç €€€€€€ €€€€ €€€€ ð cores s (cid:144) time tpp = ð points = ç ô ò ì à æ idealpatchsize æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æà à à à à à à à à à à à à à à à à à à à à à à à à àà à à à à à à à à à à à à à à à à à à à à à à à à à ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ìì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ôô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô €€€€€€ €€€€ €€€€ €€€€ €€€ ð cores s (cid:144) time tpp = ð points = ô ò ì à æ idealpatchsize æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æà à à à à à à à à à à à à à à à à à à à à à à à à àà à à à à à à à à à à à à à à à à à à à à à à à à à ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ìì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ôô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô €€€€€€ €€€€ €€€€ ð cores s (cid:144) time tpp = ð points = ô ò ì à æ idealpatchsize æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æà à à à à à à à à à à à à à à à à à à à à à à à à àà à à à à à à à à à à à à à à à à à à à à à à à à à ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ìì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ôô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô €€€€€€€€ €€€€€€ €€€€€€ €€€€ €€€€ ð cores s (cid:144) time tpp = ð points = ô ò ì à æ idealpatchsize æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ææ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æà à à à à à à à à à à à à à à à à à à à à à à à à àà à à à à à à à à à à à à à à à à à à à à à à à à à ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ìì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò òô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ôô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô €€€€€€€€ €€€€€€ €€€€€€ ð cores s (cid:144) time tpp = ð points = ô ò ì à æ idealpatchsize Figure 1: Reciprocal total timings as a function of number of cores for different evaluationtimes per point (tpp), different number of points (see heading of corresponding plot) andpatchsizes. The left column shows the result for the homogeneous cluster (tpp i = tpp =const). The right column shows those for the inhomogeneous cluster i.e. tpp i = tpp (1 +3 i − ) for i = 1 , . . . ,
25. 11he package has been tested under Linux, MacOS, and Windows/Cygwin, both as mas-ter and as slave. The communication with remote Mathematica Kernels requires attentionto a few details that may not be obvious: • An sshd daemon must be running on the remote machine and access not restrictedby a firewall. On Cygwin one has to start sshd once with “ net start sshd ” (asAdministrator) and on MacOS one has to open the ssh port in the firewall (SystemPreferences – Sharing – Remote Login). • ssh access to remote machines must be possible without password authentication.This requires that a host key is generated with ssh-keygen and the public part of it(typically $HOME/.ssh/id_rsa.pub ) copied to $HOME/.ssh/authorized_keys . • If remote access other than by ssh is required, one needs to redefine the
RemoteMath function, which encodes the command string used to execute remote MathematicaKernels (see Sect. 3.5). This can either be done in the master session before any
AddCore invocations, or once and forever in
MultiCore.m . The MultiCore package provides a simple mechanism to distribute (parallelize) evaluationsof a single functions over many points. After setting up the cores participating in thecalculation with
AddCore , the single replacement of
Map by MapCore suffices to distributethe calculation.
MapCore is not limited to numerical evaluations, but can handle any typeof Mathematica expression.From Mathematica 7 on, parallelization on several cores of a single host is a built-infunctionality. Distributing calculations over more than one host is not straightforward,however, but can be done with the same ease using the
MultiCore package.The package is open source and is licensed under the GPL. It can be downloadedfrom and runs on Mathematica versions 5 and up(version 7 recommended).
Acknowledgements
We thank A. Hoang for playing our guinea pig in the beta stage and apologize to the MPIusers for using up too many Mathematica licenses during testing.
References [1] T. Hahn,
Comp. Phys. Commun.178