Assessing the Hierarchical Hamiltonian Splitting Integrator for Collisionless N-body Simulations
G. Aguilar-Argüello, O. Valenzuela, J. C. Clemente, H. Velázquez, J. A. Trelles
MMNRAS , 000–000 (0000) Preprint 15 September 2020 Compiled using MNRAS L A TEX style file v3.0
Assessing the Hierarchical Hamiltonian Splitting Integrator forCollissionless N -body Simulations Aguilar-Argüello, G. , (cid:63) Valenzuela, O. , Clemente, J. C. , Velázquez, H. , Trelles, J. A. Instituto de Astronomía, Universidad Nacional Autónoma de México, A.P. 70–264, 04510, México, CDMX, México
15 September 2020
ABSTRACT
The N -body problem has become one of the hottest topics in the fields of computational dy-namics and cosmology. The large dynamical range in some astrophysical problems led to theuse of adaptive time steps to integrate particle trajectories, however, the search of optimalstrategies is still challenging. We quantify the performance of the hierarchical time step in-tegrator Hamiltonian Splitting (HamSp) for collisionless multistep simulations. We comparewith the constant step Leap-Frog (LeapF) integrator and the adaptive one (AKDK). Addition-ally, we explore the impact of di ff erent time step assigning functions. There is a computationaloverhead in HamSp however there are two interesting advantages: choosing a convenient time-step function may compensate and even turn around the e ffi ciency compared with AKDK. Wetest both reversibility and time symmetry. The symmetrized nature of the HamSp integra-tion is able to provide time-reversible integration for medium time scales and overall deliverbetter energy conservation for long integration times, and the linear and angular momentumare preserved at machine precision. We address the impact of using di ff erent integrators inastrophysical systems. We found that in most situations both AKDK and HamSp are able tocorrectly simulate the problems. We conclude that HamSp is an attractive and competitivealternative to AKDK, with, in some cases, faster and with better energy and momentum con-servation. The use of recently discussed Bridge splitting techniques with HamSp may allowto reach considerably high e ffi ciency. Key words: methods: numerical – software: simulations – gravitation
Historically, fully self-consistent realistic astrophysical N -bodysimulations are a challenging problem (Aarseth 1971; Efstathiouet al. 1985; Stadel 2001; Springel et al. 2001; Klypin et al. 2017;Dehnen & Read 2011). On the purely gravitational case, direct N -body codes su ff er from the bottleneck of the amount of computa-tional work needed to accurately evolve the orbits of particles in thesystem on a particle-by-particle basis, commonly used for simulat-ing dense stellar environments or planetary systems. Such limita-tions triggered the development of sophisticated approximated hy-brid collisionelss methods, like TreePM / P3M / P3T codes (Xu 1995;Bode et al. 2000; Bagla 2002; Bode & Ostriker 2003) where theshort range component of the force is carried out by direct or treeforce solvers (Couchman 1991; Oshino et al. 2011; Habib et al.2013), and AMR methods are used to compute the long range com-ponent of the force (Villumsen 1989; Jessop et al. 1994; Kravtsovet al. 1997; Teyssier 2002), appealing a balance between accuracyand computational e ffi ciency. N -body simulations require both, afast way to calculate the accelerations and an accurate and e ffi cientintegration method to evolve particles in time. The second-order (cid:63) E-mail:[email protected]
Leap-Frog (LeapF) symplectic integrator is the most widely usedin collisionles N -body simulations. It is strictly symplectic when aglobal common and constant time-step is adopted, however, this isnot e ffi cient for problems with a large dynamical range which arestudied with modern hybrid N -body codes. In the quest for improv-ing e ffi ciency, adopting multiple or adaptive time steps is necessary,that implies to use a modified version of the Leap-Frog scheme orsome alternative. Several proposals have been presented in order toeither improve the LeapF integrator or to provide an adaptive-stepsymplectic or symmetric integrator (see Dehnen (2017) for a reviewof the current state of such e ff orts). Notably Hut et al. (1995) pro-posed an iterative time-step selection with a block time-step schemefor astrophysical direct integration problems, that requires an extracalculation and memory overhead. Quinn et al. (1997) suggested atime-step selection in the LeapF (KDK) integrator for massive N -body simulations, however, they pointed out that particle migrationacross time-step blocks may ruin the symmetry and occasionallyimplies a backwards integration, which is hard to reconcile withcollisional components like gas. Several other schemes forward,implicit, continuous, have been studied and many are reviewed byDehnen (2017). The situation is that there is not a general solu-tion, many of the proposed integrators truly restore the symme-try, however the specific selection of time step precludes the sym- © a r X i v : . [ a s t r o - ph . C O ] S e p Aguilar-Argüello et. al metrization in some conditions or alternatively the computationaloverhead makes impractical the proposal. The specific case of cur-rent collisionless simulations codes commonly use the KDK Leap-Frog implementation (Klypin et al. 2017) with adaptive time-stepsthat we call hereafter AKDK. In this paper we explore and quan-tify the Hierarchical Hamiltonian Splitting (HamSp) strategy pro-posed by Pelupessy et al. (2012) which is, as the LeapF, a secondorder scheme, mostly tested for small number of bodies or colli-sional simulations. We extend the discussion for collisionless sim-ulations and for di ff erent step selection functions. We also comparewith both the LeapF and the adaptive version AKDK. HamSp o ff ersan interesting compromise between accuracy in energy, momentumconservation and e ffi ciency.In section 2 we briefly introduce the integrators that we com-pare, and in section 3 the time step selection functions that we use.In section 4 we briefly describe the implementation of the integra-tors. In section 5 we show accuracy tests with the two-body andisolated halo problems and a minor merger case. Throw sections6, 7, 8 and 9, we quantify reversibility and time symmetry, the ef-fect of time step functions, performance and the stability on longintegrations and across realizations. Finally section 10 presents ourconclusions. As we have already discussed, large dynamical range problems re-quire multi or adaptive time stepping. Symplectic integrators al-though allowing accurate and e ffi cient integration, mostly requirea global constant time step. Some adaptive extensions have beenproposed however, the computational overhead may constrain theiruse (Farr & Bertschinger 2007). It is common to express integra-tors as a composition of operators using the Hamiltonian splittingtechnique in potential ( Drift ) and kinetic energy (
Kick ), althoughthere are other possibilities (Oshino et al. 2011).We implemented three di ff erent integrators in a direct integra-tion code dubbed as NP splitt (Aguilar-Argüello et al. in preparation),the Leap-Frog (LeapF), Adaptive-KDK (AKDK) and the Hamilto-nian Splitting integrator (HamSp). Below, we describe each inte-grator. The Leap-Frog (LeapF) integrator is a second-order widely usedsymplectic integrator. Symplectic integrators are designed to nu-merically preserve the integrals of motion and the phase volume ofthe simulated system.In the Leap-Frog method, the evolution of the gravitationalsystem can be written as a sequence of
Kick (advance of velocities)and
Drift (advance of positions) operators (Klypin et al. 2009), de-fined by: K ( dt ) : v ( t n + dt ) = v ( t n ) + dt a ( t n ) D ( dt ) : x ( t n + dt ) = x ( t n ) + dt v ( t n ) (1)where x , v and a are the position, velocity and acceleration of aparticle, respectively, and dt is the time step. In this paper we usethe operator sequence called KDK Leap-Frog: KDK : K ( dt / D ( dt ) K ( dt /
2) (2)where we considered that the evolution is for one time step, i.e.from t n to t n + dt . Given the symplectic properties of the Leap-Frog integrator, andthe potential computational overhead of symplectic adaptive stepimplementations (Farr & Bertschinger 2007), contemporary codeshave extensively used KDK (eq. 2) combined with a block time-step scheme, frequently using rungs which are power of two: δ t r = δ t ( − r ) and di ff erent assigning step functions, most frequently anacceleration based one (Springel 2005). We will use as a referencesuch implementation. The hierarchical Hamiltonian Splitting (HamSp) method is asecond-order integrator that, unlike Leap-Frog, allows individualparticle time-stepping (Pelupessy et al. 2012) through recursivelysplitting the Hamiltonian. Although it accurately preserves linearand angular momentum and has a good energy conservation, thesymmetry properties may be lost for the N -body problem if parti-cles change time-step at di ff erent moments.This integrator consists in splitting the Hamiltonian adaptivelyand recursively based on the current time step, dt , assigned to theparticles such that the called Slow system (hereinafter S ) containsall the particles with a time step larger than dt , and the called Fast system (hereinafter F ) contains all the particles with a time stepsmaller than dt . So, the splitting is as follows: H S = P S + V SS + V S F H F = P F + V FF (3)where, P X ≡ (cid:88) i ∈ X p i m i V XX ≡ − (cid:88) i , j ∈ X , i < j Gm i m j (cid:12)(cid:12)(cid:12) r i − r j (cid:12)(cid:12)(cid:12) (4)are, respectively, the kinetic and potential energies, and X in (4)can be either S or F , hence V S F is the potential generated by S particles due to F particles. The previous splitting scheme is knownas HOLD (Pelupessy et al. 2012).The S system is solved using the DKD scheme (Quinn et al.1997; Springel 2005), which consists of drifts of the particles in thissystem (due to P S ) and kicks on the particles of both systems (due to V SS + V S F ). For the F system, the same procedure as for the originalsystem is applied but using a halved time-step. Hence, the splittingis applied recursively to the F systems with time-step dt / r . Therecursion ends when the system F (of the rung r ) has no particles.Both the Kick and
Drift operators are symplectic, however usingmultisteps or adaptive steps may not preserve such properties in ageneral way. Nevertheless, using a block step scheme do not nec-essarily breaks symplecticity as discussed in Farr & Bertschinger(2007), therefore we need to investigate HamSp behaviour.
Besides the formulation of integrators with individual time stepsbased on symmetric operators, the choice for each particle step ismade through the so called time step function. There is not a uniquechoice, arguably the most commonly used step function in contem-porary collisionless N -body codes is based on the acceleration as: MNRAS , 000–000 (0000) ssessing Hamiltonian Splitting τ i = η (cid:114) (cid:15) a i (5)where a i is the acceleration acting on the particle i , giving the codethe possibility of adapting to high / low accelerations. Improvementshave been recently discussed by using a dynamical time proxy(Zemp et al. 2007) and tidal force time scale (Grudi´c & Hopkins2020), specifically establishing a balance between short and longtime steps, which may translate into higher e ffi ciency. Extensivecomparisons of AKDK with both choices have been discussed inZemp et al. (2007) and Grudi´c & Hopkins (2020).In our study, in order to preserve the symplecticity of theHamSp integrator while allowing adaptive time-steps, followingPelupessy et al. (2012), we use a time-symmetrized time-step ex-trapolation criterion for each particle, this is a first order perturba-tive expansion of: τ ± sym ( t ) = τ ( t ) / + τ ( t ± τ ± sym ) / τ + sym ( t ) ≈ τ ( t ) + d τ dt τ + sym (7)so that, the time-step we will use is given by, τ i = min j τ ij (cid:16) − d τ ij dt (cid:17) (8)It is important to state that the minimization indicated aboveis across the so called Slow particles. For a time-step proportionalto the inter-particle free-fall times: τ ij = η (cid:115) r ij m i + m j d τ ij dt = v ij · r ij r ij τ ij (9)The former options is a two-body based proxy for the dynam-ical time motivated step function suggested by Zemp et al. (2007).For a time-step proportional to the inter-particle fly-by times: τ ij = η r ij v ij d τ ij dt = v ij · r ij r ij τ ij + m i + m j v ij r ij (10)Such implementation is tuned to collisional problems, we willquantify the e ffi ciency of such time step functions, however thehigh acceleration derivatives may require going beyond the first or-der in the perturbative expansion.Across the paper we will use the symmetrized free-fall timestep for Hamiltonian Splitting (defined by eqs. 8 and 9) or alterna-tively the minimum of this and the symmetrized fly-by time step(eqs. 8 and 10). For AKDK we will use the standard step functiongiven by eq. 5. In order to make a comparison we implemented the di ff erent in-tegrators in a direct summation N -body code (Aguilar-Argüello etal. in preparation). It is not our purpose to extensively test the im-plementation in hybrid codes but to compare the integrators on thesame grounds, although it is important to say that several adaptive codes (but not all) use a particle-particle approach as the high reso-lution solver which is also the more time consuming, therefore webelieve our choice is general. Splitting strategies like the BRIDGEone discussed in Fujii et al. (2007); Oshino et al. (2011) or the hi-erarchical splitting (Jänes et al. 2014) show that our results can beextended and remain. Also choosing a direct integration code forthe implementation we preserve the symmetry in force calculation,which is not trivially guaranteed in hybrid codes (see however theFMM technique based codes as gyrfalcON Dehnen (2002), ABA-CUS Garrison et al. (2019), PKDGRAV3 Potter et al. (2017)).For the tests presented in this work, the Leap-Frog and itsadaptive version are entirely implemented on a single GPU, in or-der to avoid the bottleneck generated from CPU-GPU communi-cations. Implementing HamSp for the GPU have its subtleties, be-cause the recursive splitting is not easily handled inside the GPU,for that reason we decided to make the recursion (inherent of thealgorithm) in the CPU and the rest of calculation in the GPU. Formore details and discussion of the implementation and tests usingdi ff erent possibilities of GPU-CPU communication technology seeAguilar-Argüello et al. (in prep.). In this section, we present the results of the HamSp algorithm testsin terms of accuracy by simulating binaries, an isolated halo andsinking satellites, and compare them with the constant-step Leap-Frog and Adaptive-KDK.
The most basic but still very useful test for N -body codes is the two-body problem, because it posses an analytic solution. In order toexplore such a test we simulated binaries, either circular or eccen-tric ( e = .
9) ones for 2 × and 10 orbital periods, respectively,since Hernandez (2019) has showed that long integration times bet-ter constrains the integrator nature.In this set of tests we have no step rungs, therefore it is mostlyan adaptive time step test. For the tests, we decided to compare theintegrators by constraining them to an energy conservation limit.Because the step selection function for HamSp and AKDK are dif-ferent, we compensate it tuning the accuracy parameters.For the circular binary case, we constrained an energy conser-vation limit of 10 − and we used the accuracy parameters η HamS p = .
016 and η AKDK = .
037 for HamSp and AKDK, respectively. Leftpanel in figure 1 shows, from top to bottom, the fractional energyerror, the semi-major axis error, eccentricity error and position frac-tional error with respect to the analytical solution, for the circularbinary. All quantities show oscillations along the orbit integration,all but energy error show a linearly growing lower limit, relatedwith the sampling frequency. After 500 orbital periods, the energyerror starts to grow linearly for the AKDK (blue) solution whilefor the HamSp (black) occurs after 10 orbital periods. In contrastLeapF (red) only keeps oscillating. The other parameters similarlystart oscillating but the eccentricity start growing linearly at 10 or-bits, the residual regarding the analytic solution drifts linearly at10 periods. We observe that for the circular case, the energy con-servation in the LeapF and HamSp case is considerably better thanfor the AKDK after 500 orbital times, however all integrators areable to handle the problem.In the eccentric binary case ( e = . η HamS p = .
095 and η AKDK = .
35. Figure 2 shows
MNRAS000
MNRAS000 , 000–000 (0000)
Aguilar-Argüello et. al − − × − × − × − ∆ E / E LeapFaKDKHamSp − × − × − . × − a − a − − − − e − e − t / t orb − − − − ∆ R / R a n a l y t i c − . − . . . − . − . . . . y − . − . . . x − . − . . . . V x Figure 1.
Circular binary simulated over 2 × orbital periods using the three integrators: Leap-Frog (constant-step, red line), Hamiltonian Splitting (black),and Adaptive-KDK (blue). The left side show, from top to bottom, the energy error, change in semi-mayor axis, a , change in eccentricity, e , and position errorwhen comparing to the analytic solution, as a function of time. The right side show the XY plane (top) and the phase space (bottom) for the first 1000 orbits.LeapF and HamSp handle very accurate the evolution of the binary, while AKDK shows a systematic deviation from conservation after 10 orbital periods. Figure 2.
Same as figure 1, but for the eccentric binary ( e = .
9) simulated over 10 orbital periods. The XY plane (the two upper-right rows) and the phasespace (the two lower-right rows) are shown for the first 20 orbits (middle column) and for the last 20 orbits (periods) after 10 periods (right column). LeapFand HamSp handle with good accuracy the evolution of the binary, while AKDK shows a systematic deviation from conservation after 20 orbital periods andhas unstable orbits. MNRAS , 000–000 (0000) ssessing Hamiltonian Splitting − . − . . . . − . − . . . . t ∼ . × t orb LeapFAKDKHamSp0 < t t orb − . − . . . . x − . − . − . . . . . − . − . . . . − . − . . . . y t ∼ . × t orb − . − . . . . x − . − . − . . . . . V x Figure 3.
Projections of the phase space at di ff erent moments of the ec-centric binary of figure 2. Left panel shows one orbit period after evolvingthe binary for 9 . × periods, and right panels after 9 . × periods.We choose this moments to show that the phase space projections does notchange over integration for the LeapF and HamSp integrators, while forAKDK it is notorious the change. that when a 10 − energy conservation level is constrained, we no-tice that after 20 orbital times AKDK energy, semi-major axis andeccentricity (left panels) start to drift systematically. Near to 20 or-bital periods, all integrators agree. This is evident also from theorbit locus and from the phase space projection (right panels). Forsuch test, HamSp is 20 times faster than AKDK, suggesting that ismore accurate and stable but not as LeapF. HamSp starts to driftalso after 5000 orbital periods. The rightmost panel shows the or-bit locus (upper panel) and phase space projection (lower panel)at around 10 orbital periods, where we can clearly see that orbitsprecessed for LeapF and HamSp, but the semi-major axis is closelypreserved while for AKDK the semi-major axis shrank. The middlecolumn show the same phase space projection at 20 orbital times.Since the orbits precessed, it is not obvious to visually track driftor agreement with initial conditions of the phase space projections.For that reason, in figure 3 we selected specific moments when theplane of initial conditions and the orbit coincide. We can see (leftpanel) that HamSp besides preserving semi-major axis it preservesthe phase space projection, the right panel shows a case where bothLeapF and AKDK plane coincide with initial conditions. As ex-pected LeapF preserves the phase space locus but AKDK modifiedthe area and semi-major axis. We adopted as a reference model an equal particle mass, iso-lated halo following the NFW cuspy density profile predicted bycollisionless dark matter cosmological simulations (Navarro et al.1997). The large density range and the corresponding di ff erent dy-namical times makes it a suitable system for an adaptive time-stepcode. Such tests depend on resolution in order to actually capturethe benefit of individual time steps as compared with a global con-stant time step scheme. We will use as a reference time scale the t / t dyn ( r s )10 − − − − − | ∆ E / E | LeapF − NP splitt LeapF − gyrfalcONAKDK − NP splitt AKDK − gyrfalcONAKDK − GADGETHamSp − NP splitt Figure 4.
Comparison of implementations of LeapF (red), AKDK (blue)and HamSp (black) for our direct integration code (NP splitt , solid lines),gyrfalcON (dashed lines) and GADGET (dotted line). Overall all LeapFand AKDK implementations are consistent, the small di ff erences are relatedwith the lack of symmetry in force calculation. dynamical time at the NFW characteristic radius ( r s ), since it hasbeen used to study the stability of the halo in other works (e.g.Klypin et al. (2017)).For the integration of our fiducial model, we adopted G = M T = r s = splitt , Aguilar-Argüello et al. (inprep.)), and compared it with GADGET (Springel 2005) and gyr-falcON (Dehnen 2002), constraining the spatial resolution to 0 . η AKDK = . , η HamS p = . ff erence comes out of the approximated force calculation.The rest of the curves in the same figure show: our constant stepLeapF (solid red), our HamSp direct integration implementation(solid black) and Gyrfalcon constant time-step LeapF (dashed red).All of them are comparable and with better energy conservationthan AKDK, which is expected because symplecticity is broken.However, we should further investigate and quantify di ff erences inaccuracy and performance.Once we tested that our AKDK implementation is consistentwith well known ones, we simulated the fiducial halo, up to 40dynamical times, using the same accuracy parameters η = . A dynamical time, also called crossing time, is the time taken for a typ-ical particle to cross the system. In this paper, a dynamical time is definedas t dyn ( r ) = (cid:104) r / GM ( r ) (cid:105) / , where r and M ( r ) are the radius and mass,respectively.MNRAS , 000–000 (0000) Aguilar-Argüello et. al − − − − | ∆ E / E | − − − − | ∆ R c m | LeapFAKDK HamSpHamSp − sTSS t / t dyn ( r s ) − − − − − | ∆ P / P | t / t dyn ( r s ) − − − − − | ∆ L / L | Figure 5.
Error in conserved quantities for an isolated NFW halo with N = particles, simulated up to 40 dynamical times (at scale radius, r s ). Shown is theenergy error (upper left panel), change in center of mass position (upper right panel), linear momentum (lower left panel) and change in angular momentum(lower right panel) for the three integrators: Leap-Frog (red line), AKDK (blue line) and Hamiltonian Splitting original (black line) and saving the time-stepstructure (yellow line) versions. plummer softening (cid:15) = . . × − with the same rung number equal to 5. For AKDK weobtain | dE / E | max ∼ . × − and wall-clock time (wct) ∼ | dE / E | max ∼ . × − andwct ∼ | dE / E | max ∼ . × − but with a lower performancewct ∼ ffi cient but less accurate inenergy conservation. In order to investigate such di ff erences, it isimportant to mention that the time selection function is di ff erentfor each case, therefore the relative number of particles with smalland large steps is quite di ff erent, not to mention that AKDK timestep function has a dependence with softening and HamSp dependson inter-particle separation.Now we consider a more meaningful comparison, assuminginstead an energy conservation threshold of 10 − , that implies us-ing di ff erent accuracy η parameters for HamSp ( η HamS p = . η AKDK = . particles for 40 dy-namical times at the characteristic radius r s , t dyn . Figure 5 showsthe result of such tests. The upper left panel shows the energy er-ror. The LeapF integrator (red) fluctuates and HamSp (black) staysvery close during the first 20 t dyn , afterwards it shows a small driftsuggesting that may be approximately reversible, AKDK drifted al-most linearly and after 20 t dyn it slightly flattens. We decided to con-sider a complementary test, we delayed one time step the construc-tion and update of the time step hierarchy in HamSp, denoted byHamSp-sTSS, it represents a computational overhead with respectto AKDK, such an action results in important savings in computa-tion time. The result is represented by the yellow line. The accu-racy is as we can expect lower but it quite resembles the AKDKbehaviour.Regarding the conservation of other dynamical quantities like − − − − ρ LeapFAKDKHamSp t = 0 t = 5 t dyn ( r s ) t = 10 t dyn ( r s ) − . − . − . . . . r / r s )0 . . . . | ∆ ρ | / ρ Figure 6.
Density profile, at di ff erent times, of the isolated NFW halo with2 × particles simulated up to 10 dynamical times for the three integrators:Leap-Frog (solid lines), AKDK (dotted lines) and Hamiltonian Splitting(dashed lines). linear and angular momentum or the system barycenter the sit-uation is di ff erent (see figure 5). Linear and angular momentumare clearly preserved almost to machine precision by LeapF andHamSp integrators (red and black, bottom panels), whereas AKDK(blue) presents smaller accuracy although a slope until 10 t dyn , af-terwards it flattens. Interestingly HamSp with a delay in updatingstep hierarchy is almost indistinguishable from HamSp and LeapF.The upper right panel shows the accuracy in preserving the halocentroid. Once again LeapF and both HamSp versions accuratelykeep the centroid while AKDK accuracy degraded two orders ofmagnitude. Wall-clock time for LeapF experiment was wct ∼ ∼ ∼ MNRAS , 000–000 (0000) ssessing Hamiltonian Splitting − − − − − | ∆ E / E | × − × − × − | ∆ Z c m | LeapFAKDKHamSp t / t dyn − − − − | ∆ P / P | t / t dyn − − − − | ∆ L / L | Figure 7.
Error in conserved quantities for a NFW halo with N = particles plus a satellite simulated up to 10 dynamical times. Shown is the energy error(upper left panel), change in center of mass position (upper right panel), linear momentum (lower left panel) and change in angular momentum (lower rightpanel) for the three di ff erent integrators: Leap-Frog (red line), AKDK (blue line) and Hamiltonian Splitting (black line). Finally, we emphasize that all integrators accurately preserve thedensity profile as we can see in figure 6 with obvious dependenceon the number of particles, we decided to show the test with 2 mil-lion particles in order to minimize discreetness e ff ects. Satellite accretion onto larger galaxies is an astrophysical problemcommonly simulated by both isolated and cosmological N -bodysimulations (Miller et al. 2020; Arca-Sedda & Capuzzo-Dolcetta2016). We simulated, for 10 dynamical times and with di ff erentrealizations, a rigid softened satellite falling into a spherical systemrepresented by collisionless particles. The spherical system consistsof N = equal mass particles spatially distributed accordingto the NFW cuspy density profile (Navarro et al. 1997), and weadopted the same units as in our previous fiducial isolated halo caseand a softening parameter (cid:15) = .
026 (in model units). The satelliteinitial separation from the center of the spherical system is R sat = .
6, and its mass is m sat = .
01, which is ∼ ff ects start to play a role and energy has an increment.HamSp integrator starts to jump at one dynamical time, afterwardsit stays flat. AKDK shows a lower accuracy in energy conserva-tion and presents a systematic energy growth. As in the case ofthe isolated halo the energy drift slope is considerably flatter forthe HamSp. Linear and angular momentum are less accurate forthe AKDK integrator for several orders of magnitude, while sys-tem barycenter behaves essentially the same for the three methods.These results indicate that HamSp is an excellent alternative fordynamical friction studies.The evolution of the satellite radial position shows di ff erences . . . . . . R s a t / r s LeapFAKDKHamSp t / t dyn − − − | ∆ R s a t | / R s a t , L e a p F Figure 8.
Evolution of the satellite radial position (upper panel) for the sink-ing satellite test and for the three integrators: LeapF (red), AKDK (blue) andHamSp (black). When comparing with LeapF (lower panel), the di ff erencesare below 1% between integrators. below the one percent level (see figure 8). We conclude that for areduced number of dynamical times the three integrators can pro-vide an accurate description of the sinking process. Despite AKDK is an integrator widely used in N -body codes, itis well known that using adaptive time steps breaks the LeapFsymplecticity and therefore looses the related benefits of being acanonical mapping (Hernandez & Bertschinger 2018). One of the MNRAS000
Evolution of the satellite radial position (upper panel) for the sink-ing satellite test and for the three integrators: LeapF (red), AKDK (blue) andHamSp (black). When comparing with LeapF (lower panel), the di ff erencesare below 1% between integrators. below the one percent level (see figure 8). We conclude that for areduced number of dynamical times the three integrators can pro-vide an accurate description of the sinking process. Despite AKDK is an integrator widely used in N -body codes, itis well known that using adaptive time steps breaks the LeapFsymplecticity and therefore looses the related benefits of being acanonical mapping (Hernandez & Bertschinger 2018). One of the MNRAS000 , 000–000 (0000)
Aguilar-Argüello et. al . . . . . . R s a t / r s LeapF , FWLeapF , BW1LeapF , BW2LeapF , BW3AKDK , FWAKDK , BW1 AKDK , BW2AKDK , BW3HamSp , FWHamSp , BW1HamSp , BW2HamSp , BW3 t / t dyn − − − − | ∆ R s a t | / R s a t , F W Figure 9.
Reversibility test. For the sinking satellite experiment we reversedthe velocity sign at di ff erent moments, indicated by BW1, BW2 and BW3,and continue the integration. Upper panel shows the normalized satellitedistance as a function of time for the truly symplectic LeapF, the adaptiveAKDK and HamSp integrators. We can see that all integrators are somehowreversible during the studied period. Bottom panel shows the fractional dif-ference regarding the forward solution. It is clear that HamSp has a shorterdrift with respect LeapF. The di ff erence seems smaller when the reversibil-ity is made at larger times, likely because the di ff erence is now dominatedby cumulative integration errors. consequences of non being symplectic is loosing energy conserva-tion. A common strategy has been looking for procedures to restorethe reversibility or time symmetry (Dehnen 2017). The hierarchi-cal Hamiltonian Splitting proposed by Pelupessy et al. (2012) thatwe analyze in our study, chooses a perturbative time step functionthat restores the time symmetry (section 3). Recently Hernandez& Bertschinger (2018) analyzed time symmetry integrators com-monly used in astrophysics, they establish that for a global timestep symplecticity, time symmetry and reversibility are the same.They found that for an adaptive time step symplecticity, time sym-metry and reversibility are not identical, but they overlap. Someintegrators may be reversible but not time symmetric. Such di ff er-ence is critical when we talk about long term integration. Motivatedby the previous discussion we performed symmetry and reversibil-ity tests using both HamSp and AKDK integrators and we usedLeapF as a reference case. We chose the sinking satellite systemas the test bed because the satellite orbit allows easily to track thesystem response. At three di ff erent moments, termed BW1, BW2and BW3, we reversed velocity signs for all particles and for onecase instead of velocities we reversed time sign, after that we con-tinued the integration. Figure 9 shows the global result of forward(FW) and backward (BW) evolution (i.e. inverting the sign of ve-locities) for all integrators. Overall the di ff erences are quite smallat this level of energy conservation (10 − ), although somehow un-expected this is consistent with Hernandez & Bertschinger (2018),when they discuss that AKDK preserves quantities like angular andlinear momentum. There are still some di ff erences as we can see inthe lower panel, it is important to mention that at less accurate en-ergy conservation di ff erences may be large as it happens with thebinary tests. The fractional distance change with respect to its own − − − − − | ∆ E / E | LeapF
FWBW1BW2BW3 − − − − − | ∆ E / E | AKDK
FWBW1BW2BW3 t / t dyn − − − − − | ∆ E / E | HamSpHamSp − sTSSnSymHamSp FWBW1BW2BW3
Figure 10.
Fractional energy error before and after reversibility, for thesame test of figure 9. Constant time step LeapF (top panel) is truly re-versible. AKDK (middle panel) has an almost linear growth of energy, sug-gesting that it is not reversible. HamSp (black, bottom panel) is almost flatwith a small drift of energy after ∼ t dyn of the backward integrations, in-dicating that it is approximately reversible, whereas for the HamSp-sTSS(yellow) and nSymHamSp (green) versions the energy grows almost lin-early, hence both are not reversible. forward evolution (lower panel) is a good diagnostic of symmetry.Clearly, we can see that HamSp (black) is very close to the trulysymplectic LeapF (red) at least for such medium integration timesand AKDK (blue) is about two orders of magnitude away. We alsotracked the fractional energy change (figure 10). The first outstand-ing fact is the truly symmetrical behaviour of the constant time stepLeapF after the reversal (upper panel), almost every peak and valleyis reproduced. The second panel shows a roughly linear growth (inblue) for both forward integration and right after reversing veloci-ties, suggesting that AKDK is quite no reversible. HamSp is almostflat (black, bottom panel) similar to LeapF. We run a test wherewe change the time variable sign, the energy shows a systematicgrowth showing HamSp is not time symmetric. We performed sim-ilar tests for the fiducial isolated halo with similar results. The smallslope showed in the energy drift indicates that HamSp is approxi-mately reversible, therefore energy drift is quite small comparedto AKDK that is no reversible. Such property may be relevant toperformance, we will comeback to such point in a further section. MNRAS , 000–000 (0000) ssessing Hamiltonian Splitting HamSpHamSp − sTSSnSymHamSp FWBW1BW2BW3 t / t dyn . . . . . . R s a t / r s AKDKSymAKDKnSymAKDK FWBW1BW2BW3 t / t dyn − − − | ∆ R s a t | / R s a t , F W Figure 11.
As figure 9, but for the AKDK (blue, left column) and HamSp (black, right column) integrators and their corresponding modified versions:SymAKDK (grey, left), nSymAKDK (orange, left), HamSp-sTSS (yellow, right) and nSymHamSp (green, right). AKDK versions are in agreement. HamSpand its versions are more stable than any AKDK variant.
As it has already been discussed some hierarchical / adaptive timestep integrators like HamSp and AKDK, include a time step se-lection function, such a function may help restoring the integratorsymmetry. We may question if the symmetrized time step selectionfunction given by eq. 8 is useful for a particular simulation, do weloose all the convenient properties of HamSp observed until now?In order to quantify such e ff ect we evolved the fiducial isolatedhalo including a sinking rigid satellite with both the HamSp andAKDK integrators but changing the time step selection function.For HamSp integrator we removed the derivative term from eq. 8which represents the symmetrizing correction, we name a such testnSymHamSp. Additionally we exchanged the acceleration criterion(eq. 5) in AKDK and replaced it by the symmetryzed free-fall stepfunction (eqs. 8 and 9), we named this test SymAKDK, for com-pleteness purposes we considered the plain no-symmetrized free-fall step function in AKDK and we called it nSymAKDK. Fig-ure 10 shows the energy behaviour in our tests. Specifically fornSymHamSp the symmetry is now broken as expected, for both theforward and backwards integration due the energy grows system-atically, in a similar way to AKDK (middle panel) but the normal-ization is still a factor of three smaller. Although the drift is similarto the HamSp-sTSS, where the update of the time step hierarchy isdelayed one time step, the normalization is also smaller. Figure 11shows the evolution of the satellite radial positions for both inte-grators (HamSp and AKDK) and their corresponding modified ver-sions. Left upper panel shows the distance normalized to the char-acteristic radius, before 6 t dyn all AKDK versions are in agreement.Lower left panel corresponds to the fractional position residual withrespect to the forward integration case and, also, they are quiteclose. Right panels show the equivalent for HamSp. We can seethat there is not a big di ff erence between HamSp and nSymHamSp,even HampSp-sTSS is more stable than any of the AKDK variants.Our main conclusion is that HamSp preserves to some degree itsstability even if we do not use the symmetrized time step selec-tion function. We can speculate that the fact that the splitting of the Hamiltonian at each time step rung is ultimately expressed in termsof Kick and
Drift operators which are symplectic, helps to preservesome of the symmetry. Energy rise in the modified HamSp ver-sion is clearly no reversible but still its drift slope is smaller thanin any AKDK version. Situations where the HamSp may be com-bined with di ff erent time step functions therefore may be appealingto explore (e.g. Zhu (2017)). Although the adaptive nature of the HamSp and AKDK integra-tors may imply a higher e ffi ciency compared with the global stepLeapF, the benefit of taking adaptive steps is evident only whenthe dynamical range is large. In comparison with AKDK, HamSphas a computational overhead due to the recursive splitting of theHamiltonian needed to build the time step hierarchy.We decided to make a short exploration of the simulationparameters (accuracy, rung number, and minimum and maximumtime steps) and we present the results on figure 12, which showsa pragmatic diagnostic of the integrator performance: energy con-servation vs wall-clock time. The upper and middle panels corre-spond to the fiducial isolated cuspy halo, and the same halo witha rigid satellite, the lowest panel corresponds to the cuspy halo in-cluding four live cuspy satellites, a common situation in cosmo-logical simulations that requires a large dynamical range. In manycases HamSp outperforms AKDK. We may wonder what is the rea-son given the extra computations related with the splitting process.One possible suspect is the individual time step distribution. In or-der to investigate that we built the histogram of particles for initialconditions and at later times (see figure 13). For HamSp (solid cir-cle) the particle distribution in rungs almost does not change. It isclear that only a moderate fraction of particles are in the deepestrung. For AKDK the distribution is almost the opposite, there is apeak of particles in the two deepest rungs, which translates into alot more of time steps than HamSp. Tuning parameters for both in-tegrators may improve the situation. The di ff erence in performance MNRAS000
Drift operators which are symplectic, helps to preservesome of the symmetry. Energy rise in the modified HamSp ver-sion is clearly no reversible but still its drift slope is smaller thanin any AKDK version. Situations where the HamSp may be com-bined with di ff erent time step functions therefore may be appealingto explore (e.g. Zhu (2017)). Although the adaptive nature of the HamSp and AKDK integra-tors may imply a higher e ffi ciency compared with the global stepLeapF, the benefit of taking adaptive steps is evident only whenthe dynamical range is large. In comparison with AKDK, HamSphas a computational overhead due to the recursive splitting of theHamiltonian needed to build the time step hierarchy.We decided to make a short exploration of the simulationparameters (accuracy, rung number, and minimum and maximumtime steps) and we present the results on figure 12, which showsa pragmatic diagnostic of the integrator performance: energy con-servation vs wall-clock time. The upper and middle panels corre-spond to the fiducial isolated cuspy halo, and the same halo witha rigid satellite, the lowest panel corresponds to the cuspy halo in-cluding four live cuspy satellites, a common situation in cosmo-logical simulations that requires a large dynamical range. In manycases HamSp outperforms AKDK. We may wonder what is the rea-son given the extra computations related with the splitting process.One possible suspect is the individual time step distribution. In or-der to investigate that we built the histogram of particles for initialconditions and at later times (see figure 13). For HamSp (solid cir-cle) the particle distribution in rungs almost does not change. It isclear that only a moderate fraction of particles are in the deepestrung. For AKDK the distribution is almost the opposite, there is apeak of particles in the two deepest rungs, which translates into alot more of time steps than HamSp. Tuning parameters for both in-tegrators may improve the situation. The di ff erence in performance MNRAS000 , 000–000 (0000) Aguilar-Argüello et. al − − − − | ∆ E / E | m a x HamSpAKDK dt min , dt min , dt min , − − | ∆ E / E | m a x HamSpAKDKRungs = 3Rungs = 4Rungs = 5Rungs = 6 Wall − clock Time (s)10 − − − | ∆ E / E | m a x
10 min 30 min 1 hour
HamSpAKDK dt min , dt min , Figure 12.
Energy conservation as a function of wall-clock time for thefiducial isolated cuspy halo (upper and middle panels) and for a cuspy haloincluding four live cuspy satellites (bottom panel), for HamSp (solid circles)and AKDK (stars) integrators. In upper panel we used a plummer softening (cid:15) = .
007 and we varied the minimun time step ( dt min , = . × − , dt min , = . × − and dt min , = . × − ) but kipping fixed the numberof rungs (6 and 5 for HamSp and AKDK, respectively). In middle panelwe used (cid:15) = .
01 and we varied the number of rungs but kipping fixed theminimum time step ( dt min = . × − ). For the bottom panel we used (cid:15) = .
004 and we varied the minimun time step ( dt min , = . × − and dt min , = . × − ) but kipping fixed the number of rungs (6 and 3 forHamSp and AKDK, respectively). is partly due to the time step selection function, in agreement withwhat was alredy discussed by Zemp et al. (2007); Grudi´c & Hop-kins (2020). Other helpful characteristic is the stability of energyas we can see in figures 7 and 10, this is related with the approxi-mated reversibility that we observed in HamSp. Although tweakingparameters we may obtain di ff erences in performance and accu-racy, we noticed that for a defined energy conservation thresholdin many cases HamSp outperforms AKDK. Recently, some stud-ies have published successful results using HamSp but keeping the − − − dt/t dyn ( r s )01 × × × × a c t i v e p a r t . HamSpAKDK t =0 t =20 t dyn ( r s ) t =40 t dyn ( r s ) Figure 13.
Particle distribution across time step rungs for di ff erent selectionfunctions: acceleration criteria (eq. 5) stars and the symmetrized free-fall(eqs.8 and 9) solid circles, for the fiducial isolated halo using the same pa-rameters as figure 5. It is notorious that the acceleration criteria has moreparticles with small time step and that free-fall criteria is almost the oppositewith potential consequences for the performance. standard acceleration based time step selection function to equa-tion 5 (Wang et al. 2019), based in our study results we may specu-late that Bridge strategies like the one discussed by Portegies Zwartet al. (2020) may indicate that reaching even better performance ispossible. At this point we have compared the integrators accuracy and perfor-mance for some dynamical times. A natural question arises aboutif HamSp advantages are relevant for realistic long term integra-tion. Dark matter halos survive around 30 −
200 dynamical times incosmological simulations depending on the merger / accretion his-tory (Klypin et al. 2015). As before we investigated the stabilityof energy, linear and angular momentum and density centroid forour fiducial isolated halo model, this time for hundreds of dynam-ical times, results are presented in figure 14. Energy conservationis quite close to the global step LeapF behaviour during the first20 −
40 dynamical times (consistent with our previous tests), how-ever after that it starts to slowly drift in agreement to the reversibil-ity tests presented above. The drift seems to get smaller at the endof the simulation ( ∼ t dyn ). AKDK quickly drifts to a consid-erably larger energy error and keeps systematically growing, al-though it does not flattens like HamSp. The yellow curve representsthe HamSp − sTSS version that delays the step hierarchy updating,as we observed before it behaves as AKDK but with smaller accu-racy, although it is quite faster. For linear and angular momentumall integrators preserve such quantities, however while HamSp pre-serves almost at machine accuracy, AKDK preservation is at almost8 orders of magnitude worst, a similar disparity is obtained track-ing the halo centroid. Seeking for a consequence of the di ff erencein accuracy we tracked the circular velocity in experiments withsmaller number of particles ( N = . r s where circular velocity peaks. Figure15 shows the circular velocity profile at several times. Despite wefind some di ff erences, they are all inside 10%. We conclude that atthe level of high accuracy we do not expect important di ff erencesbetween integrators and performance is the most relevant di ff er-ence. As far as N -body simulations reach larger dynamical range, MNRAS , 000–000 (0000) ssessing Hamiltonian Splitting − − − − | ∆ E / E | | ∆ R c m | t / t dyn ( r s ) − − − − − | ∆ P / P | t / t dyn ( r s ) − − − − − | ∆ L / L | LeapFAKDK HamSpHamSp − sTSS Figure 14.
As figure 5, but following the fiducial isolated halo up to 1000 dynamical times (at scale radius). t / t dyn (2 . r s )0 . . . . . V m a x / V m a x , i n i LeapF AKDK HamSp . r / r s . . . . . . . . . V c i r / V m a x , i n i LeapFAKDKHamSp t = 0 t = 50 t dyn (2 . r s ) t = 100 t dyn (2 . r s ) t = 200 t dyn (2 . r s ) t = 400 t dyn (2 . r s ) Figure 15.
Circular velocity curves for the fiducial isolated halo with 2000particles following the system for long integration times. There is not sys-tematic di ff erence between di ff erent integrators. The energy conservationlevel is: . Although the accuracy is not high the maximum circular velocityis scattered inside 7% at di ff erent moments. the smallest structures may live for a larger number of dynamicaltimes, HamSp may o ff er a more stable possibility in such a case.Recent studies regarding long-term N -body evolution (Her-nandez et al. 2020), suggest that long term N -body integration maybe quite unstable to the particular realization, this may be partic-ularly critical for highly non-linear situations like the three-bodyproblem. We generated a small ensemble of realizations for thefiducial isolated halo and for the sinking satellite problem, resultsare shown in figure 16 only for LeapF and HamSp. Indeed some scatter is found, however overall we found that results are quiterobust.
10 DISCUSSION AND CONCLUSIONS
As it has recently been discussed there is no general solution for asymplectic adaptive time step integrator, although there are severalproposals. Using a GPU direct integration N -body code, we testedand characterized the Hierarchical Hamiltonian Splitting integratorproposed by Pelupessy et al. (2012), but we focused on collisionlesssimulations. As a reference we compared with the global time-stepsympletic Leap-Frog (LeapF) and the Adaptive KDK (AKDK).(i) We explored the e ff ect of the time-step selection function.We demonstrate that time-step selection functions like a dynam-ical time proxy are quite e ffi cient and may help to compensatecomputational overheads for HamSp. As introduced by Pelupessyet al. (2012), it adopts a symmetryzed free-fall particle pair func-tion, which is suitable for direct integration codes like the one weused. In all our experiments the smallest time step rung is not over-populated in contrast to the traditional acceleration based selectionfunction, making HamSp more e ffi cient.(ii) Based on reversibility and time symmetry tests we con-cluded that HamSp is not symplectic, it is not either time sym-metric but it is only approximately reversible, although quite sta-ble in comparison to AKDK. Although the exact correspondencebetween forward and backwards integration lasts only for few dy-namical times even for ten dynamical times the order or magni-tude energy drift is preserved as well as some features. In contrastAKDK energy drift grows linearly with time, showing clearly thatis not reversible. Based in our tests HamSp reversibility combinedwith the selection function explains the advantage in performance.(iii) Although AKDK is neither reversible or symplectic, at highenergy conservation accuracy and large number of particles, there isno di ff erence in maximum circular velocity, density profile or satel-lite dynamical friction time between LeapF, AKDK and HamSp. MNRAS000
As it has recently been discussed there is no general solution for asymplectic adaptive time step integrator, although there are severalproposals. Using a GPU direct integration N -body code, we testedand characterized the Hierarchical Hamiltonian Splitting integratorproposed by Pelupessy et al. (2012), but we focused on collisionlesssimulations. As a reference we compared with the global time-stepsympletic Leap-Frog (LeapF) and the Adaptive KDK (AKDK).(i) We explored the e ff ect of the time-step selection function.We demonstrate that time-step selection functions like a dynam-ical time proxy are quite e ffi cient and may help to compensatecomputational overheads for HamSp. As introduced by Pelupessyet al. (2012), it adopts a symmetryzed free-fall particle pair func-tion, which is suitable for direct integration codes like the one weused. In all our experiments the smallest time step rung is not over-populated in contrast to the traditional acceleration based selectionfunction, making HamSp more e ffi cient.(ii) Based on reversibility and time symmetry tests we con-cluded that HamSp is not symplectic, it is not either time sym-metric but it is only approximately reversible, although quite sta-ble in comparison to AKDK. Although the exact correspondencebetween forward and backwards integration lasts only for few dy-namical times even for ten dynamical times the order or magni-tude energy drift is preserved as well as some features. In contrastAKDK energy drift grows linearly with time, showing clearly thatis not reversible. Based in our tests HamSp reversibility combinedwith the selection function explains the advantage in performance.(iii) Although AKDK is neither reversible or symplectic, at highenergy conservation accuracy and large number of particles, there isno di ff erence in maximum circular velocity, density profile or satel-lite dynamical friction time between LeapF, AKDK and HamSp. MNRAS000 , 000–000 (0000) Aguilar-Argüello et. al − − − − − | ∆ E / E | LeapFHamSp × − × − × − | ∆ Z c m | t / t dyn − − | ∆ P / P | t / t dyn − − − − − | ∆ L / L | Figure 16.
Error in conserved quantities for a NFW halo with N = particles plus a satellite simulated up to 10 dynamical times. Shown is the energy error(upper left panel), change in center of mass position (upper right panel), linear momentum (lower left panel) and change in angular momentum (lower rightpanel) for the Leap-Frog (red line), and Hamiltonian Splitting (black line) integrators. Shaded portions represent one sigma standard deviation propagatedfrom di ff erent realizations run under di ff erent random seeds. This opens the possibility that HamSp is a more e ffi cient alterna-tive to AKDK.(iv) The Bridge techniques like the ones discussed in PortegiesZwart et al. (2020) make appealing the possibility of using HamSpin the way we discussed in this paper keeping the e ffi ciency on ac-celeration calculation and as a consequence reaching considerablyhigh performance. ACKNOWLEDGEMENTS
GAA and JCC thanks Leobardo Ithehua Rico for the fruitful tech-nical support and discussion. OV and GAA acknowledge supportfrom UNAM PAPIIT grant IN112518. GAA and JAT thanks sup-port from CONACyT graduate fellowships 455336 and 825451,respectively. HV acknowledges support from IN101918 PAPIIT-UNAM Grant and from a Megaproyecto DGTIC-LANCAD. Theauthors acknowledge DGTIC-UNAM for access to the Supercom-puter MIZTLI. This research was partially supported through com-putational and human resources provided by the LAMOD UNAMproject through the clusters Atocatl and Tochtli. LAMOD is a col-laborative e ff ort between the IA, ICN and IQ institutes at UNAM,and it acknowldges benefit from the MX28274 agreement withIBM. REFERENCES
Aarseth S. J., 1971, Ap&SS, 14, 20Arca-Sedda M., Capuzzo-Dolcetta R., 2016, MNRAS, 461, 4335Bagla J. S., 2002, Journal of Astrophysics and Astronomy, 23, 185Bode P., Ostriker J. P., 2003, ApJS, 145, 1Bode P., Ostriker J. P., Xu G., 2000, ApJS, 128, 561Couchman H. M. P., 1991, ApJ, 368, L23Dehnen W., 2002, Journal of Computational Physics, 179, 27Dehnen W., 2017, MNRAS, 472, 1226Dehnen W., Read J. I., 2011, European Physical Journal Plus, 126, 55Efstathiou G., Davis M., White S. D. M., Frenk C. S., 1985, ApJS, 57, 241 Farr W. M., Bertschinger E., 2007, ApJ, 663, 1420Fujii M., Iwasawa M., Funato Y., Makino J., 2007, PASJ, 59, 1095Garrison L. H., Eisenstein D. J., Pinto P. A., 2019, MNRAS, 485, 3370Grudi´c M. Y., Hopkins P. F., 2020, MNRAS, 495, 4306Habib S., Morozov V., Frontiere N., Finkel H., Pope A., Heitmann K., 2013,in SC ’13 Proceedings of SC13: International Conference for High Per-formance Computing. p. 6, doi:10.1145 / / , 000–000 (0000) ssessing Hamiltonian Splitting Zhu Q., 2017, arXiv e-prints, p. arXiv:1712.10116MNRAS000