tRecX -- an environment for solving time-dependent Schrödinger-like problems
ttRecX — an environment for solving time-dependentSchr¨odinger-like problems
Armin Scrinzi ∗ Ludwig Maximilian University, Theresienstrasse 37, 80333 Munich, Germany
Abstract tRecX is a C++ code for solving generalized inhomogeneous time-dependentSchr¨odinger-type equations id Ψ /dt = H [ t, Ψ] + Φ in arbitrary dimensions andin a variety of coordinate systems. The operator H [ t, Ψ] may have simple non-linearities, as in Gross-Pitaevskii and Hartree(-Fock) problems. Primary ap-plication of tRecX has been non-perturbative strong-field single and doublephoto-electron emission in atomic and molecular physics. The code is designedfor large-scale ab initio calculations, for exploring models, and for advancedteaching in computational physics. Distinctive numerical methods are the time-dependent surface flux method for the computation of single and double emissionspectra and exterior complex scaling for absorption. Wave functions and oper-ators are handled by tree-structures with the systematic use of recursion on thecoarse-grain level. Numerical, analytic, and grid-based discretizations can becombined and are treated on the same abstract level. Operators are specifiedin the input using a script language including symbolic algebra. User-friendlyin- and output, error safety, and documentation are integrated by design.
Keywords: S chr¨odinger solver, strong field physics, attosecond physics,recursive structure PROGRAM SUMMARY
Program title: tRecX — time-dependent Recursive indeXing (tRecX=tSurff+irECS)
CPC Library link to program files: (to be added by Technical Editor)
Developer’s repository link: https://gitlab.physik.uni-muenchen.de/AG-Scrinzi/tRecX
Code Ocean capsule: (to be added by Technical Editor)
Licensing provisions:
GNU General Public License 2
Programming language:
C++
Nature of problem : tRecX is a general solver for time-dependent Schr¨odinger-like prob-lems, with applications mostly in strong field and attosecond physics. There are notechnical restrictions on the spatial dimension of the problem with up to 6 spatialdimensions realized in the strong-field double ionization of Helium. A selection of co-ordinate systems is available and any Hamiltonian involving up to second derivatives ∗ E-mail address:
Preprint submitted to Journal of L A TEX Templates January 21, 2021 a r X i v : . [ phy s i c s . c o m p - ph ] J a n nd arbitrary up to three dimensional potentials can be defined on input by simplescripts. Solution method:
The method of lines is used with spatial discretization by a flexiblecombination of one dimensional basis sets, DVR representations, discrete vectors, ex-pansions into higher-dimensional eigenfunctions of user-defined operators and multi-center basis sets. Photo-emission spectra are calculated using the time-dependentsurface flux method (tSurff) in combination with infinite range exterior complex scal-ing (irECS) for absorption. The code is object oriented and makes extensive use oftree-structures and recursive algorithms. Parallelization is by MPI. Code design andperformance allow use in production as well as for graduate level training.
Contents1 Introduction 3
Index constructors . . . . . . . . . . . . . . . . . 324.1.2
Discretization classes . . . . . . . . . . . . . . . . . . . 334.2 The template class Tree . . . . . . . . . . . . . . . . . . . . . . . 334.3 Operators classes . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.3.1 Construction and optimization of an
OperatorTree . . . 354.3.2 Further important operator classes . . . . . . . . . . . . . 364.4 Recursive algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 364.5 Basis sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.5.1
BasisIntegrable . . . . . . . . . . . . . . . . . . . . . . 384.5.2
BasisDVR . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.5.3
BasisGrid . . . . . . . . . . . . . . . . . . . . . . . . . . 394.5.4
BasisVector . . . . . . . . . . . . . . . . . . . . . . . . . 404.5.5
BasisSub . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.5.6 Multi-dimensional basis functions —
BasisNdim . . . . . 404.6 Input, units conversion, and algebraic expressions . . . . . . . . . 414.6.1 Class
Units . . . . . . . . . . . . . . . . . . . . . . . . . . 424.6.2
Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.7
TimePropagator and
TimePropagatorOutput classes . . . . . . . 424.7.1
Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.8 Python scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
The tRecX code package is designed to be a high-performance, yet flexibleand robust code with good maintainability and usability for Schr¨odinger-liketime-dependent problems. It is in use for computing the interaction of atomicand molecular systems in non-perturbatively strong laser fields. It implementsa range of techniques such as irECS (infinite-range exterior complex scaling[1]), tSurff (the time-dependent surface flux method [2, 3]), general and mixedgauges [4], and the FE-DVR method for complex scaling [5, 6]. The hybrid anti-symmetrized Coupled Channels method (haCC [7]) is going to be made publiclyavailable with the next release. The code has been developed for and applied tosolving several problems in strong field physics. The most outstanding applica-tions of tRecX are the computation of fully-differential double electron emissionspectra of the Helium atom [8, 9] at laser wave length from 10 to 800 nm, includ-ing also elliptically polarized fields [10], as well as strong field ionization ratesand photo-emission spectra for di- and tri-atomic linear molecules [11, 12, 13]with arbitrary alignment between the direction of laser polarization and themolecular axis.During the development of the code a conscious effort has been and still isbeing made to adhere to good programming practice for ensuring re-usabilityand maintainability. The object-oriented C++ code systematically uses abstractand template classes for ensuring uniform and transparent code structure. Foreasier accessibility by physicists these classes reflect concepts that are familiarin physics such as the linear and more specifically Hilbert space, operators thatare usually but not necessarily linear maps, and wave functions. Discretizationof the wave function is in terms of an abstract basis set class, whose specificimplementation covers the whole range from discrete sets of vectors, over grids,3nite-elements, standard basis sets such as spherical harmonics, all the way toexpansions in terms of eigenfunctions of a user-defined operator. These can becombined in a tree-structured hierarchy that admits building correlated (non-product) bases from one-dimensional factors. For performance, numerical li-braries such as Lapack [14], Eigen [15], or FFTW [16] are used on the low level.Parallelization is through MPI with some degree of automatic load-balancingbased on self-measurement of the code.The development of tRecX was initially motivated by several simultaneousPhD projects all related to the time-dependent Schr¨odinger equation (TDSE),but varying in dimension from 1 to 6 with different coordinate systems and dis-cretization strategies. Using math-type strings for input of discretizations andoperators allowed covering all these projects within the same framework, re-ducing supervision overhead, code redundancy, and programming errors. Also,non-trivial model Hamiltonians can be implemented quickly with little com-promise in computational performance. This includes, for example, Floquetcalculations or simple many-body systems.For the use with research students and for graduate level teaching, but alsofor productivity in research, error-safety and usability are important designgoals, as is adhering to good programming practice. Due to its origin in multi-ple research projects the code does contain important sections that do not con-form to such best practice, but there is an ongoing effort to re-implement thosesections with modern standards. Documentation relies on code-readability andDoxygen [17] inline documentation. Input is exclusively through a dedicatedclass, which only allows documented input and machine-generates up-to-datehelp. Input can be as numbers or algebraic expressions with standard mathe-matical functions and combining SI, cgs(ESU), or atomic units (a.u.).
We give an overview of typical uses of tRecX that do not require any exten-sions to the code. In addition, the code’s potential is made clear and possibleadvanced use with or without code extensions is indicated. Far from attemptingcomplete documentation in this place, we expose the mathematical background,logical structure, and the principles for mapping equations into the code. Someroom is given to describing code structure and selected classes. This infor-mation, apart from being useful in its own right, is meant to illustrate designphilosophy and principles, which we consider as a defining constituent of thetRecX project.The aim is to provide answers and/or useful information regarding the fol-lowing questions: • What has been done and what is typically done using tRecX? • Is there a possibility of using or adapting tRecX for my problem? • What are the most important methods in tRecX? Which of them arespecific for tRecX? 4
What is the code structure? How could I extend this for a new use?We do not discuss here specific algorithms or numerical methods in greaterdetail, giving the relevant references instead.In the following, we first list examples of applications and discuss the corre-sponding inputs, before introducing the main methods used in the code. Finally,code concept and structure are illustrated at the example of its main classes.Independent reading of the sections is aided by ample cross-referencing andminor redundancies between the sections. No effort is made to provide a com-plete manual for the code here or elsewhere. Rather, all examples shown andfurther introductory and advanced examples are provided as tutorials with thecode. This together with code readability and generous Doxygen annotation isintended to serve as a source of full documentation.
2. Application examples
The code source resides on a git repository [18] from where up-to-date infor-mation on file structure and compilation should be drawn. We only single outthe subdirectory tutorial that contains input files for a range of applications,named , , etc., where tutorial/00 through systematically introduce the most important input features and codefunctionalities. We choose the single-electron system for introducing the general character-istics of strong field physics problems, the discretization strategy, and the formof operators in tRecX. Complete input at slightly different parameters is givenin tutorial/11shortPulseIR .The single-electron time-dependent Schr¨odinger equation (TDSE) in strongfields is, in atomic units (a.u. ) i ddt Ψ( (cid:126)r, t ) = [ −
12 ∆ + iA z ( t ) ∂ z + V ( | (cid:126)r | )]Ψ( (cid:126)r, t ) . (1)This describes an electron bound by a rotationally symmetric potential, wherethe laser field is linearly polarized in z -direction (cid:126) E ( t ) = (0 , , E z ( t ) and theinteraction is written in dipole approximation and velocity gauge with the vectorpotential (cid:126)A ( t ) = (cid:90) t −∞ dτ (cid:126) E ( τ ) . (2)At optical or near-infrared wave length the duration of one field oscillation ison the scale of 100 a.u. and pulse durations reach 1000’s of a.u. . In ionization,a wide range of momenta appears and the wave function expands to very largesize during the pulse. This requires reliable absorption at the simulation boxboundaries, if exceeding simulation sizes are to be avoided. In tRecX, the stan-dard method for absorption is irECS (Sec. 3.1), which allows to work with box5izes of only a few 10’s of a.u. , although the underlying problem expands to1000’s of a.u. .Technically, ”strong field” also means that rotational symmetry is stronglybroken. Still, the use of polar coordinates and an expansion into sphericalharmonics Y ml often remains convenient and efficient. The ansatz isΨ( (cid:126)r, t ) = M (cid:88) m = − M L (cid:88) l = | m | Y ml ( ϕ, θ ) 1 r χ ml ( r, t ) . (3)In linear polarization the m -quantum number is conserved and the problem iseffectively two-dimensional. The radial functions χ ml need to support a broadrange of momenta, which suggests the use of higher order grid methods withsufficient density of points. The standard choice in tRecX is a finite-elementdiscrete-variable method (FE-DVR) with K=10-20 collocation points per ele-ment. The density of points is problem-dependent, typical average densities are2 points per atomic unit. Such an expansion is written as χ ml ( r, t ) = N − (cid:88) n =0 K − (cid:88) k =0 b nk ( r ) C mlnk ( t ) . (4)We remark here that indices of coefficients and partial wave functions are gener-ally written as superscripts, while basis functions are labeled by a subscript thatcounts the basis, and a superscript, that designates the set of basis functionsto which the individual function belongs. That principle is loosely adhered tothroughout the paper and broken occasionally for aesthetic reasons.FE-DVR can be considered as a local basis set discretization with Lagrangepolynomials as the basis functions on intervals [ r n , r n +1 ] b nk ( r ) = L k (cid:18) r − r n r n +1 − r n (cid:19) for r ∈ [ r n , r n +1 ] , L k ( y ) = K − (cid:89) j =0 ,j (cid:54) = k y − y j y k − y j . (5)The y j are the quadrature points for a Lobatto quadrature rule on the interval[0 , r n , which amounts to a linearconstraint on the expansion coefficients of the form C ml,n − ,K − = C ml,n, .Using polar coordinates for (cid:126)r , the full expansion can be written as a hierarchyof sumsΨ( ϕ, cos θ, r ; t ) = M (cid:88) m = − M e imϕ L (cid:88) l = | m | P | m | l (cos θ ) N − (cid:88) n =0 K − (cid:88) k =0 b nk ( r ) r C mlnk ( t ) , (6)where P | m | l are properly normalized associated Legendre functions. For thecomputation of matrix elements all operators involved can be written as (shortsums of) tensor products, for example − ∆ = − ⊗ ⊗ r ∂ r r − (cid:18) ⊗ ∂∂ cos θ sin θ ∂ cos θ + 1sin θ ⊗ ∂ ϕ (cid:19) ⊗ r . (7)6n this form matrix elements only involve one-dimensional integrations, which, inFE-DVR, are performed using the underlying Lobatto quadrature scheme. Wedenote quadrature schemes by pairs of nodes and weights, in present example as( r j , w j ). For correct results in FE-DVR one must use the explicitly symmetricform of any operator involving derivatives. For example, one writes (cid:90) r n r n r dr r b nk ( r )[ − r ∂ r r r b nl ( r )] → (cid:90) r n r n dr [ ∂ r b nk ( r )][ ∂ r b nl ( r )] = K − (cid:88) j =0 w j [ ∂ r b nk ( r j )][ ∂ r b nl ( r j )] , (8)and similarly for the other coordinates. Note that in this example the Lobattoquadrature rule gives the exact integral.For product bases, matrices corresponding to tensor products are tensorproducts of matrices. Typical bases in tRecX are not tensor products, butrather show tree-like interdependence (Sec. 3.2). Still, matrix-vector multiplica-tions can be performed with essentially the same operations count as for stricttensor products (cf. Sec. 3.3). In the given case, rotational symmetry of thepotential and dipole selection rules reduces operator matrices to simple block-tridiagonal matrices and there is no computational advantage in exploiting thetensor-product form.The negative Laplacian Eq. (7) can be specified on input by the string <1 > <1 > < d_1_d >+ <1 > < d_ (1 - Q*Q)_d > <1/( Q*Q ) >......+ < d_1_d > <1/(1 - Q*Q ) > <1/( Q*Q ) >. The pairs of “ . . . ” in subsequent lines are for typesetting only and indicate thatthe lines in actual input should be joined into a single line. The symbols
The
Laser:shape and
FWHM parameters determine (cid:126)α ( t ), which by default pointsinto z -direction. Any desired polarization angle can be input with additionalparameters. Shape cos8 indicates a pulse envelope function cos , which approx-imates a Gaussian pulse but maintains strictly finite pulse duration, in this caseabout 3000 a.u. . At the carrier envelope offset phase φ = 0 the vector potential | (cid:126)A | has a node at t = 0. The field (cid:126) E ( t ) then has its peak approximately at t = 0except for very short pulses, where the factorization into carrier and envelopebecomes ill-defined and extra contributions from the time-derivative of (cid:126)α ( t ), see(2), become non-negligible.The discretization is specified in the form Axis : name , nCoefficients ,...... lower end , upper end , functions , orderPhi ,1Eta ,30 , -1 ,1 , assocLegendre { Phi }Rn ,80 , 0, 40 , polynomial ,20Rn ,20 , 40 , Infty , polExp [0.5]
This means that we use 30 angular momenta ( L max = 29) and FE-DVR func-tions b nk ( r ) , n = 0 , , , , × b k starting at 40, con-sists 20 polynomials with exponential damping exp( − . r ). The single functionon the ϕ -coordinate is trivially constant and the associated Legendre functionshere effectively reduce to the ordinary Legendre polynomials. Specifying the ra-dial coordinate as Rn instructs the code to use the Dirichlet boundary conditionsat 0 and a warning will be issued, if the basis does not start from r = 0. Theremaining inputs for time-propagation and complex scaling, will be discussed inlater examples.At the given laser parameters tSurff was first demonstrated for a realisticscale problem in a prototype implementation [2]. With tRecX results are ob-tained within (cid:46) ∼
20% in themain part of the spectrum, see also discussion in [2]. Computation times canbe further reduced by parallelization, but gains of a factor (cid:46) tutorial/11 .8 igure 1: Dependence of calculated photo-emission spectra on the radius R c (c.f. Sec. 3.1).Calculation for the Hydrogen atom with a laser pulse duration of 3 optical cycles at wavelength 800 nm and peak intensity 3 × W/cm . The relative differences of (cid:46)
10% betweenthe calculations arise, as the Coulomb tail is effectively cut off at R c . Numerical convergenceis well below these differences. A script for producing this and similar plots directly frommultiple tRecX outputs is provided with the code, see Sec. 4.8. This much larger problem is used to illustrate the input of higher-dimensionaland more complex discretizations that contain basis constraints in the form ofinter-dependencies between the coordinates. Also, with electron repulsion an op-erator appears that does not have tensor-product form. The tutorial/23Helium3DSpectrum elaborates further on the following by computing double-emission spectra, al-though for less demanding parameters.The Hamiltonian of the Helium atom is H ( t ) = H ( t ) ⊗ + ⊗ H ( t ) + 1 | (cid:126)r − (cid:126)r | , (10)where H ( t ) is the single-electron Hamiltonian from Eq. (1) with V ( r ) = − r .We generalize the expansion (6) to two electrons with the ansatzΨ( (cid:126)r , (cid:126)r ) = M (cid:88) m = − M M (cid:88) m = − M L (cid:88) l = | m | L (cid:88) l = | m | Y m l ( ϕ , θ ) Y m l ( ϕ , θ ) χ m m l l ( r , r )(11)and the radial functions χ m m l l ( r , r ) = N − (cid:88) n ,n =0 K − (cid:88) k ,k =0 b n k ( r ) b n k ( r ) C m m n n l l k k ( t ) . (12)In a complete expansion for Ψ( (cid:126)r , (cid:126)r ) analogous to Eq. (6), the 8-index coeffi-cients appear within a hierarchy of 8 sums, with the number of indices related tothe dimension of the problem. In tRecX, such a discretization can be specifiedby the lines 9 define BOX 40 For convenience, the input files allow local macros, here used to define
ANG as20 for the number of angular momenta,
BOX for the simulation box size, and
NABS for number of functions for absorption. The radial axes
Rn1 and
Rn2 arehere cut into three different regions, the section [0 ,
10] with 20 points, the regionwith lower density of 40 points on [10 , ϕ and ϕ coordinates we have the first three functions fromthe basis expIm which is defined as { , e − iϕ , e iϕ , e − iϕ , e iϕ , . . . } . Nominally, the above basis has a daunting size, given by the product ofthe size of each of the axes, which would be impractical for calculations. tRecXallows to impose constraints on the bases by letting the basis one hierarchy leveldepend on the preceding levels. In fact, a first such constraint has tacitly beenintroduced by using the spherical harmonics l i ≥ | m i | , where the P | m i | l i (cos θ i )depend on the value of m i . For the given problem further constraints wereadded by the input BasisConstraint : axes , kindPhi1 . Phi2 ,M =0Eta1 . Eta2 , Lshape [3;24]
The first line simply constrains the z -component of total angular momentumto 0 = m + m , which reduces the 6-dimensional problem to 5 dimensions,and, in our example reduces the basis size by a factor 3. The second constraintaccounts for the fact that because of the particular dynamics of photo-ionizationpairs of angular momenta ( l , l ) with both values large do not occur and thebasis can be constrained to an L-shaped region near the axes in the l l -plane,Fig. 2. Examples and numerical demonstration of such constraints can be foundin [8, 10]. This reduces the effective dimension to near 4. The possibility toflexibly impose constraints of this kind is one of the important features of thetree-structures in tRecX and has been used extensively in applications.10s a result of the constraints the expansion coefficients C no longer are thecomponents of a tensor. Rather, the indices become inter-dependent, wherewe use the convention that any index can only depend on the indices to theleft of it. While operator matrices cease to be tensor products of matrices, thehierarchy of indices still allows efficient operator application, see Sec. 3.3. Coulomb repulsion cannot be written as a finite tensor product and requiresspecial treatment. We use a multipole expansion and apply the radial part bymultiplication on a quadrature grid. Although this can be made exact withinthe given polynomial basis, it turns out that the approximate DVR quadraturedoes not compromise computation accuracy. Details of the scheme are given inRef. [8] for finite elements, which can be readily transferred to FE-DVR nowused by default in tRecX.While tensor product operators can be defined through simple scripting,Coulomb repulsion is custom-implemented. The Hamiltonian (10) can be spec-ified as
Operator : hamiltonian =1/2 < < Laplacian > >...... -2. < < Coulomb > >+[[ eeInt6DHelium ]]Operator : interaction = iLaserAz [t]<
Phi2 must appear above the
Eta2 which carries the associated Legendre functions assocLegendre{Phi2} . The sequence determines the layout of the indices ofthe C ’s, where storage is such that lowest axis corresponds to the rightmostindex, which runs fastest. Storage arrangement can be modified when definingthe parallel layout, See. 3.6. The Floquet method converts a time-periodic problem into a stationaryproblem by discrete Fourier expansion in time. The resulting operator hascontinuous spectrum on the whole real axis, but underlying resonances can beaccessed by complex scaling. The tutorial/90Floquet shows how the methodcan be used within tRecX.The TDSE for a single-electron system in a cw field polarized in z -directionis, in velocity gauge, i ddt Ψ( (cid:126)r, t ) = [ H + i sin( ωt ) A z ∂ z ] Ψ( (cid:126)r, t ) . (13)11 igure 2: Preponderance rules and the use of constraints. Panel (a) shows the maximalamplitudes of angular momentum pairs ( l , l ) during laser ionization of Helium atom by alinearly polarized field. An Lshape constraint discards unneeded pairs. Reproduced fromRef. [8]. Panel (b): maximal amplitudes in the spherical waves Y ml during ionization of asingle-electron atom by a near-circularly polarized field. A pronounced preponderance forsmall l + m is seen. Reproduced from Ref. [10]. The Ψ( (cid:126)r, t ) can be expanded intoΨ α ( (cid:126)r, t ) = e − i(cid:15) α t Φ α ( (cid:126)r, t ) , (14)where the Φ α ( (cid:126)r, t ) = Φ α ( (cid:126)r, t + T ) are strictly time-periodic and in turn can beexpanded into a discrete Fourier seriesΦ α ( (cid:126)r, t ) = ∞ (cid:88) n = −∞ e inωt Φ αn ( (cid:126)r ) , ω = 2 πT . (15)Inserting into the TDSE and arranging the Φ αn into a vector (cid:126) Φ α one finds theeigenvalue equation (cid:98) H F (cid:126) Φ α = (cid:15) α (cid:126) Φ α (16)with ( (cid:98) H F ) mn = δ nm ( H + nω ) + 12 ( δ n,m +1 − δ n,m − ) A z ∂ z (17)The Floquet Hamiltonian H F has the complete real axis as its continuous spec-trum, into which the bound states of H are embedded. For non-zero A z allbound states experience an ac-Stark shift to a resonance energy E r with a de-cay width Γ r . Upon complex scaling these two quantities appear as complexeigenvalue E r − i Γ / H F .We define a discretization for the expansion (15) as Axis : name , nCoefficients , lower end , upper end , functions , orderVec ,18Phi , 1Eta , 7 , -1 ,1 , assocLegendre { Phi }Rn , 60 , 0. , BOX , polynomial ,30Rn , 30 , BOX , Infty , polExp [0.5]
Vec labels a total of 18 Floquet blocks, i.e. the Fouriercomponents Φ n , n <
18. The Floquet Hamiltonian (17) is input as where the define macros are used for better readability. The factor
Vec with entries ( i − δ ij .The potential − (1 + e − . r ) /r models the screened potential seen by oneelectron in a Helium atom. That model gives qualitatively meaningful re-sults for single-ionization processes and approximately reproduces the first fewground and excited state energies of the Helium atom. We use it to illustratenon-perturbative ac-Stark shifts and the resulting intensity-dependent n -photonFreeman resonances [19]. We trace the resonance positions E r − i Γ as a func-tions of A z from field intensity I = 0 to 2 × W/cm . The function A[I] that is used in the Hamiltonian string together with tracing range and step sizeare defined in the input as
Trace : eigenvalues , from , to , steps , function-0.903 , 0, 2 e14 W/ cm2 , 71 , A[I ]= sqrt (I )/0.1155 where − .
903 is the initial guess eigenvalue and the function A [ I ] defines theconversion from intensity to A z in a.u. for the given photon energy of 0 . au ∼ eV . The eigenproblem is solved by inverse iteration and roots are selected forlargest overlap with the preceding solution. Fig. 3 shows traces for ground andexcited states, where crossings near intensities 1 . × indicate an 8-photonresonance. These lead to characteristic structural changes in differential doubleemission spectra, as discussed in Ref. [9]. A popular model for inspirational studies in strong field physics is the “two-dimensional Helium atom” defined by the Hamiltonian H ( x , x ) = (cid:88) i =1 , (cid:34) − ∂ ∂x i + iA ( t ) ∂∂x i − (cid:112) x i + a (cid:35) + 1 (cid:112) ( x − x ) + b , (18)which with values a = 0 . b = 0 . tutorial/20Helium2d A Cartesian grid extending symmetrically around the origin is input as13 igure 3: Floquet energies and widths as functions of laser intensity at photon energy0.1115 a.u. ( ≈ nm ). States can be identified by their initial field-free energies. Thecrossings near 1 . × W/cm occur at 8-photon transitions from the ground to the 3s and3d-states, respectively. Energies are w.r.t. the continuum threshold. The coordinates
X1,X2 illustrate the general tRecX feature that coordinatescan be numbered. Equivalently one can use, e.g., the axis names
X,Y . Complexscaling is input as
Absorption : kind , axis , theta , upperECS ,X1 ,0.3 , BOXECS ,X2 ,0.3 , BOX with a complex scaling radius of R = 20 at positive coordinates. The complexscaling radius at negative coordinates defaults to − R , but can also be setexplicitly by specifying a value for Absorption:lower .Using input macros for brevity, the Hamiltonian is
This illustrates how to define electron repulsion, which is a multiplicative oper-ator that is not a tensor product w.r.t. x and x : one defers the definition of14 igure 4: Photo-electron spectra for the 2 × φ , see Eq. (9). All graphs share the same color code withnormalization to overall maximum 1. Only the quadrant where both electrons are emittedinto the positive axis direction is shown. The pulse is single-cycle at wave length 800 nm andintensity 3 × W/cm . Spectra strongly depend on φ because of the pronounced asymmetryof the single-cycle pulse. the potential by putting a placeholder factor <{}> until one reaches the hierar-chy level of the lowest coordinate axis, here X2 . On that last level one definesthe function using the axis names as the variables. Simple multi-dimensionalpotentials can be input easily in this way. For more complicated dependenciesone may consider writing a specialized class instead. A larger class of generalthree-dimensional potentials is covered by the Pot3d discussed in section 2.6below.Spectra for emission into the first quadrant R + × R + can be computed byinputs analogous to the full 6-dimensional case. Other quadrants are not sup-ported at present, but spectra can be obtained by computations with reflectedcoordinate axes ( x , x ) → ( ± x , ± x ). Fig. 4 shows the dependence of spectraon the carrier-envelope phase φ , Eq. (9), for a single-cycle pulse. In tRecX one can use hybrid bases where different types of basis functionsare combined to discretize the same space. A typical example is the haCCmethod [7] for molecules in strong fields, which combines a Gaussian-based CIwith the numerical basis described above. Another example is a multi-centerbasis, where spherical bases with different centers are combined.For the introduction of the concept of hybrid bases we use a model that ispopular in strong field physics, realized in tutorial/221CO2Free . In that typeof model one assumes that a single or a few bound states {| α (cid:105) , α = 0 , , . . . , A − } of some complicated Hamiltonian H a are essential, but the strong field dynamicson the rest of the space can be described by a simplified Hamiltonian H b with15he total Hamiltonian H ( t ) = P H a P + QH b Q + i (cid:126)A ( t ) · (cid:126) ∇ , and Q = (1 − P ) , P = A − (cid:88) α =0 | α (cid:105)(cid:104) α | . (19)For H b one typically uses free motion or motion in a Coulomb field. Note thatthe interaction among the | α (cid:105) states and between | α (cid:105) and the rest of the spaceis taken fully into account in H ( t ). With a single bound state A = 1 and H b = − ∆ this is very nearly the so-called “strong field approximation” [20],which is behind much of the theoretical understanding of strong field physics.Hamiltonian (19) was used to investigate attosecond (1 as = 10 − s ) delays inphoto-emission from CO . We choose a highly simplified CO model Hamilto-nian H a = −
12 ∆ − γ (1 + 5 e − r/c ) | (cid:126)r | − (1 − γ )(1 + 7 e −| (cid:126)r + (cid:126)b | /a )2 | (cid:126)r + (cid:126)b | − (1 − γ )(1 + 7 e −| (cid:126)r − (cid:126)b | /a )2 | (cid:126)r − (cid:126)b | , (20)where γ = 0 . C and O atoms and screening was chosen as c = 3 , a = 1 .
73. The O-atoms are locatedalong the z -axis at the equilibrium C − O bond length (cid:126)b = (0 , , . au ). Withthat one finds a Π-gerade state at the CO HOMO energy of ≈ − . a.u. . Forthe purpose of this study it suffices to compute the eigenstates | α (cid:105) of H a in asingle-center expansion. We pick the HOMO and the next higher Σ-state asfollows: The potential parameters
BOX,GAM,CSCR,ASCR are set by define ’s. The func-tion itself was hard-coded into tRecX for efficiency, although it can be, in prin-ciple, written as in the example of Sec. 2.4. The additional input subset sepa-rately specifies the discretization on a
Subspace and its
Complement . The basison the subspace are two
Orbital s { Φ ( (cid:126)r ) , Φ ( (cid:126)r ) } , which are three-dimensional Eigenbasis functions of the Hamiltonian
HAM , which are computed in the dis-cretization defined in the subset named
Complement . Complement is a standardspherical expansion. The axial symmetry around z is broken by the field, whichis why a total of 7 ϕ -functions m = 1 , ± , ± ω and its 13thand 15th harmonic as (cid:126)A ( t ) = ˆ (cid:15)A f cos ( tτ ) sin( ωt + φ ) + ˆ (cid:15)A h cos ( tτ H )[sin(13 ωt ) + sin(15 ωt )] (21)16ith the polarization vector ˆ (cid:15) in the xz -plane. The field is input in terms ofpeak intensities and FWHM as Laser : shape ,I(W/ cm2 ), FWHM , lambda ( nm ), phiCEO , polarAnglecos2 , 1 e10 , 4 OptCyc , 800 , pi /2 , 45cos4 , 1 e11 , 3 OptCyc , 800/13 , 0, 45cos4 , 1 e11 , 3 OptCyc , 800/15 , 0, 45
Here pi/2 for phiCEO at the fundamental means that node of the fundamentalfield falls onto the peak intensity of the harmonics. Note that
OptCyc is w.r.t.to the first wave length in the list. A warning issued by the code will remindthe user of this fact.The Hamiltonian (19) is specified as
Operator : hamiltonian = <0 ,0 > HAM + <1 ,1 >(1/2 < < Laplacian > >)Operator : interaction =< allOnes >......( iLaserAx [t]<
The factor
Subspace&Complement and indicates that all sub-blocks of theinteraction on the subspace and its complement are to be computed: H I = i (cid:126)A · (cid:126) ∇ = P H I P + P H I (1 − P ) + (1 − P ) H I P + (1 − P ) H I (1 − P ) . (22)Also note that polarization is no longer along the z -axis but rather in the xz -plane, which is why x and z -components of the dipole interaction are bothpresent. Delays in the laser-emission of electrons have drawn some attention as pos-sible indicators of a delay in tunneling emission (see, e.g., [21, 22]), which mayoccur on the time scale of attoseconds. In order to correctly pose the question,one must disentangle any possible such delay from delays not related to tunnel-ing that are well known to appear in scattering after emission. The model aboveallows to give meaning to the notion of “scattering after emission” by restrictingthe action of the binding potential to the initial state and use the free particleHamiltonian everywhere outside the bound initial state. This can be comparedto the full problem, or a partially restricted problem, e.g. using only the shortrange part of the molecular potential or motion in the Coulomb field insteadof than free motion. The experimental definition of delay is related to a beatin a so-called RABITT spectrogram, see, e.g. [23] for a general discussion ofattosecond techniques.Here we only illustrate the use of tRecX for comparing alternative modelswithin the same computational framework without any deeper discussion ofthe underlying physics. Fig. 5 shows RABITT delays computed with threedifferent models, the full single-electron Hamiltonian (20), the strong field-likeapproximation Eq. (19) with free motion H b = − ∆ / H b = − ∆ / − /r . If there were any dependence of thedelays on the alignment of the laser field with the molecular axis, this would be17 igure 5: Dependence of RABITT emission delays on alignment between of the molecularaxis and laser polarization direction. The three calculations are for the CO single-electronmodel (20, dots) and the Hamiltonian (19) with H b the free motion (diamonds) and motion inthe Coulomb field (squares), respectively. Size of the dots indicates emission yield. Alignment-dependent delays are largely are due to scattering in the molecular potential. Delays by theCoulomb potential are nearly independent of alignment. considered as an effect of tunneling through the orientation-dependent barrier.While the full model shows strong orientation dependence, no such effect isseen with free motion or motion in the Coulomb field. The conclusion from thissimple study is that any possible effects of tunneling delays would be completelydominated by delays incurring after emission. When a system has singularities at several points in space the use of a multi-center basis is advisable. The tutorial/510OffCenterScatter was used as thestarting point for the calculations published in [24].We consider one scatterer at some larger distance from the origin. Such apotential cannot be written as tensor product w.r.t. the original polar coordi-nates, but rather is treated as a general three-dimensional potential which isinput in a category
Pot3d . It is referenced in the operator definition as the spe-cial operator [[Pot3d]] . An off-center radial potential can be specified by theCartesian coordinates of its origin and an algebra string for the radial function,as for − / (cid:112) | (cid:126)r − (cid:126)r | with (cid:126)r = (0 , ,
65) in
Pot3d : potential = radial [0 ,0 ,65 , -1/ Q]
A matching off-center basis with the spherical harmonics l ≤ PolarOffCenter : radius =5 , origin =[0 ,0 ,65] , Lmax =2 , Mmax =0 , Nmax =5 which uses polynomials of degree 4 on a sphere of radius=5 around (cid:126)r . Withthe center placed on the z -axis, we have axial symmetry around the z -axis and m -quantum numbers remain conserved. In that case one may constrain theoff-center basis to m = 0, as in the example above.That basis is to be combined with a standard spherical basis centered at theorigin into a hybrid basis as in 18 xis : subset , name , functions , nCoefficients ,...... lower end , upper end , orderOff , Ndim , PolarOffCenterCenter , Phi , ,1,Eta , assocLegendre { Phi } ,4 , -1, 1, Rn , polynomial , 160 , 0 ,80 ,10, Rn , polExp [1.] , 20 , 80 , Infty The off-center potential and the off-center basis both break rotational symme-try and cause partial fill-in of overlap and operator matrices. Note that here,different from Sec. 2.5, the bases of the two subsets are not orthogonal. Theinverse of the overlap is applied through a specialized class that implements theWoodbury formula for low-dimensional updates of an inverse (cf. Sec. 4.3.2).Possible linear dependency and ill-conditioning of the overlap is monitored, butdoes not usually pose a problem for a rather well-localized off-center basis as inthis example.The fill-in of operator matrices occurs where the off-center functions over-lap with the radial sections of the origin-centered basis. For that reason it isrecommended to minimize the number of radial elements [ r n , r n +1 ] where the Center -basis overlaps with the
Off -basis. In the given example, the off-centerbasis has overlap with two radial sections r ∈ [60 , Center discretization, as in
Operator : hamiltonian =0.5 < allOnes ><< Laplacian > >...... - < allOnes ><< Coulomb > >+[[ Pot3d ]]
The factor
3. Methods and general framework
Strong field problems involve photo-emission all the way to total ionizationof the initial system. Pulse durations are long on the atomic time scale and themomentum spectrum can be very broad. In this situation efficient absorption ofoutgoing flux is provided by “infinite range exterior complex scaling” (irECS) [1].Complex scaling is an analytic continuation technique for Schr¨odinger operatorsby which the continuous energy spectrum is rotated around the single or multiplecontinuum thresholds into the lower complex plane leading to damping of thecontinuous energies in forward time-evolution. Bound state energies remainunaffected by the transformation and a new class of discrete eigenvalues W r =19 r − i Γ r / E r with decay widths Γ r .The transformation is achieved by scaling the coordinates (cid:126)r → e iθ (cid:126)r . If thescaling is only applied outside a finite radius R one speaks of exterior complexscaling (ECS). As a consequence of analyticity, exterior complex scaling leavesthe solution in the region r ≤ R strictly unchanged and allows direct directphysics interpretation — it is a perfect absorber. The usual discretization errorsarise but any dependence on the complex scaling angle θ can be reduced tomachine precision and in that sense there are no adjustable parameters. Thechoice of θ does matter for efficiency with θ ∼ π/ − π/ r > R b k ( r ) = L k ( r ) e − αr . (23)The L k can is any set of orthogonal or sufficiently well-conditioned polynomialssuch as Laguerre or Lagrange polynomials. In tRecX, we use for L k the La-grange polynomials at the Radau quadrature points for the weight e − αr , whichis a DVR basis (Sec. 4.5.2). The rational of this discretization is to simultane-ously accommodate short and long wave length: short wave-length require finersampling but get damped by complex scaling over a short range. Long wavelength penetrate deeper into the absorbing region, but need fewer discretizationfunctions over the range. This discretization reduces the number of functionsneeded for absorption per coordinate by factors (cid:46) Absorption : kind , axis , theta , upperECS , Rn1 ,0.3 ,20ECS , Rn2 ,0.3 ,20
The name upper indicates the complex scaling radius R for interval [ R , ∞ ).For Cartesian coordinates (Sec. 2.4) one also needs absorption towards negativeinfinity ( −∞ , X − ] which defaults to X − = − X + , but can be set independentlyby lower if so desired. The exponentially damped functions are chosen withthe axes, as shown in the applications of Sec. 2.The time-dependent surface flux (tSurff) method constructs spectra from theflux through a surface at some sufficiently large radius R c . It is specific for thedipole approximation used in laser-ionization that momenta will get modifiedalso after they pass any remote surface. This can be taken into account if one hasan analytic solution for the time-evolution outside R c . With R c large enough forneglecting the potentials, these are the Volkov solutions for electronic motionin a dipole field, here given in velocity gauge and δ -normalized χ V(cid:126)k ( (cid:126)r, t ) = (2 π ) − / e − i Φ( (cid:126)k,t ) e i(cid:126)k(cid:126)r , (24)with the (cid:126)k -dependent Volkov phases Φ( (cid:126)k, t ) = (cid:82) t dτ [ (cid:126)k − (cid:126)A ( τ )] /
2. With thesethe complete spectral amplitude at a given (cid:126)k can be written as an integral over20he surface and time b ( (cid:126)k, T ) = (cid:90) TT (cid:104) χ V(cid:126)k ( t ) | [ −
12 ( − i(cid:126) ∇ − (cid:126)A ( t )) , h ( r − R c )] | Ψ( t ) (cid:105) dt (25)As h is the Heaviside function, the commutator leads to δ -functions at r = R c and the integral is only over the surface. T is the begin time of the pulse, and T is some time large enough such that all relevant flux has passed R c . The schemewritten here for the single-particle emission can be generalized to the emissionof two or more particles. In tRecX, the general form is implemented, but inpractice three-particle emission has not been studied for reasons of problemsize. Further details on the tSurff method can be found in Refs. [2, 3].tRecX computes values and derivatives of Ψ( t ) on the surface and savesthem to disk. In a second sweep, the integral (25) for the spectral amplitudes b ( (cid:126)k, T ) is computed. For multi-particle spectra the process is recursively iter-ated. One can specify the desired grid for (cid:126)k using the input category Spectrum with a choice of points and optionally a momentum range. If
Spectrum is found,tRecX automatically initiates the amplitude computation. Alternatively, onecan restart tRecX with the output directory as input and the momentum gridspecified by command line parameters.tSurff and irECS are the two defining techniques of tRecX which also havephonetically inspired the name as tRecX=tSurff+irECS. An alternative inter-pretation of the acronym is related to the recursive discretization discussedbelow in Sec. 3.2.
The matrix representing a complex scaled Hamiltonian non-hermitian andhas the desired complex eigenvalues. If one uses strictly real basis functions,the matrix for the unscaled Hamiltonian will usually be real. In that case, theHamiltonian matrix will become complex symmetric upon scaling, i.e. (cid:98) H ij = (cid:98) H ji without complex conjugation. This is a computationally useful property: theright eigenvectors of (cid:98) H are identical to the left-eigenvectors (cid:98) H (cid:126)C n = (cid:126)C n W n ⇔ (cid:126)C Tn (cid:98) H = W n (cid:126)C Tn (26)and the eigenvectors (cid:126)C n can be selected to be pseudo-orthonormal (cid:126)C Tm (cid:126)C n = δ mn . (27)A modification of that general approach is used for irECS: one starts fromstrictly real basis functions on the rhs. , but in the scaled region r ≥ R theseare multiplied by a complex factor. This creates a complex-valued discontinu-ity of the logarithmic derivative in rhs. basis at R , that is required by ECS.The analogous, but complex conjugated discontinuity is required for the lhs.basis. Mathematical and implementation details of this realization of ECS,and its numerical advantages compared to commonly used alternatives are dis-cussed in Refs. [1, 6]. With the lhs. differing from the rhs. basis, also the21verlap matrix becomes complex symmetric rather than hermitian. However,algorithms remain unchanged from the hermitian case, if some care is taken toproperly use transposed instead of adjoint matrices and vectors. For example,a pseudo-Schmidt-orthonormalization can be performed if only one replaces thestandard scalar product with its complex-symmetric counterpart (cid:126)C † (cid:126)C → (cid:126)C T (cid:126)C ,and even a pseudo-Cholesky decomposition exists and is used. In tRecX, a key-word pseudo indicates that the unconjugated rather than standard operation isperformed. The organization of operators, wave-functions, expansion coefficients, basissets, and multi-indices in trees is central to the design of tRecX. Trees are used torecursively generate the objects and in virtually all other algorithms. This makesthe code largely independent of specific coordinate systems and dimensions andallows to handle all multi-dimensional expansions of the examples above withinthe same scheme. Program uniformity is ensured by deriving all trees from atemplate abstract base class Tree , Sec. 4.2.
We denote the L -tuple of all coordinates by Q = Q = ( q , q , . . . , q L − )and the sub-tuple starting at l by Q l = ( q l , . . . , q L − ). There is some flexibilityas to what is considered as a “coordinate”: on the one hand, the finite-elementindex n in Eq. (6) can assume the role of a coordinate, but also all three spatialcoordinates (cid:126)r of the orbitals Φ α ( (cid:126)r ) in the hybrid discretization of Sec. 2.5 canbe subsumed in a single q l := (cid:126)r .One or several sets of basis functions b J l = (cid:16) b J l ( q l ) , b J l ( q l ) , . . . (cid:17) are definedfor a coordinate q l , where we arranged the set as a row vector, indicated bythe underscore. The multi-index J l = ( j , . . . , j l − ) unites the labels j k ofall functions b J k j k ( q k ) , k < l preceding the basis set b J l ( q l ). In that way b J l can be made to depend on the sequence of basis functions preceding it in thecoordinate hierarchy. The tuple J = J L is the complete set of indices for a singleexpansion coefficient C J = C j ,j ,...,j L − . The basis function matching C J is B J ( Q ) = (cid:81) L − l =0 b J l j l ( q l ). As with coordinates, we use the word “basis function”in a rather wide sense: basis functions in the proper sense are trigonometricfunctions, associated Legendre functions, or the Lagrange polynomials for FE-DVR discretization, etc. but we also consider Kronecker δ : b j ( n ) = δ jn as a“basis function” for a discrete index n , e.g. the photon index in the Floquetmodel of Sec. 2.3. By that principle all discretization methods are treateduniformly in tRecX.All basis expansions discussed in Sec. 2 fit into the scheme, e.g. Eqs. (6)and (11). It is important to note that, while individual functions B J are prod-ucts of functions of the coordinates, the total basis B (again considered asa row vector of B J ’s) is not a product basis because of the dependence offactor functions b J l k on the complete preceding hierarchy. A well known setof two-dimensional functions with this structure are the spherical harmonics22 ml ( φ, θ ) ∝ e imφ P | m | l (cos θ ). Another example for the hierarchical dependencein the products is the implementation of angular constraints as discussed inSec. 2.2.With the above definitions, the wave function for a given tuple of coordinates Q l is expanded recursively asΨ J l ( Q l ) = K Jl − (cid:88) j =0 b J l j ( q l )Ψ J l j ( Q l +1 ) , (28)where we use the notation J l j := ( j , . . . , j l − , j ). Ψ( Q ) := Ψ J ( Q ) is thecomplete wave function at the point Q and Ψ J L =: C J L = C j ,j ,...,j L − is avector of length one, i.e. the complex valued expansion coefficient at the multi-index J l = ( j , j , . . . , j L − ). The recursion (28) defines the discretization as atree whose nodes are labeled by an index J l , as in Fig. 6. The subtree startingat J l defines a multi-coordinate wave function component Ψ J l ( Q l ). Every nodehosts a basis b J l = (cid:16) b J l j ( q l ) , j = 0 , . . . , K J l − (cid:17) and each function of the basisconnects to one branch of the node. Both, the number K J l of basis functions b J l j and their kind can be different on every node, as, e.g., for the associatedLegendre functions b J l −| m | = P | m | l in Fig. 6. Usually basis sets at given level l have equal coordinate q l . An exception are hybrid discretizations as in Secs. 2.5and 2.6. In Fig. 6, on level l = 1 the node at J = (0) hosts three-dimensionaleigenfunctions b = (Φ ( (cid:126)r ) . Φ ( (cid:126)r )), while its neighbor at J = (1) has the node-basis b = (cid:0) , e − iφ , e iφ (cid:1) . The recursive hierarchy is also reflected in the expansion coefficients. Everysubtree wave function Ψ J l is associated with a vector of expansion coefficients C J l . The overline indicates a column vector and emphasizes its duality to thebasis B J l . C J l is the direct sum of the coefficient vectors at l + 1: C J l = C J l C J l ... C J l K Jl . (29)The recursion can be phrased as “a coefficient vector is a vector of coefficientvectors”.Finally, the recursive hierarchy of the overall multi-dimensional basis setbelonging to C can be exploited for the computation of the operator matricesand in matrix-vector multiplication. The row-vector of multi-dimensional basisfunctions B J l ( Q l ) for the subtree J l is defined recursively as B J l ( Q l ) = (cid:16) b J l B J l , b J l B J l , . . . , b J l K Jl − B J l K Jl − (cid:17) . (30)23 igure 6: Index tree for the hybrid discretization of Sec. 2.5 (abbreviated). Axis names areindicated in yellow. Nodes on the respective levels are labeled by J l , factor basis functionsconnect a node to the next-lower level. The recursion starts from B J L ≡ ∀ J L = ( j , j , . . . , j L − ) and B J = B is thecomplete multi-dimensional basis. The full Hamiltonian matrix can be denotedas (cid:98) H = (cid:104) B | H | B (cid:105) if we interpret (cid:104) B | as a column vector of bra-functions. Thefull wave function is Ψ = BC .Here one can clearly see that the basis Eq. (30) reduces to a tensor product,only if the bases at all subnodes of J l are equal, B J l j ≡ B J l , ∀ j : B J l = (cid:16) b J l B J l , b J l B J l , . . . , b J l K Jl − B J l (cid:17) = b J l ⊗ B J l . (31)For each pair I l , J l of index subtrees we define a sub-block (cid:98) H I j J j of (cid:98) H = (cid:98) H I ,J , where the 1 × (cid:98) H I L J L are the matrix elements. The recursivestructure of the coefficients induces a recursive block structure of the matrix as (cid:98) H I l ,J l = (cid:98) H I l ,J l (cid:98) H I l ,J l · · · (cid:98) H I l ,J l K Jl − (cid:98) H I l ,J l (cid:98) H I l ,J l · · · (cid:98) H I l ,J l K Jl − ... ... (cid:98) H I l K Il − ,J l (cid:98) H I l K Il − ,J l · · · . (32)This structure can be exploited for construction of the operator matrices andalso for simple representation of block-sparsity, e.g. in presence of selectionrules. One can phrase this recursively as “an operator matrix is a matrix ofoperator matrices”. 24f the operator H is a tensor product H = h ⊗ h ⊗ . . . ⊗ h L − =: h ⊗ . . . ⊗ h l ⊗ H l +1 H l := h l ⊗ H l +1 , l = 0 , , . . . , L − I l , J l on level l can be assembled from all blocks at thenext-lower level ( I l +1 , J l +1 ) = ( I l i, J l j ) as (cid:2) (cid:104) B I l | H l | B J l (cid:105) (cid:3) ij = (cid:104) (cid:98) H I l ,J l (cid:105) ij = (cid:104) b I l i | h l | b J l j (cid:105) (cid:98) H I l i,J l j , (34)where [ . . . ] ij designates the ij -block of (cid:98) H I l ,J l and the indices i and j range from0 to K I l − K J l −
1, respectively. In practice, the matrix is not usuallyconstructed explicitly.The tensor-product form of H l implies a tensor-product form of (cid:98) H I l J l = (cid:104) B I l | H l | B J l (cid:105) , only if also B I l and B J l are strict tensor products as in Eq. (31),in which case Eq. (34) reduces to (cid:98) H I l J l = (cid:104) b I l | h l | b J l (cid:105) ⊗ (cid:98) H I l ,J l . (35)Yet, also when the matrix is not a tensor product, the recursive structureEq. (34) largely preserves the computational advantages of tensor products interms of the floating point count and, to a lesser degree, data compression. Atypical algorithm for matrix-vector multiplication is discussed in Sec. 3.3.Many operators in physics can be written as short sums of tensor prod-ucts and allow efficient and transparent computation of the matrices using thisscheme. In some cases it is advantageous to exploit the recursive structurefor applying the operator matrices to coefficient vectors, as for the radial ki-netic energy in two-particle problems. If the matrix is very block-sparse, ase.g. in case of dipole selection rules, direct block-wise application performs bet-ter. The choice between these options is made automatically in tRecX basedon the actual operator, using non-rigorous heuristics. When operators do nothave tensor-product structure, such as electron repulsion | (cid:126)r − (cid:126)r | − in the He-lium atom, the recursive scheme is still used in tRecX for bookkeeping and forensuring a uniform construction of operator matrices.For numerical efficiency, operators are not usually expanded to the lowestlevel, but rather recursion is terminated at a “floor”level F ≤ L such that thesmallest operator block has typical sizes of 10 × ∼ × r n , r n +1 , ] × [ r m , r m +1 , ], with atypical number of K = 20 functions for each radial coordinate. Operators onthe floor level are usually not represented by full matrices. In the Helium Hamil-tonian Eq. (10) the first two terms are trivial tensor products. With basis size K on both coordinates r , r , the operations count of matrix-vector multipliesis O ( K ) when one exploits the tensor-product form, rather than O ( K ) forgeneral full matrix. Such structures are automatically recognized by tRecX and25mplemented using derived classes of an abstract base class OperatorFloor .Electron repulsion on this lowest level requires application of matrices that arediagonal for each multipole term with matrix-vector operations count O ( K ).As mentioned above and discussed in Ref. [8], this is not exact, but turns out tobe an excellent approximation. The high computational cost of electron repul-sion arises not from the radial part, but from the significant fill-in of the blocksparse matrix by widely coupling the angular momenta of the two individualelectrons. This can be controlled to some extent by truncating the multipoleexpansion at less than maximal order (input OperatorFloorEE:lambdaMax ).The recursive scheme for operators and coefficients translates into simple andtransparent algorithms for matrix setup and matrix-vector operations, which areimplemented in the C++ class OperatorTree discussed in Sec. 4.3.
The code makes extensive use of numerical quadrature. This is so, by def-inition, for FE-DVR basis functions, but we apply it throughout: integralsinvolving trigonometric functions, the associated Legendre functions P | m | l orthe general multi-dimensional basis functions of Sec. 2.6 are usually all com-puted by quadratures. Wherever possible, exact quadrature is used. Apartfrom providing a uniform and comparatively error-safe computational schemein the code, exact quadratures are often numerically more stable the evaluationof complicated algebraic expressions for analytic integrals.The tree-structure of the expansion provides for efficient conversion to andfrom product grids that tRecX uses in multi-dimensional quadratures. Thewave-function value at one point Q A = ( q α , . . . , q L − α L − ) of an L -dimensionalproduct grid isΨ( q α , . . . , q L − α L − ) = K J (cid:88) j =0 b J j ( q α ) K J (cid:88) j =0 b J j ( q α ) · · · K JL − (cid:88) j L − =0 b J L − j L − ( q L − α L − ) C j ,j ,...,j L − (36)We abbreviate the matrix of basis function values at the grid points as b J l j l ( q lα l ) =: b J l α l j l and introduce the intermediate vectors G J l A l = G J l α l A l +1 = K Jl (cid:88) j l =0 b J l α l j l · · · K JL − (cid:88) j L − =0 b J L − α L − j L − C j ,j ,...,j L − = K Jl (cid:88) j l =0 b J l α l j l G J l j l A l +1 ∀ α l (37)with A l = ( α l , . . . , α L − ) and the previously defined J l = ( j , . . . , j l − ). The lastequality defines a recursion starting from coefficients C J L = G J L A L and ending atthe vector G J A of the values of Ψ at all grid points.26he analogous recursion can be set up for the back-transformation from gridto basis functions. With quadrature weights w lα l , α l = 0 , . . . , S l − q lα l one computes the overlap matrices for coordinate q l at the nodes J l s J l ij = (cid:88) α l ( b J l ) † iα l w lα l b J l α l j and from that the factor matrices for back-transformation d J l j l α l = (cid:2) ( s J l ) − b J l † (cid:3) j l α l w lα l . On complex-scaled coordinates, the adjoint b J l † must be replaced by the trans-pose b J l T , see Sec. 3.1.1. The recursion for back-transformation from G = G J A to C J L = G J L A L proceeds by G J l +1 A l +1 = G J l j l A l +1 = K Jll − (cid:88) α l =0 d J l j l α l G J l α l A l +1 ∀ j l (38)Both recursions (37) and (38) share the same structure and are implemented ina class OperatorMap , Sec. 4.3.2.The recursive algorithm for the transformation to a product grid is verysimilar to the algorithm for applying a tensor product of operators to a vectorand it has the same favorable operations count. For an ideal quadrature gridwith S l = K J l , the transformation maintains size len( C ) = len( G ) and theoperations count for the transformation C → G is (cid:32) L − (cid:88) l =0 K J l (cid:33) × len( C ) . (39)The computational gain increases exponentially with dimension L comparingto direct application of the full transformation matrix ( (cid:81) L − l =0 K J l ) × len( C ).In practice the number of quadrature points often exceeds the number of basisfunctions, S l > K j l , with a corresponding increase of operations count. Oneprominent example are the associated Legendre functions P | m | l ( η ) , l ≤ L wherewe use a Legendre quadrature grid η k , k = 0 , . . . , L which is shared among all m and is exact for all overlaps, but inflates the vector length from L − | m | + 1to L + 1. These are more points than, e.g., in a Lebedev quadrature grid [25],but the product structure is maintained and with it the efficient algorithm fortransformation to the grid.For simplicity we have treated the case where all coordinates are transformedto a grid, but obviously transformations can be limited to a given subset of thecoordinates, as needed. In tRecX, the creation of product grids and transforma-tions to and from them are handled by a specialized class DiscretizationGrid ,see Sec. 4. 27 .4. Adaptive features
In problems that are strongly driven by the external field, time step size andrequired basis size can change significantly as the system evolves. Step sizesdecrease near field peaks and increase near field nodes. By default, the codeautomatically controls the size of the time steps based on a standard single-to-double step estimate, which has an overhead slightly above 50%. We havedecided to use this simple but universal control algorithm, which only requires awell-defined consistency order of the underlying time-stepper, in order to main-tain flexibility in choosing the time-stepper. In strongly driven systems, gainby adaptive step size can be up to a factor of 2 compared to a step fixed at themaximal stable size. The maybe more important advantage of step size controlin tRecX is that well-defined accuracies are achieved without the need of carefultime-step adjustment. At the end of time propagation average step size and itsvariance are printed, based on which one can fix the step size once a system’sbehavior in a given parameter range and discretization is known.A typical phenomenon of strong-field physics is a large increase in angu-lar momenta as the field ramps up. After the end of the pulse, those angularmomentum components gradually decay and the operator does not need tobe applied to them. Also, in absence of the pulse the interaction part of theoperator is zero. These developments are monitored in the code and opera-tors are only applied in regions where there is non-negligible contribution tothe time-evolution. Control is achieved by estimating the contribution to thederivative vector based on the norm of the floor operator block || (cid:98) H I F J F || , whichis precomputed at setup, and a norm of the rhs. vector C J F . As the vectornorm needs to be evaluated at every time-step, we use the simple estimate || C J F || a := max i [ |(cid:60) ( C i ) | + |(cid:61) ( C i ) | ]. If the contribution to the total vector normis below a threshold || (cid:98) H I F J F |||| C J F || a ≤ (cid:15) th application of the block is skipped.The procedure requires some care with choosing (cid:15) th , but can speed up compu-tations by factors (cid:46) tutorial/13 . The code will print some advice when (cid:15) th may have been chosentoo large or too small, but at present heuristics for the choice of (cid:15) th is incom-plete. By default (cid:15) th = 0, i.e. blocks are only skipped when the operator blockor the vector become exactly zero, which happens, for example, after the end ofa laser pulse with strictly finite duration. For time-propagation at present only explicit methods are used, whose ef-ficiency notoriously deteriorates as the norm of the operator matrix increases.The main origin of large norm in Schr¨odinger-like problems is the Laplacian,whose matrix norm grows as p ∼ δx − , where p max and δx are the character-istic scales of maximal momentum and spatial resolution, respectively. Usuallyone does not manage to restrict the momenta in the discretization to the phys-ically relevant level and spurious, very high eigenvalues appear that can dra-matically slow down explicit time-steps to the level of numerical breakdown ofthe propagation. This stiffness problem can be fixed, if one manages to remove28purious eigenvalues from the operators. In tRecX, high-lying eigenvalues of thefield-free Hamiltonian are suppressed by spectral projections. In the simplestform one replaces the full Hamiltonian matrix with a projected one (cid:98) H ( t ) → ( − (cid:98) P ) (cid:98) H ( t )( − (cid:98) P ) , (cid:98) P = (cid:88) i | i (cid:105)(cid:104) i | , (40)where the | i (cid:105) are orthonormal eigenvectors for large eigenvalues of the field-freeHamiltonian.With more challenging Hamiltonians like for the Helium atom or molecu-lar systems, the full field-free Hamiltonian has many high-lying spurious states,they are expensive to compute, and application of the projection becomes costly.In such cases one can use for the | i (cid:105) eigenvectors of a different operator, for ex-ample the Laplacian. Eigenvectors of the Laplacian are sparse due to rotationalsymmetry and in case of multi-particle systems they can be given as tensorproducts of single-electron vectors. This renders calculation of the eigenvectorsas well as application of the projection computationally cheap.The cutoff energy for removal of high-lying states is characteristically setaround 100 a.u. . This is far larger than the actual energy scale of typically (cid:46) a.u. However, choosing the threshold that low would compromise the resultsand raises the cost of applying the projection to the point where no computetime is gained, in spite of the fact that step-size increases inversely proportionalto the cutoff energy. The energy cutoff is first introduced in tutorial/07 .Examples for using the Laplacian instead of the full Hamiltonian for projectingare in tutorial/21 and .Comparing to an outright spectral representation of the operators, removinga small number of outlier eigenvalues from a local representation maintains allsparsity deriving from locality of operators represented in a local basis. Thecost of removal remains low because the number of removed eigenvalues is smallcompared to the basis size and the vectors may have tensor product form, asfor the Laplacian of the He atom. tRecX is parallelized using MPI, but it will also compile without MPI, if noMPI library is detected by Cmake. The code is aware of hardware hierarchy inthat it can distinguish between “compute nodes” assumed connected throughswitches, “boards” connected by a bus, and “CPUs” assumed to have fast sharedmemory access. This hierarchy, although present in the code, is not at presentexploited by the distribution algorithm. For local operators, communicationbetween non-overlapping elements of the FE-DVR is low. Operator localitybetween elements is detected during setup and taken into account by the defaultdistribution algorithms for the respective coordinate systems.The finest MPI grains are the OperatorFloor blocks (cid:98) H I F J F . These oper-ate between subsections of the coefficient vectors C I F and C J F with typicaldimensions 10 to 400. The operator blocks can be distributed arbitrarily acrossall MPI nodes, but communication overhead must be taken into consideration.29he corresponding class OperatorFloor has a member cost() that deter-mines the CPU load for its application by self-measurement during setup. Aheuristic algorithm uses these numbers to create a load-balanced distributionof the operator. For containing communication cost, care is taken to arrangeblocks into groups that share either I F or J F . At least one of the respectivesections of coefficient vectors C J F or C J F reside on the same parallel process,which then “owns” the corresponding I F or J F .Actual communication cost is not measured by the code. Rather, it assumesthere is a sorting of the C I F such that compunction is dominantly short range, ase.g. sorting by increasing angular momenta in case of dipole interactions. Thenneighboring C I F ’s are preferably assigned to the same thread. The default forthis sorting is the sequence how the Axis:name are input. For some coordinatesystems this is overridden by internal defaults and the user can in turn canoverride by the input
Parallel:sort . The sorting actually used is shown inthe output.
Problems that can be solved with tRecX vary widely in structure and alsoin the methods employed. Scaling behavior strongly depends on these choices.Memory is, in general, not a limiting factor for tRecX calculations. Paralleliza-tion strategy focuses on large problems where run times in sequential modewould be days or weeks, while little effort has been made to boost paralleliza-tion for small problems with runtimes on the scale of minutes. Into the lattercategory fall many problems in tRecX that would be on time scales of hourswith more traditional approaches. These gains in program efficiency are throughcomplex features such as exploiting tensor products and block-sparsity, by stiff-ness control, the use of high order methods, or the tSurff box-size reduction. Allthese features, while at times dramatically reducing compute times and prob-lem sizes, tend to lead to coarser graining and enhanced communication, whichnecessarily deteriorates scalability. Specifically the haCC method is inherentlynon-local with large communication and therefore mostly restricted to sharedmemory use.Most problems treated with tRecX are best solved on small parallel ma-chines in the range from 4 to 64 cores. Only large problems such as the double-ionization of the Helium atom can profit from more extensive parallelization.Fig. 3.6.1 shows the scaling behavior for fixed-size problems (“strong scaling”).The two examples are Hydrogen in an IR field discussed in Sec. 2.1 and a He-lium atom computation with 20 l -functions, m = − , , × . Computations wereperformed at the LMU Theory machine KCS hosted at the Leibnitz Rechenzen-trum (LRZ), which consists of compute nodes connected by infiniband and dualboards with 2 ×
16 cores on each node. Parallelization gains can be seen up to256 cores. Scaling remains away from linear and as always in this situation onehas to weigh time gains for individual computations against overall throughputfor multiple runs. 30 igure 7: Scaling of tRecX for a double-ionization calculation of the Helium atom at totalbasis size of 3 × (dots) and single-electron problem discussed in Sec. 2.1 at basis size 1460(squares).
4. Main classes
Here we discuss the classes that form the functional and conceptional back-bone of tRecX. A complete listing of all classes is provided by the code’s Doxygendocumentation. In general, many classes in the code have a .write() memberfor dumping to file and a matching .read() or constructor for recovery fromfile. Mostly for debugging purposes, there is usually a .str() member thatreturns a human-readable string . Also, for critical classes, there are test() functions that provide cross-checks and usage examples. A key role is played bythe abstract template class
Tree . The C++ class Index represents the complete recursive basis tree definedthrough (30). The class and its member data are declared as class Index : public Tree < Index > {mutable uint32_t _size ;uint16_t _indexBas ;uint8_t _indexAx ;char _indexKind ;....}
These member data refer to the given node and are the only index-specificdata of the tree. The complete tree-structure, such as tree iterators, pruning,transposition, and other tree transformations are implemented in the templateclass Tree , Sec. 4.2, which is used for all tree classes of tRecX. The
Index data is squeezed into 8 bytes in an attempt to minimize storage, as
Index trees can become very large. This limits the number of different single-levelbasis sets b that are pointed to by _indexBas to 2 . In practice, also in verylarge computations only a few tens of different bases appear. Whenever a tensorproduct basis is used, the same basis re-appears at many nodes and has the same31 indexBas , as for example the product bases for the r , r -discretization. Therange of the axis pointer _indexAx is 2 , which is sufficient as the number of axesis intimately related to the dimension of the problem and hardly ever exceeds 10. _size gives the length of C J at the node. This information is redundant, butis cached here for fast access, and similarly _indexKind is cached informationabout the node’s function and position within the tree.The Index class, as one of the code’s oldest classes, is burdened by legacycode. In order to disentangle the current from legacy code, primary constructionis through an auxiliary derived class
IndexNew which takes an
AxisTree as itsinput.
AxisTree , in turn, reflects the definitions read from input. The standard
AxisTree is trivial with a single branch per node, equivalent to a vector. Onlywhen hybrid discretizations are used, as for the molecular problem (Sec. 2.5)and for off-centers bases (Sec. 2.6), the tree becomes non-trivial.An
Index contains the complete information about the basis B , Eq. (30).It also has a member function overlap() that returns a pointer to (cid:104) B J l | B J l (cid:105) as long as this is a meaningful entity for a single B J l , i.e. when the subspaceon level l does not have overlap with any other subspace on the same level (cid:104) B I l | B J l (cid:105) = 0 for I l (cid:54) = J l . Index constructors
Figure 8: Hierarchy of
Index classes, sec 4.1.1.
There is a number of specialized constructors for indices, see Fig. 8. Fordisentangling from the legacy code, these are usually given as the constructorof a derived class. One useful constructor is
IndexG for transforming from basisfunctions to representation by grid values. The grid can be equidistant, use-ful for plotting, or a quadrature grid, which is convenient for various forms ofbasis transformations and quadratures. The tree-structure of the basis allowsto perform these transformations computationally efficiently, cf. Sec. 3.3. The32 ndexS represents value and radial derivative at the tSurff surface. It is con-structed from a standard
Index by specifying the coordinate axis name andthe surface radius R s . IndexProd constructs a new index tree as the tensorproduct of two
Index trees.
IndexQuot forms the “quotient” of an
Index full by a “denominator”
Index den by eliminating from full all basis levels thatappear in den , ensuring consistency of the result. This allows to extract, e.g., asingle-electron factor from a two-electron basis. Maps to these derived indicesare generated automatically by a class
OperatorMap (see below) and are notinvertible in general
Discretization classes
Various classes derived form class Discretization are wrappers aroundspecific indices and serve as an interface for
Index construction. Examples are
DiscretizationGrid (Sec. 3.3) and
DiscretizationTsurffSpectra (Sec. 4.3.2).An important derived class is
DiscretizationSpectral , which constructsall or a selected part of the eigenvalues and eigenvectors of any diagonaliz-able operator matrix (cid:98) A and presents them in the form of a diagonal operator (cid:98) d A ( class OperatorDiagonal ). Further it contains transformations (cid:98) U and (cid:98) V ( class OperatorMap ) from and to the original basis, respectively. Note thatin general (cid:98) U (cid:54) = (cid:98) V † as the original basis as a rule is not orthonormal and also (cid:98) A may not be hermitian. Block-diagonal structure of the original operator isrecognized and translated into block-diagonal transformation. Arbitrary func-tions of the eigenvalues can be formed using OperatorDiagonal . E.g. one canform exp( − it (cid:98) A ) = (cid:98) U exp( − i (cid:98) d A ) (cid:98) V for time-integration of small problems, or,similarly, to implement rotations in a spherical harmonic basis. The principaluse in tRecX is in stiffness control, Sec. 3.5.For operators of the special form H = H I ⊗ + ⊗ H J the class DiscretizationSpectralProduct constructs a spectral representation taking full advantage of the fact that thereis an eigenbasis of H in the form of a tensor product of eigenbases of H I and H J .Transformations to and from that spectral representation have tensor productform. The class can also be used when the basis B is not tensor product, but isrelated to a tensor product B I ⊗ B J by a constraint as in Sec. 2.2.1. All trees in the code are derived from class Tree by the ”curiously recursivetemplate pattern” exemplified in class Index:public Tree
Tree has the private data template < typename T > class Tree {const T* _parent ;vector
Origin .Nodes without branches are called leafs. A standard sorting of leafs is bytheir position along the lower edge of the tree. The functions firstLeaf() and nextLeaf() return leftmost leaf descending from a given node and theiterator through the leafs. Note that in general nextLeaf() is not equivalentto nodeRight() as a tree’s lower edge does not need to remain at the samelevel depth, as in the example of Fig. 6 The index of a node is returned by vector
Index , this is exactly the tuple J l defined inSec. 3.2.Functions to add and remove branches include childAdd(T* C) and childPop() .For re-sorting trees there is a permute(...) , which takes a permutation of thetree levels as its argument and returns at tree with the levels permuted. A typ-ical case would be the transposition of tensor indices. With non-tensor objects,as e.g. in Sec. 2.2.1, it may not be possible to interchange certain indices in anunambiguous way and an exception will be raised upon the attempt.Finally, trees can also be realized as “views”, which do not actually owncopies of their data, but rather point do data of another tree. This is particularlyuseful for re-arranging tree data into a new tree by permuting indices withoutactually moving the data. All operator classes are derived from an abstract base class with the followingkey data and member function: class OperatorAbstract {const Index * iIndex , * jIndex ;void apply ( complex < double > A , const Coefficients &Xcomplex < double >B , Coefficients &Y) const =0;
It symbolizes a map (cid:126)X → (cid:126)Y = Op ( (cid:126)X ). The class containing the coefficients isa tree class Coefficients: public Tree
OperatorAbstract ,part of who are shown in the Doxygen-generated class hierarchy in Fig. 9. Par-ticularly important is class OperatorTree : public Tree < OperatorTree >, igure 9: Hierarchy of operators in tRecX (incomplete). An OperatorAbstract is an instanceof a map between linear spaces
LinSpaceMap , specifically between
Coefficients . Implemen-tations
OperatorTree , OperatorMap , and
Resolvent are briefly described in the text. public OperatorAbstract {protected :OperatorFloor * oFloor ;... which implements the hierarchy of block-matrices (32). The oFloor pointeris only non-null at the leafs of the operator tree. The class OperatorFloor implements all forms of maps in a numerically efficient way, for example, mul-tiplication of a vector by a full or diagonal matrix, multiplication by a tensorproduct of small matrices, but also more complicated maps as, for example inthe electron-electron interaction for a given multipole-contribution. Again, themap may be also non-linear, as in a Gross-Pitaevskii operator. These variousforms are realized as derived classes of the abstract base class
OperatorFloor . OperatorTree
The primary constructor of
OperatorTree takes an operator definition stringas in the examples of Sec. 2 that matches
Index and recursively sets up the fulloperator. As a rule, no complete matrix is constructed. Mostly, the tree containsonly the non-zero
OperatorFloor ’s. If tensor product structure is detected inthe operator, it is exploited if found to be numerically advantageous by some(approximate) internal algorithm. When multiple terms contribute to the same
OperatorFloor these are summed into a single
OperatorFloor where this ispossible and numerically profitable. 35 .3.2. Further important operator classes
From the whole list of operators we further single out the following classesfor their more general relevance:
OperatorInverse . Calculates the inverses of overlap matrices using Woodbury-like methods consisting of a cheap direct inverse with some low rank update forcompleting the exact inverse.
MapGauge . Implements general Gauge transformations.
OperatorMap . Given two
Index objects for discretizations B and B (cid:48) this isthe map B → B (cid:48) , where this is logically possible and meaningful. Typicalexamples are maps to and from grids, Sec. 3.3. Another application is in classDiscretizationTsurffSpectra for the transformation from surface values to agrid of momentum points, where the momentum spectra are accumulated. Suchtransformations are not necessarily lossless. Resolvent . Given (cid:98) H and an overlap (cid:98) S as OperatorAbstract ’s, this class con-structs the resolvent operator ( (cid:98) H − z (cid:98) S ) − with a complex z . At present theimplementation is through Eigen ’s sparse LU-decomposition and is limited bybasis sizes. For banded matrices, like in the Floquet example discussed above,
Resolvent can be constructed for very large dimensions.
Recursive structures provide for tRecX’s flexibility, but in addition theygenerate compact and comparatively transparent code. The basic pattern is: function ( tree ):if isLeaf : specific action on leafelse : for c in children : function (c)
As examples, we discuss the apply member function of the class
OperatorTree and an
OperatorTree constructor. For apply , which implements (cid:126)Y ← b(cid:126)Y + a (cid:98) O [ (cid:126)X ], the simplified pseudo-code is
1: O. apply (a ,X ,b ,Y ):2: Y <-b*Y3: if isLeaf :4: OperatorFloor . apply (a ,X ,1 , Y)5: else :6: for cO in O. children :0: cO . apply (a ,X. child ( cO . jdx ) ,1 , Y. child ( cO . idx ))
Here an
OperatorFloor class implements the specialized action in a efficientway, typically through LAPACK or Eigen. The code plays out its efficiencyfor block-sparse matrices, where zero-blocks never appear in the loop, line 6.The price to pay is that one needs to locate the block operator’s left- andright hand indices in the coefficient vectors (cid:126)X and (cid:126)Y , here symbolical writtenas
X.child(cO.jdx) and
Y.child(cO.idx) , respectively. If one ensures that36 peratorFloor.apply is a sufficiently coarse-grain operation, say multiplica-tion by a 20 ×
20 matrix, the overhead from the recursion remains small. Clearly,for full matrices or matrices with very regular structure such as band-matrices,the algorithm is at a disadvantage. Where such performance losses are identi-fied,
OperatorTree should be replaced by a more specialized class derived from
OperatorAbstract .After all setup is done the
OperatorTree a “flattened” view of the treeis created for use in propagation. This is a vector of pointers to the leafsof the
OperatorTree , which are automatically distributed for parallelization(see sec. 3.6). In the process, direct pointers from the operator indices to therespective sections of the (cid:126)X and (cid:126)Y vectors are set up, eliminating all overheadfrom that place.A second example of recursion in tRecX is the pseudo-code of a basic
OperatorTree constructor, for the case of a strict tensor product opDef="0.5
OperatorTree ( opDef , iIndex , jIndex , mult ):
The opDef strings are split into the scalar prefactor a , the first tensor fac-tor f , and the remainder r by getFactors . This is mainly located in class OperatorDefinition , with a few additional classes due to legacy code. Then getMatrix interprets the tensor factor string s=
All bases are derived from a class
BasisAbstract with the pure virtual func-tion size() giving the number of functions in the basis. The word “basis” isused in a general way for any set of defining properties for the discrete repre-sentation on a given q l . This includes a discrete set of functions, but also grids,or an orthonormal set of unit vectors in a discrete space. Bases need not beorthonormal, although this is ensured wherever it is possible and meaningful. BasisIntegrableBasisIntegrable is an abstract class is for single-variable basis functionsthat can be integrated over: class BasisIntegrable : public BasisAbstract {protected :double _lowBound , _upBound ;public :virtual void valDer (const vector < complex < double > > &X ,vector < complex < double >> &V ,vector < complex < double >> &D ,...) const =0;virtual void quadRule (int N , vector < double > & QuadX ,vector < double > & QuadW ) const =0;virtual unsigned int order () const =0;...
The functions are supported on the interval [ _lowBound , _upBound ], which mayalso be infinite. The pure virtual function valDer(...) must be implemented toreturn the value and first derivative matrices V ij = b j ( x i ) and D ij = b (cid:48) j ( x i ). Any BasisIntegrable must provide N -point quadrature rules q k , w k in QuadX,QuadW through quadRule(...) . Finally, there is the concept “order” of a
BasisIntegrable .This can be understood as the minimal number of quadrature points needed forthe correct evaluation of overlap matrix elements. For example, in a DVR basiswith Dirichlet boundary conditions an the lower boundary, the first Lagrangepolynomial is omitted, leading to size()=order()-1 .A simple example of a
BasisIntegrable that is only used for debuggingpurposes are the monomials { , x, x , . . . } : class BasisMonomial : public BasisIntegrable {int _order ;public :BasisMonomial ( int Order , double Low , double Up ): BasisIntegrable ( Low , Up ), _order ( Order ){}unsigned int size () const { return _order ;}void quadRule (...) const {... shift - and - scale Legendre quadrature ...} oid valDer ( const vector < complex < double >> &X ,vector < complex < double >> &V ,vector < complex < double >> &D ,...) const {V. assign (X. size () ,1.);D. assign (X. size () ,0.);for ( int k =1; k < size (); k ++){for ( int i =0; i BasisDVR An important BasisIntegrable implementation is BasisDVR , where themost important data members are class BasisDVR : public BasisIntegrable {vector < double > _dvrX , _dvrW ;int _nBeg ;int _size ;... _dvrX and _dvrW are the nodes and weights for quadrature rule. The rule is Lo-batto for finite intervals and Radau for semi-infinite intervals. There are at most _dvrX.size() different Lagrange polynomials, for which values and derivativescan be evaluated anywhere within the basis’ interval. Dirichlet boundary condi-tions are determined through _nBeg and _size . _nBeg =0 means the Lagrangepolynomial for _dvrX[0]=_lowBound is included, and =1, where that polyno-mial is omitted. Similarly, _nBeg+_size=_dvrX.size()-1 means the Lagrangepolynomial at _dvrX.back()=_upBound is omitted for Dirichlet condition atthat point. The values are set upon construction. BasisGridBasisGrid is not a BasisIntegrable , rather it derives directly from BasisAbstract : class BasisGrid : public BasisAbstract {vector < double > _mesh ;unsigned int size () const { return _mesh . size ();}... The only class-specific member data is _mesh , which holds the grid points.Values of a BasisGrid are only defined at the grid points, but a member functionfor Newton-interpolation between these points is provided.39he class is mostly for transforming BasisIntegrable ’s to grids. Assumea wave function | Ψ (cid:105) = | B (cid:105) C is given in terms of an Index cIdx contain-ing BasisIntegrable ’s. A new IndexG gridIdx(cIdx) is created where the BasisIntegrable ’s are replaced by the desired BasisGrid ’s. In the process an OperatorMap mapFrom is automatically created which transforms Ψ to its rep-resentation on the multi-dimensional grid. Assuming Coefficients X(cIdx) contains the C , then mapFromParent () - > apply (1 ,X ,0 , Y) fills Coefficients Y(gridIdx) with Ψ( (cid:126)x i ) at the multi-dimensional grid points (cid:126)x i . The assignment between values Ψ( (cid:126)x i ) and (cid:126)x i is given through the structureinformation contained in gridIdx .A class BasisGridQuad is derived from BasisGrid , with an additional mem-ber vector BasisIntegrable to a Gauss quadra-ture grid that is exact for the basis. This procedure is used on several occasions,e.g., for the efficient multiplication by the Volkov phases on a grid of (cid:126)k -values(see Sec. 3.1), when the spectral amplitudes are given in terms of sphericalharmonics. BasisVectorBasisVector is a simple and useful class for discrete coordinate indices,where only l matters and the value of q l has no significance. It is fully definedby its size class BasisVector : public BasisAbstract {unsigned int _size ;unsigned int size () const { return _size ;}... This is used, for example to label the Floquet blocks in sec. 2.3. BasisSub A subset of a given BasisAbstract is selected by BasisSub class BasisSub : public BasisAbstract {vector < int > _subset ;const BasisAbstract * _bas ;... where the vector BasisSub . This is used when imposing basis constraints or in general whenpruning branches from an Index . BasisNdim As illustrated in Sec. 2.5 and further discussed in Sec. 3.2, formal coordinates q l may also be multi-dimensional. Functions with higher-dimensional argumentsappear as orbitals but also as intermediate objects when mixing coordinates40ystems, for example for a multi-center expansion. The class is more complexthan the examples given so far. The general strategy is to store the valuesand partial derivatives of all functions at a suitable quadrature grid. This mayrequire substantial memory, but we have not exhausted standard size storage ofa few GB in applications so far. The quadrature grid may refer to a differentcoordinate system _quadCoor than the basis’s coordinate system _ndimCoor .From this follows the class signature: class BasisNdim : public BasisAbstract {string _ndimCoor ;string _quadCoor ;vector < vector < double >> _quadGrid ;vector < double > _quadWeig ;vector < vector < vector < complex < double >>>> _valDer ;... Matrix elements can be computed for operators given in terms of standardstrings, where the transformation between different coordinate systems is doneautomatically adhering to the philosophy of “automatic differentiation”. Forfurther details we refer to the in-line documentation of the code. All user input is controlled by a class ReadInput with a prescribed des-ignation of any input item in the format Category: name as illustrated in theexamples of Sec. 2. An overloaded read(...) method requires to supply adefault value or to state explicitly that there cannot be a default and brief doc-umentation for every input item. Inputs can be of all standard types, whichalso includes vector ’s. Input is usually read from file but can be overruledby command line flags of the format -Category:name=value by default or ab-breviated flags, that can be specified in read(...) . In the input file, several name ’s can follow the Category specifier, the sequence of the names is arbitrary.Also, the same category can appear in repeated lines, such as in Sec. 2.5 wherewe have Operator:hamiltonian and Operator:interaction . We follow theconvention of having Category ’s start with upper case and name ’s with lowercase letters.There is a simple syntax to restrict admissible input values. A member func-tion ReadInput::finish() checks all inputs from file and from the commandline for correct Category and name and will stop if a given pair Category:name in the file does not actually appear in the code, reducing the likelihood of mis-print errors. In addition, finish() a list of all admissible inputs in a file tRecX.doc , which also explains the input as documented in read(...) andthe default input values. The contents of this file is shown as a help whenrunning tRecX without any input.The above is meant to illustrate the general strategy for enforcing documen-tation and enhancing usability and error safety. Full features can be found inthe inline-documentation and are illustrated by a usage example in the test() member function. 41nother feature for productivity and error safety is the possibility to freelychoose input units and to use algebraic expressions as inputs. Default are a.u.unless the input name specifies a different unit. In the example of section 2.5 Laser : shape ,I(W/ cm2 ), FWHM , lambda ( nm ), phiCEO , polarAnglecos2 , 1 e10 , 4 OptCyc , 800 , pi /2 , 45cos4 , 1 e11 , 3 OptCyc , 800/13 , 0, 45cos4 , 1 e11 , 3 OptCyc , 800/15 , 0, 45 the intensity is expected with the strong-field convention as W/cm . Theunits in brackets at I(W/cm2) form a functional part of the Category:name .The name ’s input units can be overruled by specifying e.g. instead.That value will be converted by read(...) to W/cm , 10 − a.u. = 3 . . . . × − W/cm with full available precision. Another example is with lambda(nm) ,where, e.g. one could equivalently use the input string . Units Unit conversions are performed by a class Units which at present recog-nizes atomic units au , cgs ESU , and SI units plus a few units that customar-ily used in strong field physics such as W/cm , nm and Rydberg energy Ry .The duration of an optical cycle OptCyc is computed from the wave-length ofthe field component in the first line after Laser : the listing above produces1 OptCyc = 2 π (800 nm ) /c (converted to a.u. ). Algebra Input values can also be specified as algebraic expressions of constants, as in or pi/2 . The strings are interpreted by the same class Algebra that isused for the definition of operators. It can do standard complex algebra, wherecomplex numbers are specified as in . It recognizes a few constantssuch as pi and hbar ( (cid:126) in SI units). Further constants can be added from theinput, as documented in the command line help. TimePropagator and TimePropagatorOutput classes Time propagation is controlled through a wrapper class TimePropagator .It takes start and end times, accuracy or step size and output intervals as itsmain control parameters. For solving the ordinary differential equation in timeit needs a class of abstract type ODEstep . A range of those steppers have beenimplemented including a general (explicit) Runge-Kutta, a specialized classical4-stage Runge-Kutta, and Arnoldi solver, and several experimental solvers. Atpresent, only the classical Runge-Kutta is used, as it was found to be the overallmost efficient across the large variety of problems treated with tRecX. Thenotorious stiffness problem of explicit methods is controlled by removing fewextremely high-lying spectral values from the problem, see Sec. 3.5. Althoughthis does deliver a workable and rather efficient solution, we do not consider thedevelopment of time-steppers as concluded.42he class TimePropagatorOutput controls which information is outputduring time-propagation. One category of outputs are expectation values of op-erators, by default the overlap (cid:104) Ψ( t ) | Ψ( t ) (cid:105) and field-free Hamiltonian (cid:104) Ψ( t ) | H | Ψ( t ) (cid:105) ,where H is the operator specified as Operator:hamiltonian . Further expec-tation values can be defined at the input, for example the dipole values invarious gauges. Another category are Coefficients for A | Ψ( t ) (cid:105) . In this waythe values and derivatives at the tSurff-radius R c are written to disc, but A can also be user-defined. More transformations can be easily added by editing main_trecx.cpp . Plot One can plot densities of the kind | Ψ( (cid:126)q, t ) | or more generally Ψ( (cid:126)q, t )[ A Ψ]( (cid:126)q, t )with a user-defined operator A . This is handled by class Plot , which is con-structed from input as, for example, Plot : axis , points , lowerBound , upperBoundRn ,101 ,0. ,20.Eta ,31 , -1. ,1. where the density is plotted w.r.t. the discretization’s axes Rn and Eta thetwo-dimensional region [0 , × [ − , 1] with 101 × 31 equidistant grid points.Coordinates not listed are assumed to be integrated or summed over. Outputwill be in ASCII format and readable, e.g., by Gnuplot, but also by tRecX’s plot.py script. The order of inputs lines in Plot determines the sorting ofthe density values, such that the first axis , Rn in the example, runs fastest.For higher-dimensional plots the further dimensions will appear as additionalcolumns in the two-dimensional output file. Explanation of the input for plotscan be found in tRecX.doc , for the full features of the class we refer to theDoxygen and inline documentation of the code. There is a limited number of convenience python scripts in the SCRIPTS subdirectory. These have mostly grown out of practice and certainly do notcomply with good programming requirements. Yet, given their proven usefulnessin practice, we include them with the distribution.For submission to compute queues one can adjust submit_tRecX.py , whichis currently set up for SLURM and should be adaptable to similar queuing sys-tems with little effort. It ensures generation of properly named run-directoriesbefore starting the actual tRecX code. By this one can submit multiple jobswithout the need to manually ensure proper run-directory numbering. It alsocreates a short submit name for the job for display by the SLURM queueoverview.Virtually all ASCII files that appear in the run directory can be plotted using plot.py . It produces one- and two-dimensional graphs from selected columns ofa file, compares multiple runs, can annotate curves with the actual parametersused in the run etc. Brief instructions and a full list of command line flags aredisplayed by running plot.py without any arguments.43unning multiple calculations with varying parameters, either for ensuringconvergence or for analyzing a physical phenomenon is a frequent mode of usingtRecX. The script lRuns.py lists all or a selected subset of runs showing basicinformation such as status of the computation, run time, wave function norm,and energy. In addition, the user can select any set of input parameters fordisplay. Usage instructions are shown when running lRuns.py on the commandline without any parameters. 5. Conclusions The purpose of tRecX is three-fold: applications, training and education,and community development.The code produces accurate solutions for TDSEs that appear in ultrafastand strong field physics. In the present public version a wide range of standardproblems such as high harmonic generation, fully differential spectra for singleionization, Floquet and various model systems can be solved by adapting thegiven tutorial inputs. Also, with the use of significant computer resources,fully differential double emission spectra can be computed. With tSurff as oneof its key methods, computer resource consumption remains low, on the scaleof a few minutes for single-electron calculation of standard tasks, and withinthe range of the feasible for long-wavelength double emission. Forthcomingreleases will include haCC, which integrates Gaussian-based quantum chemicalwave functions with the discretizations discussed here. This allows to computeemission from multi-electron systems.A designated part of the tRecX development is to ensure user experience thatis acceptable to a somewhat wider range of specialist users, including experi-mentalists who want to generate standard results or study simple models as wellas theorists with more complex demands. We consider error safe and intuitiveinput, extensive consistency checks, and structurally enforced documentation asessential for achieving that goal.Finally, on the developer level, the systematic C++ object orientation hasallowed development and maintenance of the code by a very small group. Thefull research code is also used in training and education on the undergraduateand graduate level. In course of such projects, attention to understandable andconsistent code structure it taught and enforced. Student projects have non-trivially contributed to the code in specialized applications, such as the use ofparabolic coordinates, Coulomb scattering, and double- and triple breakup (notincluded in the public release yet).The experience with student projects shows that substantial structural con-tributions from a community are possible without endangering code integrityor maintainability. Possible first such projects would likely be collaborative,but also unsupervised extensions may well be feasible. A formal invitation forcontributions is extended here. 44 cknowledgment Key initial contributions to the code were made by Vinay Pramod Majetyand Alejandro Zienlinski, with further contributions by, in alphabetic order,Christoph Berger, Jonas Bucher, Florian Egli, Jacob Liss, Mattia Lupetti, J¨ornSt¨ohler, Jonathan Rohland, Andreas Swoboda, Hakon Volkmann, Markus andMichael Weinmueller, and Jinzhen Zhu. Funding was provided by the Mu-nich Center for Advanced Photonics (MAP), the Austrian Science Foundationproject ViCoM (F41) and the DFG priority program 1840 (QUTIF). References [1] Armin Scrinzi. Infinite-range exterior complex scaling as a perfect absorberin time-dependent problems. Physical Review A , 81(5):1–10, May 2010.[2] Liang Tao and Armin Scrinzi. Photo-electron momentum spectra fromminimal volumes: the time-dependent surface flux method. New Journalof Physics , 14(1):013021, January 2012.[3] Armin Scrinzi. t-surff: fully differential two-electron photo-emission spec-tra. New Journal of Physics , 14(8):085008, 2012.[4] Vinay Pramod Majety, Alejandro Zielinski, and Armin Scrinzi. Mixedgauge in strong laser-matter interaction. Journal of Physics B: Atomic,Molecular and Optical Physics , 48(2):025601, 2015.[5] T. N. Rescigno and C. W. McCurdy. Numerical grid methods for quantum-mechanical scattering problems. Phys. Rev. A , 62(3):032706, Aug 2000.[6] Markus Weinm¨uller, Michael Weinm¨uller, Jonathan Rohland, and ArminScrinzi. Perfect absorption in schrdinger-like problems using non-equidistant complex grids. Journal of Computational Physics , 333:199 –211, 2017.[7] Vinay Pramod Majety, Alejandro Zielinski, and Armin Scrinzi. Photoion-ization of few electron systems: a hybrid coupled channels approach. New.J. Phys. , 17, JUN 1 2015.[8] Alejandro Zielinski, Vinay Pramod Majety, and Armin Scrinzi. Doublephotoelectron momentum spectra of helium at infrared wavelength. Phys.Rev. A , 93:023406, Feb 2016.[9] Jinzhen Zhu and Armin Scrinzi. Electron double-emission spectra for he-lium atoms in intense 400-nm laser pulses. Phys. Rev. A , 101:063407, Jun2020.[10] Vinay Pramod Majety and Armin Scrinzi. Absence of electron correlationeffects in the helium attoclock setting. Journal of Modern Optics , 0(0):1,2017. 4511] Vinay Pramod Majety and Armin Scrinzi. Dynamic exchange in the strongfield ionization of molecules. Phys. Rev. Lett. , 115:103002, Sep 2015.[12] Vinay Pramod Majety and Armin Scrinzi. Static field ionization rates formulti-electron atoms and small molecules. Journal of Physics B: Atomic,Molecular and Optical Physics , 48(24):245603, 2015.[13] Vinay Pramod Majety and Armin Scrinzi. Multielectron effects in strong-field ionization of co : Impact on differential photoelectron spectra. Phys.Rev. A , 96:053421, Nov 2017.[14] Lapack. . Accessed: 2021-01-12.[15] Eigen - a c++ template library for linear algebra. https://eigen.tuxfamily.org . Accessed: 2021-01-12.[16] FFTW. . Accessed: 2021-01-12.[17] Doxygen. . Accessed: 2021-01-12.[18] The tRecX git repository. https://gitlab.physik.uni-muenchen.de/AG-Scrinzi/tRecX . Accessed: 2021-01-12.[19] R. R. Freeman, P. H. Bucksbaum, H. Milchberg, S. Darack, D. Schumacher,and M. E. Geusic. Above-Threshold Ionization with Subpicosecond LaserPulses. Phys. Rev. Lett. , 59(10):1092–1095, 1987.[20] M Lewenstein, Ph Balcou, M Yu Ivanov, Anne L Huillier, and P Corkum.Theory of high-harmonic generation by low-frequency laser fields. Physics ,49(3):2117, 1994.[21] Adrian N. Pfeiffer, Claudio Cirelli, Mathias Smolarski, Darko Dimitrovski,Mahmoud Abu-samha, Lars Bojer Madsen, and Ursula Keller. Attoclockreveals natural coordinates of the laser-induced tunnelling current flow inatoms. NATURE PHYSICS , 8(1):76–80, JAN 2012.[22] Lisa Torlina, Jivesh Kaushal, and Olga Smirnova. Time-resolving electron-core dynamics during strong-field ionization in circularly polarized fields. PHYSICAL REVIEW A , 88(5), NOV 4 2013.[23] A Scrinzi, MY Ivanov, R Kienberger, and DM Villeneuve. Attosecondphysics. J. Phys. B , 39(1):R1–R37, JAN 14 2006.[24] Denis Jelovina, Armin Scrinzi, Hans Jakob Wrner, and Axel Schild. Nonlo-cal mechanisms of attosecond interferometry in three-dimensional systems. Journal of Physics: Photonics , 2020.[25] V.I. Lebedev. Quadratures on a sphere.