[PDF] BOPfox program for tight-binding and analytic bond-order potential calculations

Abstract

Bond-order potentials (BOPs) provide a local and physically transparent description of the interatomic interaction. Here we describe the efficient implementation of analytic BOPs in the BOPfox program and library. We discuss the integration of the underlying non-magnetic, collinear-magnetic and noncollinear-magnetic tight-binding models that are evaluated by the analytic BOPs. We summarize the flow of an analytic BOP calculation including the determination of self-returning paths for computing the moments, the self-consistency cycle, the estimation of the band-width from the recursion coefficients, and the termination of the BOP expansion. We discuss the implementation of the calculations of forces, stresses and magnetic torques with analytic BOPs. We show the scaling of analytic BOP calculations with the number of atoms and moments, present options for speeding up the calculations and outline different concepts of parallelisation. In the appendix we compile the implemented equations of the analytic BOP methodology and comments on the implementation. This description should be relevant for other implementations and further developments of analytic bond-order potentials.

Full PDF

BBOPfox program for tight-binding and analytic bond-order potential calculations

T. Hammerschmidt,

1, 2

B. Seiser,

1, 2

M. E. Ford,

1, 2

A.N. Ladines, S. Schreiber, N. Wang, J. Jenke, Y. Lysogorskiy, C. Teijeiro, M. Mrovec, M. Cak, E. R. Margine,

2, 4

D. G. Pettifor, and R. Drautz

1, 2 Atomistic Modelling and Simulation, ICAMS, Ruhr-Universit¨at Bochum, D-44801 Bochum, Germany Department of Materials, University of Oxford, Parks Road, Oxford OX1 3PH, United Kingdom High-Performance Computing in Materials Science, ICAMS,Ruhr-Universit¨at Bochum, D-44801 Bochum, Germany Department of Physics, Applied Physics and Astronomy, Binghamton University,State University of New York, Vestal, New York 13850, USA

Bond-order potentials (BOPs) provide a local and physically transparent description of the inter-atomic interaction. Here we describe the eﬃcient implementation of analytic BOPs in the BOPfoxprogram and library. We discuss the integration of the underlying non-magnetic, collinear-magneticand noncollinear-magnetic tight-binding models that are evaluated by the analytic BOPs. We sum-marize the ﬂow of an analytic BOP calculation including the determination of self-returning pathsfor computing the moments, the self-consistency cycle, the estimation of the band-width from therecursion coeﬃcients, and the termination of the BOP expansion. We discuss the implementation ofthe calculations of forces, stresses and magnetic torques with analytic BOPs. We show the scalingof analytic BOP calculations with the number of atoms and moments, present options for speedingup the calculations and outline diﬀerent concepts of parallelisation. In the appendix we compile theimplemented equations of the analytic BOP methodology and comments on the implementation.This description should be relevant for other implementations and further developments of analyticbond-order potentials.

I. INTRODUCTION

A key requirement for reliable atomistic simulationsis a robust description of the interatomic interaction.Density-functional theory (DFT) calculations provide areliable treatment of the bond chemistry in many systemsbut the accessible length- and time-scales are limiteddue to the computational eﬀort. Larger systems and/orlonger time scales become accessible by coarse-grainingthe electronic structure in DFT to the tight-binding (TB)approximation and further on to the analytic bond-orderpotentials (BOPs) . This leads to a transparent and in-tuitive framework for modelling the interatomic interac-tion, including covalent bond formation, charge transferand magnetism.The analytic BOPs are closely related to the nu-merical BOPs as discussed in Refs. . Both have beenapplied in simulations of diﬀerent materials, see Ref. foran overview. Here we describe our implementation of an-alytic BOPs in the software package BOPfox . BOPfoxhas already been used in several publications andis being continuously extended and optimised. We pointout similarities of TB/BOP calculations and computa-tions carried out using other electronic structure meth-ods, and discuss the peculiarities of analytic BOPs indetail. This comprehensive description of the algorith-mic framework of analytic BOPs should be of use forother implementations and further developments of ana-lytic BOPs.In Sec. II we outline the program ﬂow of TB/BOP cal-culations in BOPfox. Section III is devoted to the discus-sion of the performance with regard to scaling, speed-upand parallelisation. The full set of equations that is eval-uated during an analytic BOP calculation is compiled in the appendix with details of the implementation andreferences to the original derivations. II. PROGRAM FLOWA. Overview

The typical ﬂow for computing the bond energy witha non-magnetic analytic BOP is sketched in Fig. 1 anddiscussed in detail in the following. The real-space BOPcalculations can easily be complemented by reciprocal-space TB calculations that employ the same Hamiltonianmatrix elements.

B. Input ﬁles

The initial stage of TB and BOP calculations inBOPfox is (i) reading the central control ﬁle ( infox.bx ),(ii) the speciﬁed structure ﬁle (default: structure.bx )and (iii) the speciﬁed model ﬁle with the TB/BOP pa-rameters (default: models.bx ). The presently avail-able TB/BOP models in BOPfox include parameters formagnetic calculations for Fe , Fe-C , for non-magnetic calculations for V , Cr , Nb , Mo ,Ta , W , Ir , Si-N , and a canonical d -bandmodel . The set of TB/BOP parametrisations availablein BOPfox is being constantly extended. a r X i v : . [ phy s i c s . c o m p - ph ] J u l FIG. 1: Overview of the calculation of the bond energy for anon-magnetic system with analytic BOPs in BOPfox.

C. Initialisation

Two neighbour-lists of the crystal structure are createdby setting up ghost cells and constructing cell linked-lists. The implementation scales linearly with the num-ber of atoms. The short-ranged neighbour-list is used forthe construction of the intersite matrix elements of theHamiltonian ( H iαjβ in Fig. 1), interference paths ( ξ ( n ) iαjβ in Fig. 1) and transfer paths ( T ( n,m ) iαjβ in Fig. 1). The sec-ond, long-range neighbour-list is used for the evaluationof the repulsive energy. D. Hamiltonian

For each pair of atoms, the Hamiltonian matrix ele-ments H iαjβ are constructed (Eq. A13) with the spec-iﬁed tight-binding model and rotated to the global co-ordinate system (Eq. A15). TB/BOP calculations tak-ing into account collinear or non-collinear magnetism useHamiltonians with spin-dependent onsite levels as givenin Eq. A17 and A18, respectively. The implementationof collinear magnetism in BOPfox uses a loop over the ↑ and ↓ spin channels. The calculations for the indi-vidual spin channels are very similar to non-magneticBOP calculations. The similar processes involved in non-collinear magnetic calculations, collinear magnetic cal-culations and non-magnetic calculations (see C) allowreuse of large portions of the code for each type of calcu-lation. Switching the implementation to non-collinearmagnetism is controlled by a preprocessor ﬂag in the Makefile that includes the relevant parts of the sourcecode.

E. DOS and Fermi energy

A key diﬀerence between the TB and BOP implemen-tations is the calculation of the local density of states(DOS) n iα ( E ): (i) In analytic BOP calculations the pair-wise H iαjβ are used to construct n iα ( E ) in real space asoutlined in B 1. (ii) In TB calculations the H iαjβ areused to generate a Hamiltonian with periodic boundaryconditions that is diagonalised in reciprocal space usingLAPACK routines .The local DOS n iα ( E ), whether obtained using TBcalculations in reciprocal space or using BOP calculationsin real space, is integrated up to the Fermi energy E F .The Fermi energy is determined by the bisection methodto match the sum of electrons in all orbitals with thetotal number of electrons in the system. F. Self-consistency

The onsite levels H iαiα are optimised in the self-consistency loop (Eq. A30 or Eq. A29) until the con-tributions to the binding energy (Eqs. A1- A11) and theforces (Eq.C14) can be computed. The self-consistencycondition in TB and BOP calculations is approached it-eratively. The onsite levels E ( n +1) iα of step n + 1 in theself-consistency loop are computed according to Eqs. A30and A29 from n ( n ) iα ( E ) that was obtained for the Hamil-tonian with onsite levels E ( n ) iα . With the new E ( n +1) iα , theHamiltonian is updated and the new n ( n +1) iα ( E ) is com-puted. In BOPfox, the input and output values of the on-site levels can be mixed (i) linearly, (ii) with the Broydenmethod , (iii) with the FIRE algorithm or (iv) withmolecular dynamics of onsite levels using a damped Ver-let algorithm. In all mixers, the self-consistency loop iscarried out until the speciﬁed convergence limit or max-imum number of steps is reached. The convergence ofthe diﬀerent mixers depends on the particular system athand, particularly for magnetic systems . G. Energy and force contributions

In TB, the bond energy is obtained by integrating thelocal electronic DOS n iα ( E ) of the eigenvalues, whichresult from diagonalisation of the Hamiltonian, with theMethfessel-Paxton scheme or the improved tetrahedronmethod . In analytic BOPs, the bond energy is deter-mined analytically from the local electronic DOS n iα ( E )and the Fermi energy E F , see A.In both TB and BOP calculations, the forces can beused for structural relaxation and MD simulations withinBOPfox. The current implementation includes severalrelaxation algorithms (e.g. damped MD, conjugate gra-dient , L-BFGS , FIRE ) as well as standard MDschemes (e.g. Verlet , velocity Verlet ). H. BOPfox as library: BOPlib

BOPfox provides an application programming inter-face (API) for communication with external software.The API takes the system conﬁguration (species, posi-tions, onsite levels, etc.) as arguments, starts a TB/BOPcalculation and returns atomic binding energies, forces,stresses and torques. The combination of API andBOPfox subroutines can be compiled to a static or dy-namic library called BOPlib. With BOPlib the TB/BOP

FIG. 2: Combination of BOPfox with ASE , openKIM andLAMMPS by the BOPlib API. calculations can be fully integrated with other externalsoftware as sketched in Fig. 2. In particular, BOPfox can be addressed from ASE as calculator with ei-ther BOPfox as system call or BOPlib as linked library.BOPlib can also be conﬁgured as KIM model to be linkedto openKIM and as pair style potential to be linkedwith LAMMPS . III. PERFORMANCEA. Scalability

The computational eﬀort of energy and force calcu-lations with analytic BOPs is largely dominated by theevaluation of interference paths (Eq. B19) and transferpaths (Eq. C6). The theoretical scalability of the compu-tational eﬀort with respect to the number of atoms andthe number of moments is discussed in a detailed com-plexity analysis and systematic benchmarks in Ref. .For typical choices of the number of moments, the com-plexity of the calculations increases with the number ofmoments to the power of approximately 4.5. The im-plementation of analytic BOPs in BOPfox reaches thistheoretical scaling limit . The increase in the computa-tional eﬀort with the number of atoms is linear (Fig. 3)due to the use of linear-scaling linked-cell lists and thelocality of the BOP expansion. FIG. 3: Linear scaling of the execution time with the numberof atoms in the analytic BOP simulations. The dashed lineindicates a linear ﬁt of the data points. Technical details ofthe benchmark are given in Ref. . B. Speed-ups

BOPfox provides several options to accelerate the en-ergy and force calculations with analytic BOPs:(i) The interference paths that are determined to eval-uate the moments of the DOS are also needed to computethe bond-order type term ˜Θ iανjβµ for the self-consistency(Eq. A29) and the forces (Eq. C14). An obvious approachto improve the computational speed is therefore to storethe interference paths. The resulting increase in memorylimits this optimisation to moderate system sizes.(ii) The self-consistency cycle involves the modiﬁca-tion of onsite levels E iα which necessitates the repeatedcomputation of new interference paths (Eq. B25). Thiscan hardly be avoided. However, small changes in the lo-cal atomic structure typically lead to only small changesin the self-consistent onsite levels. Hence for relaxationsand MD simulations, the computation time can be re-duced by initializing the onsite levels to the values of theprevious step. For typical step sizes of relaxations or MDsimulations, this leads to signiﬁcant speed-ups in succes-sive self-consistent energy or force evaluations as fewerself-consistency steps need to be carried out.(iii) In many cases the interatomic interaction is dom-inated by the inﬂuence of the local environment of agiven atom rather than eﬀects due to atoms located fur-ther away. In the BOP framework, this expected short-sightedness of the interaction corresponds to a greaterimportance of the interference paths which sample thenearby environment as compared to those that reach outto more distant atoms. A straight-forward improvementin performance is, therefore, to introduce a maximum ra-dius for the interference paths. In this way the immedi-ate neighbourhood is fully sampled, while the paths thatreach beyond a speciﬁed maximum radius are neglected.This introduces an additional level of approximation. C. Parallelisation

The computation of forces and energies using analyticBOPs is perfectly suited for parallel execution. BOPfoxprovides diﬀerent concepts of parallelisation. Here weprovide only an overview, the details and performanceanalysis are discussed in detail in the respective refer-ences given below. Switching between diﬀerent paralleli-sations is performed during compilation time with pre-processor ﬂags.(i) The shared-memory parallelisation based onOpenMP provides a straight-forward parallelisation ofthe loops for computing the interference paths (Eqs. B26-B28) and the transfer matrices (Eqs. C6-C9). In this im-plementation all operations make use of the same arrayswhich are allocated for the whole simulation cell. There-fore the maximum size of the simulation cell is limitedby the available memory.(ii) The shared-memory parallelisation based on MPIuses a TODO list of operations that is distributed todiﬀerent threads. As for the shared-memory OpenMPparallelisation, the working arrays are allocated for thewhole simulation cell, which leads to a memory limita-tion. This parallelisation approach is also suitable andimplemented for GPU processing.(iii) The distributed-memory parallelisation based onMPI performs a domain decomposition of the simulationcell and thereby reduces the memory required per thread of the parallel execution. This implementation was opti-mised to reduce communication and to avoid redundantoperations due to the overlap of interference-paths calcu-lations in the distributed domains. The implementation FIG. 4: Strong scaling of execution time with the numberof processes for ﬁxed system size (top) and weak scaling ofexecution time with the number of processes for ﬁxed sizeof individual processes (bottom). The dashed lines indicateideal strong scaling and ideal weak scaling. Technical detailsof the benchmark are given in Ref. . in BOPfox reaches excellent strong scaling (Fig. 4, top),i.e. a linear decrease of the computation time for a ﬁxedsystem size with the number of processes. At the sametime it also shows excellent weak scaling (Fig. 4, bottom),i.e. a constant execution time for increasing system sizeat a constant number of atoms per process.(iv) The hybrid parallelisation is a combination ofshared-memory and distributed-memory parallelisationthat was developed to make use of the multi-core CPUarchitectures and multi-threading-capabilities of modernsupercomputers. Here, the system is decomposed intodomains that are distributed to diﬀerent nodes usingMPI. On each node the operations are then carried outon the same memory using OpenMP. IV. CONCLUSIONS

Analytic BOPs provide a local and physically trans-parent description of the interatomic interaction. TheBOPfox program package provides an implementation ofanalytic BOPs for non-magnetic, collinear-magnetic andnoncollinear-magnetic calculations. It computes analyticforces, stresses and magnetic torques. For completeness,we compiled the implemented equations of the analyticBOPs with references to the original publications andcomments on the implementation in the appendix. Thiscomprehensive description of the algorithmic frameworkshould prove beneﬁcial for a broader community of usersand developers of analytic BOPs.The implementation is highly eﬃcient and provideslinear scaling of the computation time for energies andforces with the number of atoms. The diﬀerent paral-lelisations make it possible to run the calculations withoptimum use of the hardware resources for a given prob-lem size. The program can be compiled as standaloneprogram or as library with an API for linking with anexternal software.

Acknowledgements

We wish to dedicate this paper to the memory of ourcoauthor Professor David G. Pettifor CBE FRS, whosadly passed away before the work was completed. Weare grateful to Ting Qin, Paul Kamenski, Johnny Drain,Jan Gehrman, Aleksey Kolmogorov, Thomas Schablitzki,Martin Staadt and Jutta Rogal for discussions and for their feedback using BOPfox. TH, BS, RD, and DGPacknowledge funding from the Engineering and Physi-cal Sciences Research Council (EPSRC) of the UnitedKingdom through the project

Alloys by Design . AL, THand RD acknowledge ﬁnancial support by the GermanResearch Foundation (DFG) through research grant HA6047/4-1 and project C1 and C2 of the collaborative re-search centre SFB/TR 103. SS, MC, TH and RD ac-knowledge funding through the project

Damage TolerantMicrostructures in Steel by thyssenkrupp Steel EuropeAG and Benteler Steel Tube GmbH. MEF acknowledgesfunding from the EPSRC through a University of Ox-ford, Department of Materials Doctoral Training Award(DTA). CT acknowledges funding from thyssenkruppSteel Europe AG through the HPC group. MC acknowl-edges ﬁnancial support by the DFG through researchgrant CA 1553/1-1. E.R.M. acknowledges the NSF sup-port (Award No. OAC-1740263). Part of the work ofNW and AL was carried out in the framework of theInternational Max-Planck Research School SurMat. D. G. Pettifor, Bonding and Structure of Molecules andSolids, Oxford Science Publications, 1995. R. Drautz, D. G. Pettifor, Valence-dependent analyticbond-order potential for transition metals, Phys. Rev. B74 (2006) 174117. M. W. Finnis, Bond-order potentials through the ages,Prog. Mat. Sci. 52 (2007) 133. R. Drautz, D. G. Pettifor, Valence-dependent analyticbond-order potential for magnetic transition metals, Phys.Rev. B 84 (2011) 214114. R. Drautz, T. Hammerschmidt, M. Cak, D. G. Pettifor,Bond-order potentials: Derivation and parameterizationfor refractory elements, Mod. Sim. Mat. Sci. Eng. 23 (2015)074004. A. Horsﬁeld, A. M. Bratkovsky, M. Fearn, D. G. Pettifor,M. Aoki, Bond-order potentials: Theory and implementa-tion, Phys. Rev. B 53 (1996) 12694. T. Hammerschmidt, R. Drautz, Bond-order potentials forbridging the electronic to atomistic modelling hierarchies,in: J. Grotendorst, N. Attig, S. Bl¨ugel, D. Marx (Eds.),NIC Series 42 - Multiscale Simulation Methods in Molec-ular Science, J¨ulich Supercomputing Centre, 2009, p. 229. M. Cak, T. Hammerschmidt, R. Drautz, Comparison ofanalytic and numerical bond-order potentials for W andMo, J. Phys.: Cond. Mat. 25 (2013) 265002. T. Hammerschmidt, R. Drautz, D. G. Pettifor, Atomisticmodelling of materials with bond-order potentials, Int. J.Mat. Sci. 100 (2009) 1479. T. Hammerschmidt, B. Seiser, R. Drautz, D. G. Petti-for, Modelling topologically close-packed phases in super-alloys: Valence-dependent bond-order potentials based onab-initio calculations, in: R. C. Reed, K. Green, P. Caron,T. Gabb, M. Fahrmann, E. Huron, S. Woodward (Eds.),Superalloys 2008, The Metals, Minerals and Materials So-ciety, 2008, p. 847. Y. Chen, A. N. Kolmogorov, D. G. Pettifor, J.-X. Shang, Y. Zhang, Theoretical analysis of structural stability ofTM Si transition metal silicides, Phys. Rev. B 82 (2010)184104. B. Seiser, T. Hammerschmidt, A. N. Kolmogorov,R. Drautz, D. G. Pettifor, Theory of structural trendswithin 4d and 5d transition metals topologically close-packed phases, Phys. Rev. B 83 (2011) 224116. T. Hammerschmidt, G. K. H. Madsen, J. Rogal, R. Drautz,From electrons to materials, Phys. Stat. Sol. B 248 (2011)2213. T. Hammerschmidt, B. Seiser, M. Cak, R. Drautz, D. G.Pettifor, Structural trends of topologically close-packedphases: Understanding experimental trends in terms of theelectronic structure, in: E. S. Huron, R. C. Reed, M. C.Hardy, M. J. Mills, R. E. Montero, P. D. Portella, J. Teles-man (Eds.), Superalloys 2012, The Metals, Minerals andMaterials Society, 2012, p. 135. T. Schablizki, J. Rogal, R. Drautz, Topological ﬁngerprintsfor intermetallic compounds for the automated classiﬁca-tion of atomistic simulation data, Mod. Sim. Mat. Sci. Eng.21 (2013) 0755008. M. Cak, T. Hammerschmidt, J. Rogal, V. Vitek,R. Drautz, Analytic bond-order potentials for the bcc re-fractory metals Nb, Ta, Mo and W, J. Phys.: Cond. Mat.26 (2013) 195501. J. F. Drain, R. Drautz, D. G. Pettifor, Magnetic analyticbond-order potential for modeling the diﬀerent phases ofMn at zero Kelvin, Phys. Rev. B 89 (2014) 134102. M. Ford, R. Drautz, T. Hammerschmidt, D. G. Petti-for, Convergence of an analytic bond-order potential forcollinear magnetism in Fe, Modelling Simul. Mater. Sci.Eng. 22 (2014) 034005. C. Teijeiro, T. Hammerschmidt, R. Drautz, G. Sutmann,Parallel bond order potentials for materials science simula-tions, in: P. Iv´anyi, B. Topping (Eds.), Proceedings of theFourth International Conference on Parallel, Distributed,Grid and Cloud Computing for Engineering, Civil-Comp

Press, Edinburgh, UK, 2015. M. Ford, R. Drautz, D. G. Pettifor, Non-collinear mag-netism with analytic bond-order potentials, J. Phys.:Cond. Mat. 27 (2015) 086002. T. Hammerschmidt, A. Ladines, J. Koßmann, R. Drautz,Crystal-structure analysis with moments of the density-of-states: Application to intermetallic topologically close-packed phases, Crystals 6 (2016) 18. C. Teijeiro, T. Hammerschmidt, B. Seiser, R. Drautz,G. Sutmann, Complexity analysis of simulations with an-alytic bond-order potentials, Mod. Sim. Mat. Sci. Eng. 24(2016) 025008. C. Teijeiro, T. Hammerschmidt, R. Drautz, G. Sutmann,Eﬃcient parallelisation of analytic bond-order potentialsfor large atomistic simulations, Comp. Phys. Comm. 204(2016) 64. C. Teijeiro, T. Hammerschmidt, R. Drautz, G. Sutmann,Optimized parallel simulations of analytic bond-order po-tentials on hybrid shared/distributed memory with MPIand OpenMP, Int. J. High Perf. Comp. App. (in print:https://doi.org/10.1177/1094342017727060). J. Jenke, A. Subramanyam, M. Densow, T. Hammer-schmidt, D. G. Pettifor, R. Drautz, Chemistry informedstructure map for measuring similarity of atomic environ-ments, Phys. Rev. B, under review. M. Mrovec, D. Nguyen-Manh, C. Els¨asser, P. Gumbsch,Magnetic bond-order potential for iron, Phys. Rev. Lett.106 (2011) 246302. G. K. H. Madsen, E. McEniry, R. Drautz, Optimized or-thogonal tight-binding basis: Application to iron, Phys.Rev. B 83 (2011) 184119. N. Hatcher, G. K. H. Madsen, R. Drautz, DFT-based tight-binding modeling of iron-carbon, Phys. Rev. B 86 (2012)155115. S. Schreiber, M. Cak, T. Hammerschmidt, R. Drautz, inpreparation. Y.-S. Lin, M. Mrovec, V. Vitek, A new method for devel-opment of bond-order potentials for transition bcc metals,Mod. Sim. Mat. Sci. Eng. 22 (2014) 034022. M. Mrovec, R. Gr¨oger, A. G. Bailey, D. Nguyen-Manh,C. Els¨asser, V. Vitek, Bond-order potential for simulationsof extended defects in tungsten, Phys. Rev. B 75 (2007)104119. M. J. Cawkwell, D. Nguyen-Manh, D. G. Pettifor,V. Vitek, Construction, assessment and application ofbond-order potential for iridium, Phys. Rev. B 73 (2006)064104. J. Gehrmann, D. G. Pettifor, A. Kolmogorov, M. Reese,M. Mrovec, C. Els¨asser, R. Drautz, Reduced tight-bindingmodels for elemental Si and N, and ordered binary Si-Nsystems, Phys. Rev. B 91 (2015) 054109. O. K. Andersen, W. Klose, H. Nohl, Electronic structureof Chevrel-phase high-critical-ﬁeld superconductors, Phys.Rev. B 17 (1978) 1209. E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel,J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling,A. McKenney, D. Sorensen, LAPACK Users’ Guide, 3rdEdition, Society for Industrial and Applied Mathematics,Philadelphia, PA, 1999. C. G. Broyden, A class of methods for solving nonlinearsimultaneous equations, Math. Comp. 19 (1965) 577. E. Bitzek, P. Koskinen, F. G¨ahler, M. Moseler, P. Gumb-sch, Structral relaxation made simple, Phys. Rev. Lett. 97(2006) 170201. P. Soin, A. Horsﬁeld, D. Nguyen-Manh, Eﬃcient self-consistency for magnetic tight-binding, Comp. Phys.Comm. 182 (2011) 1350. M. Methfessel, A. Paxton, High-precision sampling forBrillouin-zone integration in metals, Phys. Rev. B 40(1989) 3616. P. E. Bl¨ochl, O. Jepsen, O. K. Andersen, Improved tetra-hedron method for Brillouin-zone integrations, Phys. Rev.B 49 (1994) 16223. J. Gilbert, J. Nocedal, Global convergence properties ofconjugate gradient methods, SIAM J. Optimization 2(1992) 21. C. Zhu, R. Byrd, P. Lu, J. Nocedal, L-BFGS-B: A limitedmemory FORTRAN code for solving bound constrainedoptimization problems, Tech. Report, NAM-11, EECS De-partment, Northwestern University. R. Byrd, P. Lu, J. Nocedal, C. Zhu, A limited memoryalgorithm for bound constrained optimization, SIAM J.Sci. Comp. 16 (1995) 16 (1995) 1190. L. Verlet, Computer experiments on classical ﬂuids. I.Thermodynamical properties of lennardjones molecules,Phys. Rev. 159 (1967) 98. W. Swope, H. Andersen, P. Berens, K. Wilson, A computersimulation method for the calculation of equilibrium con-stants for the formation of physical clusters of molecules:Application to small water clusters, J. Chem. Phys. 76(1982) 648. S. Bahn, K. Jacobsen, An object-oriented scripting inter-face to a legacy electronic structure code, Comput. Sci.Eng. 4 (2002) 56. E. Tadmor, R. Elliott, J. Sethna, R. Miller, C. Becker, Thepotential of atomistic simulations and the knowledgebaseof interatomic models, JOM 63 (2011) 17. S. Plimpton, Fast parallel algorithms for short-rangemolecular dynamics, J. Comp. Phys. 117 (1995) 1. A. P. Sutton, M. W. Finnis, D. G. Pettifor, Y. Ohta, Thetight-binding bond model, J. Phys. C 21 (1988) 35. E. Margine, D. G. Pettifor, Competition between crystal-ﬁeld, overlap, and three-center contributions in h n eigen-spectra, Phys. Rev. B 89 (2014) 235134. J. Hubbard, Electron correlations in narrow energy bands,Proc. R. Soc. A 276 (1963) 238. L. Goodwin, A. J. Skinner, D. G. Pettifor, Generatingtransferable tight-binding parameters - Application to sil-icon, Europhys. Lett. 9 (1989) 701. E. Stoner, Collective electron ferromagnetism, Proc. R.Soc. A 169 (1939) 339. J. K¨ubler, K.-H. H¨ock, J. Sticht, A. Williams, Densityfunctional theory of non-collinear magnetism, J. Phys. F:Met. Phys. 18 (1988) 469. D. Nguyen-Manh, D. G. Pettifor, V. Vitek, Analyticenvironment-dependent tight-binding bond-integrals: Ap-plication to MoSi , Phys. Rev. Lett. 85 (2000) 4136. P.-O. L¨owdin, On the nonorthogonality problem connectedwith the use of atomic wave functions in the theory ofmolecules and crystals, J. Chem. Phys. 18 (1950) 365. B. Seiser, D. G. Pettifor, R. Drautz, Analytic bond-orderpotential expansion of recursion-based methods, Phys.Rev. B 87 (2013) 094105. R. Haydock, Recursive solution of the Schr¨odinger equa-tion, Comp. Phys. Comm. 20 (1980) 11. R. Haydock, V. Heine, M. J. Kelly, Electronic structurebased on the local atomic environmentfor tight-bindingbands, J. Phys. C: Sol. Stat. Phys. 5 (1972) 2845. C. Lanczos, An iteration method for the solution of theeigenvalue problem of linear diﬀerential and integral oper-ators, J. Res. Natl. Bur. Stand. 45 (1950) 225. R. Haydock, V. Heine, M. Kelly, Electronic structure basedon the local atomic environment for tight-binding bands :Ii, J. Phys. C: Solid State Phys. 8 (1975) 2591. P. E. A. Turchi, F. Ducastelle, G. Treglia, Band gaps andasymptotic behaviour of continued fraction coeﬃcients, J.Phys. C: Solid State Phys. 15 (1982) 2891. F. Cryot-Lackmann, On the electronic structure of liquidtransition metals, Adv. Phys. 16 (1967) 393. P. E. A. Turchi, Interplay between local environment eﬀectand electronic structure properties in close packed struc-tures, Mat. Res. Soc. Symp. Proc. 206 (1991) 265. M. Aoki, Rapidly convergent bond order expansion foratomistic simulations, Phys. Rev. Lett. 71 (1993) 3842. R. Haydock, Recursive solution of Schr¨odinger’s equation,Sol. Stat. Phys. 35 (1980) 215. A. P. Horsﬁeld, A computationally eﬃcient diﬀerentiabletight-binding energy functional, Mater. Sci. Eng. B 37(1996) 219. A. Weiße, G. Wellein, A. Alvermann, H. Fehske, The kernelpolynomial method, Rev. Mod. Phys. 78 (2006) 275. R. Haydock, R. Johannes, The electronic structure of tran-sition metal laves phases, J. Phys. F: Met. Phys. 5 (1975)2055. N. Beer, D. G. Pettifor, The recursion method and theestimation of local densities of states, in: P. Phariseau,W. M. Temmermann (Eds.), The Electronic Structure ofComplex Systems, Plenum Press, New York, 1984, p. 769. S. A. Gerschogorin, ¨Uber die Abgrenzung der Eigenwerteeiner Matrix, Bulletin der L’Acad´emie des Sciences del’URSS 6 (1931) 749. H. Hellmann, Einf¨uhrung in die Quantenchemie, Deuticke,Leipzig, 1937. R. P. Feynman, Forces in molecules, Phys. Rev. 56 (1939)340. T. Gilbert, A phenomenological theory of damping in fer-romagnetic materials, IEEE Trans. Mag. 40 (2004) 3433.

Appendix A: Binding energy in TB and BOP1. Energy contributions

The TB and BOP calculations within BOPfox arebased on the TB bond model that can be obtainedas a second-order expansion of the DFT energy . In theabsence of external ﬁelds the total binding energy is givenby U B = U bond + U prom + U ion + U es + U rep + U X . (A1)The covalent bond energy U bond summarizes the energythat originates from the formation of chemical bonds be-tween the atoms. Its onsite representation U bond = (cid:88) iαν E F (cid:90) ( E − E iαν ) n iαν ( E )d E (A2)is the integral of the local electronic DOS n iαν ( E ) upto the Fermi energy E F for each orbital α and spin ν of atom i with onsite level E iαν . The equivalent intersiterepresentation U bond = iαν (cid:54) = jβµ (cid:88) iανjβµ β iανjβµ n jβµiαν (A3)is expressed in terms of the density-matrix elements n iανjβµ (Eq. B20) that are identical to the bond orderΘ iανjβµ ( φ F ) aside from a factor of two for non-magneticsystems. The bond integrals β iανjβµ = H iανjβµ −

12 ( E iαν + E jβµ ) S iανjβµ (A4)include the Hamiltonian matrix elements H iανjβµ andoverlap matrix elements S iανjβµ . The promotion energy U prom accounts for the redistribution of electrons acrossorbitals upon bond formation. It is given by U prom = (cid:88) iαν E (0) iαν (cid:16) N iαν − N (0) iαν (cid:17) (A5)with (0) indicating the non-magnetic free atom as refer-ence and the number of electrons N iαν = E F (cid:90) n iαν ( E )d E . (A6)The deviation from charge-neutral atoms upon bond for-mation leads to charges q iαν = N iαν − N (0) iαν . (A7)The energies associated with charge redistribution areapproximated to depend only on the total atomic charge q i = (cid:88) αν q iαν . (A8)The energy to charge an atom is given by the onsite ionicenergy U ion = ¯ E i q i + 12 (cid:88) i J ii q i (A9)that is determined by the electronegativity ¯ E i and theresistance against charge transfer J ii that is related to theHubbard U . The energy ¯ E i q i is obtained by a weightedaverage of the reference onsite levels ¯ E i = (cid:88) α E (0) iα ∆ q iα (A10)where ∆ q iα is the amount of charge which is gained orlost by orbital iα due to minimization of the bindingenergy U B . The interaction of the charged atoms is givenby the intersite electrostatic energy U es = 12 i (cid:54) = j (cid:88) ij J ij q i q j (A11)with the Coulomb parameter J ij . The repulsive energy U rep includes all further terms of the second-order expan-sion of DFT and is usually parametrised by empiricalfunctions. The exchange energy U X due to magnetism isapproximated by the typically dominating onsite contri-butions U X = − (cid:88) i I i m i (A12)with m i the magnetic moment and I i the Stoner exchangeparameter of atom i . The preparation energy (Eq. 92 inRef. ) vanishes in an unscreened calculation. Furthercontributions to the energy due to external magnetic orelectric ﬁelds can be included .

2. Hamiltonian a. Construction

For each interacting pair of atoms i and j with orbitals α and β , the structure of the pairwise Hamiltonian H ( b ) ij in the coordinate system of the bond is given by H ( b ) ij = js jp jdisipid  σ σ σ σ σ σ π π π π σ σ σ π π π π δ

00 0 0 0 0 0 0 0 δ  (A13)for the general case of an spd -valent atom i interactingwith an spd -valent atom j . The superscript ( b ) indicatesthe coordinate system of the bond aligned along the z axis with ordering of the p and d orbitals as p z , p x , p y and d z − r , d zx , d yz , d x − y , d xy , respectively. The values ofthe matrix elements σ and π diﬀer in general for diﬀer-ent combinations of orbitals (e.g. σ ( is, js ) (cid:54) = σ ( ip, jp ))and atoms (e.g. σ ( ip, jd ) (cid:54) = σ ( id, jp )). For combina-tions of atoms with fewer types of valence orbitals, theHamiltonian reduces accordingly. The values of the ma-trix elements H ( b ) iαjβ are determined for the interatomicdistance r ij = | r ij | = | r i − r j | from the values of thedistance-dependent bond integrals β iαjβ ( r ij ). The func-tional form of β iαjβ ( r ij ) depends on the speciﬁc TB/BOPmodel and is, for example, power-law, exponential, orGoodwin-Skinner-Pettifor type. The interaction rangecan be smoothly forced to zero at r cut by multiplicationof β iαjβ ( r ij ) with a cosine function f cut ( r ij ) = 12 (cid:18) cos (cid:18) π (cid:20) r ij − ( r cut − d cut ) d cut (cid:21)(cid:19) + 1 (cid:19) (A14) for r cut - d cut ≤ r ij ≤ r cut . For each bond, the pairwiseHamiltonian initialised in the bond coordinate system isrotated to the global coordinate system H ij = R ( θ ij , φ ij ) H ( b ) ij ( r ij ) R ( θ ij , φ ij ) T (A15)using rotation matrices R ( θ ij , φ ij ) with polar and az-imuthal angles θ ij and φ ij determined from the orien-tation of the bond r ij in the global coordinate system(see D). b. Magnetism Magnetism enters the Hamiltonian H iαµjβν via the ex-plicit spin-dependence of the onsite levels E iαµν . Thespin indices µ and ν span the four quadrants of neigh-bouring electron spin ↑↑ , ↑↓ , ↓↑ and ↓↓ . The globalonsite-level matrix of orbitals α of atom i E iα = (cid:32) E iα ↑↑ E iα ↑↓ E iα ↓↑ E iα ↓↓ (cid:33) (A16)with onsite levels E iαµν = H iαµiαν (A17)= H (0) iαiα δ µν + B i · σ µν − I i m i · σ µν + J i q i depends on the non-magnetic onsite levels H (0) iαiα , anyexternal magnetic ﬁeld B i , the Pauli matrices σ µν , theStoner exchange integral I i and the charge q i .In the case of collinear magnetism with identical axisof spin quantization for all atoms the magnetic momentsare parallel or antiparallel to one another. In this casethe global magnetic moment direction can be taken tolie along the z-axis of the unit cell. Then the ↑↓ and ↓↑ modiﬁcations to H ( b ) iαjβ vanish and the global onsite-levelmatrix takes a diagonal form with decoupled ↑↑ and ↓↓ modiﬁcations. Therefore, we may use separate ↑ and ↓ spin channels ν with onsite elements E iαν = H (0) iαiα − ( − ν B z + 12 ( − ν I i m i + J i q i (A18)for a magnetic moment of m i = (cid:88) α ( N iα ↑ − N iα ↓ ) . (A19)In the case of non-collinear magnetism , the axis ofspin quantization is diﬀerent for diﬀerent atoms i . How-ever, with a unitary transformation U iα , the diagonalform of E iα can be enforced E (local) iα = U iα E iα U † iα = (cid:32) E ↑ (local) iα E ↓ (local) iα (cid:33) (A20)by a rotation into a local coordinate system that is ori-ented along the local magnetic moment. The transfor-mation matrix U iα is deﬁned in terms of the angle α between the z direction in the global space, s z , and thedirection of the local magnetic moment, s iα cos( α ) = s z · s iα (A21)and a vector n iα that is orthogonal to s z and s iα . Acomputationally convenient way to express the transfor-mation matrix is U iα = cos (cid:16) α (cid:17) − i ( σ · n iα ) sin (cid:16) α (cid:17) (A22)with the identity matrix and the vector of Pauli spinmatrices σ . c. Screening The analytic BOP calculations in BOPfox employ or-thogonal TB models that can be obtained by approx-imate transformations of non-orthogonal TB models.This transformation leads to an environment dependencyof the bond integrals β iαjβ ( r ij ) in the orthogonal TBmodel in terms of screening by environment atoms k with orbitals γ .The transformation to an orthogonal basis is achievedby a L¨owdin transformation ˜ H iαjβ = S − / iαkγ H kγlδ S − / lδjβ . (A23)The diagonal elements of the overlap matrix are one, itcan therefore be written as S iαjβ = δ iαjβ + O iαjβ . (A24)where O ij = S ij for i (cid:54) = j and zero otherwise. Similarly,we can write S − / iαjβ = δ iαjβ − S iαjβ . (A25)The screened orthogonal Hamiltonian matrix elementsare given as ˜ H (0) iαjβ = H (0) iαjβ − (cid:16) H (0) iαkγ S kγjβ + S iαkγ H (0) kγjβ (cid:17) + 14 S iαkγ H (0) kγlδ S lδjβ (A26)where the bond between atoms i and j is screened byatom k . The matrices O iαjβ are constructed analogouslyto the Hamiltonian (Eq. A13) with pairwise distance-dependent parametrisations. In BOPfox, the screeningis implemented up to the linear term in S , while S isapproximated to ﬁrst order as S iαjβ = O iαjβ . (A27)

3. Self-consistency

The onsite levels E iα of the diﬀerent atoms i in thesystem are optimised in a self-consistency loop in orderto minimise the binding energy (Eq. A1). The targetquantity ∆ SCFiα that is minimised with respect to onsitelevels , deﬁned by ∂U B ∂E iα = ∆ SCFiα → SCFiα = ˜Θ iαiα − N iα = (cid:88) m Ξ ( m − ,m ) iαiα − N iα . (A29)The bond-order like term Ξ ( m − ,m ) iαiα that includes gra-dients of the moments with respect to onsite levels isexplained in detail in C.The corresponding minimisation target for TB calcu-lations can be written as∆ SCFiα = E iα −  E (0) iα + (cid:88) jβ J iαjβ q j  (A30)= ∆ E iα − (cid:88) jβ J iαjβ q j . Local-charge neutrality can be enforced by the alter-native target quantity∆

SCFiα = N (0) iα − N iα (A31)or, implicitly, by large values of J iαiα .For non-collinear magnetism , the gradient of thebinding energy with respect to local onsite levels E (local) iαν (Eq. A20), i.e., ∂U B ∂E (local) iαν = ˜ Θ (local) iανiαν − N iαν (A32)involves the unitary transformation ˜ Θ (local) iανiαν = U iαν ˜ Θ iανiαν U † iαν (A33) Appendix B: Bond energy in analytic BOPs1. Density of states

In analytic BOPs, the local density of states n iα ( (cid:15) )required for the calculation of the bond energy (Eq. A2), n iα ( (cid:15) ) = 2 π (cid:112) − (cid:15) (cid:88) m g m σ ( m ) iα P m ( (cid:15) ) (B1)is determined analytically using Chebyshev poly-nomials of the second kind P m ( (cid:15) ) (see B 2), structure-dependent expansion coeﬃcients σ ( m ) iα (see B 3), and0damping factors g m (see B 4). The expansion of the DOSis based on a transformation of the Hamiltonian to atridiagonal form (cid:104) u n | ˆ H | u m (cid:105) =  a (0) b (1) b (1) a (1) b (2) b (2) a (2) b (3) b (3) a (3) . . .. . . . . . . . .. . . . . .  with all other entries identical to zero. This Hamilto-nian corresponds to a one-dimensional chain with onlynearest-neighbour matrix elements, see Fig. 1. that can FIG. 1: Graphical representation of the recursion Hamilto-nian as a one-dimensional chain: the Lanczos chain. be solved by recursion using the Lanczos algorithm to obtain the local DOS n iα ( E ) = − π Im 1 E − a (0) iα − b (1) iα E − a (1) iα − b (2) iα . . . (B2)in terms of the recursion coeﬃcients a ( m ) iα and b ( m ) iα . Inpractice, the recursion is terminated at some level n bymaking assumptions for the values of a ( m ) iα and b ( m ) iα for m > n . This corresponds to taking the energy calculationto a local scheme which requires convergence with respectto n . In BOPfox the required recursion coeﬃcients a ( m ) iα and b ( m ) iα for m > n can be taken (i) as constant, (ii) asweighted average and (iii) as oscillating.Taking the recursion coeﬃcients as constant values a ( m ) iα = a ( ∞ ) iα , b ( m ) iα = b ( ∞ ) iα for m > n (B3)corresponds to the so-called square-root terminator asthe tail of the continued fraction can then be given an-alytically as a square-root function . The diﬀerent ap-proaches to obtain the values of the asymptotic recursioncoeﬃcients a ( ∞ ) iα and b ( ∞ ) iα in BOPfox are summarizedin B 5. Taking a ( m ) iα and b ( m ) iα for m > n as weighted averages over m maxrec recursion levels a (approx) iα = m maxrec (cid:80) m =0 w m a ( m ) iαm maxrec (cid:80) m =0 w m , b (approx) iα = m maxrec (cid:80) m =1 w m b ( m ) iαm maxrec (cid:80) m =1 w m (B4)with w m = 1 / [ β ( m maxrec − m ) + 1] can provide smootherconvergence for values of β ≥ for a ( m ) iα and b ( m ) iα can be chosen totreat, e.g., systems with band-gaps .

2. Chebyshev polynomials

The Chebyshev polynomials of the second kind inEq. B1 are expressed as P m ( (cid:15) ) = m (cid:88) n =0 p mn (cid:15) n (B5)with p ( m +1) n = 2 p m ( n − − p ( m − n (B6)(unless n < n > m when p mn = 0). They presentthe basis of the expansion of n iα (Eq. B1) . The valuesof P m ( (cid:15) ) are computed iteratively P m +1 ( (cid:15) ) = 2 (cid:15)P m ( (cid:15) ) − P m − ( (cid:15) ) (B7)with P = 1 and P = 2 (cid:15) . The phase (cid:15) = − cos φ (B8)transforms the Chebyshev polynomials P m ( (cid:15) ) = sin( m + 1) φ sin φ (B9)to sine functions with a corresponding DOS n iα ( (cid:15) ) = (cid:88) m g m σ ( m ) iα sin( m + 1) φ . (B10)This expression can be integrated to provide analytic ex-pressions for the bond energy of orbital α of atom i , U bond ,iα = b ( ∞ ) iα (cid:88) m g m σ ( m ) iα [ ˆ χ m +2 ( φ F ) − γ ˆ χ m +1 ( φ F ) + ˆ χ m ( φ F )] , (B11)and the number of electrons N iα ( φ F ) = (cid:88) m g m σ ( m ) iα ˆ χ m +1 ( φ F ) . (B12)The structure-independent response functionsˆ χ ( φ F ) = 0 (B13)1ˆ χ ( φ F ) = 1 − φ F π + 12 π sin (2 φ F ) (B14)ˆ χ m ( φ F ) = 1 π (cid:20) sin( m + 1) φ F m + 1 − sin( m − φ F m − (cid:21) (B15)with the Fermi phasecos φ F = E F − a ( ∞ ) iα b ( ∞ ) iα (B16)correspond to a weighting of the contribution of thestructure-dependent expansion coeﬃcients σ ( m ) iα to thebond energy.

3. Expansion coeﬃcients and moments

The expansion coeﬃcients σ ( m ) iα = m (cid:88) n =0 p mn ˆ µ ( n ) iα (B17)in Eq. B1 carry the information on the atomic structurein the normalised moments ˆ µ ( n ) iα = 1 (cid:16) b ( ∞ ) iα (cid:17) n n (cid:88) l =0 (cid:32) nl (cid:33) ( − l a ( ∞ ) iα l µ ( n − l ) iα (B18)with terminator coeﬃcients a ( ∞ ) iα and b ( ∞ ) iα of orbital α ofatom i . The moments provide the direct link between theelectronic structure, n iα ( E ), and the atomic structure bythe moments theorem µ ( n ) iα = (cid:90) E n n iα ( E ) dE = (cid:104) iα | ˆ H n | iα (cid:105) (B19)= (cid:88) j β ...j n − β n − H iαj β H j β j β ...H j n − β n − iα This link is schematically illustrated in Fig. 2 for the sec-ond, third and fourth moment: The self-returning pathsof length two, three and four in the atomic structuresare linked to the root mean square (RMS) width, theskewness and the bimodality of the electronic DOS, re-spectively.In the intersite representation (Eq. A3), the informa-tion on the individual bonds is contained in the bondorder Θ iαjβ ( (cid:15) ) or the density matrix n iαjβ ( (cid:15) ) that canbe expressed in terms of Chebyshev polynomials Θ iαjβ ( (cid:15) ) = 2 n iαjβ ( (cid:15) )= 2 2 π (cid:112) − (cid:15) (cid:88) m g m σ ( m ) iαjβ P m ( (cid:15) ) (B20)with σ ( m ) iαjβ deﬁned equivalently to Eq. B17 as σ ( m ) iαjβ = m (cid:88) n =0 p mn ˆ ξ ( n ) iαjβ (B21) FIG. 2: Schematic illustration of the direct link between theatomic structure in terms of self-returning paths (top) andelectronic density of states (bottom). The second momentthat is linked to the RMS width of the DOS (bottom left) isdetermined by self-returning paths of length two (top, left redatom). The third moment that relates to the skewness of theDOS (bottom middle) is given by paths of length three (top,middle red atom) and the fourth moment that is linked to thebimodality of the DOS (bottom right) by paths of length four(top, right red atom). and normalization of the interference paths (Eq. B25)ˆ ξ ( l ) iαjβ = 1 (cid:16) b ( ∞ ) iα (cid:17) l l (cid:88) n =0 (cid:32) nl (cid:33) ( − l a ( ∞ ) iα n − l ξ ( n ) iαjβ . (B22)A relation between the moments and the atomic struc-ture is established in the second equality by the self-returning paths iα → j β → j β → ... → j n − β n − → iα from orbital α on atom i along orbitals β k of atoms j k ( k = 1 ...n − iH iαiα = (cid:104) iα | ˆ H | iα (cid:105) = E iα (B23)and the interatomic interactions between the atomic or-2bitals on neighbouring atoms i and jH iαjβ = (cid:104) iα | ˆ H | jβ (cid:105) . (B24)Higher moments correspond to longer paths and thus toa more far-sighted sampling of the atomic environment.As diﬀerent crystal structures have diﬀerent sets of self-returning paths of a given length, the moments may beseen as ﬁngerprints of the crystal structure and usedto construct maps of structural similarity .The paths can be computed eﬃciently by realizing that(1) only the sum of all paths is relevant (Eq. B19) andthat (2) the sums across the whole paths can be repre-sented as sums along path segments. The path segmentsare the interference paths ξ ( n ) iαjβ = (cid:104) iα | ˆ H n | jβ (cid:105) (B25)of length n between atom i and j . The computation ofinterference paths can be simpliﬁed after realising thatthey can be (i) constructed iteratively ξ ( n ) iαjβ = (cid:88) kγ H iαkγ ξ ( n − kγjβ (B26)for all interaction neighbours k with orbitals γ , (ii) in-verted in their direction by taking the transpose ξ ( n ) iαjβ = ξ ( n ) jβiαT (B27)and (iii) merged by multiplication of segments ξ ( n ) iαjβ = (cid:88) kγ ξ ( l ) iαkγ ξ ( n − l ) kγjβ (B28)of length 0 < l < n for all common endpoint atoms k with orbital γ . Using these properties, the summation ofmatrix multiplications along the individual self-returningpaths can be decomposed to segments that representsummations of matrix multiplications along shorter par-tial paths. It is, therefore, not necessary to determineeach possible path ξ ( n ) iαjβ between atoms i and j individu-ally, but instead suﬃcient to determine the set of shortersegments that is needed for their construction. The im-plementation of this approach in BOPfox reaches the the-oretical scaling limits of the required execution time andis discussed in detail in Ref. .A relation between the moments and the electronicstructure is due to the expansion coeﬃcients a ( n ) iα and b ( n ) iα . These coeﬃcients determine the electronic struc-ture in terms of n iα , the local DOS, as given in Eq. B2.The ﬁrst four moments of the local DOS are given by µ (0) iα = 1 (B29) µ (1) iα = a (0) iα (B30) µ (2) iα = a (0) iα + b (1) iα (B31) µ (3) iα = a (0) iα + 2 a (0) iα b (1) iα + a (1) iα b (1) iα (B32) which is easily veriﬁed by identifying all self-returningpaths of corresponding length in Fig. 1. Vice-versa, therecursion coeﬃcients can be determined from the mo-ments for each iα by a n = n (cid:88) j =0 n (cid:88) l =0 c nj c nl µ j + l +1 (B33)and b n = n (cid:88) j =0 n − (cid:88) l =0 c nj c n − l µ j + l +1 (B34)where we dropped the common index iα for readability.The coeﬃcients c nj are given by c = 1 ,c nj = 0 if j > n or j < n < ,b n +1 c n +1 j = c nj − − a n c nj − b n c n − j and determined iteratively.

4. Damping factors

The damping factors g m in Eq. B1, together with ap-proximate higher expansion coeﬃcients, were introducedto suppress Gibbs ringing and ensure strictly positive val-ues of the DOS . Therefore the calculation of the DOS(Eq. B1) with expansion coeﬃcients σ ( m ) iα from the mo-ments up to m = n max (Eq. B17) is expanded up to n max + 1 < m < n exp with estimated higher expansioncoeﬃcients σ ( m ) iα n ( n max ) iα ( (cid:15) ) = π √ − (cid:15) (cid:20) n max (cid:80) m =1 g m σ ( m ) iα P m ( (cid:15) )+ n exp (cid:80) m = n max +1 g m σ ( m ) iα P m ( (cid:15) ) (cid:21) . (B35)The higher expansion coeﬃcients σ ( m ) iα are obtained byrecursive calculation of the interference paths along thesemi-inﬁnite chain, ζ ( m +1) k = 2 (cid:104) ˆ a k ζ ( m ) k + ˆ b k ζ ( m ) k − + ˆ b k +1 ζ ( m ) k +1 (cid:105) − ζ ( m − k (B36)with ˆ a k = a k − a ( ∞ ) iα b ( ∞ ) iα and ˆ b k = b k b ( ∞ ) iα (B37)and σ ( n ) iα = ζ ( n )0 . The relative importance of the higher,approximated expansion coeﬃcients with respect to thelower, computed ones is balanced by the damping factors3 g m in Eq. B1 that vary smoothly from 1 to 0. In BOPfox,the Jackson kernel g Jm = ( n max − m + 1) cos πmn max +1 + sin πmn max +1 cot πn max +1 n max + 1 (B38)for an expansion m = 1 . . . n max is adapted to Chebyshevpolynomials of the second kind by g m = g Jm +1 /g J (B39)as described in Ref. .

5. Band-width estimates

The diﬀerent terminators (Eqs. B3 and B4) of the con-tinued fraction (Eq. B2) require the recursion coeﬃcients a ( m ) iα and b ( m ) iα beyond the ones that can be computedfrom the µ ( n ) iα with m > n by Eqs. B33 and B34. Wedetermine approximate values of a ( m ) iα and b ( m ) iα from es-timates of the centre and the width of the DOS a ( ∞ ) iα = A ( ∞ ) iα (B40) b ( ∞ ) iα = B ( ∞ ) iα (B41)with A ( ∞ ) iα = ( E top iα + E bottom iα ) (B42) B ( ∞ ) iα = ( E top iα − E bottom iα ) . (B43)The values of A ( ∞ ) iα and B ( ∞ ) iα can be estimated in BOPfoxin several ways based on the computed recursion coeﬃ-cients a ( n ) iα and b ( n ) iα for n levels of orbital α on atom i .The simple approximations are (i) the lowest computedrecursion coeﬃcients, i.e., A ( ∞ ) iα = a (1) iα , B ( ∞ ) iα = b (1) iα , (B44)(ii) the highest computed recursion coeﬃcients A ( ∞ ) iα = a ( n max ) iα , B ( ∞ ) iα = b ( n max ) iα , (B45)(iii) averaged values similar to Haydock and Jo-hannes , A ( ∞ ) iα = n max (cid:80) n =0 a ( n ) iα n max + 1 , B ( ∞ ) iα = (cid:118)(cid:117)(cid:117)(cid:117)(cid:116) n max (cid:80) n =1 b ( n ) iα n max , (B46)(iv) the average band-centre with the band-width fromthe highest computed recursion level A ( ∞ ) iα = n max (cid:80) n =0 a ( n ) iα n max + 1 , B ( ∞ ) iα = b ( n max ) iα , (B47) or (v) lowest computed band-bottom and highest com-puted band-top A ( ∞ ) iα = max (cid:16) a ( n ) iα (cid:17) +min (cid:16) a ( n ) iα (cid:17) , (B48) B ( ∞ ) iα = max (cid:16) a ( n ) iα (cid:17) − min (cid:16) a ( n ) iα (cid:17) +4max (cid:16) b ( n ) iα (cid:17) . (B49)Further choices are (vi) the approach of Beer et al. thatminimises the band-width with preserved moments of theDOS and (vii) Gershogorin’s circle theorem which leadsto estimates of the band-edges E bottom iα = min (cid:16) a ( n ) iα − b ( n ) iα − b ( n +1) iα (cid:17) (B50) E top iα = max (cid:16) a ( n ) iα + b ( n ) iα + b ( n +1) iα (cid:17) . (B51)For testing purposes the user can also deﬁne (viii) globalvalues of A ( ∞ ) and B ( ∞ ) that hold for all atoms.

6. Example with typical settings: bcc Ta

As an example of an analytic BOP calculation, we usedthe parametrisation of Ref. to determine the DOS ofbcc Ta shown in Fig. 3. This non-magnetic BOP calcu- FIG. 3: DOS of bcc Ta computed by reciprocal-spaceTB (black) and real-space BOP (red) computed with theparametrisation of Ref. . The integrated DOS of TB andBOP (divided by a factor of ﬁve for plotting convenience) aregiven as dashed lines. The dotted line marks the Fermi level. lation with a d -band model uses 9 moments (Eq. B19), asquare-root terminator (Eq. B3), the Gershogorin band-width estimate (Eq. B50), and estimated expansion co-eﬃcients up to moment 200 (Eq. B37) that are dampedwith a Jackson kernel (Eq. B38). The DOS obtained byanalytic BOPs is in good agreement with the TB refer-ence (20 × × k -point mesh, tetrahedron integration).In both cases, the Fermi level is in the pseudo-gap of the4bimodal DOS that is typical for bcc transition metals.The bandwidth of the DOS, as well as the position andheight of the two most prominent peaks are well cap-tured. The integrated DOS of analytic BOPs is in excel-lent agreement with the TB reference which is the basisfor reproducing DOS-integral quantities like the bond en-ergy. Appendix C: Forces and torques in analytic BOPs1. General binding-energy derivative

The minimisation of the binding energy in the self-consistency cycle (see A 3) is based on the derivative ofthe binding energy with respect to onsite-levels E jβµ .The computation of forces and stresses requires thederivative of the binding energy with respect to the po-sition r j , while determining the torques makes use of thederivative of the binding energy with respect to the spinorientation s jβµ of atom j . These are all speciﬁc exam-ples of derivatives of the binding energy which can bewritten in a generic form as derivatives with respect to ageneral parameter Λ, dU B d Λ = (cid:88) iαν n max (cid:88) n =0 w ( n ) iαν dµ ( n ) iαν d Λ − (cid:88) iαν N iαν dE iαν d Λ + dU rep d Λ . (C1)The total derivative of the bond energy U bond (Eq. A3)with respect to Λ is transformed to partial derivativeswith respect to moments µ ( n ) iαν and associated partialderivatives of the moments with respect to Λ. This al-lows the derivative of the bond energy to be expressed inthe form of Hellmann-Feynman-type forces dU bond d Λ = (cid:88) iανjβµ ˜Θ iανjβµ dH jβµiαν d Λ (C2)with a bond-order-like term˜Θ iανjβµ = n max (cid:88) n =1 Ξ ( n − ,n ) iανjβµ . (C3)Inserting E jβµ for Λ leads to the self-consistency condi-tion of Eq. A29. Replacing Λ with r j or s jβ yields forcesand torques as described in C 2 and C 3, respectively.The derivatives of U bond with respect to the moments w ( n ) iαν = ∂U bond ∂µ ( n ) iαν (C4)enter ˜Θ iανjβµ as weights w ( m ) iαν inΞ ( n − ,m ) i α ν i n α n ν n = (cid:80) i α ν ...i n − α n − ν n − (cid:16)(cid:80) nl =1 w ( m ) i l α l ν l (cid:17) H i α ν i α ν . . . H i n − α n − ν n − i n α n ν n (C5) and are given in detail in C 4. This compact form leadsto an eﬃcient recursive computation of ˜Θ iαjβ byΞ ( n − ,m ) iαjβ = T ( n − ,m ) iαjβ + w ( m ) iα ξ ( n − iαjβ (C6)with transfer paths T ( n,m ) iαjβ . The transfer paths are closelyrelated to the interference paths (Eq. B19) and exhibitsimilar properties (Eqs. B26-B28). In particular, thetransfer paths can also be (i) constructed iteratively T ( n,m ) iαjβ = (cid:88) kγ H iαkγ T ( n − ,m ) kγjβ + w ( m ) iα ξ ( n ) iαjβ , (C7)(ii) inverted by taking the transpose T ( n,m ) iαjβ = T ( n,m ) jβiα T , (C8)and (iii) merged by a product rule T ( n − ,m ) iαjβ = (cid:88) kγ T ( l − ,m ) iαkγ ξ ( n − l ) kγjβ + (cid:88) kγ ξ ( l − iαkγ T ( n − l,m ) kγjβ . (C9)These properties of the transfer paths are the basis forthe eﬃcient and parallel implementation of self-consistency, forces and torques in analytic BOPs.For non-collinear magnetism , the above equations aretransformed by rewriting the moments and weights as2 × dU B d Λ = (cid:88) iα n max (cid:88) n =0 Tr (cid:32) w ( n ) iα d µ ( n ) iα d Λ (cid:33) − (cid:88) iαν N iαν dE iαν d Λ + dU rep d Λ (C10)with d µ ( n ) iα d Λ = (cid:32) dµ ↑↑ ( n ) iα d Λ dµ ↑↓ ( n ) iα d Λ dµ ↓↑ ( n ) iα d Λ dµ ↓↓ ( n ) iα d Λ (cid:33) (C11)and weights that are constructed in the local frame w ( n, local) iα = U iα w ( n ) iα U † iα = (cid:32) w ↑ ( n, local) iα w ↓ ( n, local) iα (cid:33) (C12)from the global counterparts by a unitary transformationlike the onsite levels (Eq. A20). The transformation ofthe bond order term ˜Θ iανjβµ and the transfer matrices T ( n − ,m ) iαjβ to 2 × ξ ( n ) iαjβ = (cid:32) ξ ↑↑ ( n ) iαjβ ξ ↑↓ ( n ) iαjβ ξ ↓↑ ( n ) iαjβ ξ ↓↓ ( n ) iαjβ (cid:33) . (C13)5

2. Forces

Replacing the derivative d/d

Λ in Eq. C1 with the gra-dient ∇ k leads to the analytic forces. With self-consistentonsite levels dU/dE iα = 0 (Eq. A29), the forces on atom k in TB and BOP calculations are given by F k = −∇ k U B = − iα (cid:54) = jβ (cid:88) iαjβ ˜Θ iαjβ ∇ k H jβiα − (cid:88) iαjβ ( ∇ k J iαjβ ) q jβ q iα + 14 (cid:88) iαjβ ( ∇ k I iαjβ )m jβ m iα −∇ k U rep . (C14)(This expression also holds for non-collinear magnetismas there only the onsite levels are aﬀected by the rota-tion .) In TB calculations this expression correspondsto Hellmann-Feynman forces and˜Θ iαjβ = ∂U B ∂H iαjβ . (C15)becomes the density matrix n iαjβ (Eq. A3). In analyticBOP calculations, in contrast, the approximate evalu-ation of the DOS means that a self-consistent set ofcharges and magnetic moments does not correspond to astationary point in the BOP energy (as it does in DFTor TB approaches) . However, taking exact derivativesof the energy with respect to atomic positions, this formcan still be used to represent forces (Eq. C14) andstresses . The contribution of the bond energy to theatomic virial stress is σ ( i )bond = 12 (cid:88) αjβ ˜Θ iαjβ ( ∇ j H jβiα − ∇ i H jβiα ) ⊗ r ij . (C16)

3. Torques

Inserting the local spin direction s iα for Λ in the gen-eral derivative (Eq. C1) leads to the torques, i.e., to thechange in binding energy due to rotation of local spindirections. The derivatives of the rotation matrices canbe taken into account by expressing the weights in termsof their local counterparts w ( n ) iα = (cid:20) (cid:16) w ↑ iα + w ↓ iα (cid:17) + 12 (cid:16) w ↓ iα − w ↑ iα (cid:17) s iα · σ (cid:21) (C17)where we dropped the index ( n, local) of w iα for brevity.With ∆ iα = E ↓ iα − E ↑ iα , the derivative of the bond energy with respect to s iα is given by d U B d s iα = (cid:16) Tr (cid:16) ˜Θ iαiα σ (cid:17) ∆ iα − I i m i m i + n max (cid:80) n (cid:16) w ↓ iα − w ↑ iα (cid:17) Tr (cid:16) σ µ ( n ) iα (cid:17)(cid:19) (C18)where m i is the spin direction on atom i and σ is thevector of Pauli spin matrices. The cross product with thespin direction leads to the magnetic torque given by t iα = d U B d s iα × s iα (C19)for orbital α on atom i where s iα × m iα = 0.

4. Common partial derivatives

The weights (Eq. C4) can be determined analytically.To this end, the band and onsite contributions are sepa-rated as w ( n ) iαν = ∂U band ∂µ ( n ) iαν + ∂∂µ ( n ) iαν E F (cid:90) n iαν d E · (cid:18) E iαν − (cid:80) jβµ E jβµ n jβµ ( E F ) (cid:80) jβµ n jβµ ( E F ) J i q i − (cid:80) jβ J j q j n jβ ( E F ) (cid:80) jβ n jβ ( E F ) − (cid:18) ( − ν I i m i − (cid:80) jβµ ( − µ I j m j n jβµ ( E F ) (cid:80) jβµ n jβµ ( E F ) (cid:19)(cid:19) with ∂U band ∂µ ( n ) = n m ax (cid:88) m =0 (cid:18) ∂b ( ∞ ) ∂µ ( n ) σ ( m ) (cid:2) ˆ χ m +2 − (cid:15) F ˆ χ m +1 + ˆ χ m (cid:3) + b ( ∞ ) (cid:18) ∂σ ( n ) ∂µ ( n ) + ∂σ ( n ) ∂a ( ∞ ) ∂a ( ∞ ) ∂µ ( n ) + ∂σ ( n ) ∂b ( ∞ ) ∂b ( ∞ ) ∂µ ( n ) (cid:19) · (cid:2) ˆ χ m +2 − (cid:15) F ˆ χ m +1 + ˆ χ m (cid:3) + (cid:20) ∂ ˆ χ m +2 ∂a ( ∞ ) − ∂(cid:15) F ∂a ( ∞ ) ˆ χ m +1 − (cid:15) F ∂ ˆ χ m +1 ∂a ( ∞ ) + ∂ ˆ χ m ∂a ( ∞ ) (cid:21) · b ( ∞ ) σ ( m ) ∂a ( ∞ ) ∂µ ( n ) + (cid:20) ∂ ˆ χ m +2 ∂b ( ∞ ) − ∂(cid:15) F ∂b ( ∞ ) ˆ χ m +1 − (cid:15) F ∂ ˆ χ m +1 ∂b ( ∞ ) + ∂ ˆ χ m ∂b ( ∞ ) (cid:21) · b ( ∞ ) σ ( m ) ∂b ( ∞ ) ∂µ ( n ) (cid:19) (C20)where a constant terminator (Eq. B3) was assumed and ∂∂µ ( n ) E F (cid:90) n ( E ) dE = n m ax (cid:88) m =0 (cid:18) ˆ χ m +1 ( φ F ) · (cid:18) ∂σ ( m ) ∂µ ( n ) + ∂σ ( m ) ∂a ( ∞ ) ∂a ( ∞ ) ∂µ ( n ) + ∂σ ( m ) ∂b ( ∞ ) ∂b ( ∞ ) ∂µ ( n ) (cid:19) + σ ( m ) (cid:18) ∂ ˆ χ m +1 ∂a ( ∞ ) ∂a ( ∞ ) ∂µ ( n ) + ∂ ˆ χ m +1 ∂b ( ∞ ) ∂b ( ∞ ) ∂µ ( n ) (cid:19)(cid:19) . (C21)6where we omitted the common index iαν for brevity. Thepartial derivatives of the expansion coeﬃcients with re-spect to the moments are given by ∂σ ( m ) ∂µ ( n ) = m (cid:88) k = n p mk ∂ ˆ µ ( k ) ∂µ ( n ) = m (cid:88) k = n p mk (cid:0) b ( ∞ ) (cid:1) k ( kn ) (cid:16) − a ( ∞ ) (cid:17) ( k − n ) (C22)and with respect to the asymptotic recursion coeﬃcients ∂σ ( m ) ∂a ( ∞ ) = m (cid:88) k =0 p mk ∂ ˆ µ ( k ) ∂a ( ∞ ) = − m (cid:88) k = n p mk (cid:0) b ( ∞ ) (cid:1) kk − (cid:88) n =0 ( k − n ) (cid:32) kn (cid:33) µ ( n ) (cid:16) − a ( ∞ ) (cid:17) ( k − n − (C23) ∂σ ( m ) ∂b ( ∞ ) = m (cid:88) k =0 p mk ∂ ˆ µ ( k ) ∂b ( ∞ ) = − m (cid:88) k =1 k p mk b ( ∞ ) ˆ µ ( k ) (C24)(Note that Eq. C24 corrects a misprint in Eq. A8 ofRef. .) The derivatives of the response functions aregiven in terms of the Fermi phase by ∂ ˆ χ m ∂a ( ∞ ) = ∂ ˆ χ m ∂ cos ( φ F ) ∂ cos ( φ F ) ∂a ( ∞ ) (C25) ∂ ˆ χ m ∂b ( ∞ ) = ∂ ˆ χ m ∂ cos ( φ F ) ∂ cos ( φ F ) ∂b ( ∞ ) (C26)with partial derivatives of Eqs. B15 and B16 ∂ ˆ χ m ∂ cos ( φ F ) = − cos( m + 1) φ F − cos( m − φ F π sin ( φ F ) (C27) ∂ cos ( φ F ) ∂a ( ∞ ) = − b ( ∞ ) (C28) ∂ cos ( φ F ) ∂b ( ∞ ) = − cos ( φ F )2 b ( ∞ ) . (C29)The derivatives of the recursion coeﬃcients are given by ∂a n ∂µ m = b n +1 n +1 (cid:88) j =0 n (cid:88) l =0 c n +1 j c nl δ l + j,m − b n n (cid:88) j =0 n − (cid:88) l =0 c nj c n − l δ l + j,m (C30) ∂b n ∂µ m = b n  n (cid:88) j =0 n (cid:88) l =0 c nj c nl δ l + j,m − n − (cid:88) j =0 n − (cid:88) l =0 c n − j c n − l δ l + j,m  . (C31)This set of partial derivatives is computed (i) in ev-ery self-consistency step to optimise the onsite-levels(Eq. A29) and (ii) in every force (Eq. C15) or torquecalculation (Eq. C19). Appendix D: Rotation matrices

The rotation matrices R ( θ, φ ) in Eq. A15 are con-structed from the polar and azimuthal angles θ and φ between the interatomic bond and the global coordinatesystem. ( θ is the angle to the xy -plane and φ the angleto the x -axis in the xy -plane.) For the orbital orderingin H ij of Eq. A13, the elements of the rotation matrixfor p -orbitals are given by R ( θ, φ ) , = cos( θ ) (D1) R ( θ, φ ) , = − sin( θ ) R ( θ, φ ) , = 0 . R ( θ, φ ) , = cos( φ ) sin( θ ) R ( θ, φ ) , = cos( φ ) cos( θ ) R ( θ, φ ) , = − sin( φ ) R ( θ, φ ) , = sin( φ ) sin( θ ) R ( θ, φ ) , = sin( φ ) cos( θ ) R ( θ, φ ) , = cos( φ )7The matrix entries of the rotation matrix for d -orbitalsare given by R ( θ, φ ) , = cos ( θ ) − / ( θ ) (D2) R ( θ, φ ) , = −√ θ ) cos( θ ) R ( θ, φ ) , = 0 R ( θ, φ ) , = (cid:112) / ( θ ) R ( θ, φ ) , = 0 R ( θ, φ ) , = √ φ ) sin( θ ) cos( θ ) R ( θ, φ ) , = cos( φ )(cos ( θ ) − sin ( θ )) R ( θ, φ ) , = − sin( φ ) cos( θ ) R ( θ, φ ) , = − cos( φ ) sin( θ ) cos( θ ) R ( θ, φ ) , = sin( φ ) sin( θ ) R ( θ, φ ) , = √ φ ) sin( θ ) cos( θ ) R ( θ, φ ) , = sin( φ )(cos ( θ ) − sin ( θ )) R ( θ, φ ) , = cos( φ ) cos( θ ) R ( θ, φ ) , = − sin( φ ) sin( θ ) cos( θ ) R ( θ, φ ) , = − cos( φ ) sin( θ ) R ( θ, φ ) , = (cos ( φ ) − sin ( φ )) (cid:112) / ( θ ) R ( θ, φ ) , = (cos ( φ ) − sin ( φ )) sin( θ ) cos( θ ) R ( θ, φ ) , = − φ ) cos( φ ) sin( θ ) R ( θ, φ ) , = (cos ( φ ) − sin ( φ ))(cos ( θ ) + 1 / ( θ )) R ( θ, φ ) , = − φ ) cos( φ ) cos( θ ) R ( θ, φ ) , = √ φ ) cos( φ ) sin ( θ ) R ( θ, φ ) , = 2 sin( φ ) cos( φ ) sin( θ ) cos( θ ) R ( θ, φ ) , = (cos ( φ ) − sin ( φ )) sin( θ ) R ( θ, φ ) , = sin( φ ) cos( φ )(cos ( θ ) + 1) R ( θ, φ ) , = (cos ( φ ) − sin ( φ )) cos( θ ) Both rotation matrices become identity matrices forsin( φ ) = 0 and sin( θθ