Black-box inhomogeneous preconditioning for self-consistent field iterations in density functional theory
BBlack-box inhomogeneous preconditioning for self-consistent field iterations in densityfunctional theory
Michael F. Herbst ∗ and Antoine Levitt † Inria Paris and CERMICS, Ecole des Ponts, 6 & 8 avenue Blaise Pascal, 77455 Marne-la-Vallee, France.
We propose a new preconditioner based on the local density of states for computing the self-consistent problem in Kohn-Sham density functional theory. This preconditioner is inexpensive andable to cure the long-range charge sloshing known to hamper convergence in large, inhomogeneoussystems such as clusters and surfaces. It is based on a parameter-free and physically motivatedapproximation to the independent-particle susceptibility operator, appropriate for both metals andinsulators. It can be extended to semiconductors by using the macroscopic electronic dielectricconstant as a parameter in the model. We test our preconditioner successfully on inhomogeneoussystems containing metals, insulators, semiconductors and vacuum.
I. INTRODUCTION
Since its original formulation in the 1960s
Kohn-Sham density-functional theory (DFT) has become one ofthe most widespread methods for simulating properties insolid-state physics and chemistry. Its advantageous bal-ance balance between accuracy and computational costas well as the continuous development of more efficientmethods has continuously increased its range of appli-cability. Nowadays DFT-based ab initio treatments arestandard in many fields . Driven by the availabilityof ever-growing amounts of computational power, sev-eral groups have developed high-throughput DFT frame-works where the properties of a large numbers of sys-tems are computed systematically . In domains suchas catalysis or battery research , where systematicexperiments are expensive or time-consuming, large-scalescreening studies represent an interesting alternative tocomputationally discover novel compounds or boil downa large candidate list to a tractable set for more detailedfollow-up investigation. Compared to the early yearswhere the focus was on performing a small number ofcomputations on hand-picked systems, high-throughputstudies have much stronger requirements. In particularany form of manual parameter selection is not practical.The typical approach to tackle this problem is to pro-vide a set of heuristics, which automatically select thecomputational parameters based on prior experience .This empirical process is, however, problematic, becauseresulting methods can still fall short or break down incases with unexpected behavior, amongst which one of-ten finds exactly the interesting cases such studies werehoped to excavate in the first place. A clear objectivefor improving DFT algorithms is therefore to employ assmall a number of parameters as possible, ideally leadingto a simulation workflow being composed only of black-box building blocks that make use of the appropriatephysics to adapt to the system at hand.With this idea in mind, this work will consider theself-consistent field (SCF) problem as a fundamentalstep in all Kohn-Sham DFT simulations. As it is well-known, simple SCF schemes suffer from a number ofinstabilities, which in the linear regime can be seen to arise from the properties of the dielectric operator ε = 1 − ( v c + K XC ) χ . This operator is built up fromthree terms: the independent-particle susceptibility χ ,the Coulomb kernel v c and the exchange-correlation ker-nel K XC . Based on an analysis of these three compo-nents, we classify the possible sources of slow conver-gence in three types: (a) systems close to an electronicphase transition, for instance ferromagnetic, due to thenegativity of K XC ; (b) localized d or f orbitals, givinglarge contributions to χ ; or (c) the long-range “charge-sloshing” due to the divergence of the Coulomb operator v c at large wavelengths, especially prominent as largersystems are considered. The latter forms the main focusof this work. For an early discussion of these aspects, werefer to the work of Dederichs and Zeller . In applica-tions such as the ones mentioned above, all these sourcesof instabilities may occur and need to be accounted for bya preconditioner to yield fast and reliably converging SCFmethods for high-throughput calculations. Moreover, infields such as catalysis one is typically concerned with in-terfaces, surface effects or surface defects, i.e. situationsin which simulations on large and/or inhomogeneous sys-tems are to be performed.Based on the successful application of the precondi-tioner suggested by Kerker for bulk metallic systemsand (with minor modifications) for bulk semiconductors ,several attempts have been made to generalize Kerker’sapproach to inhomogeneous systems. For example, thework of Raczkowski et al. discusses an approach bysolving a Thomas-Fermi-von Weizs¨acker problem at ev-ery SCF step. More recently, Zhou et al. suggested amore generalized parametrization of the Kerker model formixed systems along with various strategies to adjust therespective parameters. A more general approach, solvinga partial differential equation with variable coefficients,is the “elliptic preconditioner” by Lin and Yang . Re-lated is the work by Hasnip and Probert where SCFconvergence is improved by mapping at each SCF stepthe Kohn-Sham problem to a simpler “auxiliary system”and solving that. Yet another alternative approach is todirectly compute an approximate dielectric matrix basedon the Kohn-Sham orbitals and use this operator forpreconditioning. We refer to Woods et al. for a recent a r X i v : . [ phy s i c s . c o m p - ph ] N ov and thorough review of these approaches.While these methods have been successfully appliedto different classes of inhomogeneous systems like metal-lic slabs or clusters, none of them has so far proven tobe both robust and general enough to prevent charge-sloshing in all kinds of inhomogeneous systems. Ad-ditionally some of these methods can be either diffi-cult to implement, too expensive to employ in big sys-tems or require the selection of many or complicatedsystem-specific parameters, making automated compu-tations non-trivial. In this work, we present an inex-pensive and easy to implement preconditioning strategybased on the local density of states. For systems involv-ing only metals, insulators and vacuum, this precondi-tioner is parameter-free and thus completely black-box.Its range of applicability can be extended to systems in-volving semiconductors by introducing an additional pa-rameter, the dielectric constant — which for most mate-rials is a priori known. For all tested systems, we findthat our preconditioner is able to cure long-range chargesloshing (cause (c) in the above classification), leavingonly localized states and electronic phase transitions assources of slow convergence to be investigated in futurework.The structure of this manuscript is as follows. Sec-tion II reviews the mathematical structure of the SCFproblem in Kohn-Sham DFT and provides key resultsabout rates of convergence, which are illustrated by ex-ample calculations in Section III. Section IV reviews com-mon strategies for SCF preconditioning in bulk systemsfor which Section V provides computational results. Fi-nally Sections VI and VII introduce the LDOS precondi-tioner in contrast to other approaches for preconditioningheterogeneous systems and presents an extensive studyillustrating its performance in large mixed systems. II. SELF-CONSISTENT FIELD ITERATIONS
Assume an isolated system of 2 N electrons with peri-odic boundary conditions in a (possibly nonlocal) pseu-dopotential V ext created by nuclei and core electrons.The Kohn-Sham equations for N electron pairs at finitetemperature T are (in atomic units) (cid:18) − ∇ + V ext + V HXC ( ρ ) (cid:19) ψ i = ε i ψ i ρ = ∞ (cid:88) i =1 f (cid:18) ε i − ε F T (cid:19) | ψ i | V HXC ( ρ ) = v c ρ + V XC ( ρ ) (cid:90) ρ ( r ) d r = 2 N (1)where f ( x ) = 2 / (1+ e x ) is twice the Fermi-Dirac function, V XC ( ρ ) is the exchange-correlation potential, and v c ρ isthe Coulomb potential associated to the charge density ρ in a uniform neutralizing background: −∇ ( v c ρ ) = 4 πρ. (2)We define the potential-to-density mapping F by F ( V ) = ∞ (cid:88) i =1 f (cid:18) ε i − ε F T (cid:19) | ψ i | (3)where ( ε i , ψ i ) are the eigenpairs of − ∇ + V ext + V andthe Fermi level ε F is uniquely defined by the condition (cid:90) (cid:16) F ( V ) (cid:17) ( r ) d r = 2 N. (4)The Kohn-Sham equations can then be written as thefixed-point equation F ( V HXC ( ρ )) = ρ. (5)The simplest self-consistent iteration is ρ n +1 = F ( V HXC ( ρ n )) . (6)Except on very simple systems, this iteration usually doesnot converge. A more useful scheme is the damped iter-ation ρ n +1 = ρ n + α (cid:2) F (cid:0) V HXC ( ρ n ) (cid:1) − ρ n (cid:3) , (7)where α > ρ ∗ . As is well-known, the iteration converges locally if and only if allthe eigenvalues of the Jacobian matrix J α = 1 − α (1 − χ K ) (8)have magnitude less than one. In this case, the conver-gence rate is equal to the eigenvalue of largest magnitudeof J α . In this equation, χ = F (cid:48) (cid:0) V HXC ( ρ ∗ ) (cid:1) (9)is the independent-particle susceptibility, and K = V (cid:48) HXC ( ρ ∗ ) = v c + K XC ( ρ ∗ ) (10)is the derivative of the Hartree-exchange-correlation po-tential (the Hartree-exchange-correlation kernel).Note that J α = 1 − αε † , where ε † = 1 − χ K (11)is the adjoint of the dielectric matrix ε = 1 − Kχ . Aswill be shown in Section IV, the susceptibility χ is anon-positive self-adjoint operator, and K is self-adjoint.It follows that χ K = − ( − χ ) / ( − χ ) / K (12)has the same spectrum (except possibly for 0) than thesymmetrized self-adjoint operator ( − χ ) / K ( − χ ) / ,and therefore that ε † has real spectrum.It is often a good approximation to neglect the term K XC = V (cid:48) XC ( ρ ∗ ) in K , yielding the random phase ap-proximation (RPA) K RPA = v c . (13)Since v c has non-negative spectrum, ε † = 1 − χ v c haspositive spectrum. If we do not assume the RPA ap-proximation, K does not have a guaranteed sign dueto the non-convexity introduced in the model by theexchange-correlation term. The eigenvalues of χ K canthus be in principle arbitrary. It can, however, be shownthat, at an energy local minimizer, ε † has non-negativespectrum .It follows that, near a non-degenerate energy min-imizer, by selecting α small enough the spectrum of J α = 1 − αε † will be in ( − , et al. ,Theorem 3.6). However, this convergence may be unac-ceptably slow in practice. Letting λ min > λ max bethe lowest and largest eigenvalues of ε † , the convergencerate is R = max( | − αλ min | , | − αλ max | ). Assuming that α is selected optimally to minimize this convergence rate,we obtain α = 2 λ min + λ max (14)and R = κ − κ + 1 , (15)where κ = λ max λ min (16)is the (metric-independent) “spectral condition number”.Note that this number is distinct from the usual (metric-dependent) condition number of ε † , equal to the ratio oflarge to lowest singular values, because this operator isnot self-adjoint. The spectral condition number κ canhowever be identified with the condition number of ε † under the metric induced by − χ , for which ε † is self-adjoint. With an abuse of language, we will use the term“condition number” to refer to the “spectral conditionnumber” (16) throughout this work.For κ large enough, R ≈ − /κ and therefore thenumber of iterations tol log R needed to achieve a fixed tol-erance tol grows proportionally to κ . Slowly converg-ing SCF iterations are therefore linked to large valuesof κ , originating from small and large eigenvalues of ε † = 1 − χ ( v c + K XC ). In practice, this can be at-tributed to three different classes of instabilities, linkedto large modes of K XC , χ and v c respectively: (a) The first case is when ε † has small eigenvalues. Thiscan only happen in systems with strong exchange-correlation effects, because under the RPA alleigenvalues of ε † are greater than 1. This phe-nomenon is usually (but not necessarily) associatedwith symmetry breaking. To see this, consider fer-romagnetic systems, where ε † has negative eigen-values at the spin-unpolarized state (a solution ofthe Kohn-Sham equations which is not an energyminimum). Paramagnetic systems close to a fer-romagnetic transition will display spin-unpolarizedenergy minima with small but positive eigenvaluesfor ε † . The modes associated with the slow conver-gence are spin-polarization modes.(b) A second source of instabilities are large eigenval-ues of ε † brought about by large modes of χ . Insolids, these modes can for instance be due to lo-calized states, such as d or f electrons, close to theFermi level. In this case the associated modes arelocalized around atoms.(c) A final source are large eigenvalues of ε † originat-ing from the long-range divergence of the Coulombinteraction (large modes of v c ). Indeed, if (cid:98) ρ ( q ) arethe Fourier coefficients of ρ , then (cid:91) ( v c ρ )( q ) = 4 π (cid:98) ρ ( q ) | q | (17)which diverges for small q (long wavelengths). De-pending on whether this effect is compensated ornot by χ , this can manifest as large eigenvalues of ε † . As we will see, no compensation takes place formetals in large supercells. In this case, the condi-tion number increases with the square of the systemsize length. Associated modes are long-wavelengthtransfer of charge from one end of the system toanother (“charge sloshing”).Among these three sources of instabilities, the third one ispossibly the most serious, because there the convergencedegrades with system size.In this paper, we focus on curing this source of slowconvergence using preconditioning. Preconditioning con-sists in replacing (7) by ρ n +1 = ρ n + αP − (cid:16) F ( V HXC ( ρ n )) − ρ n (cid:17) (18)where P is a preconditioner. The Jacobian of this methodat convergence ( ρ = ρ ∗ ) is J αP = 1 − αP − ε † . (19)It follows that the optimal preconditioner P is simply theadjoint dielectric operator ε † . Indeed, in this case, when α = 1, the iteration (18) turns into the standard Newtonmethod: J P = 0, and the method converges in one stepin the linear approximation. In practice however, invert-ing this explicitly (or computing its action on a vector)is too costly. Therefore it is desirable to find a precon-ditioner P that can be inverted efficiently and such that P − (1 − Kχ ) has a condition number close to 1. III. ANDERSON ACCELERATION
Before turning our attention to the problem of findinggood preconditioners, we address the convergence behav-ior of acceleration techniques, which are commonly usedto speed up SCF convergence, and which we will use inour tests.Acceleration techniques treat the fixed-point equa-tion (7) as a black box iteration, and use the history ρ , . . . , ρ n in order to extrapolate a better ρ n +1 thanthe one obtained by the simple (preconditioned) damp-ing method (18). There are many variants of these, themost used being the method variously known as Pu-lay/DIIS/Anderson mixing/acceleration, which we willrefer to as Anderson acceleration. We refer to Woods et al. for an extensive review in the context of Kohn-Sham DFT, and to Chupin et al. and references withinfor a mathematical analysis of these acceleration tech-niques. In particular, in the linear regime and with infi-nite history, Anderson acceleration is known to be equiv-alent to the well-known GMRES method to solve linearequations. Assuming non-linear effects to be negligible,we can expect that Anderson acceleration inherits the im-proved convergence properties of Krylov methods . Inparticular, the convergence rate of the GMRES methodis R = √ κ − √ κ + 1 , (20)and therefore the number of iterations grows only as thesquare root of κ . Furthermore, an important property ofKrylov methods is that they converge to the exact solu-tion in N steps if the matrix has N distinct eigenvalues ,and therefore we can expect Anderson acceleration to bevery efficient in the case where eigenvalues are clustered.In practice, Anderson methods are truncated to a finitehistory size, and the impact of the nonlinearity on theefficiency of the extrapolation is complex. Unlike withthe simple damping method, the convergence behavior ofAnderson acceleration can not be understood solely fromthe condition number κ , and the number of iterationsdoes not always follow the law predicted by the rate (20).In Figure 1 we exhibit the error in energy for threerepresentative cases, selected to illustrate the possible be-haviors. We compute the extreme eigenvalues of P − ε † at convergence, and use them to determine the optimalstep using (14). For comparison, the plot also shows theconvergence estimated from the rates (15) for damped it-erations and (20) for Anderson acceleration. For detailson the computational procedures, we refer to Sections Vand VII.In cases (a) and (c) the simple mixing method ( m =0) converges with a rate consistent with the theoretical FIG. 1. Convergence of self-consistent field iterations onthree selected systems using different history lengths m forAnderson acceleration ( m = 0 corresponds to simple mixing).From top to bottom the considered systems are (a) 12 layersof aluminium using a Kerker preconditioner; (b) 10 layers ofaluminium with an equally wide portion of vacuum; and (c)20 layers of gallium arsenide, in both latter cases using nopreconditioner. prediction. In case (b), using the optimal step computedat the solution proves too optimistic, and the iteration isdivergent.The accelerated methods track approximately the con-vergence rate (20) in case (a). Case (b) displays a char-acteristic “plateau” behavior: after an initial stage ofrapid convergence, the error stagnates, then very rapidconvergence is obtained once the accelerated scheme has“learned” the dominant clusters in the eigenvalues of P − ε † . The plateau is particularly pronounced in thiscase because P − ε † has a very clustered spectrum: thelargest eigenvalue is about 50, but all eigenvalues exceptfive are between 0.9 and 2.0. In case (c) the error shootsup abruptly around iterations 4 and 5. This is becausethe linear model assumed by Anderson acceleration doesnot hold in the initial steps. In more complex cases, thiscan lead to complete divergence, but here the iterationrecovers and then displays the plateau behavior seen incase (b). Notice that for m = 5, where the history sizeis too small to capture all dominant clusters, the plateauhas more the shape of a chain of half-arcs.Overall one can conclude that while for non-acceleratedfixed-point iterations the condition number and the re-sulting theoretical rates are very closely related to theobserved number of iterations, the situation is less clearfor the Anderson-accelerated procedure. In the followingwe will still employ Anderson acceleration for all our cal-culations for two reasons. Firstly because this representsthe typical situation encountered in practice and secondlybecause otherwise, especially for cases with bad precon-ditioning, convergence can become extremely slow. Tosimplify our analysis we will also largely assume that thecondition number of the preconditioned adjoint of thedielectric matrix P − ε † is a good indicator for the ex-pected number of iterations in the Anderson-acceleratedSCF, even though this discussion shows that the correla-tion is not as clear as one would expect from the simpleestimate (20).Note that, although the Anderson method is some-times divergent, convergence can always be achieved (al-beit very slowly) with the simple damping method byselecting a sufficiently small damping parameter. Thismeans that it should be possible to stabilize the Andersonmethod by selecting stepsizes and extrapolation param-eters more conservatively. This is a worthwhile researchdirection, but falls outside the scope of this paper. IV. RESPONSE OPERATORS ANDPRECONDITIONINGA. The independent-particle susceptibility χ To examine the impact of the Coulomb divergence onthe eigenvalues of the dielectric operator, we study theproperties of the independent-particle susceptibility op-erator χ in periodic systems. Consider a perfect crys-tal with lattice R , reciprocal lattice R ∗ , Brillouin zone B , unit cell Ω and N el electron pairs per unit cell. Wechoose a domain Ω N , with periodic boundary conditions,composed of N = L × L × L copies of the unit cell ofthe crystal, and containing N N el electron pairs. Then,as is standard, the eigenvectors of the periodic Hamilto-nian − ∇ + V ext + V HXC ( ρ ∗ ) can be chosen as the Bloch waves ψ n k ( r ) = 1 √N e i k · r u n k ( r ) , (21)where u n k is a R -periodic function normalized as (cid:104) u n k | u m k (cid:105) Ω N = N (cid:104) u n k | u m k (cid:105) Ω = N (cid:90) Ω u ∗ n k ( r ) u m k ( r ) d r = N δ nm . (22)The index k runs over the discrete set B N ⊂ B , whichcontains the N wavevectors in the Brillouin zone thatare compatible with the periodic boundary conditions onΩ N .We now derive an explicit sum-over-states expressionfor χ = F (cid:48) ( V HXC ( ρ ∗ )). We consider an infinitesimalperturbation δV , and compute the response δρ = χ δV .We get δρ = (cid:88) k ∈B N ∞ (cid:88) n =1 (cid:0) δf n k | ψ n k | + f n k ( δψ ∗ n k ψ n k + ψ ∗ n k δψ n k ) (cid:1) (23)with f n k = f (cid:0) ε n k − ε F T (cid:1) . From first-order perturbationtheory, δf n k = ( (cid:104) ψ n k | δV ψ n k (cid:105) Ω N − δε F ) f (cid:48) n k (24) δψ n k = (cid:88) k (cid:48) ∈B N ∞ (cid:88) (cid:48) m =1 (cid:104) ψ m k (cid:48) | δV ψ n k (cid:105) Ω N ε n k − ε m k (cid:48) ψ m k (cid:48) (25)with f (cid:48) n k = 1 T f (cid:48) (cid:18) ε n k − ε F T (cid:19) (26)and where Σ (cid:48) indicates that the term m = n , k (cid:48) = k is omitted in the summation. Note that strictly speak-ing, these intermediary results are only valid for non-degenerate eigenvalues (otherwise δψ n k is not well-defined), but the end result is also valid for degenerateeigenvalues, see Levitt for an alternative derivation us-ing contour integrals. We can determine δε F by the con-dition that the total number of electrons is unchanged: (cid:80) k ∈B N (cid:80) ∞ n =1 δf n k = 0 and therefore δε F = (cid:104) D loc | δV (cid:105) Ω N N D (27)where we have introduced D = − N | Ω | (cid:88) k ∈B N ∞ (cid:88) n =1 f (cid:48) n k (28) D loc ( r ) = − N | Ω | (cid:88) k ∈B N ∞ (cid:88) n =1 f (cid:48) n k | u n k ( r ) | (29)the total density of states per unit volume at the Fermilevel, and the local density of states.We can now obtain the integral kernel of χ bothin real space and in reciprocal space. For the for-mer, we simply identify using the formula δρ ( r ) = (cid:82) Ω N χ ( r , r (cid:48) ) δV ( r (cid:48) ) d r (cid:48) . For the latter, note that χ is invariant by the lattice translations R , and can there-fore be described by its Bloch matrix χ ( q , G , G (cid:48) ), where q ∈ B N , G , G (cid:48) ∈ R ∗ , describing the density response atwavelength q + G to a total potential perturbation atwavelength q + G (cid:48) . Using the convention f n k − f n k ε n k − ε n k = f (cid:48) n k ,we obtain χ ( r , r (cid:48) ) = (cid:88) k (cid:48) ∈B N (cid:88) k ∈B N ∞ (cid:88) m =1 ∞ (cid:88) n =1 f n k − f m k (cid:48) ε n k − ε m k (cid:48) ψ ∗ n k ( r ) ψ m k (cid:48) ( r ) ψ n k ( r (cid:48) ) ψ ∗ m k (cid:48) ( r (cid:48) ) + | Ω |N D loc ( r ) D loc ( r (cid:48) ) D (30) χ ( q , G , G (cid:48) ) = 1 N | Ω | (cid:88) k ∈B N ∞ (cid:88) m =1 ∞ (cid:88) n =1 f n k − f m, k − q ε n k − ε m, k − q (cid:104) u m, k − q | e − i G · r u n k (cid:105) Ω (cid:104) u n k | e i G (cid:48) · r u m, k − q (cid:105) Ω (31)+ δ q0 D (cid:104) e i G · r | D loc (cid:105) Ω (cid:104) D loc | e i G (cid:48) · r (cid:105) Ω . In both these expressions, the second term, arisingfrom the variable Fermi level, imposes charge neutrality: (cid:82) Ω N δρ ( r ) d r = 0 for all δV .For q (cid:54) = , we can take the thermodynamic limit N → ∞ and obtain the usual Adler-Wiser formulalim
N →∞ χ ( q , G , G (cid:48) )= 1(2 π ) (cid:90) k ∈B ∞ (cid:88) n =1 ∞ (cid:88) m =1 f n k − f m, k − q ε n k − ε m, k − q A mn k ( q , G , G (cid:48) ) d k . (32)with matrix elements A mn k ( q , G , G (cid:48) )= (cid:104) u m, k − q | e − i G · r u n k (cid:105) Ω (cid:104) u n k | e i G (cid:48) r u m, k − q (cid:105) Ω . (33)Of particular interest is the long-wavelength limit q → , G = , G (cid:48) = , in which case A mn k = δ mn andlim q → lim N →∞ χ ( q , , ) = 1(2 π ) (cid:90) B ∞ (cid:88) n =1 f (cid:48) n, k d k = − lim N →∞ D, (34)the negative of the density of states at the Fermi levelof this system. In the limit of zero temperature, f (cid:48) n, k localizes on the Fermi surface, and D converges to a finitenegative value if the system is conducting, and to zero ifit is not. For insulating and semiconducting systems,expanding u m, k + q to first order (the “ k · p perturbationtheory”), we obtain that for q small, χ ( q , , ) is of theorder of | q | and both χ ( q , G , ) and χ ( q , G (cid:48) , ) are oforder of | q | for G , G (cid:48) (cid:54) = .This has strong consequences for the eigenvalues of theoperator ε † = 1 − χ K . Indeed, in the RPA, ε † ( q , G , G (cid:48) ) = δ GG (cid:48) − χ ( q , G , G (cid:48) ) 4 π | q + G (cid:48) | (35) has the same eigenvalues as the symmetrized operator(1 − v / c χ v / c )( q , G , G (cid:48) )= δ GG (cid:48) − π | q + G | χ ( q , G , G (cid:48) ) 1 | q + G (cid:48) | . (36)It follows from the considerations above that this oper-ator diverges as 1 / | q | for q small enough in the caseof metals or systems at finite temperature, but not forinsulators and semiconductors (see Canc`es and Lewin as well as Levitt for a careful analysis). When com-puting for instance on a supercell L × × q allowed in B L areof order 1 /L . The result is that the condition number κ of ε † grows like 1 /L for metals: this is the source ofcharge sloshing. For insulators and semiconductors, nosuch divergence is present and the condition number isindependent of the system size. In this case, ε r = lim q → ε − ( q , , ) > ε − ( q , , ) is an upperbound on the lowest eigenvalue of ε − , it follows that κ is at least higher than the dielectric constant. Therefore, κ can be rather large for extended systems even if theyare not conducting. B. Neglect of local field effects and dielectricfunctions
Since bad convergence is linked to long-wavelengthmodes, it is useful to ignore lattice-scale details (“neglectof local field effects”). Under this approximation, the op-erator χ is modeled as a translation-invariant operatorcharacterized by its Fourier multiplier χ ( q ). Under theRPA, the dielectric operator is also translation-invariant,with ε ( q ) = 1 − πχ ( q ) | q | (38)This relatively crude approximation is still able to repro-duce at a qualitative level the difference between metals,semiconductors and insulators.For metals, lim q → χ metal0 ( q ) = − D for q small, where D is the density of states per unit volume at the Fermilevel. It follows that an appropriate approximation of ε for q small is ε metal ( q ) = 4 πD + | q | | q | . (39)Note that this result is the same as that obtained inthe Thomas-Fermi theory of the electron gas. Indeed,in Thomas-Fermi theory, the potential-to-density map-ping is given by ρ = C TF ( ε F − V ) / where C TF is auniversal constant, and therefore χ TF0 ( r , r (cid:48) ) = − C TF ρ ( r ) δ ( r − r (cid:48) ) . (40)In the case of the homogeneous electron gas where ρ isconstant, we recover (39), with D TF = 32 C TF ρ . (41)The equation (39) is also consistent with the small- q behavior of the dielectric function derived from the ex-act independent-particle susceptibility of the free electrongas (the Lindhard formula).For insulators or semiconductors, as we have seenabove, a more appropriate approximation of χ ( q ) for q small is − q T σ q , where σ is a unitless material-dependent symmetric positive definite matrix, with thephysical meaning of a polarizability per unit volume. Thedielectric function is then ε ( q ) = 1 + 4 π q T σ q | q | (42)In the isotropic case where σ is a scalar, ε r = ε ( ) =1 + 4 πσ is the dielectric constant of the material.The concept of a translation-invariant dielectric func-tion is only meaningful at long wavelengths (small q ),where local field effects can be ignored. At larger q onecan simply extrapolate to a reasonable function. In Fig-ure 2 we plot approximate dielectric functions for threebulk materials: aluminium (Al), a metal, gallium ar-senide (GaAs), a semiconductor, and silica (SiO ), aninsulator. For the metal, we used the formula (39) above,with 4 πD = 1. For the semiconductor and the insulator,we used the empirical formula χ dielectric0 ( q ) = − ( ε r − | q | π (cid:16) ε r − | q | k (cid:17) ε dielectric ( q ) = ε r + ( ε r − | q | k ε r − | q | k (43) This is a two-parameter simplification of the Restamodel chosen to reproduce the correct behavior at zeroand at infinity. We used for ε r the macroscopic isotropicelectronic dielectric constants (2 . and 14 forGaAs), and set k TF = 1. FIG. 2. Approximate dielectric functions of aluminium,gallium arsenide and silica.
C. Preconditioners for homogeneous systems
Several preconditioners have been proposed in the lit-erature. The simplest class is that of homogeneous pre-conditioners, which simply act as multipliers in reciprocalspace. The first such preconditioner is Kerker’s scheme ,derived using Thomas-Fermi theory: P − ( q ) = 1 ε metal ( q ) = | q | k + | q | , (44)where k TF is the Thomas-Fermi screening wavevector.This is exactly the metallic dielectric function (39), with k TF = √ πD .This preconditioner is appropriate for metals butnot for insulators and semiconductors, and thereforethe Kerker preconditioner has been modified to accom-modate different types of dielectric behavior, for in-stance by the truncated Kerker scheme P trunc ( q ) = P Kerker (max( | q | , q min )) for some q min , or by using theResta model as preconditioner . We use a dielectricpreconditioner based on the dielectric model prescribedin (43): P − ( q ) = 1 ε dielectric ( q ) = 1 + ( ε r − | q | k ε r + ( ε r − | q | k . (45)When a preconditioner is chosen, the condition numberof the iterative scheme is equal to the ratio of largest tolowest eigenvalues of P − ε † . It follows that the conditionnumber κ of P − ε † is small (and therefore convergenceis rapid) when (a) ε is well-approximated by a homo-geneous dielectric function (and therefore the system ishomogeneous enough), and (b) the dielectric function iswell-approximated by the selected preconditioner (andtherefore the parameters are selected appropriately). Se-lecting the wrong preconditioner risks damping too littleor too much the long-wavelength modes, resulting in slowconvergence. V. COMPUTATIONAL RESULTS ONHOMOGENEOUS SYSTEMS
To compare different strategies of preconditioning nu-merically we will consider six testcases in the formof elongated supercells of silica (SiO ), gallium ar-senide (GaAs) and aluminium (Al) as simple examples ofa bulk insulator, semiconductor and metal respectively.To generate these test systems we repeated theconventional cubic (for GaAs and Al) or trigo-nal (for SiO ) unit cells of the respective materi-als roughly 20 or 40 times along the (001) surface.All calculations have been performed in the density-functional toolkit (DFTK) , in a plane-wave basis, usingthe Perdew-Burke-Ernzerhof (PBE) exchange-correlationfunctionals as implemented in the libxc library, andGodecker-Teter-Hutter pseudopotentials . We used akinetic energy cutoff of 20 Hartree and Monkhorst-Packgrids with a maximal spacing of 0.3 Bohr − . For alu-minium a Gaussian smearing scheme with width of0.001 Hartree was used. We chose tight convergence tol-erances for the diagonalization during the SCF procedureto avoid any slowdown due to inaccurately representedbands. To avoid symmetry effects which would result infaster convergence not representative of larger-scale com-putations, we slightly perturbed the atomic positions andthe initial guess density. We refer to our github repos-itory of supporting information for full computationaldetails, including instructions on how to reproduce allthe results in this paper.We show in Table I the results of the SCF procedurewith Anderson acceleration (history size m = 10) in fourcases: without preconditioning ( P = 1), with the dielec-tric preconditioner (45) (using the dielectric constants ofSiO and GaAs), and with the Kerker preconditioner.In all cases we took k TF = 1. In order to avoid se-lecting good stepsizes manually, we estimated the low-est and largest eigenvalues λ min and λ max of P − ε † atconvergence, and used these to determine the optimalstepsize from (14). The eigenvalues were determined it-eratively using the Arnoldi method. At each step, theaction of ε † = 1 − Kχ on a vector was computed bysolving the Sternheimer equations. A similar method wasused by Wilson and coworkers to approximate thedielectric operator. For technical reasons, we used theTeter reparametrization of the local-density approxima-tion (LDA) in the exchange-correlation kernel for thecomputation of K instead of the PBE functional used inthe SCF computations themselves.From the eigenvalues we also computed the condition number κ = λ max λ min . This gives a simple intrinsic measureof the quality of the preconditioner in the linear regime,free from the noise associated with the Anderson method(nonlinear effects, stepsize, history length, implementa-tion details). Since the eigenvalues can be slow to con-verge, we only report approximate values. Using a crudebound based on the residuals obtained in our Arnoldischeme we estimate the error in the values for κ to beless than 13 % over all cases considered in this work.Both the number of iterations required to converge theSCF to an energy error of 10 − as well as the conditionnumber κ are summarized in Table I. The correspondingplots are shown in Figure 3.The combinations of systems and preconditioners forwhich there is no significant growth in the number of iter-ations and condition numbers are highlighted in orangecolor in Table I. As predicted from the theory, Kerkerpreconditioning leads to system size independence forthe metal Al, and no preconditioning or dielectric pre-conditioning for the insulator SiO or the semiconductorGaAs. In the other cases, the condition number growsquadratically with the system size. On semiconductors,even if the unpreconditioned condition number is inde-pendent on system size, it can still be rather large (onthe order of the dielectric constant) and therefore benefitfrom appropriate preconditioning.On the diagonal in Table I, using a preconditioneradapted to the system at hand leads to a condition num-ber between 2 and 4, and convergence in less than 20 iter-ations. Using the wrong preconditioner leads to degradedconvergence. On SiO and Al, convergence is eventu-ally achieved even for inadequate preconditioners, albeitslowly. On GaAs however, the interference of strong non-linear effects on the Anderson scheme leads to a quickdivergence (see discussion in Section III).The long-wavelength charge sloshing is responsible forslow convergence, as illustrated Figure 4 on the largesteigenmodes of ε † in the case of unpreconditioned Al. Be-ing eigenmodes of the almost periodic operator ε † , thesemodes are close to Bloch waves e i q · r u ( r ) with q con-sistent with periodic boundary conditions on the com-putational domain and u lattice-periodic. The largesteigenvalues are associated with small q . VI. PRECONDITIONERS FORHETEROGENEOUS SYSTEMS
As we have seen, for simple homogeneous systems atranslation-invariant approximation of χ and therefore ε − is appropriate. For more complex systems (for in-stance, periodic materials with large unit cells, clustersand surfaces), these are very often unsuccessful in sig-nificantly reducing the number of iterations, and severalalternative preconditioners have been proposed.The scheme in Raczkowski et al. employs a modi-fied Thomas-Fermi-von Weizs¨acker (TFW) model to de-rive an inexpensive approximation to the dielectric opera- None Dielectric Dielectric Kerker(SiO ) (GaAs) N it κ it κ it κ it κ SiO
21 13 3.2 13 3.5 22 10.2 33 103.139 12 3.2 16 3.5 22 10.7 n.c. 351.6GaAs a
20 n.c. 14.7 n.c. 9.3 10 2.4 23 31.240 n.c. 18.0 n.c. 11.3 10 2.7 38 100.9Al 20 29 58.1 22 30.5 15 7.2 10 3.940 n.c. 261.6 n.c. 141.3 24 27.8 10 2.5 a System exhibits strong non-linear effects leading to issues with Anderson acceleration.
TABLE I. Number of iterations and condition number κ for several homogeneous materials and preconditioners. N denotesthe number of repeats of the conventional unit cell and “n.c.” indicates that the SCF has not converged after 50 iterations.The cases where κ increases by less than a factor of two when the system size is doubled are colored. tor. However, the TFW model, like the simpler Thomas-Fermi model in (40), predicts metallic-like full screen-ing (see Canc`es and Ehrlacher for a detailed analy-sis), and is therefore unable to account for the differencein dielectric behavior between insulators, semiconductorsand metals. In practice, we observed the convergencewith this preconditioner to strongly degrade with sys-tem size for semiconductors and (especially) insulators,exactly as the Kerker scheme. A slightly generalized ver-sion of this approach has been proposed by Hasnip andProbert , where the Kohn-Sham problem is dynami-cally mapped to a related auxiliary problem, which issolved self-consistently at each iteration in order to de-termine the preconditioned density for the next itera-tion. The “extrapolar method” of Anglade and Gonze ,building on the earlier HIJ method , is based on a fullcomputation and inversion of the dielectric operator ona reduced system. It performs well, but is very costly,scaling like the fourth power of the number of electrons.The “elliptic preconditioner” of Lin and Yang uses amodel elliptic equation with variable coefficients as a di-electric model, approximating ε − by the solution of theequation (cid:2) ∇ · (cid:0) a ( r ) ∇ (cid:1) + 4 πb ( r ) (cid:3) ( ε − V ) = −∇ V. (46)In the case of constant coefficients a and b , this method isable to reproduce the dielectric behavior of metals (39) bytaking a = 1, b = D and of insulators and semiconductors(42) by taking b = 0 and a equal to the dielectric con-stant. By manually constructing appropriate switchingfunctions for a and b , they were able to demonstrate goodperformance on hybrid systems consisting of a metallicand a vacuum region. However, this method requiresmanual tuning for the system at hand, in a fashion thatis hard to automatize.In this work, we propose to build an inhomoge-neous preconditioner by approximating the independent-particle susceptibility χ instead of the dielectric opera-tor ε . Consider an arbitrary finite system (not necessarilyperiodic) with orthonormal eigenfunctions ϕ i , eigenval-ues ε i , Fermi level ε F , occupations f i = f (cid:0) ε i − ε F T (cid:1) and temperature T . Then, following the computations in Sec-tion IV A, χ can be written as χ ( r , r (cid:48) ) = ∞ (cid:88) i =1 ∞ (cid:88) j =1 f i − f j ε i − ε j ϕ ∗ i ( r ) ϕ j ( r ) ϕ i ( r (cid:48) ) ϕ ∗ j ( r (cid:48) )+ D loc ( r ) D loc ( r (cid:48) ) D , (47)where D = − ∞ (cid:88) i =1 f (cid:48) i D loc ( r ) = − ∞ (cid:88) i =1 f (cid:48) i | ϕ i | ( r ) (48)are the density of states (not per volume, in contrast tothe computations in IV A) and localized density of statesat the Fermi level.The second term χ (2)0 in (47) relates to the global vari-ation of the Fermi level, is fully nonlocal and easily com-putable. The first term χ (1)0 with its double sum over allstates is where the complexity of building precondition-ers lies. However, if we are interested only in describinglong-range modes, we can write (cid:90) χ (1)0 ( r , r (cid:48) ) V ( r (cid:48) ) d r (cid:48) ≈ V ( r ) (cid:90) χ (1)0 ( r , r (cid:48) ) d r (cid:48) = V ( r ) ∞ (cid:88) i =1 ∞ (cid:88) j =1 f i − f j ε i − ε j ϕ ∗ i ( r ) ϕ j ( r ) δ ij = − D loc ( r ) V ( r ) (49)where we have assumed that r (cid:48) (cid:55)→ χ (1)0 ( r , r (cid:48) ) is localizedaround r at a scale smaller than the characteristic scaleof variation of V . This process corresponds to condens-ing all of χ (1)0 ( r , r (cid:48) ) on its diagonal (similar to the “masslumping” used in the finite element community).0 FIG. 3. Convergence plots for bulk systems. From top tobottom the plots show (a) 39 repeats of silica, (b) 40 repeatsof gallium arsenide and (c) 40 repeats of aluminium. ForGaAs the iterations for no preconditioner and for the dielec-tric (SiO ) preconditioner start to diverge after 3 iterations. Algorithm 1
The LDOS preconditioner P LDOS withina preconditioned SCF scheme (18)
Input: δF = F ( V ( ρ n )) − ρ n , current { ε n k } and { ψ n k } Output: δρ to construct ρ n +1 = ρ n + αδρ function ldos preconditioner ( δF, { ε n k } , { ψ n k } ) Compute D and D loc , see (48) Solve (1 − χ LDOS0 v c ) δρ = δF for δρ using GMRES return δρ end function Remarkably, this large-scale averaging process involves z direction y d i r ec t i o n FIG. 4. The five most dominant eigenmodes of the dielectricadjoint operator ε † for bulk aluminium with 40 repeats. Themodes are plotted in real space, averaged over the x direc-tion, corresponding approximately (from top to bottom) toeigenvalues 24, 46, 47, 261 and 261. The modes display typi-cal charge-sloshing behavior, moving electron density betweenextremal parts of the cell. only the “diagonal” excitations i = j in χ (1)0 , resulting ina tremendous reduction in complexity.The final expression (and the central result of this pa-per) is χ LDOS0 ( r , r (cid:48) ) = − D loc ( r ) δ ( r − r (cid:48) )+ D loc ( r ) D loc ( r (cid:48) ) D . (50)We use this to build a preconditioner P LDOS = (1 − χ LDOS0 v c (cid:1) . (51)In our tests, including the exchange-correlation kernel K XC in the preconditioner did not improve it, and there-fore we stay within the RPA.To apply P − to a vector, we solve the linear equationiteratively using the GMRES method , see Algorithm 1.We first compute the local density of states, which is ofthe same complexity as computing the density. At eachstep of the iterative solver (line 3 of Algorithm 1), wehave to apply v c and χ LDOS0 to a test vector. These areinexpensive operations since they scale linearly with thenumber of atoms. More precisely, applying χ LDOS0 onlyrequires multiplications in real space and the Coulomboperator is applied as usual (using Fourier transformsand a multiplication in Fourier space). Both operationsare needed for other steps in DFT computations, andtherefore this method can readily be implemented in anyDFT code, independently of the basis set used.This preconditioner is completely parameter-free, andcan be seen as a localized version of Kerker’s precondi-tioning, to which it reduces when D loc is constant. Byusing the density of states of the system, it automati-cally adjusts the constant k appearing in the Kerkerpreconditioner using the properties of the system underconsideration. This systematizes the findings in Zhou et al. where this constant was adapted to the num-ber of metallic atoms in a hybrid Au-MoS2 system. In1contrast to the homogeneous preconditioners discussedin Section IV C, the LDOS-based approach furthermoredoes not assume that local field effects can be neglected,since the inverse of P is computed iteratively, which ismore appropriate to heterogenous systems.Since the LDOS is localized on metallic regions, we canexpect the approximation (50) to be accurate on systemsmade of simple, free electron-like metals and vacuum,and we will see in Section VII that this is indeed thecase. On regions containing insulators and semiconduc-tors, D loc ( r ) will be zero, such that (50) also performswell for almost inert materials like SiO , where χ ≈ et al. based on Thomas-Fermi theory, which incorrectly predicts a non-zero lo-cal density of states proportional to the cube root of thedensity (see (41)), and therefore in particular overscreensinsulators. However, on semiconductors like gallium ar-senide, our approximation corresponds to not precondi-tioning at all, which we saw in Section V yields a ratherlarge condition number of the order of its dielectric con-stant. This can be traced to the fact that the approxi-mation (49) completely neglects any type of polarizabilityand only considers the bulk reaction of free electrons.To account for this and obtain a model which is suit-able for semiconductors as well, we can simply add thesusceptibility model of (43): χ LDOS+dielectric0 = χ LDOS0 + χ dielectric0 . (52)The rationale behind this empirical modification is thatadding the dielectric model corrects for χ LDOS0 being zeroin semiconducting regions while at the same time nothaving a large effect in metallic regions, where the localdensity of states term of χ LDOS0 is anyway dominating.Compared to the pure χ LDOS0 model, which is completelyfree of empirical parameters, this “LDOS+dielectric”preconditioner introduces the two free parameters k TF and ε r from χ dielectric0 . As we will see in the following thisdoes, however, allow the resulting LDOS+dielectric pre-conditioner to be much more effective on mixed systemscontaining semiconductors. Naturally further improve-ments are possible, at the expense of introducing addi-tional empirical parameters. One aspect we have consid-ered is the use of a switching function L ( r ) going from 0to 1 inside the semiconducting region of the system. Thiseffectively activates χ dielectric0 only in the semiconductingregion, resulting in the “localized” preconditioner χ localized0 ( r , r (cid:48) ) = χ LDOS0 ( r , r (cid:48) )+ (cid:112) L ( r ) χ dielectric0 ( r , r (cid:48) ) (cid:112) L ( r (cid:48) ) . (53) VII. COMPUTATIONAL RESULTS ONHETEROGENEOUS SYSTEMS
In this section we consider a number of interfaces in-volving a combination of the materials we used previously (SiO , GaAs and Al) as well as vacuum. The computa-tional details are the same as in Section V, with the ex-ception of the k -point mesh, for which a maximal spacingof 0.15 Bohr − was used, and the initial guess and struc-tures, which were not randomized.We built the surface systems by stacking N monolay-ers of the first material with N monolayers of the secondmaterial and, for Al+GaAs+SiO , additionally N mono-layers of the third material. For SiO and GaAs the (110)surface was employed, while the (100) surface was usedfor Al. For cases involving vacuum the vacuum regionwas taken of the same width as the material layer. Toavoid the emergence of surface states, which are highlylocalized and thus form another source of instability notof focus here, all silica surfaces were passivated with hy-drogen. To make the lattices compatible, Al and SiO were compressed or expanded in the appropriate direc-tion to adapt to GaAs. Note that we did not deformGaAs, where the band gap is most susceptible to strain.We tested the following preconditioners: • no preconditioner ( P = 1); • the dielectric preconditioner (43), with ε r = 14, k TF = 1; • the Kerker preconditioner (44), with k TF = 1; • the parameter-free LDOS preconditioner based on(50); • the LDOS+dielectric preconditioner based on (52),with ε r = 14, k TF = 1; • on selected systems, the localized preconditionerbased on (53), with the same parameters asthe LDOS+dielectric preconditioner and L ( r ) con-structed as a linear combination of error functionschosen to be 1 inside the GaAs region and 0 outside; • on selected systems, the TFW preconditioner ofRaczkowski et al. , using the implementation inQuantum ESPRESSO with the same parame-ters as for the other computations as far as pos-sible. For these computations we use QuantumESPRESSO’s default damping value of 0 .
7, butfound results to be similar for other choices ofdamping. In contrast to the DFTK-computedcurves, where the absolute error in the total energyis used directly to estimate the SCF error at each it-eration, the Quantum ESPRESSO curves show the estimated scf accuracy quoted in the QuantumESPRESSO output (which is the v c -norm of thedensity difference squared).As previously, we show the number of iterations toreach an energy error of 10 − and the condition numbers κ in Table II. Cases where κ increases by less than a factorof two as the system size is doubled are again highlightedin orange. In every system except for GaAs+SiO , theSCF procedures converge or are on the way to very slow2 None Dielectric Kerker LDOS LDOS+ LocalizedDielectric N it κ it κ it κ it κ it κ it κ SiO +vacuum 10 11 3.3 26 19.7 50 95.7 11 3.3 26 19.720 12 3.4 30 24.4 n.c. 351.5 12 3.4 30 21.7GaAs+vacuum 10 17 13.4 18 6.2 23 67.0 17 12.4 18 10.4 11 3.520 20 15.5 22 12.9 n.c. 312.2 20 15.5 22 12.9 13 4.6Al+vacuum 10 19 51.5 24 44.3 22 64.4 9 3.7 16 10.320 47 170.8 49 168.5 n.c. 323.9 9 3.5 20 10.5GaAs+SiO
10 45 13.7 19 8.9 34 52.4 45 13.4 19 8.8 16 6.120 n.c. 18.2 20 10.2 n.c. 170.1 n.c. 18.2 20 10.2 24 10.3Al+SiO
10 43 93.1 29 33.6 30 50.9 17 6.1 20 9.220 n.c. 316.6 n.c. 118.4 n.c. 159.4 14 5.4 20 10.1Al+GaAs 10 n.c. 144.0 24 22.4 16 9.0 15 7.2 11 3.520 n.c. 485.0 40 59.0 26 28.8 26 21.4 13 5.0Al+GaAs+SiO
10 n.c. 149.5 34 50.4 36 62.9 26 21.5 19 9.0 21 10.4 a System exhibits strong non-linear effects leading to issues with Anderson acceleration.
TABLE II. Number of iterations and condition number κ for several heterogeneous materials and preconditioners. N denotesthe number of repeats of the conventional unit cell and “n.c.” indicates that the SCF has not converged after 50 iterations.The cases where κ increases by less than a factor of two when the system size is doubled are colored. convergence when they are stopped after 50 iterations. InGaAs+SiO however, as in bulk GaAs, the interferenceof strong nonlinear effects on the Anderson scheme leadsto a quick divergence for some preconditioners.None of the homogeneous strategies are able to per-form well on all surfaces. This is particularly clear inthe case of systems containing aluminium: for instance,on the large Al+SiO and Al+vacuum test case, all havecondition numbers in the hundreds, with the SCF con-verging very slowly (in around 50 iterations or more).Even on nonmetallic systems, the homogeneous precon-ditioners are not able to achieve condition numbers lessthan 10 on the large testcases, with the exception of theunpreconditioned SiO surface.By contrast, our parameter-free LDOS-based schemeperforms fairly well, achieving condition numbers lessthan 10 for all systems not containing the semiconduc-tor GaAs, with associated quick convergence of the SCF.Even for the systems containing GaAs, condition num-bers are reasonable, never exceeding 25. The convergenceis accordingly relatively quick, with the exception of theGaAs+SiO system mentioned above.On systems containing GaAs, we tested theLDOS+dielectric and the localized preconditioner. InAl+GaAs, adding the dielectric susceptibility model re-sults in a low condition number, suggesting that it doesnot significantly hamper the good model in the metallicregion. In the GaAs surface (where the LDOS is con-stant zero), plainly adding the susceptibility model doesnot significantly improve things. However, localizing iton the GaAs results in a low condition number. The im-provement is not as marked on the GaAs+SiO , wheresurface effects might play a larger role. Finally, the largerAl+GaAs+SiO is adequately preconditioned with theLDOS+dielectric model, without need of localizing it.In Figure 5, we compare our preconditioner against FIG. 5. Convergence plots for (a) the Al+vacuum and (b)the Al+SiO systems with 20 repeats. the inhomogeneous TFW preconditioner of Raczkowski et al. on two specific metallic systems, for which theTFW model should be adequate in the metallic region:the Al surface, and the Al+SiO interface. On both thesesystems, our LDOS preconditioner converges in about 10iterations. By contrast, the TFW preconditioner is less3efficient, the effect being especially pronounced in theAl+SiO system where the TFW model is very inade-quate in the SiO region, being based on the density in-stead of the LDOS for determining dielectric properties.We illustrate this point by plotting in Figure 6 the pseu-dodensity and the LDOS in these two systems. As is ap-parent, the density is drastically different in the vacuumand the SiO regions, although both have very similarscreening properties, having no free electrons. By con-trast, the LDOS is able to correctly recognize the lack offree electrons in the insulator region.In Figure 6, we also plot the LDOS computed at twodifferent temperatures: that used in the SCF, and ahigher temperature. The LDOS at the SCF tempera-ture is very noisy because of the insufficient Brillouinzone sampling relative to the small temperature used. Incontrast the LDOS at the elevated temperature is moreregular, closer to the actual LDOS of this system. Nev-ertheless, we found the LDOS preconditioning to be rel-atively insensitive to the noise at the SCF temperaturein our test cases. Should this manifest as an issue inother calculations, a higher smearing temperature couldbe used for determining the LDOS, or the LDOS couldbe artificially smoothed.Apart from the rather artificial (but illustrative) testsystems considered here, we have also applied the LDOSpreconditioner to further tests taken from the scf-xn testsuite . These similarly demonstrate that this precondi-tioner, with its ability to switch on and off according tothe character of the system (metallic or not), is especiallyapplicable to mixed systems containing metals, insulatorsand vacuum. For example, using a kinetic energy cutoffof 35 Hartree and otherwise the computational parame-ters from Woods et al. , we found that the LDOS pre-conditioner, gives a rapid and steady convergence to anenergy error of 10 − in around 10 iterations for a varietyof test cases such as a caffeine molecule, a slab of gold(Au supercell), or a cluster of 51 water molecules, andis therefore more appropriate as a “black-box” precondi-tioner than the homogeneous schemes. It was howeveras ineffective as homogeneous schemes in “complicated”inhomogeneous metals with localized orbitals close to theFermi level. VIII. CONCLUSION
A novel preconditioning scheme for the self-consistentKohn-Sham equations based on the local density ofstates (LDOS) has been constructed and its performanceevaluated in representative test systems. We showed ourscheme to be broadly applicable for large inhomogeneoussystems involving metals, vacuum and insulators. Thekey idea of the proposed preconditioning scheme is toapproximate the independent-particle susceptibility χ by its large-scale component (50), which only involvesthe readily computable LDOS. The result is a precondi-tioner which is simple to implement, inexpensive to com- FIG. 6. Local density of states (LDOS) and pseudodensity ρ (thick line) of the (a) Al+vacuum system and the (b) Al+SiO system. Values are averaged in the xy -plane. The LDOS isshown both for the temperature used for preconditioning andduring the SCF ( T = 0 .
001 Hartree) and at a higher temper-ature, where it is more accurate due to insufficient Brillouinzone sampling. The part of the cell occupied by aluminiumis shown in grey background and atomic positions projectedonto the z -axis are shown with a tick. pute and free of any parameters. By nature of the LDOSthe scheme automatically adapts to the local environ-ment in mixed systems, effectively interpolating betweena Kerker-like treatment in metallic regions and no pre-conditioning where the local density of states is zero.We believe our proposed LDOS-based preconditioningscheme to be an important step towards a generally ap-plicable, parameter-free and black-box preconditioner forSCF iterations. In its present state there are, however, anumber of limitations. Firstly, as we have discussed theLDOS is not able to distinguish insulators and semicon-ductors. As a remedy we have investigated an empiricalcorrection by adding a dielectric model parametrized inthe macroscopic dielectric constant. The resulting pre-conditioner performs very well in cases with semiconduct-ing regions, but an appropriate dielectric constant has tobe selected.In order to go beyond the current approach, one couldtry to improve on the diagonal approximation for χ torepresent polarizability effects. Unlike the LDOS, thepolarizability is however more expensive to compute and4cannot be easily localized , making it difficult to developa fully transferable preconditioning scheme. Another lim-itation of our current preconditioner is that it only fo-cuses on preventing long-range charge-sloshing and noneof the other sources of ill-conditioning in SCF schemes(electronic phase transitions and localized states close tothe Fermi level). We intend to tackle this in future work.We believe the idea of approximating χ instead of di-rectly tackling the inverse dielectric operator to be fruit-ful for this endeavor as well, because χ is susceptibleto additive approximations (unlike ε − ). Therefore dif-ferent physics (charge transfer, polarizability...) can besummed up, resulting in a composable dielectric model.Finally, we mention that inexpensive approximations ofthe dielectric operator can be useful in other contexts as well, such as for preconditioning geometry optimization,in response computations, or for constructing screenedCoulomb interactions. ACKNOWLEDGMENTS
This project has received funding from the ISCD(Sorbonne Universit´e) and from the European ResearchCouncil (ERC) under the European Union’s Horizon2020 research and innovation program (grant agreementNo 810367). We gratefully acknowledge stimulation dis-cussions with E. Canc`es, X. Gonze, P. Hasnip, L. Lin,and C. Yang. ∗ [email protected] † [email protected] P. Hohenberg and W. Kohn, Physical Review , B864(1964). W. Kohn and L. J. Sham, Physical Review , A1133(1965). G. Kresse and J. Furthm¨uller, Physical Review B , 11169(1996). J. K. Norskov, F. Abild-Pedersen, F. Studt, and T. Bli-gaard, Proceedings of the National Academy of Sciences , 937 (2011). P. J. Hasnip, K. Refson, M. I. J. Probert, J. R. Yates, S. J.Clark, and C. J. Pickard, Philosophical Transactions of theRoyal Society A: Mathematical, Physical and EngineeringSciences , 20130270 (2014). D. Morgan, G. Ceder, and S. Curtarolo, MeasurementScience and Technology , 296 (2005). A. Jain, G. Hautier, C. J. Moore, S. Ping Ong, C. C. Fis-cher, T. Mueller, K. A. Persson, and G. Ceder, Computa-tional Materials Science , 2295 (2011). W. Setyawan, R. M. Gaume, S. Lam, R. S. Feigelson, andS. Curtarolo, ACS Combinatorial Science , 382 (2011). J. Greeley, T. F. Jaramillo, J. Bonde, I. Chorkendorff, andJ. K. Nørskov, Nature Materials , 909 (2006). F. Studt, F. Abild-Pedersen, T. Bligaard, R. Z. Sorensen,C. H. Christensen, and J. K. Norskov, Science , 1320(2008). E. Sk´ulason, T. Bligaard, S. Gudmundsd´ottir, F. Studt,J. Rossmeisl, F. Abild-Pedersen, T. Vegge, H. J´onsson,and J. K. Nørskov, Phys. Chem. Chem. Phys. , 1235(2012). L. R. Johnson, S. Sridhar, L. Zhang, K. D. Fredrick-son, A. S. Raman, J. Jang, C. Leach, A. Padmanabhan,C. C. Price, N. C. Frey, A. Raizada, V. Rajaraman, S. A.Saiprasad, X. Tang, and A. Vojvodic, ACS Catalysis ,253 (2020). G. Hautier, A. Jain, H. Chen, C. Moore, S. P. Ong,and G. Ceder, Journal of Materials Chemistry , 17147(2011). S. Curtarolo, W. Setyawan, G. L. Hart, M. Jahnatek,R. V. Chepulskii, R. H. Taylor, S. Wang, J. Xue, K. Yang,O. Levy, M. J. Mehl, H. T. Stokes, D. O. Demchenko,and D. Morgan, Computational Materials Science , 218 (2012). P. H. Dederichs and R. Zeller, Physical Review B , 5462(1983). G. P. Kerker, Physical Review B , 3082 (1981). D. Raczkowski, A. Canning, and L. W. Wang, PhysicalReview B , 121101 (2001). Y. Zhou, H. Wang, Y. Liu, X. Gao, and H. Song, PhysicalReview E , 033305 (2018). L. Lin and C. Yang, SIAM Journal on Scientific Computing , S277 (2013). P. Hasnip and M. Probert, arXiv preprintarXiv:1503.01420 (2015). K.-M. Ho, J. Ihm, and J. D. Joannopoulos, Physical Re-view B , 4260 (1982). P.-M. Anglade and X. Gonze, Physical Review B ,045126 (2008). N. Woods, M. Payne, and P. Hasnip, Journal of Physics:Condensed Matter , 453001 (2019). X. Gonze, Physical Review B , 4383 (1996). E. Canc`es, G. Kemlin, and A. Levitt, arXiv preprintarXiv:2004.09088 (2020). M. Chupin, M.-S. Dupuy, G. Legendre, and ´E. S´er´e, arXivpreprint arXiv:2002.12850 (2020). Y. Saad,
Iterative Methods for Sparse Linear Systems , 2nded., edited by Y. Saad (SIAM Publishing, 2003). A. Levitt, Archive for Rational Mechanics and Analysis ,1 (2020). ´E. Canc`es and M. Lewin, Archive for Rational Mechanicsand Analysis , 139 (2010). R. Resta, Physical Review B , 2717 (1977). S. Kumar, Q. Xu, and P. Suryanarayana, ChemicalPhysics Letters , 136983 (2020). M. F. Herbst and A. Levitt, “Density-functional toolkit(DFTK), version v0.1.8,” (2020), https://dftk.org . J. P. Perdew, K. Burke, and M. Ernzerhof, Physical Re-view Letters , 3865 (1996). S. Lehtola, C. Steigemann, M. J. Oliveira, and M. A.Marques, SoftwareX , 1 (2018). S. Goedecker, M. Teter, and J. Hutter, Physical ReviewB , 1703 (1996). M. F. Herbst and A. Levitt, “Computationalscripts and raw data for the presented study ofpreconditioners,” https://github.com/mfherbst/ supporting-ldos-preconditioning . H. F. Wilson, F. Gygi, and G. Galli, Physical Review B (2008), 10.1103/physrevb.78.113303. H. F. Wilson, D. Lu, F. Gygi, and G. Galli, PhysicalReview B (2009), 10.1103/physrevb.79.245106. E. Canc`es and V. Ehrlacher, Archive for rational mechanicsand analysis , 933 (2011). P. Giannozzi, S. Baroni, N. Bonini, M. Calandra, R. Car,C. Cavazzoni, D. Ceresoli, G. L. Chiarotti, M. Cococ-cioni, I. Dabo, A. Dal Corso, S. de Gironcoli, S. Fab-ris, G. Fratesi, R. Gebauer, U. Gerstmann, C. Gougous-sis, A. Kokalj, M. Lazzeri, L. Martin-Samos, N. Marzari,F. Mauri, R. Mazzarello, S. Paolini, A. Pasquarello,L. Paulatto, C. Sbraccia, S. Scandolo, G. Sclauzero, A. P.Seitsonen, A. Smogunov, P. Umari, and R. M. Wentz-covitch, Journal of Physics: Condensed Matter , 395502 (2009). P. Giannozzi, O. Andreussi, T. Brumme, O. Bunau,M. Buongiorno Nardelli, M. Calandra, R. Car, C. Cavaz-zoni, D. Ceresoli, M. Cococcioni, N. Colonna, I. Carnimeo,A. Dal Corso, S. de Gironcoli, P. Delugas, R. A. DiStasio,A. Ferretti, A. Floris, G. Fratesi, G. Fugallo, R. Gebauer,U. Gerstmann, F. Giustino, T. Gorni, J. Jia, M. Kawa-mura, H.-Y. Ko, A. Kokalj, E. K¨u¸c¨ukbenli, M. Lazzeri,M. Marsili, N. Marzari, F. Mauri, N. L. Nguyen, H.-V. Nguyen, A. Otero-de-la Roza, L. Paulatto, S. Ponc´e,D. Rocca, R. Sabatini, B. Santra, M. Schlipf, A. P. Seitso-nen, A. Smogunov, I. Timrov, T. Thonhauser, P. Umari,N. Vast, X. Wu, and S. Baroni, Journal of Physics: Con-densed Matter , 465901 (2017). J. Sipe and J. Van Kranendonk, Molecular Physics35