Differentiable but exact formulation of density-functional theory
aa r X i v : . [ phy s i c s . c h e m - ph ] F e b Differentiable but exact formulation of density-functional theory
Simen Kvaal, a) Ulf Ekstr¨om, Andrew M. Teale,
2, 1 and Trygve Helgaker Centre for Theoretical and Computational Chemistry, Department of Chemistry, University of Oslo,P.O. Box 1033 Blindern, N-0315 Oslo, Norway School of Chemistry, University of Nottingham, University Park, Nottingham, NG7 2RD,UK
The universal density functional F of density-functional theory is a complicated and ill-behaved functionof the density—in particular, F is not differentiable, making many formal manipulations more complicated.Whilst F has been well characterized in terms of convex analysis as forming a conjugate pair ( E, F ) withthe ground-state energy E via the Hohenberg–Kohn and Lieb variation principles, F is nondifferentiableand subdifferentiable only on a small (but dense) set of its domain. In this article, we apply a tool fromconvex analysis, Moreau–Yosida regularization, to construct, for any ǫ >
0, pairs of conjugate functionals( ǫ E, ǫ F ) that converge to ( E, F ) pointwise everywhere as ǫ → + , and such that ǫ F is (Fr´echet) differentiable.For technical reasons, we limit our attention to molecular electronic systems in a finite but large box. It isnoteworthy that no information is lost in the Moreau–Yosida regularization: the physical ground-state energy E ( v ) is exactly recoverable from the regularized ground-state energy ǫ E ( v ) in a simple way. All concepts andresults pertaining to the original ( E, F ) pair have direct counterparts in results for ( ǫ E, ǫ F ). The Moreau–Yosida regularization therefore allows for an exact, differentiable formulation of density-functional theory.In particular, taking advantage of the differentiability of ǫ F , a rigorous formulation of Kohn–Sham theoryis presented that does not suffer from the noninteracting representability problem in standard Kohn–Shamtheory. I. INTRODUCTION
Modern density-functional theory (DFT) was intro-duced by Hohenberg and Kohn in a classic paper andis now the workhorse of quantum chemistry and otherfields of quantum physics. Subsequently, DFT was puton a mathematically firm ground by Lieb using convexanalysis. The central quantity of DFT is the universaldensity functional F ( ρ ), which represents the electronicenergy of the system consistent with a given density ρ .Clearly, the success of DFT hinges on the modelling of F , an extremely complicated function of the electron den-sity. It is an interesting observation that, over the lasttwo or three decades, F has been modelled sufficiently ac-curately to make DFT the most widely applied methodof quantum chemistry, in spite of the fact that Schuchand Verstraete have shown how considerations from thefield of computational complexity place fundamental lim-its on exact DFT: if F ( ρ ) could be found efficiently, allNP hard problems would be solvable in polynomial time,which is highly unlikely. From a mathematical point of view, DFT is neatlyformulated using convex analysis : The universal densityfunctional F ( ρ ) and the ground-state energy E ( v ) are re-lated by a conjugation operation, with the density ρ andexternal potential v being elements of a certain Banachspace X and its dual X ∗ , respectively. The functionals F and E are equivalent in the sense that they contain thesame information—each can be generated exactly fromthe other.The universal density functional F is convex and lower a) Electronic mail: [email protected] semi-continuous but otherwise highly irregular and ill be-haved. Importantly, F is everywhere discontinuous andnot differentiable in any sense that justifies taking thefunctional derivative in formal expressions—even for the v -representable densities, as pointed out by Lammert. For example, it is common practice to formally differ-entiate F with respect to the density, interpreting thefunctional derivative “ − δF ( ρ ) /δρ ( r )” as a scalar poten-tial at r . However, this derivative, a Gˆateaux derivative,does not exist.Together with the problem of v -representability, con-ventional DFT is riddled with mathematically unfoundedassumptions that are, in fact, probably false. Forexample, conventional Kohn–Sham theory assumes, inaddition to differentiability of F , that, if ρ is v -representable for an interacting N -electron system, then ρ is also v -representable for the corresponding noninter-acting system. While providing excellent predictive re-sults with modelled approximate density functionals, itis, from a mathematical perspective, unclear why Kohn–Sham DFT works at all.It is the goal of this article to remedy this situationby introducing a family of regularized DFTs based on atool from convex analysis known as the
Moreau envelope or Moreau–Yosida regularization . For ǫ >
0, the idea isto introduce a regularized energy functional ǫ E related tothe usual ground-state energy E by ǫ E ( v ) = E ( v ) − ǫ k v k , (1)where k · k is the usual L -norm. The convex conju-gate of ǫ E is the Moreau envelope ǫ F of F , from whichthe regularized ground-state energy can be obtained bya Hohenberg-Kohn minimization over densities: ǫ E ( v ) = inf ρ ( ǫ F ( ρ ) + ( v | ρ )) . (2)where ( v | ρ ) = R v ( r ) ρ ( r )d r . The usual Hohenberg–Kohnvariation principle is recovered as ǫ → + . Importantly,the Moreau envelope ǫ F ( ρ ) is everywhere differentiable and converges pointwise from below to F ( ρ ) as ǫ → + .We use the term “regularized” for both ǫ E and ǫ F , al-though it is ǫ F that, as will be shown below, becomesdifferentiable through the procedure.A remark regarding the Banach spaces of densities andpotentials is here in order. If v is a Coulomb potential,then the regularization term in Eq. (1) becomes infinite.Moreover, the strongest results concerning the Moreau–Yosida regularization are obtained in a reflexive setting.The usual Banach spaces X = L ( R ) ∩ L ( R ) and X ∗ = L / ( R ) + L ∞ ( R ) for densities and potentials,respectively, are therefore abandoned, and both replacedwith the Hilbert space L ( B ℓ ), where B ℓ = [ − ℓ/ , ℓ/ is an arbitrarily large but finite box in R . As is wellknown, domain truncation represents a well-behaved ap-proximation: as ℓ increases, all eigenvalues converge tothe R -limit. Moreover, the continuous spectrum is ap-proximated by an increasing number of eigenvalues whosespacing converges to zero.We observe that, in the box, the difference E ( v ) − ǫ E ( v ) = ǫ k v k is arbitrarily small and explicitlyknown —it does not relate to the electronic structure ofthe system and is easily calculated from v . Nothing istherefore lost in the transition from ( E, F ) to ( ǫ E, ǫ F ).On the contrary, we obtain a structurally simpler theorythat allows taking the derivative of expressions involvingthe universal functional. Moreover, the differentiabilityof ǫ F implies v -representability of any ρ , for noninter-acting as well as interacting systems, as needed for arigorous formulation of Kohn–Sham theory. In this pa-per, we explore the Moreau envelope as applied to DFT,demonstrating how every concept of standard DFT hasa counterpart in the Moreau-regularized formulation ofDFT and vice versa.The remainder of the article is organized as follows:In Sec. II, we review formal DFT and discuss the regu-larity issues of the universal functional within the non-reflexive Banach-space setting of Lieb. In preparationfor the Moreau–Yosida regularization, we next reformu-late DFT in a truncated domain, introducing the Hilbertspace L ( B ℓ ) as density and potential space.The Moreau–Yosida regularization is a standard tech-nique of convex analysis, applicable to any convex func-tion such as the universal density functional. We intro-duce this regularization in Sec. IV, reviewing its basicmathematical properties. To establish notation, a reviewof convex analysis is given in the Appendix; for a goodtextbook of convex analysis in a Hilbert space, with anin-depth discussion of the Moreau–Yosida regularization,see Ref. 7.Following the introduction of the Moreau–Yosida regu-larization, we apply it to DFT in Sec. V and subsequently to Kohn–Sham theory in Sec. VI. Finally, Sec. VII con-tains some concluding remarks. II. PRELIMINARIESA. Formal DFT
In DFT, we express the Born–Oppenheimer ground-state problem of an N -electron system in the externalelectrostatic potential v ( r ) as a problem referring only tothe one-electron density ρ ( r ). The Born–Oppenheimer N -electron molecular Hamiltonian is given by H λ ( v ) = ˆ T + λ ˆ W + ˆ v, (3)where ˆ T and ˆ W are the kinetic-energy and electron-electron repulsion operators, respectively, while ˆ v is amultiplicative N -electron operator corresponding to thescalar potential v ( r ). The scalar λ is introduced to distin-guish between the interacting ( λ = 1) and noninteracting( λ = 0) systems.By Levy’s constrained-search argument, the (fully in-teracting) ground-state energy, E ( v ) = inf Ψ h Ψ | H ( v ) | Ψ i , (4)can be written in the form of a Hohenberg–Kohn varia-tion principle, E ( v ) = inf ρ ∈I N ( F ( ρ ) + ( v | ρ )) , (5)where I N is the set of N -representable densities—thatis, ρ ∈ I N if and only if there exists a normalized N -electron wave function with finite kinetic energy and den-sity ρ . In Eq. (4), the infimum extends over all prop-erly symmetrized and normalized Ψ ∈ H ( R N ), thefirst-order Sobolev space consisting of those functions in L ( R N ) that have first-order derivatives also in L ( R N )and therefore have a finite kinetic energy.Different universal density functionals F can be usedin Eq. (5), the only requirement of an admissible func-tional being that the correct ground-state energy E ( v ) isrecovered.Given that R ρ ( r )d r = N , it follows that I N ⊂ L ( R ).As demonstrated by Lieb in Ref. 2, the universal den-sity functional F can be chosen as a unique lower semi-continuous convex function with respect to the L ( R )topology. (By definition, therefore, F ( ρ ) = + ∞ for any ρ / ∈ I N ; see Appendix A for remarks on extended-valuedfunctions.) Moreover, by a Sobolev inequality, we mayembed the N -representable densities in the Banach space X = L ( R ) ∩ L ( R ), with norm k · k X = k · k L + k · k L and topological dual X ∗ = L ∞ ( R ) + L / ( R ). Giventhat this Banach space X has a stronger topology than L ( R ), a convergent sequence in X converges also in L .From the lower semi-continuity of F in L ( R ), we thenobtain k ρ n − ρ k X → ⇒ k ρ n − ρ k → ⇒ lim inf n F ( ρ n ) ≥ F ( ρ ) , (6)implying that F is lower semi-continuous also in thetopology of X . We note that the choice X = L ∩ L is not unique, but it has the virtue that all Coulombpotentials are contained in X ∗ .On the chosen Banach spaces, the (concave and con-tinuous) ground-state energy E : X ∗ → R ∪ {−∞} andthe (convex and lower semi-continuous) universal densityfunctional F : X → R ∪ { + ∞} are related by the varia-tion principles E ( v ) = inf ρ ∈ X ( F ( ρ ) + ( v | ρ )) , v ∈ X ∗ , (7a) F ( ρ ) = sup v ∈ X ∗ ( E ( v ) − ( v | ρ )) , ρ ∈ X. (7b)In the terminology of convex analysis (see Appendix A), ρ F ( ρ ) and v
7→ − E ( − v ) are each other’s convexFenchel conjugates. To reflect the nonsymmetric rela-tionship between E and F in Eqs. (7a) and (7b), we in-troduce the nonstandard but useful mnemonic notation F = E ∨ , (8a) E = F ∧ , (8b)which is suggestive of the “shape” of the resulting func-tions: F ∧ = E is concave, whereas E ∨ = F is convex.The density functional F in Eq. (7b) is an extension ofthe universal functional F HK derived by Hohenberg andKohn, the latter functional having from our perspec-tive the problem that it is defined only for ground-statedensities ( v -representable densities) in A N , an implicitlydefined set that we do not know how to characterize.It can be shown that the functional F defined byEq. (7b) is identical to the constrained-search functional F ( ρ ) = inf Γ ρ Tr( ˆ T + λ ˆ W )Γ, where the minimization isover all ensemble density matrices Γ corresponding to adensity ρ , constructed from N -electron wave functionswith a finite kinetic energy. A related functional is the(nonconvex) Levy–Lieb constrained search functional, F LL ( ρ ) = inf Ψ ρ h Ψ | ( ˆ T + λ ˆ W ) | Ψ i , obtained by minimiz-ing over pure states only. In any case, Eq. (7b) definesthe unique lower semi-continuous, convex universal func-tional such that F = ( F ∧ ) ∨ . In fact, any ¯ F that satisfiesthe condition ( ¯ F ∧ ) ∨ = F is an admissible density func-tional. In particular, F LL and F HK are both admissible,satisfying this requirement when extended from their do-mains ( I N and A N , respectively) to all of X by settingthem equal to + ∞ elsewhere. B. Nondifferentiability of F The Hohenberg–Kohn variation principle in Eq. (5) isappealing, reducing the N -electron problem to a problem referring only to one-electron densities. However, as dis-cussed in the introduction, F is a complicated function.In particular, here we consider its nondifferentiability.The Gˆateaux derivative is closely related to the notionof directional derivatives, see Appendix A. A function F is Gˆateaux differentiable at ρ ∈ X if the directionalderivative F ′ ( ρ ; σ ) is linear and continuous in all direc-tions σ ∈ X , meaning that there exists a δF ( ρ ) /δρ ∈ X ∗ such that F ′ ( ρ ; σ ) := d F ( ρ + sσ )d s (cid:12)(cid:12)(cid:12) s =0 + = (cid:18) δF ( ρ ) δρ (cid:12)(cid:12)(cid:12)(cid:12) σ (cid:19) . (9)However, F is finite only on I N . In a direction σ ∈ X such that R ( ρ ( r ) + σ ( r )) d r = N , F ( ρ + sσ ) = + ∞ for all s >
0, implying that F ′ ( ρ ; σ ) = + ∞ and hence that F isnot continuous in the direction of σ . The same argumentshows that F is discontinuous also in directions σ suchthat the density ρ + sσ is negative in a volume of nonzeromeasure for all s > X + N , the subset of X containing all non-negative functions that integrate to N electrons. Afterall, the discontinuity of F in directions that change theparticle number is typically dealt with using a Lagrangemultiplier for the particle number constraint. However,Lammert has demonstrated that, even within X + N , thereare, for each ρ , directions such that F ′ ( ρ ; σ ) = + ∞ , asso-ciated with short-scale but very rapid spatial oscillationsin the density (and an infinite kinetic energy). C. Subdifferentiability of F Apart from lower (upper) semi-continuity of a con-vex (concave) function, the minimal useful regularityis not Gˆateaux differentiability but subdifferentiability(superdifferentiability), see Appendix A. Let f : X → R ∪ { + ∞} be convex lower semi-continuous. The subd-ifferential of f at x , ∂f ( x ) ⊂ X ∗ , is by definition the col-lection of slopes of supporting continuous tangent func-tionals of f at x , known as the subgradients of f at x , seeFig. 2 in Appendix A. If the graph of f has a “kink”at x , then there exists more than one such subgradi-ent. At a given point x ∈ dom( f ), the subdifferential ∂f ( x ) may be empty. We denote by dom( ∂f ) the set ofpoints x ∈ dom( f ) such that ∂f ( x ) = ∅ . It is a fact thatdom( ∂f ) is dense in dom( f ) when f is a proper lowersemi-continuous convex function. The superdifferentialof a concave function is similarly defined.Together with convexity, subdifferentiability is suffi-cient to characterize minima of convex functions: A con-vex lower semi-continuous functional f : X → R ∪ { + ∞} has a global minimum at x ∈ X if and only if 0 ∈ ∂f ( x ).Similarly x f ( x ) + h ϕ, x i has a minimum if and onlyif − ϕ ∈ ∂f ( x ).Subdifferentiability is a substantially weaker conceptthan that of Gˆateaux (or directional) differentiability.Clearly, if f ( x ) is Gˆateaux differentiable at x , then ∂f ( x ) = { δf ( x ) /δx } . However, the converse is nottrue: in infinite-dimensional spaces, it is possible that ∂f ( x ) = { y } , a singleton, while f ( x ) is not differentiableat x . This is so because ∂f ( x ) being a singleton is notenough to guarantee continuity of f .In DFT, subdifferentiability has an important interpre-tation. Suppose ρ is an ensemble ground-state density of v , meaning that, for all ρ ′ ∈ I N , we have the inequality E ( v ) = F ( ρ ) + ( v | ρ ) ≤ F ( ρ ′ ) + ( v | ρ ′ ) . (10)Then, the subdifferential of F ( ρ ) at ρ is ∂F ( ρ ) = {− v + µ : µ ∈ R } , (11)which is a restatement of the first Hohenberg–Kohn the-orem: the potential for which ρ is a ground-state densityis unique up to a constant shift. On the other hand, if ρ is not a ground-state density for any v ∈ X ∗ , then ∂F ( ρ ) = ∅ . Thus, a nonempty subdifferential is equiv-alent to (ensemble) v -representability: ρ ∈ dom( ∂F ) ifand only if ρ is v -representable. Denoting the set of en-semble v -representable densities by B N , we obtain ρ ∈ B N ⇐⇒ ∂F ( ρ ) = ∅ . (12)We note that B N is dense in X + N , the subset of X con-taining all nonnegative functions that integrate to N elec-trons.However, even though subdifferentiability is sufficientfor many purposes, differentiability of F would make for-mal manipulations easier. Moreover, the characteriza-tion of v -representable ρ ∈ B N is unknown and probablydependent on the interaction strength λ . These observa-tions motivate the search for a differentiable regulariza-tion of the universal functional. D. Superdifferentiability of E Let us briefly consider the superdifferential of E , a con-cave continuous (and hence upper semi-continuous) func-tion over X ∗ . A fundamental theorem of convex analysisstates that − v ∈ ∂F ( ρ ) ⇐⇒ ρ ∈ ∂E ( v ) , (13)where we use the same notation for sub- and superdif-ferentials. Thus, the potential v has a ground state withdensity ρ if and only if ρ ∈ ∂E ( v ); if v does not supporta ground state, then ∂E ( v ) is empty. Denoting the setof potentials in X ∗ that support a ground state by V N ,we obtain: v ∈ V N ⇐⇒ ∂E ( v ) = ∅ . (14)If a ground state is nondegenerate, then ∂E ( v ) = { ρ } isa singleton; together with the fact that E is continuous, it then follows that E is Gˆateaux differentiable at v . Onthe other hand, if the ground state is degenerate, thenthe subdifferential is the convex hull of g ground-statedensities: ∂E ( v ) = co { ρ , ρ , · · · , ρ g } , (15)and E is not differentiable at this v unless all the ρ i are equal—that is, if the degenerate ground states havethe same density. For example, in the absence of a mag-netic field, the hydrogen atom has the degenerate groundstates 1 sα and 1 sβ , with the same density. III. DOMAIN TRUNCATION
In Sec. IV, we outline the mathematical background forthe Moreau–Yosida regularization. Many useful results,such as differentiability of the Moreau envelope ǫ F ( ρ ),are only available when the underlying vector space X is reflexive or, even better, when X is a Hilbert space.However, the Banach space X = L ( R ) ∩ L ( R ) used inLieb’s formulation of DFT is nonreflexive. In this section,we truncate the full space R to a box B ℓ = [ − ℓ/ , ℓ/ of finite volume ℓ , so large that the ground state energyof every system of interest is sufficiently close to the R limit. What is lost from this truncation is well compen-sated for by the fact that we may now formulate DFTusing the Hilbert space H ℓ := L ( B ℓ ) (16)for both potentials and densities, as we shall now demon-strate. A. The ground-state problem
For the spatial domain B ℓ , the N -electron ground-state problem is a variational search for the lowest-energywave function Ψ ∈ H ( B Nℓ ), the first-order Sobolev spacewith vanishing values of the boundary of B Nℓ , the N -fold Cartesian product of B ℓ . The search is carried outonly over the subset of H ( B Nℓ ) which is also normalizedand properly symmetrized: for a total spin projection of ~ ( N ↑ − N ↓ ) /
2, the corresponding subset of wavefunctionsis antisymmetric in the N ↑ first and the N ↓ last particlecoordinates separately.Any potential in the full space, ˜ v ∈ L / ( R )+ L ∞ ( R ),induces a potential v = ˜ v ↾ B ℓ ∈ L / ( B ℓ ) + L ∞ ( B ℓ ) in thetruncated domain. We remark that L / ( B ℓ )+ L ∞ ( B ℓ ) = L / ( B L ), with equivalent topologies. Since the domainis bounded, the Rellich–Kondrakov theorem states that H ( B Nℓ ) is compactly embedded in L ( B Nℓ ), which in turnimplies that the spectrum of the Hamiltonian H λ ( v ) inEq. (3) is purely discrete. Thus, for any potential v inthe box, one or more ground-state wave functions Ψ v ∈ H exists.We next observe that, if ˜ v is a Coulomb potential, thenthe truncated potential v belongs to L ( B ℓ ). Moreover, L ( B ℓ ) ⊂ L / ( B ℓ ) since B ℓ is bounded. It is thereforesufficient to consider the ground-state energy as a func-tion E ℓ : H ℓ → R . (17)Regarding the continuity of E ℓ , we note that the proofgiven in Ref. 2 for the continuity of E in the L / ( R ) + L ∞ ( R ) topology is equally valid for E ℓ in the L / ( B ℓ )+ L ∞ ( B ℓ ) topology. Convergence in L ( B ℓ ) implies conver-gence in L / ( B ℓ ) + L ∞ ( B ℓ ). Therefore, E ℓ is continuousin the L ( B ℓ ) topology.We remark that, as ℓ → ∞ , E ℓ ( v ) converges to theexact, full-space ground-state energy E ( v ). On the otherhand, the associated eigenfunctions converge if and onlyif the full-space ground-state energy E ( v ) is an eigen-value, with v = 0 as a counterexample. B. Densities and the universal density functional
Invoking the usual ensemble constrained-search proce-dure, we obtain E ℓ ( v ) = inf ρ ∈I N ( B ℓ ) F ℓ ( ρ ) + ( v | ρ ) , (18)where I N ( B ℓ ) is the set of N -representable densities: ρ ∈I N ( B ℓ ) if and only if there exists a properly symmetrizedand normalized Ψ ∈ H ( B Nℓ ) such that Ψ ρ . It isstraightforward to see that I N ( B ℓ ) = (cid:8) ρ ∈ L ( B ℓ ) : ρ ≥ , √ ρ ∈ H ( B ℓ ) , R ρ ( r ) d r = N (cid:9) . (19)The density functional F ℓ is completely analogous to thefull-space functional F . In particular, F ℓ is lower semi-continuous in the L ( B ℓ ) topology by Theorem 4.4 andCorollary 4.5 in Ref. 2.We remark that F ℓ ( ρ ) = F ( ρ ) for any ρ ∈ I N ( B ℓ ), asseen from the fact that, if Ψ ∈ H ( R N ) and Ψ ρ with √ ρ ∈ H ( B ℓ ), then we must have Ψ ∈ H ( B Nℓ ).Since B ℓ is bounded, the Cauchy–Schwarz inequalitygives for any measurable u , k u k = (1 | | u | ) ≤ k k k u k = | B ℓ | / k u k . (20)By an argument similar to that of Eq. (6), F ℓ is nowseen to be lower semi-continuous also with respect to the L ( B ℓ ) topology. Note that I N ( B ℓ ) ⊂ L ( B ℓ ) ∩ L ( B ℓ ) = L ( B ℓ ) ⊂ L ( B ℓ ) (21)so that every N -representable density is in L ( B ℓ ). Since F ℓ is convex and lower semi-continuous on H ℓ = L ( B ℓ ),we may now formulate DFT in the Hilbert space H ℓ as E ℓ ( v ) = inf ρ ∈H ℓ ( F ℓ ( ρ ) + ( v | ρ )) , (22a) F ℓ ( ρ ) = sup v ∈H ℓ ( E ℓ ( v ) − ( v | ρ )) . (22b) Given that Hilbert spaces possess a richer structure thanBanach spaces, this formulation of DFT is particularlyconvenient: densities and potentials are now elements ofthe same vector space H ℓ and reflexivity is guaranteed.Even for the full space, I N ( R ) ⊂ L ( R ), indicatingthat it is possible to avoid the use of the box. Indeed, wemay restrict the ground-state energy to potentials v ∈ L ( R ) ⊂ L / ( R ) + L ∞ ( R ):˜ E : L ( R ) → R , ˜ E = E ↾ L ( R ) , (23)a concave and continuous map. Invoking the theory ofconjugation within this reflexive Hilbert-space setting, wehave a convex lower semi-continuous universal functional˜ F : L ( R ) → R ∪ { + ∞} , ˜ F = ˜ E ∨ = ( ˜ F ∧ ) ∨ . (24)However, Coulomb potentials are not contained in L ( R ).On the other hand, this theory is sufficient for dealingwith all truncated Coulomb potentials, obtained, for ex-ample, from the usual Coulomb potentials by settingthem equal to zero outside the box B ℓ ; it is also suffi-cient when working with Yukawa rather than Coulombpotentials.The optimality conditions for the Hohenberg–Kohnand Lieb variation principles in Eqs. (22a) and (22b) are − v ∈ ∂F ℓ ( ρ ) ⇐⇒ ρ ∈ ∂E ℓ ( v ) . (25)Denoting the set of densities for which F ℓ is subdifferen-tiable by B ℓ (by analogy with B N in X ) and the set ofpotentials for which E ℓ is superdifferentiable by V ℓ (byanalogy with V N in X ∗ ), we obtain B ℓ ( H ℓ , V ℓ = H ℓ (26)where B ℓ is dense in the subset of H ℓ containing all non-negative functions that integrate to N electrons. Thedifferentiability properties of F ℓ are the same as those of F discussed in Section II B. To introduce differentiability,a further regularization is necessary. IV. MOREAU–YOSIDA REGULARIZATION
In this section, we present the basic theory of Moreau–Yosida regularization, introducing infimal convolutionsin Section IV A, Moreau envelopes in Section IV B, prox-imal mappings in Section IV C, and conjugates of Moreauenvelopes in Section IV D. The results are given mostlywithout proofs; for these proofs, we refer to the bookby Bauschke and Combettes, whose notation we followclosely. A. Infimal convolution
In preparation for the Moreau–Yosida regularization,we introduce the concept of infimal convolution in thissection and discuss its properties on a Hilbert space H . Definition 1.
For f, g : H → R ∪ { + ∞} , the infimalconvolution is the function f (cid:31) g : H → R ∪ {±∞} givenby ( f (cid:31) g )( x ) = inf y ∈H ( f ( y ) + g ( x − y )) . (27)In the context of convex conjugation, the infimal con-volution is analogous to the standard convolution in thecontext of the Fourier transform. Here are some basicproperties of the infimal convolution for functions thatdo not take on the value −∞ : Theorem 1.
Let f, g : H → R ∪ { + ∞} . Then:1. f (cid:31) g = g (cid:31) f ;2. dom( f (cid:31) g ) = dom f + dom g = { x + x ′ : x ∈ dom f, x ′ ∈ dom g } ;3. ( f (cid:31) g ) ∧ = f ∧ + g ∧ ;4. if f and g are convex, then f (cid:31) g is convex.Proof. See Ref 7, Props. 12.6, 12.11 and 13.21.Henceforth, we restrict our attention to all lower semi-continuous proper convex functions f : H → R ∪ { + ∞} ,denoting the set of all such functions by Γ ( H ), see Ap-pendix A. We also need the concepts of coercivity andsupercoercivity: a function f : H → R ∪{ + ∞} is coercive if f ( x ) → + ∞ whenever k x k H → + ∞ and supercoercive if f ( x ) / k x k H → + ∞ whenever k x k H → + ∞ . For exam-ple, F ℓ ∈ Γ ( H ℓ ) is coercive, whereas − E ℓ ∈ Γ ( H ℓ ) isnot coercive.For functions in Γ ( H ), we have the following strongerproperties of the infimal convolution: Theorem 2.
Let f, g ∈ Γ ( H ) such that either g is su-percoercive or f is bounded from below and g is coercive.Then1. f (cid:31) g ∈ Γ ( H ) ;2. ( f ∧ + g ∧ ) ∨ = f (cid:31) g ;3. for each x ∈ H , there exists x ∗ ∈ H such that ( f (cid:31) g )( x ) = f ( x ∗ ) + g ( x − x ∗ ) (28) where x ∗ is unique if g is strictly convex.Proof. Point 1 follows from Ref. 7, Prop. 12.14. Point 2follows from Theorem 1 above. Finally, Point 3 followsfrom the fact that strictly convex functions have uniqueminima; the existence of a minimum follows from the(super)coerciveness of the mapping y
7→ k x − y k H / B. The Moreau envelope
In the following, we introduce the Moreau envelope offunctions in Γ ( H ) and review its properties. Definition 2.
For f ∈ Γ ( H ) and ǫ > , the Moreau–Yosida regularization or the
Moreau envelope ǫ f : H → R ∪ { + ∞} is the infimal convolution of f with x ǫ k x k H : ǫ f ( x ) = inf y ∈H (cid:18) f ( y ) + 12 ǫ k x − y k H (cid:19) . (29)Since f ∈ Γ ( H ) and since x ǫ k x k H is strictlyconvex and supercoercive, it follows from Theorem 1 that ǫ f ∈ Γ ( H ). In fact, ǫ f is much more well behaved thana general function in Γ ( H ), as the following theoremshows. Theorem 3.
The Moreau envelope ǫ f of f ∈ Γ ( H ) with ǫ > satisfies the following properties:1. ǫ f ∈ Γ ( H ) with dom ǫ f = H ;2. inf f ( H ) ≤ ǫ f ( x ) ≤ γ f ( x ) ≤ f ( x ) for all x ∈ H andall ≤ γ ≤ ǫ ;3. inf ǫ f ( H ) = inf f ( H ) ;4. for all x ∈ H , ǫ f ( x ) → f ( x ) from below as ǫ → + (even if x / ∈ dom f );5. ǫ f is continuous;6. ǫ f is Fr´echet differentiable: for every x ∈ H , thereexists ∇ ǫ f ( x ) ∈ H such that for all y ∈ H : ǫ f ( x + y ) = ǫ f ( x ) + ( ∇ ǫ f ( x ) | y ) + o ( k y k H ) ; (30)
7. the subdifferential of ǫ f at x is given by ∂ ǫ f ( x ) = {∇ ǫ f ( x ) } . (31) Proof.
Point 1 follows from Theorems 1 and 2. ForPoints 2 and 3, see Ref. 7, Prop. 12.9. For Point 4, seeProp. 12.32. For Points 5–7, see Props. 12.15, 12.28, and12.29.In Figure 1, the Moreau envelope is illustrated fora convex function f on the real axis. We observethat the minimum value of f ( x ) is preserved by theMoreau envelope ǫ f ( x ) and that the second argument x
7→ k x − x ′ k H / (2 ǫ ) to the infimal convolution removesall kinks, giving a curvature equal to that of this function. C. The proximal mapping
From Theorem 1, it follows that the infimum of ǫ f ( x )in Eq. (29) is attained with a unique minimizer. We makethe following definitions: FIG. 1. Illustration of the Moreau envelope of a simple convexfunction f : R → R ∪ { + ∞} . The function ǫ f ( x ) is plotted inthick lines, whereas f ( x ) is shown in a thinner line. Finally,for a chosen value of x ′ , the function x
7→ k x − x ′ k / (2 ǫ ) issuperposed on f ( x ) and ǫ f ( x ) using a dashed line. Definition 3.
Let f ∈ Γ ( H ) and ǫ > . The proximalmapping prox ǫf : H → H is defined by prox ǫf ( x ) = argmin y ∈H (cid:18) f ( y ) + 12 ǫ k x − y k H (cid:19) , (32) where prox ǫf ( x ) is the proximal point of f at x ∈ H . The usefulness of the proximal mapping follows fromthe following theorem:
Theorem 4.
Let f ∈ Γ ( H ) and ǫ > . Then1. if x ∈ dom f and ǫ → + , then (cid:13)(cid:13) prox ǫf ( x ) − x (cid:13)(cid:13) H = O ( ǫ ); (33)
2. the Fr´echet (and Gˆateaux) derivative of ǫ f at x isgiven by δ ǫ f ( x ) δx = ∇ ǫ f ( x ) = ǫ − (cid:0) x − prox ǫf ( x ) (cid:1) ; (34)
3. for all p, x ∈ H , it holds that p = prox ǫf ( x ) ⇐⇒ ǫ − ( x − p ) ∈ ∂f ( p ); (35)
4. if x ∈ H , then ∇ ǫ f ( x ) ∈ ∂f (prox ǫf ( x )) . (36) Proof.
For Point 1, see the proof of Prop. 12.32 in Ref. 7.For Point 2, see Prop. 12.29; for Point 3, see Prop. 12.26.Point 4 follows from Point 2 and 3.
D. The conjugate of the Moreau envelope
Given that ǫ f ∈ Γ ( H ), there exists a concave ǫ g ∈− Γ ( H ) such that ( ǫ f ) ∧ = ǫ g and ( ǫ g ) ∨ = ǫ f . The follow-ing theorem gives the basic properties of this conjugate: Theorem 5. If ǫ f is the Moreau envelope of f ∈ Γ ( H ) ,then their conjugates and the superdifferentials of theseconjugates are related as ( ǫ f ) ∧ ( x ) = f ∧ ( x ) − ǫ k x k H , (37a) ∂ ( ǫ f ) ∧ ( x ) = ∂f ∧ ( x ) − ǫx. (37b) Proof.
Eq. (37a) follows from the fact that the convexconjugate of x
7→ k x k H / (2 ǫ ) is x ǫ k x k H / ∂ ( ǫ k x k H /
2) = { ǫx } .Being related in such a simple manner, f ∧ and ( ǫ f ) ∧ share many properties. We note, however, that ( ǫ f ) ∧ isstrictly concave, whereas f ∧ may be merely concave.We remark that the Moreau envelope is not defined fora concave function g ∈ − Γ ( H ), only for convex func-tions. Thus, the notation ǫ g for a g ∈ − Γ ( H ) is not tobe interpreted as a Moreau envelope, but as the concaveconjugate of a Moreau envelope, ǫ g = ( ǫ ( g ∨ )) ∧ . V. MOREAU–YOSIDA REGULARIZED DFT
Having introduced Moreau–Yosida regularization inthe preceding section, we are ready to apply it to DFTon the Hilbert space H ℓ = L ( B ℓ ). A. Moreau–Yosida regularized DFT
Applying Eqs. (29) and (37a) with f = F ℓ and f ∧ = E ℓ , we obtain the regularized Lieb functional ǫ F ℓ : H ℓ → R and ground-state energy ǫ E ℓ : H ℓ → R , ǫ F ℓ ( ρ ) = inf ρ ′ ∈H ℓ (cid:0) F ℓ ( ρ ′ ) + ǫ k ρ − ρ ′ k (cid:1) , (38a) ǫ E ℓ ( v ) = E ℓ ( v ) − ǫ k v k . (38b)Importantly, these functions are related to each other asconjugate functions; just as we have already encounteredfor the ( E, F ) and ( E ℓ , F ℓ ) conjugate pairs. As such, thefollowing Hohenberg–Kohn and Lieb variation principleshold on the Hilbert space H ℓ : ǫ E ℓ ( v ) = inf ρ ∈H ℓ ( ǫ F ℓ ( ρ ) + ( v | ρ )) , ∀ v ∈ H ℓ , (39a) ǫ F ℓ ( ρ ) = sup v ∈H ℓ ( ǫ E ℓ ( v ) − ( v | ρ )) , ∀ ρ ∈ H ℓ . (39b)However, unlike F and F ℓ , which are finite only for N -representable densities, the Moreau–Yosida regularizedLieb functional ǫ F ℓ is finite on the whole Hilbert space:dom( ǫ F ℓ ) = H ℓ (40)since, in Eq. (38a), a finite value is always found on theright-hand side, even when ρ / ∈ I N . A curious side effectof the regularization is therefore that the minimizing den-sity in the regularized Hohenberg–Kohn variation princi-ple in Eq. (39a) (which exists for all v ∈ H ℓ ) may not be N -representable: it may be negative in a region of finitemeasure or contain an incorrect number of electrons.To illustrate the behaviour of the regularized func-tional for nonphysical densities, consider ǫ F ℓ ( ρ + c ) when ρ is N -representable and c ∈ R . From the definition ofthe Moreau envelope in Eq. (38a), we obtain straightfor-wardly that ǫ F ℓ ( ρ + c ) = ǫ F ℓ ( ρ ) + 12 ǫ ℓ c . (41)The regularized density functional thus depends on c in asimple quadratic manner, with a minimum at c = 0. As ǫ tends to zero from above, ǫ F ℓ ( ρ + c ) increases more andmore rapidly with increasing | c | , approaching F ℓ ( ρ + c ) =+ ∞ more closely. As expected, the regularized functionalis differentiable in the direction that changes the numberof electrons.On the face of it, the existence of minimizing ‘pseudo-densities’ in the Hohenberg–Kohn variation principlethat are not N -representable may seem to be a seri-ous shortcoming of the Moreau–Yosida regularization—ideally, we would like the minimizing density to arisefrom some N -electron wave function. However, the ap-pearance of nonphysical pseudo-densities is an inevitableconsequence of the regularization—differentiability in alldirections cannot be achieved without extending the ef-fective domain of F ℓ to all H ℓ ; alternatively, we mayretain the effective domain of N -representable densitiesand instead work with restricted functional derivatives,defined only in directions that conserve some propertiesof the density. Such an approach is straightforward fordirections that change the number of electrons in the sys-tem but much more difficult for directions that lead tonegative densities or to an infinite kinetic energy.The existence of minimizing pseudo-densities that arenot N -representable is less important than the fact that ǫ F ℓ converges pointwise to F from below as ǫ → + , evenwhen ρ / ∈ I N ( B ℓ ). Also, we shall in the next subsec-tion see that every ρ ∈ H ℓ is linked to a unique physicalground-state density ρ ǫ ∈ B ℓ . It is therefore possible toregard (and to treat) the Hohenberg–Kohn minimizationover pseudo-densities in H ℓ as a minimization over phys-ical densities in B ℓ , as discussed below.We also observe that ǫ E ℓ converges pointwise to E ℓ from below as ǫ → + . More importantly, for any cho-sen ǫ >
0, we may recover the exact ground-state energy E ℓ from the regularized energy ǫ E ℓ simply by adding theterm ǫ k v k , which does not depend on the electronic structure of the system . Indeed, this term is no morerelevant for the molecular electronic system than the ne-glected nuclear–nuclear repulsion term—its purpose ismerely to make the ground-state energy strictly concaveand supercoercive in the external potential so that theuniversal density functional becomes differentiable andcontinuous. Indeed, no information regarding the elec-tronic system is lost in the regularization beyond what islost upon truncation of the domain from R to an arbi-trarily large cubic box B ℓ , needed to make ǫ k v k finitefor all potentials. B. The proximal density and potential
According to the general theory of Moreau–Yosida reg-ularization, a unique minimizer, which we shall here callthe proximal (ground-state) density , ρ ǫ = prox ǫF ℓ ( ρ ) . (42)exists for any ρ ∈ H ℓ in the regularized Lieb functionalof Eq. (38a), which may therefore be written as ǫ F ℓ ( ρ ) = F ℓ ( ρ ǫ ) + 12 ǫ k ρ − ρ ǫ k . (43)From Eq. (35), we conclude that the standard Lieb func-tional is subdifferentiable at ρ ǫ and hence that ρ ǫ is anensemble v -representable ground-state density in H ℓ : ρ ǫ ∈ B ℓ . (44)We also see from Eq. (35) that every ρ ∈ H ℓ and asso-ciated proximal ground-state density ρ ǫ together satisfythe subgradient relation ǫ − ( ρ − ρ ǫ ) ∈ ∂F ℓ ( ρ ǫ ) , (45)implying that v ǫ = ǫ − ( ρ ǫ − ρ ) (46)is an external potential with ground-state density ρ ǫ ∈B ℓ . In the following, we refer to v ǫ as the proximal poten-tial associated with ρ . We recall that, by the Hohenberg–Kohn theorem, the density determines the potential upto a constant. The subdifferential of F ℓ at the proximaldensity ρ ǫ is therefore ∂F ℓ ( ρ ǫ ) = − v ǫ + R . (47)where v ǫ is the proximal potential of Eq. (46).Conversely, suppose that ρ ∈ B ℓ . There then exists anexternal potential v such that − v ∈ ∂F ℓ ( ρ ). Expressing v in the form v = ǫ − ( ρ − ˜ ρ ) for some ˜ ρ ∈ H ℓ , we obtain ǫ − (˜ ρ − ρ ) ∈ ∂F ℓ ( ρ ), which by Eqs. (35) and (45) impliesthat ρ is the proximal density of ˜ ρ . Thus, every ensemble v -representable density ρ ∈ B ℓ is the proximal density of ρ − ǫv ∈ H ℓ where v is such that − v ∈ ∂F ℓ ( ρ ): ρ = prox ǫF ( ρ − ǫv ) . (48)In short, we have the important fact that the set ofproximal densities in H ℓ is precisely the set of ensem-ble ground-state densities B ℓ . A density ρ ∈ H ℓ whoseproximal density is ρ ǫ is called a carrier density of ρ ǫ .By the Hohenberg–Kohn theorem, the potential v inEq. (48) is unique up a constant c ∈ R . The carrier den-sity is therefore uniquely determined up to an additiveconstant. The nonuniqueness of the carrier density alsofollows directly from Eq. (41), which shows that ρ and ρ + c where ρ ∈ H ℓ and c ∈ R have the same proximalground-state density ρ ǫ ∈ B ℓ .To summarize, even though the densities in the regu-larized Hohenberg–Kohn variation principle in Eq. (39a)are pseudo-densities (not associated with any N -electronwave function), every such density ρ ∈ H ℓ is uniquelymapped to a ground-state density by the surjective prox-imal operator prox ǫF : H ℓ → B ℓ . (49)This operator performs the decomposition ρ = ρ ǫ − ǫv ǫ , (50)where the proximal density ρ ǫ ∈ B ℓ may be viewed as the‘projection’ of ρ onto B ℓ with potential v ǫ ∈ V ℓ . We notethat ρ ǫ = ρ , even when ρ ∈ B ℓ . The proximal operatoris therefore not a true projector.For any ρ ∈ H ℓ , the proximal density ρ ǫ and proximalpotential v ǫ together satisfy the usual reciprocal relationsfor the standard Lieb functional and ground-state energy: − v ǫ ∈ ∂F ℓ ( ρ ǫ ) ⇐⇒ ρ ǫ ∈ ∂E ℓ ( v ǫ ) , (51)see Eq. (13), and therefore satisfy the relation: E ℓ ( v ǫ ) = F ℓ ( ρ ǫ ) + ( v ǫ | ρ ǫ ) . (52)Thus, to every solution of the regularized Hohenberg–Kohn variation principle with − v ∈ ∂ ǫ F ℓ ( ρ ) in Eq. (39a)there corresponds a proximal solution to the standardvariation principle with − v ǫ ∈ ∂F ℓ ( ρ ǫ ). C. Differentiability of ǫ F ℓ Regarding the differentiability of the regularized Liebfunctional, we note from Theorems 3 and 4 that ǫ F ℓ isFr´echet differentiable so that ǫ F ℓ ( ρ + σ ) = ǫ F ℓ ( ρ ) − ( v ǫ | σ ) + o ( k σ k ) , (53)with the derivative given by Eq. (46): ∇ ǫ F ℓ ( ρ ) = − v ǫ . (54)Gˆateaux differentiability follows from Fr´echet differen-tiability: the existence of ∇ ǫ F ℓ ( ρ ) implies that the direc-tional derivatives at ρ exist in all directions σ ∈ H ℓ andare equal to d ǫ F ℓ ( ρ + tσ )d t (cid:12)(cid:12)(cid:12) t =0 = ( ∇ ǫ F ℓ ( ρ ) | σ ) . (55) Hence the functional derivative of ǫ F ℓ is well defined andgiven by δ ǫ F ℓ ( ρ ) δρ ( r ) = − v ǫ ( r ) , (56)justifying the formal manipulations involving functionalderivatives in DFT, recalling that ǫ F ℓ ( ρ ) tends to F ℓ ( ρ )pointwise from below as ǫ → + . (However, v ǫ need notconverge to anything.) D. The optimality conditions of regularized DFT
The optimality conditions of the regularized DFT vari-ation principles in Eqs. (39a) and (39b) are the reciprocalrelations − v ∈ ∂ ǫ F ℓ ( ρ ) ⇐⇒ ρ ∈ ∂ ǫ E ℓ ( v ) , (57)which for the regularized Hohenberg–Kohn variationprinciple may now be written in the form of a stationarycondition: ∇ ǫ F ℓ ( ρ ) = − v. (58)In combination with Eq. (56), we obtain v ǫ = v and hencefrom Eq. (46) the following Hohenberg–Kohn stationarycondition: ρ = ρ ǫ − ǫv, (59)suggestive of an iterative scheme with the repeated cal-culation of the proximal density until self-consistency.By contrast, the Lieb optimality condition ρ ∈ ∂ ǫ E ℓ ( v )in Eq. (57) cannot be written as a stationary conditionsince the ground-state energy ǫ E ℓ (just like E and E ℓ )is differentiable only when v has a unique ground-statedensity. From Theorem 5, we obtain ∂ ǫ E ℓ ( v ) = ∂E ℓ ( v ) − ǫv, (60)which shows that the degeneracy of the ground-state en-ergy is preserved by the Moreau–Yosida regularization.For any ρ ∈ H ℓ in Eq. (58), an explicit expression forthe potential v ǫ in terms of the proximal density is givenin Eq. (46), yielding the regularized ground-state energy ǫ E ℓ ( v ǫ ) = ǫ F ℓ ( ρ ) + ( v ǫ | ρ ) . (61)Hence, for every ρ ∈ H ℓ , there exists a potential v ǫ forwhich ρ is the ground-state density . Stated differently,the set of ensemble v -representable pseudo-densities ǫ B ℓ is equal to the full Hilbert space: ǫ B ℓ = H ℓ . (62)We recall that the proximal density ρ ǫ is the exact (stan-dard ) ground-state energy of v ǫ , see Eq. (52).0 VI. REGULARIZED KOHN–SHAM THEORY
In the present section, we apply Moreau–Yosida reg-ularization to Kohn–Sham theory, beginning with a dis-cussion of the adiabatic connection. The essential pointof the regularized Kohn–Sham theory is the existence of acommon ground-state pseudo-density for the interactingand noninteracting systems, thereby solving the repre-sentability problem of Kohn–Sham theory.In the present section, we simplify notation by omittingthe subscript that indicates the length of the box fromall quantities—writing H , for instance, rather than H ℓ everywhere. A. Regularized adiabatic connection
The presentation of Moreau–Yosida regularized DFTgiven in Section V was for the fully interacting electronicsystem, with an interaction strength λ = 1 in the Hamil-tonian of Eq. (3). However, given that nothing in thedevelopment of the theory depends on the value of λ ,it may be repeated without modification for λ = 1. Inparticular, we note that the set of ground-state pseudo-densities is equal to the whole Hilbert space and hence isthe same for all interaction strengths, see Eq. (62). Con-sequently, every ρ ∈ H is the ground-state pseudo-densityof some v λ ∈ H , for each λ .To setup the adiabatic connection, we select ρ ∈ H .Denoting by ǫ F λ : H → R the regularized universal den-sity functional at interaction strength λ , we obtain fromEq. (58) the unique external potential v λǫ = −∇ ǫ F λ ( ρ ) , (63)for which the regularized ground-state energy at that in-teraction strength ǫ E λ : H → R is given by ǫ E λ ( v λǫ ) = ǫ F λ ( ρ ) + ( v λǫ | ρ ) . (64)As λ changes, the potential v λǫ can be adjusted to setupan adiabatic connection of systems with the same ground-state pseudo-density ρ at different interaction strengths.In the Moreau–Yosida regularized adiabatic connec-tion, the pseudo-density ρ has a proximal ground-statedensity that depends on λ : ρ λǫ = prox ǫF λ ( ρ ) = ρ + ǫv λǫ , (65)which is the true ground-state density in the potential v λǫ at that interaction strength: E ( v λǫ ) = F ( ρ λǫ ) + ( v λǫ | ρ λǫ ) . (66)In short, in the adiabatic connection, the effective poten-tial v λǫ has the same ground-state pseudo-density ρ butdifferent ground-state densities ρ λǫ = ρ + ǫv λǫ for differentinteraction strengths. In the next subsection, we shall seehow this decomposition makes it possible to calculate thetrue ground-state energy by (regularized) Kohn–Shamtheory in a rigorous manner, with no approximations ex-cept those introduced by domain truncation. B. Regularized Kohn–Sham theory
Consider an N -electron system with external potential v ext ∈ H . We wish to calculate the ground-state energyand to determine a ground-state density of this system: ρ ∈ ∂E ( v ext ) . (67)This can be achieved by solving the interacting many-body Schr¨odinger equation, in some approximate man-ner. In Kohn–Sham theory, we proceed differently, solv-ing instead a noninteracting problem with the same den-sity.We begin by transforming Eq. (67) into a regularizedmany-body energy, noting that the energy and superdif-ferential of the exact and regularized ground-state ener-gies are related according to Eqs. (37a) and (37b) as E ( v ext ) = ǫ E ( v ext ) + 12 ǫ k v ext k , (68) ∂E ( v ext ) = ∂ ǫ E ( v ext ) + ǫv ext . (69)From these relations, it follows that the pseudo-density ρ c = ρ − ǫv ext (70)is a ground-state density of the regularized system: ρ c ∈ ∂ ǫ E ( v ext ) . (71)The subscript ‘c’ indicates that ρ c is the carrier density ofboth the physical ground-state of the system ρ accordingto Eq. (70) and the ground-state density of the Kohn–Sham system ρ s : ρ c = ρ s − ǫv s . (72)Our task is to determine the carrier density and regular-ized ground-state energy by solving Eq. (71). The solu-tion will subsequently be transformed to yield the phys-ical ground-state density and energy.We observe that the carrier density ρ c is obtained fromthe physical density ρ by subtracting ǫv ext with ǫ >
0, seeEq. (70). In practice, v ext < ρ c > ρ c ∈ H , there exists a Kohn–Sham poten-tial v s ∈ H such that ρ c is the ground-state density of anoninteracting system in this potential: ρ c ∈ ∂ ǫ E ( v s ) . (73)To determine the regularized Kohn–Sham potential v s ,we first note that the potentials v ext and v s satisfy thestationary condition in Eq. (63): v ext = −∇ ǫ F ( ρ c ) , (74) v s = −∇ ǫ F ( ρ c ) . (75)1To proceed, we next introduce the regularized Hartree–exchange–correlation energy and potential as ǫ E Hxc ( ρ ) = ǫ F ( ρ ) − ǫ F ( ρ ) , (76) ǫ v Hxc ( ρ ) = ∇ ǫ E Hxc ( ρ ) , (77)yielding the following expression for the Kohn–Sham po-tential as a function of the density: v s = v ext + ǫ v Hxc ( ρ c ) . (78)To solve the regularized Kohn-Sham problem in Eq. (73),we first note that it is related in a simple manner to thestandard Kohn–Sham problem: ∂ ǫ E ( v s ) = ∂E ( v s ) − ǫv s , (79)we then proceed in an iterative fashion. From some trialpseudo-density ρ , we iterate v i = v ext + ǫ v Hxc ( ρ i − ) , (80a) ρ i ∈ ∂E ( v i ) − ǫv i , (80b)until convergence, beginning with i = 1 and terminat-ing when self-consistency has been established. We em-phasize that the regularized Kohn–Sham iterations inEqs. (80a) and (80b) are identical to the iterations instandard Kohn–Sham theory except for the use of a reg-ularized Hartree–exchange–correlation potential in theconstruction of the Kohn–Sham matrix and the subtrac-tion of − ǫv i from the density generated by diagonaliza-tion of the resulting Kohn–Sham matrix.Having determined the ground-state carrier density ρ c and the corresponding Kohn–Sham potential v s by iter-ating Eq. (80a) and (80b) until self consistency, we cal-culate the interacting regularized ground-state energy as ǫ E ( v ext ) = ǫ F ( ρ c ) + ( v ext | ρ c )= ǫ F ( ρ c ) + ǫ E Hxc ( ρ c ) + ( v ext | ρ c )= ǫ E ( v s ) + ( v ext − v s | ρ c ) + ǫ E Hxc ( ρ c ) (81)from which the physical ground-state energy E ( v ext )is recovered by adding ǫ k v ext k according to Eq. (68),while the ground-state density ρ is recovered by adding ǫv ext to the pseudo-density ρ c according to Eq. (70). Wenote that the pair ( ρ c , v s ) is uniquely determined to theextent that ρ in Eq. (67) is unique; for systems with de-generate ground-state densities, several equivalent pairs( ρ c , v s ) exist.By means of Moreau–Yosida regularization, we havethus setup Kohn–Sham theory in a rigorous manner,where the interacting and noninteracting ground-statedensities are different (by an amount proportional to ǫ ) but related by the same carrier density ρ c , therebysolving the noninteracting representability problem ofstandard Kohn–Sham theory. Moreover, differentiabil-ity of the regularized universal density functional meansthat the potentials associated with this pseudo-densityat different interaction strengths are well defined as the (negative) derivatives of the density functional. In thelimit where ǫ → + , standard Kohn–Sham theory is ap-proached, although the limit itself is not expected to bewell behaved. VII. CONCLUSION
The possibility of setting up DFT follows from themathematical properties of the ground-state energy E ( v ),which is continuous and concave in the external potential v . By convex conjugation, it may be exactly representedby the lower semi-continuous and convex universal den-sity functional F ( ρ ), whose properties reflect those of theground-state energy. Unfortunately, F ( ρ ) depends on thedensity ρ in a highly irregular manner, being everywherediscontinuous and nowhere differentiable. These char-acteristics of F arise in part because E is concave butnot strictly concave and not supercoercive. By modify-ing E in a way that introduces strict concavity and su-percoercivity without losing information about the elec-tronic system, we obtain an alternative DFT, where theuniversal density functional is much more well behaved,being everywhere differentiable (and therefore also con-tinuous). This is achieved by Moreau–Yosida regular-ization, where we apply convex conjugation not to E ( v )itself but to the strictly concave function E ( v ) − ǫ k v k ,where ǫ >
0. The resulting density functional ǫ F ( ρ ) isconvex and differentiable. Standard DFT is recovered as ǫ → + but this limit need not be taken for the theoryto be exact—for any chosen value of ǫ , we can performDFT as usual; the exact ground-state energy is recov-ered as E ( v ) = ǫ E ( v ) + ǫ k v k . The only restriction onthe exact theory is the truncation of the domain from R to a box of finite (but arbitrarily large) volume; such adomain truncation simplifies the Moreau–Yosida formu-lation of DFT by introducing (reflexive) Hilbert spacesof densities and potentials.The densities that occur naturally in regularized DFTare not physical densities since they cannot be generatedfrom an N -electron wave function in the usual manner.Nevertheless, each ‘pseudo-density’ ρ has a clear physicalinterpretation: it can be uniquely decomposed as ρ = ρ ǫ − ǫv ǫ , where ρ ǫ is a physical ground-state density (the‘proximal density’) and v ǫ the associated potential.This density decomposition justifies Kohn–Sham the-ory: a given pseudo-density ρ is uniquely decomposedas ρ = ρ λǫ − ǫv λǫ , at each interaction strength λ . As λ changes, the decomposition of ρ changes accordingly.For the fully interacting system, ρ = ρ − ǫv ext where ρ is the physical ground-state density and v ext the externalpotential; for the noninteracting system, ρ = ρ s − ǫv s ,where ρ s and v s are the Kohn–Sham density and poten-tial, thereby solving the noninteracting representabilityproblem of Kohn–Sham theory. The working equationsof regularized Kohn–Sham theory are essentially identi-cal to those of standard Kohn–Sham theory.Here, we have considered standard Moreau–Yosida reg-2ularization. However, we may also consider a general-ized approach, in which the regularizing term ǫ || v || is replaced by ǫ || Av || , where the operator A is cho-sen based on some a priori knowledge of the desiredsolution. Indeed, some choices of A result in ap-proaches closely related to known regularization tech-niques, such as the Zhao–Morrisson–Parr approach tocalculate the noninteracting universal density functionaland the “smoothing-norm” regularization approach ofHeaton-Burgess et. al. , used both in the context of op-timized effective potentials and Lieb optimizationmethods . These and related approaches will be dis-cussed in a forthcoming paper. We expect such Moreau–Yosida techniques to be of great practical value in theimplementation of procedures that attempt to determineeither the ground-state energy or the universal densityfunctional by direct optimization techniques using theirderivatives, bearing in mind that both the derivatives andthe objective functions are well defined in the regularizedcontext. ACKNOWLEDGMENTS
This work was supported by the Norwegian Re-search Council through the CoE Centre for Theoreti-cal and Computational Chemistry (CTCC) Grant No.179568/V30 and the Grant No. 171185/V30 and throughthe European Research Council under the EuropeanUnion Seventh Framework Program through the Ad-vanced Grant ABACUS, ERC Grant Agreement No.267683.A. M. T. is also grateful for support from the RoyalSociety University Research Fellowship scheme.
Appendix A: Mathematical Supplement
In this section, we review some important concepts ofconvex analysis and the calculus of variations. Suggestedreading for convex analysis are van Tiel’s book andthe classic text by Ekeland and T´emam. The presentarticle relies on additional information gathered in thebook by Bauschke and Combettes, which focuses on theHilbert-space formulation of convex analysis. For func-tional analysis, the monograph by Kreyszig is recom-mended.
1. Convex functions
We are here concerned with extended real-valued func-tions f : X → R ∪ {±∞} over a Banach or Hilbert space( X, k · k X ). Note that we define x ± ∞ = ±∞ for any x ∈ R , and x · ±∞ = ±∞ for positive real numbers x ,but that + ∞ − ∞ is not defined.We recall that X ∗ , the topological dual of X , is the setof continuous linear functionals over X : if ϕ ∈ X ∗ , then ϕ is a real-valued map, continuous and linear in x ∈ X .We denote by h ϕ, x i the value of ϕ at x , except in theDFT setting, where the notation ( ·|· ) is used. For simplic-ity, we assume in this section that X is reflexive so that X ∗∗ = X . Ultimately, we shall work with Hilbert spaces,which are reflexive Banach spaces so that X ∗ = X by theRiesz representation theorem of functional analysis.Let f : X → R ∪ { + ∞} be an extended-valued func-tion. The (effective) domain dom f is the subset of X where f is not + ∞ . The function f is said to be properif dom f = ∅ . The function f is convex if, for all x and y in X , and for all λ ∈ (0 , f ( λx + (1 − λ ) y ) ≤ λf ( x ) + (1 − λ ) f ( y ) . (A1)Note that this formula also makes sense if, say, f ( x ) =+ ∞ . The interpretation of convexity is that a linearinterpolation between two points always lays on or abovethe graph of f . We say that f is strictly convex if strictinequality holds for x = y in Eq. (A1). Moreover, f issaid to be concave if the inequality is reversed in Eq. (A1)and strict concavity is defined similarly.Perhaps the most important property of a convex f is that any local minimum is also a global minimum.Moreover, if f is strictly convex, the global minimizer, ifit exists, is unique. Convex optimization problems are inthis sense well behaved.
2. Proper lower semi-continuous convex functions
The minimal useful regularity of convex functions isnot continuity but lower semi-continuity. In a metricspace X , a function f is said to be lower semi-continuousif, for every sequence { x n } ⊂ X converging to some x ∈ X , we have f ( x ) ≤ lim inf n f ( x n ) . (A2)The importance of lower semi-continuity is that it guar-antees the existence of a global minimum if A = dom f is compact: inf x ∈ A f ( x ) = f ( x min ) for some x min ∈ A .For concave functions, upper semi-continuity is the cor-responding useful notion; f is upper semi-continuous if − f is lower semi-continuous, by definition.We are particularly interested in lower semi-continuousproper convex functions. The set Γ( X ) is defined as con-sisting of all functions that can be written in the form f ( x ) = sup α ∈ I {h ϕ α , x i − g α } (A3)for some family { ϕ α } α ∈ I ⊂ X ∗ of dual functions andsome { g α } α ∈ I ⊂ R . The set Γ( X ) contains precisely alllower semi-continuous proper convex functions on X andthe functions identically equal to ±∞ . In other words, f is lower semi-continuous proper convex or identicallyequal to ±∞ if and only if it is the pointwise supremum ofa set of continuous affine (“straight-line”) functions over3 X . We denote by Γ ( X ) all proper lower semi-continuousfunctions on X : Γ ( X ) = Γ( X ) \ { x
7→ −∞ , x + ∞} .It is a fact that any f ∈ Γ( X ) is also weakly lower semi-continuous.On the dual space X ∗ , we denote by Γ ∗ ( X ∗ ) the set ofall functions that can be written in the form g ( ϕ ) = sup α ∈ I {h ϕ, x α i − g α } . (A4)These functions are precisely the weak- ∗ lower semi-continuous proper convex functions on X ∗ and theimproper functions ±∞ . The proper functions areΓ ∗ ( X ∗ ) = Γ ∗ ( X ∗ ) \ { ϕ
7→ −∞ , ϕ + ∞} . Theorem 6 (Convex conjugates) . There is a one-to-onecorrespondence between the functions f ∈ Γ ( X ) and thefunctions g ∈ Γ ∗ ( X ∗ ) given by f ( x ) = sup ϕ ∈ X ∗ ( h ϕ, x i − g ( ϕ )) , (A5a) g ( ϕ ) = sup x ∈ X ( h ϕ, x i − f ( x )) . (A5b)The unique function g is said to be the convex conju-gate of f and is denoted by g = f ∗ ; likewise, f = g ∗ is the convex conjugate of g . A pair of functions f ∈ Γ( X ) and g ∈ Γ ∗ ( X ∗ ) that are each other’s convex conjugates aresaid to be dual functions . The dual functions contain thesame information, only coded differently: each propertyof f is reflected, in some manner, in the properties of f ∗ and vice versa. We note the relations f = ( f ∗ ) ∗ = f ∗∗ , g = ( g ∗ ) ∗ = g ∗∗ (A6)for functions f ∈ Γ( X ) and g ∈ Γ ∗ ( X ∗ ). In fact, theconjugation operation is a bijective map between Γ( X )and Γ ∗ ( X ∗ ), they contain precisely those functions thatsatisfy the biconjugation relations in Eq. (A6).Because of sign conventions, we work with functions f ∈ Γ ( X ) and g ∈ − Γ ∗ ( X ∗ ). It is then convenient toadapt the notation f ∧ ( ϕ ) = inf x ∈ X ( f ( x ) + h ϕ, x i ) , (A7a) g ∨ ( x ) = sup ϕ ∈ X ∗ ( g ( ϕ ) − h ϕ, x i ) , (A7b)for which f = ( f ∧ ) ∨ and g = ( g ∨ ) ∧ hold. In particular,in DFT as developed by Lieb, the density functional andground-state energy F ∈ Γ ( X ) , X = L ∩ L , (A8a) E ∈ − Γ ∗ ( X ∗ ) , X ∗ = L ∞ + L / , (A8b)are related as E = F ∧ and F = E ∨ .
3. Subdifferentiation
A dual function ϕ ∈ X ∗ is said to be a subgradient to f at a point x where f ( x ) is finite if f ( y ) ≥ f ( x ) + h ϕ, y − x i , ∀ y ∈ X, (A9) FIG. 2. Illustration of the subdifferential for an f ∈ Γ ( R ).For a x ∈ R , ∂f ( x ) is a collection of slopes of tangent func-tionals. One such slope ϕ and its affine mapping is shownexplicitly, the rest is indicated with stippled lines. ϕ is notunique since the graph of f has a “kink” at x . meaning that the affine function y f ( x ) + h ϕ, y − x i isnowhere above the graph of f . The subdifferential ∂f ( x )is the set of all subgradients to f at x , see Figure 2. Notethat ∂f ( x ) may be empty. The function f is said to besubdifferentiable at x ∈ X if ∂f ( x ) = ∅ . A function f ∈ Γ ( X ) has a global minimum at x ∈ X if and only if0 ∈ ∂f ( x ). Similarly x f ( x ) + h ϕ, x i has a minimumif and only if − ϕ ∈ ∂f ( x ). A function f ∈ Γ( X ) issubdifferentiable on a dense subset of its domain dom( f ).In the context of DFT, F is subdifferentiable at ρ ∈ X if and only if ρ is the ground-state density of a potential v ∈ X ∗ , E ( v ) = F ( ρ ) + ( v | ρ ) = inf ρ ′ ( F ( ρ ′ ) + ( v | ρ ′ )) (A10)so that F ( ρ ′ ) ≥ F ( ρ ) + ( v | ρ ′ − ρ ) , ∀ ρ ∈ X. (A11)By the Hohenberg–Kohn theorem, we know that ∂F ( ρ ) = { v + µ : µ ∈ R } (A12)if ρ is v -representable and that ∂F ( ρ ) = ∅ otherwise.Thus, from the point of view of convex analysis, the no-tion of v -representability of ρ is equivalent to subdiffer-entiability of F at ρ . It follows that the v -representabledensities are dense in the set of N -representable densities,the effective domain of F .
4. Gˆateaux differentiability
Let x, y ∈ X . The directional derivative of f at x inthe direction of y is defined by f ′ ( x ; y ) := lim ǫ → + ǫ − [ f ( x + ǫy ) − f ( x )] (A13)4if this limit exists (+ ∞ is accepted as limit). For f ∈ Γ ( X ), the directional derivative f ′ ( x ; y ) always exists.Let x ∈ X be given. If there is a ϕ ∈ X ∗ such that f ′ ( x ; y ) = h ϕ, y i , ∀ y ∈ X (A14)then f is said to be Gˆateaux differentiable at x . In otherwords, a function is Gˆateaux differentiable if its variousdirectional derivatives may be assembled into a linearfunctional at x . The Gˆateaux derivative is the usual no-tion of functional derivative encountered in the calculusof variations, for which we write ϕ = δf ( x ) /δx .If f is continuous and has a unique subgradient at x ,then it is is also Gˆateaux differentiable at x ; the conversestatement is also true, but note that a unique subgradientalone is not enough to ensure Gˆateaux differentiability:continuity is not implied by a unique subgradient.
5. Fr´echet differentiability
A stronger notion of differentiability is given by theFr´echet derivative. Let x ∈ X . If there exists ϕ ∈ X ∗ such that for all sequences h n → X as n → ∞ ,lim n →∞ | f ( x + h n ) − f ( x ) − h ϕ, h n i |k h n k = 0 , (A15)then f is Fr´echet differentiable at x , and ∇ f ( x ) = ϕ isthe Fr´echet derivative.Clearly, Fr´echet differentiable implies Gˆateaux differ-entiable, but not the other way around. In fact, if ∇ f ( x ) exists at x , then f ( x + h ) = f ( x ) + h∇ f ( x ) , h i + o ( k h k ) , (A16)so that f is approximated by its linearization around x .This is not true if f is merely Gˆateaux differentiable. P. Hohenberg and W. Kohn, Phys. Rev. , B864 (1964). E. H. Lieb, Int. J. Quant. Chem. , 243 (1983). N. Schuch and F. Verstraete, Nature Physics , 732 (2009). M. Garey and D. Johnson,
Computers and Intractability: AGuide to the Theory of NP-Completeness (W.H. Freeman andCompany, 1979). P. Lammert, Int. J. Quant. Chem. , 1944 (2005). W. Kohn and L. J. Sham, Phys. Rev. , A1133 (1965). H. Bauschke and P. Combettes,
Convex Analysis and MonotoneOperator Theory in Hilbert Spaces (Springer, New York, Dor-drecht, Heidelberg, London, 2011). M. Levy, Proc. Natl. Acad. Sci. , 6062 (1979). L. Evans,
Partial Differential Equations (American Mathemati-cal Society, Providence, R.I., 1998). I. Babuska and J. Osborn, Math. Comp. , 275 (1989). Q. Zhao, R. Morrison, and R. Parr, Phys. Rev. A , 2138(1994). W. Yang and Q. Wu, Phys. Rev. Lett. , 143002 (2002). T. Heaton-Burgess, F. A. Bulat, and W. Yang, Phys. Rev. Lett. , 256401 (2007). Q. Wu and W. Yang, J. Chem. Phys. , 2498 (2003). F. A. Bulat, T. Heaton-Burgess, A. J. Cohen, and W. Yang, J.Chem. Phys. , 174101 (2007). A. M. Teale, S. Coriani, and T. Helgaker, J. Chem. Phys. ,104111 (2009). J. van Tiel,
Convex Analysis, an Introductory Text (Wiley,Chichester, 1984). I. Ekeland and R. T´emam,
Convex Analysis and VariationalProblems (SIAM, Philadelphia, 1999). E. Kreyszig,