HHartree–Fock &Density Functional Theory
Klaas Giesbertzversion 2020 a r X i v : . [ phy s i c s . e d - ph ] O c t ontents N -particle spaces . . . . . . . . . . . . . . . . . . . . . 71.2 Full configuration(s) interaction (CI) . . . . . . . . . . . . . . 91.3 Slater–Condon rules . . . . . . . . . . . . . . . . . . . . . . . 101.3.1 0-body operators . . . . . . . . . . . . . . . . . . . . . 131.3.2 1-body operators . . . . . . . . . . . . . . . . . . . . . 141.3.3 2-body operators . . . . . . . . . . . . . . . . . . . . . 161.3.4 More integral notation . . . . . . . . . . . . . . . . . . 201.3.5 Extra information on exercise 1.10 . . . . . . . . . . . 231.4 The HF self-consistent field (SCF) equations . . . . . . . . . . 241.5 Roothaan–Hall equations . . . . . . . . . . . . . . . . . . . . 321.6 Properties and specialities of the HF system . . . . . . . . . . 351.6.1 Brillouin’s theorem (1934) . . . . . . . . . . . . . . . . 351.6.2 The HF Hamiltonian . . . . . . . . . . . . . . . . . . . 351.6.3 Koopmans’ theorem (1934) . . . . . . . . . . . . . . . 361.6.4 Restricted HF . . . . . . . . . . . . . . . . . . . . . . . 361.6.5 Finite gap & aufbau in completely unrestricted HF . . 401.7 Basis sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411.7.1 Atomic basis sets: Slater type orbitals (STOs) . . . . . 451.7.2 Atomic basis sets: Gaussian type orbitals (GTOs) . . 461.7.3 Read Atkins section 9.4 (Atkins and Friedman 2010) . 47 molecule . . . . . . . . . . . . . . . . . 712.5 Approximations to the xc energy . . . . . . . . . . . . . . . . 722.5.1 The local density approximation (LDA) . . . . . . . . 732.5.2 The generalized gradient approximations (GGAs) . . . 772.5.3 The meta-GGAs . . . . . . . . . . . . . . . . . . . . . 792.5.4 The hybrid functionals . . . . . . . . . . . . . . . . . . 802.5.5 The B3LYP functional . . . . . . . . . . . . . . . . . . 812.6 Is there any meaning to the Kohn–Sham (KS) system? . . . . 822.6.1 The occupied KS orbital energies . . . . . . . . . . . . 822.6.2 The unoccupied KS orbital energies . . . . . . . . . . 832.7 Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 A Prolate spheroidal coordinate system 86B Lagrange multipliers 89 hapter 1 Hartree–Fock
In quantum chemistry the quantum mechanical properties of molecules arestudied. The full non-relativistic molecular Hamiltonian treats both theelectrons and nuclei quantum mechanically, so reads ˆ H full = N nuc (cid:88) µ =1 − (cid:126) M µ ∇ R µ + e π(cid:15) N nuc (cid:88) µ =1 N nuc (cid:88) ν = µ +1 Z µ Z ν | R µ − R ν | − (1.1) e π(cid:15) N nuc (cid:88) µ =1 N (cid:88) i =1 Z µ | R µ − r i | + − (cid:126) m e N (cid:88) i =1 ∇ r i + e π(cid:15) N (cid:88) i =1 i − (cid:88) j =1 | r i − r j | . As a convention, we will use latin letters to refer to electrons and greek lettersto indicate nuclei. Further, we used the following quantities• N nuc total number of nuclei,• N total number of electrons,• (cid:126) Planck’s constant, h , divided by π ,• e elementary charge,• (cid:15) vacuum permitivitty,• M µ mass of nucleus µ ,• Z µ atom number of atom µ , i.e. the number of protons,• R µ position of nucleus µ ,• m e mass of an electron,• r i position of electron i . 4he solution to the Schr¨odinger equation of the full molecular Hamiltonian isthe full molecular wavefunction, depending on all positions and spin variablesof all nuclei and electrons Ψ full (cid:0) X , . . . , X N nuc , x , . . . , x N (cid:1) , (1.2)where we used combined space-spin coordinates X µ := R µ Σ µ and x i := r i σ i . (1.3)In these combined space-spin coordinates, Σ µ and σ i denote the spin co-ordinates of nucleus µ and electron i respectively.The nuclei are much heavier than the electrons, so we expect the nuc-lear quantum effects to be small compared to the electronic ones. We willtherefore focus on the electronic part of the Hamiltonian by taking thelimit M µ → ∞ , which turns the nuclei into classical point charges. Thisis called the Born–Oppenheimer approximation . In fact, Born and Oppen-heimer showed more precisely under which assumptions this approximationis valid and how to include the nuclear effects in a second step (Born andOppenheimer 1927). It has become clear that the nuclear part plays a crucialrole in chemical reactions and alternative approaches have been put forward(Abedi, Maitra and Gross 2010; Kyl¨anp¨a¨a and Rantala 2011). Nevertheless,we will focus on the electronic part in this course, as it already provides asignificant challenge.The (electronic) Hamiltonian central in this course will be ˆ H = − N (cid:88) i =1 ∇ r i + N (cid:88) i =1 N nuc (cid:88) µ =1 − Z µ | R µ − r i | (cid:124) (cid:123)(cid:122) (cid:125) = v ( r i ) + (cid:88) ≤ i Show that ψ i = (cid:104) φ i | ψ (cid:105) if we assume that { φ i } forms anorthonormal basis, i.e. (cid:104) φ i | φ j (cid:105) = δ ij . N -particle spaces To construct an N -particle Hilbert space, we simply glue N H N := N (cid:79) i =1 H = H ⊗ · · · ⊗ H (cid:124) (cid:123)(cid:122) (cid:125) N times , (1.12)which is similar to the construction of R = R × R × R out R . We can readilyconstruct a basis for H N by considering all possible products of 1-body basisfunctions, so a general function f ∈ H N can be expanded as f ( x , . . . , x N ) = (cid:88) i ,...,i N f i ...i N φ i ( x ) φ i ( x ) · · · φ i N ( x N ) (cid:124) (cid:123)(cid:122) (cid:125) =:Φ i i ...iN ( x , x ,..., x N ) . (1.13)7hat is all we need for non-identical particles. For identical quantum particles,however, we need to do some more work, since the expectation value of anyoperator is not allowed to change upon permutation. The spin-statistics the-orem (Fierz 1939; Pauli 1940) states that in 3D there are only two options.The wavefunction is either symmetric which corresponds to integer spin particles, i.e. bosons, or anti-symmetric which are half-integer particles, i.e. fermions.As we should either have a fully symmetric or anti-symmetric wavefunctionto describe bosons or fermion respectively, the only thing we need to do isto adapt the product basis to the required symmetry Φ ± i i ...i N ( x , . . . , x N ) := (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) φ i ( x ) φ i ( x ) . . . φ i N ( x ) φ i ( x ) φ i ( x ) . . . φ i N ( x ) ... ... . . . ... φ i ( x N ) φ i ( x N ) . . . φ i N ( x N ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ± . (1.14)The minus sign means that we use the determinant (fermions) and the plussign means that we use the permanent (bosons). The permanent is simply adeterminant without the alternating signs. The basis states for the fermionsare often referred to as Slater determinants (Slater 1929), though they wereused earlier by Heisenberg (Heisenberg 1926) and Dirac (Dirac 1926).As we now are working with (anti-)symmetric functions, we need to re-define the inner product as Ψ( x , x ) refers to the same points as Ψ( x , x ) .As we should only take the unique part into account, the inner product nowbecomes (cid:104) f | g (cid:105) := (cid:90) d x (cid:90) x > x d x · · · (cid:90) x N > x N − d x N f ∗ ( x , x , . . . , x N ) g ( x , x , . . . , x N ) (1.15) = 1 N ! (cid:90) d x (cid:90) d x · · · (cid:90) d x N f ∗ ( x , x , . . . , x N ) g ( x , x , . . . , x N ) . In virtually all textbooks and papers, the inner product is not taken over theunique sector, so does not have the ( N !) − factor. To compensate for thiserror, the Slater determinants gets a / √ N ! normalisation factor. A moreextensive explanation why the inner product defined in (1.15) is the correctone for the probabilistic interpretation of the square of the wavefunction canbe found in (Stefanucci and Leeuwen 2013). Exercise 1.2. Show that the symmetrised basis functions in (1.14) are or-thonormal, if the one-particle functions φ i ( x ) are also orthonormal.8 .2 Full configuration(s) interaction (CI) Combined with the variational principle, the basis expansion quickly leadsto the following idea to build approximate wavefunctions, which is known inthe quantum chemistry community as full configuration(s) interaction (CI). Step 1 ) Select a finite number of one-electron basis functions which youdeem important. Step 2 ) Construct the corresponding anti-symmetric N -electron basis (Slaterdeterminants), Φ I . Step 3 ) Use the variational principle to optimise the expansion parameters c I in Ψ( x , . . . , x N ) = (cid:88) I c I Φ I ( x , . . . , x N ) , (1.16)where I := i , i , . . . , i N with i < i < · · · < i N . That is, we de-mand that the gradient of the energy with respect to the expansioncoefficients, c I , vanishes which leads to the secular equations (cid:88) J (cid:0) H IJ − E S IJ (cid:1) c J = 0 . (1.17) Exercise 1.3. Derive the secular full CI equations. Exercise 1.4. How many N -body determinants can be constructed from M one-electron basis functions? How many permanents can be constructed?The number of determinants grows very quickly in full CI with the numberof one-electron basis functions and number of electrons. There are manyselection schemes to use only a subset of the determinants, which lead to alarge variety of CI methods. Now, let us to try to get away with the a singledeterminant: the ‘best’ one. With the variational principle at our disposal,we will define the ‘best’ Slater determinant as the one that yields the lowestenergy. Only the orbitals can be varied in the Slater determinant, so we willneed to optimise the energy with respect to the orbitals. So we would like tohave an explicit expression of the energy in terms of the orbitals, E HF [ { φ i } ] .As E HF is now a function of functions, we call it a functional.This approximation to retain only one Slater determinant is known asHartree–Fock (HF). It was originally proposed by Hartree (Hartree 1928), The expansion of a solution in a finite number of basis functions, including the conver-gence considerations of the expansion are known as the Galerkin approximation (Galerkin1915) and was introduced by Walther Ritz in 1908 to whom Galerkin refers. We could equally well have taken i > i > · · · > i N , or a more complicated scheme,as long as the set { Φ I } is linearly independent. The Slater–Condon rules (Slater 1929; Condon 1930) which deal with theexpectation values of (possibly different) Slater determinants (both N -body) (cid:104) Φ I | ˆ O | Φ J (cid:105) . (1.18)The Slater–Condon rules only deal with the case where both determinants areconstructed from the same orthonormal basis. They can readily be extendedto a non-orthonormal bases (L¨owdin 1955), but we will not need them andare therefore out of the scope of this course.First we need to introduce the concept of an n -body operator. An n -body operator is an operator which acts only on n different particles simul-taneously ˆ O = o = constant , (1.19a) ˆ O = (cid:88) i ˆ o i = (cid:88) i ˆ o ( x i ) , (1.19b) ˆ O = (cid:88) i Why are the elements which have some of the indices equalexcluded from a general n -body operator. More concrete, why is the term i = j not included in a two-body operator? Exercise 1.6. Derive the relations in (1.21). Start with the 0-body operator,then consider the 1-body operator and the 2-body operator. Finally arguethat the formula for a general n -body operator is correct.The Slater–Condon rules can be derived by simply working out the expect-ation values. For 0-body (constant) and 1-body operators this is relativelystraightforward, but more more-body operators this becomes quite a dirtybusiness, in particular to keep track of all the phase factors when dealing withfermions, i.e. determinants. Therefore, we introduce a graphical representa-tion of a determinant (which also works for permanents of course) (Stefanucciand Leeuwen 2013). A permanent/determinant of an n × n matrix, A , isdefined as | A | ± := (cid:88) ℘ ( ± ) ℘ n (cid:89) i =1 A i ℘ ( i ) , (1.22)where the “ + ” sign refers to the permanent and the “ − ” sign to the determin-ant. The symbol ℘ denotes a permutation of the indices, so ℘ (cid:0) , , . . . , n (cid:1) = (cid:0) ℘ (1) , ℘ (2) , . . . , ℘ ( n ) = (cid:0) (cid:48) , (cid:48) , . . . , n (cid:48) (cid:1) . For the permanent, we have (+) ℘ = 1 for any permutation. The notation ( − ) ℘ denotes the sign of the permutation,so ( − ) ℘ = 1 if an even number of pairs needed to interchanged to achievethe permutation and ( − ) ℘ = − if an odd number of swaps were needed forthe permutation. For example, consider the permutation ℘ (1 , , , , 5) =(2 , , , , . This permutation can be built up by swapping first positions4 and 5, which we will denote as (4 , . Next we swap positions 3 and 4,so (3 , and finally we perform (1 , . The complete permutation operation11an therefore be expressed as ℘ = (1 , , , . So 3 swaps are neededwhich tells us that the sign of the permutation is − . It is helpful to make agraphical representation of the permutation. The right column denotes theindices i and the left column their positions. (1 , , , 5) 12345 12345 = (1 , , 4) 12345 12345= (1 , 2) 12345 12345 = 12345 12345 . (1.23)You might notice that the number of crossings of the lines, n c , is exactlyequal to the number of permutations we made. This is not always the case,but it is easy to convince oneself that the parity is always the same. Considerthe right-hand side of a permutation graph and interchange two vertices i and j ij ( ij ) −−→ ji . (1.24)You see that the interchanges of nodes i and j leads to one additional crossingplus an even number of crossings, since both lines attached to i and j needto cross any other line. Therefore, we find the important rule ( ± ) n c = ( ± ) ℘ . (1.25)As an example, the permanent/determinant of a × matrix can be workedout with the help of these graphs as | A | ± = 12 12 A A + 12 12 A A A A ± A A . (1.26) Exercise 1.7. Use the graphical representation to work out the permanent/determinant for a general × matrix.Now we have the necessary ingredients to work out the Slater–Condon rulesfor determinants . For permanents the rules are more complicated, since thereis no anti-symmetry which forbids orbitals to be occupied multiple times. Wewill therefore skip the derivation of the bosonic Slater–Condon rules, as weaim to deal with electrons (fermions) in this course. The 0-body operators simply reduce to the calculation of the overlap (cid:104) Φ I | Φ J (cid:105) ,as the constant can be pulled out of the integration. Let us first consider theterm of the determinant without any permutations (cid:90) d x (cid:90) d x · · · (cid:90) d x N φ ∗ i ( x ) φ ∗ i ( x ) · · · φ ∗ i N ( x N ) φ j ( x ) φ j ( x ) · · · φ j N ( x N )= i i i i x x x x j j j j i N x N j N ... ... (cid:90) d x φ ∗ i k ( x k ) φ j k ( x k ) = i i i i (cid:104) φ i | φ j (cid:105)(cid:104) φ i | φ j (cid:105)(cid:104) φ i | φ j (cid:105)(cid:104) φ i | φ j (cid:105) j j j j i N (cid:104) φ i N | φ j N (cid:105) j N ... ... = i i i i δ i j δ i j δ i j δ i j j j j j i N δ i N j N j N ... ... = δ i j δ i j · · · δ i N j N =: δ IJ , (1.27)where we used the orthonormality of the orbitals in the last step, (cid:104) φ i | φ j (cid:105) = δ ij . Now consider two different permutations of the indices in both determ-13nants i i i i δ i j δ i j δ i j δ i j j j j j i N δ i N j N j N ... ... = 0 , (1.28)as we work with indices in increasing order. For example, i can never beequal to j , since combined with ordering this implies that i > i = j > j .We also need i = j for the integral to be non-zero, which cannot be.As the indices are ordered, we need the I = J and only when we have thesame permutation in both determinants, Φ I and Φ J , we get a contributionto the integral. For example i i i i i N δ i j δ i j δ i j δ i j δ i N j N j j j j j N ... ... = i i i i i N δ i j δ i j δ i j δ i j δ i N j N j j j j j N ... ... = δ IJ . (1.29)The only remaining question is how many of these terms we have. This issimply the number of terms generated by a single Slater determinant, N ! , asthe term in the other determinant needs to correspond to exactly the samepermutation. Hence, we find that the Slater determinants are orthonormal,if we use an orthonormal orbital basis (cid:104) Φ I | Φ J (cid:105) = δ IJ . (1.30) Now we proceed with the one-body operators. The difference compared tothe 0-body operators is that the integration over the first coordinate, x ,now contains an operator, (cid:104) φ i k | ˆ o | φ i l (cid:105) , so the orbitals it connect do not needto be equal anymore for a finite expectation value. All the other orbitalsstill need to be equal, so there are two options to consider: 1) one orbital isdifferent in the determinants, or 2) all orbitals are the same.14et us first consider the case that the orbitals in both determinants arethe same, i.e. I = J . If we only consider the terms where the operator workson i , we find i i i i i N (cid:104) φ i | ˆ o | φ i (cid:105) δ i i δ i i δ i i δ i N i N i i i i i N ... ... = i i i i i N (cid:104) φ i | ˆ o | φ i (cid:105) δ i i δ i i δ i i δ i N i N i i i i i N ... ... = (cid:104) φ i | ˆ o | φ i (cid:105) . (1.31)As we keep the orbital φ i fixed, we can only make permutations of theremaining i , . . . , i N elements, so we have ( N − of those elements. Theseare the only graphs which lead to a contribution of the form (cid:104) φ i | ˆ o | φ i (cid:105) . If wewould choose different permutations for both determinants, the contributionof the graph is zero.Here we have singled out the orbital φ i , but the same argument worksfor any orbital φ i k as i i i i k i N (cid:104) φ i k | ˆ o | φ i k (cid:105) δ i i δ i i i i i i k i N ... ...... ... = i k i i i k − i N (cid:104) φ i k | ˆ o | φ i k (cid:105) δ i i δ i i i k i i i k − i N ... ...... ... = (cid:104) φ i k | ˆ o | φ i k (cid:105) . (1.32)Again, we can permute the other indices i m (cid:54) = i k in ( N − ways withoutaffecting the value. The expectation value for J = I therefore becomes (cid:104) Φ I | ˆ O | Φ I (cid:105) = N ( N − N ! N (cid:88) k =1 (cid:104) φ i k | ˆ o | φ i k (cid:105) = N (cid:88) k =1 (cid:104) φ i k | ˆ o | φ i k (cid:105) , (1.33)where we included the pre-factor for the 1-body operators (1.21b).Now consider the case that the orbitals i k and j l are different. The onlyoption to get a non-zero contribution to the integral is to connect them to15he operator ˆ o , so we have i i i k i k +1 i l i N (cid:104) φ i k | ˆ o | φ j l (cid:105) δ i j δ i k +1 j k δ i N j N j j j k j k +1 j l j N ... ...... ...... ... = ( − k − l i k i i k − i k +1 i l i N (cid:104) φ i k | ˆ o | φ j l (cid:105) δ i j δ i k +1 j k δ i N j N j k j j k − j k j l − j N ... ...... ...... ... = ( − k − l (cid:104) φ i k | ˆ o | φ j l (cid:105) , (1.34)where the ( − k − l phase factor is the result of the permutation (12)(23) · · · ( k − k ) → ( − k − on the left side to get i k on the first position without affectingthe ordering of the other indices and the permutation (12)(23) · · · ( l − l ) → ( − l − on the left side to get j l on the first position.As in the previous diagrams, we can permute the other indices in ( N − without affecting (zeroing) the integral and we find (cid:104) Φ I | ˆ O | Φ J (cid:105) = ( − k − l (cid:104) φ i k | ˆ o | φ j l (cid:105) , (1.35)if only the orbitals i k and j l differ. Collecting the results for the 1-bodyoperators, we find as Slater–Condon rule (cid:104) Φ I | ˆ O | Φ J (cid:105) = N (cid:88) k =1 (cid:104) φ i k | ˆ o | φ i k (cid:105) if I = J , ( − k − l (cid:104) φ i k | ˆ o | φ j l (cid:105) if only i k (cid:54) = j l , otherwise . (1.36) Now we consider the 2-body operators, so we have a non-trivial integral withtwo pairs of orbitals (cid:104) i k i l | ˆ o | j m j n (cid:105) := (cid:104) φ i k φ i l | ˆ o | φ j m φ j n (cid:105) (1.37) := (cid:90) d x (cid:90) d x φ ∗ i k ( x ) φ ∗ i l ( x )ˆ o ( x , x ) φ j m ( x ) φ j n ( x ) , where we have omitted the explicit notation of φ , as it is clear that we arealways referring to the orbitals φ and it makes the notation less bulky. For16he two-body operators we will find that the integrals always come in pairsand we will use the following notation for such a pair (cid:104) kl (cid:107) ˆ o (cid:107) mn (cid:105) := (cid:104) kl | ˆ o | mn (cid:105) − (cid:104) kl | ˆ o | nm (cid:105) . (1.38)As we have now a two body operator even three orbitals in the determinantsneed to be different to make the integral identically zero. So now we needto consider three cases: 1) no orbitals different, 2) one orbital different, 3)two orbitals different.Let us start again when both determinants are comprised of the sameorbital set, so I = J . For simplicity we will first consider only terms wherethe operator works on i and i i i i i i N (cid:104) i i | ˆ o | i i (cid:105) δ i i δ i i δ i N i N i i i i i N ... ... = i i i i i N (cid:104) i i | ˆ o | i i (cid:105) δ i i δ i i δ i N i N i i i i i N ... ... = (cid:104) i i | ˆ o | i i (cid:105) , (1.39)where we used the symmetry of the 2-body operator. As the remainingindices can be permuted in ( N − ways without affecting the integral, wehave N − of the terms. Due to the operator in the integral, it is notnecessary to directly connect i to i and i to i to have a possible non-zerocontribution to the expectation value. Also connecting i to i can yield acontribution. These are graphs of the form i i i i i N (cid:104) i i | ˆ o | i i (cid:105) δ i i δ i i δ i N i N i i i i i N ... ... = i i i i i N (cid:104) i i | ˆ o | i i (cid:105) δ i i δ i i δ i N i N i i i i i N ... ... = (cid:104) i i | ˆ o | i i (cid:105) . (1.40)Again, by interchanging the remaining nodes we see the there are N − of these terms. Taking into account that in the last diagrams we got 117dditional crossing, connecting the nodes i and i to the operator yields thefollowing contribution to the expectation value N − (cid:0) (cid:104) i i | ˆ o | i i (cid:105) − (cid:104) i i | ˆ o | i i (cid:105) (cid:1) . (1.41)To get any other pair on the first positions, we always make an even numberof permutations and the contribution will have exactly the same form as forthe i , i pair. So to get the full expectation value, we simply need to sumover all possible combinations (cid:104) Φ I | ˆ O | Φ I (cid:105) = (cid:88) r Show that the HF energy functional can be written as E HF [ { φ i } ] = N (cid:88) i =1 (cid:104) φ i | ˆ h | φ i (cid:105) + 12 N (cid:88) i,j =1 (cid:104) ij (cid:107) ij (cid:105) + E nuc , (1.52)where ˆ h := − ∇ + v ( r ) . Exercise 1.9. Show that (cid:104) ij (cid:107) kl (cid:105) = ( ik (cid:107) jl ) . Exercise 1.10 [restricted HF (RHF) for H in a minimal basis]. Inthis exercise we will calculate the (singlet) restricted HF (RHF) energy forthe H molecule in a minimal basis, i.e. two s orbitals on each hydrogenatom χ A ( r ) = N ζ e − ζ | r − R A | χ B ( r ) = N ζ e − ζ | r − R B | (1.53)20) Show that the normalisation constant of the s orbitals is N ζ = (cid:114) ζ π . (1.54)b) Show that the overlap of the s functions is s := (cid:104) χ A | χ B (cid:105) = (cid:18) ρ + ρ (cid:19) e − ρ , (1.55)where ρ := ζR and R := | R A − R B | is the internuclear distance. Hint: You can use different coordinate systems to solve the integral.You might be inclined to use the more familiar spherical coordinatesystem, but the integrals are not very straightforward. The most nat-ural one which leads to the easiest integrals is the prolate spheroidalcoordinate system (Ap. A), as there are two nuclei which can be placedat the foci of ellipses. First show that χ A/B = N ζ e − ρ ( ξ ∓ η ) and thenperform the integral over ξ , η and φ .The s functions on the different hydrogen atoms are not orthogonal ( s (cid:54) = 0 ).One way to orthogonalise them is to make symmetry adapted combination( σ g / σ u ). Due to the symmetry of the system, we expect one of them to bethe HF solution.c) Show that the normalised symmetry adapted basis functions are σ g = χ A + χ B (cid:112) s ) and σ u = χ A − χ B (cid:112) − s ) . (1.56)Why are these functions orthogonal?d) Argue that a doubly occupied σ g orbital, i.e. a determinant with spinup and a spin down electron in the σ g orbital would yield the lowestenergy.To calculate the RHF energy, we need to evaluate the expectation values ofthe different parts of the Hamiltonian. We will first consider the one-bodypart.e) Show that (cid:104) χ A | ˆ h | χ A (cid:105) = (cid:104) χ B | ˆ h | χ B (cid:105) = ζ − ζ + ζρ (cid:0) e − ρ (1 + ρ ) − (cid:1) , (1.57a) (cid:104) χ A | ˆ h | χ B (cid:105) = (cid:20) ζ (cid:18) ρ − ρ (cid:19) − ζ (1 + ρ ) (cid:21) e − ρ . (1.57b)21 int: If you do not want to evaluate the Laplacian explicitly, you canuse that the orbitals χ A/B solve a hydrogenic Schr¨odinger equationwith charge ζ , i.e. (cid:18) − ∇ − ζr A/B (cid:19) χ A/B = − ζ χ A/B . (1.58)f) Evaluate the long bond distance limit, R → ∞ , of the one-body terms.Did you expect this result? Explain.g) Show that (cid:104) σ g | ˆ h | σ g (cid:105) = (cid:104) χ A | ˆ h | χ A (cid:105) + (cid:104) χ A | ˆ h | χ B (cid:105) s . (1.59)Evaluation of the two-body part is more challenging in general, since thetwo-electron integrals are 6D. To ease the computation, we use the Mullikenapproximation (Mulliken 1949). The idea of the Mulliken approximation isto regard a two-electron integral as an interactions between the two chargedensities φ ∗ i ( r ) φ l ( r ) and φ ∗ j ( r ) φ k ( r ) . These charge densities will be propor-tional to the overlap between the orbitals, so Mulliken approximated themas φ ∗ i ( r ) φ l ( r ) ≈ S il | φ i ( r ) | + | φ l ( r ) | (1.60)and similar for φ ∗ j ( r ) φ k ( r ) of course. Using this approximation to the chargedensities, one finds for the integrals ( il | jk ) ≈ ( il | jk ) M := S il S jk (cid:2) ( ii | jj ) + ( ii | kk ) + ( ll | jj ) + ( ll | kk ) (cid:3) . (1.61)h) Show that ( AA | AB ) M = s (cid:2) ( AA | AA ) + ( AA | BB ) (cid:3) , (1.62a) ( AB | AB ) M = s (cid:2) ( AA | AA ) + ( AA | BB ) (cid:3) , (1.62b) ( σ g σ g | σ g σ g ) = 12(1 + s ) (cid:2) ( AA | AA ) + 4( AA | AB ) +( AA | BB ) + 2( AB | AB ) (cid:3) , (1.62c) ( σ g σ g | σ g σ g ) M = 12 (cid:2) ( AA | AA ) + ( AA | BB ) (cid:3) , (1.62d)where we used the abbreviations A/B = χ A/B .So we find that in the Mulliken approximation, only ( AA | BB ) needs to becalculated, as ( AA | AA ) follows by taking the limit R B → R A .22) Show that ( AA | BB ) = ζ (cid:20) ρ − e − ρ (cid:18) ρ + 118 + 3 ρ ρ (cid:19)(cid:21) . (1.63) Hint: First evaluate the Coulomb potential due to one of the chargedensities, e.g. the density | χ A ( r ) | yields the Coulomb potential v A ( r (cid:48) ) = (cid:90) d r | χ A ( r ) | | r (cid:48) − r | . (1.64)Note that you already calculated this integral for r (cid:48) = R B in part 1.10.e.Next you only need to work out (cid:82) d r | χ B ( r ) | v A ( r ) , whose parts youalso already did before.j) Show that ( AA | AA ) = lim R → ( AA | BB ) = 5 ζ . (1.65)Now we have all the ingredients to calculate the restricted HF energy for thedoubly occupied σ g orbitalk) Show that E RHF [( σ g ) ] = 2 (cid:104) σ g | ˆ h | σ g (cid:105) + ( σ g σ g | σ g σ g ) + 1 R . (1.66)The following exercises are easiest to do with some math program, e.g. Mathematica .l) Optimise the orbital exponent, ζ , at each distance R to minimise theenergy. Plot the exponent as a function of the bonding distance. Whatdo you notice in the dissociation limit?m) Plot also the optimised RHF energy (with optimised ζ ) as a functionof R . Is the dissociation limit as you would expect? The Mulliken approximation for the two-electron integrals can be avoided ifone is willing to evaluate two additional integrals. The ( AA | AB ) integralcan be handled in the same manner as the ( AA | BB ) integral and yields ( AA | AB ) = ζ e − ρ ρ (cid:16) ρ + 16 ρ − (5 + 2 ρ )e − ρ (cid:17) . (1.67)23he other integral turns out to be a nasty one, which probably can only behandled in the prolate spheroidal coordinate system. The Coulomb interac-tion needs to expressed in the Von Neumann expansion | r − r | = 2 R ∞ (cid:88) l =0 l (cid:88) m = − l α ml P | m | l ( ξ < ) Q | m | l ( ξ > ) × P | m | l ( η ) P | m | l ( η ) e i m ( φ − φ ) , (1.68)where P | m | l ( x ) and Q | m | l ( x ) are the associated Legendre polynomials of thefirst and second kind respectively. The expansion coefficients are given as α lm = ( − m (2 l + 1) (cid:18) ( l − | m | )!( l + | m | )! (cid:19) (1.69)and ξ > := max( ξ , ξ ) and ξ < := min( ξ , ξ ) . After a long massage, theintegral becomes ( AB | AB ) = ζ e − ρ ρ (cid:104) ρ − ρ − ρ − ρ +16 (cid:0) γ + ln( ρ ) (cid:1) (9 + 18 ρ + 15 ρ + 6 ρ + ρ ) − − ρ + ρ )e ρ Ei( − ρ ) +16(3 − ρ + ρ ) e ρ Ei( − ρ ) (cid:105) , (1.70)where the Euler–Mascheroni constant is defined as γ := − (cid:90) ∞ d z e − z ln( z ) = 0 . 577 215 664 9 . . . (1.71)and the exponential integral is defined as Ei( − x ) := − (cid:90) ∞ x d z e − z z . (1.72) In the derivation of the HF energy functional we assumed that the orbitals areorthonormal, so we need to respect this constraint. There are different waysto handle this constraint. We will use the method of Lagrange multipliers todo this. If you do not know about Lagrange multipliers or need a refresh ofyour memory, read Appendix B. Here are two exercises to practice your skill Exercise 1.11. Find the point on the parabola y ( x ) = ( x − closest tothe point ( x, y ) = (1 , in the Euclidean norm (Nocedal and Wright 2006,problem 12.18). So consider the following minimisation problem min x,y ∈ R f ( x, y ) = ( x − + ( y − subjet to ( x − = 5 y. (1.73)24) Construct the Lagrangian and find the points which satisfy the firstorder optimality conditions.b) Which of these points are solutions?c) It is tempting to eliminate the ( x − term to transform the probleminto an unconstraint minimisation. Show that this procedure does not yield the correct minimum. Exercise 1.12. Find the maximum and minimum of f ( x, y, z ) = 4 y − z subject to the constraints x − y − z = 2 and x + y = 1 (Dawkins n.d.,example 5).To enforce the orthonormality of the orbitals we introduce the followingLagrangian L [ { φ i , φ ∗ i } , (cid:15) ] = E HF [ { φ i , φ ∗ i } ] − N (cid:88) r,s =1 (cid:15) rs (cid:0) (cid:104) φ s | φ r (cid:105) − δ sr (cid:1) . (1.74)Thanks to the Lagrange multipliers, we can now vary over φ i and φ ∗ i as ifthey were independent. Now consider small perturbations in the orbitals δφ i and δφ ∗ i . The first order variation in the Lagrangian is readily worked outas δL = N (cid:88) i =1 (cid:104) δφ i | ˆ h | φ i (cid:105) + 12 N (cid:88) i,j =1 (cid:0) (cid:104) δφ i φ j (cid:107) φ i φ j (cid:105) + (cid:104) φ i δφ j (cid:107) φ i φ j (cid:105) (cid:1) + N (cid:88) i =1 (cid:104) φ i | ˆ h | δφ i (cid:105) + 12 N (cid:88) i,j =1 (cid:0) (cid:104) φ i φ j (cid:107) δφ i φ j (cid:105) + (cid:104) φ i φ j (cid:107) φ i δφ j (cid:105) (cid:1) − N (cid:88) i,j =1 (cid:15) ji (cid:0) (cid:104) δφ j | φ i (cid:105) + (cid:104) φ j | δφ i (cid:105) (cid:1) . (1.75)Collecting now all variations due to δφ ∗ i and δφ i respectively, the first ordervariation in the Lagrangian can be expressed as δL = N (cid:88) i =1 (cid:90) d x (cid:18) δφ ∗ i ( x ) δLδφ ∗ i ( x ) + δLδφ i ( x ) δφ i ( x ) (cid:19) , (1.76)where δLδφ ∗ i ( x ) = ˆ hφ i ( x ) − N (cid:88) j =1 φ j ( x ) (cid:15) ji + N (cid:88) j =1 (cid:90) d x (cid:48) φ ∗ j ( x (cid:48) ) φ j ( x (cid:48) ) φ i ( x ) − φ j ( x ) φ i ( x (cid:48) ) | r − r (cid:48) | , (1.77a)25 Lδφ i ( x ) = ˆ hφ ∗ i ( x ) − N (cid:88) j =1 (cid:15) ij φ ∗ j ( x ) + N (cid:88) j =1 (cid:90) d x (cid:48) φ ∗ i ( x ) φ ∗ j ( x (cid:48) ) − φ ∗ i ( x (cid:48) ) φ ∗ j ( x ) | r − r (cid:48) | φ j ( x (cid:48) ) . (1.77b)The quantities δL/δφ i ( x ) and δL/δφ ∗ i ( x ) are called functional derivativesand behave quite similarly as ordinary derivatives (chain-rule and productrule apply). The general mathematical theory behind it is quite involvedand is beyond the scope of this course.Now let us analyse the terms in the functional derivative and in particularthe terms with the interaction. In the first interaction term you can recognisethe classical Coulomb potential due to the interaction with the (electroniccharge) density ρ ( x ) = N (cid:88) i =1 | φ ( x ) | → v H [ ρ ]( x ) := (cid:90) d x (cid:48) ρ ( x (cid:48) ) | r − r (cid:48) | . (1.78)Due to historical reasons, this term is called the Hartree term. Probably be-cause Hartree (Hartree 1928) only had this interaction term in his equations,as he did not take the anti-symmetry properly into account. He started fromonly an orbital product.The second part in the interaction terms in (1.77) were recovered byFock (Fock 1930), so sometimes called the Fock part. He started correctlyfrom a Slater determinant, so got this additional term which has no classicalanalogue. Since this term is caused by the permutation symmetry of theparticles, this term is also called the exchange term. Note that as this termis caused by the permutation symmetry of the particles, the exchange termwould have a plus sign for bosons.Since the index i also appears in the integral over x (cid:48) , the exchange poten-tial can not be expressed as a simple local (multiplicative) potential. Instead,we need to express the exchange potential as an integral kernel v x [ { φ i } ]( x , x (cid:48) ) := − N (cid:88) j =1 φ j ( x ) φ ∗ j ( x (cid:48) ) | r − r (cid:48) | . (1.79)The exchange part in (1.77a) can now be written as − N (cid:88) j =1 (cid:90) d x (cid:48) φ ∗ j ( x (cid:48) ) φ j ( x ) φ i ( x (cid:48) ) | r − r (cid:48) | = (cid:90) d x (cid:48) v x ( x , x (cid:48) ) φ i ( x (cid:48) ) =: (cid:0) ˆ v x φ i (cid:1) ( x ) . (1.80)One-body operators which cannot be expressed as a simple multiplicationare called non-local. An other example of a non-local one-body operator isthe kinetic energy. Every non-local operator can be expressed as an integral26ernel. For example, the integral kernel for the kinetic energy can be writtenas t ( x , x (cid:48) ) = 12 (cid:88) r ∇ χ r ( x ) · ∇ χ ∗ r ( x (cid:48) ) , (1.81)where { χ r } is an arbitrary complete orthonormal basis. Exercise 1.13. Check that the kinetic energy integral kernel (1.81) indeedcorresponds to the kinetic energy operator. So you need to check whether (cid:90) d x (cid:48) t ( x , x (cid:48) ) ψ ( x (cid:48) ) = − ∇ ψ ( x ) . (1.82)Note that all local one-body operators can be regarded as a special caseof a non-local operator with the help of the Dirac delta distribution. Forexample, the Coulomb potential can be written as v H ( x , x (cid:48) ) = v H ( x ) δ ( x − x (cid:48) ) (1.83)The non-local one-body potential are equivalent to matrices. When you actwith a matrix (non-local operator) on a vector (function), you can get anyvector back. However, when the matrix is diagonal (local potential), youget the same vector (function) back with scaled components (values in thepoints).Now let us turn our attention back to functional derivatives of the Lag-rangian. All terms coming from the HF energy are collected in the Fockoperator ˆ f [ { φ i } ] := ˆ h + ˆ v H [ { φ i } ] + ˆ v x [ { φ i } ] . (1.84)The functional derivative can be compactly expressed as δLδφ ∗ i ( x ) = ˆ f φ i ( x ) − N (cid:88) j =1 φ j ( x ) (cid:15) ji , (1.85a) δLδφ i ( x ) = ˆ f ∗ φ ∗ i ( x ) − N (cid:88) j =1 (cid:15) ij φ ∗ j ( x ) . (1.85b)Since the Lagrangian needs to be stationary with respect to any variation inthe orbital, δφ ∗ i ( x ) and δφ i ( x ) , both functional derivatives need to be zero.Just as if we were dealing with functions instead of functionals. So as firstorder optimality conditions, we obtain ˆ f φ i ( x ) = N (cid:88) j =1 φ j ( x ) (cid:15) ji , ˆ f ∗ φ ∗ i ( x ) = N (cid:88) j =1 (cid:15) ij φ ∗ j ( x ) . (1.86) In case you are not scared of derivatives of delta distributions, you can also expressthe kernel as t ( x , x (cid:48) ) = ∇ x (cid:48) · ∇ x δ ( x − x (cid:48) ) . (cid:15) -matrix on the right-hand side. We would rather like itto be diagonal.To show that (cid:15) can be made diagonal, we multiply (1.86) φ i ( x ) and φ j ( x ) respectively and integrating over x , we find (cid:104) φ j | ˆ f | φ i (cid:105) = (cid:15) ji , (cid:104) φ j | ˆ f | φ i (cid:105) ∗ = (cid:15) ij , (1.87)so we find that the Lagrange multiplier matrix is hermitian, (cid:15) = (cid:15) † . We usedhere that at the solution point, we satisfy also the constraint (cid:104) φ i | φ j (cid:105) = δ ij .Combing now these two equations, we have (cid:15) ij = (cid:104) φ j | ˆ f | φ i (cid:105) ∗ = (cid:15) ∗ ji . (1.88)Additionally, one can show that the HF wavefunction does not change ifwe make a unitary transformation among the HF orbitals, except for anirrelevant overall phase factor. In particular, the HF energy will not changeunder such a unitary transformation and we are allowed to diagonalise (cid:15) .This yields the canonical HF equations ˆ f [ { φ } ] φ i ( x ) = ε i φ i ( x ) , (1.89)where the eigenvalues of the Lagrange multiplier matrix, ε i , are called HForbital energies. The HF orbitals that also diagonalise the Fock operator arecalled the canonical HF orbitals. Exercise 1.14. Show that either δL/δφ i ( x ) = 0 or δL/δφ ∗ i ( x ) = 0 is re-dundant if you use that (cid:15) = (cid:15) † in 1.86. Exercise 1.15. Show that the HF wavefunction remains the same up to anoverall phase factor, when making a unitary transformation among the HForbitals.Hint: The HF wavefunction is Φ HF = det( Φ ) , where Φ is the matrix com-posed of all HF orbitals at all possible coordinates. Next show that theHF wavefunction with the transformed orbitals can be written as Φ (cid:48) HF =det( Φ U ) , where the transformation matrix U is defined as φ (cid:48) i ( x ) = N (cid:88) j =1 φ j ( x ) U ji . Exercise 1.16. Check that both δL/δφ i ( x ) = 0 as well as δL/δφ ∗ i ( x ) = 0 indeed reduce to the expressions in terms of the Fock operators (1.85). Exercise 1.17 [unrestricted HF (UHF) for H in a minimal basis]. In exercise 1.10 we assumed that the HF solution would have the samesymmetry as the system itself, so we used symmetry adapted orbitals as28rial orbitals. We argued that both electrons should occupy the σ g orbitalto get the lowest energy. Since the Fock operator is not linear, the solutionsdo not necessarily exhibit the symmetry of the system. Since electrons repeleach other, we might be able to lower the HF energy by allowing the electronsto localise in their own orbital. One with spin up and the other with spindown. For the two orbital (minimal basis) model for H we will first assumethat they localise in the atomic orbitals, so the orbitals would be χ A ( r ) α ( s ) and χ B ( r ) β ( s ) . Note that χ A ( r ) and χ B ( r ) are not orthogonal. For laterconvenience, we will perturb them to be orthogonal.a) Explain why it is not necessary that (cid:104) χ A | χ B (cid:105) (cid:54) = 0 .b) Given the overlap matrix, S ij = (cid:104) χ i | χ j (cid:105) , show that the orbitals φ i ( x ) = (cid:88) j χ j ( x ) S − / ji (1.90)are orthonormal. This method to generate an orthonormal basis froma non-orthonormal one is called L¨owdin orthonormalisation (L¨owdin1950).c) In the minimal H basis, the overlap matrix is of the form S = (cid:18) ss (cid:19) . (1.91)Calculate the inverse, S − .d) The inverse square root of the overlap matrix is S − / = 1 √ − s (cid:18) cos θ − sin θ − sin θ cos θ (cid:19) , (1.92)where s =: sin(2 θ ) . Check this by showing that (cid:0) S − / (cid:1) = S − .e) Show that φ a/b ( r ) = 1 √ − s (cid:0) cos( θ ) χ A/B ( r ) − sin( θ ) χ B/A ( r ) (cid:1) . (1.93)Now that we have orthogonalized the orbitals, we need to transform theintegrals to the new basis.f) Show that (cid:104) φ a | ˆ h | φ a (cid:105) = (cid:104) φ b | ˆ h | φ b (cid:105) = (cid:104) χ A | ˆ h | χ A (cid:105) − s (cid:104) χ A | ˆ h | χ B (cid:105) − s , (1.94a) (cid:104) φ a | ˆ h | φ b (cid:105) = (cid:104) φ a | ˆ h | φ b (cid:105) = (cid:104) χ A | ˆ h | χ B (cid:105) − s (cid:104) χ A | ˆ h | χ A (cid:105) − s . (1.94b)29) Show that the unique two-electron integrals can be expressed as ( aa | aa ) = 1(1 − s ) (cid:20)(cid:18) − s (cid:19) ( AA | AA ) + s AA | BB ) − s ( AA | AB ) + s ( AB | AB ) (cid:21) , (1.95a) ( aa | bb ) = 1(1 − s ) (cid:20)(cid:18) − s (cid:19) ( AA | BB ) + s AA | AA ) − s ( AA | AB ) + s ( AB | AB ) (cid:21) , (1.95b) ( aa | ab ) = 1(1 − s ) (cid:20) (1 + s )( AA | AB ) − s ( AB | AB ) − s (cid:0) ( AA | AA ) = ( AA | BB ) (cid:1)(cid:21) , (1.95c) ( ab | ab ) = 1(1 − s ) (cid:20) ( AB | AB ) − s ( AA | AB ) − s (cid:0) ( AA | AA ) = ( AA | BB ) (cid:1)(cid:21) . (1.95d)h) Show that in the Mulliken approximation, the integrals reduce to ( aa | aa ) M = ( AA | AA ) + 12(1 − s ) (cid:2) ( AA | AA ) − ( AA | BB ) (cid:3) , (1.96a) ( aa | bb ) M = ( AA | BB ) + 12(1 − s ) (cid:2) ( AA | BB ) − ( AA | AA ) (cid:3) , (1.96b) ( aa | ab ) M = ( ab | ab ) M = 0 . (1.96c)Now we put one electron with spin up in φ a and the other electron with spindown in φ b .i) Show that the HF energy becomes E locHF = 2 (cid:104) φ a | ˆ h | φ a (cid:105) + ( φ a φ a | φ b φ b ) + 1 R . (1.97)j) Using the integrals you already calculated in exercise 1.10, optimisethe exponent of the orbitals for this localised HF solution, e.g. with Mathematica . Compare to the exponent in the RHF exercise 1.10.l.k) Plot the HF energy with the localised orbitals. Compare with the RHFenergy of 1.10.m. What do you notice?l) Put now two spin up electrons in the localised orbitals and work outthe HF energy in the Mulliken approximation and compare with 1.17.iin the dissociation limit. 30) Show that occupying φ a and φ b with two spin up electrons yields inthe dissociation limit the same HF wavefunction (up to a sign) asoccupying the σ g and σ u orbitals with two spin up electrons.So far we assumed that the HF orbitals are either localised or fully deloc-alised (symmetry adapted). In a fully unrestricted HF (UHF) calculation,the HF orbitals can also be a mixture between these two extremes, i.e. weneed to consider linear combinations. Since the final charge density will besymmetric, the linear combinations are restricted to the form ψ ( r ) a/b = cos( α ) φ a/b ( r ) + sin( α ) φ b/a ( r ) . (1.98)In principle, we should now solve the HF equations. However, since there isonly one additional parameters, it is easier to write the energy as a functionof α as well and to optimise.n) Show that the unrestricted HF (UHF) energy with the orbitals ψ a/b can be expressed as E UHF = 2 (cid:0) (cid:104) a | ˆ h | a (cid:105) + sin(2 α ) (cid:104) a | ˆ h | b (cid:105) (cid:1) +( aa | bb ) + sin (2 α )2 (cid:0) ( aa | aa ) − ( aa | bb ) (cid:1) + 1 R . (1.99)o) Optimise (numerically) both the orbital exponents, ζ , and the orbitalmixing angle, α . Plot α as a function of the internuclear distance, R .What do you notice?p) Plot the fully optimised UHF energy and compare to the previousresults (RHF from exercise 1.10.m and the localised solution 1.17.i). Exercise 1.18. The kernel of the spin-density operator can be written as ρ ( x , x (cid:48) ) = N (cid:88) i =1 δ ( x − x (cid:48) ) . (1.100)a) Show that this expression is correct. That means, you need to showthat it gives the expected expression for the spin-density of a general wavefunction.b) Show that the expectation value of the spin-density simplifies to ρ ( x ) = N (cid:88) i =1 | φ i ( x ) | , (1.101)if the wavefunction is a Slater determinant.31 .5 Roothaan–Hall equations The canonical HF equations derived before are differential equations whichneed to be solved self-consistently. As with the many-body Schr¨odingerequation, it is basically impossible to do this for general φ i ( x ) . Hence, weresort to the same approach which leads to full CI: expand the orbitals in abasis φ i ( x ) = m (cid:88) ν =1 χ ν ( x ) C νi , (1.102)where m denotes the number of functions in our basis. This is exactly whatRoothaan (Roothaan 1951) and Hall (Hall 1951) did independently in 1951.Simply insert the expansion (1.102) in the HF equations (1.89), multiplyfrom the left by χ ∗ µ ( x ) and integrate over x m (cid:88) ν =1 (cid:104) χ µ | ˆ f | χ ν (cid:105) (cid:124) (cid:123)(cid:122) (cid:125) =: f µν C νi = (cid:88) ν (cid:104) χ µ | χ ν (cid:105) (cid:124) (cid:123)(cid:122) (cid:125) = S µν C νi ε i , (1.103)where we did not assume that the basis { χ } is orthonormal. So you seethat the general Roothaan–Hall equations have the form of a generalisedeigenvalue equation f C = S C diag( ε ) , (1.104)where diag( ε ) stands for a diagonal matrix with ε on its diagonal. In the caseof an orthonormal basis, the Roothaan–Hall equations reduce to an ordinaryeigenvalue equation f C = C diag( ε ) . (1.105)So starting from some non-orthonormal (atomic) basis, one can either usethe general form (1.104), or first orthonormalise the basis and use the simplerform (1.105).The Fock operator itself depends on the HF orbitals, so we should expressit also in terms of the expansion coefficients f µν [ C ] := (cid:104) χ µ | ˆ f [ C ] | χ ν (cid:105) = (cid:104) χ µ | ˆ h + ˆ v H [ C ] + ˆ v x [ C ] | χ ν (cid:105) . (1.106)The one-body part of the Hamiltonian, ˆ h , does not depend on the HF orbit-als, so nothing to be done for that term. For the Hartree (classical Coulomb)potential, we need the spin-density. The spin-density in terms of the expan-sion coefficients becomes ρ ( x ) = N (cid:88) i =1 | φ i ( x ) | = (cid:88) µ,ν χ µ ( x ) N (cid:88) i =1 C µi C † iν (cid:124) (cid:123)(cid:122) (cid:125) =: γ HF µν χ ∗ ν ( x ) . (1.107)32he quantity γ HF µν is called the HF one-body reduced density matrix (1RDM),or ‘density matrix’ for short if only used in the context of HF. Let us considersome properties of the HF 1RDM. As the spin-density integrates to the totalnumber of electrons, we have N = N (cid:88) i =1 (cid:104) φ i | φ i (cid:105) = (cid:88) µ,ν N (cid:88) i =1 C † iν (cid:104) χ µ | χ ν (cid:105) C µi = (cid:88) µ,ν γ HF µν S νµ = Tr { γ HF S } . (1.108)If we use an atomic basis set, each basis function belongs to a certain atom,so the trace can be partitioned into atomic contributions as Tr { γ HF S } = (cid:88) A (cid:88) µ ∈ A ( γ HF S ) µµ . (1.109)The combined contribution per atom is called the Mulliken population (Mul-liken 1955a,b) ρ Mulliken A = (cid:88) µ ∈ A ( γ HF S ) µµ . (1.110)The Mulliken charge is now obtained by adding the nuclear charge Q Mulliken A = Z A − ρ Mulliken A . (1.111)Note that this decomposition only makes sense for basis functions whichare well localised on the individual atoms. Large basis sets with diffusefunctions, this decomposition scheme breaks down. There are more soph-isticated decomposition schemes which do not suffer from this, e.g. naturalpopulation analysis (Reed, Weinstock and Weinhold 1985), Bader’s atomsin molecules (Bader 1990) and Vornoi deformation density (Voronoi 1908;Bickelhaupt et al. 1996; Fonseca Guerra et al. 2004). These more sophistic-ated decomposition schemes can also used in combination other basis sets,such as plane waves. The Mulliken decomposition is easy to generalise tomore fine-grained populations, e.g. separate s , p , d , etc. contributions. Theprime advantage of the Mulliken analysis is its low computational cost andsimplicity to implement.Now let us turn our attention back to the Hartree potential. Insertingthe expansion for the density (1.107) in the Hartree potential (1.78), we find v H ( x ) = (cid:90) d x (cid:48) ρ ( x (cid:48) ) | r − r (cid:48) | = (cid:88) µ,ν γ HF µν (cid:90) d x (cid:48) χ µ ( x (cid:48) ) χ ∗ ν ( x (cid:48) ) | r − r (cid:48) | , (1.112)or in terms of its matrix elements (cid:104) χ µ | ˆ v H | χ ν (cid:105) = N (cid:88) i =1 (cid:104) χ µ φ i | χ ν φ i (cid:105) = (cid:88) ρσ γ HF σρ (cid:104) χ µ χ ρ | χ ν χ σ (cid:105) . (1.113)33or the exchange potential (1.79) we need a ‘density with two coordinates’ γ HF ( x , x (cid:48) ) = N (cid:88) i =1 φ i ( x ) φ ∗ i ( x (cid:48) ) = (cid:88) µ,ν χ µ ( x ) γ HF µν χ ∗ ν ( x (cid:48) ) , (1.114)which is the HF 1RDM in coordinate representation. Note that the spin-density is just the diagonal of the 1RDM, ρ ( x ) = γ HF ( x , x ) . The exchangekernel now becomes v x ( x , x (cid:48) ) = − γ HF ( x , x (cid:48) ) | r − r (cid:48) | = − (cid:88) µ,ν χ µ ( x ) γ HF µν | r − r (cid:48) | χ ∗ ν ( x (cid:48) ) , (1.115)or in its matrix representation (cid:104) χ µ | ˆ v x | χ ν (cid:105) = − (cid:88) ρ,σ γ HF σρ (cid:104) µρ | σν (cid:105) . (1.116)We have now seen the HF 1RDM in two representations: in the space-spincoordinate representation and 2) in and arbitrary basis ( { χ i } ) representation.In the canonical HF basis the HF 1RDM becomes particularly simple as itis diagonal γ HF rs = (cid:104) φ r | ˆ γ HF | φ s (cid:105) = (cid:40) if r = s ≤ N otherwise. (1.117)The HF 1RDM is therefore idempotent, i.e. γ HF Sγ HF = γ HF . (1.118) Exercise 1.19. a) Prove that an idempotent matrix M can only have 0 and 1 as itseigenvalues. Assume that an orthonormal basis is used, so S = .b) Show that if the HF 1RDM is obtained in a non-orthogonal basis, so C † SC = , that the idempotency property changes to (1.118). Usethat the HF 1RDM in a general basis can be written as, γ HF = CρC † ,where ρ is the 1RDM in the canonical HF basis (1.117).Since the Roothaan–Hall equations have the form of a(n) (generalised) ei-genvalue equation, an algorithm to find the optimal HF solution would beto start with an initial guess for the HF orbitals, construct the Fock matrixand diagonalise it. Then select the orbitals with the lowest orbital energies(eigenvalues) to construct a new HF 1RDM. That selecting the orbitals withthe lowest orbital energies leads to the lowest HF energy is called the auf-bau principle . For the completely unrestricted form described here, we willproof that the aufbau principle always works and leads to the lowest HF34nergy. However, as HF is usually implemented with additional restriction,that proof does not apply anymore. One needs to generalise the aufbau prin-ciple to handle degenerate orbital energies as well. This generalisation andits proof are beyond the scope of this course, but can be found in Ref. (Gies-bertz and Baerends 2010).The N orbitals used to construct the new HF determinant / 1RDM arecalled the occupied orbitals. The other orbitals obtained from diagonalisingthe Fock matrix are called the unoccupied/empty/virtual orbitals. Exercise 1.20. Make a diagram of the self-consistent field (SCF) procedureto solve the HF equations. Consider a singly excited determinant. To be more precise, with a singlyexcited determinant, Φ ai , we mean a determinant where an occupied orbital φ i ( x ) is replaced by a virtual orbital, φ a ( x ) . Brillouin’s theorem states (cid:104) Φ | ˆ H | Φ ai (cid:105) = 0 , (1.119)where Φ is the unperturbed HF determinant. This result will be useful laterin the course, when we add additional determinants to the HF determinantto improve our approximation of the wavefunction. Exercise 1.21. Prove Brillouin’s theorem (1.119). Use the Slater–Condonrules to work out the left-hand side of (1.119) and relate the result to theFock matrix. Since the HF orbitals are eigenfunctions of the Fock operator, the HF wave-function is an eigenfunction of the following Hamiltonian ˆ H (0) = N (cid:88) i =1 ˆ f ( x i ) . (1.120)Note that E HF (cid:54) = (cid:104) Φ HF | ˆ H (0) | Φ HF (cid:105) , but instead E HF = (cid:104) Φ HF | ˆ H | Φ HF (cid:105) (1.121) = (cid:104) Φ HF | ˆ H (0) | Φ HF (cid:105) + (cid:104) Φ HF | ˆ H − ˆ H (0) | Φ HF (cid:105) = E (1) . The HF energy can therefore be regarded as the first order corrected energy,use ˆ H (0) as a zeroth order Hamiltonian. It is obvious that this perturbationexpansion can be pushed to higher orders. This is called M ø ller–Plesset (MP)perturbation theory and will be explained later in the the course in moredetail. One could call HF ‘MP1’: first order corrected perturbation theory.35 xercise 1.22. a) Show that ˆ H (0) | Φ HF (cid:105) = E (0) | Φ HF (cid:105) where E (0) = N (cid:88) i =1 (cid:15) i . (1.122)b) Show that E HF = N (cid:88) i =1 (cid:15) i − N (cid:88) i,j =1 (cid:104) ij (cid:107) ij (cid:105) = 12 N (cid:88) i =1 (cid:0) (cid:104) i | ˆ h | i (cid:105) + (cid:15) i (cid:1) . (1.123) This theorem on the interpretation of the HF solution was published byTjalling Koopmans (Koopmans 1934). He received a Nobel prize in econom-ics in 1975. Koopmans’ theorem states that the occupied HF orbital energiescan be regarded as approximations to ionisation energies/potentials and thatthe unoccupied ones serve as approximations to affinities. The assumptionis that the orbitals do not relax when an electron is removed from or addedto the system. Under this assumption one can readily show thatIP HF i = E N − HF − E N HF ≈ − (cid:15) i , (1.124a)EA HF a = E N HF − E N +1 HF ≈ − (cid:15) a . (1.124b)Apart from the intrinsic approximations in HF, the approximation of norelaxation introduces an error of several eVs (hundreds kcal / mol ). Since ion-isation energies are typically quite large, the relative error is not too largeand − (cid:15) i often gets you in the right ball park. On the contrary, affinities aretypically small, which renders − (cid:15) a as an approximation to affinities prac-tically useless. The HF virtuals typically have positive orbital energies, soKoopmans’ theorem predicts many negatively charged ions not to be stable. Exercise 1.23. Proof Koopmans’ theorem. Often we are interested in systems for which the ground state is a singlet (cid:104) ˆ S (cid:105) = 0 , so we will have an equal amount of electrons in spin up and spindown orbitals. One can show that a single Slater determinant can only be asinglet, if we have a closed shell solution. Closed shell means that the spinup and spin down orbitals span the same spatial function space. This iseasiest to implement as both having the same spatial part φ i ( x ) = (cid:40) ψ ( i +1) / ( r ) α ( s ) for i odd, ψ i/ ( r ) β ( s ) for i even . (1.125)36ince the only degree of freedom is now the spatial part of the HF orbitals,we can integrate out the spin part from the HF expressions and only N/ spatial HF orbitals remain to be determined. This form of HF is calledrestricted HF (RHF). The RHF energy becomes E RHF = 2 N/ (cid:88) i =1 (cid:104) ψ i | ˆ h | ψ i (cid:105) + N/ (cid:88) i,j =1 (cid:0) (cid:104) ψ i ψ j | ψ i ψ j (cid:105) − (cid:104) ψ i ψ j | ψ j ψ i (cid:105) (cid:1) . (1.126)The RHF SCF equations retain the same basic form (cid:15) i ψ i ( r ) = ˆ f ψ i ( r ) = (cid:0) ˆ h + ˆ v H + ˆ v x (cid:1) ψ i ( r ) , (1.127)though the Hartree and exchange potential are now v H ( r ) = (cid:90) d r (cid:48) ρ ( r (cid:48) ) | r − r (cid:48) | , (1.128a) v x ( r , r (cid:48) ) = − γ HF ( r , r (cid:48) ) | r − r (cid:48) | , (1.128b)where the spin-integrated HF density and one-body reduced density matrix(1RDM) are defined as γ HF ( r , r (cid:48) ) = (cid:88) σ γ HF ( r σ, r (cid:48) σ ) = 2 N/ (cid:88) i =1 ψ i ( r ) ψ ∗ i ( r (cid:48) ) , (1.129a) ρ ( r ) = (cid:88) σ ρ ( r σ ) = 2 N/ (cid:88) i =1 | ψ i ( r ) | = γ HF ( r , r ) . (1.129b)As one is most of the time interested in closed shell systems, RHF is themost used form of HF. In the case of open shell there exist many variantswith varying restrictions. If one uses the same spatial parts for the spin-upand spin-down orbitals, one calls this restricted open shell HF (ROHF). Ifone only fixes the number of occupied spin-up and spin-down orbitals (the S z value) and allows for different spatial parts of the spin-up and spin-downorbitals, the method is called unrestricted HF (UHF). The HF as introducedin the beginning of this chapter is even less restrictive, as it only fixes thenumber of electrons and finds itself the optimal distribution between thenumber of spin-up and spin-down electrons. One could call this completelyunrestricted HF. Typically, for small S z the UHF and completely unrestric-ted HF solutions coincide. For S z = 0 both often yield the same solution asthe RHF method for organic molecules in close to their equilibrium geometry. Exercise 1.24. a) Derive the RHF energy expression (1.126) .37) Derive the RHF equations, i.e. the RHF expression for the the Fockoperator and the corresponding potentials (1.128). Exercise 1.25. a) Consider a Slater determinant with one spin up and one spin downorbital φ ( x ) = ψ ( r ) α ( s ) and φ ( x ) = ψ ( r ) β ( s ) . Show that thisdeterminant is only an eigenfunction of ˆ S if ψ ( r ) = ψ ( r ) . Hint: Write the total spin operator as ˆ S = ˆ S · ˆ S = (cid:88) i,j S ( σ i ) · S ( σ j )= (cid:88) i,j (cid:2) ˆ S x ( σ i ) ˆ S x ( σ j ) + ˆ S y ( σ i ) ˆ S y ( σ j ) + ˆ S z ( σ i ) ˆ S z ( σ j ) (cid:3) = (cid:88) i,j (cid:20) (cid:16) ˆ S + ( σ i ) ˆ S − ( σ j ) + ˆ S − ( σ i ) ˆ S + ( σ j ) (cid:17) + ˆ S z ( σ i ) ˆ S z ( σ j ) (cid:21) . b) Show that a Slater determinant is only a singlet state, if the spin-up and spin-down parts span the same volume (determinant). Therestricted solution is a particular realisation of this, since we can makearbitrary unitary transformations between functions building up thedeterminant without affecting its absolute value. Hint: So you need to work out ˆ S | Φ (cid:105) , where Φ is a general determin-ant. First note that one can only have singlet state if the amount ofspin-up and spin-down orbitals are the same. Second, note that the ˆ S does not really care about the spatial part, but only about spin. So itis convenient to group the spin-up and spin-down parts and write thedeterminant in a more abstract manner as | Φ (cid:105) = | ψ α, . . . , ψ N/ α, ψ N/ β, . . . , ψ N β (cid:105) . Now work out ˆ S | Φ (cid:105) with the help of ˆ S = (cid:0) ˆ S + ˆ S − + ˆ S − ˆ S + (cid:1) + ˆ S z anddraw the conclusions.c) Show that the UHF solution for H in the dissociation limit is halfsinglet and half triplet. You can use the UHF dissociation limit foundin exercise 1.17, as it is exact in the infinite basis limit. To do this,first show that Ψ HL = 1 √ (cid:0) χ A ( r ) χ B ( r ) + χ A ( r ) χ B ( r ) (cid:1)(cid:0) α ( σ ) β ( σ ) − α ( σ ) β ( σ ) (cid:1) , (1.130a) Ψ HL = 1 √ (cid:0) χ A ( r ) χ B ( r ) − χ A ( r ) χ B ( r ) (cid:1)(cid:0) α ( σ ) β ( σ ) + α ( σ ) β ( σ ) (cid:1) (1.130b)38re a singlet ( S = 0 ) and a triplet ( S = 1 ) respectively. Then show thatthe UHF solution is a linear combination of these two Heitler–Londonwavefunctions. Exercise 1.26 [Full CI for H in a minimal basis]. It is clear that HFcannot give both a good energy and the correct spin state in the dissoci-ation limit. The remedy is simple: one Slater determinant is apparently notenough. So we need to include more determinants in the description, i.e. todo a small CI.a) Use symmetry to argue that only the ( σ u ) determinant needs to beconsidered in the CI.b) Show that the required matrix elements for the full CI calculation are E u = (cid:104) ( σ u ) | ˆ H | ( σ u ) (cid:105) = 2 (cid:104) σ u | ˆ h | σ u (cid:105) + ( σ u σ u | σ u σ u ) , (1.131a) V = (cid:104) ( σ u ) | ˆ H | ( σ g ) (cid:105) = ( σ g σ u | σ g σ u ) , (1.131b) E g = E RHF = 2 (cid:104) σ g | ˆ h | σ g (cid:105) + ( σ g σ g | σ g σ g ) , (1.131c)where you have already evaluated E g in exercise 1.66.c) Show that these matrix elements can be expressed in the atomic basisas (cid:104) σ u | ˆ h | σ u (cid:105) = (cid:104) A | ˆ h | A (cid:105) − (cid:104) A | ˆ h | B (cid:105) − s , (1.132a) ( σ u σ u | σ u σ u ) = 12(1 − s ) (cid:2) ( AA | AA ) + ( AA | BB ) − AA | AB ) − AB | AB ) (cid:3) , (1.132b) ( σ u σ u | σ u σ u ) M = 12 (cid:2) ( AA | AA ) + ( AA | BB ) (cid:3) , (1.132c) ( σ g σ u | σ g σ u ) = 12(1 − s ) (cid:2) ( AA | AA ) − ( AA | BB ) (cid:3) . (1.132d)d) Setup the full CI secular equations (1.17) with (1.131) and solve forthe energy.e) Construct the lowest eigenvector. It is convenient to express the fullCI ground state solution as | Ψ CI (cid:105) = √ (cid:0) cos( α ) | ( σ g ) (cid:105) + sin( α ) | ( σ u ) (cid:105) (cid:1) (1.133)and to solve for α . This immediately ensures that your ground stateis normalised.For the following computer exercises use the Mulliken approximation againfor the two-electron integrals. 39) Optimise the orbital exponent, ζ , for the full CI ground state (1.133).Plot the orbital exponent as a function of the bound distance. Comparewith your results for RHF and UHF. What do you notice?g) Plot the optimised full CI energy as a function of the internucleardistance and compare to the RHF and UHF energies.h) Plot the CI coefficients (the coefficients of the eigenvector in exer-cise 1.26.e) as a function of the bond distance.The following exercises help you to explain the behaviour of the CI coeffi-cients.i) Write out the RHF determinant for H in the atomic orbital basis(minimal s basis).j) What is wrong with the RHF wavefunction in the dissociation limit?In other words, which terms should not be there or are missing?k) Write out the full CI wavefunction for H in the dissociation limit inthe atomic orbital basis (minimal s basis). How does full CI fix thedissociation limit?l) Argue why the equilibrium bond length of H is predicted too short byHF. Do you expect this trend to persist in other systems? To be able to construct the aufbau solution, it is important to have a finitegap. The gap is defined as the difference between the highest occupiedmolecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO)energies. In Ref. (Bach et al. 1994) it is rigorously shown that the completelyunrestricted HF gap is always finite, i.e. larger than zero, so the aufbausolution always works. Additionally they showed that the aufbau solutionindeed leads to the lowest completely unrestricted HF energy.An other important consequence of the finite gap is that completely un-restricted HF cannot describe metals. Because HF is a simple orbital theory,the conductance is primarily related to the HOMO-LUMO gap. Only whenthe gap closes, we have a metal. HF therefore predicts all material to beinsulators (large gap) or semi-conductors (small gap).Here is the proof, which actually works for any strictly positive definiteinteraction w ( x , x (cid:48) ) . With strictly positive definite we mean (cid:90) d x (cid:90) d x (cid:48) w ( x , x (cid:48) ) | ψ ( x , x (cid:48) ) | > . (1.134)40enote by (cid:15) ≤ (cid:15) ≤ · · · ≤ (cid:15) N the occupied orbital energies of the corres-ponding HF orbitals which constitute Φ HF . Further, we will introduce thenotations h k = (cid:104) φ k | ˆ h | φ k (cid:105) , (1.135a) W k,l = 12 (cid:90) d x (cid:90) d x (cid:48) | φ k ( x ) φ l ( x (cid:48) ) − φ l ( x ) φ k ( x (cid:48) ) | w ( x , x (cid:48) ) . (1.135b)Note that W k,k = 0 and that W k,l = W l,k > for k (cid:54) = l .Now assume that there exists a HF orbital φ N +1 with (cid:15) N +1 ≤ (cid:15) N , so thatthe minimum E HF was obtained with a non aufbau Φ HF . Let ˜Φ be the Slaterdeterminant constructed from φ , . . . , φ N − , φ N +1 . For the total energies wehave (cid:104) Φ HF | ˆ H | Φ HF (cid:105) = N (cid:88) k =1 h k + 12 N (cid:88) k,l =1 W k,l , (1.136a) (cid:104) ˜Φ | ˆ H | ˜Φ (cid:105) = N − (cid:88) k =1 h k + 12 N − (cid:88) k,l =1 W k,l + h N +1 + N − (cid:88) l =1 W l,N +1 . (1.136b)Since the Fock operator is constructed from Φ HF we have (cid:15) k = (cid:104) φ k | ˆ f | φ k (cid:105) = h k + N (cid:88) l =1 W l,k . (1.137)Now we can rewrite the energy of ˜Φ in terms of the energy of Φ HF as (cid:104) ˜Φ | ˆ H | ˜Φ (cid:105) = (cid:104) Φ HF | ˆ H | Φ HF (cid:105) + h N +1 − h N + N − (cid:88) l =1 (cid:0) W l,N +1 − W l,N (cid:1) = (cid:104) Φ HF | ˆ H | Φ HF (cid:105) + (cid:15) N +1 − (cid:15) N − W N,N +1 (1.138) ≤ (cid:104) Φ HF | ˆ H | Φ HF (cid:105) − W N,N +1 , where we used the assumption (cid:15) N +1 ≤ (cid:15) N for the last inequality. However,we end up with a contradiction, since we find that ˜Φ yields a lower energythan Φ HF . The assumption that an occupied HF orbital with a lower energythan the HOMO can exist, is wrong. Exercise 1.27. Check each step in the proof. We will first discuss some properties of basis sets on a more general level.Since quantum mechanics is more concerned about the properties of oper-ators (spectra) than their exact representation, we will push the abstract41ra-ket notation introduced by Dirac (Grassmann 1862; Dirac 1939) a bitfurther. The elements of a Hilbert space are now represented by kets, | φ i (cid:105) .The inner product between two elements of the Hilbert space is now denotedas (cid:104) φ | ψ (cid:105) . Depending on the setting, the precise formula to calculate the in-ner product differs. For example in an N -dimensional vector space we wouldhave (cid:104) v | w (cid:105) = N (cid:88) i =1 v ∗ i w i = v † w . (1.139)The part (cid:104) v | is called the ‘bra’ and includes the complex conjugation. Whenthe bra and ket combine in the proper manner, (cid:104)·|·(cid:105) , their combination im-plies summation to form the inner product. In the case of L ( R ) the innerproduct is implemented as (cid:104) f | g (cid:105) = (cid:90) d r f ∗ ( r ) g ( r ) . (1.140)You see that the inner product is basically the same as in the vector case,except that we sum (integrate) over a continuous index.The most important properties of a basis is completeness. We will firstdiscuss this for the usual vector space, so in the case we have a discreteindex for the components. This means that the unit operator can be exactlyrepresented in the basis. In bra-ket notation this becomes very elegantly ˆ1 = (cid:88) i,j | φ i (cid:105) (cid:0) S − (cid:1) ij (cid:104) φ j | , (1.141)where S ij := (cid:104) φ i | φ j (cid:105) . Its importance comes from the fact that this meansthat every element in the Hilbert space can be represented as a unique linearcombination, since | f (cid:105) = ˆ1 | f (cid:105) = (cid:88) i,j | φ i (cid:105) (cid:0) S − (cid:1) ij (cid:104) φ j | f (cid:105) , (1.142)where (cid:80) j (cid:0) S − (cid:1) ij (cid:104) φ j | f (cid:105) are the expansion (Fourier) coefficients (see exer-cise 1.1). There are two things which basically can go wrong for a basis tobe complete. Either you miss some elements, so you do not cover the wholespace. Or there are too many elements, so you cover some parts of your spacemultiple times. This is called overcompleteness and destroys the uniquenessof the expansion (Fourier) coefficients. In other words, some of your basisstates are linearly dependent.An other convenient property is orthonormality (cid:104) φ i | φ j (cid:105) = S ij = δ ij . (1.143)42ou see that the unit operator in (1.141) becomes diagonal and the expansion(Fourier) coefficients simplify to (cid:104) φ i | f (cid:105) . There are different ways to generateorthonormal basis sets. One way is to diagonalise a hermitian operators,since the eigenstates belonging to different eigenvalues are orthogonal. Inthe the degenerate subspace we can use an other hermitian operator or useone of the following techniques Gramm–Schmidt orthogonalisation Simply follow the following algorithmstep 1: | u (cid:105) = | v (cid:105) , step 2: | u (cid:105) = | v (cid:105) − (cid:104) u | v (cid:105)(cid:104) u | u (cid:105) | u (cid:105) , step 3: | u (cid:105) = | v (cid:105) − (cid:104) u | v (cid:105)(cid:104) u | u (cid:105) | u (cid:105) − (cid:104) u | v (cid:105)(cid:104) u | u (cid:105) | u (cid:105) (1.144)...step n: | u n (cid:105) = | v n (cid:105) − n − (cid:88) i =1 (cid:104) u i | v n (cid:105)(cid:104) u i | u i (cid:105) | u i (cid:105) = (cid:32) ˆ1 − n − (cid:88) i =1 | u i (cid:105)(cid:104) u i |(cid:104) u i | u i (cid:105) | u i (cid:105) (cid:33) | v n (cid:105) . So at each step, you simply project out all the previously found com-ponents with the projector | u i (cid:105)(cid:104) u i | / (cid:107) u i (cid:107) . L¨owdin orthonormalisation which was introduced in exercise 1.17 u i = (cid:88) j | v j (cid:105) (cid:0) S − / (cid:1) ji . (1.145)The inverse square root is defined via the spectral decomposition, so S − / := U † diag (cid:0) s − / (cid:1) U , where U is the unitary matrix which diag-onalises S , i.e. SU = U diag( s ) and s i are its eigenvalues. Cholesky decomposition Any hermitian positive definite matrix can bewritten as S = LL † , where L is a lower triangular matrix and isunique. An orthonormal basis is now readily constructed as | u i (cid:105) = (cid:88) j | v j (cid:105) (cid:0) L −† (cid:1) ji , (1.146)where L −† := ( L − ) † = ( L † ) − . The advantage is that only a Choleskydecomposition needs to be performed which is computationally moreefficient than a full diagonalisation. Because L is lower triangular, thesolution of (1.146) is very fast. Any other decomposition of the form S = QQ † . An additional advantage of an orthonormalisation procedure is that theyautomatically provide a check for linear dependency and even provide aremedy. 43 ramm–Schmidt breaks down if the generated | u n (cid:105) ≈ | (cid:105) and can simplybe skipped. L¨owdin leave out the eigenvalues (close to) zero. To do this, one ratherworks with ˜ S := U † diag (cid:0) s − / (cid:1) instead of S − / . This type of or-thonormalisation is sometimes called canonical orthonormalisation. Cholesky depends on the actual implementation, but the basic strategy isto prevent the diagonal elements of L from becoming small. Exercise 1.28. Check that indeed the L¨owdin, canonical and Choleskymethods yield orthonormal bases. Exercise 1.29. A nice feature of the L¨owdin orthonormalisation is that ityields the smallest perturbation (in L norm) of the original basis whichmakes it orthonormal. In this sense the L¨owdin orthonormalisation is su-perior to any other method. Proof this. That is, show that (cid:107) χ − χT (cid:107) isminimised for T = S − / , under the constraint that (cid:104) χT | χT (cid:105) = .All these definitions are also useful for bases with continuous indices like theposition basis, | r (cid:105) , or the momentum basis, | k (cid:105) , though we need to redefinethem with continuous analoguesdiscrete, | φ i (cid:105) continuous, | r (cid:105) completeness (cid:88) i | φ i (cid:105)(cid:104) φ i | = ˆ1 (cid:90) d r | r (cid:105)(cid:104) r | = ˆ1 (1.147)orthonormality (cid:104) φ i | φ j (cid:105) = δ ij (cid:104) r | r (cid:48) (cid:105) = δ ( r − r (cid:48) ) Basis transformations now work in exactly the same manner as before | φ i (cid:105) = ˆ1 | φ i (cid:105) = (cid:90) d x | r (cid:105)(cid:104) r | φ i (cid:105) = (cid:90) d r | r (cid:105) φ i ( r ) , (1.148a) | r (cid:105) = ˆ1 | r (cid:105) = (cid:88) i | φ i (cid:105)(cid:104) φ i | r (cid:105) = (cid:88) i | φ i (cid:105) φ ∗ i ( r ) . (1.148b)So you see that the orbitals and wavefunctions we have been working withare regarded as expansion (Fourier) coefficients in the abstract bra-ket form-alism. Similarly, for the matrix elements of the operators we have (cid:104) φ i | ˆ O | φ j (cid:105) = (cid:90) d r (cid:90) d r (cid:48) (cid:104) φ i | r (cid:105)(cid:104) r | ˆ O | r (cid:48) (cid:105)(cid:104) r (cid:48) | φ j (cid:105) = (cid:90) d r (cid:90) d r (cid:48) φ ∗ i ( r ) (cid:104) r | ˆ O | r (cid:48) (cid:105) φ j ( r (cid:48) ) . (1.149)The advantage of the position basis is that we typically have a good intuitionhow the matrix elements should be defined, e.g. local potentials (cid:104) r | ˆ v loc | r (cid:48) (cid:105) = v loc ( r , r (cid:48) ) = v loc ( r ) δ ( r − r (cid:48) ) . (1.150)44or the kinetic energy, momentum space is easier, since the momentum andhence also the kinetic energy operator will be multiplicative (cid:104) k | ˆ p | k (cid:48) (cid:105) = k δ ( k − k (cid:48) ) and (cid:104) k | ˆ T | k (cid:48) (cid:105) = | k | δ ( k − k (cid:48) ) . (1.151)Since we know the Fourier coefficients, (cid:104) r | k (cid:105) = (2 π ) − d / e i k · r , where d isthe dimensionality of the space, we can transform these matrix elements toposition space and find (cid:104) r | ˆ p | r (cid:48) (cid:105) = i ∇ r (cid:48) δ ( r − r (cid:48) ) = − i ∇ r δ ( r − r (cid:48) ) , (1.152a) (cid:104) r | ˆ T | r (cid:48) (cid:105) = − ∇ r (cid:48) δ ( r − r (cid:48) ) = 12 ∇ r · ∇ r (cid:48) δ ( r − r (cid:48) ) . (1.152b) Exercise 1.30. Derive the matrix elements for the momentum and kineticenergy operator in the position basis (1.152) starting from their matrix ele-ments in the momentum basis (1.151). You need to use that δ ( r − r (cid:48) ) = 1(2 π ) d (cid:90) d k e i k · ( r − r (cid:48) ) . (1.153) In practice we need to work with a finite basis, so we are always infinitelyfar from a complete basis. However, the basis does not need to be able torepresent any state, but only the ground state or some low excited state. Weexpect the ground state of a molecule or solid to be small distortions of theground states of the atoms. So it is natural to start from an atomic basisset.Since the kinetic energy dominates in the outer region of the molecularSchr¨odinger equation, its bound states need to decay exponentially as e − αr .As the Coulomb potential becomes infinite at the nuclei, the solutions eitherneed a cusp to compensate with an infinite kinetic energy ( s -orbitals) orthey need to be zero at the nuclei, i.e. have a nodal surface ( p , d , etc.). Forexample, when we consider the complete set of hydrogenic solutions we have bounded ψ nlm ( r, θ, φ ) = N nl e − ρ/ ρ l L l +1 n − l − ( ρ ) Y ml ( θ, φ ) , (1.154)where• ρ = 2 Zr/n ,• N nl is a normalisation constant,• L αn ( x ) is a generalized Laguerre polynomial,• Y ml ( θ, φ ) is a spherical harmonic,45 n = 1 , , . . . and l = 0 , , . . . , n − and m = − l, − l + 1 , . . . , l . unbounded The unbounded/ionised states are often forgotten/neglected inthe treatment of the hydrogen atom, but they are also part of the spec-trum. The unbounded solutions are the ones with positive energies, soone needs to solve (cid:18) − ∇ + Zr (cid:19) φ k ( r ) = k φ k ( r ) , (1.155)where k / | k | / is the kinetic energy of the electron far awayform the nucleus. The radial part of the solutions are the Coulombwavefunctions F ηl ( ρ ) = ˜ N ηl ρ l +1 e ∓ i ρ M( l + 1 ∓ i η, l + 2 , ± i ρ ) , (1.156)where• η = Z/k ∈ [0 , ∞ ) is a continuous quantum number, (continuousequivalent of n ).• ˜ N ηl is a normalisation constant,• M( a, b, z ) confluent hypergeometric function (further generalisa-tion of the Laguerre functions).So the Coulomb wavefunctions are similar to the radial part of thebound states, R nl ( ρ ) . The full solutions are now obtained by glueingthe Coulomb wavefunctions to the spherical harmonics ˜ ψ ηlm ( r, θ, φ ) = F ηl ( ρ ) Y ml ( θ, ρ ) . (1.157)The completeness relation for the hydrogenic solutions is therefore ˆ1 = (cid:88) n,l,m | ψ nlm (cid:105)(cid:104) ψ nlm | (cid:124) (cid:123)(cid:122) (cid:125) discrete part + (cid:90) ∞ d η (cid:88) l,m | ˜ ψ ηlm (cid:105)(cid:104) ˜ ψ ηlm | (cid:124) (cid:123)(cid:122) (cid:125) continuous part . (1.158)When people talk about Slater type orbitals (STOs), they only mean thesolutions which decay exponentially. Since they only form the discrete partof the spectrum, the STO set is not complete. The main problem of the STO basis set is not its incompleteness, but to eval-uation of the 3 and 4 centre integrals. With 3/4 centre integrals we meantwo-electron integrals where the different STOs are located at 3/4 atoms.46n 1950 Boys therefore proposed to use Gaussian type orbitals (GTOs) in-stead (Boys 1950). Because the product of two Gaussians is a new Gaussianlocated between the two original Gaussians, all 3/4 centre integrals reduceto 2 centre integrals Exercise 1.31. Show for two Gaussians centred at A and B e − αr A e − βr B = e − µR AB e − pr P , (1.159)where r A = r − A , r A = | r A | and idem for r B and r P . Further, p = α + β, P = α A + β B p ,µ = αβp , R AB = | A − B | . Now you might be concerned how to deal with a Gaussian 2 centre integral.We will not consider this in detail, but the basic trick is to express also theCoulomb interaction as a Gaussian integral r C = 1 √ π (cid:90) ∞−∞ d t e − r C t . (1.160)The Gaussian product rule can then be used to reduce all Coulomb integralsto one special function, the Boys function F n ( x ) = (cid:90) d t t n e − xt , (1.161)which can be integrated numerically/fitted/tabulated.An additional formal advantage of the GTOs is that this set is complete.Since Gaussians are the eigenfunctions of the harmonic oscillator, the elec-trons cannot escape to infinity. So there are no ionisation/continuum statesto worry about.The disadvantages are obvious from the discussion on the STOs• too fast decay ( e − αr instead of e − αr ),• no cusp at the nuclei.Therefore, one needs typically more GTOs than STOs to get the same ac-curacy. The important points are• contractions 47 split valence• polarization functions. They are important to give directions to bonds.For example a basis with only s & p functions predicts ammonia to beplanar. With d functions it gets the correct umbrella shape.• The counterpoise correction is not always the right thing to do. A morecomplete error analysis shows that the problem is more delicate. Thereis an additional error of opposite sign due to the basis incompletenessin describing the chemical bond. So do not take this blindly. A moredetailed account can be found in Ref. (Sheng et al. 2011).48 hapter 2 Density Functional Theory Prologue These lecture notes provide a concise introduction to density functional the-ory (DFT). As with most theories, the historical developments are not in alogical order, so the lecture notes do not follow the historical time-path ofthe development of DFT to give a more logical presentation. The classical approach to quantum mechanics is to solve for the wavefunc-tion, Ψ , via the Schr¨odinger equation. However, in general we are onlyinterested in a reduced quantity in the sense of the amount of informationit contains. Examples are:• the density, ρ ( r ) • the energy, E ••• Exercise 2.1. Add some more observables of interest.It would be convenient to calculate these reduced quantities directly, insteadof having to calculate the full Ψ first. Important reduced quantities fromwhich a lot of other reduced quantities can be calculated are• the density, ρ ( r ) • the one-body reduced density matrix (1RDM), γ ( r , r (cid:48) ) 49 the pair-density, P ( r , r ) The density is a well-known quantity, but the other two reduced quantitiesmight be less familiar. We will define them shortly. These quantities can beused to calculate all the individual components of the total energy E = T [ γ ] + V [ ρ ] + W [ P ] . (2.1)As an example consider the kinetic energy, T . We will use x := r σ as acombined space-spin coordinate and the integration over x implies also thesummation over the spin-variable (cid:90) d x := (cid:88) σ (cid:90) d r . (2.2)The kinetic energy can now be worked out as T = 1 N ! (cid:90) d x · · · d x N Ψ ∗ ( x , x , . . . , x N ) N (cid:88) i =1 − ∇ i Ψ( x , x , . . . , x N )= − N ! (cid:90) d x · · · d x N Ψ ∗ ( x , x , . . . , x N ) ∇ r Ψ( x , x , . . . , x N ) + · · · + − N ! (cid:90) d x · · · d x N Ψ ∗ ( x , x , . . . , x N ) ∇ r N Ψ( x , x , . . . , x N )= − N N ! (cid:90) d x · · · d x N Ψ ∗ ( x , x , . . . , x N ) ∇ r Ψ( x , x , . . . , x N )= − (cid:90) d x (cid:2) ∇ r γ ( x , x (cid:48) ) (cid:3) x (cid:48) = x , (2.3)where x := r σ is the combined space-spin coordinate and we used the per-mutational symmetry of the wave function. The one-body reduced densitymatrix (1RDM) in the last line appears naturally, which is defined as γ ( x , x (cid:48) ) := 1( N − (cid:90) d x · · · d x N Ψ( x , x , . . . , x N ) × Ψ ∗ ( x (cid:48) , x , . . . , x N ) . (2.4)Since the kinetic energy operator does not depend on spin, we can alsointegrate out (sum over) the remaining spin-degree of freedom, which givesthe spin-integrated 1RDM γ ( r , r (cid:48) ) := (cid:88) σ γ ( r σ, r (cid:48) σ ) . (2.5)The kinetic energy can now be calculated directly from the 1RDM as T [ γ ] = − (cid:90) d r (cid:2) ∇ r γ ( r , r (cid:48) ) (cid:3) r (cid:48) = r = 12 (cid:90) d r (cid:2) ∇ r · ∇ r (cid:48) γ ( r , r (cid:48) ) (cid:3) r (cid:48) = r , (2.6)50here in the last step we used partial integration. The spin-density and thespin-pair-density can be expressed in terms of the wavefunction respectivelyas ρ ( x ) := 1( N − (cid:90) d x · · · x N | Ψ( x , x , . . . , x N ) | , (2.7) P ( x , x ) := 1( N − (cid:90) d x · · · x N | Ψ( x , x , . . . , x N ) | . (2.8) Exercise 2.2. Show that the other two components of the total energy canbe written as V [ ρ ] = (cid:90) d r v ( r ) ρ ( r ) , (2.9) W [ P ] = 12 (cid:90) d r (cid:90) d r w ( | r − r | ) P ( r , r ) , (2.10)where v ( r ) is a local (external) potential, such as the Coulomb interactionwith the nuclei in a molecule and w ( | r − r | ) is the interaction, which willbe the Coulomb interaction / | r − r | for non-relativistic electrons.The two-body reduced density matrix (2RDM) is defined in a similar fashionas the 1RDM Γ( x x , x (cid:48) x (cid:48) ) := 1( N − (cid:90) d x · · · d x N Ψ( x , x , x , . . . , x N ) × Ψ ∗ ( x (cid:48) , x (cid:48) , x , . . . , x N ) . (2.11) Exercise 2.3. Check that P ( x , x ) = Γ( x x , x x ) , (2.12) γ ( x , x (cid:48) ) = 1 N − (cid:90) d x Γ( x x , x x (cid:48) ) , (2.13) ρ ( x ) = γ ( x , x ) = 1 N − (cid:90) d x (cid:48) Γ( xx (cid:48) , x (cid:48) x ) . (2.14)Since we can calculate ρ , γ and P from the 2RDM, we actually only need the2RDM to calculate the total energy and not the full many-body wavefunction Ψ (L¨owdin 1955; Mayer 1955), which is just a 4-point function. We say thatthe total energy is a functional of the 2RDM, [Γ] .Using the variational principle, we can try to find the ground state energyby minimising the energy functional over all 4-point functions, Ξ( x x , x (cid:48) x (cid:48) ) one can think of. It turns out, however, that this minimum does not exist,since the functional is not bounded from below inf Ξ E [Ξ] = −∞ . (2.15)51he problem is that we cannot freely vary over every possible 4-point func-tion that we can imagine. We also should guarantee that there exists awavefunction that actually corresponds to this 4-point function, so that itis an actual 2RDM. Only when we guarantee that the 4-point functions aretrue 2RDMs, we can invoke the variational principle to argue that E [Γ] willbe bounded from below by the true ground state energy.Otherwise, we cannot invoke the variational principle to argue that E [Γ] will be bounded from below by the true ground state energy. So if we do notenforce that Ξ is a proper 2RDM, the calculated energy will be lower thanthe actual ground state energy (Tredgold 1957; Mizuno and Izuyama 1957;Ayres 1958; Bopp 1959; Coleman 1963). A 2RDM which can be generatedby a wavefunction via (2.11) is called an N -representable 2RDM. Limitingour search over only N -representable 2RDMs we actually have E gs = min N -representable Γ E [Γ] . (2.16)Unfortunately, this is not a practical solution to determine the ground stateenergy. It turns out that it is very hard to tell for a given 2RDM if itis N -representable or not (Coulson 1960; Klyachko 2006). Some necessaryconditions N -representability conditions are known, but not all of them.Probably, imposing all N -representability conditions is equally or even moredifficult than solving the Schr¨odinger equation itself. There are some effortsto impose only some of the N -representability conditions and to hope for agood energy, although there is no proof that the energy would not collapseto −∞ . By imposing more and more N -representability conditions the trueground state energy is approached from below. In this sense this strategy iscomplementary to CI.That the minimum values is now determined by constraints rather thanthe functional itself, becomes quite obvious when we inspect the functional E [Γ] more closely E [ γ ] = Tr (cid:8) ˆ L ˆΓ (cid:9) = (cid:90) d x (cid:90) d x (cid:0) ˆ L Γ (cid:1) (Γ( x x , x x ) , (2.17)where the operator ˆ L can be defined in different ways. The most symmetricmanner is ˆ L = 12( N − (cid:18) − (cid:0) ∇ r + ∇ r (cid:1) + v ( r ) + v ( r ) (cid:19) + 1 | r − r | . (2.18)The key point here is that the functional is just linear in the 2RDM, whichmeans that the functional itself is just straight hyper plane, and the operator ˆ L is the normal of this hyper plane in some advanced mathematical sense.Since the functional is just a plane, you can just keep on sliding down till youfinally hit a boundary. This is illustrated in Fig. 2.1 using a low dimensionalrepresentation of the 2RDM. 52 E [Γ]Γ ˆ L Γ E [Γ]Γ ˆ L Figure 2.1: An artistic impression of the 2RDM optimization. In blue, the plane issketched given by the normal vector ˆ L and its boundaries are set by the constraints.In the left figure, the external potential in ˆ L is set such that there is a uniqueminimum, dictated by the boundaries. The active ones are shown in red. In theright figure, a different external potential leads to a different ˆ L , i.e. orientationof the plane. In this case, only one constraint is active and we have a degenerateminimum, i.e. a set of degenerate ground state 2RDMs. The N -representability conditions for the density and the 1RDM (Cole-man 1963) are actually known and quite simple, so one can hope that thefunctionals E [ ρ ] and E [ γ ] might exist. They will definitely be more complic-ated than E [Γ] , but it might be possible to find some good approximations toparts of the energy which are not readily expressible in terms of the densityor 1RDM respectively. Exercise 2.4. Some of the N -representability conditions for the 2RDM areactually quite easy to derive directly from its definition (2.11). Find these N -representability conditions for Γ( x x , x (cid:48) x (cid:48) ) . You can find four differentkind of conditions in this manner. Consider the permutation symmetry ofthe wavefunction (2 conditions) and complex conjugation (1 condition). Alsoconsider the ‘diagonal’, x i = x (cid:48) i . This should give you a positivity condition(an inequality). The existence of the functional E [ ρ ] has been proved by Hohenberg and Kohnand additionally that the potential generating the ground state density isunique up to a constant shift (Hohenberg and Kohn 1964). These theoremsform therefore the basis of DFT and will be considered in detail. Firstconsider the composite mapping v ( r ) ˆ H Ψ= E Ψ (cid:55)−−−−−→ Ψ (cid:82) d r (cid:55)−−→ ρ ( r ) . (2.19)53ow we ask ourselves the question if these maps are invertible. If this is thecase, then we can always go back to the potential and reconstruct everythingwe need to know. The proof consists of two parts, showing the invertibility ofeach map separately. For simplicity we only consider non-degenerate groundstates, though the results can be generalised to degenerate ground stateswithout too much difficulty (Kohn 1985; Dreizler and Gross 1990). You areasked to do this yourself in Exercise 2.6. Theorem (HK-I) . The map from local potentials to ground states, v (cid:55)→ Ψ is invertible modulo a constant (shift) in the potential.Proof. Suppose that there are two potential v and v which both yield thesame ground state Ψ , then from the Schr¨odinger equation we have (cid:0) ˆ T + ˆ V + ˆ W (cid:1) Ψ = E Ψ , (2.20a) (cid:0) ˆ T + ˆ V + ˆ W (cid:1) Ψ = E Ψ . (2.20b)Subtracting both equations from each other, we find ( E − E )Ψ = (cid:0) ˆ V − ˆ V (cid:1) Ψ = N (cid:88) i =1 (cid:0) v ( r i ) − v ( r i ) (cid:1) Ψ . (2.21)Under the assumption that Ψ does not vanish on any finite region in space, we can divide by Ψ and obtain N (cid:88) i =1 (cid:0) v ( r i ) − v ( r i ) (cid:1) = E − E = constant . (2.22)So we find that potentials which yield the same ground state, that they canonly differ by a constant. Theorem (HK-II) . The map from non-degenerate ground states, generatedby local potentials, to ground state densities, Ψ (cid:55)→ ρ is invertible.Proof. The proof goes by reductio ad absurdum. Suppose that the statementis not true, so that there exist two different non-degenerate ground statewavefunctions, Ψ (cid:54) = Ψ , that both yield the same ground state density ρ .Then using the variational principle, we have E = (cid:104) Ψ | ˆ T + ˆ V + ˆ W | Ψ (cid:105) < (cid:104) Ψ | ˆ T + ˆ V + ˆ W | Ψ (cid:105) = (cid:104) Ψ | ˆ T + ˆ V + ˆ W | Ψ (cid:105) + (cid:104) Ψ | ˆ V − ˆ V | Ψ (cid:105) = E + (cid:90) d r ρ ( r ) (cid:0) v ( r ) − v ( r ) (cid:1) . (2.23) There is strong evidence that this is the case for Coulomb systems, though very hardto proof (Lieb 1983). E < E + (cid:90) d r ρ ( r ) (cid:0) v ( r ) − v ( r ) (cid:1) . (2.24)Adding both inequalities we find E + E < E + E (2.25)and therefore, our initial assumption that both ground states Ψ and Ψ canyield the same density is incorrect.The combination of the Hohenberg–Kohn (HK) theorems (HK-I and HK-II)forms the foundation of density functional theory (DFT), since they showthat v ( r ) HK-I ←−−− (cid:112) Ψ HK-II ←−−− (cid:112) ρ ( r ) , (2.26)so that we are allowed to write v [ ρ ] as well as Ψ[ ρ ] . Since the ground statewave function is a functional of the density, Ψ[ ρ ] , also every observable is afunctional of the density O [ ρ ] = (cid:104) Ψ[ ρ ] | ˆ O | Ψ[ ρ ] (cid:105) . (2.27)In particular the ground state energy is a functional of the density E [ ρ ] = (cid:104) Ψ[ ρ ] | ˆ H | Ψ[ ρ ] (cid:105) = F HK [ ρ ] + (cid:90) d r ρ ( r ) v ( r ) , (2.28)where F HK [ ρ ] := (cid:104) Ψ[ ρ ] | ˆ T + ˆ W | Ψ[ ρ ] (cid:105) is the HK functional and simply collectsthe parts of the energy which are not explicit density functionals. The HKfunctional is often called a ‘universal’ functional, because it does not dependon the particular system considered. The system (the positions of the nucleiand local external fields) only enters via the local potential v . Exercise 2.5. Try to proof the Hohenberg–Kohn theorems in the same man-ner for the 1RDM (2.4) and non-local one-body potentials. Non-local one-body potentials are similar to the exchange potential in the sense that theyact via an integral kernel on one-body states ˆ vψ ( x ) = (cid:0) ˆ vψ (cid:1) ( x ) = (cid:90) d x (cid:48) v ( x , x (cid:48) ) ψ ( x (cid:48) ) . (2.29)On a many-body state these non-local potentials act on each coordinate indi-vidually. So for a many-body wave function the non-local one-body potentialbecomes ˆ V = N (cid:88) i =1 ˆ v ( x i , x (cid:48) i ) . (2.30)55ue to the one-body nature, the expectation value reduces to a contractionwith the 1RDM (cid:104) Ψ | ˆ V | Ψ (cid:105) = (cid:90) d x (cid:90) d x (cid:48) v ( x (cid:48) , x ) γ ( x , x (cid:48) ) . (2.31)Consider now the mappings ˆ v (cid:55)→ Ψ (cid:55)→ γ . Investigate whether you can reusethe proofs for HK-I and HK-II that we used to establish DFT. If you can notreuse the proofs of the HK theorems to establish a 1RDM functional theory,why does it not work? Exercise 2.6. In this exercise we consider in which sense the Hohenberg–Kohn theorems can be generalised to degenerate ground states.a) What changes in HK-I if we allow for degenerate ground states?b) Is it possible to generalise HK-II to degenerate states? Consider thetwo different cases separately: the non-degenerate case ( E (cid:54) = E ) andthe degenerate case ( E = E ).c) Is it still possible to establish DFT when allowing for degenerate states?In particular, are we still allowed to write E [ ρ ] ? Exercise 2.7. Show E [ ρ gs ] ≤ E [ ρ ] , where ρ gs is the ground state density. Though we have shown that the functional E [ ρ ] exists on a formal level, wedo not have an explicit form for practical calculations. The HK functional, F HK [ ρ ] , only provides a very abstract expression for the universal functional.Here we will construct a more explicit expression for the universal functional,which has better mathematical properties and serves as a more convenientstarting point to derive approximations. In fact, we will always want toresort to approximations to make DFT useful, as an exact functional shouldalways be too complicated (more complicated than the Schr¨odinger equationprobably) to use in practice.The mathematical motivation to construct a different expression for theuniversal functional is that the HK functional F HK [ ρ ] is only defined for v -representable densities. With v -representable densities we mean densitieswhich can be generated by a ground state , so that Ψ[ ρ ] in HK constructionexists. This is important when we want to minimise the energy by makingvariations in the density. We should never hit a density for which F HK [ ρ ] doesnot exists, since then we would get stuck in our optimisation attempt. Onewould expect that all reasonably well-behaved densities (positive, smoothand normalisable) are v -representable, but this turns out to be not the caseunfortunately (Englisch and Englisch 1983).56 RON J. COHEN AND PAULA MORI-S ´ANCHEZ PHYSICAL REVIEW A , 042511 (2016) 0 1 2 -1 0 1 0 0.5 1 1.5 E ne r g y F Levy + γ .vE FCIv γ γ E ne r g y 0 1 2 -1 0 1 0 0.5 1 1.5 γ .v γ γ 0 1 2 γ -1 0 1 γ γ -1 0 1 γ E ne r g y Levy [ γ ](a) (b)(c) (d) ,Missing regions of γ from FCI caclulationsE vFCI - γ .v γ γ Levy [ γ ](a) (b)(c) (d) ,Missing regions of γ from FCI caclulations γ γ FIG. 1. Energy landscape of the exact functional (a) F Levy [ γ ] for all allowable density matrices of the two-site Hubbard model. (b) The oneelectron term γ · v for t = . "ϵ = . 9, which is purely a flat plane. (c) Illustration of the minimization of the exact functional addingon the same γ · v term to give the FCI energy and density matrix { E FCI v , γ FCI v } . (d) F Levy [ γ ] and 6552 points of { F HK [ γ FCI v ] , γ FCI v } that show the E FCI v subtracting the one electron term [Eq. (11)] at γ FCI v for many different v . ϵ − ϵ and the ratio U/t , therefore, in this work U is fixed at 1and t and "ϵ are the chosen variables. The kinetic and on-sitepotential part of the Hamiltonian, which together we denoteas v , is a real symmetric 2 × t and "ϵ , v = ! "ϵ / − t − t − "ϵ / " (8)and the 2 × γ ij = σ ⟨ % | c † i σ c j σ | % ⟩ is γ = ! γ γ γ ∗ (2 − γ ) " (9)leading to a total energy for real density matrices E v = − γ t + γ "ϵ / − (2 − γ ) "ϵ / + F [ γ ] . (10)The exact functional can be obtained and understood fromdifferent perspectives. First, for any γ that comes froman exact diagonalization full configuration interaction (FCI)calculation with one-electron Hamiltonian v , the Hohenberg-Kohn functional is given by F HK [ γ v ] = E FCI v + γ t − γ "ϵ / + (2 − γ ) "ϵ / . (11)The second way is the constrained search over real singletwave functions % = a √ A ( φ αφ β ) + A ( φ αφ β )] + b A ( φ αφ β ) + c A ( φ αφ β ) , (12) which can be simplified to an expression (see Refs. [10,12]and Supplemental Material (SM) for more details [17]) F Levy [ γ ] = γ $ − % − γ − [ γ − & + γ − $ γ + [ γ − & . (13)Third, it can be viewed as the exact functional in density matrixfunctional theory for two electrons. From the work of L ¨owdinand Shull in 1956 [18] using the natural orbitals | a ⟩ and | b ⟩ ( | p ⟩ = i = , C pi ˆ c † i | vac ⟩ ) and their occupation numbers n a and n b that diagonalize γ , it can be derived that F LS [ γ ] = n a ⟨ aa | aa ⟩ + n b ⟨ bb | bb ⟩ − √ n a n b ⟨ aa | bb ⟩ , (14)where the two-electron integral is ⟨ pp | qq ⟩ = U i = , C pi C qi . This gives exact agreement with theconstrained search expression Eq. (13) and has beenutilized in functionals such as the AGP natural orbitalfunctional [19,20] and PNOF5 [21] (see SM [17]). Thereare two further possible routes to the exact functional(details in the SM [17]): the extension over pure-statewave functions to complex, and the Lieb maximization [3] F Lieb [ γ ] = sup v { E v − γ · v } . F Levy [ γ ] is shown in Fig. 1(a) for the allowable densitymatrices ( γ − + γ ! 1. It is represented as a uniquesurface of hills and a valley in a bowl type shape, with achannel through the center (at γ = 1) and hills on both sides Figure 2.2: Taken from Cohen and Mori-S´anchez 2016 without permission. The3D lanscape is the constrained-search universal 1RDM functional as a function ofthe two independent 1RDM components for a two-orbital Hubbard system. Theprojection indicates the regions where the 1RDM is v -representable, i.e. the 1RDMsfor which there exists a (non-local) potential generating this 1RDM via the groundstate. In the two red oval encircled regions [Missing regions of γ from FCI caclula-tions], the 1RDM is not v -representable. An example is shown in Fig. 2.2 taken from Ref. (Cohen and Mori-S´anchez 2016). For technical reasons, this is not the DFT functional, butthe 1RDM functional. In the xy -plane, the region of 1RDMs is plotted forwhich a (non-local) potential could be found which generates that particu-lar 1RDM. In the two red encircled oval regions, no such potential can befound, so the 1RDM is not v -representable there and the HK functional doesnot exist in those regions. It nicely demonstrates that the topology of thedomain of the HK functional can be quite nasty.To avoid this problem, the domain of the universal functional was exten-ded by Levy (Levy 1979). He observed that the search for the ground stateenergy can be split as E gs = min Ψ (cid:104) Ψ | ˆ T + ˆ V + ˆ W | Ψ (cid:105) = min ρ (cid:18) min Ψ → ρ (cid:104) Ψ | ˆ T + ˆ W | Ψ (cid:105) + (cid:90) d r ρ ( r ) v ( r ) (cid:19) . (2.32)So the only thing we did is to write the minimisation over all wave functionsinto two parts. First we vary over all densities and inside these densityvariations we consider all wave functions which yield this density, Ψ → ρ .57he universal functional is now readily extracted as F LL [ ρ ] := min Ψ → ρ (cid:104) Ψ | ˆ T + ˆ W | Ψ (cid:105) . (2.33)The advantage is now that the domain of this functional consists of all N -representable densities, i.e. densities that can be produced by some wavefunction. In fact all reasonable densities (positive, smooth and normalis-able) are N -representable (Harriman 1981). Since the characterization of N -representable densities is so much easier than for v -representable densit-ies, making proper variations also becomes easier.The v -representability of the density is still an issue when taking a func-tional derivative of F LL with respect to the density and therefore remains atopic of research (Lieb 1983; Englisch and Englisch 1984a,b; Leeuwen 2003;Eschrig 2003; Lammert 2006b,a, 2010; Kvaal et al. 2014). One needs tomention that v -representability is only a problem in the continuum formula-tion of DFT. On a lattice it has been proved that reasonably well-behaveddensity are v -representable (Chayes, Chayes and Ruskai 1985). For a morethorough introduction to these technical matters, consult the excellent text-book by Dreizler and Gross (Dreizler and Gross 1990) or the review by VanLeeuwen (Leeuwen 2003).As mentioned before, this formulation of the exact functional is not onlyuseful from a theoretical point of view, but can also be used as a startingpoint to formulate different types of approximations. Most well-known isthe Kohn–Sham (KS) approximation to the kinetic energy, which will bediscussed in Sec. 2.3.2. The constrained-search formulation can also be usedto approximate the interaction part, though this is quite involved (Lieb 1983;Seidl, Perdew and Levy 1999; Seidl 1999; Seidl, Gori-Giorgi and Savin 2007;Gori-Giorgi, Vignale and Seidl 2009). For the interaction part, we will onlyconsider the traditional route.Since the interaction term is mainly electrostatic, a reasonable startingpoint is the classical Coulomb term W H [ ρ ] := 12 (cid:90) d r (cid:90) d r ρ ( r ) ρ ( r ) | r − r | , (2.34)This term is commonly called the Hartree term, because Hartree only tookthis classical interaction term into account (Hartree 1928). We will considerrefinements to the Hartree term, the exchange-correlation (xc) term, W xc ,later. First we will turn our attention to the kinetic energy, since it is not soeasy to find a reasonable approximation for the kinetic energy which capturesthe important quantum effects. Probably the most important reason not to call this term the Coulomb term is thatCoulomb starts with a ‘c’ which is already in use as an abbreviation for correlation (seeHartree–Fock and later in these lecture notes). xercise 2.8. In 1RDM functional theory, the kinetic energy is an explicitfunctional of the 1RDM, cf. (2.3), so only the interaction part of the energyhas no explicit form. Use the construction by Levy to write down the two-body term of the energy, W (2.10), as an exact functional of the 1RDM. The most important part to approximate reasonably well is the kinetic en-ergy term, because this term is fundamentally different from its classicalcounterpart. The interactions with the nuclei and between the electrons isstill simply the Coulomb interaction in non-relativistic quantum mechanicsand crude approximations often suffice. Already in the early days of quantum mechanics, physicists tried to simplifyquantum mechanics to make calculations with pen and paper more feasible. The first DFT approximations, therefore, date back long ago before the HKtheorems were formulated and were even formulated before HF. The oldestapproximation to the kinetic energy in terms of the density alone is due toThomas (Thomas 1927) and Fermi (Fermi 1927). Around 1926 they derivedindependently the following approximation for the kinetic energy T TF [ ρ ] := C TF (cid:90) d r ρ ( r ) / , C TF := 310 (cid:0) π (cid:1) / . (2.35)This expression might look very strange, but it is simply the kinetic energy ofa non-interacting homogeneous electron gas (HEG), where the homogeneousdensity has been replaced by the inhomogeneous density, ρ → ρ ( r ) . TheHEG is a fictitious system with a constant electron density in an infinite box.The ‘nuclei’ are also smeared out as a homogeneous background charge overthe whole space, to counteract the Coulomb repulsion between the electrons.so the HEG is also sometimes called the Jellium model, as the smeared outnuclei resemble a jelly. The HEG provides a reasonable model for the electrondelocalization in metals and qualitatively reproduces features of real metals,e.g. plasmons and Wigner crystallization.Combining the Thomas–Fermi kinetic energy with the Hartree approx-imation to the interaction term, we have a very simply approximation to theenergy functional E [ ρ ] ≈ E TF [ ρ ] := T TF [ ρ ] + V [ ρ ] + W H [ ρ ] . (2.36) The computer did not exist at that time. Programmable computers were only de-veloped in during the 2 nd world war mainly to break the Enigma code used by the Nazis.The computers around that time were on par with human speed to perform basic mathem-atical operations (multiplication, addition, etc.). The main advantage was that computersdo not tire and do not need coffee breaks and sleep (Feynman 2005). E TF [ ρ ] when solved self-consistently are• too low total energy for atoms (54% for hydrogen),• density decays as r − instead of e − αr ,• no shell structure in atoms,• all negative ions are predicted to be unstable,• molecules do not exist (Teller non-binding theorem (Teller 1962)).The main term to blame for its bad performance is the Thomas–Fermi ap-proximation to the kinetic energy. This statement can be validated by com-paring with Hartree calculations (HF without exchange) (Hartree 1928). Exercise 2.9. Derive the Thomas–Fermi approximation for the kinetic en-ergy by deriving the kinetic energy for the HEG. If you have never seencalculations on the HEG, this can be quite a challenge, so here are somesteps to help you out.a) The first step is to solve the non-interacting Schr¨odinger equation forthe electrons in a finite box with sides of length L and periodic bound-ary conditions. The one-electron solutions are ψ k ( r ) = Ω − / e i k · r ,where Ω = L is the volume and the wave vectors, k , are quantized as k i = 2 πn i L i = x, y, z n i = 0 , ± , ± , . . . (2.37)with energies (cid:15) ( k ) = | k | . Since I have already given you the solu-tions, you only need to check whether they are correct.b) Check that the density is constant.c) Calculate the total number of electrons in the limit of a large box.Assume that the system is spin compensated, so there is an equalamount of spin up and spin down electrons. In the limit of a large box,you can replace the summation over the states by integrals, so (cid:88) n x ,n y ,n z = Ω(2 π ) (cid:88) k x ,k y ,k z L →∞ −−−−→ Ω(2 π ) (cid:90) d k . (2.38)In the first equality we compensated for the fact that the intervalbetween k -points is ∆ k i = 2 π/L instead of unity, so the volume inthe second summation increased by (2 π/L ) which is compensated forby the prefactor. For large volumes, L → ∞ , we replaced the sum byan integral, because ∆ k i = k i +1 − k i → , cf. (2.37). Note that we still60etained the volume, Ω , in front of the integral. All expectation valuesof size-extensive operators diverge due to this volume term. We willtherefore retain this volume factor and divide by it on both sides, toobtain a finite expectation value for density-like (size-intensive) quant-ities.Further note that the orbital energies behave as (cid:15) ( k ) = | k | , so thelowest (so occupied) states will have k -vectors contained in a spherewith some radius k F . Show that given a density ρ = N/ Ω , the length ofthe maximum occupied wave vector, the Fermi wave vector, is relatedto the density as k F := (3 π ρ ) / .d) Calculate the kinetic energy density τ = T / Ω in the limit of a largebox.e) Use the calculated total kinetic energy to determine the kinetic energydensity of the HEG. To derive the Thomas–Fermi approximation tothe kinetic energy (2.35) from the HEG kinetic energy density, simplyuse the kinetic energy density expression of the HEG at each point inspace and integrate to get the total kinetic energy. The Kohn–Sham (KS) system is a system composed of non-interacting particleswith a prescribed density (Kohn and Sham 1965). The energy functional forthe KS system is simply the functional F LL (2.33) without the interactionterm, so only the kinetic energy is left T s [ ρ ] := min Φ → ρ (cid:104) Φ | ˆ T | Φ (cid:105) . (2.39)This functional is simple enough to allow us to actually perform the min-imisation. Since only a one-body part is present, we suspect that the wave-function Φ which achieves this minimisation will be a Slater determinantcomposed of single-particle orbitals Φ( x , . . . , x N ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) φ ( x ) φ ( x ) . . . φ N ( x ) φ ( x ) φ ( x ) . . . φ N ( x ) ... ... . . . ... φ ( x N ) φ ( x N ) . . . φ N ( x N ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (2.40) Since we do now have the constraint that Φ should yield a prescribed density ρ , weare not certain anymore that the minimiser will be a Slater determinant. Nevertheless, itseems that the densities typically under consideration allow for a Slater determinant asminimiser, though exceptions are known (Levy 1982; Schipper, Gritsenko and Baerends1998a). γ s ( x , x (cid:48) ) = N (cid:88) i =1 φ i ( x ) φ ∗ i ( x (cid:48) ) . (2.41)So the kinetic energy can be expressed in terms of the orbitals as ˜ T s [ { φ, φ ∗ } ] = − (cid:90) d x (cid:2) ∇ r γ s ( x , x (cid:48) ) (cid:3) x (cid:48) = x = − N (cid:88) i =1 (cid:104) φ i |∇ | φ i (cid:105) , (2.42)where [ { φ, φ ∗ } ] means that ˜ T s is a functional depending on the orbitals con-stituting the Slater determinant and their complex conjugates. To calculatethe functional T s [ ρ ] , we need to minimise ˜ T s not only under the constraintthat the orbitals { φ } are orthonormal, but also under the constraint of aprescribed density, so we introduce the following Lagrangian L ρ [ { φ, φ ∗ } , v s , (cid:15) ] := ˜ T s [ { φ, φ ∗ } ] − (cid:88) ij (cid:15) ji (cid:0) (cid:104) φ i | φ j (cid:105) − δ ij (cid:1) + (cid:90) d r v s ( r ) (cid:18)(cid:88) i,σ | φ i ( r σ ) | − ρ ( r ) (cid:19) . (2.43)A necessary condition for a minimum is that if we make small variations, δL ρ = 0 to first order, so if we consider variations due to perturbations inthe orbitals, we have δL ρ = (cid:90) d x N (cid:88) i =1 (cid:18) δφ ∗ i ( x ) δL ρ δφ ∗ i ( x ) + δL ρ δφ i ( x ) δφ i ( x ) (cid:19) . (2.44)We have seen a similar expression in the derivation of the HF equations.Since δL ρ needs to vanish for arbitrary variations, the functional derivativesneed to vanish. For the derivative with respect to φ ∗ i ( x ) , we find δL ρ δφ ∗ i ( x ) = − ∇ φ i ( x ) − N (cid:88) j =1 φ j ( x ) (cid:15) ji + v s ( r ) φ i ( x ) . (2.45)which can be rearranged to (cid:16) − ∇ + v s ( r ) (cid:17) φ i ( x ) = N (cid:88) j =1 φ j ( x ) (cid:15) ji . (2.46)The derivative with respect to φ i ( x ) is less straightforward, due to the Lapla-cian. There are different ways to proceed. Probably the easiest way is to An alternative is to use the the functional derivative of a functional of the simple form (cid:82) d s L [ f ( s ) , ∇ f ( s ) , ∇ f ( s ) , . . . ] can be calculated as δL ρ δf ( s ) = ∂L∂f ( s ) − ∇ · ∂L∂ ∇ f ( s ) + ∇ ∂L∂ ∇ f ( s ) + · · · , ˜ T s = 12 N (cid:88) i =1 (cid:104)∇ φ i |∇ φ i (cid:105) = − N (cid:88) i =1 (cid:104)∇ φ i | φ i (cid:105) . (2.47)Now it is easy to take the derivative with respect to φ i ( x ) which gives (cid:16) − ∇ + v s ( r ) (cid:17) φ ∗ i ( x ) = N (cid:88) j =1 (cid:15) ij φ ∗ j ( x ) (2.48)when equated to zero. In both equations (2.45) and (2.48) we still have thesum over the Lagrange multiplier matrix (cid:15) , which we would rather like to bediagonal to interpret them as orbital energies. We can proceed in exactly thesame manner as in the derivation of the HF SCF equations (Sec. 1.4). Di-agonalising the Lagrange multiplier matrix brings the stationarity equationsto canonical form (cid:16) − ∇ + v s ( r ) (cid:17) φ k ( x ) = ε k φ k ( x ) . (2.49)These are the KS equations which yield KS orbitals and KS orbital energies.Some remarks are in order.• To obtain the minimum expectation value for T s , we select the KS or-bitals with the lowest orbital energies to constitute our Slater determ-inant (2.40). This is the Aufbau principle, which intuitively makessense, but it takes a more careful derivation to mathematically justifythis assumption and to handle the possibility of fractionally occupiedorbitals (Giesbertz and Baerends 2010). The KS orbitals from whichthe Slater determinant is composed (2.40) are called occupied orbitalsand unused orbitals are called unoccupied/virtual orbitals.• The Lagrange multiplier for the density, v s ( r ) , has the form of a po-tential (KS potential). This is not so strange, since by modifying thepotential we can influence the density profile: too much density → in-crease the potential to push the particles away and visa versa. In thismanner the KS potential, v s ( r ) , can be obtained self-consistently bysolving the KS equations (2.49), calculating the density from the orbit-als and by comparing to the target density, one increases/decreases topotential accordingly till the orbitals yield the required density. Fur-ther, the HK theorems tell us that this potential, v s ( r ) , is unique for a where s is some arbitrary vector. This can easily be established with successive partialintegrations. Note that for s = t , f ( s ) = q ( t ) , L = T − V and only derivatives up to firstorder, one gets the usual Euler–Lagrange equations from classical mechanics. (cid:15) .• Often we deal with closed shell systems, so the amount of spin-up andspin-down electrons is the same. If there are no magnetic interactions,the wave function will be an eigenstates of ˆ S z operator, so the orbitalscome in pairs (spin-up and spin-down) with the same spatial part φ k ( x ) = φ k ( r σ ) = (cid:40) ψ ( k +1) / ( r ) α ( σ ) for k odd ψ k/ ( r ) β ( σ ) for k even . (2.50)In all expressions, the summation over the spin can now be performedexplicitly, so all expressions simplify somewhat, because only half ofthe orbitals needs to be calculated. For example, the spin-integrated1RDM (2.5) becomes γ s ( r , r (cid:48) ) = 2 N/ (cid:88) i =1 ψ i ( r ) ψ ∗ i ( r (cid:48) ) . (2.51)It turns out that T s [ ρ ] provides a very good approximation to the real kineticenergy, T [ ρ ] . The main reason is that the quantum nature of the kinetic en-ergy operator is properly taken into account. Though originally not intendedby Kohn and Sham in 1965, their approximation to the kinetic energy (Kohnand Sham 1965) has been crucial for the success of DFT in practice. Exercise 2.10. Check that the 1RDM for a Slater determinant has indeedthe simple form as is stated in (2.41). Note that the task is almost identicalto the derivation of the Slater–Condon rules for one-body operators, so youcould follow the same procedure. Exercise 2.11. Check that the noninteracting, spin-integrated 1RDM (2.5)for a non-magnetic closed shell system is indeed given by (2.51). Exercise 2.12. Show that the functional derivative of the classical Coulombinteraction (2.34) with respect to the density, gives the classical Coulombpotential. There are two ways of obtaining this derivative. The first one is tofollow the same approach as I used in class for the energetic contribution fromthe local potential, i.e. to consider the functional as the continuous analogueof the gradient. The other option is to work out δW due to variations in thedensity and to collect the first order terms in the form (cid:90) d r δW H δρ ( r ) δρ ( r ) (2.52)64 xercise 2.13. Calculate the KS potential, v s ( r ) , for a singlet two-electronsystem. Note that only one spatial KS orbital is occupied in this case, soyou can express the KS potential in the terms of the density and the orbitalenergy. Our current formulation of the KS system requires an input density and thisshould be the density of the real interacting system of course. We will as-sume that every interacting v -representable density is also non-interacting v -representable, i.e. that a potential v s ( r ) exists which is able to make the twodensities equal. This is still an open question for the T s [ ρ ] under considera-tion here (2.39). However, there exists a suitable generalisation for the kin-etic energy functional, which avoids this potential problem partially (Dreizlerand Gross 1990; Lieb 1983).Assuming that always a v s ( r ) can be found which makes the densityof the non-interacting system equal to the one of the interacting system ρ s ( r ) = ρ ( r ) , we will derive an expression for the KS potential that takescare of this. We first note that for a given density, the optimal orbitals andLagrange multipliers for the Lagrangian L ρ are functionals of the density, sowe write these optimal quantities as φ k [ ρ ]( x ) , v s [ ρ ]( r ) and (cid:15) [ ρ ] . Further notethat the Lagrangian at these optimal values exactly equals T s [ ρ ] , i.e. T s [ ρ ] = L ρ (cid:2) { φ i [ ρ ] , φ ∗ i [ ρ ] } , v s [ ρ ] , (cid:15) [ ρ ] (cid:3) . (2.53)Since the values of T s and L ρ are at each density the same, also the derivativeswith respect to the density are the same, so using the chain rule we have δT s δρ ( r ) = (cid:88) i (cid:90) d x (cid:48) (cid:32) ∂L ρ ∂φ i ( x (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12) ρ δφ i ( x (cid:48) ) δρ ( r ) + ∂L ρ ∂φ ∗ i ( x (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12) ρ δφ ∗ i ( x (cid:48) ) δρ ( r ) (cid:33) + (2.54) (cid:90) d r (cid:48) ∂L ρ ∂v s ( r (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12) ρ δv s ( r (cid:48) ) δρ ( r ) + (cid:88) i ∂L ρ ∂(cid:15) ij (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ρ δ(cid:15) ij δρ ( r ) + ∂L ρ ∂ρ ( r ) (cid:12)(cid:12)(cid:12)(cid:12) ρ = − v s [ ρ ]( r ) . We used here that the Lagrangian L ρ is stationary (derivatives zero) withrespect to the orbitals and Lagrange multipliers at the optimum values φ k [ ρ ]( x ) , v s [ ρ ]( r ) and ε k [ ρ ] , so only the term where the density appearsexplicitly survives.The next step is to rewrite the total energy of the real interacting systemin terms of T s as E [ ρ ] = F [ ρ ] + V [ ρ ] = T s [ ρ ] + V [ ρ ] + F [ ρ ] − T s [ ρ ] (cid:124) (cid:123)(cid:122) (cid:125) =: E Hxc [ ρ ] , (2.55)where E Hxc [ ρ ] is the Hartree-exchange-correlation energy. Often the clas-sical Coulomb (Hartree) part (2.34) is treated explicitly and the remaining65xchange-correlation (xc) energy is decomposed in a kinetic and an inter-action part E xc [ ρ ] := E Hxc [ ρ ] − W H [ ρ ] = T [ ρ ] − T s [ ρ ] (cid:124) (cid:123)(cid:122) (cid:125) =: T c [ ρ ] + W xc [ ρ ] . (2.56)Since we want to minimise the total energy of the fully interacting system, weactually want to optimise the energy of the interacting system with respectto density variations, δρ ( r ) . We should keep in mind, however, that we fixedthe number of particles, so we have the following constraint on the alloweddensity variations (cid:90) d r δρ ( r ) = 0 . (2.57)This means that if we consider first order variations in the energy due tosuch variations in the density, we find δE = (cid:90) d r δEδρ ( r ) δρ ( r ) ⇒ c = δEδρ ( r ) = δFδρ ( r ) + v ( r ) . (2.58)So the functional derivative of E with respect to the density needs to bea constant. This constant reflects the fact that the potential is uniquelydetermined by the density up to a constant. This called gauge freedom:shifting the potential by a constant does not change the physics of the system.A convenient convention to fix the value of the constant in the potential isto demand that v ( r ) → if | r | → ∞ .Working out the stationarity condition in terms of KS quantities, we have c = δEδρ ( r ) = δT s δρ ( r ) + v ( r ) + δE Hxc δρ ( r ) . (2.59)Combining with (2.54), we find that the KS potential should be set to v s [ ρ ]( r ) = v ( r ) + v H [ ρ ]( r ) + v xc [ ρ ]( r ) , (2.60)where we assumed now that all the potentials involved vanish at infinity,so we can set c = 0 . Additionally we used the classical Coulomb (Hartree)potential and the exchange-correlation (xc) potential, defined respectively as v H [ ρ ]( r ) := δW H δρ ( r ) = (cid:90) d r (cid:48) ρ ( r (cid:48) ) | r − r (cid:48) | , v xc [ ρ ]( r ) := δE xc δρ ( r ) . (2.61)The exact E xc will be a very complicated functional of the density and thexc potential will be even more complicated. Nevertheless, if we can find agood approximation to the xc energy, we are in business and solve the KSequations directly without solving the fully interacting Schr¨odinger equation. Note that xc energy is just a fancy name for all the difficult parts of the energy whichare hard to calculate for us, so it is a measure of our inability to do the real calculation.Richard Feynman therefore prefers to call the xc energy the stupidity energy (Feynman1972). .4 The exchange-correlation energy To transform the KS equations into a practical scheme, we need to approxim-ate the xc energy. Therefore, we need to understand the exact E xc better torationalise the performance of various approximations, which will be treatedlater. To make the discussion simpler, we will first assume that the kineticenergy correction, T c , is small and can be neglected. Later, we will showhow the contribution from the kinetic energy can be included again. To introduce the concept of holes, we will first only consider the interactionpart of the xc energy, W xc . The full interaction written in terms of thepair-density is W = 12 (cid:90) d r (cid:90) d r P ( r , r ) | r − r | . (2.62)Since the pair-density integrates to N ( N − particles, this term describesthe interaction between all the particles. So that are in total N ( N − interactions, since the particles do not interact with themselves and the halftakes care that we only count the unique pairs.To express more explicitly that the particles only interact with otherparticles, we introduce the conditional probability density ρ ( r | r ref ) := P ( r , r ref ) ρ ( r ref ) . (2.63)The conditional probability is the probability to find a particle at r , if weknow that an other particle is located at the reference position r ref . Exercise 2.14. Check that the conditional probability is normalised as youwould expect ( N − ).The interaction term can now be expressed with the help of the condi-tional probability as W = 12 (cid:90) d r (cid:90) d r ref ρ ( r | r ref ) ρ ( r ref ) | r − r ref | , (2.64)So given the density profile of the electrons, only the interactions with the other electrons should be taken into account, which is exactly what theconditional density achieves.The classical Hartree part (2.34), however, takes the interaction of ρ ( r ref ) with the full density ρ ( r ) into account instead of only the conditional density ρ ( r | r ref ) . The Hartree term therefore does not only contain the interactionbetween the particles, but also contains an interaction of the particles with67hemselves. The main task of W xc is to remove this self-interaction. We canimagine this correction as an interaction of the particles with a − particle,so we rewrite the xc interaction part as W xc := W − W H = 12 (cid:90) d r (cid:90) d r ref ρ ( r | r ref ) ρ ( r ref ) − ρ ( r ) ρ ( r ref ) | r − r ref | = 12 (cid:90) d r (cid:90) d r ref ρ ( r ref ) ρ xc ( r | r ref ) | r − r ref | , (2.65)where the xc-hole is defined as the correction to the density to obtain theconditional density ρ xc ( r | r ref ) := ρ ( r | r ref ) − ρ ( r ) = P ( r , r ref ) ρ ( r ref ) − ρ ( r ) . (2.66)The xc-hole has the property that it contains exactly − particle (cid:90) d r ρ xc ( r | r ref ) = − (2.67)as would be expected from our discussion on the main purpose of W xc .One could forget about the conditional density and consider the xc-holeas a quantity which discribes (minus) the shape of one electron at the refer-ence position, among all the other electrons. Since the electrons are quantumparticles, the electron is not localized at the reference position, but delocal-ized. Subtracting the interaction of the electrons with their xc-hole from theHartree term therefore eliminates the self-interaction.The definition for the xc-hole (2.66) can also be used to define other holes.In particular, the hole corresponding to the KS wave function is called theexchange hole (x-hole). Since the KS wave function is consists of only oneSlater determinant, its pair-density simplifies to P s ( r , r ) = ρ ( r ) ρ ( r ) − | γ s ( r , r ) | . (2.68)Using this pair-density in the definition of the xc-hole, we find that the x-holehas the simple form ρ x ( r | r ref ) := − | γ s ( r , r ref ) | ρ ( r ref ) . (2.69)The correlation hole (c-hole) is now simply defined as the difference betweenthe xc-hole and the x-hole ρ xc ( r | r ref ) =: ρ x ( r | r ref ) + ρ c ( r | r ref ) . (2.70)In the next section we derive how the kinetic energy can be included in thexc hole description. This has not been treated in the lecture, so will not be68art of the exam. If you are interested, you can read the following section,otherwise, you can perfectly skip it. The holes including the kinetic energyeffects are indicated with an additional bar, ¯ ρ c and ¯ ρ xc for the c-hole andxc-hole respectively. Exercise 2.15. Check that (cid:90) d r (cid:90) d r P ( r , r ) = N ( N − . (2.71) Exercise 2.16. Check that the definition of ρ xc (2.66) is consistent with W xc and the xc-hole integrates to exactly − particle (2.67). Exercise 2.17. In this exercise we will check some properties of the pairdensity when the wave function is a simple Slater determinant.a) Show that P s ( x , x ) = ρ ( x ) ρ ( x ) −| γ s ( x , x ) | . This relation is onlyvalid if the pair-density is calculated from a single Slater determinant, Ψ s , so you need to use this fact.b) Using the previous result, show P s ( r , r ) = ρ ( r ) ρ ( r ) − | γ s ( r , r ) | in the restricted case, so the spin-up and spin-down orbitals share thespatial parts as in (2.50). Exercise 2.18. Check that the x-hole, ρ x ( r | r ref ) , integrates to − particle.Assume that the KS 1RDM is of restricted form (2.51). To what number ofparticles does the c-hole integrate? Exercise 2.19. Show that the x-hole for a singlet two-electron system canbe calculated to be ρ x ( r | r ref ) = − ρ ( r ) / . What do you notice? What about the kinetic energy part of E xc ? We expect T c to be small, so wewill aim to write it as a small modification of the xc-hole introduced before.To calculate T c we need to connect the non-interacting KS system with thefully interacting system. We do this by considering systems with a rescaledinteraction, λ ˆ W , and a local potential, v λ , which is adjusted such that thedensity is equal to the fully interacting density at all coupling strengths, ρ λ = ρ = ρ . Note that for λ = 0 we exactly recover the KS system. Theenergy at arbitrary λ is E λ = (cid:10) Ψ λ (cid:12)(cid:12) ˆ H λ (cid:12)(cid:12) Ψ λ (cid:11) = (cid:10) Ψ λ (cid:12)(cid:12) ˆ T + ˆ V λ + λ ˆ W (cid:12)(cid:12) Ψ λ (cid:11) . (2.72)The variation in the energy when changing the coupling strength, λ , can beevaluated as ∂E λ ∂λ = (cid:28) Ψ λ (cid:12)(cid:12)(cid:12)(cid:12) ∂ ˆ H λ ∂λ (cid:12)(cid:12)(cid:12)(cid:12) Ψ λ (cid:29) (Hellman–Feynman)69 (cid:28) Ψ λ (cid:12)(cid:12)(cid:12)(cid:12) ∂ ˆ V λ ∂λ (cid:12)(cid:12)(cid:12)(cid:12) Ψ λ (cid:29) + (cid:10) Ψ λ (cid:12)(cid:12) ˆ W (cid:12)(cid:12) Ψ λ (cid:11) = (cid:90) d r ρ ( r ) ∂v λ ( r ) ∂λ + 12 (cid:90) d r (cid:90) d r P λ ( r , r ) | r − r | (2.73)Now we use the fundamental theorem of calculus (Almbladh 1972; Langrethand Perdew 1975; Gunnarsson and Lundqvist 1976). To write the energydifference between the KS system and the interacting system as E − E = (cid:90) d λ ∂E∂λ = (cid:90) d r ρ ( r ) (cid:0) v ( r ) − v ( r ) (cid:1) + 12 (cid:90) d r (cid:90) d r ¯ P ( r , r ) | r − r | , (2.74)where the coupling constant integrated/averaged pair-density is defined as ¯ P ( r , r ) := (cid:90) d λ P λ ( r , r ) . (2.75)Subtracting the integral over the potential difference on both sides of thisequation, gives the following expression for the Hartree (H)xc energy E Hxc = T − T s + W Hxc = 12 (cid:90) d r (cid:90) d r ¯ P ( r , r ) | r − r | . (2.76)Subtracting the classical Coulomb (Hartree) term from both sides, we fin thefollowing expression for the xc energy E xc = 12 (cid:90) d r (cid:90) d r ¯ P ( r , r ) − ρ ( r ) ρ ( r ) | r − r | . (2.77)Comparing this expression with W xc (2.65), we find that we only need toreplace the fully interacting pair-density by the coupling constant integratedpair-density. The corresponding averaged xc-hole, which includes the kineticenergy effects, becomes ¯ ρ xc ( r | r ref ) := ¯ P ( r , r ref ) ρ ( r ref ) − ρ ( r ) . (2.78)Approximate functionals based on the HEG typically include the the kin-etic energy effects. The kinetic energy effects on the shape of the hole ininhomogeneous systems (anything else than the HEG) such as molecules arenot well known actually. They are expected to be small nevertheless. Exercise 2.20. Check the Hellman–Feynman step in (2.73).70 � Figure 2.3: The different holes of the H molecule at R H–H = 1 . Bohr andthe reference electron at 0.3 Bohr to the left of the right nucleus along the bondaxis ( r ref = (0 , , . Bohr). The positions of the nuclei are indicated by the bluelines and the position of the reference electron is by red. The left panel shows thex-hole, ρ x ( r | r ref ) , the middle panel shows the c-hole, ρ c ( r | r ref ) , which provides asmall correction to have the more localized real hole, ρ xc ( r | r ref ) . �� Figure 2.4: Similar to the previous plots, but now for R H–H = 5 . Bohr. Thereference electron is still at 0.3 Bohr to the left of the right nucleus along the bondaxis ( r ref = (0 , , . Bohr now). molecule In Exercise 2.19 you have already calculated the x-hole of a singlet two-electron system, so also H , to be ρ x ( r | r ref ) = − ρ ( r ) , (2.79)so the x-hole of a singlet two-electron system is independent of the referenceposition. The xc-hole can be obtained from accurate CI calculations. Theholes for the H molecule at equilibrium distance are shown in Fig. 2.3. Thec-hole only provides a small correction to the x-hole. The reference electronis located close to the right nucleus (0.3 Bohr to the left). The main effectof the c-hole is to localize the full xc-hole more on the right nucleus. This isa very intuitive effect, since if the reference electron is located near the rightnucleus, you also expect that the hole is located near the reference electron.Some delocalization remains due to the quantum nature of the electrons. To include the kinetic energy effects, the averaged xc-hole should have been calculated.This is actually quite involved procedure and this data is currently not yet available. Asmentioned before, the effect of the kinetic energy is probably small and the λ = 1 holealready shows the most important physics. molecule with R H–H = 5 . Bohr. The c-hole is not a smallcorrection to the x-hole anymore. The c-hole actually needs to be of equalmagnitude to completely eliminate the hole amplitude on the left nucleus.The c-hole has the same peak on the right nucleus with opposite sign toensure that the full xc-hole correctly integrates to − electron (2.67).The holes give a different view on the failure of restricted HF to describethe dissociation of the H molecule. Since HF only includes exchange, thehole by the HF model remains completely delocalized, even when the hy-drogen molecule is dissociated. Simple physical intuition immediately tellsyou that this is an incorrect description, since it is energetically unfavour-able for two electrons to be near the same nucleus simultaneously. Instead,when one electron is on the left nucleus, the other should be at the rightnucleus and visa versa. A fancy name for this strong correlation between thewhereabouts of the electrons is ‘quantum entanglement’. Exercise 2.21. Use the x-hole to argue that the HF energy behaves asymp-totically as E HF ( R H–H → ∞ ) = C − R H–H . (2.80)Do not forget to include the interaction between the nuclei!Assuming that the HF orbital becomes just a linear (gerade) combinationof the hydrogenic 1s orbitals on the two hydrogen atoms, argue that theconstant can be approximated as C = lim R →∞ E RHF ( R ) ≈ (cid:104) A | ˆ h | A (cid:105) + 12 ( AA | AA ) = ζ − ζ + 5 ζ , (2.81)where in the last step you can reuse the results from exercise 7 (e + k) fromthe HF part. Since this is the restricted HF energy in the dissociation limit,you can simply optimise the exponent to get the numbers ( ζ opt = 27 / and C opt = − / ≈ − . ). Now that we have some understanding of what the exact xc energy shoulddo, we can take a look at different approximations which are used for E xc [ ρ ] .One of the main disadvantages of current DFT is that there does not anultimate xc functional, which works in all cases. Therefore, hundreds of dif-ferent approximations have been published all optimised for different systemsand physical situations. Although many of these functionals have intimid-ating acronyms based on the author names, they are often just some slightreparametrizations without any newly captured physics. We will limit the72iscussion to the most basic xc functional classes, which encompasses mostof the functionals used in daily practice. The local density approximation (LDA) is the oldest of the density function-als and its history actually dates back far before the foundations of DFTwere laid. Some people like to refer to the LDA as the mother of all func-tionals. The LDA based on the idea that if the density does not vary toostrongly, we can assume that ¯ ρ xc closely resembles the xc-hole of the HEGat the reference position. The xc-hole of the HEG only depends on the (con-stant) density, ρ , of the gas and the distance between the electron, | r − r ref | ,since the HEG is an isotropic system. The LDA xc-hole is defined as ¯ ρ LDAxc ( r | r ref ) := ρ HEGxc (cid:0) ρ ( r ref ) , | r − r ref | (cid:1) . (2.82)The remaining task is to calculate ρ HEGxc . The exchange part is not toodifficult to obtain, since you can use the non-interacting solution you alreadyfound in Exercise 2.9, where you calculated the kinetic energy of the HEG.The 1RDM of the non-interacting HEG can be worked out to be γ HEG s ( k F , r ) = k F π J ( k F r ) , J ( y ) = sin( y ) − y cos( y ) y , (2.83)where k F := 3 π ρ is the Fermi wave vector. From the non-interacting 1RDM,the x-hole of the HEG can readily be calculated (2.69). In fact, the exchangepart of the LDA energy was already evaluated in 1930 by P. Dirac (Dirac1930) to be W LDAx [ ρ ] = − C x (cid:90) d r ρ ( r ) / , C x := 34 (cid:18) π (cid:19) / . (2.84)Since the original treatment by Thomas and Fermi did not include exchangehe proposed add this term to take exchange approximately into account. Un-fortunately, the additional term W LDAx only makes the result from Thomas–Fermi theory even worse (Eschrig 2003), indicating that the poor kineticenergy functional, T TF is the main source of the bad performance and notthe lack of exchange and correlation effects.History demands we should mention a resurgence of interest in W LDAx already in 1951. The first electronic computers started to appear around thattime and performing an actual HF calculation for small chemical systemsbecame feasible. The main bottleneck however, was the calculation of thenon-local exchange potential of HF. Therefore, J.C. Slater proposed to usea local approximation to the exchange potential based on the HEG in the73ame way as the LDA (Slater 1951). He called this method the X α method,where α refers to a constant in his expression for the exchange energy W X α x [ ρ ] = 32 αW LDAx [ ρ ] , (2.85)so α = 2 / . Strangely enough, the X α approximation by Slater actually gavebetter results than HF and got even better if the value of the parameter α was set to α = 0 . . At first sight it is counterintuitive that an approximationgives better numbers than the method it is supposed to approximate. Thishas upset many scientists and the use of the X α method has remained con-troversial for a very long time. Only when the LDA functional was studied inmore detail, people started to understand why the simplistic X α performedbetter than HF and could even explain why raising the value of α wouldimprove the results even further.Before we explain the superior performance of LDA and X α over HF,we should mention that an analytic expression for the correlation part isnot available for the HEG. Although the HEG appears to be a very simplesystem with its constant density, all the many-body effects responsible forcorrelation turn out to be very complicated, even in this ‘simple’ isotropicsystem. The main advantage of the HEG is that we do not have to deal withreal density functionals, but only functions of the density, since the density isconstant in the HEG. The asymptotic behaviour of the xc-hole of the HEGhas been studied to great detail and is well understood (Gori-Giorgi andPerdew 2001). Accurate quantum Monte Carlo calculations have suppliedxc holes for intermediate values (Ceperley and Alder 1980; Ortiz, Harris andBallone 1999), which have been combined with the asymptotic behaviourto construct accurate fits of the correlation part of the HEG (Wang andPerdew 1991; Gori-Giorgi and Perdew 2002). In practical LDA functionals,these fits are used to construct the correlation part of the LDA functional.The most used fit for the correlation part of the LDA is the one by Vosko,Wilk and Nusair (VWN) and is denoted as VWN5, where the 5 stands forthe 5 th variant in the (same!) article (Vosko, Wilk and Nusair 1980). Thephysics captured by the LDA does not really change when using differentapproximations, so we will not go into the details of different versions for thefit of the correlation part.Now let us consider the LDA holes for a stretched H molecule at R H–H =5 . Bohr depicted in Fig. 2.5. There are couple of important things to notice:• The x-hole is oscillatory due to the sines and cosines in the 1RDM from Quantum Monte Carlo is a different approach to find the ground state. Instead ofusing Slater determinants, one uses much more complicated ansatz forms which capturesimportant analytic features of the wave function exactly. The resulting integral with theHamiltonian remains high dimensional and is solved by stochastic generation of integralpoints. This stochastic manner of solving integrals is called Monte Carlo, a code themethod got during the Manhattan project (the atomic bomb). � Figure 2.5: The LDA holes for the H molecule at R H–H = 5 . Bohr. The referenceelectron is again at 0.3 Bohr to the left of the right nucleus along the bond axis( r ref = (0 , , . Bohr). the HEG (2.83) (see also Fig. 2.7 later on). Correlation removes theseoscillations.• The LDA holes are spherical (only depend on | r − r ref | ) and are centeredaround the reference electron.• The LDA x-hole and c-hole are very bad approximations to the exactholes and a direct comparison does not make much sense. The totalLDA hole, ¯ ρ xc ( r | r ref ) , however, is not a too bad approximation to theexact xc-hole (see Fig. 2.4).• The total LDA xc-hole is a much better approximation to the exactxc-hole than the x-hole alone which is used in the HF approximation.Although the shape of the LDA xc-hole is not particularly good, at leastLDA is able to describe the localization of the hole near the referenceelectron. This is why LDA often outperforms HF.• The correction from the LDA c-hole is small. Its main contribution isto deepen the hole near the reference electron, which causes a decreaseof the exchange energy, due to the / | r − r ref | factor in the exchangeenergy functional (check (2.65) with xc → x). This explains why Slatergot even better results by increasing the α factor to slightly largervalues, since effectively he incorporated more correlation effects.Due to a better localization of the hole, LDA outperforms HF for moleculardissociation (see Fig. 2.6). The most important features of LDA in practicalcalculations are:• The LDA favours homogeneous systems too much, so is too eager toform bonds between atoms and molecules. The bond lengths are there-fore too short in LDA calculations, so LDA overbinds in this sense.Nevertheless, the bond lengths are still within 1–2% accuracy.• Though better than HF, the energy still increases to much upon dis-sociation (Fig. 2.6), so binding energies and transition states tend tobe too high in energy. 75he LDA only brought partial success for DFT. The LDA energies where notaccurate enough for chemists to make useful predictions on molecules. Onlywhen the accuracy of the energies increased sufficiently with the introductionof functionals which also depend on the gradient (GGAs), DFT started tobe useful for chemistry. The solid state physicists were very happy with theLDA already. In an infinite solid, it makes no sense to talk about the totalenergy, so the lack of accuracy of LDA on this part was irrelevant. The mostimportant feature of LDA compared to HF for the solid state physicists wasthat LDA can describe metals whereas HF can not. One can even proofrigorously that unrestricted HF predicts all materials to be insulators (Bachet al. 1994). The ability of LDA to describe both insulators and metals wastherefore a major breakthrough in the solid state community. Exercise 2.22. Show that the 1RDM of the non-interacting HEG is indeedgiven by (2.83). Due to the isotropy of the HEG, the KS orbitals will remainplane waves, so you can reuse the orbitals from Exercise 2.9. Exercise 2.23. Give the expression for the x-hole of the LDA. Exercise 2.24. Calculate the LDA exchange energy, W LDAx . The integral isnot so easy to solve. Use q ( x ) := − sin( x ) /x to rewrite the integral in termsof q ( x ) , q (cid:48) ( x ) and q (cid:48)(cid:48) ( x ) only. You can rewrite the integrant now as a totalderivative, which allows you to do the integration easily. You should get thesame answer as Dirac in 1930 (2.84). Exercise 2.25. Show explicitly that the LDA x-hole integrates to exactly − electron. There are two ways to solve this exercise brute-force Use the same trick as in the previous exercise to show that (cid:90) d r ρ LDAx ( r | r ref ) = − π (cid:90) ∞ d x (cid:0) q ( x ) (cid:1) . (2.86)If you did a course in complex analysis, you can solve the remainingintegral over q by contour integration, which gives π/ . detour Write the 1RDM back in its integral form over the wave vectors γ HEG s ( k F , r ) = 14 π (cid:90) Ω F d k e i k · r , (2.87)where Ω F is the Fermi sphere: the part of k -space which correspondsto occupied orbitals. Insert this expression in the definition for thex-hole (2.69) and evaluate the integration condition for the x-hole by first performing the integration over u = r − r ref . You also need to usethat (cid:90) ∞−∞ d x e i kx = 2 π δ ( k ) , (2.88)where δ ( x ) is the Dirac delta-function.76 ull CIRHF VWN5BP86TPSSB3LYP t o t a l ene r g y [ a . u .] − − − − − H-H [a.u.]2 4 6 8 10 Figure 2.6: Comparison of several approximations to the xc energy to full CI andHF in the aug-cc-pVQZ basis for the hydrogen molecule. The hydrogen moleculeis special, since the total energies from DFT seem to be very good. For othermolecules the total energies are not so good, but the relative energies are. The most logical step to improve the accuracy of the LDA is to include alsothe gradient of the density. An approximate hole can be built by using aslightly perturbed HEG ¯ ρ GEAxc ( r | r ref ) := ¯ ρ HEGxc (cid:0) ρ ( r ref ) , |∇ ρ ( r ref ) | , | r − r ref | (cid:1) , (2.89)the gradient expansion approximation (GEA) hole. Unfortunately this ap-proach did not work, since the GEA functionals always gave results worsethan the LDA. It took a long time before Perdew (Perdew 1985, 1986) real-ized that the long-range oscillations — already present in the LDA x-hole(see Fig. 2.5 and discussion) — were hugely enhanced and that the GEAx-hole is even not negative definite anymore. The solution by Perdew wassimple: just remove the positive part of the x-hole. The x-hole integratesnow to less than − electron, so he limited the extend of the x-hole by onlytaking the part which integrates to − electron within a sphere centered atthe reference position. In this way he could both maintain the integration77 xcGGA @ n " , n E d r f ~ n " , n , π n " , π n ! , ~ ! often start from the GEA for the hole n xcGEA and cut off itslarge- u contributions to restore exact conditions such as Eqs. ~ ! , ~ ! , and ~ ! . Since only the system average of Eq. ~ ! isneeded, π n contributions to the GEA are first transformedvia integration by parts on r . GGA’s may be applied directlyor hybridized with exact exchange. In Sec. II, we present our GGA model for the exchangehole. The first such model was that of Perdew and Wang in1986 ~ PW86 ! , who used sharp cutoffs on n xGEA to enforceEqs. ~ ! and ~ ! , yielding E xGGA @ n " , n E xGGA @ n " E xGGA @ n , ~ ! E xGGA @ n E d r n e xunif ~ n ! F x ~ s ! , ~ ! where e xunif ~ n ! k F /4 p , ~ ! k F ~ p n ! , ~ ! s u π n u /2 k F n . ~ ! The real-space cutoff gave a numerical function F x ( s ) ~ seeFig. 1 of Ref. 12 ! , which was fitted to an analytic form, F xPW86 ( s ). In the later work of Perdew and Wang in 1991 ~ PW91 ! , Becke’s semiempirical refinements plus addi-tional theoretical constraints were included in F xPW91 ( s ), al-though F xPW91 ( s ) was a worse fit to the numerical functionthan was F xPW86 ( s ). Both the PW86 and PW91 parametriza-tions were contorted at small s to recover the expectedGEA of Eq. ~ ! .Recently, Perdew, Burke, and Ernzerhof ~ PBE ! pre-sented a simplified construction of a simplified GGA for ex-change and correlation, in which all parameters ~ other thanthose in LSD ! are fundamental constants. Although indepen-dent of PW91 or any model for the hole, the PBE functionalis numerically equivalent to PW91 for most purposes, and F xPBE ~ s ! k k / ~ m s / k ! , ~ ! where m ~ to preserve the good LSD description ofthe exchange-correlation energy in the linear response of theuniform gas ! and k ~ ! , by applying adamping factor to the PW86 exchange hole. The dampingfactor, used only for exchange, reflects the more pathologicallarge- u behavior of the n xGEA , and the ‘‘double’’ nature of itsGGA cutoffs, which enforce both Eqs. ~ ! and ~ ! .In Sec. III, we present our model for the GGA correlationhole. The first such model, which led to the PW91 corre-lation energy functional, was based upon sharp cutoffs ofcrude approximations for both the LSD and gradient contri-butions to the hole. We refine these crude approximations,but find essentially the same correlation energy, which canbe accurately represented by the PBE functional E cPBE @ n " , n E d r n $ e c ~ r s , z ! H PBE ~ r s , z , t ! % , ~ ! where r s ~ p n ! , ~ !z ~ n " n " ! / n , ~ ! t u π n u /2 k s f n , ~ !f @~ z ! ~ z ! , ~ ! k s ~ k F / p ! , ~ ! H PBE gf ln H bg t F At At A t G J , ~ ! A bg @ exp $ e cunif / gf % ~ ! and g b s and t measure how fast n ( r ) is varying on the scales of thelocal Fermi wavelength 2 p / k F and the local Thomas-Fermiscreening length 1/ k s , respectively.In Ref. 30, Eq. ~ ! was derived from three limits: H PBE ! bf t ~ t ! ! , ~ ! H PBE ! e Cunif ~ t ! ` ! , ~ ! and E cPBE @ n " g , n g ! const ~ g ! ` ! , ~ ! where n sg ( r ) g n s ( g r ) is a uniformly scaled density. These limits also emerge naturally from the real-space cutoffconstruction of Sec. III, as shown in Ref. 32. The high-density limit of Eq. ~ ! is violated by both LSD andPW91. Thus the PBE correlation energy functional of Eq. ~ ! can be derived either from various limits, as in Ref. 30, orfrom a real-space construction of the GEA correlation hole,as in Sec. III. The PBE exchange energy functional of Eq. ~ ! is derived from its limits in Ref. 30, and is then used toimprove the real-space cutoff of the GEA exchange hole inSec. II. FIG. 1. Spherically averaged exchange hole density n X for s ~ circles ! , GEA ~ crosses ! , and GGA ~ solid line ! .16 534 54JOHN P. PERDEW, KIERON BURKE, AND YUE WANG Figure 2.7: Spherically averaged x-hole for the LDA (=LSD, circles), the GEA(crosses) and the GGA (solid line). The plot is taken from (Perdew, Burke andWang 1996), where n x = ρ x denotes the x-hole and u = | r − r ref | is the inter-electronic distance and s := |∇ ρ | / (2 k F ρ ) is a dimensionless version of the gradient. condition and remove the long-range oscillations from the GEA x-hole. Thisprocedure gives the GGA.As an example to illustrate all these features, we show spherically av-eraged x-holes for the LDA (=LSD), GEA and GGA in Fig. 2.7 which istaken from Ref. (Perdew, Burke and Wang 1996). You clearly see that theoscillations present in the LDA x-hole are enhanced in the GEA x-hole tosuch extend that the GEA x-hole has positive parts. In the GGA x-holethese are removed and only the inner part is retained which integrates to − electron. This gives a rather ridiculous shape for the GGA x-hole, but thatis mainly in the outer region. Since the exchange energy mainly probes theinner region due to the / | r − r ref | factor, these irregularities in the outerpart do not affect the exchange energy too much. The inner region is ofthe x-hole is actually improved by including gradient dependent terms andhence, also the prediction for the energy compared to the LDA. The mainfeatures of the GGAs are:• The gradient part of the GGAs favours inhomogeneous systems more,so corrects for the overbinding of the LDA. The GGAs tend to over-correct the bond lengths, so the accuracy remains the same 1–2% ofLDA. This is exactly what you expect for a perturbative expansion. By including higherorder derivatives you improve the description close to your reference and the description faraway can be better or worse (almost exclusively worse in practice, e.g. the MP perturbationseries or the failing perturbative approach to deal with the strong force between quarks(chromodynamics). 78 Since homogeneous systems are not favoured so much anymore, bindingenergies and transition state barriers are improved, which made theGGA useful for chemistry.• Core electrons are treated better, so GGAs give better total energiesthan the LDA.The total energy for the oldest successful GGA functional, the BP86, areshown also in Fig. 2.6 for our test system, H . The correlation part ofthe BP86 is the one originally proposed by Perdew (Perdew 1986) and theexchange part was replaced by the B88 exchange functional proposed byBecke (Becke 1988), since it gave better numbers. During the years therehave been efforts to simplify the parametrization of the BP86 and has lead tothe PW91 functional (Wang and Perdew 1991; Perdew et al. 1992) and wassimplified even more in the PBE functional (Perdew, Burke and Ernzerhof1996). The physical idea remains the same. There is only a difference inparametrization strategy, which results in different numbers. The next logical step is to include the second order derivative of the density,the Laplacian ∇ ρ ( r ) . Approximate functionals which also include higherorder derivatives of the density are called A functional one step beyond theGGA, so it also includes the Laplacian of the densitys (meta-GGAs). Directcalculation of ∇ ρ ( r ) leads to numerical problems for code which are basedon Gaussian basis sets. To avoid these numerical problems, one use oftenthe KS kinetic energy-density instead, which is defined as τ s ( x ) := 12 N (cid:88) i =1 |∇ φ i ( x ) | . (2.90)The use of the KS kinetic energy density leads to the difficulty that thefunctional now becomes an orbital dependent functional. In practice, thisadditional complication is simply neglected.The meta-GGAs do not such a big improvement for covalent bonds, butdo do improve the description of weak bonds Sun et al. 2013. In practice,their performance for weak bonds still requires the additional use of empiricaldispersion correction schemes, limiting the actual use of meta-GGAs in prac-tice. The most well-known older meta-GGAs is the TPSS functional (Taoet al. 2003) and a more recent popular meta-GGA is the SCAN functional.A key ingredient of the SCAN functional is recognition that with the help of τ s ( r ) regions of single-orbital character, slowly varying density and overlapof closed shells can be recognized by the following parameter α ( r ) = τ s ( r ) − τ W ( r ) τ HEG ( r ) , (2.91)79here τ W is the single orbital (Von Weizs¨acker) kinetic energy density (cf.exercise 2.13) τ W ( r ) = 18 | ∇ ρ ( r ) | ρ ( r ) (2.92)and τ HEG ( r ) is the kinetic energy density of the non-interacting HEG τ HEG ( r ) = 310 (cid:0) π (cid:1) / ρ ( r ) / . (2.93)That a meta-GGA would be able to distinguish between regions with singleorbital character ( α ≈ ) and slowly varying density ( α ≈ ) was alreadyrecognized very early by Becke Becke 1998 and incorporated in the earlymeta-GGAs such as the TPSS functional. About a decade later, it wasrealized that the description of weak bonds (overlap of closed shells) couldbe regonized by the parameter α , since in that case α (cid:29) Sun et al. 2013.Still, dispersion interactions require Exercise 2.26. Derive a relation between τ s ( r ) and ∇ ρ ( r ) . First show that τ s ( x ) and ∇ ρ ( x ) are related as τ s ( x ) = N (cid:88) i =1 ε i | φ i ( x ) | − v s ( r ) ρ ( x ) + 14 ∇ ρ ( x ) . (2.94)Now you can integrate out the spin coordinate to find the desired relation. Typically we are interested in molecules at their equilibrium geometries. ForH we saw that the full xc-hole does not completely localize on the thenucleus where the reference electron is located (Fig. 2.3). A small peak ofthe x-hole remains behind on the other nucleus. To incorporate this effect,one can include a small percentage of ρ x (also called ‘exact’ exchange) inthe functional. The amount is typically fixed. It is clear from the holesof the H molecule that the inclusion of exact exchange gives an improveddescription of the xc-hole at short distances compare Figs 2.3, 2.5 and 2.8).For stretched bonds, however, we inherit the delocalization error of HF andthe description becomes worse (compare Figs 2.4, 2.5 and 2.9).The improved description of the xc-hole at short bond distances alsoimproves the energy near equilibrium. The worse description of the xc-hole atlong bond distances deteriorates the energy, as is clear from the comparisonof total energy from the B3LYP functional with the other functionals andHF in Fig. 2.6. The B3LYP functional performs extremely good near theequilibrium distance of H , but upon dissociation the B3LYP hole does notfully localize and is outperformed by one of the oldest GGAs: the BP86.80 igure 2.8: 80% of the LDA xc-holemixed with 20% of the exact x-hole at R H–H = 2 . Bohr. The reference elec-tron is again located at at 0.3 Bohrto the left of the right nucleus ( r ref =(0 , , . Bohr). Figure 2.9: The same xc hole model(80% LDA and 20% exact exchange) for R H–H = 5 . Bohr. The B3LYP functional is the most used and well known functional in chem-istry, though one might wonder if it deserves this honor. The tale of theB3LYP functional is a strange one. It starts with a hybrid functional origin-ally proposed by Becke in 1993 (Becke 1993) with 3 empirical parameters tomix several GGAs and LDA with exact exchange E B3PW91xc = E VWN5xc + a (cid:0) E exactx − E VWN5x (cid:1) + a x (cid:0) E B88x − E VWN5x (cid:1) + a c (cid:0) E PW91c − E VWN5c (cid:1) , (2.95)Becke fitted the parameters a , a x and a c to a set of thermodynamic data.This functional from Becke has been implemented in Gaussian — the mostpopular quantum chemistry package — with the following modifications• The GGA correlation part from the PW91 functional was replaced bythe correlation from the LYP functional.• The VWN5 LDA parametrization was replaced by the VWN3. Thiswas probably a mistake, since VWN3 has the wrong asymptotic beha-viour and VWN recommended not to use this parametrization.Such a large modification of the B3PW91 would normally require a refittingof the empirical parameters, but this has not been done. Although theproper motivations for the B3LYP functional are virtually absent, the B3LYPfunctional is the most used approximation for the xc energy in chemistry.81 .6 Is there any meaning to the KS system? The KS system was introduced to calculate T s [ ρ ] as an approximation to thereal kinetic energy functional T [ ρ ] . Is there more physical meaning to theKS system? The energy of the HOMO in finite systems is exactly equal to minus the firstionisation energy ε HOMO = − I = E ( N ) − E ( N − . (2.96)The first step to prove this relation is to show that the density decays as ρ ( r → ∞ ) ∼ e − √ I r (2.97)far away from the system. The derivation is beyond the scope of this lecture,but can be found in (Almbladh and Barth 1985). The next step is to realizethat the asymptotic behaviour of the density in the KS system is governedby the asymptotic decay of the HOMO, since the KS orbitals asymptoticallydecay as ψ k ( r → ∞ ) ∼ e −√− (cid:15) k r . (2.98)All the other occupied orbitals have a more negative orbital energy, so theydecay faster than the HOMO when r → ∞ . The asymptotic behaviour ofthe KS density is therefore dictated by the HOMO.In practice, however, the LDA & GGA HOMO energies are ∼ − eVtoo high. This too high value for the HOMO energy is caused by the too fastdecay of their corresponding xc potentials. The exact v xc ( r ) decays as − /r for neutral systems, since if the electron is pulled away, it leaves a systembehind with one electron less, so an effective positive charge.The other occupied orbital energies provide good approximations to theionisation to excited ion states, I k = E k ( N − − E ( N ) , if these excited ionstates are well described by a single Slater determinant (Chong, Gritsenkoand Baerends 2002; Gritsenko and Baerends 2002). This statement onlyholds for the exact xc potential, which is in general badly approximated byLDA & GGA functionals.For this reason, special approximations directly to the xc potential havebeen constructed which have explicitly built in the correct asymptotic decay.The first potential with the correct asymptotic decay is the LB94 poten-tial (Leeuwen and Baerends 1994). Later the SAOP potential (Gritsenko,Schipper and Baerends 1999; Schipper et al. 2000) was developed to also im-prove the description of the inner part of the xc potential, which is importantof the other occupied orbital energies. Unfortunately, only an expression for82he xc potential is provided and a corresponding E xc is not available. Nev-ertheless, if only the orbital energies are of interest, especially the SAOPpotential typically provides very good KS orbital energies. Exercise 2.27. Show that the LDA x-potential decays as e − / √ I r for theexact density. Exercise 2.28. Derive that ε HOMO = − I by considering the KS equationin the asymptotic limit. You first need to show that the KS orbitals indeeddecay as in (2.98) and subsequently you can use the the KS density should beequal to the exact density (by construction), so also its asymptotic behaviour. Since the KS orbitals are generated by a local potential, also the unoccupiedKS orbitals feel an N − system for large r . Note that this situation isdifferent from the unoccupied HF orbitals which feel an N particle system.This causes the HF unoccupied orbital energies to provide approximations toaffinities via Koopmans’ theorem (Koopmans 1934). Since the unoccupiedKS orbitals feel an N − system, their energies do not provide approxima-tions to affinities, but their energy differences with the occupied KS energiesprovide good approximations to local excitations, if the excited state can bewell described by a single Slater determinant. Local valence excitations arealready quite good on the LDA & GGA level, but the Rydberg excitationsare problematic. Because the LDA & GGA potentials decay too fast theunoccupied KS are too high in energy which cause the Rydberg excitationsto be unbound states on the LDA & GGA level. As you might expect, themodel potentials LB94 and SAOP are very effective to cure this deficiency ofthe LDA & GGA potentials. Especially SAOP, which is a more sophisticatedversion of LB94. These lecture notes provided a short introduction into DFT, too short tohighlight all its subtleties and difficulties. Though great care has been takento maximize validity while retaining simplicity for an introductory course,there are some incorrect assumptions which should at least be mentioned.• There are two important classes of functionals that would deserve at-tention in a more extended course on DFT. The first class of func-tionals are the orbital depend functionals. Since the KS orbitals arealso dependent on the density, orbital functionals can also be used asdensity functionals, though great care is needed when taking functionalderivatives and one typically needs to use the optimized effective poten-tial (OEP) method which is a numerically unstable procedure which83eeds additional care to stabilize. The advantage of orbital dependfunctionals is that the can be constructed in a more systematic man-ner.The second class of functionals are the so-called weighted density ap-proximations (WDAs). Instead of pretending the density to be contantaround the reference position, these density functional approximationcut the hole out of the true density (Gunnarsson, Jonson and Lun-dqvist 1977, 1979). The WDAs performs about equally good as theGGAs, but at much more computational cost, so they have been notbeen very popular. However, recently there has been a resurgence of in-terest (Bahmann and Ernzerhof 2008; Cuevas-Saavedra, Chakrabortyand Ayers 2012; Cuevas-Saavedra et al. 2012; Giesbertz, Leeuwen andBarth 2013; Antaya, Zhou and Ernzerhof 2014; Pecechtlov´a et al. 2014;Pˇrecechtˇelov´a et al. 2015).• The Levy–Lieb functional F LL [ ρ ] is not convex, so a global minimiseris not guaranteed. A suitable convex generalisation is (Lieb 1983) F L [ ρ ] = min ˆΓ → ρ Tr (cid:8) ˆΓ( ˆ T + ˆ W ) (cid:9) , (2.99)where ˆΓ = (cid:80) K w K | Ψ K (cid:105)(cid:104) Ψ K | is a density operator with w K ≥ and (cid:80) K w K = 1 .• The functional F L [ ρ ] is only differentiable at v -representable densities,so the v -representability question is still an important issue (Leeuwen2003; Lammert 2006b).• There are several regularisation techniques to deal with v represent-ability: course graining by Lammert (Lammert 2006a, 2010) and Moreau–Yosida regularisation (Kvaal et al. 2014).• In the KS construction it is assumed that all interacting N -represent-able densities are non-interacting v -representable. This is not true (Lieb1983).• The KS kinetic energy should be generalised by extending the search todensity matrices instead of only pure states. The KS system will not besolved by a single Slater determinant anymore (Lieb 1983). This caneven be an issue for real systems (Schipper, Gritsenko and Baerends1998b).• In finite basis sets the KS typically degenerates to 1RDM functionaltheory. As a simple example, consider the ground state of H in aminimal basis. The KS orbital is readily constructed from the CIdensity as φ ( r ) = (cid:113) cos ( θ ) | σ g ( r ) | + sin ( θ ) | σ u ( r ) | . (2.100)84ypically, only a complete basis will manage to reproduce this orbitalat all distances, i.e. for all θ . Otherwise, the KS system will yield asuperposition of the ( σ g ) and ( σ u ) states and reproduce even theexact 1RDM.• In the recent years much progress has been made in the investigationof the properties of the functional (Seidl, Perdew and Levy 1999; Seidl1999; Seidl, Gori-Giorgi and Savin 2007; Gori-Giorgi, Vignale and Seidl2009). W SCE [ ρ ] = min ˆΓ → ρ Tr (cid:8) ˆΓ ˆ W (cid:9) , (2.101)which can be considered as the counterpart of the KS kinetic energyfunctional. Indeed we have the obvious inequality F [ ρ ] ≥ T s [ ρ ] + W SCE [ ρ ] . (2.102)As only the interaction remains, the electrons become effectively aclassical system constrained to yield a smooth density. To minimisethe electrostatic energy, the electrons are forced to move in a strictlycorrelated manner, sometimes referred to as a ‘floating’ Wigner crys-tal. This is called the strictly correlated electrons Seidl, Perdew andLevy 1999; Seidl 1999; Seidl, Gori-Giorgi and Savin 2007; Gori-Giorgi,Vignale and Seidl 2009 (SCE) limit.An important feature is that instead of higher order derivatives ofthe density, the functional use a cumulant function as an importantingredient (at least in 1D) N e ( x ) = (cid:90) x −∞ d y ρ ( x ) . (2.103)85 ppendix A Prolate spheroidal coordinatesystem For quantum problems with two nuclei, its is often convenient to work withthe prolate spheroidal coordinate system. As there are two nuclei, a sphericalcoordinate system seems out of place as we have two nuclei which we wouldlike to place at the focus. Therefore, one rather starts from ellipses whichhave two foci at which we can place both nuclei. By adding hyperbolae weget a 2D coordinate system which is called an elliptic coordinate system.Simply revolving it around the bonding axis (long axes of the ellipses) weobtain the prolate spheroidal coordinate system.Let us first focus on the elliptic system in the plane. In Fig. A.1 weshow two points, r and r (cid:48) and the corresponding intersection ellipses andhyperbolae. The elliptic coordinates of a points r are easy to calculate fromthe distance to the foci, r A/B := | r − R A/B | , ξ := r A + r B R and η := r B − r A R , (A.1)where R := | r A − r B | is the distance between the foci.You see that each elliptic coordinate ( ξ, η ) results in an intersection inboth the upper half plane and the lower half plane, so it corresponds to twopoints. With some fiddling around, you can also establish the elliptic reversetransformation to be x = R (cid:112) ( ξ − − η ) and z = R ξη. (A.2)Note that we only recover the points in the upper half plane with this para-metrisation. To generate the full 3D space, we turn the elliptic system aroundthe z -axis with an angle φ ∈ [0 , π ) . This is called the prolate spheroidalcoordinate system. This immediately also generates the coordinates in the Turning the elliptical system around the x -axis results in the oblate spheroidal co-ordinate system. x R A R B ξ = ξ = η = η = r r a r b r (cid:48) r (cid:48) a r (cid:48) b Figure A.1: Elliptic coordinates for the points r and r (cid:48) . negative half plane r = xyz = R (cid:112) ( ξ − − η ) cos( φ ) (cid:112) ( ξ − − η ) sin( φ ) ξη . (A.3)The prolate spheroidal basis vectors are readily obtained as (cid:0) e ξ e η e φ (cid:1) = J = (cid:16) ∂ r ∂ξ ∂ r ∂η ∂ r ∂φ (cid:17) = R ξ (cid:113) − η ξ − cos( φ ) − η (cid:113) ξ − − η cos( φ ) −√ ( ξ − − η ) sin( φ ) ξ (cid:113) − η ξ − sin( φ ) − η (cid:113) ξ − − η sin( φ ) √ ( ξ − − η ) cos( φ ) η ξ . (A.4)It is readily checked that this is an orthonormal coordinate system, so wehave a diagonal metric and the scaling factors are the lengths of the basisvectors λ ξ = | e ξ | = R (cid:115) ξ − η ξ − , (A.5a) λ η = | e η | = R (cid:115) ξ − η − η , (A.5b) λ φ = | e φ | = R (cid:112) ( ξ − − η ) . (A.5c)87he volume element is now simply obtained as the product of all scalingfactors d r = λ ξ λ η λ φ d ξ d η d φ = R ξ − η ) d ξ d η d φ (A.6)and the Laplacian becomes ∇ = 1 λ ξ λ η λ φ (cid:18) ∂∂ξ λ η λ φ λ ξ ∂∂ξ + ∂∂η λ ξ λ φ λ η ∂∂η + ∂∂φ λ ξ λ η λ φ ∂∂φ (cid:19) = 4 R ( ξ − η ) (cid:20) ∂∂ξ ( ξ − ∂∂ξ + ∂∂η (1 − η ) ∂∂η + (A.7) (cid:18) ξ − − η (cid:19) ∂ ∂φ (cid:21) . ppendix B Lagrange multipliers This appendix explains how Lagrange multipliers can be used to formu-late the first order optimality conditions for an optimization problem underequality constraints, i.e. conditions that only involve first order derivatives.It is a slight modification of Appendix A from (Giesbertz 2010), with somemodifications to limit the discussion to equality constraints. The Lagrangemultiplier technique can also be generalized to inequality constraints, whichleads to the Karush–Kuhn–Tucker (KKT) conditions (Karush 1939; Kuhnand Tucker 1951).First the problem has to be formulated more mathematically min r ∈ R N f ( r ) subject to h j ( r ) = 0 for j = 1 , . . . , l, where f : R N → R and h j : R N → R are continuously differentiable. Strictly,it is not necessary for the domains of f and h j to be R N , but a more gen-eral domain would require to specify some restrictions on it, which wouldonly cloud the discussion. We will now illustrate the concept of Lagrangemultipliers with a daily problem.A student goes to the pub to meet his friend. Just after he enters thepub he sees his friend sitting at table. However, he can not go there withouta beer, so he needs to go to the bar first. He is a bit drunk already, so the bardoes not seem to be completely straight. A schematic view of the situationis given in Fig. B.1. The student enters at the origin, O , his friend is sittingat point F and the edge of the bar is described by the function h ( x, y ) = 0 .The student has to find a point B on the edge of the bar [ h ( B ) = 0 ] suchthat the distance from O to B , d OB , plus the distance from B to F , d BF isas short as possible. So we can introduce the following objective function f = d OB + d BF . (B.1) Every occurrence of he/his may be replaced by she/her, if the other gender is preferred. y bar h ( x, y ) = 0 O F xy bar h ( x, y ) = 0 O FB Figure B.1: Schematic representationof the problem. The student is locatedat O and wants to find the shortest pathto his friend at point F via the bar. Figure B.2: Graphical solution of theproblem. Without constraining B to beon the bar, all points B that give thesame total distance. The solution is theellipse that has only one point in com-mon with the bar. The problem can be solved graphically. Suppose we take the total distanceto be some fixed value. If we plot all possible combinations of d OB and d BF which sum to this fixed value, we obtain an ellipse. By increasing the sizeof this ellipse till it just hits the bar ( h ( x, y ) = 0 ), we find the optimal paththat the student should take. In Fig. B.2 we show a couple of these ellipses.The outer ellipse is just large enough to touch the edge of the bar, so this isthe point B that minimises the total distance f .Note that at the point B that the ellipse is tangent to the the bar. Infact, this is not specific for this problem. For all optimization problems withequality constraints, the objective function will be tangent to the constraints.A more mathematical way to formulate this is to say that the normal vectorsof both curves (surfaces in higher dimensions) are parallel. The normalvector of a curve or surface is given by the gradient, so this condition can beexpressed mathematically as ∇ f ( P ) + λ ∇ h ( P ) = 0 . (B.2)The unknown constant multiplier λ is known as the Lagrange multiplier andit is necessary, because the magnitudes of the two gradients might be differ-ent. This expression, including the Lagrange multiplier has a nice physicalinterpretation. Consider a particle at r and f to be its potential energy, sothe force at r is given by −∇ f ( r ) . Since the constraint prevents the particlefrom going to the unconstraint minimum, it has to generate an opposingforce, − λ ∇ h ( r ) . If the forces are not parallel to each other, there will remain90 y h = 0 f = const. −∇ f − λ ∇ h F net xy h = 0 f = const. − λ ∇ h −∇ f Figure B.3: The force of the object-ive function −∇ f and the force of theconstraint − λ ∇ h are imbalanced. A netforce F net remains. Figure B.4: At the optimal point theforces of the constraint − λ ∇ h exactlycancels the force of the objective func-tion −∇ f , so there is no net force. a net force pushing the particle to a region with a lower potential energy,without violating the constraint (Fig. B.3). At the point where the forcesexactly cancel each other, r ∗ , the particle is at a minimum of the potentialenergy f , satisfying the constraint h ( r ∗ ) = 0 . The Lagrange multiplier canbe thought of a measure how hard h ( r ∗ ) has to pull in order to balance theforce generated by f (Fig. B.4).Usually the use of Lagrange multipliers is formulated by introducing aLagrangian. It is simply defined to be the objective function f , plus all therequired equality constraints weighted by Lagrange multipliers L ( r , λ ) = f ( r ) + l (cid:88) j =1 λ j h j ( r ) . (B.3)By taking the partial derivatives with respect to all coordinates (including λ ), all the optimality conditions are obtained ∇ f ( r ) + l (cid:88) j =1 λ j ∇ h j ( r ) = , (B.4a) h j ( r ) = 0 for j = 1 , . . . , l. (B.4b) Example B.1. The problem of the student can be solved if he is not toodrunk, so the bar can be described by a simple straight line. However, dueto the square roots in the objective function, the algebra is quite formidable,so it hardly serves as an example. Therefore, as a first example we considera more simple mathematical problem min x,y ∈ R f ( x, y ) = x y subject to x + y = 3 . (B.5)91 y λ f ( x, y ) −√ − − −√ − √ −√ √ − − √ − Table B.1: All six critical points of the function f ( x, y ) = x y with x and y constraint to lay on a circle with radius √ . As the constraint function we will take h ( x, y ) = x + y − . The Lagrangiancan be written as L ( x, y, λ ) = f ( x, y ) + λh ( x, y ) = x y + λ ( x + y − . (B.6)Now we take all the partial derivates of the Lagrangian L∂L∂x = 2 xy + 2 λx = 0 , (B.7a) ∂L∂y = x + 2 λy = 0 , (B.7b) ∂L∂λ = x + y − . (B.7c)Equation (B.7a) gives x = 0 or y = − λ . In the first case Eq. (B.7c) gives y = ±√ , so by Eq. (B.7b) we have λ = 0 .In the second case Eq. (B.7b) gives x − y = 0 ⇒ x = 2 y . (B.8)Using this result in Eq. (B.7c) gives y = 3 ⇒ y = ± . (B.9)So the Lagrange multiplier is λ = ∓ and x = ±√ , where the sign of x is arbitrary. So the Lagrange equations have six critical points which aresummarised in table B.1. As can be seen from the table, the optimisationproblem has two global solutions at ( −√ , − and ( √ , − . From theHessian of the Lagrangian L it may be determined that (0 , √ is a localminimum. Example B.2. In this example we will work out the problem for the studentwhen he is not too drunk, so the bar is still straight. In this case, theconstraint function can in general be defined as h ( x, y ) = y − ( αx + h ) , (B.10)92here x and y will be the coordinates of point B . Its gradient is ∇ h ( x, y ) = ( − α, T . (B.11)An explicit expression for the objective function, i.e. the total length of thepath can be written as f ( x, y ) = d OB + d BF = (cid:112) x + y + (cid:112) ( x − c ) + y , (B.12)where c is the direct distance between the student and his friend ( c := | F − O | ). The square roots in this function might seem pretty harmless, but theywill make the task of solving this problem quite formidable, even though weonly treat a straight bar.The derivatives of f can be worked out as ∂f∂x = 2 x (cid:112) x + y + 2( x − c )2 (cid:112) ( x − c ) + y = xd OB + x − cd BF , (B.13a) ∂f∂y = 2 y (cid:112) x + y + 2 y (cid:112) ( x − c ) + y = yd OB + yd BF . (B.13b)Using these derivatives, the stationarity conditions become (at the point B = P ) ∂L∂x = ∂f∂x + λ ∂g∂x = xd OP + x − cd P C − αλ = 0 , (B.14a) ∂L∂y = ∂f∂y + λ ∂g∂y = yd OP + yd P C + λ = 0 , (B.14b) αx + h = y. (B.14c)Since the partial derivative of the Lagrangian L with respect to y [Eq. (B.14b)]directly gives an expression for the Lagrange multiplier λ in terms of x and y ,the Lagrange multiplier in Eq. (B.14a) can be eliminated. In principle, usingthe equality condition [Eq. (B.14c)] to eliminate y , we obtain an equationwith only the variable x . However, this equation is quite formidable to solvedue to all the square root terms. So feel free to skip the algebra and to jumpto the answer at the end of this section.It is convenient only to substitute for λ only and not yet for y . Theequation needs to be reordered, so that all the terms containing a squareroot, d OP and d P C , are isolated on one site of the equation (cid:18) d OP + 1 d P C (cid:19) x − cd P C + α (cid:18) d OP + 1 d P C (cid:19) y = 0 ⇔ (cid:18) d OP + 1 d P C (cid:19) ( x + αy ) = cd P C ⇔ (cid:18) d P C d OP + 1 (cid:19) ( x + αy ) = c d P C d OP = cx + αy − ⇔ (cid:115) ( x − c ) + y x + y = c − x − αyx + αy . (B.15)Now the square root can simply be eliminated by taking the square on bothsides. Later, we have to check our solution in the original equation, sincethe number of solutions of the squared equation is twice as large. Beforeeliminating y , the equation can be cleaned up a bit further ( x − c ) + y x + y = ( x + αy − c ) ( x + αy ) ⇔ (cid:0) ( x − c ) + y (cid:1) ( x + αy ) = (cid:0) x + y (cid:1) ( x + αy − c ) ⇔ ( x + y )( x + αy ) + ( c − cx )( x + αy ) = ( x + y )( x + αy ) + ( x + y ) (cid:0) c − c ( x + αy ) (cid:1) ⇔ ( c − x )( x + αy ) = ( x + y ) (cid:0) c − x + αy ) (cid:1) . (B.16)To proceed, the equality condition y = αx + h has to be inserted. However,the equation become quite formidable, so we will deal with the left- andrighthand side of the equation separately.l.h. = ( c − x ) (cid:0) x + α ( αx + h ) (cid:1) = ( c − x ) (cid:0) (1 + α ) x + αh (cid:1) = ( c − x ) (cid:0) (1 + α ) x + 2 αh (1 + α ) x + α h (cid:1) = − α ) x − αh (1 + α ) x + c (1 + α ) x − α h x + 2 αch (1 + α ) x + α ch , (B.17a)r.h. = (cid:0) x + ( αx + h ) (cid:1)(cid:0) c − x + α ( αx + h )) (cid:1) = (cid:0) (1 + α ) x + 2 αhx + h (cid:1)(cid:0) c − αh − α ) x (cid:1) = − α ) x − αh (1 + α ) x + ( c − αh )(1 + α ) x − h (1 + α ) x + 2 αh ( c − α ) x + h ( c − αh ) . (B.17b)Comparing the left- and righthand side, we see that the first two termsare equal. Therefore, we will only be left with a polynomial of order two.Subtracting the righthand side from the lefthand side gives l.h. − r.h. = (cid:2) (1 + α ) (cid:0) c (1 + α ) − ( c − αh ) (cid:1)(cid:3) x + (cid:2) α ch − h ( c − αh ) (cid:3) +2 (cid:2) cαh (1 + α ) − α h + h (1 + α ) − αh ( c − α ) (cid:3) x = (1 + α )( c + α c − c + 2 αh ) x + (cid:2) α ch − ch + 2 αh (cid:3) αch + α ch − α h + h + α h − αch + 2 α h ) x α (1 + α )( αc + 2 h ) x + 2 h ( α c + 2 α h + h ) x + h ( α c + 2 αh − c ) . (B.18)In principle the equation is now easy to solve. However, the coefficients inthe polynomial are rather cumbersome, so it is actually still a tough task.First consider the discriminant D . D = h ( α c + 2 α h + h ) + h ( c − α c − αh ) α (1 + α )( αc + 2 h )= h (cid:2) ( α c + 2 α h + h ) { α c + 2 α h + h } + (cid:0) αc − ( α c + 2 α h ) (cid:1)(cid:0) { α c + 2 α h + h } + h + αc (cid:1)(cid:3) = h (cid:2) h ( α c + 2 α h + h ) + αc ( α c + 2 α h + 2 h + αc ) − ( α c + 2 α h )( h + αc )= h ( h + 2 αch + α c )= h ( αc + h ) . (B.19)With this result for the discriminant, the final solution becomes quite simple x = − h ( α c + 2 αh + h ) + h ( αc + h ) α (1 + α )( αc + 2 h ) = h c − αh − α c (1 + α )( αc + 2 h ) . (B.20)The y coordinate of B is now simply found using the equality condition (B.14c) y = h (cid:18) α c − αh − α c (1 + α )( αc + 2 h ) + 1 (cid:19) = h αc − α h − α c + αc + 2 h + α c + 2 α h (1 + α )( αc + 2 h )= 2 h ( αc + h )(1 + α )( αc + 2 h ) . (B.21)The expression for the total distances d OP and d P C and the Lagrange mul-tiplier are quite horrendous, so we do not show them.95 cronyms one-body reduced density matrix . . . . . . . . . . . . . 37 two-body reduced density matrix . . . . . . . . . . . . . 51 a.u. atomic units: (cid:126) = e = m e = 4 π(cid:15) . . . . . . . . . . . . . 5 B3LYP Replacement of the PW91 correlation part by the LYPfunctional in the B3PW91 Kim and Jordan 1994; Stephenset al. 1994 and originally also the replacement of VWN5 byVWN3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 B3PW91 A hybrid functional by Becke Becke 1993 . . . . . . . . 81 B88 GGA approximation to exchange by Becke Becke 1988 . 79 BP86 GGA composed of B88 and the correlation functional byPerdew from 1986 Perdew 1986 . . . . . . . . . . . . . . 79 c-hole correlation hole . . . . . . . . . . . . . . . . . . . . . . . 68 CI configuration(s) interaction . . . . . . . . . . . . . . . . 9 DFT density functional theory . . . . . . . . . . . . . . . . . 49 GEA gradient expansion approximation . . . . . . . . . . . . 77 GGA generalized gradient approximation . . . . . . . . . . . . 3 GTO Gaussian type orbital . . . . . . . . . . . . . . . . . . . 47 H Hartreeclassical Coulomb . . . . . . . . . . . . . . . . . 70 HEG homogeneous electron gas . . . . . . . . . . . . . . . . . 59 HF Hartree–Fock . . . . . . . . . . . . . . . . . . . . . . . . 9 HK Hohenberg–Kohn . . . . . . . . . . . . . . . . . . . . . . 55 KKT Karush–Kuhn–Tucker . . . . . . . . . . . . . . . . . . . 89 HOMO highest occupied molecular orbital . . . . . . . . . . . . 40 KS Kohn–Sham . . . . . . . . . . . . . . . . . . . . . . . . . 3 LB94 The model xc potential by Van Leeuwen andBaerends Leeuwen and Baerends 1994 . . . . . . . . . . 82 LDA local density approximation . . . . . . . . . . . . . . . . 396 UMO lowest unoccupied molecular orbital . . . . . . . . . . . 40 LYP GGA correlation functional by Lee, Yang and Parr Lee, Yangand Parr 1988 . . . . . . . . . . . . . . . . . . . . . . . . 81 meta-GGA A functional one step beyond the GGA, so it also includesthe Laplacian of the density . . . . . . . . . . . . . . . . 79 MP M ø ller–Plesset . . . . . . . . . . . . . . . . . . . . . . . 35 OEP optimized effective potential . . . . . . . . . . . . . . . . 83 PBE The GGA by Perdew, Burke and Ernzerhof Perdew, Burkeand Ernzerhof 1996 . . . . . . . . . . . . . . . . . . . . . 79 PW91 The GGA by Perdew and Wang Wang and Perdew 1991;Perdew et al. 1992 . . . . . . . . . . . . . . . . . . . . . 79 RHF restricted HF . . . . . . . . . . . . . . . . . . . . . . . . 37 ROHF restricted open shell HF . . . . . . . . . . . . . . . . . . 37 SAOP The statistical averaging of (model) orbitalpotentials Gritsenko, Schipper and Baerends 1999; Schipperet al. 2000 . . . . . . . . . . . . . . . . . . . . . . . . . . 82 SCAN Strongly Constrained and Appropriately Normed Sun,Ruzsinszky and Perdew 2015 . . . . . . . . . . . . . . . 79 SCE strictly correlated electrons Seidl, Perdew and Levy 1999;Seidl 1999; Seidl, Gori-Giorgi and Savin 2007; Gori-Giorgi,Vignale and Seidl 2009 . . . . . . . . . . . . . . . . . . . 85 SCF self-consistent field . . . . . . . . . . . . . . . . . . . . . 35 STO Slater type orbital . . . . . . . . . . . . . . . . . . . . . 46 TPSS The meta-GGA by Tao, Perdew, Staroverov and Scuseria Taoet al. 2003 . . . . . . . . . . . . . . . . . . . . . . . . . . 79 UHF unrestricted HF . . . . . . . . . . . . . . . . . . . . . . . 37 VWN Vosko, Wilk and Nusair . . . . . . . . . . . . . . . . . . 74 WDA weighted density approximation . . . . . . . . . . . . . . 84 x-hole exchange hole . . . . . . . . . . . . . . . . . . . . . . . . 68 xc exchange-correlation . . . . . . . . . . . . . . . . . . . . 5897 ibliography Born, M. and R. Oppenheimer (Dec. 1927). “Zur Quantentheorie der Molekeln”.German. In: Ann. Physik doi : 10 . 1002 / andp .19273892002 .Abedi, Ali, Neepa T. Maitra and E. K. U. Gross (Sept. 2010). “Exact Fac-torization of the Time-Dependent Electron-Nuclear Wave Function”. In: Phys. Rev. Lett. doi : 10 . 1103 / PhysRevLett . 105 .123002 .Kyl¨anp¨a¨a, Ilkka and Tapio T Rantala (Sept. 2011). “First-principles simula-tion of molecular dissociation-recombination equilibrium”. In: J. Chem.Phys. doi : .Fierz, Markus (1939). “ ¨Uber die relativistische Theorie kr¨aftefreier Teilchenmit beliebigem Spin”. German. In: Helv. Phys. Acta doi : .Pauli, W. (Oct. 1940). “The Connection Between Spin and Statistics”. In: Phys. Rev. Phys. Rev. doi : .Heisenberg, W. (June 1926). “Mehrk¨orperproblem und Resonanz in der Quant-enmechanik”. German. In: Z. Phys. doi : .Dirac, P. A. M. (Oct. 1926). “On the Theory of Quantum Mechanics”. In: Proc. Roy. Soc. A url : .Stefanucci, Gianluca and Robert van Leeuwen (2013). Nonequilibrium Many-Body Theory of Quantum Systems: A Modern Introduction . New York:Cambridge University Press. isbn : 978-0-521-76617-3.Galerkin, Boris G. (1915). “Стержни и пластинки. Ряды в некоторыхвопросах упругого равновесия стержней и пластинок”. In: Vestnik In-zhenerov i Tekhnikov 19. (English translation: NTIS Rept. TT-63-18924),pp. 897–908.Hartree, D. R. (Jan. 1928). “The Wave Mechanics of an Atom with a Non-Coulomb Central Field. Part I. Theory and Methods”. In: Math. Proc.Cambr. Phil. Soc. doi : .98ock, V. (Jan. 1930). “N¨aherungsmethode zur L¨osung des quantenmech-anischen Mehrk¨orperproblems”. In: Z. Phys. doi : .Condon, E. U. (Oct. 1930). “The Theory of Complex Spectra”. In: Phys. Rev. doi : .L¨owdin, P.-O. (Mar. 1955). “Quantum Theory of Many-Particle Systems.I. Physical Interpertations by Means of Density Matrices, Natural Spin-Orbitals, and Convergence Problems in the Method of ConfigurationalInteraction”. In: Phys. Rev. doi : .Mulliken, R. S. (1949). French. In: J. Chim. Phys. 46, pp. 497–542.Nocedal, Jorge and Stephen J. Wright (2006). Numerical Optimization . Ed.by Thomas V. Mikosch, Sidney I. Resnick and Stephen M. Robinson.2nd. Springer Series in Operations Research and Financial Engineering.233 Spring Street, New York, NY 10013, USA: Springer New York. isbn :978-0387-30303-1. doi : .Dawkins, Paul (n.d.). Paul’s Online Math Notes . url : http://tutorial.math.lamar.edu/Classes/CalcIII/LagrangeMultipliers.aspx .L¨owdin, Per-Olov (Mar. 1950). “On the Non-Orthogonality Problem Connec-ted with the Use of Atomic Wave Functions in the Theory of Moleculesand Crystals”. In: J. Chem. Phys. doi : .Roothaan, C. C. J. (Apr. 1951). “New Developments in Molecular OrbitalTheory”. In: Rev. Mod. Phys. doi : .Hall, G. G. (Mar. 1951). “The Molecular Orbital Theory of Chemical Valency.VIII. A Method of Calculating Ionization Potentials”. In: Proc. R. Soc.Lond. A doi : .Mulliken, R. S. (Oct. 1955a). “Electronic Population Analysis on LCAO–MOMolecular Wave Functions. I”. In: J. Chem. Phys. doi : .— (Oct. 1955b). “Electronic Population Analysis on LCAO–MO MolecularWave Functions. II. Overlap Populations, Bond Orders, and CovalentBond Energies”. In: J. Chem. Phys. doi : .Reed, Alan E., Robert B. Weinstock and Frank Weinhold (July 1985). “Nat-ural population analysis”. In: J. Chem. Phys. doi : .Bader, R. F. W. (1990). Atoms in Molecules - A Quantum Theory . Oxford,UK: Oxford University Press.Voronoi, G. (1908). “Nouvelles applications des param`etres continus `a lath´eorie des formes quadratiques. Deuxi`eme m´emoire. Recherches sur lesparall´ello`edres primitifs.” French. In: J. reine angew. Math. doi : .99ickelhaupt, F. Matthias et al. (1996). “The Carbon-Lithium Electron PairBond in (CH Li) n ( n = 1 , , )”. In: Organometallics doi : .Fonseca Guerra, C´elia et al. (2004). “Voronoi deformation density (VDD)charges: Assessment of the Mulliken, Bader, Hirshfeld, Weinhold, andVDD methods for charge analysis”. In: J. Comput. Chem. doi : .Giesbertz, K. J. H. and E. J. Baerends (May 2010). “Aufbau derived from aunified treatment of occupation numbers in Hartree–Fock, Kohn–Sham,and natural orbital theories with the Karush–Kuhn–Tucker conditionsfor the inequality constraints n i ≤ and n i ≥ ”. In: J. Chem. Phys. doi : .Koopmans, T. (1934). “ ¨Uber die Zuordnung von Wellenfunktionen und Ei-genwerten zu den Einzelnen Elektronen Eines Atoms”. In: Physica doi : .Bach, Volker et al. (May 1994). “There are no unfilled shells in unrestrictedHartree-Fock theory”. In: Phys. Rev. Lett. doi : .Grassmann, Hermann (1862). Die Ausdehnungslehre. Vollstandig Und inStrenger Form . German. Ed. by T. C. F. Enslin. Berlin.Dirac, P. A. M. (July 1939). “A new notation for quantum mechanics”. In: Math. Proc. Cambridge doi : .Boys, S. F. (1950). “Electronic Wave Functions. I. A General Method ofCalculation for the Stationary States of Any Molecular System”. In: Proc.Roy. Soc. A doi : .Atkins, Peter and Ronald Friedman (2010). Molecular Quantum Mechanics .5th ed. USA: Oxford University Press. isbn : 978-0199541423.Sheng, Xiao Wei et al. (Oct. 2011). “Counterpoise Correction is Not Usefulfor Short and Van derWaals Distances but May Be Useful at Long Range”.In: J. Comput. Chem. doi : .Mayer, J. E. (Dec. 1955). “Electron Correlation”. In: Phys. Rev. doi : .Tredgold, R. H. (Mar. 1957). “Density Matrix and the Many-Body Problem”.In: Phys. Rev. doi : .Mizuno, Y. and T. Izuyama (July 1957). “Remarks on Mayer’s ReducedDensity Matrix Method”. In: Prog. Theor. Phys. doi : 10 .1143/PTP.18.33 .Ayres, R. U. (Sept. 1958). “Variational Approach to the Many-Body Prob-lem”. In: Phys. Rev. doi : .Bopp, Fritz (Aug. 1959). “Ableitung der Bindungsenergie von N -Teilchen-Systemen aus 2-Teilchen-Dichtematrizen”. German. In: Z. Phys. doi : .Coleman, A. J. (1963). “Structure of Fermion Density Matrices”. In: Rev.Mod. Phys. doi : .100oulson, C. A. (Apr. 1960). “Present State of Molecular Structure Calcula-tions”. In: Rev. Mod. Phys. doi : .Klyachko, Alexander A. (2006). “Quantum marginal problem and N-representability”.In: J. Phys. Conf. Ser. doi : .Hohenberg, P. and W. Kohn (Nov. 1964). “Inhomogeneous Electron Gas”.In: Phys. Rev. doi : .Kohn, W. (1985). “Density Functional Theory: Fundamentals and Applic-ations”. In: Highlights of condensed-matter theory . Ed. by F. Bassani,F. Fumi and M. P. Tosi. Vol. LXXXIX. International School of PhysicsEnrico Fermi. North-Holland, Amsterdam, p. 1.Dreizler, R. M. and E. K. U. Gross (1990). Density Functional Theory:An Approach to the Quantum Many-Body Problem . Berlin Heidelberg:Springer-Verlag. isbn : 3-540-51993-9.Lieb, Elliott H. (Sept. 1983). “Density Functionals for CouIomb Systems”. In: Int. J. Quantum Chem. doi : .Cohen, Aron J. and Paula Mori-S´anchez (Apr. 2016). “Landscape of an exactenergy functional”. In: Phys. Rev. A doi : 10 . 1103 /PhysRevA.93.042511 .Englisch, H. and R. Englisch (Aug. 1983). “Hohenberg–Kohn theorem andnon- V -representable densities”. In: Physica A doi : .Levy, M. (Dec. 1979). “Universal variational functionals of electron densities,first order density matrices, and natural-spinorbitals and solutions ofthe v -representability problem”. In: Proc. Natl. Acad. Sci. USA doi : .Harriman, John E. (Aug. 1981). “Orthonormal orbitals for the representationof an arbitrary density”. In: Phys. Rev. A doi : .Englisch, H. and R. Englisch (1984a). “Exact Density Functionals for Ground-State Energies. I. General Results”. In: Phys. Stat. Sol. (b) doi : .— (July 1984b). “Exact Density Functionals for Ground-State Energies II.Details and Remarks”. In: Phys. Stat. Sol. (b) doi : 10 .1002/pssb.2221240140 .Leeuwen, Robert van (2003). “Density functional approach to the many-bodyproblem: key concepts and exact functionals”. In: Adv. Quant. Chem. Ed.by John R. Sabin and Erkki J. Braendas. Vol. 43. Academic Press, p. 25. doi : .Eschrig, H. (2003). The Fundamentals of Density Functional Theory (revisedand extended version) . Teubner. isbn : 3815430305.101ammert, Paul E. (Feb. 2006b). “Differentiability of Lieb Functional in Elec-tronic Density Functional Theory”. In: Int. J. Quantum Chem. doi : .— (Aug. 2006a). “Coarse-grained V representability”. In: J. Chem. Phys. doi : .— (July 2010). “Well-behaved coarse-grained model of density-functionaltheory”. In: Phys. Rev. A doi : .Kvaal, Simen et al. (Mar. 2014). “Differentiable but exact formulation ofdensity-functional theory”. In: J. Chem. Phys. doi : .Chayes, J. T., L. Chayes and Mary Beth Ruskai (Feb. 1985). “Density Func-tional Approach to Quantum Lattice Systems”. In: J. Stat. Phys. doi : .Seidl, Michael, John P. Perdew and Mel Levy (Jan. 1999). “Strictly correlatedelectrons in density-functional theory”. In: Phys. Rev. A doi : .Seidl, Michael (Dec. 1999). “Strong-interaction limit of density-functionaltheory”. In: Phys. Rev. A doi : .Seidl, Michael, Paola Gori-Giorgi and Andreas Savin (Apr. 2007). “Strictlycorrelated electrons in density-functional theory: A general formulationwith applications to spherical densities”. In: Phys. Rev. A doi : .Gori-Giorgi, Paola, Giovanni Vignale and Michael Seidl (Mar. 2009). “Elec-tronic Zero-Point Oscillations in the Strong-Interaction Limit of DensityFunctional Theory”. In: J. Chem. Theory Comput. doi : .Feynman, Richard P. (Nov. 2005). Classic Feynman: All the Adventures ofa Curious Character . Ed. by Ralph Leighton. W. W. Norton. isbn : 978-0-393-06132-1.Thomas, L. H. (Jan. 1927). “The calculation of atomic fields”. In: Math. Proc.Cambridge Phil. Soc. doi : .Fermi, E. (1927). “Un metodo statistico per la determinazione di alcuneproprieta dell atomo”. In: Rend. Accad. Naz. Licei 6, pp. 602–607.Teller, Edward (Oct. 1962). “On the Stability of Molecules in the Thomas–Fermi Theory”. In: Rev. Mod. Phys. doi : .Kohn, W. and L. J. Sham (Nov. 1965). “Self-Consistent Equations IncludingExchange and Correlation Effects”. In: Phys. Rev. doi : .Levy, M. (Sept. 1982). “Electron densities in search of Hamiltonians”. In: Phys. Rev. A doi : .102chipper, P. R. T., O. V. Gritsenko and E. J. Baerends (Sept. 1998a). “One-determinantal pure state versus ensemble Kohn–Sham solutions in thecase of strong electron correlation: CH and C ”. In: Theor. Chem. Acc. 99, p. 329. doi : .Feynman, R. P. (1972). “Statistical Mechanics”. In: Benjamin, Reading, p. 249.Almbladh, C. O. (1972). Tech. rep. Univeristy of Lund.Langreth, D. C. and J. P. Perdew (Dec. 1975). “The exchange-correlationenergy of a metallic surface”. In: Solid State Commun. doi : .Gunnarsson, O. and B. I. Lundqvist (May 1976). “Exchange and correlationin atoms, molecules, and solids by the spin-density-functional formalism”.In: Phys. Rev. B , 6006 (1977), p. 4274. doi : .Dirac, P. A. M. (July 1930). “Note on Exchange Phenomena in the ThomasAtom”. In: Math. Proc. Cambridge Phil. Soc. doi : .Slater, J. C. (Feb. 1951). “A simplification of the Hartree–Fock method”. In: Phys. Rev. doi : .Gori-Giorgi, Paola and John P. Perdew (Sept. 2001). “Short-range correlationin the uniform electron gas: Extended Overhauser model”. In: Phys. Rev.B doi : .Ceperley, D. M. and B. J. Alder (Aug. 1980). “Ground State of the ElectronGas by a Stochastic Method”. In: Phys. Rev. Lett. doi : .Ortiz, G., M. Harris and P. Ballone (June 1999). “Zero Temperature Phasesof the Electron Gas”. In: Phys. Rev. Lett. doi : .Wang, Yue and John P. Perdew (Dec. 1991). “Correlation hole of the spin-polarized electron gas, with exact small-wave-vector and high-densityscaling”. In: Phys. Rev. B doi : .Gori-Giorgi, Paola and John P. Perdew (Oct. 2002). “Pair distribution func-tion of the spin-polarized electron gas: A first-principles analytic modelfor all uniform densities”. In: Phys. Rev. B doi : .Vosko, S. H., L. Wilk and M. Nusair (1980). “Accurate spin-dependent elec-tron liquid correlation energies for local spin density calculations: a crit-ical analysis”. In: Can. J. Phys. doi : .Perdew, John P. (Oct. 1985). “Accurate Density Functional for the Energy:Real-Space Cutoff of the Gradient Expansion for the Exchange Hole”.In: Phys. Rev. Lett. , 2370, pp. 1665–1668. doi : . 103 (June 1986). “Density-functional approximation for the correlation en-ergy of the inhomogeneous electron gas”. In: Phys. Rev. B , 7406 (1986), pp. 8822–8824. doi : .Perdew, John P., Kieron Burke and Yue Wang (Dec. 1996). “Generalizedgradient approximation for the exchange-correlation hole of a many-electron system”. In: Phys. Rev. B doi : 10 .1103/PhysRevB.54.16533 .Becke, A. D. (Sept. 1988). “Density-functional exchange-energy approxima-tion with correct asymptotic behavior”. In: Phys. Rev. A doi : .Perdew, John P. et al. (Sept. 1992). “Atoms, molecules, solids, and surfaces:Applications of the generalized gradient approximation for exchange andcorrelation”. In: Phys. Rev. B , 4978 (1993), p. 6671. doi : .Perdew, John P., Kieron Burke and Matthias Ernzerhof (Oct. 1996). “Gener-alized Gradient Approximation Made Simple”. In: Phys. Rev. Lett. , 1396 (1997), pp. 3865–3868. doi : .Sun, Jianwei et al. (Sept. 2013). “Density Functionals that Recognize Cova-lent, Metallic, and Weak Bonds”. In: Phys. Rev. Lett. doi : .Tao, Jianmin et al. (Sept. 2003). “Climbing the Density Functional Ladder:Nonempirical Meta–Generalized Gradient Approximation Designed forMolecules and Solids”. In: Phys. Rev. Lett. doi : 10 .1103/PhysRevLett.91.146401 .Becke, Axel D. (Sept. 1998). “A new inhomogeneity parameter in density-functional theory”. In: J. Chem. Phys. doi : .— (Apr. 1993). “Density-functional thermochemistry. III. The role of exactexchange”. In: J. Chem. Phys. doi : .Almbladh, C.-O. and U. von Barth (Mar. 1985). “Exact results for thecharge and spin densities, exchange-correlation potentials, and density-functional eigenvalues”. In: Phys. Rev. B doi : .Chong, D. P., O. V. Gritsenko and E. J. Baerends (Feb. 2002). “Interpreta-tion of the Kohn – Sham orbital energies as approximate vertical ioniza-tion potentials”. In: J. Chem. Phys. doi : .Gritsenko, O. V. and E. J. Baerends (Nov. 2002). “The analog of Koopman’stheorem in spin-density functional theory”. In: J. Chem. Phys. doi : .Leeuwen, R. van and E. J. Baerends (Apr. 1994). “Exchange-correlation po-tential with correct asymptotic behavior”. In: Phys. Rev. A doi : .104ritsenko, O. V., P. R. T. Schipper and E. J. Baerends (Mar. 1999). “Ap-proximation of the exchange-correlation Kohn–Sham potential with astatistical average of different orbital model potentials”. In: Chem. Phys.Lett. doi : .Schipper, P. R. T. et al. (Jan. 2000). “Molecular calculations of excitationenergies and (hyper)polarizabilities withe a statistical average of orbitalmodel exchange-correlation potentials”. In: J. Chem. Phys. doi : .Gunnarsson, O., M. Jonson and B. I. Lundqvist (Dec. 1977). “Exchange andcorrelation in inhomogeneous electron systems”. In: Solid State Commun. doi : .— (Oct. 1979). “Descriptions of exchange and correlation effects in inhomo-geneous electron systems”. In: Phys. Rev. B doi : .Bahmann, Hilke and Matthias Ernzerhof (June 2008). “Generalized-gradientexchange-correlation hole obtained from a correlation factor ansatz”. In: J. Chem. Phys. doi : .Cuevas-Saavedra, Rogelio, Debajit Chakraborty and Paul W. Ayers (Apr.2012). “Symmetric two-point weighted density approximation for ex-change energies”. In: Phys. Rev. A doi : 10 . 1103 /PhysRevA.85.042519 .Cuevas-Saavedra, Rogelio et al. (Oct. 2012). “Symmetric Nonlocal WeightedDensity Approximations from the Exchange-Correlation Hole of the Uni-form Electron Gas”. In: J. Comput. Theory Chem. doi : .Giesbertz, K. J. H., R. van Leeuwen and U. von Barth (Feb. 2013). “To-wards nonlocal density functionals by explicit modeling of the exchange-correlation hole in inhomogeneous systems”. In: Phys. Rev. A doi : .Antaya, H´el`ene, Yongxi Zhou and Matthias Ernzerhof (Sept. 2014). “Approx-imating the exchange energy through the nonempirical exchange-factorapproach”. In: Phys. Rev. A doi : functional .Pecechtlov´a, Jana et al. (Sept. 2014). “A non-empirical correlation factormodel for the exchange-correlation energy”. In: J. Chem. Phys. doi : .Pˇrecechtˇelov´a, Jana Pavl´ıkov´a et al. (Oct. 2015). “Design of exchange-correlationfunctionals through the correlation factor approach”. In: J. Chem. Phys. doi : .Schipper, P. R. T., O. V. Gritsenko and E. J. Baerends (Sept. 1998b). “One-determinantal pure state versus ensemble Kohn-Sham solutions in thecase of strong electron correlation: CH and C ”. In: Theor. Chem. Acc. 99, p. 329. doi : .Giesbertz, K. J. H. (Nov. 2010). “Time-Dependent One-Body Reduced Dens-ity Matrix Functional Theory; Adiabatic Approximations and Beyond”.105hD thesis. De Boelelaan 1105, Amsterdam, The Netherlands: Vrije Uni-versiteit. isbn : 978-90-9025619-1.Karush, W. (1939). “Minima of Functions of Several Variables with Inequal-ities as Side Constraints”. MA thesis. Chicago, Illinois.: Dept. of Math-ematics, Univ. of Chicago.Kuhn, H. W. and A. W. Tucker (1951). “Nonlinear programming”. In: Pro-ceedings of 2nd Berkeley Symposium . Ed. by J. Neyman. Berkeley: Uni-versity of California Press., p. 481.Kim, K. and K. D. Jordan (Oct. 1994). “Comparison of Density Functionaland MP2 Calculations on the Water Monomer and Dimer”. In: J. Phys.Chem. doi : .Stephens, P. J. et al. (Nov. 1994). “Ab Initio Calculation of VibrationalAbsorption and Circular Dichroism Spectra Using Density FunctionalForce Fields”. In: J. Phys. Chem. doi : .Lee, Chengteh, Weitao Yang and Robert G. Parr (Jan. 1988). “Developmentof the Colle–Salvetti correlation-energy formula into a functional of theelectron density”. In: Phys. Rev. B doi : 10 . 1103 /PhysRevB.37.785 .Sun, Jianwei, Adrienn Ruzsinszky and John P. Perdew (July 2015). “StronglyConstrained and Appropriately Normed Semilocal Density Functional”.In: Phys. Rev. Lett. doi :10.1103/PhysRevLett.115.036402