General technique for analytical derivatives of post-projected Hartree-Fock
GGeneral technique for analytical derivatives of post-projected Hartree-Fock
Takashi Tsuchimochi a) and Seiichiro Ten-no
1, 2, b) Graduate School of Science, Technology, and Innovation, Kobe University, Kobe,Hyogo 657-0025 Japan Graduate School of System Informatics, Kobe University, Kobe, Hyogo 657-0025 Japan
In electronic structure theory, the availability of analytical derivative is one of the desired features for amethod to be useful in practical applications, as it allows for geometry optimization as well as computationof molecular properties. With the recent advances in the development of symmetry-projected Hartree-Fock(PHF) methods, we here aim at further extensions by devising the analytic gradients of post-PHF approacheswith a special focus on spin-extended (spin-projected) configuration interaction with single and double substi-tutions (ECISD). Just like standard single-reference methods, the mean-field PHF part does not require thecorresponding coupled-perturbed equation to be solved, while the correlation energy term needs the orbitalrelaxation effect to be accounted for, unless the underlying molecular orbitals are variationally optimized inthe presence of the correlation energy. We present a general strategy for post-PHF analytical gradients, whichclosely parallels that for single-reference methods, yet addressing the major difference between them. Thesimilarity between ECISD and multi-reference CI not only in the energy but also in the optimized geometryis clearly demonstrated by the numerical examples of ozone and cyclobutadiene.
I. Introduction
Recently, we have proposed and developed novel wavefunction methods based on symmetry-projected Hartree-Fock (PHF) in analogy with the traditional single ref-erence approaches.
It appears PHF successfully treatsthe major static (nondynamical) correlation effect aris-ing due to electronic degeneracies in a system, by break-ing and restoring the symmetries that a wave functionought to possess. The key feature of symmetry-breakingis that a broken-symmetry HF determinant such as spin-unrestricted HF (UHF) can be written as a linear combi-nation of multiple configuration state functions, each ofwhich is highly multi-reference (MR) possessing a propersymmetry. As a result, such an “effectively MR” pic-ture hidden in a broken-symmetry Slater determinantis potentially capable of providing an efficient meansto account for static correlation.
A symmetry pro-jection operator then eliminates the undesired compo-nents with the irrelevant symmetries, retrieving a gen-uine
MR wave function with the designated symmetry.Hence, PHF stands as an improved alternative to RHFand UHF, while being black-box unlike most traditionalMR methods. This fact has motivated us to developpost-PHF methods to account for the residual dynami-cal correlation by applying perturbation theory (EMP2)or configuration interaction (ECI), without necessiat-ing very expensive canonical MR calculations. Combin-ing PHF with density functional correlations was alsoshown to improve the quantitative accuracy by Garzaand coworkers. Another direction of post-PHF devel-opments includes a time-dependent extension (TDPHF)and introduction of spin-flip (SF) excitations forexcited states. However, most benchmark calculations in a) Electronic mail: [email protected] b) Electronic mail: [email protected] these works have focused only on energetics, and molec-ular properties such as dipole moment have been hardlystudied. This is due to the fact that all these post-PHFmethods do not satisfy the Hellmann-Feynman theorem,despite they have well-defined wave functions or energyfunctionals.To account for the non-Hellmann-Feynman contribu-tion, one needs to find the energy derivative with re-spect to an infinitesimal perturbation. To see this,let us introduce a one-electron perturbation λ ˆ O = (cid:80) µν (cid:104) φ µ | λ ˆ O | φ ν (cid:105) a † µ a ν into the Hamiltonian ˆ H ,ˆ H ( λ ) = ˆ H + λ ˆ O, (1)where we assume basis functions do not depend on λ forthe sake of simplicity. The Taylor expansion of E ( λ ) = (cid:104) Ψ | ˆ H ( λ ) | Ψ (cid:105) around λ = 0 becomes E ( λ ) = E (0) + λ d E dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + 12 λ d E dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 + · · · . (2)Since approximate wave functions do not in general sat-isfy the Hellmann-Feynman theorem, the expectationvalue of ˆ O is different from the first-order energy deriva-tive. It is widely accepted that the derivative approachis more appropriate because it accounts for the responseof perturbation. The one-electron property with ˆ O canbe computed by d E ( λ ) dλ (cid:12)(cid:12)(cid:12)(cid:12) λ =0 = (cid:88) µν (cid:104) φ µ | ˆ O | φ ν (cid:105) D rel νµ , (3)where D rel is the relaxed density. The molecular proper-ties computed without the appropriate response correc-tion are known to be inaccurate for approximate meth-ods, and the importance of the non-Hellmann-Feynmancontribution in calculation of molecular properties is welldocumented. PHF does not suffer from this problem since it satisfiesthe Hellmann-Feynman theorem if the λ dependence is a r X i v : . [ phy s i c s . c h e m - ph ] D ec not present in basis functions. If, on the other hand, λ is anuclear displacement x , then PHF also requires the non-Hellmann-Feynman term, the so-called Pulay force. Yet, the fully variational nature of the PHF wave functionmakes the derivation straightforward without the neces-sity of solving the coupled-perturbed (CP) equations or equivalently the Z-vector equation for PHF. In fact,the analytical nuclear derivative of PHF was first derivedby Schutski et al. except for the point-group symme-try projection, which was later incorporated by Uejimaand Ten-no. The latter authors also derived the nucleargradient for the projection-after-variation scheme, wherethe orbitals used are variationally determined by UHF,and therefore the corresponding CPHF equation was re-quired in general with less computational cost for theself-consistent field (SCF) part.Taking these works into consideration, our next stepis to devise the analytical gradients of post-PHF meth-ods with respect to nuclear displacements, because notonly is it straightforward to obtain the relaxed densitymatrix D rel from the resulting expressions, but also itwill enable us to perform geometry optimization. In thispaper, therefore we will give a detailed discussion onthe analytical derivative technique with a special focuson the recently developed ECI with singles and doubles(ECISD) as an illustrative example. Hence, the projec-tion operator we consider is of collinear spin-symmetry,i,e, the reference is spin-projected UHF (SUHF); thisis reasonable as the most post-PHF schemes developedso far are based on collinear spin-projection. Nonethe-less, we first give a general perspective for post-PHFgradients, including other symmetry-projections, in or-der to point out that it is straightforward to generalizethe working equations of ECISD although the resultingderivations would be necessarily cumbersome. Also thestrategy outlined below is applicable to any other levelsof theory than ECISD; for example, geometry optimiza-tion of TDPHF excited states are possible. Their spe-cific derivations and performances will be discussed inthe forthcoming papers.This paper is organized as follows. In Section II A, wewill outline the general workflow for analytical derivativeof post-PHF methods, with some notes on the differencefrom post-HF ones. SUHF, the nonorthogonal Wick the-orem, and ECISD are briefly reviewed in Section II B todefine our notations used throughout this paper. Sec-tion II C and II D provide the working equations for theECISD orbital gradient and CPPHF equations. SectionII E deals with the explicit dependence of the energy onthe nuclear displacement. A short remark on the size-consistent correction to ECISD is given in Section II G.Finally, we demonstrate the validity and performance ofECISD energy gradients for ozone and cyclobutadiene. II. TheoryA. General view for analytical derivative of post-PHFmethods
We first consider the analytical nuclear gradients ofpost-PHF methods whose total energy E may be varia-tional with respect to particle-hole amplitudes c I , e.g.,CI coefficients. Due to the restriction that MO coeffi-cients C are optimal with PHF energy E PHF but notwith E itself, one has to take into account the effect of thepossible orbital change upon a small perturbation x , ei-ther explicitly or implicitly. One way is to formulate andsolve linear equations to compute ∂ C ∂x for 3 N atom times,just like CPHF. For traditional post-HF methods, Handyand Schaefer have shown that these linear equations canbe simplified to only one set of equations, using their Z-vector method. A different formulation of the Z-vectormethod was realized by Helgaker et al., which usesthe Lagrangian multiplier approach and has later gainedpopularity in electronic structure theory because it hasextended the applicability of the Z-vector method to anyorder of energy derivatives.
We will closely followHelgaker’s scheme as it is intuitive, simple, and equallyapplicable to PHF based methods.In our problem, C is determined by minimizing E PHF instead of E HF under the condition that orbitals are mu-tually orthonormal. This means the generalized Brillouintheorem holds; given the PHF energy E PHF = (cid:104) Ξ J , M | ˆ H | Ξ J , M (cid:105)(cid:104) Ξ J , M | Ξ J , M (cid:105) , (4)with the corresponding PHF state | Ξ J , M (cid:105) = (cid:88) K f K ˆ P JMK | Φ (cid:105) , (5)the PHF Fock matrix elements are all zero at the self-consistency of a PHF state | Ξ J , M (cid:105) : F ai = (cid:80) KK (cid:48) f ∗K f K (cid:48) (cid:104) Φ | a † i a a ( ˆ H − E PHF ) ˆ P JKK (cid:48) | Φ (cid:105) (cid:80) KK (cid:48) f ∗K f K (cid:48) (cid:104) Φ | ˆ P JKK (cid:48) | Φ (cid:105) = 0 , (6)where we have used the fact that ˆ P JMK is commutablewith ˆ H and is orthogonal to each other. Here | Φ (cid:105) isa broken-symmetry HF determinant and J , K , · · · rep-resent different symmetry components. Also, through-out this paper, we will use the spin -orbitals unless oth-erwise noted, with i, j, ... to indicate occupied orbitals, a, b, ... virtual orbitals, p, q, ... general orbitals, and µ, ν, ... atomic orbitals (AO). Therefore, generally, each indexruns over both α and β spins. We should mention that,in our notation, a matrix may be represented in eitherAO or MO, depending on the characters used for indices.In addition to the orbital condition Eq.(6), f K are alsovariationally determined by diagonalizing the Hamilto-nian matrix H KK (cid:48) = (cid:104) Φ | ˆ H ˆ P KK (cid:48) | Φ (cid:105) under metric N KK (cid:48) = (cid:104) Φ | ˆ P KK (cid:48) | Φ (cid:105) . Hence, similarly to many post-HF methods,we set up the following Lagrangian, L = E [ c , C , f ] + (cid:88) ia ( z ai F ai + z ia F ia )+ (cid:88) KK (cid:48) (cid:104) Z K ( H KK (cid:48) − E PHF N KK (cid:48) ) f K (cid:48) − Λ ( N KK (cid:48) f ∗K f K (cid:48) − (cid:105) − Tr[ (cid:15) (cid:0) C † SC − (cid:1) ] , (7)where z = z † and Z are the Lagrange multipliers to ac-count for the first-order relaxation effect of C and f inthe presence of correlation, and therefore will play a roleof response. Note that nonzero z and Z are due to theabsence of the Hellmann-Feynman theorem in E . Thedimension of z changes depending on the PHF schemeemployed; for non-collinear PHF, the αβ and βα com-ponents are nonzero, while for collinear SUHF, they areredundant and necessarily zero. The terms with Λ and (cid:15) in Eq.(7) are added due to the orthonormal conditionof f and C under metric N and AO overlap matrix S .Finally, note that the normalization condition of c is notexplicitly treated here for the sake of simplicity, but it isimplicitly included in the denominator of E .With parameters c determined variationally, i.e., ∂ E ∂ c =0, we find ∂ L ∂c I = ∂ L ∂(cid:15) pq = ∂ L ∂z ia = ∂ L ∂ Z K = ∂ L ∂ Λ = 0 , (8)and therefore using the chain-rule d E dx ≡ d L dx = ∂ L ∂x + ∂ L ∂ C d C dx + ∂ L ∂ f d f dx . (9)The troublesome d C dx and d f dx terms shall not enter theequation if we enforce ∂ L ∂ C = ∂ L ∂ f = .Below, instead of using C itself, we parametrize C as C [ κ ] = C exp( κ ) , (10)for convenience, where κ is an anti-Hermitian matrix andexp( κ ) performs an orbital rotation from the reference C , which is set to constant. Then our task boils downto finding appropriate Lagrange multipliers z , Z , Λ, and (cid:15) by requiring ∂ L ∂ κ = ∂ L ∂ κ ∗ = ∂ L ∂ f = . Henceforth, we willonly consider κ ∗ derivatives, as κ derivatives are justtheir complex conjugates.The orbital rotation on a projected wave function isexpressed simply by ˆ P e ˆ κ | Ψ (cid:105) withˆ κ = (cid:88) pq κ pq ˆ E pq , (11)where | Ψ (cid:105) is the underlying broken-symmetry wave func-tion, and both anti-symmetric matrix κ and ˆ E pq = a † p a q are spin-dependent. The orbital gradient and f gradientof E are hence defined by L pq = ∂ E ∂κ ∗ pq (cid:12)(cid:12)(cid:12)(cid:12) κ = , (12) L K = ∂ E ∂f K (cid:12)(cid:12)(cid:12)(cid:12) f = , (13)and, together with the derivatives of the second and lastterms of Eq.(7), they will constitute the Z-vector equa-tions for post-PHF, which we refer to as CPPHF equa-tions. Note that Λ is easily identified as a constant E ,which can be verified by multiplying ∂ L ∂f K by f K and thenby summing over all K .Once z , Z , and (cid:15) are all determined, the energy gra-dient for perturbation x becomes E x = E ( x ) + (cid:88) ia z ai (cid:16) F ( x ) ai + F ( x ) ia (cid:17) + (cid:88) KK (cid:48) (cid:104) Z K (cid:16) H ( x ) KK (cid:48) − E ( x )PHF N KK (cid:48) − E PHF N ( x ) KK (cid:48) (cid:17) f K (cid:48) − EN ( x ) KK (cid:48) f K f K (cid:48) (cid:105) − (cid:88) pq (cid:15) pq ( C † S x C ) qp , (14)where superscripts x and ( x ) respectively indicate thetotal derivative and partial derivative with C [ κ ] fixed.If E is not variational with respect to c , we also haveto treat c in a similar way to C as above. This hap-pens when E has an additional correction term like theDavidson correction to CI, or when E and c are de-termined by a projective way as in coupled-cluster (CC).In such cases, one also needs to treat c in a similar way to C . While the computational cost will surely increase, itsformulation is straightforward, as shown in Section II Gfor ECISD+Q.As expected, the above scheme closely resembles theone for regular post-HF methods. The main difference,however, lies in the density matrices . The post-PHF den-sity matrices in a molecular orbital (MO) basis dependnot only on c but also on x explicitly through S andimplicitly through C . This is not the case in the tradi-tional schemes, where density matrices in a certain MObasis are only a function of c because a transition den-sity matrix element (cid:104) Φ I | a † p a q | Φ J (cid:105) is always either 0 or 1.In Eq.(14), E ( x ) therefore requires a special treatment,which we will discuss in Section II E. Also, L pq definedas above is of the broken-symmetry representation (thusit has α and β components) and is not the same as thestandard generalized Fock matrix, which has been ex-tensively used as orbital gradients. In other words, P pq L pq (cid:54) = P pq (cid:16)(cid:88) r h pr P rq + (cid:88) rst (cid:104) pt | rs (cid:105) P qt,rs (cid:17) , (15)where P pq = 1 − ( p ↔ q ) is the permutation operatorand P are the unrelaxed density matrices of the methodin question. This is essentially due to the same reasonas above, that is, the unrelaxed density matrices for thecorrelated wave function are not solely determined by c but they depend on the MO coefficients. This somewhatcomplicates our derivation as will be seen.One could entirely avoid the broken-symmetry pic-ture by using the internally-contracted spin-free basiswhere exp(ˆ κ ) is placed after the projection operator, i.e. (cid:104) Ψ | ˆ P e ˆ κ † ˆ He ˆ κ ˆ P | Ψ (cid:105) / (cid:104) Ψ | ˆ P e ˆ κ † e ˆ κ ˆ P | Ψ (cid:105) . This is possible pro-vided that ˆ κ is spin-free, written in, for example, thenatural orbital (NO) basis. Formulated in this way, theequality between the left hand and right hand sides ofEq.(15) is now satisfied, if P are also spin-free. However,the broken-symmetry representation offers numerous ad-vantages over a spin-free basis in both the derivation andimplementation of analytical gradient. First, the SCFcondition for PHF is given in the broken-symmetry basis(Eq.(6)), and hence z is also effectively spin-polarized.Second, the nonorthogonal Wick theorem makes it easyto evaluate the required matrix elements as we have pre-viously shown. Third, there is a clear distinction be-tween occupied and virtual blocks, which enables us toseparate the working spaces and to reduce the computa-tional effort. Fourth, there is no need to deal with thedouble-integration that arises due to the presence of twoprojection operators, whose computational cost is an or-der of magnitude higher. Of course, one can reformulatethe spin-free formalism within the single integration byadopting the well-known Wigner-Eckart theorem ;however, all the expressions will ultimately become a lin-ear combination of the half-projected, broken-symmetryrepresentation, so from the algebraic point of view, itseems pointless to employ the spin-free basis. Indeed, inmost situations, we only need half-projected quantities,and therefore we could utilize the Wigner-Eckart theoremonly when necessary, as will be done for the spin-adaptedrelaxed density matrix. Finally, it is expected that mostingredients that will be needed for analytical derivativesare readily available in the broken-symmetry basis fromthe existing post-PHF programs and therefore are likelyto require only minor modifications.In what follows, we apply the above scheme to derivethe ECISD nuclear gradient. We abbreviate the collinear-spin projection operator as ˆ P ≡ ˆ P JMM , as the spin quan-tum number J and multiplicity M are both obvious.Also note that there is no dependence on f K and one canremove the corresponding terms from L in Eq.(7). Again,once the derivative is given for ECISD, similar derivationscan be obtained for ECIS and TDPHF excited states aswell as spin-flip ECIS. For EMP2, care must be taken be-cause it uses UHF-like orbital energies, but its derivationessentially follows along the same lines. B. Nomenclature
Before going into detailed derivations, we shall brieflysummarize our nomenclature to make this paper self-contained. We basically follow the same notations withRef.[9], but specifically clarify them here as well.Throughout this work, we will employ the projectionoperator in the following form:ˆ P = (cid:90) ˆ R (Ω) d Ω ≈ (cid:88) g w g ˆ R g , (16) where ˆ R g and w g are a rotation operator at some dis-cretized grid point g and its weight. Hence, one cangenerically work on the transition elements between un-rotated and rotated states such as (cid:104) Θ | and ˆ R g | Θ (cid:105) = | Θ g (cid:105) ,followed by the summation over g .
1. SUHF
Using Eq.(16), the SUHF energy is given by E SUHF = (cid:104) Φ | ˆ H ˆ P | Φ (cid:105)(cid:104) Φ | ˆ P | Φ (cid:105) = (cid:80) g w g n g E g (cid:80) g w g n g , (17a)where we have used the fact that ˆ P is Hermitian, idem-potent, and commutable with spin-free operators such asˆ H , and defined E g ≡ (cid:104) Φ | ˆ H | Φ g (cid:105)(cid:104) Φ | Φ g (cid:105) = (cid:104) Φ | ˆ H | Φ g (cid:105) N , (17b) n g ≡ (cid:104) Φ | Φ g (cid:105) . (17c)Hereinafter we will simplify notations by adding sub-script N to transition elements in order to indicate theelements are intermediate-normalized, i.e., divided by n g as in Eq.(17b). The SUHF transition one-particle den-sity matrix (1PDM) ρ g and transition Fock matrix F g defined by ( ρ g ) pq = (cid:104) Φ | a † q a p | Φ g (cid:105) N , (18)( F g ) pq = h pq + (cid:88) rs (cid:104) pr || qs (cid:105) ( ρ g ) sr , (19)are fundamental quantities for our discussion below. Forexample, the SUHF transition energy is expressed as E g = (cid:88) pq h pq ( ρ g ) qp + 12 (cid:88) pqrs (cid:104) pr || qs (cid:105) ( ρ g ) qp ( ρ g ) sr . (20)For more details, we refer the reader to the original PHFpaper (Ref.[4]).
2. Nonorthogonal Wick theorem
The nonorthogonal Wick theorem allows one to writeHamiltonian at any grid g asˆ H = E g + (cid:88) pq ( F g ) pq { a † p a q } g + 14 (cid:88) pqrs (cid:104) pq || rs (cid:105){ a † p a † q a s a r } g , (21)where curly brackets mean the normal-ordering in thesense of the nonorthogonal Wick theorem with respectto the left- and right-vacua (cid:104) Φ | and | Φ g (cid:105) . In other words, (cid:104) Φ |{ ˆ O } g | Φ g (cid:105) N ≡
0. Then, it proves convenient to definethe following quantities in the MO basis:( W g ) ij = (cid:104) Φ | a † j ˆ R g a i | Φ (cid:105) N = ( R oo g ) − ij , (22a)( W g ) ai = (cid:104) Φ | a † i a a ˆ R g | Φ (cid:105) N = (cid:0) R vo g ( R oo g ) − (cid:1) ai , (22b)( W g ) ia = (cid:104) Φ | ˆ R g a † a a i | Φ (cid:105) N = (cid:0) ( R oo g ) − R ov g (cid:1) ia , (22c)( W g ) ab = (cid:104) Φ | a a ˆ R g a † b | Φ (cid:105) N = (cid:0) R vv g − R vo g ( R oo g ) − R ov g (cid:1) ab , (22d)where o and v stand for the occupied and virtual orbitalblocks, respectively. The above equations manifest ρ g = (cid:18) oo ov W vo g vv (cid:19) . (23)In the nonorthogonal Wick theorem, Eqs.(22-23) are re-alized as the contractions of two different Fermion oper-ators, a † and ˆ R g a † ˆ R − g , between (cid:104) Φ | and | Φ g (cid:105) . Note thatall other possible contractions result in either the Kro-necker delta or zero. Then, the standard Wick theoremand its generalization for a product of normal-orderedstrings can be completely replaced by their generaliza-tion to nonorthogonal bases.In the previous work, we introduced the left- andright-transformation matrices, given by L g = (cid:18) W oo g ov − W vo g vv (cid:19) , (24) R g = (cid:18) oo ov W vo g W vv g (cid:19) , (25)to manipulate the transition Fock matrix and the baretwo-electron integrals as˜ F g := L g F g R g , (26)( ¯ V g ) pqrs := (cid:88) tuvw ( L g ) pt ( L g ) qu (cid:104) tu || vw (cid:105) ( R g ) vr ( R g ) ws . (27)It is sometimes convenient to treat these matrices to-gether with the MO transformation in some cases,˜ L g = L g C † , (28)˜ R g = C R g , (29)so that both E and F ai are solely expressed as a functionof W g , ˜ L g , and ˜ R g , without explicit C [ κ ] dependence.
3. ECISD energy and density matrices
Similarly to the SUHF energy, the ECISD energy canbe expressed as E = (cid:80) g w g n g E g (cid:80) g w g n g N g , (30a)where we have defined the transition energy and overlap E g = (cid:104) Ψ | ˆ H | Ψ g (cid:105) N = (cid:88) µν h µν (cid:104) Ψ | a † µ a ν | Ψ g (cid:105) N + 14 (cid:88) µνλσ (cid:104) µν || λσ (cid:105)(cid:104) Ψ | a † µ a † ν a σ a λ | Ψ g (cid:105) N , (30b) N g = (cid:104) Ψ | Ψ g (cid:105) N , (30c)for which, we have presented the expression of σ -vectorsin the previous work. We will not repeat their derivationsbut the supporting material is available for the final ex-pressions, which are factorized and thus more compactthan those presented in Refs.[8 and 9].In what follows, we assume both (cid:104) Ψ | ˆ P | Ψ (cid:105) and (cid:104) Φ | ˆ P | Φ (cid:105) are properly normalized, i.e., (cid:80) g w g n g N g = (cid:80) g w g n g =1, for brevity; however, we stress that these norms needbe taken into account in derivative evaluations (as therederivatives are typically nonzero).Note that half-projected density matrices, such as P νµ ≡ (cid:104) Ψ | a † µ a ν ˆ P | Ψ (cid:105) = (cid:80) g w g n g (cid:104) Ψ | a † µ a ν | Ψ g (cid:105) N , are nei-ther relaxed nor spin-adapted. As these “unrelaxed”ECISD density matrices will repeatedly appear in thefollowing derivations, it is useful to analyze these objectsfor latter discussions. 1PDM P qp is obtained simply byintegrating the corresponding transition 1PDM, P g , P qp = (cid:88) g w g n g ( P g ) qp = (cid:88) g w g n g (cid:104) Ψ | a † p a q | Ψ g (cid:105) N , (31)and similarly for two-particle density matrix (2PDM).The nonorthogonal Wick theorem applied to the one-particle operator a † p a q = ( ρ g ) qp + { a † p a q } g (32)suggests P g be separated into two terms, namely, theoverlap-weighted SUHF contribution ( ρ g ) qp N g and thenormal-ordered transition 1PDM defined by( γ g ) pq = (cid:104) Ψ |{ a † q a p } g | Ψ g (cid:105) N , (33)which accounts for the correction due to the ECISD cor-relation contribution. The programmable expression ofEq.(33) is easily identified from the ECISD equations, as γ g is contracted only with F g because of the structure ofthe normal-ordered Hamiltonian Eq.(21). Namely, it suf-fices to replace all ( ˜ F g ) rs with ( R g ) rp ( L g ) qs and neglectall other contractions. Note that the correlated (non-separable) 2PDM contribution, (cid:104) Ψ |{ a † p a † q a s a r } g | Ψ g (cid:105) N , issimilarly obtained, but with ( ¯ V g ) pqrs replaced appropri-ately (see Eq.(27)). C. ECISD orbital gradient
Having established our notations above, now we are ina position to derive the ECISD orbital gradient L . Herethe generalized nonorthogonal Wick theorem proves use-ful. Using Eq.(32) and the normal ordered HamiltonianEq.(21), we write L pq = (cid:104) Ψ | ˆ E † pq ( ˆ H − E ) ˆ P | Ψ (cid:105) = (cid:88) g w g n g (cid:34) ( ρ g ) pq (cid:104) Ψ | ˆ H | Ψ g (cid:105) N + (cid:104) Ψ |{ a † q a p } g | Ψ g (cid:105) N E g + (cid:104) Ψ |{ a † q a p } g { a † r a s } g | Ψ g (cid:105) N ( F g ) rs + 14 (cid:104) rs || tu (cid:105)(cid:104) Ψ |{ a † q a p } g { a † r a † s a u a t } g | Ψ g (cid:105) N − ( P g ) pq E (cid:35) = (cid:88) g w g n g (cid:34) ( P g ) pq ( E g − E ) + ( ρ g ) pq ( E g − E g N g ) − [ γ g F g ρ g ] pq + [ η g F g P g ] pq + [ η g G g ρ g ] pq + ( ζ g ) pq − ( ω g ) pi δ iq + δ pa ( ˜ ω g ) aq (cid:35) . (34)Hereafter, the Einstein summation convention for re-peated indices is assumed for the sake of visual simpli-fication, except g integration in order to specifically in-dicate that the symmetry-projection is being performed.In Eq.(34), we have additionally defined the transitionhole matrix ( η g ) pq = δ pq − ( ρ g ) pq , (35)and the following g -dependent quantities:( G g ) pq : = (cid:104) pr || qs (cid:105) ( γ g ) sr , (36)( ζ g ) pq : = ( F g ) rs (cid:104) Ψ |{ a † q a † r a s a p } g | Ψ g (cid:105) N + 14 (cid:104) rs || tu (cid:105)(cid:104) Ψ |{ a † q a † r a † s a u a t a p } g | Ψ g (cid:105) N , (37)( ω g ) pq : = 12 (cid:104) rs || tu (cid:105) ( R g ) tq (cid:104) Ψ |{ a † r a † s a u a p } g | Ψ g (cid:105) N , (38)( ˜ ω g ) pq : = 12 (cid:104) rs || tu (cid:105) ( L g ) pr (cid:104) Ψ |{ a † q a † s a u a t } g | Ψ g (cid:105) N . (39)This is a general result for any excitation levels, includ-ing SUHF and ECIS. For example, L will reduce to F inthe case of SUHF. For ECISD, the explicit expressionsof Eqs.(37-39) are rather complex as given in the supple-mental material, but can be straightforwardly evaluatedusing the existing ECISD subroutines with minor modifi-cations. We just note here that we avoid the computationof the correlated 3PDM in Eq.(37) by performing the in-tegral contraction on-the-fly, and L can be thus evaluatedfor the same cost as the ECISD energy, which scales as O (o v ).Some observations on the structure of L are in order.Since the ECISD energy is invariant with respect to an or-bital rotation within the occupied space as well as withinthe virtual space, clearly L ij = L ab = 0. Furthermore, L ia must also vanish because it apparently constitutesonly (cid:104) Φ | ( ˆ H − E ) ˆ P | Ψ (cid:105) and (cid:104) Φ ai | ( ˆ H − E ) ˆ P | Ψ (cid:105) , which are allguaranteed to be zero due to the variationality of ECISDwith respect to the CI coefficients. Therefore, only the L ai block contains nonzero elements, as confirmed nu-merically in our calculations. D. Coupled-perturbed PHF
With L derived above, we also need the orbital deriva-tive of SUHF Fock matrix (6) for computing z ai . Usingthe notations defined in Section II B, F ai is explicitlywritten as F ai = ∂E SUHF ∂κ ∗ ai = (cid:88) g w g n g (cid:104) ( E g − E PHF ) W g + L g F g R g (cid:105) ai , (40)and F ia is its complex conjugate. As previously shown, its derivative with respect to an orbital rotation becomesthe Hessian components, ∂ F ia ∂κ ∗ bj = A ai,bj = (cid:88) g w g n g (cid:104) ( E g − E SUHF ) (cid:16) W ai W jb + W ab W ji (cid:17) + W ai ˜ F jb + ˜ F ai W jb + ˜ F ab W ji − W ab ˜ F ji + ¯ V ajib (cid:105) , (41a) ∂ F ai ∂κ ∗ bj = B ai,bj = (cid:88) g w g n g P ( ab ) P ( ij ) (cid:104)
12 ( E g − E SUHF ) W ai W bj + W ai ˜ F bj + 14 ¯ V abij (cid:105) , (41b)where we have dropped g subscripts in W g , ˜ F g , and¯ V g for brevity. All the derivatives with respect to κ ∗ pq in the oo, ov, and vv spaces are rigorously zero for asimilar reason to the aforementioned discussion for L . Inderiving Eqs.(41), we have used the fact that F = fora converged SUHF state.Hence, the stationary condition ∂ L ∂ κ ∗ = reads thefollowing set of equations; L ij = (cid:15) ij , (42a) L bj + ( A ai,bj + B ai,bj ) z ai = (cid:15) bj , (42b) L jb = (cid:15) jb , (42c) L ab = (cid:15) ab . (42d)Keeping the structure of L in mind ( L vo is the onlynonzero block) and requiring (cid:15) = (cid:15) † , we find simply (cid:15) = as is the case in SUHF. On the other hand, Eq.(42b)results in the Z-vector (CPPHF) equation,( A ai,bj + B ai,bj ) z ai = − L bj . (43)Again, for general projection operators in the form ofEq.(5), this equation will be coupled with the correspond-ing f response, similarly to MRCI. Orbital rotations can often contain linear depen-dencies, i.e., Hessian A + B is singular in PHF,since a projection operator may produce the identicalsymmetry-adapted state from different broken-symmetrydeterminants. If this is the case, such linear dependen-cies also appear in L in exactly the same way, so onecan easily identify this redundant space as a mathemat-ically null space. The reader may then wonder if z isleft arbitrary and there are infinite numbers of solutionsthat satisfy the Z-vector equation (43). However, thisredundancy is simply due to the working space that wehave adopted; had we chosen an appropriate space fororbital rotations other than the broken-symmetry repre-sentation, such a null space would completely disappearfrom the equation. Therefore, the correct approach totreat these linear dependencies in the present Z-vectorequation is to simply remove them; in practice, we takethe pseudo-inverse of A + B with a threshold of 10 − in order to ensure a numerical stability. This proceduredetermines only one unique z . In passing, if other z thatyet satisfy Eq.(43) are used, gradients would still be cor-rectly evaluated but the relaxed density matrix may be-come different.For ECISD, one can solve Eq.(43) by explicitly form-ing and pseudo-inverting the Hessian matrix A + B , be-cause its cost is O (o v ), which is typically less thanthat of ECISD itself. For other low-scaling methods suchas EMP2, one should resort to iterative linear-equationsolvers like GMRES for efficient computations. E. Explicit dependence1. E ( x ) In Section II A, we argued that the explicit depen-dence on x in E comes not only from Hamiltonian butalso from density matrices through S . Given the ECISDenergy Eqs.(30), one would have to consider terms like (cid:104) Ψ | a † p a q | Ψ g (cid:105) ( x ) N and (cid:104) Ψ | a † p a † q a s a r | Ψ g (cid:105) ( x ) N , which are bothexactly zero if | Ψ g (cid:105) = | Ψ (cid:105) (i.e., the regular single-reference limit) but are complicated functions of both C and S x in the symmetry-projection methods. However,one can completely avoid constructing these derivativesand simplify the derivation by formally writing the ex-plicit dependence of E on x as E ( x ) = ∂ E ∂h µν h xµν + ∂ E ∂ (cid:104) µν || λσ (cid:105) (cid:104) µν || λσ (cid:105) x + (cid:88) g (cid:16) ∂ E ∂ W g W ( x ) g + ∂ E ∂ ˜ L g ˜ L ( x ) g + ∂ E ∂ ˜ R g ˜ R ( x ) g (cid:17) . (44)Then, the last summation over g takes into account thefact that the density matrices of ECISD are dependent on x . Now, recall that we require E be completely expressedwith W g , ˜ L g , and ˜ R g , and hence allow no explicit de-pendence on C . Therefore, the previously obtained ECI orbital gradient Eq.(34) is equivalently expressed as L pq = ∂ E ∂κ ∗ pq = (cid:88) g (cid:16) ∂ E ∂ W g ∂ W g ∂κ ∗ pq + ∂ E ∂ ˜ L g ∂ ˜ L g ∂κ ∗ pq + ∂ E ∂ ˜ R g ∂ ˜ R g ∂κ ∗ pq (cid:17) , (45)which may be compared with Eq.(44) for their similarity,implying that the latter can be rewritten in terms of L pq .It is not difficult to show that the explicit derivatives of W g , ˜ L g , and ˜ R g with respect to x critically resemblethose with respect to κ ∗ pq , W ( x ) g = ∂ W g ∂κ ∗ pq ( C † S x C ) qp , (46a)˜ L ( x ) g = ∂ ˜ L g ∂κ ∗ pq ( C † S x C ) qp − ˜ L g S x S − , (46b)˜ R ( x ) g = ∂ ˜ R g ∂κ ∗ pq ( C † S x C ) qp . (46c)Therefore, noting that only E g depends on ˜ L g in E ,one can substitute them into Eq.(44) and then useEqs.(42,45) to obtain E ( x ) = ∂ E ∂h µν h xµν + ∂ E ∂ (cid:104) µν || λσ (cid:105) (cid:104) µν || λσ (cid:105) x + L pq ( C † S x C ) qp − (cid:88) g (cid:32) ∂ E ∂ ( ˜ L g ) rµ ( ˜ L g S x S − ) rµ (cid:33) = h xµν (cid:104) Ψ | a † µ a ν ˆ P | Ψ (cid:105) + 14 (cid:104) µν || λσ (cid:105) x (cid:104) Ψ | a † µ a † ν a σ a λ ˆ P | Ψ (cid:105)− ( A ai,bj + B ai,bj ) z ai ( C † S x C ) jb − ( C † S x C ) qp X pq . (47)The third term will cancel out the same term in F ( x ) ( vide infra ). For the last term of Eq.(47), it is relativelyeasy to derive X pq = (cid:88) g w g n g C ∗ µp ∂ E g ∂ ( ˜ L g ) rµ ( L g ) rq = (cid:88) g w g n g (cid:16) [ F g P g ] pq + [ G g ρ g ] pq + [ L − g ˜ ω g ] pq (cid:17) , (48)and this will be in part recognized as the energy-weighteddensity matrix upon a grid integration. We should men-tion that the existence of L − g is always guaranteed. F ( x ) One can use the same simplification as presented aboveto ease the evaluation of the explicit nuclear gradientcontribution of the PHF Fock; F ( x ) is closely related tothe orbital gradients A and B through Eqs.(46). Ourresult is F ( x ) ia = (cid:88) g w g n g (cid:104) L g F (¯ x ) g R g + E (¯ x ) g ( W g − N ) (cid:105) ia + ( C † S x C ) qp ( A ai,bj δ pb δ jq − Y ia,pq ) , (49a) F ( x ) ai = (cid:88) g w g n g (cid:104) L g F (¯ x ) g R g + E (¯ x ) g ( W g − N ) (cid:105) ai + ( C † S x C ) qp ( B ai,bj δ pb δ jq − Y ai,pq ) , (49b)where Y is the residual effect, Y vw,pq = (cid:88) g w g n g C ∗ µp ∂ F vw ∂ ( ˜ L g ) rµ ( L g ) rq = (cid:88) g w g n g (cid:40) [ F g ρ g ] pq ( W g − N ) vw + ( L g ) vq [ F g R g ] pw + (cid:104) pr || us (cid:105) ( ρ g ) uq ( L g ) vr ( R g ) sw (cid:41) , (50)and the bars on x indicate that only Hamiltonian is sub-ject to the differentiation, that is to say, in the AO basis,( F (¯ x ) g ) λσ = h xλσ + (cid:104) λµ || σν (cid:105) x ( ρ g ) νµ , (51) E (¯ x ) g = h xλσ ( ρ g ) σλ + 12 (cid:104) λµ || σν (cid:105) x ( ρ g ) νµ ( ρ g ) σλ . (52)(Note that ρ g is an explicit function of S .) Notice thatin Eqs.(49), F (¯ x ) g is given in the MO basis. We have alsointroduced the SUHF norm derivatives, N ai = (cid:88) g w g n g W ai = (cid:104) Φ ai | ˆ P | Φ (cid:105) , (53) N ia = (cid:88) g w g n g W ia = (cid:104) Φ | ˆ P | Φ ai (cid:105) . (54)As was mentioned above, A and B terms in Eq.(49) arecanceled out with those in E ( x ) , when contracted with z ai . F. Final assembly
Putting altogether, we finally arrive at the completeexpression, L ( x ) = E (¯ x ) + (cid:88) g w g n g (cid:34)(cid:16) ( L g F (¯ x ) g R g ) ia + ( L g F (¯ x ) g R g ) ai (cid:17) z ai + E (¯ x ) g (cid:16) ( W g ) ia − N ia + ( W g ) ai − N ai (cid:17) z ai (cid:35) − ( C † S x C ) qp (cid:40) X pq + ( Y ia,pq + Y ai,pq ) z ai (cid:41) = h xµν P rel νµ + 14 (cid:104) µν || λσ (cid:105) x P rel νµ,σλ + S xµν W νµ (55)where P rel νµ and P rel νµ,σλ are spin-incomplete relaxed one-and two-particle density matrices, and hence includes not only αα and ββ but also nonzero βα and αβ sectors. Us-ing Eq.(48), W is the generalized energy-weighted den-sity matrix explicitly given by W pq = − (cid:88) g w g n g (cid:34) ( F g ) pr ( P g + P corr g ) rq + (cid:104) pr || us (cid:105) ( ρ g ) uq (cid:16) ( R g ) sa z ai ( L g ) ir + ( R g ) si z ia ( L g ) ar (cid:17) + [ G g ρ g ] pq + [ L − g ˜ ω g ] pq (cid:35) (56)in the MO basis, with the relaxation correction at grid g defined as( P corr g ) pq = ( ρ g ) pq (cid:16) ( W g ) ia − N ia + ( W g ) ai − N ai (cid:17) z ia + ( R g ) pi z ia ( L g ) aq + ( R g ) pa z ai ( L g ) iq . (57)One can assure that Eq.(55) is consistent with SUHF for | Ψ (cid:105) = | Φ (cid:105) and z = .Comparing the terms in Eq.(55), one can easily iden-tify the spin-incomplete relaxed 1PDM as the sum of theunrelaxed density matrix and relaxation correction (in-tegrated), P rel pq = (cid:88) g w g n g ( P g + P corr g ) pq . (58)To gain some physical insights, one can elegantly rewritethe integrated correction term as (cid:88) g w g n g ( P corr g ) pq = z ai ∂ (cid:104) Φ | a † q a p ˆ P | Φ (cid:105) ∂κ ∗ ai + z ia ∂ (cid:104) Φ | a † q a p ˆ P | Φ (cid:105) ∂κ ai . (59)The meaning of Eq.(59) is striking; it accounts for thefirst-order orbital relaxation effect on the ECISD densitymatrix via the reference (SUHF) density matrix. Aftersome simple algebra, one can verify that the Wigner-Eckart theorem can be directly applied so as to obtainthe spin-adapted relaxed density; in other words, D rel α and D rel β can be derived from a linear combination of P rel αα , P rel βα , P rel αβ , and P rel ββ (see Appendix).Similarly, the relaxed 2PDM can be easily derived, andmay be explicitly symmetrized in the AO basis withoutloss of generality, i.e., ˜ P µν,λσ = P µν P λσ P µν,λσ . How-ever, for ease of computations, the non-separable (un-relaxed) term should be directly contracted with two-electron integrals in the AO basis to avoid prohibitivelylarge memory requirement and disk storages. G. ECISD+Q gradient
We should mention the gradient of the Davidsoncorrection, ∆ E Q , can be formulated. In this case,the total energy E + ∆ E Q is stationary with respect nei-ther to C nor to c . Hence, our Lagrangian takes a morecomplicated form L Q = E + ∆ E Q + (cid:88) ai ( F ai + F ia ) z ai − Tr (cid:2) (cid:15) (cid:0) CS † C − (cid:1)(cid:3) + (cid:88) I ˜ z I (cid:16) (cid:104) Φ I | ( ˆ H − E ) ˆ P | Ψ (cid:105) + (cid:104) Ψ | ( ˆ H − E ) ˆ P | Φ I (cid:105) (cid:17) (60)where Φ I ∈ Φ , Φ ai , Φ abij , requiring to solve the resultingCPECISD equation, ∂ L Q /∂ c = , for ˜ z . While it isstraightforward to derive the equations, this will signifi-cantly complicate the algorithm and thus is beyond thescope of the present work. III. Numerical examplesA. Computational details
The above scheme has been implemented in the GEL-LAN suite of programs, which can also handle the an-alytical gradient of SUHF. All of the calculations wereperformed with all-electrons correlated unless otherwisespecified. While incorporating the frozen-core approxi-mation is possible, the implementation is not as straight-forward as that for the conventional post-HF methods,because ECISD requires a constrained SCF optimizationto define frozen-core orbitals.
We will address thisissue elsewhere.For the number of grid points for the spin-projection, N grid = 4 was found to be enough to make (cid:104) ˆ S (cid:105) of aSUHF wave function precise to at least 10 − for all thecalculations presented in this work. If a precision of10 − is requested for (cid:104) ˆ S (cid:105) as in the previous studies, only three grid points is sufficient, while the resulting en-ergy error is less than 1 µE h . Also, increasing N grid didnot change both the SUHF and ECISD wave functions.All the spin-projected calculations employed GELLAN,and single reference methods including CC singles anddoubles (CCSD) and CCSD with perturbative triples,CCSD(T), were performed using Gaussian . We alsoused
Molpro for the MR calculations. B. Ozone
We take the ozone molecule as our first example, as ithas been extensively studied by other authors due to itsdegenerate electronic structure.
We use Dunning’sDZP basis set following the earlier work of reduced MR-CCSD (RMR-CCSD) calculations, for a direct compar-ison with our results.We have performed geometry optimizations on thissystem with various methods including RHF, SUHF,ECISD, CISD, CCSD, and MRCISD, and the results arelisted in Table I. The results of reduced MR-CCSD are FIG. 1: Selected natural orbitals of O . All the orbitalslook qualitatively similar for ECISD and SUHF. Thecorresponding natural occupations are listed in thefollowing order; ECISD (relaxed), ECISD (unrelaxed) inbold, and SUHF in italic.taken from Ref.[42] for comparison. All the MR methodsemploy the minimum (2 e ,2 o ) active space. Using the fi-nite difference of analytical gradients, we also evaluatedthe vibrational frequencies. RHF significantly underesti-mates the bond length due to the lack of static correla-tion, while SUHF, in spite of its mean-field nature, gainsa large amount of correlation energy, predicting a muchbetter geometry. It is noteworthy that SUHF outper-forms CASSCF for the energy and geometry. However,it turned out that SUHF fails to predict correct frequen-cies even qualitatively for this simple molecule. Espe-cially, the 1 b mode becomes unreasonably small com-pared to the other methods. Although the frequency forthe 2 a mode is overestimated, that for 1 a results ina too low value. This indicates that SUHF’s potentialenergy surface is unphysically shallow for this particularsystem, which is somewhat astonishing, given its goodperformance on the energy and geometry. We note thatUHF, without spin-projection, provides more reasonablefrequencies, and so does the non-collinear spin-projectionon generalized HF (although not listed in the table). Hence, this is an SUHF-specific failure and a precautionis required when SUHF is used for frequency calculations.On the other hand, ECISD certainly improves theSUHF results in all of the aspects. Not only does itcorrect the ill-behaved frequencies of SUHF, but also itsenergy and geometry are comparable to those of MR-0TABLE I: Total energy and geometry of ozone computed with the DZP basis.
Method Energy (a.u.) R OO (˚A) ∠ OOO ( ◦ ) ω (cm − ) µ ( D )1 a a b RHF –224.320 897 1.207 118.9 1541 842 1432 0.874SUHF –224.438 884 1.284 114.4 722 936 226 0.191CASSCF(2,2) –224.403 040 1.258 115.1 1181 776 1492 0.220RCISD –224.858 121 1.247 117.7 1388 783 1562 0.702ECISD –224.898 361 1.274 116.2 1209 735 1349 0.439 a MRCISD(2,2) –224.890 282 1.271 116.2 1226 746 1358 0.347 b RCCSD –224.943 339 1.275 117.1 1249 729 1244 0.595RMR-CCSD c — 1.277 116.7 1187 727 1156 —Exp. — 1.272 116.8 1135 716 1089 0.532 a b Unrelaxed. c s orbitals are frozen and basis exponent α D is modified to 1.211. CISD. Employing the same active space as MRCISD,RMR-CCSD delivers as accurate results especially for vi-brational frequencies. Despite its single-reference nature,CCSD also shows an excellent performance, yet amelio-rating RHF only partially. Hence, we conjecture that thediscrepancies between the ECISD and MRCISD frequen-cies and those of RMR-CCSD as well as experimentalvalues are mostly attributed to the lack of linked discon-nected terms in the former, although the use of largerbasis sets could change the results greatly.We also computed the dipole moment µ with the un-relaxed and relaxed densities of ECISD. Without the or-bital relaxation effect, ECISD gives 0.355 Debye ( D ); asolid improvement is obtained over SUHF (0.191 D ) andCASSCF (0.220 D ). We should stress that this value isalso comparable to the MRCISD result (0.347 D ), com-puted with its unrelaxed density. It is noteworthy thatthe underestimation of the unrelaxed density on µ is fur-ther improved by the response correction, whose contri-bution is substantial, giving a total µ of 0.439 D .To see the main difference between SUHF and ECISDrelaxed and unrelaxed densities, in Figure 1, we have vi-sualized the NOs with their occupation numbers. Forcomparison, here we use the ECISD geometry also forthe SUHF results, which yield µ = 0 . D . The appear-ances of NOs computed with the relaxed and unrelaxedECISD densities and SUHF density are indistinguishablewith one another, so we only depict the relaxed ECISDorbitals. It is worth mentioning that, while the most or-bitals have almost the same occupancies for the relaxedand unrelaxed densities, the degenerate orbitals are moresensitive to the orbital relaxation effect, i.e., the occu-pation numbers of 12th (non-bonding) and 13th (anti-bonding) NOs noticeably change. The relaxed occupa-tion of the latter is smaller than that of the unrelaxedcalculation, reflecting the more “dynamical” character ofthe wave function. This is responsible for the descrip-tion of µ , because the density is more polarized withoutoccupying electrons in the anti-bonding NO as clearly ( b g ) A g ( D h ) B g ( D h ) ( b u )( b g ) ( a u ) ( a u )( e g ) ( e g )( b u ) FIG. 2: Automerization of singlet cyclobutadiene.seen in Figure 1. The non-bonding and anti-bondingorbital occupations of SUHF are even more fractionalthan the unrelaxed ECISD density due to the neglect ofthe vast majority of dynamical correlation effects. TheSUHF density becomes even less polarized by promotingmore electrons than necessary, from the non-bonding NOto the anti-bonding one.
C. Cyclobutadiene
Cyclobutadiene is another suitable test case thatexhibits a degenerate electronic structure. Althoughthis molecule is highly distabilized due to the anti-aromaticity of 4 nπ electrons, the characterization of itsautomerization has posed a significant challenge and thusgained considerable attention both experimentally and theoretically. As depicted in Figure 2, the A g ground state has the distorted rectangular equilibriumgeometry because of the Jahn-Teller effect, and mostsingle-reference methods should offer a reasonable ver-1 D h D h (1.074)(134.9)(1.566)(1.343) FIG. 3: Optimized geometries of the ground statesinglet states computed with ECISD, SUHF (italic),and MRCI (bold) in ˚A and degree. For D h , CCSD(T)results are also shown in parentheses.tical singlet-triplet gap ∆ E ST as the static correlationis not strong enough. However, at the transition struc-ture (square D h ) of the automerization, its electronicstructure is entangled, requiring MR treatments or ex-plicit inclusion of triple excitations such as CCSDT.Therefore, single-reference methods limited to doublestypically overestimate the the activation barrier ∆ E ‡ and significantly underestimate the singlet-triplet gapat the square structure ∆ E ∗ ST . Balkov´a and Bartlettstudied this system using MR-CCSD(T) with a smallDZP basis and frozen orbitals, and reported ∆ E ‡ = 6 . E ∗ ST = 6 . On the otherhand, Levchenko and Krylov used the larger cc-pVTZ ba-sis functions to perform all-electron equation-of-motionSF-CCSD (EOM-SF-CCSD), which yielded 7.5 and 8.5kcal/mol for ∆ E ‡ and ∆ E ∗ ST . Many other computa-tional studies focused on ∆ E ‡ but not on ∆ E ∗ ST . Herewe will perform geometry optimizations to calculate allthese quantities using ECISD and MRCI.Figure 3 shows the optimized geometries of the rectan-gular and square structures computed with ECISD andSUHF (in italic). For comparison, we list the result ofMRCI in bold, where we used an active space of (4 e, o )that consists of the orbitals shown in Figure 2. For D h ,CCSD(T) results are also provided in parentheses. WhileSUHF geometries are reasonable, the agreement betweenECISD and MRCI geometries is excellent. Their groundstate energies are also close to each other, as tabulatedin Table II, indicating their wave functions are similarboth qualitatively and quantitatively. However, it turnsout that both ECISD and MRCI predict slightly shorterbond lengths for D h compared to CCSD(T). This ispartly attributed to the fact that both ECISD and MRCIare size-inextensive. In addition, CCSD(T) also has itsown deficiency; the method is not suitable for describingstatic correlation, which is found even in D h . AlthoughCCSD(T) performs very well compared to MR-CCSD(T)at this structure as shown by Balkov´a and Bartlett (the R e l a t i v e ene r g y ( kc a l / m o l ) Reaction coordinate SUHF ECISD ECISD+Q CCSD(T)MRCI+Q xx x x
SingletTriplet
FIG. 4: Energy profiles along with the automerizationpathway. The solid and broken curves represent singletand triplet states, respectively.energy difference is less than 2 mH at a DZP basis), theoptimized geometry can be largely affected by the pres-ence of static correlation; for example, MR-CCSD pre-dicted 1.570 and 1.367 ˚A for R CC , while CCSD gave 1.582and 1.359 ˚A. Nonetheless, these optimized geometries allow us todraw energy diagrams for the automerization at eachlevel of theory. Figure 4 presents the lowest singlet andtriplet potential curves (solid and broken curves, respec-tively) along with the reaction coordinate that is definedas a linear interpolation between the two structures. Wechoose the low-spin triplet state for spin-projection inthis study, because the low-spin SUHF energy was foundto be much lower than the high-spin one (the stabilitygained in the former is 35 mH compared to the latterat D h ), suggesting the former is a qualitatively betterstarting point for subsequent ECISD calculations. Also,the use of low-spin states was recommended for the cal-culations of singlet-triplet gaps in our previous work. Note that, nevertheless, the ECISD+Q results for ∆ E ST and ∆ E ∗ ST in this case did not significantly change whena high-spin state was used; the largest difference betweenhigh-spin and low-spin triplet energies is less than 3 mHwith a non-parallelity error of about 0.1 mH.The activation barrier of SUHF is found to be toolow with only 1.5 kcal/mol, which is consistent with thefinding in Section III B that the SUHF energy potentialsufraces appear to be unphysically shallow. Inclusion ofsingles and doubles as well as the Davidson correction dueto Pople et al. increases ∆ E ‡ to 4.8 and 7.4 kcal/mol,the latter almost coinsiding the result of EOM-SF-CCSD(7.5 kcal/mol). However, as tabulated in Table II, theECISD+Q value is lower than that of MRCI+Q by 1.7kcal/mol. Nevertheless, we note that the correlation en-ergy gained by ECISD is comparable to that of MRCI at D h , while the former is much larger than the latter at2TABLE II: Total energy of ground state at the D h geometry ( a.u. ), activation barrier, and verticalsinglet-triplet gaps at both D h and D h geometries(kcal/mol). Method Energy ( D h ) ∆ E ‡ ∆ E ST ∆ E ∗ ST SUHF –153.790 05 1.5 40.6 27.2ECISD –154.344 81 4.8 34.3 11.3ECISD+Q a –154.431 67 7.4 31.3 6.1MRCI b –154.341 53 8.4 34.5 6.3MRCI+Q b,c –154.433 56 9.1 33.0 4.7CCSD(T) d,e –154.453 26 18.4 33.8 –6.1EOM-SF-CCSD e,f –154.424 95 7.5 38.3 8.5 a ECISD geometries are used. b The active space used is (4 e, o ). c MRCI geometries are used. d RHF/ROHF reference. e The singlet ( D h ) and triplet ( D h ) CCSD(T) geometries areused. f UHF reference. Taken from Ref.[53]. D h , as can be seen from the large difference in the com-puted barrier heights. This suggests that ECISD spansa better reference space than does MRCI (for singlet) forthe subsequent Davidson correction, and therefore, weconclude the singlet ECISD+Q results should be morereliable than those of MRCI+Q with this active space.On the other hand, CCSD(T) produces a kink at the D h point as expected, because of the exact degeneracyin the e g orbitals. As a result, its activation barrier is pre-dicted to be 18.4 kcal/mol, which is significantly largerthan the experimental estimate of 1.6–10 kcal/mol, re-sulting in the incorrect ordering of the singlet and tripletenergies. However, again, the small energy difference be-tween CCSD(T) and MR-CCSD(T) in Ref.[52] validatesthe accuracy of CCSD(T) at the D h geometry. Henceits ∆ E ST , 33.8 kcal/mol, is expected to be reliable, whichis indeed close to the MRCI+Q result (33.0 kcal/mol).While EOM-SF-CCSD gives a somewhat larger value of38.3 kcal/mol, the ECISD+Q prediction agrees well withCCSD(T) and MRCI+Q. We observe a similar behav-ior for ∆ E ∗ ST , where the EOM-SF-CCSD value is againslightly larger than those of ECISD+Q and MRCI+Q. Itis likely that this discrepancy is attributed to the spin-contamination inevitable in SF calculations. At the D h structure, ECISD+Q and MRCI+Q are energeti-cally very close to each other; the singlet energies of themare -154.41992 and -154.41910, whereas the triplet en-ergies are -154.41018 and -154.41166, in E h . Therefore,ECISD+Q slightly outperforms MRCI+Q for singlet, butgets worse for triplet, resulting in the increase of ∆ E ∗ ST compared the latter, although small (1.4 kcal/mol).Overall, ECISD produces similar results to those ob-tained with MRCI, especially when a size-consistencycorrection is introduced. It appears ECISD is satisfac-tory for this system, considering the simplicity of its wavefunction ansatz . IV. Conclusions
The availability of analytical derivative is critical inelectronic structure calculations, as it broadens the ap-plication range of a method by providing a means tocompute molecular properties and nuclear gradients. Inthis manuscript, we have demonstrated that, using the Z-vector technique, the first derivative of post-SUHF meth-ods can be derived in a manner analogous to single-reference post-HF. The chief difference is that, in theMO basis, the density matrices of spin-projected ap-proaches explicitly depend on the underlying molecularorbitals and AO overlap matrix, while only particle-holeamplitudes such as CI coefficients are relevant in single-reference cases. We showed the calculation of densitymatrix derivatives with respect to nuclear coordinatescan be avoided by expressing the total energy as a solefunction of contractions in terms of the nonorthogonalWick theorem, which allows for complete cancellation insuch terms. The resulting relaxed density matrices arenot spin-adapted, if a projection operator is not explicitlypresent at both bra and ket states. However, since therelaxation correction is written as the gradient of SUHF1PDM with respect to orbital change, one can retrievethe spin-adapted form using the Wigner-Eckart theorem.Our results revealed the SUHF frequencies can besometimes even much worse than UHF as seen in ozone,but this deficiency can be largely ameliorated by the in-troduction of dynamic correlation in ECISD. The dipolemoment of ozone was also greatly improved when com-puted with the relaxed density. For the automerizationof cyclobutadiene, the ECISD+Q results agreed quitewell with MRCI, as well as with EOM-SF-CCSD andCCSD(T) reference values. However, our results alsostrongly indicate the inclusion of size-extensive (consis-tent) correction is indispensable for accurate descriptionsof these systems. While the Davidson correction greatlyimproves the energy with a negligible computational cost,its derivative is evidently cumbersome. Hence it is moreadvantageous to renormalize the effect of ∆ E Q into theenergy functional, so that the total energy derivative withrespect to c I remains to be zero. Research for such ex-tension is currently underway; some preliminary resultsare reported elsewhere. Acknowledgements
We would like to thank Motoyuki Uejima for help-ful discussions, and Roman Schutski for performing non-collinear projected HF calculations. This work was sup-ported by MEXT’s FLAGSHIP2020 as priority issue 5(development of new fundamental technologies for high-efficiency energy creation, conversion/storage and use).We are also indebted to the HPCI System Researchproject for the use of computer resources (Project ID:hp150228, hp150278).3
AppendixA. Wigner-Eckart theorem for relaxed density matrix
The relaxed density matrix in the form of Eq.(58), P rel , is not spin-adapted because the perturbation hasbeen applied to the half-projected Hamiltonian ˆ H ( λ ) ˆ P but not to the explicitly projected one ˆ P ˆ H ( λ ) ˆ P . On thecontrary, it can be even asymmetric. This section showsthe Wigner-Eckart theorem can be equally applicable toderive D rel from P rel .For our purpose, let us split the spin-adapted relaxed density matrix as D rel = D + D ( vo ) + D ( ov ) , (A1)where the first term is the unrelaxed spin-adapted ECISDdensity matrix ( (cid:104) Ψ | ˆ P a † p a q ˆ P | Ψ (cid:105) ), and superscripts ( vo )and ( ov ) stand for the contributions from the correspond-ing space z ai and z ia , respectively.For the explicitly spin-projected Hamiltonian, ˆ P ˆ H ˆ P = (cid:80) g,g (cid:48) w g w g (cid:48) ˆ R g (cid:48) ˆ H ˆ R g , it can be shown that PHF Fock forthe ai component ( κ = α, β spin) is¯¯ F aκ,iκ = 1 (cid:104) Φ | ˆ P | Φ (cid:105) (cid:88) g,g (cid:48) w g w g (cid:48) n g (cid:48) g (cid:32) ( ¯¯ E g (cid:48) g − ¯¯ E PHF ) (cid:104) Φ | a † iκ a aκ ˆ R g (cid:48) ˆ R g | Φ (cid:105) n g (cid:48) g + ( ¯¯ F g (cid:48) g ) pσ,qσ (cid:104) Φ | a aκ ˆ R g (cid:48) a † pσ ˆ R g | Φ (cid:105) n g (cid:48) g (cid:104) Φ | a † iκ ˆ R g (cid:48) a qσ ˆ R g | Φ (cid:105) n g (cid:48) g (cid:33) , (A2)where n g (cid:48) g = (cid:104) Φ | ˆ R g (cid:48) ˆ R g | Φ (cid:105) , (A3)¯¯ E g (cid:48) g = (cid:88) pq (cid:88) σ h pq (cid:104) Φ | ˆ R g (cid:48) a † pσ a qσ ˆ R g | Φ (cid:105) n g (cid:48) g + 12 (cid:88) pqrs (cid:88) σκ (cid:104) pq || rs (cid:105) (cid:104) Φ | ˆ R g (cid:48) a † pσ a rσ ˆ R g | Φ (cid:105) n g (cid:48) g (cid:104) Φ | ˆ R g (cid:48) a † qκ a sκ ˆ R g | Φ (cid:105) n g (cid:48) g (A4)¯¯ E PHF = 1 (cid:104) Φ | ˆ P | Φ (cid:105) (cid:88) g,g (cid:48) w g w g (cid:48) n g (cid:48) g ¯¯ E g (cid:48) g , (A5)( ¯¯ F g (cid:48) g ) pσ,qσ = h pσ,qσ + (cid:88) rs,κ (cid:104) pr || qs (cid:105) (cid:104) Φ | ˆ R g (cid:48) a † rκ a sκ ˆ R g | Φ (cid:105) n g (cid:48) g . (A6)Note that the spin-adapted relaxed density contribution in the MO basis D ( vo ) qσ,pσ then comes from ¯¯ E (¯ x ) g (cid:48) g , ¯¯ E (¯ x )PHF , and( ¯¯ F (¯ x ) g (cid:48) g ) pσ p ,qσ q . Therefore, we have D ( vo ) qσ,pσ = (cid:88) κ z aκ,iκ (cid:88) gg (cid:48) w g w (cid:48) g n g (cid:48) g (cid:32) (cid:104) Φ | a aκ ˆ R g (cid:48) a † pσ ˆ R g | Φ (cid:105) n g (cid:48) g (cid:104) Φ | a † iκ ˆ R g (cid:48) a qσ ˆ R g | Φ (cid:105) n g (cid:48) g + (cid:104) Φ | ˆ R g (cid:48) a † pσ a qσ ˆ R g | Φ (cid:105) n g (cid:48) g (cid:104) Φ | a † iκ a aκ ˆ R g (cid:48) ˆ R g | Φ (cid:105) n g (cid:48) g − (cid:104) Φ | ˆ R g (cid:48) a † pσ a qσ ˆ R g | Φ (cid:105) n g (cid:48) g (cid:88) G (cid:48) G w G w G (cid:48) n G (cid:48) G (cid:104) Φ | a † iκ a aκ ˆ R G (cid:48) ˆ R G | Φ (cid:105) n G (cid:48) G (cid:33) . (A7)The last term is simply a product of two projected ele-ments; both n g (cid:48) g and n G (cid:48) G cancel out between the nu-merator and denominator, and one can use the definitionˆ P ≡ (cid:80) g w g ˆ R g . On the other hand, n g (cid:48) g in the the first two terms also cancel out by using the nonorthogonalWick theorem in a backward manner, and these termscan be cast as a single term. Finally, we find D ( vo ) qσ,pσ = (cid:88) κ z aκ,iκ (cid:32)(cid:88) gg (cid:48) w g w (cid:48) g n g (cid:48) g (cid:104) Φ | a † iκ a aκ ˆ R g (cid:48) a † pσ a qσ ˆ R g | Φ (cid:105) n g (cid:48) g − (cid:104) Φ | a † iκ a aκ ˆ P | Φ (cid:105)(cid:104) Φ | ˆ P a † pσ a qσ ˆ P | Φ (cid:105) (cid:33) = (cid:88) κ z aκ,iκ (cid:32) (cid:104) Φ | a † iκ a aκ ˆ P a † pσ a qσ ˆ P | Φ (cid:105) − (cid:104) Φ | a † iκ a aκ ˆ P | Φ (cid:105)(cid:104) Φ | ˆ P a † pσ a qσ ˆ P | Φ (cid:105) (cid:33) , (A8)for which we can resort to the Wigner-Eckart theorem for the expansion of ˆ P a † pσ a qσ ˆ P to a linear combination4of half-projected, mixed-spin operators a † pλ a qτ ˆ P . B. Alternative derivation of energy-weighted densitymatrix in PHF
Here we check if the energy-weighted density matrix W in PHF can also be obtained by only considering thedifference between ˜ L ( x ) g and ∂ ˜ L g /∂κ ∗ pq , that is, Eq.(46).Again, for the κ ∗ pq differentiation, we have to write thePHF energy as a function of only W g , ˜ L g , and ˜ R g toassure no explicit C † dependence, so that one can usethe chain-rule as in Eq.(45). For PHF, it is evident that z → and E g → E g , so the energy-weighted densitymatrix W is defined as W pq = − C ∗ µp (cid:88) g w g n g ∂E g ∂ ( ˜ L g ) rµ ( L g ) rq . (B1)Then, the ˜ L g dependence of E PHF (or E g ) only appearsnaturally in the transition density matrix in the AO basiswritten as ( ρ g ) µν = ( ˜ R g ) µi ( W oo g ) − ij ( ˜ L g ) jµ . (B2)and there is no dependence in n g , because it can be com-pletely expressed by W g . Thus, the energy-weighteddensity matrix for PHF gradient may be given simplyby W pq = − C ∗ µp (cid:88) g w g n g ∂E g ∂ ( ρ g ) αβ ∂ ( ρ g ) αβ ∂ ( ˜ L g ) rµ ( L g ) rq = − C ∗ µp (cid:88) g w g n g ( F g ) βα [ S − R g C o ] αk δ µβ ( L g ) kq = − (cid:88) g w g n g ( F g ρ g ) pq , (B3)where we have used ∂ ( ρ g ) αβ ∂ ( ˜ L g ) kµ = [ S − R g C o ] αk δ µβ . (B4) ∂ ( ρ g ) αβ ∂ ( ˜ L g ) aµ = 0 . (B5)This is consistent with the energy-weighted density ma-trix of PHF previously derived in Refs.[23 and 24], W = (cid:88) g w g n g (cid:16) ( E g − E PHF ) ρ g − ρ g F g ρ g (cid:17) . (B6)which, using the variational condition of the PHF Fockmatrix F = ∂E PHF ∂ κ ∗ = (cid:88) g w g n g (cid:16) ( E g − E PHF ) ρ g + η g F g ρ g (cid:17) = , (B7) becomes W = − (cid:88) g w g n g ( η g F g ρ g + ρ g F g ρ g ) = − (cid:88) g w g n g F g ρ g . (B8)Hence, the equivalence is proven. P.-O. L¨owdin, Phys. Rev. , 1474 (1955). P.-O. L¨owdin, Phys. Rev. , 1509 (1955). G. E. Scuseria, C. A. Jim´enez-Hoyoz, T. M. Henderson,K. Samanta, and J. K. Ellis, J. Chem. Phys. , 124108 (2011). C. A. Jim´enez-Hoyoz, T. M. Henderson, T. Tsuchimochi, andG. E. Scuseria, J. Chem. Phys. , 164109 (2012). T. Tsuchimochi and T. Van Voorhis, J. Chem. Phys. , 164117(2014). T. Tsuchimochi and T. Van Voorhis, J. Chem. Phys. , 124103(2015). T. Tsuchimochi, J. Chem. Phys. , 144114 (2015). T. Tsuchimochi and S. Ten-no, J. Chem. Phys. , 011101(2016). T. Tsuchimochi and S. Ten-no, J. Chem. Theory Comput. ,1741 (2016). T. Tsuchimochi and G. E. Scuseria, J. Chem. Phys. , 121102(2009). G. E. Scuseria and T. Tsuchimochi, J. Chem. Phys. , 164119(2009). A. J. Garza, C. A. Jim´enez-Hoyoz, and G. E. Scuseria, J. Chem.Phys. , 134102 (2013). A. J. Garza, C. A. Jim´enez-Hoyoz, and G. E. Scuseria, J. Chem.Phys. , 140 (2014). A. I. Krylov, Chem. Phys. Lett. , 375 (2001). A. I. Krylov, Chem. Phys. Lett. , 522 (2001). A. I. Krylov, Acc. Chem. Res. , 83 (2006). G. H. Diercksen, B. O. Roos, and A. J. Sadlej, Chem. Phys. ,29 (1981). J. Alml¨of and P. R. Taylor, Int. J. Quantum Chem. , 743(1985). P. Pulay, Mol. Phys. , 197 (1969). P. Pulay, Mol. Phys. , 473 (1970). J. Gerratt and I. M. Mills, J. Chem. Phys. , 1719 (1968). N. C. Handy and H. F. S. III, J. Chem. Phys. , 5031 (1984). R. Schutski, C. A. Jim´enez-Hoyoz, and G. E. Scuseria, J. Chem.Phys. , 204101 (2014). M. Uejima and S. Ten-no, to be submitted . T. Helgaker and P. Jørgensen, Theor. Chim. Acta , 111 (1989). P. Jørgensen and T. Helgaker, J. Chem. Phys. , 1560 (1988). K. Hald et al., J. Chem. Phys. , 2985 (2003). P. Celani and H.-J. Werner, J. Chem. Phys. , 5044 (2003). B. Kaduk, T. Tsuchimochi, and T. Van Voorhis, J. Chem. Phys. , 18A503 (2014). S. R. Langhoff and E. R. Davidson, Int. J. Quantum Chem. ,61 (1974). E. R. Davidson,
The World of Quantum Chemistry , Reidel,Dordrecht, 1974. W. Duch and G. H. F. Diercksen, J. Chem. Phys. , 3018(1994). T. Helgaker, P. Jørgensen, and J. Olsen,
Molecular Electronic-Structure Theory , Wiley, Chichester, U. K., 2000. M. Tinkham,
Group Theory and Quantum Mechanics , McGraw-Hill, New York, 1964, J. E. Rice and R. E. Amos, Chem. Phys. Lett. , 585, MRCIGrad. T. Tsuchimochi and G. E. Scuseria, J. Chem. Phys. , 141102(2010). T. Tsuchimochi and G. E. Scuseria, J. Chem. Phys. , 064101(2011). M. J. Frisch et al., Gaussian 09 Revision D.01, Gaussian Inc.Wallingford CT 2009. H.-J. Werner, P. J. Knowles, G. Knizia, F. R. Manby, andM. Sch¨utz, WIREs Comput Mol Sci , 242 (2012). T. J. Lee, W. D. Allen, and H. F. Schaefer III, J. Chem. Phys. , 7062 (1987). P. Borowski, K. Andersson, p. ˚A. Malmqvist, and B. O. Roos, J.Chem. Phys. , 5568 (1992). X. Li and J. Paldus, J. Chem. Phys. , 2844 (1999). K. Kowalski, J. Chem. Phys. , 014102 (2005). O. Hino, T. Kinoshita, G. K.-L. Chan, and R. J. Bartlett, J.Chem. Phys. , 114311 (2006). A. Kalemos and A. Mavridis, J. Chem. Phys. , 054312 (2008). P. J. Hay, T. H. Dunning Jr., and W. A. Goddard III, J. Chem.Phys. , 3912 (1975). R. Schutski, private communication. O. L. Chapman, C. L. McIntosh, and J. Pacansky, J. Am. Chem.Soc. , 614 (1973). O. L. Chapman, D. De la Cruz, R. Roth, and R. Pacansky, J.Am. Chem. Soc. , 1337 (1973). D. W. Whitman and B. K. Carpenter, J. Am. Chem. Soc. ,6473 (1982). B. K. Carpenter, J. Am. Chem. Soc. , 1700 (1983). A. Balkov´a and R. J. Bartlett, J. Chem. Phys. , 8972 (1994). S. V. Levchenko and A. I. Krylov, J. Chem. Phys. , 175(2004). M. Eckert-Maksi´c, M. Vazdar, M. Barbatti, H. Lischka, and Z. B.Maksi´c, J. Chem. Phys. , 064310 (2006). K. Bhaskaran-Nair, O. Demel1, and J. Pittner, J. Chem. Phys. , 184105 (2008). X. Li and J. Paldus, J. Chem. Phys. , 114103 (2009). J. Shen and P. Piecuch, J. Chem. Phys. , 144104 (2012). T. Tsuchimochi and S. Ten-no, arXiv:1612.02945 (2016). I. A. Pople, J. S. Binkley, and R. Seeger, Int. J. Quantum Chem.
S11 , 149 (1977). J. S. Sears, C. D. Sherrill, and A. I. Krylov, J. Chem. Phys.118