[PDF] Is Externally Corrected Coupled Cluster Always Better than the Underlying Truncated Configuration Interaction?

Abstract

The short answer to the question in the title is 'no'. We identify classes of truncated configuration interaction (CI) wave functions for which the externally corrected coupled-cluster (ec-CC) approach using the three-body (T_{3}) and four-body (T_{4}) components of the cluster operator extracted from CI does not improve the results of the underlying CI calculations. Implications of our analysis, illustrated by numerical examples, for the ec-CC computations using truncated and selected CI methods are discussed. We also introduce a novel ec-CC approach using the T_{3} and T_{4} amplitudes obtained with the selected CI scheme abbreviated as CIPSI, correcting the resulting energies for the missing T_{3} correlations not captured by CIPSI with the help of moment expansions similar to those employed in the completely renormalized CC methods.

Full PDF

IIs Externally Corrected Coupled Cluster AlwaysBetter than the Underlying TruncatedConﬁguration Interaction?

Ilias Magoulas, † Karthik Gururangan, † Piotr Piecuch, ∗ , † , ‡ J. Emiliano Deustua, † and Jun Shen † † Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, USA ‡ Department of Physics and Astronomy, Michigan State University, East Lansing,Michigan 48824, USA

E-mail: [email protected]

Abstract

The short answer to the question in the title is ‘no’. We identify classes of trun-cated conﬁguration interaction (CI) wave functions for which the externally correctedcoupled-cluster (ec-CC) approach using the three-body ( T ) and four-body ( T ) com-ponents of the cluster operator extracted from CI does not improve the results ofthe underlying CI calculations. Implications of our analysis, illustrated by numericalexamples, for the ec-CC computations using truncated and selected CI methods arediscussed. We also introduce a novel ec-CC approach using the T and T amplitudesobtained with the selected CI scheme abbreviated as CIPSI, correcting the resultingenergies for the missing T correlations not captured by CIPSI with the help of momentexpansions similar to those employed in the completely renormalized CC methods. a r X i v : . [ phy s i c s . c h e m - ph ] F e b INTRODUCTION

It is well-established that methods based on the exponential wave function ansatz ofcoupled-cluster (CC) theory, | Ψ (cid:105) = e T | Φ (cid:105) , (1)where T = N (cid:88) n =1 T n (2)is the cluster operator, T n is the n -body component of T , N is the number of correlatedelectrons, and | Φ (cid:105) is the reference determinant deﬁning the Fermi vacuum, are among themost eﬃcient ways of incorporating many-electron correlation eﬀects in molecular applica-tions. However, the conventional and most practical single-reference CC methods, includ-ing the CC singles and doubles (CCSD) approach, where T is truncated at T , and thequasi-perturbative correction to CCSD due to T clusters deﬁning the widely used CCSD(T)approximation, fail in multi-reference situations, such as bond breaking and strongly corre-lated systems (cf., e.g., refs 7,8,12). In fact, no traditional truncation in the cluster operatorat a given many-body rank, including higher-order CC methods, such as the CC approachwith singles, doubles, and triples (CCSDT), where T is truncated at T , and the CCapproach with singles, doubles, triples, and quadruples (CCSDTQ), where T is truncated at T , can handle systems with larger numbers of strongly correlated electrons. Sinceconventional multi-reference methods of quantum chemistry may be inapplicable tosuch problems as well (in part due to rapidly growing dimensionalities of the underlyingactive spaces), it is worth exploring various alternative ideas, including those that combinediﬀerent wave function ans¨atze, which would allow us to provide an accurate and balanceddescription of nondynamic and dynamic correlations in a wide range of many-electron sys-tems encountered in chemical applications.One of the interesting ways of improving the results of single-reference CC calculations inmulti-reference and strongly correlated situations, which is based on combining the CC and2on-CC (e.g., conﬁguration interaction (CI)) concepts and which is the main topic of thisstudy, is the externally corrected CC (ec-CC) framework (see ref 32 for a review). Theec-CC approaches are based on the observation that as long as the electronic Hamiltonian H does not contain higher–than–two-body interactions, the CC amplitude equations projectedon the singly and doubly excited determinants, (cid:104) Φ ai | ( H N e T ) C | Φ (cid:105) = 0 (3)and (cid:10) Φ abij (cid:12)(cid:12) ( H N e T ) C (cid:12)(cid:12) Φ (cid:11) = 0 , (4)respectively, in which no approximations are made, do not engage the higher-rank T n com-ponents of the cluster operator T with n >

4. Thus, by solving these nonlinear, energy-independent, equations, which can also be written as (cid:104) Φ ai | [ F N + ( F N T ) C + ( F N T ) C + ( F N T ) C + ( V N T ) C + ( V N T ) C + ( V N T ) C + ( V N T ) C + ( V N T T ) C + ( V N T ) C ] | Φ (cid:105) = 0 , (5) (cid:10) Φ abij (cid:12)(cid:12) [( F N T ) C + ( F N T ) C + ( F N T T ) C + V N + ( V N T ) C + ( V N T ) C + ( V N T ) C + ( V N T ) C + ( V N T T ) C + ( V N T ) C + ( V N T ) C + ( V N T T ) C + ( V N T ) C + ( V N T T ) C + ( V N T ) C ] | Φ (cid:105) = 0 , (6)for the singly and doubly excited clusters, T and T , respectively, in the presence of theirexact triply ( T ) and quadruply ( T ) excited counterparts extracted from full CI (FCI), oneobtains the exact T and T and the exact correlation energy ∆ E ≡ E − (cid:104) Φ | H | Φ (cid:105) , which inthe case of the Hamiltonians with two-body interactions is given by the expression∆ E = (cid:104) Φ | ( H N e T ) F C | Φ (cid:105) = (cid:104) Φ | [( F N T ) F C + ( V N T ) F C + ( V N T ) F C ] | Φ (cid:105) . (7)3his suggests that by using external wave functions capable of generating an accurate rep-resentation of T and T clusters, and subsequently solving for T and T using eqs 5 and 6,one should not only produce correlation energies that are much better than those obtainedwith CCSD, where T and T are zero, but also substantially improve the results of thecalculations used to provide T and T .One of the two main objectives of this study is to examine the validity of the latter partof the above suggestion. By performing the appropriate mathematical analysis, backed bynumerical examples, we point out that, in addition to the exact, FCI or full CC, states, thereexist large classes of truncated CI and CC wave functions which, after extracting T through T from them via the cluster analysis procedure adopted in all ec-CC considerations, satisfyeqs 5 and 6. In all those cases, which are further elaborated on in Section 2, the ec-CCcalculations return back the energies obtained in the calculations that provide T and T clusters. While it is obvious that any CC state with T = (cid:80) Mn =1 T n , where 2 ≤ M ≤ N ,including the conventional CCSD ( M = 2), CCSDT ( M = 3), CCSDTQ ( M = 4), etc.truncations and their active-space CCSDt, CCSDtq, etc. analogs, which all treat the T and T components of T fully, satisﬁes eqs 5 and 6, the ﬁnding that there are truncatedCI states that result in the T n operators with n = 1–4 which are solutions of eqs 5 and 6is less trivial, while having interesting consequences for the ec-CC methods using CI wavefunctions to determine T and T . Throughout this article, we use the notation in which | Φ a ...a n i ...i n (cid:105) are the n -tuply excited determinants, with i, j, . . . and a, b, . . . representing theoccupied and unoccupied spin-orbitals in | Φ (cid:105) , respectively, H N = H − (cid:104) Φ | H | Φ (cid:105) = F N + V N is the Hamiltonian in the normal-ordered form, with F N and V N representing its one- andtwo-body components, and ( AB ) F C , ( AB ) C , and AB are the fully connected, connected butnot fully connected, and disconnected products of operators A and B , respectively. In thelanguage of diagrams, ( AB ) F C is a connected operator product having no external fermionlines, ( AB ) C is a connected operator product with some fermion lines left uncontracted, and AB implies that no fermion lines connect A and B . It should be noted that the connected4perator product is not necessarily the same as a connected quantity in a diagrammaticsense of the many-body perturbation theory (MBPT), since operators A and B involved informing ( AB ) F C or ( AB ) C may themselves be disconnected or, even, unlinked. For example,the T n components of the cluster operator T , which enter eqs 5–7, are connected in a sense ofMBPT diagrammatics if they originate from standard CC computations or cluster analysisof FCI, but they may no longer be connected if obtained from cluster analysis of truncated CIwave functions. The latter remark is particularly relevant to the subject of this study. Theconnected operator product only means that operators A and B , each treated as a whole,are connected with at least one fermion line, but the connectedness of ( AB ) C in the MBPTsense depends on the contents of A and B .The formal and numerical results reported in this study may have implications for thedevelopment of future methods based on the ec-CC ideas. The external sources of T and T clusters adopted in the ec-CC methods developed to date include projected unrestrictedHartree–Fock wave functions, which were used in the past to rationalize diagram cancellationsdeﬁning the approximate coupled-pair approaches, when applied to certain classes of stronglycorrelated model systems (see, also, ref 32 and references therein), and wave functionsobtained with methods designed to capture nondynamic correlation eﬀects relevant to molec-ular applications, such as bond breaking and polyradical species, including valence-bond, complete-active-space self-consistent-ﬁeld (CASSCF), multi-reference CI (MRCI), perturbatively selected CI (PSCI), FCI quantum Monte Carlo (FCIQMC), and adap-tive CI (ACI) approaches. One could also develop extensions of the ec-CC formalism byconsidering projections of the CC equations on higher–than–doubly excited determinantsand extracting the relevant T n components with n ≥ T - and T -containing terms extracted from CASSCF. While some of the above ec-CCmethods, especially the reduced multi-reference CCSD (RMRCCSD) approach, whichuses T and T clusters extracted from MRCI, and its RMRCCSD(T) extension correcting5he RMRCCSD energies for certain types of T correlations missing in MRCI wave func-tions, the PSCI-based ec-CC scheme introduced in ref 26, and the cluster-analysis-drivenFCIQMC (CAD-FCIQMC) approach of refs 30,35, which utilizes T and T amplitudes ex-tracted from stochastic wave function propagations deﬁning the FCIQMC framework, oﬀer considerable improvements compared to both CCSD and the underlying CI calculationsproviding T and T clusters (cf., e.g., refs 27–29,41–46 for illustrative examples of successfulRMRCC computations), there are situations where the improvements are minimal or none.The most recent example demonstrating that the ec-CC computations do not necessarilyoutperform the underlying CI calculations is the ACI-CC method implemented in ref 31. Inthe ACI-CC method, which we suggested two years earlier, but have not followed through,one uses T and T clusters extracted from the wave functions obtained with the ACI ap-proach. As shown, for example, in Table 2 of ref 31, the ACI-CCSD calculations worsenthe underlying ACI results for the automerization of cyclobutadiene so much that there isvirtually no diﬀerence between the ACI-CCSD and poor CCSD energetics (see, also, Figure5 in ref 31). The enrichment of the ACI wave functions using MRCI-like arguments throughthe extended ACI approach abbreviated by the authors of ref 31 as xACI, followed by clus-ter analysis to obtain T and T and ec-CC iterations to determine T and T , deﬁning thexACI-CCSD method, improves the ACI-CCSD barrier heights, but the xACI-CCSD com-putations do not improve the corresponding ACI and xACI results. Similar remarks applyto the potential energy curve and vibrational term values of the beryllium dimer, shown inFigure 2 and Table 1 of ref 31, where there is virtually no diﬀerence between the CCSD(T)and ACI-CCSD(T) data (ACI-CCSD(T) stands for the ACI-CCSD calculations correctedfor the T correlations missing in the ACI wave functions). This shows that there are classesof truncated CI wave functions that are either good enough in their own right or that havea speciﬁc mathematical structure such that the ec-CC calculations using them do not oﬀerany signiﬁcant beneﬁts, while adding to the computational costs.This leads us to the second objective of the present study, discussed mostly in Section 3,6amely, the exploration of an alternative to ACI, abbreviated as CIPSI, which stands for theCI method using perturbative selection made iteratively, as a source of T and T clustersin the ec-CC considerations. In analogy to ACI, CIPSI belongs to the broader category ofselected CI approaches, which date back to the late 1960s and early 1970s and whichhave recently attracted renewed attention. In addition to using CIPSI to supportparts of our mathematical analysis presented in Section 2, we demonstrate that the ec-CCapproach using T and T cluster components extracted from the CIPSI wave functionsimproves the underlying CIPSI results, especially after correcting the ec-CC energies for themissing T correlations. On the other hand, as further discussed in Section 3, the energiesobtained with CIPSI corrected using multi-reference second-order MBPT can be competitivewith the corresponding CIPSI-driven ec-CC computations, at least when smaller molecularsystems are examined. This agrees with the excellent performance of the perturbativelycorrected CIPSI approach, when compared with other methods aimed at near-FCI energetics,including the ec-CC-based CAD-FCIQMC scheme, observed, for example, in ref 62, whichreinforces the importance of the question posed in the title of this work. We begin our considerations with the formal analysis, supported by the numerical evidenceshown in Tables 1–3, aimed at identifying the non-exact ground-state wave functions | Ψ (cid:105) that, after performing cluster analysis on them, result in the T through T componentssatisfying eqs 5 and 6 and returning back the energies associated with these | Ψ (cid:105) states. Themathematics of the ec-CC framework, focusing on the ec-CC approaches using truncated CIwave functions to generate T and T clusters, and its implications are discussed in Section2.1. The calculations illustrating the ec-CC theory aspects examined in Section 2.1 arediscussed in Section 2.2. 7 .1 Mathematical Analysis of the ec-CC Formalism We have already noted that any state resulting from the CC calculations using T = (cid:80) Mn =1 T n ,where 2 ≤ M ≤ N , starting with the basic CCSD approximation and including the remain-ing members of the conventional CCSD, CCSDT, CCSDTQ, etc. hierarchy, satisﬁes eqs 5and 6. In fact, any wave function | Ψ (cid:105) that uses the exponential CC ansatz deﬁned by eq 1and treats T and T clusters fully satisﬁes these equations too. This alone, while obviousto CC practitioners, might already be a potential issue in the context of ec-CC considera-tions, since, at least in principle, one can envision situations where some non-exact, non-CCapproaches recreate, to a good approximation, such CC states and energies associated withthem, diminishing the value of the corresponding ec-CC calculations.As elaborated on in this subsection, and as demonstrated in Appendices A and B whichcontain the relevant mathematical proofs, similar statements apply to certain classes of trun-cated CI approaches, when used as providers of T and T clusters for ec-CC computations.In particular, if we use any conventional CI truncation to deﬁne the ground state | Ψ (cid:105) inwhich singly and doubly excited contributions are treated fully, as in the CISD, CISDT,CISDTQ, etc. approaches, i.e., | Ψ (cid:105) = (1 + C ) | Φ (cid:105) , (8)where we assumed the intermediate normalization and where C = M (cid:88) n =1 C n (9)is the corresponding excitation operator, with C n representing its n -body components and2 ≤ M ≤ N , and then, after determining the CI excitation amplitudes by diagonalizing theHamiltonian, deﬁne the cluster operator T as T = ln(1 + C ) = N (cid:88) m =1 ( − m − m C m , (10)8o bring the CI expansion, eq 8, to an exponential form, eq 1, which leads to the well-knowndeﬁnitions of the T through T components adopted in all ec-CC methods, T = C ,T = C − C ,T = C − C C + C ,T = C − C C − C + C C − C , (11)the resulting T n , n = 1–4, amplitudes satisfy eqs 5 and 6. In other words, if we extract the T and T components of T through the cluster analysis of the CI state | Ψ (cid:105) deﬁned by eqs8 and 9, using the relationships between the C n and T n operators given by eq 11, and solvefor T and T in the presence of T and T obtained in this way, we recover the truncated CIenergy associated with | Ψ (cid:105) back, without improving it at all. This means that if we followthe above recipe, without making any additional a posteriori modiﬁcations in T and T ,which we refer to as variant I of ec-CC, abbreviated throughout this paper as ec-CC-I, theec-CC calculations using the CISD, CISDT, CISDTQ, etc. wave functions to generate T and T with the help of eq 11 reproduce the corresponding CI energies, nothing more. Atﬁrst glance, the ec-CC-I calculations using the CISD and CISDT states as external sourcesof T and T amplitudes seem strange, but, as further clariﬁed by the proofs presented inAppendices A and B, there is nothing strange about it. If we do not impose any constraintson the T and T components resulting from the cluster analysis deﬁned by eq 11, the ec-CCcalculations using the CISD or CISDT wave functions are as legitimate as all others. Theunconstrained ec-CC-I calculations using the CISD and CISDT wave functions in a clusteranalysis return back the corresponding CISD and CISDT energies, since the ec-CC-I scheme,as summarized above, allows for the purely disconnected T and T amplitudes, such as T = − C C + C (12)9r T = − C C − C + C C − C . (13)This undesirable feature of the ec-CC-I scheme is a consequence of artiﬁcially imposing theexponential structure of the wave function on a truncated CI expansion, which does not haveit. The only conventional CI state that can be represented by the connected T n componentsby exploiting the relations between the C n and T n operators given by eq 11 is a FCI state.In reality, the ec-CC-I scheme, as deﬁned above, is never used in the context of ec-CCcalculations based on T and T extracted from truncated CI, since allowing the purelydisconnected forms of T and T operators, when the corresponding C and C amplitudesare zero, as in eqs 12 and 13, is problematic (we recall that in the exact, FCI, description, allmany-body components of the cluster operator T are connected ). To eliminate the risk ofintroducing the purely disconnected three- and four-body components of the cluster operator T into the ec-CC equations for T and T , eqs 5 and 6, in all practical implementations ofthe ec-CC methodology employing truncated CI wave functions, such as those reported inrefs 26–29,31, one keeps only those T and T amplitudes resulting from the cluster analysisfor which the corresponding C and C excitation coeﬃcients are nonzero. This does notguarantee a complete elimination of disconnected diagrams from the resulting T and T amplitudes, since all Hamiltonian diagonalizations using conventional CI truncations resultin unlinked wave function contributions that do not cancel out, but it does take care of thenegative consequences associated with the presence of the purely disconnected T and T terms that fall into the category of expressions represented by eqs 12 and 13. The resultingec-CC protocol, in which one does not allow the problematic T and T components thatdo not have the companion C and C amplitudes, is called in this paper variant II ofec-CC, abbreviated as ec-CC-II. The RMRCCSD approach introduced in ref 27, the ACI-CCSD scheme implemented in ref 31, and the CIPSI-driven ec-CC-II algorithm discussed inSection 3 are the examples of ec-CC methods in this category. The removal of certain classesof triples and quadruples from the ec-CC considerations, which results from imposing the10bove constraint on the T and T amplitudes allowed in eqs 5 and 6, can be compensatedby correcting the ec-CC-II energies for the missing T or T and T correlations, as in theCCSD(T)-like triples corrections to RMRCCSD and ACI-CCSD adopted in refs 29 and 31,deﬁning the RMRCCSD(T) and ACI-CCSD(T) approaches, respectively, and the CIPSI-based ec-CC-II method introduced in Section 3, which relies on the more accurate momentcorrections. We will return to the issue of correcting the ec-CC-II energies for the missing T correlations when discussing our new CIPSI-driven ec-CC-II approach in the next section.The ec-CC-II protocol eliminates problems resulting from allowing the purely discon-nected forms of T and T operators in the ec-CC calculations, when the corresponding C and C amplitudes are zero, but it does not prevent the collapse of the ec-CC energies ontotheir CI counterparts. The ec-CC-II computations, in which the T and T clusters enteringeqs 5 and 6 are extracted from CI calculations using a complete treatment of singles, dou-bles, triples, and quadruples, as in the CISDTQ, CISDTQP, CISDTQPH, etc. truncations,where M in eq 9 is at least 4 and letters ‘P’ and ‘H’ in the CISDTQP and CISDTQPHacronyms stand for pentuples ( C ) and hextuples ( C ), respectively, return back the corre-sponding CI energies. In all of these cases, the ec-CC methodology, including even its mostproper ec-CC-II variant, oﬀers no beneﬁts whatsoever, while adding to the computationalcosts associated with cluster analysis of the underlying CI wave functions and dealing witheqs 5 and 6 after the respective CI Hamiltonian diagonalizations. This is because once allsingles, doubles, triples, and quadruples are included in CI calculations, every nonzero T n , n = 3 ,

4, amplitude resulting from the cluster analysis of the underlying CI state has acompanion, also nonzero, C n excitation coeﬃcient, i.e., the ec-CC-I and ec-CC-II schemesbecome equivalent. One might argue that the above observation does not diminish the use-fulness of ec-CC computations, since one never uses high-level single-reference CI methods,such as CISDTQ, as sources of T and T amplitudes in practical applications, which is acorrect statement, but a remark like that could be misleading. Nowadays, one can generateapproximate CI-type wave functions that provide a highly accurate representation of the C n and semi-stochastic implementations of selected CI techniques, asin heat-bath CI and modern implementations of CIPSI. As shown in Section 3, theec-CC-II approach using the T and T clusters extracted from the CIPSI wave functionsand corrected for the missing T correlations does not have to improve the correspondingCIPSI energetics corrected using second-order MBPT when smaller many-electron problemsare examined.If we limit ourselves to conventional CI truncations deﬁned by eqs 8 and 9, the onlysituations where the ec-CC-I and ec-CC-II protocols diﬀer, giving the ec-CC-II approach achance to improve the results of the corresponding ec-CC-I and CI computations, are thosein which T and T clusters are extracted from CISD or CISDT. To be more speciﬁc, whenone uses the CISD wave function, where C and C are by deﬁnition zero, in the ec-CC-IIcalculations, which means that T and T in eqs 5 and 6 are set at zero as well, the ec-CC-II energy becomes equivalent to that obtained with CCSD, as opposed to the usually lessaccurate CISD value obtained with ec-CC-I. When the CISDT wave function is employedin the ec-CC-II computations, one solves eqs 5 and 6 for T and T in the presence of T contributions having companion C excitation amplitudes extracted from CISDT, which maylead to major improvements compared to the corresponding ec-CC-I calculations that returnback the CISDT energy and the results obtained with CCSD, in which T is ignored, butsince C = 0 in CISDT, i.e., T correlations are not accounted for, the CISDT-based ec-CC-IIapproach may produce erratic results in more complex multi-reference situations. We willreturn to the discussion of these accuracy patterns, including the equivalence of the ec-CC-I,ec-CC-II, and the underlying CI approaches when the ec-CC calculations use wave functionscharacterized by a full treatment of C n amplitudes with n = 1–4, during the examination ofthe numerical results shown in Table 1 in Section 2.2.As shown in Appendices A and B, the above relationships between the results of the ec-CC calculations using T and T clusters extracted from truncated CI computations and the12orresponding CI energetics can be generalized to unconventional truncations in the linearexcitation operator C deﬁning the wave function | Ψ (cid:105) via eq 8 as long as the C and C components of C contain a complete set of singles and doubles. To state this generalization,which is the heart of the mathematical and numerical analysis presented in this section, moreprecisely, let us consider the CI eigenvalue problem in which the ground electronic state isdeﬁned as follows: | Ψ (cid:105) = (1 + C ( P A ) + C ( P B ) ) | Φ (cid:105) , (14)where the singly and doubly excited components of | Ψ (cid:105) , described by the C ( P A ) operator,are treated fully, i.e., C ( P A ) = C + C , (15)and where the remaining wave function contributions, if any, are represented by C ( P B ) = (cid:88) n ≥ C ( P B ) n (16)(as in the case of eq 8, we impose the intermediate normalization on | Ψ (cid:105) ). We do not makeany speciﬁc assumptions regarding the C ( P B ) operator other than the requirement that itdoes not contain the one- and two-body components of C , which are included in C ( P A ) . Theabove deﬁnitions encompass the previously discussed conventional CI truncations, startingfrom CISD, where C ( P B ) = 0, and including the remaining members of the CISD, CISDT,CISDTQ, etc. hierarchy, for which the relevant n -body components of C ( P B ) are treated fully,and the various selected CI approaches with all singles and doubles and subsets of higher–than–double excitations. To facilitate our discussion below, and to aid the presentation ofthe proofs in Appendices A and B, we designate the subspace of the many-electron Hilbertspace H spanned by the singly and doubly excited determinants, | Φ ai (cid:105) and | Φ abij (cid:105) , respectively,which are jointly abbreviated as | Φ α (cid:105) and which match the content of the C ( P A ) operatordeﬁned by eq 15, as H ( P A ) . The subspace of H spanned by the determinants corresponding13o the content of C ( P B ) , eq 16, denoted as | Φ β (cid:105) , is designated as H ( P B ) , and the orthogonalcomplement to H ( P ) ⊕ H ( P A ) ⊕ H ( P B ) , which contains the remaining determinants | Φ γ (cid:105) not included in the CI wave function | Ψ (cid:105) , is denoted as H ( Q ) ( H ( P ) is a one-dimensionalsubspace of H spanned by the reference determinant | Φ (cid:105) ). Using the above notation, wecan write the CI eigenvalue problem for the ground-state wave function | Ψ (cid:105) given by eq 14and the corresponding correlation energy ∆ E (CI) as follows: (cid:104) Φ α | H N (1 + C ( P A ) + C ( P B ) ) | Φ (cid:105) = ∆ E (CI) (cid:104) Φ α | C ( P A ) | Φ (cid:105) , (17) (cid:104) Φ β | H N (1 + C ( P A ) + C ( P B ) ) | Φ (cid:105) = ∆ E (CI) (cid:104) Φ β | C ( P B ) | Φ (cid:105) , (18)where | Φ α (cid:105) ∈ H ( P A ) , | Φ β (cid:105) ∈ H ( P B ) , and∆ E (CI) = (cid:104) Φ | H N (1 + C ( P A ) + C ( P B ) ) | Φ (cid:105) (19)(in the case of CISD calculations, where C ( P B ) = 0 and the set of determinants | Φ β (cid:105) is empty,we only have to write eqs 17 and 19).The main theorem of this work states that the ec-CC-I calculations, in which we obtainthe T and T components entering eqs 5 and 6 by deﬁning the cluster operator T as T = ln (cid:0) C ( P A ) + C ( P B ) (cid:1) (20)and then solve eqs 5 and 6 for the T and T clusters in the presence of T and T extractedfrom the CI state | Ψ (cid:105) , eq 14, determined by using eqs 17–19, without eliminating any T and T amplitudes resulting from the cluster analysis of | Ψ (cid:105) , return back the CI correlationenergy ∆ E (CI) , eq 19, independent of truncations in C ( P B ) . In order to prove this theorem,we have to focus on the subset of CI equations represented by eq 17, which corresponds to14rojections on the singly and doubly excited determinants, (cid:104) Φ ai | H N (1 + C + C + C ( P B ) ) | Φ (cid:105) = ∆ E (CI) (cid:104) Φ ai | C | Φ (cid:105) (21)and (cid:10) Φ abij (cid:12)(cid:12) H N (1 + C + C + C ( P B ) ) (cid:12)(cid:12) Φ (cid:11) = ∆ E (CI) (cid:10) Φ abij (cid:12)(cid:12) C (cid:12)(cid:12) Φ (cid:11) , (22)respectively, and the energy formula, eq 19, which, given the absence of higher–than–two-body interactions in the electronic Hamiltonian and the use of normal ordering in H N , canalso be written as ∆ E (CI) = (cid:104) Φ | H N ( C + C ) | Φ (cid:105) . (23)As demonstrated, in two diﬀerent ways, in Appendices A and B, the subsystem of CI equa-tions represented by eq 17 or eqs 21 and 22, with the correlation energy ∆ E (CI) given byeq 19 or 23 and the cluster operator T deﬁned by eq 20, so that the T n components with n = 1–4 are obtained using eq 11, is equivalent to the CC amplitude equations projected onsingles and doubles, represented by eqs 3 and 4 or, more explicitly, 5 and 6. In the proofs ofthis equivalence, the CI correlation energy ∆ E (CI) , eq 19 or 23, becomes the correspondingCC energy given by eq 7. The ﬁrst proof of the above statement, presented in Appendix A,uses the formal deﬁnition of T , eq 20, which brings the CI expansion, eq 14, to a CC-likeform given by eq 1, the well-known property of the exponential ansatz that reads (cf., e.g.,refs 4,5,7,63–66) H N e T | Φ (cid:105) = e T ( H N e T ) C | Φ (cid:105) , (24)and the resolution of the identity in the many-electron Hilbert space, P + P A + P B + Q = , (25)15here P = | Φ (cid:105)(cid:104) Φ | , (26) P A = (cid:88) α (cid:48) | Φ α (cid:48) (cid:105)(cid:104) Φ α (cid:48) | , (27) P B = (cid:88) β | Φ β (cid:105)(cid:104) Φ β | , (28)and Q = (cid:88) γ | Φ γ (cid:105)(cid:104) Φ γ | (29)are the projection operators on the aforementioned H ( P ) , H ( P A ) , H ( P B ) , and H ( Q ) sub-spaces and is the unit operator, to convert the CI eqs 17 and 19 to the CC form repre-sented by eqs 3, 4, and 7 (or 5–7). The second proof, shown in Appendix B, which relies ona diagrammatic approach, follows the opposite direction. It starts from the CC amplitudeequations projected on the singly and doubly excited determinants, eqs 5 and 6, and theassociated correlation energy formula, eq 7, which are subsequently transformed into thecorresponding CI amplitude and energy equations, eqs 21–23, after expressing the T n com-ponents of T with n = 1–4 in terms of the corresponding CI excitation operators C , C ,and C ( P B ) n , n = 3 ,

4, using eq 11.In analogy to the previously discussed case of conventional CI truncations at a givenmany-body rank in the excitation operator C , deﬁned by eqs 8 and 9, the relationship be-tween the ec-CC calculations based on the more general form of the wave function | Ψ (cid:105) deﬁnedby eq 14, which encompasses a wide variety of CI approximations outside the CISD, CISDT,CISDTQ, etc. hierarchy, and the underlying CI computations, as summarized above, hasseveral implications. The most apparent one is the observation that the ec-CC-I calculationsbased on solving eqs 5 and 6, in which the T and T components of T are obtained by clusteranalysis of the CI wave functions that describe singles and doubles fully and higher–than–double excitations in a partial manner, return back the underlying CI energies, without anyimprovements, if the purely disconnected T and T amplitudes of the type of eqs 12 and 13,16or which the corresponding C ( P B )3 and C ( P B )4 contributions are zero, are not eliminated. Moreimportantly, the ec-CC-II approach, which is what one normally uses in the CI-based ec-CCcomputations, in which such purely disconnected T and T amplitudes are disregarded whensetting up eqs 5 and 6, may oﬀer improvements over the corresponding CI calculations, butonly if the triply and quadruply excited manifolds considered in CI are incomplete, i.e., the C ( P B )3 and C ( P B )4 components of | Ψ (cid:105) include fractions of triples and quadruples. Once the C ( P B )3 and C ( P B )4 operators capture all triples and quadruples (which in practice may meantheir signiﬁcant fractions), the ec-CC-I and ec-CC-II schemes based on the wave functions | Ψ (cid:105) deﬁned by eq 14 become equivalent and the resulting ec-CC energies become identical(with the large fractions of triples and quadruples, similar) to those obtained with CI. Inother words, assuming that singles and doubles are treated in CI fully, one might say thatthe ec-CC approach improves the energies obtained in the underlying CI calculations onlyif the treatment of triples and quadruples in the latter calculations is incomplete. In thatcase, after performing the ec-CC-II computations using subsets of triples and quadruplesprovided by CI and selecting the T and T amplitudes accordingly, to match these subsets,one can obtain the desired improvements over the corresponding CI calculations, improvingthe CCSD energetics at the same time. This is especially true when the ec-CC-II energiesare corrected for the remaining T (ideally, T and T ) correlations, as in the aforementionedRMRCCSD(T) method and the CIPSI-driven ec-CC-II approach discussed in Section 3.On the other hand, as mentioned in the Introduction, and as already alluded to above, thebeneﬁts oﬀered by the CI-based ec-CC calculations compared to modern variants of selectedCI techniques, represented in this work by the semi-stochastic CIPSI approach described inrefs 50,51 or the ACI scheme of refs 47,48 used in the recently implemented ACI-CCSD andACI-CCSD(T) approaches, may not be as substantial as desired.Before discussing the numerical evidence supporting the key aspects of the above mathe-matical analysis in Section 2.2, it is worth mentioning that while in this study we focus on themost popular form of the ec-CC formalism, in which one solves the CC equations projected17n singles and doubles, eqs 3 and 4 or 5 and 6, for the T and T clusters in the presence of T and T extracted from the external, non-CC source, one can extend our considerations tohigher-order ec-CC variants that solve for higher–than–two-body components of the clusteroperator T (see, e.g., the ecCCSDt-CASSCF approach, discussed in ref 36, for an exampleof such a higher-order ec-CC method). This can be done by reusing eqs 17–19 and redeﬁningsubspaces H ( P A ) and H ( P B ) and the corresponding excitation operators C ( P A ) and C ( P B ) that enter the CI wave function | Ψ (cid:105) through eq 14. To illustrate this, let us consider the CIeigenvalue problem in which C ( P A ) = m A (cid:88) n =1 C n , (30)with m A ≥

2, so that the corresponding subspace H ( P A ) is spanned by all determinants | Φ α (cid:105) with the excitation ranks ranging from 1 to m A , and C ( P B ) = (cid:88) n ≥ m A +1 C ( P B ) n , (31)where the many-body components C ( P B ) n with n ≥ m A + 1 describe contributions from theremaining determinants included in the CI calculations, designated as | Φ β (cid:105) and spanningsubspace H ( P B ) . As demonstrated in Appendix A, the ec-CC-I calculations, in which onesolves the CC system (cid:104) Φ α | ( H N e T ) C | Φ (cid:105) = 0 , (32)where | Φ α (cid:105) ∈ H ( P A ) , for the T n components of T with n = 1 , . . . , m A in the presence of the T m A +1 and T m A +2 clusters extracted from the CI state | Ψ (cid:105) determined by using eqs 17–19,without eliminating any T m A +1 and T m A +2 amplitudes resulting from the cluster analysis of | Ψ (cid:105) , return back the CI correlation energy ∆ E (CI) , eq 19. The ec-CC-II approach, where thepurely disconnected T m A +1 and T m A +2 amplitudes of the type of eqs 12 and 13, for whichthe corresponding CI excitation coeﬃcients in C ( P B ) m A +1 and C ( P B ) m A +2 are zero, are disregardedwhen setting up the CC system represented by eq 32, may improve the energies obtained in18he CI calculations used to determine T m A +1 and T m A +2 , but only if the excitation manifoldsdeﬁning C ( P B ) m A +1 and C ( P B ) m A +2 are not treated fully. Once all n -tuply excited determinants with n = m A + 1 and m A + 2 are captured by the C ( P B ) m A +1 and C ( P B ) m A +2 operators, the ec-CC-Iand ec-CC-II schemes based on the wave functions | Ψ (cid:105) deﬁned by eq 14 become equivalentand the resulting ec-CC energies become identical to those obtained with CI. In making theabove statements, we took advantage of the fact that the T n clusters with n > m A + 2 do notenter eq 32, since electronic Hamiltonians do not contain higher–than–two-body interactionsand the excitation ranks of determinants | Φ α (cid:105) do not exceed m A . The validity of the mathematical considerations discussed in Section 2.1 and Appendices Aand B, and of the above remarks about the anticipated accuracy patterns in the ec-CC andthe underlying CI computations, are supported by the numerical data shown in Tables 1–3.Our numerical example is the C v -symmetric double bond dissociation of the water molecule,as described by the cc-pVDZ basis set, in which both O–H bonds are simultaneouslystretched without changing the ∠ (H–O–H) angle. In addition to the equilibrium geometry,designated as R = R e , we considered two stretches of the O–H bonds, by factors of 2and 3, designated in our tables as R = 2 R e and 3 R e , respectively. All three geometriesadopted in our calculations were taken from ref 68. Following ref 68, in all of the post-SCFcomputations reported in this work, we correlated all electrons and the spherical componentsof d functions contained in the cc-pVDZ basis were employed throughout. In all of the CI,CC, and ec-CC computations carried out in this study, we used the restricted Hartree–Fock(RHF) determinant as a reference | Φ (cid:105) .We use the H O/cc-pVDZ system as our molecular example, since it is small enough toallow all kinds of CI and CC computations, including high-level methods, such as CISDTQand beyond and CCSDTQ, as well as FCI, which are all critical for the analysis of theec-CC formalism presented in this work. At the same time, the C v -symmetric double bond19issociation of the water molecule creates signiﬁcant challenges to many quantum chemistryapproaches. In particular, the stretched nuclear geometries considered in this study arecharacterized by substantial multi-reference correlation eﬀects, which result in large triplyand quadruply excited CI and CC amplitudes when a single-reference framework is employed,and which require a well-balanced description of nondynamic and dynamic correlations (see,e.g., refs 68,69). As shown, for example, in Table 1, the CISDTQ approach, which is veryaccurate at R = R e , recovering the FCI energy to within a small fraction of a millihartree,struggles when the stretched geometries are considered, increasing the errors relative to FCIto 5.819 and 16.150 millihartree at R = 2 R e and 3 R e , respectively. The R = 3 R e geometryis so demanding that even the CCSDTQ method, which is virtually exact at R = R e and2 R e , faces a challenge, producing the sizable − .

733 millihartree error relative to FCI whenthe RHF reference determinant is employed. CCSDTQ improves the erratic behavior of theCCSDT approach at R = 3 R e , which produces the energy 40.126 millihartree below FCI,but is not suﬃcient if one aims at a highly accurate description, pointing to the signiﬁcanceof higher–than–quadruply excited clusters in this case. A similar remark applies to the CIcomputations, which require an explicit inclusion of six-fold excitations if we are to recoverthe FCI energetics to within a millihartree at all three geometries of water considered in thisstudy (as can be seen in Table 1, errors in the CISDTQP energies relative to FCI at R = 2 R e and 3 R e exceed 2 and 6 millihartree, respectively).In performing the various ec-CC computations reported in Tables 1–3, we relied onour in-house CC and cluster analysis codes, interfaced with the RHF, restricted open-shellHartree–Fock, and integral routines in the GAMESS package. The CISD, CISDT, CIS-DTQ, CISDTQP, and CISDTQPH wave functions, which formed the non-CC sources of thethree- and four-body clusters for the subsequent ec-CC-I and ec-CC-II calculations basedon conventional CI truncations, presented in Table 1, were obtained using the occupationrestricted multiple active space determinantal CI (ORMAS) code available in GAMESS.The selected CI wave functions used to provide the T and T cluster components for the20c-CC-I, ec-CC-II, and ec-CC-II computations based on the CIPSI methodology, shown inTables 2 and 3, which are further elaborated on in Section 3, were determined with theQuantum Package 2.0 software. As in the case of other post-SCF calculations reportedin this study, all of our CIPSI runs relied on the transformed one- and two-electron integralsin an RHF molecular orbital basis generated with GAMESS. While the authors of ref 68obtained the FCI/cc-pVDZ energies of the water molecule at R = R e , 2 R e , and 3 R e , werecalculated them in this study using the GAMESS determinantal FCI routines, sincethe FCI results at the latter two geometries reported in ref 68 were not converged tightlyenough. The CCSD, CCSDT, and CCSDTQ energies were taken from ref 69, although werecalculated them here as well using our in-house CC codes interfaced with GAMESS.As shown in Table 1, and in agreement with our mathematical analysis in Section 2.1and Appendices A and B, the ec-CC-I energies obtained by solving eqs 5 and 6 for the singlyand doubly excited clusters in the presence of the T and T components extracted fromthe CISD, CISDT, CISDTQ, CISDTQP, and CISDTQPH wave functions, without makingany a posteriori modiﬁcations in T and T obtained in this way, perfectly match their CIcounterparts. This is happening, since all of the above CI truncations are characterizedby a complete treatment of the C and C operators. Similar is observed in the ec-CC-Icomputations relying on the selected CI wave functions, obtained in this work with CIPSI,as sources of the triply and quadruply excited clusters, when the CI diagonalization spacesare large enough to capture all or nearly all singles and doubles. This can be seen in Tables2 and 3. Indeed, when the CIPSI calculations initiated from the RHF wave functions,shown in Table 2, capture nearly all singly and doubly excited determinants at all threegeometries of water considered in this study, which happens when the input dimensionparameter N det(in) utilized by the CIPSI methodology to terminate the buildup of the CIdiagonalization spaces, deﬁned in Section 3, is 100,000 or more, the resulting ec-CC-I energiesmatch their CIPSI counterparts to within a microhartree. When N det(in) is set at 100,000, theﬁnal CI diagonalization spaces, which are used to obtain the wave functions that generate the21 and T clusters for the ec-CC computations, contain about 200,000 S z = 0 determinantsof the A ( C v ) symmetry (see the N det(out) values in Table 2) and the corresponding CIPSIruns capture about 94 % of all singles and 98 % of all doubles at R = R e , 100 % of singlesand ∼

92 % of doubles at R = 2 R e , and about 91 % of all singly excited and 79 % of alldoubly excited determinants at R = 3 R e . Interestingly, the CIPSI and CIPSI-based ec-CC-Ienergies agree to within a millihartree when the CI diagonalization spaces contain as little as ∼ T and T .Although one does not do it in typical applications of the CIPSI approach, we alsoperformed a numerical experiment in which the process of building up the CI diagonalizationspaces in CIPSI runs was initiated from the CISD wave function. We did this for the R = 2 R e geometry. In this case, each CIPSI run was forced to provide a complete treatment of the C and C operators, so that, based on our mathematical considerations in Section 2.1, theresulting ec-CC-I energies and their CIPSI counterparts should be identical. As shown inTable 3, they are indeed identical for all the values of the input parameter N det(in) thatpermit CIPSI runs beginning with all singly and doubly excited determinants in the initialdiagonalization space (in the case of the all-electron calculations for water, as described bythe spherical cc-pVDZ basis set, the CISD wave function contains 3,416 S z = 0 determinantsof the A ( C v ) symmetry, so that N det(in) must be at least 3,416).The above relationships between ec-CC-I and CI provide useful insights, but, as alreadypointed out, the realistic applications of the CI-based ec-CC methodology adopt the ec-CC-II protocol, where one keeps only those T and T amplitudes resulting from the clusteranalysis of the underlying CI wave function for which the corresponding C and C excitationcoeﬃcients are nonzero. The ec-CC-II algorithm takes care of the problems resulting from the22resence of the purely disconnected forms of the T and T clusters, such as those representedby eqs 12 and 13, which emerge when the corresponding C and C amplitudes are zero,but it does not prevent the collapse of the ec-CC energies onto their CI counterparts. Asimplied by our formal analysis, the CI-based ec-CC-II calculations are capable of improvingthe corresponding CI energetics, but in order for this to happen, the triply and quadruplyexcited manifolds included in the CI diagonalizations used to determine T and T mustbe incomplete. Otherwise, i.e., when the underlying CI computations capture all triplesand quadruples, the ec-CC-I and ec-CC-II schemes become equivalent, returning back thecorresponding CI energies.We can see all of the above patterns in our tables. Indeed, as demonstrated in Table 1,variant II of the ec-CC methodology improves the CI energetics when one uses the CISDand CISDT wave functions in the corresponding cluster analyses, but once all triples andquadruples are included in CI, as in the case of the CISDTQ-, CISDTQP-, and CISDTQPH-based ec-CC-II calculations carried out in this study, the ec-CC-II and the associated CIenergies do not diﬀer. This may result in unusual and non-systematic accuracy patterns, oreven in an erratic behavior of the ec-CC-II computations. For example, normally one antici-pates that when the quality of the wave function improves the resulting energies improve aswell, but this is not the case when we examine the CI-based ec-CC-II energies of the watermolecule at the stretched R = 2 R e geometry shown in Table 1. The CISD-based ec-CC-IIcalculation, where C = C = 0, so that the T and T clusters entering eqs 5 and 6 areset at zero as well, returns the energy obtained with CCSD, reducing the massive, 72.017millihartree, error resulting from the CISD diagonalization to 22.034 millihartree. The ec-CC-II computation, in which one solves eqs 5 and 6 for T and T using the T amplitudesextracted from the higher-rank CISDT wave function, oﬀers further error reduction, to a2.920 millihartree level, but the next scheme in the ec-CC-II hierarchy in Table 1, whichuses a much better wave function in the cluster analysis than CISDT, by returning back theCISDTQ energy, worsens the previous CISDT-based ec-CC-II result, increasing the error by23 factor of 2. The situation at R = 3 R e is even more peculiar. In this case, the replacementof the CISD wave function by its higher-level CISDT counterpart in the ec-CC-II calcula-tions not only worsens the CISD-based ec-CC-II, i.e., CCSD energy, increasing the 10.849millihartree unsigned error obtained with CCSD by a factor of 7, but also places the resultingenergy 77.317 millihartree below FCI. By accounting for T correlations, the ec-CC-II com-putation employing the CISDTQ wave function in the cluster analysis improves the erraticCISDT-based ec-CC-II result, but since the CISDTQ-based ec-CC-II and CISDTQ energiesare identical and the CISDTQ energy, which diﬀers from FCI by 16.150 millihartree, is ratherpoor, the beneﬁts of using the ec-CC-II methodology are virtually none in this case. Thispoints to the need for being very careful about evolving truncated CI wave functions used inthe context of ec-CC computations. The CI algorithms that capture the excitation spacesthrough quadruples, when going from one truncation level to the next, too rapidly are notthe best candidates for the ec-CC work.Based on our formal considerations and numerical evidence, the truncated CI wave func-tions that are expected to beneﬁt most from the subsequent ec-CC computations are thosewhich attempt to probe the many-electron Hilbert space without saturating the lower-rankexcitation manifolds, especially the excitations through quadruples, too early. We have seenthis in our semi-stochastic CAD-FCIQMC work, which relies on the cluster analysis ofFCIQMC wave functions, and we can see it again in Tables 2 and 3, where we examinethe performance of the CIPSI-based ec-CC-II algorithm and its ec-CC-II extension thatcorrects the ec-CC-II energies for the missing T correlations that are not accounted for inCIPSI diagonalizations. The ec-CC-II and ec-CC-II approaches that rely on the CIPSI wavefunctions to extract the information about the leading T and T clusters are discussed next.24 CIPSI-DRIVEN ec-CC

The purpose of this section is to present and test a novel form of the ec-CC approach, focusingon the ec-CC-II protocol and its ec-CC-II counterpart, in which the wave functions used togenerate T and T clusters are obtained in the Hamiltonian diagonalizations deﬁning theCIPSI approach, as implemented in the Quantum Package 2.0. As in the case of thenumerical analysis discussed in Section 2.2, we used the water molecule, as described bythe cc-pVDZ basis set, at the equilibrium and two displaced geometries in which both O–Hbonds were simultaneously stretched by factors of 2 and 3 without changing the ∠ (H–O–H)angle, to illustrate the performance of the CIPSI-driven ec-CC-II and ec-CC-II methods.We recall that, in analogy to many other selected CI schemes, the main idea of CIPSI is toperform a series of CI calculations using increasingly large, iteratively deﬁned, diagonaliza-tion spaces, designated as V int . The construction of the V int space for a given CIPSI iterationis carried out using a perturbative selection of the singly and doubly excited determinantsfrom the previously determined V int . To be more precise, if (cid:12)(cid:12) Ψ (CIPSI) (cid:11) = (cid:80) | Φ I (cid:105)∈V int c I | Φ I (cid:105) isa CI wave function associated with a given CIPSI iteration, where the coeﬃcients c I and thecorresponding energy E var are obtained by diagonalizing the Hamiltonian in the current V int space, the diagonalization space for the subsequent CIPSI iteration is constructed using aperturbative selection of the singly and doubly excited determinants out of (cid:12)(cid:12) Ψ (CIPSI) (cid:11) . Thus,if V ext designates the space of singly and doubly excited determinants out of (cid:12)(cid:12) Ψ (CIPSI) (cid:11) , wecalculate the second-order MBPT energy correction associated with each individual deter-minant | Φ ρ (cid:105) ∈ V ext , e (2) ρ = (cid:12)(cid:12) (cid:10) Φ ρ (cid:12)(cid:12) H (cid:12)(cid:12) Ψ (CIPSI) (cid:11)(cid:12)(cid:12) / ( E var − (cid:104) Φ ρ | H | Φ ρ (cid:105) ), and use the resulting e (2) ρ values to determine how to enlarge the current space V int . One can initiate CIPSI itera-tions, each consisting of the diagonalization of the Hamiltonian in the current space V int todetermine (cid:12)(cid:12) Ψ (CIPSI) (cid:11) and the identiﬁcation of the associated V ext space needed to construct V int for the subsequent iteration, by starting from a single determinant, such as the RHFwave function, or a multi-determinantal state obtained, for example, in some preliminary CIcalculation. 25ollowing refs 50,51, in the speciﬁc CIPSI model adopted in this work, used to perform theCIPSI calculations for water reported in Tables 2 and 3, the actual V ext spaces were obtainedby stochastic sampling of the most important singles and doubles out of the (cid:12)(cid:12) Ψ (CIPSI) (cid:11) wavefunctions and the sampled determinants | Φ ρ (cid:105) generated in each CIPSI iteration were arrangedin descending order according to their | e (2) ρ | values. We then enlarged each space V int , to beused in the subsequent CI diagonalization, starting from the determinants | Φ ρ (cid:105) with thelargest | e (2) ρ | contributions and moving toward those with the smaller values of | e (2) ρ | , untilthe dimension of V int was increased by the user-deﬁned factor f (in reality, this increase inthe dimension of V int , from one CIPSI iteration to the next, was always slightly larger toensure that the resulting CI wave function remained an eigenfunction of total spin). In all ofthe CIPSI calculations reported in this study, we set f at 2 (which is the default value of f in Quantum Package 2.0), forcing the CIPSI wave function (cid:12)(cid:12) Ψ (CIPSI) (cid:11) to grow in a temperedmanner, without saturating the lower-rank excitation manifolds too rapidly, while probingthe many-electron Hilbert space more eﬀectively at the same time. To obtain the CI wavefunction of a given CIPSI run used to determine the T and T clusters employed in the ec-CC computations, we chose to terminate the sequence of CIPSI diagonalizations when thedimension of space V int exceeded the input parameter N det(in) . Due to the aforementioneddimension-doubling growth mechanism, the size of the CI wave function at the end of a givenCIPSI calculation, denoted as N det(out) , always exceeded N det(in) , but never by more than afactor of 2.As a byproduct of calculating energy corrections e (2) ρ associated with the sampled de-terminants | Φ ρ (cid:105) ∈ V ext generated in each CIPSI iteration, in addition to the variationalenergies E var , one has an immediate access to the total second-order multi-reference MBPTcorrections ∆ E (2) = (cid:80) | Φ ρ (cid:105)∈V ext e (2) ρ . For each CIPSI run carried out in this study, as deﬁnedby the aforementioned wave function termination parameter N det(in) , we report the uncor-rected energy E var corresponding to the CI wave function obtained in the last Hamiltoniandiagonalization of that run and its perturbatively corrected E var + ∆ E (2) counterpart. The26etermination of ∆ E (2) is also important for a diﬀerent reason. Although the stopping crite-rion adopted in our CIPSI runs executed with Quantum Package 2.0 relies on the above wavefunction termination parameter N det(in) , the CIPSI iterations can also stop if the magnitudeof the total second-order MBPT correction ∆ E (2) falls below a threshold parameter η . Toprevent this from happening, we used a very tight η value of 10 − hartree.After the completion of each CIPSI run, we cluster analyzed the wave function (cid:12)(cid:12) Ψ (CIPSI) (cid:11) obtained in the ﬁnal Hamiltonian diagonalization of that run and used the resulting T and T components to perform the corresponding ec-CC computations. Since in the ec-CC-IIcalculations that interest us in this section the purely disconnected T and T amplitudes ofthe type of eqs 12 and 13, for which the corresponding C and C contributions are zero, aredisregarded when setting up eqs 5 and 6, we corrected the ec-CC-II correlation energies ∆ E ,determined using eq 7, for the missing T eﬀects not captured by ec-CC-II. In order to do this,we adopted the formulas that we previously used to develop the deterministic and semi-stochastic CC( P ; Q ) approaches. According to the biorthogonal moment expansionsbehind the CC( P ; Q ) framework, the corrections to the ec-CC-II correlation energies ∆ E dueto the T eﬀects not captured by the CIPSI wave functions, subjected to the so-called two-body approximation introduced in ref 76 that has been shown to provide a highly accuraterepresentation of these eﬀects, can be deﬁned as follows: δ = (cid:88) | Φ abcijk (cid:105)(cid:54)∈V int (cid:96) abcijk (2) M ijkabc (2) . (33)Here, M ijkabc (2) = (cid:104) Φ abcijk | H N (2) | Φ (cid:105) (34)are the moments of CC equations corresponding to projections on the triply excited deter-minants missing in the ﬁnal CI diagonalization space V int of a given CIPSI run, where the27imilarity-transformed Hamiltonian H N (2) = ( H N e T + T ) C = e − T − T H N e T + T (35)is obtained using the singly and doubly excited clusters resulting from the ec-CC-II calcula-tions. The (cid:96) abcijk (2) coeﬃcients, which are given by the expression (cid:96) abcijk (2) = (cid:104) Φ | (Λ + Λ ) H N (2) | Φ abcijk (cid:105) / (∆ E − (cid:104) Φ abcijk | H N (2) | Φ abcijk (cid:105) ) , (36)are the deexcitation amplitudes that require solving the companion left CC equations for theone- and two-body components of the operator Λ that generates the CC bra state (cid:104) Φ | (1 +Λ) e − T (cf., e.g., ref 8), i.e., (cid:104) Φ | (1 + Λ + Λ )[ H N (2) − ∆ E ] | Φ α (cid:105) = 0 , (37)where | Φ α (cid:105) represents the singly and doubly excited determinants.As already alluded to in Section 2, the ec-CC approach, in which the ec-CC-II correlationenergy ∆ E , eq 7, is corrected for the missing T eﬀects not captured by the underlying CI cal-culations using correction δ given by eq 33, deﬁnes the ec-CC-II method. We use the abovecorrection δ rather than its simpliﬁed CCSD(T)-like analog adopted in the RMRCCSD(T)and ACI-CCSD(T) work, since it is well established that, in analogy to the completelyrenormalized CC approaches, such as CR-CC(2,3), the CC( P ; Q ) moment corrections area lot more robust. Furthermore, there are molecular applications where the CCSD(T)-type corrections, instead of improving the underlying results, make the results worse (see,e.g., ref 77 for a discussion). It should also be emphasized that the computational cost ofdetermining the δ correction to the ec-CC-II correlation energy is only twice the cost ofcalculating the corresponding (T) correction, i.e., the replacement of δ , eq 33, by its (T)counterpart oﬀers no signiﬁcant computational beneﬁts either.28t this time, our deterministic ec-CC-II and ec-CC-II codes, along with the routines thatallow one to perform the corresponding ec-CC-I calculations examined in Section 2.2, arecapable of reading the non-CC wave functions for the subsequent cluster analysis and the CCcomputations based on eqs 5–7 and 33–37 from GAMESS (the determinantal ORMAS and FCI runs) and Quantum Package 2.0 (CIPSI calculations ). We also have thesemistochastic variant of the ec-CC codes that can work with the wave functions obtainedwith FCIQMC, as in the CAD-FCIQMC methodology introduced in ref 30 and furtherelaborated on in the Supporting Information to ref 35. The remaining information aboutour ec-CC codes used in this work can be found in Section 2.2.We now move to the discussion of our CIPSI-based all-electron ec-CC-II and ec-CC-II calculations for the C v -symmetric double bond dissociation of the water molecule, asdescribed by the cc-pVDZ basis set, summarized in Tables 2 and 3. The information aboutthe nuclear geometries used in these calculations has been provided in Section 2.2. In analogyto the ec-CC-I computations discussed in Section 2.2, we performed two sets of CIPSI-drivenec-CC-II and ec-CC-II calculations. In the ﬁrst set, summarized in Table 2, where we appliedthe CIPSI-based ec-CC-II and ec-CC-II approaches to the equilibrium geometry, R = R e ,and two stretches of both O–H bonds, including R = 2 R e and 3 R e , we initiated each CIPSIrun from the corresponding RHF wave function. In the second set, summarized in Table3, where we focused on a single stretch of both O–H bonds, namely, R = 2 R e , we forcedeach CIPSI calculation preceding the ec-CC-II and ec-CC-II steps to provide a completetreatment of the C and C operators by initiating the CIPSI runs from the wave functionobtained with CISD. In each case, we considered multiple values of the input parameter N det(in) used to terminate the CIPSI runs by sampling N det(in) in a semi-logarithmic manner.In the case of the CIPSI calculations initiated from the single-determinantal RHF wavefunction (Table 2), the smallest value of N det(in) was 1. As already mentioned in Section 2.2,the smallest value of N det(in) used in the CIPSI calculations initiated from the CISD statehad to be at least 3,416 (the number of S z = 0 determinants deﬁning the CISD ground-state29roblem if the A ( C v ) symmetry is employed). The smallest value of N det(in) that we usedin this case was 5,000.The results in Tables 2 and 3 demonstrate that both ec-CC-II and ec-CC-II , especiallythe latter method, oﬀer signiﬁcant improvements over the underlying CIPSI computations.This is particularly true when the relatively small values of N det(in) and the similarly smallCIPSI diagonalization spaces, as deﬁned by N det(out) , are employed. They also signiﬁcantlyimprove the corresponding CCSD results, where T and T are assumed to be zero. Asshown in Table 2, with as little as about 5,000–6,000 determinants in the CIPSI calculationsinitiated from RHF, which capture approximately 40–50 % of singles, 20–70 % of doubles,1–2 % of triples, and 0.2 % of quadruples, the ec-CC-II approach reduces the approximately8, 14, and 10 millihartree errors relative to FCI obtained with CIPSI at R = R e , 2 R e , and3 R e , respectively, to about 2–4 millihartree. The triples correction δ , eq 33, used in the ec-CC-II calculations, reduces these errors even further, to 0.012, 0.226, and 1.507 millihartree,respectively. These are impressive improvements, especially if we realize that the CISD andCCSD calculations, which use the same numbers of singly and doubly excited amplitudes asec-CC-II and only slightly smaller numbers of excitation amplitudes than the N det(in) = 5 , R = R e , morethan 72 millihartree at R = 2 R e , and almost 165 millihartree at R = 3 R e in the case of CISDand about 4, 22, and 11 millihartree, respectively, when the CCSD approach is employed (cf.Table 1). As a matter of fact, by comparing the results in Tables 1 and 2, we can see that the N det(in) = 5 ,

000 ec-CC-II calculations initiated from RHF are considerably more accuratethan the CCSDT, CISDTQ, and even CISDTQP calculations, which are more expensiveby orders of magnitude and which use 90,279, 1,291,577, and 10,502,233 S z = 0 excitationamplitudes, when the A ( C v ) symmetry is employed, as opposed to 3,145 amplitudes usedby the underlying ec-CC-II approach and about 5,000–6,000 determinants included in theCIPSI diagonalizations needed to extract the T and T clusters for the ec-CC computations.Interestingly, while the CIPSI-driven ec-CC-II method using N det(in) = 5 ,

000 is less accurate30han the CCSDTQ approach in the R = R e –2 R e region, it becomes competitive with it whenthe largest stretch of both O–H bonds in water considered in this study, i.e., R = 3 R e , isexamined, reducing the − .

733 millihartree error relative to FCI obtained with CCSDTQby more than a factor of 3. Compared to ec-CC-II, there is an extra cost associated with thedetermination of the triples correction δ in the ec-CC-II calculations, but the computationalcosts associated with this noniterative correction, being similar to those of CCSD(T), areless than the cost of a single iteration of CISDT or CCSDT. One also has to keep in mindthat, in analogy to CCSD(T), one does not have to store higher–than–two-body quantitieswhen forming the triples correction deﬁned by eqs 33–37.The results of the ec-CC-II and ec-CC-II calculations employing the T and T ampli-tudes extracted from the CIPSI wave functions become even more accurate when the CIPSIdiagonalization spaces start growing. For example, when the wave function terminationparameter N det(in) is set at 50,000, which translates into about 80,000–90,000 determinantsparticipating in the ﬁnal Hamiltonian diagonalizations of the corresponding CIPSI runs, theec-CC-II calculations reduce the 2.612, 2.436, and 0.906 millihartree errors relative to FCIat R = R e , 2 R e , and 3 R e , respectively, obtained with CIPSI, by factors of 3–4, to 0.626millihartree at R = R e , 0.788 millihartree at R = 2 R e , and 0.341 millihartree at R = 3 R e .The δ correction reduces the already small errors obtained with the CIPSI-driven ec-CC-IIapproach at R = R e and 2 R e even further, to 0.168 and 0.515 millihartree, respectively.The ec-CC-II calculations do not improve the underlying ec-CC-II result at R = 3 R e anylonger, most likely because the δ correction deﬁned by eq 33 takes care of only the miss-ing T correlations not captured by CIPSI, without correcting the ec-CC-II energies for themissing T eﬀects, which at R = 3 R e may become substantial (cf., e.g., the large diﬀerencebetween the CCSDTQ and CCSDT energies in Table 1). On the other hand, the 0.358millihartree error obtained with ec-CC-II in a highly challenging multi-reference situationcreated by the R = 3 R e structure of the water molecule is a very accurate result. Onehas to keep in mind that the much more expensive CISDTQ and CCSDTQ methods, which31se 1,291,577 excitation amplitudes, as opposed to N det(out) = 92 ,

707 determinants in thelast CIPSI diagonalization space corresponding to N det(in) = 50 ,

000 and 3,145 singles anddoubles participating in the ec-CC steps, combined with the relatively inexpensive nonitera-tive correction δ , produce the 16.150 and − .

733 millihartree errors, respectively, when the R = 3 R e geometry is considered. As in the previously discussed N det(in) = 5 ,

000 case, the N det(in) = 50 ,

000 CIPSI-based ec-CC-II calculations are also more accurate than CISDTQP,which uses 10,502,233 excitation amplitudes.When we look at the overall picture emerging from the results reported in Tables 2 and3, it is quite clear that the CIPSI-driven ec-CC-II and ec-CC-II computations, especiallythe latter ones, oﬀer a rapid convergence toward FCI with the relatively small CIPSI diago-nalization spaces. As shown, for example, in Tables 2 and 3, when one uses about 1,000,000determinants in the ﬁnal diagonalizations of the CIPSI runs, both the uncorrected ec-CC-IIand the corrected ec-CC-II calculations recover the FCI energetics at all three geometriesof the water molecule examined in this work, including the most challenging R = 3 R e struc-ture, to within 0.1 millihartree. To appreciate this result, one has to keep in mind that1,000,000 determinants in the diagonalization space is nowhere near the dimension of theFCI ground-state problem, which is 451,681,246 if the S z = 0 A ( C v )-symmetric deter-minants are considered. What certainly helps the ec-CC-II and ec-CC-II calculations inachieving this remarkable performance is the aforementioned tempered growth of the wavefunction in the consecutive CIPSI iterations, which allows the CIPSI algorithm, as imple-mented in the Quantum Package 2.0 software utilized in this study, to eﬃciently sample themany-electron Hilbert space, without saturating the lower-rank excitation manifolds, espe-cially the excitations through quadruples, too early. As already demonstrated in Section 2,mathematically and numerically, if CIPSI saturated the lower-rank excitation manifolds toorapidly, without bringing information about higher–than–quadruply excited contributions,our ec-CC-II computations would collapse onto the results of the respective Hamiltoniandiagonalizations. This emphasizes the signiﬁcance of the appropriate design of the CI (in32eneral, non-CC) methodologies used to provide information about the T and T clustersin ec-CC considerations. Our computations suggest that the current design of the CIPSIalgorithm in Quantum Package 2.0 is well suited for the ec-CC-II and ec-CC-II approachesdeveloped in this study.Having stated all of the above, one cannot ignore the fact that the CIPSI approach canbe very eﬃcient in its own right, especially when the variational energies E var resulting fromthe underlying CI diagonalizations are corrected for the remaining correlation eﬀects withthe help of the aforementioned and easy-to-determine second-order multi-reference MBPTcorrections ∆ E (2) . As shown, for example, in Table 2, when the ﬁnal diagonalizations ofthe CIPSI runs involve about 1,000,000 determinants, the E var + ∆ E (2) energies are withina few microhartree from FCI, independent of the nuclear geometry, despite the fact thatthe FCI space is about 500 times larger. When N det(in) = 1 , , E var + ∆ E (2) energies reportedin Table 2 are competitive with their ec-CC-II and ec-CC-II counterparts. Even with aslittle as about 80,000–90,000 determinants in the ﬁnal diagonalizations of the CIPSI runs,which result from setting the N det(in) parameter at 50,000, the E var + ∆ E (2) energies are stillwithin 0.1 millihartree from FCI. While the perturbatively corrected CIPSI calculations forlarger many-electron systems and larger basis sets may require additional extrapolations toachieve similarly accurate results, the fact of the matter remains that CIPSI represents apowerful computational tool capable of generating high-quality results by itself. The CIPSI-driven ec-CC-II and ec-CC-II approaches are capable of substantially improving the purelyvariational CIPSI energies and the results of lower-rank CC calculations, but one has to keepin mind that CIPSI and other modern variants of selected CI techniques can be made veryaccurate too, i.e., the beneﬁts oﬀered by the ec-CC framework utilizing such techniques maynot always be as great as desired. 33 CONCLUSIONS

One of the most interesting ways of extending the applicability of single-reference CC ap-proaches to multi-reference and strongly correlated systems is oﬀered by the ec-CC formal-ism. The key idea of all ec-CC methods is to solve the CC equations for the lower-rankcluster components, such as T and T , in the presence of their higher-order T n counter-parts (typically, T and T ) extracted from a non-CC source that behaves well in situationscharacterized by stronger nondynamic correlations. In this paper, we have focused on theec-CC methods, in which one solves the CC equations projected on the singly and doublyexcited determinants, eqs 5 and 6, respectively, for the T and T clusters using the T and T contributions obtained via cluster analysis of truncated CI wave functions.The present study has had two main objectives. The ﬁrst objective has been a thoroughexamination of the mathematical content of the ec-CC equations, backed by the appropriatenumerical analysis, in which we have attempted to identify the truncated CI states that,after extracting the T n components of the cluster operator T with n = 1–4 from them viathe cluster analysis procedure adopted in all ec-CC considerations, satisfy eqs 5 and 6. Thisis an important topic, since, by solving eqs 5 and 6 for the T and T clusters in the presenceof the T and T amplitudes extracted from such states, the ec-CC procedure can only returnback the corresponding CI energies, without improving them at all. The second objectivehas been the exploration of a novel ﬂavor of the ec-CC approach in which the wave functionsused to generate the required T and T contributions are obtained by the cluster analysisof the truncated CI wave functions resulting from one of the most successful selected CImethods abbreviated as CIPSI.We have demonstrated that the ec-CC calculations performed by solving eqs 5 and 6,where the T and T components are obtained by cluster analysis of the CI wave functions thatdescribe singles and doubles fully and higher–than–double excitations in a complete or partialmanner and where all T and T amplitudes generated in this way are kept, as in the ec-CC-Iprotocol discussed in Section 2, return back the underlying CI energies. This means that34he ec-CC computations, which use the wave functions obtained with the conventional CItruncations, such as CISD, CISDT, CISDTQ, etc., or with any other CI method that providesa complete treatment of the single and double excitation manifolds, oﬀer no improvementsover CI if no a posteriori modiﬁcations are made in T and T extracted from CI. In reality,i.e., in typical applications of the CI-based ec-CC methodology, one disregards the purelydisconnected T and T amplitudes of the type of eqs 12 and 13, for which the correspondingCI excitation coeﬃcients are zero, as in the ec-CC-II algorithm discussed in Sections 2and 3, but this does not prevent the collapse of the resulting ec-CC energies onto their CIcounterparts. As shown in this work, mathematically and through numerical examples, theec-CC-II approach may oﬀer improvements over the underlying CI calculations, but only ifthe triply and quadruply excited manifolds considered in CI are incomplete. Once the CIcalculation captures all triples and quadruples, as in CISDTQ, CISDTQP, CISDTQPH, etc.,or in any other CI truncation that treats singles through quadruples fully, the ec-CC-I andec-CC-II schemes examined in our study become equivalent and the ec-CC computationsoﬀer no beneﬁts compared to CI, while adding to the computational cost. In other words,in order for the ec-CC computations based on eqs 5 and 6 to signiﬁcantly improve theenergetics obtained with the underlying CI approach, it is essential to avoid a complete ornearly complete treatment of the triple and quadruple excitation manifolds. In that case,after solving eqs 5 and 6 for T and T in the presence of T and T extracted from CI,in which the amplitudes that do not have the companion triple and quadruple excitationCI coeﬃcients are ignored, it is useful to correct the resulting ec-CC-II energies for theremaining T and T or at least T correlations, as in the RMRCCSD(T) approach of ref29 or the CIPSI-driven ec-CC-II method introduced in this work, to name representativeexamples. We have also considered higher-order ec-CC-I and ec-CC-II variants that solve forhigher–than–two-body components of the cluster operator T and examined their relationswith the underlying CI approaches.Our mathematical and numerical analyses imply that the truncated CI wave functions35hat are best suited for the ec-CC computations are those that attempt to eﬃciently sam-ple the many-electron Hilbert space without saturating the lower-rank excitation manifolds,especially the excitations through quadruples, too rapidly. As shown in this work, the mod-ern formulation of the CIPSI approach, developed in refs 50,51, which achieves a temperedgrowth of the wave function through a systematic sequence of CI calculations, combinedwith perturbative and stochastic analyses of the excitation spaces used in Hamiltonian di-agonalizations, is capable of providing such CI states. By examining the C v -symmetricdouble bond dissociation of the water molecule, including the very challenging region whereboth O–H bonds are stretched by a factor of 3, so that even the sophisticated levels ofthe CC theory, such as full CCSDT and CCSDTQ, struggle, we have demonstrated thatthe CIPSI-based ec-CC-II method, described in Section 3, is capable of providing highlyaccurate results with the relatively low computational costs, especially when the ec-CC-IIenergies are corrected for the missing T correlations via the ec-CC-II scheme. The CIPSI-driven ec-CC-II energies are so accurate that they are competitive with those obtained withthe much more expensive high-level CC and CI methods, such as CCSDTQ, CISDTQP, oreven CISDTQPH. Most remarkably, the ec-CC-II and ec-CC-II computations, especiallythe latter ones, oﬀer a rapid convergence toward FCI, reaching submillihartree accuracieswith the relatively small CIPSI diagonalization spaces used to determine T and T . Thefast convergence of the energies obtained in the CIPSI-enabled ec-CC-II runs toward FCI isreminiscent of the FCIQMC-enabled ec-CC computations using the CAD-FCIQMC method,observed in refs 30,35. This only reinforces our view that the CI approaches which are ca-pable of eﬃciently sampling the many-electron Hilbert space through a tempered evolutionof the wave function, without populating the lower-rank excitation manifolds, especially theexcitations through quadruples, too fast, beneﬁt the ec-CC computations most.While our initial tests of the CIPSI-driven ec-CC-II and ec-CC-II approaches reportedin this study are very promising, encouraging us to continue our work in this direction,the present study has also pointed out that the underlying CIPSI method, especially when36ne adds the second-order multi-reference MBPT corrections to the variational energiesobtained in the CIPSI Hamiltonian diagonalizations, can be made very accurate as well,being competitive with the ec-CC-II results. This is not a criticism of the idea of ec-CC,but, rather, a recognition of the fact that the new generations of selected CI techniques,such as those developed in refs 47,48,50,51,55–61, and the stochastic CI approaches, suchas FCIQMC, have become highly competitive with the best CC solutions (cf., e.g., refs35,62 for selected recent examples).It would be important to investigate if our initial observations regarding the performanceof the CIPSI-based ec-CC-II and ec-CC-II approaches reported in this study remain true ina wider range of molecular applications. Furthermore, it would be interesting to examine ifthe ec-CC approaches based on other selected CI methods developed by various groups in re-cent years, especially the semi-stochastic CI approaches enabling a highly eﬃcient samplingof the many-electron Hilbert space, such as the heat-bath CI framework of refs 59–61, areas accurate and as eﬃcient as the CIPSI-enabled ec-CC-II and ec-CC-II schemes examinedhere. It would also be important to investigate if our ec-CC-II results could further beimproved by correcting the CIPSI-driven ec-CC-II energies for the missing T as well as T correlation eﬀects not captured during CIPSI runs, rather than the missing T correlationsonly. In analogy to the triples correction δ adopted in this work, we could take advantageof the formulas that we previously used to develop and benchmark the CC( P ; Q ) approachescorrecting the active-space CC energies, such as CCSDtq, for the missing triples and quadru-ples. Such corrections might also beneﬁt the aforementioned semi-stochastic CAD-FCIQMC methodology based in the cluster analysis of FCIQMC wave functions, which wehave been pursuing in parallel with the ec-CC development work reported in this article.

Acknowledgement

This work has been supported by the Chemical Sciences, Geosciences and Biosciences Divi-37ion, Oﬃce of Basic Energy Sciences, Oﬃce of Science, U.S. Department of Energy (GrantNo. DE-FG02-01ER15228 to P.P) and Phase I and II Software Fellowships awarded toJ.E.D. by the Molecular Sciences Software Institute funded by the National Science Foun-dation grant ACI-1547580.

APPENDIX A: PROOF OF THE EQUIVALENCE OFEQ 17, WITH THE CORRELATION ENERGY DE-FINED BY EQ 19, AND EQS 5–7 BASED ON EQS 20,24, AND 25

The main purpose of this appendix is to demonstrate that the subsystem of CI equations forthe ground-state wave function | Ψ (cid:105) deﬁned by eqs 14–16, corresponding to the projections onthe singly and doubly excited determinants, as in eq 17, where the correlation energy ∆ E (CI) is calculated using eq 19, can be transformed into the CC amplitude equations projected onsingles and doubles, represented by eqs 3 and 4 or 5 and 6, with the CC energy given by eq7, if the cluster operator T is deﬁned by eq 20. We begin by rewriting eq 17 with the help ofeq 20, which allows us to convert the CI expansion for | Ψ (cid:105) , eq 14, to a CC-type expression,eq 1, while taking advantage of the property of the exponential ansatz given by eq 24. Weobtain, (cid:104) Φ α | e T ( H N e T ) C | Φ (cid:105) = ∆ E (CI) (cid:104) Φ α | e T | Φ (cid:105) , (A.1)where the CI correlation energy ∆ E (CI) , eq 19, becomes∆ E (CI) = (cid:104) Φ | H N e T | Φ (cid:105) = (cid:104) Φ | ( H N e T ) F C | Φ (cid:105) , (A.2)in agreement with the CC energy formula given by eq 7. Let us recall that the | Φ α (cid:105) statesentering eq A.1 are the determinants that span subspace H ( P A ) matching the C ( P A ) | Φ (cid:105) | Ψ (cid:105) which, in this particular analysis, are a short-hand notation for the singlyand doubly excited determinants, | Φ ai (cid:105) and | Φ abij (cid:105) , respectively.The next step is the insertion of the resolution of the identity in the many-electron Hilbertspace H , eq 25, between e T and ( H N e T ) C on the left-hand side of eq A.1. This allows us torewrite eq A.1 as follows:Λ (1) α + Λ (2) α + Λ (3) α + Λ (4) α = ∆ E (CI) (cid:104) Φ α | e T | Φ (cid:105) , (A.3)where Λ (1) α = (cid:104) Φ α | e T | Φ (cid:105) (cid:104) Φ | ( H N e T ) F C | Φ (cid:105) ≡ ∆ E (CI) (cid:104) Φ α | e T | Φ (cid:105) , (A.4)Λ (2) α = (cid:88) a (cid:48) (cid:104) Φ α | e T | Φ α (cid:48) (cid:105) (cid:104) Φ α (cid:48) | ( H N e T ) C | Φ (cid:105) , (A.5)Λ (3) α = (cid:88) β (cid:104) Φ α | e T | Φ β (cid:105) (cid:104) Φ β | ( H N e T ) C | Φ (cid:105) , (A.6)and Λ (4) α = (cid:88) γ (cid:104) Φ α | e T | Φ γ (cid:105) (cid:104) Φ γ | ( H N e T ) C | Φ (cid:105) , (A.7)with | Φ α (cid:105) , | Φ α (cid:48) (cid:105) ∈ H ( P A ) , | Φ β (cid:105) ∈ H ( P B ) , and | Φ γ (cid:105) ∈ H ( Q ) . We recall that H ( P B ) is asubspace spanned by the determinants | Φ β (cid:105) that match the content of the C ( P B ) operator, eq16, and H ( Q ) = ( H ( P ) ⊕ H ( P A ) ⊕ H ( P B ) ) ⊥ , with H ( P ) spanned by the reference determinant | Φ (cid:105) , is a subspace spanned by the remaining determinants, designated as | Φ γ (cid:105) , which arenot included in the CI wave function | Ψ (cid:105) deﬁned by eqs 14–16. In writing eq A.4, we tookadvantage of the energy formula given by eq A.2.Let us analyze each contribution Λ ( k ) α , k = 1–4, to eq A.3. Since we have made theassumption that the | Φ α (cid:105) states represent the singly and doubly excited determinants and | Φ β (cid:105) and | Φ γ (cid:105) are at least the triples, the (cid:104) Φ α | e T | Φ β (cid:105) and (cid:104) Φ α | e T | Φ γ (cid:105) matrix elements ineqs A.6 and A.7 vanish, i.e., Λ (3) α = Λ (4) α = 0. At the same time, Λ (1) α , eq A.4, cancels out the39ight hand side of eq A.3. This means that eq A.3 reduces toΛ (2) α = 0 , (A.8)with Λ (2) α deﬁned by eq A.5. Since in the speciﬁc case considered here | Φ α (cid:105) and | Φ α (cid:48) (cid:105) arethe singly and doubly excited determinants, the deﬁnition of the cluster operator T , eq 20,implies that e T = 1 + C ( P A ) + C ( P B ) , and the C and C ( P B ) operators generate at least doubleexcitations, we can rewrite the (cid:104) Φ α | e T | Φ α (cid:48) (cid:105) term seen in eq A.5 in the following manner: (cid:104) Φ α | e T | Φ α (cid:48) (cid:105) ≡ (cid:104) Φ α | (1 + C + C + C ( P B ) ) | Φ α (cid:48) (cid:105) = δ αα (cid:48) + (cid:104) Φ α | C | Φ α (cid:48) (cid:105) , (A.9)where δ αα (cid:48) is the usual Kronecker delta. By substituting eq A.9 into eq A.5 and using eqA.8, we immediately obtainΛ (2) α = (cid:104) Φ α | ( H N e T ) C | Φ (cid:105) + (cid:88) α (cid:48) (cid:104) Φ α | C | Φ α (cid:48) (cid:105) (cid:104) Φ α (cid:48) | ( H N e T ) C | Φ (cid:105) = 0 . (A.10)We now examine the system of equations represented by eq A.10. Let us start with thecase in which | Φ α (cid:105) = | Φ ai (cid:105) . Since C generates single excitations and | Φ α (cid:48) (cid:105) ’s in eq A.10 areat least the singly excited determinants, (cid:104) Φ α | C | Φ α (cid:48) (cid:105) = 0. We can, therefore, conclude thatthe CI equations projected on the singly excited determinants, eq 17 with | Φ α (cid:105) = | Φ ai (cid:105) oreq 21, which are equivalent to eq A.10 in which | Φ α (cid:105) = | Φ ai (cid:105) , reduce to the CC equationsprojected on singles given by eq 3 or, more explicitly, eq 5.In the case of the projections on the doubly excited determinants (cid:12)(cid:12) Φ abij (cid:11) , i.e., when | Φ α (cid:105) = (cid:12)(cid:12) Φ abij (cid:11) , we can give eq A.10 the following form: (cid:10) Φ abij (cid:12)(cid:12) ( H N e T ) C (cid:12)(cid:12) Φ (cid:11) + (cid:88) k,c (cid:10) Φ abij (cid:12)(cid:12) C (cid:12)(cid:12) Φ ck (cid:11) (cid:104) Φ ck | ( H N e T ) C | Φ (cid:105) = 0 , (A.11)where we utilized the fact that the (cid:10) Φ abij (cid:12)(cid:12) C (cid:12)(cid:12) Φ α (cid:48) (cid:11) matrix element vanishes unless | Φ α (cid:48) (cid:105) is a40ingly excited determinant. Since we have already demonstrated that the cluster operator T ,eq 20, satisﬁes eq 3, i.e., (cid:104) Φ ck | ( H N e T ) C | Φ (cid:105) = 0, eq A.11 simpliﬁes to eq 4. This means thatthe CI equations projected on the doubly excited determinants, eq 17 with | Φ α (cid:105) = (cid:12)(cid:12) Φ abij (cid:11) oreq 22, which are equivalent to eq A.10 in which | Φ α (cid:105) = (cid:12)(cid:12) Φ abij (cid:11) , reduce to the CC equationsprojected on doubles given by eq 4 or, more explicitly, eq 6. This concludes our ﬁrst proofof the equivalence of the CI eqs 17 and 19 and their CC counterparts represented by eqs 3,4, and 7 or 5–7.As mentioned at the end of Section 2.1, one can extend the above considerations tohigher-order ec-CC variants, in which the excitation operators C ( P A ) and C ( P B ) that enterthe CI wave function | Ψ (cid:105) through eq 14 are deﬁned by eqs 30 and 31. In this case, subspace H ( P A ) , which matches the content of C ( P A ) , is spanned by all determinants | Φ α (cid:105) with theexcitation ranks ranging from 1 to m A , where m A ≥

2, and determinants | Φ β (cid:105) ∈ H ( P B ) that match the many-body components C ( P B ) n of operator C ( P B ) , assuming that C ( P B ) (cid:54) = 0,have the excitation ranks exceeding m A . In order to prove the equivalence of eq 17, with thecorrelation energy deﬁned by 19, and the CC system deﬁned by eq 32 in this generalized case,we follow the same procedure as described above, adjusting it to the contents of operators C ( P A ) and C ( P B ) and subspaces H ( P A ) and H ( P B ) . It is immediately obvious that eqs A.1–A.8 still hold. In particular, Λ (1) α cancels out the right hand side of eq A.3 and matrixelements (cid:104) Φ α | e T | Φ β (cid:105) and (cid:104) Φ α | e T | Φ γ (cid:105) that enter the Λ (3) α and Λ (4) α contributions to eq A.3vanish, since the many-body ranks of determinants | Φ α (cid:105) ∈ H ( P A ) do not exceed m A , theexcitation levels of | Φ β (cid:105) and | Φ γ (cid:105) , which belong to H ( P B ) and H ( Q ) , respectively, are atleast m A + 1, and e T = 1 + C ( P A ) + C ( P B ) . With the generalized deﬁnitions of the excitationoperators C ( P A ) and C ( P B ) considered here, eq A.8 can be given the following form:Λ (2) α = (cid:104) Φ α | ( H N e T ) C | Φ (cid:105) + (cid:88) α (cid:48) (cid:104) Φ α | (cid:101) C | Φ α (cid:48) (cid:105) (cid:104) Φ α (cid:48) | ( H N e T ) C | Φ (cid:105) = 0 , (A.12)41here (cid:101) C = m A − (cid:88) n =1 C n , (A.13)since the excitation ranks of determinants | Φ α (cid:105) and | Φ α (cid:48) (cid:105) belonging to H ( P A ) range from 1to m A and operator C ( P B ) generates higher–than– m A -fold excitations, so that (cid:104) Φ α | e T | Φ α (cid:48) (cid:105) ≡ (cid:104) Φ α | (1 + m A (cid:88) n =1 C n + C ( P B ) ) | Φ α (cid:48) (cid:105) = δ αα (cid:48) + (cid:104) Φ α | (cid:101) C | Φ α (cid:48) (cid:105) , (A.14)with (cid:101) C deﬁned by eq A.13.As in the previously considered m A = 2 case, we examine the system represented by eqA.12, which we can do in a recursive manner starting from | Φ α (cid:105) = | Φ ai (cid:105) . When | Φ α (cid:105) = | Φ ai (cid:105) ,matrix element (cid:104) Φ α | (cid:101) C | Φ α (cid:48) (cid:105) vanishes, since | Φ α (cid:48) (cid:105) is at least a singly excited determinantand (cid:101) C deﬁned by eq A.13 generates at least single excitations. This immediately leads tothe CC equations projected on the singly excited determinants, eq 3, i.e., eq 32 is satisﬁedwhen | Φ α (cid:105) = | Φ ai (cid:105) . Moving on, when | Φ α (cid:105) = (cid:12)(cid:12) Φ abij (cid:11) , (cid:104) Φ α | (cid:101) C | Φ α (cid:48) (cid:105) = 0 unless | Φ α (cid:48) (cid:105) is a singlyexcited determinant, but since the CC equations projected on singles are already satisﬁed, thesummation over α (cid:48) in eq A.12 vanishes and eq A.12 reduces to the CC equations projectedon doubles, eq 4, which means that eq 32 remains true for | Φ α (cid:105) = (cid:12)(cid:12) Φ abij (cid:11) . Continuing(assuming that m A > | Φ α (cid:105) = (cid:12)(cid:12) Φ abcijk (cid:11) , (cid:104) Φ α | (cid:101) C | Φ α (cid:48) (cid:105) is zero unless | Φ α (cid:48) (cid:105) is a singlyor doubly excited determinant. Again, since the CC equations projected on singles anddoubles are already satisﬁed, the summation over α (cid:48) in eq A.12 becomes zero and we obtain (cid:10) Φ abcijk (cid:12)(cid:12) ( H N e T ) C (cid:12)(cid:12) Φ (cid:11) = 0, i.e., eq 32 is satisﬁed in the | Φ α (cid:105) = (cid:12)(cid:12) Φ abcijk (cid:11) case. One can repeatthe same procedure for the projections on higher-rank determinants | Φ α (cid:105) in eq A.12 thatbelong to subspace H ( P A ) . In the ﬁnal stage of this recursive analysis, i.e., when | Φ α (cid:105) is an m A -tuply excited determinant, eq 32 remains true as well, since (cid:104) Φ α | (cid:101) C | Φ α (cid:48) (cid:105) vanishes unlessthe excitation rank of | Φ α (cid:48) (cid:105) is at most m A − m A − (cid:104) Φ α | ( H N e T ) C | Φ (cid:105) is zero, i.e., eq 32 holds once again. In other words, eq 32 remains true for all determinants42 Φ α (cid:105) ∈ H ( P A ) , i.e., eq 17, with the correlation energy deﬁned by 19, is equivalent to the CCsystem deﬁned by eq 32 when the excitation operators C ( P A ) and C ( P B ) that enter the CIwave function | Ψ (cid:105) through eq 14 are deﬁned by eqs 30 and 31 and subspace H ( P A ) , whichmatches the content of C ( P A ) , is spanned by all determinants | Φ α (cid:105) with the excitation ranksranging from 1 to m A . APPENDIX B: DIAGRAMMATIC PROOF OF THEEQUIVALENCE OF EQ 17, WITH THE CORRELA-TION ENERGY DEFINED BY EQ 19, OR EQS 21–23AND EQS 5–7

In this appendix, we provide an alternative proof of the equivalence of eqs 5–7 and 21–23 usinga diagrammatic approach. As in the case of the algebraic derivation presented in AppendixA, the only assumption that we make regarding the CI state | Ψ (cid:105) used to provide informationabout the T and T clusters for ec-CC considerations is the full treatment of the C and C components. This means that all of the mathematical manipulations in this appendix applyto conventional as well as unconventional truncations in the CI excitation operator, as deﬁnedby eqs 14–16, in addition to FCI. The diagrammatic derivation of the equivalence of eqs 5–7and 21–23 is accomplished by starting from the CC equations corresponding to projectionson singles and doubles, eqs 5 and 6, respectively, and, after performing cluster analysis of theCI wave function | Ψ (cid:105) with the help of eq 11, converting them to the analogous eqs 21 and22, with the correlation energy deﬁned by eq 23, which are part of the CI eigenvalue problemfor | Ψ (cid:105) used to determine the three- and four-body clusters. To facilitate our presentation,throughout this appendix we drop the ‘( P B )’ superscript in the C ( P B ) n components of operator C ( P B ) , eq 16, associated with higher–than–doubly excited contributions to | Ψ (cid:105) .The ﬁrst step is to express eqs 5 and 6 in terms of the C – C operators by using the43elationships between the T n and C n components given by eq 11. In the case of the singlesprojections, the correspondence between the various ( H N e T ) C terms appearing in eq 5 andtheir counterparts resulting from the application of eq 11 is provided in Chart B.1. Asshown in Chart B.1, all contributions containing products of C n components other than theunlinked terms, marked in red, in which the fully connected operator products ( F N C ) F C and ( V N C ) F C that contribute to the correlation energy multiply C , cancel out. As a result,the CC equations corresponding to projections on the singly excited determinants, eq 5,become (cid:104) Φ ai | [ F N + ( F N C ) C + ( F N C ) C + ( V N C ) C + ( V N C ) C + ( V N C ) C + Θ ] | Φ (cid:105) = 0 , (B.1)where Θ = − [( F N C ) F C C + ( V N C ) F C C ] (B.2)represents the terms highlighted in Chart B.1 in red. Focusing on the Θ contribution to eqB.1, we obtain (cid:104) Φ ai | Θ | Φ (cid:105) = − (cid:104) Φ ai | [( F N C ) F C C + ( V N C ) F C C ] | Φ (cid:105) = − (cid:104) Φ ai | [( F N C ) F C + ( V N C ) F C ] C | Φ (cid:105) = − ∆ E (CI) (cid:104) Φ ai | C | Φ (cid:105) , (B.3)where we used eq 23 for the CI correlation energy ∆ E (CI) , which, after replacing H N by thesum of F N and V N , is equivalent to∆ E (CI) = (cid:104) Φ | [( F N C ) F C + ( V N C ) F C ] | Φ (cid:105) . (B.4)Inserting eq B.3 for (cid:104) Φ ai | Θ | Φ (cid:105) back to eq B.1 and moving the energy-dependent term44 E (CI) (cid:104) Φ ai | C | Φ (cid:105) to the right-hand side of the resulting expression, we arrive at (cid:104) Φ ai | [ F N + ( F N C ) C + ( F N C ) C + ( V N C ) C + ( V N C ) C + ( V N C ) C ] | Φ (cid:105) = ∆ E (CI) (cid:104) Φ ai | C | Φ (cid:105) , (B.5)which is equivalent to eq 21, when expressed in terms of the one- and two-body componentsof H N . This completes the proof of the equivalence of eqs 5 and 21.A similar analysis can be performed for the CC equations corresponding to projectionson the doubly excited determinants, eq 6. The correspondence between the various ( H N e T ) C terms contributing to eq 6 and their counterparts obtained by using eq 11 is shown in ChartB.2. In this case, despite the cancellation of the majority of the nonlinear terms in the C n components resulting from the application of eq 11, the emergence of the CI equationsprojected on doubles, eq 22, from eq 6 is not as obvious as in the previously examinedsingles projections. As shown in Chart B.2, in addition to the unlinked terms, marked inred and green, in which the fully connected operator products ( F N C ) F C and ( V N C ) F C thatcontribute to the correlation energy multiply C and C , we see the appearance of thedisconnected quantities, marked in blue, where the ( F N C ) C , ( F N C ) C , ( V N C ) C , ( V N C ) C ,and ( V N C ) C connected operator products multiply C . The Hugenholtz diagrams emergingfrom the V N C C and V N C operator products, which result from the application of eq 11to the (cid:10) Φ abij (cid:12)(cid:12) ( V N T ) C (cid:12)(cid:12) Φ (cid:11) contribution to eq 6 and which correspond to the second throughﬁfth expressions contributing to ( V N T ) C in Chart B.2, are shown in Figure B.1. It shouldbe noted that the T component (emphasized in Figure B.1 by a dashed oval) resulting fromthe cluster analysis deﬁned by eq 11 is a strictly connected quantity (in an MBPT sense)only in a FCI limit.After removing the nonlinear terms in C n components that cancel out and grouping theremaining contributions to the CC equations corresponding to projections on the doubly45xcited determinants according to their color in Chart B.2, eq 6 becomes (cid:10) Φ abij (cid:12)(cid:12) [( F N C ) C + ( F N C ) C + V N + ( V N C ) C + ( V N C ) C + ( V N C ) C + ( V N C ) C + Θ (cid:48) + Θ (cid:48)(cid:48) + Θ ] | Φ (cid:105) = 0 , (B.6)whereΘ (cid:48) = − [( F N C ) C C + ( F N C ) C C + ( V N C ) C C + ( V N C ) C C + ( V N C ) C C ] , (B.7)Θ (cid:48)(cid:48) = 2( F N C ) F C C + 2( V N C ) F C C , (B.8)and Θ = − [( F N C ) F C C + ( V N C ) F C C ] . (B.9)After adding and subtracting F N C on the left-hand side of eq B.6, we obtain (cid:10) Φ abij (cid:12)(cid:12) [ F N C + ( F N C ) C + ( F N C ) C + V N + ( V N C ) C + ( V N C ) C + ( V N C ) C + ( V N C ) C + (cid:101) Θ (cid:48) + Θ (cid:48)(cid:48) + Θ ] | Φ (cid:105) = 0 , (B.10)where (cid:101) Θ (cid:48) = Θ (cid:48) − F N C . (B.11)Using eq B.7, factoring out C , and taking advantage of the already obtained CI equationsprojected on the singly excited determinants, eq B.5, the contribution from the (cid:101) Θ (cid:48) term,46eﬁned by eq B.11, to eq B.10, can be rewritten as follows: (cid:10) Φ abij (cid:12)(cid:12) (cid:101) Θ (cid:48) (cid:12)(cid:12) Φ (cid:11) = − (cid:10) Φ abij (cid:12)(cid:12) [ F N C + ( F N C ) C C + ( F N C ) C C + ( V N C ) C C + ( V N C ) C C + ( V N C ) C C ] | Φ (cid:105) = − (cid:10) Φ abij (cid:12)(cid:12) [ F N + ( F N C ) C + ( F N C ) C + ( V N C ) C + ( V N C ) C + ( V N C ) C ] C | Φ (cid:105) = − ∆ E (CI) (cid:10) Φ abij (cid:12)(cid:12) C (cid:12)(cid:12) Φ (cid:11) . (B.12)At the same time, the contributions from the unlinked Θ (cid:48)(cid:48) and Θ terms to eq B.10, afterfactoring out C and C , respectively, and using eq B.4 for the CI correlation energy, become (cid:10) Φ abij (cid:12)(cid:12) Θ (cid:48)(cid:48) (cid:12)(cid:12) Φ (cid:11) = (cid:10) Φ abij (cid:12)(cid:12) [2( F N C ) F C C + 2( V N C ) F C C ] (cid:12)(cid:12) Φ (cid:11) = (cid:10) Φ abij (cid:12)(cid:12) [( F N C ) F C + ( V N C ) F C ] C (cid:12)(cid:12) Φ (cid:11) = ∆ E (CI) (cid:10) Φ abij (cid:12)(cid:12) C (cid:12)(cid:12) Φ (cid:11) (B.13)and (cid:10) Φ abij (cid:12)(cid:12) Θ (cid:12)(cid:12) Φ (cid:11) = − (cid:10) Φ abij (cid:12)(cid:12) [( F N C ) F C C + ( V N C ) F C C ] (cid:12)(cid:12) Φ (cid:11) = − (cid:10) Φ abij (cid:12)(cid:12) [( F N C ) F C + ( V N C ) F C ] C (cid:12)(cid:12) Φ (cid:11) = − ∆ E (CI) (cid:10) Φ abij (cid:12)(cid:12) C (cid:12)(cid:12) Φ (cid:11) . (B.14)Note that after all of these manipulations the (cid:101) Θ (cid:48) and Θ (cid:48)(cid:48) contributions to eq B.10, eqs B.12and B.13, respectively, cancel each other. Thus, after inserting eq B.14 back to eq B.10 andmoving the energy-dependent ∆ E (CI) (cid:10) Φ abij (cid:12)(cid:12) C (cid:12)(cid:12) Φ (cid:11) contribution to the right-hand side of theresulting formula, we arrive at (cid:10) Φ abij (cid:12)(cid:12) [ F N C + ( F N C ) C + ( F N C ) C + V N + ( V N C ) C + ( V N C ) C + ( V N C ) C + ( V N C ) C ] | Φ (cid:105) = ∆ E (CI) (cid:10) Φ abij (cid:12)(cid:12) C (cid:12)(cid:12) Φ (cid:11) , (B.15)47hich is equivalent to eq 22, when expressed in terms of the one- and two-body componentsof H N . This completes the proof of the equivalence of eqs 6 and 22 and the diagrammaticderivation of the CI eqs 21–23 from the CC eqs 5–7. The equivalence of eqs 7 and 23 for thecorrelation energy is an obvious consequence of replacing T and T in eq 7 by the formulasin terms of C and C originating from eq 11, which leads directly to eq B.4 and its analog,eq 23. 48 eferences (1) Hubbard, J. The description of collective motions in terms of many-body perturbationtheory. Proc. R. Soc. London A , , 539–560.(2) Hugenholtz, N. M. Perturbation theory of large quantum systems. Physica , ,481–532.(3) Coester, F. Bound states of a many-particle system. Nucl. Phys. , , 421–424.(4) ˇC´ıˇzek, J. On the correlation problem in atomic and molecular systems. Calculationof wavefunction components in Ursell-type expansion using quantum-ﬁeld theoreticalmethods. J. Chem. Phys. , , 4256–4266.(5) ˇC´ıˇzek, J. On the use of the cluster expansion and the technique of diagrams in cal-culations of correlation eﬀects in atoms and molecules. Adv. Chem. Phys. , ,35–89.(6) Paldus, J.; ˇC´ıˇzek, J.; Shavitt, I. Correlation problems in atomic and molecular sys-tems. IV. Extended coupled-pair many-electron theory and its application to the BH molecule. Phys. Rev. A , , 50–67.(7) Paldus, J.; Li, X. A critical assessment of coupled cluster method in quantum chemistry. Adv. Chem. Phys. , , 1–175.(8) Bartlett, R. J.; Musia(cid:32)l, M. Coupled-cluster theory in quantum chemistry. Rev. Mod.Phys. , , 291–352.(9) Purvis, G. D., III; Bartlett, R. J. A full coupled-cluster singles and doubles model: Theinclusion of disconnected triples. J. Chem. Phys. , , 1910–1918.(10) Cullen, J. M.; Zerner, M. C. The linked singles and doubles model: An approximatetheory of electron correlation based on the coupled-cluster ansatz. J. Chem. Phys. , , 4088–4109. 4911) Raghavachari, K.; Trucks, G. W.; Pople, J. A.; Head-Gordon, M. A ﬁfth-order per-turbation comparison of electron correlation theories. Chem. Phys. Lett. , ,479–483.(12) Piecuch, P.; Zarrabian, S.; Paldus, J.; ˇC´ıˇzek, J. Coupled-cluster approaches with anapproximate account of triexcitations and the optimized-inner-projection technique.II. Coupled-cluster results for cyclic-polyene model systems. Phys. Rev. B , ,3351–3379.(13) Noga, J.; Bartlett, R. J. The full CCSDT model for molecular electronic structure. J.Chem. Phys. , , 7041–7050, , , 3401 [Erratum].(14) Scuseria, G. E.; Schaefer, H. F., III A new implementation of the full CCSDT modelfor molecular electronic structure. Chem. Phys. Lett. , , 382–386.(15) Oliphant, N.; Adamowicz, L. Coupled-cluster method truncated at quadruples. J.Chem. Phys. , , 6645–6651.(16) Kucharski, S. A.; Bartlett, R. J. The coupled-cluster single, double, triple, and quadru-ple excitation method. J. Chem. Phys. , , 4282–4288.(17) Podeszwa, R.; Kucharski, S. A.; Stolarczyk, L. Z. Electronic correlation in cyclicpolyenes. Performance of coupled-cluster methods with higher excitations. J. Chem.Phys. , , 480–493.(18) Degroote, M.; Henderson, T. M.; Zhao, J.; Dukelsky, J.; Scuseria, G. E. Polynomialsimilarity transformation theory: A smooth interpolation between coupled cluster dou-bles and projected BCS applied to the reduced BCS Hamiltonian. Phys. Rev. B , , 125124.(19) Lyakh, D. I.; Musia(cid:32)l, M.; Lotrich, V. F.; Bartlett, R. J. Multireference nature of chem-istry: The coupled-cluster view. Chem. Rev. , , 182–243.5020) Evangelista, F. A. Perspective: Multireference coupled cluster theories of dynamicalelectron correlation. J. Chem. Phys. , , 030901.(21) Paldus, J.; ˇC´ıˇzek, J.; Takahashi, M. Approximate account of the connected quadruplyexcited clusters in the coupled-pair many-electron theory. Phys. Rev. A , , 2193–2209.(22) Piecuch, P.; Tobo(cid:32)la, R.; Paldus, J. Approximate account of connected quadruply excitedclusters in single-reference coupled-cluster theory via cluster analysis of the projectedunrestricted Hartree-Fock wave function. Phys. Rev. A , , 1210–1241.(23) Paldus, J.; Planelles, J. Valence bond corrected single reference coupled cluster ap-proach I. General formalism. Theor. Chim. Acta , , 13–31.(24) Stolarczyk, L. Z. Complete active space coupled-cluster method. Extension of single-reference coupled-cluster method using the CASSCF wavefunction. Chem. Phys. Lett. , , 1–6.(25) Peris, G.; Planelles, J.; Paldus, J. Single-reference CCSD approach employing three-and four-body CAS SCF corrections: A preliminary study of a simple model. Int. J.Quantum Chem. , , 137–151.(26) Peris, G.; Planelles, J.; Malrieu, J.-P.; Paldus, J. Perturbatively selected CI as anoptimal source for externally corrected CCSD. J. Chem. Phys. , , 11708–11716.(27) Li, X.; Paldus, J. Reduced multireference CCSD method: An eﬀective approach toquasidegenerate states. J. Chem. Phys. , , 6257–6269.(28) Li, X.; Paldus, J. Reduced multireference couple cluster method. II. Application topotential energy surfaces of HF, F , and H O. J. Chem. Phys. , , 637–648.(29) Li, X.; Paldus, J. Reduced multireference coupled cluster method with singles anddoubles: Perturbative corrections for triples. J. Chem. Phys. , , 174101.5130) Deustua, J. E.; Magoulas, I.; Shen, J.; Piecuch, P. Communication: Approaching exactquantum chemistry by cluster analysis of full conﬁguration interaction quantum MonteCarlo wave functions. J. Chem. Phys. , , 151101.(31) Aroeira, G. J. R.; Davis, M. M.; Turney, J. M.; Schaefer, H. F., III Coupled clusterexternally corrected by adaptive conﬁguration interaction. J. Chem. Theory Comput. , , 182–190.(32) Paldus, J. Externally and internally corrected coupled cluster approaches: An overview. J. Math. Chem. , , 477–502.(33) Piecuch, P. Active-space coupled-cluster methods. Mol. Phys. , , 2987–3015.(34) Piecuch, P.; Paldus, J. On the solution of coupled-cluster equations in the fully corre-lated limit of cyclic polyene model. Int. J. Quantum Chem. Symp. , , 9–34.(35) Eriksen, J. J.; Anderson, T. A.; Deustua, J. E.; Ghanem, K.; Hait, D.; Hoﬀmann, M. R.;Lee, S.; Levine, D. S.; Magoulas, I.; Shen, J.; Tubman, N. M.; Whaley, K. B.; Xu, E.;Yao, Y.; Zhang, N.; Alavi, A.; Chan, G. K.-L.; Head-Gordon, M.; Liu, W.; Piecuch, P.;Sharma, S.; Ten-no, S. L.; Umrigar, C. J.; Gauss, J. The ground state electronic energyof benzene. J. Phys. Chem. Lett. , , 8922–8929.(36) Xu, E.; Li, S. The externally corrected coupled cluster approach with four- and ﬁve-body clusters from the CASSCF wave function. J. Chem. Phys. , , 094119.(37) Booth, G. H.; Thom, A. J. W.; Alavi, A. Fermion Monte Carlo without ﬁxed nodes:A game of life, death, and annihilation in Slater determinant space. J. Chem. Phys. , , 054106.(38) Cleland, D.; Booth, G. H.; Alavi, A. Communications: Survival of the ﬁttest: Accel-erating convergence in full conﬁguration-interaction quantum Monte Carlo. J. Chem.Phys. , , 041103. 5239) Ghanem, K.; Lozovoi, A. Y.; Alavi, A. Unbiasing the initiator approximation in fullconﬁguration interaction quantum Monte Carlo. J. Chem. Phys. , , 224108.(40) Ghanem, K.; Guther, K.; Alavi, A. The adaptive shift method in full conﬁguration in-teraction quantum Monte Carlo: Development and applications. J. Chem. Phys. , , 224115.(41) Li, X.; Paldus, J. Dissociation of N triple bond: A reduced multireference CCSD study. Chem. Phys. Lett. , , 145–154.(42) Li, X.; Paldus, J. Reduced multireference coupled cluster method IV: Open-shell sys-tems. Mol. Phys. , , 1185–1199.(43) Li, X.; Paldus, J. Reduced multireference coupled cluster method: Ro-vibrational spec-tra of N . J. Chem. Phys. , , 9966–9977.(44) Li, X.; Paldus, J. Full potential energy curve for N by the reduced multireferencecoupled-cluster method. J. Chem. Phys. , , 054104.(45) Li, X.; Gour, J. R.; Paldus, J.; Piecuch, P. On the signiﬁcance of quadruply excitedclusters in coupled-cluster calculations for the low-lying states of BN and C . Chem.Phys. Lett. , , 321–326.(46) Li, X.; Paldus, J. Electronic structure of organic diradicals: Evaluation of the perfor-mance of coupled-cluster methods. J. Chem. Phys. , , 174101.(47) Schriber, J. B.; Evangelista, F. A. Communication: An adaptive conﬁguration interac-tion approach for strongly correlated electrons with tunable accuracy. J. Chem. Phys. , , 161106.(48) Schriber, J. B.; Evangelista, F. A. Adaptive conﬁguration interaction for computingchallenging electronic excited states with tunable accuracy. J. Chem. Theory Comput. , , 5354–5366. 5349) Huron, B.; Malrieu, J. P.; Rancurel, P. Iterative perturbation calculations of ground andexcited state energies from multiconﬁgurational zeroth-order wavefunctions. J. Chem.Phys. , , 5745–5759.(50) Garniron, Y.; Scemama, A.; Loos, P.-F.; Caﬀarel, M. Hybrid stochastic-deterministiccalculation of the second-order perturbative contribution of multireference perturbationtheory. J. Chem. Phys. , , 034101.(51) Garniron, Y.; Applencourt, T.; Gasperich, K.; Benali, A.; Fert´e, A.; Paquier, J.;Pradines, B.; Assaraf, R.; Reinhardt, P.; Toulouse, J.; Barbaresco, P.; Renon, N.;David, G.; Malrieu, J.-P.; V´eril, M.; Caﬀarel, M.; Loos, P.-F.; Giner, E.; Scemama, A.Quantum Package 2.0: An open-source determinant-driven suite of programs. J. Chem.Theory Comput. , , 3591–3609.(52) Whitten, J. L.; Hackmeyer, M. Conﬁguration interaction studies of ground and excitedstates of polyatomic molecules. I. The CI formulation and studies of formaldehyde. J.Chem. Phys. , , 5584–5596.(53) Bender, C. F.; Davidson, E. R. Studies in conﬁguration interaction: The ﬁrst-rowdiatomic hydrides. Phys. Rev. , , 23–30.(54) Buenker, R. J.; Peyerimhoﬀ, S. D. Individualized conﬁguration selection in CI calcula-tions with subsequent energy extrapolation. Theor. Chim. Acta , , 33–58.(55) Tubman, N. M.; Lee, J.; Takeshita, T. Y.; Head-Gordon, M.; Whaley, K. B. A deter-ministic alternative to the full conﬁguration interaction quantum Monte Carlo method. J. Chem. Phys. , , 044112.(56) Tubman, N. M.; Freeman, C. D.; Levine, D. S.; Hait, D.; Head-Gordon, M.; Wha-ley, K. B. Modern approaches to exact diagonalization and selected conﬁguration in-teraction with the adaptive sampling CI method. J. Chem. Theory Comput. , ,2139–2159. 5457) Liu, W.; Hoﬀmann, M. R. iCI: Iterative CI toward full CI. J. Chem. Theory Comput. , , 1169–1178, , , 3000 [Erratum].(58) Zhang, N.; Liu, W.; Hoﬀmann, M. R. Iterative conﬁguration interaction with selection. J. Chem. Theory Comput. , , 2296–2316.(59) Holmes, A. A.; Tubman, N. M.; Umrigar, C. J. Heat-bath conﬁguration interaction: Aneﬃcient selected conﬁguration interaction algorithm inspired by heat-bath sampling. J.Chem. Theory Comput. , , 3674–3680.(60) Sharma, S.; Holmes, A. A.; Jeanmairet, G.; Alavi, A.; Umrigar, C. J. Semistochasticheat-bath conﬁguration interaction method: Selected conﬁguration interaction withsemistochastic perturbation theory. J. Chem. Theory Comput. , , 1595–1604.(61) Li, J.; Otten, M.; Holmes, A. A.; Sharma, S.; Umrigar, C. J. Fast semistochastic heat-bath conﬁguration interaction. J. Chem. Phys. , , 214110.(62) Loos, P.-F.; Damour, Y.; Scemama, A. The performance of CIPSI on the ground stateelectronic energy of benzene. J. Chem. Phys. , , 176101.(63) Paldus, J. In Methods in Computational Molecular Physics ; Wilson, S., Diercksen, G.H. F., Eds.; NATO Advanced Study Institute, Series B: Physics; Plenum: New York,1992; Vol. 293; pp 99–194.(64) Piecuch, P.; Kowalski, K. In

Computational Chemistry: Reviews of Current Trends ;Leszczy´nski, J., Ed.; World Scientiﬁc: Singapore, 2000; Vol. 5; pp 1–104.(65) Kowalski, K.; Piecuch, P. The method of moments of coupled-cluster equations and therenormalized CCSD[T], CCSD(T), CCSD(TQ), and CCSDT(Q) approaches.

J. Chem.Phys. , , 18–35. 5566) Piecuch, P.; Kowalski, K.; Pimienta, I. S. O.; Mcguire, M. J. Recent advances in elec-tronic structure theory: Method of moments of coupled-cluster equations and renor-malized coupled-cluster approaches. Int. Rev. Phys. Chem. , , 527–655.(67) Dunning, T. H., Jr. Gaussian basis sets for use in correlated molecular calculations. I.The atoms boron through neon and hydrogen. J. Chem. Phys. , , 1007–1023.(68) Olsen, J.; Jørgensen, P.; Koch, H.; Balkova, A.; Bartlett, R. J. Full conﬁguration–interaction and state of the art correlation calculations on water in a valence double-zetabasis with polarization functions. J. Chem. Phys. , , 8007–8015.(69) Bauman, N. P.; Shen, J.; Piecuch, P. Combining active-space coupled-cluster ap-proaches with moment energy corrections via the CC( P ; Q ) methodology: Connectedquadruple excitations. Mol. Phys. , , 2860–2891.(70) Deustua, J. E.; Shen, J.; Piecuch, P. High-level coupled-cluster energetics by MonteCarlo sampling and moment expansions: Further details and comparisons. J. Chem.Phys., submitted (2021).(71) Schmidt, M. W.; Baldridge, K. K.; Boatz, J. A.; Elbert, S. T.; Gordon, M. S.;Jensen, J. H.; Koseki, S.; Matsunaga, N.; Nguyen, K. A.; Su, S.; Windus, T. L.;Dupuis, M.; Montgomery, Jr., J. A. General atomic and molecular electronic struc-ture system.

J. Comput. Chem. , , 1347–1363.(72) Barca, G. M. J.; Bertoni, C.; Carrington, L.; Datta, D.; De Silva, N.; Deustua, J. E.;Fedorov, D. G.; Gour, J. R.; Gunina, A. O.; Guidez, E.; Harville, T.; Irle, S.; Ivanic, J.;Kowalski, K.; Leang, S. S.; Li, H.; Li, W.; Lutz, J. J.; Magoulas, I.; Mato, J.;Mironov, V.; Nakata, H.; Pham, B. Q.; Piecuch, P.; Poole, D.; Pruitt, S. R.; Ren-dell, A. P.; Roskop, L. B.; Ruedenberg, K.; Sattasathuchana, T.; Schmidt, M. W.;Shen, J.; Slipchenko, L.; Sosonkina, M.; Sundriyal, V.; Tiwari, A.; Galvez Vallejo, J. L.;Westheimer, B.; W(cid:32)loch, M.; Xu, P.; Zahariev, F.; Gordon, M. S. Recent developments56n the general atomic and molecular electronic structure system. J. Chem. Phys. , , 154102.(73) Ivanic, J. Direct conﬁguration interaction and multiconﬁgurational self-consistent-ﬁeldmethod for multiple active spaces with variable occupations. I. Method. J. Chem. Phys. , , 9364–9376.(74) Ivanic, J. Direct conﬁguration interaction and multiconﬁgurational self-consistent-ﬁeldmethod for multiple active spaces with variable occupations. II. Application to ox-oMn(salen) and N O . J. Chem. Phys. , , 9377–9385.(75) Ivanic, J.; Ruedenberg, K. Identiﬁcation of deadwood in conﬁguration spaces throughgeneral direct conﬁguration interaction. Theor. Chem. Acc. , , 339–351.(76) Shen, J.; Piecuch, P. Biorthogonal moment expansions in coupled-cluster theory: Re-view of key concepts and merging the renormalized and active-space coupled-clustermethods. Chem. Phys. , , 180–202.(77) Shen, J.; Piecuch, P. Combining active-space coupled-cluster methods with momentenergy corrections via the CC( P ; Q ) methodology, with benchmark calculations forbiradical transition states. J. Chem. Phys. , , 144104.(78) Deustua, J. E.; Shen, J.; Piecuch, P. Converging high-level coupled-cluster energeticsby Monte Carlo sampling and moment expansions. Phys. Rev. Lett. , , 223003.(79) Yuwono, S. H.; Chakraborty, A.; Deustua, J. E.; Shen, J.; Piecuch, P. Accelerating con-vergence of equation-of-motion coupled-cluster computations using the semi-stochasticCC( P ; Q ) formalism. Mol. Phys. , , e1817592.(80) Piecuch, P.; W(cid:32)loch, M. Renormalized coupled-cluster methods exploiting left eigen-states of the similarity-transformed Hamiltonian. J. Chem. Phys. , , 224105.5781) Piecuch, P.; W(cid:32)loch, M.; Gour, J. R.; Kinal, A. Single-reference, size-extensive, non-iterative coupled-cluster approaches to bond breaking and biradicals. Chem. Phys. Lett. , , 467–474.(82) Magoulas, I.; Bauman, N. P.; Shen, J.; Piecuch, P. Application of the CC( P ; Q ) hierar-chy of coupled-cluster methods to the beryllium dimer. J. Phys. Chem. A , ,1350–1368.(83) Yuwono, S. H.; Magoulas, I.; Shen, J.; Piecuch, P. Application of the coupled-clusterCC( P ; Q ) approaches to the magnesium dimer. Mol. Phys. , , 1486–1506.58able 1: A comparison of the energies resulting from the various CI and CC all-electroncalculations for the H O molecule, as described by the cc-pVDZ basis set, at the equilibriumand two displaced geometries in which both O–H bonds are stretched by factors of 2 and 3. a wave function CI/CC energy ec-CC energyI II R = R e CISD 12 .

023 12 .

023 3 . b CISDT 9 .

043 9 .

043 0 . .

327 0 .

327 0 . .

139 0 .

139 0 . .

003 0 .

003 0 . c .

744 3 .

744 3 . c .

493 0 .

493 0 . c .

019 0 .

019 0 . d − . R = 2 R e CISD 72 .

017 72 .

017 22 . b CISDT 56 .

096 56 .

096 2 . .

819 5 .

819 5 . .

236 2 .

236 2 . .

059 0 .

059 0 . c .

034 22 .

034 22 . c − . − . − . c .

032 0 .

032 0 . d − . R = 3 R e CISD 164 .

949 164 .

949 10 . b CISDT 118 .

119 118 . − . .

150 16 .

150 16 . .

432 6 .

432 6 . .

159 0 .

159 0 . c .

849 10 .

849 10 . c − . − . − . c − . − . − . d − . a The equilibrium geometry, R = R e , and the geometries that represent a simultaneous stretching of bothO–H bonds by factors of 2 and 3 without changing the ∠ (H–O–H) angle were taken from ref 68. Unlessotherwise stated, all energies are errors relative to FCI in millihartree. b Equivalent to CCSD. c Takenfrom ref 69. d Total FCI energy in hartree. a b l e : C o n v e r g e n ce o f t h ee n e r g i e s r e s u l t i n g f r o m t h e a ll - e l ec tr o n C I P S I c a l c u l a t i o n s i n i t i a t e d f r o m t h e R H F w a v e f un c t i o n a nd t h ec o rr e s p o nd i n g C I P S I - b a s e d ec - CC e n e r g i e s t o w a r d F C I f o rt h e H O m o l ec u l e , a s d e s c r i b e db y t h ecc - p V D Z b a s i ss e t , a tt h ee q u ili b r i u m a nd t w o d i s p l a ce d g e o m e tr i e s i n w h i c hb o t h O – H b o nd s a r e s tr e t c h e db y f a c t o r s o f nd . a N d e t( i n ) / N d e t( o u t) % S b % D b % T b % Q b C I P S I c ec - CC c E v a r E v a r + ∆ E ( ) IIIII R = R e . d − . . e . e . f , , . . . . − . . . . , , . . . . . . . . . , , . . . . . . . . . , , . . . . . . . . . , , . . . . . . . . . , , . . . . . . . . , , , , . . . . . . . . R = R e . d − . . e . e − . f , , . . . . . . . . − . , , . . . . . . . . . , , . . . . . . . . . , , . . . . . . . . . , , . . . . . . . . , , . . . . . . . . , , , , . . . . . . . . R = R e . d − . . e . e − . f , , . . . . . . . . − . , , . . . . . . . . . , , . . . . . . . . . , , . . . . . . . . . , , . . . . . . . . . , , . . . . . − . . . . , , , , . . . . . − . . . . a T h ee q u ili b r i u m g e o m e t r y , R = R e , a nd t h e g e o m e t r i e s t h a t r e p r e s e n t a s i m u l t a n e o u ss t r e t c h i n go f b o t h O – H b o nd s b y f a c t o r s o f nd w i t h o u t c h a n g i n g t h e ∠ ( H – O – H ) a n g l e w e r e t a k e n f r o m r e f . b % S , % D , % T , a nd % Q a r e , r e s p ec t i v e l y , t h e p e r ce n t ag e s o f t h e s i n g l y , d o ub l y , t r i p l y , a nd q u a d r up l y e x c i t e d S z = d e t e r m i n a n t s o f A s y mm e t r y c a p t u r e ddu r i n g t h e C I P S I c o m pu t a t i o n s . c E rr o r s r e l a t i v e t o F C I i n m illi h a r t r ee ( s ee T a b l e f o r t h e F C I e n e r g i e s ) . d E q u i v a l e n tt o R H F . e E q u i v a l e n tt o CC S D . f E q u i v a l e n tt o C R - CC ( , ) . a b l e : C o n v e r g e n ce o f t h ee n e r g i e s r e s u l t i n g f r o m t h e a ll - e l ec tr o n C I P S I c a l c u l a t i o n s i n i t i a t e d f r o m t h e C I S D w a v e f un c t i o n a nd t h ec o rr e s p o nd i n g C I P S I - b a s e d ec - CC e n e r g i e s t o w a r d F C I f o rt h e H O m o l ec u l e , a s d e s c r i b e db y t h ecc - p V D Z b a s i ss e t , a t R = R e . a N d e t( i n ) / N d e t( o u t) % S b % D b % T b % Q b C I P S I c ec - CC c E v a r E v a r + ∆ E ( ) IIIII , , . . . . . . .

390 10 , , . . . . . . .

374 50 , , . . . . . . .

434 100 , , . . . . . . .

410 500 , , . . . . . . .