Many Electrons and the Photon Field -- The many-body structure of nonrelativistic quantum electrodynamics
MMany Electrons and the Photon Field
The many-body structure ofnonrelativistic quantum electrodynamics
Florian Buchholz
Max Planck Institute for the Structure and Dynamics of MatterBasic Research Community for PhysicsDissertation approved by the Faculty II – Mathematics and Natural Sciencesof Technische Universität Berlinin fulfillment of the requirements for the degree ofD
OCTOR OF S CIENCE
Dr.rer.nat.Digital EditionE
XAMINATION BOARD :Chairwomen: Prof. Dr. Ulrike WoggonReferee: Prof. Dr. Andreas KnorrReferee: Dr. Michael RuggenthalerReferee: Prof. Dr. Angel RubioReferee: Prof. Dr. Dieter Bauer Berlin 2021 a r X i v : . [ qu a n t - ph ] F e b CKNOWLEDGEMENTS
The here presented work is the result of a long process that naturally involved many different people. Theyall contributed importantly to it, though in more or less explicit ways.First of all, I want to express my gratitude to Chiara, all my friends and my (German and Italian) family.Thank you for being there!Next, I want to name Michael Ruggenthaler, who did not only supervise me over many years, contribut-ing crucially to the here presented research, but also has become a dear friend. I cannot imagine how mytime as a PhD candidate would have been without you! In the same breath, I want to thank Iris Theophilou,who saved me so many times from despair over non-converging codes and other difficult moments. Alsowithout you, my PhD would not have been the same.Then, I thank Angel Rubio for his supervision and for making possible my unforgettable and formativetime at the Max-Planck Institute for the Structure and Dynamics of Matter. There are few people who havethe gift to make others feel so excited about physics, as Angel does.Heiko Appel supervised my first steps in the scientific environment and remained also afterwards alwaysclose. His excitement for unusual questions and ideas greatly influenced my projects and work. Nobodyinspired my idea of what science really is more than Markus Penz. He regularly shows with great confidencehow to think out of the box and to question things beyond traditional boundaries. Then I want to nameMicael Oliveira the great tamer of the Octopus. He taught me not only to implement, but to develop. I thankHenning Glawe for being patient with me and for using his supernatural powers to make the penguin regainhis feet, whenever I made it fall. Warm thanks go to my dear friends Björn Bembnista and Wilhelm Benderfor the many stimulating nights of discussion about physics and for sharing their great knowledge aboutcoding and minimization. I also want to thank Vasilis Rokaj, Davis Welakuh, Christian Schäfer, GuillemAlbareda, Arun Debnath, Nicolas Tancogne-Dejean, Florian Eich, Michael Sentef, Enrico Ronca, MassimoAltarelli and Martin Lüders for many interesting discussions.Warm thanks goes to Uliana Mordovina, Teresa Reinhard, Christian Schäfer, Mary-Leena TchenkoueDjouom, Norah Hoffmann and Alexandra Göbel for sharing the precious moments with me, where weneeded to forget science. I thank Fabio Covito, Simone Latini, Matteo Vandelli, and Enrico Ronca for sharingcoffee and the feeling of sunny places. And I thank Ute Ramseger, Graciela Sanguino, Kathja Schroeder andFrauke Kleinwort for their big efforts to support us PhD students at the Institute.Sarah Loos contributed not only to my scientific world view in many inspiring discussions, but also di-rectly to this thesis by her indispensable feedback, understanding what I wanted to say before I did. I alsowant to thank David Licht and Chiara for valuable feedback on my thesis, and Uliana for help with the lay-out. Furthermore, I want to mention Michael Duszat and the “deep readers,” who made me aware on howmuch information only one sentences may contain and who were the very first audience for this thesis. Ialso want to thank all the many authors that have contributed to making the various
STACKEXCHANGE pagessuch a valuable platform with answers to so many difficult questions.Finally, I want to mention the other members of the
Basic Research Community For Physics . It is so inspir-ing to know so many people who share the believe in a cooperative, respectful and open-minded scientificresearch inside and outside traditional institutions. ii
USAMMENFASSUNG
Neueste experimentelle Fortschritte im Bereich von “Cavity”-Quantenelektrodynamik ermöglichen die Er-forschung der starken Wechselwirkung zwischen quantisiertem Licht und komplexen Materiesystemen.Aufgrund der kohärenten Kopplung zwischen Photonen und Materiefreiheitsgraden, entstehen Polarito-nen, hybride Licht-Materie Quasiteilchen, die dazu beitragen können, Materieeigenschaften und komplexeProzesse wie chemische Reaktionen entscheidend zu beeinflussen. Dieses Regime der starken Kopplung er-öffnet Möglichkeiten zur Kontrolle von Materialien und Chemie in einer beispiellosen weise. Allerdings sinddie genauen Mechanismen hinter vielen solcher Phänomene nicht vollständig verstanden. Ein wichtigerGrund dafür ist, dass das physikalische Problem oft mit äußerst vereinfachten Methoden beschrieben wird,wobei die Materie zu wenigen effektiven "Levels"reduziert wird. Akkuratere first-principles Methoden, diePhotonen gleichwertig zu Elektronen behandeln entstehen nur langsam, da die Erforschung solcher Metho-den sowohl durch die erhöhte Komplexität der kombinierten Elektron-Photon Wellenfunktionen, als auchdem Fakt, dass zwei verschiedene Teilchenspezies miteinbezogen werden müssen, aufgehalten wird.In dieser Doktorarbeit schlagen wir vor diese Problem zu umgehen, indem das gekoppelte Elektron-Photon Problem exakt in einem anderen zweckgebauten Hilbertraum neuformuliert wird. Dadurch, dasswir ein System, bestehend aus N Elektronen und M Moden, mit einer N -Polaritonen Wellenfunktion re-präsentieren, können wir explizit zeigen wie ein electronic-structure in eine polaritonic-structure Methodeumgewandelt werden kann, die für schwache bis hin zu starken Kopplungstärken akkurat ist. Wir rationali-sieren diesen Paradigmenwechsel innerhalb einer umfassenden Revision der Licht-Materie Wechselwirkungund indem wir die Verbindung zwischen verschiedenen electronic-structure Methoden und quantenopti-schen Modellen hervorheben. Diese ausführliche Diskussion hebt hervor, dass die Polariton-Konstruktionnicht nur ein mathematischer Trick ist, sondern auf einem einfachen und physikalischem Argument ba-siert: wenn die Anregungen eines Systems einen hybriden Charakter haben, dann ist es nur natürlich, diezugehörige Theorie bezüglich dieser neuen Entitäten zu formulieren.Schließlich diskutieren wir ausführlich, wie Standard-Algorithmen von electronic-structure Methodenangepasst werden müssen, um der neuen Fermi-Bose Statistik gerecht zu werden. Um die zugehörigennichtlinearen Ungleichungs-Nebenbedingungen zu garantieren, sind sorgfältige Entwicklung, Implemen-tierung und Validierung der numerischen Algorithmen nötig. Diese zusätzliche numerische Komplexität istder Preis, den wir zahlen, um das gekoppelte Elektron-Photon Problem zugänglich zu first-principles Me-thoden zu machen. iiiv bstract Recent experimental progress in the field of cavity quantum electrodynamics allows to study the regimeof strong interaction between quantized light and complex matter systems. Due to the coherent couplingbetween photons and matter-degrees of freedom, polaritons – hybrid light-matter quasiparticles – emerge,which can significantly influence matter properties and complex process such as chemical reactions. Thisstrong-coupling regime opens up possibilities to control materials and chemistry in an unprecedented way.However, the precise mechanisms behind many of these phenomena are not yet entirely understood. Oneimportant reason is that often the physical problem is described with highly simplified models, where thematter system is reduced to a few effective levels. More accurate first-principles approaches that considerphotons on the same footing as electrons only slowly emerge. Their development is hampered by the in-crease of complexity of the combined electron-photon wave functions and the fact that we have to deal withtwo different species of particles.In this thesis we propose a way to overcome these problems by reformulating the coupled electron-photon problem in an exact way in a different, purpose-build Hilbert space, where no longer electronsand photons are the basic physical entities but the polaritons. Representing an N -electron- M -mode sys-tem by an N -polariton wave function with hybrid Fermi-Bose statistics, we show explicitly how to turnelectronic-structure methods into polaritonic-structure methods that are accurate from the weak to thestrong-coupling regime. We elucidate this paradigmatic shift by a comprehensive review of light-mattercoupling, as well as by highlighting the connection between different electronic-structure methods andquantum-optical models. This extensive discussion accentuates that the polariton description is not only amathematical trick, but it is grounded in a simple and intuitive physical argument: when the excitations ofa system are hybrid entities a formulation of the theory in terms of these new entities is natural.Finally, we discuss in great detail how to adopt standard algorithms of electronic-structure methods toadhere to the new hybrid Fermi-Bose statistics. Guaranteeing the corresponding nonlinear inequality con-straints in practice requires a careful development, implementation and validation of numerical algorithms.This extra numerical complexity is the price we pay for making the coupled matter-photon problem feasiblefor first-principle methods. vi IST OF PUBLICATIONS
A part of the results of my research as a PhD candidate have been published prior to this thesis. The followingpublications are part of this thesis:[1] Buchholz, F., Theophilou, I., Nielsen, S. E., Ruggenthaler, M., and Rubio, A.
Reduced Density-Matrix Approach to Strong Matter-Photon Interaction
ACS Photonics, American Chemical Society, 2019, 6, 2694
DOI:10.1021/acsphotonics.9b00648[2] Buchholz, F., Theophilou, I., Giesbertz, K. J. H., Ruggenthaler, M., and Rubio, A.
Light-matter hybrid-orbital-based first-principles methods: the influence of the polariton statistics
J. Chem. Theory Comput., American Chemical Society, 2020, 16, 24
DOI:10.1021/acs.jctc.0c00469[3] Tancogne-Dejean, N., Oliveira, M. J. et al.
Octopus, a computational framework for exploring light-driven phenomena and quantum dynamicsin extended and finite systems
J. Chem. Phys., American Institute of Physics Inc., 2020, 152, 124119
Ch. 4:
Dressed reduced density matrix functional theory for ultra-strongly coupled light-matter systems
Ch. 14:
Conjugate gradient implementation in RDMFT
DOI:10.1063/1.5142502The following publication is not part of this thesis:[4] Theophilou, I., Buchholz, F., Eich, F. G., Ruggenthaler, M., and Rubio, A.
Kinetic-Energy Density-Functional Theory on a Lattice
J. Chem. Theory Comput., American Chemical Society, 2018, 14, 4072
DOI:10.1021/acs.jctc.8b00292 viiiii
EMARKS ON NOTATION AND TERMINOLOGY
For a better readability, we try whenever possible to refrain from abbreviating words. However, the follow-ing few abbreviations are used several times in this text (note that we will sometimes not always use theabbreviated form):NR-QED non-relativistic quantum electrodynamicsQED quantum electrodynamicsHF Hartree-FockDFT density functional theoryQEDFT quantum-electrodynamical density functional theoryKS Kohn-ShamRDM reduced density matrixRDMFT reduced density matrix functional theoryMF mean fieldpXC photon-exchange-correlation (only part IV)We want to comment on the terms electronic-structure theory, many-body theory, and first-principles,which are used almost interchangeably. However, there is a hierarchy between them, that is the idea of adescription of a system from first-principles is to make use of as little knowledge as possible that is specificfor the scenario. For instance, to describe the equilibrium properties of a Helium atom, the standard first-principles approach would consider a doubly-positively charged nucleus and two electrons. To describesuch a 2-electron-1-nucleus system, we need methods that efficiently approximate the interaction betweenall the particles. We call this in general a many-body problem and the research area connected to that many-body theory. Thus, a first-principles description of microscopic systems is done with many-body methods.Importantly, we often can separate the electronic from the nuclear dynamics (Born-Oppenheimer approxi-mation, see Sec. 1.3.3) and for a plethora of phenomena, it is sufficient to describe only the former part, i.e.,the electronic structure accurately. This research area is generally called electronic-structure theory and itcomprises most of the known many-body methods (see also Sec. 2).We sometimes will quote from publications that are written in German. In this case, my translations areprovided below (in italic letters). ix ontents
List of publications viiContents xIntroduction 1
I Light, Matter and Strong Coupling 7
II Dressed Orbitals - Old Theory in a New Basis 113
III Numerics 175
CTOPUS . . . . . . . . . . . . . . . . . . . . . . . . . 1998.2 Hybrid Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
IV Concluding Remarks 221
A.1 The bosonic symmetry of the photon wave function . . . . . . . . . . . . . . . . . . . . . . . . . 233A.2 Conjugate gradients algorithm for real orbitals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234A.3 Gradient of the electronic natural occupation numbers . . . . . . . . . . . . . . . . . . . . . . . 235
B Validation of the occupation number optimization in RDMFT 237
B.1 Definition of the test setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237B.2 Visualization of the algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238B.3 The identification of a resolved bug of polaritonic RDMFT . . . . . . . . . . . . . . . . . . . . . . 240B.4 A simple resolution of the issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
C Convergence of dressed orbitals in O
CTOPUS
C.1 Validation of the exact dressed many-body ground-state . . . . . . . . . . . . . . . . . . . . . . . 244C.2 Validation of the dHF routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246C.3 Validation of dRDMFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250C.4 Protocol for the convergence of a dHF/dRDMFT calculation . . . . . . . . . . . . . . . . . . . . 253
Bibliography 255 xiONTENTSxii
NTRODUCTION
In the natural sciences, the phenomenon of electricity and its relation with charged particles is a researchtopic at least since the 16th century, and its importance has since then continuously increased. In the 19thcentury, researchers started to understand the connection between charge and electric forces not only morequantitatively but also the relation of both to magnetism and even light. Nowadays, all these phenomena areunderstood as different aspects of the electromagnetic field, whose dynamics and interaction with chargeis well described by one set of four coupled equations, named after James Clerk Maxwell . Rapid advancesin experimental techniques at the turn of the twentieth century revealed that charge is a characteristic of all materials – not only of certain materials, as we used to think up until then. To the best of our today’sknowledge, all matter consists of atoms, and all atoms consist of negatively charged electrons and positivelycharged nuclei. To understand matter on these atomic scales, physicists had to give up the classical conceptof point particles governed by Newton’s laws, which is accurate only in the macroscopic world, and replace itby the set of tools and laws of quantum mechanics. For a consistent description, not only matter, but also theelectromagnetic field had to be quantized and finally in 1938, Wolfgang Pauli and Markus Fierz formulatedthe theory that accounts for the full quantum nature of electrodynamics and charged particles on atomicscales [7]. Today, we call this theory non-relativistic quantum electrodynamics (NR-QED) (or Pauli-Fierztheory after its developers). It is claimed that NR-QED describes “any physical phenomenon in between[gravity on the Newtonian level and nuclear- and high-energy physics], including life on Earth” [9, p. 157].In other words, physicists have reduced all Life on Earth on the interaction between charged particles andthe electromagnetic field.However, as the full theory of NR-QED is too difficult for strict mathematical deductions, the limit caseshave become much more important in the following years. For example, the quantized electromagneticfield, possibly controlled by some classical external charges or charge currents, is very-well understood to-day. A very active field of research is quantum optics , where physicists investigate the interaction betweensimple models of matter and photons, i.e., the quanta of the electromagnetic field. This allows to study fun-damental atomic processes, such as light emission and absorption, to characterize the quantum nature oflight or to develop important devices, e.g., lasers or single-photon emitters. The crucial approximation be-hind quantum optical models is the simplification of the matter-degrees of freedom, which allows to studythe photon field in detail.If we instead perform the static limit , that is we turn off the interaction to the quantized part of the elec- See for example the very well-documented article on W
IKIPEDIA : https://en.wikipedia.org/wiki/Electricity , accessed 12.06.2020 . Naturally, there were many people involved in the discovery of these equations, but it was Maxwell who added a last term to makethe system of equations consistent [5, part 6.3]. For more information on the history of Maxwell’s equations and a well-readable intro-duction to the topic, we recommend the W
IKIPEDIA article and the references therein: https://en.wikipedia.org/wiki/Maxwell’s_equations , accessed 12.06.2020 . According to the standard model of particle physics [6], nuclei can be even further divided into smaller constituents, but this doesnot influence our statement. The term “non-relativistic” refers here to the matter description that does not include high-energy phenomena like particle creation or annihilation processes. See Ref. [8]. Note that this limit case is very important for the theoretical understanding of field quantization [10], but in practice, it is basicallynot relevant. The reason is that the quantum nature is barely visible without the interaction with matter. See for example the discussionin part 3 of the book by Keller [11]. To really follow this programme, i.e., to describe matter from first prin-ciples in practice, many further approximations are necessary. The static limit of the Pauli-Fierz theory, i.e.,the Schrödinger theory of say N interacting electrons is still too difficult for exact solutions if N is large. Thereason is that the theory describes the state of the system by the many-body wave function , which dependson N coordinates. This means that the configuration space, i.e., the space of all these functions grows ex-ponentially with N , which makes it for larger N entirely inaccessible (we will discuss this in detail in Ch. 2).This is the so-called (quantum) many-body problem , which is so severe that the Nobel laureate Walter Kohnquestioned (for large systems) the legitimacy of the many-electron wave function (and thus of the wholetheory) as a scientific concept [13]. However, in the research field of electronic-structure theory (or moregeneral many-body theory ) many accurate approximation strategies and even alternative formulations havebeen developed to describe matter on the microscopic level, i.e., from first principles . Nowadays, thanksto these methods and high performance computers, we can predict the structure of many molecules andcrystalline solids, calculate many of their properties like excitation spectra, and even understand complexprocesses like chemical reactions.Interestingly, quantum opticians and many-body theorists (and the same is true for others) have con-ducted their research without much overlap despite their common origin. Both communities study differ-ent aspects of the quantized theory of charged particles and the electromagnetic field. Only recently, thishas changed and the full Pauli-Fierz theory started to raise renewed interest. One reason is that new ex-perimental techniques allow nowadays to probe systems and parameter regimes, where many-body effectsand the quantum nature of the electromagnetic field play a role [14]. In this strong-coupling regime , hybridlight-matter particles (so-called polaritons) emerge that are capable to modify the properties of the cou-pled system significantly in comparison to the separate subsystems. Applications include the possibility ofbuilding polariton lasers [15], the modification of chemical landscapes [16], the control of long-range energytransfer between different matter systems [17] or the emergence of entirely new states of matter [14, 18, 19].In such scenarios, fundamental approximations of the traditional methods and models break down andconsequently “many theoretical works [have been proposed] which diverge significantly in their predictionscompared to experiments” [20]. This suggests to take the more general perspective of Pauli-Fierz theory andreevaluate in a less-biased way the assumptions and approximations of the standard methods. We believethat this is the natural task of a generalized first-principles approach that treats matter and photons on thesame footing.To describe all the degrees of freedom of Pauli-Fierz theory from first principles, it is necessary to findways to efficiently describe the interaction between electrons or more generally charged particles and pho-tons (in a similar way as researchers in electronic-structure theory once have found ways to deal with theCoulomb interaction between electrons). It is clear that adding the quantized electromagnetic field to thealready difficult many-electron problem is a very challenging task. In fact, the many-body problem in Pauli- This model is in impressively many cases sufficient to understand the structure of atoms, molecules, but also condensed matter,their spectra and many more properties. However, there are phenomena such as molecular vibrations or the heat capacity of solids thatrequire to take the dynamics and often also the quantum nature of the nuclei into account and there are generalizations of the modelto account for that. The inner structure of the nuclei, e.g., the dynamics of protons, neutrons or even quarks instead influences veryrarely the properties of matter in some direct way. It is usually sufficient to describe the atomic nucleus as effective point charge with a mass and a spin. To the best of our knowledge, the largest system that has been described until today exactly, i.e., in the basis set limit consisted of N =
54 electrons in a very special scenario [12]. Note that in the literature the term ab-initio is often used equivalently to first principles. Thedipole and few-mode approximation are very common in the field of cavity QED, where researchers studymatter systems inside optical cavities, i.e., resonators that “trap” photons with selected frequencies [21]. Most of the aforementioned strong-coupling phenomena have been observed in such cavity settings. Avery important motivation for our work are some open debates in the only recently established field of po-laritonic chemistry [22]. Here researchers study the (possibly considerable) influence of cavity photons onmolecular systems and complex chemical process such as reactions. A reasonable first step to understandthis influence, is to study the physical setting of cavity QED from first principles. Importantly, this limitcase of the full Pauli-Fierz theory exhibits already many fundamental issues and challenges of the coupledelectron-photon problem and thus, defines a very good starting point for the development of new meth-ods. We will hereby focus on equilibrium scenarios, which play an important role for the actual debates inpolaritonic chemistry but are considerably less well studied than the time-dependent case [24].One of the most important challenges for the description from first principles is the simultaneous in-clusion of (at least) two particle species, e.g., electrons and photons. Most existing methods are geared tothe accurate description of only one particle species, such as electrons and their interaction. For instance,most many-body methods explicitly take the particle statics, e.g., the Fermi-statistics of electrons into ac-count. This is usually a very important part of an accurate description (see Ch. 2). Any kind of generalizationof such methods to treat more species faces thus the problem to describe more degrees of freedom, whichhave different statistics (and other properties) and a different type of interaction. A prominent exampleis the accurate description of non-adiabatic effects between electrons and nuclei, which is an unresolvedproblem for many relevant scenarios [25]. We face a similar challenge in cavity QED, when we want to de-scribe the physics of polaritons where matter and light degrees of freedom are strongly mixed. For instance,the emergence of polaritons induces correlation between the matter-degrees of freedom [2, 26], which canmake the accurate description considerably more difficult (see Ch. 3).Motivated by the challenges of a multi-species description and the prominent role of polaritons in strong-coupling physics, we propose to reformulate the coupled electron-photon problem in a new purpose-builtHilbert space. The basic entities here are not anymore electrons and photons, but polaritons. This allowsus to represent a system that consists of say N electrons and M photon modes in an exact manner by an N -polariton wave function that adheres to hybrid Fermi-Bose statistics. Importantly, the new Hamiltonianresembles structurally the Hamiltonian of an N -electron system, i.e., both operators consist of a one-bodypart (kinetic and potential energy) and a two-body part (interaction energy). This allows for a straightfor-ward application of established electronic-structure methods to describe electrons and photons on the samefooting and makes this mathematical reformulation of cavity QED practical.The derivation of this dressed-orbital construction, on the one hand, can be explained purely with math-ematical similarities. On the other hand, we can rationalize the approach by combining the basic principles Thus we ignore, e.g., the spatial dependence of the electron-photon interaction. In their simplest form, one can imagine a cavity as two (high quality) mirrors that are positioned with a certain distance opposedto each other. The distance selects a light-mode with a certain frequency and the corresponding photons are reflected back and forthvery often before they can dissipate. Every time, they cross the volume of the cavity, the photons can interact with the matter system,which can be effectively described by an increased light-matter coupling strength. Note that there are cavity experiments, which require a theoretical description beyond the dipole and few-mode approximation.However, in most cases these approximations are very accurate [21]. A “full” first-principles perspective would explicitly describe the cavity as a part of the matter system. This would however requireto describe the electromagnetic field fully spatially resolved. See Ref. [23].
Outline
This thesis consists of three parts that reflect the different aspects of our research. The topic of part I isthe general theoretical analysis of coupled light-matter systems. We start (Ch. 1) with the introduction ofthe physical setting of cavity QED. We discuss the phenomenology of strongly coupled electron-photon sys-tems, the standard way to understand their principal features, and the limitations of this perspective that isillustrated by concrete examples. We then discuss these standard approaches in more detail (quantum op-tics perspective) and define a framework that allows for a more general description from first principles. In4NTRODUCTIONthe second chapter (Ch. 2), we turn to the static limit of NR-QED and introduce first-principles electronic-structure theory. We outline the challenges of the accurate description of many-electron systems and specif-ically discuss three specific approaches to deal with these. In the last chapter of part I (Ch. 3), we generalizethese approaches to coupled light-matter systems and discuss them in detail. The analysis reveals why inparticular in equilibrium scenarios, many powerful concepts of electronic-structure theory are less useful inthe coupled setting.Motivated from the results of the analysis of the first part, we propose in part II a new strategy to describecoupled light-matter problems by introducing a purpose-built Hilbert space (dressed-orbital construction,Ch. 4). This approach allows to restructure the coupled electron-photon many-body space such that we candescribe a system by a “many-polariton” wave function. We thus define polaritons as its own particle-speciesthat has electronic (fermionic) and photonic (bosonic) degrees of freedom and consequently adheres to aFermi-Bose hybrid statistics. In the polariton description, the coupled light-matter Hamiltonian resemblesthe electronic-structure Hamiltonian, which allows to generalize electronic-structure methods to the cou-pled problem in a very straightforward way (Ch. 5). We show this explicitly with the example of Hartree-Fock(HF) theory and reduced density matrix functional theory (RDMFT) and present first results for model sys-tems in Ch. 6. Despite their reduced dimensionality, these example systems exhibit already a rich spectrumof nontrivial behavior that is accurately described by the newly proposed methods. This highlights the po-tential of the polariton description.After the discussion of the theory and the results, we concentrate in part III on the details of the nu-merical part of the research. The gain of making applicable first-principles methods to strongly-coupledlight-matter systems is accompanied by the need for new algorithms and an increased numerical complex-ity. To do so, we present specific algorithms to solve the electronic HF and RDMFT equations in real space,including a newly developed conjugate-gradients algorithm (Ch. 7). We then explain how to modify these al-gorithms to describe coupled-light matter systems by means of the dressed construction (Ch. 8). We presentin great detail the validation of our implementation in the electronic-structure code O
CTOPUS [3] and showhow the results presented in part II have been converged. This implementation was geared toward the two-polariton case. In Ch. 8.2, we finish the numerical part by presenting an algorithm for the general case.We finalize the work by presenting in part IV the conclusions and perspective. 5NTRODUCTION6
ART I
LIGHT, MATTER AND STRONG COUPLING
Few distinctions in quantum mechanics are as important as that between fermions and bosons.[...] I do not have the authority to assert that God agrees with me as to the importance of thisdistinction, but I am sure that most happy humans will since, as noted by Eddington, if therewere no fermions there would be no electrons, so no molecules, so no DNA, no humans!(A.J. Coleman, 2007 [27, Ch. 1]) 7 hapter 1
STRONG LIGHT-MATTER COUPLING: EXPERIMENTS, THEORY ANDMORE THEORY
This chapter aims to motivate the need for first-principles approaches to describe strong-coupling phenom-ena and to define a theoretical framework that is suited to investigate such approaches. This framework isgiven by the non-relativistic Pauli-Fierz theory, i.e., an interacting quantum-field theory that allows to de-fine the equilibrium properties of coupled many-electron-photon systems. The theory includes as a limitcase the physical setting of cavity QED, which entails the dipole-approximation and the restriction to a feweffective modes. This level of theory is still enough to capture the possibly strong modifications of atoms,molecules and solids due to the coupling to the modes of an optical cavity.
In electrodynamics, we can differentiate between several effects and interactions, whose strength dependson the physical setting. Electric charges for instance attract or repel each other by the Coulomb force, whichis the dominant interaction on atomic scales. It is thus impossible to understand the properties of con-densed matter or molecular systems without the Coulomb interaction. However, when we want to studyelectrically neutral entities (like the atoms of a gas), Coulomb forces typically play a negligible role. Then,there are magnetic forces that are connected to electric currents or spins. Such interactions are crucial tounderstand phenomena such as ferromagnetism or the quantum-Hall effect, but do not play an importantrole in, e.g., spin-saturated (closed-shell) systems or in the thermodynamic equilibrium. Besides the role ofthe electromagnetic field as mediator of interaction, there is another important degree of freedom, which wecall electromagnetic radiation or simply light. Since light can move freely, it is treated in the theoretical de-scription as a separate entity that can interact with matter (charge) via absorption and emission processes.Usually, this interaction is so small in comparison to, e.g., Coulomb or magnetic forces that we can treatabsorption and emission processes perturbatively. For example, this means that the emission of a photonby a matter system usually does not influence its properties, i.e., there is a negligible back reaction. This isexpressed in the fact that the coupling constant between the free electromagnetic field and charged particlesis small independently of the system of units. However, the combined efforts of researchers from many research communities have revealed that al-though difficult, it is indeed possible to overcome this fundamental limitation and reach strong interac- Strictly speaking, light denotes electromagnetic radiation in a certain frequency (or wavelength) interval that can be perceived bythe human eye. However, it has become customary to extend this definition and denote, e.g., the spectra with smaller and larger thewavelengths than the visible range as infra-red and ultra-violet light, respectively. It is called the fine-structure constant, which is dimensionless and has approximately the value α ≈ α . See for example[28]. coupling between light and thematter system. This makes (strong) modifications of matter properties possible for very small field strengthsor even only the vacuum [35, 22]. Thus, strong-interaction phenomena can be studied, but without, e.g., theheating due to strong lasers and also with additional nontrivial quantum effects. With the first breakthroughexperiments [36, 37] only about two decades ago, it is a relatively young research topic, but because of itshigh potential for applications, the investigation of this strong-coupling regime literally “has exploded [...] inthe past few years” [38]. Nowadays, strong coupling has been demonstrated in various systems with not onlydistinct basic entities, i.e., the employed matter system and the degrees of freedom of field and matter thatare coupled to each other, but even different mechanisms that allow to reach strong coupling. However, allthese systems share two key features, which we can loosely define in the following way: for a matter systemto reach strong coupling with certain modes of the electromagnetic field,1. these modes have to be confined to very small volumes, and2. the matter system has to be chosen such that it responds especially strong to these confined modes.It is difficult to make this definition more concrete, because there are on the one hand so many ways to reachstrong coupling. On the other hand, many different communities participate in the research and there is notan ultimate consent on the correct definition of strong coupling. Missing such a consent, we start in the next paragraph with the (relatively) unquestioned part: the ex-perimentally determined facts and their basic interpretation. According to this interpretation, all the phe-nomena of the strong-coupling regime are related by the emergence of hybrid light-matter quasi-particlestates, called polaritons . These determine the properties of the combined light-matter system, which can besignificantly different than the properties of the separate subsystems. This mechanism can be understoodimpressively well by a minimal model. In fact, most of the strong-coupling effects can be classified accord-ing to the different parameters of this model. These parameters were and still are a very important guidelinefor experimentalists to design new setups that reach the strong-coupling regime.However, the explanatory power of this and similar cavity-QED models is limited, when the matter sys-tems are sufficiently complex. Take for instance the field of polaritonic chemistry , where researchers modifythe properties of molecular systems by coupling them to photons, e.g., they control chemical reactions byletting them take place inside a cavity. Nowadays conducted routinely, these experiments usually take placeat room temperature, involving complex molecules in some solvent, and often with lossy cavities. Evenoutside the cavity, the accurate description of such settings require sophisticated first-principles methods. As we will see in the following, the topic of strong coupling is located between many established research fields. This sometimesleads to situations, where scientists have to explain and defend basic concepts of their field to (in this regard) non-specialists. This canbe a formidable task, as many long and heated discussions at conferences and in peer-review processes have testified.
The experimental breakthrough of strong coupling
In one of the most cited reviews of the field, the experimentalist Thomas Ebbesen names two publicationsof 1998 as the breakthrough experiments that “generated increased interest among physicists” [22]. In thefirst one, Lidzey et al. [36] achieved strong coupling by fabricating a so-called microcavity, which is a verysmall resonator that is capable to trap the photons of a mode with frequency ω ph inside a very small vol-ume of space (key feature one). The breakthrough however was not achieved by their advances in cavityfabrication, but due to the matter system (a special type of organic semiconductor) that they coupled to thecavity modes (key feature two). To get an idea of the setting, we show a simplified sketch in Fig. 1.1. To provethat their system was able to reach strong coupling, they pumped the cavity mode by an external laser pulsefor a series of ω ph . In Fig. 1.2, we have depicted a sketch of the “typical” outcome of this type of experiment.When they tuned ω ph close to the frequency ω m of a certain excitation of the matter system, they observeda splitting of the absorption peak, i.e., two peaks symmetrically distributed left and right from ω m ≈ ω ph (or-ange solid line in Fig. 1.2 (a)). When they measured the same absorption spectrum of the matter system, butoutside the cavity, they instead observed only one peak at ω m (blue dashed line in Fig. 1.2 (a)). Collecting theposition of all those peaks as a function of the mode frequency ω ph in one graph, they observed two linesthat approach each other until they reach a minimal distance at resonance ω ph = ω m and then move againapart as schematically depicted in Fig. 1.2 (b). Without light-matter coupling, both curves would cross eachother (blue, dashed lines in the plot) and thus the observed anti-crossing or “ Rabi splitting ” is considered asthe principal indicator for (strong) coupling. The minimal distance d min = (cid:126) Ω R between the two lines isproportional to the Rabi frequency Ω R , which measures the strength of the light-matter coupling ( (cid:126) denotesas usual the Planck constant). For their experiment, Lidzey et al. found a maximal value of (cid:126) Ω R = any Rabi splitting reported before and which explains the work’s im-pact.In the other paper that Ebbessen cited, Fujita et al. [37] measured a Rabi splitting of (cid:126) Ω R = How can we understand the phenomenon of strong coupling?
Although Rabi splitting has been achieved experimentally already before 1998, it were especially the newmaterials, employed in Refs. [36, 37], that allowed for the large values of (cid:126) Ω R . The typical (excitonic) transi-tions in organic semi-conductors respond very strongly to the driving by electromagnetic radiation, which Note that strong coupling has been already achieved before 1998. See below. To be precise, the cavity does not trap exactly photons of one but a narrow band of modes around a center with frequency ω ph .Considering only ω ph instead of the full band is for such kind of cavities usually justified. In this type of cavity, the trapped mode frequency ω ph is controlled by the incidence angle of the laser pulse. Thus to perform ameasurement series for ω ph , they merely had to vary this incidence angle. Theoretically, the anti-crossing happens for every light-matter system that has a non-vanishing coupling. However, the splittingmay be too small to be spectroscopically resolvable, i.e., it is smaller than the linewidth of the two peaks. In this case, the system is saidto be in the weak coupling regime. See for instance Ref. [41, 42, 43, 44]. In these experiments lower effective coupling strengths were reached and they were usuallyconducted under very restrictive conditions such as cryostatic temperatures. d of a matter system (here illustratedby a benzene molecule) couples strongly to the electric field E of the confined modes (red) inside a cavity(here illustrated by two concave mirrors in blue).is expressed by a large value of their (dimensionless) oscillator strength f . The oscillator strength is thus one(and in fact the most common) measure to quantify the second key feature of strong coupling. A very goodmeasure of the first key feature is the so-called mode volume V , which loosely speaking denotes the “spacein which the mode is confined.” For say a cubic cavity with side-length a , we can simply calculate V = a . Nowadays, we call basically every experimental setup that accomplishes such a confinement a cavity . Andtherefore, strong coupling is often considered as a subfield of cavity QED, which generally deals with coupledmatter-cavity systems [21]. We can derive an explicit expression for the Rabi frequency from the oscillatorstrength and the mode volume by a minimal model. The model considers one matter transition betweensay an energetically lower | m 〉 and higher state | m 〉 with energy difference (cid:126) ω m and an electromagneticmode with frequency ω ph . Importantly, we do not need to further specify the nature of this transition and | m 〉 , | m 〉 may represent for example electronic, vibrational or some collective states. The only necessary(external) parameter is the transition dipole moment d between the two states, whose absolute value isproportional to the square-root of the oscillator strength | d | ∝ (cid:112) f . A straightforward extension of themodel takes N identical matter systems into account by simply exchanging f → N f . The matter-dipolecouples to the electric field E of the mode, which has a magnitude | E | ∝ (cid:112) V proportional to the inverse ofthe square-root of V . Close to resonance and neglecting losses for the moment, the Rabi frequency for thisso-called Jaynes-Cummings model [45] (or Tavis-Cummings model for N > (cid:126) Ω R = d · E ∝ (cid:115) N fV , (1.1)Despite its simplicity, this model or one of its generalizations that we collectively subsume under the term cavity-QED models describe the principal physics of the strong-coupling regime well. This accordance In other setups, this simple formula is note valid anymore, but has to be replaced by a more general mode volume that can beassigned to, e.g., nanoplasmonic cavities. We want to remark that the matter states are often referred to as ground and excited state in the literature. However, this nomen-clature is quite misleading, since in most experiments, e.g., for the excitonic transitions of our two examples, both states are actually excited states. Another important example is the Rabi-(Dicke-)model that includes so-called off-resonant terms for one (many identical) mattertransition(s). Other generalizations may also include more than two (but not much more) electronic energy levels. See Sec. 1.2 forfurther details on cavity models. weak coupling, g < γ strong coupling, γ > g (a) (b)(c) (d)Figure 1.2: We depict the idealized data of a strong-coupling experiment according to the Jaynes-Cummingsmodel, which provides a good description of the dynamics of coupled light-matter systems close to reso-nance, i.e., ω m ≈ ω ph , where ω ph and ω m are the frequencies of the (high-Q) cavity mode and a mattertransition, respectively. This allows to differentiate two coupling regimes depending on the ratio betweencoupling constant g and the spontaneous emission rate γ .If g < γ , the light-matter coupling manifests in an effective line-broadening, i.e., an increase of the spon-taneous emission rate of the combined system γ P (the combined system corresponds in all plots to theorange, solid line) with respect to the matter system outside of the cavity γ (in all plots blue, dashed line).The ratio P = γ P / γ is called the Purcell factor and it is related to the coupling strength P ∝ g [40]. This canbe explained by the energy eigenspectrum of the Jaynes-Cummings Hamiltonian (part c): The light-mattercoupling leads to an anti-crossing between the electronic and photonic energy eigenvalues. This is so smallthat it cannot be resolved spectroscopically, but becomes only visible as the broadening.If instead g > γ , two separated peaks can be distinguished (part b), which characterizes the strong-couplingregime. The coupling between light and matter is here so strong that the Rabi splitting 2 (cid:126) Ω R ∝ (cid:112) γ + g and thereby the coupling constant g is measurable. In the spectrum (part d), we can observe how two clearlyseparated lines emerge inside the cavity, which describe the dispersion relation of the two polaritons. 13HAPTER 1. STRONG COUPLINGis very important, because it indicates that the dominant phenomenon of the regime is the hybridizationbetween two energy transitions. The Jaynes-Cummings model, which we will employ as a prototype for cavity-QED models, providesus not only with a handy formula for the Rabi frequency but also with an interpretation of the new transi-tions, that have a hybrid electron-photon character. As it is common in quantum physics, we interpretsuch effective degrees of freedom as quasiparticles, which in this case are called polaritons . In the Jaynes-Cummings model, polaritons “emerge” as the eigenstates of the model Hamiltonian and they have thesimple form P n + / − = α ( | m 〉 ⊗ | n + 〉 ) ± β ( | m 〉 ⊗ | n 〉 ), (1.2)where | n 〉 denotes the n -photon state of the mode and the coefficients α , β depend on the details of themodel. The power of the Jaynes-Cummings model lies in its extreme simplification of the in general arbi-trarily complex phenomenology of light-matter interaction. It provides us with simple concepts and thusvocabulary to describe and discuss about strong-coupling physics. Until nowadays, it is the most importanttool for the interpretation of strong-coupling experiments, which is remarkable in the light of the variety ofthe field. An exhaustive overview over this variety is clearly beyond the scope of this thesis and the interestedreader is referred to, e.g., the reviews of Litinskaya et al. [47], Törmä and Barnes [35], Ebbesen [22], Kockumet al. [48], Ruggenthaler et al. [14] and the references therein. We content ourselves instead with the presen-tation of some selected examples to give the reader a flavour of the richness of polaritonic physics and thatillustrate the challenges for the theoretical description. The variety of strong-coupling phenomena: some selected experimental examples
We start with some general considerations. The relation (1.1) presents the three major “knobs” that havebeen turned in the past twenty years to reach the strong light-matter coupling regime with many differentsystems inside cavities:1. the mode volume V ,2. the oscillator strength f , and connected to f ,3. the number of oscillators N that couple collectively to the mode.From a broad perspective, the most important of these knobs to control the light-matter coupling is themode volume V . As we have mentioned in the introduction, the light-matter coupling strength is funda-mentally determined by the fine-structure constant α , which is small independently of the system of unitsused. This is the reason why strong-coupling is very difficult to reach in practice and the phenomena thatwe discuss here can only occur because of the modern cavities that strongly reduce V . Thus, all strong-coupling experiments use some form of a cavity or more precisely, they confine some spectral band of theelectro-magnetic field in small volumes, in which they position some matter system. However, even withstate-of-the-art cavities, we cannot reach the strong-coupling regime with all materials, but we need to turnalso the second and third knob. Regarding the oscillator strength f , especially organic materials showed to Note that if we assume that both transitions stem from a harmonic oscillator, we cannot differentiate the classical from the quantum description (Hopfield model) anymore in a spectroscopic experiment. Hence, there are still debates on the “quantumness” of many ofthe strong-coupling phenomena [46]. See Sec. 1.2.2 for a discussion on the limitations of this and other cavity-QED models. The Jaynes-Cummings model is one of the very few QED models that can be diagonalized analytically. V asmuch as possible but the crucial step to achieve such large Rabi frequencies was due to the employed ma-terials. They utilized organic semi-conductors, since “large oscillator strengths are a characteristic featureof these materials” [36]. In terms of the knobs, this refers not only to a large value of f but also of N , since,e.g., all the excitons of the conduction band couple to the mode. In the following years, many of the moststriking experiments have been conducted with this material class, including polariton lasing [15] and con-densation [49], both of which have been predicted before [50]. Orgiu et al. [51] reported that they increasedthe conductivity of an organic semi-conductor by an order of magnitude by coupling it to only the vacuumfield of a cavity and Coles et al. [17] modified the energy transfer pathways in a light-harvesting complexwith the help of strong coupling.
Strong coupling with molecular systems: polaritonic chemistry
But not only the excitons in organic semi-conductors are suitable to reach the strong coupling regime. In2011, when Schwartz et al. used the photochemical properties of an organic molecule to “switch” its couplingto a cavity mode, the field of polaritonic chemistry emerged. In contrast to semi-conductors, where the mostimportant degrees of freedom often approximately resemble free electrons, molecules exhibit an enormousspectrum of qualitatively different degrees of freedom. Schwartz et al. [52] used for example an electronictransition for their experiment, but Thomas et al. [53] coupled the vibronic transition of a molecule to acavity to demonstrate one of the most striking opportunities that polaritonic chemistry offers: they changed the reaction rate of a chemical reaction “just” by letting the reaction take place inside a cavity.Molecular strong coupling has not only been achieved for molecular liquids, where many molecules ofthe same type are injected in the cavity (like in the latter two examples), but even on the single- and few-molecule level. Since a few molecules have a considerably smaller total oscillator strength than molecularliquids ( N is small), either cryostatic conditions are necessary to observe the Rabi splitting [54] or the modevolume V needs to be substantially decreased. That such single-molecule strong coupling at room temper-ature is indeed possible has been shown for the first time by Chikkaraddy et al. [55]. They manufactured aso-called plasmonic nano-cavity, which confined the photon field to effective volumes of V ≤ Though the exact coupling strength depends on the wave vector of the charge carriers.
The challenges of an interdisciplinary field
These (in comparison to the total number of publications) few examples illustrate how diverse the field is.We saw at least the research fields of nanophotonics, plasmonics, quantum optics, material science (includ-ing 2d materials), solid-state physics and (quantum) chemistry appearing. All of them play an importantrole. And all of them have their own focus and consequently also their own point of view on the topic. In thenanophotonics and plasmonics communities, people try to understand the behaviour of the electromag-netic field on the nano-scale. They are interested in understanding and optimizing geometries and as theo-retical tool they usually just solve the classical Maxwell’s equations, where the matter often only enters in theform of complicated boundary conditions. The fields of materials science, solid state physics and quantumchemistry instead focus on matter properties on atomic length scales. Crucially, this requires an encom-passing quantum-mechanical description and thus in principle solving the many-body Schrödinger equa-tion. Since this is impossible in practice, the main challenge here is to find approximations or alternativedescriptions that are numerically feasible but still “sufficiently” accurate. The question “what is sufficient?”is hereby one of the important research questions in the field. Electromagnetic fields normally enter in thisdescription merely as “external potentials” or “perturbations” without shape and spatial extension. Probablythe most important contribution to strong-coupling physics comes from the quantum optics community.Quantum opticians study light-matter interaction on the smallest scales and they developed the first the-ories to describe the strong-coupling phenomenology, including the aforementioned cavity-QED models.Still, these descriptions usually emphasize the accurate description of the electromagnetic field, includingthe complex phenomenology of its quantum statistics. Matter though treated quantum mechanically, is al-most exclusively reduced to two (or a few) levels. Until nowadays, these cavity-QED models are the basis formost of our understanding of strong-coupling physics and chemistry.
The theoretical challenge: complex systems exhibit strong-coupling effects that require complex theory
It is obvious that there are many scenarios where such simplified descriptions cannot account for the com-plexity of the involved processes. Especially in the realm of polaritonic chemistry, where complex molecularsystems are coupled to the photon field, the explanatory power of cavity-QED models is limited. There areseveral open questions, such as• What is the influence of the strong electron-photon interaction, if the electronic-structure changes asit happens in a chemical reaction [63, 64]?• Can the vacuum fluctuations of a cavity mode really modify equilibrium properties of matter systemsas claimed by experimentalists [16, 53]?• Is the mechanism that leads to collective strong coupling really as simple as predicted by the Dickemodel [65]? And can local properties be substantially modified, due to this pathway to reach strongcoupling [66, 67]?To answer such questions, we need genuine first-principles methods such as QEDFT [39] that are capable todescribe the inner structure of matter systems and its interaction with the field [14]. However, developingsuch methods is a delicate task that involves finding entirely new approximation strategies (for the electron-photon interaction) and deriving and solving highly nonlinear equations. This requires not only an accuratemathematical treatment, but also a careful development of numerical methods. In this thesis, we addressand analyze these theoretical and numerical challenges from a very general perspective. Based on this, we16.1. WHAT IS STRONG LIGHT-MATTER COUPLING?present the dressed-orbital construction that allows to circumvent many of the identified difficulties by anexact reformulation of (equilibrium) cavity QED in terms of polaritonic particles. It is noteworthy that incontrast to Eq. (1.2), this definition of a polariton in terms of dressed orbitals is general and does not requirerestrictive assumptions such as the few-level approximation.We present two specific examples of dressed-orbital-based methods including the details of the accord-ing numerical algorithms and implementations. With the help of these methods, we provide explicit evi-dence for phenomena that cannot be described with standard model approaches. For instance, we demon-strate in Sec. 6.4.2 how a simple chemical reaction in one spatial dimension is influenced differently by thecavity, depending on the reaction coordinate. Missing spatial resolution, such an effect cannot be capturedby methods that rely on the few-level approximation to account for the electron-photon interaction. How-ever, there are indications that such local effects might play in important role for the understanding of manystrong-coupling phenomena [26] and our results of Sec. 6.4.4 provide additional evidence for this. We findthere that both, electronic localization and correlation, strongly influences the light-matter interaction andpoints toward a different than the Dicke-type mechanism behind collective strong coupling.
Unresolved questions in polaritonic chemistry: the limitations of cavity-QED models
Let us illustrate the limitations of model descriptions with a concrete example that concerns the just men-tioned “collective contribution” to the coupling strength appearing in relation (1.1) by the simple factor (cid:112) N .The assumption that leads to this dependence is that N two-level molecules couple to the same cavity andthus have an (cid:112) N times larger effect on the Rabi splitting, but at the same time every molecule itself only“feels” the small coupling for N =
1. Accordingly, the coupling does not have strong local effects and cannotsignificantly change, e.g., the electronic structure. But if this is true, how are chemical reactions modified bystrong matter-photon coupling as experimentalists claimed [16] and asked [68]?Feist and Garcia-Vidal [69] and Cwik et al. [70] tried to answer this question with their models andshowed that some observables are collective and others are not, however partly contradicting each other.One prominent controversy arose around the question, whether the ground-state potential energy surface (ground-state PES), which is a crucial quantity in chemical reactions (see Sec. 2) is modified by the collec-tive or merely the single-molecule coupling. Feist and Garcia-Vidal [69] and Herrera and Spano [71] showedfor different models that modifications of the ground-state PES proportional to the collective coupling arepossible. Martínez-Martínez et al. [66] instead showed that such modifications for their model are only pro-portional to the single-molecule coupling strength. They explicitly state that their results contradict Ref. [69]and Ref. [71], but is in line with Ref. [70]. One year later, Galego et al. [67] enforced with improved methodstheir argument that ground-state PES modifications due to collective effects are indeed possible. They arguethat Ref. [66] did not take ground-state dipole moments into account, which would be crucial for a correctdescription of chemical reactions under strong coupling.This debate reflects the inherent challenges of understanding and describing complex systems: there areseveral effects playing a role at the same time and some of them might be cooperative, others in competition.And especially, this might change with certain system parameters. Thus, it is not enough to identify theseeffects, but one also needs to accurately quantify their importance. As we have seen, model descriptions likethe Jaynes-Cummings model are very efficient and powerful in describing one or a few features of a system.However, with increasing complexity, i.e., with an increasing number of such effects this strength becomesa weakness. We somehow have to decide, which feature we include in the description and which not anddepending on that, the model may provide different answers. In some cases this might be quite clear, butthe aforementioned example shows that this is not always the case. 17HAPTER 1. STRONG COUPLINGAnother very old example is the not-yet resolved debate on the (non-)existence of a so-called superra-diant phase that Dicke predicted already in 1954. The there proposed model (which is today known as theDicke-model) describes N two-level systems coupled to a photon mode and predicts a transition in the su-perradiant phase [73], where all the dipole moments of the atoms align and the photon mode occupationreaches values much bigger than N . There have been several publications trying to answer if a real systemcan undergo such a phase-transition. Certain “no-go-theorems” have been derived, e.g., Refs. [74, 75], andcontested several times, e.g., Refs. [65, 76]. Recently, the superradiant transition could be demonstrated inartificial realizations of the Dicke model [77, 78], but the question whether the transition can occur in morerealistic situations that Dicke originally had in mind is still not resolved. For a good summary of the topic,the reader is referred to the recent review by Kirton et al. [79].It is such kind of problems that have motivated our research on a first-principles description of light-matter systems. How can we describe the details of strong-coupling physics in a less-biased way, i.e., with-out deciding a priori which features we include in the description? Cavity-QED models have proven theirexplanatory power, but what are their limits? And if we reach these limits, how can we improve the modelsin a systematic way? The range of validity of cavity-QED models
So one might ask, how it is possible that so many different and highly complex materials can be modelled bycavity-QED models in the moment we put them into a cavity that itself might be a complicated plasmonicnano-structure. It took decades of research to develop the machinery that allows to accurately describe theproperties of these materials and cavities. How can an additional interaction between two already complexsystems simplify things? In fact, exactly this is what (in many cases) happens. Tuning the cavity in resonancewith one matter excitation, can be seen as a sort of selection process. If the other matter transitions are ener-getically well separated from the selected transition and only this energy range is probed in an experiment,one can observe the Rabi splitting exactly as it is described by the Jaynes-Cummings model. Wang et al.[54] showed this explicitly in a recent experiment, where they “turn[ed] a molecule into a coherent two-levelquantum system.” But even if some other matter degrees of freedom play a role, it is often enough to extendthe model by, e.g., some more matter or photon levels to match theory and experiment.Nevertheless, it is clear that there must be a limit to such a procedure. With the increasing complexity ofthe effects that have to be described, more and more features have to be added to the models to fit the exper-imental data. This will not only become computationally difficult at a certain point, but more importantly,such a path heads toward a situation, where so many parameters have to be introduced that interpretationsand predictions might become difficult. For instance, when George et al. [80] wanted to interpret their ex-perimental data, they first tried to employ a Jaynes-Cummings-like model, which they could not fit to theirresults. They say in the publication that it was necessary to add several extra matter and photon states and amore thorough description of the electron-photon interaction to the model to properly interpret their data.The reason for the break-down of the simple model in this case is two-fold: first, they fabricated a systemwith very strong electron-photon coupling (for smaller couplings in the same experiment, the simple modelworked) and second, the energy structure of the molecule they employed was such that several matter exci-tations strongly coupled to the mode (what they called “multimode splitting effect”). Consequently, they didnot only have to describe the two polaritonic states ( P n + / − for a fixed n ), but a “genuine ladder of vibrationalpolaritonic states” ( P n + / − for a series of n = Perspective: connecting models and first-principles approaches
It is clear that answering such questions is anything but easy and in many cases first-principles approachesmight not be (directly) applicable simply because of numerical limitations. This kind of problem is wellknown in other fields of quantum physics such as strongly-correlated electron systems. To this categorybelong materials like high-temperature superconductors [81], Mott insulators [82] or many important cata-lysts [83], all of which promise huge possibilities for applications. Modelling such effects is among the hard-est problems of material science, because efficient many-body methods like (approximate) density func-tional theory are typically too inaccurate. Thus, a crucial role for the understanding of strongly-correlatedelectrons has been played by effective models, most importantly the Hubbard model [85]. The Hubbardmodel exhibits a wide range of correlated electron behavior including all the above mentioned phenomenaand thus allows to study the basic mechanisms behind these phenomena. Such kind of studies have re-vealed the complexity of the effects but also their extreme dependence on tiny variations of system parame-ters, many of which cannot directly be determined by experiments and thus require first-principles calcula-tions. Triggered by this insight, the field of strongly-correlated electrons provides nowadays a plethora ofexamples, where models and first-principles descriptions have been successfully combined. Judging from the complex phenomenology that experiments have revealed, one can expect that the fieldof strong electron-photon coupling and especially the sub-field of polaritonic chemistry exhibit a similarlycomplex phenomenology as strongly-correlated electrons. A general perspective on the problem will allowus to put the connection between electronic strong-correlation and strong coupling even in quite concreteterms (see Sec. 3.1.2). The aforementioned examples, where different cavity-QED models contradict eachother additionally indicate the need for new less-biased methods. We believe that a combination of modeland first-principles approaches that was so successful in other areas of physics like strongly-correlated elec-trons, could also be fruitful in the field of strong electron-photon coupling. This is considered as more or less basic knowledge in quantum chemistry and solid state physics, which is the motivation behindmany new theory developments. However, for a recent specification of this statement, the reader is referred to Ref. [84]. To provide an example for this statement, we refer the reader to the very comprehensive study of the two-dimensional Hubbard model with different methods by LeBlanc et al. [86]. For example in Ref. [87] the author explains very well how to connect models to first-principles calculations. Another even more ex-plicit example is the
LD A + U method [88] that connects the Hubbard model with the local-density approximation of density functionaltheory. It was built exactly with the purpose to unify the advantages of both, the model and the first-principles world. Before we come to first-principles methods, we want to make a brief detour to the standard way to describethe phenomena of strongly coupled electron-photon systems. As we have mentioned in the last section,the researchers that historically first investigated such phenomena stem from the community of quantumoptics . They introduced the cavity-QED models such as the Jaynes-Cummings model, which allowed toidentify the basic mechanism behind the emergence of polaritons and satisfactorily describe many of theexperiments in the field.In this section, we briefly present the derivation of this set of models, which has two purposes. First,presenting this standard description helps to acquaint the reader already with the concepts and tools thatare necessary to describe light-matter interaction on the quantum-level. In the next section, we present abig part of this derivation a second time, but including all the technicalities that are necessary to derive aproper framework for a first-principles perspective. We hope that the preliminary discussion in this sectionfacilitates reading and understanding of the general derivation. The second reason for such a detailed pre-sentation of cavity-QED models is to make their limitations more concrete. Thus, we put a special emphasison the approximations that enter the models. After the presentation of the general derivation and some im-portant special cases, we briefly analyze their range of validity. Importantly, the models accurately describeexperimental data, even in cases where certain approximations are not strictly justified. This important factreveals the universality of the models and puts this in the context of their obvious limitations in the realm ofquantum chemistry. We then briefly explain, why first-principles methods are a valuable tool to overcomethese limitations and finish the subsection with a short summary.
To describe the physics of cavity experiments, we essentially need to model the matter systems, the photonmodes, and the interaction between both. The standard starting point for the discussion of quantum atom-field interaction is the Hamiltonian [89, part 6.1]ˆ H = ˆ H m + ˆ H ph − e ˆ r · ˆ E , (1.3)where ˆ H m is the matter Hamiltonian and ˆ H ph is the electromagnetic-field Hamiltonian. The last term de-scribes the interaction between both subsystems that is given by the inner product of the electron dipole − e ˆ r ( e denotes the elementary charge and ˆ r is the position operator) and the electric field operator ˆ E . However, one needs to be aware that Hamiltonian (1.3) involves already many assumptions (such as thedipole approximation and the neglect of the dipole-self energy), which are in many textbooks discussed inthe semiclassical theory [89, part 5]. This means that ˆ H ph is neglected and ˆ E → E and all other descriptors ofthe electromagnetic field are treated as external classical vector fields.If we consider this semiclassical theory for a single-electron atom, i.e., one electron with mass m that isconfined by the electrostatic potential V of the nucleus with mass m n → ∞ (Born-Oppenheimer approxi-mation, see Sec. 1.3.3), the corresponding Hamiltonian readsˆ H sc = − (cid:126) m [ ∇ − i e (cid:126) A ( r , t )] + e φ ( r , t ) + V ( r ), (1.4) To account for the quantum nature of the electromagnetic field, the electric field vector is promoted to an operator. See alsoSec. 1.3.2. φ and A are the scalar and vector potentials of the electromagnetic field, respectively. The form of thematter-field interaction in ˆ H sc can be derived by basic principles (see [89, part 5.1.1]) and has a very largerange of validity.Then one chooses the Coulomb gauge ∇ · A = φ = which is strictly speakingonly possible far away from any charge. However, since only one electron is considered, this term con-tributes only by the constant self-energy of the electron and thus can be neglected as we will discuss inSec.1.3.3. For a many-particle system, φ leads actually to the Coulomb interaction between the (charged)particles. After applying the Coulomb gauge, Hamiltonian (1.4) readsˆ H Csc = − (cid:126) m ∇ + V ( r ) + i e (cid:126) A ⊥ ( r , t ) + e m A ⊥ ( r , t ) = ˆ H m + i e (cid:126) ∇ · A ⊥ ( r , t ) + e m A ⊥ ( r , t ), (1.5)where A ⊥ is the transversal part of the vector potential. We can generalize ˆ H Csc straightforwardly to N e electrons, if we drop the assumption φ =
0. This generalization is considered as the basic
Hamiltonian ofelectronic structure theory (see Ch. 2).
The long-wavelength limit and the diamagnetic term
The next step is to introduce the dipole approximation by assuming A ( r , t ) (cid:117) A ( r , t ), (1.6)where r is the center of charge (which is equal to the center of mass) of the matter system. This approxi-mation is also called the long-wavelength limit and it is well justified, if the spatial extension of the atom ismuch smaller than the wavelength of the considered modes of the electromagnetic field. This is this case inmost cavity-matter systems (see Sec. 1.3.3). The derivation is finished by applying the gauge transformation U d = exp( i e (cid:126) A ( r , t ) · r ) to the Hamiltonian (1.5), which after some standard rearrangements has the formˆ H Csc A ( r , t ) (cid:117) A ( r , t ) −−−−−−−−−−−−→ U d ˆ H d = (cid:126) m ∇ + eU ( r , t ) + V ( r ) − e r · E = ˆ H m − e ˆ r · E . (1.7)To arrive from here at the (fully-quantized) starting Hamiltonian (1.3), we add H ph and follow the prescrip-tion of canonical quantization. We reserve the details of this procedure for Sec. 1.3.2 and just assume that( E , A ) → ( ˆ E , ˆ A ) are now operators.It is important to realize that within the “rearrangements,” another approximation has been employed.The diamagnetic contribution e m A ⊥ ( r , t ) that was still present in Eq. (1.5) is indeed removed by the transfor-mation U d , but at the same time a new term that is proportional to r is introduced (see Sec. 1.3.3). This termis called the dipole-self energy and it is obviously neglected in Eq. 1.7. That this standard approximation in The theory of electrodynamics exhibits (on the classical and on the quantum level) a so-called gauge-symmetry. This means thatthe theoretical description has a certain redundancy, which usually is removed in a concrete application. This is done by choosing oneof many possible gauges. In electrodynamics, there are many established standard gauges, e.g., the Coulomb gauge, which can cruciallysimplify the description of certain problems. We discuss this in more detail in Sec. 1.3.2. See also Def. 1.1. Both together is called radiation gauge in this context. According to Helmholtz’s theorem, every vector field X = X ∥ + X ⊥ can uniquely be decomposed in its longitudinal X ∥ and transversal X ⊥ component, with ∇× X ∥ = ∇· X ⊥ =
0. In the Coulomb gauge, where ∇· A =
0, thus the longitudinal contribution of A is explicitlyremoved. This also induces [ ∇ , A ⊥ ] = − i (cid:126) ∇ is the particle momentum operator. A ∝ c is inverseproportional to the speed of light c . Thus, the A -term has a prefactor of 1/ c that makes its contribution tothe Hamiltonian usually small in comparison to all the other terms and justifies its neglect. For instance, ifwe want to investigate the interaction of an atom with the electromagnetic vacuum to understand lifetimesor energy level shifts due to the photon field like the Lamb-shift, 〈 A 〉 is small and we can perform a calcu-lation in terms of lowest-order perturbation theory, where the A -term naturally would drop out. However,when some of the electromagnetic field modes are strongly enhanced by a cavity, this approximation mightbreak down. We discuss the A -term in Sec 1.3.3 again. The making of the cavity-QED Hamiltonian: the diagonalization of the matter Hamiltonian and the single-mode approximation
To finish the derivation of the atom-field model of quantum optics, we still need to perform one crucial step,which is the diagonalization of the matter Hamiltonian, i.e.,ˆ H m | m i 〉 = (cid:178) i | m i 〉 . (1.8)If this decomposition is known, we can rewriteˆ H m = (cid:88) i (cid:178) i | m i 〉〈 m i | ≡ (cid:88) i (cid:178) i σ ii (1.9)in its diagonal matrix form. For the single-electron system that is considered here, this is not problematic.Although in most cases, there is no analytic solution, we can always find a suitable approximate basis for theproblem and diagonalize it numerically (see Sec. 2.1.1). For more than one or a few particles, the situationchanges drastically, because of the already mentioned many-body problem . This manifests here in the costfor such a numerical diagonalization, that grows exponentially with the particle number (independentlyof the details of the problem, see Ch. 2). We discuss some implications of this with respect to cavity-QEDmodels in Sec. 1.2.2.Additionally, we diagonalize the free photon Hamiltonian which is not as problematic as for the elec-tronic problem. There is no (direct) confining potential and no (direct) interaction in ˆ H ph and thus its diag-onal form can even be derived analytically by going to k -space. We findˆ H ph = (cid:88) k , s (cid:126) ω k ( ˆ a † k , s ˆ a k , s +
12 ), (1.10)where ω k = c | k | = ck is the mode frequency and ˆ a (†) k , s are the annihilation (creation) operators for a photon inthe mode with wave-vector k and the polarizaton-index s = The operators ˆ a (†) k , s obey the commutationrelations [ ˆ a k , s , ˆ a † k , s ] = a (†) k , s , ˆ a (†) k , s ] =
0, which defines the quantum mechanical properties of the photonfield.These definitions are sufficient to write the whole Hamiltonian (1.3) in matrix form. For that, we expand The connection between the real space, where quantities are parametrized by position vectors r and k -space is given by the Fouriertransformation. Specifically for a function f ( r ), we have f ( k ) = (cid:112) π (cid:82) d r f ( r )exp( − ir · k ). For example, s = k . e ˆ r = (cid:88) i , j e | m i 〉〈 m i | ˆ r | m j 〉〈 m j | ≡ (cid:88) i , j e d i j σ i j (1.11)and the electric field ˆ E = (cid:88) k , s e k , s E ω k ( ˆ a † k , s + ˆ a k , s ), (1.12)where e k , s is the polarization vector and E ω k = ( (cid:126) ω k /2 (cid:178) V ) is a prefactor that includes the vacuum per-mittivity (cid:178) and importantly, the mode volume V that plays a crucial role in polaritonic physics as we havediscussed in the previous section. Using (1.9), (1.10) and (1.11), we arrive at the matrix form of (1.3)ˆ H = (cid:88) k (cid:126) ω k ˆ a † k ˆ a k + (cid:88) i (cid:178) i σ ii + (cid:126) (cid:88) i , j (cid:88) k g k , i j σ i j ( ˆ a † k + ˆ a k ), (1.13)with the coupling-matrix g k , i j = − eE ω k (cid:126) (cid:80) s d i j · e k , s , (1.14)where we summed over the 2 polarization directions, assuming polarized light, such that only one of thetwo polarizations contributes. Additionally, we neglected the zero-point energy of the electromagnetic field.Although, it regards only a one-electron system on the matter side, Hamiltonian (1.13) has a broad range ofapplicability and in general, its diagonalization is nontrivial.For the cavity setting that we are interested in, we assume that there is one dominant mode with fre-quency ω ph . The cavity strongly confines the mode to the volume V , which leads to a large prefactor E ω k ∝ (cid:112) V . Consequently, it holds for the coupling elements g i j ≡ g i j , ph of this mode that g i j (cid:192) g k , i j for all other k and we can neglect these coupling elements. The resulting Hamiltonian is the origin of thecavity-QED models and it readsˆ H cQED = (cid:126) ω ph ˆ a † ˆ a + (cid:88) i = (cid:178) i σ ii + (cid:126) (cid:88) i , j g i j σ i j ( ˆ a † + ˆ a ). (1.15)Note that we dropped the index of the operators ˆ a (†) that refer to the cavity mode. The crucial step: the few-level approximation
We want to stress again that the assumption of a single-electron atom was not necessary to arrive at ˆ H cQED .In principle, we can assume a Hamiltonian of the same form to study a complicated many-electron system,because the possibility of the underlying eigendecomposition simply results from the linear structure ofquantum mechanics. However, the corresponding eigenvalue problems are so high-dimensional that inorder to solve them in practice, we have to significantly restrict the underlying configuration spaces, i.e., thebases. The choice of the basis is thus a crucial step in any quantum mechanical calculation (see Sec. 2.1.1).The most important approximation in cavity-QED models is to restrict this choice from the beginningby assuming that a very small matter basis is sufficient to describe the phenomena that we are interested in. For dynamical problems, we usually have take to the influence of the bath represented by the other modes into account. In manycases (for example to include spontaneous emission processes), this can be done approximately by the introduction of a dampingfactor. The reader is referred to, e.g., the corresponding chapters in Ref. [89]. few-level approximation reduces the dimensionof ˆ H cQED such that it can be diagonalized exactly . The most important special case of this approximationis the two-level system that we discuss in the next paragraph [90]. However, if we wanted to employ moreelectronic levels, the standard approach consists of a two-step procedure. In the first step, we use some first-principles methods to identify the most important electronic states, their energies and the correspondingcoupling elements. These enter in the second step as input parameters in ˆ H cQED , which is diagonalized. Inthe realm of polaritonic chemistry, several hybrid approaches of this type have been proposed [91, 92, 93,94, 95]. However, to make the diagonalization of ˆ H cQED numerically feasible, most of these methods rely onthe projection on the single-excitation space, i.e., the rotating-wave approximation that we discuss below. Itis very difficult to extend such methods in a simple way. The standard case: the two-level atom
The most common models reduce the matter description to a minimum and consider a two-level atom (or molecule), i.e., they neglect all but two effective matter states | m 〉 . Assuming a real transition dipole d = d between these states, we arrive at one of the most important cavity-QED model, the Rabi-model. The corresponding Hamiltonian readsˆ H R = (cid:126) ω ph ˆ a † ˆ a + (cid:88) i = (cid:178) i σ ii + (cid:126) (cid:88) i , j = g σ i j ( ˆ a † + ˆ a ) ≡ (cid:126) ω ph ˆ a † ˆ a + (cid:126) ω σ z + (cid:126) Ω R ( σ + + σ − )( ˆ a † + ˆ a ), (1.16)where in the second line, we have renamed the coupling constant g = g = Ω R as usual in this context. Ω R is the famous Rabi frequency that we have introduced in the previous section. Since it is common prac-tice, we additionally have introduced the Pauli-matrices σ z = σ − σ , σ + = σ , σ − = σ . To rewrite theelectronic Hamiltonian in the second line, we have utilized the equality σ + σ =
1, have introduced thetransition frequency ω = ( (cid:178) − (cid:178) )/ (cid:126) and have removed the constant energy contribution of ( (cid:178) + (cid:178) )/2. De-spite its seeming simplicity, the Rabi model has only a semi-analytic solution, which is only known since2011 [98]. This reflects how intricate the coupled electron-photon problem really is.Finally, there is the aforementioned rotating-wave approximation to H R that allows for a complete an-alytical diagonalization. Neglecting the so-called counter-rotating terms σ + ˆ a † and σ − ˆ a , we arrive at theJaynes-Cummings Hamiltonian [45]ˆ H JC = (cid:126) ω ph ˆ a † ˆ a + (cid:126) ω σ z + (cid:126) Ω R ( σ + ˆ a + σ − ˆ a † ), (1.17)that we have introduced in the previous section. It is one of the very few light-matter problems that areanalytically solvable, which explains its key-role for the understanding of electron-photon interaction, es-pecially for the cavity-system. Collective coupling: the Dicke construction
All cavity-QED models can be generalized in a simple manner to describe N identical molecules (or moregeneral matter systems) if we assume that the wave functions of different atoms do not overlap. In this In Ref. [96], the authors propose a method that goes beyond the rotating wave approximation. They discuss the difficulties and the(strong) limitations of such a generalization. In fact, the original publication by Rabi [97] from 1936 considered a nuclear spin and not an atom. N -molecule-one-mode problem again as single molecule, that is coupled to onemode with an effective coupling strength g i j → (cid:112) N g i j . This assumption is for example well justified ina molecular gas, which was the scenario that Dicke [72] had in mind, when he introduced his constructionfor N two-level systems in 1954. In the realm of strong-coupling physics, this collective superposition of thematter systems is regarded as one of the basic mechanisms that lead to strong coupling. The collective coupling strength is (cid:112) N times larger than the single-molecule coupling, which leads to a huge increase fora macroscopic number of such systems, i.e., N ≈ . We conclude with a brief discussion on the justification and range of validity of the cavity-QED models.First of all, we have seen that to arrive at the ˆ H cQED , we have started at a very general level and followeda well-defined hierarchy of approximations. We have discussed already that, as written in Eq. (1.15), theHamiltonian has a wide range of validity and this fact is often stressed in the literature. Cavity-QED mod-els are thus often seen as a kind of first-principles method. One conclusion from this point of view is thatthe different perspective that, e.g., first-principles methods would provide is superfluous. The impressivesuccess of cavity-QED models to understand the phenomena of strong-coupling physics support this argu-ment. However, this point of view basically disregards the quantum many-body problem and the complexitythat matter systems present.To illustrate this, let us briefly reflect on the most common cavity-QED models, which describe moleculesas two-level systems. Strictly speaking, the two-level approximation is only valid for a single spin, but its va-lidity can be convincingly generalized to any kind of transition in a matter system that is energetically well-separated from all other transitions of the system (at least for not too strong coupling-strengths). Neverthe-less, the two-level approximation has been extensively used way beyond this range of validity. Frasca [90]summarizes this fact in his review of the two-level approximation as: “It is safe to say that the foundationsof quantum optics are built on the concept of a few level atom.” Hence, the justification for the approxi-mation is obviously not its well-defined range of validity, but the very good agreement of the correspondingmodels with many experimental results. In their review on strong coupling, Kockum et al. [48] even presentthe cavity-QED models in a generalized version, explicitly noting that only special parameter choices canbe “derived from first principles.” Again their justification is coming from experiment. This shows that thestrength of cavity-QED models is not their (only sometimes possible and often hardly justifiable) connectionto the fundamental level of theory, but their obvious universality .Take for instance the absorption spectrum of a complex systems like a molecule in a cavity in resonancewith a certain energy transition of the molecule, which shows a Rabi splitting (see Fig. 1.2 (a)). If we can de-scribe this accurately by the Jaynes-Cummings model, then we have learned that the principal physics of thisprocess can be understood in terms of the hybridization between one electronic transition and a harmonicoscillator. This is extremely valuable, because it reveals the underlying mechanism of this phenomenon. A very detailed derivation of this can be found in the review of Kirton et al. [79] (see especially Sec. 3.2). This is typically regarded as a common fact. See for instance the review of Kockum et al. [48] or Keeling [65]. However, it is important to realize that one cannot just increase the number of basis states in Eq. (1.15). Due to having no r -term, this model has no eigenstates in the limit of large basis-sets [99]. See for example the section on “models” in Ref. [48]. The model additionally allows to differentiate the “character” of the electronic transition. For example, if the dominant matteroscillator is a two-level system, a so-called quantum-blockade [100], which is experimentally measurable, occurs.
Why the first-principles perspective is useful
We have seen in the previous subsection, how important this principal understanding of polariton emer-gence was for the progress in strong-coupling physics. However, we have also discussed the (current) limi-tations of the models, which are difficult to define but non-negligible. The controversy about the collectivenature of strong-coupling effects in the realm of polaritonic chemistry exemplified some of these limitations:to describe the complex setting of a chemical reaction that takes place inside a cavity, simple approaches likethe Jaynes-Cummings model have to be extended, because there are several matter-degrees of freedom thatplay a role. However, such extensions are not straightforward, but require a detailed knowledge of the mat-ter system, which often is not provided by experimental data. Sometimes this situation can be remedied byphysical argumentation, but the discussed controversies show that this is not always the case.The history of the field of quantum chemistry knows a multitude of such kind of controversies and inthe majority of the resolved cases, the answers were nontrivial. Properties of molecules might dependon tiny variations of the geometry or the electronic configuration, which in turn might be the result of anintricate interplay of different effects that counteract each other. Many open questions regarding chemicalreactions could only be resolved by an exhaustive use of first-principles methods, many of which nowadayshave become standard tools. Clearly, all practical methods to describe quantum many-particle systemshave to employ approximations because of the many-body problem. But the first-principles approach canprovide (if applicable) a less-biased perspective than effective models, because the explicit description of theparticles allows for approximations an a more general level. In the case of coupled electron-photon systems,this means that we treat electrons and their Coulomb interaction on the same footing as photons and theelectron-photon interaction. Thus, we are able to study the interplay of both forms of interaction. We willsee in Ch. 6 how this perspective allows to identify new effects that are not easy to describe with methods thatare based on cavity-QED models. For instance, we will present some results suggesting a further mechanismthat influences the coupling strength between electrons and photons. This puts the Dicke-type mechanismof collective strong coupling in a different perspective.
The problem of standard model approximations: can we neglect the dipole-self energy?
Finally, we want to remark on one other approximation that we have done for the derivation of the models.This is the neglect of the dipole-self energy, i.e., the r -term. We will see in the next sections that in orderto find the ground state of a coupled electron-photon system with a first-principles method, this term isof utmost importance and its neglect would lead to useless results. The reason is that the correspondingHamiltonian is unbounded without this term [102]. In a practical calculation, this means that the groundstate is not well-defined but depends on the basis and its energy can in principle be shifted to arbitrarilylow values. Interestingly, this is a considerably less severe issues for cavity-QED models. The reason is thatwithin the few-level approximation, every system becomes finite and if we fit the electronic energy levelsto some experimental data, there is basically no issue. However, neglecting the diamagnetic term can alsolead to issues for few-level systems. The Rabi model for example loses its gauge invariance for large couplingstrengths [76, 103]. And connected to that, there are several indications that the transition to the superra-diant phase, predicted by the Dicke model cannot occur in equilibrium if the dipole-self energy is properlytaken into account [104].In contrast, one of the main goals of first-principles methods is to determine these electronic energy For a general introduction to the theoretical description of chemical reactions and its challenges, the reader is referred to, e.g., thetextbook by Moore and Pearson [101]. An exhaustive discussionabout the diamagnetic term can be found in Ref. [99]
Summary
In summary, the set of cavity-QED models constitutes a very powerful tool box that has been successfullyused to accurately describe a wide range of phenomena in coupled light-matter systems. With only very fewfitting parameters, cavity-QED models describe many experimental results quantitatively . Their simplicity isthereby an important strength, because it allows for the definition of simple and clear concepts to interpretand not only to fit the data, e.g., a polariton-model of the Jaynes-Cummings model. However, there arelimits to the applicability of cavity-QED models and determining and overcoming these limits is crucial forprogress in the research field of cavity QED. In the next section, we thus want to approach the problem from“the other side,” and introduce the framework for a first-principles description of coupled matter-cavitysystems. See for example the discussion on p.3 in Ref. [67], where the authors defend the neglect of the dipole self-energy in their model.They refer explicitly to Ref. [102].
The goal of this section is to find a good starting point for our first-principles description of coupled light-matter systems. For that Schäfer et al. [99] have defined the following three “basic constraints [...] a theoryof light-matter interactions [should] adhere to”:
1. All physical observables should be independent of the gauge choice and of the choice of coordinatesystem (for instance, it would be unphysical that the properties of atoms and molecules would dependon the choice of the origin of the laboratory reference frame).2. The theory should support stable ground states (else we could not define equilibrium properties andidentify specific atoms and molecules).3. The coupled light-matter ground state should have a zero transversal electric field (else the systemwould radiate and cascade into lower-energy states).Having these constraints in mind, we briefly discuss in the following subsections the principles and math-ematical issues that enter the (quantum) theory of light-matter interaction on different levels of accuracy.We start at the most fundamental level, which is the (full-relativistic) theory of quantum electrodynam-ics (QED). We keep the discussion of QED very short, summarizing the basic concepts of QED, its fun-damental character and especially its mathematical challenges. We will see that formulating a theory inthe realm of QED that adheres to the three basic constraints is very difficult. However, the wide rangeof energy-scales that QED theoretically covers is not necessary for an accurate description of the typicalphenomena of the field of polaritonic chemistry. We therefore reserve the principal part of the section forthe (semi-)nonrelativistic limit of QED, where we assume that the active particles of the considered mattersystems have considerably lower energies than their rest-energy. This approximation is very well justifiedfor most processes in condensed matter and chemical systems. In the last subsection, we derive the long-wavelength limit of the Pauli-Fierz Hamiltonian and introduce the
Born-Oppenheimer approximation . Theresulting cavity-QED Hamiltonian and the corresponding mathematical framework serves as the basis forour discussion of first-principle methods.
The general theory of quantum mechanics is now almost complete ... The underlying physicallaws necessary for the mathematical theory of a large part of physics and the whole of chemistryare thus completely known, and the difficulty is only that the exact application of these lawsleads to equations much too complicated to be soluble. It therefore becomes desirable thatapproximate practical methods of applying quantum mechanics should be developed, whichcan lead to an explanation of the main features of complex atomic systems without too muchcomputation. (Dirac, 1929 [106])The interaction between light and matter as we understand it today from a physics perspective is such a vasttopic that it concerns to a smaller or greater extend all natural sciences. The usual explanation behind this Note that many cavity QED models and also typical approaches based on perturbation theory do not adhere to all of these con-straints. This is in many cases related to the neglect of the diamagnetic contribution [99]. This means that matter as well as photon degrees of freedom are described within the framework of the theory of special relativity .See for example Ref. [105]. Where with most general, we mean to include all effects that are somehow related to the electromagnetic interaction. Other funda-mental forms of interaction like gravity, weak or strong interaction are excluded in this picture. relativistic quantum field theory , which means that it is consistent with thegeometry of special relativity, accounts for quantum effects of the microscopical world and is formulatedonly in terms of fields (instead of, e.g., particles that here emerge as effective degrees of freedom from thefield). There are many reasons for this unification of theories with the most important being consistency .Since Maxwell’s equations of the electromagnetic field (Eq. 1.20) are relativistically invariant, it seems tobe a logical step to also generalize the quantum description of matter to the full-relativistic level, i.e., toconsider the
Dirac equation . Combining both theories requires then to leave the realm of standard quantummechanics based on wave functions, and generalize Dirac’s theory to a quantum-field theory [10]. For furtherdetails on these considerations and other very interesting discussions, the reader is referred to one of themany textbooks on this topic, e.g., Refs. [105, 8, 10].Guided by the idea of consistency, researchers have pursued the unification of quantum mechanics andelectrodynamics at least since the 1930s. In this process, they encountered so many severe mathematicaland connected conceptual problems that first successful calculations have not been undertaken before thelate 1940s. The last important steps have been made by Tomonaga, Schwinger and Feynman, which wererewarded with the Noble prize “for their fundamental work in quantum electrodynamics”. Most strik-ingly, on the basis of QED one can explain why the “electron’s magnetic moment proved to be somewhatlarger than expected,” – with twelve decimal places precision. This and also many other impressive predic-tions established undoubtedly today’s reputation of QED of being the “best-tested” theory or the “jewel”of physics." This success story clearly illustrates the power of the idea of consistency between differentphysical theories. And indeed, encouraged by the success of QED, the story of unification went on con-stantly leading to the standard model of particle physics . This theoretical framework unifies three of the fourknown fundamental interactions (electromagnetic, weak and strong). Until today, many researchers workon including the missing gravitational interaction, which is described by the theory of general relativity.However, there is also an opposing point of view on the consistency of QED, which puts the theory in adifferent light. Most importantly, there is still no practical way known how to remove the occurring diver-gences in a non-perturbative approach [110]. Additionally, the mere concept of a many-particle wave func-tion leads to several inconsistencies in a relativistic situation, since each particle should have its own time-coordinate [111]. Many years after the formulation by Tomonaga, Feynman and Schwinger (1948/1949), Iwoand Zofia Białynicki-Birula summarize the state of the research on QED in the following way:In quantum electrodynamics, we have a theory that is not complete. Owing to the enormous We recommend the very-well documented article on the history of QED on W
IKIPEDIA : https://en.wikipedia.org/wiki/Quantum_electrodynamics . See facts about J. Schwinger on There is set of standard effects that can be described and understood by QED and that are nowadays textbook knowledge. See forexample [107, part 6]. For a recent application of QED, see [108]. These kinds of statements can be found in basically every book about QED or other quantum field theories. For a good summaryof this point of view and the according predictions, we recommend, e.g., Ref. [109]. Note that Iwo and Zofia Białynicki-Birula are strong supporters of “this beautiful theory[, i.e., QED]” [112]. They contributed im-portantly to formulate QED more rigorously. However, scatteringprocesses happen typically in very special experimental situations or for very high energies. Their theoreti-cal description is based on some well-defined and simple reference states, which describe the system beforeand after the scattering event has taken place [8]. But not all types of systems can be represented within thestrict boundaries of this assumption. Most importantly, it is very unpractical to describe ground states of(strongly) coupled light-matter systems as a scattering process. One (but not the only) important reasonthat hampers applying QED to other than scattering processes is that the standard technique of renormal-ization can only be performed order by order in a perturbation series [8].One of the few attempts to go beyond scattering problems by introducing a “space-time resolved ap-proach” to QED has been undertaken by Wagner et al. [115]. They also investigated within this approach thepossibility of a purely computational renormalization scheme [116], yet only for very simplified models. Toemploy the usual methods of electronic-structure theory, we would need a well-defined Hamiltonian formu-lation of QED. Such formulations indeed do exist , but the problems that Iwo and Zofia Białynicki-Birulamentioned remain. There is for example no definite answer on how to deal with the many of the occur-ring divergences, besides removing them by artificial cutoffs (which makes calculations depend on thesecutoffs) [118]. The energy-scales that are involved in usual chemical and materials-science processes are usually smallin comparison to the involved rest masses, such that relativistic effects of the matter-degrees of freedomcan be neglected. However, the description of the electromagnetic field needs to be relativistic on theseenergy scales. This suggests to regard the (semi-)nonrelativistic “limit” of QED (NR-QED) that considersnonrelativistic charged particles and the electromagnetic field. This scenario is very well studied and thusespecially its mathematical properties are comparatively well understood. We summarize in the followinghow NR-QED can be constructed and discuss the most important conceptual and mathematical issues. Wefollow hereby the excellent book by Spohn [9]. Note that basically all of the aforementioned impressive predictions belong into this group. In principle, one could perform a scattering description of bound states as well. The problem is then more practical. Assume wewould have a QED theory that captures bound-state resonances (live long, but not infinitely long). The spectrum goes (from - ∞ ifno positron interpretation is done) to ∞ and we have no idea where the special "ground-state resonance" is lying. So we would needto probe all in/out states and minimization is useless. If we instead have a Hamiltonian that is bounded from below, the variationalprinciple tells us “where to look” for the ground state, i.e., we just need to minimize the energy. See for example Ref. [117, part 5] for a physicist’s definition and Ref. [118] for a mathematically rigorous discussion and definitionof the QED Hamiltonian. Note that there are also non-relativistic formulations of the electrodynamics, which however have a limited practical applicability.See, e.g., Ref. [119].
Definition 1.1:
Maxwell-Lorentz Equations
For a charge density of N e electrons and N n nuclei with masses m i , charges q i and positions r i , i = N e + N n ρ ( r , t ) = N e + N n (cid:88) i = q i δ ( r − r i ), (1.18) we associate a current j ( r , t ) . Both are linked by the continuity equation ∂ t ρ ( r , t ) + ∇ · j ( r , t ) =
0. (1.19)
For simplicity, we constrain the description explicitly to length scales, were the contribution of theaforementioned form-factor is negligible [9] and thus consider point-particles in this definition.The evolution of the N e + N n particles and the electric field E and the magnetic field B is governed bythe 4 Maxwell’s equations and Newton-Lorentz’s equations for the particles. The first two Maxwell’sequations describe the evolution of E and B the under the influence of the current by ∂ t B ( r , t ) = − ∇ × E ( r , t ), (1.20a) ∂ t E ( r , t ) = c ∇ × B ( r , t ) + µ c j ( r , t ), (1.20b) where c is the speed of light and µ the vacuum permeability. The other two Maxwell’s equationsare constraints to the evolution that read ∇ · E = ρ ( r , t ), (1.20c) ∇ · B =
0. (1.20d)
Newton-Lorentz’s equation for the evolution of the i-th particle under the influence of E and B readsm i d d t r i ( t ) = e E ( r i , t ) + c − t r i × B ( r i , t ). (1.21) The fundamental inconsistency of the classical theory of coupled light-matter systems
To derive the Hamiltonian or equivalently the equations of motion of NR-QED, we can apply the correspon-dence principle [120] to its classical analogue. Thus, there is no need of, e.g., performing a limiting procedurefrom a more fundamental theory and we will follow this route here. The classical correspondence to NR-QEDis
Lorentz-Maxwell theory that defines a set of coupled equations of motion (Def. 1.1) which govern the dy-namics of N charged particles that are coupled to their own electro-magnetic radiation field. Already on theclassical level, this coupling is problematic. To see this, let us shortly consider the well-defined equations ofmotion of a point charge in an external electromagnetic field (Newton-Lorentz equations). Neglecting mag-netic fields for the moment, we can write the Newton-Lorentz equation for a nonrelativistic point charge ofthe mass m and charge e as m d d t r ( t ) = e E ( r , t ), (1.22)where E ( r , t ) is the electric field. Since we assumed that E ( r , t ) is external, i.e., prescribed, this equationtogether with a suitable initial condition (e.g., r = ˙ r =
0) constitutes a well-defined initial value problem. It ishowever important to know E ( r , t ) exactly at the position r of the particle for all times t ≥ t bigger or equalto the initial time t . In Lorentz-Maxwell theory, we then want to couple Eq. (1.22) with Maxwell’s equations31HAPTER 1. STRONG COUPLING(see Def. 1.1). This means that E ( r , t ) is not anymore some prescribed function, but a dynamical quantitythat has to be self-consistently determined by the coupled equations (this is of course also true for all othervariables). This allows for example for back-reactions by the field on the particles and vice versa. On theother hand, there is the well-defined theory of Maxwell’s equations that react to a prescribed charge density ρ ( r , t ). For instance, the electric field of a static charge distribution is obtained simply by solving Gauss’s law,cf. Eq. (1.20c) (the other equations do not play a role in this simple case) E ( r , t ) = (cid:90) d r (cid:48) ρ ( r (cid:48) ) r (cid:48) / | r − r (cid:48) | . (1.23)In the case that ρ ( r ) = δ ( r − r (cid:48) ) is just a point charge, we E is not anymore well-defined at every point, but diverges exactly at the position of the charge. We have to conclude that connecting Eq. (1.23) and Eq. (1.22)leads to an inconsistency. Note that this issue is resolved in Schrödinger quantum mechanics for mattersystems with the Coulomb interaction, but without taking into account the transversal degrees of freedomof the electromagnetic interaction. The debate on this has a long and interesting history and for furtherdetails, the reader is referred to, e.g., the essay by the famous mathematical physicist John Baez [121]. The resolution: regularization and renormalization
Nevertheless, we need to “cure” this inconsistency for our theory and the usual way to do so is by a so-called regularization procedure for short distances. This means in practice that we consider a small sphere insteadof a point charge, which directly removes the divergence in Eq. (1.23). Still, we need to decide about the exactshape and size of this sphere, which is formally done by introducing a form-factor in the equations. By theintroduction of the cutoff, we remove small length scales from the description that “come from a more re-fined theory” [9, p. 15]. The regularization removes the divergence of the self-energy, i.e., electrostatic energyof the particles corresponding to the Coulomb force (1.23). But it still contributes to the energy-momentumrelation of the coupled theory, which consequently differs from the one of a free particle. Since the latter isan experimental fact, we need to “fix” the coupled energy-momentum relation by performing a renormal-ization . We see here that although mostly discussed in the area of quantum field theory, the technique ofrenormalization is already required on the classical level. In the case of Lorentz-Maxwell theory, this meansthat we choose the particle’s mass as a parameter that we utilize to fit the theory to the experiment [9, part 6and part 7].We see that the description of coupled light-matter systems is a very challenging topic, even on the “sim-plest,” i.e., the classical, level and there are many other subtleties and interesting issues.
The quantization of the classical theory: gauge dependence
Starting from the classical Lorentz-Maxwell theory, we now introduce NR-QED by the correspondence prin-ciple. The standard way to do this is to employ the canonical quantization procedure, which requires the
Hamiltonian formulation of the classical theory. Regarding the matter part, this is straightforward, but forthe Hamiltonian formulation of Maxwell’s theory, care has to be taken. To describe the electromagnetic fieldin terms of canonical variables, one usually introduces the vector and scalar potential A and φ . These are There are different ways to do this and certain connected issues. See [9, part 2/part 4] for details E , B by E ( r , t ) = − c ∂ t A ( r , t ) − c ∇ φ ( r , t ) (1.24a) B ( r , t ) = c ∇ × A ( r , t ), (1.24b)which is not a unique correspondence but leaves a so-called gauge freedom. This means that the quantiza-tion procedure depends on the gauge [10]. In the nonrelativistic description, it is very convenient to employthe so-called Coulomb gauge ∇ · A ( r , t ) =
0. (1.25)The Coulomb gauge removes the longitudinal part of the vector potential, which by means of Gauss’s law (Eq.(1.20c)) also fixes the scalar potential φ ( r , t ). The remaining two transversal degrees of freedom of A = A ⊥ describe the two known polarizations of the electromagnetic field. This means that in the Coulomb gauge,there is no “superfluous” field component and thus it can be seen as kind of maximal gauge. That is whybasically all derivations of NR-QED are done in the Coulomb gauge. However, there are also other gaugesthat are maximal in the same sense, e.g., the Poincare gauge and employing these might be advantageous incertain scenarios [11]. The Hamiltonian of NR-QED: how to construct a well-defined mathematical theory of light-matter inter-action
However, to discuss the quantization of the Maxwell-Lorentz theory we stick to the standard case. We em-ploy Coulomb gauge and follow the prescription of the canonical quantization, which leads the so-called
Pauli-Fierz Hamiltonian . Since this is a standard procedure that can be found in several textbooks, we onlysummarize the most important aspects. We again follow closely the book of Herbert Spohn [9]. The Pauli-Fierz Hamiltonian describes the quantum dynamics of N charged particles that are coupled to the (vacuum)electromagnetic field. Since we are interested in molecular systems, we further divide N = N e + N n in N e electrons and N n nuclei. The quantum formulation does not remedy the inconsistencies of the classicaltheory, but adds further problems. The most important property of a Hamiltonian of a quantum theory ishermiticity (which in mathematical literature is usually called self-adjointness ). Importantly, it is possible toprove the self-adjointness of the Pauli-Fierz Hamiltonian under the following two conditions [9, part 13.3].1. We need to remove the self-energy of the particles analogously to the classical case.2. We need to introduce a further “suitable” ultraviolet cutoff. This comes on top of the classical cutoffthat we already included by quantizing extended instead of point charges. Still, the new cutoff justfurther renormalizes the mass and in most situations, we can include it as in the classical case byusing the physical instead of the bare mass of the charge carriers. However, although the proof forself-adjointness provides a definition of the cutoff, it remains arbitrary up to a certain degree.A Hamiltonian that obeys these conditions is the Pauli-Fierz Hamiltonian of Def. 1.2 A remark on the validity of Pauli-Fierz theory
We want to conclude this subsection with a small remark on the interpretation Pauli-Fierz theory. Thisparagraph is not necessary to follow the course of this text and can be skipped by the fast reader. To see this, compare Eq. (1.20c) with the definition of E in terms of bA , φ , cf. Eq. (1.24a): ρ = ∇· E = ∇· ( − c ∂ t A − c ∇ φ ) = − c ∇ φ . Definition 1.2:
Pauli-Fierz Hamiltonian
A system consisting of N e electrons and N n nuclei with masses m i and charges q i , i = N n + N e that are minimally coupled to the electromagnetic field in the Coulomb gauge are described in thenonrelativistic energy regime by the Pauli-Fierz Hamiltonianˆ H PF = N e + N n (cid:88) i m i (cid:179) − i (cid:126) ∇ i − q i c ˆ A ⊥ ( r i ) (cid:180) + π(cid:178) N e + N n (cid:88) i (cid:54)= j q i q j | r i − r j |+ (cid:88) k , s (cid:179) ( − i ∂ q k , s ) + ω k q k , s (cid:180) (1.26) Here i is the imaginary unit, (cid:126) the Planck constant and (cid:178) the vacuum permittivity. We denoted thefield operator of the (transversal) vector potential with ˆ A ⊥ that reads ˆ A ⊥ ( r ) = (cid:114) (cid:126) c (cid:178) (cid:88) k , s ω k (cid:179) ˆ a k , s S k , s ( r ) + ˆ a † k , s S ∗ k , s ( r ) (cid:180) , (1.27) where S k , s ( r ) = (cid:112) V n k , s e ik · r (1.28) are the mode functions for a plane wave with wave-vector k and frequency ω k = ck (k ≡ | k | ). Wedenote with V = a the volume of the quantization box with side-length a. Furthermore, n k , s denotesthe two polarization vectors (s = ) for every mode. To arrive at the form (1.26) , we have defined thedisplacement coordinates q k , s = (cid:113) (cid:126) ω k ( ˆ a † k , s + ˆ a k , s ) (1.29) − i ∂ q k , s = i (cid:113) (cid:126) ω k ( ˆ a † k , s − ˆ a k , s ). (1.30)Pauli-Fierz theory as defined in Def. 1.2 is mathematically a well-defined quantum theory, i.e., the Hamil-tonian of definition 1.2 is self-adjoint and it has eigenstates that can be described by a many-particle wavefunction. However, to remove the occurring divergences of the theory, we have to introduce a cutoff thatis in principle arbitrary. Thus, for high-energy processes the accuracy of the theory is somewhat unclear.Nevertheless, the range of validity, on which these mathematical issues are not problematic, is huge:The claimed range of validity of the Pauli-Fierz Hamiltonian is flabbergasting... As the bold claimgoes, any physical phenomenon in between [gravity on the Newtonian level and nuclear- andhigh-energy physics], including life on Earth, is accurately described through the Pauli-FierzHamiltonian. (Spohn [9, p. 157])The conclusion that Pauli-Fierz theory provides a useful description of, e.g., molecules as complex as DNAor whole organisms is indeed “flabbergasting.” Thus, it seems that Dirac’s vision has become true (see thequotation in the beginning of Sec. 1.3.1): we have found a mathematical theory that describes “a large part ofphysics and the whole of chemistry.” We just need to solve this theory by “approximate practical methods ofapplying quantum mechanics [...], which can lead to an explanation of the main features of complex atomicsystems without too much computation.”Interestingly, although Pauli-Fierz theory is known since the 1930s, no practical method has yet been de-veloped that is capable to solve the theory for any relativistic scenarios. Spohn remarks on this reductionist’s34.3. THE (QUANTUM) THEORY OF LIGHT-MATTER INTERACTIONperspective:Of course, our trust is not based on strict mathematical deductions from the Pauli-Fierz Hamil-tonian. This is too difficult a program. Our confidence comes from well-studied limit cases. [...]In the static limit we imagine turning off the interaction to the quantized part of the Maxwellfield. This clearly results in Schrödinger particles interacting through a purely Coulombic po-tential, for which many predictions are accessible to experimental verification. But beware, eventhere apparently simple questions remain to be better understood. For example, the size ofatoms as we see them in nature remains mysterious if only the Coulomb interaction and thePauli exclusion principle are allowed. (Spohn [9, p. 157])Another example for such a limit is the standard form of cavity QED that we will derive in the next sectionfrom Pauli-Fierz theory by performing the long-wavelength approximation. We will see in the course of thisthesis, that only finding the (approximate) ground state of cavity QED is an utmost difficult task. Without thegreat knowledge of the many-electron problem that we have nowadays, this would probably be an impossi-ble task. This explicitly shows the limitation of the very common (reductionist’s) point of view that definesa hierarchy between more general theories and their limit cases. Although Pauli-Fierz theory is more funda-mental in the sense that it is more general than, e.g., many-electron Schrödinger theory, it is the predictionsof the lower member that are the most important justification for the higher lying member.This illustrates how long the way from writing down a Hamiltonian until describing nature really is. Andmost importantly, it is not “just mathematics” that is required to go this way as the reductionist’s hypothesissuggest. This has been pointed out already in 1972 by Anderson [122] with several examples from molecularand solid-state physics. The impressive success of the cavity-QED models is another examples that provethis far too simple idea wrong: the power of these models lies especially in their validity in settings, wherethe explicit mathematical derivation from first-principles, i.e., the Pauli-Fierz Hamiltonian is not possible(see Sec. 1.2). Spohn summarizes this in the following way:[If t]he Hamiltonian is a self-adjoint linear operator, [... o]f course this does not mean thatwe have solved any physical problem. It just assures us of a definite mathematical frameworkwithin which consequences can be explored. (Spohn [9, p. 158])This is the reason for the long discussion of the mathematical foundations of Pauli-Fierz theory in this sub-section. To develop new methods that accurately describe electron-photon interaction from first principles,we need a (mathematically and physically) firm ground. With the corresponding methods, we will howevernot “solve” Pauli-Fierz theory in the sense that we diagonalize ˆ H PF . There are no analytical solutions knownand even the simplest case of one particle, coupled to the Maxwell-field is basically numerically inaccessible.But we need such a “definite mathematical framework” to derive approximate theories. 35HAPTER 1. STRONG COUPLING Definition 1.3:
Atomic units
The system of atomic units (denoted by [a.u.] ) is defined by the following base units:mass m e the electron masscharge e the electronic chargeaction (cid:126) the reduced Planck constantpermittivity π(cid:178) the inverse Coulomb constantThis means that we measure mass in multiples of the electron mass etc. This is formally done by setting thesefour fundamental constants to one. Derived from these base units, we have the following derived units (we listonly the most important ones)length a = π(cid:178) (cid:126) /( m e e ) 1 a = “bohr”energy E h = (cid:126) /( m e a ) 1 E h = “Hartree”Furthermore, we can derive the numerical value of the speed of light c = α ≈
137 [a.u.] , where we denotedwith α the fine-structure constant α = e /(4 π(cid:178) (cid:126) c ) . In this subsection, we derive the cavity-QED Hamiltonian from Pauli-Fierz theory. The physical setting,defined by this limit is the basis for our analysis in the following chapters. Since we aim to understandstrong-coupling phenomena that require the strong mode confinement of cavities, we constrain the elec-tromagnetic field description to only a few modes . These modes are assumed to have long wavelengths incomparison to the spatial extension of the considered matter systems, such that the dipole approximation is valid. Regarding the matter-description, we only consider the (adiabatic) electronic structure in theelectrostatic potential of (quasi)static or clamped nuclei. This means that we constrain our description tosettings, where the
Born-Oppenheimer approximation can be applied and disregard the dynamics of thenuclei and their quantum nature. This setting is however sufficient to describe a large range of chemicalprocesses [123]. Despite these many simplifications, we are still confronted with an extraordinarily difficult problem.Also in this limit, there are currently only two special cases, for which analytical solutions are known. The starting point: the macroscopic description of Pauli-Fierz theory
The starting point of the derivation is the non-relativistic theory of quantum electrodynamics, which is de-scribed by the Pauli-Fierz Hamiltonian ˆ H PF (Def. 1.2). Since we are interested in processes with subatomiclength and energy scales, we use from now on a suitable unit system, so-called atomic units (see Def. 1.3)We start the derivation with the macroscopic description of Maxwell’s equations, which separates thecharge current j ( r , t ) = j b ( r , t ) + j f ( r , t ) Note that the dipole approximation is accurate also in more general settings in the context of cavity QED, e.g., nanoplasmonic environments. See also Ch. 2. The first considers one electron in a harmonic potential [9, part13.7] and the second N electrons without interaction in a box withperiodic boundary conditions [124]. j b and the free current j f . The former is generated by the magnetization M and polariza-tion P of the matter system as j b ( r , t ) = ∇ × M ( r , t ) + ∂ t P ( r , t ).We can then define the displacement field D = (cid:178) E + P and the magnetization field H = µ B − M , which aregenerated by the free current as j f ( r , t ) = −∇ × H ( r , t ) + ∂ t D ( r , t ).This construction is in a sense artificial, because the actual forces that are exerted by the field are only dueto the total electric and magnetic fields E , B . But it is very suited for the following theoretical considerations,because it separates out the part of the electromagnetic field that is exclusively produced by the (bound)charges of our system.The connection to the macroscopic form of (1.26) is given by the unitary Power-Zienau-Woolley (PZW)transformation ˆ U P ZW = exp (cid:181) − i α (cid:90) d r ˆ P ⊥ ( r ) · ˆ A ⊥ ( r ) (cid:182) ,where we only need to transform the transversal fields, because of the Coulomb gauge. With the choice ofour matter system, we can explicitly define the polarization P ( r ) = − N e (cid:88) i = r i (cid:90) δ ( r − s r i )d s + N n (cid:88) i = Z i R i (cid:90) δ ( R − s R i )d s , (1.31)where we separated electronic r i and nuclear coordinates R i and denoted the charge of the i -th nucleus by Z i . Since we included all physical charges in the definition of P , we trivially have j f = The dipole approximation
We can now introduce the dipole approximation by disregarding the integration over s in the definition of P (Eq. (1.31)) or equivalently by setting A ( r ) (cid:117) A (0), (cf. 1.6)where we assumed that the coordinate frame origin is located at the center of charge. Specifically, this leadsto the transformation ˆ U P ZW , dip = exp (cid:195) − i (cid:112) π (cid:88) k , s S k , s (0) n k , s · R (cid:33) q k , s , (1.32)where n k , s is the polarization vector and S k , s (0) ∝ (cid:112) V the (absolute value of the) mode function of thephoton mode with wave vector k and polarization s (see Def. 1.2). Additionally, we have defined the total This follows, because we treat a closed system here. In a more general setting, the free current could, e.g., represent an externalcurrent that pumps the cavity mode. R = − N e (cid:88) i = r i + N n (cid:88) i = Z i R i . (1.33)Further, we constrain the description to a finite number of M photon modes, where M cannot be chosenarbitrarily, since the dipole approximations breaks down for very large M [124]. In the following, we willusually assume that M is very small (corresponding to the enhanced modes of an optical cavity) and thus,this issue will not be problematic. Note that Eq. (1.32) is also called length-gauge transformation and it isthe generalized form of the transformation U d (defined above Eq. (1.7)) that was employed in the derivationof the cavity-QED models. Furthermore, we subsume α = k , s and apply a canonical transformation thatexchanges the role of the photon coordinates and momenta − i ∂ q α → − ω α p α q α → − i ω − α ∂ p α .The non-relativistic QED Hamiltonian in the long-wavelength limit reads thenˆ H LW = ˆ H n + ˆ H e + ˆ H ne + ˆ H p + ˆ H ep + ˆ H np , (1.35)where we defined the nuclear Hamiltonianˆ H n = N n (cid:88) i M i ∇ R i + N n (cid:88) i (cid:54)= j Z i Z j | R i − R j | ,with the nuclear masses M i . The electronic Hamiltonian readsˆ H e = N e (cid:88) i ∇ r i + N e (cid:88) i (cid:54)= j | r i − r j | and the interaction Hamiltonian between both matter degrees of freedom is given byˆ H ne = N n (cid:88) i (cid:54)= j Z j | r i − R j | .The Hamiltonians for the contributions of all the M photon modes can be collected together byˆ H p + ˆ H ep + ˆ H np = M (cid:88) α (cid:183) − ∂ p α + ω α (cid:181) p α − λ α ω α · R (cid:182) (cid:184) ,where the light matter interaction becomes explicit by the appearance of the total dipole R . We also intro-duced the fundamental light-matter coupling constant for mode α λ α = (cid:112) π S α (0) (cid:178) α . (1.36)38.3. THE (QUANTUM) THEORY OF LIGHT-MATTER INTERACTION The Born-Oppenheimer approximation
The Hamiltonian (1.35) describes the electrons and nuclei of a matter systems, which are basically all de-grees of freedom molecules and solids. It is thus a suited basis for a first-principles description of, e.g.,the phenomena of polaritonic chemistry. Nevertheless, the accurate description of these degrees of free-dom is very challenging and in many cases, the electronic and nuclear dynamics can be separated (Born-Oppenheimer approximation). The electrons are then described in the electrostatic potential of the nucleiwith fixed coordinates R i (clamped nuclei). The electronic energy as a function of the R i (potential energysurface) defines then an effective potential for the nuclear dynamics. In this work, we focus on the electronicpart of the matter-description (electronic structure, see also Ch. 2). However, since electrons and nuclei cou-ple in exactly the same way (with their dipole operator) to the photon modes, a large part of our discussioncan be generalized straightforwardly to the nuclear theory. For further details about additional implicationsof the approximation for the coupled light-matter systems and possible improvements, the reader is referredto Ref. [125].To derive the electron-photon theory formally, we let M i → ∞ for i = N n . Since m e (cid:191) M i , thisapproximation has a wide range of validity. Thus, the kinetic energy of the nuclei (cid:80) i M i ∇ R i → R i . Under the Born-Oppenheimer approximation(abbreviated by BOA in the equations), the nuclear contributionˆ H n BOA −→ N n (cid:88) i (cid:54)= j Z i Z j | R i − R j | is merely a constant that we can neglect in many scenarios. We will consider the term only for compar-isons of different nuclear configurations, e.g., to study simple chemical reactions. The electron-nuclearinteraction has then a prescribed local potential ˆ H ne BOA −→ v ( r ) = (cid:80) N n i = | R i − r | that we include in the electronicHamiltonian ˆ H e BOA −→ ˆ H e + v ( r ).We also neglect the constant contribution of the nuclear charge to the total dipole R BOA −→ − (cid:80) Ni = r i , where wehave removed the index of N e → N for convenience. We note that the Hamiltonian (1.35) is symmetric withrespect to the simultaneous inversion of r → − r and p α → − p α . Additionally, there is the ambiguity of thesign of the electric charge [126]. That is why one can find different definitions of the bilinear coupling termin the literature. Importantly both choices, p α ± λ α ω α · R , lead to the same expectation values of the observables.We make use of this freedom and redefine R = + N (cid:88) i = r i ,which is the more common choice in the literature.Collecting all the terms, we arrive at the coupled electron-photon Hamiltonian in the long-wavelengthlimit or the cavity-QED Hamiltonian that is explicitly given in Def. 1.4. It will be the basis for (most of) thediscussions of the rest of this thesis. According to the standard microscopic model of matter, that neglects the inner structure of the nuclei. However, there are also many situations, when the Born-Oppenheimer approximation fails, but these cases are beyond the scopeof this work. The interested reader is referred to, e.g., the review by Worth and Cederbaum [123].
Definition 1.4:
Cavity-QED Hamiltonian
A system consisting of N nonrelativistic electrons that are coupled via their dipole to M modes of theelectromagnetic field in the length gauge are described by the cavity-QED Hamiltonianˆ H = N (cid:88) i (cid:183) ∇ r i + v ( r i ) (cid:184) + N (cid:88) i (cid:54)= j | r i − r j | + M (cid:88) α (cid:183) − ∂ p α + ω α (cid:181) p α + λ α ω α · R (cid:182) (cid:184) (1.37) Here R = + (cid:80) Ni = r i is the total dipole of the matter systems and for each mode α , ω α denotes thefrequency and λ α = (cid:112) π S α (cid:178) α (1.38) is the light-matter coupling constant, that includes the polarization vector (cid:178) α and the (effective) modefunction S α . In free space, the modes are plane waves and thus, S α is given by the simple expres-sion S α = (cid:112) V (cf. Eq. (1.28) ). However, inside a cavity the mode functions may have a nontrivialform [127].For later convenience, we further differentiate the contributions ˆ H = ˆ T + ˆ V + ˆ W + ˆ H ph + ˆ H int + ˆ H sel f , where ˆ T = N (cid:88) i ∇ r i , ˆ V = N (cid:88) i v ( r i ), ˆ W = N (cid:88) i (cid:54)= j | r i − r j | ,ˆ H ph = M (cid:88) α (cid:104) − ∂ p α + ω α p α (cid:105) , ˆ H int = N (cid:88) i M (cid:88) α ω α p α λ α · r i , ˆ H sel f = N (cid:88) i , j = ( λ α · r i ) (cid:161) λ α · r j (cid:162) A remark on the gauge choice
We conclude this subsection with a remark on the specific gauge choice in ˆ H . In the length (PZW) gauge, wedescribe the photon degrees of freedom with the (transversal) displacement fieldˆ D ⊥ = M (cid:88) α = ω α π λ α p α as canonical variable with corresponding momentumˆ B = M (cid:88) α = ( − i ∂ p α ) λ α × (cid:178) α .We recover the electric field as ˆ E = π (cid:161) ˆ D ⊥ − ˆ P ⊥ (cid:162) with the (transversal) polarization ˆ P ⊥ = (cid:88) α π λ α ( λ α · R ).40.3. THE (QUANTUM) THEORY OF LIGHT-MATTER INTERACTIONThe original creation and annihilation operators for the electromagnetic field modes have now the form ˆ a α = (cid:112) ω α (cid:161) − i ∂ p α − i ω α p α + i λ α · R (cid:162) ˆ a + α = (cid:112) ω α (cid:161) − i ∂ p α + i ω α p α − i λ α · R (cid:162) This has consequences for the interpretation of photon-observables. Most importantly, the expectationvalue ˜ E ph = (cid:173) ˆ H ph (cid:174) = (cid:42) M (cid:88) α (cid:104) − ∂ p α + ω α p α (cid:105)(cid:43) (1.40)does not correspond to the energy of the electromagnetic field that is defined as π (cid:82) d r ( E + B ) in terms ofthe electric field, but is instead connected to the displacement-field D . To calculate the photon energy (inits standard definition), we have to consider instead E ph = (cid:173) ˆ H ph + ˆ H int + ˆ H sel f (cid:174) (1.41)Correspondingly, the photon number of mode α is calculated as N ph = ω α E ph − = (cid:80) α 〈 ˆ N ph , α 〉 , where wedefined the photon number operator for mode α ˆ N ph , α = ω α (cid:181) − ∂ ∂ p α + ω α (cid:181) p α − λ α ω α · R (cid:182) (cid:182) −
12 , (1.42)For the sake of completeness, let us also define to “photon-number” operator with respect to Eq. (1.40), i.e.,˜ N ph , α = ω α (cid:104) − ∂ p α + ω k p α (cid:105) −
12 , (1.43)that we call the mode occupation to differentiate it from N ph , α .This illustrates that in the length gauge, the electronic and photonic degrees of freedom are (partly)mixed. Consequently, a part of the electron-photon interaction is described only by matter quantities as,e.g., the occurrence of the dipole-self-interaction ˆ H sel f shows. This is a convenient starting point to developnew first-principles methods for light-matter interaction, since the description of matter is much betterunderstood. For instance, one could approximate the light-matter interaction by only taking ˆ H sel f intoaccount and would end up with a matter-only theory, which provides a good (qualitative) description ofcertain scenarios [128, 125]. Without the PWZ transformation, cf. Eq. (1.32), such a mixing does not occurand we describe the electromagnetic field by the vector potential and the electric field exactly as in theCoulomb gauge of NR-QED. Within the long-wavelength description, this is called the velocity gauge . Insteadof ˆ H sel f , we would find in this gauge the diamagnetic term , mentioned in Sec. 1.2, that is proportional to ˆ A . Note that − i ∂ + p α = − i ∂ p α is the canonical momentum operator that is hermitian. hapter 2 MATTER FROM FIRST PRINCIPLES: ELECTRONIC-STRUCTURETHEORY
Before we investigate how the coupled light-matter problem can be approached from a first-principles per-spective (which is the topic of Sec. 3), we discuss in this chapter the “simpler” problem of matter-only sys-tems. How to accurately describe the microscopic details of matter is a topic so well studied that manytechniques have already become basic textbook knowledge. Nevertheless, we discuss this topic in the fol-lowing in great detail and with a particular focus on the conceptual level. This serves on the one hand toprovide a complete picture also for the unacquainted reader. On the other hand, we aim at highlighting foreach method the essential ingredients that makes it numerically efficient, yet accurate. We then show thatthis efficiency is lost, once we consider the standard form of the light-matter problem based on individualelectrons and photons (Sec. 3). An analysis of this drawback provides already the physical rationale why apolariton picture would be preferable (part II).The “keystone” [123] of the first-principles description of matter is the Born-Oppenheimer approxima-tion that we also have employed in the derivation of the cavity-QED Hamiltonian in the previous section.Within the Born-Oppenheimer approximation, the dynamics of the electrons and nuclei are decoupled,which allows to separate the description of matter into methods that focus on nuclear effects (such as vibra-tions or rotations) and so-called electronic-structure theory . The latter comprises all approaches to describethe electronic degrees of freedom of matter systems including the majority of all known first-principlesmethods. A very prominent role in electronic-structure theory is played by the ground state, that deter-mines fundamental properties of matter systems such as the equilibrium geometry or the effective screen-ing of the Coulomb interaction inside a solid. It also plays a key-part in understanding chemical processeslike reactions that often can be described in the quasistatic picture of the Born-Oppenheimer approxima-tion. Many first-principles methods exclusively aim to accurately describe electronic ground states and weput the focus on such methods in the following.The accurate description of the ground state of a many-electron system is an extraordinary challeng-ing problem that is a research topic since the beginnings of quantum mechanics. Consequently, many ap-proaches have been developed for this task and it depends strongly on the specific scenario, which one isthe best choice. However, there are certain generic features that play a role in all approaches.The most important example for that is the complexity that arises from the many-body problem andthat basically prohibits the exact description of most many-electron systems. The only explicitly known wayfor such a description is to solve the Schrödinger equation for the many-body wave function. For a systemconsisting of N electrons, the wave function depends on 4 N coordinates, and thus, the configuration space, Note that there are also first-principles methods that, e.g., focus explicitly on the description of the coupled electron-nuclei dynam-ics [129] or only the nuclei [130]. Three spatial and one spin coordinate for each electron. N . For large N , this makes the conceptof the many-body wave function in practice useless. The complexity of the many-body problem occurs insome way or the other in all electronic-structure approaches and it is the reason, why practical methodsalways rely on approximations.Another example of a generic feature of all electronic-structure methods is the particle-exchange sym-metry . Electrons are fermions and a fermionic wave functions is antisymmetric with respect to the exchangeof the coordinates of any two electrons. As a consequence, fermions adhere to the Pauli principle, i.e., twofermions cannot have the same quantum number. Accounting for the Pauli principle is fundamental for anaccurate description of the electronic structure.The third example is the variational principle that defines the ground state as the minimizer of the totalenergy. This suggests to determine the ground state by an explicit minimization of the energy, which isthe starting point of every electronic-structure method. To carry out this minimization, it is necessary to characterize the state of the system, which in turn defines the energy functional. In fact, there are severalways to do this characterization and each of these defines a class of electronic-structure methods. In thefollowing, we present three different electronic-structure approaches that adhere to the above principles:1. Hartree-Fock (HF) theory (Sec. 2.2)2. Density functional theory (DFT, Sec. 2.3)3. Reduced density matrices (RDMs) that are the basis of variational RDM theory (Sec. 2.4.1) or one-bodyRDM functional theory (RDMFT, Sec. 2.4.2)Let us now briefly explain these approaches, highlighting the just defined generic principles.HF theory is one of the simplest examples of the class of wave-function methods, that directly approx-imate the many-body wave function. The HF wave function is one Slater determinant (single-referenceansatz), which is the simplest many-body wave function that is antisymmetric and already provides a goodqualitative description of many systems. In general, wave function methods guarantee the particle-exchangesymmetry by employing Slater determinants. One can improve on the accuracy of HF by considering wavefunctions built from more than one Slater determinant (multi-reference ansatz). Important examples forthis approach are configuration interaction , coupled cluster theory or multiconfigurational self-consistentfield theory [132, part 5]. More recently, tensor-network methods such as density matrix renormalizationgroup [133] theory have additionally enriched the spectrum of this approach. Such wave-function methods yield until today the most precise results for many systems. However, their high precision usually comeswith equivalently high numerical costs. Systems with larger particle numbers of say N (cid:39)
100 can rarelybe described accurately with such methods. This is the manifestation of the many-body problem in wavefunction methods.In DFT instead, we drop the concept of the many-body wave function and describe the state of a systemwith the (one-body) electron density ρ ( r ). The many-body problem manifests here more indirectly than forwave-function methods. The reformulation of many-body quantum mechanics in terms of ρ leads to anunknown quantity, which is called the exchange-correlation functional . Constructing approximations forthis functional is a very difficult task and severely limits the applicability of functional methods to certainsystem classes. The most common starting point for the development of functionals is the Kohn-Sham (KS)construction that describes the state of the system with one Slater determinant exactly as in HF theory. Animportant reason for this choice is that the Slater determinant accounts for the fermionic antisymmetry. Note however that in the most common practical coupled cluster methods, the wave function is not entirely reconstructable [131]. Employing RDMs as a basic variable provides yet another perspective on many-electron systems. Forinstance, we can describe the expectation value of the energy of an N -electron system exactly in terms ofthe 2-body RDM (2RDM). The configuration space of such a description does not grow with the particlenumber and thus a variational minimization in terms of the 2RDM is in principle feasible. However, themany-body problem also arises in this description in the form of conditions that determine the exact con-figuration space, i.e., the set of all 2RDMs. The number of these so-called N -representability conditionsgrows exponentially with the particle number N . Thus, in practical variational minimizations of the energywith respect to the 2RDM (variational 2RDM theory) only a subset of conditions is considered.A special role is here taken by the 1-body RDM (1RDM) γ that has comparatively simple N -representabilityconditions. These reflect the exchange symmetry of the particle species and thus provide an alternative toemploying Slater-determinants to enforce the antisymmetry of electrons. Although γ is not sufficient to de-scribe the energy of a many-electron system in a linear way, it carries all information on the system by virtueof a generalized Hohenberg-Kohn theorem (Gilbert’s theorem). This establishes RDMFT as an alternative toDFT that employs γ instead of ρ as the basic variable. Usual approximate RDMFTs are numerically moreexpensive than DFT, but conversely, they can account for multi-reference effects in an easier way.We want to remark on a fourth and last common feature of all electronic-structure methods, which istheir computational complexity. Independently of the details of a certain method, we always exchange alinear problem on a huge configuration space, i.e., solving the many-body Schrödinger equation, with anonlinear problem on a comparatively small configuration space. Importantly, the equations of electronic-structure (and in general many-body) methods involve usually nonlinear operators and some form of self-consistency, which is a challenging combination. To do this in practice, it is not sufficient to employ, e.g.,a solver for partial differential equations from a standard library. But we usually need specific algorithmsthat have been designed to solve the equations of a specific theory. Thus, electronic-structure methods arefundamentally connected to computational mathematics. That we are nowadays able to apply a method likeHF to problems that involve many hundred electrons is a consequence, more of algorithmic improvementsthan of the increase of computer power. State of the art DFT methods can describe systems with more than 1000 particles. See for instance Ref. [134], where the authorspresent results for more then 8000 electrons.
We start our survey on (equilibrium) electronic-structure theory with the proper definition of the problemthat we want to solve. We aim to accurately describe the ground state of a system, consisting of N electronsthat repel each other via the Coulomb interaction exposed to some local potential v ( r ). Usually v is thoughtas the electrostatic potential of some classic charges at fixed positions that represent the nuclei in the Born-Oppenheimer approximation (see Sec. 1.3.3), but we do not have to specify this for the general theory. Thissetting is described by the (electronic-structure) Hamiltonianˆ H = N (cid:88) i (cid:183) ∇ r i + v ( r i ) (cid:184) + N (cid:88) i (cid:54)= j | r i − r j | , (2.1)that can be derived in several ways, e.g., by the canonical quantization of the corresponding classical theoryor as the static limit of the Pauli-Fierz Hamiltonian (Def. 1.2), i.e., by turning off the interaction to the quan-tized part of the Maxwell field. Eq. (2.1) defines a linear operator that has a well-defined spectrum and isbounded from below. Hence, we can define the ground state Ψ that by a variational minimization of theenergy expectation value E = 〈 Ψ | ˆ H Ψ 〉 over all “physical” wave functions Ψ , i.e., E = 〈 Ψ | ˆ H Ψ 〉 = inf Ψ 〈 Ψ | ˆ H Ψ 〉 , (2.2)where we will discuss the term “physical” in the following paragraphs. This variational principle is of cen-tral importance in electronic-structure theory and it will play a fundamental role for all methods that wepresent in the following. Sloppily we can say that the goal of (equilibrium) electronic-structure theory is to(approximately) solve the minimization problem (2.2) with Hamiltonian (2.1).
The general definition of the electronic-structure problem by Eqs. (2.1) and (2.2) is quite abstract and for theunacquainted reader, it might be hard to imagine how difficult solving this problem really is. To make thingsmore tangible, we will put Hamiltonian (2.1) aside for a moment and consider the problem of two electronsin one spatial dimension and without spin.
The one-particle problem
We start with one particle in a “box,” that we define simply by constraining the domain of all quantities toan interval L ⊂ (cid:82) and zero-boundary conditions. The energy of this system is described by the Hamiltonianˆ H = ˆ H ( x ) = − ∂ x , (2.3) Equivalently, we can derive Hamiltonian (2.1) from the cavity-QED Hamiltonian, cf. Eq. (1.37). We generally indicate operator by the ’ˆ’ symbol. However, because we work in the real-space picture, we will often drop the symbolfor operators that obviously only depend on the position, such as ˆ r ≡ r or ˆ v (ˆ r ) ≡ v ( r ). The reason is that they act multiplicative and thusare represented by a function. Both follows from the proof for self-adjointness of ˆ H that is valid for a very large class of local potentials v ( r ) (Kato-Rellich theorem).See for example Ref. [part 6][135]. Note that for certain potentials v , it can be proven that the ground state exists and thus we can exchange the inf by a min [135, part6.2]. This includes the important case where v ( r ) = (cid:80) N n i | r − R i | describes the electrostatic potential of N n nuclei at the positions R i . t ( x ) = − ∂ x on the domain. According to the prescription of standardmany-body quantum mechanics, we can describe the state of the system by a wave function Ψ ( x ) ∈ h [ L ], (2.4)where we denote with h [ L ] the Hilbert space of square-integrable functions. To determine the ground state Ψ of this system, we employ the variational principle E = 〈 Ψ | ˆ H Ψ 〉 = inf Ψ ∈ C 〈 Ψ | ˆ H Ψ 〉 . (cf. 2.2)This means (assuming no degeneracy) that Ψ is the one (normalized) wave function with the lowest en-ergy expectation value chosen from the set C ⊂ h [ L ] of all allowed wave functions, which we call the con-figuration space .We can solve this minimization problem simply by diagonalizing the kinetic energy operator, i.e., − ∇ x ψ i ( x ) = k i /2 ψ i ( x ), (2.5)where the ψ i ( x ) = (cid:112) L sin( xk i ) are the “box states” with the momenta k i = i π / L . Thus, the configurationspace is C = span( ψ , ψ ,...) and the ground state Ψ = ψ ( x ) is just the box-state with the lowest allowedmomentum k = π / L . The two-particle problem on a first glance
Let us now consider two particles in a box. Their energy is described by the Hamiltonianˆ H = ˆ H ( r , r ) = (cid:88) i = − ∂ x , (2.6)that is just the sum of the kinetic energy operators for each particle (with index i ) on the domain. To describethe state of the system, we need a wave function that depends on two coordinates, i.e., Ψ ( x , x ) ∈ h . (2.7)The linear structure of the theory suggests now that we define the two-body Hilbert space h [ L ] = h [ L ] ⊗ h [ L ] (2.8)simply as the product of the one-particle space h [ L ]. This is equivalent to employing the wave functionansatz Ψ ( x , x ) = ∞ (cid:88) i , j = c i j ψ i ( x ) ψ j ( x ), (2.9) A function f : [ a , b ] → (cid:67) is square-integrable if its ( L -)norm is finite, i.e., (cid:82) ba | f ( x ) | d x < ∞ . This is a necessary condition for theinterpretation of | ψ ( x ) | as probability amplitude. Note that if Ψ exists and is not degenerate, it must be an eigenstate of ˆ H , because ˆ H is self-adjoint on h [ L ] and thus has according to the spectral theorem [135, part 1] an eigenrepresentation that is bounded from below, i.e., ˆ H = (cid:80) i = E i | ˜ Ψ i 〉〈 ˜ Ψ i | with E < E ≤ E ....If we expand Ψ = (cid:80) i c i ˜ Ψ i , we see that the lowest energy is given by Ψ ≡ ˜ Ψ . If there is degeneracy, i.e., E = E = ... + E m for somefinite m , then Ψ = (cid:80) mi = c i ˜ Ψ i . However, it might be that Ψ ∉ h [ L ] and thus, the system cannot reach its infimum. For instance, thishappens in the free-space case, where L → ∞ . c i j ∈ (cid:67) . To guarantee the normalization | Ψ | = (cid:80) i | c i | =
1. Thus, if we perform theminimization over C = h [ L ], we find the ground state Ψ ( x , x ) = ψ ( x ) ψ ( x ),with eigenvalue E = π / L . Obviously, this state does not describe a valid configuration of electrons, becauseit does not adhere to the Pauli-principle . What differentiates many from one: exchange symmetry
The reason is that we have not performed the minimization over the correct configuration space. Charac-terizing this space is a crucial step to solve the many-electron minimization problem. In fact, the physicallycorrect configuration space is considerably smaller than h . The true ground state of the “two electrons in abox” problem is the lowest eigenstate of ˆ H that is also antisymmetric under the exchange of the coordinates,i.e., Ψ ( x , x ) = − Ψ ( x , x ). (2.10)We call this the exchange symmetry , which plays a fundamental role all theoretical descriptions of quan-tum many-body systems. This is especially true in the realm of first-principles methods, where we do notonly utilize the variational principle as a formal definition, but actually perform the minimizations of theform (2.2) with large-scale numerical algorithms. To do so, we need to parametrize C in some way and forthat, the antisymmetry of electrons is one of the most important tools as we will see in the following. How to parametrize the antisymmetric space: Slater determinants
Let us therefore regard again our wave function ansatz (2.9) with the expansion coefficients c i j . Since theindices correspond explicitly to the particle coordinates, we can simply transfer the condition (2.10) to thecoefficients by demanding c i j = − c ji .We can subsume this condition into the expansion and get the more common form Ψ ( x , x ) = (cid:112) (cid:88) i , j c i j (cid:161) ψ i ( x ) ψ j ( x ) − ψ j ( x ) ψ i ( x ) (cid:162) = (cid:112) (cid:88) i , j c i j | ψ i ( x ) ψ j ( x ) | − (2.11)of an expansion in terms of the (two-body) Slater determinants Ψ i j ( x , x ) = (cid:112) (cid:161) ψ i ( x ) ψ j ( x ) − ψ j ( x ) ψ i ( x ) (cid:162) = (cid:112) | ψ i ψ j | − . The set of all these two-body Slater determinants spans the antisymmetric two-particle Hilbertspace h A = h ∧ h ≡ A h , (2.12) We redefine here the coefficients c i j → c i j (cid:112) A turns the tensor product ⊗ into an antisymmetric tensor product ∧ or “antisym-metrizes” a given tensor [136].Thus, we can parametrize the correct configuration space by employing the ansatz (2.11). This meansthat we calculate the matrix elements H i j = 〈 ψ i | ˆ H φ j 〉 and reformulate the original minimization prob-lem (2.2) in terms of the coefficients: E [ c i j ] = 〈 Ψ [ c i j ] | ˆ H Ψ [ c i j ] 〉 . (2.13)To calculate the 2-body matrix elements H i j , let us see how ˆ H acts on one Slater determinant, built by thebox states. We haveˆ H (cid:161) ψ i ( x ) ψ j ( x ) − ψ j ( x ) ψ i ( x ) (cid:162) = (cid:34) (cid:88) i = − ∂ x i (cid:35)(cid:161) ψ i ( x ) ψ j ( x ) − ψ j ( x ) ψ i ( x ) (cid:162) = (cid:163) − ∂ x ψ i ( x ) (cid:164) ψ j ( x ) − (cid:163) − ∂ x ψ j ( x ) (cid:164) ψ i ( x ) + (cid:163) − ∂ x ψ j ( x ) (cid:164) ψ i ( x ) − (cid:163) − ∂ x ψ i ( x ) (cid:164) ψ j ( x ) = k i /2 + k j /2) (cid:161) ψ i ( x ) ψ j ( x ) − ψ j ( x ) ψ i ( x ) (cid:162) .We see that Slater determinants, constructed from the eigenstates of ˆ t ( x ), are the eigenfunctions of themany-body Hamiltonian ˆ H ( x , x ) = ˆ t ( x ) + ˆ t ( x ). Thus, we know that the ground state is given by c i j = i = j = c kl = H is a singleSlater determinant, constructed from the lowest two eigenstates of ˆ t ( x ). Including a local potential
Let us now generalize the description and consider a local potential ˆ v ( x ). Since the potential now shoulddescribe the geometry of the problem, we can formally assume L → ∞ . The many-body Hamiltonian readsthen ˆ H = (cid:88) i = ˆ t ( x i ) + ˆ v ( x i ). (2.14)We can straightforwardly generalize our solution strategy from before, if instead of Eq. (2.5), we consider theeigenvalue problem [ˆ t ( x ) + ˆ v ( x )] ψ i ( x ) = e i ψ i ( x ). (2.15)The physical ground state is simply given as the Slater determinant Ψ ( x , x ) = (cid:112) | ψ ψ | − with groundstate energy E = e + e . Solving a minimization problem in practice: the basis set
For almost all choices of v , Eq. (2.15) cannot be solved analytically and requires a numerical solver. Thismeans that we need to parametrize the single-particle states (orbitals) ψ i in a known basis set with say B elements, i.e., ψ i = (cid:80) Bj = b ji ξ j . An obvious choice would be a (sufficiently large) discretized interval in realspace, such that ξ j ( x ) = θ j ( x ), where j = B denote the B grid points and θ j ( x ) defines the discretiza-tion. The differential operators would then be approximated by, e.g., finite-differences (see Ch. 7). This will A possible choice would be θ x j = d − x Θ ( d x /2 −| x i − x | ), where d x is the spacing and Θ denotes the Heaviside step-function. B , the approximate solutions willconverge to the exact results. We call such a procedure convergence with respect to the basis set.Of course, we can also choose a different basis and in fact, discretizing the real-space is often not the bestchoice. For example, we might need very large B to converge with respect to the basis set. Another option isto employ the eigenfunctions of the kinetic energy operator, which for L → ∞ are plane waves [137, part 2].Employing this basis is equivalent to describing the whole problem in Fourier space. The obvious advantageof such a choice is the analytically known form of the basis for all x ∈ (cid:82) and thus, the description is notconstrained by a box of finite volume. However, the accurate description of small distances will be inefficientin this basis. In fact, plane waves are a standard tool to describe the bulk of many condensed-matter systems,which are modelled as infinitely extended periodic systems [137]. These two examples demonstrate howstrongly the basis choice depends on our knowledge of the problem.Plane waves and discretized real-space bases are in this sense the most generic bases. To describe forinstance finite-systems, usually neither of the two is used, but so-called atomic orbitals . These are especiallydesigned bases with the aim to reduce the necessary number of orbitals to a minimum. The basic ideais to make use of the analytically known eigenstates of the Hydrogen atom and the most straightforwardapproach is thus to use the lowest say B a of these eigenfunctions centered at the positions of all the N n nuclei of the system. The basis set consists then of B = B a N n of such atomic orbitals, which is in mostcases considerably smaller then any basis set obtained from a parametrization of the discretized real space.Another very common approach is to employ Gaussian functions instead of the Hydrogen orbitals becauseof their convenient mathematical properties. These are the two most employed strategies to obtain quantumchemical basis-sets. However, we want to stress that also real-space bases are used in practice because they allows for adescription of systems with less foreknowledge and a considerably easier visualization [138]. The interacting problem
To finalize our survey, there is one last missing piece, which is the inclusion of the interaction. The repulsionbetween two electrons in 3d is described by the Coulomb operator that reads (employing the coordinates r , r ) ˆ w ( r , r ) = | r − r | . (2.16)This is a two-body operator, because it depends on two coordinates, but in contrast to, e.g., Eq. (2.14), it cannot be expressed as a sum of two one-body operators. We call this the order of an operator: ˆ w has theorder two, but for instance (cid:80) i = [ˆ t ( x i ) + ˆ v ( x i )] has the order one. It is a fundamental property of interactionoperators that their order is larger than one. Importantly, this spoils our solution strategy from above.To demonstrate this, let us consider a generic interaction operator in 1d w ( x , x ) that we add to Hamil-tonian (2.14), i.e., ˆ H = (cid:88) i = ˆ t ( x i ) + ˆ v ( x i ) + ˆ w ( x , x ). (2.17) The interested reader is referred to, e.g., the corresponding chapters in the textbook of Helgaker et al. [132]. Also in this work, see Ch. 6. Note that the one-dimensional version of the Coulomb operator w ( x , x ) = | x − x | has very different properties than its 3d version.Thus, in 1d-model calculations the soft-Coulomb potential w ( x , x ) = (cid:113) ( x − x ) + (cid:178) with (cid:178) > t + ˆ v that we know to calculate numerically. However, Slater determinants of these orbitalscannot be eigenstates of the interacting problem and thus, we have to consider the general ansatz Ψ ( x , x ) = (cid:112) (cid:88) i , j c i j (cid:161) ψ i ( x ) ψ j ( x ) − ψ j ( x ) ψ i ( x ) (cid:162) (cf. 2.11) = (cid:88) i , j c i j Ψ i j .If we apply the full Hamiltonian to one of these Slater determinants Ψ i j ,ˆ H Ψ i j = (cid:34) (cid:88) i = ˆ t ( r i ) + ˆ v ( r i ) + ˆ w ( r , r ) (cid:35) Ψ i j = (cid:34) (cid:88) i = ˆ t ( r i ) + ˆ v ( r i ) (cid:35) Ψ i j (cid:124) (cid:123)(cid:122) (cid:125) e + e ) Ψ i j + ˆ w ( r , r ) (cid:112) (cid:161) ψ i ( x ) ψ j ( x ) − ψ j ( x ) ψ i ( x ) (cid:162) ,we see that the interaction term couples the two orbitals of Ψ i j for every i , j . This means that (withoutfurther information on the systems like symmetries etc.) we cannot simplify the general ansatz (2.11) for aninteracting problem. Even if we chose a highly optimized orbital basis, we have to perform the minimization E [ c i j ] = 〈 Ψ [ c i j ] | ˆ H Ψ [ c i j ] 〉 . (cf. 2.13)with respect to in principle all entries of the coefficient matrix c i j . Let us now transfer the concepts and tools from the two-electron example to the general case of N electronsin three spatial dimensions, including spin. Thus, we consider spin-spatial coordinates x = ( r , σ ), where r ∈ (cid:82) describes a position and σ ∈ { ↑ , ↓ } is the spin-quantum number. The antisymmetric N -electron space We start with the definition of the antisymmetric N -particle space, i.e., the generalization of Eq. (2.12).Therefore, we consider again the single-particle Hilbert space h that is spanned by some spin-spatial or-bital basis { ¯ ψ i ( x )} (where we assume basis states to be orthonormalized, i.e., (cid:82) d r ¯ ψ ∗ i ( r ) ¯ ψ j ( r ) = δ i j ), such thatevery single-particle state can be expressed as Ψ ( x ) = ∞ (cid:88) i = c i ¯ ψ i ( x ), (2.18)where the c i ∈ (cid:67) are the expansion coefficients (with corresponding sum rule (cid:80) i , j | c i j | = h , h for more particles asantisymmetric product spaces, i.e., h NA = N (cid:94) l = h . (2.19)51HAPTER 2. ELECTRONIC-STRUCTURE THEORYThe basis states of h N are the N -body Slater determinants Ψ I ( x ,..., x N ) = (cid:112) N ! (cid:88) π j ∈ P N ( − j ¯ ψ π j ( I ) ( x ) ··· ¯ ψ π j ( I N ) ( x N ) ≡ (cid:112) N ! | ¯ ψ I ,..., ¯ ψ I N | − , (2.20)where P N denotes the permutation group on N elements and the index j is chosen such that it is even (odd)for an even (odd) permutation π j ∈ S N . Thus, every wave function Ψ ( x ,..., x N ) ∈ h NA can be parametrizedas Ψ ( x ,..., x N ) = (cid:88) I c I Ψ I ( x ,..., x N ) = Ψ [ c I ], (2.21)where c I = c II ,..., I N ∈ (cid:67) is the coefficient of the I -th Slater determinant including the orbitals with indices I ,..., I N . The according sum rule is (cid:80) I | c I | =
1. Note that the introduction of Slater determinants is one ofthe few known ways to guarantee the antisymmetry of a wave function and it is thus a fundamental tool ofmost electronic-structure theories.
The closed-shell setting
In the following, we will usually consider the special case of closed-shell systems, which is very common inelectronic-structure theory [132] and covers a broad range of in practice relevant systems. For that, weassume that N is even and define ¯ ψ i ( r , σ ) = ψ i ( r ) α ( σ )¯ ψ i + ( r , σ ) = ψ i ( r ) β ( σ ), (2.22)where { α , β } ≡ { ↑ , ↓ } denote the two spin functions. This means that each spatial orbital is occupied twice and the system is spin-saturated. We can thusremove the spin-coordinate from our description (by a trivial integration).
The ground state of a many-electron system
With the ansatz (2.21), we can reformulate the original minimization problem (2.2) in the following way:First, we choose a specific basis { ψ j ( x )}, expand Ψ = Ψ [ c I ] in Slater determinants of this basis according toEq. (2.21) and calculate the general energy expression as a functional of the expansion coefficients c I , i.e., E [ c I ] = 〈 Ψ [ c I ] | ˆ H Ψ [ c I ] 〉 . (2.23)To find the ground state, we then minimize E [ c I ] with respect to the c I . The solution c I defines Ψ = Ψ [ c I ]. For instance, this excludes systems where certain magnetic effects or spin-orbit coupling play a role [137]. The generalization toodd N and spin-dependent Hamiltonians is in principle straightforward. However, approximate methods for the spin-restricted caseare generally better understood. We choose this nomenclature, because it is very common in the quantum chemistry literature.
The non-interacting case
Importantly, the just described strategy strongly simplifies for every systems that is described by operatorsof order one, i.e., operators that can be decomposed asˆ O ( r ,..., r N ) = N (cid:88) i = ˆ o ( r i ). (2.24)An important example for this are non-interacting systems with the Hamiltonianˆ H = N (cid:88) i (cid:183) ∇ r i + v ( r i ) (cid:184) . (2.25)To find the ground state of ˆ H , we simply have to calculate the N /2 (assuming that N is even) lowest eigen-states of ∇ r + v ( r ) and construct a Slater determinant out of those, using Eq. (2.22). This construction of amany-body state by successively “filling up” one-body states is often called aufbau principle (see Sec. 2.1.3).Thus, the non-interacting N -electron problem (and any other problem of order one) reduces effectivelyto N one-electron or orbital problems. With state-of-the-art algorithms and higher-performance clusters,such (and nonlinear versions of such) eigenvalue problems for orbitals can be very efficiently solved even forspatially very extended systems (that in turn require large orbital bases). We call a wave function consistingonly of one Slater determinant a single-reference wave function. Interaction and the exponential wall
As in the two-particle case, any form of interaction spoils this simple solution strategy. For instance, theelectronic-structure Hamiltonian (Eq. (2.1)) includes the Coulomb interaction operator (cid:80) Ni (cid:54)= j | r i − r j | thathas order two. This operator in principle couples all Slater determinants of the general (or multi-reference )ansatz Ψ ( x ,..., x N ) = (cid:88) I c I Ψ IN ( x ,..., x N ), (cf. 2.21)with the N -dimensional coefficient tensor c I = c Ij ,..., j N . (2.26)Thus, we cannot simplify the expansion as we have done for the non-interacting case and need to considerthe full coefficient tensor. If we consider M basis states, this tensor has a dimension of D = M N , (2.27)that grows exponentially with the particle number. If we wanted to compute the wave function for say a Ben-zene molecule ( C H ) that has N = ∗ + =
42 electrons with a small basis set of say M =
10 orbitals, thecoefficient tensor would have D = entries, which could hardly be determined by any imaginable com-putation system. For a system with N =
80 electrons, we approach the approximated number of particles ofthe universe. Clearly, we can exclude some of this entries by smart basis choices and symmetry considera-tions. State-of-the-art methods have been employed to calculate the (real-space) many-body wave function Source: , accessed 15.07.2020. However, the scaling of Eq. (2.27) remains and representsan “exponential wall” as Nobel laureate Walter Kohn called it [13].The problem of the exponential wall is so severe that the concept of the many-body wave function forlarger systems has basically no value beyond its use for certain theoretical considerations. Walter Kohn wentso far as to suggest that“in general the many-electron wavefunction Ψ ( r ,..., r N ) for a system of N electrons is not alegitimate concept, when N > N , where N ≈ N seems even understated. Inpractice, this means that most of our knowledge on electronic-structure relies on alternative descriptionsthat carry less information than the full many-body wave function. Since exchange symmetry will play a central role in the course of this thesis, we want to make a little detourrecapitulating the concept of exchange symmetry. This more historically excursion supplements the chapterand is not necessary to follow the course of this text. The fast reader can directly continue with Sec. 2.2.
The origin of quantum indistinguishability
To the best of our knowledge, all electrons have identical physical properties, i.e., the same mass, charge andspin and thus, a valid definition of a many-body Hamiltonian like Eq. (2.6) or Eq. (2.1) must be symmetric with respect to the exchange of particles coordinates. From the symmetry of the Hamiltonian we can deducethe exchange symmetry or “special circumstances” of the wave function as Wolfgang Pauli called it in one ofhis famous articles on quantum mechanics:Wenn wir es mit mehreren gleichartigen Teilchen zu tun haben, treten besondere Verhältnisseein, die daher rühren, daß der Hamiltonoperator stets invariant ist bei irgendwelchen Vertauschun-gen der Teilchen.
If we deal with many similar [indistinguishable] particles, special circumstances occur that arisefrom the Hamilton operator being invariant under any permutations of particles. (Pauli, 1933 [140])It is clear that also a classical system consisting of many identical particles would be described by a sym-metric Hamiltonian. However, such classical particles (imagine billiard balls) are distinguishable, even ifthey have identical physical properties. The reason is that we can in principle locate their position at anyinstance of time and collect them in a trajectory. This allows to define a label for every trajectory, which inturn distinguishes the balls. The situation changes, if we employ, e.g., a statistical description. This meansthat instead of describing explicitly the trajectory of each billiard ball, we employ a probability distributionthat describes the probability to find a billiard ball at a certain position at a given time instance. In such adescription, we obviously lose the distinguishability.The most common formulations of quantum mechanics are also probabilistic and thus they are (at In certain (usually effectively one-dimensional) model descriptions, the many-body wave function has been calculated even forlarger N . See for example Motta et al. [139]. For a discussion on possible alternative approaches, see for example Ref. [141]. A typical argument why quantum particles must be indistin-guishable extends on the trajectory example above: the knowledge of a trajectory in time and space contra-dicts Heisenberg’s uncertainty principle [142]. A less “hand-wavy” argument by means of the counting ofstates in a fictive experiment was given by Dirac already in the very beginnings of the development of quan-tum mechanics [143]. For a very detailed discussion of this argument, the reader is referred to the book byStefanucci and Van Leeuwen [144, part 1].
The consequence of indistinguishability: particle statistics
To the best of our knowledge, the indistinguishability of quantum particles is unquestioned until nowadaysand usually regarded as a fact [142, 137, 144]. This means in practice that although we formally need tointroduce (labelled) coordinates to describe N electrons, we have to make sure that the predictions of thetheory cannot be used to differentiate the electrons. Since only expectation values of observable operatorscorrespond to possible measurements in experiments, we must guarantee indistinguishability on this level.For instance, to obtain the expectation value of the energy of a two-particle system, we need to calculate theintegral E = (cid:90) d x d x Ψ ∗ ( x , x ) ˆ H ( x , x ) Ψ ( x , x ). (2.28)Since the wave function occurs quadratically in such expressions, we have more freedom to do such a sym-metrization on the level of Ψ than on the operator level. Dirac [143] concluded already in 1926 that as aconsequence, many-body wave functions need to be either symmetric or antisymmetric under a pair-wisepermutation P x , y of two coordinates x , y : P x , x Ψ ( x , x ) = Ψ ( x , x ) ! = Ψ ( x , x ) ←→ symmetry (2.29a) P x , x Ψ ( x , x ) = Ψ ( x , x ) ! = − Ψ ( x , x ) ←→ antisymmetry. (2.29b)These symmetry properties define the two fundamental particle classes, that we consider until today: wecall particles that are described by a symmetric or antisymmetric many-particle wave function bosons or fermions , respectively. In Ref. [143], Dirac further connected the fundamental exchange symmetry of theparticle to its statistical properties: symmetric particles follow Bose-Einstein statistics [145, 146] and anti-symmetric particles follow Fermi-Dirac [147] statistics. The different particle statistics are undoubtedly oneof the most important elements of quantum theory and alone, i.e., without taking into account many otherelements of the theory, they have an impressive explanatory power. Einstein showed for example alreadyin 1925 [146], i.e., before Heisenberg and Schrödinger formulated the first well-defined theories of quantummechanics [148, 149], that Bose-Einstein gases exhibit a phase state, the condensate , which is characterised However, ontologically there are huge differences between classical mechanics and quantum mechanics, were the details dependon the specific interpretation of quantum mechanics. We refer the interested reader to the well-documented W
IKIPEDIA article on thetopic and the reference therein: https://en.wikipedia.org/wiki/Interpretations_of_quantum_mechanics . From this argument follows that quantum particles that are very far away from each other (such that they can be described byseparate wave functions that do not overlap) are indeed distinguishable. Dicke used this argument, when he introduced his model of N two-level system coupled to one photon mode [72]. Note that there are effective Hamiltonians that have even more complicated symmetries, e.g., the effective Hamiltonian of certain topologically nontrivial systems. See Ref. [137, part 9] for a good overview about such scenarios. Note that particles with different symmetry properties can be defined (See also the previous footnote). However, such definitionsrequire some kind of effective theory and thus the according particles are usually not considered as fundamental. This holds also forthe polaritons that we will introduce in part II. They are electron-photon hybrid particles. Pauli’s principle
Regarding the explanatory potential of fermion statistics, many phenomena could be named, especially inthe realm of electronic-structure theory, where electrons are described as fermions. Most importantly, onecan show that fermions obey
Pauli’s exclusion principle , which is a crucial element of the modern model ofatoms. Pauli defined his principle in the following way:Es kann niemals zwei oder mehrere äquivalente Elektronen im Atom geben, für welche [...] dieWerte aller Quantenzahlen [...] übereinstimmen. Ist ein Elektron im Atom vorhanden, für dasdiese Quantenzahlen bestimmte Werte haben, so ist dieser Zustand “besetzt”.
There are never two or more equivalent electrons in the atom which have the same quantumnumber. If there is an electron in the atom with certain quantum numbers, this state is “occupied.” (Pauli, 1925 [151])A remarkable deduction from Pauli’s principle is for example the explanation of the stability of matter . Lieb[152] showed under very weak assumptions employing the atom model of electrons and static nuclei thatPauli’s principle is necessary and sufficient to guarantee that any matter system built by such constituentsis stable, i.e., it has a well-defined ground state.
A historical example of the importance of Pauli’s principle
To illustrate the importance of Pauli’s principle, let us make a very brief detour to the 1930ies that is basedon the first chapters of the book by Gavroglu and Simões [153]. At that time, chemistry and physics weretwo (almost completely) separate research fields. But with the formulation of quantum mechanics and thecorresponding model of the atom, physicists paved the way for establishing quantum chemistry . In thissince then ever growing research field, researchers try to understand physical models and experiments ofchemical systems by connecting quantum theory with the theoretical knowledge of chemistry. However, thedevelopment of quantum chemistry was initially very slow and it took a lot of time and especially moderncomputers until the real success story of the field started. Physicists like Dirac started to promote the pointof view that chemistry was merely an application of physical laws, but they had to realize very fast that thisapplication is much more difficult than anybody could have ever imagined. Walter Kohn summarized thisin his Nobel lecture in the following way:“There is an oral tradition that [...] Dirac declared that chemistry had come to an end - its con-tent was entirely contained in that powerful [i.e., the Schrödinger] equation. Too bad, he is saidto have added, that in almost all cases, this equation was far too complex to allow solution.”(Kohn,1999 [13])On the other hand, Chemistry always relied (and is still relying) on heuristic rules and tables, like Mendeleev’sperodic table [154], Hund’s rule [155] or Mulliken’s “correlation diagram” [156]. And most chemists doubtedthat the mathematically very challenging quantum theory could be ever useful for understanding such kindof rules. The only exception is Pauli’s principle, which is somewhat special in comparison to other con-cepts like the wave function or the Hamilton operator. In fact, Pauli’s principle is a rule: Only by utilizing Already Einstein predicted this peculiar phase state in the cited paper but it took until 1995 for its first experimental realization[150]. Since then, many more Bose-Einstein condensates have been realized experimentally and one can find an considerable amountof literature about the topic. A good introduction to the topic can be found in the textbook by Altland and Simons [137, part 6.3].
Bohr’s aufbau principle
Another consequence of fermion statistics that plays an important role in electronic-structure theory is
Bohr’s aufbau principle (building-up principle), which Pauli defined in the following way:Dieses von Bohr aufgestellte Prinzip besagt, daß bei Anlagerang eines weiteren Elektrons anein Atom die Quantenzahlen der schon gebundenen Elektronen dieselben Werte behalten, dieihnen im zugehörigen stationären Zustand des freien Atomrestes zukommen.
This by Bohr established principle states that when a further electron is absorbed, the quantumnumbers of an atom’s bound electron keep the values, that they would have in the correspondingstationary state of the free atom rest. (Pauli, 1925 [151])The strength of the aufbau principle is especially visible in effective single-particle methods such as HFtheory (see Sec. 2.2) to describe many-body systems. In such methods, we solve (nonlinear) eigenvalueequations for single-particle states (orbitals). The aufbau principle then tells us how to “occupy” these or-bitals with electrons to obtain the many-body state: if our systems consists of N electrons, we will occupythe N energetically lowest orbitals. In this sense, the aufbau principle combines the exclusion principle (weonly populate every state by one electron) with the variational principle (we try to obtain the lowest en-ergy). However, other aufbau principles inspired by Bohr’s principle have been developed, which in one orthe other way fill up some levels. This shows the strength of the very concept of an aufbau principle inelectronic-structure theory. See for instance Refs. [157, 158, 159].
In the last section, we have discussed the problems connected to the N -electron wave function that for ageneral interacting system is a superposition of arbitrary Slater determinants, i.e., Ψ ( x ,..., x N ) = (cid:88) I c I Ψ IN ( x ,..., x N ). (cf. 2.21)There is however the special setting of non-interacting electrons, for which the N -body problem basicallyreduces to a one-body problem. The corresponding many-body state is given by one Slater determinant.In HF theory, we try to find the “best fit” between the non-interacting and interacting problem by simplytruncating the expansion (2.21) after the first element, i.e., we consider a single-reference wave functionansatz Ψ ( x ,..., x N ) ≈ Ψ HF ( x ,..., x N ) = (cid:112) N ! | ¯ ψ ..., ¯ ψ N | − . (2.30)to describe the interacting problem. The HF minimization problem
For our derivations, we assume spin-restriction , i.e., we consider systems with even particle number anddefine ¯ ψ i ( r , σ ) = ψ i ( r ) α ( σ )¯ ψ i + ( r , σ ) = ψ i ( r ) β ( σ ), (cf. 2.22)where { α , β } denote the two spin functions (see the discussion around Eq. (2.22)). This restricted HF is themost common version of HF theory. We saw already that wave functions like Ψ HF are the solutions of non-interacting problems, i.e., Ψ HF is the ground state of any Hamiltonian of the formˆ H ni = N (cid:88) i = ˆ t ( r i ) + ˆ v ( r i ), (2.31)if we identify ψ ,.., ψ N /2 according to the aufbau principle with the lowest N /2 eigenstates of ˆ t + ˆ v . Onecan thus say that in single-reference methods, we approximate the interacting with a non-interacting wavefunction. However, there are many possibilities for such an approximation and we have to define a qual-ifier that determines one out of these. In wave-function methods such as HF theory, this qualifier is alwaysthe energy (we will employ a different qualifier in Sec. 2.3). We thus can define the HF ground state by thevariational principle, i.e., E HF = 〈 Ψ HF | ˆ H Ψ HF 〉 = inf Ψ HF 〈 Ψ HF | ˆ H Ψ HF 〉 , (2.32)with the N -electron electronic-structure Hamiltonian that we defined in the previous sectionˆ H = N (cid:88) i = ˆ t ( r i ) + ˆ v ( r i ) + N (cid:88) i , j = ˆ w ( r i , r j ). (cf. 2.1) This statement is independent of the assumed spin-restriction of our example.
The features of HF theory: the upper energetic bound and the definition of correlation
It is clear that the ansatz Ψ ≈ Ψ HF constrains the full configuration space that we have to consider to findthe exact ground state of ˆ H . Since due the variational principle, Ψ must be the wave function with the lowestpossible energy expectation value, we can deduce that the HF energy E HF ≥ E (2.33)is an upper bound of the exact many-body energy E . We call methods with this property variational . Thedifference between exact and HF energy, the so-called correlation energy E c = E − E HF ≤
0, (2.34)thus must be zero or negative. Consequently, HF defines a kind of “baseline” for all electronic-structuremethods that from this perspective “only” aim to accurately describe E c . Because the HF wave function cor-responds to a non-interacting system that as we have seen in the last section can be reduced to an effectiveone-body problem, the correlation energy is often called many-body energy . The HF energy: the “quantum” contribution and exchange symmetry
Let us now calculate this energy expression explicitly. We have E HF =〈 Ψ HF | ˆ H Ψ HF 〉= N (cid:88) i = 〈 Ψ HF | (cid:163) ˆ t ( r i ) + ˆ v ( r i ) (cid:164) Ψ HF 〉 + N (cid:88) i , j = 〈 Ψ HF | ˆ w ( r i , r j ) Ψ HF 〉 ,where the first, i.e., the one-body part (after some algebra, see Ref. [132, part 5.4]) reads E (1) [{ ψ i }] = N /2 (cid:88) i = (cid:90) d r ψ ∗ i ( r ) (cid:163) ˆ t ( r ) + ˆ v ( r ) (cid:164) ψ i ( r ) = N /2 (cid:88) i = 〈 ψ i | ˆ t + ˆ v | ψ i 〉 ,where we have defined the one-body integrals 〈 ψ i | ˆ o (1) | ψ i 〉 = (cid:82) d r ψ ∗ i ( r )[ ˆ o ( r ) + ˆ v ( r )] ψ i ( r ) for a given one-bodyˆ o (1) . Note that the factor 2 is due to the spin-restriction. The second term involves the two-body operatorˆ w ( r , r (cid:48) ) and reads [132, part 5.4] E (2) [{ ψ i }] = N /2 (cid:88) i , j = (cid:183) (cid:90) d r d r (cid:48) ψ ∗ i ( r ) ψ ∗ j ( r (cid:48) ) ˆ w ( r , r (cid:48) ) ψ i ( r ) ψ j ( r (cid:48) ) − (cid:90) d r d r (cid:48) ψ ∗ i ( r ) ψ ∗ j ( r (cid:48) ) ˆ w ( r , r (cid:48) ) ψ j ( r ) ψ i ( r (cid:48) ) (cid:184) ≡ N /2 (cid:88) i , j = (cid:163) 〈 ψ i ψ j | ˆ w | ψ i ψ j 〉 − 〈 ψ i ψ j | ˆ w | ψ j ψ i 〉 (cid:164) .Here, we have introduced the notation 〈 ψ i ψ j | ˆ o (2) | ψ k ψ l 〉 = (cid:82) d r d r (cid:48) ψ ∗ i ( r ) ψ ∗ j ( r (cid:48) ) ˆ o (2) ( r , r (cid:48) ) ψ i ( r ) ψ j ( r (cid:48) ) to abbre-viate the integrals with respect to a two-body operator ˆ o (2) . We call the first term of E (2) the Hartree or59HAPTER 2. ELECTRONIC-STRUCTURE THEORYmean-field energy contribution, E H = N /2 (cid:88) i , j = 〈 ψ i ψ j | ˆ w | ψ i ψ j 〉 , (2.35)that can be evaluated in two steps by first calculating the Hartree potentialˆ v H ( r ) = N /2 (cid:88) j = (cid:90) d r (cid:48) ψ ∗ j ( r (cid:48) ) ˆ w ( r , r (cid:48) ) ψ j ( r (cid:48) ) (2.36)and then the energy expression E H = (cid:80) N /2 i = 〈 ψ i | ˆ v H | ψ i 〉 . The name mean-field stems from the usual inter-pretation of 〈 ˆ v H 〉 as the mean Coulomb potential of all the electrons. Importantly, 〈 ˆ v H 〉 is equivalent to theCoulomb potential of a classical charge distribution. The second term of E (2) instead is called the exchange (or Fock) contribution, E X = − N /2 (cid:88) i , j = 〈 ψ i ψ j | ˆ w | ψ j ψ i 〉 (2.37)that arises because of the negative terms of the Slater determinant and thus has no classical counter-part.The name stems from the fact the two body integrals occurring in Eq. (2.37) can be obtained from the inte-grals in Eq. (2.35) by exchanging the indices on one of the two sides. The exchange contribution representsthus a quantum-mechanical correction to the classical or mean-field energy. Although this correction isusually much smaller than the other energy terms in E HF , it makes the HF description considerably moreaccurate than the older Hartree theory that only takes the mean-field into account [132].Collecting all the terms, we get the following expression for the HF energy E HF [{ ψ i }] = N /2 (cid:88) i = 〈 ψ i | ˆ t + ˆ v | ψ i 〉 + N /2 (cid:88) i , j = (cid:163) 〈 ψ i ψ j | ˆ w | ψ i ψ j 〉 − 〈 ψ i ψ j | ˆ w | ψ j ψ i 〉 (cid:164) (2.38)We see that the HF energy is a functional of the N /2 orbitals { ψ ,..., ψ N /2 }. The HF minimization problem and the Fock operator
To minimize E HF with respect to the orbitals, we need to guarantee their orthonormality, i.e., 〈 ψ i | ψ j 〉 = δ i j ∀ i , j . (2.39)The standard way to do so, is to constrain the minimization of the functional by introducing a Lagrangianmultiplier, ˜ (cid:178) i j , for every condition. Instead of the original minimization problem of Eq. (2.32), we minimizethe Lagrangian L HF [{ ψ i },( λ i j )] = E HF [{ ψ i }] − N /2 (cid:88) i , j = ˜ (cid:178) i j ( 〈 ψ i | ψ j 〉 − δ i j ) (2.40)60.2. HARTREE-FOCK THEORYwith respect to the orbitals and the Lagrange multipliers. As it is customary, we treat the orbitals ψ i andtheir complex conjugates ψ ∗ i as independent and consider L HF [{ ψ i , ψ ∗ i },( λ i j )]. A necessary condition fora minimum of L HF is stationarity with respect to all variables0 = δ L HF [{ ψ i },( λ i j )] = N /2 (cid:88) i = (cid:34)(cid:90) d r δ L HF δψ i ( r ) δψ i ( r ) + (cid:90) d r δ L HF δψ ∗ i ( r ) δψ ∗ i ( r ) (cid:35) + N /2 (cid:88) i , j = ∂ L HF ∂λ i j d λ i j . (2.41)This leads to three independent sets of conditions, where the latter just gives back the orthonormality con-ditions of Eq. (2.39) and the former two are equivalent to each other. For notational convenience, we utilizethe second set of conditions (with respect to ψ ∗ i ) and obtain N /2 (cid:88) j = ˜ (cid:178) i j ψ j ( r ) = (cid:163) ˆ t + ˆ v (cid:164) ψ i ( r ) + N /2 (cid:88) j = (cid:183) (cid:90) d r ψ ∗ j ( r (cid:48) ) ˆ w ( r , r (cid:48) ) ψ j ( r (cid:48) ) ψ i ( r ) − (cid:90) d r ψ ∗ j ( r (cid:48) ) ˆ w ( r , r (cid:48) ) ψ i ( r (cid:48) ) ψ j ( r ) (cid:184) . (2.42)To simplify this expression, we can define the Coulomb-operator ˆ J j that acts asˆ J j ψ i ( r ) = (cid:90) d z (cid:48) ψ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) ψ j ( r (cid:48) ) ψ i ( r ) (2.43)and the Exchange-operator ˆ K j that acts as [160]ˆ K j ψ i ( r ) = (cid:90) d z (cid:48) ψ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) ψ i ( r (cid:48) ) ψ j ( r ). (2.44)The right-hand side of Eq. (2.42) can then be expressed asˆ H ψ i ( r ) = (cid:163) ˆ t + ˆ v (cid:164) ψ i ( r ) + N /2 (cid:88) j = (cid:163) J j − ˆ K j (cid:164) ψ i ( r ), (2.45)where we introduced the Fock-operator ˆ H , that acts on one orbital and thus is can be seen as an effective one-body Hamiltonian . Solving the HF equations: a nonlinear orbital eigenvalue problem
Importantly, ˆ H is hermitian and thus, we can apply a unitary transformation to Eq. (2.42) such that ˜ (cid:178) i j = (cid:178) i δ i j becomes diagonal. In this canonical formulation, Eq. (2.42) obtains the form of an eigenvalue equation ˆ H ψ i ( r ) = (cid:178) i ψ i ( r ). (2.46)for HF orbitals ψ i and the orbital energies (cid:178) i =〈 ψ i | ˆ H ψ i 〉=〈 ψ i | (cid:163) ˆ t + ˆ v (cid:164) ψ i 〉 + 〈 ψ i | N /2 (cid:88) j = (cid:163) J j − ˆ K j ψ j ( r ) (cid:164) ψ i 〉 Note that we introduced the factor 2 in front of the Langrange-multiplier term for later convenience. It is important to realize that our goal is to perform a free minimization over all possible functions f ( r ) that depend on one variable.In this space, there are in principle functions with f ( r ) (cid:54)= f ∗ ( r ). However, our formalism is constructed such that the solution f has thecorrect property f = f ∗ . Note that the Coulomb-operator is connected to the Hartree potential by ˆ v H = (cid:80) j ˆ J j . =〈 ψ i | (cid:163) ˆ t + ˆ v (cid:164) ψ i 〉 + N /2 (cid:88) j = (cid:163) 〈 ψ i ψ j | ˆ w | ψ i ψ j 〉 − 〈 ψ i ψ j | ˆ w | ψ j ψ i 〉 (cid:164) . (2.47)The set of these equations for all N /2 orbitals are called the Hartree-Fock equations , which have to be solved(numerically) to obtain the Hartee-Fock ground state of the Hamiltonian (2.1).Importantly, Eq. (2.46) is not a usual, i.e., linear eigenvalue equation. The Fock-operator ˆ H = ˆ H [{ ψ i }] depends on its own eigenstates and thus the eigenvalue problem of Eq. (2.46) is in fact nonlinear . We call ˆ H a nonlinear one-body operator, which has very different properties from linear operators. We can utilize ˆ H to find the HF ground state or to determine approximate ionization potentials and electron affinities (Koop-man’s theorem [132, Sec. 10.5]). However, the higher lying “eigenstates” of ˆ H , the so-called unoccupiedorbitals, have only limited physical meaning. In fact, the validity Koopman’s theorem is a very special fea-ture of the Fock-matrix and similar nonlinear one-body operators (see Sec. 2.4) in other electronic-structuretheories do not have the same level of explanatory power. In comparison, the eigenfunctions and eigen-values of the many-body Hamiltonian have a clear physical interpretation: they simply denote the possiblestates with corresponding energy levels that the physical system can adopt.Another practical consequence of the nonlinearity of ˆ H is that solving the HF equations is numericallya very challenging task that is not comparable with an ordinary eigenvalue equation. For instance, thereare simple many-body systems, for which the Schrödinger equation is analytically solvable. But there is no known analytic solution of the HF equations. The only known way to solve Eq. (2.46) is a so-called self-consistent field (SCF) procedure. For that, we linearize ˆ H [{ ψ i }] ≈ ˆ H [{ ψ i }] locally around some startingguess of orbitals { ψ i }, solve Eq. (2.46) for this ˆ H [{ ψ i }] to obtain new orbitals { ψ i } and update ˆ H [{ ψ i }]. Weproceed with this iterative procedure until the orbitals { ψ n + i } ≈ { ψ ni } do not change significantly, i.e., untilself consistence of Eq. (2.46). Although we know from experience that this procedure converges in most cases, we can neither guar-antee convergence, nor can we be sure that the converged result is the global minimum of E HF that weare searching for. The reason is again the nonlinearity of the problem that makes a mathematical analysisvery difficult. There are certain hints that E HF is not convex, which would suggest that solutions of the HFequations could correspond to local minima, but this has not yet been proven. The cornerstone of (molecular) electronic-structure theory
Let us finish our brief discussion of HF theory with a quotation of one of the most important textbooks onmolecular electronic-structure theory:“The Hartree-Fock wave function is the cornerstone of ab initio electronic-structure theory. [...It] yields total electronic energies that are in error by less than 1% and a wide range of impor-tant molecular properties such as dipole moments, electric polarizabilities, electronic excitationenergies, magnetizabilities, force constants and nuclear magnetic shieldings are usually repro-duced to within 5-10 % accuracy. Molecular geometries are particularly well reproduced and aremostly within a few picometres of the true equilibrium structure.The Hartree-Fock wave function is often used in qualitative studies of molecular systems, par-ticularly larger systems. Indeed, the Hartree-Fock wave function is still the only wave functionthat can be applied routinely to large systems, and systems containing several hundred atoms For example the
Harmonium model that consists of two interacting electrons in a Harmonic potential. In part III, we will present some algorithms that are capable to perform HF minimizations. See for example the discussion in Ref. [161] and the references therein. Being one of the conceptually and computationally simplest electronic-structure methods, wewill repeatedly consider HF theory in the following chapters, when we discuss the coupled electron-photonproblem.However, as Helgaker et al. have pointed out, in many cases, HF is not sufficient for an accurate descrip-tion of many-body systems and we need more accurate methods . In the next two sections, we will thusdiscuss some approaches to go beyond HF theory and try to include the aforementioned correlation energyin the description.
In the last section, we have discussed how we can describe an interacting many-body system by a singleSlater determinant, i.e., an effective non-interacting wave function. This is an approximate descriptionthat neglects correlation, i.e., all effects that intrinsically require a multi-reference description. However,if we give up the concept of the many-body wave function (and with this the standard description of quan-tum states), we can in principle describe any many-particle system by such a single-reference wave func-tion. This is called the
Kohn-Sham (KS) construction of DFT and it represents for a huge class of systemsthe best trade-off between efficiency and accuracy in comparison to all alternative descriptions. It is notoverstated to say that KS-DFT “is indispensable for modern quantum-chemical modeling of materials andmolecules” [162].Because of the single-reference ansatz, KS-DFT is in practice very similar to HF. A KS-DFT calculationmeans to solve the KS equations, which constitute a nonlinear eigenvalue equation. In many cases, thisis computationally even less demanding than solving the HF equations. Conceptually however, DFT is verydistinguished from any type of wave-function method. In fact, one can see DFT as an alternative formulationof quantum mechanics. The density as basic variable
As the name suggests, DFT is based on the (single-particle) electron density, that is defined as ρ ( r ) = N (cid:88) σ ,..., σ N (cid:90) d r ··· d r N Ψ ∗ ( x , x ,..., x N ) Ψ ( x , x ,..., x N ) (2.48) Note that HF plays a less pronounced role for extend systems, such as solids. One reason is simply that performing HF-calculationsin k -space is in most cases numerically very expensive. Here, the local-density approximation of KS-DFT has a comparable role to HFfor molecular systems (see next section). For instance, Tokatly argues that DFT can be interpreted as a hydrodynamic formulation of quantum mechanics [163, 164]. N -electron system with ground state wave function Ψ ( x ,..., x N ) that is the lowest eigenstate of theelectronic-structure Hamiltonian, cf. (2.1),ˆ H = N (cid:88) i = (cid:163) ˆ t ( r i ) + v ( r i ) (cid:164) + N (cid:88) i , j = w ( r i , r j ) (2.49) = ˆ T + ˆ V + ˆ W .We renamed the three occurring terms, ˆ T = (cid:80) Ni = ˆ t ( r i ), etc. for later convenience. Already in 1964, WalterKohn and Pierre Hohenberg have proven that ρ ( r ) entirely determines the ground state of ˆ H [165]. On a firstglance, it might sound counter-intuitive that all the relevant information of the 4 N -dimensional wave func-tion Ψ is contained in a merely 3-dimensional quantity. However, the wave function is simply an auxiliaryquantity that allows us to formulate the standard theory of quantum mechanics, based on linear operators.The information that determines the physical setting in the electronic-structure Hamiltonian is actually thelocal potential v ( r ) that is also merely three-dimensional.For instance, if we want to describe a diatomic molecule with nuclei at the positions R , R and charges Z , Z , we would model this by considering the local potential v ( r ) = Z | r − R | + Z | r − R | . We can arbitrarily enlargethis system with further nuclei by adding Coulomb terms to the local potential. But for all these systems,the kinetic and the interaction operators are the same. These operators are thus said to be universal . Anymany-electron system in the non-relativistic setting is thus completely defined by the 3-dimensional localpotential and the number of electrons N and judged from this perspective, the concept of a theory based onthe electronic density might sound more reasonable.In the realm of DFT, v ( r ) is called the external variable that determines the corresponding internal vari-able ρ ( r ). This variable pair ( ρ , v ) is defined by the structure of the Hamiltonian (2.49). To see this, let uscalculate the expectation value of the potential term 〈 ˆ V 〉 =〈 Ψ | N (cid:88) i = v ( r i ) Ψ 〉= (cid:88) σ ,..., σ N (cid:90) d r r ··· d r N Ψ ∗ ( x , x ,..., x N ) (cid:34) N (cid:88) i = v ( r i ) (cid:35) Ψ ( x , x ,..., x N ) = (cid:90) d r v ( r ) N (cid:88) σ ,..., σ N (cid:90) r ··· d r N Ψ ∗ ( x , x ,..., x N ) Ψ ( x , x ,..., x N ) = (cid:90) d r v ( r ) ρ ( r ),where we used the exchange symmetry of Ψ from the second to the third line. This term defines the pair ofexternal and internal variable and it is at the center of the Hohenberg-Kohn proof [166] of the existence ofthe one-to-one mapping v ( r ) ←→ ρ ( r ). (2.50)The simplicity of this argument illustrates the fact that there are many possible functional theories, depend-ing on the structure of the Hamiltonian. A famous example is current-density functional theory [167] thatcan describe systems with external magnetic field or reduced density matrix functional theory that we willintroduce in Sec. 2.4.2. All these functional theories are based on the same type of proof, which we do For very general proof of this statement, see Penz et al. [166].
The universal density functional
The Hohenberg-Kohn theorem establishes DFT with the density ρ as the fundamental variable to describemany-electron systems. We thus can replace the wave function in the variational principle, cf. Eq. (2.2), bythe density and write E = inf ρ E [ ρ ], (2.51)where E [ ρ ] describes the energy of the system with Hamiltonian (2.49) as a functional of the density ρ .We can further specify this expression, because also the ground state wave function Ψ = Ψ [ ρ ] is uniquelyconnected to ρ and thus we have E [ ρ ] =〈 Ψ [ ρ ] | ˆ H Ψ [ ρ ] 〉=〈 Ψ [ ρ ] | ˆ T Ψ [ ρ ] 〉 + 〈 Ψ [ ρ ] | ˆ V Ψ [ ρ ] 〉 + 〈 Ψ [ ρ ] | ˆ W Ψ [ ρ ] 〉= T [ ρ ] + (cid:90) d r v ( r ) ρ ( r ) + W [ ρ ]. (2.52)We used here that the potential energy 〈 ˆ V 〉 = (cid:82) d r v ( r ) ρ ( r ) can be explicitly written in terms of ρ . The remain-ing part of E [ ρ ] is called the universal functional F [ ρ ] = ˆ T [ ρ ] + ˆ W [ ρ ]. (2.53)This reformulation has however a fundamental problem : Although the Hohenberg-Kohn theorem provesthe existence of F , it does not provide an explicit expression for it, because the proof is not constructive (seeSec. 3.3.2). As a matter of fact, most of the research in the field of DFT aims on understanding how thisfunctional looks like. This is a very difficult task and only many years of research lead to the remarkableaccuracy of state-of-the-art DFT. The Kohn-Sham construction
The first crucial step toward a useful density-functional is the KS construction [170] that makes use of theincreased flexibility of the density description. The basic idea is to employ as in HF, a non-interacting wavefunction, the KS wave function Ψ s (we denote all quantities in the auxiliary system with an s). We have seenin HF theory, that this is a very good starting point to describe the interacting problem and at the same time,a single Slater determinant can be calculated relatively cheaply by effective orbital equations. However, incontrast to HF theory, the fundamental variable in DFT is the density and not the wave function. Thus,the KS wave function is a purely auxiliary quantity that in principle is not related to the many-body wavefunction of the system. The connection between the non-interacting Kohn-Sham system described by theHamiltonian ˆ H s = N (cid:88) i = ˆ t ( r i ) + v s ( r i ) (2.54) For details on mathematical issues of the universal functional as defined by Hohenberg and Kohn and an alternative more generaldefinition, the reader is referred to the publications by Levy [168] and Lieb [169]. ρ s ( r ) = ρ ( r ). (2.55)Since the connection between density and potential is unique, i.e., there is one and only one interactingsystem with the density ρ , there is also one and only one non-interacting system with the density ρ s = ρ .Thus, the Hohenberg-Kohn theorem also proves that the KS potential is unique. Connecting the physical and the KS system
The original construction due to Kohn and Sham [170] to connect the KS and the physical system is basedon the energy. For that we define the Kohn-Sham energy (using Eq. (2.55)) E s [ ρ ] = T s [ ρ ] + (cid:90) d r v s ( r ) ρ ( r ) (2.56)and then re-express the physical (many-body) energy as E [ ρ ] = (cid:90) d r v ( r ) ρ ( r ) + F [ ρ ] = T s [ ρ ] + (cid:90) d r v ( r ) ρ ( r ) + F [ ρ ] − T s [ ρ ] ≡ T s [ ρ ] + (cid:90) d r v ( r ) ρ ( r ) + E Hxc [ ρ ]. (2.57)This defines the Hartree-exchange-correlation energy E Hxc [ ρ ] = T [ ρ ] − T s [ ρ ] + W [ ρ ] = E H [ ρ ] + E xc [ ρ ], (2.58)that usually is further divided in the mean-field or Hartree part E H [ ρ ] = (cid:82) d rr (cid:48) ρ ( r ) w ( r , r (cid:48) ) ρ ( r (cid:48) ), that is anexplicit functional of ρ , and the remaining unknown part E xc , that is called exchange-correlation energy (inanalogy to the HF energy definitions, see Sec. 2.2). If we assume that this functional is differentiable withrespect to ρ -variations, we can define the ground state by the stationarity condition0 = δ E [ ρ ] δρ = δ T s [ ρ ] δρ + v ( r ) + δ E H [ ρ ] δρ + δ E xc [ ρ ] δρ .The first term δ T s [ ρ ] δρ = − v s [ ρ ] In fact, it can be shown that this functional is not differentiable. There are ways to circumvent this problem (see, e.g., [166]), but from a mathematical point of view this issue is not yet resolved. In practice, this seems to play only a secondary role. However, oneshould not underrate such mathematical issues. Many advances in DFT were in fact triggered by mathematical research such as theLevy-Lieb constrained search formulation [168, 171, 169]. For a good introduction in this topic from a physicist’s perspective, the readeris referred to the respective section in the textbook by Dreizler and Gross [172]. v H [ ρ ] = (cid:90) d r (cid:48) ρ ( r (cid:48) ) w ( r , r (cid:48) ), (cf. 2.36)and the last term defines the unknown exchange-correlation potentialv xc [ ρ ] = δ E Hxc [ ρ ] δρ . (2.59)It follows that the potential of the auxiliary system is v s [ ρ ]( r ) = v ( r ) + v H [ ρ ]( r ) + v xc [ ρ ]( r ). (2.60)Under our assumption that E [ ρ ] is differentiable, this relation uniquely defines the KS system. We want toremark at this point that there are alternative constructions to define the KS system avoid certain mathemat-ical subtleties. For instance via the force-balance equations [173] or by considering more general definitionspaces [166]. The KS equations
Since ˆ H s does not contain interaction terms, its ground state is (assuming no degeneracy ) described byone Slater determinant Ψ s ( x ,..., x N ) = | ψ ··· ψ N | − , (2.61)which can be determined by a set of orbital equations (cid:163) ˆ t ( r ) + v s ( r ) (cid:164) ψ i ( x ) = (cid:178) i ψ i ( x ). (2.62)These are called the KS equations . We see that in contrast to HF theory, the KS equations do not include anonlocal term like the Exchange operator (cf. Eq.(2.44)) and are thus formally simpler. However, approxi-mations to the unknown exchange correlation potential usually have strong nonlinear dependencies on thedensity, or derivatives of the density. A very sophisticated class of approximate functionals, so-called hy-brids , even lift the assumption of a local v s completely and include the HF exchange energy. Nevertheless,the KS equations define a nonlinear eigenvalue problem for the effective one-body Hamiltonian ˆ H s . Thisis very similar to HF theory and accordingly the solution strategies are basically the same. For exampleKS-DFT calculations also require an SCF procedure (see Sec. 2.2) that is called the
KS scheme . The quest for the universal functional: approximations to the exchange-correlation potential
Although the KS scheme simplifies the functional construction, we have not gained much so far with respectto HF theory, because we do not know the form of the exchange-correlation functional. The formalism itselfdoes not give us any hint, how v xc could look like and thus, there is literally nothing else than “intuition” or” This assumption makes the derivations in the following simpler, but it is not necessary for the KS construction. For details, see
Ref. [172]. See for example the review by Perdew and Kurth [174] for an overview on the main classes of known functionals. Although the locality of most exchange-correlation potentials makes a large difference in practical calculations. Especially, KS-DFTcan be applied to condensed matter systems considerably more easily than HF. local-density approximation (LDA) is based on oneof the few many-electron systems that can be solved analytically, i.e., the homogeneous electron gas (HEG).Importantly, the electron-density of the HEG ρ HEG is constant and thus, we can explicitly parametrize thetotal energy E HEG ( ρ HEG ) as a function of the density. Utilizing definition (2.57), we can then (analytically)derive the exchange-correlation energy expression [175] E HEGxc ( ρ ) ≡ E LD AX [ ρ ] = − (cid:181) π (cid:182) (cid:90) d r ρ ( r ), (2.63)where we formally reintroduced the spatial dependence of ρ ( r ) = ρ HEG = const . Additionally, we removedthe index “C,” because a non-interacting system as the HEG per definition does not have correlation. Thisprocedure can be generalized to the interacting homogeneous electron gas, yielding a functional E LD Axc thatincludes correlation effects. The local-density approximation is then simply to apply E LD Axc to systems thatis not homogeneous, i.e., to consider ρ = ρ ( r ) as we have already anticipated in Eq. (2.63). Despite its sim-plicity, the LDA is impressively accurate for a large range of systems, especially in the realm of condensedmatter, where the LDA has a similar significance as HF for molecular systems. But also for certain classes ofchemical systems, the LDA is very accurate and often also better than HF [174]. However, the LDA is by farnot sufficient to reach “chemical accuracy,” i.e., the accuracy required in electronic-structure calculationsto make realistic predictions. This has only been achieved with the historically second class of function-als, which are the generalized gradient approximations (GGAs). The GGA functionals are conceptually astraightforward generalization of the LDA, because they consider besides ρ also its gradient, i.e., E GG Axc = E GG Axc [ ρ , ∇ ρ ]. (2.64)However, from the definition of the LDA in 1966 until 1985, when Perdew [178] constructed the first GGAfunctional that provided significantly better results, passed almost 20 years . This illustrates how difficultfunctional construction in DFT really is. However, once an accurate functional is constructed, its potentialis enormous because the the KS equations that have to be solved remain basically the same. Thus it is prob-ably no wonder that the publication of Perdew marks a turning point in the history of DFT. Since 1985, thenumber of proposed exchange-correlation functionals and publications has grown exponentially [179] andstarted the “incredible success story” [180] of DFT. Modern functionals include, e.g., even higher derivativesof the density (for example the meta-GGAs ) and the Fock or exchange contribution of HF ( hybrid function-als ) [174, 172]. Such theoretical ideas define the principal form of the functionals, but the precise contri-bution of the different theoretical levels usually cannot be predicted. Many successful KS-DFT functionalshave thus been constructed by fitting to experimental data. Despite these huge efforts and certain systematic improvements, it is difficult to compare and judge theperformance of the many existing functionals. Theoretically, there is one universal functional and one couldexpect that the known functionals somehow “converge” into one direction, i.e., they approximate increas-ingly better the exact functional. However, the opposite seems to be true: there is basically no functional Note that E HEG ( ρ HEG ) is really a function and not a functional, because ρ HEG is just a number. This model cannot be solved analytically anymore, but it still can be reduced to the numerical calculation of one integral in themany-body coordinate space. This can be done very efficiently by Monte-Carlo integration. For an exhaustive discussion on the ho- mogeneous electron gas, the reader is referred to Ref. [176]. See for example the Nobel-lecture of Pople [177]. The interested reader is referred to the recent report by Medvedev et al. [162] that highlights the history of functional constructionand reflects critically on the strong focus on fitting that has become common practice nowadays. Consequently, applying KS-DFT in practice is everything but easy and requires experienceand a careful literature research. Discussing functional construction in more depth is beyond the scope ofthis text and thus, we refer the reader for further details to the already cited literature. To name a few furtherexamples, the review of Jones [179] provides a very good introduction to the topic of density functionals in-cluding its very interesting history and the review by Burke [180] provides a perspective on the recent stateof DFT.As a last example for an exchange correlation functional, we want to mention the exact exchange (EXX)approximation which will play an important role in the generalization of DFT to coupled electron-photonsystems (see Sec. 3.2). The EXX approximation is exceptional in comparison to most other known KS-DFTfunctionals, because it does not rely on empirical (or in the special case of the LDA analytical) knowledge.The idea is to approximate the interacting system with a single-reference wave function exactly as in HF.However, the EXX functional is local , which requires to solve an auxiliary equation inside the SCF routine.This is a non-unique problem and thus there are several possibilities to define this auxiliary equation. Themost common way is the optimized effective potential (OEP) approach together with the Krieger-Li-Iafrate(KLI) approximation [172], but there are also other possible constructions [181]. Importantly, the EXX func-tional is usually more accurate than HF but at the same time as generic. This makes it an important tool fordeveloping first-principles methods in settings that are not covered by the electronic-structure Hamiltonian.We conclude our brief survey on KS-DFT functionals with a small example that shows the significanceof density functionals in modern material science and chemistry. Despite all the problems and difficultieswith KS-DFT, accurate functionals are known nowadays and they are basically the only tool that exists tocalculate the properties of many realistic matter systems. This is well-reflected in the extraordinary numberof citations of the corresponding publications. The most famous example is the paper [182] in which Perdew,Burke and Enzerhof propose their PBE functional, which has become within merely two decades the mostcited paper of all physics . According to Google scholar, there are more than 110000 publications that refer tothis publication.
A remark on the KS scheme and alternatives
With the KS scheme, we have reduced the problem of calculating the 4 N -dimensional wave function on find-ing the N lowest eigenstates of the nonlinear one-particle operator ˆ H s . This is on obvious computationaladvantage over the exponential wall of the many-body problem. Still, it is much more expensive than ourstarting point, i.e., a direct minimization of E [ ρ ] (Eq. (2.51)). In fact, people have tried to construct theoriesthat are directly based on ρ (nowadays known under the term orbital-free DFT) even before DFT was devel-oped. A famous example is
Thomas-Fermi theory, developed already in 1927, which provides however avery poor description of quantum systems. For example, matter is not stable in Thomas-Fermi theory [183].There have been advances in the field, but still the accuracy of known explicit density functionals is verylow. The crucial point of the KS construction is that the contribution of the unknown v xc [ ρ ] to the totalenergy is considerably smaller than the contribution of the universal functional F [ ρ ] (cf. Eq. (2.52)). Thereason is that the Slater-determinant includes already a very large part of the quantum mechanical probleminto our description (as we know from HF theory). Thus, DFT is almost exclusively utilized in the KS pictureand often DFT and KS-DFT are used interchangeably. See for example the extensive assessment of density functionals in quantum chemistry by Mardirossian and Head-Gordon [84]. See for example the review by Jones [179] and the references therein. See Ref. [184] for a recent review on the topic.
When DFT fails: the strong-correlation regime
In this last paragraph, we want to discuss the (actual) limitations of KS-DFT (and other single-referencemethods). Although density functionals can in principle exactly describe every physical setting that is in-cluded in the electronic-structure Hamiltonian (Eq. (2.1)), there are scenarios, where all known KS-DFTfunctionals are severely inaccurate. The research on functional construction has shown that in such settings,the form of the exact functional is very intricate. For instance, the exact functional of the non-interactingHEG is given by a formula as simple as Eq. (2.63). Systems that do not deviate too much from the HEG liketypical conductors are usually very well described by KS-DFT. However, for other seemingly simple systems,such as two separate Hydrogen atoms, it turns our that the exact functional has a very complicated form.The principal reason for this difference is the single Slater determinant ansatz in the KS system that is avery good starting point for the HEG, but a very bad starting point for two separate atoms. The (spatial partof the) KS wave function for the ground state of two electrons reads simply Ψ s ( r , r ) = ψ ( r ) ψ ( r ), (2.65)where ψ is the doubly-occupied spatial orbital of a spin-singlet. The exact ground state of two separateHydrogen atoms after a proper symmetrization reads instead Ψ ( r , r ) = (cid:112) (cid:161) ψ a ( r ) ψ b ( r ) + ψ a ( r ) ψ b ( r ) (cid:162) , (2.66)where ψ a ( b ) = exp( − ( r − R a ( b ) ))/ (cid:112) π is the ground state of the Hydrogen located at R a ( R b ), where | R a − R b | >> Ψ describes a spin-singlet, but it requires two different spatialorbitals, which are degenerate, i.e., they have the same energy eigenvalue. From the wave function perspec-tive, this special situation requires thus a multi-reference ansatz, i.e., a wave function constructed from morethan one Slater determinant. This is clearly an extreme case but the problem is quite generic: whenever the(valence) electrons of a system are sufficiently localized, we can assign orbitals to them that have only smalloverlaps and we recover a similar scenario as described by Eq. (2.66). This happens in chemistry, wheneverorbital energies are (nearly) degenerate, e.g., when bonds are stretched [133]. Another example are materialswith partially filled electron shells such as transition metals or their oxides [185], which exhibit very specialproperties [186] such as high-temperature superconductivity [81], Mott metal-insulator [82] transitions orcolossal magnetoresistance [187]. We call such physical systems strongly correlated and they constitutethe hardest challenge for electronic-structure methods. We have so far only discussed the wave-function perspective, where strong correlation manifests inthe multi-reference character of the ansatz. In KS-DFT however, we describe every system with an single-reference ansatz of the form (2.65). The multi-reference character needs thus to be captured by the exchangecorrelation potential. In this simple case, the potential would need to create a kind of barrier between the Ground-states are normally singlets, i.e., spin-saturated. Note that the antisymmetry is contained in the spin-function: (cid:112) ( α ( σ ) β ( σ ) − α ( σ ) β ( σ )). Note that in the realm of electronic-structure theory, one often calls the contribution of such near-degeneracies static correla-tion(see for instance Ref. [188]). Additionally, there is the dynamical contribution to the correlation energy that corrects the HF energyof a tightly bound electron pair such as in Helium [189]. This type of correlation can be usually very well described with single-reference methods such as KS-DFT. A description of the electronic-structure in full real-space is usually not possible for strongly-correlated systems. Instead, they areoften described with effective models like the Hubbard model [85] (see also the discussion in Sec. 1.1) or a lattice of Hydrogen moleculeswith stretched bonds [133]. However, even such simplified models are extremely challenging for electronic-structure methods. intra-system steepening [190]. This leads to one effective orbital of the form ψ ( r ) ≈ ψ a ( r ) + ψ b ( r ), (2.67)which then is occupied twice. It is clear that the exact form of this potential is very sensitive to specificparameters such as the precise distance between the atoms and the orbitals that are involved. Consequently,it is very difficult to construct general exchange correlation potentials for strongly-correlated electrons.However, the failure for the strongly correlated system that we just described is related to the KS con-struction and not to DFT per se. Complementarily to the non-interacting KS system that is obtained bysetting the interaction W = strong-correlation limit with zerokinetic energy, T =
0. For Coulomb systems, this limit can even be solved analytically and one can definea DFT based on this auxiliary system [191, 192]. This approach has several promising analytically derivablefeatures, but the functional construction has not been very fruitful so far. As the long history of KS-DFTshows, functional construction is a very delicate task and one basically has to start from scratch in everynew setting. In part II, we will also make use of the flexibility of DFT and construct an auxiliary system forcoupled electron-photon systems that is explicitly correlated. This unusual auxiliary system will, in contrastto strongly-correlated DFT, even facilitate functional construction.Discussing strong correlation and the corresponding methods in more detail is clearly beyond the scopeof this work and we instead refer the reader to the literature, e.g., Refs. [137, 133, 82, 86, 185]. However, we willdiscuss in the next section another approach to describe many-body problems that in fact has been used toconstruct methods that showed excellence accuracy in certain strongly-correlated systems. This approachwill play an important role for the auxiliary construction in part II.
To complete our discussion of electronic-structure methods, we will introduce in this section the conceptof reduced density matrices (RDMs). We can utilize these to describe many-body systems in an exact waywithout the explicit use of the wave function. For instance, we can describe the expectation value of theelectronic-structure Hamiltonian exactly in terms of the 2-body RDM (2RDM). The configuration space ofsuch a description does not grow with the particle number and thus a variational minimization in terms ofthe 2RDM is in principle feasible. However, the many-body problem also arises in this description in theform of conditions that determine the exact configuration space, i.e., the set of all 2RDMs. The number ofthese so-called N -representability conditions grows exponentially with the particle number N and thus, oneonly can consider a subset of conditions in practice.A special role is here taken by the 1-body RDM (1RDM) γ that has comparatively simple N -representabilityconditions. These reflect the exchange symmetry of the particle species and thus provide an alternative toemploying Slater-determinants to ensure, e.g., the Pauli-principle. Although γ is not sufficient to describethe energy of a many-electron system in a linear way, it carries all information on the system by virtue ofa generalized Hohenberg-Kohn theorem (Gilbert’s theorem). This establishes RDMFT as an alternative toDFT, employing γ instead of ρ as the basic variable.Importantly, methods based on RDMs are usually more efficient than wave-function methods for de-scribing strongly correlated systems beyond models [27, 193]. 71HAPTER 2. ELECTRONIC-STRUCTURE THEORY To illustrate the role of RDMs, we consider a system consisting of two electrons that can move freely insome volume V ⊂ (cid:82) . The corresponding Hamiltonian is simply the kinetic energy operator ˆ T . Let us nowcalculate the energy E =〈 Ψ | ˆ T Ψ 〉= (cid:90) d x d x Ψ ∗ ( x , x ) (cid:34) (cid:88) i = − ∇ r i (cid:35) Ψ ( x , x ) = (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ) + (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ). (2.68)We see that the energy expression separates in one term that is only dependent on x and another term thatis only dependent on x . However, we know that the indices are just place-holders and we can arbitrarilyexchange them if we respect the exchange symmetry. If we exchange x and x in the second term, we get (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ) = − (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ) = (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ) = (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ), (2.69)where in the last line we renamed the variables for our convenience (we also changed the index of the ∇ -operator). We see that exchanging the indices of the (antisymmetric) wave function within an expectationvalue does not change its sign. Inserting this equality (2.69) back in the energy expression (2.68), we get thesimplified expression, E = (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ) + (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ) = (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ) + (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ) = (cid:90) d x d x Ψ ∗ ( x , x ) (cid:163) − ∇ r (cid:164) Ψ ( x , x ).It is obvious that we can generalize this example for any many-body operator of order one ˆ O = (cid:80) Ni = ˆ o ( r i ), cf.Eq. (2.24). The expectation value of ˆ O reads for two particles O =〈 Ψ | ˆ O Ψ 〉= (cid:90) d x d x Ψ ∗ ( x , x )[ ˆ o ( r )] Ψ ( x , x ). (2.70)Importantly, in order to get from the first to the second line in this equation, we do not need to have any knowledge about the system besides the exchange-symmetry of Ψ , the number of particles N and the orderof ˆ O . This is the key insight to understand the role of RDMs: Although we have to introduce coordinatesfor all the N particles of a system to define the quantum-mechanical wave function and correspondingoperators, the expectation value of any of these operators depends only on the order of the operator. Foroperators of type (2.24), we only need one spatial coordinate and this motivates the definition of the (spin-72.4. REDUCED DENSITY MATRICES IN ELECTRONIC-STRUCTURE THEORYsummed) one-body reduced density matrix (1RDM) Γ (1) ( r ; r (cid:48) ) = (cid:88) σ , σ (cid:90) d r Ψ ∗ ( r (cid:48) , σ , x ) Ψ ( r , σ , x ). (2.71)If we know Γ (1) , we can calculate any expectation value O = (cid:90) d r ˆ o ( r ) Γ (1) ( r ; r (cid:48) ) | r (cid:48) = r , (2.72)where the subscript | r (cid:48) = r means that we first apply ˆ o and then set r (cid:48) = r . The 1RDM is exactly the part of theexpectation value (2.70) that is independent of the actual operator and in this sense, it carries all “one-bodyinformation” of a system. The 1RDM with respect to local and nonlocal operators
We want to remark briefly on the one-body nature of Γ (1) = Γ (1) ( r ; r (cid:48) ) that in fact depends on two coordinates.The definition (2.71) explicitly differentiates between the coordinates of the wave function and its conjugatein the expectation value. This is crucial to calculate expectation values of so-called nonlocal one-body op-erators like the kinetic energy E = (cid:90) d r ˆ t ( r ) Γ (1) ( r ; r (cid:48) ) | r (cid:48) = r = (cid:88) σ (cid:90) d r d x Ψ ∗ ( r (cid:48) σ , x ) (cid:163) − ∇ r (cid:164) Ψ ( r σ , x ).The differential operator only acts on Ψ , but not on Ψ ∗ and thus, we need to introduce a second coordinate r (cid:48) that is not affected by ˆ t ( r ). If we instead want to calculate the expectation value of the local potentialˆ V = (cid:80) i = v ( r i ), we find V =〈 Ψ | ˆ V | Ψ 〉= (cid:90) d x d x Ψ ∗ ( x , x )[ v ( r )] Ψ ( x , x ) = (cid:90) d r ˆ o ( r ) Γ (1) ( r ; r ).This is nothing else than the one-body density, Γ (1) ( r ; r ) = ρ (1) ( r )( ≡ ρ ( r )), (2.73)that we defined in the Sec. 2.3. We see that the expectation values of local operators can be calculated withthe one-body density, but for nonlocal operators, we need the full 1RDM. This seemingly subtle differenceplays an important role in the many-body description. For instance, the exact kinetic energy is not a func-tional of the density and thus part of the universal functional F [ ρ ] of DFT (Eq. (2.53)). In fact, this kineticpart of F typically makes up the largest contribution and thus, approximations have to model especially thispart. This insight is the basis of (RDMFT), where the 1RDM is employed as basicvariable instead of the density (see Sec. 2.4.2). The unknown part of the corresponding universal functionhas thus a considerably smaller contribution to the total energy than in DFT. Note that the calculation of the kinetic energy expectation value does not require the full 1RDM. This can be illustrated in a dis-cretized picture, where we approximate the Laplace operator with finite-differences of some order. The matrix form of the 1RDMbecomes then explicit by, e.g., defining G i j = Γ (1) ( r i ; r j ). The density, i.e., the local part of G is the diagonal ρ = G ii and to apply theLaplace operator, we would need the first n off-diagonals, where n is the order of the finite-differences approximation. Importantly, we The hierarchy of RDMs
Having motivated the definition of the 1RDM, let us now generalize the concept to systems with N particlesand (non-)local operators of any order p . Let Ψ ( x ,..., x N )be the wave function of a system of N electrons and Γ N ( x ,..., x N ; x (cid:48) ,..., x (cid:48) N ) = Ψ ∗ ( x (cid:48) ,..., x (cid:48) N ) Ψ ( x ,..., x N ) (2.74)the corresponding ( N -body) density matrix. We define a general non-local operator of the order p ˆ O ( p ) nl = p ! (cid:88) i ,..., i p o ( p ) nl ( r i ,..., r i p ; r (cid:48) i ,..., r (cid:48) i p ), (2.75)and a general local operator of the order p ˆ O ( p ) l = p ! (cid:88) i ,..., i p o ( p ) l ( r i ,..., r i p ). (2.76)Accordingly, we define the p-body reduced density matrix ( p RDM) Γ ( p ) ( r ,.., r p ; r (cid:48) ,.., r (cid:48) p ) = N !( N − p )! (cid:88) σ ,..., σ N (cid:90) d r ( N − p + ··· d r N Γ N ( r σ ,..., r p σ p , x p + ,..., x N ; r (cid:48) σ ,..., r (cid:48) p σ p , x p + ,..., x N ). (2.77)Additionally, we denote the 1RDM in the following by γ ( r ; r (cid:48) ) ≡ Γ (1) ( r ; r (cid:48) ) (2.78)because of its special importance. We call the diagonal of the p RDM, ρ ( p ) ( r ,.., r p ) = Γ ( p ) ( r ,.., r p ; r ,.., r p ), (2.79)the p-density . Note that ρ (1) = ρ , defined in Eq. (2.48). With these definition, the expectation value of anyoperator ˆ O with respect to Ψ can be expressed as O =〈 Ψ | ˆ O Ψ 〉= (cid:90) d r ··· d r N ˆ O ( r i ,..., r i p ; r (cid:48) i ,..., r (cid:48) i p ) Γ N ( r ,.., r N ; r (cid:48) ,.., r (cid:48) N ) | r (cid:48) = r ,..., r (cid:48) N = r N (2.80) ≡ Tr[ Γ N ˆ O ],where in the last line we formally rewrote the contraction as a trace operation for later convenience. Thisexpression can be reduced for operators of the order p with the aid of the RDMs. In the nonlocal case, we do not need all off-diagonals and this is why the Laplace operator is called a semilocal operator. This fact inspired the formulation of kinetic energy functional theory in which the author was involved [4]. Note that local operators are included in the definition of nonlocal operators. We explicitly differentiate between both to stress thedifference between RDMs and densities. V = (cid:80) Ni = v ( r i ) 1-density 〈 ˆ V 〉 = (cid:82) d r v ( r ) ρ ( r )1-body, non-local kinetic energy ˆ T = (cid:80) Ni = − ∇ r i 〈 ˆ T 〉 = − (cid:82) d r ∇ r γ ( r ; r (cid:48) ) | r (cid:48) = r W = (cid:80) i , j | r − r (cid:48) | 〈 ˆ W 〉 = (cid:82) d r d r (cid:48) | r − r (cid:48) | ρ (2) ( r , r (cid:48) )Table 2.1: Examples of operator classes with important examples and corresponding RDM. For instance, tocalculate the expectation value of a local operator of order 1 like such as the local potential v , we need the1-density.have for an operator ˆ O ( p ) nl O ( p ) nl =〈 Ψ | ˆ O ( p ) nl Ψ 〉= (cid:90) d r ··· d r p ˆ O ( p ) nl ( r ,..., r p ; r (cid:48) ,..., r (cid:48) p ) Γ ( p ) ( r ,.., r p ; r (cid:48) ,.., r (cid:48) p ) | r (cid:48) = r ,..., r (cid:48) N = r N (2.81) ≡ Tr[ ˆ O ( p ) nl Γ ( p ) ].The expectation value of a local operator ˆ O ( p ) l is calculated as O ( p ) l =〈 Ψ | ˆ O ( p ) l Ψ 〉= (cid:90) d r ··· d r p ˆ O ( p ) l ( r ,..., r p ) ρ ( p ) ( r ,.., r p ) (2.82) ≡ Tr[ ˆ O ( p ) l ρ ( p ) ].We want to stress that the above definitions are straightforward generalizations of our two-particle exampleand there is nothing new to understand here. We will need these expressions in the following for sometheoretical considerations. In practice, we will mostly be confronted with operators of order 1 and only onelocal operator of order 2, which is the Coulomb interaction. See Tab. 2.1 for an overview about these for usimportant cases.From the normalization of Ψ , i.e., 〈 Ψ | Ψ 〉 = p RDMs,which corresponds to the following sum rule (cid:90) d r ··· r p Γ ( p ) ( r ,.., r p ; r ,.., r p ) = N !( N − p )! . (2.83)Additionally, all the p RDM and the ( p + Γ ( p ) ( r (cid:48) ,.., r (cid:48) p ; r ,.., r p ) = p + N − p (cid:90) d r p + Γ ( p + ( r (cid:48) ,.., r (cid:48) p , r p + ; r ,.., r p , r p + ). (2.84)We conclude this overview with a comment on the prefactors in Eq. 2.77, which essentially stem fromthe possible permutations of the coordinates. We chose them such that the expressions for the expectationvalue, (2.81) and (2.82) are prefactor free. Other definitions are possible and employed in the literature. This follows directly from the definition and the sum rule (2.83). See for example the definitions in Ref. [194, part 2].
Coulson’s Challenge: RDMs as basic variables
Let us recall at this point the basic task of electronic-structure methods that is to describe the electronicstate that minimizes the N -electron energy expectation value E = inf E We have introduced this energy functional in Sec. (2.1) in terms of the many-body wave function, i.e., E [ Ψ ] = 〈 Ψ | ˆ H Ψ 〉 = 〈 Ψ | ˆ T Ψ 〉 + 〈 Ψ | ˆ V Ψ 〉 + 〈 Ψ | ˆ W Ψ 〉 .Then, we discussed in Sec. 2.3 that by means of the Hohenberg-Kohn theorem, we can identify Ψ = Ψ [ ρ ] andthus reformulate the energy as a functional of ρ : E = E [ ρ ]However, from Eq. 2.77 and Tab. 2.1 we can derive another form of the energy functional, expressed onlywith respect to RDMs, i.e., E = E [ ρ , γ , Γ (2) ] = T [ γ ] + V [ ρ ] + W [ ρ (2) ]. (2.85)Due to relation (2.84), we can even rewrite the energy E = E [ Γ (2) ]solely in terms of the 2RDM. This reformulation is exact and has no unknown part. In fact, this functionalis even linear. It is thus very tempting to reformulate the variational principle (Eq. (2.2)) as E = inf Γ (2) ∈ C Γ E [ Γ (2) ] (2.86)and determine E by a functional variation over the space of 2RDMs C Γ . Since Γ (2) is a merely four-dimensionalquantity ( independently on the number of electrons that are described) this functional minimization shouldbe possible even for very large systems. Especially, Γ (2) is not limited by a single-reference construction andthus, it describes the properties of strongly-correlated systems exactly [195]. This has been first pointed outin 1955 by Löwdin [196] and in the following years, variational 2RDM theory became a highly popular re-search area [195]. However, the variational calculations based on Eq. (2.86) usually resulted in energies thatwere considerably smaller than the exact references. The reason was that it was not known how to charac-terize the configuration space C Γ . This led to violations fo the Pauli principle by bosonic contributions theenergy. C.A. Coulson summarized the problem some years later in the following way:"It has frequently been pointed out that a conventional many-electron wave function tells usmore than we need to know. [...] There is an instinctive feeling that matters such as electroncorrelation should show up in the two-particle density matrix [...] but we still do not know theconditions that must be satisfied by the density matrix. Until these conditions have been eluci-dated, it is going to be very difficult to make much progress along these lines."(Coulson, 1960 [197]) Note that the 2-density ρ (2) is not sufficient for such a re-expression, because the 1RDM cannot be connected to ρ (2) by Eq. (2.84). N -representability conditions , asone of the ten most prominent research challenges in quantum chemistry [198]. The N -representability problem Although we have discussed several minimization problems in this chapter, the problem of N -representabilityhas not yet been occurred, at least not specifically. Strictly speaking, we could have called the task toparametrize of the space of antisymmetric wave functions, an N -representability problem. However, wewere able to solve this problem by going to the special basis of Slater determinants (see Sec. 2.1). The nextparametrization that we have considered was for the space of single-particle densities to perform DFT min-imizations. Clearly, not every function f ( r ) that depends on one variable is a one-body density according toEq. 2.48. Without being entirely conscious about this question, practitioners of the field used DFT for manyyears. However, no issues occurred. The reason is that the N -representability conditions of the density arevery simple as Gilbert [199] could prove in 1975: every non-negative function that is finite (and thus canbe normalized to N ) is an N -representable density. Astonishingly, there is no trace of quantum effects likethe exchange-symmetry or the Pauli principle in these conditions. The electron-density is basically equiva-lent to a classical charge density. This reflects one important advantage of the KS construction: it allows toinclude the antisymmetry of the electrons explicitly. In, e.g., orbital-free formulations of DFT this is muchmore difficult. In this sense, the one-particle density is an exceptional RDM: the properties of all other RDMs are cru-cially influenced by the exchange symmetry of the system’s particles. In fact, the antisymmetry plays thekey-role in Coulson’s challenge. To see this, let us recapitulate the definition of the p RDM Γ ( p ) ( r (cid:48) ,.., r (cid:48) p ; r ,.., r p ) = (cid:161) NN − p (cid:162) (cid:88) σ ,..., σ N (cid:90) d r ( N − p + ··· d r N Ψ ∗ ( r (cid:48) σ ,..., r (cid:48) p σ p , x p + ,..., x N ) Ψ ( r σ ,..., r p σ p , x p + ,..., x N ) = Γ ( p ) [ Ψ ], (cf. 2.77)that is a functional of Ψ in the sense that for every Ψ , Eq. (2.77) defines the map Ψ → Γ ( p ) . (2.87)The question of N -representability, or more precisely pure-state N -representability concerns the other di-rection of this map, Γ ( p ) → Ψ . (2.88)Given a function g that depends on 2 p coordinates, the pure-state conditions are necessary and sufficientfor the existence of an N -body wave function Ψ with g = Γ ( p ) [ Ψ ] according to Eq. (2.77). If there was no Coleman introduced this name, when he proved the first known set of conditions for γ [136]. N stands for the number of particlesin the system. Note that besides the N -representability, there is the v -representability problem that concerns the more specific question, which densities (or RDMs) can be “produced” by physically meaningful potentials v . Importantly, it would be sufficient to determine all v -representable quantities, since these are automatically N -representable. The v -representability problem has stimulated many con-ceptual advances of DFT [172, part 2.3] and RDMFT [200, 201], but a direct application similar to the N -representability conditions isvery difficult. See for example Ref. [202, 203]. N -representability condi-tion would be the positivity (or positive semi-definiteness) of Γ ( p ) that directly follows from the quadraticstructure of the definition. However, the coordinates of a (single-species) wave function are symmetric orantisymmetric under permutations, which is a crucial feature of the quantum mechanical description as wehave discussed in detail in Sec. 2.1.The pure-state N -representability problem has only been (formally) solved in its full generality for the1RDM γ ( r , r (cid:48) ) = N (cid:88) σ ,..., σ N (cid:90) d r ··· d r N Ψ ∗ ( r (cid:48) σ , x ,..., x N ) Ψ ( r σ , x ,..., x N ).Klyachko [204] published in 2006 a prescription to construct a set of conditions, known as the generalizedPauli constraints [205], that guarantee the one-to-one correspondence between γ and the (antisymmetric)many-body wave function. However, the number of these conditions grows exponentially with the particlenumber N and the number of basis states B and hence utilizing them in practice is basically impossible. There is ongoing research on approximation strategies that might result in numerically feasible methods yetonly for very simplified model problems [207, 208].
Ensembles
The whole problem can be significantly simplified, if we generalize our description from pure states to statis-tical ensembles which occur, e.g., in the theory of open quantum systems [209]. Such mixed states cannot bedescribed by one wave function Ψ , but require the introduction of the (von-Neumann) density matrix [209] Γ NE = (cid:88) i w i Γ Ni , (2.89)where Γ Ni is the N -body density matrix corresponding to a pure state Ψ i , 0 ≤ w i ≤ (cid:80) i w i =
1. Thisdefines the Γ NE as a convex combinations of the pure states Γ Ni . Γ describes the statistically mixed statewith probability w i to find the system in the state Ψ i . A specific example is the canonical ensemble ofstatistical physics with temperature T . The weight functions are then given as w i = exp( − E i /( k B T ))/ Z ,where Z = (cid:80) i exp( − E i /( k B T )) is the partition function E i is the energy expectation value of system i and k B is the Boltzmann constant. We can calculate the expectation value of an N -body operator ˆ O in the ensembleby generalizing Eq. (2.80), i.e., O = (cid:88) i w i 〈 Ψ i | ˆ O Ψ i 〉= (cid:88) i w i Tr[ Γ Ni ˆ O ] ≡ Tr[ Γ NE ˆ O ]. (2.90)If we replace Γ N with Γ NE in Eq. 2.77, we can straightforwardly generalize the concept of RDMs to ensembles.The advantage of the ensemble picture lies in the mathematical structure of Eq. (2.89). The set C N Γ E ≡ E N of all ensemble N -body density matrices Γ NE is convex and the pure state density matrices Γ N are simply its A p RDM is positive-semidefinite, if for any wave function ψ ( x ,..., x p ) it holds that 〈 ψ | Γ ( p ) ψ 〉 ≥ N -representability plays also an important role in quantum information theory, where it is usually called the quantum marginalproblem . See for example the work by Theophilou et al. [206], where the application of the generalized Pauli constraints has been explorednumerically. Γ NE as a convex combination of the Γ N .Importantly, this property carries over to the sets E p of the (ensemble) p RDMs, because all RDMs arelinearly connected. This makes the parametrization of E p considerably easier than the corresponding set ofpure state p RDMs. Ensemble N -representability of the 1RDM Let us demonstrate this with the example of the 1RDM N -representability conditions that have been derivedby Coleman [136] already in 1963. Given the 1RDM of a fermionic (bosonic) γ ( r ; r (cid:48) ) = (cid:80) i n i φ (cid:48) i ( r (cid:48) ) φ i ( r ) in itsdiagonal representation, i.e., (cid:82) d r (cid:48) γ ( r , r (cid:48) ) φ i ( r (cid:48) ) = n i φ i ( r ). Then the (necessary and sufficient) ensemble N-representability conditions are 0 ≤ n i ( ≤
1) (2.91a) (cid:88) i n i = N , (2.91b)where the upper bound holds only for fermions. We call n i and φ i natural occupation numbers and naturalorbitals respectively. This means, if we have a basis set with B elements, we have exactly B + N . This is enormous simplificationin comparison to the generalized Pauli constraints for pure states makes the conditions (2.91) applicable inpractice. Additionally, the conditions have a direct physical interpretation, because we can connect them tothe Pauli principle [205]: bosonic natural orbitals can be occupied arbitrarily often, but fermionic naturaloccupation numbers are bounded by a maximal value of one. Here, the special case of n i = i ≤ N and n j = signatureof the multi-reference character (or the correlation) of a system.There are several ways to prove the above statement [136, 27, 200] and we want to outline one of themfor the fermionic case. This very instructive proof has been published by Giesbertz and Ruggenthaler [200]for the special case of finite basis sets. The crucial step in the proof is the explicit knowledge of the con-nection between γ and Ψ in the case, when Ψ is a single Slater determinant. We assume a basis set of B orbitals, from which we can choose (cid:161) BN (cid:162) different combinations to construct an N -body Slater determi-nant. We denote each of these sets with the collective index I k = (1 k ,..., N k ), where k = (cid:161) BN (cid:162) and define Ψ I k = (cid:112) N ! | ψ k ··· ψ N k | − . A simple calculation yields the corresponding 1RDM γ I k ( r ; r (cid:48) ) = (cid:88) i ∈ I k ψ ∗ i k ( r ) ψ i k ( r ).Thus, the natural orbitals φ i k = ψ i k are identical to the orbitals of Ψ I k and all natural occupation numbersare one. This establishes for each Slater determinant Ψ I k a bijective map (or a one-to-one correspondence)between the pure state N RDM Γ NI k = Ψ ∗ I k Ψ I k and its 1RDM γ I k γ I k ↔ Γ NI k . (2.92)Next, we note that the γ I k must be the extreme elements of the space of ensemble 1RDMs E , because we For a good introduction on the topic of convex sets in the realm of RDMs, the reader is referred to Ref. [210]. Mathematical detailsare well-describe in Ref. [211]. γ ∈ E by γ = (cid:88) k n k γ I k ,if 0 ≤ n k ≤ (cid:80) k n k =
1, i.e., by a convex combination of the γ I k . On the other hand, we can constructevery ensemble N RDM Γ NE = (cid:88) k n k Γ NI k as a convex combination of the N -body density matrices Γ NI k , since Slater determinants form a basis of theantisymmetric N -body space. Crucially, we can employ the same prefactors n k , because of Eq. (2.92) whichcompletes the proof. N -representability beyond the 1RDM Unfortunately, such a simple connection between RDM space and E N is only possible for the 1RDM. To seethis, let us recapitulate the essential ingredients of the proof: we need on the one hand the convex structureof the spaces of ensemble RDMs E N , E (and the linear connection between these spaces). On the otherhand, the proof relies crucially on the properties of Slater determinants, which allow to parametrize the an-tisymmetric N -body space by single-particle wave functions, i.e., orbitals. Slater determinants thus connectthe N -body with the one-body space, which transfers to the RDM picture by connecting both the extremeelements of E N and E .Generalizing this construction to, e.g., the 2RDM thus does not work. The eigenfunctions of the 2RDM,so-called geminals , depend on two coordinates and there is no practical way known to construct an antisym-metric many-body wave function (or ensemble N -body density matrix) from geminals. This is one way tounderstand the nowdays well-known fact that even in the ensemble case, the N -representability conditionsare only simple for the 1RDM. In 1967, shortly after Coleman published his proof of the 1RDM conditions,Kummer [213] formally defined these conditions, but it took almost further 50 years , until this formal solu-tion could be translated into a practical prescription: Mazziotti [214] published the solution of the (ensem-ble) N -representability problem only in 2012. However, for all p RDMs with p >
1, the number of conditionsgrows exponentially and thus, they are not applicable in practice. The exponential wall of the many-bodyproblem manifests thus in the number of N -representability conditions, instead of the dimensionality ofthe configuration space. Nevertheless, these many years of research have been fruitful and methods thatonly take a subset of all conditions into account have been successfully developed and implemented intoquantum-chemistry codes [27, 216, 217]. Such methods have proven to be capable to accurately describecomparatively large strongly-correlated systems. At least not without further approximations like the strong orthogonality assumption [212]. In fact, the number of conditions grows even factorial, i.e., over-exponentially [215]. For example, Fosso-Tande et al. [217] applied their variational 2RDM method to a system of 50 strongly-correlated electrons in 50orbitals, which is compatible with state-of-the-art methods.
Although γ is not sufficient to describe an interacting many-body system in a straightforward linear waysuch as the 2RDM (Eq. (2.86)), there exists an exact but nonlinear energy functional E = E [ γ ]. (2.93)This was first realized by Gilbert [199], who generalized the Hohenberg-Kohn theorem to this case in 1975and established 1RDM functional theory (RDMFT). Exactly as in DFT, the exact functional E [ γ ] is not known,but the contribution of the unknown part to E [ γ ] is smaller than in DFT. This is obvious from the definitionof the exact energy in terms of RDMs, i.e., E = T [ γ ] + V [ ρ ] + W [ Γ (2) ]. (cf. 2.85)Since ρ ( r ) = γ ( r ; r ) is included in γ , only the interaction contribution W (and not T + W ) must be approxi-mated in RDMFT. Additionally, we can generalize the potential term v ( r ) → v ( r , r (cid:48) ) (2.94)to nonlocal potentials such that V = (cid:90) d r v ( r , r (cid:48) ) γ ( r , r (cid:48) ) | r (cid:48) = r . (2.95)Nonlocal potential thus constitute the “natural” external partner to the internal variable γ (see Eq. (2.50) andthe surrounding text). Although they do not have a direct physical interpretation, nonlocal potentials occurin some effective descriptions, e.g., pseudopotentials [218] and are thus a useful extension. The obviousprice to pay for these advantages is the necessity to deal with the full 1RDM as basic variable instead of thesimple density. Although not obvious from a first glance, the 1RDM severely complicates both, theory andnumerics.This complication starts with the generalization of the Hohenberg-Kohn theorem. To make use of thesimple N -representability conditions (2.91), Gilbert’s theorem considers ensembles instead of pure states.This means that the many-body system is described by the ensemble N -body density matrix Γ NE and theexpectation value of an operator ˆ O is calculated by O = Tr[ Γ NE ˆ O ], cf. Eq.(2.90). Conceptually, this is notproblematic, because the pure ground state of a system is included in the ensemble representation and thusby means of the variational principle E = inf γ ∈ G E E [ γ ], (2.96)The ground state will be also the solution of a variation over ensembles. Further, considering nonlocalpotentials v ( r ; r (cid:48) ), there is no full one-to-one correspondence to γ ( r ; r (cid:48) ): there are many potentials that lead tothe same ground state 1RDM. This changes the mathematical details of the construction, but it does not Note that this is strictly only true for the exact functional and there are indications that the performance of approximate functionalscan in some cases be increased by employing the pure state conditions [206]. Note that the one-to-one correspondence is restored, if we consider the equilibrium states of grand-canonical ensembles insteadof pure ground states [200]. E [ γ ], which only requires Γ N ←→ γ . (2.97)This is proven by Gilbert’s theorem and allows to define Γ N = Γ N [ γ ] and thus W = W [ Γ N ] = W [ γ ]. The exchange-correlation functional of RDMFT
Obviously, the definition of E [ γ ] is not sufficient for a practical electronic-structure method, but we have tofind accurate approximate expressions for the unknown part W [ γ ]. Although the 1RDM covers a larger partof the energy than the density in an exact way, it is the quality of the approximation of W [ γ ] that matters. Inanalogy to DFT and HF, we can define as a first step W [ γ ] = E H [ γ ] + E xc [ γ ], (2.98)where E H is the classical or Hartree contribution (Eq. (2.35)) and E xc denotes the unknown exchange-correlation functional (that is different from E xc [ ρ ] in DFT, cf. Eq. (2.57)). One interesting difference toDFT is that the 1RDM is general enough to include exchange terms like in HF. Thus, HF theory is a specialcase of RDMFT with the functional E xc = E HF = (cid:90) d r d r (cid:48) γ ( r ; r (cid:48) ) w ( r , r (cid:48) ) γ ( r (cid:48) , r ). (2.99)The challenge of RDMFT is thus to find functionals that go beyond E HF . Unfortunately, E HF is the onlyknown RDMFT functional that can be expressed explicitly in terms of γ , i.e., orbital-free. All practical RDMFTfunctionals are instead implicit functionals, i.e., they can only be formulated in terms of the natural orbitalsand natural occupation numbers, E xc [ γ ] = E xc [ φ i , n i ], (2.100)where γ = (cid:80) i n i φ ∗ i φ i . In contrast to DFT, it is very difficult to connect this description in a useful way to aKS system, i.e., a non-interacting auxiliary system. This has very severe consequences: although we solveorbital equations in practical RDMFT algorithms, there is no clear underlying single-particle picture. For ex-ample, we cannot define an effective one-body Hamiltonian which usually provides good approximationsfor ionization potentials and that can be diagonalized efficiently. Functional construction in RDMFT
Despite the additional challenges of RDMFT in comparison to DFT, accurate exchange-correlation function-als have been successfully constructed and implemented in electronic-structure codes. A comprehensive See Sec. 3.3.2 for the explicit proof for the more general (but in terms of the proof analogous) case of coupled electron-photonsystems. Although we can formally define such a one-body Hamiltonian, it has been proven by Pernal [219] that the spectrum of this Hamil-tonian is infinitely degenerate, if the functional E xc = E xc [ φ i , n i ] is implicit. This makes the one-body Hamiltonian in practice uselessand one important consequence is that standard RDMFT algorithms are usually numerically considerably less efficient than DFT algo- rithms for the same number of orbitals. In HF, this is justified by
Koopman’s theorem , which can be generalized (in a slightly modified form) to KS-DFT [220]. There have been efforts to define a local RDMFT [221] for which such a one-body Hamiltonian can be constructed. A first imple-mentation showed promising results while being computationally considerably more efficient than standard (nonlocal) RDMFT. < n i < any E xc that onlydepends linear on the occupation numbers. To go beyond single-reference, we therefore need to employ E xc the depend nonlinearly on the occupation numbers. Another very important feature of typical RDMFT functionals is that they are more generic than, e.g.,density functionals. Most DFT functionals are based on the paradigmatic homogeneous electron gas, whichdefines a very special physical setting. Also in RDMFT, there is such a paradigmatic study case, which is how-ever much less specific: any two-electron wave function can be exactly parametrized in terms of the 1RDM.In this problem, only the particle number is specified, but the form of the Hamiltonian can be arbitrary.This is a considerable advantage, when we want to apply RDMFT to problems that are not as well studiedas the electronic-structure Hamiltonian (Eq. (2.1)). For example, to apply the LDA to a one-dimensionalproblem as we will do in Sec. 2.5, we cannot employ Eq. (2.63), which has been derived for the Coulombinteraction in 3d, but need a different functional form [225]. Usual RDMFT functionals need instead sometwo-body matrix elements as input parameter, which can be calculated for an arbitrary type of interaction.For instance, we can apply the same functional in 1d or in 3d or in settings, where the Coulomb interactionis modified [127].The parametrization of the two-electron problem in terms of natural orbitals and natural occupationnumbers has been discovered already in 1956 by Löwdin and Shull [226]. The corresponding “Löwdin-Shull”(LS) exchange-correlation functional reads E xc = E LS = min f i , f j − (cid:88) i , j f i f j (cid:112) n i n j 〈 φ i φ j | ˆ w | φ j φ i 〉 , (2.101)where f i = ± E LS is (up to thephase factors) exact for two-electron systems and has been studied extensively in the last two decades toobtain a general understanding of the universal RDMFT functional [193]. For larger electron numbers, thefunctional is not exact anymore and needs to be adopted. Especially the in practice quite difficult minimiza-tion over the phase factors is either completely removed or usually replaced by simple rules.In this work, we will explicitly consider the simplest version of the “LS-type” of functional, which is ob-tained by setting f i = i . This functional has been constructed by Müller [227] already in 1984 ina different context and without even a reference to the Löwdin-Shull construction. Later in 2002, whenthe interest in RDMFT had increased Buijse and Baerends [228] rederived the Müller functional from morephysical considerations and it provides a reasonable qualitative description of a large range of electronic- One way to explain, why it is not possible to define a one-body Hamiltonian in RDMFT is exactly this nonlinearity (see Ref. [219]and Sec. 7.3). The Müller functional is thus often called BB functional. E xc = E M = − (cid:88) i , j (cid:112) n i n j 〈 φ i φ j | ˆ w | φ j φ i 〉 . (2.102)Nowadays, E M has become a reference in RDMFT, comparable to the LDA in DFT. A practical feature of E M that very few other RDMFT (and DFT) functionals share, is its convexity , which guarantees a unique globalminimum [161]. Thus E M is especially well suited to test new numerical implementations.Another interesting property of the Müller and many other RDMFT functionals concerns the corre-sponding ground state energy, which have been observed to be a lower limit to the exact reference [229,222]. Thus, energetic improvements have to be positive, which is in a sense the contrary to variationalmethods such as HF, where improvements are rigorously negative. This “negative variational” behavior is animportant guiding principle for the RDMFT functional construction [193].
The RDMFT minimization
In this final paragraph of the subsection, we will discuss the last missing piece to apply RDMFT in practice,which is the minimization algorithm. We will employ a Lagrangian approach as in Sec. 2.2 (cf. Eq. (2.38)and the following paragraph) for HF, which for RDMFT is however more involved. We do not calculate theenergy with respect to the N orbitals of a Slater determinant, but with respect to B natural orbitals { φ ,..., φ B }and natural occupation numbers { n ,..., n B }, where B ≥ N depends on the system and is thus a convergenceparameter. The generic energy functional of RDMFT reads E [{ φ i },{ n i }] = B (cid:88) i = n i 〈 φ i | [ˆ t + ˆ v ] φ i ( r ) 〉 + B (cid:88) i , j = n i n j 〈 φ i φ j | ˆ w | φ i φ j 〉 + E xc [{ n i },{ φ i }], (2.103)where E xc has to be replaced by a specific exchange-correlation functional such as E M , cf. Eq. (2.102). Thegoal is to minimize E [{ φ i },{ n i }] under the constraint that the natural orbitals are orthonormalized, i.e., c i j [{ φ j }] = (cid:90) d r φ ∗ i ( r ) φ j ( r ) − δ i j =
0, (2.104)which is similar to the constraint (2.39) for the HF orbitals. Additionally, we need to enforce the N -represent-ability conditions (2.91) that guarantee that the { φ i },{ n i } are connected to a fermionic 1RDM. The first con-dition on the individual eigenvalues, cf. Eq. (2.91a), can be incorporated explicitly by, e.g., the substitution n i = (2 πθ i ), (2.105)allowing for a free minimization of the θ i . However, the second condition (Eq. (2.91b)), S [{ n i }] = B (cid:88) i n i − N =
0, (2.106) The usual explanation for this is that the exchange-correlation functional needs to enforce (in some indirect way as a functional of the 1RDM) the N -representability conditions of the 2RDM. Approximate functionals often do not accomplish this and thus thevariation is performed over a too large configuration space, which leads to lower energies than the exact reference. We want to remind the reader that the HF algorithm can be almost directly transferred to KS-DFT problems, because both theoriesemploy a single-reference picture. See Sec. 2.3. (cid:178) i j and µ and consider the Lagrangian L [{ φ i },{ θ i };{ (cid:178) i j }, µ ] = E [{ φ i },{ n i }] − µ S [{ θ i }] − (cid:88) i , j (cid:178) i j c i j [{ φ i },{ φ j }]. (2.107)To find the RDMFT ground state defined by the variational principle, cf. Eq. (2.96), we optimize L [{ φ i },{ θ i };{ (cid:178) i j }, µ ] with respect to all variables. A necessary condition for an optimum of L is stationarity δ L =
0, (2.108)which is also sufficient for a minimum, if a convex functional such as E M is employed. If we consider φ i and φ ∗ i as independent, we find0 = δ L = B (cid:88) i = δ L δφ ∗ i δφ ∗ i + B (cid:88) i = δ L δφ i δφ i + B (cid:88) i = ∂ L ∂ n i d n i = π B (cid:88) i = sin( θ i ) (cid:183) ∂ E ∂ n i − µ (cid:184) d θ i + B (cid:88) i = (cid:90) d r δφ ∗ i ( r ) (cid:34) δ E δφ ∗ i ( r ) − B (cid:88) k = (cid:178) ki φ k ( r ) (cid:35) + B (cid:88) i = (cid:90) d r (cid:34) δ E δφ i ( r ) − B (cid:88) k = (cid:178) ik φ ∗ k ( r ) (cid:35) δφ i ( r ),which leads to the three sets of coupled equations0 = ∂ E ∂ n i − µ (2.109a)0 = δ E δφ ∗ i ( r ) − B (cid:88) k = (cid:178) ki φ k ( r ) (2.109b)0 = δ E δφ i ( r ) − B (cid:88) k = (cid:178) ik φ ∗ k ( r ). (2.109c)In practice, we will thus need a self-consistent procedure, in which we solve alternately the equations forthe n i and for the φ i , keeping the other variables constant, respectively. Since the n i are just numbers, theiroptimization can usually be done with a routine from a standard library. For the orbital optimization, wehave a nonlinear operator equation similar to the HF equations and thus, an SCF procedure is required.Similarly to HF, where the stationarity condition defines the Fock operator (Eq. (2.45)), Eq. (2.109b) andEq. (2.109c) define a nonlinear one-body operator ˆ H (1) . For Eq. (2.109b), we have δ E δφ ∗ i ( r ) = ˆ H (1) φ i = n i [ˆ t + ˆ v ] φ i + n i ˆ v H φ i + δ E xc δφ ∗ i . (2.110)Importantly, this operator is not hermitian and thus cannot be interpreted as an effective one-body Hamil-tonian [219]. An important practical consequence of this fact is that we cannot transform Eq. (2.109b) (orequivalently Eq. (2.109c)) into a nonlinear eigenvalue equation as we have done in Sec. 2.2 to derive the HF Nevertheless, the equations are nonlinear and thus also here caution is in order. In Sec. B, we present an explicit example thatdemonstrates the challenges of such nonlinearities. We will discuss this with the concrete example of the Müller in Sec. 7.2. B components of the Lagrange multipliermatrix ( (cid:178) i j ). We will discuss how to accomplish this in practice in Sec. 7.It should be stressed that usual RDMFT minimization algorithms are not only more expensive than DFTcalculations, but also considerably less stable, which is probably the most important bottleneck of state-of-the-art RDMFT. The reasons for the unsatisfactory convergence of the proposed algorithms are not entirelyresolved and thus numerics is an especially important part of the actual research in the field [193, part 4].Despite all the challenges of the 1RDM as a basic variable, one should not forget that RDMFT is muchyounger than DFT and investigated by a considerably smaller community. Thus, many promising researchdirections have only been identified but not yet fully investigated [193] and further significant improvementsof the theory are quite probable. Especially, when it comes to entirely new types of problems like the accuratedescription of the electron-photon interaction, RDMFT could be an interesting starting point, because thecontribution of the unknown part of the universal functional is smaller than it is the case in DFT. We willcome back to this idea briefly in Sec. 3.3.2 and more specifically in Sec. 5.3. Figure 2.1: Ground-state potential energy surface E ( d ) of the 1d H model (Eq. (2.113)), calculated exactly(exact) and with three different electronic-structure methods: HF, KS-DFT with the LDA functional (LDA),and RDMFT with the Müller functional (Mueller). The deviations between the different methods are dis-cussed in detail in the text.To get a feeling for the different approximation schemes that we have defined in the previous sections,we want to conclude our survey on electronic-structure theory with a simple example. We consider a 1dmodel of a hydrogen molecule ( H ) that consists of two hydrogen atoms with distance or bond length d .We can describe this scenario by the local potential v d ( x ) = − (cid:112) ( x − d ) + − (cid:112) ( x + d ) + − (cid:112) ( x − d ) + (cid:178) with (cid:178) = is the “soft Coulomb potential” of one elementary charge. The softCoulomb potential is usually employ in 1d studies, because it resembles many essential features of the stan-dard 3d Coulomb potential [225]. Accordingly, we also approximate the interaction w by a soft Coulombexpression w ( x , x (cid:48) ) = (cid:112) | x − x (cid:48) | +
1. (2.112)The many-body Hamiltonian of this problem readsˆ H = ˆ T + ˆ V + ˆ W = (cid:88) i = (cid:183) − ∂ ∂ x i + v d ( x i ) (cid:184) + w ( x , x ) (2.113)We now can apply any of our discussed methods to approximate the ground state of ˆ H for different valuesof d . The resulting energy function E ( d ) is called the ground-state potential energy surface (ground-statePES), which plays a central role in electronic-structure theory. One of its most important applications isthe structure prediction of matter systems. In our example, this reduces to the (equilibrium) bond-length of the H molecule, which is simply 2 d min , where d min is the minimum of E ( d ). But this is only oneapplication of the ground state PES and there are many more. For instance, the shape of E ( d ) around theminimum provides information about the energetic costs of bond stretching and the large d limit describesthe dissociated molecule. Thus, with the knowledge of the full function E ( d ) we can understand compli-cated processes such as chemical reactions. This shows that there are research questions, that only requirethe knowledge of a small part of the ground-state PES and consequently methods that are accurate in thispart. But there are other problems, such as the description of the complete process of a dissociation, whichnecessitate methods that accurately describe a large part of the ground-state PES.Let us see now how the methods that we have discussed in this chapter perform in describing the ground-state PES E ( d ) of the 1d H model. Since the Hamiltonian (2.113) is very low-dimensional, we can calculatethe exact many-body wave function with a simple eigensolver even for large d . This makes the model anideal and often employed system to test the accuracy of new electronic-structure methods [225, 230]. Wewill consider a generalized from of this model in part II to test our new methods for coupled electron-photonproblems.In this section, we compare three levels of theory, that is HF, and one exemplary functional of KS-DFTand RDMFT, respectively. For the former, we employ a 1d version of the paradigmatic LDA [225] and forthe latter, we choose the Müller functional. All calculations are performed in real space in a box with length L =
30 a.u. and discretized with a spacing of d x = M =
20 natural orbitals. We have plotted E ( d ) for the three cases together with the exact reference inFig. 2.1.A first glance on the figure reveals already how challenging an accurate description of the electronic-structure is: there is no method that performs well over the whole range of bond lengths. Clearly, thereare more sophisticated functionals for DFT and RDMFT that perform much better than our chosen exam-ples, but H is also just a very simple problem. The accurate description of the ground-state PES over the For nuclei with a bigger charge, the soft Coulomb parameter (cid:178) needs to be adjusted to guarantee certain properties. We will make use of this in Sec. 6. This can be understood by a simple gedanken experiment: without external driving, any initial configuration of nuclei will relax tothe configuration with minimal energy. See App. C.3 for the details on the convergence of RDMFT calculations. and KS-DFT leading the way in terms of efficiency.However, methods based on only one Slater determinant become usually inaccurate for large bond-lengths.We have discussed this in the last paragraph of Sec. 2.3 considering the dissociation limit d → ∞ , wherewe have to describe the two degenerate orbitals of the separate atoms. To describe this limit, we need twoSlater determinants, which corresponds to a multi-reference wave function and makes the system strongly-correlated.In HF and KS-DFT, we try to describe this inherently multi-reference scenario with only one Slater deter-minant. We can observe the consequence of this approximation in Fig. 2.1: with increasing d , the E HF / LD A ( d )increases constantly, introducing an artificial attraction. Thus, according to the single-reference description,the molecule would never dissociate but always feel a force that pushes the nuclei back to the equilibrium.The exact solution instead saturates at the dissociation plateau with E diss ≈ − d min almost exactly for a large class of molecules [84]. In contrast to its bad performancein terms of energetics, the LDA reproduces the correct equilibrium bond length of d eq = d HFeq ≈ E Müller0 ( d ) is closest to the exact E ( d ). Importantly, we see that E Müller0 ( d ) doesnot increase arbitrarily with d , but saturates at a constant value of about E Müller diss = − d Müller min ≈ H are described very well by improved functionals [222, 232], more complicated elec-tronic structures are often better approximated by state-of-the-art DFT functionals [193]. As a last remark,we note that E Müller0 ( d ) is always lower than the exact reference, which is a typical feature of many RDMFTfunctionals (see Sec. 2.4). One of the most precise electronic-structure methods for this regime is coupled cluster theory [131], which systematically improvesupon the HF ansatz and usually is significantly more accurate then typical DFT functionals. However, coupled cluster is also consider-ably more expensive than DFT. hapter 3 LIGHT AND MATTER FROM FIRST PRINCIPLES
In the previous chapter, we have discussed the electronic many-body problem and presented a selection ofstrategies from the big repertoire of state-of-the-art electronic-structure theory to deal with it. In the follow-ing, we generalize these strategies to the coupled electron-photon space. We will see that this is straight-forward in the sense that all tools and concepts from electronic-structure theory have a clear counterpartin the coupled theory. Also all the features of the matter-only problem reappear, i.e., we will have to dealwith the complexity of the many-electron-photon problem. For that, the particle-exchange symmetry playsan important role and we perform variational minimizations to find the ground state, which is in completeanalogy to equilibrium electronic-structure theory. This requires to characterize the system’s state and alsohere we can utilize the same concepts: the wave function, the electron density plus the corresponding pho-ton quantity that is the displacement field and (generalized) RDMs.However, the exploration of the coupled electron-photon space using these tools is considerably moredifficult. One important reason is the enormous size of the corresponding Hilbert space that is a direct prod-uct of the electronic and photonic Hilbert (sub)spaces. Additionally there is no further symmetry restrictionbetween these subspaces, e.g., there is no exchange symmetry between electronic and photonic coordinates.Thus, the exponential wall grows considerably “faster” here than in the separate theories, which in particularlimits wave-function methods and the access to exact reference solutions. At the same time, simple approx-imation schemes, such as the generalization of the Hartree-Fock ansatz (the “mean field”) do not accountfor quantum effects of the interaction between electrons and photons, i.e., there is no quantum-mechanicalexchange. This is especially severe in equilibrium scenarios, because the mean-field (or classical) contribu-tion of the electron-photon interaction is in many important cases trivially zero. Using the vocabulary ofelectronic-structure theory, there is only correlation between static electrons and photons.To accomplish the task to generalize electronic-structure methods to the coupled setting, we are there-fore confronted with two complications: the coupled electron-photon space is substantially larger than theelectronic space alone. At the same time, the mean field description that is one of the most powerful toolsof electronic-structure theory, is considerably less useful for coupled systems. Thus, we have to approxi-mate a larger configuration space with more expensive tools. A further substantial difference between theelectronic and the coupled electron-photon problem is the new type of interaction operator in the lattercase: whereas the Coulomb interaction acts as a 2-body operator that pairwise correlates all the particles,the electron-photon interaction acts as a so-called 3/2-body operator (Sec. 3.3.1). This shows up in the factthat the photon number of a system is a priori not determined - in contrast to the electron number, which isdefined by the physical problem. This has important consequences for the characterization of the photonicstate, which manifest differently in the wave-function, density-functional or RDM description. Note that in time-dependent scenarios, the mean-field approximation of the electron-photon interaction is capable to describe im-portant effects, such as the formation of polaritons [23, 233]. Understanding such effects also in equilibrium scenarios is one importantmotivation for this work.
In analogy to Sec. 2.1, let us start the discussion with the proper definition of the problem that we aim tosolve. We consider the cavity-QED setting (see Sec. 1.3.3) that is described by the Hamiltonian (Def. 1.4)ˆ H = N (cid:88) i (cid:183) ∇ r i + v ( r i ) (cid:184) + N (cid:88) i (cid:54)= j | r i − r j | + M (cid:88) α (cid:183) − ∂ p α + ω α (cid:181) p α + λ α ω α · R (cid:182) (cid:184) , (3.1)For the matter part of the system, this corresponds to the electronic structure Hamiltonian (Eq. (2.1)) andfor the photon part, we consider M modes (with known mode functions) that are coupled to the total dipoleof the electrons. For each mode α , ω α denotes the frequency and λ α = (cid:112) π S α (0) (cid:178) α the coupling vector.Here, (cid:178) α is the polarization direction of the mode and S α ( r ) ∝ (cid:112) V is the mode function at the centerof charge r (see Def. 1.4). To control the electron-photon coupling strength, we regard the absolute valueof λ α = | λ α | = (cid:112) π S α (0) as a tunable parameter. Although ˆ H can be seen as an abstract operator (andwe will sometimes make use of this picture), we have anticipated in Eq. (3.1) already the coordinate choicethat we will employ mostly in the following: as in Sec. 2, we will describe the matter system in real spacewith coordinates r . For the photon modes, we will instead consider the canonical Harmonic-oscillator co-ordinates p α , − i ∂ p α that are in the chosen gauge proportional to the displacement field p α ∝ D α and themagnetic field − i ∂ p α ∝ M α , respectively (see Sec. 1.3.3). Thus, any eigenstate of ˆ H can be described by awave function Ψ ( x ,..., x N , p ,..., p M ), (3.2)that depends on M photon p α and N spin-spatial electron coordinates x = ( r , σ ), where σ ∈ { ↑ , ↓ } denotes thespin-degree of freedom. Additionally, Ψ is normalized, i.e.,1 = (cid:90) d r ...d r N d p d p M Ψ ∗ ( x ,..., x N , p ,..., p M ) Ψ ( x ,..., x N , p ,..., p M ) ≡〈 Ψ | Ψ 〉 , (3.3)where we generalized the ’braket’ notation to the coupled case in the last line. With these coordinates, Ψ isantisymmetric under the exchange of electronic coordinates, i.e., Ψ ( x ,.., x i ,.., x j ,.., x N , p ,.., p M ) = − Ψ ( x ,.., x j ,.., x i ,.., x N , p ,.., p M ) ∀ i , j = N . (3.4)Importantly, there is no symmetry with respect to the exchange of the displacement field coordinates p α ↔ p β , since these are clearly distinguishable . We discuss this special coordinate choice in the next paragraph.To complete the problem definition, we collect all Ψ that adhere to Eq. (3.4) in the set C . This constitutesa domain on which ˆ H is bound from below (and self-adjoint) and thus, we can define the ground state Ψ with energy E = 〈 Ψ | ˆ H Ψ 〉 via the variational principle E = inf Ψ ∈ C 〈 Ψ | ˆ H Ψ 〉 . (3.5) This is justified, because via S α (0) ∝ (cid:112) V enters the effective mode volume V into the Hamiltonian. This is one of the crucialparameters to reach strong coupling as we have discussed in Sec. 1.1. The indices α , β correspond to modes with different frequencies ω α , ω β This follows directly from the self-adjointness of the Pauli-Fierz Hamiltonian, see Sec. 1.3.3. Ψ will be the topic of the following sections. In Sec. 2.1, we have already discussed the antisymmetric many-electron space in great detail. Its parametriza-tion in terms of Slater determinants turned out to be a very important tool for basically all the electronicstructure methods that we have discussed afterwards. Thus, we now want to follow the same route for themany-photon space, which however leaves us more options. In Eq. (3.2), we have explicitly parametrizedthe many-photon space by the displacement coordinates p α . The index α corresponds here to the modeand not a specific photon. The advantage of these coordinates is their independence of the photon number.Since the Hamiltonian (3.1) does not conserve the photon number, these coordinates allow to describe itseigenstates with only one wave function (and not with a superposition of many wave functions that have adifferent number of coordinates). This is practical for the first-principles perspective in general and espe-cially in the cavity-QED setting, where the number of modes is not too big.However, the most common parametrization of the many-photon space considers directly the photons,which correspond to the quantized excitations of the electromagnetic field modes. It is an empirically knownfact that photons are indistinguishable (such as electrons) but they do not adhere to the Pauli-principle, i.e.,many photons can occupy the same state. Thus the many-photon wave function must be symmetric underparticle exchange. If we exchange the antisymmetrization with a symmetrization, we can simply follow theprocedure of Sec. 2.1 to construct the many-electron space. To do so, we choose some one-photon space h and define the n -photon space H nS = S n (cid:79) l = h (3.6)as the symmetric product (denoted by S ) of orbital spaces. Any φ n ∈ H nS thus describes the state of n pho-tons (or in general bosons), exactly as any ψ ∈ H NA describes an N -electron (fermion) state (cf. Eq. (2.19)).However, when we want to describe the eigenstates of Eq. (3.1), we do not know n a priori and thus, we haveto allow in principle for all possible values. The appropriate Hilbert space for this is the Fock space F = ∞ (cid:77) n = H nS . (3.7)A general state Φ ∈ F is a superposition of φ n with arbitrary n .In contrast to the electronic problem, the one-photon or orbital basis of h , i.e., the starting point ofthe many-body construction is very easy to choose. The reason is that for photons, we usually do not haveto consider external currents or variations of the refraction index [234] that would shape the photon land-scape in a similar way as the local potential shapes the electronic orbitals. Once we have solved the classicalMaxwell’s equations to obtain the mode functions, the quantum mechanical part of the photon problemreduces to a sum of Harmonic oscillators with the Hamiltonianˆ H ph = M (cid:88) α (cid:104) − ∂ p α + p α (cid:105) = M (cid:88) α ω α (cid:179) ˆ a † α ˆ a α + (cid:180) , This is how the photon many-body space is introduced in most textbooks, e.g., [10, 137, 144]. H ph in the last line by introducing the annihilation ˆ a α = (cid:113) ω α ( p α − ω α ∂∂ p α )and creation operators ˆ a † α = (cid:113) ω α ( p α + ω α ∂∂ p α ). The eigenfunctions of ˆ H ph are known analytically (see alsoSec. 4.2.2), which makes them the standard basis choice in quantum optics. Additionally, there is usuallyno photon-photon interaction term such as the Coulomb interaction and thus the description of the entire photonic part of the system is basically as simple as the electrons-in-a-box problem (Sec. 2.1.1).This analytical structure can be used to describe photon states in a very elegant manner. Any eigenstate | n α 〉 of the individual ˆ a † α ˆ a α is given by multiple applications of creation operators to the vacuum state | 〉 ,i.e., | n α 〉 = ( ˆ a † α ) n | 〉 . A general n photon state | ϕ n 〉 reads then | ϕ n 〉 = | n ,..., n M 〉 = ( ˆ a †1 ) n ...( ˆ a † M ) n M | 〉 , (3.8)where n = n + ... + n M . This representation is already geared to the symmetry of bosons that can occupy onestate configuration with several particles and which manifests here in the fact that the mode occupations n i can have values different from zero or one. By that, we went completely around an explicit coordinaterepresentation and remain in the abstract state space. We see that the problem of the basis set choice, whichwas an important part of the many-electron description is basically not present for the photon subsystem.This is an important difference between the first-principles description of electrons and photons.However, this means that any nontrivial behavior of the photon system stems from the electron-photoninteraction ˆ H int = − N (cid:88) i M (cid:88) α ω α p α λ α · r i that couples the electronic and the photonic Hilbert spaces. In other words, there is no photon-only many-body theory , at least if we remain within the Pauli-Fierz picture and do not consider theories with an effectivephoton-photon interaction. The special form of ˆ H int is also the reason why eigenstates of ˆ H are superposi-tions of | ϕ n 〉 with different n : the photon-part of the operator p α = (cid:112) ω α ( ˆ a † α + ˆ a α ) does not conserve theparticle number.The states | n ,..., n M 〉 are connected to the displacement representation that we have employed in Eq. (3.2)by ϕ n α ( p α ) = 〈 p α | n α 〉 = 〈 p α | ( ˆ a † α ) n | 〉 . We can then express an M -mode state as ϕ n ,..., n M ( p ,..., p M ) = 〈 p ... p M | n ,..., n M 〉 = 〈 p ... p M | ( ˆ a †1 ) n ...( ˆ a † M ) n M | 〉 . (3.9)It is now crucial to realize that although this is a valid representation of the n = n + ... + n M -photon state,it is not the coordinate representation of the photons. Thus, there is no exchange symmetry between thedifferent p α .Although less common, we can employ also a coordinate representation for photons in analogy to thestandard representation of the electronic wave function in terms of electron coordinates. For that, we as-sociate every such multi-mode eigenstate | n ,..., n M 〉 with a specific photon-number sector, i.e., the zero- Note that following Refs. [200, 144], we chose here explicitly a non-normalized basis { | ϕ n α 〉 } of the n -photon sector, with 〈 ϕ n α | ϕ n α 〉 = n ! for later convenience. The missing normalization factor is shifted to the resolution of identity in this basis, i.e., (cid:49) = n (cid:80) M α ,..., α n = | α ,..., α n 〉〈 α ,..., α n | , where | α ,..., α n 〉 = ˆ a † α ··· ˆ a † α n | 〉 as defined later in the text. Note that the form of the electron-photon interaction depends on the gauge. For instance, in the velocity gauge, one would have to consider also the diamagnetic term that, e.g., renormalizes the photon frequency [99, 124]. Note that although we have restricted the discussion to the cavity-QED setting, most of the general features, such as the construc-tion of the coupled Hilbert space can be generalized straightforwardly to full minimal coupling. See for example Ref. [100]. | ,...,0 M 〉 ≡ | 〉 , the single-photon sector is M -dimensional and corresponds to the span of ˆ a † α | 〉 ≡ | α 〉 for all α and so on. For the multi-photon sectorswe see due to the commutation relations of the ladder operators the bosonic exchange symmetry appear-ing, e.g., ˆ a † α ˆ a † α | 〉 = ˆ a † α ˆ a † α | 〉 ≡ | α , α 〉 for α , α ∈ {1,..., M }. A general photon state can therefore berepresented by a sum over all photon-number sectors as | Φ 〉 = ∞ (cid:88) n = (cid:195) (cid:112) n ! M (cid:88) α ,..., α n = ˜ Φ ( α ,..., α n ) | α ,..., α n 〉 (cid:33) , (3.10)where ˜ Φ ( α ,..., α n ) = (cid:112) n ! 〈 α ,..., α n | Φ 〉 . It is no accident that the bosonic symmetry becomes explicit in thisrepresentation since the different modes α determine how the photon wave functions looks like in real space(for further details on this topic, see App. A.1). The coupled electron-photon space
Having illustrated the principal differences between the photonic and the electronic part of the problem,we now want to discuss the coupled many-body Hilbert space. Since every many-body space depends onthe underlying one-body space and thus also on the coordinate choice, we have several options here and wewill choose the displacement-coordinates. We choose a photonic one-body (or orbital) basis { χ n α ( p α )} thatcorresponds to the Hilbert space h α of mode α and construct the photonic space simply as H ph = M (cid:79) α = h α . (3.11)Importantly, the description in terms of p α allows for an easier definition of the wave function, which is themain reason, why we employ it. The coupled N -electron-photon space is hence given by C = H e ⊗ H ph = A (cid:161) ⊗ Ni = h (cid:162) ⊗ M (cid:79) α = h α , (3.12)where h is the electronic one-body space and A is the antisymmetrizer (see Sec. 2.1). C is a proper con-figuration space of the minimization problem (3.5), that we have to explore to find the ground state wavefunction of the QED Hamiltonian (3.1). Let us now see how we can parametrize wave functions of C to perform the variational minimization. Forsimplicity, we consider a system consisting of 2 electrons that are coupled to one cavity mode with frequency ω and coupling vector λ . This setting is already sufficient to illustrate the main challenges of the coupledproblem. The many-body wave function is represented in real space with two electronic and one photonicvariables Ψ ( x , x , p ), where x = ( r , σ ) is a spin-spatial coordinate. We choose an electronic basis set { ψ k ( x )}and a photonic basis { χ n ( p )} and expand Ψ ( x , x , p ) = (cid:112) (cid:88) kl , n A nkl (cid:161) ψ k ( x ) ψ l ( x ) − ψ k ( x ) ψ l ( x ) (cid:162) χ n ( p ) = Ψ [ A nkl ]. (3.13)94.1. THE MANY-ELECTRON-PHOTON SPACEThe normalization of the wave function manifests then in the sum rule (cid:80) k , l , n | A nkl | for the coefficient tensorelements A nkl ∈ (cid:67) . We see that additionally to the electronic Slater determinant, that adds one index for eachelectron to the coefficient tensor, we get one photonic index for every mode. The Hamiltonian (3.1) for this scenario reads explicitlyˆ H = (cid:88) k (cid:183) − ∇ k + v ( r k ) + ( λ · r k ) (cid:184)(cid:124) (cid:123)(cid:122) (cid:125) ˆ H e = (cid:80) k h e ( r ) = (cid:80) k [ t + v + h sel f ,1 ]( r k ) − (cid:88) k = ω p λ · r k (cid:124) (cid:123)(cid:122) (cid:125) ˆ H ep = (cid:80) k = h ep ( r k , p ) + (cid:179) − d d p + ω p (cid:180)(cid:124) (cid:123)(cid:122) (cid:125) ˆ H ph + | r − r | + ( λ · r )( λ · r ). (cid:124) (cid:123)(cid:122) (cid:125) ˆ H ee = h ee ( r , r ) = [ w + h sel f ,2 ]( r , r ) (3.14)For the chosen gauge (see Sec. 1.3.3), the interaction between electrons and photons appears in the bilinearinteraction term ˆ H ep and in the purely electronic dipole self-interaction ˆ H sel f ≡ (cid:80) k , l = ( λ · r k )( λ · r l ) thathas a one-body (last term of ˆ H e ) and a two-body contribution (last term of ˆ H ee ). The energy expectationvalue of this Hamiltonian computed with the ansatz (3.13) leads to an expression of the form: E = E [ A nkl ] =〈 Ψ [ A nkl ] | ˆ H Ψ [ A nkl ] 〉= E e [ A nkl ] + E ph [ A nkl ] + E ep [ A nkl ] + E ee [ A nkl ], (3.15)Here, we have defined the electronic and photonic one-body energies E e and E p E e [ A nkl ] = (cid:88) k (cid:48) , k (cid:88) l , r (cid:161) A r ∗ k (cid:48) l A rkl − A r ∗ k (cid:48) l A rlk (cid:162) 〈 Ψ k (cid:48) | h e Ψ k 〉 (3.16) E ph [ A nkl ] = (cid:88) r (cid:48) , r (cid:88) kl A r (cid:48) ∗ kl A rkl 〈 χ r (cid:48) | h p χ r 〉 , (3.17)the electron-photon interaction energy E ep E ep [ A nkl ] = − ω (cid:88) r (cid:48) , r (cid:88) k (cid:48) , k (cid:88) l (cid:179) A r (cid:48) ∗ k (cid:48) l A rkl − A r (cid:48) ∗ k (cid:48) l A rlk (cid:180) 〈 Ψ k (cid:48) | λ · r | Ψ k 〉〈 χ r (cid:48) | p χ r 〉 , (3.18)and the electron-electron interaction energy E ee E ee [ A nkl ] = (cid:88) r (cid:88) k (cid:48) , k (cid:88) l (cid:48) , l (cid:161) A r ∗ k (cid:48) l (cid:48) A rkl − A r ∗ k (cid:48) l (cid:48) A rlk (cid:162) 〈 Ψ k (cid:48) Ψ l (cid:48) | h ee | Ψ k Ψ l 〉 . (3.19)We see that the (exact) minimization of E [ A nkl ] is comparable to the minimization of a 3-electron problemand merely requires the implementation of the extra energy terms. In fact, on the exact wave-function level,electronic and photon degrees of freedom are very similar. However, such a description is numerically infea-sible already for most electronic problems. If also the photon modes have to be included, such descriptionswill be even more limited. This is illustrated by the fact that the largest matter-photon problem that hasbeen solved exactly so far consists of three particles (electron or nuclei) and one photon mode [235]. We remind the reader that including more modes in this representation would not require any further symmetrization, be-cause the p coordinates are distinguishable. For example, the 2-electron-2-mode wave function reads Ψ e m ( x , x , p , p ) = (cid:80) kl , n , n A n n kl (cid:161) ψ k ( x ) ψ l ( x ) − ψ k ( x ) ψ l ( x ) (cid:162) χ n ( p ) χ n ( p ). We remind the reader that this term is gauge-depended. In the velocity gauge for example, the Hamiltonian would exhibit thediamagnetic term instead of the dipole self interaction. However, the bilinear interaction term instead is always present (though thephysical meaning of the photon observables change). See Sec. 1.3.3.
The mean field approximation: “Hartree-Fock for QED”
The substantial differences of the coupled electron-photon space become explicitly visible, when we tryto approximate the wave function. Let us as a first example consider the equivalent to the single-Slater-determinant ansatz of HF theory, which is called the mean field (MF) approximation. This means that wehave to consider the simplest wave function that adheres to the symmetry (3.4), i.e., one basis element ofthe coupled many-body space C . This is simply the product of one Slater determinant (the HF ansatz) anda photon orbital Φ MF ( x , x , p ) = Ψ HF ( x , x ) χ ( p ) = (cid:112) (cid:161) ψ ( x ) ψ ( x ) − ψ ( x ) ψ ( x ) (cid:162) χ ( p ), (3.20)with the corresponding energy expression E MF =〈 Φ MF | ˆ H Φ MF 〉 (3.21) = (cid:88) k = 〈 ψ k | h e ψ k 〉 + 〈 χ | h p χ 〉 − ω (cid:88) k = 〈 ψ k | λ · r | ψ k 〉〈 χ | p χ 〉 + (cid:161) 〈 ψ ψ | h ee | ψ ψ 〉 − 〈 ψ ψ | h ee | ψ ψ 〉 (cid:162) .We see that the MF ansatz (3.20) entirely decouples electron and photon degrees of freedom, which has severeconsequences for the quality of the approximation. This is well-illustrated by discussing the electronic andphotonic subsystems separately. The electronic part of the energy expression is given by E MF , e = (cid:88) k = 〈 ψ k | ( h e + v ph ) ψ k 〉 + (cid:161) 〈 ψ ψ | h ee | ψ ψ 〉 − 〈 ψ ψ | h ee | ψ ψ 〉 (cid:162) , (3.22)where we have defined v ph ( r ) = − ω 〈 χ | p χ 〉 λ · r . This is structurally the same expression as in HF, but with amodified local potential and two-body interaction kernel. The photon-part of the energy reads E MF , ph = 〈 χ | h p χ 〉 − ω λ · R 〈 χ | p χ 〉 , (3.23)where R = (cid:80) k = 〈 ψ k | r ψ k 〉 is the total dipole of the electronic system. We see that on the photon side, we havejust a harmonic oscillator that is shifted by the electronic total dipole moment. The eigenstates of shiftedoscillators are coherent states which are very closely connected to classical fields [89]. Many systems such asatoms but also many molecules do not have a static dipole, i.e., R =
0. In this case, E MF , ph = 〈 χ | h p χ 〉 ,which yields the photon vacuum-state χ as lowest energy state. The photon contribution to the total energyis the trivial vacuum energy in this case. This important result can be generalized to arbitrary many modesand particles [39]. We see that the MF approximation entirely neglects the quantum nature of the electron-photon interaction.With the ansatz of the form (3.20), we have derived coupled “Maxwell-HF” theory. Within this approach,we cannot describe quantum properties of the electro-magnetic field beyond coherent states, which areessentially classical [89, Ch. 2.1]. This is very different for the electronic mean-field theory, i.e., HF. We haveseen in Sec. 2.2 that the antisymmetry of the HF ansatz alone adds the Fock term to the equations, which Note that the dipole moment becomes the full electronic current in the general minimal coupling setting. multi-reference (correlated) wave function.This does not mean that coupled Maxwell-HF theory or more generally, coupled Maxwell-matter the-ories are not a reasonable extension of electronic structure theory. A prominent example is the coupled
Maxwell-Kohn-Sham approach , that calculates the Slater determinant of Φ MF with the KS-DFT machin-ery [236, 237, 23]. Possible applications include the theoretical description of high-harmonic generation[238], attosecond physics [239] or molecular systems weakly coupled to the modes of a cavity [233]. Thesemiclassical approach is especially powerful to investigate time-dependent phenomena, where the solutionof Maxwell’s equations and their self-consistent implementation with the KS equations is highly nontrivial. Inclusion of correlation: the generalized mean-field ansatz
To investigate the role of the quantum nature of the photon-field in cavity systems, we have to go beyondthe MF approximation. The straightforward way to do so is a more general wave function ansatz, which webriefly want to discuss in the following.We have concluded already that the configuration space of the ansatz (3.13) is not practically useful be-yond very simple systems. If we nevertheless want to employ a wave function, we therefore need to truncatethe configuration space in a reasonable way. For that, we can utilize our knowledge about the electronic sys-tem that is quite accurately described by, e.g., one Slater determinant (HF or KS-DFT). To make use of thisinformation, we could try to extend the mean-field wave function in a systematic way with the idea to re-main as close as possible to the single-reference ansatz in the electronic subsystem. For instance, we coulduse the mean-field ansatz Φ MF as a kind of basis and build the full wave function from superpositions ofdifferent such Φ MF → Φ r that we denote by an index r. One can see this as the “many-body generalization”of the cavity-QED models of Sec. 1.2 (from Eq. (1.16) onwards): the Φ r represent the electronic “levels” thatare coupled to the states of a photon-mode. We call this the generalized MF ansatz and for our 2-electron-1-mode example, the corresponding wave function reads Φ g MF ( x , x , p ) = B (cid:88) r = A r Φ r ( x , x , p ) = (cid:112) B (cid:88) r = A r (cid:161) ψ r ( x ) ψ r ( x ) − ψ r ( x ) ψ r ( x ) (cid:162) χ r ( p ), (3.24)where the expansion coefficients A r ∈ (cid:67) satisfy the normalization condition (cid:80) Br = | A r | =
1. In this expan-sion, we have sorted the problem according to the photon states that define different sectors labelled by r .Note that only the full Slater determinants 〈 Φ r | Φ r (cid:48) 〉 = δ r , r (cid:48) between different photon sectors are orthogonal,but not the electronic orbitals alone, i.e., 〈 ψ ri | ψ r (cid:48) j 〉 (cid:54)= δ r , r (cid:48) . Only within one sector r , we have 〈 ψ ri | ψ rj 〉 = δ i j . Clearly, this expansion is equivalent to the full many-body ansatz (3.13) for a complete basis set ( B → ∞ ).The central idea behind Eq. (3.24) is that the number B of included MF states Φ r is small. The correspondinggeneralized MF energy expression reads E g MF =〈 Φ g MF | ˆ H Φ g MF 〉= E eg MF + E phg MF + E epg MF + E eeg MF If we assumed instead that 〈 ψ ri | ψ r (cid:48) j 〉 = δ i j δ r , r (cid:48) , the electronic and photon subsystems would decouple as in the MF description. = B (cid:88) r = (cid:88) k = | A r | 〈 ψ rk | h e ψ rk 〉 + B (cid:88) r (cid:48) , r = (cid:88) k (cid:48) , k = A r (cid:48) ∗ A r 〈 χ r (cid:48) | h p χ r 〉〈 ψ r (cid:48) k (cid:48) | ψ rk 〉− ω B (cid:88) r (cid:48) , r = (cid:88) k (cid:54)= l = A r (cid:48) ∗ A r (cid:179) 〈 ψ r (cid:48) k | λ · r | ψ rk 〉〈 ψ r (cid:48) l | ψ rl 〉 − 〈 ψ r (cid:48) k | λ · r | ψ rl 〉〈 ψ r (cid:48) l | ψ rk 〉 (cid:180) 〈 χ r (cid:48) | p χ r 〉+ B (cid:88) r = | A r | (cid:161) 〈 ψ r ψ r | w | ψ r ψ r 〉 − 〈 ψ r ψ r | w | ψ r ψ r 〉 (cid:162) .Let us briefly analyze the occurring terms. The purely electronic contribution is just the sum of the standardMF terms of the involved Slater-determinants, E eg MF + E eeg MF = B (cid:88) r = | A r | E er + B (cid:88) r = | A r | E eer ,where we have defined E er = (cid:80) k = 〈 ψ rk | h e ψ rk 〉 and E eer = (cid:161) 〈 ψ r ψ r | h ee | ψ r ψ r 〉 − 〈 ψ r ψ r | h ee | ψ r ψ r 〉 (cid:162) . This corre-sponds to B HF problems, which is manageable even for large B .We continue with the two terms that involve the photon states. Interestingly, we see that the photonenergy term E phg MF = B (cid:88) r (cid:48) , r = (cid:88) k (cid:48) , k = A r (cid:48) ∗ A r 〈 χ r (cid:48) | h p χ r 〉〈 ψ r (cid:48) k (cid:48) | ψ rk 〉 that in the full many-body description (Eq. (3.17)) was a one-body term (involving only the 〈 χ r (cid:48) | h p χ r 〉 )now also requires the calculation of the overlap integrals 〈 ψ r (cid:48) k (cid:48) | ψ rk 〉 between all the electronic basis elementsof all the photon sectors. Thus, the B HF equations are seemingly coupled due to this term, which spoilsthe advantage of the construction. However, we have not yet made use of the freedom to choose the ba-sis functions χ r . For instance, if we choose the eigenstates of ˆ H ph , i.e., ˆ H ph 〈 χ r 〉 = ω ( r + ) 〈 χ r 〉 , we have 〈 χ r (cid:48) | ˆ H ph χ r 〉 ∝ δ r (cid:48) , r and thus the overlap integrals between the different r-sectors 〈 ψ r (cid:48) k (cid:48) | ψ rk 〉 = 〈 ψ rk (cid:48) | ψ rk 〉 = δ k (cid:48) , k vanish. The energy expression reduces then to E phg MF ˆ H ph | χ r 〉= ω ( r +
12 ) | χ r 〉 −→ B (cid:88) r = | A r | ω ( r + ). (3.25)This is exactly the representation that we have employed to derive the cavity-QED models and it makes alsothe photon-energy term in this generalized MF description manageable.The only missing term is the electron-photon interaction that reads E epg MF = − ω B (cid:88) r (cid:48) , r = (cid:88) k (cid:54)= l = A r (cid:48) ∗ A r (cid:179) 〈 ψ r (cid:48) k | λ · r | ψ rk 〉〈 ψ r (cid:48) l | ψ rl 〉 − 〈 ψ r (cid:48) k | λ · r | ψ rl 〉〈 ψ r (cid:48) l | ψ rk 〉 (cid:180) 〈 χ r (cid:48) | p χ r 〉 . (3.26)This most-complicated term of the energy has three different contributions: the transition dipole matrixelements 〈 ψ r (cid:48) k | r | ψ rk 〉 , again the electronic overlap integrals 〈 ψ r (cid:48) l | ψ rk 〉 and the photon-displacement matrixelements 〈 χ r (cid:48) | p χ r 〉 . Since we have chosen a photonic basis, which we know analytically, we can furthersimplify this expression. We identify p = (cid:112) ω ( ˆ a + ˆ a † ) and utilize ˆ a † | χ 〉 r = (cid:112) r + | χ 〉 r + , ˆ a | χ 〉 r = (cid:112) r | χ 〉 r − to re-express 〈 χ r (cid:48) | p χ r 〉 ˆ H ph | χ r 〉= ω ( r +
12 ) | χ r 〉 −→ ω ( (cid:112) r + δ r (cid:48) , r + + (cid:112) r δ r (cid:48) , r − ), (3.27)98.2. QUANTUM-ELECTRODYNAMICAL DENSITY FUNCTIONAL THEORYwhich reduces the double-sum over r , r (cid:48) to one sum that couples only neighboring photon sectors. However,we still have to compute the overlap integrals 〈 ψ rl | ψ r ± k 〉 for all combinations of ( k , r ) and ( l , r + H ph , we could also choose the eigenstates of ˆ p , i.e., coherent states as aphoton basis. This would remove all the coupling elements in the E epg MF -expression, 〈 χ r (cid:48) | p χ r 〉 → δ r , r (cid:48) , inthe same way as the eigenstates of ˆ H ph remove the overlaps in the photon energy before. However, since[ ˆ H ph , p ] (cid:54)=
0, there is no basis that diagonalizes both operators at the same time and thus, we cannot avoidthe calculation of the overlap integrals: we are fundamentally confronted with the problem of determining B coupled Slater determinants.This might seem simpler than solving the standard ansatz (3.13), but instead the configuration spacegrows here even faster with the particle number. To see this, we consider the N -electron-1-mode wavefunction Ψ ( x ,..., x N , p ) = (cid:88) r A r Φ r ( x ,..., x N , p ) = (cid:88) r A r Ψ r ( x ,..., x N ) χ r ( p ), (3.28)with the electronic Slater determinants Ψ r ( x ,..., x N ) = (cid:112) N ! (cid:80) π j ∈ P N ( − j ψ r π j (1) ( x ) ··· ψ r π j ( N ) ( x N ), where P N denotes the permutation group on N elements and the index j is chosen such that it is even (odd) for aneven (odd) permutation π i ∈ S N . With this ansatz, we calculate the electron-photon interaction term E ep = − ω N ! (cid:88) r (cid:48) , r A r (cid:48) ∗ A r 〈 χ r (cid:48) | p χ r 〉× (cid:88) π i , π j ∈ P N ( − i + j (cid:179) 〈 ψ r (cid:48) π i (1) | λ · r | ψ r π j (1) 〉〈 ψ r (cid:48) π i (2) | ψ r π j (2) 〉···〈 ψ r (cid:48) π i ( N ) | ψ r π j ( N ) 〉 (cid:180) . (3.29)The permutation group P N has N ! many elements over which the double sum in the second line runs. Thismeans that even if we only take two different photon states into account, we have to calculate in princi-ple N ! many integrals. This number is enormous since the factorial function x ! ≈ (cid:112) xx x e − x grows evenfaster than the exponential. In a practical calculation many of these integrals would be zero, since the cou-pled Slater determinants usually do not differ in every basis function. Nevertheless, the principal problemremains, which illustrates the intrinsic difficulty of the (many-body) description of polaritons.The only practical way to utilize the efficient description of electronic structure methods in the coupledsetting is to employ the rotating-wave approximation that we have introduced in Sec. 1.2. This means that weproject the wave function on the single excitation space [241], which would remove the r -index in Eq. (3.26).This is a reasonable approach for certain physical settings, especially for time-dependent systems, if thephoton mode is close to resonance with a matter excitation. However, for the description of ground states,this approximation breaks down [24] and generalizations are very difficult [96]. The analysis of the previous section showed that a wave function description of coupled electron-photonproblems beyond the mean-field level is quite challenging for larger systems and many modes. However, themean-field wave function, i.e., the generalization of the HF ansatz cannot capture the quantum effects dueto the photon interaction, and is thus not sufficient to describe, e.g., strong-coupling phenomena. For that, According to Sterling’s approximation for factorials [240, part 4]. ρ ( r ) as fundamental variable, which can be very accurately approximated by considering the KSsystem, i.e., a non-interacting auxiliary system. We have mentioned already in Sec. 2.3 that the Hohenberg-Kohn theorem that justifies DFT can be generalized to many other scenarios [166] including Pauli-Fierztheory and its limits [39, 242]. This allows to define QEDFT, which we want to discuss in the following.Ruggenthaler et al. formally defined time-dependent QEDFT [39] and ground-state QEDFT [242] forthe full hierarchy from the full relativistic regime (neglecting the mathematical problems that we mentionedin Sec. 1.3.1) to model systems in the long-wavelength limit. These works connect the two precursory pub-lications by Ruggenthaler et al. [245] in the relativistic regime and Tokatly [246], who first considered thecavity-QED setting. A possible theoretical framework for the equilibrium phenomena of polaritonic chem-istry is thus ground-state QEDFT [242]. QEDFT and the KS construction
Let us start with the basic theory of QEDFT in the cavity-QED setting described by Hamiltonian (3.1). Weformally include the time-derivative of an additional external mode-resolved current ˙ j α that couples tothe photon field. The Hamiltonian then readsˆ H = N (cid:88) i (cid:183) ∇ r i + v ( r i ) (cid:184) + N (cid:88) i (cid:54)= j | r i − r j | + M (cid:88) α (cid:183) − ∂ p α + ω α (cid:181) p α + λ α ω α · R (cid:182) + ˙ j α ω α p α (cid:184) . (3.30)The ground state of ˆ H is described by a wave function of the form Ψ ( x ,..., x N , p ,..., p M ). (3.31)Note that the external current ˙ j α plays the same role for the photons as v ( r ) for the electrons. Correspond-ingly, the pair of internal and external variables that are in one-to-one correspondence are [246]( ρ ( r ),{ p α }) ←→ ( v ( r ),{ ˙ j α }), (3.32)where ρ ( r ) =〈 Ψ | ˆ ρ ( r ) Ψ 〉 = (cid:90) d r N − r d M p Ψ ∗ ( r , r ,..., r N , p ,..., p M ) Ψ ( r , r ,..., r N , p ,..., p M ), (3.33a)is the one-body density as in standard DFT and p α =〈 Ψ | p α Ψ 〉 = (cid:90) d r N r d M p Ψ ∗ ( r , r ,..., r N , p ,..., p M ) p α Ψ ( r , r ,..., r N , p ,..., p M ). (3.33b) For the time-dependent setting is not captured by the Hohenberg-Kohn theorem, but needs to be generalized. This has beenshown first by Runge and Gross [243] for the special case of analytical external potentials and was generalized to a very broad range ofpotentials by van Leeuwen [244]. Note that we have to consider the time-derivative because of the chosen length-gauge, where p α is proportional to the displacementfield. In the velocity gauge for example, we would use the vector potential as principal variable, which couples directly to external currents. Thus, strictly speaking, we consider a quasi-static situation here. In practice, this is just a technical detail and thus, we willnot further comment on the time-derivative. Note that in the time-independent case, ˙ j α is equivalent to an external vector-potential [242]. We thus do not have to considerexternal vector-potentials in addition to ˙ j α , which is in contrast to the (most general) time-dependent case. α .To facilitate the functional construction, the obvious next step is to consider an auxiliary system that wecan describe efficiently. The straightforward generalization of the KS system in DFT, is a non-interacting ormean-field system in the coupled space, i.e., a system described by the mean-field wave function Φ s ( r ,..., r N , p ,..., p M ) = | ψ ( r ) ··· ψ N ( r n ) | − χ ( p ) ··· χ M ( p M ) (3.34)that we discussed in the previous section for N = M =
1. We call this the KS construction for QEDFT(KS-QEDFT) and for the electronic orbitals ψ i , the according KS equations read (cid:178) i ψ i ( r ) = (cid:34) − ∇ + v s ( r ) + M (cid:88) α (cid:161) ω α 〈 p α 〉 + λ α · 〈 R 〉 (cid:162) λ α · ˆ R (cid:35) ψ i ( r ). (3.35)Note the we explicitly included the expectation value of the displacement 〈 p α 〉 = 〈 Φ s | p α Φ s 〉 of the photonfield modes and the expectation value of the total dipole 〈 R 〉 of the matter system. We explicitly added the 〈·|·〉 symbol to differentiate these mean values from the operator ˆ R . The whole last term in Eq. (3.35) thusrepresents the mean-field contribution of the electron-photon interaction term (see the according term inEq. (3.21)), i.e., the straightforward generalization of the electronic Hartree part potential (see Eq. (2.36)).The KS potential v s ( r ) = v ( r ) + v eHxc [ ρ ,{ p α }]( r ) + M (cid:88) α v ph α , xc [ ρ ,{ P α }]( r ) (3.36)of KS-QEDFT consists of three main parts. The first two terms are equivalent to KS-DFT, where v ( r ) is theexternal potential and v eHxc is the usual Hartree-exchange-correlation potential (cf. Eq. (2.57)) that describesthe Coulomb interaction between the electrons. The third term is new and constitutes the (mode-resolved)photonic exchange-correlation potential v ph α , xc that accounts for the electron-photon correlation.The equations for the photonic subsystem are much simpler because the coherent states of the mean-field ansatz can be equivalently described by their mean-value which are the (classical) displacement coor-dinates p α . The corresponding eigenvalue problem is given by the time-independent Maxwell’s equations.These reduce to the trivial equality ω α p α = ω α λ · 〈 R 〉 + ˙ j α ω α . (3.37) The problem of a non-interacting KS system
This shows again the difficulty of the time-independent electron-photon problem: If we consider the case,where ˙ j α = R = the photon part of the KS system does not contribute to the solution. Additionally,we lose all the mean-field part of the Kohn-Sham equations that in this scenario read (cid:178) i φ i ( r ) = (cid:163) − ∇ + v s ( r ) (cid:164) φ i ( r ). (3.38)It is important to realize how general this situation is, since it includes all atoms and also all molecules thatare aligned in a inversion-symmetric way with respect to the polarization direction such that λ · 〈 R 〉 ≡ This is the case for, e.g., all centrosymmetric potentials v ( r ) = v ( − r ), where 〈 R 〉 = (cid:80) Ni = 〈 r i − r 〉 ≡
0, if we chose the center of mass asthe reference r for the dipole-operator. v ph α , xc , whichwe do not know. Additionally, we do not have direct access to non-classical observables of the photon fieldas, e.g., the photon-number. The reason behind this is obviously our choice of a non-interacting auxiliarysystem. Since there is no symmetrization between the electronic and photonic Hilbert space, we cannot gain“quantumness” only by considering a properly symmetrized ansatz. In other words, there is no exchangecontribution of the electron-photon interaction. Dynamics vs Statics in QEDFT
As we have mentioned in the beginning, QEDFT was first formulated for the time-dependent case, whichmeans that we have to solve the time-dependent version of the coupled KS equations (3.35) and (3.37) self-consistently. So far QEDFT was mainly applied in this setting, which we therefore briefly want to discuss.Interestingly, the accurate description of many time-dependent problems is actually easier with QEDFT thanof ground states. The principal reason is that in the time-dependent case, already the mean field, i.e., theclassical electromagnetic field does contribute nontrivially to the problem. The simplest approximation ofsuch a time-dependent QEDFT just disregards v ph α , xc completely, which is equivalent to the Maxwell-Kohn-Sham approach that we have mentioned in Sec. 3.1 and which leads to a very good description of the weak(and sometimes intermediate) coupling regime. For instance, the linear response of many systems is cap-tured well [233]. It has also been shown how one can in principle include the nuclear dynamics into thedescription and present accurate results on the level of Ehrenfest dynamics (nuclear mean field) [247, 248].A good overview about the applicability and the actual state of (time-dependent) QEDFT can be found inthe review by Flick et al. [249].As a final comment for the time-dependent case, we would like to mention the publication of Wanget al. [250], in which the authors connect QEDFT to cavity-QED models. They show that indeed for smallcoupling strengths the cavity-QED models describe the principal physics very accurately. This is a first stepin the direction of one of the important applications of first-principles theory, that is to define the range ofapplicability of model descriptions. Photon-exchange-correlation
Despite the interesting results that have been obtained on the mean-field level and the many conceptualinsights that we could gain already only by establishing the general theory of QEDFT, we have to go beyondthat for our ground state problem. Thus, we are faced with the most difficult part of every type of DFT, i.e.,the approximation of the unknown exchange-correlation functional v ph α , xc . Only one functional has beenproposed so far by Pellegrini et al. [251], which is a generalization of the optimized effective potential (OEP)approach to standard DFT (see Sec. 2.3). This approach allows to construct a functional on the basis of per-turbation theory by using a connection to the Green’s function formalism [144]. Based on this connection,Pellegrini et al. developed the photon OEP , which considers a lowest order perturbative correction of the ef-fective potential, i.e., it takes one-photon processes into account. This is the straightforward generalizationof the exact-exchange approximation in KS-DFT that we have mentioned in Sec. 2.3. As in the electroniccase, the photon OEP goes beyond standard perturbation theory, which is used to calculate a correction to102.3. REDUCED DENSITY MATRICES IN QEDan already determined ground state. Instead, the photon OEP is a nonlinear functional of ρ that has to beincluded self-consistently when we solve the KS equations.Unfortunately, using the photon OEP in practice is numerically very expensive and difficult to con-verge [247]. This is a well-known problem of the OEP approach (regarding photonic or electronic cor-rections) and the reason why the OEP is rarely used explicitly. Instead, the Krieger-Li-Iafrate approxima-tion [252] is very common that captures the main features of the OEP quite accurately for relatively lowcomputational costs and good convergence properties [253]. Unfortunately, the generalization of the KLI-approximation for the photon OEP is severely less accurate than in the electronic case and thus not very use-ful in practice [247]. Another severe drawback of the OEP functional with regard to strong-coupling physicsis its limitation to one-photon processes. Flick et al. [247] analyzed this for an exactly solvable system andcame to the conclusion that the photon OEP starts to fail, when two-photon process become important inthe electron-photon interaction. Since this is a hallmark of the strong-coupling regime, we need functionalsthat go beyond the photon OEP to describe many of the phenomena of polaritonic chemistry.Comparing to the history of DFT, it seems very probable that once an accurate photon-exchange-correlationfunctional has been found, KS-QEDFT will become the standard tool to describe coupled electron-photonproblems. However, there are also alternative routes as we will show in part II that provide a valuable addi-tional perspective. This might be helpful not only for the functional construction (see also the Outlook inpart IV) but also for possible scenarios, where such a future functional is inaccurate. For instance, the emer-gence of polaritons in the strong-coupling regime shows that the electronic and photonic subsystems arestrongly mixed, which corresponds to a strongly correlated character of the wave function. The experiencefrom KS-DFT shows that the non-interacting auxiliary system is not a good starting point to describe suchsystems. It is thus possible that we will face a similar challenge, when we want to describe strongly-coupledsystems with KS-QEDFT. In this section, we generalize the concept of reduced density matrices (RDMs) to the coupled electron-photon space. Although in principle straightforward, the coupled light-matter problem poses many newchallenges to an RDM description, because of the special form of the electron-photon interaction that isnot particle-conserving. Nevertheless, it is possible to define a variational RDM theory and RDMFT (QED-RDMFT) also in the coupled setting.
We start with the generalization of the concept of RDMs to the coupled electron-photon problem. For that,we analyze an N -electron- M -mode system in terms of the RDM description, following the same procedureas in Sec. 2.4.1 for the N -electron problem. We follow here the section Reduced density matrices for coupledlight-matter systems of Ref. [1]. To ease reading, we shortly recapitulate the most important definitions ofSec. 3.1.We consider the cavity-QED Hamiltonianˆ H = N (cid:88) i (cid:183) ∇ r i + v ( r i ) (cid:184) + N e (cid:88) i (cid:54)= j q i q j | r i − r j | + M (cid:88) α (cid:183) − ∂ p α + ω α (cid:181) p α + λ α ω α · R (cid:182) (cid:184) , (cf. 3.1)103HAPTER 3. LIGHT AND MATTER FROM FIRST PRINCIPLESwhich we split for later convenience in the contributionsˆ H = ˆ T + ˆ V + ˆ W + ˆ H ph + ˆ H int + ˆ H sel f ,1 + ˆ H sel f ,2 , (3.39)where ˆ T = N e (cid:88) i ∇ r i , ˆ V = N e (cid:88) i v ( r i ), ˆ W = N e (cid:88) i (cid:54)= j q i q j | r i − r j | ,ˆ H ph = M ph (cid:88) α (cid:104) − ∂ p α + ω k p α (cid:105) , ˆ H int = − N e (cid:88) i M ph (cid:88) α ω α p α λ α · r i ,ˆ H sel f ,1 = N e (cid:88) i = (cid:161) λ α · r j (cid:162) , ˆ H sel f ,2 = N e (cid:88) i (cid:54)= j = ( λ α · r i ) (cid:161) λ α · r j (cid:162) .The Hamiltonian (3.1) has eigenstates of the form Ψ ( x ,..., x N , p ,..., p M ), (cf. 3.2)where x = ( r , σ ) are spin-spatial coordinates. The wave function Ψ is antisymmetric with respect to theexchange of any two x j ↔ x k and depends on M photon-mode displacement coordinates p α (see Sec. 3.1.1).Though RDMs and their properties are quite general objects that can be defined with respect to any wavefunction or density matrix, their form and their role is most obvious, when we calculate expectation values.The prime example is the energy expression that we use to define the ground state of a system. Accordingto the variational principle, the ground state of the Hamiltonian (3.1) is the (possibly degenerate) state thathas the lowest energy expectation value E [ Ψ ] = inf Ψ 〈 Ψ | ˆ H Ψ 〉 . (3.40)We have discussed already that this minimization is not useful in practice, since it is has to be performed overthe configuration space spanned by all possible many-body wave functions, which builds the exponentialwall. However, the full wave function is usually not necessary to compute the energy expectation value buttypically only RDMs are sufficient. This reduces the configuration space enormously because the coordinatedependence of RDMs is independent of the particle number (see Sec. 2.4). The RDM perspective and coupled light-matter systems
In analogy to Def. (2.77), let us now define the q RDM for photons or, more generally, bosons by integratingover all but q particle-coordinates. For that, we need an according wave function representation as dis-cussed in Sec. 3.1.1 ( ˜ Φ in Eq. (3.10)). Thus, we consider an N b boson state in the representation ψ b ( α ,..., α N b )and define the corresponding bosonic q RDM Γ ( q ) b ( α ,..., α q ; α (cid:48) ,... α (cid:48) q ) = N b !( N b − q )! M (cid:88) α q + ,..., α Nb = ψ ∗ b ( α (cid:48) ,..., α (cid:48) q , α q + ,..., α N b ) ψ b ( α ,..., α q , α q + ,..., α N b ). (3.41)Equivalently to the electronic case, we denote the 1RDM by γ b ( α , β ) = Γ (1) b ( α ; β ). But this definition is notsufficient to describe expectation values of the Hamiltonian (1.37). Because of the form of the electron-104.3. REDUCED DENSITY MATRICES IN QEDphoton interaction, the number of photons is undetermined and we need to work with Fock-space wavefunctions | Φ 〉 . For the 1RDM, we generalize the definition (3.41) as γ b ( α , β ) = 〈 Φ | ˆ a † β ˆ a α Φ 〉 = ∞ (cid:88) N b = N b (cid:195) M (cid:88) α ,..., α Nb = ψ ∗ b ( β , α ,..., α N b ) ψ b ( α , α ,..., α N b ) (cid:33) . (3.42)We see here why the coordinate-representation is normally not used for photon states. Since their particle-number is usually not fixed, the description becomes highly cumbersome. The abstract definition of | Φ 〉 together with the annihilation (creation) operators ˆ a (†) α are much simpler. Thus, we consider in the followingthe general bosonic Fock-space q RDM directly via Γ ( q ) b ( α ,..., α q ; α (cid:48) ,..., α (cid:48) q ) = 〈 Φ | ˆ a † α (cid:48) ··· ˆ a † α (cid:48) q ˆ a α q ··· ˆ a α Φ 〉 . (3.43)The fermionic and bosonic RDMs can be extended to the coupled fermion-boson case straightforwardly byjust integrating/summing out the other degrees of freedom. That is, if we have a general electron-bosonstate of the form of Eq. (3.2) we can accordingly define Γ ( q ) e ≡ N !( N − q )! (cid:80) σ ,..., σ N (cid:82) d N − q ) r d M p Ψ ∗ Ψ as well as Γ ( q ) b ≡ 〈 Ψ | ˆ a † α (cid:48) ··· ˆ a † α (cid:48) q ˆ a α q ··· ˆ a α Ψ 〉 . In a next step, we see whether these standard ingredients of RDM theories are sufficient to express theenergy expectation value of the Hamiltonian of Eq. (3.1). For the purely electronic part, the different contri-butions can be expressed either explicitly by the electronic 1RDM or by the electronic 2RDM. As the generalprescription of Def. (2.77) tells us, all expectation values of the single-particle operators of ˆ H are given interms of the 1RDM. These are the standard electronic operators ˆ T and ˆ V but also the single-particle part ofthe dipole self-energy ˆ H sel f ,1 . We can thus write T [ γ e ] = (cid:90) d r (cid:163) − ∇ r (cid:164) γ e ( r ; r (cid:48) ) | r (cid:48) = r , (3.44a) V [ γ e ] = (cid:90) d r (cid:163) v ( r ) (cid:164) γ e ( r ; r ), (3.44b) H sel f ,1 [ γ e ] = (cid:90) d r (cid:34) M (cid:88) α = ( λ α · r ) (cid:35) γ e ( r ; r ) (3.44c)as functionals of γ e . The latter two energies are actually functionals of merely the electronic density ρ ( r ) = γ e ( r ; r ). As before, the subscript | r (cid:48) = r indicates that r (cid:48) is set to r after the application of the semi-local single-particle operator − ∇ r . The expectation value of the electronic interaction energy ˆ W and the two-body partof the dipole self-energy ˆ H (2) d are given in terms of the (diagonal) of the 2RDM by W [ Γ (2) e ] = (cid:90) d r d r (cid:48) w ( r , r (cid:48) ) Γ (2) e ( r , r (cid:48) ; r , r (cid:48) ), (3.45a) H (2) d [ Γ (2) e ] = (cid:90) d r d r (cid:48) (cid:34) M (cid:88) α = ( λ α · r ) (cid:161) λ α · r (cid:48) (cid:162)(cid:35) Γ (2) e ( r , r (cid:48) ; r , r (cid:48) ). (3.45b)Hence, for the electronic operator expectation values little changes in comparison to a purely fermionicproblem, except that we have a coupled electron-boson wave function and the extra contributions of thedipole self-energy. For the purely bosonic part of the coupled Hamiltonian we can do the same and find (see Note that to actually calculate the latter case, we have to employ the connection (3.9). H ph [ γ b ] = 〈 Ψ | (cid:40) M (cid:88) α = (cid:104) − ∂ ∂ p α + ω α p α (cid:105)(cid:41) Ψ 〉 = M (cid:88) α = (cid:181) ω α + (cid:182) γ b ( α , α ). (3.46)We see that the photon-energy functional resembles structurally exactly the electronic one-body expres-sions (3.44a). This is of course due to the construction of RDMs, but nevertheless it is remarkable, sincewe deal with two different particle classes. The underlying exchange symmetry of an RDM is instead en-coded in what we introduced in Sec.2.4.1 as N -representability conditions. For the 1RDM of (fermionicor bosonic) ensembles, the conditions are simple and they are most easily expressed in the eigenbases γ e / b = (cid:80) n e / bi (cid:179) φ ie / b (cid:180) ∗ φ ie / b , where the φ ie / b are called the natural orbitals and the n ie / b the natural occupa-tion numbers. Then, the conditions are 0 ≤ n ei ≤
1, (3.47a)0 ≤ n bi , (3.47b)for fermions and bosons, respectively. Let us recall our example of the n = n + ... + n M -boson state thatdescribes, e.g., photons occupying mode 1 with n , mode 2 with n photons etc. From the definition ofthe bosonic 1RDM it is obvious that n i ≡ n bi in the above expression and thus, we see how the bosoniccharacter of photons is directly transferred to the N-representability condition. In practice, this means that any positive-semidefinite matrix can be a bosonic 1RDM, which makes the conditions much less stringentthan in the fermionic case, where the Pauli-principle translates to the upper bound for the n ei . Additionally,if the particle number N e / b of one species of the system is conserved, the respective sum-rule ∞ (cid:88) i = n e / bi = N e / b (3.48)becomes a second part of the N -representability conditions. Thus, in our case, we have yet another con-dition for the electronic part of the system, but no further bounds for the photons. We mention this hereto stress that the 1RDM N -representability conditions for electrons reduce the configuration space of valid1RDMs quite strongly. If we want to build some kind of RDMFT, this is on the one hand numerically chal-lenging, because we have to test these conditions, but on the other hand it is really helpful, because thenany approximation to the interaction functional needs to carry less “information.” We discussed in Sec. 2.4a similar example of the more stringent N -representability conditions that refer to pure-states and not onlyensembles. Theophilou et al. [206] showed that enforcing the pure-state conditions in RDMFT yield moreaccurate results than enforcing only the ensemble conditions for the same functional. This indicates thatthe functional construction in a theory that is based on the photonic 1RDM might be more difficult. The problem of the electron-photon coupling: we need a new type of RDM
However, the coupled electron-photon theory provides us even more intricate problems in terms of RDMsand representability conditions. The bilinear coupling term ˆ H int , which is the key quantity of the coupledtheory cannot be treated by the qRDMs, that we have defined in Def. (2.77) and Eq. (3.43). Formally, wecould write H I [ Γ (3/2) e , b ] =〈 Ψ | (cid:34) M (cid:88) α = − ω α p α λ α · ˆ D (cid:35) Ψ 〉 = M (cid:88) α = − ω α 〈 Ψ | (cid:34)(cid:115) ω α (cid:179) ˆ a † α + ˆ a α (cid:180) λ α · ˆ D (cid:35) Ψ 〉 , (3.49)with a new reduced quantity that mixes light and matter degrees of freedom and can be interpreted as a“3/2-body” operator Γ (3/2) e , b ( α ; r , r (cid:48) ). This can be best understood, if we also lift the continuous fermionicproblem into its own Fock space and introduce genuine field operators ˆ ψ † e ( r σ ) and ˆ ψ e ( r σ ) with the usualanti-commutation relations. Similarly to the discussed bosonic case, the electronic RDMs can then be writ-ten in terms of strings of creation and annihilation field operators [254]. We re-express 〈 Ψ | (cid:104)(cid:179) ˆ a † α † ˆ a α (cid:180) λ α · ˆ D (cid:105) Ψ 〉 = (cid:88) σ (cid:90) d r 〈 Ψ | (cid:104)(cid:161) ˆ a + α + ˆ a α (cid:162) ˆ ψ † e ( r σ ) ˆ ψ e ( r σ )( λ α · r ) (cid:105) Ψ 〉 and define Γ (3/2) e , b ( α ; r , r (cid:48) ) = (cid:88) σ 〈 Ψ | (cid:104)(cid:179) ˆ a † α + ˆ a α (cid:180) ˆ ψ † e ( r σ ) ˆ ψ e ( r (cid:48) σ ) (cid:105) Ψ 〉 . (3.50)We can now re-write Eq. 3.49 H I [ Γ (3/2) e , b ] = M (cid:88) α = − (cid:112) ω α (cid:90) d r ( λ α · r ) Γ (3/2) e , b ( α ; r , r ). (3.51)The “property” of the bilinear interaction term to create/annihilate bosons by interacting with the electronicsubsystem is thus directly connected to the half-integer index of the corresponding RDM. This is clearly thestraight-forward generalization of the q RDMs. However, this definition is only useful if we understand itsproperties. First of all, we note that the 3/2-body RDM has in general no simple connection to any q RDM,similarly to the connection formula (2.84), even if we extend the definitions to include combined matter-boson q RDMs. If we consider for example the integration (cid:82) d r Γ (3/2) e , b ( α ; r , r ), we will have a function thatdepends only on α and thus will be related to some kind of a bosonic 1/2-body RDM. We refrain at this pointfrom a precise definition of such half-body objects since we will not further discuss them. However, oneobvious reason for this is that q RDMs conserve particle numbers, while half-body RDMs do not.We want to finish this discussion with a simple example, that shows that the information of the photonic1RDM γ b ( α , β ) = 〈 Φ | ˆ a † β ˆ a α Φ 〉 is different from the photonic 1/2RDM Γ (1/2) b ( α ) = 〈 Φ | ˆ a † β Φ 〉 . In the special casethat | Φ 〉 consists only of coherent states for each mode (which essentially means that we have treated thephotons in mean field) and since the coherent states are eigenfunctions to the annihilation operators, wefind γ b ( α , β ) = d ∗ β d α , where d α is the total displacement of the coherent state of mode α . In this case wealso know 〈 Φ | ˆ a † β Φ 〉 = d ∗ β . If we now assume all but one mode, say mode 1, having zero displacement, thenwe only know γ b (1,1) = | d | from the bosonic 1RDM. We do, however, in general not know what d ∗ is. Forother states, such a connection is even less explicit. Variational RDM theory for NR-QED: The representability problem
Putting the interrelations among the different RDMs aside for the moment, the minimization for the coupledmatter-boson problem can be reformulated by E = inf Ψ 〈 Ψ | ˆ H Ψ 〉= inf { γ e , Γ (2) e , γ b , Γ (3/2) e , b } → Ψ (cid:110) ( T + V )[ γ e ] + ( W + H d )[ Γ (2) e ] + H ph [ γ b ] + H I [ Γ (3/2) e , b ] (cid:111) . (3.52)107HAPTER 3. LIGHT AND MATTER FROM FIRST PRINCIPLESSo in principle, we could replace the variation over all wave functions Ψ by their respective set of RDMsneeded to define the energy expectation values. Instead of varying over the full configuration space ( x ,..., x N , p ,..., p M ), the above reformulation seems to indicate that we can replace this by varying over ( r , r (cid:48) ) for thediagonal of Γ (2) e and also for the 1RDM γ e together with a variation over ( α , β ) for γ b and over ( α , r ) for Γ (3/2) e , b .Such a reformulation is the basis of any RDM theory, and for electronic systems the properties of RDMs havebeen studied for more than 50 years [195]. As we have seen in Sec.2.4 for the electronic case, this seeming re-duction of complexity is deceptive. The many-body problem has merely been shifted to the representabilityconditions of the RDMs. Although the conditions are simple for 1RDMs of ensembles, for any higher orderRDM, they are extremely complicated.Let us transfer the representability question to the present case. In order to find physically sensible re-sults, we cannot vary arbitrarily over the above RDMs but need to ensure that they are consistent amongeach other and that they are all connected to the same physical wave function. This is indicated in Eq. (3.52),where { γ e , Γ (2) e , γ b , Γ (3/2) e , b } → Ψ highlights that the RDMs are contractions of a common wave function. Thus,we are confronted with two new difficulties in the coupled case: consistency between the RDMs and rep-resentability with respect to one wave function. In the purely electronic problem, all the occurring RDMsare connected and thus, we can calculate the energy expectation value only by one single object, the 2RDM.Consequently, we just need to bother about the representability problem Γ (2) e → Ψ . For the coupled case,we still have the connection between γ e and Γ (2) e , but we cannot connect them to the other RDMs. Thereis also no kind of coupled higher order RDM that by different contraction yields all the others as we dis-cussed before. Thus, we need to treat all the three objects { Γ (2) e , γ b , Γ (3/2) e , b } together and find representabilityconditions for this set. The insights from electronic theory might be helpful here, but they are not gen-eralizable in a straightforward way. One of the crucial new problems is the underlying Fock-space for thephoton-degrees of freedom. As the name suggests, N -representability conditions are always defined withrespect to a constant particle number. These considerations show yet from a different perspective how in-tricate a many-body description of the coupled-electron photon problem is, and this even in the simplestdipole-approximated case. From the analysis of the last subsection, we gained yet another point of view on the difficulties of theelectron-photon interaction. Transferring concepts from electronic-structure theory is also from an RDMperspective far from trivial and provides us with many interesting research questions. In this last part of thesection, let us see how far we can get with a “conservative” approach that is to generalize electronic RDMFTto its QED version in the same way as Ruggenthaler [242] generalized DFT. To the best of our knowledge, theproof of the according Hohenberg-Kohn like theorem has not been published in the literature and thus, wepresent it in the followingWe will show the proof for the full Pauli-Fierz theory with minimal coupling and only afterwards showits form for the cavity setting of Hamiltonian (3.1). We follow Gilbert’s proof [199] of a one-to-one mappingbetween the von-Neumann density matrix of the system and the “internal” pair of ( γ ( r , r (cid:48) ), A ( r )), where γ isthe electronic 1RDM and A the vector potential of the coupled electron-photon ground state. The corre-sponding “external” pair is ( v ( r , r (cid:48) ), j ( r )), where v is a possibly non-local external potential and j an externalcharge current. As in electronic RDMFT, there is no unique v corresponding to a given γ . For the sake ofgenerality, we will present the proof in SI units.108.3. REDUCED DENSITY MATRICES IN QED Basic Definitions
We consider non-relativistic QED in the setting that we introduced in Sec. 1.3.1, i.e., we neglect any form ofspin-dependence of the Hamiltonian (e.g., due to the Stern-Gerlach term) and quantize the theory in theCoulomb gauge where for the vector potential holds ∇ · A =
0. Thus, the corresponding vector-potentialoperator is purely transversal. We then expandˆ A ( r , t ) = (cid:114) (cid:126) c (cid:178) L (cid:88) k , σ (cid:178) k , σ (cid:112) ω q (cid:179) ˆ a k , σ e ik · r + ˆ a + k , σ e − ik · r (cid:180) , (3.53)where we assume a quantization box with side-length L , leading to allowed wave vectors k = π n / L ( n ∈ (cid:90) )and corresponding frequencies ω k = c | k | . The index σ = {1,2} denotes the transversal polarization direc-tions and the corresponding polarization vectors obey (cid:178) k , σ · (cid:178) k , σ = δ σ , σ (cid:48) and (cid:178) k , σ · k =
0. The expansion co-efficients become the usual creation (annihilation) operators ˆ a (†) k , σ with (transversal) commutation relations[ ˆ a k , σ , ˆ a † k , σ ] = δ σ , σ (cid:48) δ T ( k − k (cid:48) ), where δ T is the transversal delta-distribution. We denoted the vacuum permit-tivity with (cid:178) , the speed of light with c and the Planck constant with (cid:126) . We consider a classical external chargecurrent j that because of the transversality of A enters also only by its transversal component. We expand j in the modes of the quantization box, i.e., j ( r ) = (cid:113) (cid:178) (cid:126) L (cid:88) k , σ ω k (cid:178) k , σ (cid:112) ω q (cid:179) j k , σ e ik · r + j ∗ k , σ e − ik · r (cid:180) , (3.54)with expansion coefficients j k , σ = j ∗− k , σ = (2 ω k (cid:178) (cid:126) L ) − (cid:82) d r (cid:178) k , σ · j ( r )exp( − ik · r ). We couple this field to N non-relativistic electrons. The Hamiltonian of the full coupled system reads then (cf. Def. 1.2)ˆ H = N (cid:88) i = (cid:189) m (cid:179) − i (cid:126) ∇ i + ec ˆ A ( r i ) (cid:180) − ev ( r i ; r (cid:48) i ) (cid:190) + N (cid:88) i , j = e | r i − r k |+ (cid:88) k , σ (cid:126) ω k ( ˆ a † k , σ ˆ a k , σ + ) − (cid:88) k , σ (cid:126) ω k ( ˆ a k , σ j ∗ k , σ + ˆ a † k , σ j k , σ ), (3.55)where m is the electron mass, and e the elementary charge. We introduced now a general non-local externalpotential v ( r ; r (cid:48) ) that acts on a function f ( r ) in an integral sense as ˆ v f ( r ) = (cid:82) d r (cid:48) v ( r ; r (cid:48) ) f ( r (cid:48) ). Note that for thetime-independent case, introducing an external vector potential is equivalent to an external current [242].The physical charge current that is preserved in a time-independent setting (because it obeys the continuityequation) reads ˆ J ( r ) = ˆ j P ( r ) − mc ˆ n ( r ) ˆ A ( r ), (3.56)where ˆ j P ( r ) = i e (cid:126) m N (cid:88) i = ( δ ( r − r k ) → ∇ k − ← ∇ k δ ( r − r k )) (3.57)is the paramagnetic current and ˆ n ( r ) = − e (cid:80) Ni = δ ( r − r k ) is the charge density operator. Note that by intro-ducing the term e | r i − r k | , we assumed already the L → ∞ limit. For notational convenience, we use in this subsection the symbol A to denote the transversal vector potential. H = H NA ⊗ F ,where H Na is the antisymmetric Hilbert space of N electrons and F is the photon Fock space (see Sec. 3.1.1).We denote the eigenstates of ˆ H with | Ψ i 〉 , i.e., they satisfy the static Schrödinger equation ˆ H | Ψ i 〉 = E i | Ψ i 〉 .The | Ψ i 〉 can be parametrized by N x = ( r , σ ) and some parametriza-tion of the photon degrees of freedom that we denote by α = ( k , σ ). We choose this representation in analogyto the real-space representation of the electrons. Note however that this description is only well-defined inthe mode-space and the “back-transformation” to real space is problematic. We then define Ψ ( r ,..., r N , α , α ,...) = 〈 r ,..., r N , α , α ,... | Ψ 〉 , (3.58)where we leave the exact number of photons arbitrary, because it is not determined by the Hamiltonian.More generally, we will not only consider pure states, i.e., the eigenstates | Ψ i 〉 of ˆ H , but ensembles of sucheigenstates that are described by the density matrix D ( r ,..., r N , α , α ,...; r (cid:48) ,..., r (cid:48) N , α (cid:48) , α (cid:48) ,...) = (cid:88) i w i Ψ i ( r ,..., r N , α , α ,...) Ψ ∗ i ( r (cid:48) ,..., r (cid:48) N , α (cid:48) , α (cid:48) ,...), (3.59)where 0 ≤ w i ≤
1, and (cid:80) i w i =
1, making the set P N ∞ of all such D convex (see Sec. 2.4.1). In the following, weare interested in the electronic one-body reduced density matrix (1RDM) γ ( r , r (cid:48) ) = N (cid:90) d x ··· d x N d q d q ··· D ( r ,..., r N , α , α ,...; r (cid:48) , r ..., r N , α , α ,...) = γ [ D ], (3.60)and the expectation value of the vector potential operator from Eq. (3.53), i.e., A ( r ) = tr[ D A ]. (3.61) Gilbert’s theorem for NR-QED
With these definitions, we are now set to show the following correspondence( v , j ) −→ D ←→ ( γ , A ). (3.62)We will do this in two steps. First, we show that the pair ( v , j ) determines the (non-degenerate) ground statedensity-matrix D = Ψ Ψ ∗ (though the opposite is not true in general) and then, we will show the one-to-one correspondence between D and the pair ( γ , A ), where γ = γ [ D ] and A = Tr[ D A ]. This proof is adirect generalization of Gilbert’s proof [199] and establishes QED-RDMFT.The first part is to show that we can associate to every pair ( v , j ) a corresponding ground state densitymatrix D . This is however simple, because the pair ( v , j ) consists of the only external parameters in the The concept of a real-space photon wave function is the object of a long-standing debate and a good summary of this debate canbe found in [89, Chap. 1.5.4]. For the concrete example of a one-photon-wave function in real-space and its connection to mode-space, the reader is referred to, e.g., Ref. [255]. The author considers there the spontaneous emission process of an atom and shows that thephoton wave function diverges at the position of the atom. However, for all other positions, it is well-defined and makes physicallysense. This is a good example of the problem: there are several ways to define a photon wave function in real-space that are reasonable,but they are all limited in a similar way as the example of Ref. [255]. H = H ( v , j ).Obviously, we can also label the (unique) ground state of H ( v , j ) in the same way, i.e., D = D ( v , j ). However,the opposite is not true, i.e., we cannot label v ( D ), because there are v (cid:54)= v (cid:48) that have the same ground statedensity matrix D = D (cid:48) . Take for instance a non-interacting N -electron system that consists only of one-body terms. The Hamiltonian ˆ H = (cid:80) Ni = = − ∇ i + v ( r i ; r (cid:48) i ) of such a system per construction commuteswith γ = (cid:80) i n i φ ∗ i ( r (cid:48) ) φ i ( r ) and thus the ground state is simply the Slater determinant of the lowest N naturalorbitals φ ,..., φ N . Consider now the nonlocal potential v + δ v , with δ v ( r ; r (cid:48) ) = (cid:80) ∞ i j = N + v i j φ ∗ i ( r (cid:48) ) φ j ( r ) thatonly depends on the unoccupied natural orbitals. Obviously, v + δ v is different from v , but both potentialslead to the same ground state, since they differ only in the unoccupied subspace. Thus, there are infinitelymany non-interacting systems with different nonlocal potential that have the same ground state and thus,one cannot construct a KS-like auxiliary system in (zero-temperature) RDMFT. In DFT, there is instead aunique KS system for every density, since we consider only local potentials that cannot separate between theoccupied and unoccupied space [165]. Thus there is a one-to-one correspondence between such potentialsand the density. However, to define RDMFT (or QED-RDMFT), we do not need this unique mapping asGilbert has pointed out [199].Let us now prove the second part of the theorem, i.e., the one-to-one correspondence between groundstate density matrix D and the internal variables ( γ , A ). We do so by reductio ad absurdum and show thatthe opposite assumption leads to a contradiction. Thus, we considering two Hamiltonians H ( v , j ), H ( v , j )that have to two different ground states D , D but the same ( γ , A ) = ( γ , A ) = ( γ , A ). We have then E = Tr[ D H ( v , j )] < Tr[ D H ( v , j )] = E + (cid:90) d r d r (cid:48) γ ( r , r (cid:48) ) (cid:161) v ( r , r (cid:48) ) − v ( r , r (cid:48) ) (cid:162) − (cid:90) d rA ( r ) (cid:161) j ( r ) − j ( r ) (cid:162) E = Tr[ D H ( v , j )] < E = Tr[ D H ( v , j )] = E + (cid:90) d r d r (cid:48) γ ( r , r (cid:48) ) (cid:161) v ( r , r (cid:48) ) − v ( r , r (cid:48) ) (cid:162) − (cid:90) d rA ( r ) (cid:161) j ( r ) − j ( r ) (cid:162) ,Since ( γ , j ) = ( γ , j ), the sum of both inequalities leads to E + E < E + E ,which is obviously a contradiction. Thus, we have proven the assumption.We can trivially transfer this proof to the dipole-Hamiltonian (3.1) including an external current, cf. Eq.(3.30), that we have employed also for the QEDFT mapping. Therefore, we have( v ( r , r (cid:48) ),{ p α }) −→ D ←→ ( γ ( r , r (cid:48) ),{ ˙ j α }) (3.63) Note that conversely, such a construction is not possible, if all natural orbitals have a non-zero occupation number. In this case,the mapping is unique. It has been conjectured that this is the case for Coulomb systems, but there is still no general proof [256]. If we however generalize the description to grand-canonical ensembles, the mapping between γ and v is indeed unique for allsystems [200]. QED-RDMFT in practice
We want to conclude the subsection with a small discussion of the applicability of QED-RDMFT. We employour basic setting, i.e., Hamiltonian (3.1) and consider again wave functions, since the ground state of a closedsystem is pure and does not require the ensemble construction.The proof of the last paragraph is the justification to employ QED-RDMFT in practice. The first step forthis is the definition of the energy functional to identify the corresponding unknown parts, especially thephoton exchange correlation functional. We can directly conclude from our discussion in Sec. 3.3 that wewill have to find approximations for E XC , ph [ γ ,{ ˙ j α }] =〈 Ψ [ γ ,{ ˙ j α }] | (cid:163) ˆ H int + ˆ H sel f ,2 (cid:164) Ψ [ γ ,{ ˙ j α }] 〉 , (3.64)where ˆ H int = − N (cid:88) i M ph (cid:88) α ω α p α λ α · r i (3.65)ˆ H sel f ,2 = N e (cid:88) i (cid:54)= j = ( λ α · r i ) (cid:161) λ α · r j (cid:162) (3.66)Since both these terms are local in space, the 1RDM perspective does not directly provide an advantage incomparison to the alternative QEDFT description. However, typical RDMFT functionals are more genericthan DFT functionals as we have discussed in Sec. 2.4.2. We can for example generalize straightforwardlythe Müller functional to describe the self-interaction term, simply by exchanging w ( r , r (cid:48) ) → ( λ α · r ) (cid:161) λ α · r (cid:48) (cid:162) .The reason is that the Müller functional (as many other RDMFT functionals) is basically an approximationof the 2RDM in terms of the 1RDM.Unfortunately, since we do not yet have a good understanding of the 3/2-body RDM (Eq. (3.50)), that isconnected to the interaction term, we cannot use a similar “trick” for the first term in Eq. (3.64). However,in contrast to QEDFT, we have some concrete tools that we can try to apply to this problem. For instance,this suggests to investigate the still unknown representability conditions of coupled RDMs (see the Outlookin part IV).112 ART II
DRESSED ORBITALS - OLD THEORY IN A NEWBASIS
You can look at something with a microscope and see it a certain way, you can look at it with anaked eye and see it in a certain way, you look at it with a telescope and you see it in anotherway. Now, which level of magnification is the correct one? Well, obviously, they’re all correct, butthey’re just different points of view. (Alan Watts, 1960 [257]) 113e now have discussed at length the challenges of accurately describing equilibrium many-body statesof non-relativistic QED. Even if we consider only one photon mode, a wave function description of the many-body space is already infeasible for very small systems. Since the electronic many-body problem is a verywell-studied topic with a plethora of theories and optimized implementations that are capable to treat manysystems extremely accurately, it would be very desirable to extend such methods to the coupled electron-photon problem. However, we have shown in Sec. 3.1.2 with the example of Hartree-Fock theory that suchan extension is not as straightforward as one might hope. A direct generalization of the Hartree-Fock (single-reference) wave function ansatz is not capable to describe any quantum effects of the photon field. Whenwe try to go beyond that, even a seemingly very restrictive ansatz leads to an (over-)exponentially growingproblem.In order to study to role of the quantum nature of the electron-photon interaction, we cannot avoiddealing with the multi-reference character of the coupled states. This is an important difference to elec-tronic problems where a single-reference ansatz, i.e., one Slater determinant already included quantum-mechanical exchange because of its anti-symmetry. Understanding the role of such quantum effects ofthe photon field is especially important for polaritonic ground states, because static classical electric andmagnetic fields are zero for many equilibrium settings. In the time-dependent case instead, the classicaldescription is very accurate, when it is combined with a quantum mechanical description of the electrons.We made this observation, when we were discussing KS-QEDFT (Sec. 3.2). QEDFT is in principle capable totreat quantum fluctuations of the photon-field by means of corresponding exchange-correlation potentials.However, it is very difficult to construct an accurate functional and the only known functional that goes be-yond the MF, the “photon OEP,” can besides being numerically very challenging only account for one-photonprocesses. For instance, we cannot apply the photon OEP to larger systems and one-photon processes arenot sufficient to accurately describe polaritonic ground states in the strong-coupling regime [247]. Conse-quently, we need new methods to appropriately study such settings.Let us therefore take one step back and analyze again the origin of the multi-reference character of po-laritonic systems. We describe electrons and photons as separate species in their own antisymmetric andsymmetric many-body spaces. These are then coupled by the interaction operator, which introduces mostof the challenges of the coupled problem (see Ch. 3). There is a clear analogy to correlated electronic sys-tems, which we also describe starting from non-interacting electrons that then are coupled by the interac-tion. However, this coupling takes place within one antisymmtric many-body space and this is an importantdifference to the electron-photon coupling. The central idea of the dressed(-orbital) construction is nowto change the basic description of coupled electron-photon systems such that there is only one (effective)many-body space with one (effective) symmetry. This means specifically that we define a single-particlespace with already coupled electron-photon orbitals, i.e., polaritonic or dressed orbitals and construct themany-body space as a product of such orbital spaces. We can then describe non-interacting polaritons thatare coupled due to the corresponding dressed interaction. Such a programme naturally defines polaritonsas its own particle class with its own statistics that needs to be respected to construct the many-polaritonspace.Clearly, restructuring the many-electron-photon space in such a way is not straightforward. In fact, thedressed construction considers an auxiliary system that is even higher-dimensional than the cavity-QEDHamiltonian. However, this extension provides the necessary flexibility to introduce an exact reformula-tion of the cavity-QED systems in terms of dressed orbitals. Most importantly, this allows to approach thechallenges of describing coupled light-matter systems from a different perspective. Approximations to thewave function, the density description and reduced density matrices can be defined also in the dressed set-114ing. This is yet another way to utilize these state descriptors and they will have again different propertieswith respect to the already discussed versions of the previous chapter. A practical consequence of this dif-ferent characterization of light-matter states is that it allows for relatively simple approximation schemes.For instance, the HF approximation to the many-polariton wave function explicitly accounts for correlationbetween the electronic and photonic subsystems in the original description.A key role is hereby played by the new structure of the auxiliary Hamiltonian with respect to the standardcavity-QED Hamiltonian. Expressed in terms of polaritonic orbitals, the new Hamiltonian consists only ofone-body and two-body operators. It is thus structurally equivalent to the electronic-structure Hamiltonian(Sec. 2.1). Consequently, we can generalize in principle every electronic-structure method to a “polaritonic-structure method” to describe cavity-QED problems. For instance, this allows to apply RDM methods tocoupled light-matter systems, without the need to deal with the little-understood 3/2-body RDM. Also func-tional construction for a polariton-based QEDFT is simpler, since one can more directly apply the successfulstrategies of electronic-structure theory.Another important advantage of the structural resemblance between polaritonic and electronic-structuremethods is the numerical implementation. Since these methods always require to solve nonlinear equationsthat do not allow for an analytical treatment, we need numerical solvers to apply any method in practice.By employing the polariton description, we can simply use the existing electronic-structure codes as a basisand extend them accordingly. We do not have to develop entirely new codes.However, to be able to introduce polaritonic orbitals, we have to increase their dimension by one withevery photon mode that we take into account. For instance, to describe the standard case of cavity QED thatconsiders one effective photon mode, the polaritonic orbitals depend on three electronic and one photoniccoordinate and are thus four-dimensional. Considering such orbitals (and depending on the system sizealso with more modes) is still numerically feasible for modern high-performance clusters. Thus, polaritonic-structure methods will be especially useful in such cavity-QED settings.A further important feature of the polariton description is the exchange-symmetry and the accordingstatistics of polaritonic orbitals, which have a fermion-boson hybrid character. Such a hybrid statistics isnot usual in quantum many-body theory, which leads to interesting new research questions at a very basiclevel. In fact, we also had underestimated its role when we started the investigation of dressed orbitals. Onlyafter we had obtained reproducible numerical results that violated the Pauli principle, we understood thesignificance of the hybrid statistics (see part III). On the one hand, this leads to new challenges especiallywith regard to enforcing the statistics in practice. On the other hand, the hybrid statistics are a new toolto describe many-particle spaces of coupled species in general, which could be very valuable also for othercoupled problems. 11516 hapter 4 THE DRESSED-ORBITAL CONSTRUCTION
In this chapter, we introduce the dressed-orbital construction that allows to describe coupled electron-photon systems in terms of new orbitals. This is the basis for all the following discussions. The constructionrequires several steps, which are nontrivial and we try to explain each of them in a way, as simple and slowas possible. Specifically, we start in Sec. 4.1 with a brief motivation for a polariton-based description ofcavity-QED problems. Then, we introduce the construction with an example (Sec. 4.2), that we general-ize in Sec. 4.3. We note that the dressed construction has been discussed in three publications, the first ofwhich was written by my colleagues Nielsen et al. [258] and in the other two I was the main author ([1, 2]).This chapter is based on Ref. [2], which is the most recent publication and thus presents the most completepicture of the construction.
We want to start the discussion with some general comments about polaritons and the multi-reference char-acter of coupled electron-photon systems. We have discussed in the introduction (Sec. 1.1) that the hallmarkof strong electron-photon coupling is the emergence of light-matter hybrid states or polaritons. The basicphysics of such hybrid states is described by a minimal model (see Sec. 1.2) that considers two relevant elec-tronic states, labelled by | m 〉 (“ground”) and | m 〉 (“excited” state), and two photonic states, the vacuum | 〉 and one-photon state | 〉 . The resulting polaritons are then denoted as P ± = α | m 〉 ⊗ | 〉 ± β | m 〉 ⊗ | 〉 , (4.1)with coefficients α , β that depend on details of the model (see Sec. 1.2 for further details about the familyof cavity-QED models). From a first-principles perspective this means that polaritons correspond to multi-reference or correlated wave functions, even if we described the matter states within a single-reference ap-proach. We have discussed this in detail in Sec. 3.1.2. For strongly-coupled ground states which are thefocus of this thesis, the results of more general cavity-QED models [24] predict that an accurate descrip-tion requires even more terms, i.e., more references, than the two of the “minimal” polariton (4.1). Sys-tems that require multi-reference states are well known in electronic-structure theory and their accuratedescription is among the most difficult challenges in the field. A classical example is the dissociation ofthe Hydrogen molecule (Sec. 2.5), where the wave function obtains a multi-reference character for largerbond distances. This prototypical system is commonly used as a challenging test case for first-principlemethods [259, 221, 260, 261].However, the term multi-reference depends crucially on the basic entities, i.e., the “single references”that are used to build the “multi-reference” state. Transferred to the problem of polaritonic physics, we See Ch. 2, especially the last paragraph of Sec. 2.3 on strong correlation. (as in the simple exampleabove) can be very inefficient in describing polaritonic states of the coupled system. We saw this for examplein Sec. 3.1.2, when we were discussing a possible generalized HF wave function. Even a very simplified ansatzbrings back the exponential wall. We also saw that the (single-reference) KS system in equilibrium QEDFTreduces the description basically to an effective matter-only system. All effects of the photon field (besidessome trivial static shift for matter systems with a permanent dipole) have to be carried by the unknownexchange-correlation potential and there is no known approximation that is accurate in the strong-couplingregime [247].One important challenge for the equilibrium description is that the exchange-correlation functional onlydepends on matter quantities, since the photon displacement coordinate is trivial (cf. Eq. (3.37) of Ch. 3).From a general perspective, this is rooted in the choice of a non-interacting auxiliary system, where theelectron and photon spaces decouple. Thus, we are confronted with a dilemma: On the one hand, strongly-coupled light matter systems require fundamentally a multi-reference description, because polaritons, i.e.,hybrid matter-photon particles emerge as the principal degree of freedom. On the other hand, a multi-reference description is especially difficult for coupled systems, because the product space of the separateelectron and photon spaces is considerably more difficult to describe than the single-species spaces. Espe-cially, typical (and straightforward) approximations such as an effective single-reference description captureconsiderably less effects than the equivalent approaches in a matter-only theory.In this chapter, we want to propose a construction that tries to mediate between the opposing sides ofthis dilemma. The key idea is to build the theory not as usual from separate electron and photon statesbut somehow introduce electron-photon hybrid states as basic entities. Before we lay out our constructionto introduce such a “many-polariton” theory, we want to illustrate the challenges that arise, when we tryto modify the standard description. Let us therefore shortly recapitulate how many-body spaces are con-structed with the example of the electronic problem. We start with the single-particle Hilbert-space ˜ h ,that is spanned by a corresponding single-particle basis { ˜ ψ i ( r )} (where we always assume bases to be or-thonormalized, i.e., (cid:82) d r ˜ ψ ∗ i ( r ) ˜ ψ j ( r ) = δ i j ), such that every single-particle state can be expressed by somesuperposition Ψ ( r ) = ∞ (cid:88) i = c i ˜ ψ i ( r ),where the c i are expansion coefficients that need to satisfy the sum rule (cid:80) i | c i | =
1. In the case of electrons,we need to include also the spin-degree of freedom, which we do by a tensor-product. The spin-space S for aone-electron problem is just two-dimensional and we denote its basis-elements by { α ( σ ), β ( σ )}. The tensor-product h = ˜ h ⊗ S then combines both sets to one larger basis, taking into account all possible combinations.We subsume both together in a spin-spatial coordinate r → x = ( r , σ ) where σ is the spin-coordinate. Theexpansion then becomes Ψ ( x ) = (cid:88) j c j ψ j ( x ) = (cid:88) s = α , β ∞ (cid:88) i = c i , s ψ i , s ( r , σ ) = ∞ (cid:88) i = (cid:163) c i , α ˜ ψ i ( r ) ⊗ α ( σ ) + c i , β ˜ ψ i ( r ) ⊗ β ( σ ) (cid:164) . We want to remind the reader on the subtleties regarding the definition of a photon wave function that we have discussed inSec. 3.1.1 h = h ⊗ h .This two-particle Hilbert space is however still too general, because it includes distinguishable states(Sec. 2.1). Since electrons are fermions, we have to constrain h to only antisymmetric combinations, i.e., h A = h ∧ h .We can make this explicit in the state parametrization by employing Slater determinants Ψ i , j ( x , x ) = (cid:112) (cid:161) ψ i ( x ) ψ j ( x ) − ψ j ( x ) ψ i ( x ) (cid:162) = | ψ i ψ j | − as basis of the two-particle space. If we label every combina-tion of i and j by a new index I = ( i , j ), we can describe a general state Ψ ∈ h A as Ψ ( x , x ) = (cid:88) I c I Ψ I ( x , x ),where (cid:80) I | c I | =
1. The advantage of this construction is obvious: the basis already takes care about thefundamental particle symmetry. We have discussed at length that only this additional symmetry informationincludes nontrivial quantum effects in the description.When we want to describe coupled electron-photon problems, we lose this advantage. Electrons andphotons are two different species and thus distinguishable. Accordingly, we have to consider the full tensorproduct between the matter and photon many-body spaces without further symmetry restrictions (Ch. 3.1.1).Let us contrast this to a hypothetical many-polariton space. In analogy to before, we start with the single-polariton Hilbert-space that is build by objects that have an electron coordinate x and a photon coordinate p . Similarly to the extension of the spatial coordinates by a spin-component for electrons, we could define h p = h e ⊗ h ph , where h e is the electronic one-particle space from before and h ph is some one-particle photonspace. Introducing a photon orbital basis { χ α ( p )}, the elements of h p are thus given by φ ( x , p ) = (cid:88) i , α c i α (cid:161) ψ i ( x ) ⊗ χ α ( p ) (cid:162) , (4.2)where (cid:80) i α | c i α | =
1. Since electrons and photons are distinguishable, there is no further symmetrizationrequired.We have discussed in Sec. 2.1.2 that many electronic-structure problems can be simplified, if we assumethat the orbitals are the eigenfunctions of the one-body part of the Hamiltonian, ( h ( r ) = − ∇ r + v ( r )). Toget a feeling for the challenges of these new kind of orbitals (4.2), let us try to do find a similar “one-body”Hamiltonian. The straightforward choice is to consider the cavity-QED Hamiltonian (Eq. (1.37)) for oneelectron, i.e., ˆ h p = ∇ + v ( r ) + M (cid:88) α (cid:183) − ∂ p α + ω α (cid:181) p α + λ α ω α · r (cid:182) (cid:184) .The eigenfunctions φ ( x , p ) of ˆ h p would define our photon variable as M-dimensional, i.e., p = ( p ,..., p M ).This dimension is determined by the system that we want to describe and as we have discussed, we canoften assume even M = h = h p ⊗ h p ,with elements Φ ( x p , x p ) = (cid:88) I , J c I , J φ I ( x p ) φ J ( x p ).Now we need to understand how we utilize Φ to describe the ground state of the actual Hamiltonian thatdescribes two electrons, coupled to M modes, i.e.,ˆ H = (cid:88) i = ∇ r i + v ( r i ) + (cid:183) − ∂ p + ω (cid:181) p + λ ω · ( r + r ) (cid:182) (cid:184) . (cf. 3.14)This confronts us with two fundamental questions:1. How can we enforce the correct symmetry on this level?2. How can we deal with the additional p -coordinates that are not present Eq. (3.14)?In the following, we show how the dressed-orbital construction can in principle resolve both of the abovequestions. This allows to construct a many-polariton space (almost) of the form h Np = S p ( h p ⊗ ··· ⊗ h p (cid:124) (cid:123)(cid:122) (cid:125) N times ),where S p enforces the underlying exchange-symmetry that is of a fermion-boson hybrid nature. Before we discuss the general case, we want to illustrate the dressed construction with the example of the2-electron-1-mode system that we have considered already in Sec. 3.1.1. Since the construction requiresmany (sometimes quite technical) steps, we reserve this whole section only for the example system. Thisallows us to go through every step in detail and motivate its purpose. In the next section (Sec. 4.3), we thenrecapitulate all these steps for the general case of N electrons coupled to M photon modes. The basic ideaof the construction is sketched in Fig. 4.1. The Hamiltonian of the 2-electron-1-mode system (introduced in Sec. 3.1.1) readsˆ H = (cid:88) k = ∆ r k + v ( r k ) + | r − r | + (cid:183) − ∂ p + ω (cid:181) p + λ ω · ( r + r ) (cid:182) (cid:184) . (cf. 3.14) Note that the contents of this section are part of Ref. [2] Ψ ( x , y , p ) of two one-dimensional electrons ( x , y ) coupled to one photon mode with displacement coordinate p. The couplingis indicted by the double arrows x ↔ p , y ↔ p . The electronic orbital wave functions are symbolized by theground and first excited state of a box with zero-boundary conditions and the photon mode by a wiggly line.The corresponding dressed wave function Ψ (cid:48) ( x , q , y , q ) has instead two photon-coordinates q , q that arerelated to the physical coordinate p by p = (cid:112) q + q ). On the wave function level, this connection canbe utilized to introduce two polariton orbitals with coordinates ( x , q ) and ( y , q ), respectively, that are in-teracting. This new interaction is indicated by a double arrow between the two orbitals.We start by expanding ˆ H in its single constituentsˆ H = ˆ H e = (cid:80) k = h e ( r k ) (cid:122) (cid:125)(cid:124) (cid:123) (cid:88) k = (cid:183) − ∆ r k + v ( r k ) + ( λ · r k ) (cid:184) ˆ H ep = (cid:80) k = h ep ( r k , p ) (cid:122) (cid:125)(cid:124) (cid:123) − (cid:88) k = ω p λ · r k + (cid:181) − ∂ ∂ p + ω p (cid:182)(cid:124) (cid:123)(cid:122) (cid:125) ˆ H ph = h ph ( p ) + | r − r | + ( λ · r )( λ · r ) (cid:124) (cid:123)(cid:122) (cid:125) ˆ H ee = h ee ( r , r ) , (4.3)where we grouped terms according to the coordinates that are appearing. We observe that there is anasymmetry between electron- and photon-coordinates: there are two electron coordinates that appear ina completely symmetric way in all the terms. However there is only one mode-displacement coordinate,independently on the number of electrons.The basics idea of the dressed construction is now to simply introduce as many artificial photon coor-dinates as electron coordinates, which formally “lifts” the asymmetry. In this case, this means to introduceone further photon coordinate p . This has to be done such that the original system cannot be influenced.We can imagine another photon cavity that is very far away from our system, but that we want to describeat the same time. If we further assume that this artificial mode has the same frequency as the physical one,the auxiliary system is described by the Hamiltonian (we denote the quantities of the auxiliary system by a Note that we employ in this section the symbol ∆ ≡ ∇ for the Laplacian. For a system of N electrons, we need to introduce therefore N − H (cid:48) = ˆ H + h ph ( p ) = (cid:88) k = h e ( r k ) + h ee ( r , r ) − (cid:88) k = ω p λ · r k + h ph ( p ) + h ph ( p ). (4.4)If we denote by χ ( p ) an eigenstate of the additional photon Hamiltonian h ph ( p ) and Ψ ( x , x , p ) is aneigenstate of ˆ H , then Ψ (cid:48) ( x , x , p , p ) = Ψ ( x , x , p ) χ ( p ), (4.5)is an eigenstate of ˆ H (cid:48) . The ground state of ˆ H (cid:48) is then uniquely defined and given by Ψ (cid:48) = Ψ χ with theharmonic oscillator vacuum state χ .With ˆ H (cid:48) , we almost achieved our goal to symmetrize electron and photon coordinates, but there is stillthe electron-photon interaction term h ep ( r k , p ) that depends only on p , but not on p . The basic idea toremedy this stems from the similarity of the Hamiltonian (4.4) with a molecular center-of-mass Hamilto-nian [262]. Transferred to this case, we interpret p as the “center-of-displacement” coordinate of two inde-pendent displacements q , q . The second key ingredient of the dressed construction is therefore a coordi-nate transformation ( p , p ) → ( q , q ) that sets p ∝ ( q + q ). (4.6)Obviously, the transformation should be norm-conserving to not increase the displacement artificially, butmore importantly, it needs to keep the photon-energy part form-invariant, i.e., h ph ( p ) + h ph ( p ) → h ph ( q ) + h ph ( q ). (4.7)Since h ph ( p ) = (cid:179) − d d p + ω p (cid:180) = (cid:104) (cid:112) ( i dd p , ω p ) (cid:105) · (cid:104) (cid:112) ( i dd p , ω p ) (cid:105) has the form of an inner product, we cansimply employ an orthogonal transformation for that. Specifically, a transformation for a set of variables( x , x ,...) → ( x (cid:48) , x (cid:48) ,...) is orthogonal, if it leaves any expression of the form (cid:80) i x i → (cid:80) i x (cid:48) i invariant. In thetwo-particle case, there is exactly one possible choice for that, which reads (see also Fig. 4.1) p = (cid:112) q + q ) p = (cid:112) q − q ). (4.8)As constructed, the transformation changes only ˆ H ep = (cid:80) k = h ep ( r k , p ), which becomesˆ H ep → − (cid:88) k = ω / (cid:112) q + q ) λ · r k = − (cid:88) k = ω / (cid:112) q k λ · r k − ω / (cid:112) q λ · r + q λ · r ). (4.9)Again, we grouped the terms according to the particle indices, which we can do now also for the photoncoordinates. Since the auxiliary system after the transformation is symmetric under the exchange of photonindices, we can arbitrarily exchange the indices of the q -coordinates. Putting all terms together, we get the122.2. POLARITONS IN THE DRESSED AUXILIARY SYSTEMfollowing new form of the auxiliary Hamiltonianˆ H (cid:48) = (cid:88) k = (cid:104) − (cid:179) ∆ r k + d d q k (cid:180) + v ( r k ) + ω q k − ω (cid:112) q k λ · r k + ( λ · r k ) (cid:105) + | r − r | − ω (cid:112) ( q λ · r + q λ · r ) + ( λ · r )( λ · r ). (4.10)Interestingly, this Hamiltonian has the same structure as the electronic many-body Hamiltonian 2.1. We cansee this more explicitly, when we introduce the new dressed coordinate z = ( r , q ) (4.11)and the according dressed Laplacian ∆ z (cid:48) = ∆ r + d d q . (4.12)We can write then ˆ H (cid:48) = (cid:88) k = (cid:104) − ∆ (cid:48) z k + v (cid:48) ( z k ) (cid:105) +
12 2 (cid:88) k (cid:54)= l = w (cid:48) ( z k , z l ), (4.13)where the dressed local potential reads v (cid:48) ( z ) = v ( r ) + ω q k − ω (cid:112) q k λ · r k + ( λ · r k ) (4.14)and the dressed interaction kernel is w (cid:48) ( z k , z l ) = | r k − r l | − ω (cid:112) ( q k λ · r l + q l λ · r k ) + ( λ · r k )( λ · r l ). (4.15) With this construction, we have found a way to introduce new coordinates and rewrite our basic Hamilto-nian in a form that is completely symmetric with respect to the new coordinates. The next step is to definethe corresponding wave function. Let us therefore recall the Pauli citation of Sec. 2.1.3:
If we deal with many similar (indistinguishable, A/N) particles, special circumstances occur thatarise from the Hamilton operator being invariant under any permutations of particles. (Pauli, 1933 [140])According to Pauli, exchange-symmetry is a consequence of the symmetry of the Hamiltonian, which in-dicates that we are quite close to our goal to formulate a many-polariton theory. We only need to derivethe details of the “special circumstances,” i.e., the type of exchange symmetry that the auxiliary wave func-tion. In the standard description, there are only two choices: symmetric and antisymmetric wave functions.However, we will see that the situation here is a bit more involved. We do not need further differential operators here, but of course we can generalize every differential operator to the new coodinates.The dressed nabla-operator would read for example ∇ z = ( ∇ r , dd q ). Polaritonic symmetry
Let us briefly recapitulate the steps to arrive at the dressed Hamiltonian (4.10). We start with the physicalHamiltonian (Eq. (4.3)) that depends on the photon coordinate p . We then add another harmonic oscillatorterm, depending on the auxiliary coordinate p which is still distinguishable from the original p (because ofˆ h ep ) and obtain Eq. (4.4). Finally, the coordinate transformation (4.8) introduces new photon coordinates p , p , which enter the transformed Hamiltonian (4.10) in a completely symmetric way. Since the Hamil-tonian is also symmetric in the electronic coordinates, we can deduce the polaritonic symmetry, i.e., theHamiltonian (4.10) is invariant under the exchange of the two polaritonic coordinates z ↔ z .With that we have defined all symmetries on the Hamiltonian level. Regarding the wave function Ψ (cid:48) ( z σ , z σ ), we still have to understand whether it is symmetric or antisymmetric under z σ ↔ z σ .For that, we first note that the photon ground state is inversion-symmetric, i.e., χ ( p ) = χ ( − p ). (4.16)Further, we want to preserve the antisymmetry of the physical wave function in the electronic coordinates,i.e., Ψ ( x , x , p ) = − Ψ ( x , x , p ). Taken both together, we deduce that Ψ (cid:48) ( x , q , x , p , q ) = Ψ ( x , x ,1/ (cid:112) q + q )) χ (1/ (cid:112) q − q )) z σ ↔ z σ = − Ψ ( x , x ,1/ (cid:112) q + q )) χ (1/ (cid:112) q − q )) Eq . (4.16) = − Ψ ( x , x ,1/ (cid:112) q + q )) χ (1/ (cid:112) q − q )) = − Ψ (cid:48) ( x , q x , q ). (4.17)The dressed wave function Ψ (cid:48) is antisymmetric with respect to the exchange of dressed coordinates ( x , q ) = ( r , σ , q ) = ( z , σ ).Crucially, this allows us to describe the coupled electron-photon Hilbert space in a different many-bodybasis, i.e., in terms of Slater determinants of dressed orbitals (see also Fig. 4.1). Therefore, let us go back toour polariton-orbital basis { φ ( z , σ )} of Sec. 4.1. We can now explicitly follow the steps of the electronic theory:we consider a basis φ i ( z , σ ) of h p . For instance, this could be the eigenstates of the single-particles Hamilto-nian h ( z ) = − ∆ z (cid:48) + v (cid:48) ( z ). The two-particle space is then given as the antisymmetrized tensor-product h p = A ( h p ⊗ h p ) (4.18)and the elements of h p can be parametrized as Ψ (cid:48) ( z , σ , z , σ ) = (cid:88) i j c i j (cid:163) φ i ( z , σ ) φ j ( z , σ ) − φ j ( z , σ ) φ i ( z , σ ) (cid:164) ! = (cid:88) i , j , α , β C (cid:48) α , β i j ψ i ( x ) ψ j ( x ) χ α ( q ) χ β ( q ). (4.19)It is clear that since the auxiliary configuration space is much larger than the original configuration space,we made an (beyond simple systems) infeasible numerical problem even more infeasible. What the dressedconstruction offers us in compensation, is a new structure of the configuration space. In terms of the aboveexpansion, we hope that the new coefficients c i j are easier to approximate than the original C α i j . The mainadvantage is that the auxiliary wave function can be expanded in terms of Slater determinants and we havea lot of methods at our disposal that are geared toward such a situation, i.e., first-principles electronic struc-124.2. POLARITONS IN THE DRESSED AUXILIARY SYSTEMture theories. Yet the dressed formulation allows for relatively simple approximation schemes, e.g., HF the-ory in terms of a single polaritonic Slater determinant (see Sec. 2). Importantly, such simple wave functionsin terms of polaritonic orbitals correspond to correlated (multi-determinant) wave functions in physicalspace. In Sec. 6, we present calculations based on dressed orbitals which agree remarkably well with exactreferences even for very strong coupling strengths. We also illustrate there the explicitly correlated characterof the electronic subsystem. The explicit form of the auxiliary wave function
Let us now have a second closer look at the wave function. In fact, we can explicitly construct the dressedwave function in the new coordinates Ψ (cid:48) ( x , x , p , p ) = Ψ ( x , x , p ) χ ( p ) → Ψ ( x , x ,1/ (cid:112) q + q )) χ (1/ (cid:112) q − q )) (4.20)if we know Ψ . To see this, we consider some electronic orbital-basis { ψ i ( x )} and the eigenfunctions { χ α ( p )}of the photon Hamiltonian ˆ h ph and expand Ψ ( x , x , p ) = (cid:88) i , j , α C α i j ψ i ( x ) ψ j ( x ) χ α ( p ).The first step of the construction of the auxiliary ground state is simple. We (tensor-)multiply Ψ with theground state χ ( p ) of the extra Harmonic oscillator that by construction is also an eigenstate of ˆ h ph , i.e., Ψ (cid:48) ( x , x , p , p ) = (cid:34) (cid:88) i , j , α C α i j ψ i ( x ) ψ j ( x ) χ α ( p ) (cid:35) χ ( p ). (4.21)The nontrivial step is the transformation (4.8) to the new coordinates. However, since we know the analyticalexpression of the photon basis, χ α ( p ) = (cid:112) α α ! (cid:179) ωπ (cid:180) e − ω p /2 H α ( (cid:112) ω p ), (4.22)where H α ( z ) = ( − α e z d α d z α (cid:179) e − z (cid:180) are Hermite-polynomials, we can perform the transformation (4.8) explic-itly . Specifically, we have to calculate terms of the form χ α (1/ (cid:112) q + q )) χ (1/ (cid:112) q − q )) ∝ e − ω ( q + q ) /4 H α ( (cid:112) ω /2( q + q )) H e − ω ( q − q ) /4 = e − ω ( q + q ) /4 H α ( (cid:112) ω /2( q + q )) e − ω ( q − q ) /4 for all α , where we used that H =
1. We start with the Gaussian part of the oscillator states and calculateexplicitly e − ω p /2 e − ω p /2 → e − ω ( q + q ) /4 e − ω ( q − q ) /4 = e − ω q /2 e − ω q /2 .The transformation merely exchanges the coordinates, because it is orthogonal (cf. Eq. (4.7)). For the remaining part involving the Hermite-polynomial, the transformation is more involved, but we Thus, also for more than two particles, the Gaussian part of the states remains form-invariant under the transformation. H α ( z + z ) = − α /2 α (cid:88) β = (cid:195) αβ (cid:33) H α − β ( z (cid:112) H β ( z (cid:112) z i = (cid:112) ω /2 q i , we arrive after some algebra at Ψ (cid:48) ( x , x , q , q ) = (cid:88) i , j , α C α i j ψ i ( x ) ψ j ( x ) α (cid:88) β = (cid:115) α ! β !( α − β )! 1 (cid:112) α χ ( α − β ) ( q ) χ β ( q ) = (cid:88) i , j , α , β C (cid:48) α , β i j ψ i ( x ) ψ j ( x ) χ α ( q ) χ β ( q ). (4.24)In the last line, we introduced the polaritonic expansion coefficient C (cid:48) α , β i j , which is uniquely determined bythe above equation.First of all, Eq. (4.24) highlights how the value of p is “distributed” over the new coordinates q and q . This shows how 2 (or in general N ) polaritonic orbitals can carry collectively the information of onlyone mode ( M modes). Beyond that, this explicit construction illustrates how intricate the coordinate trans-formation acts on the system. It is true that Eq. (4.24) is antisymmetric under z σ ↔ z σ and thus it iscovered by our just derived ansatz Ψ (cid:48) ( z σ , z σ ). However, the opposite is clearly not true, i.e., not all po-laritonic wave functions that are antisymmetric will also be of the form (4.24). This is a first indication thatthe ansatz (4.19) might be limited. Nevertheless, there is one generic feature in Eq. (4.24): it is symmetricwith respect to the new coordinates q , q . We discuss this in the next subsection.
In fact, the antisymmetric polariton space h p that we constructed in the previous subsection, includes statesthat do not have a correspondence in the physical system. Let us illustrate this with our two-electron-one-mode example, neglecting no electron-electron and electron-photon interactions. The corresponding inde-pendent particles (IP) Hamiltonian readsˆ H IP = (cid:88) k = − ∆ r k + v ( r k ) (cid:181) − d d p + ω p (cid:182) . (4.25)The ground state of ˆ H IP is simply Ψ ( r σ , r σ , p ) = (cid:112) (cid:163) ψ ( r σ ) ψ ( r σ ) − ψ ( r σ ) ψ ( r σ ) (cid:164) χ ( p ),a product of a Slater determinant consisting of the two lowest eigenfunctions ψ , ψ of [ − ∆ r + v ( r )] and χ ( p ) = ( ωπ ) e − ω p /4 , which is the ground state of a harmonic oscillator with frequency ω . As shown inthe last subsection, we obtain the dressed version of Ψ by multiplication with another oscillator groundstate χ ( p ) with the same frequency. Performing the coordinate transformation (4.8) in this case is espe-cially simple because also χ ( p ) is a ground state. Consequently, the transformation does nothing else than Note that we can in principle derive a similar (but considerably more cumbersome) expression for N electrons and M modes. Toderive Eq. (4.24), we only have to make use of the orthogonality and the explicit transformation of p . The auxiliary coordinates p , p ,...do not play a role because only for p , we have to take Hermite-polynomials H α with α > p , p ) with ( q , q ). The complete auxiliary ground state reads then Ψ (cid:48) ( z σ , z σ ) = (cid:112) (cid:163) ψ ( r σ ) ψ ( r σ ) − ψ ( r σ ) ψ ( r σ ) (cid:164) χ ( q ) χ ( q ) = (cid:112) (cid:163) ψ ( r σ ) χ ( q ) ψ ( r σ ) χ ( q ) − ψ ( r σ ) χ ( q ) ψ ( r σ ) χ ( q ) (cid:164) = (cid:112) φ ( z σ ) φ ( z σ ) − φ ( z σ ) φ ( z σ )],where in the last line we subsumed the electronic and photonic orbitals with the same coordinate index to adressed orbital, i.e., ψ i ( r j σ j ) χ n ( q j ) ≡ φ in ( z j σ j ). This wave function is obviously antisymmetric with respectto the exchange of z σ and z σ . Now let us consider another wave function,˜ Ψ (cid:48) ( z σ , z σ ) = φ ( z , σ ) φ ( z , σ ) − φ ( z , σ ) φ ( z , σ ) = (cid:112) (cid:163) ψ ( r σ ) χ ( q ) ψ ( r σ ) χ ( q ) − ψ ( r σ ) χ ( q ) ψ ( r σ ) χ ( q ) (cid:164) = ψ ( r σ ) ψ ( r σ ) (cid:163) χ ( q ) χ ( q ) − χ ( q ) χ ( q ) (cid:164) ,that is a dressed Slater determinant and thus part of h p . This state is however unphysical, because is violatesthe Pauli-principle : both electrons occupy the same electronic orbital. Depending on their energy eigenval-ues E [ Ψ (cid:48) ] = 〈 Ψ (cid:48) | ˆ H IP (cid:48) Ψ (cid:48) 〉 , a minimization of the dressed Hamiltonian without ensuring the hybrid statisticscould determine either of the two wave functions as the ground state. Only if E [ Ψ (cid:48) ] = (cid:178) + (cid:178) + ω < (cid:178) + ω = E [ ˜ Ψ (cid:48) ], where (cid:178) and (cid:178) are the eigenenergies corresponding to ψ and ψ , a simple minimization would yieldthe correct solution. If, however, (cid:178) − (cid:178) > ω , then a minimization of the dressed problem would lead to thestate ˜ Ψ (cid:48) that violates the Pauli principle. Within the dressed Slater determinant, the antisymmetry can be“carried” by the electronic or the photonic part of the orbital, which allows for states like ˜ Ψ (cid:48) . For an inter-acting problem, both cases cannot be separated so easily, but the problem remains in principle the same,as we show numerically in Sec. 6. Note that this is very similar to the violations of the N -representabilityconditions in variational 2RDM theory (see Sec. 2.4.1). This means that we either have to make sure that ω is large compared to the electronic excitations, such that the unrestricted minimization with only fermionicsymmetry in z σ picks the right wave function [1] (fermion ansatz), or we have to enforce the hybrid statistics.To guarantee the Pauli principle, we actually have to build the dressed many-body space by requiring an-tisymmetry only with respect to the electronic coordinates together with symmetry in terms of the photoniccoordinates. We can summarize this as r σ ↔ r σ → Ψ (cid:48) ↔ − Ψ (cid:48) (4.26a) q ↔ q → Ψ (cid:48) ↔ Ψ (cid:48) . (4.26b)This does not mean that we have to discard the wave function expansion in terms of dressed Slater determi-nants (4.19), because from (4.26) follows also z σ ↔ z σ → Ψ (cid:48) ↔ − Ψ (cid:48) . (4.27)If we require (4.27), which builds h p and then constrain the space by either (4.26a) or (4.26b), we fulfil (4.26)equivalently. In Sec. 4.3, we show for the general case of N electrons and M modes that (4.26) is indeed nec-essary and sufficient to guarantee a one-to-one correspondence between the physical and auxiliary ground127HAPTER 4. THE DRESSED-ORBITAL CONSTRUCTIONstate. This establishes the polariton description of cavity QED.Let us now illustrate how the extra condition indeed rules our the unphysical state ˜ Ψ (cid:48) . We can enforcethe constitutive relations on ˜ Ψ (cid:48) by adding two extra terms that exchange either electronic or photonic coor-dinates, which leads to˜ Ψ (cid:48) hybr id ( z σ , z σ ) = ˜ Ψ (cid:48) (cid:122) (cid:125)(cid:124) (cid:123) φ ( r q σ ) φ ( r q σ ) − φ ( r q σ ) φ ( r q σ ) + φ ( r q σ ) φ ( r q σ ) − φ ( r q σ ) φ ( r q σ ) = ψ ( r σ ) ψ ( r σ ) (cid:163) χ ( q ) χ ( q ) − χ ( q ) χ ( q ) (cid:164) + ψ ( r σ ) ψ ( r σ ) (cid:163) χ ( q ) χ ( q ) − χ ( q ) χ ( q ) (cid:164) = The next step is to construct a theory based on these polaritons. Unfortunately, the constitutive relation(4.26) cannot be enforced on the wave function level in a practical way. For that, we would need to generalizethe concept of a Slater determinant to polaritonic and electronic coordinates, as we have done with ˜ Ψ (cid:48) toconstruct ˜ Ψ (cid:48) hybr id . In the general case, such a “generalized Slater determinant” for two particles is given by Ψ (cid:48) ab ( z , σ , z , σ ) = φ a ( r , q , σ ) φ b ( r , q , σ ) − φ b ( r , q , σ ) φ a ( r , q , σ ) + φ a ( r , q , σ ) φ b ( r , q , σ ) − φ b ( r , q , σ ) φ a ( r , q , σ ), (4.28)where φ a / b are some (orthonormal) polariton orbitals. To see why such an ansatz is problematic, we calcu-late the norm-square of Ψ (cid:48) ab , i.e., || Ψ (cid:48) ab || = (cid:88) σ , σ (cid:90) d z z Ψ (cid:48) ab ∗ ( z , σ , z , σ ) Ψ (cid:48) ab ( z , σ , z , σ ) = 〈 φ a | φ a 〉〈 φ b | φ b 〉 − 〈 φ a | φ b 〉〈 φ b | φ a 〉+ (cid:88) σ , σ (cid:90) d z z φ ∗ a ( r , q , σ ) φ ∗ b ( r , q , σ ) φ a ( r , q , σ ) φ b ( r , q , σ ) + c . c . − (cid:88) σ , σ (cid:90) d z z φ ∗ a ( r , q , σ ) φ ∗ b ( r , q , σ ) φ b ( r , q , σ ) φ a ( r , q , σ ) − c . c . + (cid:88) σ , σ (cid:90) d z z φ ∗ b ( r , q , σ ) φ ∗ a ( r , q , σ ) φ b ( r , q , σ ) φ a ( r , q , σ ) + c . c . − (cid:88) σ , σ (cid:90) d z z φ ∗ b ( r , q , σ ) φ ∗ a ( r , q , σ ) φ a ( r , q , σ ) φ b ( r , q , σ ) − c . c .In this expression, only the first line is what would appear in a standard Slater determinant. Since we con-sider orthonormal orbitals, i.e., 〈 φ a | φ b 〉 = δ ab , the integrals of the first term are all one and the integrals ofthe second term are all zero. Thus for an ordinary Slater determinant, the orthonormality condition is suf-ficient to fix its norm, independently of the orbitals. However, with the terms that stem from the additional128.2. POLARITONS IN THE DRESSED AUXILIARY SYSTEMsymmetry requirements, new “mixed-index” terms arise, whenever one or more of the according orbitalshave coordinates with different indices. All these (in this case) 8 terms are actually two-body like integrals,because we cannot separate the integrations of one set of coordinates as we have done in the first line. Thecomputation of these integrals is nontrivial and cannot be defined by an orthonormality condition. Thismeans that the norm of Ψ (cid:48) ab depends on the specific form of φ a / b and thus has to be calculated explicitly tonormalize Ψ (cid:48) ab .The number of such terms for an N -body generalized Slater determinant is given by all permutations ofpolaritonic and electronic coordinates, which is N ! , minus the N ! “ordinary” terms. The total number of“mixed-index” terms, N ! − N !, grows therefore factorial with the number of particles. The normalization(and in the same way also the calculation of expectation values) of a wave function that explicitly exhibitsthe symmetry (4.26) requires the numerical calculation of (over)exponentially many nontrivial terms. Suchan explicit Slater determinant ansatz is thus infeasible in practice. We have found yet another exponentialwall. Comparing the challenges of the dressed wave function with the difficulties to construct a “polariton-like”wave function within generalized MF theory (Sec. 3.1.2), it seems that any many-body description based onpolaritons leads not only to an exponential but even a “factorial” wall. However, the dressed construction ismore flexible than the standard description and we show in the following how we can exploit this flexibilityto construct simple but accurate approximation strategies based on polaritonic orbitals. The key-role herebywill be played by the (ensemble) 1RDM that allows the enforce exchange-symmetry without the explicit useof, e.g., Slater determinants.For that, we generalize the description from pure states (described by single wave functions) to ensem-bles, characterized by the N -body density matrix Γ NE = (cid:80) i w i Ψ ∗ i Ψ i , cf. Eq. (2.89), where the Ψ i are eigen-states of the Hamiltonian and the w i are weight coefficients with 0 ≤ w i ≤ (cid:80) i w i = Γ NE thanto a pure-state density matrix Γ N = Ψ ∗ Ψ . This connection is given by the (ensemble) N -representabilityconditions. In other words, if a given matrix is bosonic (fermionic) N -representable than a corresponding N -body density matrix exists that is composed out of only (anti)symmetric pure state wave functions. Theidea is now to rule out unphysical states such as ˜ Ψ (cid:48) by utilizing the N -representability conditions of the1RDM instead of an explicit ansatz such as the generalized Slater determinant (Eq. (4.28)).We start by noting that we can straightforwardly generalize the ensemble description to the dressed set-ting, if we utilize the eigenstates Ψ (cid:48) i of ˆ H (cid:48) . However, as we are only interested in the pure ground state Ψ (cid:48) ,we will only implicitly make use of the ensemble description. This is comparable to the use of the ensemble N -representability conditions in RDMFT (see Sec. 2.4.2). We therefore define the dressed 1RDM γ [ Ψ (cid:48) i ]( z σ , z (cid:48) σ (cid:48) ) = (cid:88) σ (cid:90) d z Ψ (cid:48)∗ ( z (cid:48) σ (cid:48) , z σ ) Ψ (cid:48) ( z σ , z σ ) (4.29)in terms of the dressed wave function Ψ (cid:48) . Since the pure state Ψ (cid:48) is an extreme case of an ensemble state, itsFermi statistics with respect to the polaritonic coordinates z σ (cf. Eq. (4.44)) is also apparent in γ [ Ψ (cid:48) ] in theform of fermionic N -representability conditions. By using the natural orbitals φ i and the natural occupation The generalization to ensembles would simply consist in exchanging Γ N (cid:48) = Ψ (cid:48)∗ Ψ (cid:48) by Γ NE (cid:48) = (cid:80) i w i Γ Ni (cid:48) for a set of pure state densitymatrices Γ Ni (cid:48) . n i , which are defined by the eigenvalue equation n i φ i = ˆ γψ i , we represent γ in its diagonal form γ [ Ψ (cid:48) ]( z σ , z (cid:48) σ (cid:48) ) = ∞ (cid:88) i = n i φ ∗ i ( z (cid:48) σ (cid:48) ) φ i ( z σ ). (4.30)The fermionic N -representable conditions become especially simple in this representation and are given by(cf. Eq. (2.91)) 0 ≤ n i ≤ ∀ i (cid:80) i n i = N . (4.31)However, we still need to take care that the antisymmetry remains in the electronic subsystem. For that,we now go one step further and define from the dressed 1RDM the electronic 1RDM γ e [ Ψ (cid:48) ]( r σ , r (cid:48) σ (cid:48) ) = (cid:90) d M q γ [ Ψ (cid:48) ]( rq σ , r (cid:48) q σ (cid:48) ) = ∞ (cid:88) i = n ei ψ ei ∗ ( r (cid:48) σ (cid:48) ) ψ ei ( r σ ) (4.32)by an integration over the photonic coordinates. In the second line, we introduced the diagonal representa-tion of γ e with the according natural orbitals ψ ei and the natural occupation numbers n ei . The Fermi statis-tics with regard to only the electronic coordinates r σ thus becomes apparent by considering the electronicnatural occupation numbers n ei ≤ n ei ≤ ∀ i (4.33a) (cid:80) i n ei = N . (4.33b)With Eqs. (4.31) and (4.33) together with the definition of γ e in terms of γ (Eq. (4.32)), we have deriveda set of necessary conditions to guarantee the hybrid statistics. Further conditions would be necessaryfor a sufficient characterization of the set of polaritonic wave functions that exhibit the exchange symme-try (4.26). This is comparable to the N -representable conditions of the 2RDM [214], since the dressed 1RDM γ ( z ; z (cid:48) ) = γ ( r , q , r (cid:48) q (cid:48) ) written in electronic and photonic variables depends on four particle coordinates ex-actly as the 2RDM. Importantly, the conditions (4.33) are sufficient to guarantee the antisymmetry in thefermionic sector, which is the only symmetry of the physical systems and thus the most important one. Wetherefore focus in the following on the conditions (4.33).Let us illustrate this new point of view with the example from before. The correct auxiliary ground state Ψ (cid:48) satisfies the conditions of Eq. (4.33), since γ e [ Ψ (cid:48) ] = (cid:88) σ (cid:90) d z d q Ψ (cid:48)∗ ( r (cid:48) q σ (cid:48) , z σ ) Ψ (cid:48) ( r q σ , z σ ) = (cid:88) i = ψ ∗ i ( r (cid:48) σ (cid:48) ) ψ i ( r σ ).The two electronic orbitals are the natural orbitals of γ e [ Ψ (cid:48) ] with natural occupation numbers n = n = Note that the normalization of γ e to the electron number N is a direct consequence of the dressed construction that considersexactly N polaritons for system with N electrons. Ψ (cid:48) , we get instead γ e [ ˜ Ψ (cid:48) ] = ψ ∗ ( r (cid:48) σ (cid:48) ) ψ ( r σ ),which violates the N -representability conditions of Eq. (4.33) and thus the Pauli principle. For the wavefunctions of an interacting system, the diagonalization of γ e [ Ψ (cid:48) ] will not be as trivial as for this simple exam-ple, but nevertheless the conditions (4.33) are sufficient to ensure the Pauli exclusion principle in the sensethat maximally one fermion can occupy a single quantum state (upper bound in Eq. (4.33a)).Most importantly, we can employ these conditions to obtain a computationally tractable procedure (pre-sented in Sec. 5.1) to approximately ensure the polariton statistics implied by Eq. (4.43), instead of the fac-torially growing number of “mixed-index” orbitals. We recapitulate now the dressed construction for the general case of N electrons that are coupled to M photon modes. We follow hereby Ref. [2, Sec. 2]. We consider the setting of cavity QED with accordingHamiltonian (reordering its terms for the purposes of this section)ˆ H = N (cid:88) k = (cid:163) − ∆ r k + v ( r k ) (cid:164) + N (cid:88) k (cid:54)= l w ( r k , r l ) (cid:124) (cid:123)(cid:122) (cid:125) ˆ H m = ˆ T [ t ] + ˆ V [ v ] + ˆ W [ w ] + M (cid:88) α = (cid:179) − ∂ ∂ p α + ω α p α (cid:180)(cid:124) (cid:123)(cid:122) (cid:125) ˆ H ph + M (cid:88) α = − ω α p α λ α · ˆ D (cid:124) (cid:123)(cid:122) (cid:125) ˆ H I + M (cid:88) α = (cid:161) λ α · ˆ D (cid:162) (cid:124) (cid:123)(cid:122) (cid:125) ˆ H d . (cf. 1.37)The first three terms constitute the usual matter Hamiltonian of quantum mechanics, with the kinetic andexternal one-body parts, ˆ T [ t ] and ˆ V [ v ], respectively, and the two-body interaction term ˆ W [ w ]. Here the ki-netic term is the usual Laplacian t ( r ) = − ∆ r , the external potential v ( r ) is due to the attractive nuclei/ionsand w ( r , r (cid:48) ) is the electron-electron repulsion. Usually this is just taken as the free-space Coulomb inter-action w ( r , r (cid:48) ) = | r − r (cid:48) | , but inside a cavity the interaction can be modified [127]. The fourth term ˆ H ph isthe free field-energy of M effective modes of the cavity. The effective modes are characterized by their dis-placement coordinate p α , frequency ω α and polarization vectors λ α . The latter include already the effectivecoupling strength g α = | λ α | (cid:113) ω α ∝ (cid:112) V [48, 14] that is proportional to the inverse square-root of the cav-ity mode volume V. In the dipole approximation, the coupling between light and matter is described by thebilinear term ˆ H int together with the dipole self-energy term ˆ H sel f . Here the dipole operator is defined byˆ D = (cid:80) Nk = r k .The ground state of ˆ H is given by a wave function Ψ ( x ,..., x N , p ,..., p M ) (4.34)that depends on N spin-spatial electron coordinates and M photon mode-displacement coordinates andthat is antisymmetric with respect to the exchange of any two electron coordinates. 131HAPTER 4. THE DRESSED-ORBITAL CONSTRUCTION To turn this coupled electron-photon problem into an equivalent and exact dressed problem we will followthree steps:1. For each mode α = M and all but the first electron i = N , we introduce extra auxiliary coor-dinates p α , i . This adds ( N − M extra degrees of freedom to the problem. In this higher-dimensionalauxiliary configuration space, we now consider wave functions depending on 4 N + N M coordinates,i.e., Ψ (cid:48) ( r σ ,..., r N σ N , p ,..., p M , p ,..., p N ,..., p M ,2 ,..., p M , N ).Here and in the following, we will denote all quantities in the auxiliary configuration space with aprime.2. We next construct an auxiliary Hamiltonian in the extended configuration space of the formˆ H (cid:48) = ˆ H + M (cid:88) α = ˆ Π α where ˆ Π α = N (cid:88) i = (cid:179) − ∂ ∂ p α , i + ω α p α , i (cid:180) (4.35)depends only on these new auxiliary coordinates. This construction guarantees that the auxiliary de-grees of freedom do not mix with the physical ones, which will ensure a simple and explicit connectionbetween the physical and auxiliary system.3. Finally we perform an orthogonal coordinate transformation of the physical and auxiliary photon co-ordinates ( p ,..., p M , N ) → ( q ,..., q M , N ) such that p α = (cid:112) N (cid:161) q α ,1 + ··· + q α , N (cid:162) , − ∂ ∂ p α + ω α p α + ˆ Π α = N (cid:88) i = (cid:179) − ∂ ∂ q α , i + ω α q α , i (cid:180) . (4.36)Note that the second line is automatically satisfied for any orthogonal transformation and the first linedefines p α as the “center-of-mass” of all the q α , i with uniform relative masses 1/ (cid:112) N .In total, we then find the auxiliary Hamiltonian in the higher-dimensional configuration space given asˆ H (cid:48) = N (cid:88) k = (cid:104) − ∆ r k + v ( r k ) (cid:105) + (cid:88) k (cid:54)= l w ( r k , r l ) − M (cid:88) α = ω α p α λ α · ˆ D + M (cid:88) α = (cid:161) λ α · ˆ D (cid:162) + M (cid:88) α = (cid:179) − ∂ ∂ p α + ω α p α (cid:180) + M (cid:88) α = N (cid:88) i = (cid:181) − ∂ ∂ p α , i + ω α p α , i (cid:182) (4.36) = N (cid:88) k = (cid:40) − ∆ r k + v ( r k ) + M (cid:88) α = (cid:183) − ∂ ∂ q α , k + ω α q α , k − ω α (cid:112) N q α , k ( λ α · r k ) + ( λ α · r k ) (cid:184)(cid:41) + (cid:88) k (cid:54)= l (cid:34) w ( r k , r l ) + M (cid:88) α = (cid:179) − ω α (cid:112) N q α , k λ α · r l − ω α (cid:112) N q α , l λ α · r k + λ α · r k λ α · r l (cid:180)(cid:35) ,132.3. DRESSED CONSTRUCTION: THE GENERAL CASEwhere we inserted the definition of the total dipole operator and reordered the expressions, such that theterms with only one index and the terms with two different indices are grouped together. Introducingthen a (3 + M )-dimensional polaritonic vector of space and transformed photon coordinates z = rq with q ≡ ( q ,..., q M ), we can rewrite the above Hamiltonian asˆ H (cid:48) = N (cid:88) k = (cid:163) − ∆ (cid:48) k + v (cid:48) ( z k ) (cid:164) + (cid:88) k (cid:54)= l w (cid:48) ( z k , z l ) = ˆ T [ t (cid:48) ] + ˆ V [ v (cid:48) ] + ˆ W [ w (cid:48) ], (4.37)where we introduced the dressed one-body terms t (cid:48) ( z ) = − ∆ (cid:48) k ≡ −
12 3 (cid:88) i = ∂ ∂ r i − M (cid:88) α = ∂ ∂ q α , (4.38) v (cid:48) ( z ) = v ( r ) + M (cid:88) α = (cid:104) ω α q α − ω α (cid:112) N q α λ α · r + ( λ α · r ) (cid:105) , (4.39)and the dressed two-body interaction term w (cid:48) ( z , z (cid:48) ) = w ( r , r (cid:48) ) + M (cid:88) α = (cid:104) − ω α (cid:112) N q α λ α · r (cid:48) − ω α (cid:112) N q (cid:48) α λ α · r + λ α · r λ α · r (cid:48) (cid:105) . (4.40)We see here that only the conditions (4.36), but not the details of the coordinate transformation of step 3are important for our construction [1]. The crucial part of this coordinate transformation is the replacementof p α in the interaction terms p α λ α · ˆ D . Instead of p α only, now all q α , i couple to the dipole of the mattersystem just with a rescaled coupling-strength by the factor 1/ (cid:112) N . The hybrid statistics of the dressed wave function
Let us next discuss the wave function Ψ (cid:48) in the auxiliary configuration space. The wave function Ψ in theusual configuration space is a (normalized) solution of the (time-independent) Schrödinger equation E Ψ = ˆ H Ψ . Since ˆ H (cid:48) = ˆ H + (cid:80) M α = ˆ Π α and ˆ Π α acts only on the auxiliary coordinates, we can simply construct Ψ (cid:48) ( r σ ,..., p M , N ) = Ψ ( r σ ,..., r N σ N ; p ,..., p M ) χ ( p ,..., p M , N ), (4.41)with χ being the (normalized) ground state of (cid:80) M α = ˆ Π α , which is a product of individual harmonic-oscillatorground states. Clearly, Ψ (cid:48) is a normalized solution of the auxiliary Schrödinger equation E (cid:48) Ψ (cid:48) = ˆ H (cid:48) Ψ (cid:48) . Inprinciple any combination of eigenstates of the auxiliary harmonic oscillators would lead to a new eigen-function for ˆ H (cid:48) but since we here focus on the ground state the natural choice is the lowest-energy solution.Rewriting this wave function in the new coordinates and employing the polaritonic coordinates z σ ≡ r q σ ,we arrive at Ψ (cid:48) ( r σ ,..., r N σ N , q ,..., q M , N ) = Ψ (cid:48) ( z σ ,..., z N σ N ). (4.42)This polaritonic wave function as the ground state of (4.37) is the reformulation of the original electron-photon problem of (1.37) we were looking for. Since all the new photonic coordinates belong to harmonicoscillator ground states, exchanging p α , i with p α , j does not change the total wave function Ψ (cid:48) and this prop-erty transfers to the exchange of any coordinate q α , i and q α , j . Hence we have now a bosonic symmetry with133HAPTER 4. THE DRESSED-ORBITAL CONSTRUCTIONrespect to the q coordinates. Since the electronic part of the auxiliary system is not affected by the coordi-nate transformation, the electronic symmetries are the same in the physical and auxiliary system, i.e., wehave a fermionic symmetry with respect to r σ . Together these two fundamental symmetries imply that thepolaritonic coordinates z σ have fermionic character. The symmetries of the polaritonic wave function Ψ (cid:48) can be summarized as r k σ k ↔ r l σ l → Ψ (cid:48) ↔ − Ψ (cid:48) q k ↔ q l → Ψ (cid:48) ↔ Ψ (cid:48) (4.43a)(4.43b)from which follows z k σ k ↔ z l σ l → Ψ (cid:48) ↔ − Ψ (cid:48) (4.44)This means that though the dressed wave function has fermionic statistics (4.44) in terms of the polaritoniccoordinates z σ , due to the constitutive relations (4.43) it actually consists of two types of particles: one withfermionic character and another with bosonic character. Consequently, the polariton wave function Ψ (cid:48) hasa hybrid Fermi-Bose statistics. As consequences of these symmetries we find the Pauli exclusion principlefor the electrons, yet for the auxiliary photon coordinates we find that many photonic auxiliary entities canoccupy the same quantum state.Indeed, we can prove that the conditions of Eq. (4.43) are necessary and sufficient to establish a one-to-one mapping between the dressed Ψ (cid:48) and the physical ground states Ψ . The first part “ Ψ → Ψ (cid:48) ” is given bythe dressed construction. For the second part “ Ψ (cid:48) → Ψ ” we show that the minimal energy state Ψ (cid:48) has theform Ψ (cid:48) = Ψ χ , cf. Eq. (4.41). For that, we consider a trial wave function in the dressed space Υ (cid:48) ( z σ ,..., z N σ N ) ≡ Υ (cid:48) ( r σ ,..., p M ; p ,..., p M , N )that fulfils (4.43). Then it holds since ˆ H only acts on ( r ,..., r N , p ,..., p M ) and (cid:80) M α = ˆ Π α only acts on p ,..., p M , N that inf Υ (cid:48) 〈 Υ (cid:48) | ˆ H (cid:48) Υ (cid:48) 〉 ≥ inf Υ (cid:48) 〈 Υ (cid:48) | ˆ H Υ (cid:48) 〉 + inf Υ (cid:48) 〈 Υ (cid:48) | M (cid:88) α = ˆ Π α Υ (cid:48) 〉 (4.43) = 〈 Ψ | ˆ H Ψ 〉 + 〈 χ | M (cid:88) α = ˆ Π α χ 〉= 〈 Ψ (cid:48) | ˆ H (cid:48) Ψ (cid:48) 〉 .We have thus proven the assumption. For excited states the constitutive relations are necessary but notsufficient to single out the eigenfunctions of ˆ H (cid:48) that correspond to the original Hamiltonian in terms ofsimple products. The reason is that we cannot make use of the variational principle as above. Let us illustratethis for the first excited state Ψ with corresponding dressed version Ψ (cid:48) = Ψ χ . We could try to followa similar line of proof as above by considering only trail wave functions ˜ Υ (cid:48) ⊥ Ψ (cid:48) that are orthogonal to theground state. If we now search for the lowest energy solution among the ˜ Υ (cid:48) , the constitutive relations cannotdifferentiate the correct state Ψ (cid:48) = Ψ χ from, e.g., the state ˜ Ψ (cid:48) = Ψ χ , where χ is the first excited state ofthe auxiliary modes. However, for ground states the above proof and thus the one-to-one correspondencebetween physical and dressed system holds. The reader is referred to the work of Nielsen et al. [258] for further details on excited states and the time-dependent case in general. N -representability We have discussed in the previous section that enforcing the physical conditions (4.43) on the polaritonicwave function directly, is not practical. As an alternative, the physical conditions (4.43) are visible in thedressed 1RDM, which is given explicitly by γ [ Ψ (cid:48) ]( z σ , z (cid:48) σ (cid:48) ) = (cid:88) σ ,..., σ N (cid:90) d N − z Ψ (cid:48)∗ ( z (cid:48) σ (cid:48) ,..., z N σ N ) Ψ (cid:48) ( z σ ,..., z N σ N ). (4.45)The Fermi statistics of the wave function Ψ (cid:48) with respect to the polaritonic coordinates z σ (4.44) is alsoapparent in γ [ Ψ (cid:48) ] in the form of the N -representability conditions. By using the natural orbitals φ i and the natural occupation numbers n i , which are defined by the eigenvalue equation n i φ i = ˆ γψ i , we represent γ in its diagonal form γ [ Ψ (cid:48) ]( z σ , z (cid:48) σ (cid:48) ) = (cid:80) ∞ i = n i φ ∗ i ( z (cid:48) σ (cid:48) ) φ i ( z σ ). The fermionic N -representability conditionsbecome especially simple in this representation and are given by0 ≤ n i ≤ ∀ i (cid:80) i n i = N . (4.46)From the dressed 1RDM, we can define the electronic 1RDM γ e [ Ψ (cid:48) ]( r σ , r (cid:48) σ (cid:48) ) = (cid:90) d M q γ [ Ψ (cid:48) ]( rq σ , r (cid:48) q σ (cid:48) ) (4.47)and the auxiliary photonic 1RDM γ p [ Ψ (cid:48) ]( q , q (cid:48) ) = (cid:88) σ (cid:90) d r γ [ Ψ (cid:48) ]( rq σ , rq (cid:48) σ ). (4.48)Again, we can define the according natural orbitals ψ e / pi and the natural occupation numbers n e / pi by theeigenvalue equations n i ψ e / pi = ˆ γ e / p ψ e / pi and go into their diagonal representations γ e [ Ψ (cid:48) ]( r σ , r (cid:48) σ (cid:48) ) = ∞ (cid:88) i = n ei ψ ei ∗ ( r (cid:48) σ (cid:48) ) ψ ei ( r σ ) (4.49)and γ p [ Ψ (cid:48) ]( q , q (cid:48) ) = ∞ (cid:88) i = n pi ψ pi ∗ ( q (cid:48) ) ψ pi ( q ). (4.50)The Fermi statistics with regard to only the electronic coordinates r σ thus becomes apparent by consideringthe electronic natural occupation numbers n ei n ei ≥ ∀ in ei ≤ ∀ i (cid:80) i n ei = N , (4.51a)(4.51b)(4.51c)where we split the conditions in three parts for later convenience. The equivalent bosonic symmetry of the135HAPTER 4. THE DRESSED-ORBITAL CONSTRUCTIONauxiliary photonic coordinates leads instead to the conditions0 ≤ n pi , ∀ i (cid:80) i n pi = N . (4.52a)(4.52b)Note that the normalization of γ e / b to the electron number N is a direct consequence of the auxiliary con-struction that considers exactly N polaritons for a system with N electrons. This becomes explicitly vis-ible in the fact that the normalization of γ by definition transfers to γ e / b , since N = (cid:80) σ (cid:82) d z γ ( z σ , z σ ) = (cid:80) σ (cid:82) d r γ e ( r σ , r σ ) = (cid:82) d q γ p ( q , q ). Additionally, the lower bounds of γ e / b , cf. Eqs. (4.51a) and (4.52a), transferfrom γ , because the partial trace operation is a completely positive map [263]. We can conclude that if (4.46)is enforced, only the upper bound of the electronic 1RDM, cf. Eq. (4.51b) provides a nontrivial additionalconstrained.This now shows explicitly also for an interacting wave function that at most one electron can occupy aspecific quantum state, while many auxiliary photon quantities can occupy a single quantum state. Further,the dressed 1RDM γ [ Ψ (cid:48) ] itself has only natural occupation numbers between zero and one and is thereforefermionic, yet it contains a fermionic and a bosonic subsystem. It is important to note that the originalwave function Ψ did not have this simple hybrid statistics but only fermionic symmetry, since the physical p α did not follow any specific statistics. Further, that we genuinely have formulated the coupled electron-photon problem in terms of hybrid quasi-particles becomes most evident by actually using single-particle(polariton) orbitals φ i ( z σ ) to expand the dressed 1RDM of Ψ (cid:48) .To obtain a computationally tractable procedure, we therefore use the construction presented in Sec. 5.1to (approximately) ensure the polariton statistics implied by Eq. (4.43), instead of the factorially growingnumber of “mixed-index” orbitals. We will consider all fermionic density matrices in the auxiliary configu-ration space, which we characterize by the conditions of Eq. (4.46) in terms of polaritonic orbitals φ i ( z σ ). Wethen constrain this space by enforcing the N -representability conditions of Eq. (4.51) for the 1RDM of theelectronic subsystem. Since this guarantees that only (ensembles of) fermionic wave functions are allowed,also the minimal energy solution (corresponds to an ensemble that) has fermionic symmetry with respectto r σ . We remind the reader that this is the only symmetry of the physical system. Thus, it is especiallyimportant to enforce this symmetry also within an accurate approximation scheme in the dressed system.This together with the z σ antisymmetry guarantees additionally the correct zero-coupling limit. We call thisconstruction the polariton ansatz for strong light-matter interaction. In the next section, we will based onthe polariton ansatz provide a detailed prescription to generalize a given electronic-structure theory to treatground states of coupled electron-photon systems from first principles. To conclude we want to briefly discuss the role of observables in the auxiliary system. Although we con-structed the auxiliary space explicitly in a way that the physical wave function Ψ can be reconstructed ex-actly from its dressed counterpart Ψ (cid:48) by integration of all auxiliary coordinates, this does not hold for alltypes of operators. For operators that depend only on electronic coordinates, there is no difference andwe have 〈 Ψ | ˆ O Ψ 〉 = 〈 Ψ (cid:48) | ˆ O Ψ (cid:48) 〉 . This is not surprising because the coordinate transformation (4.36) acts onlyon the photonic part of the system. For photonic observables instead, the transformation changes the re-spective operators and thus, the connection between physical and auxiliary space becomes nontrivial ingeneral. However, at least for all observables that depend on photonic 1/2- or 1-body expressions, there is136.3. DRESSED CONSTRUCTION: THE GENERAL CASEan analytical connection. For half-body operators, i.e., any operator that depends only on the displacementof p α = (cid:112) N ( q α ,1 + ... + q α , N ) (4.53)and its conjugate ∂∂ p α = (cid:112) N (cid:181) ∂∂ q α ,1 + ... + ∂∂ q α , N (cid:182) , (4.54)the coordinate transformation itself provides us with the connection. For 1-body operators, this becomesslightly more involved. For example, consider the mode energy operator ˆ H ph = (cid:80) M α = (cid:104) − ∂ ∂ p α + ω α p α (cid:105) ≡ (cid:80) M α = ˆ h α , that we can straightforwardly generalize in the auxiliary space to ˆ H (cid:48) ph = (cid:80) M α = (cid:80) Ni = (cid:183) − ∂ ∂ q α , i + ω α q α , i (cid:184) ≡ (cid:80) M α = ˆ h (cid:48) α . The connection between ˆ H ph and ˆ H (cid:48) ph is given by the definition of the coordinate transformation(4.36), ˆ H ph = ˆ H (cid:48) ph − M (cid:88) α = ˆ Π α . (4.55)Since the expectation value of ˆ Π α is known analytically, 〈 Ψ (cid:48) | ˆ Π α ( p α ,2 ,..., p α , N ) Ψ (cid:48) 〉 = 〈 χ | ˆ Π α χ 〉 = ( N − (cid:80) M α = ω α ,we have E ph =〈 Ψ | ˆ H ph Ψ 〉 = 〈 Ψ (cid:48) | ˆ H (cid:48) ph Ψ (cid:48) 〉 − ( N − M (cid:88) α = ω α E =〈 Ψ | ˆ H Ψ 〉 = 〈 Ψ (cid:48) | ˆ H (cid:48) Ψ (cid:48) 〉 − ( N − M (cid:88) α = ω α p α p β , p α ∂∂ p β , and ∂∂ p α ∂∂ p β , where α and β denote any two modes. The reason is that the transformation (4.36) preserves the standard innerproduct of the Euclidean space of the mode plus extra coordinates. This transfers also to their conjugatesand combinations of both. From the above, it is straightforward to derive also the expression for the occu-pation of mode α , ˆ N α ph = ω α ˆ h α − . In the auxiliary system, we haveˆ N α ph = ω α ˆ h (cid:48) α − N hapter 5 POLARITONS FROM FIRST PRINCIPLES
In the last chapter, we have defined the theoretical framework to describe coupled matter-photon systemswith polaritons as the fundamental entity. Now we need to discuss how to use the framework in practice.The key behind the application of the dressed construction lies in the fact that the polaritons are “almostelectrons” - with an additional symmetry - and the Hamiltonian in terms of these polaritons is also almostthe Hamiltonian of standard electronic-structure theory. In this chapter, we lay out how we can exploit thisby providing a general prescription of how to turn any given electronic-structure theory into a “polaritonic-structure theory” (Sec. 5.1). This means that we keep the approximations to the Coulomb-interaction withinthe electronic theory exactly as they are but use them to approximate the in the previous section derivedinteraction between the polaritons. Sec. 5.1 is based on chapter 3 of [2]. Then, we exemplify this prescriptionfor two examples. In Sec. 5.2, we introduce polaritonic HF, followed by polaritonic RDMFT in Sec. 5.3.
In this section, we lay out in detail how one can transform a given electronic-structure theory that meetssome minimal requirements into its polaritonic version. The goal of such a “polaritonic-structure theory”is to find the ground state of the cavity-QED Hamiltonian of Eq. (1.37) by considering the ground state ofthe auxiliary Hamiltonian of Eq. (4.37). Let us start by defining the according variational principle for theground-state energy E (cid:48) as E (cid:48) = inf Ψ (cid:48) ∈ P 〈 Ψ (cid:48) | ˆ H (cid:48) Ψ (cid:48) 〉 , (5.1)where P = { Ψ (cid:48) : Ψ (cid:48) ↔ (4.43)} is the set of all normalized many-polariton wave functions that obey the consti-tutive relations of Eq. (4.43). For our purposes, as explained in Sec. 4.3, we will instead consider the larger set M of all (mixed-state) density matrices Γ = (cid:80) j w j | Ψ (cid:48) j 〉〈 Ψ (cid:48) j | with (cid:80) j w j =
1, that obey the hybrid Fermi-Bosestatistics. The minimal energy also in this more general set corresponds to the pure state of Eq. (5.1), i.e., E (cid:48) = inf Γ ∈ M Tr{ Γ ˆ H }. (5.2)The main trick now is in how we approximate this set. We do so by first considering the yet larger set ˜ M = {˜ Γ :˜ Ψ (cid:48) j = (cid:80) C j , K Φ K }, i.e., density matrices made of superpositions of Slater determinants Φ K = det( φ K ,1 ··· φ K , N )/ (cid:112) N !of polariton orbitals φ K , i . This guarantees the overall Fermi statistics in terms of the polaritonic coordinates z σ . We then constrain this larger set to M (cid:48) = {˜ Γ ∈ ˜ M : n ei [˜ Γ ] ≤ H m = ˆ T [ t ] + ˆ V [ v ] + ˆ W [ w ] { φ k ( r σ )} → L m [ Ψ ] = E t , v , wm [ Ψ ] + C [ Ψ ] ↓ ˆ H = ˆ H m + ˆ H ph + ˆ H I + ˆ H d { φ k ( r σ ), χ α ( p )} → “new theory necessary” ↓ ˆ H (cid:48) = ˆ T [ t (cid:48) ] + ˆ V [ v (cid:48) ] + ˆ W [ w (cid:48) ] { φ (cid:48) k ( r σ , q )} → L (cid:48) [ Ψ (cid:48) ] = E t (cid:48) , v (cid:48) , w (cid:48) m [ Ψ (cid:48) ] + C [ Ψ (cid:48) ] + G [ Ψ (cid:48) ]Figure 5.1: Graphical illustration of the polariton construction and its connection to an electronic-structuretheory (EST). Here E m indicates the energy expression of the EST, such as the HF, configuration interaction,or coupled cluster energy functional, and C indicates the constraints of the EST, such as orthonormalityof the orbitals. They are enforced on a (possibly multideterminantal) wave function Ψ constructed froman electronic single-particle basis φ k . Further, G indicates the new constraints that arise due to the hybridstatistics of the polaritons, which are now enforced on a (possibly multideterminantal) wave function Ψ (cid:48) of polaritonic single-particle orbitals φ (cid:48) k . The usual coupled electron-photon problem (second line) has aHamiltonian with a different structure and is built on separate orbitals φ k and χ α . Thus, a new (efficient andaccurate) approximate energy expression would be needed.where n ei [˜ Γ ] are the natural occupation numbers, cf. Eq. (4.49), of the electronic 1RDM γ e [˜ Γ ] = (cid:80) j w j γ e [ ˜ Ψ (cid:48) j ]that depend on ˜ Γ . This enforces the fermionic statistics with respect to the electronic coordinates r σ . Therest of the N -representability conditions (Eqs. (4.51a) and (4.51c)) are satisfied automatically by choosing Ψ (cid:48) ∈ ˜ P as fermionic with respect to polariton coordinates and thus the corresponding dressed 1RDM satis-fies the N -representability conditions (Eq. (4.46)). This guarantees that the electronic and photonic 1RDMsof the system are N -representable, and thus, e.g., the electronic Pauli-principle is enforced. However, higherorder RDMs are not treated exactly, which is an interesting topic for future research. We thus avoid the directconstruction of the exponentially growing correlated electron-photon states.As we have pointed out already, the polariton picture gives any coupled problem that is described byHamiltonian (1.37) the same structure as a purely electronic problem with two-body interactions, i.e., theelectronic-structure Hamiltonian (Eq. 2.1). Consequently, we can transfer every type of electronic-structuretheory to the coupled electron-photon problem, if the theory provides an expression for the 1RDM (sincewe need the 1RDM to test the N -representability constraints). The main steps how to do so are depicted inFig. 5.1. We assume that the theory provides us an energy expression E m with respect to a set of electronicbasis states { φ k }. This requirement is met by basically every electronic-structure theory, as for instance thesingle-reference methods KS-DFT or HF but also more involved (and numerically expensive) approacheslike coupled cluster, valence bond theory or configuration interaction. Depending on the specific theory, E m might have quite different forms, but it is always derived from some many-body Hamiltonian ˆ H m = ˆ T [ t ] + ˆ V [ v ] + ˆ W [ w ]. More specifically, the connection between ˆ H m and E m is given by the particle number N and the integral kernels ( t , v , w ) of the three energy operators. For the matter Hamiltonian ˆ H m of Eq. (1.37)for example, these kernels are given by t = − ∇ r , v = v ( r ) and w = w ( r , r (cid:48) ). The goal of any electronic-structure theory is then to find the minimum of E t , v , wm [ Ψ ], where Ψ is a (possibly multi-determinantal) wavefunction constructed from the orbital set { φ k }. Typically, one needs to impose some constraints on theparametrization of the wave function Ψ to make it physical c k [ Ψ ] =
0, (5.4) A standard reference for these and other quantum chemical methods is Helgaker et al. [132]. generic electronic-structure minimiza-tion problem is formulated as minimize E t , v , wm [ Ψ ]subject to c k [ Ψ ] =
0. (5.5)We can solve Eq. (5.5) by, e.g., minimizing the Lagrangian L t , v , wm [ Ψ ,{ (cid:178) k }] = E t , v , wm [ Ψ ] + C [ Ψ ,{ (cid:178) k }], (5.6)where C [ Ψ ,{ (cid:178) k }] = (cid:80) k (cid:178) k c k [ Ψ ] is a Lagrange-multiplier term. Instead of minimizing E m directly, one mini-mizes L m with respect to the orbitals and the Lagrange-multipliers (cid:178) k . Today, a plethora of standard electronic-structure codes exist that solve (5.5) very efficiently for many different theory levels and thus allow for ahighly accurate description of the electronic structure.If we consider the coupled electron-photon Hamiltonian of Eq. (1.37) instead of the purely electronHamiltonian ˆ H m , we find that we need to build new approximation strategies and implementations todeal with the coupled electron-photon Hamiltonian directly. However, by transforming the problem intoits dressed counterpart, i.e., we consider (4.37), we can utilize the full existing machinery for the electroniccase. In particular, this means that we have now polaritonic orbitals φ (cid:48) i ( z σ ) as fundamental entities thathave as coordinate z ≡ r q , where q is an M -dimensional (number of photon modes) vector. Additionally, theone- and two-body terms are replaced by their polaritonic counterparts, i.e., ( t , v , w ) → ( t (cid:48) , v (cid:48) , w (cid:48) ) as givenin Eqs. (4.38), (4.39) and (4.40). We can then transform straightforwardly the energy expression of a givenelectronic-structure theory into a polariton energy expression E t , v , wm [ Ψ ] → E t (cid:48) , v (cid:48) , w (cid:48) m [ Ψ (cid:48) ], because the connec-tion between E m and ˆ H m is defined by the one- and two-body terms and the particle number alone. Also theconstraints directly transfer to the polariton system, leading to the Lagrangian term C [ Ψ ,{ (cid:178) k }] → C [ Ψ (cid:48) ,{ (cid:178) k }].Lastly, since polaritons are particles with a more complicated hybrid statistics than electrons (see Ch. 4), weneed to add to the Lagrangian a further constraint term G [ γ e ] to enforce the constraints g i [ γ e ] = − n ei ≥ ∀ i = N . (5.7)With this definition, the energy expression E t (cid:48) , v (cid:48) , w (cid:48) m [ Ψ (cid:48) ] and the constraints, we are now able to generalizethe minimization problem of Eq. (5.5) to the generic polaritonic minimization problem minimize E t (cid:48) , v (cid:48) , w (cid:48) m [ Ψ (cid:48) ]subject to c k [ Ψ (cid:48) ] = g i (cid:163) γ e [ Ψ (cid:48) ] (cid:164) ≥ g i are trivially fulfilled(see Sec. 4.2.3). In this case, a minimization that does not explicitly guarantee g i ≥ The precise form of this term depends on the method that is used. See Sec. 8.2.2 fermion ansatz to differentiate them from the generic polariton ansatz . We will numeri-cally investigate the settings, where the fermion ansatz and the polariton ansatz lead to the same results inSec. (6.1.3).
In this section, we will apply the general rules of the previous section to HF theory, which leads to polari-tonic HF theory. This means that we approximate the density matrix of the exact dressed wave function ofEq. (4.42) by the density matrix of a single Slater determinant with orbitals φ (cid:48) ,..., φ (cid:48) N , i.e., Φ (cid:48) ( z σ ,..., z N σ N ) = (cid:112) N ! | φ (cid:48) ,..., φ (cid:48) N | − Further, we consider a spin-restricted formalism, i.e., we assume that the number of electrons N is even anddefine φ (cid:48) k − ( z σ ) = φ (cid:48) k ( z ) α ( σ ), φ (cid:48) k ( z σ ) = φ (cid:48) k ( z ) β ( σ ) for k = N /2, where α , β are the usual spin-orbitals(cf. Eqs. (2.22). We again note that we do not necessarily enforce with our constraints that the auxiliary Slaterdeterminant has the right symmetry but rather its 1RDM. In this regard polaritonic HF becomes actually a1RDM functional theory for polaritonic problems rather than a wave-function based method [1]. With thisansatz, we calculate the energy expectation value for the Hamiltonian of Eq. (4.37), which reads E (cid:48) HF = (cid:88) i 〈 φ (cid:48) i | ( ˆ T [ t (cid:48) ] + ˆ V [ v (cid:48) ]) φ (cid:48) i 〉 + (cid:88) i , k (cid:163) 〈 φ (cid:48) k | ˆ J (cid:48) i [ w (cid:48) ] φ (cid:48) k 〉 − 〈 φ (cid:48) k | ˆ K (cid:48) i [ w (cid:48) ] φ (cid:48) k 〉 (cid:164) , (5.9)where we introduced the “dressed” Coulomb-operator ˆ J (cid:48) i which acts asˆ J (cid:48) i φ (cid:48) k ( z ) = (cid:90) d z (cid:48) φ (cid:48) i ∗ ( z (cid:48) ) w (cid:48) ( z ; z (cid:48) ) φ (cid:48) i ( z (cid:48) ) φ (cid:48) k ( z ) (5.10a)and the “dressed” exchange-operator ˆ K (cid:48) i that acts asˆ K (cid:48) i φ (cid:48) k ( z ) = (cid:90) d z (cid:48) φ (cid:48) i ( z ) w (cid:48) ( z ; z (cid:48) ) φ (cid:48) i ∗ ( z (cid:48) ) φ (cid:48) k ( z (cid:48) ). (5.10b)The polaritonic one- and two-body terms are given by (4.38), (4.39) and (4.40), respectively. With this wefind that E (cid:48) HF = E t (cid:48) , v (cid:48) , w (cid:48) HF ({ φ (cid:48) k }) (see Ch. 2.2). Consequently, we also find structurally the same derivative,which reads ∇ φ ∗ k E (cid:48) HF = ˆ H (cid:48) φ (cid:48) k =
2( ˆ T [ t (cid:48) ] + ˆ V [ v (cid:48) ]) φ (cid:48) k + (cid:88) i (cid:163) J (cid:48) i [ w (cid:48) ] φ (cid:48) k − ˆ K (cid:48) i [ w (cid:48) ] φ (cid:48) k (cid:164) , (5.11)where ˆ H (cid:48) is the generalization of the Fock operator (Eq. (2.45)). Since we consider only one Slater deter-minant, the orbitals φ (cid:48) i are also the eigenfunctions of the system’s dressed 1RDM γ . Because of the spin-restriction, it suffices also to consider the spin-summed version γ ( z , z (cid:48) ) = (cid:80) N /2 i = φ (cid:48) i ∗ ( z (cid:48) ) φ (cid:48) i ( z ), which we de-note with the same symbol. We see that γ [ Φ (cid:48) ] has occupations (eigenvalues) of 2 instead of 1 because of thespin-summation. This transfers to the natural occupation numbers of the electronic 1RDM γ e ( r , r (cid:48) ) = (cid:90) d q γ ( rq , r (cid:48) q ).Now, we have defined all the terms that enter the minimization problem (5.8).142.3. POLARITONIC RDMFT As mentioned in the last section, we consider polaritonic HF as a test-case. HF theory is conceptually (andnumerically almost) the simplest electronic-structure theory and thus it is well-suited to explore the fea-tures and possible issues of polaritonic-structure theory. Judging from the first calculations (see Ch. 6), po-laritonic HF captures despite its simplicity many phenomena of coupled light matter systems surprisinglywell. However, for large coupling strengths, polaritonic HF becomes inaccurate. To improve upon the HFdescription, a logical next step is DFT based on an auxiliary system of polaritons. For that, we merely haveto remove the exchange-part of the HF-implementation and employ a functional of our choice. This hasbeen already tested in a simple setting by Nielsen et al. [258]. We only want to mention that the employed(simple) functional in terms in the dressed KS system was already more accurate than KS-QEDFT with thephoton OEP.In this section instead, we go from a conceptual point of view yet another step further and derive polari-tonic RDMFT, i.e., we construct a theory on the polaritonic 1RDM. This case is especially interesting withregard to the difficulties that we encountered in Sec. 3.3.1 when we tried to express the energy-expectationvalue of the Hamiltonian in terms of RDMs. By employing polaritonic orbitals, we can let aside the under-standing of the complicated 3/2-body RDM that accounts for the electron-photon interaction in the stan-dard picture, including its representability conditions. The polaritonic and electronic 1RDMs are quite sim-ilar and most importantly, many concepts that were important for the construction of RDMFT functionalstransfer to the polaritonic case.
The RDM perspective on polaritonic-structure theory
Let us therefore analyze again the structure of ˆ H (cid:48) , given in Eq. (4.37). It consists of only polaritonic one-body terms ˆ h (1) ( z ) = − ∆ + v (cid:48) ( z ), and two-body terms ˆ h (2) ( z , z (cid:48) ) = w (cid:48) ( z , z (cid:48) ). It commutes with the polaritonicparticle-number operator ˆ N (cid:48) = (cid:82) d + M z ˆ n ( z ), where we used the definition of the polaritonic local densityoperator ˆ n ( z ) = (cid:80) Ni = δ + M ( z − z i ). This means that the auxiliary system has a constant polaritonic particlenumber N . Additionally, the physical wave function of the dressed system Ψ (cid:48) ( z σ ,..., z N σ N ) is per con-struction antisymmetric (but it has in addition the q -symmetry). These properties allow for the definitionof the polaritonic (spin-summed) 1RDM in exactly the same way as for an electronic state. We repeat herethe definition of Sec. 4.3, i.e., γ ( z , z (cid:48) ) = N (cid:88) σ ,..., σ N (cid:90) d (3 + M )( N − z (cf. 4.45) Ψ (cid:48)∗ ( z (cid:48) σ , z σ ,..., z N σ N ) Ψ (cid:48) ( z σ , z σ ,..., z N σ N ).Furthermore, we introduce the (spin-summed) dressed 2RDM Γ (2) ( z , z ; z (cid:48) , z (cid:48) ) = N ( N − (cid:88) σ ,..., σ N (cid:90) d (3 + M )( N − z Ψ (cid:48)∗ ( z (cid:48) σ , z (cid:48) σ , z σ ,.., z N σ N ) Ψ (cid:48) ( z σ , z σ , z σ ,.., z N σ N ). Most of the contents of this section are part of Ref. [1] E (cid:48) =〈 Ψ (cid:48) | ˆ H (cid:48) | Ψ (cid:48) 〉 = 〈 Ψ (cid:48) | N (cid:88) k = ˆ h (1) ( z k ) + (cid:88) k (cid:54)= l ˆ h (2) ( z k , z l ) | Ψ (cid:48) 〉= (cid:90) d + M z ˆ h (1) ( z ) γ ( z , z (cid:48) ) | z (cid:48) = z + (cid:90) d + M z d + M z (cid:48) ˆ h (2) ( z , z (cid:48) ) Γ (2) ( z , z (cid:48) , z , z (cid:48) ).Thus, we can define the variational principle for the ground state only with respect to well-defined reducedquantities, E (cid:48) = inf { γ , Γ (2) } → Ψ (cid:48) E [ γ , Γ (2) ].Notice the difference to the according variational principle in the standard picture E = inf { γ e , Γ (2) e , γ b , Γ (3/2) e , b } → Ψ (cid:110) ( T + V )[ γ e ] + ( W + H d )[ Γ (2) e ] + H ph [ γ b ] + H I [ Γ (3/2) e , b ] (cid:111) , (cf. 3.52)that we have defined formally in Sec. 3.3.1. We discussed there that we don’t know how to enforce the con-nection between the RDMs and the wave function, { γ e , Γ (2) e , γ b , Γ (3/2) e , b } → Ψ , and thus cannot use it. In con-trast to that, we know what we have to do in principle to perform the minimization (cf. 3.52): We need toconstrain the configuration space to the physical dressed RDMs that connect to an antisymmetric wavefunction with the extra q -exchange symmetry by testing the appropriate N -representability conditions ofthe dressed 2RDM and the dressed 1RDM. Besides the by now well-known conditions for the fermionic2RDM [214] and the fermionic 1RDM [136] we would in principle get further conditions to ensure the ex-tra exchange symmetry. However, already for the usual electronic 2RDM the number of conditions growsexponentially with the number of particles, and it is out of the scope of this work to discuss possible ap-proximations. The interested reader is referred to, e.g., Ref. [214]. Instead, we want to stick to the dressed1RDM γ and approximate the 2-body part as a functional of the γ . We will treat hereby γ as approximatelyfermionic and thus consider only the N -representability conditions (4.46). Additionally, we guarantee thefermionic character γ e as described in Sec. 5.1. The polaritonic 1RDM as a basic variable
The mathematical justification of RDMFT is given by Gilbert’s theorem [199], which is a generalization ofthe Hohenberg-Kohn theorem of DFT [165]. More specifically, Gilbert proves that the ground state energy ofany Hamiltonian with only 1-body and 2-body terms is a unique functional of its 1RDM (see Sec. 2.4.2. Fol-lowing this idea, we will express the ground-state energy of the dressed system as a partly unknown energyfunctional E (cid:48) of only the system’s dressed 1RDM E (cid:48) = inf γ E (cid:48) [ γ ], (5.12)where E (cid:48) [ γ ] = (cid:90) d + M z ˆ h (1) ( z ) γ ( z , z (cid:48) ) | z (cid:48) = z + (cid:90) d + M z d + M z (cid:48) ˆ h (2) ( z , z (cid:48) ) Γ (2) ([ γ ]; z , z (cid:48) , z , z (cid:48) ) (cid:124) (cid:123)(cid:122) (cid:125) = W (cid:48) [ γ ] (5.13)144.3. POLARITONIC RDMFTis the dressed RDMFT energy functional. For this minimization, we need a functional of the diagonal of thedressed 2RDM in terms of the dressed 1RDM as well as adhering to the corresponding N -representabilityconditions when varying over γ . Analogously to the discussion about the extra symmetry, we need to con-sider now polariton ensembles to have simple conditions. However, since the ground state is a pure state andpure states are spacial cases of ensembles, the definition (5.12) is still exact with the ensemble conditions.Another advantage of RDMFT in general is the direct access to all one-body observables. This transfersalso to polaritonic RDMFT. The calculation of expectation values of purely electronic one-body observablesis trivial with the knowledge of the dressed 1RDM, but also photonic one-body (and half-body) observablescan be calculated, using the connection formula shown in the end of Sec. 4.3. Thus, we are able to calculatevery interesting properties of the cavity photons like the mode occupation or quantum fluctuations of theelectric and magnetic field.To see whether our approach is practical and accurate, we employ simple approximations to the un-known part W (cid:48) [ γ ] that have been developed for the electronic case. To do so, we further, similarly to theelectronic case, decompose W (cid:48) [ γ ] = E H [ γ ] + E xc [ γ ]into a classical Hartree part E H [ γ ] = (cid:90) (cid:90) d + M z d + M z (cid:48) γ ( z , z ) γ ( z (cid:48) , z (cid:48) ) w (cid:48) ( z , z (cid:48) )and an unknown exchange-correlation part E xc [ γ ]. Almost all known functionals E xc [ γ ] are expressed interms of the eigenbasis and eigenvalues of the 1RDM. In our case the dressed natural orbitals φ i ( z ) and occu-pation numbers n i are found by solving (cid:82) d + M z (cid:48) γ ( z , z (cid:48) ) φ i ( z (cid:48) ) = n i φ i ( z ). One interesting feature of RDMFTis (in contrast to KS-DFT), that it includes HF theory as a special case with the functional E xc [ γ ] = E HF [ γ ] = − (cid:88) i , j n i n j (cid:90) (cid:90) d + M z d + M z (cid:48) φ ∗ i ( z ) φ ∗ j ( z (cid:48) ) w (cid:48) ( z , z (cid:48) ) φ i ( z (cid:48) ) φ j ( z ). (5.14)As the HF functional depends linearly on the natural occupation numbers, any kind of minimization willlead to the single-Slater-determinant HF ground state (which corresponds to occupations of 1 and 0) [224].We have recovered polaritonic HF. We can go beyond the single Slater determinant in polaritonic RDMFT,if we employ a nonlinear occupation-number dependence in the exchange-correlation functional. We haveonly considered the oldest and best tested functional that was introduced by Müller in 1984 [227] that readsin the dressed setting (cf. Eq. (2.102)) E xc [ γ ] = E M [ γ ] = − (cid:88) i , j (cid:112) n i n j (cid:90) (cid:90) d + M z d + M z (cid:48) φ ∗ i ( z ) φ ∗ j ( z (cid:48) ) w (cid:48) ( z , z (cid:48) ) φ i ( z (cid:48) ) φ j ( z ), (5.15)and later re-derived by Bjuise and Baerends [228] from a different perspective. The Müller functional hasbeen studied for many physical systems [229, 228] and gives a qualitatively reasonable description of elec-tronic ground states (see also Sec. 2.5). Additionally, it has many advantageous mathematical properties [227,161](see also Sec. 2.4.2). A thorough discussion of different functionals goes beyond the scope of this work,and we only want to remark that a variety of functionals were proposed after E M [ γ ] and it is likely to haveeven better agreement with the exact solution by choosing more elaborate functionals. 145HAPTER 5. POLARITONS FROM FIRST PRINCIPLES146 hapter 6 POLARITONIC STRUCTURE METHODS IN PRACTICE
In this chapter, we finally apply our new polariton tool box to concrete problems and show some first re-sults. Specifically, we have two implementations available that are capable to perform calculations withpolaritonic orbitals. In Sec. 6.1, we introduce the first one. Here, the polaritonic orbitals are expanded ina combined basis of a one-dimensional lattice for the electronic subsystem and the Fock-number statesfor the photon mode (see Sec. 6.1.2). In this setting, we can perform polaritonic HF calculations taking ex-plicitly into account the hybrid statistics of the polaritonic orbitals, i.e., the polariton ansatz (see Sec. 5.1).Importantly, this implementation allows to assess our construction to approximatively guarantee the hybridstatistics of polaritonic orbitals (polariton ansatz) and to define settings, in which we can use the fermionansatz. In Sec. 6.2, we introduce our second implementation in the real-space electronic structure code O C - TOPUS . It is an extension of the RDMFT routine of O
CTOPUS and capable to perform minimizations withthe Müller and HF approximation employing the fermion ansatz. For both implementations, we present accuracy studies, which is a standard step in first-principles the-ory. Before we can apply a method to a new problem, we have to systematically compare the numerical re-sults to exact references (or analytical limit cases). Since this is a standard procedure in electronic structuretheory, there are well-defined test sets with accurate reference data. For instance, Curtiss et al. introducedthe frequently used G2/97 [264] or G3/99 [265] theoretical thermochemistry test sets. For coupled electron-photon systems, we do not have such kind of databases and it will be considerably more difficult to generatethem because of the larger configuration space. However, we still can follow the same route and assess ourapproximations with exact references. With the machinery that we had accessible when we produced ourdata, we could calculate the (regarding the basis set) exact many-body ground states of one-dimensionaltwo-electron systems that are coupled to one photon mode. In Sec. 6.3, we conclude the assessment with abrief discussion of the numerical challenges of polaritonic-orbital-based methods, including the scaling ofthe computational costs.The benchmark studies justify then to go beyond problems that we can still treat with exact methods anddo some first calculations of nontrivial systems. Our implementations are not yet general enough to treatrealistic three-dimensional matter systems, but are constrained to one spatial dimension. Nevertheless, thissetting allows already to study nontrivial effects of the electron-photon interaction, which we present inSec. 6.4.In Sec. 6.5, we summarize these results and comment on possible implications with respect to the stan-dard description in terms of cavity-QED models (Sec. 1.2). The chapter is based on Refs. [1, 2]. See also the corresponding publication Ref. [part 4][3]. Note that to ease reading in this section, we only briefly introduce both algorithms, reserving their appropriate introduction forpart III. The first implementation is discussed in Ch. 8.2, and for the details on the second implementation, the reader is referred toCh. 8. The largest system that has been described exactly considers three particles and one mode [235].
In this section, we introduce and asses the setting of our first implementation that numerically minimizesthe polaritonic HF energy functional (5.9) (Sec. 5.2) within the polariton ansatz. This means that the imple-mentation is capable to explicitly enforce the extra conditions (5.7) due to the hybrid statistics of the polari-tonic orbitals and allows for (approximately) enforcing the hybrid statistics. For that, we have developed anew algorithm that we briefly outline in Sec. 6.1.1. In the first subsection (Sec. 6.1.2), we introduce the lattice model that this implementation considersto describe the polaritonic orbitals. In Sec. 6.1.3, we assess this implementation by a comparison to exactresults and exemplify the influence of the hybrid statistics.
Algorithm-wise, we are confronted with enforcing the additional inequality constraints (5.7) in extensionto the original (HF) minimization problem in (5.5). We start by noting that the constraint functions g i de-pend on γ { φ (cid:48) k }, which can be directly calculated from the polariton orbitals, via the eigendecompositionof γ e . Since the diagonalization of γ e is a nontrivial step for large systems (or in real-space) and thus canbe a bottleneck of the minimization, it is helpful to consider natural and dressed orbitals as independent variables of the minimization and enforce their connection as an additional constraint. We thus define g i = g i [ γ e { φ (cid:48) k , ψ ei }] and include the necessary orthonormality of the ψ ei by a third set of conditions f i j = 〈 ψ ei | ψ ej 〉 − δ i j =
0, (6.1)that we include in the minimization by a third Lagrange-multiplier term − (cid:80) i j ¯ θ i j f i j . Note that this con-struction automatically linearizes the constraints (5.7) during one minimization step, where the ψ ei are fixed.To enforce now these inequality constraints, we use an augmented Lagrangian algorithm, following thebook of Nocedal and Wright [266], part 17.3. We have chosen this algorithm, since it simply extends a givenLagrangian with penalty terms. Hence, we can make use of any existing implementation that solves the min-imization problem of Eq. (5.5) and just add the extra terms with corresponding extra iteration loops. To testthis, we employed a standard electronic-structure algorithm [267], which we extended by the augmented-Lagrangian method for the inequality constraints. This extension involves two extra terms. A linear (so-called augmented) term, − (cid:80) i ν i g i with Lagrange-multipliers ν i that are initialized to zero and updated tovalues ν i > g i =
0. And a second nonlinear term, that adds a penalty function P = µ /2 (cid:80) i ([ g i ] − ) , where [ y ] − denotesmax( − y ,0), which penalizes violations of condition (5.7) quadratically, but has no effect in the so-called feasible region of configuration space, where the conditions (5.7) are satisfied.Specifically for our example, the extra Lagrangian term of the translation rules depicted in Fig. 5.1 isgiven by G [ γ e { φ (cid:48) k , ψ ei }] = − (cid:88) i λ i g i [ γ e { φ (cid:48) k , ψ ei }] + µ (cid:88) i ([ g i ] − [ γ e { φ (cid:48) k , ψ ei }]) − (cid:88) i j ¯ θ i j f i j [ γ e { ψ ei }]. (6.2) We present the algorithm with all its details in the last part in Ch. 8.2. This is similar to considering φ and φ ∗ as independent. E of the cavity mode yet restricted in the perpendicular directions, i.e., we consider aone-dimensional discretized matter subsystem. Since the extension in the perpendicular directions is smallcompared to the wave length of the dominant cavity mode, the coupling is mediated via the total dipole d of the electrons. If the mode volume (distance between the mirrors) is small or the number of particlesincreased, new hybrid light-matter quasi-particles, i.e., polaritons, emerge.The full Lagrangian for the polaritonic HF minimization problem reads then L (cid:48) HF [ γ { φ (cid:48) k , ψ ei }] = E (cid:48) HF − (cid:88) i j ¯ (cid:178) i j h i j [ γ { φ (cid:48) k }] + G [ γ e { φ (cid:48) k , ψ ei }] (6.3)and the corresponding first order conditions for a minimum (stationary point) of L (cid:48) HF are0 =∇ φ (cid:48) k ∗ L (cid:48) HF = ˆ H φ (cid:48) k − (cid:88) j ¯ (cid:178) k j φ (cid:48) j + (cid:88)(cid:163) λ i − µ [ g i ] − (cid:164) ˆ G i φ (cid:48) k (6.4a)0 =∇ ψ ei ∗ L (cid:48) HF = ( µ [ g i ] − − λ i ) (cid:90) d r (cid:48) γ e ( r (cid:48) , r ) ψ ei ( r (cid:48) ) − (cid:88) j ¯ θ i j ψ ej , (6.4b)where we considered φ (cid:48) k and φ (cid:48) k ∗ as independent and defined ˆ G i φ (cid:48) k ( rq ) = n ek (cid:82) d r (cid:48) ψ ei ∗ ( r (cid:48) ) φ (cid:48) k ( r (cid:48) q ) ψ ei ( r σ ).Additionally, we can diagonalize the Lagrange-multiplier matrices ¯ (cid:178) i j = δ i j (cid:178) j and ¯ θ i j = δ i j θ j , since theorbital-dependent Hamiltonian ˆ H and the electronic 1RDM γ e are hermitian. We also want to remark onthe second gradient equation, cf. Eq. (6.4b), which is much simpler than it looks like on a first glance. Infact, solving Eq. (6.4b) is equivalent to solving first the eigenvalue equation for γ e (see the paragraph aboveEq. (4.51)) and then replacing θ i = n ei ( µ [ g i ] − − λ i ). With these definitions, we are able to perform polaritonicHF calculations by numerically solving the Eqs. (6.4a) and (6.4b) with the expressions (5.9) and (5.11).For a more detailed discussion of this algorithm, the reader is referred to Ch. 8.2. For the exemplification of polaritonic HF, we consider a one-dimensional lattice system that couples to onephoton mode in dipole approximation with frequency ω . We depicted a sketch of the setup in Fig. 6.1. We149HAPTER 6. POLARITONIC STRUCTURE METHODS IN PRACTICEhave chosen such a simple lattice model, since this allows us to have still exact numerical reference datafor more than 2 electrons to compare to. We stress again that there are no (numerically exact) referencessolutions currently available for realistic three-dimensional matter plus cavity systems. To the best of ourknowledge there are only QEDFT simulations (at several levels of approximations) for such systems [247,233].The Hamiltonian is of the form of Eq. (1.37) and readsˆ H = ˆ H m (cid:122) (cid:125)(cid:124) (cid:123) − t B m (cid:88) i (cid:88) σ =↑ , ↓ ( ˆ c † i , σ ˆ c i + σ + ˆ c † i + σ ˆ c i , σ ) + B m (cid:88) i = v i ˆ n i + ˆ H sel f (cid:122) (cid:125)(cid:124) (cid:123) λ B m (cid:88) i , j = ˆ n i ˆ n j x i x j − (cid:114) ω a † + ˆ a ) λ B m (cid:88) i = x i ˆ n i (cid:124) (cid:123)(cid:122) (cid:125) ˆ H int + ω ( ˆ a + ˆ a + (cid:124) (cid:123)(cid:122) (cid:125) ˆ H ph , (6.5)with hopping t = ∆ x corresponding to a second-order finite difference approximation for a grid with spac-ing ∆ x , where we choose t = ∆ x = v i on site i . We have set the Coulomb repulsion tozero in this example to highlight the influence of the matter-photon coupling and how well the polaritonicHF approach can capture it. However, we show results for larger systems including the Coulomb interactionin Sec. 6.4. Nevertheless, due to the dipole self-energy term ˆ H d we have a mode-induced dipole-dipole in-teraction among the electrons. This type of interaction is important in many fundamental quantum-opticalquestions, such as the quest for a super-radiant phase in the strong-coupling case [268, 269, 270] (see alsoSec. 1.1). Further, the electron basis B m is determined by the number of sites, x i = i − x is the position withrespect to the middle of our lattice x , ˆ c (†) i , σ are the fermionic creation (annihilation) operators that satisfy theanticommutation relation [ ˆ c i , σ , ˆ c † j , σ (cid:48) ] + = δ i j δ σσ (cid:48) , and ˆ n i = ˆ c † i , ↑ ˆ c i , ↑ + ˆ c † i , ↓ ˆ c i , ↓ is the density operator.For the implementation and to go to the polariton picture, we express the matter ˆ H m plus dipole part ˆ H d of our Hamiltonian in matrix form by using the basis states | ˜ ψ i , σ 〉 = ˆ c † i , σ | 〉 . As a basis for the photon subsys-tem, we utilize the eigenstates χ i of the photon energy operator, i.e., ˆ H ph χ α = ( α + χ α , which are photonnumber states. To calculate the coupling term of the energy expression H I , cf. Eq. (5.9), we express the dis-placement operator p α = (cid:112) ω α ( ˆ a † α + ˆ a α ) in this basis as well. To then construct the auxiliary Hamiltonianˆ H (cid:48) to Eq. (6.5) according to the rules from Sec. 5.2, we would need to define the auxiliary terms t (cid:48) , v (cid:48) , w (cid:48) , cf.Eqs. (4.38)-(4.40). Since in this section we employ a second-quantized picture, it is less convenient to de-fine these kernel-like quantities, but directly the many-body operators ˆ T [ t (cid:48) ], ˆ V [ v (cid:48) ], ˆ W [ w (cid:48) ]. For the one-bodyterms T [ t (cid:48) ] + V [ v (cid:48) ], this is straightforward and the expression reads T [ t (cid:48) ] + V [ v (cid:48) ] = ˆ H m + ˆ H ph + λ B m (cid:88) i = ˆ n i ˆ n i x i − (cid:114) ω N ( ˆ a † + ˆ a ) λ B m (cid:88) i = x i ˆ n i . (6.6)However, the interpretation of the operators is different from before, because we have to apply them topolaritonic basis states. Since we consider the spin-restricted formalism as introduced in Sec. 5.2, we neglectthe spin-dependency of the electronic part of the basis ˜ ψ i , σ → ψ i and define | φ (cid:48) i α 〉 = | ψ i χ α 〉 . We can thenderive the kernel expression ( t + v )( x , q ) → ( t (cid:48) + v (cid:48) ) j β i α = 〈 ψ i χ α | ( ˆ T (cid:48) + ˆ V (cid:48) ) ψ j χ β 〉 as matrix elements( t (cid:48) + v (cid:48) ) j β i α = − t ( δ i , j + + δ i + j ) + v i δ i j + ω ( δ α , β + ) + λ ( x i − x ) δ i j δ α , β − λ (cid:113) ω N ( x i − x ) δ i j ( (cid:113) β + δ α , β + + (cid:113) βδ α , β − ). (6.7)For the two-body term, a definition analogously to (6.6) is more difficult, since we have to differentiate thetwo polaritonic coordinates. For the sake of the analogy, we formally writeˆ W (cid:48) = λ B m (cid:88) i (cid:54)= j = ˆ n i ˆ n j x i x j − (cid:114) ω N ( ˆ a + ˆ a ) λ B m (cid:88) i = x i ˆ n i − (cid:114) ω a + ˆ a ) λ B m (cid:88) i = x i ˆ n i , (6.8)where the upper indices differentiate the two polaritonic orbitals that both have an electronic and photonicpart. This is to be understood in the following sense: To define the corresponding kernel w ( x , q , x , q ) → w i α i α j β j β = 〈 ψ i χ α ψ i χ α | ˆ W ψ j χ β ψ j χ β 〉 , the operators only act on the basis elements with the sameindices. As an example, let us state the kernel ( w self ) i α i α j β j β for the self-interaction part ˆ W self = λ (cid:80) B m i (cid:54)= j = ˆ n i ˆ n j x i x j that reads ( w self ) i α i α j β j β = λ ( x i − x )( x i − x ) δ i j δ i j δ α β δ α β . (6.9)With these definitions, we can calculate the polaritonic HF energy expression, cf. (5.9) and the polaritonicHF Fock-matrix, cf. (5.11). Then, we employ the augmented Lagrangian algorithm as discussed in Sec. 6.1.1to find the polaritonic HF ground state of the model system. As a first example, we illustrate the violation of the Pauli principle if we do not enforce the right symmetries(see Sec. 5.1). To this end, we compare ground-state energies, electronic 1RDMs and the photon number ofa small 4-electron system obtained with the two different HF ground states, i.e., polaritonic HF using densitymatrices with the exact symmetry, cf. (4.43), and polaritonic HF with only fermionic symmetry (which wecall in this section fermionic HF). We can expect deviations between both polaritonic-HF theory levels forsystems that contain more than one orbital. In our spin-restricted case this corresponds to more than twoelectrons and that is why we chose here N =
4. Further we set the external potential to zero, i.e., v i = ∀ i .Since we need to calculate the exact coupled electron-photon many-body ground state from a configurationspace that grows exponentially fast with the size of the basis sets and the electron number, we choose asmall box of length L = B m = B ph = Despite the small basis sets and electron number employed,the many-body configuration space has the considerable size of (2 B m ) N ∗ B ph ≈ , which is already at theedge of standard exact diagonalization solvers: matrices of this size can still be diagonalized without specialefforts like parallelization. Since we only aim for a benchmark study here, this limitation is not problematic,but it shows how expensive the exact solutions of coupled electron-photon systems computationally are.The need for numerically manageable approximations is evident here.We first compare the electronic 1RDMs γ e , cf. Eq. (4.47), and the photon numbers N ph = 〈 ˆ N ph 〉 , cf.Eq. (1.42) using the connection formula of Eq. (4.57), for varying coupling strengths g / ω = λ / (cid:112) ω and ω = For example, deviations in the energy or photon number between B ph = B ph = − . N ph , which is an equivalently good measure for the quality in the photonic sector. In Fig. 6.2 (a),we display the difference of the exact electronic 1RDM from the one of the polaritonic HF and fermionicHF approximations, measured by the Frobenius norm (cid:107) A (cid:107) = (cid:113)(cid:80) i j A i j for a matrix A i j . We see that forall coupling strengths the polaritonic HF 1RDM (dashed-dotted orange line), which enforces the right hy-brid statistics, remains very close to the exact solution indicating that the electronic subsystem is capturedvery well within this approximation. The fermionic HF (solid blue line) approximation, however, deviatesstrongly due to its wrong purely fermionic character.(a) (b)Figure 6.2: Comparison of the electronic 1RDM γ e (a) and the photon number N ph (b) for the 4-electronsystem with ω = g / ω . In (a) the norm difference between the exact1RDM and the polaritonic HF (pHF) 1RDM (dashed-dotted orange line) and between the exact 1RDM andfermionic HF 1RDM (fHF) (solid blue line) are displayed. In (b) the exact photon number (dashed green line)and the polaritonic HF (dashed-dotted orange line) and fermionic HF (solid blue line) photon numbers areshown. In both cases, fermionic HF deviates much stronger from the exact reference than polaritonic HFdue to the wrong symmetry.The same behavior is also encountered in the photonic subsector, where in Fig. 6.2 (b) the photon num-ber of the exact calculation (dashed green line) is compared to the polaritonic HF (dashed-dotted orangeline) and to the fermionic HF photon number (solid blue line). We therefore find, similarly to the simpleuncoupled problem in Sec. 5.1 (for g / ω = E = 〈 ˆ H 〉 of thecoupled system as a function of the coupling strength g / ω . While the polaritonic HF (dashed-dotted orangeline) is variational, i.e., due to the right statistics we are always equal or above the exact energy (dashedgreen line), the fermionic HF (blue solid line) breaks the proper symmetry and thus can reach energiesbelow the physically accessible ones. However, again in close analogy to the uncoupled example in Sec. 5.1,if we increase the frequency of the photon field such that it is much more costly to excite photons thanelectrons, the minimal-energy conditions can single out the correct statistics, as displayed in Fig. 6.3(b).That is, for ω large enough the constraints g i [ γ { φ (cid:48) k , ψ ei }] ≥ ω = ω = g / ω . While in (a) the fermionic HF approximation (solid blue line) can achieve unphys-ically low energies when compared to the exact solution (dashed green line) due to the wrong statistics, in(b) the minimal-energy condition singles out the right statistics without further constraints. The polaritonicHF (dashed-dotted orange line) by construction always has the right hybrid statistics and thus is variational,i.e., the energy is always above the exact energy.this feature later, when we use the other implementation for 4-electron system. We want to stress again thatfor a 2-electron singlet system, we always trivially satisfy the additional constraints, because there is onlyone occupied orbital. Thus, in the following benchmark study, we do not have to take care about the photonfrequency. 153HAPTER 6. POLARITONIC STRUCTURE METHODS IN PRACTICE In this section, we present our second numerical implementation of polaritonic orbitals that minimizes theRDMFT energy functional (5.13) within the fermion ansatz. We can employ the HF (Eq. (5.14)) as well asthe Müller approximation (Eq. (5.15)) to the energy functional. The implementation is part of the real-spaceelectronic structure code O
CTOPUS [3]. This implementation is not yet extended to explicitly guaranteethe extra conditions of Eq. (5.7) and thus can only be applied in settings, where the conditions are triviallyfulfilled (see Sec. 6.1.3). Within this limitation, the implementation is capable to study one-dimensionalelectronic systems, coupled to one photon mode in full real space. This means the electronic and photoniccoordinates of the polaritonic orbitals are approximated on discretized grids and the differential operatorsare approximated accordingly (see Sec. 8.1). This allows for a highly accurate description of coupled light-matter systems (see also App. C).In Sec. 6.2.1, we briefly introduce the RDMFT algorithm. The appropriate introduction of the imple-mentation, including a detailed convergence study is presented instead in Ch. 8 of the numerics part. InSec. 6.2.2, we validate the methods for two example systems.
CTOPUS -implementation of polaritonicRDMFT
An important advantage of the polariton formulation of cavity QED is that we can re-use most of the numeri-cal techniques developed for quantum chemistry and materials science. We demonstrate this explicitly withour working implementation of dressed RDMFT in the electronic-structure code
Octopus [3] that is publiclyavailable in the actual developer’s version. Specifically, we rewrite the approximated energy functional in the natural orbital basis as E [ γ ] = ∞ (cid:88) i = n i (cid:90) d + M z φ ∗ i ( z ) (cid:163) − ∆ + v (cid:48) ( z ) (cid:164) φ i ( z ) + (cid:88) i , j n i n j (cid:90) (cid:90) d + M z d + M z (cid:48) (cid:175)(cid:175) φ i ( z ) (cid:175)(cid:175) (cid:175)(cid:175)(cid:175) φ ∗ j ( z (cid:48) ) (cid:175)(cid:175)(cid:175) w (cid:48) ( z , z (cid:48) ) + E xc [ γ ].We use this form to minimize the energy functional by varying the natural orbitals as well as the naturaloccupation numbers. To impose fermionic ensemble N -representability, we first represent the occupationnumbers as the squared sine of auxiliary angles, i.e. 0 ≤ n i = ( α i ) ≤
2, to satisfy Eq. (3.47a). The secondpart of the conditions (Eq. (3.48)), i.e., (cid:80) i = n i = N , as well as the orthonormality of the dressed naturalorbitals, i.e., (cid:82) d + M z φ ∗ i ( z ) φ j ( z ) = δ i j , are imposed via Lagrange multipliers as, e.g., explained in Ref. [271].We have available two different orbital-optimization methods, a conjugate-gradient algorithm (see Sec. 7.3)and an alternative method that was introduced by Piris and Ugalde [158] (see Sec. 7.2). The latter expressesthe φ i in a basis set and can use this representation to considerably speed up calculations in comparison tothe conjugate-gradient algorithm. It was used for all results presented in the following. However, it is nottrivial to converge such calculations in practice and we developed a protocol to obtain properly convergedresults. The interested reader is referred to App. C. https://octopus-code.org/wiki/Developers:Starting_to_develop . Note that the n i are bounded by 2 because we employed a spin-summed formulation. If we considered natural spin-orbitals in-stead, the upper bound would be 1. We now validate our real-space implementation by comparing to exact solutions of simple atomic andmolecular systems. The different systems are described by a local potential v ( x ) and coupled to one photonmode. We transfer the systems in the dressed basis, that leads to a dressed local potential v (cid:48) ( x , q ) = v ( x ) + ( λ x ) + ω q − ω (cid:112) q ( λ x ). (6.10)Specifically, we consider a one-dimensional model of a helium atom (He), i.e., v He ( x ) = − (cid:112) x + (cid:178) (6.11)and a one-dimensional model of a hydrogen molecule ( H ), i.e., v H ( x ) = − (cid:112) ( x − d ) + (cid:178) − (cid:112) ( x + d ) + (cid:178) (6.12)at its equilibrium position d = d eq = w ( x , x (cid:48) ) = (cid:113) | x − x (cid:48) | + (cid:178) C (6.13)for all test systems. Note that the softening parameters (cid:178) / (cid:178) C are a standard tool for one-dimensional mod-els of atoms. In contrast to the 3d case, the divergence of the potential 1/ | x | for x = We set (cid:178) = (cid:178) C = (cid:178) to guaranteeproperly bound electrons (Sec. 6.4.3). In Sec. 6.4.4, we will even use (cid:178) as an additional parameter to explicitlycontrol the confinement of the electrons. Finally, we choose the photon frequency in resonance with thelowest excitations of the respective “bare” systems, so outside of the cavity. For that we calculate the groundand first excited state of each system with the exact solver and find the corresponding excitation frequencies ω He = ω H = L x = L q =
16 a.u. and spacings of d x = d q = − or less. All the following results require a maximal precision of the order of 10 − in energyas well as in the density and thus we can safely use the given parameters. Details can be found in Ch. 8. For Note that we introduced v H already in Sec. 2.5 (eq. (2.111)). See for example Gebremedhin and Weatherford [273] for an introduction in the topic and further details about the mathematicalproperties of the soft-Coulomb approximation. Note however that such resonance is not an important feature of ground states and we just use these values because we have tochoose one. L x = L q =
20 a.u. and d x = d q = M = M =
71) natural orbitals for He( H ). For further details, the reader is referred to App. C for the details onhow to determine these numbers. We first show (see Fig. 6.4) the deviations of the ground state energiesHe H Figure 6.4: Differences of dressed HF (dHF) and dressed RDMFT (dRDMFT) from the exact ground stateenergies (in Hartree) as a function of the coupling g / ω for the (one-dimensional) He atom (left) and (one-dimensional) H molecule (right) in the dressed orbital description. Dressed RDMFT improves considerablyupon dressed HF. For both systems, the energy of dressed RDMFT remains close to the exact one, the errorof dressed HF instead increases with the coupling strength.for dressed RDMFT and dressed HF from the exact dressed calculation as a function of the dimensionlessrelation between effective coupling strength and photon frequency g / ω for He and H , respectively. Wethereby go from weak to very strong coupling with g / ω = We see that while dressed HF deviates stronglyfor large couplings, dressed RDMFT remains very accurate over the whole range of coupling strength. Still, amore severe test of the accuracy of our method is if instead of merely energies, we compare spatially resolvedquantities like the ground-state density ρ ( x , q ) ≡ γ ( x , q ; x , q ). To simplify this discussion, we separate theelectronic and photonic parts of the two-dimensional density by integration, i.e., ρ ( x ) = (cid:82) d q ρ ( x , q ) and ρ ( q ) = (cid:82) d x ρ ( x , q ). The exact reference solutions show that with increasing g / ω the electronic part of thedensity becomes more localized, while the photonic part becomes broadened. This behavior is capturedqualitatively with dressed HF as well as with dressed RDMFT. The latter performs for the electronic densityconsiderably better over the whole range of coupling strength, whereas for the photonic densities both levelsof theory deviate in a similar way from the exact result. This is shown for g / ω = Such strong coupling strength are sometimes referred to as the deep-strong coupling regime [48] that has been observed experimen-tally in different systems like for instance for Landau polaritons[62]. For (organic) molecules the highest reported coupling strengthsare in the ultra-strong regime of g / ω ≈ ρ ( x ) (here visible for H .)An even more stringent test of the accuracy of the dressed RDMFT approach is to compare the dressed1RDMs. The essential ingredients of the dressed 1RDMs are their natural orbitals φ i ( x , q ). Again, we sepa-rate electronic and photonic contributions and show their reduced electronic density ρ i ( x ) = (cid:82) d q | φ i ( x , q ) | .Fig. 6.6 depicts the first three dressed natural orbital densities of dressed RDMFT in comparison with the ex-act ones for both test systems. While it holds that for both systems, the lowest natural orbital density of thedressed RDMFT approximation is almost the same as the exact one, and the second natural orbital densitiesare only slightly different, the third natural orbital densities of H differ even qualitatively. For He, similarstrong deviations are visible for the fourth natural orbital. However, as long as such strong deviations onlyoccur for natural orbitals with small natural occupation numbers, like in these cases ( H : n = n = n = n = n = n = ρ i ( q ) = (cid:82) d x | φ i ( x , q ) | ,the first 3 of which are plotted in Fig. 6.7, for He and H . Here, the dressed RDMFT results even agreebetter with the exact solution than their electronic counterparts. Apparently, dressed RDMFT captures thephotonic properties of the tested systems very accurately for the ultra-strong coupling regime. The accuracydrops with increasing g / ω .As an example for a photonic observable, we show in Fig. 6.8 the mode occupation N ph as a func-tion of the coupling strength g / ω that we calculated by using Eq. (4.58), i.e., N ph = E ph ω − N , with the pho-ton mode energy E ph = (cid:80) M i = n i (cid:82) d x d q φ ∗ i ( x , q ) (cid:179) −
12 d d q + w q (cid:180) φ i ( x , q ). From weak to the beginning of theultra-strong coupling regime ( g / ω ≈ N ph well. For verylarge coupling strengths, the deviations to the exact mode occupation becomes sizeable. This might soundcounter-intuitive, as the photonic density is described comparatively well. The reason is that the photon oc-cupation, in contrast to the density, is mainly determined by the second and third natural orbital, becausethe first natural orbital resembles a photonic ground state with occupation number zero in the studied cases.Dressed HF does not consider a second orbital (the first instead is doubly occupied) and thus cannot capturethe effect. And for dressed RDMFT, the error in the second and third natural orbital is much larger than inthe first (see Fig. 6.7.) However, it is probable that this can be improved by better functionals.Although we only can assess our theories on small model systems, we can draw some interesting con-clusions. First of all, for coupling strength of g / ω (cid:47) g / ω = Note that this is not the usual photon number. See Sec. 1.3.3. H H Figure 6.5: Deviations of dressed HF (dHF) and dressed RDMFT (dR) ground state densities from the exactsolution (depicted in the insets) for the He atom (top) and the H molecule (bottom) with coupling g / ω = H H Figure 6.6: The first three natural orbital densities ρ ( i ) ex / dR ( x ) of the exact (ex) and dressed RDMFT (dR) calcu-lations are depicted for the He atom (top) and the H molecule (bottom) with coupling g / ω = ρ (1) ex ( x ) is almost exactly reproduced by dressed RDMFT, but ρ (2) dR ( x ) deviates already visiblyfrom ρ (2) ex ( x ) (left.) However, it is in both cases qualitatively correct. This changes for ρ (3) dR ( x ) of H , whichhas one node more than ρ (3) ex ( x ). Nevertheless, ρ (3) dR ( x ) of He, is reproduced correctly (right.) 159HAPTER 6. POLARITONIC STRUCTURE METHODS IN PRACTICEHe H Figure 6.7: We show the differences ∆ ρ ( i ) = ρ ( i ) dR ( q ) − ρ ( i ) ex ( q ) between the dressed RDMFT (dR) and the exact(ex) photonic natural orbital densities ρ iex / dR ( q ) for the 3 highest occupied natural orbitals for the He atom(left) and the H molecule (right) for coupling strength g / ω = ρ ( i ) ex ( q ) have asimilar shape as the density (see inset.) We see in both cases that dressed RDMFT captures the exact solutionvery well. He H Figure 6.8: The total mode occupation N ph , calculated from the exact, dressed HF and dressed RDMFT solu-tions is shown for He (left) and H (right.) We see that both dressed RDMFT and dressed HF underestimate N ph . In the ultra-strong coupling regime for g / ω > We have now discussed the dressed construction in all its details, presented with polaritonic HF and polari-tonic RDMFT two examples for its application and showed that these provide an accurate description of arange of simple cavity-QED systems. Let us now briefly analyze the prospects of applying such methods tolarger systems, which principally depends on the scaling of their numerical costs with the system size.We start by remarking that any polaritonic-structure theory scales exactly as the corresponding elec-tronic structure method. This means that the numerically most expensive step of the algorithm has the samedependence on the size of the orbital-basis for the electronic and the polaritonic version of the method. Es-pecially, enforcing the hybrid statistics does not increase the scaling of the method. For instance, a verycommon algorithm to solve the HF equations iteratively diagonalizes the matrix-representation of the Fockoperator. If the basis set consists of B elements, the algorithm scales with B , independently on whetherthe basis elements correspond polaritonic or electronic orbitals. From a practitioner’s perspective, this isprobably the most important feature of the polariton description.However, despite its importance, the scaling is not sufficient to characterize the numerical challenges ofthe polariton description. The two most important further challenges are the increase of the dimension ofthe basic orbitals of the theory and the inclusion of the new constraints that enforce the right statistics. Theformer challenge is reflected in the fact that although polaritonic HF has the same scaling as electronic HFwith the basis size, we have to consider a larger basis. If the photon mode(s) are satisfactorily described bya basis of size B ph and the matter-part of the system with a basis of size B m , than polaritonic HF scales as( B m ∗ B ph ) . Thus, we get a factor of B ph in addition to the matter description. Note that this holds for 1D,2D and 3D systems equivalently. A similar consideration holds also for a real-space description, where theFock-matrix cannot be explicitly constructed, but is diagonalized iteratively by, e.g., a conjugate-gradientalgorithm like the one of our implementation[267]. The scaling of such a method in the electronic case is ofthe order O ( B m ln B m N ) and can even be reduced with state-of-the-art algorithms to O ( B m ln B m N ) [276],where N is the number of electrons. As usually one single effective mode is considered, B ph should notbecome too larger and is therefore not overly numerically expensive. The second challenge is instead on the level of the algorithm. We have to solve a nonlinear minimizationproblem under nonlinear inequality constraints. It is not clear a priori which algorithms are best for suchdelicate problems and thus, different approaches have to be tested carefully. To develop the algorithm forour implementation (se also Ch. 8.2), we have already discarded certain approaches, but there are still manyopen questions (see part IV). Such methods thus scale better than a direct diagonalization of the Fock-matrix, but at the same time the underlying basis size B m describes the number of grid points and thus is typically much larger than B m in orbital-based codes. However, for very large systems,both descriptions can have comparable B m . A real-space like description of the displacement coordinates of the photon modes would in a similar manner be quite inefficient for a few photons, but might become even advantageous for large photon numbers. Note that we do not want to underestimate the influence of an entirely new dimension in the problem. The given statement de-pends of course on the necessary size of the photon-basis. However, in the cases that we studied so far, we observed the convergedresults with photon bases that where of the order of the number of particles.
In this chapter, we present first results for systems that are not (easily) accessible by exact methods todemonstrate the possibilities of our newly developed machinery. We aim not (yet) to study experimen-tally observed effects directly, because our implementations are not yet on the level to treat realistic three-dimensional matter systems. Nevertheless our two implementations allow already to identify a range ofnontrivial effects that arise from the complex interplay of electronic correlations and light-matter interac-tion. Importantly, these effects are local, which makes their description with usual model approaches dif-ficult. Thus, the presented results will allow us to draw some first conclusions on the limitations of suchapproaches.
As a starting example, we go one small step beyond our assessment study and present results for a many-body systems that cannot easily be solved exactly: the one-dimensional Beryllium atom Be in a cavity thatis described by v Be ( x ) = − (cid:112) x + (cid:178) . (6.14)In this case, we consider a smaller softening parameter of (cid:178) = Since we use here the O
CTOPUS implementation which doesnot explicitly enforce the hybrid statistics, we cannot choose an arbitrary cavity frequency, but need to give ita sufficiently high value such that the constraints are trivially fulfilled (see Sec. 5.1 and Sec. 6.1.3). We testedthis and found that ω = E h is large enough to make sure that no solutions that violate the Pauli principlecan occur.In Fig. 6.9, we see the total energy as a functional of the coupling strength g / ω for dressed HF anddressed RDMFT, respectively. Like in the two-electron systems, the deviation between both curves increasesfor larger g / ω and as expected the dressed RDMFT energies are lower than the dressed HF results. Analyzingthe ground-state densities, we see a similar trend as in the 2-particle systems. With increasing g / ω , theelectronic (photonic) part of the density becomes more (less) localized, though the details differ as we showin the last part of this chapter (see Fig. 6.13 and the corresponding part in the main text.) Comparing dressedRDMFT with dressed HF, we observe that the variation of the electronic (photonic) density with increasingcoupling strength is less (more) prounounced for dressed RDMFT, as Fig. 6.10 shows. We conclude thesurvey of Be with the mode occupation under variation of the coupling strength (see Fig. 6.11.) We see thatthe value of g / ω ≈ g / ω < g / ω > g / ω ≈ : g / ω ≈ This is a technicality for one-dimensional systems: The softening parameter is a fitting parameter to resemble in 1d a similar energyand wave function behavior to the real 3d-system. For corresponding references, see the second section of Ch. 6. g / ω . We observe the same trend as for the two-electron systems: for both levels of theory, theenergy grows with increasing g / ω , though for dressed HF faster than for dressed RDMFT.Figure 6.10: Shown are the electronic ( ρ dHF / dRg / ω ( x ), left) and photonic ( ρ dHF / dRg / ω ( q ), right) densities of Be fordressed HF (dHF) and dressed RDMFT (dR) for 2 different coupling strengths subtracted from their coun-terparts in the no-coupling limit ( ρ dHF / dRg / ω = ( x / q ).) We see in the electronic (photonic) case that the dressedRDMFT deviations are less (more) pronounced than for dressed HF. 163HAPTER 6. POLARITONIC STRUCTURE METHODS IN PRACTICEFigure 6.11: The total mode occupation N ph for dressed HF and dressed RDMFT is shown for Be. We see thatdressed RDMFT exhibits larger N ph until a coupling strength of g / ω ≈ Figure 6.12: We show the differences in the electronic density of the H molecule for 3 different bond lengths d (as examples of the dissociation) for g / ω = g / ω =
0, calculated exactly ( ρ exg / ω ( x ), left) andwith dressed RDMFT ( ρ dRg / ω ( x ), right). We see that for small d , the cavity mode reduces the electronic repul-sion and localizes the charges at the bond center ( d = < d eq = d , the electronic repulsion is locally enhanced such that the charge deviations are sep-arated in two peaks ( d = d , this interplay between local suppresion and enhancement ofrepulsion becomes more pronounced ( d = can be well modelled in a quasi-static picture with the potential v d ( x ) = − (cid:112) ( x − d ) + − (cid:112) ( x + d ) + d . We employed this model already in Sec. 2.5.In Fig. 6.12, we see the density of two H-atoms under variation of the distance d with and without the(strong) coupling to the cavity. We see that the influence of the cavity mode strongly depends on the ex-act electronic structure. The interaction with the cavity mode can locally reduce or enhance the electronicrepulsion due to the Coulomb interaction, where the exact interplay between both effects depends on theinteratomic distance. Thus, we can observe a number of different effects like pure localization of the densitytoward the center of charge ( d =
1) or localization combined with a local enhancement of repulsion such165HAPTER 6. POLARITONIC STRUCTURE METHODS IN PRACTICEthat the density deviations exhibit a double peak structure ( d = d = g / ω = g / ω = − . However, as every observable depends on the density, such deviationsare significant. Remarkably, dressed RDMFT reproduces the effects very accurately.He BeFigure 6.13: We show the differences in the electronic density ( ρ g / ω ( x )) of He (left) and Be (right) for 3 differ-ent coupling strengths compared to the atoms outside the cavity (insets), calculated with dressed RDMFT.We see that the effect of the cavity is very different for both systems: The strong localization of the electronicdensity for He indicates the suppression of electronic repulsion for all coupling strengths. For Be instead,we see additionally local enhancement of the repulsion. The interplay of enhancement and suppressionchanges with increasing coupling strength.166.4. POLARITONIC-STRUCTURE METHODS: SOME FIRST RESULTS In the next example, we compare the behavior of the He and Be atoms under the influence of the cavity. Theshapes of the electronic density of the two bare systems are very similar (see insets in Fig. 6.13). Since weknow that the density determines a system uniquely, one could naively expect that both atoms behave qual-itatively similar. However, it is well-known that He is a noble gas which has a very low chemical reactivityand is found in nature almost exclusively as an atomic gas. Beryllium instead occurs naturally only in com-bination with other elements. It is for example contained in emeralds [277]. In electronic-structure theory,Beryllium can be described extremely accurately [278] because of its few electrons and it has been studiedin many different scenarios. However, from a (cavity-QED) model perspective, atoms are just energy levelsthat interact with the photon field and all the electronic-structure that contributes to the specific energy isnot represented. Accordingly, these (many-body) energy levels are assumed to remain constant even if theyinteract strongly with the cavity.Having this in mind, let us compare the response of our one-dimensional Be and He atoms to the interac-tion with the cavity-mode. We see in Fig. 6.13 that they behave actually very differently under the influenceof the cavity. The electronic density of He is pushed toward its center of charge with increasing couplingstrength, which can be understood as a suppression of the electronic repulsion related to the Coulomb in-teraction. As He can be understood very well with only one orbital this is to be expected. Things changefor Be, where we have several dominant orbitals. With increasing coupling strength, we see like in the dis-sociation example a subtle interplay between suppression and local enhancement of the electronic repul-sion, that depends on the coupling strength. Thus for the same coupling strength, we can observe opposite( g / ω = g / ω = Note that although the one-dimensional setting is very different, there are still many traces of this different behavior as our results will show. For g / ω = n = n = In our last example, we will leave the real-space and go back to our site-based model implementation, whichhowever allows for the study of four-electron systems with arbitrary photon frequencies ω . As we havediscussed before, ω is not very important for polaritonic ground states. Of course different ω will lead todifferent results, but qualitatively we can observe the same physics. Since we do not try to study specificexperimental setups, we can just use ω as a kind of “convergence” parameter like for the Be atom in the lastsections. However, we want to show with this last example that our method is capable to produce resultsfor systems that are not accessible by exact methods. Specifically, we consider a box length of 30 a whichcorresponds to an electronic basis of B m =
30 or to a real-space grid from x = − a bohr to x = a .We consider the case of a 2-electron and a 4-electron system, respectively, and set ω = E h , which is faraway from the regime where the fermion ansatz is valid. Thus the right hybrid statistics of the polaritonsare crucial. We consider again B ph =
5, for which all the results are converged. This system might seemquite small but still, it is practically inaccessible by exact diagonalization. The corresponding many-bodyspace for 4 particles has a dimension of (2 B s ) N ∗ B ph = · . Only highly optimized methods on a highperformance cluster might be able to still explore such a configuration space. In contrast, all the calcula-tions that are presented in the following have been performed on a laptop although the code is by no meansoptimized.In this system, we employ polaritonic HF to study the effect of electron-photon coupling versus electronlocalization. We consider a matter system with a local potential v ( x ) = N / (cid:112) x + (cid:178) , which represents a po-tential well that is deep (shallow) for small (large) (cid:178) . The softening-parameter (cid:178) thus represents the level ofconfinement of the potenial v ( x ) which is depicted in green (for various values of (cid:178) ) in Fig. 6.14. Note thatwe need the large simulation box to reduce boundary effects which also represent a form of confinement.Let us first consider how the electronic ground-state density changes when coupling and the localizationare varied. To facilitate the comparison between the N = N = ρ ( x )/ N , where ρ = γ e ( x , x ) is the diagonal of the electronic 1RDM (in blue)and the normalized confinement v ( x )/ N (in green). In Fig. 6.14 (a) we show the uncoupled 2-particle caseand in (b) we use g / ω = (cid:178) . In Fig. 6.14 (c) and (d) we show the sameplots for the 4-particle case. In both cases we see that for strongly-confined electrons, i.e., for small valuesof (cid:178) , the influence of the strong light-matter coupling on the density is negligible. This is in agreementwith the usual assumption underlying, e.g., the Jaynes–Cummings model, that the ground state for atomicsystems is only slightly affected by coupling to the photons of a cavity mode. Much higher coupling strengthswould need to be employed in order to see a sizeable effect for strong localization. In contrast, once welift the confinement and the electrons get delocalized, the influence of the light-matter coupling becomesappreciable. The induced changes are not uniform but depend on the details of the electronic-structure, i.e.,in the N = N = ρ ( x )/ N (blue) with corresponding local po-tentials v ( x )/ N = (cid:112) x − (cid:178) (green, rescaled by a factor of 0.25 for better visibility) for a series of softeningparameters (cid:178) of the 2- (upper row) and 4-electron (lower row) system and for coupling strength g / ω = g / ω = (cid:178) . Note that the legends only depict three example lines (thesmallest, largest and middle values of (cid:178) ), respectively. 169HAPTER 6. POLARITONIC STRUCTURE METHODS IN PRACTICETo make these observations more quantitative we display in Fig. 6.15 the normalized changes in the elec-tronic 1RDMs depending on the confinement and the coupling strength, i.e., ∆ γ e = (cid:107) γ eg / ω , (cid:178) − γ eg / ω = (cid:178) (cid:107) / N ,in panel (a) and (d) for the 2-particle case and the 4-particle case, respectively. Also, we show the photonnumber N ph in the ground state in dependence of confinement and coupling strength in (b) and (e) for the2-particle case and the 4-particle case, respectively. As a third quantity we consider ∆ n e = (cid:80) i || n ei , g / ω , (cid:178) − n ei , g / ω = (cid:178) || , where n i are the natural occupation numbers. For the zero-coupling case they are all eitherzero or one, which corresponds to a single Slater determinant in the electronic subspace. If they are betweenzero and one they indicate a correlated (multi-determinantal) electronic state. Therefore ∆ n e measuresthe photon-induced correlations and also highlights that although polaritonic HF is a single-determinantmethod in the polaritonic space, for the electronic system it is a correlated (multi-determinantal) method.For both, the 2- and the 4-particle case we find consistently that the more delocalized the uncoupled mattersystem is, the stronger the coupling modifies the ground state. Although this effect depends on the detailsof the electronic-structure as we saw in Fig. 6.14, the plots of Fig. 6.15 indicate that this behavior is quitegeneric. The reason is that within a small energy range many states with different electronic configurationsare available as opposed to a strongly bound (and hence energetically separated) ground-state wave func-tion (see Fig. 6.16). A glance on the correlation measure ∆ n e in panel (c) and (f) of Fig. 6.15 strengthensthis explanation: For large (cid:178) and g / ω the electronic correlation is strongest and thus many electronic con-figurations contribute to the states of this parameter regime. This indicates also that the effective one-bodydescription of the ground state of many cavity-QED models might be inaccurate in this regime. Additionally,we observe that the maximal values of ∆ γ e in the 2-particle case are slightly larger than in the 4-particle case,which again is related to the different electronic structures of the two systems.For the photon numbers, however, which are depicted in panel (b) and (d) of Fig. 6.15, we see that ingeneral the number of photons is larger in the 4-particle case. This is due to the simple reason that the morecharge we have the more photons are created. Nevertheless, the amount of photons does not just double(as expected from a simple linear relation) but is almost three times higher. This highlights the nonlinearregime of electron-photon coupling that we consider here. Again the number of photons increase also withthe delocalization and hence the parameter (cid:178) is a very decisive quantity. All these results point toward aninteresting parameter in the context of strong light-matter coupling: the localization of the matter wavefunction. In agreement with a recent case study for simple 2-particle problems [26], systems that are lessconfined react much stronger to a cavity mode.170.4. POLARITONIC-STRUCTURE METHODS: SOME FIRST RESULTS(a) (b) (c)(d) (e) (f)Figure 6.15: The plots show key quantities of the model system as a function of the coupling strength g / ω and the localization parameter (cid:178) for the 2-electron (upper row) and 4-electron (lower row) case. In the firstcolumn, we depict the normalized deviation ∆ γ e = || γ eg / ω , (cid:178) − γ eg / ω = (cid:178) || / N of the electronic 1RDM to areference for the same (cid:178) and g / ω = N ph (green) and in the third column, the total deviation of the electronic natural occupation numbers ∆ n e = (cid:80) i || n ei , g / ω , (cid:178) − n ei , g / ω = (cid:178) || is displayed. This is a measure of the induced electron-photon correlation. Weobserve in all the cases that when the bare matter wave function becomes more delocalized ( (cid:178) larger), themodifications due to the matter-photon coupling become stronger. 171HAPTER 6. POLARITONIC STRUCTURE METHODS IN PRACTICEFigure 6.16: The first four eigenenergies e ,..., e of the electronic one-body equation [ − ∂ x − v ( r )] ψ i = e i ψ i for N = (cid:178) . We see that the energies approach eachother with increasing (cid:178) and thus decreasing confinement. This means that for a fixed coupling strength,the less confined the system is the more states are available in a small energy range for photon-inducedmodifications of the ground state.172.5. SUMMARY OF THE RESULTS Let us conclude that although we only have regarded matter systems in one spatial dimension and maxi-mally four electrons, we have made some interesting observations.Specifically, we observed in the first sections (Sec. 6.4.2 and Sec. 6.4.3) an intricate interplay betweenCoulomb-induced and photon-induced correlations: in some cases the Coulomb repulsion was suppressed,but in others it was effectively enhanced due to the photon-interaction. Although the changes were seem-ingly small, one should keep in mind that the correlation contribution to the many-body energy is typicallysmall but nevertheless plays a major role for many important phenomena such as high-temperature super-conduction [81]. That this observation transfers to the coupled electron-photon case is highly probable. Forinstance, Schäfer et al. [125] have shown some possible implications explicitly for a one-dimensional systemcoupled to a cavity mode. Whether these modifications of the underlying electronic-structure are indeed amajor player in the changes of chemical and physical properties still needs to be seen. However, to capturesuch modifications in the first place (and study their influence) clearly needs a first-principles theory thatis able to treat both types of (strong) correlations accurately and is predictive inside as well as outside of acavity. We have shown here that polaritonic HF and polaritonic RDMFT are viable options to predict andanalyze these intricate structural changes. We want to stress again that polaritonic HF explicitly accountsfor electronic correlation and goes thus beyond standard electronic HF.Besides the interplay between the two types of interaction in coupled light-matter systems, we identifiedan interesting parameter that especially influences the electron-photon interaction: the localization of thematter wave function. We saw that delocalized wave functions respond considerably stronger to the per-turbation by the photon mode than localized ones which is a parameter that is very difficult to be capturedby cavity-QED models. This suggests that if we want to observe genuine modifications of the ground statedue to strong light-matter coupling, we should consider matter systems that have a spatially extended wavefunction. One way would be large molecular or solid-state systems, the other way would be an ensembleof emitters. In the latter case the strong influence would lead to local changes, in contrast to the Dicke-likedescription of collective strong coupling, where we have N independent replicas of the same, perfectly local-ized system. In both cases, it seems plausible that there are strong modifications of the electronic-structureeven if there is only coupling to the vacuum of a cavity. Modifications of chemistry by merely the vacuumseem therefore to be feasible. At the same time an interesting perspective with respect to collective strong-coupling arises: Maybe it is not the simple Dicke-type collectivity [79], where an excitation is delocalizedover many replica, that drives the changes in chemistry, but rather a genuine cavity-photon mediated spa-tial delocalization of the ensemble wave function. To answer this question, further investigations especiallyfor realistic three-dimensional systems and ensembles including the Coulomb interaction have to be con-ducted. And polaritonic first-principle methods as introduced in this work seem specifically well-suited toanswer these interesting and fundamental questions. 173HAPTER 6. POLARITONIC STRUCTURE METHODS IN PRACTICE174 ART III
NUMERICS
The subject of optimization is a fascinating blend of heuristics and rigour, of theory and exper-iment. It can be studied as a branch of pure mathematics, yet has applications in almost everybranch of science and technology. (R. Fletcher, 1987 [279]) 175he last part of this thesis highlights the price we pay to make first-principles methods directly applicableto strong matter-photon problems: we need to develop new algorithms as well as to validate the accuracyof well-established algorithms applied to this new setting. Furthermore, this part clarifies how numericalnecessity dictates the choice of the particular algorithms that we have presented in part II.Before we start the discussion, we want to remark on the role of numerics as an nowadays indispens-able part of research in physics. This is true in all fields, but in particular in first-principles theory, which isimpossible to do without the development of efficient algorithms and modern advances in computer tech-nology. Nowadays, everybody with a minimal knowledge of scripting can use standard implementationsto determine, e.g., the equilibrium structure of a molecule. Such a service seeks its equals and it is onlypossible because of the concerted efforts of mathematicians, physicists and software engineers that havepainstakingly validated the numerical implementations. Also for our implementations, we have performedexhaustive testing procedures, which are presented in detail in the appendix C. We are therefore confidentthat the available codes are as reliable as other standard implementations.176 hapter 7
RDMFT IN REAL SPACE
Before we come to the coupled problems, we need to explain the matter-only case, which is the basis forpolaritonic-structure methods. We therefore discuss the two RDMFT-algorithms of the electronic-structurecode O
CTOPUS . The first one is a standard (and the default) method [271] and the second one was builtspecifically to investigate the dressed orbitals (but is also useful for matter-only calculations) [3, Ch. 14].Let us briefly explain the difference between the two methods. The RDMFT equations (Sec. (2.4.2)) con-sist of two coupled sets of equations for the natural orbitals and the natural occupation numbers that haveto be solved self-consistently. Thus, usual algorithms consist of two major parts that are the occupationnumber optimization and the orbital optimization. The default solver for the latter optimization problemcannot use the full flexibility of the real-space grid but requires a two-step procedure. In the first step, onehas to perform a preliminary calculation on the real-space grid that generates a basis set for which in thesecond step Hamiltonian matrix elements are computed. The code performs then the actual optimizationby orbital rotations within this basis set (see Ref. [271]). Thus, the flexibility of the underlying real-space gridis only present in the preliminarily generated basis. In principle, this flexibility could be exploited to adoptthe basis to specific scenarios and by that, to overcome the limitations of usual quantum-chemistry basissets (see Sec. 2.1.1). However, it is very difficult to construct a generic algorithm for this purpose. Currently,there is only one method to generate a basis, that is by a preliminary calculation with a less expensive the-ory level such as KS-DFT or even independent particles which solves just the non-interacting Schrödingerequation (see Sec. 2.1.2). Such basis sets are usually considerably less efficient than optimized quantum-chemistry basis sets [3, Ch. 14] and they have further shortcomings, when we generalize the description todressed orbitals (see Sec. 8).For this reason, we have implemented and validated a new orbital optimization method that is a gener-alization of the conjugate-gradients algorithm by Payne et al. [267]. This algorithm uses the full flexibilityof the grid and needs considerably less natural orbitals for a converged result than the default solver [3].However, RDMFT calculations with the conjugate gradients algorithm are numerically considerably moreexpensive than with the orbital-based algorithm. Thus, it is useful to have the choice between both algo-rithms, which our implementation provides. In this section, we discuss a general strategy to solve an RDMFT minimization problem that is independentof the details of the orbital handling. The algorithm assumes spin-restriction, i.e., the electron-number N iseven and all B > N electronic orbitals can be doubly occupied (see the discussion around Eq. (2.22)). Details on the code can be found on the webpage: https://octopus-code.org/ . As discussed in Sec. 2.4.2, all known RDMFT energy functionals (besides the trivial HF functional) are notexplicitly given in terms of the 1RDM, but can only be expressed with respect to the natural orbitals { φ i ( r )}and occupation numbers { n i }. The generic RDMFT functional reads in this representation (cf. Eq. (2.103)) E [{ φ i },{ n i }] = B (cid:88) i = n i (cid:90) d r φ ∗ i ( r ) h ( r ) φ i ( r ) + B (cid:88) i , j = n i n j (cid:90) d r d r (cid:48) (cid:175)(cid:175) φ i ( r ) (cid:175)(cid:175) (cid:175)(cid:175) φ j ( r (cid:48) ) (cid:175)(cid:175) w ( r , r (cid:48) ) + E xc [{ φ i },{ n i }], (7.1)where h ( r ) = − m ∇ + v ( r ) is the one-body kernel, w ( r , r (cid:48) ) is the two-body interaction kernel and E xc [{ φ i },{ n i }]is the unknown exchange-correlation part of the functional. In the current state of the O CTOPUS imple-mentation, there are two exchange-correlation functionals available: the HF exchange-only and the Müllerfunctional [227] (cf. Eq. (2.102)) E xc [{ φ i },{ n i }] = E M [{ φ i },{ n i }] ≡ − B (cid:88) i , j = (cid:112) n i n j (cid:90) d r d r (cid:48) φ ∗ i ( r ) φ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) φ i ( r (cid:48) ) φ j ( r ). (7.2)In the following, we will present all equations explicitly for the Müller functional. To obtain the equivalentexpressions in terms of the HF functional, one simply has to replace (cid:112) n i n j → n i n j in Eq. (7.2). We also willassume spin-restriction, which is the standard case in RDMFT. The goal of the RDMFT algorithm is to minimize the energy with respect to the natural orbitals and naturaloccupation numbers, respecting the following three additional constraints:0 ≤ n i ≤ ∀ i (7.3a) S [{ n i }] = B (cid:88) i n i − N = C [{ φ i },{ φ j }] = (cid:90) d d r φ ∗ i ( r ) φ j ( r ) − δ i j = N -representability constraints of the (spin-summed) 1RDM, and thelast line guarantees the orthonormality of the natural orbitals. Since the first condition is simple in thenatural orbital basis, it is enforced with the help of an auxiliary construction: We set n i = (2 πθ i ) ∀ i , (7.4) Note that in practice this leads to integer occupation numbers [224] and therefore, the code simply fixes n = ... = n N = n N + = n N + = ... = Note that treating systems that are not spin-saturated is very challenging for many electronic-structure methods, but in particularfor RDMFT. θ i instead of the n i directly. The second and the third conditions are imposedby the Lagrange multiplier technique. For that, we introduce the Lagrangian L [{ φ i },{ θ i };{ (cid:178) i j }, µ ] = E [{ φ i },{ n i }] − µ S [{ θ i }] − (cid:88) i , j (cid:178) i j C [{ φ i },{ φ j }], (7.5)with the Langrange multipliers µ and (cid:178) i j . The extrema of E under the constraints S and C are then foundat stationary points of L, i.e., for δ L =
0. (7.6)The condition (7.6) leads to three sets of independent equations (cf. Eq. (2.109)), i.e..0 = ∂ E ∂ n i − µ (7.7a)0 = δ E δφ ∗ i ( r ) − B (cid:88) k = (cid:178) ki φ k ( r ) (7.7b)0 = δ E δφ i ( r ) − B (cid:88) k = (cid:178) ik φ ∗ k ( r ) (7.7c)for all i . The first set of equations, cf. (7.7a) defines the gradient with respect to n i ∂ E ∂ n i = (cid:90) d r φ ∗ i ( r ) h ( r ) φ i ( r ) + B (cid:88) j = n j (cid:90) d r d r (cid:48) φ ∗ i ( r ) φ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) φ j ( r (cid:48) ) φ i ( r ) − B (cid:88) j = (cid:113) n j n i (cid:90) d r d r (cid:48) φ ∗ i ( r ) φ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) φ j ( r ) φ i ( r (cid:48) ) ≡ (cid:90) d r φ ∗ i ( r ) h ( r ) φ i ( r ) + (cid:90) d r ρ i ( r ) v H ( r ) − (cid:112) n i (cid:90) d r φ ∗ i ( r ) v iXC ( r ), (7.8)where we defined the orbital density ρ i ( r ) = φ ∗ i ( r ) φ i ( r ) = | φ i | ( r ), (7.9)the Hartree-potential v H ( r ) = B (cid:88) j = n j d r (cid:48) ρ j ( r (cid:48) ) w ( r , r (cid:48) ), (7.10)and the exchange-correlation potential of orbital i v iXC ( r ) = B (cid:88) j = (cid:112) n j (cid:90) d r (cid:48) φ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) φ j ( r ) φ i ( r (cid:48) ). (7.11) Note that since the Müller functional is convex [161], the stationarity condition is sufficient to determine the global minimum of E . δ E δφ ∗ i ( r ) = n i h ( r ) φ i ( r ) + n i v H ( r ) φ i ( r ) − (cid:112) n i v iXC ( r ) (7.12a) δ E δφ i ( r ) = n i φ ∗ i ( r ) h ( r ) + n i φ ∗ i ( r ) v H ( r ) − (cid:112) n i ( v iXC ) ∗ ( r ). (7.12b) We have now derived and discussed all the necessary ingredients to define a generic RDMFT minimizationalgorithm that solves Eq. (7.7a) and Eq. (7.7b) (or alternatively Eq. (7.7c)) self-consistently. The structure ofthis minimization problem suggests the separation into thea (natural) occupation number optimization ,where we solve Eq. (7.7a) for fixed { φ i },{ (cid:178) i j } and theb (natural) orbital optimization ,where we solve Eq. (7.7b) for fixed { n i }, µ .Both optimizations have to be performed alternately within an outer loop until self-consistence.For our convenience in the following, we rewrite the energy functional with the Müller approximation(7.2) as E [{ φ i },{ n i }] = B (cid:88) i = n i 〈 φ i | h φ i 〉 + B (cid:88) i , j = n i n j 〈 φ i φ i | w | φ j φ j 〉 − B (cid:88) i , j = (cid:112) n i n j 〈 φ i φ j | w | φ j φ i 〉 , (7.13)where we introduced the abbreviations for one-body integrals 〈 φ i | h φ i 〉 = (cid:82) d r φ ∗ i ( r ) h ( r ) φ i ( r ) and two-bodyintegrals 〈 φ i φ j | w | φ k φ l 〉 = (cid:82) d r d r (cid:48) w ( r , r (cid:48) ) φ ∗ i ( r (cid:48) ) φ j ( r (cid:48) ) φ ∗ k ( r ) φ l ( r ). Additionally we subsume φ = { φ ,..., φ B }and n = ( n ,..., n B ). We now summarize the algorithm in pseudocode form [280]. Algorithm 1. (RDMFT)Set B.Initialize φ (depends on the orbital algorithm).Initialize n , µ . (see Sec. 7.1.4)Calculate the initial matrix elements h ii = 〈 φ i | h φ i 〉 , w i jkl = 〈 φ i φ j | w | φ k φ l 〉 . for l=1,2,...(a) Occupation number optimization (algorithm 2)Solve Eq. (7.7a) for fixed { φ i },{ (cid:178) i j } self-consistently with h l − ii , w l − i jkl for n l , µ l (b) Orbital optimization (algorithm 3 or 5)Solve Eq. (7.7b) for fixed { n i }, µ self-consistently for φ l Calculate h lii , w li jkl break if convergence criterion is fulfilled (depends on the orbital algorithm) end (for) We now explain part (a) of algorithm 1, i.e., the occupation number optimization. This is independent of thetype of orbital optimization routine that we employ for (b) and thus part of the general algorithm.We start by combining Eq. (7.7a) and Eq. (7.8) in one set of equations that we aim to solve, i.e., µ = ∂ E ∂ n i = (cid:90) d r φ ∗ i ( r ) h ( r ) φ i ( r ) + (cid:90) d r ρ i ( r ) v H ( r ) − (cid:112) n i (cid:90) d r φ ∗ i ( r ) v iXC ( r ) =〈 φ i | h φ i 〉 + B (cid:88) j n j 〈 φ i φ i | w | φ j φ j 〉 − B (cid:88) j (cid:113) n j / n i 〈 φ i φ j | w | φ j φ i 〉 . (7.14)Instead of directly solving Eq. (7.14) for the set n , µ , the algorithm performs a two-step procedure. For that,we define the auxiliary function S ( µ ) = min n L ( n ; µ ) (7.15)with the “occupation Lagrangian” L ( n ; µ ) = E [ n ] − µ (cid:195) B (cid:88) i = n i − N (cid:33) (7.16)where E [ n ] is the energy functional (7.13) with fixed orbitals. The advantage of this construction is that wecan solve Eq. (7.15) with a standard (unconstrained) minimization algorithm. Since we find a set of optimal n ∗ for every given µ , we can label n ∗ = n ∗ ( µ ). To find the µ = µ ∗ that extremalizes ˜ S , we use then the sidecondition S ( µ ) = (cid:195) B (cid:88) i = n ∗ i ( µ ) − N (cid:33) . (7.17)Specifically, we solve S ( µ ) = Algorithm 2. (Occupation number optimization)Set the convergence criterion (cid:178) µ > .Find an initial interval [ µ , µ ] that satisfies S ( µ ) < and S ( µ > . for k=1,2,...i Calculate the interval center µ k − m = µ k − + µ k − ,calculate ˜ S ( µ k − m ) andcheck the side condition S ( µ k − m ) .ii Set the interval for the next iteration by the prescriptionS ( µ k − m ) S ( µ k − ) < → µ k = µ k − , µ k → µ k − m S ( µ k − m ) S ( µ k − ) ≥ → µ k → µ k − m , µ k = µ k − break if min( S ( µ ), | µ − µ | /2) < (cid:178) µ . end (for) We want to remark that algorithm 2 is still not ready to be implemented as it is written here. For instance,we need yet another small algorithm to determine a good guess for the initial interval and take care thattoo small occupation numbers are discarded in the optimization since they occur in the denominator inEq. (7.14). We refer the reader for these minor details to the O
CTOPUS -code directly.We see here impressively how solving even a comparatively simple equation as (7.14) may require aninvolved algorithm. The reason for this is its nonlinearity . There are basically no “safe” algorithms for non-linear equations, for which, e.g., convergence con be guaranteed and debugging nonlinear solvers is thusalways a very delicate matter. Since we had to debug the occupation number optimization during the im-plementation of the dressed orbitals in O
CTOPUS , we can provide a concrete example for the latter statementwhich also serves well to visualize the algorithm. We present this in Sec. B.
In this section, we present the default RDMFT algorithm of O
CTOPUS that is publicly available from version10.0 onwards [271, 3]. The algorithm has been introduced by [158] for orbital-based codes and is adopted tofit also the real-space setting.
Before we lay out the algorithm (Sec. 7.2.2, we start this section with a brief discussion about the specialchallenges of the orbital minimization in RDMFT.
The difference of the RDMFT in comparison to single-reference methods: the single-particle Hamilto-nian is not hermitian
In single-reference methods, the orbital gradients of Eq. (7.12) would define effective one-particle Hamilto-nians, which are hermitian. We have discussed in Sec. 2.4.2 that this is not the case in RDMFT, which hasstrong implications for the numerical minimization. To see this, let us briefly recapitulate the HF gradient182.2. ORBITAL-BASED RDMFT IN REAL SPACEequation that defines the Fock operator (cf. Eq. (2.45)). In the here employed notation, the HF gradient reads δ E HF δφ ∗ i ( r ) = ˆ H HF φ i ( r ) = n i h ( r ) φ i ( r ) + n i v H ( r ) φ i ( r ) − n i v iX ( r ),where v iX ( r ) = (cid:80) Bj = n j (cid:82) d r (cid:48) φ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) φ j ( r ) φ i ( r (cid:48) ) is the “exchange-only” version of Eq. (7.11). The sta-tionarity condition leads then to the N coupled equations (cf. Eq. (2.46))ˆ H HF φ i ( r ) = N (cid:88) k = (cid:178) ki φ k ( r ) i = N , (7.19)which we can transform to a nonlinear eigenvalue equation by a division by n i and a projection on theorbital φ ∗ l , i.e., (cid:178) li / n i = (cid:90) d r φ ∗ l ˆ H HF φ i ( r ) = (cid:90) d r φ ∗ l h ( r ) φ i ( r ) + (cid:90) d r φ ∗ l v H ( r ) φ i ( r ) − (cid:90) d r φ ∗ l v iX ( r ) ≡ ( ˆ H HF ) il Crucially, the matrix-form of the single-particle Hamiltonian on the right-hand side is hermitian( ˆ H HF ) il = ( ˆ H ) ∗ li . (7.20)This is obvious for the one-body and the Hartree part. However, also for the exchange term ( ˆ H HF , X ) il = (cid:82) d r φ ∗ k v iX ( r ) we find ( ˆ H HF , X ) ∗ li = (cid:181)(cid:90) d r φ ∗ i ( r ) v lX ( r ) (cid:182) ∗ = (cid:195) B (cid:88) j = n j (cid:90) d r d r (cid:48) φ ∗ i ( r ) φ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) φ j ( r ) φ l ( r (cid:48) ) (cid:33) ∗ = B (cid:88) j = n j (cid:90) d r d r (cid:48) φ ∗ l ( r (cid:48) ) φ ∗ j ( r ) w ( r , r (cid:48) ) φ j ( r (cid:48) ) φ i ( r ) = (cid:90) d r (cid:48) φ ∗ l ( r (cid:48) ) B (cid:88) j = n j (cid:90) d r φ ∗ j ( r ) w ( r , r (cid:48) ) φ j ( r (cid:48) ) φ i ( r ) = (cid:90) d r (cid:48) φ ∗ l ( r (cid:48) ) v iX ( r (cid:48) ) = ( ˆ H HF , X ) ik . (7.21)Solving the HF equations, cf. Eq. (7.19), is thus equivalent to diagonalizing the Fock-matrix ( ˆ H HF ) li self-consistently: we start with a guess for the orbitals { φ i }, calculate ( ˆ H HF ) li [{ φ i }] and diagonalize it to obtainnew orbitals { φ i } for the next iteration. This procedure is repeated until some convergence criterion is sat-isfied (see also Sec. 2.2). In a similar way, we can solve the equations of other single-reference methods, e.g.,the KS equations in KS-DFT. Note that in standard text books, e.g., Ref. [132, 142] the occupations are usually neglected, since they are fixed. We explicitly includethe n i here to highlight the difference to RDMFT. n i , which appear linear in the gradient equation. For an RDMFTdescription beyond HF, this is not possible, because the exchange-correlation functional must depend non-linearly on the occupation numbers [224]. We can see this explicitly in Eq. (7.12) for the Müller functional.We therefore cannot reformulate the RDMFT equations as an eigenvalue problem and consequently, wehave to employ also different algorithms for the RDMFT orbital minimization. An effective eigenvalue equation for the RDMFT minimization
Nevertheless, we can define the one-body Hamiltonian from the gradient equationˆ H φ i ( r ) ≡ δ E δφ ∗ i ( r ) = n i h ( r ) φ i ( r ) + n i v H ( r ) φ i ( r ) − (cid:112) n i v iXC ( r ), (7.22)but it is not a hermitian operator (see also Sec. 2.4.2). However, it is possible to manipulate the orbitalequations, cf. Eq. (7.12) such that we can solve an equivalent auxiliary eigenvalue problem [219, 158]. TheRDMFT routine of O
CTOPUS employs such an auxiliary construction, which specifically was introduced byPiris and Ugalde [158] and which we want to briefly present in the following.We start by noting that in order to solve Eq. (7.12), we have to consider the full Langrange-multipliermatrix (cid:178) i j , since we cannot diagonalize ˆ H . Although ( ˆ H ) i j is not hermitian, the matrix (cid:178) i j that fulfils thestationarity condition is hermitian. To show this, we first note that the two energy gradients with the sameindex (cid:181) δ E δφ i ( r ) (cid:182) ∗ = δ E δφ ∗ i ( r ) (7.23)are connected by complex conjugation. Comparing now the two gradient equations (7.7b) and (7.7c), wehave for all i B (cid:88) k = (cid:178) ik φ ∗ k ( r ) (7.7c) = δ E δφ i ( r ) = (cid:195) δ E δφ ∗ i ( r ) (cid:33) ∗ (7.7b) = B (cid:88) k = (cid:178) ∗ ki φ ∗ k ( r ). One might think now that this is a technicality and that there should be a way to “cure” this unusual behavior by, e.g., a more generaldefinition of the gradient. Indeed, Pernal [219] investigated this and connected issues quite thoroughly considering a generic RDMFTfunctional. She showed there that the problem is directly connected to a fundamental property that most common RDMFT functionalshave: they depend only implicitly on the 1RDM, i.e., we can only express their functional form explicitly in terms of natural orbitalsand natural occupation numbers. One way to explain this is that the eigenset of the 1RDM is unique besides possible rotations in thespace of orbitals that have an identical occupation number. Thus, we cannot apply a unitary transformation to find the eigenbasis of the one-body Hamiltonian. In a single-reference theory instead, the 1RDM is idempotent , i.e., all eigenvalues are one (or two in thespin-restricted case) or zero, respectively and hence such a transformation can be applied within the occupied space, which is the onlyrelevant for the one-body Hamiltonian. Note that at the stationary point also ( ψ ∗ ) ∗ = φ must hold, although we treat φ and φ ∗ in principle as independent. φ k are all orthogonal, we have proven that (cid:178) ik = (cid:178) ∗ ki (7.24)is symmetric at the stationary point, where δ L =
0. This suggests to define the matrix˜ F ki = (cid:178) ik − (cid:178) ∗ ki ,that must be zero at the stationary point. We therefore have˜ F ki = ⇐⇒ δ L = (cid:178) ki by projecting one of the gradient equations, say (7.7b) on orbital φ ∗ k , (cid:178) ki = (cid:90) d r φ ∗ k ( r ) δ E δφ ∗ i ( r ) = n i (cid:90) d r φ ∗ k ( r ) h ( r ) φ i ( r ) + n i (cid:90) d r φ ∗ k ( r ) v H ( r ) φ i ( r ) − (cid:112) n i (cid:90) d r φ ∗ k ( r ), v iXC ( r ) (7.26)then Eq. (7.25) is equivalent to the orbital-gradient conditions, cf. Eqs. (7.7b)-(7.7c).The method of Piris and Ugalde exploits this by considering an auxiliary matrix F ki = (cid:178) ik − (cid:178) ∗ ki i (cid:54)= kF i i = k (7.27)with the same entries as ˜ F ki on the off-diagonals but additionally with some in principle arbitrary diagonalterms. The stationarity condition ˜ F ki = F ki = F i δ ki being in diagonal form and we candiagonalize F iteratively to minimize the RDMFT functional. A remark on hermiticity, the one-body Hamiltonian and the Lagrange-multiplier matrix
Before we lay out the concrete algorithm for this minimization, we want to address the difference betweenthe hermiticity of ( (cid:178) ik ) and ˆ H , which is subtle but crucial. When we define ( (cid:178) ik ) by the gradient equation (cid:178) ki = (cid:90) d r φ ∗ k ( r ) δ E δφ ∗ i ( r ) = (cid:90) d r φ ∗ k ( r ) ˆ H φ i ( r ),it seems as if (cid:178) ki ? = ( ˆ H ) ki would just be the matrix representation of ˆ H and thus from (cid:178) ki = (cid:178) ∗ ik we can followthat ˆ H is indeed hermitian, contrary to what we claimed. The resolution of this seeming contradiction isthat ( ˆ H ) ki = ( ˆ H ) ∗ ik holds only for exactly one point of the definition space of ˆ H , which is the stationarypoint of L . Only if the indices i , k correspond exactly to the natural orbitals that extremalize L , the matrix( ˆ H ) ki is symmetric. For all other points, this is not the case. To show this, let us calculate (cid:178) ki again, but thistime from the other gradient equation, cf. Eq. (7.7c). We have˜ (cid:178) ik = (cid:90) d r δ E δφ i ( r ) φ k ( r ) (7.28)185HAPTER 7. RDMFT IN REAL SPACE = n i (cid:90) d r φ ∗ i ( r ) h ( r ) φ k ( r ) + n i (cid:90) d r φ ∗ i ( r ) v H ( r ) φ k ( r ) − (cid:112) n i (cid:90) d r v iXC ( r ) φ k ( r ) (7.29)and thus ˜ (cid:178) ki = n k (cid:90) d r φ ∗ k ( r ) h ( r ) φ i ( r ) + n k (cid:90) d r φ ∗ k ( r ) v H ( r ) φ i ( r ) − (cid:112) n k (cid:90) d r v kXC ( r ) φ i ( r ) (cid:54)= n i (cid:90) d r φ ∗ k ( r ) h ( r ) φ i ( r ) + n i (cid:90) d r φ ∗ k ( r ) v H ( r ) φ i ( r ) − (cid:112) n i (cid:90) d r φ ∗ k ( r ) v iXC ( r ) = (cid:178) ki .Obviously, both definitions cannot agree in general. In fact, we could rephrase the task of the RDMFT orbitaloptimization as “to search for the one set of n i and φ i (assuming no degeneracy) for that ˜ (cid:178) ki = (cid:178) ki .” CTOPUS
We have now derived and discussed all the necessary ingredients to complete algorithm 1 and introduce theorbital optimization. Therefore, we employ the algorithm due to Piris and Ugalde [158].The goal of the routine is to diagonalize the matrix F ki = (cid:178) ik − (cid:178) ∗ ki i (cid:54)= kF i i = k , (cf. 7.27)for which we still need to specify the F i , which in principle may have arbitrary values. However, a goodchoice will crucially influence the stability of the algorithm. In their original paper, Piris and Ugalde [158]describe this issue and provide a prescription for initializing and updating them. A good choice are theeigenvalues of ( (cid:178) ik ) in ascending order and we refer for more details to the publication. The complete algo-rithm is summarized in pseudocode as Algorithm 3. (Orbital optimization 1)Set the convergence criteria (cid:178) F > (cid:178) E > .Construct (cid:178) i j by Eq. (7.26) from the initial orbitals φ .Diagonalize ( (cid:178) i j ) to obtain the eigenvalues F i for k=1,2,...i Construct (cid:178) ki j by Eq. (7.26) from φ k − .ii Set the off-diagonals F ki j = (cid:178) ki j − (cid:178) kji ∗ anddetermine F kmax = max F ki j iii Diagonalize F k to obtain a new set of orbitals φ k iv Calculate the total energy E k + by Eq. (7.13) break if F kmax < (cid:178) F and k > | E k + − E k | / E k < (cid:178) E . end (for) Again, we skipped some additional steps that assure a better convergence of the code and that are discussedin [158].186.2. ORBITAL-BASED RDMFT IN REAL SPACE
The explicit RDMFT algorithm of O
CTOPUS
Now we have discussed the major steps of the orbital-based RDMFT algorithm that is implemented in O C - TOPUS . We want to conclude the section by mentioning the last missing pieces, which are the initializationof φ and the convergence criteria in algorithm 1.For the former, we first do a preliminary calculation with the independent particles routine of O CTOPUS ,that is we solve the orbital equation h ( r ) φ i ( r ) = e i φ i ( r ) i = B (7.30)to generate a set of orbitals φ − . We denote them with the index “-1,” because we perform then a furtherstep before we start with the iteration. Additionally, we need a starting value n − for the occupations, whichwe obtain by the rule n − i = − n thresh ∀ i = N /2 n − i = n thresh ∀ i = N /2 + B (7.31)where n thresh is a threshold that is fixed to n thresh = − φ − and obtain n i . Then, we calculate (cid:178) i j and diagonalize( (cid:178) i j + (cid:178) ji ∗ )/2 to obtain φ . As an overall convergence criterion we calculate the total energy after everyoccupation number optimization E kocc and then slightly generalize the criterion of the orbital optimizationas F k − max < (cid:178) F and | E k − E kocc | / E k < (cid:178) E with the same thresholds (cid:178) F , (cid:178) E .The complete algorithm is then summarized as 187HAPTER 7. RDMFT IN REAL SPACE Algorithm 4. (RDMFT)Set convergence criteria (cid:178) E , (cid:178) F .Set B.Generate φ − by solving Eq. (7.30) and calculate h − ii , w − i jkl .Initialize n − according to Eq. (7.31) .Solve Eq. (7.14) self-consistently with h l − ii , w l − i jkl for n , µ .Calculate (cid:178) i j according to Eq. (7.26) .Diagonalize ( (cid:178) i j + (cid:178) ji ∗ )/2 to obtain φ and calculate h ii , w i jkl . for l=1,2,...a Occupation number optimization
Solve Eq. (7.14) self-consistently with h l − ii , w l − i jkl for n l , µ l , employingalgorithm 2Save total energy as E occ b Orbital optimization
Solve Eq. (7.25) with n l for φ l , employing algorithm 3Calculate h lii , w li jkl Save total energy E and the F max = max F i j (Eq. (7.27) ) break if F k − max < (cid:178) F and | E k − E kocc | / E k < (cid:178) E end (for) This algorithm has been assessed for a 3d H -molecule comparing to another RDMFT code based onGaussian orbitals and for further details we refer to Ref. [271]. Since we employed this implementationto study dressed orbitals, we did additional high-accuracy benchmark studies for one-dimensional modelsystems. The results are presented in App. C.188.3. CONJUGATE-GRADIENTS ALGORITHM FOR RDMFT As we have discussed in the introduction, the implementation of the orbital-based algorithm can not takeadvantage of the full flexibility of the real space grid. The straightforward way to remedy this, is to employanother orbital optimization algorithm that does not depend on an orbital basis.Before we discuss this, we want to briefly remark on the fundamental difference between orbital and real-space-based electronic-structure codes. In principle, one could see the grid-points as an effective orbitalbasis, since all (approximate) bases of a Hilbert space are theoretically equivalent for properly convergedcalculations. However, the number of grid points that are necessary to accurately describe even the small-est molecular systems is orders of magnitude larger than any standard orbital basis. Imagine for instancethat we approximate one spatial dimension with only 10 grid points, which is not sufficient for an accu-rate description of any realistic scenario. Despite this low quality of the description, we have in 3d already10 = = aug-cc-pVQZ [281] that consists of merely 55 elements. However, onewould consider such a basis set only for numerical tests or comparisons. In practice, considerably smallerbasis sets suffice to obtain very accurate RDMFT results for Helium. From this example, it is obvious thatboth approaches require entirely different numerical techniques.The standard algorithm for the orbital optimization for most theory levels, including standard KS-DFT orHF in O CTOPUS is the conjugate gradients algorithm by Payne et al. [267]. Originally proposed and optimizedfor DFT calculations of solids, which require a parametrization of the reciprocal or k -space, it is ideally suitedfor the real space, which is very similar from a numerical point of view [138]. The main complication of areal-space description is the accurate description of the core-region of atoms, close to the divergence of theCoulomb potential. The strongly increasing slope would require a very fine grid, which would spoil the nu-merical efficiency. An accurate and efficient solution of this problem is given by so-called pseudopotentials ,that describe the effective (classical) electrostatic potential of the nuclei together with the core-electrons.Pseudopotentials do not diverge at the position of the nuclei and have in general much smaller slopes inthe core-region than the exact Coulomb potential. The quasi-classical treatment of the interaction betweenthese core-electrons and the remaining so-called valence electrons is in most cases very accurate. An in-depth discussion about the advantages and disadvantages of real-space codes is beyond the scope of thisthesis and the reader is for further details referred to the review of Beck [138] and the first O
CTOPUS paperby Marques et al. [282] and reference therein.In conclusion, we have to describe in a real-space code N active or valence electrons under the influenceof pseudopotentials, instead of all the N all > N electrons of the matter system in the local Coulomb potentialof the nuclei (which is called an all-electron calculation). In single-reference methods, this corresponds tothe calculation of N orbital wave functions φ = ( φ ,..., φ N ) on the (real-space) grid. The conjugate-gradientsalgorithm by Payne et al. [267] accomplishes this by a direct (iterative) minimization of the correspondingenergy functional E [ φ ]. This means that the code minimizes E [ φ ] iteratively along the direction of the gradi-ent or the steepest-descent ζ = δ E [ φ ]/ δ φ , taking into account prior steps by what is called conjugation [266,part 5]. Since the calculation of the gradient with respect to orbital i is still for the whole grid, it is usually thebottle neck of the algorithm. Taking into account, e.g., the (approximate) Hessian is usually computationally For a good introduction in the theory of pseudopotentials, the read is referred to the review of Schwerdtfeger [218]. The idea behind the real-space optimization algorithm for RDMFT is to use the already existing well-tested conjugate-gradients implementation for single-reference methods to a maximal degree. The hope isthat the good convergence properties of the algorithm persist also in the more involved theory. Importantly,the algorithm minimizes directly the energy functional and not, e.g., the KS or HF eigenvalue equation.Thus, we just had to replace the respective functional with the RDMFT energy and redo the subsequentderivations. We present this derivation in the following subsection, which in shortened form is published inRef. [3].
We now derive and comment the conjugate-gradients method for the orbital optimization in real-spaceRDMFT. For that, we recapitulate the most important steps of the algorithm by Payne et al. [267], explic-itly highlighting the necessary modifications for the RDMFT minimization. As the general definition of theRDMFT minimization problem and the according equations have been presented in Sec. 7.2, we recapitulatehere only the most important equations.The goal is to find the set of natural occupation numbers n i and natural orbitals φ i that extremalizethe RDMFT energy functional (7.1) under the constraints (7.3). Within the Müller approximation (7.2), thefunctional reads for M natural orbitals E [{ φ i },{ n i }] = B (cid:88) i = n i (cid:90) d r φ ∗ i ( r ) h ( r ) φ i ( r ) + M (cid:88) i , j = n i n j (cid:90) d r d r (cid:48) (cid:175)(cid:175) φ i ( r ) (cid:175)(cid:175) (cid:175)(cid:175) φ j ( r (cid:48) ) (cid:175)(cid:175) w ( r , r (cid:48) ) − B (cid:88) i , j = (cid:112) n i n j (cid:90) d r d r (cid:48) φ ∗ i ( r ) φ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) φ i ( r (cid:48) ) φ j ( r ). (7.32)We will only discuss the orbital optimization and thus assume that n i and µ are constant. Consequently, weneglect the occupation-number dependence of E and consider the reduced version of the Lagrangian (7.5),i.e., L [{ φ i };{ (cid:178) i j }] = E [{ φ i }] − B (cid:88) j , k = (cid:178) jk (cid:181)(cid:90) d r φ ∗ j ( r ) φ k ( r ) − δ jk (cid:182) , (7.33)To present the algorithm, we assume that we are at a certain step m of algorithm 1 with the current orbitals φ m = ( φ m ,..., φ mB ). To minimize L , we want to vary φ m along the direction of the local steepest-descent of L ,which is given by the negative gradient with respect to, e.g., φ ∗ i , cf. Eq. (7.12a). The steepest-descent vectortherefore reads ζ mi = − δ L δφ mi ∗ ( r ) = − (cid:195) ˆ H φ mi ( r ) − B (cid:88) k = (cid:178) mki φ mk ( r ) (cid:33) , (7.34)where we used the definition of the one-body Hamiltonian ˆ H φ mi ( r ) = δ E δφ mi ∗ ( r ) , cf. Eq. (7.22). Here, we haveto take the whole sum (cid:80) Bk = (cid:178) mki φ mk ( r ) into account to find the steepest descent, which is in contrast to thecorresponding definition for single-reference, where we can assume (cid:178) ki = δ ki (cid:178) i (cf. Eq. (5.10) of Ref. [267]). It is difficult to make this statement general, because there is no database that compares the performance of certain algorithmsin standard codes. For other features there are indeed such databases, see, e.g., .However, at least for O
CTOPUS we can confirm that the algorithm is usually reliable. first of three modifications that are necessary to extend the algorithm the RDMFT.Next, we need to find an estimate for (cid:178) mki , for which we proceed in an analogous way to [267] by exploitingthe stationarity conditions δ L = = δ E δφ ∗ i ( r ) − B (cid:88) k = (cid:178) ki φ k ( r ) (cf. 7.7b)0 = δ E δφ i ( r ) − B (cid:88) k = (cid:178) ik φ ∗ k ( r ). (cf. 7.7c)We can now derive an expression for (cid:178) ki from both equations, cf. Eqs. (7.26) and (7.28). As we have dis-cussed in Sec. 7.2.1, these expressions are not equal, because the single-particle “Hamiltonian” in RDMFT isnot hermitian. In the single-reference case that is considered by Payne et al. [267], the single-particle Hamil-tonian is instead hermitian and thus the corresponding two sets of orbital equations are equivalent. This isequivalent to the fact that we can diagonalize (cid:178) ik = δ ki (cid:178) i for single-reference methods. Thus, in the RDMFTcase, we have to make sure that the “information” of both equations, (7.7b) and (7.7c) enters the algorithm.This is the second modification of the RDMFT-algorithm.Whereas in the single-reference case, we merely have to evaluate the diagonal element (cid:178) mi to calculatethe steepest descent with respect to orbital ψ mi (Eq. (5.11) of Ref. [267]), we now have to calculate the B elements (cid:178) ki (with k = B ). For that, we have to choose the “correct” equation from our two choices,Eq. (7.7b) and Eq. (7.7c). Since we have used the gradient with respect to φ ∗ i (Eq. (7.7b)) for the definition of ζ i , we have to employ the equation from the other gradient with respect to φ i , i.e., Eq. (7.7c), to define (cid:178) mki = (cid:90) d r δ E δφ mk ( r ) φ mi ( r ). (7.35)We stress this point, because it is very unusual to work with non-hermitian operators in electronic structuretheory. As a matter of fact, the RDMFT algorithm cannot converge to the minimum, if not both equationsare taken into account.This can be understood from another perspective, if we derive the equations assuming real orbitals φ ∗ = φ . In this case, there is only one gradient equation and thus no ambiguity. The derivation can be found inthe appendix A.2.Having determined the steepest-descent vector, we need to perform certain orthogonalization steps,the preconditioning step for a faster convergence, and the conjugation. All these steps are independent ofthe explicit theory and thus, we do not present them in this subsection. We summarize these steps in thenext subsection, when we present the full algorithm in pseudo-code. For the details, reader is referred toRef. [267].We conclude this subsection with the last missing modification to extend the algorithm to RDMFT thatconcerns the line-minimization . We therefore assume that we have calculated the normalized conjugate-gradients vector ξ mi , which determines the “direction” in which we want to minimize the energy functional.However, we do not know “how long” we need to “go” in this direction and thus we need to perform a line-minimization, i.e., we need to find the minimal value of the energy E along the line that is defined by ξ mi .In principle, the vector ξ mi defines an infinite line that we can parametrize by, e.g., a multiplication witha scalar. However, we can here explicitly take the normalization constraint into account. All orbitals arenormalized to one (and orthogonal to each other) and thus they can be seen as unit or basis vectors of an Note that this definition is analogous to Eq. (7.28), where we used the tilde ˜ (cid:178) ik to stress the difference to (cid:178) ki from the other equation.We drop the tilde in this section. B -dimensional subspace of the Hilbert space. The algorithm takes care that this property is preserved at aiteration step, i.e., 〈 φ mi | φ mj 〉 = δ i j . Additionally, when we optimize orbital φ mi in the direction ξ mi , it is takencare that also 〈 φ mi | ξ mi 〉 = and thus, we can parametrize all points of the possible descent by the angle Θ between φ mj and ξ mi . This reduces the line-minimization problem to merely a closed interval of [0, π /2],which can be exploited by the algorithm. The optimized state has the form˜ φ mi ( Θ ) = φ mi cos Θ + ξ mi sin Θ . (7.36)We can now evaluate the energy as a function of Θ by inserting E [ Θ ] = E [{ φ mk , k (cid:54)= i }, ˜ φ mi ( Θ )] (7.37)and find the optimized state by a minimization over Θ , i.e., φ m + i = φ mi cos Θ ∗ + ξ mi sin Θ ∗ , (7.38)where Θ ∗ is determined by min Θ E [ Θ ]. (7.39)Since the evaluation of E [ Θ ] requires the application of operators to orbitals, it is as expensive as calculatingthe gradient, which is the bottleneck of the whole procedure. Consequently, we need to find an approximatesolution to (7.39). The form of the parametrization (7.36) suggests an expansion of E in a Fourier series andfor single-reference theories, a truncation after the first order is already quite accurate [267]. We assume thatthis also holds for RDMFT and thus make the ansatz E [ Θ ] = E + A cos(2 Θ ) + B sin(2 Θ ). (7.40)First test-calculations have confirmed that this assumption is reasonable [3]. We can now analytically deter-mine the stationary point Θ ∗ of E [ Θ ] by the usual derivative condition d E /d Θ = Θ ∗ = tan − (cid:181) B A (cid:182) . (7.41)There are several possibilities to determine the coefficients A and B (see Ref. [267]) and we employ the onethat is implemented in O CTOPUS , which reads B A = − d E d Θ (cid:175)(cid:175)(cid:175) Θ = E d Θ (cid:175)(cid:175)(cid:175) Θ = . (7.42)To calculate the derivatives, occurring in this expression, we have to consider the RDMFT energy functional, Note that there is a further subtle difference between the standard algorithm and the RDMFT version. In the former, we can ad-ditionally orthogonalize ξ mi to all other states φ mj for j (cid:54)= i , because arbitrary rotations are allowed within this subspace. This is notpossible in RDMFT, because the index of every φ mj is uniquely defined by its occupation number. This is another consequence of theimpossibility to diagonalize the Lagrange-multiplier matrix in RDMFT. E d Θ (cid:175)(cid:175)(cid:175)(cid:175) Θ = = (cid:90) d r (cid:195) δ E δφ ∗ i d φ ∗ i d Θ + δ E δφ i d φ i d Θ (cid:33) =〈 ξ i | ˆ H φ i 〉 + 〈 φ i | ˆ H ξ i 〉 − (cid:88) k (cid:161) (cid:178) ki 〈 ξ i | φ k 〉 + (cid:178) ik 〈 φ i | ξ k 〉 (cid:162) (7.43a)for the first derivative andd E d Θ (cid:175)(cid:175)(cid:175)(cid:175) Θ = = (cid:90) δ E δφ ∗ i (cid:181) d φ ∗ i d Θ (cid:182) + (cid:90) δ E δφ i (cid:181) d φ i d Θ (cid:182) + (cid:90) δ E δφ ∗ i δφ i d φ ∗ i d Θ d φ i d Θ + (cid:90) δ E δφ ∗ i d φ ∗ i d Θ + (cid:90) δ E δφ i d φ i d Θ = (cid:161) 〈 ξ i | ˆ H ξ i 〉 − 〈 φ i | ˆ H φ i 〉 (cid:162) + α Hi + α XCi (7.43b)for the second derivative. Here, we defined α Hi = n i (cid:90) d r ℜ [ φ ∗ i ( r ) ξ i ( r )] (cid:181) n i (cid:90) d r (cid:48) ℜ [ φ ∗ i ( r (cid:48) ) ξ i ( r (cid:48) )] w ( r , r (cid:48) ) (cid:182) (7.44a) α XCi = − n i (cid:90) d r d r (cid:48) ℜ (cid:163) φ ∗ i ( r ) φ ∗ i ( r (cid:48) ) w ( r , r (cid:48) ) ξ i ( r ) ξ i ( r (cid:48) ) (cid:164) − n i (cid:90) d r d r (cid:48) φ ∗ i ( r ) ξ ∗ i ( r (cid:48) ) w ( r , r (cid:48) ) ξ i ( r ) φ i ( r (cid:48) ) − (cid:112) n i B (cid:88) j = (cid:112) n j (cid:90) d r d r (cid:48) ξ ∗ i ( r ) φ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) φ j ( r ) ξ i ( r (cid:48) ), (7.44b)and used d φ ( ∗ ) i ( Θ )d Θ | Θ = = ξ ( ∗ ) i and d φ ( ∗ ) i ( Θ )d Θ | Θ = = φ ( ∗ ) i . We see that (cid:178) ik appears only in the first derivative term,which hence is the only one that needs to be modified for the RDMFT algorithm, which we summarize inthe next subsection. CTOPUS
We now want to formulate the complete algorithm for the orbital optimization by conjugate-gradients of theRDMFT routine of O
CTOPUS . The routine can be chosen in the developer’s version of O
CTOPUS as alternativeto algorithm 3. The general part (algorithm 1) including the occupation number optimization (algorithm 2)are not modified.Before we formulate the complete algorithm, some remarks are appropriate. The crucial advantage ofthe whole routine is that we do not need to explicitly calculate the one-body Hamiltonian ˆ H in some basis,but only need to apply it to a state φ i according toˆ H φ i ( r ) = n i h ( r ) φ i ( r ) + n i v H ( r ) φ i ( r ) − (cid:112) n i v iXC ( r ). (cf. 7.22)This operation can be done very efficiently on the grid for a couple of reasons. One important of thesereasons is a crucial property of all the occurring operators in Eq. (7.22) with the exception of v iXC . Theoperators h ( r ) and v H ( r ) are (semi)local, i.e., they effectively depend only on one coordinate and thus theirapplication to a state has the cost of an inner product, instead of a full matrix-vector multiplication. For theone-body part h ( r ) = − ∇ + v ( r ), (7.45) Note that the former is denote by β and the latter by α in the source code of O CTOPUS . v ( r ). The Laplacian ∇ is instead what is called semi-local . Thismeans for vectors on the grid that we can approximate ∇ with, e.g., finite-differences of some order andthen apply it to a state by a so-called stencil that is the same for every point and thus “almost” local [138, Sec.IV.A]. The Hartree-potential v H ( r ) = B (cid:88) j = n j (cid:90) d r (cid:48) ρ j ( r (cid:48) ) w ( r , r (cid:48) ) (cf. 7.10) = (cid:90) d r (cid:48) ρ ( r (cid:48) ) w ( r , r (cid:48) ), (7.46)with the total density ρ ( r ) = (cid:80) Bj = n j ρ j ( r (cid:48) ) is also local during every iteration step , which is the basic ap-proximation of our self-consistent field procedure. It is crucial here that we can exchange the sum with theintegration and thus have to calculate the integral only once, which in practice is done in k -space, i.e., after aFourier transformation. This is considerably faster for a large number of grid-points and is a main advantageof the conjugate-gradients algorithm. However, the exchange-correlation potential that we employ, e.g.,in RDMFT, HF or for hybrid functionals in KS-DFT (see Sec. 2.3) is truly nonlocal. We recall the definition v iXC ( r ) = B (cid:88) j = (cid:112) n j (cid:90) d r (cid:48) φ ∗ j ( r (cid:48) ) w ( r , r (cid:48) ) φ j ( r ) φ i ( r (cid:48) ). (cf. 7.11)Importantly, we cannot exchange summation and integration here, because both operations involve differ-ent orbitals. This is a severe bottleneck of the algorithm and prevented real-space and k -space codes fora long time to treat larger systems with these methods [276]. In RDMFT this limitation is especially pro-nounced, because we need to calculate the exchange-correlation potential not only for the one orbital thatwe optimize but also for all the other orbitals to determine ( (cid:178) ik ). Consequently, it is still very expensive todo RDMFT calculations with the conjugate-gradients algorithm and we have so far mostly performed testswith one-dimensional systems. However, it has been shown recently that this problem can be solved with anewly developed approximation for the exchange-correlation operator [276]. Although the author explicitlyconsiders HF theory, the method should be generalizable to RDMFT in a straightforward way.Next, we want to remark briefly on preconditioning , which we mentioned already in the first part of thissection. This is a very common part of large-scale minimization algorithms and a thorough discussion isbeyond the scope of this text. The basic problem in the realm of electronic structure calculations is that thekinetic energy operator that we apply to a state to calculate the gradient has a larger error for energeticallyhigher than lower lying states, because the former have more nodes and thus, in general bigger slopes. Thestandard preconditioner in O CTOPUS thus applies a kind of low-pass filter on the gradient. We denote thecorresponding operator with ˆ P . For further details, we refer to [267] and references therein.As a final remark, before we present the algorithm, we want to refer the reader to the excellent explana-tion of the conjugate-gradients method in Sec. 5.A of Ref. [267]. The authors explain there how a conjugationof the steepest-descent vector leads to a maximally fast convergence of the algorithm, because it cannot get“trapped” in a zig-zag trajectory. This is illustrated in Fig. 14 of the reference. Note that there are differentpossibilities to do the conjugation and in the following we only present the Fletcher-Reeves [283] method,but there is also the
Polak-Ribiere scheme available in O CTOPUS . Both methods are elucidated in, e.g.,part 5 of Nocedal and Wright [266]. For this reason, also orbital-based electronic structure codes employ conjugate gradients algorithms for systems that require a verylarge basis sets. The original paper [284] is unfortunately only available in French.
Algorithm 5. (Orbital optimization 2)Set the convergence criterion (cid:178) φ > . for k=1,...,Ba Minimization of orbital with index k. Initialize φ l ,0 k (cid:48) = φ lk from thecurrent set of orbitals φ l b Orthogonalize to previously optimized states: φ l ,0 k = φ l ,0 k (cid:48) − (cid:80) k − i = 〈 φ li | φ l ,0 k (cid:48) 〉 φ li c Calculate relevant entries of ( (cid:178) ik ) : (cid:178) l ,0 ik = 〈 ˆ H φ li | φ l ,0 k 〉 ∀ i = B and (cid:178) l ,0 ki = 〈 φ l ,0 k | ˆ H φ li 〉 ∀ i = B for m=0,1,2,...i if k > , update (cid:178) l , min , (cid:178) l , mni for all n < kii Calculate the steepest-descent vector (Eq. (7.34) ): ζ mk = − (cid:179) ˆ H φ l , mk − (cid:80) Bi = (cid:178) l , mik φ li (cid:180) iii Apply the preconditioner: η mk (cid:48) = ˆ P ζ mk iv Orthogonalize to the current and previously optimized states: η mk = η mk (cid:48) − 〈 φ l , mk | η mk (cid:48) 〉 φ l , mk − (cid:80) k − i = 〈 φ li | η mk (cid:48) 〉 φ li v Calculate the conjugation factor: γ mk = 〈 η mk | ζ mk 〉 / 〈 η m − k | ζ m − k 〉 ∀ m > γ k = vi Calculate conjugate-gradients vector: ξ mk = η mk + γ mk ξ m − k vii Calculate A / B according to Eq. (7.42) with the expressions ofEq. (7.43) determine the optimal angle Θ ∗ viii Calculate the optimized orbital φ m + k = φ mk cos Θ ∗ + ξ mk sin Θ ∗ ix Calculate the residue R m + = | ˆ H φ m + k − (cid:178) kk φ m + k | break if m > and R m + < (cid:178) φ and R m < (cid:178) φ end (for)end (for) We want to remark that we left out some details that are not crucial for the algorithm to work. We refer hereagain to [267] or directly to the source-code of O
CTOPUS . 195HAPTER 7. RDMFT IN REAL SPACE
Having two different orbital-optimization techniques at hand, we are able to study the implications of thereal-space description for practical RDMFT calculations. We want to stress here again that there is no otherreal-space implementation of RDMFT, which naturally leads to new types of problems that have to be solved.The most pronounced limitation of the conjugate-gradients algorithm is the high computational cost ofevaluating the exchange-correlation term that we have mentioned in Sec. 7.3.2. This has prevented us so farfrom performing all-electron real-space calculations in three spatial dimensions. This would be necessaryfor a comparison with an orbital-based code. However, there are new developments that might overcomethis limitation (see Sec. 7.3.2). Besides that, all-electron calculations without pseudo-potentials are veryinefficient in a real-space code, because they require a very high resolution around the divergence of theCoulomb potential of the nuclei (see the introduction of this section). In RDMFT, this problem becomeseven more pronounced than in single-reference methods, because we need to determine considerably moreorbitals. Since all orbitals are orthogonal to each other, the number of nodes increases with the number oforbitals, which requires again a finer grid for a good representation. This is not only problematic for all-electron calculations, but could turn out a general limitation of real space RDMFT.Nevertheless, the RDMFT implementation in O
CTOPUS works and we have tested it extensively for one-dimensional systems, where we basically can afford arbitrarily fine grids. We want to finish this chapterabout real-space RDMFT thus with a comparison of the different methods for a 1d model-system. The fol-lowing discussion is based on Ref. [3, Ch. 14]. To do such a calculation, we need to converge the two basicnumerical parameters for every real space calculation, i.e., the box-length L x and the spacing δ x . In RDMFT,we additionally have to converge the number of natural orbitals B , which are system-dependent but notequal to the particle number N in contrast to, e.g., the number of KS- or HF-orbitals. Since we have twomethods for the orbital optimization at hand, it is especially interesting to compare these in terms of theconvergence with respect to B .The default method due to Piris and Ugalde (Piris method in the following) uses the orthonormality con-straint of the natural orbitals, which implies that the “F-matrix” constructed from the Lagrange multipliers (cid:178) jk is diagonal at the solution point (see Sec. 7.2). As an immediate consequence of the Piris method, thenatural orbitals at the solution point are linear combinations of the orbitals used as the starting point forthe minimization. In other words, the initial orbitals serve as a basis and the convergence of the methodwill depend on this basis. The conjugate gradient algorithm also requires a set of initial orbitals to start theself-consistent calculation, but at convergence the results are independent of that starting point. Therefore,while a calculation using the Piris method requires a set of initial states which serve as the basis, the conju-gate gradient algorithm can be used starting from a initial set of random states. In our tests of the conjugategradient implementation, the quality of the initial states only had an influence on the number of iterationsnecessary for the convergence, but not on the final result. We suggest to use the orbitals obtained from anindependent particle calculation as initial states since they can be obtained for a low numerical cost andsimultaneously can serve as a basis set in the Piris implementation.We thus will compare different starting points for the Piris method with the conjugate-gradients imple-mentation. For the former, the initial orbitals are taken to be the solutions obtained with a different levelof theory, like independent particles (IP) or KS-DFT. In order to better understand the effect of the choiceof basis, we tested the following choices: (i) independent particles, KS-DFT within (ii) the local density ap-proximation (LDA) or (iii) the exact exchange (EXX) approximation, as well as (iv) HF theory. In all cases wehave to ensure that the number of unoccupied states in the calculation is sufficient to cover all the natural196.4. COMPARISON OF BOTH ORBITAL OPTIMIZATION METHODSFigure 7.1: Total energy for the RDMFT calculation for one-dimensional H using the Piris method withdifferent basis sets and using the conjugate gradient implementation. The inset shows a zoom into the areawhere convergence is reached.orbitals which will obtain a significant occupation in the following RDMFT calculation. The results for theconvergence of the total energy of a one-dimensional (1D) hydrogen molecule (see Sec. 2.5) using the Müllerfunctional [227] are given in Fig. 7.1. The calculations were performed on a 1D grid extending from − v ( x ) = − (cid:112) ( x − d ) + − (cid:112) ( x + d ) + d = w ( x , x (cid:48) ) = (cid:112) ( x − x (cid:48) ) + hapter 8 POLARITONIC STRUCTURE THEORY: A NUMERICAL PERSPECTIVE
This second chapter of the numerics part deals with the extension of electronic-structure methods to cou-pled light-matter problems. The principal strategy to do this has been laid out in Sec. 5.1. In this chapter,we show how to do this in practice in two steps. We first present how to extend an electronic structure codeto treat dressed orbitals with the modified interaction and potential, which is already sufficient to describespecific situations such as the two-polariton case (fermion ansatz, Sec. 8.1). We exemplify this simple wayto implement polaritonic-structure methods with our implementation in O
CTOPUS of polaritonic HF andpolaritonic RDMFT [3, Ch. 4]. Then, we show how to do this also in the general case with the example of apurpose-built implementation (Sec. 8.2).At this point, we want to remark briefly on the particular numerical challenge of modifying well-knownelectronic-structure algorithms. We present in App. B an example that highlights that even simple modifi-cations (such as adding the dressed term to the interaction kernel) of a sufficiently complex code requireextensive testing. Besides revealing such issues, the validation procedure of a numerical implementationoften influences also the development of theory itself. For instance, we have only realized the importance ofthe hybrid statistics during the exhaustive convergence studies that are presented in App. C.
CTOPUS
In this section, we present how to apply the general prescription to turn an electronic-structure into apolaritonic-structure method of Sec. 5.1 with the example of our implementation of dressed orbitals in O C - TOPUS . The actual state of the code supports the inclusion of one photon mode and matter systems of onespatial dimension. We therefore do the presentation explicitly for the 1+1 dimensional case. The method ispart of O
CTOPUS version 10.0 or higher, which is publicly available ( https://octopus-code.org/wiki/Octopus_10 ).As we have mentioned already, our implementation of the dressed orbitals is based on the RDMFT im-plementation in O
CTOPUS and can thus only be used for RDMFT and HF calculations. However, a general-ization to other methods is possible with little effort, because the only crucial change is in the
Poisson solver that calculates integrals of the form ˜ v ( r ) = (cid:90) d z r (cid:48) ˜ ρ ( r (cid:48) ) ˜ w ( r , r (cid:48) ). (8.1)Here r ∈ V ⊂ R z (in the present case r = ( x , q ) and thus z =
2) and ˜ w ( r , r (cid:48) ) is the two-body integral kernel.In the standard 3d-case, this is the Coulomb kernel ˜ w ( r , r (cid:48) ) = w ( r , r (cid:48) ) = | r − r (cid:48) | , which makes Eq. (8.1) thesolution of the Poisson’s equation ∇ ˜ v = ˜ ρ and explains the name of the routine. For 1d, it is by default the This section is based on Ref. [3, part 4]. Note that the implementation is in principle general enough to treat 3d-matter systems, but this has not yet been tested. w ( x , x (cid:48) ) = (cid:112) ( x − x (cid:48) ) + (cid:178) with (cid:178) = ρ ( r ), which can be some orbital density ˜ ρ ( r ) = ρ i j ( r ) = φ ∗ i ( r ) φ j ( r ) that occurs in theexchange-correlation potential (cf. Eq.(7.11)) or the total density ˜ ρ = ρ ( r ) = (cid:80) Mi = n i ρ ii ( r ) that occurs in theHartree-potential (cf. Eq.(7.10)). The Poisson-solver is thus utilized by all methods in O CTOPUS to includethe contribution of the (approximate) Coulomb interaction.To generalize the code to dressed orbitals, we have to define one additional coordinate q that parametrizesthe photon-degrees of freedom and replace the Coulomb-kernel w ( x , x (cid:48) ) → w (cid:48) ( xq , x (cid:48) q (cid:48) ) = w ( x , x (cid:48) ) + (cid:104) − ω (cid:112) N λ qx (cid:48) − ω (cid:112) N λ q (cid:48) x + ( λ x ) (cid:105)(cid:124) (cid:123)(cid:122) (cid:125) w d ( xq , x (cid:48) q (cid:48) ) , (8.2)where ω is the frequency of the photon-mode, λ = λ x the component of the polarization vector in the direc-tion of the spatial dimension of the matter system x . Obviously, the extra term w d ( xq , x (cid:48) q (cid:48) ) is not related toPoisson’s equation and thus, it does not really “belong” in the Poisson solver. To avoid changing the structureof the code too much, the application of w d ( xq , x (cid:48) q (cid:48) ) is nevertheless implemented in this part of the code.As a consequence, it is not possible to choose the specific method to evaluate Eq.(8.1), when dressed or-bitals are used. One is constraint to the (inefficient) direct sum method, which just calculates the integral asa summation on the grid. We want to stress here that the evaluation of Eq.(8.1) for the extra part w d ( xq , x (cid:48) q (cid:48) )is considerably less involved than solving Poisson’s equation, because its 2-coordinate dependence is onlydue to simple products . To calculate for example (cid:90) d x (cid:48) d q (cid:48) ρ i j ( x (cid:48) , q (cid:48) )[ − ω (cid:112) N λ q (cid:48) x ] = − ω (cid:112) N x (cid:90) d x (cid:48) d q (cid:48) ρ i j ( x (cid:48) , q (cid:48) ) q (cid:48) ,we can pull out x and thus need to evaluate the integral only once for every value of x , which has merely the(very low) computational cost of in inner product. For the Coulomb-kernel instead, we have to evaluatethe integral for every value of x , when we perform a naive direct-sum calculation. This operation scales quadratically on the grid and is the bottleneck of basically every method. More sophisticated methods canreduce this scaling considerably. For example, the standard Poisson solver of O CTOPUS performs a Fouriertransformation of Eq.(8.1) to turn the integral into a multiplication and then transforms back to real space.Using the fast Fourier transformation [138], this procedure reduces the scaling to B ln B [138].Besides the two-body integral kernel, also the local potential has to be modified by adding a dressedextra term, as discussed in Sec. 5.1. This is not yet implemented explicitly in O CTOPUS . Instead, the user isforced the make use of the option of a user-defined potential , following, e.g., the example of the test-suite ofO
CTOPUS . Finally, we have also implemented the output of some photonic expectation values at the end ofthe calculation. They are written in the static_information .The just describes modifications allow to perform polaritonic RDMFT and HF calculations with thefermion ansatz that does not explicitly guarantee the Pauli-principle (see Sec. 5.1). This means that in orderto guarantee results that do not violate the Pauli principle, we either have to restrict the system to one ac-tive orbital (thus the two-particle singlet case) or make sure that the photon frequency ω is large enough toguarantee the Pauli-principle trivially (see Secs.4.2.3 and Fig. 6.3).To go beyond that and consider the polariton ansatz, we need to enforce the N -representability of theelectronic 1RDM with an extra routine, which would require a considerable change of the code. We have sofar only tested this in a small purpose-built code that we present in the next section. The calculation of an inner product scales linear with the grid-size B See https://octopus-code.org/wiki/Developers:Starting_to_develop . In the following, we will show how to go beyond the fermion ansatz of dressed orbitals in practice, i.e., how to(approximately) enforce the hybrid statistics of polaritonic orbitals in a numerical minimization (polaritonansatz, see Sec. 5.1). The additional challenge in comparison to the fermion ansatz is here to enforce a set ofnonlinear (only implicitly given) inequality constraints during the minimization of the (also nonlinear) en-ergy functional of a given method. Including such complicated constraints requires considerable modifica-tions of standard first-principles algorithms and it is a priori not clear, which algorithm is suited. We there-fore analyze the problem from a general perspective and present specifically the augmented-Lagrangianalgorithm that is accurate, yet simple to implement and validate (Sec.8.2.1).We then show with the example of polaritonic HF how to obtain a general polaritonic-structure methodthat accounts for the hybrid statistics (polariton ansatz) by combining the augmented-Lagrangian with astandard electronic-structure algorithm. To test the algorithm, we have implemented it in a simple purpose-built code, which we have employed to show that the polariton ansatz indeed overcomes the limitations ofthe fermion ansatz (see Sec. 6.1) and to highlight the influence of localization on the light-matter interac-tion (Sec. 6.4.4). To get accustomed to the new challenges, it is important to reduce the complexity of theminimization problem as much as possible. Therefore, we have developed and tested the algorithm in asimplified setting with 2 electron orbitals (corresponding to N = To start the discussion, let us briefly recapitulate the generic minimization problem (5.8) that we have tosolve to guarantee the Pauli principle within polaritonic-structure theory (polariton ansatz). Irrespectiveof the particular method that is considered, the principal numerical challenge of the polariton ansatz is toenforce a set of M nonlinear inequality constraints g i ≥ i = M ) during the minimization. Without lossof generality, we assume that the state of the system is described N orbitals φ = { φ ,..., φ N }. The goal is to minimize the energy E = E [ φ ]. (8.3)that is a (nonlinear) functional of N polaritonic orbitals φ = ( φ ,..., φ N ) under a set of equality constraints(cf. Eq. (5.4)) c ik [ φ ] = ∀ i , k (8.4) The code is written in the programming language P
YTHON ( ) employing the numerical routines of the scientific programming library N UM P Y ( https://numpy.org/ ). For instance, this applies to HF and KS-DFT. However, we can straightforwardly generalize the argument to, e.g., RDMFT if weconsider additionally the natural occupation numbers. As we have shown in Sec. 7, this only influences the equality constraints, butnot the inequality part. For other methods, a similar argument holds. B m inequality constraints (cf. Eq. (5.7)) g i [ φ ] = − n i [ φ ] ≥ ∀ i , (8.5)where the n i are the (electronic) natural occupation numbers, i.e., the eigenvalues of the electronic 1RDM γ e [ φ ]( r , r (cid:48) ) = (cid:90) d q γ [ φ ]( rq , r (cid:48) q ), (8.6)where γ [ φ ]( z , z (cid:48) ) = (cid:80) Nk = φ ∗ k ( z (cid:48) ) φ k ( z ) is the polaritonic 1RDM. With these definitions, the minimization prob-lem reads minimize E [ φ ]subject to c ik [ φ ] = g i [ φ ] ≥ g i = g i [ φ ] is only implicitly known.Thus, the nonlinearity of the constraints g i is comparable to, e.g., the nonlinearity of the HF equations,which suggests a similar solution strategy, i.e., another SCF procedure. Additionally, we have to account forthe inequality character of the new constraints, which, as we will see below, requires different methods thanequality constraints. The polariton ansatz thus demands to nest both types of algorithms, which is nontrivialand requires considerable modifications of the standard algorithms. In order to solve problem (8.7), it is possible to generalize the Lagrange-multiplier technique and derive a setof necessary conditions for a stationary point, which are known as the
Karush-Kuhn-Tucker (KKT) condi-tions. Additionally, we will assume that our functional has only one stationary point, which is a minimum.In this case, the KKT conditions are not only necessary but also sufficient to characterize this minimum. Asusual in first-principles theory, it is very difficult to proof such an assumption and we can only hope thatit is justified in most cases. We remark that in the tested scenarios, the code always converged to the samesolution, independently of the starting point.
The KKT conditions
To present the KKT conditions, we introduce the Lagrange multipliers ¯ (cid:178) = ¯ (cid:178) , ¯ (cid:178) ,..., ¯ (cid:178) N , N and assuming theKKT multipliers ¯ ν = (¯ ν ,..., ¯ ν B m ) and define the Lagrangian L [ φ ; ¯ (cid:178) , ¯ ν ] = E [ φ ] − N (cid:88) i j = ¯ (cid:178) i j c i j [ φ ] − B m (cid:88) i = ¯ ν i g i [ φ ]. (8.8) Note that there are in principle as many g i as matter basis states that we have denoted in the previous section by B m . In practice,it is however not necessary to consider all these constraints as we discuss below. For small summary of the original publications and later generalizations, the reader is referred to the “Notes and References” of[266, part 12, p349f.] states then that if φ s is a stationary point of the minimization problem (8.7), there exist(¯ (cid:178) s , ¯ ν s ) such that the following four sets of conditions hold (under some regularity conditions on E )1. Stationarity ∂∂φ ∗ k E [ φ s ] = (cid:88) i j ¯ (cid:178) si j ∂∂φ ∗ k c i j [ φ s ] + (cid:88) i ¯ ν si ∂∂φ ∗ k g i [ φ s ], (8.9a)2. Primal feasibility g i [ φ s ] ≥ ∀ i , (8.9b) h i j [ φ s ] = ∀ i , j , (8.9c)3. Dual feasibility ν si ≥ ∀ i , (8.9d)4. Complementary slackness ν si g i [ φ s ] = ∀ i . (8.9e)Note that the KKT conditions reduce to the Lagrange conditions for equality constraints if all the g i = φ s , ¯ (cid:178) s , ¯ ν s ) for a given problem in practice, letus have a closer look to the KKT conditions. From a practical point of view, stationarity is the most importantKKT condition, because it provides us the basic equations that have to be solved in order to determine theminimum. Primal feasibility is the trivial KKT condition that is defined by the minimization problem itselfand primal feasibility together with stationarity are the equivalent to the Lagrange conditions for equalityconstraints. The KKT conditions that are not present in Lagrange theory are dual feasibility and comple-mentary slackness. The former is necessary to fix the sign of the term containing the inequality constraints(if ν i ≤ g i ≤ g i ≥ either ν si = or g i ( φ s ) = φ s is inthe strictly feasible region where g i >
0, there is no need for “correction” and thus the corresponding KKTmultiplier ν si =
0. Only if the solution is on the boundary of the feasible region, i.e., g i ( φ s ) = ν si >
0. This is exactly the case, when the unconstrained minimization would have a solution φ s (cid:48) that is infeasible , i.e., when g i ( φ s (cid:48) ) <
0. In this case, the KKT multiplier plays the same role as the Lagrangemultiplier and in this sense KKT theory is a straightforward generalization of Lagrange theory.The major difference between dealing with inequality constraints in comparison to the equality con-straints is thus that we do not need to determine as many KKT multipliers as there are constraints. For ourspecific minimization problem, most of the multipliers will be indeed zero. The reason is that the particlenumber is conserved, i.e., (cid:80) B m i = n i = N , and consequently we can maximally have g i = − n i = N /2conditions (which corresponds to the non-interacting case, where the first n = ... = n N = n N + = ... = n B m = B m is big. We will never need to determine all g i , which would require to determine all the B m eigenvalues See [266, part 12.3] for the mathematical details. M eigenvalues, where M (cid:47) N . The basic approaches to solve nonlinear (inequality) constrained minimization problems
However, we still need a method to determine the necessary KKT multipliers and the fact that we do notknow a-priori the active set , i.e., the ¯ ν i that are non-zero, crucially influences the corresponding algorithm.Nowadays on can find a plethora of such algorithms, that have been implemented and applied successfully.For an overview of the topic, we refer the reader to the book of Nocedal and Wright [266] that we also use asa main reference for this chapter. The minimization problem (8.7) considers a nonlinear energy functionaland nonlinear constraints and thus belongs to the hardest class of minimization problems. To solve suchproblems, there are basically three different classes of algorithms.• The sequential quadratic programming approach:The idea of this approach is to constrain the minimization to a subspace such that a quadratic approx-imation of the energy functional becomes very accurate. The challenge is then to choose this subspacein a smart way taking the constraints into account.• The barrier or interior-point methods:Here, one enforces the inequality constraints during the minimization by using a barrier function thatdepends on the barrier parameter µ . During the minimization, µ is reduced successively until the KKTconditions are fulfilled.• The penalty methods:Instead of employing a barrier that constrains the minimization to the feasible region and needs to bereduced for convergence, here violations of the constraint are penalized . For convergence, the penaltyis successively increased.All of these have certain advantages and disadvantages, which are in principle well-studied. However, itis not easy to apply this knowledge to our specific problem. We therefore make the following pragmaticchoices: Since our aim is to extend the framework of a working electronic-structure code such as O CTOPUS ,we try to choose a method that is as less invasive as possible. Therefore, we exclude sequential quadraticprogramming approaches, which are conceptually very different from the standard electronic-structure al-gorithms. For instance, it is not straightforward to combine active sets together with the line-minimizationof the conjugate-gradients method by Payne et al. [267]. Unfortunately, this means to exclude the class ofmethods that are considered to “show their strength when solving problems with significant nonlinearitiesin the constraints,” [266, part 18, p. 529].The other two classes of methods can in principle be employed to extend standard electronic-structuremethods. Conceptually, barrier and penalty methods are very similar, but in practice, the latter are easierto handle. Usual penalty functions are simpler to implement and to debug than typical barrier functions.Therefore, the final version of our here proposed algorithm employs the augmented Lagrangian method that is based on a barrier function. Before we discuss the full algorithm in Sec. 8.2.2, we briefly outline thismethod in the following.
The basic quantity of every penalty method is a function that penalizes violations of the constraints. Forsimplicity, we discuss in the following only the case of inequality constraints, i.e., we consider the mini- Note that penalty functions can also be applied to equality constrained minimization problems. φ sE in the infeasible region , where g < g = φ s of the constrained problem is thus on the boundary, where g =
0. Penalty methods constructan auxiliary minimization problem by adding a penalty function P ( µ ) to the energy (blue solid lines). Forevery value of the penalty parameter µ , we can find the minimum φ s ( µ ) of the auxiliary problem, whichapproaches the true minimum φ s for increasing µ . This is illustrated for two example values µ (light-blue)and µ (dark-blue), where µ > µ .mization problem minimize E ( φ )subject to g i ( φ ) ≥
0. (8.10)For that, we define the quadratic penalty functionP ( φ ; µ ) = µ (cid:88) i ([ g i ( φ )] − ) , (8.11)where µ > penalty parameter and [ · ] − is defined as [ x ] − = max( − x ,0) for any real number x .Therefore, P penalizes constraint violations quadratically ( g i <
0) and proportional to µ , but has no effect,if g i ≥
0. It can be shown that a minimization with this penalty function converges to the exact solution ofthe minimization problem (8.10) for µ → ∞ [266, Theorem 17.3]. The nonsmoothness of P is a necessaryprice to pay for this property, but in practice, this is not a big issue. Most importantly, one can generalize thestationarity KKT conditions (Eq. (8.9a)) to this case by making use of directional derivatives. In a concretealgorithm, we consider the Lagrangian L [ φ ; µ ] = E [ φ ] + P [ φ ; µ ], (8.12) For details about this, see [266, part 17.2]. L [ φ ; µ m ] to obtain approximate minima φ m = φ s ( µ m ) for increasing µ m > µ m − until convergence. We have illustrated this procedure in Fig. 8.1.To calculate the derivative of P , we have to deal with its non-differentiability. For the contribution ofcondition i , we can calculate ∂∂φ ∗ k (cid:163) µ ([ g i [ φ ]] − ) (cid:164) = − µ g i [ φ ] ∂∂φ ∗ k g i [ φ ] g i < g i >
0, (8.13)but for g i =
0, there is no well-defined derivative. As we have mentioned before, the formally correct wayto define the stationarity condition (8.9a) for nonsmooth penalty functions is by means of the directionalderivative. Sophisticated algorithms take care of this by “smoothing procedures” [266, part 17.2] that effec-tively remove the non-differentiable point. However, a simpler way to deal with this issue is to explicitly keepthe step in the gradient and to define ∂∂φ ∗ k (cid:163) µ ([ g i ( φ )] − ) (cid:164) g i = = − µ g i [ φ ] ∂∂φ ∗ k g i [ φ ], (8.14)where − µ g i [ φ ] is assumed to remain finite (see below). We therefore have the following stationarity condi-tions ∂∂φ ∗ k E [ φ ] = (cid:88) i Θ ( − g i [ φ ]) µ g i [ φ ] ∂∂φ ∗ k g i [ φ ], (8.15)where we employed the Heaviside step-function Θ ( x ) = x for x ≥ Θ ( x ) = x < µ to guarantee theequality constraints. To see this, we compare Eq. (8.15) with the KKT conditions (8.9a). The KKT multipliercorresponding to the inequality constraint g i reads for the penalty method (assuming¯ ν pi = − µ g i ( φ ). (8.16)When g i →
0, we need µ → ∞ to have a finite KKT multiplier ¯ ν i . This can be problematic in practice, because µ is part of the gradient ∂ L ∂φ ∗ k = ∂∂φ ∗ k E [ φ ] − (cid:80) i Θ ( − g i [ φ ]) µ g i [ φ ] ∂∂φ ∗ k g i [ φ ] (and of the higher derivatives such asthe Hessian). If µ is large, even small errors in ∂ L ∂φ ∗ k are amplified strongly, which might prevent an algorithmthat employs ∂ L ∂φ ∗ k (or other derivatives) to determine the minimization steps from convergence. This well-known problem of penalty (and many other) methods is called ill conditioning . The standard way to reduce the ill conditioning of the quadratic penalty method is to “augment” the La-grangian by a further linear term. This augmented Lagrangian method has considerably better convergenceproperties than the penalty method. The corresponding Lagrangian reads L [ φ ; ν , µ ] = E [ φ ] − (cid:88) i ν i g i + µ (cid:88) i ([ g i ] − ) , (8.17)where we introduced a new set of Lagrange-multipliers ν = ( ν ,..., ν B m ). The corresponding stationarityconditions read ∂∂φ ∗ k E [ φ ] = (cid:88) i (cid:163) ν i − µ g i [ φ ] (cid:164) ∂∂φ ∗ k g i [ φ ], (8.18) See p. 505f of [266] for a more general discussion of ill conditioning. ν ai = ν i − µ g i [ φ ]. (8.19)Assuming that the algorithm is in the m -th iteration step close to the solution point, we have g i [ φ m ] ≈ − µ (¯ ν ai − ν mi ).Thus, if ν mi ≈ ¯ ν ai , the constraint g i ≈ µ . Instead, in the simplepenalty method, we have close to the solution g i [ φ m ] ≈ − µ ¯ ν pi ,and accordingly need much larger µ to satisfy g i ≈
0. We therefore see how the problem of ill conditioning isconsiderably reduced by the linear term of the augmented Lagrangian method.To guarantee ν mi → ¯ ν si in practice, Eq. (8.19) provides us even with an update formula, i.e., ν k + i = ν ki − µ k g i [ φ k ]. (8.20)The additional Lagrange-multipliers ν are initialized to zero and updated to values ν i > g i = ν i hence play the same role as the usual Lagrange multipliers for equality constraints. Having introduced the basic ingredients of the augmented Lagrangian method to guarantee inequality con-straints in minimization algorithm, we present in this section a concrete algorithm to solve the minimizationproblem (8.7). We have implemented this algorithm in a purpose-built code to produce the results of Ref. [2]that we have presented and discussed in Sec. 6. Specifically, we combine the augmented Lagrangian methodwith a simplified version of an algorithm of the L
ANCELOT software package following Ref. [266, algorithm17.4]. We want to present now our newly developed algorithm to solve the minimization problem (8.7) to solvethe polaritonic HF equations with the polariton ansatz (polariton-HF algorithm). We start by recapitulatingthe most important definitions and reformulating the inequality constraints (8.5) in a more suitable form.We then apply the augmented Lagrangian method introduced in the previous subsection to the specificproblem. We derive all the necessary equations and present the basic idea of an algorithm to solve these.
Reformulating the minimization problem
We describe the polaritonic orbitals in a basis-set with finite dimension B = B m B ph , where B m ( B ph ) is thesize of the matter (photon) basis with corresponding index i ( α ). We assume spin-restriction (see Sec. 2.1.2)and thus consider for a system of N electrons N /2 polaritonic orbitals φ = ( φ ,..., φ N /2 ) that are represented See or the corresponding book by the head developers Conn et al.[286]. φ k = ( φ i α k ) ∈ R B m B ph . Consequently, φ can be regarded as a super-vector (avector of vectors) or equivalently as a matrix. The goal is to minimize the HF energy functional E [ φ ] = (cid:88) i 〈 φ i | ( ˆ T [ t (cid:48) ] + ˆ V [ v (cid:48) ]) φ i 〉 + (cid:88) i , k (cid:163) 〈 φ k | ˆ J i [ w (cid:48) ] φ k 〉 − 〈 φ k | ˆ K i [ w (cid:48) ] φ k 〉 (cid:164) (cf. 5.9)where we used the definitions of the “dressed” Coulomb-operator ˆ J i which acts as ˆ J i φ k ( rq ) = (cid:82) d z (cid:48) φ (cid:48) i ∗ ( z (cid:48) ) w (cid:48) ( z ; z (cid:48) ) φ i ( z (cid:48) ) φ k ( z ) (cf. Eq. (5.10a)), and the “dressed” exchange-operator ˆ K i which acts as ˆ K i φ k ( z ) = (cid:82) d z (cid:48) φ i ( z ) w (cid:48) ( z ; z (cid:48) ) φ i ∗ ( z (cid:48) ) φ k ( z (cid:48) ) (cf. Eq. (5.10b)). Here t (cid:48) , v (cid:48) w (cid:48) are the dressed integral kernels, cf. Eqs. (4.38),(4.39) and(4.40). Using the definition of the polaritonic 1RDM γ i α , i (cid:48) α (cid:48) = N /2 (cid:88) k = φ ∗ k i (cid:48) α (cid:48) φ i α k , (8.21)we defineˆ J [ γ ] φ k ( z ) ≡ N /2 (cid:88) i = ˆ J i φ k ( z ) = N /2 (cid:88) i = (cid:90) d z (cid:48) φ ∗ i ( z (cid:48) ) w (cid:48) ( z ; z (cid:48) ) φ i ( z (cid:48) ) φ k ( z ) = (cid:90) d z (cid:48) w (cid:48) ( z ; z (cid:48) ) γ [ φ ]( z (cid:48) , z (cid:48) ) φ k ( z ) (8.22a)ˆ K [ γ ] φ k ( z ) ≡ N /2 (cid:88) i = ˆ K i φ k ( z ) = N /2 (cid:88) i = (cid:90) d z (cid:48) φ i ( z ) w (cid:48) ( z ; z (cid:48) ) φ ∗ i ( z (cid:48) ) φ k ( z (cid:48) ) = (cid:90) d z (cid:48) w (cid:48) ( z ; z (cid:48) ) γ [ φ ]( z , z (cid:48) ) φ k ( z (cid:48) ) (8.22b)Thus, the energy can be expressed as E [ φ ] = (cid:88) i 〈 φ i | ( ˆ T + ˆ V ) φ i 〉 + (cid:88) k (cid:163) 〈 φ k | ˆ J [ γ ] φ k 〉 − 〈 φ k | ˆ K [ γ ] φ k 〉 (cid:164) , (8.23)where we shortened ˆ T ≡ ˆ T [ t (cid:48) ] and ˆ V ≡ ˆ V [ v (cid:48) ]. The equality constraints (8.4) guarantee the orthonormality ofthe polaritonic orbitals and read explicitly c ik [ φ ] = 〈 φ i | φ k 〉 − δ ik = k = N /2. (8.24)Because of the spin-restriction, the M inequality constraints (8.5) read g i ( φ ) = − n i [ φ ] ≥ i = M , (8.25)where the n i are the (electronic) natural occupation numbers. The number M ≤ B m is smaller or equal tothe size of the electronic basis B m and depends on the specific problem (see Sec.8.2.1). To determine the n i and therefore the g i from a given a set of states φ , we first calculate the polaritonic 1RDM γ i α , i (cid:48) α (cid:48) = N /2 (cid:88) k = φ ∗ k i (cid:48) α (cid:48) φ i α k . (8.26)From γ , we obtain the electronic 1RDM (cf. Eq. (8.6)) γ i , i (cid:48) e = B ph (cid:88) α = γ i α , i (cid:48) α = B ph (cid:88) α = φ ∗ k i (cid:48) α φ i α k (8.27)208.2. HYBRID STATISTICSby a contraction over the photon index, which means that we set α = α (cid:48) and perform the sum (cid:80) B ph α = . Then,we have to diagonalize γ e by solving the system of equations B ph (cid:88) i (cid:48) = γ i , i (cid:48) e ψ e , i (cid:48) j = n j ψ e , ij for the natural occupation numbers n j and the natural orbitals ψ ej . Diagonalizing a matrix can be performednumerically very efficiently, but there is no explicit formula that relates γ e to its eigenvectors, i.e., the n i are implicit functions of φ . To disentangle the diagonalization step of γ e from the rest of the rest of the algorithm, we consider natu-ral orbitals and dressed orbitals as independent variables of the minimization and enforce their connectionas an additional constraint. We collect the natural orbitals in the vector ψ e = ( ψ e ,..., ψ eB m ) and define (cf.Eq. (4.49)) n i [ φ , ψ e ] ψ ei = ˆ γ e [ φ ] ψ ei , (8.28)where n i [ φ , ψ e ] = 〈 ψ ei | ˆ γ e ψ ei 〉 (8.29)The inequality constraints (8.25) therefore become explicit functionals of φ and ψ e , i.e., g i = g i [ φ , ψ e ] = − n i [ φ , ψ e ]. (8.30)Since the ψ e are independent variables now, we also have to enforce their orthonormality by a third set ofconditions ¯ f i j = 〈 ψ ei | ψ ej 〉 − δ i j =
0. (8.31)Importantly, this construction allows us to consider either φ or ψ e as constant, while optimizing the otherand connect both in a self-consistent field procedure.We thus have transformed the original minimization problem (5.8) with the implicit inequality con-straints (8.25) into the new minimization problemminimize E [ φ ]subject to c i j [ φ ] = g i [ φ , ψ e ] ≥ f i j [ ψ e ] =
0, (8.32)with the explicit inequality constraints (8.30). Note that this is very similar to nonlinearity of the HF equations themselves, which we have to solve by a self-consistent diagonal-ization (see the discussion in Sec. 7.2 and below). This is similar to considering φ and φ ∗ as independent. Enforcing the inequality constraints: the augmented Lagrangian method
To enforce the inequality constraints, we consider an augmented Lagrangian algorithm, which we have in-troduced and motivated in the previous section (especially Sec. 8.2.1.2). The augmented Lagrangian methodextends a given Lagrangian with extra penalty terms, which allows us to do a similar “trick” as we have shownbefore for the RDMFT generalization (see Sec. 7.3): we use the conjugate-gradients method by Payne et al.[267], but exchange the Lagrangian. However, we will see in the following that the inequality constraintsrequire a considerably stronger modification of the algorithm (and several further steps) than the general-ization to RDMFT, where we only had to consistently exchange the diagonal elements with the full Lagrange-multiplier matrix.Consequently, instead of minimizing E directly, we consider the Lagrangian L [ φ , ψ e ] = E [ φ ] + C [ φ ] + G [ φ , ψ e ], (8.33)where C [ φ ] = − (cid:88) i j ¯ (cid:178) i j c i j [ φ ] (8.34)is a standard Lagrange term to enforce the equality constraints (8.24) that introduces the Lagrange multipli-ers ¯ (cid:178) i j . We used here the notation according to the translation rules depicted in Fig. 5.1. The other term G includes all extra terms that are related to the inequality constraints. For our specific construction, these areon the one hand a standard Lagrange term, − (cid:80) i j ¯ θ i j f i j , to guarantee the constraints (8.31) and on the otherhand, the two terms, − (cid:80) i ν i g i + µ /2 (cid:80) i ([ g i ] − ) , that are introduced by the augmented Lagrangian method(see Sec. 8.2.1.2). The ¯ θ i j and ν i are Lagrange multipliers, µ is the penalty parameter and [ y ] − = max( − y ,0).Collecting all the three terms, we have G [ φ , ψ e ] = − (cid:88) i ν i g i [ φ , ψ e ] + µ (cid:88) i ([ g i ] − [ φ , ψ e ]) − (cid:88) i j ¯ θ i j f i j [ ψ e ], (8.35)which completes the definition of the Lagrangian (8.33).To derive the according stationarity conditions, we need to evaluate functional derivatives of the form ∂∂φ ∗ k g i ( φ ) = ∂∂φ ∗ k (2 − n i ) = − ∂∂φ ∗ k n i ,which is nontrivial to compute. Using operator perturbation theory, one can derive ∂∂φ ∗ k n i = B m (cid:88) j (cid:48) = ψ e , j (cid:48) i ∗ φ j (cid:48) , α k ψ e , ji , (8.36)which can be seen as a projection of only the electronic part of φ k on the to n i corresponding natural orbital ψ ei . The derivation of Eq. (8.36) is shown in appendix A.3 for the general case of real-space orbitals. Forconvenience, we introduce the operator ˆ G i that acts asˆ G i φ k = B m (cid:88) j (cid:48) = ψ e , j (cid:48) i ∗ φ j (cid:48) , α k ψ e , ji (8.37)on a given polaritonic orbital φ k and thus ∂∂φ ∗ k g i = − ˆ G i φ k .210.2. HYBRID STATISTICSThe complete stationarity conditions (cf. Eq. (6.4), Eq. (8.9a)) read0 = ∂∂φ ∗ k L = ˆ H φ k − (cid:88) j ¯ (cid:178) k j φ j + (cid:88) i (cid:163) ν i − µ [ g i ] − (cid:164) ˆ G i φ k (8.38a)0 = ∂∂ψ ei ∗ L = ( µ [ g i ] − − ν i ) ˆ γ e ψ ei − (cid:88) j ¯ θ i j ψ ej , (8.38b)where we considered φ k , φ k ∗ , ψ ei , ψ ei ∗ as independent variables and employed the definition of the Fockoperator (cf. Eq. (5.11)) ˆ H [ γ ] =
2( ˆ T + ˆ V ) + J [ γ ] − ˆ K [ γ ]. (8.39)Looking at Eq. (8.38a), we observe a structural similarity to the stationarity conditions for electronic single-reference methods. We have a Lagrange-multiplier matrix ¯ (cid:178) k j and nonlinear operators that depend on theorbitals Φ of a Slater determinant. Since the one-body Hamiltonian ˆ H and the new operators ˆ G i are allhermitian, we can also here diagonalize the Lagrange-multiplier matrix ¯ (cid:178) i j = δ i j (cid:178) j and bring Eq. (8.38a) intothe form of an eigenvalue equation (cid:178) k φ k = ˆ H φ k + (cid:88) i (cid:163) λ i − µ [ g i ] − (cid:164) ˆ G i φ k . (8.40a)The same is possible for the second gradient equation, cf. Eq. (8.38b), because also the electronic 1RDM γ e is hermitian. We choose ¯ θ i j = δ i j θ j and rewrite θ i ψ ei = ( µ [ g i ] − − λ i ) ˆ γ e ψ ei . (8.40b)This equation is in principle nontrivial to solve. However, as the φ and ψ e are treated as independent vari-ables, we can equivalently solve the eigenvalue problem for γ e (Eq. (8.28)) an then simply replace θ i = n ei ( µ [ g i ] − − λ i ). With these definitions, we are able to perform HF calculations with the polariton ansatzby numerically solving the Eqs. (8.40a) and (8.40b) with the expressions (8.23) and (8.39). The basic polariton-HF algorithm
We split the full algorithm in the following two principal parts that we discuss separately in the following.1. The inner part that (approximately) minimizes the subproblem: L ( µ l , ν l ) [ φ , ψ e ; (cid:178) , θ ] = L [ φ , ψ e ; (cid:178) , θ ; µ = µ l , ν = ν l ]for fixed penalty parameters ( µ l , ν l ) that are updated by the outer part. This part can be solved in prin-ciple by any minimization method and we will employ a modified version of the conjugate-gradientsalgorithm by Payne et al. [267]. We call this penalty-corrected conjugate-gradients (PCG) method.2. The outer part that constitutes the actual augmented Lagrangian method. Here we iteratively up-date ( µ l , ν l ) → ( µ l + , ν l + ) until the approximate solution ( φ ( µ l , ν l ) , ψ e ( µ l , ν l ) ) of the subproblem is suf- The several factors of 2 that appear in this expression are the occupation numbers of the polaritonic orbitals, which arise from thespin-summation in restricted setting. Note that these numbers also occur in electronic HF. However, they are usually neglected, becausethey enter every term of the HF equations linearly. Here, this is not possible anymore because of the nonlinear penalty function. g ( µ l , ν l ) i = g i [ φ ( µ l , ν l ) , ψ e ( µ l , ν l ) ] (cid:39) ANCELOT software package. followingRef. [266, algorithm 17.4] We start with the second part of the algorithm, i.e., the augmented Lagrangian method that constitutes the“penalty loop” with iteration index l . This outer algorithm has two tasks. First, it needs to provide a set a ofrules to update ( µ l , ν l ) → ( µ l + , ν l + ),and second, it needs to control the convergence threshold (cid:178) PCG of the inner part of the algorithm. As longas we are far away from the overall solution, there is no point in finding the minimum of L ( µ l , ν l ) [ φ , ψ e ; (cid:178) , θ ]with a high precision. On the contrary, a too small (cid:178) PCG far away from the solution can even prevent thecode from convergence, especially due to the strong nonlinear character of L . Thus, the penalty loop shouldincrease or decrease (cid:178) PCG , depending on the constraint functions g ( µ l , ν l ) i and the gradient of L ( µ l , ν l ) .Let us first discuss the updates of the penalty parameters. These are determined by the “measure offeasibility,” i.e., the value of the constraint functions g li = g i [ φ ( µ l , ν l ) , ψ e ( µ l , ν l ) ]for the current approximate set of orbitals. If we have not found the overall solution, where δ L =
0, theremust be at least one function g li < g li indicates how far we are awayfrom the feasible region. If the violation is weak, i.e., if − g li < (cid:178) lg is smaller than a threshold (cid:178) lg > µ l is producing an acceptable level of constraint violation and we update µ l → α µ µ l by a factor α µ (cid:38) α µ = α µ =
1. At this point, the“augmented” Lagrange multiplier ν i comes into play, which we update according to the rule (cf. (8.20) and[266, Eq. 17.39]) ν i , l + = ν i , l − µ l g li . (8.41)As discussed in Sec. 8.2.1.2, this update formula is not chosen arbitrarily but is prescribed by augmentedLagrangian formalism. It is a crucial ingredient for the convergence of the method (see also [266, part 17.3]).If instead − g li ≥ (cid:178) lg , the minimization has not been “enough” penalized and thus, µ l → β µ µ l is raised by afactor β µ >
1. We choose a conservative value of β µ =
20, but very stable codes can even afford values until β µ =
100 to speed up the convergence [266, part 17.3]. If µ is updated, we should not update the Lagrange-multipliers ν l → ν l , because L is nonlinear and the update formula (8.41) is only valid close to the solution.We would risk to increase ν l too much.Now, we still need to define the threshold (cid:178) lg that determines whether µ or ν is updated. Clearly, the See or the corresponding book by Conn et al. [286]. (cid:178) lg should depend on how close we are to the solution and one can show that a good measure forthat is the current value of g li and µ l [266, Theorem 17.5 and 17.6]. We employ the update formulas of [266,Algorithm 17.4], which have a form that can be derived analytically but involve certain free parameters,which have been determined empirically. Additionally, [266, Algorithm 17.4] provides us with an update rulefor the convergence criterion (cid:178) PCG of the inner part of the algorithm. In total, we have to following updateformulas if g li ≤ (cid:178) lg ν i , l + = ν i , l − µ l g li µ l + = α µ µ l (cid:178) l + g = (1/ µ l + ) (cid:178) l + PCG = (cid:178) lPCG / µ l + else ν l + i = ν li µ l + = β µ µ l (cid:178) l + g = (1/ µ l + ) (cid:178) l + PCG = (cid:178) lPCG / µ l + (8.42)The above prescription is completed by the initial values for penalty parameter, for which we chose µ = ν to zero, because they are strictly positive and can only growdue to the update formula (8.41). Thus, if we choose them too big in the beginning, the algorithm cannotconverge.The last missing piece of the polaritonic HF algorithm is the overall convergence criterion (cid:178) total , whichmust be set by the user. Since we aim to find the stationary point of L , the eigenvalue equations (8.40a),(8.40b)provide the natural convergence criterion. However, Eq. (8.40b) is much simpler than Eq. (8.40a) and there-fore it is sufficient to test only Eq. (8.40a) (see below). For that, we define the residue sum R l = N /2 (cid:88) k = (cid:118)(cid:117)(cid:117)(cid:116)(cid:42) φ ( µ l , ν l ) k (cid:175)(cid:175)(cid:175)(cid:175)(cid:175) ∂∂φ ∗ k L | φ k = φ ( µ l , ν l ) k (cid:43) (8.43)which should go to zero, when we approach the overall solution. However, it may happen that R l is verysmall although the code has not yet converged. Thus, we consider a second convergence test based on thevalue of the Lagrangian function L l = L ( µ l , ν l ) that we (approximately) calculate in the PCG algorithm (seebelow). We simply compare the differences ∆ L l = | L ( µ l − , ν l − ) − L ( µ l , ν l ) | (8.44) This derivation is presented in [287, part 14.4]. The quantity (cid:112)〈 φ k | ∂∂φ ∗ k L 〉 is called the residue of the k -th gradient ∇ φ ∗ k L . This situation occurs for instance, when we are close to the unconstrained minimum of E in the infeasible region, i.e., the solutionof the polaritonic-HF minimization under the fermion approximation. R l < (cid:178) total , R (8.45) ∆ L l < (cid:178) total L , (8.46)where we usually simply set (cid:178) total = (cid:178) total , R + (cid:178) total , L . Algorithm 6. (augmented Lagrangian)Set the overall convergence criterion (cid:178) total > .Initialize φ ( µ , ν ) .Initialize the penalty parameters µ = ν = .Initialize (cid:178) g , (cid:178) PCG according to (8.42) .(Approximately) calculate and save L l for the overall convergence test. for l = (penalty loop) Inner part: (PCG algorithm)Solve ∂∂φ ∗ k L < (cid:178) lPCG and ∂∂ψ ei ∗ L < (cid:178) lPCG to obtain φ ( µ l + , ν l + ) , ψ e ,( µ l + , ν l + ) Then,(A) Calculate g i = − n i .(B) Update µ l + , ν l + , (cid:178) l + g , (cid:178) l + PCG according to (8.42) .(C) Calculate L l + (see PCG algorithm) and the residue sum R l + according to Eq. (8.43) break if max( ∆ L l + , R l + ) < (cid:178) total end for (penalty loop) Let us now discuss the inner part of the pHF-algorithm, i.e. the PCG algorithm. The goal is to approximatelysolve the two intermediate coupled eigenvalue problems (cid:178) k φ k ≈ ˆ H [ γ [ φ ]] φ k + (cid:88) i (cid:163) ν l , i − µ l [ g i [ φ , ψ ei ]] − (cid:164) ˆ G i [ ψ ei ] φ k , (8.47a) θ i ψ ei ≈ ( µ l [ g i [ φ , ψ ei ]] − − ν l , i ) ˆ γ e ψ ei , (8.47b)up to the threshold (cid:178) PCG (we show below how (cid:178)
PCG enters in the algorithm). We have explicitly highlightedthe functional dependence of the occurring operators:• ˆ H = ˆ H [ γ ] depends only on the polaritonic 1RDM γ (Eq. (8.39),• ˆ G i = ˆ G i [ ψ ei ] depends on the natural orbitals ψ ei (Eq. (8.37)) and• g i = g i [ φ , ψ ei ] = − n i [ φ , ψ ei ] depends on both φ and ψ ei (Eq. (8.29)) and thus couples the two eigen-value problems (8.47a) and (8.47b).These intricate dependencies suggest to split the minimization in three parts that define the following threenested loops with according indices m , m , m and convergence thresholds (cid:178) φ , (cid:178) ψ e , (cid:178) PCG .214.2. HYBRID STATISTICS1. DO loop, m , (cid:178) φ :In the innermost loop Eq. 8.47a is (approximately) solved for fixed ψ e and fixed ˆ H [ γ ] to update thedressed orbitals φ m → φ m + .2. NO loop, m :The next loop (approximately) solves Eqs. (8.47b) for fixed ˆ H [ γ ] and φ to update the natural orbitals ψ e , m → ψ e , m + .3. PCG loop, m :In the utmost loop of the PCG algorithm, ˆ H [ γ m ] → ˆ H [ γ m + ] is updated.In principle, every quantity that is updated depends on 3 indices, e.g., φ = φ m , m , m . To simplify the no-tation, we keep within one loop only explicitly track of the one corresponding index. For example, we denotethe polaritonic orbitals for the orbital optimization by φ m k . Once, we have completed the orbital optimiza-tion, we can update the set φ m → φ m + in the next higher loop, which is the NO loop in this case. Once theNO loop has converged, we can update the set φ m → φ m + of the utmost loop of the inner minimization,the SCF loop. Once the SCF loop is converged, we can update the penalty parameters ( µ l , ν l ) → ( µ l + , ν l + )in the second part of the algorithm. Correspondingly, we have to update the polaritonic orbitals in this loop,i.e., φ ( µ l , ν l ) → φ ( µ l + , ν l + ) . The dressed-orbital optimization (DO loop)
Instead of solving Eq. 8.40a directly, we employ a modified version of the conjugate-gradients algorithm ofPayne et al. [267] to minimize the Lagrangian L [ φ ] with fixed ψ e . To ease reading, we denote the approxi-mate orbitals during the orbital optimization only by φ m = ( φ m ,..., φ m N /2 ), where m denotes the iterationstep. We reserve the upper index ( µ l , ν l ) only for the solutions of the complete inner minimization. Thesteepest-descend vector for orbital φ m k is given by the negative gradient (cf. Eq. (7.34)) ζ m k = − ∂∂φ ∗ k L ( µ l , ν l ) = − (cid:195) ˆ H + (cid:88) i (cid:163) ν l , i − µ l [ g i ] − (cid:164) ˆ G i − (cid:178) m k (cid:33) φ m k , (8.48)where the Lagrange multiplier (cid:178) m k is estimated in every step by projecting the eigenvalue equation (8.47a)on φ m k , i.e., (cid:178) m k = 〈 φ m k | ˆ H φ m k 〉 + (cid:88) i (cid:163) ν l , i − µ l [ g i ] − (cid:164) 〈 φ m k | ˆ G i φ m k 〉 . (8.49)Until this point, we have generalized the conjugate gradients algorithm in a very similar way as we haveshown for RDMFT in Sec. 7.3. We changed the basic Lagrangian in comparison to the originally consideredone in Ref. [267] and accordingly, calculated a modified steepest-descend vector. We then follow the pro-cedure of Ref. [267] to obtain the corresponding conjugate-gradients vector ξ m k , which is analogous to howwe have proceeded for RDMFT. The most important steps for that are summarized in algorithm 5 of Sec. 7.3and we do not repeat them here. However, we have to modify the line-minimization to properly account for the penalty. This is nontrivialand we have developed a new algorithm for this task that we present below (see algorithm 8). We remark that we skip in our test-implementation the preconditioning step, which is not crucial for small systems and the tight-binding-like approximation of the kinetic energy. φ k in the DO loop, we define | φ m + k − φ m k | < (cid:178) lPCG . (8.50) The natural orbital optimization (NO loop)
The next higher iteration loop is the “NO loop” that updates the natural orbitals ψ e , m , after the φ -optimization(DO loop) has converged for fixed ˆ H and ψ e , m . Instead of solving Eq. (8.47b) directly, we determine thefirst M eigenvalues of γ m e = γ e [ φ m ], i.e., we solve n m + i ψ e , m + i = ˆ γ e [ φ m ] ψ e , m + i , (cf. 8.28)for a set of natural occupation numbers n m + = ( n m + ,..., n m + M ) and the new set of natural orbitals ψ e , m + i .This is a linear eigenvalue problem and thus, we can employ any standard eigensolver to solve it. We use thefunction “eigh” of the N UM P Y library.If we then set θ m + i = n m + i ( µ [ g i ] − − ν i ), we have solved Eq. (8.40b). In practice, we do not need toperform the latter step explicitly, because we only utilize the updated natural orbitals in the rest of the algo-rithm.The convergence criterion of the NO loop is | γ m + e − γ m e | < (cid:178) lNO . (8.51) The update of the Fock-matrix (PCG loop)
After the NO loop has converged, we have obtained a set of orbitals ( φ m , ψ m ) that (approximately) solveboth, Eq. (8.40a) and Eq. (8.40b), for a fixed ˆ H m . The PCG loop updates thenˆ H m + = ˆ H [ γ m ] = ˆ H [ γ [ φ m ]].After the PCG loop has converged, we have found an approximate solution ( φ ( µ l , ν l ) , ψ e ( µ l , ν l ) ) for the currentvalues of the penalty parameters ( µ l , ν l ).To test the convergence of the PCG loop and thus of the inner part of the algorithm, we employ thecriterion | γ m + − γ m | < (cid:178) lSCF . (8.52) Convergence thresholds
Since the outer algorithm only determines the (cid:178) lPCG , we need to adopt the convergence thresholds of theother loops accordingly. Since the loops are nested, it is crucial that criteria of inner loops are stricter thanthe thresholds of outer loops. In our implementation, we employ (cid:178) lNO = − (cid:178) lPCG (cid:178) lDO = − (cid:178) lNO = − (cid:178) lPCG (8.53)216.2. HYBRID STATISTICS The full PCG algorithm
We now summarize the full PCG-algorithm in pseudo-code.
Algorithm 7. (PCG)Initialize φ and calculate γ = γ [ φ ] from Eq. (8.26) , ˆ H = ˆ H [ γ ] from Eq. (8.39) . γ e = γ e [ γ ] from Eq. (8.6) . ψ e ,0 by solving Eq. (8.28) .Initialize (cid:178) DO , (cid:178) NO according to (8.53) . for m = ( PCG loop ) for m = ( NO loop )Initialize φ m = = φ m ψ e , m = = ψ e , m . for k = N /2 (state iteration)Initialize φ m = k = φ m k . for m = ( DO loop )(a) Calculate ζ m k from Eq. (8.48) with the current H m , ˆ G m i and ψ e , m .(b) Calculate ξ m k according to the algorithm by Payne et al. [267] (see algorithm 5).(c) perform a line-minimization along the direction of ξ m k to obtain φ m + k (algorithm 8). break if | φ m + k − φ m k | < (cid:178) lPCG end for (DO loop) end for (state iteration)(i) Collect all final sates of the PCG loop into the new φ m + .(ii) Calculate γ m + e = γ e [ φ m + ] from Eq. (8.6) .(iii) Diagonalize γ m + e , i.e., solve Eqs. (8.28) , to obtain the new set ψ e , m + . break if | γ m + e − γ m e | < (cid:178) lNO end for (NO loop)(1) Update φ m + , ψ e , m + with the final set of φ m ψ e , m for which the NO loop has converged.(2) Update γ m + = γ [ φ m + ] .(3) Calculate ˆ H m + = ˆ H [ γ m + ] from Eq. (8.39) . break if | γ m + − γ m | < (cid:178) lSCF end for (PCG loop) The line-minimization
In order to employ the conjugate gradient algorithm by Payne et al. [267] together with the augmented La-grangian method, we need to modify the line-minimization (LM). The first-order Fourier approximation ofthe energy functional that Payne et al. [267] employ (see Sec. 7.3) is not justified here because we have totake the nonlinear penalty-term into account. Thus, instead of minimizing E , we consider the “penalized217HAPTER 8. POLARITONIC STRUCTURE THEORY: A NUMERICAL PERSPECTIVEenergy” functional ˜ E ( µ l , ν l ) = E + P ( µ l , ν l ) , (8.54)where P ( µ l , ν l ) = − (cid:80) Mi ν l , i g i [ φ , ψ e ] + µ l (cid:80) Mi ([ g i ] − [ φ , ψ e ]) . The goal of the LM is then to minimize ˜ E ( µ l , ν l ) along the direction that is defined by the conjugate-gradients vector, where we need to take especially thenonlinearity of the penalty part into account.Specifically, we parametrize the conjugate-gradients vector ξ m k by the angle Θ k ∈ [0, π ] (cf. Eq. (7.36))˜ φ m k ( Θ k ) = cos Θ k φ m k + sin Θ k ξ m k ,and accordingly ˜ E ( µ l , ν l ) = ˜ E ( µ l , ν l ) ( Θ k ), which we approximate by˜ E ( µ l , ν l ) ( Θ k ) ≈ ˜ E ( µ l , ν l ) (0) + (cid:68) ˜ φ m k ( Θ k ) (cid:175)(cid:175)(cid:175) ∇ φ ∗ i ˜ E ( µ l , ν l ) | φ k = ˜ φ m k ( Θ k ) (cid:69) = ˜ E ( µ l , ν l ) (0) + (cid:173) ˜ φ m k ( Θ k ) (cid:175)(cid:175) ˆ H ˜ φ m k ( Θ k ) (cid:174) + (cid:88) i (cid:163) ν l , i − µ l [ g i [ ˜ φ m k ( Θ k )]] − (cid:164)(cid:173) ˜ φ m k ( Θ k ) (cid:175)(cid:175) ˆ G i ˜ φ m k ( Θ k ) (cid:174) .(8.55)Note that this is not a first-order Taylor expansion, as one might think at a first glance. Expression (8.55) hasbeen developed specifically for this algorithm and it cannot be understood out of its context. We explain thisin detail in the following.We start by remarking that the operators ˆ H and ˆ G i are constant during the DO optimization. Neverthe-less Eq. (8.55) is nonlinear, because of the constraint functions g i = − n i with (cf. Eq. (8.29)) n i [ φ ] = 〈 ψ ei | ˆ γ e [ φ ] ψ ei 〉 .This formulation allows us to keep the natural orbitals ψ ei fixed during the optimization of the φ m , which isconsistent with the rest of the algorithm, but to consider γ e = γ e [ ˜ φ m k ( Θ k )] = γ e [ Θ k ]as a function of Θ k . To find the optimal angle Θ ∗ k , the natural next step would be to derive some approximateexpression similar to Eq. (7.41) in Sec. 7.3, which would be crucial for larger systems. However, for ourtest-implementation, we could afford to employ an algorithm that directly calculates ˜ E ( µ l , ν l ) ( Θ k ) with theapproximate expression (8.55) for a not too large set of test-values of Θ k . It is important to realize that todo so, we only need to construct but not to diagonalize γ e [ Θ k ], which is numerically cheep for not too largebases.In practice, we perform the LM by a newly developed method that is inspired by so-called divide-and-conquer algorithms. The idea is to sample the interval of possible values of the angle Θ , such that we canlocate the (global) minimum of ˜ E ( µ l , ν l ) ( Θ ) roughly. We define then a smaller interval around the approximateposition of the minimum, which we sample in the next iteration with the same number of test-values tolocalize the minimum with a higher precision. We repeat this procedure until the interval is smaller than aprescribed threshold (cid:178) LM that we adopt with respect to the threshold of the DO-loop, i.e., (cid:178) lLM = − (cid:178) lPCG = − (cid:178) lSCF . (8.56)218.2. HYBRID STATISTICSSpecifically, we define the interval Θ ∈ I j = [ Θ jmin , Θ jmax ],and initialize the search for the maximal possible interval, i.e., Θ jmin = Θ max = π /2. We sample then theenergy ˜ E ( µ l , ν l ) ( Θ ) over a set of N LM linearly distributed test-values in I j . Therefore we define ∆ j = ( Θ jmax − Θ jmin )/ N LM and the test set T j = { Θ jmin , Θ jmin + ∆ j , Θ jmin + ∆ j ,..., Θ jmax }, (8.57)calculate E j = { ˜ E ( µ l , ν l ) ( Θ ) | Θ ∈ T j }. (8.58)and determine the approximate minimum E jmin = min E j . The corresponding approximate minimum Θ j for which holds that E jmin = ˜ E ( µ l , ν l ) ( Θ j ) (8.59)defines then the center of the interval for the next iteration. We define the new (reduced) interval half-length D j Θ = α LM | Θ jmin − Θ jmax | /2, where 0 < α LM < Θ j + min = max( Θ j − D j Θ ,0) Θ j + max = min( Θ j + D j Θ , π /2). (8.60)Here the min and max functions are necessary to constrain Θ within the maximally allowed interval [0, π /2].Depending on the values for N LM =
10 and α LM , the algorithm may perform quite differently and these val-ues should be tested with care. We found that N LM and α LM = E ( µ l , ν l ) ( Θ ) with a very high precision, which was very important in the debuggingprocess. The algorithm is summarized in pseudo-code as follows. 219HAPTER 8. POLARITONIC STRUCTURE THEORY: A NUMERICAL PERSPECTIVE Algorithm 8. (PCG line-minimization)Set the convergence threshold (cid:178) LM according to Eq. (8.56) , the samplingnumber N LM and α LM for the interval reduction.Initialize Θ min = and Θ max = π /2 and set I = [ Θ min , Θ max ] . for j=0,1,2,...a Sample I j according to Eq. (8.57) b Calculate E j according to Eq. (8.58) c Determine the approximate minimum Θ j according to Eq. (8.59) d Update Θ j + min and Θ j + max according to Eq. (8.60) break if ( Θ j + max − Θ j + min )/2 < (cid:178) LM end (for) Remarks
We want to conclude the section with some final remarks regarding the algorithm and our implementation.• The convergence threshold (cid:178) LM of the line-minimization should be smaller than the threshold• Although we formally defined all different sets of states for all iteration loops, we do not have to savethem separately. Thus, when we write, e.g., φ m = = φ ( µ l , ν l ) we do just pass a pointer between theroutines.• We want to stress that we could converge the code for basically every tested setting up to (cid:178) total = − .Higher accuracies are possible for small systems, but become difficult with increasing system size,such as a large number of lattice sites. A possible reason could be the simple treatment of the non-differential point of the penalty function (see Eq. (8.14)).• The current version of the code works only for 4 particles and with the additional approximation donot apply the penalty to the lowest occupied orbital, but only the second one. Without this approxi-mation, we only converge very small test systems (see also the Outlook in Sec. IV).220 ART IV
CONCLUDING REMARKS hapter 9
CONCLUSIONS
In the course of this thesis, we have analyzed the new challenges that arise in a first-principles description of(equilibrium) coupled light-matter systems in comparison to the well-studied problem of describing matterfrom first principles. Motivated from this, we have proposed the dressed-orbital construction that allows toovercome many of these challenges by using established (matter-only) methods to also investigate strongly-coupled electron-photon systems. We have illustrated by several examples that this approach stays accurateeven for very strong coupling between light and matter, while being numerically feasible.This still very young research area confronted us with many unexpected challenges. Already the startingpoint of our analysis was not clear. For instance, the usual cavity-QED model Hamiltonian (see Sec. 1.2)that has been used extensively to study coupled light-matter systems, is not useful for a first-principles de-scription because it is not bounded from below. On the other hand, the full theory of QED is also not useful,because we do not know how to renormalize the theory non-perturbatively. To account for the specific needsof a first-principles approach, we have then identified Pauli-Fierz theory as a suitable theoretical frameworkthat includes cavity QED as a well-defined limit case. We have presented most of our discussion for thissimpler setting that constitutes a good starting point to describe many phenomena of strong-coupling ex-periments.The next challenge expected us when we tried to generalize standard electronic-structure methods tothe cavity-QED setting: we first had to understand how such a generalization can be done. Therefore, weanalyzed three conceptually different approaches to describe the many-electron ground state, i.e., HF the-ory (as an example for wave-function methods), KS-DFT and RDM theory. We then showed that all theseapproaches have a clear counterpart in the coupled setting. However, it turned out that especially forequilibrium scenarios, many of these generalizations are considerably less efficient. For instance, the MFdescription, i.e., the generalization of HF theory cannot account for the quantum nature of the electron-photon interaction. This is a direct consequence of the fact that there is no exchange symmetry between theelectronic and photonic Hilbert space, and thus no “Fock-term”. The remaining classical electron-photoninteraction consists of merely electrostatic fields induced by the total dipole of the matter. This means thatin order to understand nontrivial effects, such as the formation of polaritons and their influence on equilib-rium properties of the system, we fundamentally have to describe photon-induced correlation.This is in principle possible in terms of the MF wave function, if we consider (KS-)QEDFT. The challengesof the coupled space manifest here in the unknown photon-exchange-correlation potential, for which westill do not have useful approximations. To describe the electron-photon interaction energy instead withinthe RDM approach, we had to introduce the 3/2-body RDM, which is a new type of RDM of mixed-electronphoton nature. The 3/2-body RDM has very different properties than the usual RDMs, e.g., it is not con-nected to the usual RDMs by a sum rule, because of its photonic half-body part. Without investigating thisand other issues such as the unknown representability conditions, it is very difficult to make practical useof this 3/2-body RDM. Nevertheless, it is possible to define QED-RDMFT that describes the system’s state223HAPTER 9. CONCLUSIONSwith the electronic 1RDM and the displacement coordinates of the photon modes (or the vector potentialfor full Pauli Fierz-theory). We have explicitly shown that this description is in principle exact by generaliz-ing Gilbert’s theorem to Pauli-Fierz theory. However, to capture the quantum nature of the electron-photoninteraction with QED-RDMFT, one faces similar challenges as in KS-QEDFT, since the according exchange-correlation functional is also not known.Our thorough analysis allowed us to identify a similarity in these challenges of the many-electron-photonproblem: many of the difficulties are directly connected to the structure of the electron-photon interactionoperator, which couples both particle species and leads to a non-conserved photon number. Establishedtools to approximate the Coulomb interaction between electrons often seem to be less powerful if there is asecond particle species involved. And most approaches are directly geared toward the description of systemswith a fixed particle number. We have thus proposed an alternative strategy that approaches the problem ona more fundamental level. Instead of further investigating the interaction between the two species directly,we have presented the dressed-orbital construction, which allows to describe cavity-QED systems in termsof only one polaritonic particle species. We have shown that we can describe the ground state of an N -electron- M -mode system in terms of an N -polariton wave function. These polaritons are hybrid particlesthat depend on electronic and photonic coordinates and adhere to a Fermi-Bose hybrid statistics. Only thepossibility to introduce a new particle class is a striking feature of the coupled space, which to the best ofour knowledge has never been explored before, and opens up many interesting research directions.One important consequence of describing the coupled problem with only one particle species is thestructural equivalence between the dressed version of the cavity-QED Hamiltonian and the electronic-structure Hamiltonian: both consist of only one-body and two-body terms that conserve the particle num-ber. Thus, the restructuring of the many-body space allows to circumvent basically all fundamental prob-lems of the light-matter first-principles approaches that we have discussed. The polaritonic-HF ansatz leadsto an exchange contribution and explicitly accounts for correlation between the electronic and photonicsubspaces. In polaritonic DFT, we can in principle employ any known exchange-correlation functional todescribe “polaritonic correlation,” i.e., all effects beyond polaritonic HF. And the necessary RDMs to cal-culate the polaritonic energy conserve the particle number. Thus, they are connected to each other in ananalogous way as in the single-species case. Although the polaritonic 1RDM is not as simple as the elec-tronic or photonic one, we can straightforwardly employ it in polaritonic RDMFT as basic variable togetherwith known exchange-correlation functionals. But the dressed construction is not limited to these threeparticular approaches. We have explicitly shown how to turn basically any electronic-structure theory into apolaritonic-structure theory. Importantly, this also means that we can generalize existing implementationsof electronic-structure methods in a straightforward way to describe coupled systems.However, the advantages of dressed orbitals come also with a price. First, we have to raise the dimensionof the electronic orbitals by one for every photon mode that is considered. This clearly limits the applica-tion of polaritonic-structure methods to settings, where the explicit description of only one or a few modesis sufficient. Then, we have to add additional terms to the kernels of the potential and interaction opera-tors. This should not be a big issue in most cases, as we have shown explicitly for our implementation inO CTOPUS (see Sec. 8). Finally, we have to enforce the hybrid statistics of the polaritons, which is an entirelyunexplored issue in the realm of first-principles theory. The standard approach that is to expand the many-polariton state in a properly symmetrized basis (such as Slater determinants for many-fermion systems) isnot applicable in practice because of the occurrence of “mixed-index” orbitals (see Sec. 4.2.3). Thus, newstrategies have to be explored. As a first step, we have proposed two practical approximations. The simplestone is the fermion ansatz, which treats polaritons as fermions and thus completely neglects the bosonic part224f the statistics. This allows for the most straightforward implementation of dressed orbitals and to studythe practical consequences of the hybrid statistics. With the help of the comprehensive convergence stud-ies with our first implementation of the fermion ansatz (see App. C), we could understand that the ansatzallows for violations of the Pauli principle in the electronic subsystem. Conversely, it seems that polaritonic-structure methods with the fermion ansatz describe coupled light matter systems very accurately in all cases,where the Pauli principle is satisfied trivially (such as two-polariton singlet states that are described by onedoubly-occupied orbital).This observation motivates the second proposed approximation to the hybrid statistics. The idea ofthe polariton ansatz is to remain as close as possible to the fermion ansatz, but remedy its most severeshortcoming by explicitly enforcing the Pauli principle. To do this in practice, the wave function descriptionis not helpful, because enforcing the Pauli principle is here basically equivalent to enforcing antisymmetry.To define the polariton ansatz, we therefore consider the electronic (ensemble) 1RDM to guarantee the Pauliprinciple explicitly (but not strict antisymmetry) by enforcing its ensemble N -representability conditions.To test this idea, we have developed and implemented an algorithm to perform polaritonic-HF calculationswith the polariton ansatz. Although this algorithm is considerably more involved than any standard HFalgorithm, our first results have confirmed that the polariton ansatz is not only numerically feasible butalso accurate. In scenarios where the fermion ansatz trivially fulfils the Pauli-principle and provides a veryaccurate description, the polariton ansatz reproduces these results. However, the polariton ansatz remainsaccurate also in the scenarios, where violations of the Pauli principle may occur with the fermion ansatz.We also have explicitly shown how to “switch” between both cases by varying the photon mode frequency ω . This illustrates yet from another perspective the differences between a single-species and a coupledtwo-species problem: a concept such as the electronic 1RDM might be useful for both problems, but inconsiderably different ways.Besides the interesting new perspective on many-species problems that the dressed construction offers,we have shown that polaritonic-structure methods have a big potential to shed light on the complicatedmechanisms of polaritonic physics. For instance, we have demonstrated a nontrivial interplay betweenlocal suppression and enhancement of the Coulomb induced repulsion between the electrons (Sec. 6.4.2and 6.4.3). This is reflected in the natural orbitals and occupation numbers of the light-matter system andthus influences all possible observables. Such small changes have been shown to theoretically strongly affectchemical properties and reactions, which are determined by an intricate interplay between Coulomb andphoton induced correlations [125]. Whether these modifications of the underlying electronic-structure areindeed a major player in the changes of chemical and physical properties still needs to be seen. However,to capture such modifications in the first place (and study their influence) clearly needs a first-principlestheory that is able to treat both types of (strong) correlations accurately and is predictive inside as well asoutside of a cavity.As a last example, we have applied polaritonic HF to investigate the interplay of electron localizationand electron-photon coupling. We found for nontrivial problems that the more delocalized the uncoupledmatter wave function is, the stronger it reacts to the modes of a cavity, which comes along with an increaseof electronic correlation. This is a first result that directly contributes to the debate on whether the groundstate can be measurably influenced by only coupling to the vacuum of a cavity (see Sec. 1.1). The findingsindicate that the spatial extension of the electronic wave function might be another important parameterthat influences the light-matter coupling. The ground state of spatially extended systems could indeed bestrongly modified by the vacuum field of an optical cavity. Importantly, the influence of localization cannotbe studied easily with standard approaches such as cavity-QED models, which is one probable explanation225HAPTER 9. CONCLUSIONSwhy it has not yet played a role in the debate. Additionally, this raises the question whether an ensembleof emitters with a spatially extended wave function might show collective strong-coupling effects that alsomodify the local electronic-structure, in contrast to the Dicke-type collective coupling picture that does notconsider electronic correlation.These first results illustrate the prospects of the new first-principles perspective on coupled light-mattersystems that the polariton picture provides (see Ch. 10). However, this comes with the price of an increasednumerical complexity, which demands the development of efficient and robust algorithms, a careful imple-mentation and comprehensive validation procedures. Nevertheless, with the example of our polaritonic-RDMFT implementation in O CTOPUS , we have shown that even complex electronic-structure code envi-ronments can be extended to describe polaritons. Although the nonlinearities of the involved equationsmay lead to hardly predictable side effects (App. B), our convergence studies have revealed that polaritonic-structure methods are as accurate as their electronic counter parts (App. C). Importantly, many synergeticeffects arise from such an extension of standard routines. For instance, we understand now the accuracyand limitations of the default RDMFT routine of O
CTOPUS (Sec. 7.2.2) much better because of the conver-gence studies with polaritonic RDMFT. This understanding lead to the development and implementationof the new conjugate-gradients methods (Sec. 7.3, which allows to perform both electronic-RDMFT andpolaritonic-RDMFT calculations with the full flexibility of the grid. On the one hand, this opens new inter-esting research directions in the entirely unexplored field of real-space RDMFT (see Ch. 10), while on theother hand it crucially has contributed to our understanding of the hybrid statistics and eventually to thedevelopment of the polariton ansatz.Although generalizing an electronic-structure algorithm to treat general cavity-QED systems with the po-lariton ansatz is numerically nontrivial, we have shown with the example of polaritonic HF that it is indeedpossible and also has affordable computational costs (Sec. 8.2). This algorithm and its successful implemen-tation represents an important first step toward the goal of describing realistic strongly-coupled light-mattersystems from first principles. However, to reach this goal, further optimization of the algorithm is still nec-essary (see Ch. 10). We have therefore carefully analyzed the numerical challenge of enforcing nonlinearinequality constraints in addition to the nonlinear minimization problems of typical electronic-structuremethods.226 hapter 10
OUTLOOK: OPEN QUESTIONS AND PROSPECTS
During the years of my PhD, I have faced many interesting research questions. Some of them have beenanswered in this manuscript or by my colleagues, but many others are still open. The present work will bean important step forward to understand and resolve some of these open questions. We want to addressthese in the following.
Extension of existing dressed-orbital implementation to 3d
The next obvious step is to extend the actual implementation of dressed orbitals in O
CTOPUS [3, Ch. 4] totreat matter in three spatial dimensions. Most prerequisites are already fulfilled for that and the last miss-ing piece is to optimize the calculation of the dressed interaction integrals, which should be done separatelyfrom the Coulomb integrals (see Sec. 8.1). Once this straightforward modification is accomplished, one candirectly perform polaritonic-HF and polaritonic-RDMFT calculations with the fermion ansatz. For instance,systems with only two active electrons, such as many diatomic molecules can be described without furtherlimitations, but as we have shown, also systems with more electrons can be accurately described if the pre-cise value of the photon-mode frequency plays a subordinate role and can be adjusted (see Sec. 6.1.3). Addi-tionally, such an extension allows to investigate the accuracy of standard exchange-correlation functionalsfor polaritonic QEDFT without further modifications of the code. In O
CTOPUS , we have a direct access tothe full L IB XC-library to do so. It is crucial to realize here that besides the necessarily larger orbital bases,the scaling of these methods is not affected by the inclusion of the photon field (see Sec. 6.3).
Large-scale calculations with the polariton ansatz
The next evident research questions is how to extend a standard electronic-structure code to treat dressedorbitals with the polariton ansatz. Our in Sec. 8.2 presented polariton-HF algorithm is in principle applica-ble, if not too large electronic bases are considered and a direct diagonalization of the electronic 1RDM isfeasible. This applies to, e.g., medium-sized systems described by standard quantum-chemistry basis sets.Note that in contrast to the matter description, the extra photon orbitals do not necessarily have to be con-structed in the code. The analytically known structure of the photon-number states allows to calculate allnecessary matrix elements analytically. We have exploited this also in our implementation (see Sec. 6.1.2).Nevertheless, to apply the polariton-HF algorithm to systems involving many particles, it clearly needsto be optimized. The current algorithm of the line-minimization should be replaced by a similar approx-imation as laid out in Ref. [267]. Additionally, we have to investigate new practical ways to determine theLagrange multipliers (cid:178) i and ν i corresponding to the equality and inequality constraints, respectively. In Note that the calculation of the dressed-interaction integrals is much simpler than the solution of the Poisson equation. The sepa-ration of both computations thus allows to do the latter more efficiently with (already implemented) standard solvers. ν i are calculated according to the prescription of the augmented La-grangian method, which works well. However, the (cid:178) i are determined according to a direct generalization ofthe conjugate-gradients algorithm by Payne et al. [267] (see Sec. 8.2.2.3). As a consequence the (cid:178) i dependnontrivially on the ν i , which may lead to numerical instabilities. This issue requires a careful analysis, butshould be resolvable by employing a different method to determine the (cid:178) i .Once, we have optimized the algorithm, a reasonable next step is to extend the existing implementationof the dressed construction in O CTOPUS [3, Ch. 4] to the polariton ansatz. For that, one would not explic-itly diagonalize the electronic 1RDM, which is numerically prohibitively expensive but resort to semidef-inite programming methods [288] that exploit the reformulation of the constraints in terms of positivity-conditions (see also Sec. 6.3).It is clear that developing, implementing and validating such an optimized polariton-HF algorithm isnontrivial. We believe however that such extensions, though they require a certain amount of work, are in-deed worthwhile. At the moment, polaritonic-structure methods are one of the most promising candidatesto describe molecule-cavity systems even for very large coupling strengths. The principal reason for thatis that we do not have to develop entirely new functionals, but can “recycle” the ones of the correspond-ing electronic-structure methods. This will be extensively tested with the planned extension of the Octopusimplementation to 3d systems with the fermion ansatz, before we develop the generalized algorithm.
Describing vibrational strong coupling with dressed orbitals
The fact that many strong-coupling phenomena have been experimentally observed by coupling the vibra-tional degrees of freedom of molecular systems to a cavity [80, 64, 296] poses the question whether we cangeneralize the dressed-orbital approach to this setting. This means that we have to include the motion ofthe nuclei and their coupling to the cavity mode in our description.To answer this question, it is important to realize that in the cavity-QED setting the coupling of the pho-ton modes to electrons structurally identical and to the coupling to the nuclear degrees of freedom (seeSec. 1.3.3). Thus, in principle, we can employ the same strategy and dress the nuclear orbitals with thephoton modes. The straightforward way to do so is to employ the Born-Oppenheimer approximation todecouple the electronic from the nuclear-photonic degrees of freedom (cavity Born-Oppenheimer approx-imation [128]). However, one could also investigate different approaches such as discussed in Ref. [125].Note that also in such settings, the dressed construction should allow to transfer accurate first-principlesmethods for matter-only problems to describe coupled systems.
Polaritonic structure methods for extended systems
Another interesting research question is whether we can apply the polaritonic-structure methods also to ex-tended systems. This would allow to study strong-coupling phenomena of solid-state systems. For instance,Rokaj et al. [289] recently have presented an approach based on the cavity-QED Hamiltonian that allows todescribe interesting phenomena such as Landau physics from first principles. Polaritonic-structure meth-ods based on this approach would allow to study, e.g., the modification due to quantum fluctuations of phe-nomena such as the Landau levels [290], the integer [291, 292] and the fractional quantum Hall effect [293].Another very promising research direction is to investigate how properties of solids can be controlled by theinteraction with the cavity. Examples are the stabilization of topological states [294] or the control of theelectron-phonon coupling strength and thus phonon-mediated superconductivity [295]. Since such phe-nomena usually depend on a complex interplay of different effects, for their investigation there are few al-228ernatives to an unbiased description from first principles. Especially, if strongly-correlated materials areconsidered, the dressed-orbital approach might be advantageous in comparison to other methods, simplybecause of its flexibility. Following our prescription of Sec. 4.3, one could generalize an established accuratemethod for strongly-correlated electrons to the cavity setting.To extend the dressed-orbital approach to such cases, one has to consider periodic local potentials. Usu-ally, this setting is simplified according to the Bloch theorem by a kind of mode expansion that takes theperiodicity of the potential into account. Since the (dipole) coupling to the photon modes breaks the trans-lational symmetry, the Bloch theorem cannot be applied straightforwardly. Here, a similar strategy to therecently proposed generalized Bloch theorem [289] should be applicable within the dressed-orbital con-struction. If this is indeed the case, one could straightforwardly make use of the k -space routines of O CTO - PUS to perform, e.g., polaritonic QEDFT calculations with extended systems (strongly) coupled to a cavitymode.
Development of a KS-QEDFT photon-exchange-correlation functional
Although methods based on dressed orbitals are a promising candidate to describe the strong-couplingregime of the light-matter interaction, their practical applicability is limited to a few modes. The mostpromising first-principles method to describe systems with many relevant modes is KS-QEDFT (Sec. 3.2).One important reason is that calculations with KS-QEDFT have the same numerical costs as the semiclassi-cal Maxwell-KS approach, which in equilibrium settings often even reduces further to the cost of a standardKS-DFT calculation (see Sec. 3.1.2). However, to account for the quantum nature of the electron-photoninteraction, new approximation strategies to the unknown photon-exchange-correlation (pXC) functionalhave to be investigated. Known strategies of KS-DFT either cannot be generalized in a straightforward way,such as the LDA, or the generalizations are less useful, such as the KLI approximation [247] for the photon-OEP (see Sec. 3.2 for details).For that, the insights from the dressed construction can contribute in two practical ways. First, we cantry to derive useful pXC functionals by investigating the connection between polaritonic QEDFT and KS-QEDFT. For instance, such a connection can be established via the equations of motion of both approaches,assuming the same density [173]. Another option is to study the polaritonic and KS auxiliary systems ofan analytically solvable reference systems, such as the homogeneous electron gas coupled to a few cavitymodes [124]. This would allow to study the connection between both auxiliary constructions by comparinghow they describe the same system. In the best case, we will identify some general rules that allow to derivepXC functionals from polaritonic exchange-correlation functionals.Another very interesting approach to construct an approximate pXC functional is to employ the Breit ap-proximation to the light-matter description. As the dressed construction, the Breit approximation removesthe difficult electron-photon interaction term from the theory. However, this is done very differently in thelatter case by an approximate expression of the vector potential in terms of the electric current, which isderived from the Maxwell’s equations. This in turn allows to remove the photon-degrees of freedom entirelyfrom the description and leads to a (purely electronic) Hamiltonian that conserves the particle number. Theelectron-photon interaction manifests here in an additional 2-body term that consequently can be approx-imated with standard methods for the Coulomb interaction. This idea was motivated by the encouragingresults of polaritonic-structure methods that also employ approximations for the Coulomb-interaction op-erator to describe the electron-photon interaction. The research in this direction is already ongoing and thefirst results are promising. 229HAPTER 10. OUTLOOK: OPEN QUESTIONS AND PROSPECTS
Dressed orbitals for full minimal coupling
Another research question is to understand whether we can generalize the dressed-orbital approach in auseful way beyond the cavity-QED setting and consider, e.g., full minimal coupling (Pauli-Fierz theory). It isclear that such a generalization could not be directly applicable in the sense of a polaritonic-structure theory,because the corresponding dressed orbitals would depend on too many photon coordinates. However, if wegeneralize the approach and construct the corresponding many-polariton space, this can already be usefulto analyze this still very little understood setting from a different perspective.To apply the dressed construction to, e.g., the Pauli-Fierz Hamiltonian that considers full minimal cou-pling, we have to consider the full spatially-dependent (transversal) vector potential A ⊥ ( r ). The polaritoniccoordinates read then ( r , A ⊥ ( r )) and the polaritonic density is n ( r , A ⊥ ( r )) including the information of boththe electronic one-body density and A ⊥ ( r ). We would then only need to generalize the coordinate transfor-mation that symmetrizes the auxiliary A ⊥ -fields in a reasonable way (see Sec. 5.1). This should in principlebe possible. The dressed-orbital construction in different contexts
Another very interesting application of the dressed-orbital construction is with regard to other multi-speciessituations. It seems possible, provided we can define species with the same number of particles, that one caninstead of working with complicated multi-species wave functions, work with the combined density matri-ces and enforce the ensemble representability conditions on the subsystems. This does not necessarily needany dressed construction. For instance, think about the Schrödinger equation for electrons and nuclei/ions.Assuming that we have one kind of nuclei/ions we could express the combined density matrix in terms ofelectron-nuclei/ion pairs. It seems interesting to investigate the above procedure also in the context of suchcases.
The connection between hybrid statistics and RDM representability conditions
A research question, that popped up when we developed the polariton ansatz, concerns the representabilityconditions of the polaritonic 1RDM γ (and other polaritonic RDMs). With respect to the combined coor-dinates, γ is a 1RDM, but if we separate the electronic and photonic parts, γ becomes a (1,1)-body RDM(one electronic and one photonic coordinate) that is very similar to the electronic 2RDM. Importantly, onlyfor the latter point of view, we know how the hybrid statistics of the polaritonic orbitals manifest in termsof the coordinates. This suggests that γ has more complicated representability conditions than the simpleones that we enforce in polaritonic RDMFT (Sec. 5.3). Understanding these conditions would be valuable toimprove polaritonic-structure methods. For instance, we have observed that all polaritonic HF calculationswere variational, although the corresponding wave-function ansatz may violate the hybrid statistics (even ifthe Pauli-principle is fulfilled). If this is not a coincidence (which further calculations with polaritonic HFwill show), it suggests that the single-reference ansatz in polaritonic coordinates with the polariton ansatzreduces the configuration space of the minimization with respect to the full many-polariton space. This isdifferent from electronic variational 2RDM theory, which usually leads to energies that are a lower boundto the exact one. The reason is that not all N -representability conditions can be tested in practice and thus, Specifically, one can construct polaritonic Slater determinants with the polariton ansatz(, i.e., every electronic orbital has an oc-cupation number between zero and one), that do not correspond to an ensemble of wave functions that adhere to the full hybridstatistics. N -electron wave function.To answer such questions, we need to understand the representability conditions of the (1,1)RDM. Al-though it depends on electronic and photonic coordinates, the (1,1)RDM is much simpler than the 3/2-bodyRDM, because it can be connected to all the other relevant RDMs of the system by sum rules. The (1,1)RDMis also a positive-semidefinite matrix, such as the usual single-species RDMs. Thus, we believe that we canderive the representability conditions of the (1,1)RDM and more generally of any ( p , q )RDM with a similarstrategy as Mazziotti [214] employed to derive the N -representability conditions of the 2RDM. Representability conditions of the -body RDM
A research question that accompanied me from the beginning of my PhD and that is directly connected tothe previous question is about the representability conditions of the 3/2-body RDM. In Sec. 3.3.1, we havebriefly discussed the RDM approach to study the coupled electron-photon space that to the best of ourknowledge has never been investigated. We believe that understanding the representability conditions forthe 3/2-body RDM could be very valuable for further progress in the field. For instance, one could performvariational minimizations of the energy without the wave function analogously to variational 2RDM theory(see Sec. 3.3.1). Importantly, such a method would not be limited to cavity QED, but could be generalizedstraightforwardly to the Pauli-Fierz level. A very interesting opportunity that opens up on this level is toinvestigate the full gauge freedom of the theory. There are gauges such as the temporal gauge [117], forwhich the Coulomb interaction does not explicitly occur in the Hamiltonian, but instead is carried by thelongitudinal part of the vector potential. Thus, the energy could be completely described only in terms ofthe electronic and photonic 1RDMs and the 3/2-body RDM. If the representability conditions of the 3/2-body RDM were easier to handle than the N -representability conditions of the 2RDM, this reformulationcould be useful not only for coupled light-matter systems but also for difficult electronic-structure prob-lems. For instance, most methods that accurately describe the strong-correlation regime can efficiently onlytake short-range interactions into account, but there are many systems, where this approximation breaksdown. To include also the long-range part of the Coulomb interaction, there have been proposals to in-troduce additional artificial degrees of freedom (see, e.g., Ref. [297]). The “natural” candidates for this arelongitudinal photons that could be described by the 3/2-body RDM.Although the 3/2-body RDM has very different properties than the 2RDM, one could derive its repre-sentability conditions with a similar approach as proposed by Ref. [214]. The first step of this programmewould be to generalize the bipolar theorem [213] that is the basis of the construction in Ref. [214] to the moregeneral case of coupled electron-photon systems. Based on this theorem, one can derive representabilityconditions from the positivity of linear combinations of RDMs. There is already work in progress on thisdirection and the first results are promising. RDMFT in real space
Finally, we want to address another question that came up during my PhD and that concerns our real-spaceimplementation of RDMFT in O
CTOPUS . With the new conjugate-gradients algorithm, we are in the positionto explore natural orbitals with the full flexibility of a real-space grid. This would allow to study, e.g., the lim-itations of standard quantum-chemical basis sets in RDMFT. We suppose that there are many scenarios inwhich the flexibility of the grid would allow to achieve converged results with less natural orbitals than such Note that this follows directly from its definition. k -space). Another limitation of the current RDMFTimplementation is the lack of pseudo potentials. These are crucial for real-space first-principles calcula-tions, because they allow to converge the results with considerably larger grid spacings than an all-electroncalculation (see Sec. 7).Recently, a new method has been proposed to calculate exchange integrals considerably more efficiently[276]. Although this method has been tested only for HF theory and hybrid functionals of KS-DFT, it is highlyprobable that we can also apply it to RDMFT calculations, since the corresponding exchange integrals areequivalent to the Fock-exchange ones (see Sec. 2.4.2). There is already work in progress in this direction (bya colleague) and the first tests seem to confirm this assumption. Naturally, this method would then also beapplicable to polaritonic RDMFT.Once, the implementation of this new method is finished, we can straightforwardly generalize the usualstrategies to generate pseudo-potentials [218] to RDMFT functionals.232 ppendix A APPENDIXA.1 The bosonic symmetry of the photon wave function
In this appendix we go into a little more detail and show how the mode-representation, which makes thebosonic symmetry explicit and which we discussed in Sec. 3.1.1. We introduce in this setting the usualbosonic density matrices [200]. Instead of starting with the displacement representation we start with thedefinition of the single-particle Hilbert space and its Hamiltonian. We choose the single-particle Hilbertspace H to consist of M orthogonal states | α 〉 . These states are defined by the eigenstates of the Laplacianwith fixed boundary conditions and geometry and correspond to the Fourier modes of the electromagenticfield [10, 39]. This real-space perspective is a natural choice if one either wants to connect to quantummechanics and deduce the Maxwell field from gauge independence of the electronic wave function [10], orwhen deducing the theory in analogy to the Dirac equation [298]. It is this analogy of Maxwell’s equations asa single-photon wave function with spin 1 that makes the appearance of a bosonic symmetry most explicitwhen quantizing the theory [11]. We note, however, that in general the concept of a photon wave functioncan become highly nontrivial [89]. Since we work directly in the dipole approximation we do not go throughall the steps of the usual quantization procedure of QED but from the start assume that we have chosen a fewof these modes | α 〉 (with a certain frequency and polarization) in Coulomb gauge [39]. The single-particleHamiltonian in this representation is then given byˆ h (cid:48) ph = M (cid:88) α = ω α | α 〉〈 α | .Since a total shift of energy does not change the physics and for later reference, we can equivalently useˆ h ph = (cid:80) M α = (cid:161) ω α + (cid:162) | α 〉〈 α | . Therefore, the energy of a single-photon wave function | φ 〉 = (cid:80) M α = φ ( α ) | α 〉 (corresponding to the classical Maxwell field in Coulomb gauge [11]) is given by E [ φ ] = M (cid:88) α , β = φ ∗ ( β ) 〈 β | ˆ h ph | α 〉 (cid:124) (cid:123)(cid:122) (cid:125) = ˆ h ph ( β , α ) φ ( α ) = (cid:88) α , β ˆ h ph ( β , α ) γ b ( α , β ) = (cid:88) α (cid:181) ω α + (cid:182) γ b ( α , α ) (cid:124) (cid:123)(cid:122) (cid:125) | φ ( α ) | .Here we have introduced the single-particle photonic 1RDM γ b ( α , β ) = φ ∗ ( β ) φ ( α ). We can then extend thesingle-particle space and introduce photonic many-body spaces H N b which are the span of all symmetric tensor products of single-particle states of the form [200, 254] | α ,..., α N b 〉 = (cid:112) N b ! (cid:88) ℘ | ℘ ( α ) 〉 ... | ℘ ( α N b ) 〉 , 233PPENDIX A. APPENDIXwhere ℘ goes over all permutations of α ,..., α N b . This construction is completely analogous to the typicalconstruction of the fermionic many-body space with the only difference having minus sings in front of oddpermutations. Such a many-body basis is not normalized for bosons, as states can be occupied with morethan one particle. Thus, the normalization factor occurs in the corresponding resolution of identity, i.e. (cid:49) = N b ! (cid:80) M α ,..., α Nb = | α ,..., α N b 〉 〈 α ,..., α N b | . This approach is explained in great detail in Ref. [144]. An N b -particle Hamiltonian is then given by a sum of individual single-particle Hamiltonians (interactions amongthe photons will only come about due to the coupling with the electrons.) Introducing for a general N b photon state | ˜ φ 〉 = (cid:112) N b ! (cid:80) M α ,..., α Nb = ˜ φ ( α ,..., α N b ) | α ,..., α N b 〉 with ˜ φ ( α ,..., α N b ) = (cid:112) N b ! 〈 α ,..., α N b | ˜ φ 〉 thecorresponding 1RDM according to Eq. (3.42) as γ b ( α , β ) = N b (cid:80) α ,..., α Nb ˜ φ ∗ ( β , α ,..., α N b ) ˜ φ ( α , α ,..., α N b ),the energy of that state is given by E [ ˜ φ ] = M (cid:88) α , β = ˆ h ph ( β , α ) γ b ( α , β ) = M (cid:88) α = (cid:181) ω α + (cid:182) γ b ( α , α ).Such a state can be constructed, for instance, as a permanent of N b single-photon states φ ( α ). Note furtherthat the 1RDM of an N b photon state obeys N b = (cid:80) α γ b ( α , α ).Finally, since we want to have a simplified form of a field theory without fixed number of photons, wemake a last step and represent the problem on a Hilbert space with indetermined number of particles, i.e., aFock space. By defining the vacuum state | 〉 , which spans the one-dimensional zero-photon space, the Fockspace is defined by a direct sum of N b -photon spaces F = (cid:76) ∞ N b = H N b . Introducing the ladder operatorsbetween the different photon-number sectors of F by [200]ˆ a + α | α ,..., α N b 〉 = | α ,..., α N b , α 〉 ˆ a α | α ,..., α N b 〉 = N b (cid:88) k = δ α k , α | α ,..., α k − , α k + ,..., α N b 〉 with the usual commutation relations, we can lift the single-particle Hamiltonian to the full Fock space andarrive at Eq. (3.46). The Fock space 1RDM for a general Fock space wave function | Φ 〉 can then be expressedas γ b ( α , β ) = 〈 Φ | ˆ a + β ˆ a α Φ 〉 ,and (cid:80) M α = γ b ( α , α ) = N b now corresponds to the average number of photons. And finally, since we know thatEq. (3.46) is equivalent to ˆ H ph = (cid:80) M α = (cid:179) − ∂ ∂ p α + ω α p α (cid:180) , we also see that the Fock space F is isomorphic to L ( (cid:82) M ), which closes our small detour. A.2 Conjugate gradients algorithm for real orbitals
In this appendix, we present an alternative derivation for the conjugate-gradients algorithm for RDMFT thatis presented in Sec. 7.3. Since we are only interested in ground states, we can also restrict the configurationspace of the RDMFT Lagrangian to only real orbitals, so we have φ i = φ ∗ i and˜ L = ˜ L [ φ i , n i ] = ˜ E [ n i , φ i ] − µ (cid:195) M (cid:88) i = n i − N (cid:33) − M (cid:88) i , j = λ ji (cid:181)(cid:90) d r φ i ( r ) φ j ( r ) − δ i j (cid:182) , (A.1)234.3. GRADIENT OF THE ELECTRONIC NATURAL OCCUPATION NUMBERSwhere we have to define the introduced total energy functional and the respective Hartree and XC-Potentials,˜ E [ n i , φ i ] = − m M (cid:88) i = n i (cid:90) d r φ i ( r ) h ( r ) φ i ( r ) + M (cid:88) i = n i (cid:90) d r φ i ( r ) ˜ v H ( r ) φ i ( r ) − M (cid:88) i = (cid:112) n i (cid:90) d r φ i ( r ) ˜ v iXC ( r ).˜ v H ( r ) = M (cid:88) j = n j d r (cid:48) φ j ( r (cid:48) ) w ( r , r (cid:48) ) φ j ( r (cid:48) ) (A.2)˜ v iXC ( r ) = M (cid:88) j = (cid:112) n j (cid:90) d r (cid:48) φ j ( r ) φ i ( r (cid:48) ) w ( r , r (cid:48) ) φ j ( r (cid:48) ) (A.3)So the differential of L reads: δ ˜ L = M (cid:88) i = sin( θ i ) (cid:183) ∂ ˜ E ∂ n i − µ (cid:184) d θ i + M (cid:88) i = (cid:90) d r (cid:34) δ ˜ E δφ i ( r ) − M (cid:88) k = λ ik φ k ( r ) (cid:35) δφ i ( r ) (A.4)When we now require stationarity, δ ˜ L =
0, we arrive at two different Euler-Lagrange equations for everyorbital i: ∂ ˜ E ∂ n i − µ = δ ˜ E δφ i ( r ) − M (cid:88) k = ( λ ik + λ ki ) φ k ( r ) = Λ enters the equations, which is the main differences betweenthe two formulations. For the sake of completeness, we also provide the modified orbital equations (theequations for n i remain structurally unchanged we just remove the stars from the orbitals): δ ˜ E δφ i ( r ) = n i (cid:161) φ i ( r ) h ( r ) + h ( r ) φ i ( r ) (cid:162) + n i ˜ v H ( r ) φ i ( r ) + (cid:112) n i ˜ v iXC ( r ) h = h + = (cid:104) n i h ( r ) φ i ( r ) + n i ˜ v H ( r ) φ i ( r ) + (cid:112) n i ˜ v iXC ( r ) (cid:105) (A.6) = δ E δφ i ( r ) (A.7)The steepest descent-vector is now given as˜ ζ i = − δ ˜ F δφ i ( r ) = − (cid:195) δ ˜ E δφ i ( r ) − M (cid:88) k = ( λ ki + λ ik ) φ k ( r ) (cid:33) . (A.8)We can calculate λ ki + λ ik then just by Eq. (A.5b). We have tested also this version in O CTOPUS and it seemsto work equivalently well than the other definition of Sec. 7.3.
A.3 Gradient of the electronic natural occupation numbers
In this appendix, we show how to derive expression (8.36) for the gradient of the inequality constraints (8.25).235PPENDIX A. APPENDIXWe consider a symmetrized perturbation of the electronic 1RDM γ e by the k -th polaritonic orbital δγ k ( r , r (cid:48) ) = (cid:90) d q φ ∗ k ( r (cid:48) , q ) δφ k ( r , q ) + (cid:90) d q δφ ∗ k ( r (cid:48) , q ) φ k ( r , q ), (A.9)such that δγ k ( r , r (cid:48) ) = δγ ∗ k ( r , r (cid:48) ). We perform first-order perturbation theory on the equation of the i-th eigen-value of γ e , 2 n i φ ei ( r ) = (cid:90) d x [ γ ( r , r (cid:48) ) + δγ k ( r , r (cid:48) )] φ ei ( r ), (A.10)which results in a correction of n i , i.e., δ n i = (cid:90) d r d r (cid:48) φ ei ∗ ( r ) δγ k ( r , r (cid:48) ) φ ei ( r (cid:48) ) = (cid:90) d r d r (cid:48) d q φ ei ∗ ( r ) φ ∗ k ( r (cid:48) , q ) δφ k ( r , q ) φ ei ( r (cid:48) ) + (cid:90) d r d r (cid:48) d q φ ei ∗ ( r ) δφ ∗ k ( r (cid:48) , q ) φ k ( r , q ) φ ei ( r (cid:48) ) = (cid:90) d r d q φ ei ∗ ( r )[ (cid:90) d r (cid:48) φ ∗ k ( r (cid:48) , q ) φ ei ( r (cid:48) )] δφ k ( r , q ) + (cid:90) d r d q φ ei ( r )[ (cid:90) d r (cid:48) φ ei ∗ ( r (cid:48) ) φ k ( r (cid:48) , q )] δφ ek ∗ ( r , q )From this expression, we can derive δ n i δφ k ( r , q ) = φ ei ∗ ( r )[ (cid:90) d r (cid:48) φ ∗ k ( r (cid:48) , q ) φ ei ( r (cid:48) )] (A.11) δ n i δφ ∗ k ( r , q ) = φ ei ( r )[ (cid:90) d r (cid:48) φ ei ∗ ( r (cid:48) ) φ k ( r (cid:48) , q )]. (A.12)236 ppendix B VALIDATION OF THE OCCUPATION NUMBER OPTIMIZATION INRDMFT
In this appendix, we discuss the validation of the occupation number optimization part (algorithm 2) of theRDMFT routine. Although the generalization from electronic to dressed orbitals directly influences only theorbital optimization, there might still be an indirect influence on other parts of the routine. This is a typicalchallenge of nonlinear programming.In fact, during the implementation of the polaritonic RDMFT routine, an issue of this type has occurred:some approximations of the occupation number optimization algorithm can lead to pronounced inaccura-cies in polaritonic RDMFT, but a similar problem has not been observed in electronic RDMFT. Specifically,the algorithm converges to a set of occupation numbers n = ( n ,..., n M ) that do not sum up to the total parti-cle number N . Depending on the specific scenarios, this sum might deviate very strongly from N including (cid:80) i n i =
0. This issue is so severe that the polaritonic RDMFT routine cannot converge. If we apply insteadthe electronic RDMFT routine to the same matter system but outside the cavity, we find a converged result.We want to utilize this issue to exemplify the challenges of the numerical implementation of first-principlesmethods and how we can confront them. A very important role is hereby played by the proper definition ofa test case (Sec. B.1) that allows for a simple visualization (Sec. B.2) of the involved quantities. This consid-erably facilitates the identification of the origin of the issue, i.e., the “bug” (Sec. B.3). In this and many othercases, it is not difficult to fix the bug, once it is identified (Sec. B.4).
B.1 Definition of the test setting
The issue occurs in all versions of O
CTOPUS before 10.0 that support RDMFT and can be reproduced inthe following test setting. We consider for the matter part a 1d-Helium-atom model, i.e., 2 electrons con-fined in a soft-coulomb potential v ( x ) = (cid:112) x + w ( x , x (cid:48) )) = (cid:112) ( x − x (cid:48) ) + ω = λ = M = φ , φ , φ } are fixed during the occupation number optimization. Thus, the energy functionalreduces (7.13) to E [ n ] = (cid:88) i = n i (cid:178) hi +
12 3 (cid:88) i , j = n i n j (cid:178) Hi j −
12 3 (cid:88) i , j = (cid:112) n i n j (cid:178) Xi j , (B.1)where we introduced the vector of the occupation numbers n = ( n , n , n ), and (cid:178) onei = 〈 φ i | h φ i 〉 , (cid:178) Hi j =〈 φ i φ i | w | φ j φ j 〉 , (cid:178) Xi j = 〈 φ i φ j | w | φ j φ i 〉 (see definitions below Eq. (7.13)). In the dressed setting, the one-bodykernel h ( x , x (cid:48) ) is replaced by its dressed version h (cid:48) ( xq , x (cid:48) q (cid:48) ) = t (cid:48) ( xq , x (cid:48) q (cid:48) ) + v (cid:48) ( xq , x (cid:48) q (cid:48) ) (Eq. (4.38) and (4.39))237PPENDIX B. VALIDATION OF THE OCCUPATION NUMBER OPTIMIZATION IN RDMFTand the Coulomb kernel w ( x , x (cid:48) ) that enters the latter two integrals is replaced by w (cid:48) ( xq , x (cid:48) q (cid:48) ) as discussedin the previous section. The Lagrangian (7.5) that we denote in this section by F becomes F [ n ; µ ] = E [ n ] − µ S [ n ] (B.2) = E [ n ] − µ (cid:195) (cid:88) i = n i − (cid:33) (B.3)and we have stationary conditions δ F = ⇐⇒ ∇ n E [ n ] = µ S [ n ] = B.2 Visualization of the algorithm
Let us begin the discussion with a small illustration of algorithm 2 with the electronic test system. As dis-cussed in Sec. 7.1.4, the minimization of the F [ n ; µ ] is performed with the help of an auxiliary function ˜ F that reads for this test-case ˜ F ( µ ) = min n , n , n F [ n ; µ ] (B.4) ≡ F [ n ∗ ; µ ], (B.5)where in the second line we indicated with n ∗ the set of occupation numbers that minimize ˜ F ( µ ) for a pre-scribed µ . The problem of finding ∇ n , µ L = F ( µ ) and then finding the stationary point of F [ n ∗ ; µ ] that fulfils d F d µ =
0. The algorithm in O
CTOPUS exploits this reformulation by using a bisection algorithm to find theset of occupation numbers n ∗ ( µ ), n ∗ ( µ ), n ∗ ( µ ) that satisfies the side condition S ( µ ) = (cid:80) i = n ∗ i ( µ ) − = ATHEMATICA to solve min n , n , n F [ n ; µ ] with its highly accurate internal routine. The matrixelements (cid:178) onei , (cid:178) Hi j , (cid:178) Xi j that enter F are extracted from the RDMFT routine of O
CTOPUS .We start by noting that F [ n ; µ ] for constant µ must always have at least one (and possibly more than one)minimum because of the constraint 0 ≤ n i ≤
2. Put differently, F can have “normal” minima, where ∇ F = constrained min-imization or we replace the occupations n i = (2 πθ i ) ∀ i , (cf. 7.4)as variables by the angles θ i and perform an unconstrained minimization. The latter is a very commonapproach, because there are many standard algorithms for unconstrained optimization. In this case, F as afunction of θ = ( θ , θ , θ ) has a true minimum at the boundary, as shown in Fig. B.2. Note that this is equivalent to Eq.(7.16), where we denoted F with L . Note that this expression is equivalent to Eq. (7.15) for M =
3, where the auxiliary function is denoted by S . - - - - n F F ( n ) for fixed n / n , μ =- (a) - - - - n F F ( n ) for fixed n / n , μ =- (b)Figure B.1: Shown is the Lagrangian F as a function of the first occupation number n keeping the otheroccupation numbers n , n and the Lagrange multiplier µ fixed. We see the two different types of minimathat occur depending on the value of µ . In figure (a), we see a normal minimum for µ = − µ = − n / n do not vary much with µ and can therefore be ignored. - - - - θ F F ( n ) for fixed n / n , μ =- Figure B.2: We show the Lagrangian F for the same parameters as in Fig.B.1 (b), but re-expressing the n i byEq. (7.4). We see that the border minimum of F( n ) turns to a regular minimum of F ( θ ).Since our test system is very small, we can afford enough calculations to actually plot the auxiliary func-tion ˜ F ( µ ) (shown in Fig. B.3 (a)), which allows us to visualize its properties. The bisection algorithm ofO CTOPUS calculates ˜ F ( µ ) of course only for certain values of µ . We see in Fig. B.3 (a) that ˜ F ( µ ) has indeeda stationary point at µ ∗ ≈ − F ( µ ) is not related to the energy, in fact ˜ F ( µ ) corresponds per definition to a minimalenergy solution for all µ . The advantage of the algorithm is that it does not matter, which type of stationarypoint ˜ F ( µ ) has, because we do not directly consider ˜ F ( µ ). Instead, the problem is transferred to finding theroot of the side condition S ( µ ). If S ( µ ) is monotone , the bisection algorithm will always converge. This can-not be proven rigorously, but judging from the experience with another code it seems to be a reasonableassumption. For our test-case, S ( µ ) is indeed monotone as we can see clearly in Fig. B.3 (b). Another inter-esting observation is that, while ˜ F ( µ ) is everywhere smooth, S ( µ ) has a kink, which occurs exactly, when the The algorithm was copied from the well-tested H
IPPO -code. It is not publicly available and the interested reader is referred to [email protected] . F ( n , µ ) touches the border. - - - - - - - - - - - - μ F The auxiliary function (a) - - - - - - - μ S Side condition evaluated for n * (b)Figure B.3: The auxiliary function ˜ F n ∗ ( µ ) (a) and the side condition S n ∗ ( µ ) (b) are shown. ˜ F n ( µ ) has itsmaximum at the point µ ∗ , where the side condition is satisfied, i.e., S n ( µ ∗ ) = S n ( µ ) is monotone over the whole region of µ and has a kink at µ ≈ − n ∗ ( µ ) becomes a border minimum, i.e., the first occupation number reaches n ∗ = µ because of the constraint (“pinning”). B.3 The identification of a resolved bug of polaritonic RDMFT
We now apply the just introduced visualization tools to study the failure of the occupation number opti-mization algorithm of polaritonic RDMFT. This means that the algorithm converges to n ∗ with S [ n ∗ ] (cid:54)= ATHEMATICA , we have manipulated the occupation numberroutine slightly to evaluate the function S ( µ ) for a larger range of µ than the algorithm would require.The RDMFT routine of O CTOPUS of the GSL library (called BFGS2 ). Fig. B.4 (a) shows the comparison of BFGS2with the results from the accurate routine of M
ATHEMATICA : We see that while the function S ( µ ), obtainedfrom M ATHEMATICA smoothly increases with µ (blue line), S ( µ ) obtained from the BFGS2 algorithm jumpsfor µ > − S ≈
0. This coincides with a jump in the solution occupation number to n ≈ ATHEMATICA considerably better than BFGS2, butstill not exactly. Also here the occupation number n gets pinned for too small values of µ .Nevertheless, this comparison shows that the minimization algorithm, employed in the RDMFT routineis the origin of the inaccuracy. B.4 A simple resolution of the issue
Let us therefore briefly discuss the BFGS-method. The basic idea behind the algorithm is the Newtonmethod for minimization: we approximate the minimum of a function (stationary point) with a second See for example [266, part 6.1]. - - - - - - - - μ Σ i n i - N dressed RDMFT: BFGS2 ( red ) vs. exact ( blue ) (a) - - - - - - - μ Σ i n i - N Exact ( blue ) vs. BFGS ( red ) , ( n - ) ( dashed ) (b)Figure B.4: (a) Comparison of the function S ( µ ) , calculated by the very accurate routine of M ATHMATICA (exact, blue) and the BFGS2 method (red) from the GSL-library. The latter fails to find the correct minimumof ˜ F [ n ; µ ] for µ > − S ( µ ).(b) Same as (a) but comparing M ATHEMATICA (exact, blue solid line) with the BFGS method (red solid line).Additionally the function n − ATHEMATICA : dashed (dark) blue line, BFGS:dashed red line). We see that BFGS leads to a continuous S ( µ ), but it still predicts a slightly wrong root.Additionally, it predicts a boundary minimum of n ( n − =
0) for too small values of µ .order Taylor expansion. In the 1d-case, this requires merely the second order-derivative of the function(since the first-order derivative is zero by definition), i.e., the calculation of one term. However, in a multi-dimensional setting, the derivative has to be generalized to the Hessian H . In thise case, the second-orderTaylor expansion around the extremum n ∗ reads F ( n ∗ + n ) − F ( n ) = F ( n ∗ ) + n T H F n .For the very high-dimensional minimization problems in electron structure theory, only the calculation ofthe gradient is the bottleneck, i.e., the most expensive step, of the whole method. Calculating the Hessian isusually numerically too expensive and hence, one either needs to employ directly gradient-based methodsor approximate the Hessian. The conjugate-gradients algorithm that is employed for the orbital optimiza-tion in O CTOPUS is an example for the former option and the BFGS and BFGS2 algorithms are examples ofthe latter category. In particular, the BFGS algorithms are so-called quasi-Newton methods , which approxi-mate the Hessian H F by the gradient ∇ F , such that the so-called secant equation , ∇ F ( n ∗ + n ) = ∇ F ( n ) + H F n ,is satisfied. As this equation does not fully determine H F , there are many different quasi-Newton methodsthat are adopted to certain minimization problems. Note that typical quasi-Newton methods (includingBFGS and BFGS2) can only perform unconstrained minimizations.We can conclude that certain approximations of the quasi-Newton algorithms BFGS and BFGS2 mustlead to the observed inaccuracies. The best way to avoid this issue is to employ a method that does not ap-proximate the Hessian. However, the inaccuracies are much more pronounced for BFGS2 than for BFGS andthus, in a first attempt to resolve the issue, one could simply employ BFGS. We have tested this extensivelyand indeed, we have not observed convergence issues of the RDMFT routine. We have concluded that the For further details, the reader is referred to any standard textbook about optimization, e.g., the book by Nocedal and Wright [266].
CTOPUS version 10.0 onwards, BFGS is therefore employed as default in the RDMFT occupation numberoptimization routine.242 ppendix C
CONVERGENCE OF DRESSED ORBITALS IN OCTOPUS
In this appendix, we present in detail how we have validated the dressed orbital implementation of O C - TOPUS . We therefore have developed a strategy to systematically converge the numerical calculations withdressed orbitals for HF theory and RDMFT, respectively. All numerical results of Ch. 6 in the in real-spacesetting have been converged according to this strategy. We present the validation and the convergence studywith an example system (Sec. C.1-C.3) and then provide a summary in the form of a “HowTo” for such cal-culations in C.4. This appendix is based on the supporting information of Ref. [1].
Validation with the help of a second implementation
To find the accuracy threshold of our implementation we compared the O
CTOPUS results to calculations ofa private code D
YNAMICS that was used and validated for Ref. [258] by S.E.B. Nielsen. In D
YNAMICS , thepolaritonic orbitals are approximated on discretized real-space boxes in the x and q coordinate exactly as inO CTOPUS . However, both codes employ different boundary conditions, a different level of finite differencesfor the approximation of the differential operators (fourth order in O
CTOPUS vs sixth order in D
YNAMICS ),a different method of the grid point evaluation (on-point in O
CTOPUS vs. mid-point in D
YNAMICS ), andalso a different orbital optimization technique. Both codes are only able to perform calculations under thefermion ansatz, i.e., solutions may violate the Pauli-principle for electrons. We have in D
YNAMICS the optionto calculate the exact many-body ground state (for very small systems) as well as the KS-DFT ground stateunder the exact exchange (EXX) approximation. As for a two-electron spin-singlet system EXX coincides exactly with HF theory, we can assess the electronic HF and the dressed HF routine (that we abbreviate inthe following by dHF) with D
YNAMICS . The test system
We present the convergence studies at the example of the one-dimensional Helium atom inside a cavity(He) that we employed already before (Sec. 6.2.2 and Sec. B.1). The He atom is described by the nuclearpotential v He ( x ) = − (cid:112) x + and is coupled to one cavity mode with frequency ω and coupling parameter λ .The modified local potential due to the dressed auxiliary construction reads v (cid:48) He ( x , q ) = v He ( x ) + v d ( x , q )with v d ( x , q ) = ( λ x ) + ω q − ω (cid:112) q ( λ x ) and the modified interaction kernel is w (cid:48) ( xq , x (cid:48) q (cid:48) ) = w ( x , x (cid:48) ) + w d ( xq , x (cid:48) q (cid:48) ) with w ( x , x (cid:48) ) = (cid:112) ( x − x (cid:48) ) + and w d ( xq , x (cid:48) q (cid:48) ) = − ωλ (cid:112) ( xq (cid:48) + x (cid:48) q ) + λ xx (cid:48) . The principal steps of the convergence study
The convergence study and validation of our implementation in O
CTOPUS requires several steps, which cor-respond to the following sections. We start with converging the exact dressed many-body ground-state and Contact: [email protected]
YNAMICS (Sec. C.1). Although we are limited to a very small Hilbert space forsuch exact calculations, we only have to solve a linear eigenvalue problem, so it is easy to converge the re-sults to a very high precision within the accessible parameter range. Thus, we obtain an upper bound forthe accuracy from these exact calculations. In the next section, we present the convergence of dressed HF(dHF) in all its details and compare the converged ground-state to D YNAMICS (Sec. C.2). This allows usto determine the accuracy for dHF calculations, that is (because of its nonlinear nature) harder to convergethan the exact solver. Additionally, the comparison of the two codes is a good validation of the correct imple-mentation of the dressed modifications. Since these modifications are exactly the same for dressed RDMFT(that we abbreviate by dRDMFT in the following) and dHF, validating the dHF implementation validates alsothe corresponding changes for dRDMFT. In Sec. C.3, we discuss the convergence of dRDMFT, which requirestwo different minimization procedures that are interdependent, and thus again is harder to converge thandHF.All these sections are organized similarly. We start with the separated problem of the electronic systemoutside the cavity and converge first the electronic counterparts of the routines, i.e., the electronic exactsolver, HF or RDMFT. The purely photonic problem of the separated problem remains the same at all thediscussed levels of theory. We therefore discuss it only once in the first section. Then, we analyze the conver-gence of the dressed theories in the no-coupling ( λ =
0) limit, which theoretically means that the electronicand photonic problems are perfectly separated, but solved in only one large calculation. By comparison ofonly the electronic (photonic) part of the dressed solutions to the results of the purely electronic (photonic)theories, we can measure if there is a decrease in accuracy due to the simultaneous description of electronicand photonic coordinates. We finish the sections with a convergence study for λ > C.1 Validation of the exact dressed many-body ground-state
We start with the validation of the exact many-body ground-state. In both codes, this is calculated by mini-mizing directly the energy expression of the full many-body wave function (denoted by Ψ (cid:48) in the main text),discretized on the grid. For the He test system, Ψ (cid:48) = Ψ (cid:48) ( x , q , x , q ) is four-dimensional. The actual mini-mization is performed in O CTOPUS by conjugate-gradients (see Sec. 7.3), whereas D
YNAMICS makes use of aLanczos algorithm. At the beginning of every calculation, we need to find the proper grid, which is defined by the box sizes L x , L q and the spacings ∆ x , ∆ q in both dimensions. For the minimization in O CTOPUS , we use two differ-ent convergence criteria, (cid:178) E = − and (cid:178) ρ = − . The former tests the energy deviations and the latterthe integrated absolute value of the density deviations between subsequent iteration steps. These are thecriteria already available in O
CTOPUS , so we here show that these are sufficient to produce reliable results.Note that we choose (cid:178) E = (cid:178) ρ , because (cid:178) ρ is a much stricter criterion. For the box size and spacing con-vergence, we perform series C = { C , C ,...}, where C can be L x , L q , ∆ x , or ∆ q in the following. We per-form two types of convergence tests for every parameter C . In the first one, we investigate the deviationsin the energy between subsequent elements ∆ E C i = E C i − E C i − . We denote the corresponding thresh-olds with (cid:178) E C . The second type of convergence series considers the maximal deviations in the absolute- Note that we use the term dressed HF/RDMFT in stead of polaritonic HF/RDMFT in this appendix to stress the fact that we employdressed orbitals with the fermion approximation. This algorithm can be seen as a generalization of the conjugate-gradients methods. It was originally proposed in 1950 by Lanczos [299] and is widely used today. It is explained in most standard textbooks about optimization, e.g., in Nocedal and Wright [266]. Details can be found on http://octopus-code.org/doc/develop/html/vars.php?page=alpha with the keywords
Eigen-solverTolerance and
ConvRelDens . This is also recommended by the O
CTOPUS
Variable Reference. ∆ ρ C i = max x / q | ρ C i ( x / q ) − ρ C i − ( x / q ) | , with ρ C i ( x ) = (cid:82) d q ρ C i ( x , q ) for the series L x , ∆ x and ρ C i ( q ) = (cid:82) d x ρ C i ( x , q ) for the series L q , ∆ q (see Fig. C.1 foran illustration of these quantities). We denote the corresponding thresholds with (cid:178) ρ C .(a) (b)Figure C.1: As an example for the convergence tests, we show here the two spatially-resolved convergenceparameters ∆ ρ L x and ∆ ρ ∆ x in (a) and (b), respectively. We see for in figures that the deviations can changethe sign and are typically most pronounced in the center region. We also observe how the deviations ∆ ρ L x ( ∆ ρ ∆ x ) decrease at all positions with increasing (decreasing) L x ( ∆ x ). These are generic features of all per-formed series. However, this decrease continues until ∆ ρ L x < − , reaching the limit set by the convergenceparameter (cid:178) ρ for the L x -series (not shown). The ∆ x -series instead saturates at ∆ ρ ∆ x ≤ − . The reason isthe finite difference approximation that effectively leads to different Hamiltonians for every value ∆ x of thespacing. To push the accuracy beyond (cid:178) ρ ∆ x = − , one would need to adjust the finite-differences stencil ac-cordingly. However, this accuracy is clearly beyond our needs, since the goal is to converge the numericallymore involved theory levels HF and RDMFT, which are in general less accurate.We start with the electronic part of the example (corresponding to the He atom outside the cavity),choose ∆ x = L x = {8,10,12,...}. The ground-state energy E L x drops with increasing L x (be-cause boundary effects become less important) and we find ∆ E L x < − and ∆ ρ L x < − for L x ≥
20. Forillustration, we have depicted ∆ ρ L x for a choice of L x in Fig. C.1 (a). In the following, we choose L x = (cid:178) ρ Lx = − and (cid:178) E Lx = − are already stricter than the maximal accuracy of thelater nonlinear (dHF, dRDMFT) calculations. Next, we perform ∆ x = {0.2,0.19,0.18,...,0.06}. We find that ∆ E ∆ x also drops with decreasing ∆ x until ∆ E ∆ x < − for ∆ x = ∆ ρ ∆ x for a choice of ∆ x in Fig. C.1(b). ∆ ρ ∆ x instead decreases (slowly) until the lowest tested value of ∆ x = ∆ ρ ∆ x < − alreadyfor ∆ x < ∆ ρ ∆ x < − , we need to decrease the spacing until ∆ x = (cid:178) ρ ∆ x = − .We repeat the same series for the harmonic oscillator system of the q-coordinate with frequency ω = ω res ≈ ∆ E L q < − and ∆ ρ L q < − for L q ≥
14. The spacingis converged in the energy ∆ E ∆ q < − and in the density ∆ ρ ∆ q < − for ∆ q ≤ L x = L q = ∆ x = ∆ q = λ > λ = ω = ω res . To have a uniform grid distribution, we also set ∆ x = ∆ q , although our preliminarycalculations suggest that we could choose a larger ∆ q . Unfortunately, we cannot explore this space com-pletely, but we are limited with L x , L q ≤ However, we can confirm that the energy is converged in theq-direction for L q =
14 and in the x-direction, we have ∆ E L x ≈ − for L x = CTOPUS results with the results from the D
YNAMICS code. For that, we considerthe differences in the total energy E OD = | E D ynamics − E Octopus | and the maximum deviations in the polari-tonic density ρ OD ≡ max x , q | ρ Octopus ( x , q ) − ρ D ynamics ( x , q ) | between the two codes. Due to the optimiza-tion of D YNAMICS , we are limited to a box length of L x = L q =
14 to have a spacing of ∆ x = ∆ q = CTOPUS increase to ∆ E L x , L q , ∆ x , ∆ q = ∆ ρ L x , L q , ∆ x , ∆ q ≈ − . When we compare the ground-state energies of both codes for this maximum possible mesh, we find E OD ≈ − and ρ OD ≈ − . We thus were able to confirm that both codes agree on the level of accuracythat we estimated from the O CTOPUS calculations before and this although they are quite different from anumerical perspective. We conclude from these calculations that the implementation of the exact many-body routine for dressed two-electron systems in O CTOPUS is reliable and we can use it as benchmark forthe dHF and dRDMFT approximations within O
CTOPUS . C.2 Validation of the dHF routine
In this section, we present the validation of dHF, which requires several steps. We start with the electronic HFroutine and converge the He atom model (outside the cavity) in box size and spacing, where we proceed asin the previous section. Then, we converge the second HF routine (that we call HF basis in this section) thatis implemented in O
CTOPUS , which is based on the RDMFT implementation and thus makes use of a basisset (Sec. 7.2.2). We have mentioned there already that the difference of such a basis-set implementation isthat the routine calculates all the integral kernels of the total energy as matrix elements of a chosen basisand then searches for the energy minimum by only varying the corresponding coefficients. This routine canbe considerably faster than the standard one of O
CTOPUS , i.e., the conjugate-gradients algorithm, becausethe calculation of the problematic exchange-term needs only to be performed once in the beginning for thematrix elements. Such a basis-set implementation requires a convergence study with respect to the basissize, which we explain in detail in the second part of this section together with a comparison between HFand HF basis . We conclude the section with the discussion of dHF, which also uses a basis set. However, thebasis-set convergence for dHF is more involved than for the purely electronic HF basis and we explain this indetail. Afterwards, we compare dHF in the no-coupling limit to HF/HF basis and we discuss the comparisonof O
CTOPUS and D
YNAMICS on the level of dHF, where we follow the same strategy as in the previous section.
C.2.1 HF minimization on the grid
We start with the He atom outside the cavity and calculate the HF ground-state with the standard routine ofO
CTOPUS , which uses a conjugate gradients algorithm for the orbital optimization. The convergence criteriafor the conjugate gradient algorithm are defined exactly as before for the exact many-body ground-state. Weset them to (cid:178) E = − and (cid:178) ρ = − and determine L x and ∆ x . We start with a spacing of ∆ x = The precise reasons is that the memory of one node of the cluster we are using is too small. We would consequently need todistribute the wave function over several nodes, which is possible, but would demand a considerable programming effort. Larger systems are in principal also possible, but numerically infeasible in real space. ∆ E L x ≈ − and ∆ ρ L x < − for all L x ≥
20. We choose L x =
20 and perform the spacing series, which isconverged in energy with ∆ E ∆ x < − for ∆ x ≤ ∆ ρ ∆ x < − for ∆ x ≤ ∆ x ≤ ∆ x = C.2.2 HF minimization with a basis set
In this section, we present the convergence of the He test system in the HF basis routine with respect tothe basis set. It is based on the orbital-based RDMFT implementation that we introduced and discussedin Sec. 7.2, employing the HF functional (see paragrpah below Eq. 7.1.1). To generate a basis, we perform apreliminary calculation with the independent particle (IP) routine, which solves a simple 1-body Schrödingerequation to generate the basis ((1/2 ∇ + v (cid:48) He ) φ i = e i φ i , see Sec. 7.4). To allow for enough variational freedom,we calculate besides the occupied orbitals (that form the HF ground state and are denoted by GS ) alsoexcited orbitals, which are called extra states ( ES ) in O CTOPUS . The so generated basis set has the size M = GS + ES .As discussed in Sec. 7.2, the routine performs the minimization by representing the coefficients of thebasis-set expansion in a matrix and diagonalizing it repeatedly until self-consistence. The correspondingenergy convergence criterion (cid:178) E remains the same in HF basis like in HF. But the second convergence crite-rion (cid:178) F of the orbital minimization that tests the hermiticity of the Lagrange multiplier matrix needs to beadapted. As (cid:178) F is a considerably stricter criterion than (cid:178) E (and (cid:178) ρ ), it is set as default to (cid:178) Λ = · (cid:178) E .For the convergence of the size of the basis set M or equivalently the parameter ES , we perform a se-ries ES = {10,20,...,100} and test deviations in energy ∆ E ES , ES ref = E ES − E ES ref and density ∆ ρ ES , ES ref = max x | ρ ES ( x ) − ρ ES ref ( x ) | from the reference value ES re f that yields the lowest total energy and that typicallyoccurs for the largest basis. We use L x and ∆ x that we determined before in HF and find that ∆ E ES ,100 and ∆ ρ ES ,100 decrease with increasing ES and for ES ≥ ∆ E ES ,100 < − and ∆ ρ ES ,100 ≈ − . The latter valueis considerably smaller than the one we obtained with the standard solver and the reason is the bad qualityof the basis set. We see that for ES ≥
40 the extra basis states cannot improve the accuracy anymore but in-stead add only noise. We will see later for RDMFT, where we also have to optimized the occupation numbersthat this effect leads to a generic non-variational behavior, i.e., the energy does not always decrease or stayconstant with increasing ES , but will first drop and than rise again for very large ES . Thus, we define therewhat we have called an optimal region for the basis size, in which the energy remains constant within a cer-tain threshold. For the sake of completeness, we want to mention that after we had published the paper, wefound such kind of non-variationality even for HF, however only for very large ES (cid:39) basis cannot be as strictly converged as HF, which is especially visible for the electrondensity deviations that saturate of the order of ∆ ρ ES ,100 ≈ − even for large basis sets. Nevertheless, thisaccuracy is sufficient to interpret and compare numerical data in the following. Comparing HF with HF basis with these parameters, we find that | E HF − E HF basis | ≈ − and max x | ρ HF ( x ) − ρ HF basis ( x ) | ≈ − . So bothmethods are consistent. Note that in our case, we have always GS = N , where N is the number of electrons, because we only look at closed-shell systems,which distribute two electrons to every spatial orbital. C.2.3 Validation of dHF
After having properly understood the convergence of the electronic HF methods, we can now turn to dHF.All the calculations shown in Ch. 6 are done with the basis-set type of implementation, which is more in-volved than in the purely electronic HF basis case. Thus, we start in Sec. C.2.4 with discussing the new issuesthat enter when one needs to converge a system in the dressed space with respect to the basis set. In Sec.C.2.5, we present the convergence of dHF with zero-coupling ( λ = CTOPUS with D
YNAMICS . C.2.4 Basis-set convergence in the dressed auxiliary space
In the dressed auxiliary space that we explore with dHF and dRDMFT, the basis-set convergence is evenmore difficult than in the HF basis case. We illustrate this additional difficulty for dHF in the no-coupling( λ =
0) case. λ = ω ,because it is generated by a preliminary calculation. This can be illustrated as follows:For λ =
0, the electronic and photonic part of the system separate. Thus, the coupled (dHF) Hamilto-nian is a direct sum of the electronic (photonic) Hamiltonian ˆ H e ( ˆ H p ) and the two-dimensional dressedorbitals ψ i α ( x , q ) can be exactly decomposed in their one-dimensional electronic φ i ( x ) and photonic χ α ( q )constituents, ψ i α ( x , q ) = φ i ( x ) ⊗ χ α ( q ). Here, φ i ( χ α ) is an eigenfunction with eigenvalue e ei ( e p α ) of the elec-tronic (photonic) Hamiltonian ˆ H e ( ˆ H p ). Consequently, we know that we can calculate the eigenvalues of thedressed orbitals as sum of the uncoupled ones, i.e., e epi α = e ei + e p α . The basis set for a dHF calculation is thenconstructed using the ground and the first ES excited orbitals of a preliminary IP calculation. These orbitalsare ordered by their eigenvalue e epi α and for that, the relation between the individual energies of the electronand photon space is crucial.Table C.1 shows the decomposition of such a basis with ES = M = λ =
0, that trivially needsonly ground-state contributions for the photonic coordinate, these 3 states would be entirely unnecessaryand waste computational resources. Before we discuss this further, we want to explain how ω influencesthis distribution between electronic and photonic contributions: If we chose for example a larger ω ≈ ω , the opposite would happen and the first states would vary in the photonic contribution. A similar kindof argument could be done for all other ingredients of the Hamiltonian of course, but for ω this influenceis most directly visible and comparatively strong. Note that this effect is independent from the issue of thefermion ansatz that we discussed in Sec. 4.2.3. Also there, we can increase ω to trivial satisfy the extra con-ditions due to the hybrid statistics of the polaritonic orbitals. However, it is very difficult to disentangle botheffects (employing the Piris orbital-optimization routine).At this point, one might ask the question why at all we use the basis-set implementation and the an-swer remains the same as for the purely electronic case: Although we need large basis sets for the properlyconverged dHF calculations which makes them numerically very expensive, these computations are still248.2. VALIDATION OF THE DHF ROUTINEcontributionindex e ei e p α e epi α i α ES = M = N /2 + ES = ω ≈ λ = e ei ( e p α ) of the pure electronic (photonic) Hamiltonian ˆ H e ( ˆ H p ) shown. On the rightside, we see the orbital energies e epi α = e ei + e p α of the combined Hamiltonian ˆ H ep and the decomposition ofthe combined index: The first eigenenergy e ep = e ep = e e + e p is the sum of the two first uncoupled energies.The second energy e ep = e ep instead is formed from the electronic ground but photonic first excited state,etc. In this example, we see that both contributions are similar, although there are slightly more electronicorbitals included. This tendency continues also for increasing ES .relatively inexpensive compared to calculations with a conjugate gradients algorithm that calculates all in-tegrals on the grid. In the case of the He atom, the dHF calculation with conjugate gradients and ES = ES = Finally, we want to mention that the statements of this subsection carry over straightforwardly to thecoupled ( λ >
0) case, we just do not have the direct connection to the decoupled spaces and thus cannotvisualize this case as well. It is clear that we need more variational freedom for the photonic subspace thanin the λ = λ > C.2.5 Validation of dHF for λ = Now we can address the ES -convergence of dHF, which we present for λ =
0, such that we can comparethe electronic part of the converged result to HF afterwards. As we set λ =
0, we know that for the photoncomponent, the IP ground-state orbital χ ( q ) is already the exact solution, because the photons do not in-teract directly with each other and HF (or any other level of approximation) will not change this. The dHFground-state orbital thus is φ ( x , q ) = ψ HF ( x ) ⊗ χ ( q ) with the electronic HF ground-state orbital ψ HF ( x ). Aswe saw in the last section, when we converged the He test system using the HF basis routine, ψ HF ( x ) canbe approximated by an IP basis with ES =
40. So we know that in the no-coupling limit of dHF, we do notneed a larger basis set, because Kronecker-multiplying the basis-states of the HF basis -calculation with χ ( q )would be exactly sufficient. However, here and in the following, we do not want to choose by hand sucha well-adapted state space (which for very strong coupling strength is nontrivial), but instead rely on thepolaritonic IP-calculations. From our considerations before, we expect to need more basis states to reachthe same level of accuracy than using HF basis and the exact number will depend on ω . Indeed, for ω = ω res ,we need ES =
80 to have ∆ E ES ,100 < − and ∆ ρ ES ,100 ≈ − , where we calculate the electron density ofthe dHF-calculation by ρ ( x ) = (cid:82) d q ρ ( x , q ). When we instead choose a very high value of ω = ∆ E ES ,100 < − and ∆ ρ ≈ − already for ES ≈
40. For both calculations, we do not reach ∆ E ES ,100 < − Note that both algorithms are not yet fully optimized and thus, the efficiency might still significantly change. See also the Outlookin part IV. ES . This is due to the many "unnecessary" states that are taken into account in the mini-mization. These then only introduce numerical noise without providing useful variational information. Wefind this confirmed by analyzing the “photonic density” ρ ( q ) = (cid:82) d x ρ ( x , q ), which deviates from the correctdensity ρ ex ( q ) = | χ | ( q ) although χ ( q ) is explicitly part of the basis. Adding basis states consequentlycannot improve the photonic orbital. For instance, for ω = ω res , the density deviations remain at approxi-mately ∆ ρ ES , ES re f ≈ − for all ES and ES re f . When we instead choose ω = ∆ ρ ES ,100 ≈ − for ES ≥
40. Despite being less accurate, we used ω = ω res for the results in Ch. 6, because ∆ ρ ES ≈ − is still two orders of magnitude smaller than the deviations from the exact or the dRDMFT so-lution. Again, we can safely use these “less” accurate results, because we know “where it comes from,” i.e.,we understand how the accuracy is controlled.This is confirmed by the comparison of the electronic part of dHF to HF that we are now able to perform.For the energy comparison, we need to subtract the photon part E p = ω from the total dHF energy E dHF , E dHF , e = E dHF − E p . However, this analytical expression is also not exact, because of the just mentioned errorin the photonic part of the orbitals. Consequently, we have deviations for ω = ω res of | E HF − E ω = ω res dHF , e | ≈ · − and max x | ρ HF ( x ) − ρ ω = ω res dHF , e ( x ) | ≈ − . However, for the “better” value of ω = | E HF − E ω = dHF , e | ≈ · − and max x | ρ HF ( x ) − ρ ω = dHF , e ( x ) | ≈ − . C.2.6 Validation of dHF with D
YNAMICS
We conclude this section by comparing the results of our implementation of dHF in O
CTOPUS with the D Y - NAMICS code, which uses an imaginary time propagation [300] algorithm to calculate the HF ground-state.This comparison allows us to validate also the case of λ >
0. We choose ω = ω res and λ = ES -convergence, confirming that we need ES =
90 for the same level of accuracylike before in the no-coupling case. For the sake of completeness, we want to mention that for ω = λ = ω = ω res the level of convergence is an order of magnitude better thanthe expected deviations between the codes that we estimated before.So we compare the ground-state for these parameters to the result of D YNAMICS and find for the energythe expected deviations of E OD ≈ − . For the density, we find instead deviations of ρ OD ≈ − . This dis-crepancy is probably due to the different convergence criteria of the two codes. D YNAMICS tests the eigen-value equation of the one-body Hamiltonian for a certain subset of all the grid points, which is a muchstronger criterion than the one of O
CTOPUS , explained before. The influence of these different criteria on aself-consistent calculation are naturally stronger than on the calculation of a linear eigenvalue-problem likethe many-body calculation. Still, density errors of the order of 10 − are small enough for our purposes. Wefind similar errors also for other values of λ and conclude that both codes are sufficiently consistent. C.3 Validation of dRDMFT
In this section, we can finally turn to dRDMFT. This is the first implementation at all of this theory, so wecannot validate it with a reference code any more. However, the difference between HF and RDMFT (usingthe Müller functional) on the implementation-level essentially is in the treatment of the occupation num-bers, which are fixed to 2 and 0 in the former case but are allowed to be non-integer for the latter. The 1-bodyand 2-body terms, which implementation-wise are the only modifications due to the dressed auxiliary con-struction ( v ( r ) → v (cid:48) ( r , q ), w ( r , r (cid:48) ) → w (cid:48) ( r q , r (cid:48) q (cid:48) )) are the same for HF and RDMFT and thus also for dHF and250.3. VALIDATION OF DRDMFTdRDMFT. This means that the validation of dHF that we presented in the previous section at the same timelargely validates the implementation of dRDMFT.Still, we need to analyze and understand the convergence with respect to the basis set in dRDMFT andcheck for the consistency between the results of RDMFT and dRDMFT in the no-coupling limit. C.3.1 Basis-set convergence of RDMFT
In RDMFT, we have to perform two minimizations that are interdependent: for the natural orbitals φ i and the natural occupation numbers n i (where always i = M ). This is done by fixing alternately φ i or n i ,while optimizing the other until overall convergence is achieved (see Sec. 7.1.3).We have the possibility to define different convergence criteria for each minimization, (cid:178) E (which is con-nected to (cid:178) Λ , see Sec. C.2.2) and (cid:178) µ . The latter tests the convergence of the Lagrange multiplier µ that appearsin the RDMFT functional to fix the total number of particles in the system (see algorithm 2). The occupationnumber optimization routine at iteration step m sets µ = µ m , minimizes the total energy with respect to the n i = n mi and calculates the particle number N m = (cid:80) Mi = n i . Based on the deviation to the correct system’sparticle number, µ m + is increased or decreased. The routine exits if | µ m − µ m − | < (cid:178) µ . For the remainder ofthis section, we set (cid:178) E = (cid:178) µ = − .However, the ES -convergence of RDMFT is again more difficult than in the HF basis -case. Contrary to the(largely) monotonic dependence between ES and the energy that we found for HF basis , the current RDMFTimplementation in O CTOPUS shows a clear non-variational behavior with respect to the number of basisstates. For all tested systems, the energy went down with increasing ES until a certain value ES min <
100 andthen up again. Therefore, it seems that the interplay of the two minimization processes and the relativelysoft types of convergence criteria introduce for big ES such large errors that they exceed the gain of accuracydue to more variational freedom. In Tab. C.2, we show the variation of the ground-state energy of He (outsidethe cavity) for a series ES = {10,20,...,80}. The energy first decreases until its lowest value for ES min ≈ ES Energy ∆ E ES = E ES − E ES −
10 -2.2421837 -20 -2.2426908 − · −
30 -2.2427080 − · −
40 -2.2427085 − · −
50 -2.2427049 + · −
60 -2.2427035 + · −
70 -2.2426928 + · −
80 -2.2426937 − · − Table C.2: Ground state energies of the He atom (outside the cavity) calculated by RDMFT with parametersmentioned in the main text. The energy goes first down with increasing number of ES until it reaches itslowest value at ES min ≈
40 and then up again.This non-strictly variational behavior makes a clear definition of the convergence difficult. However,we find that for every considered system there is an optimal region
E S opt = { ES | ES min ≤ ES ≤ ES max }.By optimal region, we mean an interval of ES in which the solutions vary minimally among each other,i.e., their energy and density deviations are minimal. For the energies, we define ∆ E ES , ES (cid:48) = | E ES − E ES (cid:48) | ,the corresponding threshold (cid:178) E opt , and require ∆ E ES , ES (cid:48) < (cid:178) E opt for all ES , ES (cid:48) ∈ E S opt . For the densi-ties, we define the point-wise density deviations ρ ES , ES (cid:48) ( x ) = ρ ES ( x ) − ρ ES (cid:48) ( x ), their maximal deviations251PPENDIX C. CONVERGENCE OF DRESSED ORBITALS IN OCTOPUS ∆ ρ ES , ES (cid:48) = max x | ρ ES , ES (cid:48) ( x ) | , and the corresponding threshold (cid:178) ρ opt . As second condition on E S opt we re-quire ∆ ρ ES , ES (cid:48) < (cid:178) ρ opt for all ES , ES (cid:48) ∈ E S opt .We start with the investigation of the energies and find that ∆ E ES , ES (cid:48) < · − for all combinations 30 ≤ ES , ES (cid:48) ≤
60, but ∆ E ES , ES (cid:48) > − when we choose 30 ≤ ES ≤
60 and 30 ≤ ES (cid:48) ≤
60 or 10 ≤ ES (cid:48) ≤
20. Weconclude that the first condition for
E S opt is met by the interval 30 ≤ ES ≤
60 with threshold (cid:178) E opt = · − .For the investigation of the second condition, we depict in Fig. C.2 the density deviations ρ ES , ES (cid:48) ( x ) with ES (cid:48) =
60, the upper boundary of the just found interval and ES (cid:48) =
80, that corresponds to the largest basisset of this example. For a better visibility, the curve for ES =
10 is not shown, but it deviates stronger than allthe other curves from both ES (cid:48) . We conclude that ∆ ρ ES , ES (cid:48) goes down until ES =
20, independently of thereference. However, for ES ≥
30 this is not the case any more. We find ∆ ρ ES ,80 ≈ − but ∆ ρ ES ,60 < · − .Additionally, we observe two different types of deviations : The curves ρ ES ,80 ( x ) have a similar form for all30 ≤ ES ≤
60, but when we change the reference point to ES (cid:48) =
60, we cannot find pronounced similaritieswhich is what we would expect from fluctuations. When we test also the other possible values for ES (cid:48) , wefind ∆ ρ ES , ES (cid:48) < · − for all 30 ≤ ES , ES (cid:48) ≤
60. Thus, the second condition for the optimal region is met bythe same interval like the first condition with threshold (cid:178) ρ opt = · − . Therefore, we have E S opt = { ES | ≤ ES ≤
60} and conclude that the maximum possible accuracy of RDMFT calculations is already reached for ES =
30 and it is generally lower than for HF basis .Figure C.2: Differences in the ground-state density ∆ ρ ES , ES (cid:48) ( x ) for RDMFT calculations of the He atom with ES (cid:48) =
80 (left) and ES (cid:48) =
60 (right). ρ ES = ( x ) deviates similarly for both reference points, but for ES re f = ∆ ρ ES ( x ) for 30 ≤ ES ≤
60. The deviations merely drop under 10 − ,except for ES =
70, which is very close to the reference. When we instead use ES (cid:48) =
60, the calculations for ES =
30 to ES =
60 only deviate of the order of 10 − and the deviations have a random character.The strong decrease in accuracy of RDMFT in comparison to HF basis suggests that the occupation num-ber minimization adds a significant error to the calculation. As we have discussed in Sec. B, also the moreaccurate BFGS algorithm that is now the default solver exhibits certain inaccuracies. It is very difficult toestimate the exact error that is introduced by the method, because of the strong nonlinear character of theminimization (especially the interdependence between the φ i and n i optimizations). For a thorough un-derstanding of this issue, one needs to implement and test different numerical solvers. However, for the252.4. PROTOCOL FOR THE CONVERGENCE OF A DHF/DRDMFT CALCULATIONpurposes of this text, we consider the current accuracy as sufficient. C.3.2 Basis-set convergence of dRDMFT and comparison to RDMFT
For the ES -series of dRDMFT, we need to deal with the combination of the inaccuracies introduced by the n i -minimization, explained in the previous subsection and the additional errors due to extra large basis setsthat contain many redundant degrees of freedom, that we found for dHF before (Sec. C.2). We know alreadythat a large photon frequency is advantageous in terms of the latter. So we choose ω = ES =
50, which deviates from ES =
40 and ES =
60 by | E ES = − E ES = | > − . So we cannot find a region, where the energy is converged better than 10 − . Nevertheless, ifwe accept an accuracy of ∆ E ES < · − as sufficient, we find a region as large as 20 ≤ ES ≤
100 that satisfiesthe criterion. A look at the density deviations reveals that we can slightly tighten this region and exclude ES =
20, such that we have deviations ∆ ρ ES ,50 ≈ − for all 30 ≤ ES ≤ ω < λ = | E RDMFT − E dRDMFT | ≈ − and max x | ρ RDMFT ( x ) − ρ dRDMFT ( x ) | ≈ − , which means that the deviationsbetween the levels of theory are of the same order as the maximal accuracy that dRDMFT provides. Weconclude that both theories are consistent. C.4 Protocol for the convergence of a dHF/dRDMFT calculation
We conclude the convergence study with a step-by-step guide for the proper usage of the dressed orbitalimplementation in Octopus. All the real-space calculations presented in Ch. 6, presented in the main partof this paper were performed according to this protocol1. Box length L x and spacing ∆ x convergence for the purely electronic part of the system on the level ofIP and electronic HF and for the uncoupled photonic system on the level of IP.• Test the deviations in energy ∆ E L x < (cid:178) E Lx and density ∆ ρ L x < (cid:178) ρ Lx , as explained in Sec. C.1.We chose (cid:178) E Lx = − and (cid:178) ρ Lx = − to exclude any numerical artefacts. However, as the dHFand dRDMFT calculations do typically not reach such precisions, one can relax these criteria ingeneral.• For the ∆ x -series, test only the deviations in energy ∆ E ∆ x < (cid:178) E ∆ x due to the larger density errors.We chose (cid:178) E ∆ x = − , but like for the box length, this criterion can be relaxed.2. Basis size convergence for the HF basis routine with the purely electronic part of the system.• Perform an ES -series and test the deviations in energy ∆ E ES , ES ref < (cid:178) E ES and in density ∆ ρ ES , ES ref < (cid:178) ρ ES as explained in Sec. C.2.2. Here, we were typically able to reach (cid:178) E ES = − and (cid:178) ρ ES = − . Note that in Sec. C.2.2, we wrote instead ∆ ρ ES , ES ref ≈ − because some values were slightly larger than 10 − . Thus, (cid:178) E ES = − for sure is satisfied. basis and the electronic HF results in energy and density and makesure that both are consistent on their level of accuracy.3. Basis size convergence of the dressed theory that is wanted (dHF or dRDMFT) in the no-coupling( λ =
0) limit• Perform an ES -series like for HF basis . Note that one needs to expect considerably larger basissets for the same level of convergence (see Sec. C.2.5 for details).• Check consistency of the electronic sub-part of the system with HF basis as mentioned in Sec.C.2.5. If this check fails drastically, this is very probably due to the violation of the extra exchangesymmetry in the photonic coordinates (see Sec. 5.1).4. The convergence study is finished with another basis-set convergence for λ >
0. Typically, we alsoperformed another small box length series with the converged basis set to make sure that the cou-pling does not increase the size of the system crucially such that boundary effects could influence theresults.254
IBLIOGRAPHY [1] Buchholz, F., Theophilou, I., Nielsen, S. E., Ruggenthaler, M., and Rubio, A.
Reduced Density-MatrixApproach to Strong Matter-Photon Interaction . ACS Photonics, 6:2694, 2019.[2] Buchholz, F., Theophilou, I., Giesbertz, K. J. H., Ruggenthaler, M., and Rubio, A.
Light–Matter Hybrid-Orbital-Based First-Principles Methods: The Influence of Polariton Statistics . J. Chem. Theory Comput.,16:24, 2020.[3] Tancogne-Dejean, N., Oliveira, M. J., Andrade, X., Appel, H., Borca, C. H., Le Breton, G., Buchholz,F., Castro, A., Corni, S., Correa, A. A., De Giovannini, U., Delgado, A., Eich, F. G., Flick, J., Gil, G.,Gomez, A., Helbig, N., Hübener, H., Jestädt, R., Jornet-Somoza, J., Larsen, A. H., Lebedeva, I. V., Lüders,M., Marques, M. A., Ohlmann, S. T., Pipolo, S., Rampp, M., Rozzi, C. A., Strubbe, D. A., Sato, S. A.,Schäfer, C., Theophilou, I., Welden, A., and Rubio, A.
Octopus, a computational framework for explor-ing light-driven phenomena and quantum dynamics in extended and finite systems . J. Chem. Phys.,152(12):124119, 2020.[4] Theophilou, I., Buchholz, F., Eich, F. G., Ruggenthaler, M., and Rubio, A.
Kinetic-Energy Density-Functional Theory on a Lattice . J. Chem. Theory Comput., 14(8):4072, 2018.[5] Jackson, J. D.
Classical electrodynamics . John Wiley & Sons, 2007.[6] Cottingham, W. N. and Greenwood, D. A.
An Introduction to the Standard Model of Particle Physics .Cambridge University Press, 2007.[7] Pauli, W. and Fierz, M.
Zur Theorie der Emission langwelliger Lichtquanten . Nuovo Cim., 15(3):167,1938.[8] Greiner, W. and Reinhardt, J.
Quantum Electrodynamics . Springer-Verlag Berlin Heidelberg, 2009.[9] Spohn, H.
Dynamics of charged particles and their radiation field . Cambridge university press, 2004.[10] Greiner, W. and Reinhardt, J.
Field quantization . Springer Science & Business Media, 2013.[11] Keller, O.
Quantum theory of near-field electrodynamics . Springer Science & Business Media, 2012.[12] Shepherd, J. J., Booth, G., Grüneis, A., and Alavi, A.
Full configuration interaction perspective on thehomogeneous electron gas . Phys. Rev. B - Condens. Matter Mater. Phys., 85(8):81103, 2012.[13] Kohn, W.
Nobel Lecture: Electronic Structure of Matter . 1999. .[14] Ruggenthaler, M., Tancogne-Dejean, N., Flick, J., Appel, H., and Rubio, A.
From a quantum-electrodynamical light–matter description to novel spectroscopies . Nat. Rev. Chem., 2(3):0118, 2018.[15] Kéna-Cohen, S. and Forrest, S. R.
Room-temperature polariton lasing in an organic single-crystal mi-crocavity . Nat. Photonics, 4(6):371, 2010. 255IBLIOGRAPHY[16] Hutchison, J. A., Schwartz, T., Genet, C., Devaux, E., and Ebbesen, T. W.
Modifying chemical landscapesby coupling to vacuum fields . Angew. Chemie - Int. Ed., 51(7):1592, 2012.[17] Coles, D. M., Somaschi, N., Michetti, P., Clark, C., Lagoudakis, P. G., Savvidis, P. G., and Lidzey, D. G.
Polariton-mediated energy transfer between organic dyes in a strongly coupled optical microcavity . Nat.Mater., 13(7):712, 2014.[18] Kiffner, M., Coulthard, J. R., Schlawin, F., Ardavan, A., and Jaksch, D.
Manipulating quantum materialswith quantum light . Phys. Rev. B, 99(8), 2019.[19] Ashida, Y., Imamoglu, A., Faist, J., Jaksch, D., Cavalleri, A., and Demler, E.
Quantum ElectrodynamicControl of Matter: Cavity-Enhanced Ferroelectric Phase Transition . arXiv Prepr., 2020.[20] Hirai, K., Hutchison, J. A., and Uji-i, H.
Recent Progress of Vibropolaritonic Chemistry . Chempluschem,cplu.202000411, 2020.[21] Dutra, S. M.
Cavity quantum electrodynamics: the strange theory of light in a box . John Wiley & Sons,2005.[22] Ebbesen, T. W.
Hybrid light–matter states in a molecular and material science perspective . Accounts ofChemical Research, 49(11):2403, 2016.[23] Jestädt, R., Ruggenthaler, M., Oliveira, M. J. T., Rubio, A., and Appel, H.
Light-matter interactionswithin the Ehrenfest–Maxwell–Pauli–Kohn–Sham framework: fundamentals, implementation, andnano-optical applications . Adv. Phys., 68(4):225, 2019.[24] De Liberato, S.
Virtual photons in the ground state of a dissipative system . Nat. Commun., 8(1):1, 2017.[25] Baer, M.
Beyond Born-Oppenheimer: Electronic Nonadiabatic Coupling Terms and Conical Intersec-tions . Wiley, 2006.[26] Schäfer, C., Ruggenthaler, M., Appel, H., and Rubio, A.
Modification of excitation and charge transferin cavity quantum-electrodynamical chemistry . Proc. Natl. Acad. Sci., 116(11):4883, 2019.[27] Mazziotti, D. A. (editor).
Reduced-Density-Matrix Mechanics: With Application to Many-ElectronAtoms and Molecules , vol. 134 of
Advances in Chemical Physics . John Wiley & Sons, Inc., Hoboken,NJ, USA, 134 ed., 2007.[28] Webb, J. K., Flambaum, V. V., Churchill, C. W., Drinkwater, M. J., and Barrow, J. D.
Search for TimeVariation of the Fine Structure Constant . Phys. Rev. Lett., 82(5):884, 1999.[29] Eden, J. G.
High-order harmonic generation and other intense optical field-matter interactions: Reviewof recent experimental and theoretical advances . Prog. Quantum Electron., 28(3-4):197, 2004.[30] Ivanov, M. Y., Spanner, M., and Smirnova, O.
Anatomy of strong field ionization . J. Mod. Opt., 52(2-3):165, 2005.[31] Mitrano, M., Cantaluppi, A., Nicoletti, D., Kaiser, S., Perucchi, A., Lupi, S., Di Pietro, P., Pontiroli, D.,Riccò, M., Clark, S. R., Jaksch, D., and Cavalleri, A.
Possible light-induced superconductivity in K3C60at high temperature . Nature, 530(7591):461, 2016.256IBLIOGRAPHY[32] Oka, T. and Kitamura, S.
Floquet Engineering of Quantum Materials . Annu. Rev. Condens. MatterPhys., 10(1):387, 2019.[33] Mahmood, F., Chan, C.-K., Alpichshev, Z., Gardner, D., Lee, Y., Lee, P. A., and Gedik, N.
Selective scat-tering between Floquet–Bloch and Volkov states in a topological insulator . Nat. Phys., 12(4):306, 2016.[34] McIver, J. W., Schulte, B., Stein, F.-U., Matsuyama, T., Jotzu, G., Meier, G., and Cavalleri, A.
Light-induced anomalous Hall effect in graphene . Nat. Phys., 16(1):38, 2020.[35] Törmä, P. and Barnes, W. L.
Strong coupling between surface plasmon polaritons and emitters: a review .Reports Prog. Phys., 78(1):013901, 2015.[36] Lidzey, D. G., Bradley, D. D., Skolnick, M. S., Virgili, T., Walker, S., and Whittaker, D. M.
Strong exciton-photon coupling in an organic semiconductor microcavity . Nature, 395(6697):53, 1998.[37] Fujita, T., Sato, Y., Kuitani, T., and Ishihara, T.
Tunable polariton absorption of distributed feedbackmicrocavities at room temperature . Phys. Rev. B - Condens. Matter Mater. Phys., 57(19):12428, 1998.[38] Barnes, B., García Vidal, F., and Aizpurua, J.
Special issue on "strong coupling of molecules to cavities" .ACS Photonics, 5(1):1, 2018.[39] Ruggenthaler, M., Flick, J., Pellegrini, C., Appel, H., Tokatly, I. V., and Rubio, A.
Quantum-electrodynamical density-functional theory: Bridging quantum optics and electronic-structure theory .Phys. Rev. A, 90(1):1, 2014.[40] De Liberato, S.
Light-matter decoupling in the deep strong coupling regime: The breakdown of thepurcell effect . Phys. Rev. Lett., 112(1):1, 2014.[41] Haroche, S. and Kleppner, D.
Cavity quantum electrodynamics . Phys. Today, 42(1):24, 1989.[42] Pockrand, I., Brillante, A., and Möbius, D.
Exciton-surface plasmon coupling: An experimental investi-gation . J. Chem. Phys., 77(12):6289, 1982.[43] Skolnick, M. S., Fisher, T. A., and Whittaker, D. M.
Strong coupling phenomena in quantum microcavitystructures . Semicond. Sci. Technol., 13(7):645, 1998.[44] Raimond, J. M., Brune, M., and Haroche, S.
Colloquium: Manipulating quantum entanglement withatoms and photons in a cavity . Rev. Mod. Phys., 73(3):565, 2001.[45] Jaynes, E. T. and Cummings, F. W.
Comparison of Quantum and Semiclassical Radiation Theories withApplication to the Beam Maser . Proc. IEEE, 51(1):89, 1963.[46] Scheel, S. and Buhmann, S. Y.
Macroscopic QED - concepts and applications . preprint arXiv:0902.3586,2009.[47] Litinskaya, M., Reineker, P., and Agranovich, V. M.
Exciton-polaritons in organic microcavities . In
J.Lumin. , vol. 119-120, 277–282. North-Holland, 2006.[48] Kockum, A. F., Miranowicz, A., De Liberato, S., Savasta, S., and Nori, F.
Ultrastrong coupling betweenlight and matter . Nat. Rev. Phys., 1(1):19, 2019.[49] Plumhof, J. D., Stöferle, T., Mai, L., Scherf, U., and Mahrt, R. F.
Room-temperature Bose-Einstein con-densation of cavity exciton-polaritons in a polymer . Nat. Mater., 13(3):247, 2014. 257IBLIOGRAPHY[50] Imamoglu, A., Ram, R. J., Pau, S., and Yamamoto, Y.
Nonequilibrium condensates and lasers withoutinversion: Exciton-polariton lasers . Phys. Rev. A - At. Mol. Opt. Phys., 53(6):4250, 1996.[51] Orgiu, E., George, J., Hutchison, J. A., Devaux, E., Dayen, J. F., Doudin, B., Stellacci, F., Genet, C.,Schachenmayer, J., Genes, C., Pupillo, G., Samorì, P., and Ebbesen, T. W.
Conductivity in organic semi-conductors hybridized with the vacuum field . Nat. Mater., 14(11):1123, 2015.[52] Schwartz, T., Hutchison, J. A., Genet, C., and Ebbesen, T. W.
Reversible switching of ultrastrong light-molecule coupling . Phys. Rev. Lett., 106(19):1, 2011.[53] Thomas, A., George, J., Shalabney, A., Dryzhakov, M., Varma, S. J., Moran, J., Chervy, T., Zhong, X.,Devaux, E., and Genet, C.
Ground-state chemical reactivity under vibrational coupling to the vacuumelectromagnetic field . Angewandte Chemie, 128(38):11634, 2016.[54] Wang, D., Kelkar, H., Martin-Cano, D., Rattenbacher, D., Shkarin, A., Utikal, T., Götzinger, S., and San-doghdar, V.
Turning a molecule into a coherent two-level quantum system . Nat. Phys., 15(5):483, 2019.[55] Chikkaraddy, R., De Nijs, B., Benz, F., Barrow, S. J., Scherman, O. A., Rosta, E., Demetriadou, A., Fox,P., Hess, O., and Baumberg, J. J.
Single-molecule strong coupling at room temperature in plasmonicnanocavities . Nature, 535(7610):127, 2016.[56] Pendry, J. B., Schurig, D., and Smith, D. R.
Controlling electromagnetic fields . Science, 312:1780, 2006.[57] Niemczyk, T., Deppe, F., Huebl, H., Menzel, E. P., Hocke, F., Schwarz, M. J., Zueco, D., Hümmer,T., Solano, E., Marx, A., and Gross, R.
Circuit quantum electrodynamics in the ultrastrong-couplingregime . Nat. Phys., 6(10):772, 2010.[58] Yoshihara, F., Fuse, T., Ashhab, S., Kakuyanagi, K., Saito, S., and Semba, K.
Superconducting qubit-oscillator circuit beyond the ultrastrong-coupling regime . Nat. Phys., 13(1):44, 2017.[59] Liu, X., Galfsky, T., Sun, Z., Xia, F., Lin, E. C., Lee, Y. H., Kéna-Cohen, S., and Menon, V. M.
Stronglight-matter coupling in two-dimensional atomic crystals . Nat. Photonics, 9(1):30, 2014.[60] Forn-Díaz, P., Lamata, L., Rico, E., Kono, J., and Solano, E.
Ultrastrong coupling regimes of light-matterinteraction . Rev. Mod. Phys., 91(2):025005, 2019.[61] Scalari, G., Maissen, C., Turˇcinková, D., Hagenmüller, D., De Liberato, S., Ciuti, C., Reichl, C., Schuh,D., Wegscheider, W., Beck, M., and Faist, J.
Ultrastrong coupling of the cyclotron transition of a 2Delectron gas to a THz metamaterial . Science, 335(6074):1323, 2012.[62] Bayer, A., Pozimski, M., Schambeck, S., Schuh, D., Huber, R., Bougeard, D., and Lange, C.
TerahertzLight-Matter Interaction beyond Unity Coupling Strength . Nano Lett., 17(10):6340, 2017.[63] Lethuillier-Karl, L., Devaux, E., Genet, C., Chervy, T., George, J., Ebbesen, T. W., Nagarajan, K., Shalab-ney, A., Moran, J., Thomas, A., and Vergauwe, R. M. A.
Tilting a ground-state reactivity landscape byvibrational strong coupling . Science, 363(6427):615, 2019.[64] Lather, J., Bhatt, P., Thomas, A., Ebbesen, T. W., and George, J.
Cavity Catalysis by Cooperative Vibra-tional Strong Coupling of Reactant and Solvent Molecules . Angew. Chemie - Int. Ed., 58(31):10635,2019.258IBLIOGRAPHY[65] Keeling, J.
Coulomb interactions, gauge invariance, and phase transitions of the Dicke model . J. Phys.Condens. Matter, 19(29):8, 2007.[66] Martínez-Martínez, L. A., Ribeiro, R. F., Campos-González-Angulo, J., and Yuen-Zhou, J.
Can Ultra-strong Coupling Change Ground-State Chemical Reactions?
ACS Photonics, 5(1):167, 2018.[67] Galego, J., Climent, C., Garcia-Vidal, F. J., and Feist, J.
Cavity Casimir-Polder forces and their effects inground state chemical reactivity . Phys. Rev. X, 9(021057):1, 2019.[68] George, J., Wang, S., Chervy, T., Canaguier-Durand, A., Schaeffer, G., Lehn, J. M., Hutchison, J. A.,Genet, C., and Ebbesen, T. W.
Ultra-strong coupling of molecular materials: Spectroscopy and dynam-ics . Faraday Discuss., 178(0):281, 2015.[69] Feist, J. and Garcia-Vidal, F. J.
Extraordinary exciton conductance induced by strong coupling . Phys.Rev. Lett., 114(19):1, 2015.[70] Cwik, J. A., Kirton, P., De Liberato, S., and Keeling, J.
Excitonic spectral features in strongly coupledorganic polaritons . Phys. Rev. A, 93(3):1, 2016.[71] Herrera, F. and Spano, F. C.
Cavity-Controlled Chemistry in Molecular Ensembles . Phys. Rev. Lett.,116(23):1, 2016.[72] Dicke, R. H.
Coherence in spontaneous radiation processes . Phys. Rev., 93(1):99, 1954.[73] Hepp, K. and Lieb, E. H.
On the superradiant phase transition for molecules in a quantized radiationfield: the dicke maser model . Ann. Phys. (N. Y)., 76(2):360, 1973.[74] Rzaewski, K., Wódkiewicz, K., and Zakowicz, W.
Phase transitions, two-level atoms, and the A2 term .Phys. Rev. Lett., 35(7):432, 1975.[75] Viehmann, O., Von Delft, J., and Marquardt, F.
Superradiant phase transitions and the standard de-scription of circuit QED . Phys. Rev. Lett., 107(11):1, 2011.[76] De Bernardis, D., Pilar, P., Jaako, T., De Liberato, S., and Rabl, P.
Breakdown of gauge invariance inultrastrong-coupling cavity QED . Phys. Rev. A, 98(5):1, 2018.[77] Baumann, K., Guerlin, C., Brennecke, F., and Esslinger, T.
Dicke quantum phase transition with asuperfluid gas in an optical cavity . Nature, 464(7293):1301, 2010.[78] Zhiqiang, Z., Lee, C. H., Kumar, R., Arnold, K. J., Masson, S. J., Parkins, A. S., and Barrett, M. D.
Nonequi-librium phase transition in a spin-1 Dicke model . Optica, 4(4):424, 2017.[79] Kirton, P., Roses, M. M., Keeling, J., and Dalla Torre, E. G.
Introduction to the Dicke Model: FromEquilibrium to Nonequilibrium, and Vice Versa . Adv. Quantum Technol., 2(1-2):1800043, 2019.[80] George, J., Chervy, T., Shalabney, A., Devaux, E., Hiura, H., Genet, C., and Ebbesen, T. W.
Multiple RabiSplittings under Ultrastrong Vibrational Coupling . Phys. Rev. Lett., 117(October):153601, 2016.[81] Fujita, S. and Godoy, S.
Theory of High Temperature Superconductivity . Springer Netherlands, Dor-drecht, 2001.[82] Phillips, P.
Mottness . Ann. Phys. (N. Y)., 321(7):1634, 2006. 259IBLIOGRAPHY[83] van Santen, R. and Sautet, P.
Computational Methods in Catalysis and Materials Science: An Introduc-tion for Scientists and Engineers . Wiley, 2015.[84] Mardirossian, N. and Head-Gordon, M.
Thirty years of density functional theory in computationalchemistry: An overview and extensive assessment of 200 density functionals . Mol. Phys., 115(19):2315,2017.[85] Hubbard, J. and Flowers, B. H.
Electron correlations in narrow energy bands . Proceedings of the RoyalSociety of London. Series A. Mathematical and Physical Sciences, 276(1365):238, 1963.[86] LeBlanc, J. P. F., Antipov, A. E., Becca, F., Bulik, I. W., Chan, G. K.-L., Chung, C.-M., Deng, Y., Ferrero,M., Henderson, T. M., Jiménez-Hoyos, C. A., Kozik, E., Liu, X.-W., Millis, A. J., Prokof’ev, N. V., Qin, M.,Scuseria, G. E., Shi, H., Svistunov, B. V., Tocchio, L. F., Tupitsyn, I. S., White, S. R., Zhang, S., Zheng,B.-X., Zhu, Z., and Gull, E.
Solutions of the Two-Dimensional Hubbard Model: Benchmarks and Resultsfrom a Wide Range of Numerical Algorithms . Phys. Rev. X, 5(4):041041, 2015.[87] Fanfarillo, L.
Transport properties in multichannel systems . Ph.D. thesis, La Sapienza, 2012.[88] Anisimov, V. I., Aryasetiawan, F., and Lichtenstein, A. I.
First-principles calculations of the electronicstructure and spectra of strongly correlated systems: The LDA + U method . J. Phys. Condens. Matter,9(4):767, 1997.[89] Scully, M. O. and Zubairy, M. S.
Quantum optics . Cambridge University Press, 1999.[90] Frasca, M.
A modern review of the two-level approximation . Ann. Phys. (N. Y)., 306(2):193, 2003.[91] Saurabh, P. and Mukamel, S.
Two-dimensional infrared spectroscopy of vibrational polaritons ofmolecules in an optical cavity . J. Chem. Phys., 144(12):124115, 2016.[92] Luk, H. L., Feist, J., Toppari, J. J., and Groenhof, G.
Multiscale molecular dynamics simulations ofpolaritonic chemistry . Journal of chemical theory and computation, 13(9):4324, 2017.[93] Vendrell, O.
Collective jahn-teller interactions through light-matter coupling in a cavity . Physical re-view letters, 121(25):253001, 2018.[94] Zhang, Y., Nelson, T., and Tretiak, S.
Non-adiabatic molecular dynamics of molecules in the presence ofstrong light-matter interactions . J. Chem. Phys., 151(15):154109, 2019.[95] Groenhof, G., Climent, C., Feist, J., Morozov, D., and Toppari, J. J.
Tracking Polariton Relaxation withMultiscale Molecular Dynamics Simulations . J. Phys. Chem. Lett., 10(18):5476, 2019.[96] Bennett, K., Kowalewski, M., and Mukamel, S.
Novel photochemistry of molecular polaritons in opticalcavities . Faraday Discuss., 194(0):259, 2016.[97] Rabi, I. I.
On the process of space quantization . Phys. Rev., 49(4):324, 1936.[98] Braak, D.
Integrability of the Rabi model . Phys. Rev. Lett., 107(10):2, 2011.[99] Schäfer, C., Ruggenthaler, M., Rokaj, V., and Rubio, A.
Relevance of the Quadratic Diamagnetic andSelf-Polarization Terms in Cavity Quantum Electrodynamics . ACS Photonics, 7(4):975, 2020.[100] Imamoglu, A., Schmidt, H., Woods, G., and Deutsch, M.
Strongly interacting photons in a nonlinearcavity . Phys. Rev. Lett., 79(8):1467, 1997.260IBLIOGRAPHY[101] Moore, J. W. and Pearson, R. G.
Kinetics and mechanism . John Wiley & Sons, 1981.[102] Rokaj, V., Welakuh, D. M., Ruggenthaler, M., and Rubio, A.
Light–matter interaction in the long-wavelength limit: no ground-state without dipole self-energy . Journal of Physics B: Atomic, Molecularand Optical Physics, 51(3):034005, 2018.[103] Di Stefano, O., Settineri, A., Macrì, V., Garziano, L., Stassi, R., Savasta, S., and Nori, F.
Resolution ofgauge ambiguities in ultrastrong-coupling cavity quantum electrodynamics . Nat. Phys., 2019.[104] Andolina, G. M., Pellegrino, F. M. D., Giovannetti, V., MacDonald, A. H., and Polini, M.
Cavity quan-tum electrodynamics of strongly correlated electron systems: A no-go theorem for photon condensation .Phys. Rev. B, 100(12):121109, 2019.[105] Greiner, W.
Relativistic Quantum Mechanics . Springer Berlin Heidelberg, Berlin, Heidelberg, 1990.[106] Dirac, P. A. M.
Quantum mechanics of many-electron systems . Proc. R. Soc. London. Ser. A, Contain.Pap. a Math. Phys. Character, 123(792):714, 1929.[107] Kaku, M.
Quantum field theory: a modern introduction . Oxford Univ. Press, 1993.[108] Aoyama, T., Hayakawa, M., Kinoshita, T., and Nio, M.
Tenth-order QED contribution to the electron g-2and an improved value of the fine structure constant . Phys. Rev. Lett., 109(11):111807, 2012.[109] Henson, B. M.
The First Measurement of the → P − P Tune-Out Wavelength in He ∗ . Ph.D. thesis,Australian National University, 2017.[110] Berestetskii, V., Lifshitz, E., and Pitaevskii, L. Quantum Electrodynamics: Volume 4 . Course of theoret-ical physics. Elsevier Science, 1982.[111] Lienert, M., Petrat, S., and Tumulka, R.
Multi-time wave functions . In
J. Phys. Conf. Ser. , vol. 880, 12006.Institute of Physics Publishing, 2017.[112] Białynicki-Birula, I.
Triumphs and failures of quantum electrodynamics . In
Acta Phys Pol. B , vol. 27,2403–2408. 1996.[113] Białynicki-Birula, I. and Białynicka-Birula, Z.
Quantum electrodynamics , vol. 70. Elsevier, 1975.[114] Tachibana, A.
New Aspects of Quantum Electrodynamics . Springer Singapore, Singapore, 2017.[115] Wagner, R. E., Ware, M. R., Shields, B. T., Su, Q., and Grobe, R.
Space-time resolved approach for inter-acting quantum field theories . Phys. Rev. Lett., 106(2):1, 2011.[116] Wagner, R. E., Su, Q., and Grobe, R.
Computational renormalization scheme for quantum field theories .Phys. Rev. A - At. Mol. Opt. Phys., 88(1):1, 2013.[117] Barut, A. O. (editor).
Quantum electrodynamics and quantum optics , vol. 110. Plenum Press, NewYork, 1984.[118] Takaesu, T.
On the spectral analysis of quantum electrodynamics with spatial cutoffs. i.
Journal ofMathematical Physics, 50(6):062302, 2009.[119] Le Bellac, M. and Lévy-Leblond, J. M.
Galilean electromagnetism . Nuovo Cim., 14(2):217, 1973. 261IBLIOGRAPHY[120] Van Vleck, J. H.
The Correspondence Principle in the Statistical Interpretation of Quantum Mechanics .Proc. Natl. Acad. Sci., 14(2):178, 1928.[121] Baez, J. C.
Struggles with the Continuum . arXiv:1609.01421, 2016.[122] Anderson, P. W.
More Is Different . Science, 177(4047):347, 1972.[123] Worth, G. A. and Cederbaum, L. S.
BEYOND BORN-OPPENHEIMER: Molecular Dynamics Through aConical Intersection . Annu. Rev. Phys. Chem., 55(1):127, 2004.[124] Rokaj, V., Ruggenthaler, M., Eich, F. G., and Rubio, A.
The Free Electron Gas in Cavity Quantum Electro-dynamics . preprint arXiv:2006.09236, 2020.[125] Schäfer, C., Ruggenthaler, M., and Rubio, A.
Ab initio nonrelativistic quantum electrodynamics:Bridging quantum chemistry and quantum optics from weak to strong coupling . Physical Review A,98(4):043801, 2018.[126] Kirilyuk, A. P.
Universal Concept of Complexity by the Dynamic Redundance Paradigm: Causal Ran-domness, Complete Wave Mechanics, and the Ultimate Unification of Knowledge . Naukova Dumka,1997.[127] Power, E. A. and Thirunamachandran, T.
Quantum electrodynamics in a cavity . Phys. Rev. A,25(5):2473, 1982.[128] Flick, J., Appel, H., Ruggenthaler, M., and Rubio, A.
Cavity Born-Oppenheimer Approximation forCorrelated Electron-Nuclear-Photon Systems . J. Chem. Theory Comput., 13(4):1616, 2017.[129] Albareda, G., Kelly, A., and Rubio, A.
Nonadiabatic quantum dynamics without potential energy sur-faces . Phys. Rev. Mater., 3(2):023803, 2019.[130] Beck, M.
The multiconfiguration time-dependent Hartree (MCTDH) method: a highly efficient algo-rithm for propagating wavepackets . Phys. Rep., 324(1):1, 2000.[131] Bartlett, R. J. and Musiał, M.
Coupled-cluster theory in quantum chemistry . Rev. Mod. Phys., 79(1):291,2007.[132] Helgaker, T., Jørgensen, P., and Olsen, J.
Molecular Electronic-Structure Theory . John Wiley & Sons,Ltd, Chichester, UK, 2000.[133] Chan, G. K.-L. and Sharma, S.
The Density Matrix Renormalization Group in Quantum Chemistry .Annu. Rev. Phys. Chem., 62(1):465, 2011.[134] Kanungo, B. and Gavini, V.
Large-scale all-electron density functional theory calculations using anenriched finite-element basis . Phys. Rev. B, 95(3):035112, 2017.[135] de Oliveira, C. R.
Intermediate Spectral Theory and Quantum Dynamics , vol. 54. Birkhäuser Basel,Basel, 2009.[136] Coleman, A. J.
Structure of Fermion Density Matrices . Rev. Mod. Phys., 35:668, 1963.[137] Altland, A. and Simons, B. D.
Condensed Matter Field Theory . Cambridge University Press, 2010.[138] Beck, T. L.
Real-space mesh techniques in density-functional theory . Rev. Mod. Phys., 72(4):1041, 2000.262IBLIOGRAPHY[139] Motta, M., Ceperley, D. M., Chan, G. K. L., Gomez, J. A., Gull, E., Guo, S., Jiménez-Hoyos, C. A., Lan,T. N., Li, J., Ma, F., Millis, A. J., Prokof’ev, N. V., Ray, U., Scuseria, G. E., Sorella, S., Stoudenmire, E. M.,Sun, Q., Tupitsyn, I. S., White, S. R., Zgid, D., and Zhang, S.
Towards the solution of the many-electronproblem in real materials: Equation of state of the hydrogen chain with state-of-the-art many-bodymethods . Phys. Rev. X, 7(3):031059, 2017.[140] Pauli, W.
Die allgemeinen prinzipien der wellenmechanik . In Bethe, H., Hund, F., Mott, N. F., Pauli, W.,Rubinowicz, A., Wentzel, G., and Smekal, A. (editors),
Quantentheorie , chap. 2, 83–271. Springer BerlinHeidelberg, 1933.[141] Holland, P.
The Quantum Theory of Motion: An Account of the de Broglie-Bohm Causal Interpretationof Quantum Mechanics . Cambridge University Press, 1995.[142] Gross, E., Runge, E., and Heinonen, O.
Many-particle Theory . Adam Hilger, 1991.[143] Dirac, P. A. M.
On the Theory of Quantum Mechanics . Proc. R. Soc. A Math. Phys. Eng. Sci., 112(762):661,1926.[144] Stefanucci, G. and Van Leeuwen, R.
Nonequilibrium many-body theory of quantum systems: a modernintroduction . Cambridge University Press, 2013.[145] Bose, S. N.
Plancks Gesetz und Lichtquantenhypothese . Zeitschrift für Phys., 26(1):178, 1924.[146] Einstein, A.
Quantentheorie des idealen einatomigen gases, zweite abhandlung . Sitzungsberichte derPreußischen Akademie der Wissenschaften, Physikalisch-Mathematische Klasse, Berlin, 3–14, 1925.[147] Fermi, E.
Statistical method to determine some properties of atoms . Rend. Accad. Naz. Lincei, 6:620,1927. Translated by Giovanni Gallavotti, May 2011.[148] Heisenberg, W.
Über quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen.
Zeitschrift für Phys., 33(1):879, 1925.[149] Schrödinger, E.
An undulatory theory of the mechanics of atoms and molecules . Phys. Rev., 28(6):1049,1926.[150] Anderson, M. H., Ensher, J. R., Matthews, M. R., Wieman, C. E., and Cornell, E. A.
Observation ofbose-einstein condensation in a dilute atomic vapor . Science, 269(5221):198, 1995.[151] Pauli, W.
Über den Zusammenhang des Abschlusses der Elektronengruppen im Atom mit der Kom-plexstruktur der Spektren . Zeitschrift für Phys., 31(1):765, 1925.[152] Lieb, E. H.
The stability of matter: From atoms to stars: Fourth edition . Springer Berlin Heidelberg,2005.[153] Gavroglu, K. and Simões, A.
Neither Physics nor Chemistry - A history of Quantum Chemistry . MITPress, 2012.[154] Mendelejew, D.
Über die beziehungen der eigenschaften zu den atomgewichten der elemente .Zeitschrift für Chemie, 12(405-406):173, 1869.[155] Hund, F.
Atomtheoretische Deutung des Magnetismus der seltenen. Erden.
Zeitschrift für Phys.,33(1):855, 1925. 263IBLIOGRAPHY[156] Mulliken, R. S.
The interpretation of band spectra part III. Electron quantum numbers and states ofmolecules and their atoms . Rev. Mod. Phys., 4(1):1, 1932.[157] Giesbertz, K. J. and Baerends, E. J.
Aufbau derived from a unified treatment of occupation numbersin Hartree-Fock, Kohn-Sham, and natural orbital theories with the Karush-Kuhn-Tucker conditions forthe inequality constraints ni ≤ and ni ≥
0. J. Chem. Phys., 132(19), 2010.[158] Piris, M. and Ugalde, J. M.
Iterative diagonalization for orbital optimization in natural orbital func-tional theory . J. Comput. Chem., 30(13):2078, 2009.[159] Rao, C. N., Natarajan, S., Choudhury, A., Neeraj, S., and Ayi, A. A.
Aufbau principle of complex open-framework structures of metal phosphates with different dimensionalities . Acc. Chem. Res., 34(1):80,2001.[160] Szabo, A. and Ostlund, N.
Modern Quantum Chemistry: Introduction to Advanced Electronic StructureTheory . Dover Books on Chemistry. Dover Publications, 2012.[161] Frank, R. L., Lieb, E. H., Seiringer, R., and Siedentop, H.
Müller’s exchange-correlation energy in density-matrix-functional theory . Phys. Rev. A, 76(5):1, 2007.[162] Medvedev, M. G., Bushmarinov, I. S., Sun, J., Perdew, J. P., and Lyssenko, K. A.
Density functional theoryis straying from the path toward the exact functional . Science, 355(6320):49, 2017.[163] Tokatly, I. V.
Quantum many-body dynamics in a Lagrangian frame: I. Equations of motion and con-servation laws . Phys. Rev. B - Condens. Matter Mater. Phys., 71(16), 2005.[164] Tokatly, I. V.
Quantum many-body dynamics in a Lagrangian frame: II. Geometric formulation of time-dependent density functional theory . Phys. Rev. B - Condens. Matter Mater. Phys., 71(16), 2005.[165] Hohenberg, P. and Kohn, W.
Inhomogeneous electron gas . Phys. Rev., 136(3b), 1964.[166] Penz, M., Laestadius, A., Tellgren, E. I., and Ruggenthaler, M.
Guaranteed Convergence of a RegularizedKohn-Sham Iteration in Finite Dimensions . Phys. Rev. Lett., 123(3), 2019.[167] Vignale, G. and Rasolt, M.
Density-functional theory in strong magnetic fields . Phys. Rev. Lett.,59(20):2360, 1987.[168] Levy, M.
Universal variational functionals of electron densities, first-order density matrices, andnatural spin-orbitals and solution of the v-representability problem . Proc. Natl. Acad. Sci. U. S. A.,76(12):6062, 1979.[169] Lieb, E. H.
Density functionals for coulomb systems . Int. J. Quantum Chem., 24(3):243, 1983.[170] Sham, L. J. and Kohn, W.
One-particle properties of an inhomogeneous interacting electron gas . Phys.Rev., 145(2):561, 1966.[171] Levy, M.
Electron densities in search of Hamiltonians . Phys. Rev. A, 26(3):1200, 1982.[172] Dreizler, R. M. and Gross, E. K. U.
Density Functional Theory . Springer Berlin Heidelberg, Berlin,Heidelberg, 1990.[173] Tchenkoue, M. L. M., Penz, M., Theophilou, I., Ruggenthaler, M., and Rubio, A.
Force balance approachfor advanced approximations in density functional theories . J. Chem. Phys., 151(15), 2019.264IBLIOGRAPHY[174] Perdew, J. P. and Kurth, S.
Density Functionals for Non-relativistic Coulomb Systems in the New Century .In
A Prim. Density Funct. Theory , 1–55. Springer, Berlin, Heidelberg, 2003.[175] Parr, R. G.
Density functional theory of atoms and molecules . Springer, 1980.[176] Giuliani, G. and Vignale, G.
Quantum theory of the electron liquid . Cambridge university press, 2005.[177] Pople, J. A.
Nobel Lecture: Quantum chemical models . Rev. Mod. Phys., 71(5):1267, 1999.[178] Perdew, J. P.
Accurate Density Functional for the Energy: Real-Space Cutoff of the Gradient Expansionfor the Exchange Hole . Phys. Rev. Lett., 55(16):1665, 1985.[179] Jones, R. O.
Density functional theory: Its origins, rise to prominence, and future . Rev. Mod. Phys.,87(3), 2015.[180] Burke, K.
Perspective on density functional theory . J. Chem. Phys., 136(15):150901, 2012.[181] Ruggenthaler, M. and Bauer, D.
Local Hartree-exchange and correlation potential defined by local forceequations . Phys. Rev. A - At. Mol. Opt. Phys., 80(5):1, 2009.[182] Perdew, J. P., Burke, K., and Ernzerhof, M.
Generalized gradient approximation made simple . Phys.Rev. Lett., 77(18):3865, 1996.[183] Lieb, E. H.
Thomas-Fermi and related theories of atoms and molecules . Rev. Mod. Phys., 53(4):603,1981.[184] Karasiev, V. V. and Trickey, S. B.
Frank Discussion of the Status of Ground-State Orbital-Free DFT . In
Adv. Quantum Chem. , vol. 71, 221–245. Academic Press Inc., 2015.[185] Georges, A., Kotliar, G., Krauth, W., and Rozenberg, M. J.
Dynamical mean-field theory of stronglycorrelated fermion systems and the limit of infinite dimensions . 1996.[186] Vollhardt, D., Byczuk, K., and Kollar, M.
Dynamical mean-field theory . In Avella, A. and Mancini,F. (editors),
Strongly Correlated Systems: Theoretical Methods , 203–236. Springer Berlin Heidelberg,Berlin, Heidelberg, 2012.[187] Imada, M., Fujimori, A., and Tokura, Y.
Metal-insulator transitions . Rev. Mod. Phys., 70(4):1039, 1998.[188] Scuseria, G. E., Jiménez-Hoyos, C. A., Henderson, T. M., Samanta, K., and Ellis, J. K.
Projected quasi-particle theory for molecular electronic structure . J. Chem. Phys., 135(12):124108, 2011.[189] Mok, D. K., Neumann, R., and Handy, N. C.
Dynamical and nondynamical correlation . J. Phys. Chem.,100(15):6225, 1996.[190] Dimitrov, T., Appel, H., Fuks, J. I., and Rubio, A.
Exact maps in density functional theory for latticemodels . New J. Phys., 18(8):083004, 2016.[191] Seidl, M., Perdew, J. P., and Levy, M.
Strictly correlated electrons in density-functional theory . Phys. Rev.A - At. Mol. Opt. Phys., 59(1):51, 1999.[192] Gori-Giorgi, P., Seidl, M., and Vignale, G.
Density-functional theory for strongly interacting electrons .Phys. Rev. Lett., 103(16), 2009. 265IBLIOGRAPHY[193] Pernal, K. and Giesbertz, K. J. H.
Reduced density matrix functional theory (rdmft) and linear responsetime-dependent rdmft (td-rdmft) . In Ferré, N., Filatov, M., and Huix-Rotllant, M. (editors),
Density-Functional Methods for Excited States , 125–183. Springer International Publishing, Cham, 2016.[194] Bonitz, M.
Quantum Kinetic Theory . Springer International Publishing, 2 ed., 2016.[195] Coleman, A. J. and Yukalov, V. I.
Reduced density matrices: Coulson’s challenge , vol. 72. Springer Science& Business Media, 2000.[196] Löwdin, P.-O.
Quantum Theory of Many-Particle Systems. I. Physical Interpretations by Means of Den-sity Matrices, Natural Spin-Orbitals, and Convergence Problems in the Method of Configurational In-teraction . Phys. Rev., 97(6):1474, 1955.[197] Coulson, C.
Present State of Molecular Structure Calculations . Rev. Mod. Phys., 32(2):170, 1960.[198] National Research Council.
Mathematical Challenges from Theoretical/Computational Chemistry . Na-tional Academies Press, Washington, D.C., 1995.[199] Gilbert, T. L.
Hohenberg-Kohn theorem for nonlocal external potentiais . Phys. Rev. B, 12(6), 1975.[200] Giesbertz, K. J. H. and Ruggenthaler, M.
One-body reduced density-matrix functional theory in finitebasis sets at elevated temperatures . Phys. Rep., 806:1, 2019.[201] Benavides-Riveros, C. L., Wolff, J., Marques, M. A., and Schilling, C.
Reduced Density Matrix FunctionalTheory for Bosons . Phys. Rev. Lett., 124(18), 2020.[202] Ayers, P. W. and Levy, M.
Generalized density-functional theory: Conquering the N-representabilityproblem with exact functional for the electron pair density and the second-order reduced density matrix .J. Chem. Sci., 117(5):507, 2005.[203] Verdozzi, C.
Time-dependent density-functional theory and strongly correlated systems: Insight fromnumerical studies . Phys. Rev. Lett., 101(16), 2008.[204] Klyachko, A. A.
Quantum marginal problem and n-representability . Journal of Physics: ConferenceSeries, 36(1):72, 2006.[205] Altunbulak, M. and Klyachko, A.
The Pauli Principle Revisited . Commun. Math. Phys., 282(2):287,2008.[206] Theophilou, I., Lathiotakis, N. N., Marques, M. A., and Helbig, N.
Generalized pauli constraints inreduced density matrix functional theory . The Journal of chemical physics, 142(15):154108, 2015.[207] Schilling, C., Gross, D., and Christandl, M.
Pinning of fermionic occupation numbers . Phys. Rev. Lett.,110(4):1, 2013.[208] Schilling, C. and Schilling, R.
Diverging Exchange Force and Form of the Exact Density Matrix Func-tional . Phys. Rev. Lett., 122(1):013001, 2019.[209] Blum, K.
Density Matrix Theory and Applications , vol. 64 of
Springer Series on Atomic, Optical, andPlasma Physics . Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.[210] Coleman, A. J.
The convex structure of electrons . Int. J. Quantum Chem., 11(6):907, 1977.266IBLIOGRAPHY[211] Giesbertz, K.
Time-Dependent One-Body Reduced Density Matrix Functional Theory . Ph.D. thesis, VrijeUniversiteit Amsterdam, 2010.[212] Surján, P. R., Szabados, Á., Jeszenszki, P., and Zoboki, T.
Strongly orthogonal geminals: Size-extensiveand variational reference states . J. Math. Chem., 50(3):534, 2012.[213] Kummer, H. n -Representability Problem for Reduced Density Matrices . J. Math. Phys., 8(10):2063, 1967.[214] Mazziotti, D. A.
Structure of fermionic density matrices: Complete n-representability conditions . Phys.Rev. Lett., 108:263002, 2012.[215] Mazziotti, D. A.
Significant conditions for the two-electron reduced density matrix from the constructivesolution of N representability . Phys. Rev. A, 85(6):062507, 2012.[216] Lackner, F., Bˇrezinová, I., Sato, T., Ishikawa, K. L., and Burgdörfer, J.
Propagating two-particle reduceddensity matrices without wave functions . Phys. Rev. A - At. Mol. Opt. Phys., 91(2):023412, 2015.[217] Fosso-Tande, J., Nguyen, T. S., Gidofalvi, G., and Deprince, A. E.
Large-Scale Variational Two-ElectronReduced-Density-Matrix-Driven Complete Active Space Self-Consistent Field Methods . J. Chem. TheoryComput., 12(5):2260, 2016.[218] Schwerdtfeger, P.
The pseudopotential approximation in electronic structure theory . ChemPhysChem,12(17):3143, 2011.[219] Pernal, K.
Effective potential for natural spin orbitals . Phys. Rev. Lett., 94(23):1, 2005.[220] Chong, D. P., Gritsenko, O. V., and Baerends, E. J.
Interpretation of the Kohn–Sham orbital energies asapproximate vertical ionization potentials . J. Chem. Phys., 116(5):1760, 2002.[221] Lathiotakis, N. N., Helbig, N., Rubio, A., and Gidopoulos, N. I.
Local reduced-density-matrix-functionaltheory: Incorporating static correlation effects in Kohn-Sham equations . Phys. Rev. A - At. Mol. Opt.Phys., 90(3), 2014.[222] Gritsenko, O., Pernal, K., and Baerends, E. J.
An improved density matrix functional by physicallymotivated repulsive corrections . J. Chem. Phys., 122(20), 2005.[223] Sharma, S., Dewhurst, J. K., Shallcross, S., and Gross, E. K. U.
Spectral density and metal-insulatorphase transition in mott insulators within reduced density matrix functional theory . Phys. Rev. Lett.,110(11):1, 2013.[224] Lieb, E. H.
Variational principle for many-fermion systems . Physical Review Letters, 46(7):457, 1981.[225] Helbig, N., Fuks, J. I., Casula, M., Verstraete, M. J., Marques, M. A., Tokatly, I. V., and Rubio, A.
Densityfunctional theory beyond the linear regime: Validating an adiabatic local density approximation . Phys.Rev. A - At. Mol. Opt. Phys., 83(3):1, 2011.[226] Löwdin, P. O. and Shull, H.
Natural orbitals in the quantum theory of two-electron systems . Phys. Rev.,101(6):1730, 1956.[227] Müller, A. M. K.
Explicit approximate relation between reduced two- and one-particel density matrices .Phys. Lett., 105(9):446, 1984. 267IBLIOGRAPHY[228] Buijse, M. A. and Baerends, E. J.
An approximate exchange-correlation hole density as a functional ofthe natural orbitals . Mol. Phys., 100(4):401, 2002.[229] Goedecker, S. and Umrigar, C. J.
Natural orbital functional for the many-electron problem . Phys. Rev.Lett., 81(4):866, 1998.[230] Mordovina, U., Reinhard, T. E., Theophilou, I., Appel, H., and Rubio, A.
Self-Consistent Density-Functional Embedding: A Novel Approach for Density-Functional Approximations . J. Chem. TheoryComput., 15(10):5209, 2019.[231] Pernal, K., Gritsenko, O. V., and Van Meer, R.
Reproducing benchmark potential energy curves of molec-ular bond dissociation with small complete active space aided with density and density-matrix func-tional corrections . J. Chem. Phys., 151(16):164122, 2019.[232] Piris, M.
Global Method for Electron Correlation . Phys. Rev. Lett., 119(6):1, 2017.[233] Flick, J., Welakuh, D. M., Ruggenthaler, M., Appel, H., and Rubio, A.
Light-Matter Response in Nonrel-ativistic Quantum Electrodynamics . ACS Photonics, 6(11):2757, 2019.[234] Li, C.
Nonlinear Optics . Springer Singapore, Singapore, 2017.[235] Sidler, D., Ruggenthaler, M., Appel, H., and Rubio, A.
Chemistry in Quantum Cavities: Exact Results,the Impact of Thermal Velocities and Modified Dissociation . arXiv preprint, 2020.[236] Lorin, E., Chelkowski, S., and Bandrauk, A.
A numerical Maxwell-Schrödinger model for intense laser-matter interaction and propagation . Comput. Phys. Commun., 177(12):908, 2007.[237] Yabana, K., Sugiyama, T., Shinohara, Y., Otobe, T., and Bertsch, G. F.
Time-dependent density functionaltheory for strong electromagnetic fields in crystalline solids . Phys. Rev. B - Condens. Matter Mater. Phys.,85(4):045134, 2012.[238] Floss, I., Lemell, C., Wachter, G., Smejkal, V., Sato, S. A., Tong, X. M., Yabana, K., and Burgdörfer, J.
Abinitio multiscale simulation of high-order harmonic generation in solids . Phys. Rev. A, 97(1):011401,2018.[239] Sommer, A., Bothschafter, E. M., Sato, S. A., Jakubeit, C., Latka, T., Razskazovskaya, O., Fattahi, H.,Jobst, M., Schweinberger, W., Shirvanyan, V., Yakovlev, V. S., Kienberger, R., Yabana, K., Karpowicz,N., Schultze, M., and Krausz, F.
Attosecond nonlinear polarization and light-matter energy transfer insolids . Nature, 534(7605):86, 2016.[240] Freitag, E. and Busam, R.
Funktionentheorie 1 . Springer-Lehrbuch. Springer-Verlag,Berlin/Heidelberg, 2006.[241] Herrera, F. and Owrutsky, J.
Molecular polaritons for controlling chemistry with quantum optics . J.Chem. Phys., 152(10):100902, 2020.[242] Ruggenthaler, M.
Ground-State Quantum-Electrodynamical Density-Functional Theory . preprintarXiv:1509, 1–6, 2015.[243] Runge, E. and Gross, E. K. U.
Density-Functional Theory for Time-Dependent Systems . Phys. Rev. Lett.,52(12):997, 1984.268IBLIOGRAPHY[244] van Leeuwen, R.
Mapping from Densities to Potentials in Time-Dependent Density-Functional Theory .Phys. Rev. Lett., 82(19):3863, 1999.[245] Ruggenthaler, M., Mackenroth, F., and Bauer, D.
Time-dependent kohn-sham approach to quantumelectrodynamics . Physical Review A, 84(4):042107, 2011.[246] Tokatly, I. V.
Time-dependent density functional theory for many-electron systems interacting with cav-ity photons . Phys. Rev. Lett., 110(23):1, 2013.[247] Flick, J., Schäfer, C., Ruggenthaler, M., Appel, H., and Rubio, A.
Ab Initio Optimized Effective Poten-tials for Real Molecules in Optical Cavities: Photon Contributions to the Molecular Ground State . ACSPhotonics, 5(3):992, 2018.[248] Flick, J. and Narang, P.
Cavity-Correlated Electron-Nuclear Dynamics from First Principles . Phys. Rev.Lett., 121(11), 2018.[249] Flick, J., Rivera, N., and Narang, P.
Strong light-matter coupling in quantum chemistry and quantumphotonics . Nanophotonics, 7(9):1479, 2018.[250] Wang, D. S., Neuman, T., Flick, J., and Narang, P.
Weak-to-Strong Light-Matter Coupling and Dissipa-tive Dynamics from First Principles . preprint arXiv:2002.10461, 2020.[251] Pellegrini, C., Flick, J., Tokatly, I. V., Appel, H., and Rubio, A.
Optimized Effective Potential for QuantumElectrodynamical Time-Dependent Density Functional Theory . Phys. Rev. Lett., 115(9):1, 2015.[252] Krieger, J. B., Li, Y., and Iafrate, G. J.
Construction and application of an accurate local spin-polarizedKohn-Sham potential with integer discontinuity: Exchange-only theory . Phys. Rev. A, 45(1):101, 1992.[253] Kim, Y.-H., Städele, M., and Martin, R. M.
Density-functional study of small molecules within theKrieger-Li-Iafrate approximation . Phys. Rev. A, 60(5):3633, 1999.[254] van Leeuwen, R. and Stefanucci, G.
Nonequilibrium Many-Body Theory of Quantum Systems . Cam-bridge University Press, 2013.[255] Sipe, J. E.
Photon wave functions . Phys. Rev. A, 52(3):1875, 1995.[256] Giesbertz, K. J. and Van Leeuwen, R.
Natural occupation numbers: When do they vanish?
J. Chem.Phys., 139(10):104109, 2013.[257] Watts, A.
Eastern wisdom and modern life . 1960. KQED public television series,San Francisco, transcribition accessible at , accessed 17.09.2020.[258] Nielsen, S. E. B., Schäfer, C., Ruggenthaler, M., and Rubio, A.
Dressed-Orbital Approach to CavityQuantum Electrodynamics and Beyond . preprint arXiv:1812.00388, 2018.[259] Lee, A. M. and Handy, N. C.
Dissociation of hydrogen and nitrogen molecules studied using densityfunctional theory . J. Chem. Soc., Faraday Trans., 89:3999, 1993.[260] Vuckovic, S., Wagner, L. O., Mirtschink, A., and Gori-Giorgi, P.
Hydrogen Molecule Dissociation Curvewith Functionals Based on the Strictly Correlated Regime . J. Chem. Theory Comput., 11(7):3153, 2015.269IBLIOGRAPHY[261] Mordovina, U., Bungey, C., Appel, H., Knowles, P. J., Rubio, A., and Manby, F. R.
Polaritonic coupled-cluster theory . Phys. Rev. Res., 2(2):023262, 2020.[262] Watson, J. K.
Simplification of the molecular vibration-rotation hamiltonian . Mol. Phys., 15(5):479,1968.[263] de Pillis, J. E.
Linear transformations which preserve hermitian and positive semidefinite operators .Pacific Journal of Mathematics, 23(1):129, 1967.[264] Curtiss, L. A., Raghavachari, K., Redfern, P. C., and Pople, J. A.
Assessment of Gaussian-2 and densityfunctional theories for the computation of enthalpies of formation . J. Chem. Phys., 106(3):1063, 1997.[265] Curtiss, L. A., Raghavachari, K., Redfern, P. C., and Pople, J. A.
Assessment of Gaussian-3 and densityfunctional theories for a larger experimental test set . J. Chem. Phys., 112(17):7374, 2000.[266] Nocedal, J. and Wright, S.
Numerical optimization, series in operations research and financial engi-neering . Springer, 2006.[267] Payne, M. C., Teter, M. P., Allan, D. C., Arias, T. A., and Joannopoulos, J. D.
Iterative minimizationtechniques for ab initio total-energy calculations: Molecular dynamics and conjugate gradients . Rev.Mod. Phys., 64(4):1045, 1992.[268] Dzsotjan, D., Kästel, J., and Fleischhauer, M.
Dipole-dipole shift of quantum emitters coupled to surfaceplasmons of a nanowire . Phys. Rev. B - Condens. Matter Mater. Phys., 84(7), 2011.[269] Grießer, T., Vukics, A., and Domokos, P.
Depolarization shift of the superradiant phase transition . Phys.Rev. A, 94(3):33815, 2016.[270] Plankensteiner, D., Sommer, C., Ritsch, H., and Genes, C.
Cavity Antiresonance Spectroscopy of DipoleCoupled Subradiant Arrays . Phys. Rev. Lett., 119(9), 2017.[271] Andrade, X., Strubbe, D. A., De Giovannini, U., Larsen, A. H., Oliveira, M. J. T., Alberdi-Rodriguez, J.,Varas, A., Theophilou, I., Helbig, N., Verstraete, M., Stella, L., Nogueira, F., Aspuru-Guzik, A., Castro, A.,Marques, M. A. L., and Rubio, Á.
Real-space grids and the Octopus code as tools for the development ofnew simulation approaches for electronic systems . Phys. Chem. Chem. Phys., 17:31371, 2015.[272] Loudon, R.
One-Dimensional Hydrogen Atom . Am. J. Phys., 27(9):649, 1959.[273] Gebremedhin, D. H. and Weatherford, C. A.
Calculations for the one-dimensional soft Coulomb prob-lem and the hard Coulomb limit . Phys. Rev. E - Stat. Nonlinear, Soft Matter Phys., 89(5), 2014.[274] Ruggenthaler, M. and Bauer, D.
Rabi oscillations and few-level approximations in time-dependentdensity functional theory . Phys. Rev. Lett., 102(23):2, 2009.[275] Fuks, J. I., Helbig, N., Tokatly, I. V., and Rubio, A.
Nonlinear phenomena in time-dependent density-functional theory: What Rabi oscillations can teach us . Phys. Rev. B, 84(7), 2011.[276] Lin, L.
Adaptively Compressed Exchange Operator . J. Chem. Theory Comput., 12(5):2242, 2016.[277] Perera, L. C., Raymond, O., Henderson, W., Brothers, P. J., and Plieger, P. G.
Advances in berylliumcoordination chemistry , vol. 352. Elsevier B.V., 2017.270IBLIOGRAPHY[278] Bunge, C. F.
Accurate wavefunction for atomic beryllium . At. Data Nucl. Data Tables, 18(3):293, 1976.[279] Fletcher, R.
Practical Methods of Optimization . John Wiley & Sons, Ltd, 1987.[280] Zobel, J.
Writing for Computer Science . Springer London, 2014.[281] Dunning, T. H.
Gaussian basis sets for use in correlated molecular calculations. I. The atoms boronthrough neon and hydrogen . J. Chem. Phys., 90(2):1007, 1989.[282] Marques, M. A., Castro, A., Bertsch, G. F., and Rubio, A.
Octopus: A first-principles tool for excitedelectron-ion dynamics . Comput. Phys. Commun., 151(1):60, 2003.[283] Fletcher, R. and Reeves, C. M.
Function minimization by conjugate gradients . Comput. J., 7(2):149,1964.[284] Polak, E. and Ribiere, G.
Note sur la convergence de méthodes de directions conjuguées . Rev. françaised’informatique Rech. opérationnelle. Série rouge, 3(16):35, 1969.[285] Morrell, M. M., Parr, R. G., and Levy, M.
Calculation of ionization potentials from density matrices andnatural functions, and the long-range behavior of natural orbitals and electron density . J. Chem. Phys.,62(2):549, 1975.[286] Conn, A. R., Gould, N. I. M., and Toint, P. L.
Lancelot , vol. 17 of
Springer Series in ComputationalMathematics . Springer Berlin Heidelberg, Berlin, Heidelberg, 1992.[287] Conn, A. R., Gould, N. I. M., and Toint, P. L.
Trust Region Methods . MPS-SIAM Series on Optimization.Society for Industrial and Applied Mathematics, 2000.[288] Vandenberghe, L. and Boyd, S.
Semidefinite programming . SIAM Rev., 38(1):49, 1996.[289] Rokaj, V., Penz, M., Sentef, M. A., Ruggenthaler, M., and Rubio, A.
Quantum Electrodynamical BlochTheory with Homogeneous Magnetic Fields . arXiv Prepr., 1–6, 2019.[290] Landau, L. D. and Lifshitz, E. M.
Quantum mechanics: non-relativistic theory , vol. 3. Elsevier, 2013.[291] Klitzing, K. V., Dorda, G., and Pepper, M.
New Method for High-Accuracy Determination of the Fine-Structure Constant Based on Quantized Hall Resistance . Phys. Rev. Lett., 45(6):494, 1980.[292] Kohmoto, M.
Topological invariant and the quantization of the Hall conductance . Ann. Phys. (N. Y).,160(2):343, 1985.[293] Laughlin, R. B.
Anomalous Quantum Hall Effect: An Incompressible Quantum Fluid with FractionallyCharged Excitations . Phys. Rev. Lett., 50(18):1395, 1983.[294] Hübener, H., Sentef, M. A., De Giovannini, U., Kemper, A. F., and Rubio, A.
Creating stable Flo-quet–Weyl semimetals by laser-driving of 3D Dirac materials . Nat. Commun., 8(1):13940, 2017.[295] Sentef, M. A., Ruggenthaler, M., and Rubio, A.
Cavity quantum-electrodynamical polaritonically en-hanced electron-phonon coupling and its influence on superconductivity . Sci. Adv., 4(11), 2018.[296] Ojambati, O. S., Chikkaraddy, R., Deacon, W. D., Horton, M., Kos, D., Turek, V. A., Keyser, U. F., andBaumberg, J. J.
Quantum electrodynamics at room temperature coupling a single vibrating moleculewith a plasmonic nanocavity . Nat. Commun., 10(1):1, 2019. 271IBLIOGRAPHY[297] Lanatà, N.
Local bottom-up effective theory of non-local electronic interactions . preprintarXiv:2004.05384, 2020.[298] Gersten, A.
Maxwell’s Equations as the One-Photon Quantum Equation . Foundations of Physics Let-ters, 12(3):291, 1999.[299] Lanczos, C.
An iteration method for the solution of the eigenvalue problem of linear differential anintegral operators . J. Res. Nat. Bu. Stand., 45:255, 1950.[300] Feit, M. D., Fleck Jr., J. A., and Steiger, A.