Adjoint methods for stellarator shape optimization and sensitivity analysis
AABSTRACT
Title of dissertation:
ADJOINT METHODS FORSTELLARATOR SHAPE OPTIMIZATIONAND SENSITIVITY ANALYSISElizabeth Joy PaulDoctor of Philosophy, 2020
Dissertation directed by:
Professor William DorlandDepartment of Physics
Stellarators are a class of device for the magnetic confinement of plasmas without toroidalsymmetry. As the confining magnetic field is produced by clever shaping of external electro-magnetic coils rather than through internal plasma currents, stellarators enjoy enhancedstability properties over their two-dimensional counterpart, the tokamak. However, the de-sign of a stellarator with acceptable confinement properties requires numerical optimizationof the magnetic field in the non-convex, high-dimensional spaces describing their geometry.Another major challenge facing the stellarator program is the sensitive dependence of con-finement properties on electro-magnetic coil shapes, necessitating the construction of thecoils under tight tolerances. In this Thesis, we address these challenges with the applicationof adjoint methods and shape sensitivity analysis.Adjoint methods enable the efficient computation of the gradient of a function thatdepends on the solution to a system of equations, such as linear or nonlinear PDEs. Ratherthan perform a finite-difference step with respect to each parameter, one additional adjointPDE is solved to compute the derivative with respect to any parameter. This enablesgradient-based optimization in high-dimensional spaces and efficient sensitivity analysis. Wepresent the first applications of adjoint methods for stellarator shape optimization.The first example we discuss is the optimization of coil shapes based on the generaliza-tion of a continuous current potential model. We optimize the geometry of the coil-windingsurface using an adjoint-based method, producing coil shapes that can be more easily con-structed. Understanding the sensitivity of coil metrics to perturbations of the winding surfaceallows us to gain intuition about features of configurations that enable simpler coils. Wenext consider solutions of the drift-kinetic equation, a kinetic model for collisional transportin curved magnetic fields. An adjoint drift-kinetic equation is derived based on the self-adjointness property of the Fokker-Planck collision operator. This adjoint method allows us1o understand the sensitivity of neoclassical quantities, such as the radial collisional trans-port and self-driven plasma current, to perturbations of the magnetic field strength. Finally,we consider functions that depend on solutions of the magneto-hydrodynamic (MHD) equi-librium equations. We generalize the well-known self-adjointness property of the MHD forceoperator to include perturbations of the rotational transform and the currents outside theconfinement region. This self-adjointness property is applied to develop an adjoint methodfor computing the derivatives of such functions with respect to perturbations of coil shapesor the plasma boundary. We present a method of solution for the adjoint equations basedon a variational principle used in MHD stability analysis.2
DJOINT METHODS FOR STELLARATOR SHAPE OPTIMIZATIONAND SENSITIVITY ANALYSISbyElizabeth Joy Paul
Dissertation submitted to the Faculty of the Graduate School of theUniversity of Maryland, College Park in partial fulfillmentof the requirements for the degree ofDoctor of Philosophy2020Advisory Committee:Professor William Dorland, Chair/AdvisorDr. Matthew Landreman, Co-AdvisorProfessor Thomas M. Antonsen, Jr.Professor Adil HassamProfessor Ricardo Nochetto (cid:13)
Copyright byElizabeth Joy Paul2020reface
In an effort to promote open science, all data and the associated post-processing scriptsused to produce the figures in this Thesis have been preserved in aZenodo archive with citeable DOI 10.5281/zenodo.3745635.ii cknowledgments
I owe many thanks to the individuals who have made my graduate career fruitful andenjoyable. Most importantly, I would like to thank my advisors, Bill Dorland and MattLandreman, who guided me toward interesting and important physics problems and madethe completion of this Thesis possible. Bill, your positive outlook on life and constantcuriosity are an inspiration to me. I walk away from every interaction with you with a smileon my face and a new interesting idea in my head. Matt, thank you for your generosityand meticulous attention to detail. From deriving the drift-kinetic equation on the board toproviding detailed comments on every manuscript, I could never thank you enough for yourinvestment in my graduate career. As an incoming graduate student I took a bit of a leapof faith when I decided to come to Maryland, and I could not have asked for a better pairof (award-winning!) advisors. Thank you for believing in me and supporting my career atevery step of the way.Many thanks goes to the other members of the dissertation committee. To Tom Antonsen,for giving me the opportunity to teach plasma physics and contributing to our games of“dungeons and plasmas” with your top-secret notes. I feel honored to be able to work witha great mind such as yours. I hope we can continue to collaborate and spread the goodnews about ALPO. To Adil Hassam, for never ceasing to ask thought-provoking questionsduring group meeting. Your math methods course laid the perfect foundation for plasmaphysics research. To Ricardo Nochetto, for introducing our group to the methods of shapeoptimization. I appreciate the time you took in making the mathematical literature accessibleto us physicists. Our interactions have contributed to much of the work in this Thesis. Thankyou all for agreeing to serve on my committee.I would also like to give a special acknowledgement to Ian Abel, who introduced ourgroup to adjoint methods which formed the basis for this Thesis work.This work was supported by the ARCS Foundation and the US Department of EnergyFES grants DE-FG02-93ER-54197 and DE-FC02-08ER-54964. The computations presentedin this Thesis have used resources at the National Energy Research Scientific ComputingCenter (NERSC). iii ublication List
1. L. M. Imbert-Gerard, E. J. Paul, and A. Wright, “An introduction to symmetries andstellarators,” in preparation (2019). (link to preprint)2. E. J. Paul, T. Antonsen, Jr., M. Landreman, and W. A. Cooper, “Adjoint approachto calculating shape gradients for 3D magnetic confinement equilibria,”
Journal ofPlasma Physics
86, 905860103 (2020). (link to preprint)3. E. J. Paul, I. G. Abel, M. Landreman, and W. Dorland, “An adjoint method forneoclassical stellarator optimization,”
Journal of Plasma Physics
85, 795850501 (2019).(link to preprint)4. T. Antonsen, Jr., E. J. Paul, and M. Landreman, “Adjoint approach to calculatingshape gradients for 3D magnetic confinement equilibria,”
Journal of Plasma Physics
85, 905850207 (2019). (link to preprint)5. M. Landreman and E. J. Paul, “Computing local sensitivity and tolerances for stel-larator physics properties using shape gradients,”
Nuclear Fusion
58, 076023 (2018).(link to preprint)6. E. J. Paul, M. Landreman, A. Bader, and W. Dorland, “An adjoint method forgradient-based optimization of stellarator coil shapes,”
Nuclear Fusion
58, 076015(2018). (link to preprint)7. E. J. Paul, M. Landreman, F. M. Poli, D. A. Spong, H. M. Smith, and W. Dorland, “Ro-tation and neoclassical ripple transport in ITER,”
Nuclear Fusion
57, 116044 (2017).(link to preprint) iv able of Contents f and the adjoint method . . . . . . . . . . . . . . . . . . . . 453.5 Winding surface optimization results . . . . . . . . . . . . . . . . . . . . . . 473.5.1 Trends with optimization parameters . . . . . . . . . . . . . . . . . . 473.5.2 Optimal W7-X winding surface . . . . . . . . . . . . . . . . . . . . . 483.5.3 Optimal HSX winding surface . . . . . . . . . . . . . . . . . . . . . . 51v.6 Local winding surface sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . 573.7 Metrics for configuration optimization . . . . . . . . . . . . . . . . . . . . . . 603.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 β . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1005.5.2 Rotational transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035.5.3 Vacuum magnetic well . . . . . . . . . . . . . . . . . . . . . . . . . . 1085.5.4 Ripple on magnetic axis . . . . . . . . . . . . . . . . . . . . . . . . . 1145.5.5 Effective ripple in the 1 /ν regime . . . . . . . . . . . . . . . . . . . . 1165.5.6 Departure from quasi-symmetry . . . . . . . . . . . . . . . . . . . . . 1195.5.7 Neoclassical figures of merit . . . . . . . . . . . . . . . . . . . . . . . 1225.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 m = 0, n = 0 mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1346.3.2 n = 0, m (cid:54) = 0 modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 1396.3.3 m = 0, n (cid:54) = 0 modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 1416.3.4 m (cid:54) = 0, n (cid:54) = 0 modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 1446.4 Tokamak shape gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146vi.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 A Toroidal coordinate systems 155
A.1 Toroidal coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155A.2 Flux coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156A.3 Magnetic coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158A.4 Boozer coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
B Justification for current potential 161C Adjoint derivative at fixed J max F.0.1 DKES trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168F.0.2 Full trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
G Symmetry of the sensitivity function 170
G.0.1 Symmetry of S R implied by Fourier derivatives . . . . . . . . . . . . 170G.0.2 Symmetry of Fourier derivatives . . . . . . . . . . . . . . . . . . . . . 171 H Derivatives at ambipolarity 173I Derivation of generalized MHD self-adjointness relation 176J Alternate derivation of fixed-boundary adjoint relation 178K Interpretation of the displacement vector 180L Details of axis ripple calculation 182M Details of effective ripple in the /ν regime calculation 184N Details of departure from quasi-symmetry calculation 187O Details of neoclassical figures of merit calculation 189 vii Linearized equilibrium energy functional and coefficient matrices 190
P.1 Further simplification of energy functional . . . . . . . . . . . . . . . . . . . 190P.2 Explicit forms of coefficient matrices . . . . . . . . . . . . . . . . . . . . . . 192P.3 Invertibility of A αα . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Q Constraint on bulk force perturbation 194R Near-axis expansion of screw pinch equilibria 196 viii hapter 1
Introduction
This Chapter aims to motivate and place in context the work of this Thesis. We be-gin with an introduction to the stellarator concept of toroidal confinement in Section 1.1,including the necessity of optimization of the magnetic field. We then discuss importantproperties of a stellarator device in Section 1.2. To put stellarator optimization in perspec-tive, we briefly discuss the relevant history in Section 1.3. We then, in Section 1.4, providea detailed introduction to stellarator optimization, including typical assumptions, numericalmethods, and associated challenges. We conclude with an overview of this Thesis in Section1.5.Throughout this Chapter, we use terminology related to magnetic field geometry andtoroidal coordinate systems, which are introduced in Appendix A.
The fusion community must face several significant scientific challenges to demonstratea viable magnetic fusion reactor. A large fraction of the present research in magnetic fusionis dedicated to the tokamak, a concept that relies on a large plasma current for confinement.Driving such a current requires a significant amount of recirculated power and necessitateseither pulsed operation or non-inductive current drive, both of which are disadvantageousfor a fusion reactor. This large current makes them susceptible to current-driven instabilitiesthat can limit plasma performance. These instabilities, such as tearing and kink instabili-ties, can result in catastrophic terminations of the discharge (Chapter 7.9 in [235]). Runawayelectrons formed due to disruptions can be accelerated by the inductive electric field, possi-bly causing damage to plasma-facing components and applying large electro-magnetic forcesto the vacuum vessel. The effect of runaway electrons will be much more harmful in largereactor-scale tokamaks due to the exponential dependence of the density of relativistic elec-trons on the plasma current [104]. Thus in a reactor, disruptions must be mitigated by activefeedback and operation within a safe margin of stability limits. However, such control willbe difficult when alpha particles provide a significant fraction of the heating power [93].Remarkably, Lyman Spitzer predicted these possible difficulties of tokamak confinementin 1952 [210], before the first toroidal confinement experiment,1 a) (b)
Figure 1.1: A schematic image of a tokamak (a) and stellarator (b). The electro-magneticcoils are shown in blue, and the plasma domain is shown in green. Magnetic field lines lyingon the outermost magnetic surface are shown in black.“... a large induced current is open to the two practical objectives that it cannotbe sustained in a steady equilibrium and that the rapid generation of such acurrent is likely to lead to plasma oscillations.”These observations led to the development of the stellarator concept. In contrast to thetokamak, a stellarator generates a poloidal magnetic field through clever shaping by externalcurrents rather than internal plasma currents. A small amount of current in the plasma isself-driven due to pressure gradients, though this is typically not large enough to result insignificant MHD modes. There is some experimental evidence that stellarator configurationsmay be able to operate above the linear MHD stability pressure threshold [234] rather thanbeing terminated by a disruption. The Large Helical Device (LHD) has operated up to avolume-averaged β of 5% without any disruptive MHD phenomena, though the heat trans-port increases due to low- n mode activity [201]. Here β = p/ ( B / (2 µ )) is the ratio of theplasma pressure, p , to the magnetic pressure, and n is the toroidal mode number. Similarly,high-beta discharges in the Wendelstein 7-Advanced Stellarator (W7-AS) have shown satu-ration of low- n and interchange modes at a low level that merely slowly degrades confinement[234]. Stellarators can also operate at higher density than tokamaks due to the absence ofthe Greenwald limit [72]. While in tokamaks, the limits on the density and pressure dueto the Greenwald and MHD stability limits set hard boundaries on the operating points, ina stellarator much softer limits exist. Performance at high beta is often instead limited byequilibrium properties, such as magnetic field stochasticity near the edge. For example, if theShafranov shift becomes comparable to the minor radius of the plasma, this can lead to lossof magnetic surfaces [212]. The ability to operate at high beta is critical for an economicalfusion reactor: in the temperature range of 10-20 keV, the fusion power density scales as P ∼ β B [208]. See Figure 1.1 for schematics of a tokamak and stellarator configuration.Despite these clear advantages, much care must be taken to design a stellarator withacceptable confinement properties. Due to its continuous toroidal symmetry, the tokamakenjoys confinement of collisionless single-particle trajectories and the existence of closed,2 ep. Prog. Phys. (2014) 087001 Review Article -6 -5 -4 -3 -2 -1 ν *10 -3 -2 -1 D * TokamakW7-X Plateau PS
Figure 12.
The so-called ‘mono-energetic’ diffusion coefficient (see[63] for details) versus collisionality, ν ∗ = ν R/ ι v , where ν is themono-energetic pitch-angle-scattering frequency, R the major radiusand v the speed of the particles, in the standard configuration ofW7-X (bold) and a tokamak (dashed) with similar aspect ratios( r/R = . / . .
5. The asymptoticregimes are indicated by dotted straight lines. In the order ofincreasing collisionality: the √ ν regime, the 1 / ν regime, the plateauregime and the Pfirsch–Schl ¨uter regime. At very low collisionality(below the range shown) the transport again becomes proportionalto ν . The diffusivity has been normalized to the plateau value in acircular tokamak, and the radial electric field has been chosen as E r /vB = · − , where B is the magnetic field strength. If theelectric field is made larger, the transition from the √ ν regime to the1 / ν regime occurs at higher collisionality. From [48]. In the treatment just given, we focused on the equilibriumproperties of the plasma, treating the time derivativeas O ( δ v T /L) . This is sufficient for calculating thecollisional (neoclassical) transport but fails to capture turbulentfluctuations and transport. To do so, we need to elevate thetime derivative to order O ( δ v T /L) and also allow f a to varyon the length scale of the gyroradius. If it is assumed thatthe fluctuating electric and magnetic fields, δ E = −∇ δφ − ∂ δ A / ∂ t and δ B = ∇ × δ A , are small and the wave numbersare ordered as k ∥ L ∼ k ⊥ ρ i ∼ , (120)the result is the famous gyrokinetic equation ∂ g a ∂ t + (v ∥ b + v d a + δ v d a ) · ∇ (f a + g a ) − ⟨ C a (g a ) ⟩ R = e a f a T a ∂ ⟨ χ ⟩ R ∂ t , (121)where the distribution function has been written as f a = − e a δφ ( r , t )T a f a + g a ( R , H , µ, t ), and where χ = δφ − v · δ A is the gyrokinetic potential. Here,the gyro-average at fixed guiding-centre position is denoted by ⟨ · · · ⟩ R , and the perturbation of the drift velocity is given by δ v d a = b × ∇ ⟨ χ ⟩ R B . (122)According to equation (120) perturbations are assumed to varymuch more rapidly across the field than along it. The physical reason for this ordering is that unless the parallel phase velocityexceeds the ion thermal speed, ω k ∥ > v T i , there is strong ion Landau damping. Since the frequency fordrift waves is of order ω ∗ ∼ k ⊥ ρ i v T i /L , it follows that theparallel wavelength must be of order L if k ⊥ ρ i = O ( ) toavoid Landau damping. For each Fourier component of thefluctuations we then have ⟨ χ ⟩ R , k = J ! δφ k − v ∥ δ A ∥ k " + J v ⊥ k ⊥ δ B ∥ k , (123)where the argument of the Bessel functions is k ⊥ v ⊥ / ) a , δ B ∥ = b · δ B , and we have adopted the Coulomb gauge, ∇ · δ A = δφ , δ A ∥ and δ B ∥ are a n a e a T a δφ = a e a $ g a J d v, δ A ∥ = µ k ⊥ a e a $ v ∥ g a J d v, (124) δ B ∥ = − µ k ⊥ a e a $ v ⊥ g a J d v, where the volume element in velocity space is given byequation (103). The gyrokinetic particle and heat fluxes are % δ Γ a · ∇ ψδ q a · ∇ ψ & = $ % m a v − T a & g a δ v d · ∇ ψ d v, and are thus of order δ in our basic gyroradius expansion (54).This is the same order as the neoclassical transport, and we thusexpect that the two transport channels should be comparable,at least generally speaking. In practice, turbulent transporttends to dominate except in low-collisionality plasmas withoutaxisymmetry. There is an important difference between neoclassical andturbulent transport concerning ambipolarity. It follows fromequations (122), (123) and (124) that the turbulent transport isautomatically ambipolar, ⟨ δ J · ∇ ψ ⟩ = a e a ⟨ δ Γ a · ∇ ψ ⟩ = , to leading order, regardless of the magnitude of the radialelectric field. However, as we shall see, neoclassicaltransport is in general not ambipolar unless the electricfield assumes a particular value. Since the total transportmust be ambipolar (on the transport time scale ∂ / ∂ t ∼ δ v T a /L ), the radial electric field must therefore adjust so asto make the neoclassical channel ambipolar (unless the fieldis quasisymmetric). This fixes the perpendicular flow velocityof each species, V a ⊥ = b × ( ∇ φ − ∇ p a /n a e a )B , Figure 1.2: The neoclassical diffusion coefficient, D ∗ , as a function of the normalized col-lisionality, ν ∗ = νR/ ( ιv ), where ν is the collision frequency, ι is the rotational transform, v is the speed, and R is the major radius. An axisymmetric field exhibits a low-collisionalityregime in which D ∗ ∼ ν , while a stellarator exhibits D ∗ ∼ /ν . Thus the neoclassical trans-port in a general three-dimensional field can be especially deleterious at low collisionality.Figure reproduced from [101] with permission.nested magnetic surfaces. However, in the general three-dimensional field of a stellarator,these properties are not always present. The trajectories of energetic ions, such as the al-pha particles produced in a fusion reaction, may therefore be lost, resulting in damage tomaterial surfaces. Stellarators can experience enhanced neoclassical transport, the colli-sional transport of thermal particles due to the magnetic field geometry, leading to increasedtransport of heat and particles, especially at low collisionality (Figure 1.2). The presence oflarge magnetic islands or chaotic regions in a three-dimensional field can also severely limitperformance by locally flattening the temperature profile.However, none of these challenges appear to be showstoppers for stellarator confinement.The success of modern stellarators can be attributed to the ability to design the magnetic fieldwith numerical optimization. While tokamak optimization is also possible [107], it is muchmore difficult as confinement properties become very sensitive to the current density andpressure profiles. These profiles can be determined with multi-scale modeling on turbulentand transport time scales, which is very computationally intensive. On the other hand,the physical properties of stellarators are relatively insensitive to these profiles, as theyprimarily rely on the externally produced magnetic field for confinement [27]. Given theability to numerically optimize the magnetic field of a stellarator, in Section 1.2, we discussthe properties one should consider in a design.3igure 1.3: A Poincare surface computed from the NCSX coil shapes [236]. To produce thisFigure, magnetic field lines are integrated toroidally around the device. Each time they hita plane at constant toroidal angle, a point is plotted with color indicating the field line. Ageneral 3D field contains regions of chaotic field lines and magnetic island chains along witha volume of nested toroidal magnetic surfaces. Figure adapted from [121]. We now outline the desired physical properties of a stellarator and standard proxy func-tions applied during their design. We will reserve any discussion of coils, the external currentsthat produce the magnetic field, until Section 1.4.3.
Equilibrium properties
The operating space of stellarators is often restricted due to MHD equilibrium propertiesrather than stability limits. For example, when β ∼ (cid:15)ι / (cid:15) is the inverse aspectratio and ι is the rotational transform, the Shafranov shift becomes comparable to the minorradius, which may result in flux-surface break-up [97, 212]. There is a tendency of the edgemagnetic field to become stochastic at large beta [201], so a design should try to maximize thevolume of continuously nested flux surfaces [119]. One should also minimize the island widthat low-order rational surfaces, which can be estimated using analytic expressions [38, 147],assuming the magnetic field is close to having perfect magnetic surfaces. Such islands can alsobe minimized by controlling the rotational transform, either by maintaining low magneticshear and eliminating low-order rational surfaces altogether or by taking advantage of largemagnetic shear, as the magnetic island width scales as 1 / (cid:112) ι (cid:48) ( ψ ) [26]. See Figure 1.3 fora visualization of magnetic surfaces, magnetic islands, and chaotic field lines in the NCSXstellarator. Pressure-driven currents
There are several sources of self-driven plasma current [97]: the parallel bootstrap currentarises due to collisions between trapped and passing particles in the presence of densityand temperature gradients, and the parallel Pfirsch-Sch¨uter and perpendicular diamagneticcurrents occur due to equilibrium pressure gradients. The bootstrap current can cause4hifts in the rotational transform toward low-order rational values, which must especially beavoided in low-shear devices. Control of the edge rotational transform is also vital for designswith an island divertor [75]. In the presence of reduced bootstrap current, the magnetic fieldstructure becomes less sensitive to changes in beta. For these reasons, the Wendelstein 7-X(W7-X) configuration was designed for minimal bootstrap current [86]. Often optimizationis performed with a low-collisionality semi-analytic bootstrap current model [205]. Bootstrapcurrent optimization will be described further in Chapter 4. The Pfirsch-Schl¨uter currentdoes not provide any net current and therefore does not shift the rotational transform.However, it can give rise to a Shafranov shift and thus affect the equilibrium beta limit[232]. The Pfirsch-Schl¨uter current can be reduced by minimizing the magnitude of thegeodesic curvature. The net diamagnetic current will only be non-zero in the presence ofanother source of net current; thus, the reduction of the bootstrap current will automaticallyreduce the diamagnetic current.While the presence of self-driven current can give rise to unfavorable shifts in the rota-tional transform, there are situations in which significant bootstrap current may be desirable.If the bootstrap current provides a source of rotational transform in addition to the exter-nal coils, the coil complexity may be reduced and a more compact device may be possible.Plasma current can also provide island healing [95], reducing the width of islands in com-parison with those in the vacuum configuration. For these reasons, the National CompactStellarator Experiment (NCSX) was designed to be quasi-axisymmetric with a significantfraction of rotational transform provided by the plasma current [114].
Energetic-particle confinement
A successful stellarator reactor must confine energetic alpha particles for at least theirslowing-down time such that their energy can be deposited with the thermal population.Prompt losses of fast particles should especially be avoided because they can lead to damageto material surfaces. Collisional diffusion and deflection are minimal at energies near thebirth energy of 3 . J = (cid:73) dl v || , (1.1)is a conserved quantity, where v || is the velocity parallel to the magnetic field and l mea-sures length along a field line. For trapped particles, the integral is taken along a closedtrajectory between bounce points. For passing particles, it is taken along a field line untilit comes infinitesimally close to its starting point. If J is constant on a magnetic surface,then the collisionless trajectories will experience no net radial drift, a property known asomnigeneity [39]. Thus several properties involving J , such as its variation within a fluxsurface, have been considered during the design process [58, 213]. There is evidence thattargeting quasi-symmetry (defined shortly) near the half-radius may also improve energeticparticle confinement [105]. 5 uasi-symmetry Quasi-symmetric magnetic fields are a subset of omnigeneous magnetic fields. A quasi-symmetric magnetic field possesses a symmetry direction of the magnetic field strength whenexpressed in Boozer coordinates (Appendix A.4), B ( ψ, ϑ B , ϕ B ) = B ( ψ, M ϑ B − N ϕ B ) , (1.2)for fixed integers M and N . If M = 0, the contours of the magnetic field strength closepoloidally, known as quasi-poloidal symmetry. If N = 0, the contours of the magnetic fieldstrength close toroidally, known as quasi-axisymmetry. If both M and N are non-zero,known as quasi-helical symmetry, the contours of the field strength close both toroidally andpoloidally.This symmetry implies guiding center confinement [24] and neoclassical properties thatare comparable to those of an equivalent tokamak [97], including the ability to rotate inthe direction of quasi-symmetry [100]. A quasi-symmetric field is omnigeneous, though theconverse is not necessarily true. Quasi-symmetry is typically targeted by minimizing thesymmetry-breaking Fourier harmonics of the magnetic field strength. Neoclassical transport
Stellarators experience enhanced neoclassical transport at low collisionality in comparisonwith tokamaks (Figure 1.2). Neoclassical transport is typically the dominant transport chan-nel in classical (unoptimized) stellarators. It is common to employ the effective ripple ( (cid:15) eff )proxy, which quantifies the geometric dependence of the radial fluxes in the low-collisionality1 /ν regime [168]. A discussion of (cid:15) eff and neoclassical diffusion in the 1 /ν regime is givenin Chapter 5 and Appendix M. Neoclassical optimization will be discussed in more depth inChapter 4. A review of neoclassical optimization strategies is given in [165]. Stability
Although stellarators may be able to operate above linear MHD stability limits, it isdesirable to design a stellarator with an increased beta limit to reduce enhanced transportcaused by MHD modes. It is common to employ the magnetic well [85] (discussed in Chapter5) or Mercier criterion [157] as proxies for the stability of low- n interchange modes. One canalso try to increase magnetic shear, the radial derivative of the rotational transform ι (cid:48) ( ψ ), toimprove large n ballooning stability and Mercier stability [95]. It appears that stellaratorscan also be designed with reduced microturbulence, though turbulence optimization has yetto be demonstrated experimentally. Some proxies have been proposed, such as reducing theoverlap between bad curvature and trapping regions [239] or increasing nonlinear energytransfer between unstable and damped modes [96]. Lyman Spitzer’s first stellarator concept used a simple figure-eight design (Figure 1.4),which produced rotational transform by “twisting the torus out of the plane” [211]. Spitzerand his team experimentally demonstrated that external shaping could produce rotationaltransform in a vacuum field with the Model A, B, and C series stellarators at Princeton[215]. Results from the Model B1 demonstrated confinement of energetic electrons for sev-6igure 1.4: A diagram of the figure-eight stellarator design from Lyman Spitzer’s 1951Project Matterhorn report. Figure reproduced from [209].eral milliseconds, much longer than would be possible with a purely toroidal field. However,the observed diffusion of thermal particles was much larger than that predicted from Bohmscaling [46]. The Model C, using a racetrack configuration with helically wound coils, wasable to demonstrate the existence of nested magnetic surfaces [207]. Nonetheless, the ModelC experienced poor confinement with Bohm-like diffusion [241]. These early stellarator ex-periments operated until the late 1960s when promising results from the Soviet T-3 tokamakbecame available, and it was decided that Princeton’s Model C would be converted to atokamak [1].Meanwhile, the Wendelstein line of stellarators was active at IPP Garching, initiallyadopting Princeton’s racetrack design. Experiments on WII-A provided insight into thebenefits of low magnetic shear and accurate construction of the coil system for avoidingmagnetic islands [19]. The performance continued, however, to be limited by neoclassicaltransport at low collisionality and low equilibrium pressure limits due to the Shafranov shift[108].A significant breakthrough in the stellarator program came with the design of W7-AS,which aimed to improve confinement with equilibrium optimization. To demonstrate thestellarator optimization concept, W7-AS was partially optimized for minimal geodesic cur-vature. Such an objective was predicted to minimize radial magnetic drifts and pressure-driven parallel currents. For the first time, the magnetic field shaping was supplied by7igure 1.5: The modular field (MF) coils, toroidal field (TF) coils, and flux surfaces of theW7-AS stellarator. Figure reproduced from [108] with permission.non-planar, modular coils (Figure 1.5) that provided the freedom to tailor the magnetic fieldmore carefully than helical coils. The experiment operated from 1988 to 2002, demonstrat-ing the improved equilibrium and stability properties and reduction of neoclassical transportenabled through equilibrium optimization [108, 117].The success of W7-AS paved the way for the W7-X experiment [233], which was fullyoptimized for nested magnetic surfaces, fast-particle confinement, reduced parallel currents,minimal neoclassical transport at low collisionality, and MHD stability up to an average β of 5% [15]. The early optimization efforts of the Wendelstein team benefited greatly fromthe discovery that guiding center confinement could be achieved with a quasi-symmetric[24] magnetic field. N¨uhrenberg and Zille of the Wendelstein team then demonstrated thatquasi-symmetric equilibria could be obtained from numerical optimization of MHD equilibria[175]. The W7-X configuration was designed based on one of their quasi-helical configura-tions, modified to achieve the objectives outlined above. The resulting configuration wasquasi-isodynamic, a quasi-omnigenous magnetic field with poloidally closed contours of themagnetic field strength [98, 176]. Experiments from the initial campaigns of W7-X havedemonstrated the success of the stellarator equilibrium optimization concept, confirming thedesired magnetic topology to within a tolerance of 10 − [188]. High-beta operation will notbe demonstrated until an actively-cooled divertor is installed for the next operating cam-paign. However, there is initial evidence that recent high-performance shots could not havebeen achieved without neoclassical optimization [237].W7-X was not, however, the first experimental demonstration of a fully optimized stel-larator. The Helically Symmetric eXperiment (HSX) was designed to have quasi-helical8igure 1.6: Modular field coils (silver), toroidal field coils (bronze), and magnetic surfacesof the W7-X stellarator. Figure reproduced from [223] with permission.symmetry, Mercier stability, and low magnetic shear [8] using the equilibrium optimizationtools developed by the Wendelstein team [6]. HSX has demonstrated a reduction of elec-tron thermal diffusivity [35] due to the decrease in neoclassical transport and a reductionof flow damping in the symmetry direction [77]. The inward-shifted configuration of LHDwas partially optimized for reduced neoclassical transport and energetic particle confine-ment [163], though its ideal MHD stability is worsened in comparison with the standardconfiguration. Experiments have demonstrated higher electron temperatures and improvedenergetic ion confinement in the inward-shifted configuration as compared with the standardconfiguration [164].There continues to be an effort toward advanced stellarator designs. Construction hascommenced for the Chinese First Quasi-symmetric Stellarator (CFQS) [206], which will bethe first quasi-axisymmetric device in operation. The quasi-axisymmetric NCSX [242] wasdesigned and partially constructed at the Princeton Plasma Physics Laboratory (PPPL),but its funding was terminated before its completion. As the field of stellarator optimizationhas developed, several other stellarator equilibria have been optimized to be quasi-symmetric[12, 57, 70, 106, 134, 135, 167] and quasi-omnigeneous [122, 159]. Historically, stellarator optimization has largely used a two-staged approach: in the firststep, the magnetic field in the confinement region is optimized to obtain the desirable physicsproperties. The magnetic field must satisfy the MHD equilibrium equations; thus this taskamounts to optimization in the space of free parameters that describe the MHD equilibrium.9ften a fixed-boundary MHD calculation is performed, in which an outer flux surface isprescribed, as opposed to a free-boundary calculation, in which the currents in the vacuumregion are prescribed. As a second step, the currents in the vacuum region are optimized tobe consistent with the boundary obtained in the first step. As numerical MHD equilibriumcalculations form the foundation of stellarator optimization, these will be described in Section1.4.1. The two stages of the optimization process are described in Sections 1.4.2 and 1.4.3.We will conclude with a discussion of the present challenges associated with the design ofstellarators and how this Thesis will address them in Section 1.4.4.
The MHD equilibrium equations, J × B = ∇ p (1.3a) ∇ × B = µ J (1.3b) ∇ · B = 0 , (1.3c)describe the steady-state behavior of the magnetic field in strongly magnetized plasmas.Many assumptions are made in arriving at (1.3), such as small plasma resistivity, low fre-quency in comparison with the cyclotron and collision frequencies, and small electron inertia.In practice, these equations describe the long-wavelength, low-frequency behavior of mag-netic fusion plasma very well [64].Finding solutions to (1.3) is non-trivial in a general three-dimensional field, as well-posedness requires a set of constraints to be satisfied on every closed field line unless thepressure profile is locally flattened ([84], Section 10.3 in [121]). An alternative is to relyon the assumption that there exists a set of continuously nested toroidal magnetic surfaces,Γ( ψ ), labeled by the toroidal flux label, ψ . Although magnetic surfaces are not guaranteed toexist in general three-dimensional geometry, any stellarator configuration of physical interestwill possess a large region of continuously nested surfaces, and making this assumption willallow for tractable MHD equilibrium calculations.Under the assumption of continuously nested toroidal magnetic surfaces, (1.3) can beshown to be stationary points of an energy functional [133], W [ B ] = (cid:90) V P d x (cid:32) B µ − p (cid:33) , (1.4)where V P is the volume of the confinement region bounded by a magnetic surface S P . Varia-tions of W are computed at prescribed and fixed pressure ( p ( ψ )), rotational transform ( ι ( ψ )),and the toroidal flux label on S P ( ψ ) ([97], Section 11.1 in [121]). Solutions to (1.3) underthese assumptions can be computed efficiently and robustly using gradient-descent methodsto obtain local minima of W [ B ]. This approach is implemented in the VMEC [111] andNSTAB [69] codes.Sometimes another function of flux is prescribed instead of the rotational transform, such10DE BC Given( ∇ × B ) × B = µ ∇ p ( ψ ) B · ˆ n | S P = 0 p ( ψ ), ψ , & S P ∇ · B = 0 ι ( ψ ) or I T ( ψ )Table 1.1: Summary of fixed-boundary equilibrium PDE.as the net toroidal current inside a constant ψ surface, I T ( ψ ) = (cid:90) S T ( ψ ) d x J · ˆ n , (1.5)where S T ( ψ ) is a surface at constant toroidal angle bounded by Γ( ψ ) (Figure A.2) and ˆ n isthe unit normal. This choice of flux function is more common in the context of optimization,as I T ( ψ ) can be chosen to vanish for a vacuum field or to be consistent with a bootrstrapcurrent model at finite pressure [206, 214].We can consider (1.3) to be an equation determining the magnetic field B , as the currentdensity is computed from Ampere’s law (1.3b) and the pressure is given as a function of flux, p ( ψ ). The MHD equilibrium equations are solved with a Dirichlet boundary condition, B · ˆ n | S P = 0 . (1.6)In the fixed-boundary approach, S P is given and fixed during the equilibrium calculation.The relevant equations for a fixed-boundary calculation are summarized in Table 1.1.In the free-boundary approach, the current density, J C , in the vacuum region, R \ V P ,is prescribed instead of S P . The magnetic field due to this current is computed from theBiot-Savart law, B C ( x ) = µ π (cid:90) R \ V P d x (cid:48) J C ( x (cid:48) ) × ( x − x (cid:48) ) | x − x (cid:48) | . (1.7)For a given S P , the plasma current, J P , is computed from (1.3). The magnetic field dueto the plasma current can similarly be computed from the Biot-Savart law or more efficientlywith the application of the virtual casing principle [143]. The total magnetic field must betangent to the boundary, ( B P + B C ) · ˆ n | S P = 0 . (1.8)Furthermore, the total pressure must be continuous across S P , (cid:104)(cid:2) B / (2 µ ) + p (cid:3)(cid:105) S P = 0 , (1.9)to ensure force balance.In the free-boundary approach, S P is varied until (1.8) and (1.9) are satisfied. These con-ditions (1.8)-(1.9) can also be obtained from a variational principle similar to (1.4) includingthe vacuum region [14]. The free-boundary equilibrium problem is summarized in Table 1.2.Figure 1.7 shows the geometry of equilibrium calculations.Due to its efficiency and robustness, equilibrium optimization has primarily relied onthis variational approach. There are several alternative approaches to obtaining numerical11DE BC Given( ∇ × B ) × B = µ ∇ p ( ψ ) B · ˆ n | S P = 0 p ( ψ ), ψ , & J C ∇ · B = 0 S P s.t. ( B P + B C ) · ˆ n | S P = 0 (cid:104)(cid:2) B / (2 µ ) + p (cid:3)(cid:105) S P = 0 ι ( ψ ) or I T ( ψ )Table 1.2: Summary of free-boundary equilibrium PDEs. The magnetic field due to theplasma current, B P , is computed from the Biot-Savart law (1.7) or the virtual casing prin-ciple. The magnetic field due to the coil current, B C , is computed from the Biot-Savartlaw.Figure 1.7: An equilibrium is computed with a fixed plasma boundary, S P , or prescribedexternal currents, J C . We assume the existence of a set of closed, nested toroidal surfaces,Γ( ψ ). 12olutions to (1.3) in a three-dimensional field. For example, sometimes the pressure is as-sumed to be piece-wise constant [120], or the magnetic field is taken to resistively relax to anequilibrium [90, 115]. For a review of other 3D equilibrium models, see Chapter 11 in [121]. The goal of stellarator optimization is ultimately to obtain the currents in the vacuumregion needed to produce a stellarator configuration with desired physical properties. Inthis sense, it is logical to optimize the coils directly based on a free-boundary equilibrium.However, fixed-boundary optimization has been predominantly used for several practicalreasons. Free-boundary equilibrium calculations tend to be more expensive, as they requireiterations between an equilibrium solve and vacuum field calculations. This iterative schemewill not always converge in practice, hence the historical use of the more robust fixed-boundary method. It has also been suggested that fixed-boundary optimization may yieldbetter equilibrium properties, as the model assumes the existence of at least one magneticsurface. With this approach, considerations of the physics properties of a configurationare largely decoupled from engineering considerations of the coils. As a second step, theelectro-magnetic coils are designed, as described in Section 1.4.3.The fixed-boundary optimization problem is,min S P f ( S P , B ( S P )) , (1.10)where B is seen as a function of S P through the fixed-boundary equations (Table (1.1)). Here,the objective function, f , quantifies physics or engineering properties of an equilibrium, suchas those outlined in Section 1.2. It is common to consider several objectives during anoptimization, taking the objective function to be a sum of squares, f ( S P , B ( S P )) = (cid:88) i (cid:16) f i ( S P , B ( S P )) − f target i (cid:17) σ i . (1.11)Here f target i is the target value for objective i and the σ i parameters quantify the relativeweighting of the objectives.Sometimes additional equality or inequality constraints are imposed, g ( S P , B ( S P )) = 0 (1.12a) h ( S P , B ( S P )) ≤ . (1.12b)For example, the rotational transform might be constrained to be equal to a target value, or amaximum plasma volume may be imposed. Depending on the choice of optimization method,a local or global minimum will be sought. We will delay discussion of specific optimizationalgorithms until Section 1.4.4. The fixed-boundary optimization method is implemented inthe STELLOPT [197, 213] and ROSE codes [59].13 .4.3 Coil optimization Once a target plasma boundary, S P , and equilibrium magnetic field, B , are identifiedfrom equilibrium optimization, electro-magnetic coils that are consistent with this equilib-rium must be identified. The total magnetic field, B , can be decomposed into that whichresults from the target equilibrium plasma current, B P , and that which results from thecoil currents, B C , computed from the Biot-Savart law. If the two are consistent, then thefollowing relation will be satisfied,0 = B P ( x ) · ˆ n ( x ) + µ π (cid:90) R \ V P d x (cid:48) J C ( x (cid:48) ) × ( x − x (cid:48) ) · ˆ n ( x ) | x − x (cid:48) | , (1.13)for all x ∈ S P . In other words, the coils must be consistent with the last magnetic surfaceof the target equilibrium.We note that the above is in the form of an integral equation of the first kind, g ( t ) = (cid:90) ba dsK ( t, s ) f ( s ) , (1.14)where g ( t ) is given in some domain t ∈ [ c, d ], K ( t, s ) is a known kernel function, and f ( s )must be inferred. It is well-known that such problems are ill-posed [131], in the sense thatsmall changes in the prescribed data, g ( t ), result in large changes in the solution, f ( s ), anda unique solution may not exist.Thus finding a solution for J C in (1.13) is not well-posed. In some ways, this is advan-tageous, as there may be many possible coil arrangements that provide the desired plasmaconfiguration, and the one with the most favorable engineering properties can be chosen.However, one must be careful when obtaining numerical solutions to this problem so thatnoise in the prescribed data is not amplified. A classical technique for such problems isTikhonov regularization [225], in which (1.14) is replaced by the optimization problem,min f ( t ) (cid:90) dc dt (cid:32)(cid:90) ba ds K ( t, s ) f ( s ) − g ( t ) (cid:33) + λ (cid:90) ba ds (cid:0) f ( s ) (cid:1) . (1.15)When λ = 0, the above is equivalent to (1.14). In order for the problem to be well-posed,additional information about the nature of the solution is provided. In (1.15), the assumptionis made that the norm of the solution will be small. The regularization parameter, λ ,describes the trade-off between obtaining a solution of (1.14) and satisfying the expected ordesired behavior of the solution. The regularized problem now has a unique solution anddepends continuously on g ( t ) for all λ > J C (cid:32)(cid:90) S P d x (cid:18)(cid:16) B P + B C (cid:17) · ˆ n (cid:19) + λ (cid:90) R \ V P d x F ( J C ) (cid:33) , (1.16)14here B C is the magnetic field due to J C computed from the Biot-Savart law (1.7) and F ( J C ) is some function of the coil currents that characterizes desired engineering properties. Coil properties
Given the freedom inherent in designing stellarator coils, we now outline some desiredproperties for a set of stellarator coils. • Physics objectives - Our primary interest is to find a coil set consistent with our tar-get fixed-boundary equilibrium. This objective is typically quantified by the error inobtaining the last magnetic surface, as in (1.13). In practice, some physics metricsdepend very sensitively on coil perturbations, so other critical physics properties of theequilibrium can be included in the coil optimization, such as the magnetic ripple onaxis (a measure of quasi-symmetry) or the rotational transform [56]. • Manufacturability - Coil shapes have a minimum allowable radius of curvature due totheir finite build, and overly-complex coils may be difficult to manufacture withoutexcessive cost [220]. There are many metrics suggested for quantifying complexity,such as length [243], torsion [118], and curvature [32]. • Stresses - Complex support structures must be built to maintain coil locations andshapes under their large electro-magnetic, thermal, and gravitational stresses. As coilstend to become more circular and planar under electro-magnetic stresses [129], it isadvantageous to minimize curvature and non-planarity when possible. • Access to the plasma chamber - There should be sufficient distance between coils toallow for diagnostic ports and ease of machine assembly and maintenance. Coils withrelatively straight sections on the outboard side may particularly provide improvedaccess [32]. • Coil-plasma separation - In a reactor, coils should be designed sufficiently far from theplasma boundary to allow space for neutron shielding, a blanket, the first wall, coilcasing, and the vacuum vessel. Increased coil-plasma distance can also reduce the mag-netic field ripple due to the finite number of coils. The minimum coil-plasma distanceeffectively sets the required size of a reactor, as ≈ . Current potential methods
The first stellarator coil design code, NESCOIL [158], assumes that all currents in thevacuum region lie on a closed toroidal surface called the winding surface, S C . This method15as used to design the modular coils of W7-AS [108], W7-X [15], and HSX [5] and was latergeneralized to include regularization in the REGCOIL [136] code. In the limit of a largenumber of coils, we can describe a set of discrete coils by a continuous current density on S C , J = δ ( b ( x )) J C ( θ, φ ) . (1.17)Here b ( x ) is the signed-distance function [179], b ( x ) = − d ( x , S C ) x ∈ V C x ∈ S C d ( x , S C ) x (cid:54)∈ V C . (1.18)The volume enclosed by S C is V C and d ( x , S C ) is the shortest distance from x to any pointon S C . The signed distance function is also discussed in Section 2.1. The surface current J C is a function of the two angles, θ and φ , parameterizing the position on S C . As a consequenceof Ampere’s law (Appendix B), the continuous surface current can be written as, J C = ˆ n × ∇ Φ . (1.19)We can note that current will flow along the contours of Φ, as J C · ∇ Φ = 0. In this way, onceΦ is computed, the coil shapes can be chosen to be a set of the contours of Φ. As we willsee in Section 3, it is possible to construct an objective function that is a convex function ofΦ, possessing a unique global minimum that can be obtained through linear least-squares.Thus current potential methods are particularly robust and efficient, though based on somesevere assumptions. Coil complexity can be approximated from the properties of the currentpotential. In REGCOIL, this is done with the norm of the current density, χ J = (cid:90) S C d x | J C | , (1.20)as large values of χ J indicate small coil-coil spacing. An example REGCOIL calculation isshown in Figure 1.8. Filamentary methods
Other coil design codes instead assume that all currents in the vacuum region are confinedto filamentary lines, { C k } , taken to be the center of each winding pack. This assumption isagain an idealization, as stellarator coils have a finite build consisting of several layers, eachwith several turns of the conducting material. However, the filamentary method is morerealistic than current potential methods, as it accounts for the ripple due to the finite natureof coils. The lines and the current through each are optimized to minimize some objectivefunction that includes the normal field error on S P in addition to engineering objectives,which serve as a form of regularization. For example, the FOCUS code [243] uses the coillength as a form of regularization, and the COILOPT code [216] includes the coil-plasmaseparation, coil-coil separation, and the coil curvature. These optimization problems aregenerally nonlinear and non-convex so that the resulting local minimum will depend onthe initial guess. For this reason, a current potential solution can be used to initialize the16 ormalized current potential ( ) /(2 /N P ) / ( ) (a) (b)(c) Figure 1.8: An example of a REGCOIL calculation for the W7-X standard configurationequilibrium. The winding surface is taken to be a surface uniformly offset from S P by 0.5m. (a) The current potential and the uniformly-spaced contours taken for the coil set. (b)The coil set computed from the contours on the winding surface. (c) The 5 unique coils inone half period and the plasma surface. 17ptimization with filamentary methods. Although there have arguably been significant successes in optimized stellarator design,there is still room for improvement in the algorithms and numerical methods. Specifically,we aim to address several major challenges that arise in the optimization of stellaratorconfigurations.1.
Coil complexity - In the standard two-step approach, coil design is decoupled fromequilibrium optimization. While this may allow for improved physics properties, theresulting equilibrium may require overly-complex coils that cannot be manufacturedeconomically or are not consistent with engineering constraints. As was stated in the2018 report of the National Stellarator Coordinating Committee [73],“The highest priority for technology is to better integrate the engineeringdesign with the physics design at the earliest possible stage.”For this reason, it is favorable to include coil complexity metrics in equilibrium opti-mization. As an example, one approach is to compute the properties of the currentpotential (Section 1.4.3) on a winding surface that is uniformly offset from the plasmasurface [59] during fixed-boundary optimization. It has also been proposed that prop-erties of the optimal filamentary coils for a given plasma boundary be included inequilibrium optimization [118]. Alternatively, the coils can be directly optimized witha free-boundary method. This approach was implemented in the late stages of theNCSX design [119, 217] and in the QPS (Quasi-Poloidally Symmetric Stellarator) de-sign [218], resulting in simultaneous attainment of engineering feasibility and desiredplasma properties. Another tactic to reduce coil complexity is replacing non-planarmodular coils by permanent magnets [103, 246].2.
Non-convexity - The optimization problems that arise in stellarator design are oftennon-convex (except for the current potential methods described in Section 1.4.3). Whileconvex optimization problems can be solved in polynomial time (Chapter 1 in [29]),obtaining the global optimum of a non-convex optimization problem is generally
N P -hard. As global optima are difficult to locate, it is common to apply algorithms thatinstead converge to local optima. Such methods are sensitive to the initial conditionsand tend to get “stuck” in small local minima or saddle points. For this reason, it isvery valuable to have initial configurations that are close to the desired configuration.One approach is to begin with an analytic construction of an equilibrium close toquasi-symmetry or omnigeneity by employing an expansion about the magnetic axis[139, 142, 193].Gradient information is invaluable for obtaining the local minimum of an objectivefunction. While there are some algorithms for derivative-free local optimization, theytypically are only effective for small problems (Chapter 9 in [170]). Gradient informa-tion is also useful for global optimization; for example, with a multi-start approach,18any local optimization problems are solved to approximately obtain the global min-imum. As considerations of the gradient will be central to this Thesis, we will discussthis topic further in Chapter 2.In Figure 1.9 we show a benchmark of several optimization problems on the Rosenbrockfunction, f ( { x i } Ni =1 ) = N − (cid:88) i =1 x i +1 − x i ) + ( x i − , (1.21)with N = 2, a non-convex function with a long, thin valley that is often used to bench-mark optimization algorithms. We can note that the gradient-based BFGS methodconverges rather directly toward the optimum. In contrast, the gradient-free particleswarm method takes a scattered trajectory and requires many additional functionalevaluations.3. High-dimensionality - Often, the optimization problems that arise in stellarator de-sign require navigation through the high-dimensional spaces that describe the outerboundary of the plasma or coil shapes. While such shapes are infinite-dimensionalin reality, often they are parameterized with Fourier series, and only a finite numberof modes are retained during the optimization. The number of parameters used inpractice to describe such shapes is typically O (10 ) [242]. We show a benchmark ofthe N -dimensional Rosenbrock function (1.21) in Figure 1.10, noting that the numberof function evaluations required to obtain the optimum scales poorly with N for thegradient-free methods and finite difference based gradient-free methods. As computingthe gradient with a finite-difference method requires O ( N ) function evaluations, theassociated cost is reduced significantly if analytic derivatives are available. Stellaratorequilibrium optimization has historically proceeded with gradient-free methods, such asgenetic algorithms [161] and the Brent algorithm [59], or gradient-based methods withfinite-difference gradient calculations [213]. Recently, gradient-based optimization ofcoils shapes has begun to take advantage of analytic gradient and Hessian calculations[243, 244]. However, for many functions of interest, it is not so simple to compute theanalytic derivative, as the objective function may depend on the solution to a systemof equations. For such objectives, analytic derivatives can be computed with an adjointmethod. This topic will be discussed in detail in Chapter 2 and throughout the Thesis.4. Tight engineering tolerances - Once an optimal design is identified, engineering andmetrology coil tolerances must be determined from the allowable deviations of physicsparameters. In the NCSX design, it was determined that coil tolerances of ≈ . x , x ) = (10 ,
10) andconverges to the optimum at (1 ,
1) in 58 function evaluations, using an analytic gradient toobtain the descent direction. The particle swarm optimization is initialized with a swarmof 20 particles at (10 ,
10) and converges to the optimum at (1 ,
1) in 3400 evaluations. Thegradient-based method converges more directly toward a minimum, while the gradient-freemethod converges in a scattered way requiring excessive function evaluations. For (a), theoptimization was terminated when the maximum of the absolute value of the gradient ele-ments was less than 10 − , and for (b), the optimizations was terminated when the relativechange in the objective function over the previous 20 iterations was less than 10 − .schedule stretch-out which has a large management overhead cost.”One approach to address this challenge is to optimize the expected value of an objectivefunction over a distribution of possible deviations, known as stochastic optimization.This technique has been shown to increase the tolerances of an optimized coil set[150, 151]. There has also been a recent development of tools for the efficient evaluationof tolerance information to avoid costly parameter scans or Monte Carlo samplingmethods [31, 88]. The eigenvectors of the Hessian matrix illuminate the most sensitiveperturbation directions at a local minimum [243, 245], and in this Thesis, we willdiscuss the shape gradient approach [138]. This Thesis aims to address each of the challenges outlined in the previous Section.The focus will be on adjoint methods, which allow for efficient analytic gradient calcula-tions. With such gradient information available, we can navigate through high-dimensional,non-convex spaces that arise in stellarator design with gradient-based methods, addressing20igure 1.10: The number of function evaluations required for convergence to the minimumof the N -dimensional Rosenbrock function (1.21) as a function of the dimension. Results areshown for the gradient-based BFGS algorithm with finite-difference and analytic gradientsand the gradient-free particle swarm method. We note that the gradient-free and finite-difference gradient-based methods scale poorly with the dimension. Knowledge of analyticgradients reduces the associated cost by several orders of magnitude in comparison. Thecost reduction provided by analytic derivatives increases with increasing dimension. Forthe BFGS algorithm the optimization was terminated when the maximum of the absolutevalue of the gradient elements was less than 10 − , and for the particle swarm algorithm theoptimizations was terminated when the relative change in the objective function over theprevious 20 iterations was less than 10 − . 21bjectives 2 and 3. Derivatives obtained from the adjoint method can also be used to analyzelocal sensitivity to perturbations using the shape gradient, addressing objective 4. Specificapplications of the adjoint method described in this Thesis will enable efficient free-boundarycoil optimization or coupled coil-plasma optimization, addressing objective 1.We begin in Chapter 2 with an introduction to some mathematical fundamentals thatlay the groundwork for this Thesis, including an overview of shape optimization and adjointmethods. Chapter 3 describes an adjoint method for the optimization of the coil windingsurface for minimal coil complexity. Chapter 4 describes an adjoint method for the opti-mization of several neoclassical figures of merit local to a magnetic surface, including radialfluxes and the bootstrap current. Chapter 5 describes an adjoint method for the optimiza-tion of functions which depend on MHD equilibrium solutions, such as those that arise infixed and free-boundary optimization. The adjoint method discussed in Chapter 5 requiresthe solution of linearized MHD equilibrium equations, which are discussed in Chapter 6. InChapter 7, we summarize and discuss ongoing and future research related to this Thesis.22 hapter 2 Mathematical fundamentals
The design of a stellarator requires optimizing in the space of shapes: equilibrium designinvolves optimization of the shape of the plasma boundary, S P , and coil design involves op-timization of the shapes of filamentary coils or toroidal winding surfaces. The mathematicalfield of shape optimization has developed to study such problems, contributing to the designof aerodynamic car bodies [180] and airplane wings with increased lift [162]. In this Section,we briefly outline several concepts from this field. We refer to several fundamental textbooks[40, 52, 91, 191] and a Ph.D. thesis with a gentler introduction [47]. Consider some functional, f , which depends on the shape of some domain, Γ. In order tocompute the derivative of f , we must first identify a deformation field, δ x , which describesthe change of the shape. If the shape begins in a state Γ, the shape deformed in the direction δ x by magnitude (cid:15) is Γ (cid:15) = { x + (cid:15)δ x ( x ) : x ∈ Γ } . In this way, we can define the shapederivative of f as, δf (Γ; δ x ) ≡ lim (cid:15) → f (Γ (cid:15) ) − f (Γ) (cid:15) . (2.1)This is a functional derivative in the direction δ x (a Gateaux functional derivative).We can prove some useful properties of the shape derivative for specific choices of func-tional, J (Γ) = (cid:90) Γ d x j (Γ) (2.2a) J (Γ) = (cid:90) ∂ Γ d x j (Γ) , (2.2b)volume and surface integrals.For volume-integrated functionals, the shape derivative can be evaluated by noting theJacobian of the transformation x ∈ Γ → x ∈ Γ (cid:15) is given by I + (cid:15) ∇ δ x , where I is the identity23ensor. This allows us to relate the volume integral over Γ (cid:15) to a volume integral over Γ, δJ (Γ; δ x ) = lim (cid:15) → (cid:15) (cid:32)(cid:90) Γ (cid:15) d x j (Γ (cid:15) ) − (cid:90) Γ d x j (Γ) (cid:33) = lim (cid:15) → (cid:15) (cid:90) Γ d x (cid:2) det ( I + (cid:15) ∇ δ x ) j (Γ (cid:15) ) | x + (cid:15)δ x − j (Γ) (cid:3) . (2.3)Noting that j (Γ (cid:15) ) | x + (cid:15)δ x = j (Γ) | x + (cid:15)δj (Γ; δ x ) + (cid:15)δ x · ∇ j (Γ) + O ( (cid:15) ) we have, δJ (Γ; δ x ) = (cid:90) Γ d x (cid:32) δj (Γ; δ x ) + δ x · ∇ j (Γ) + dd(cid:15) (cid:0) det( I + (cid:15) ∇ δ x ) (cid:1) (cid:12)(cid:12)(cid:12)(cid:12) (cid:15) =0 j (Γ) (cid:33) . (2.4)The derivative of the determinant of a matrix can be computed from Jacobi’s formula, d/dt (cid:0) det( A ( t )) (cid:1) = det( A ( t ))tr( A ( t ) − A (cid:48) ( t )), δJ (Γ; δ x ) = (cid:90) Γ d x (cid:2) δj (Γ; δ x ) + δ x · ∇ j (Γ) + ( ∇ · δ x ) j (Γ) (cid:3) . (2.5)From the divergence theorem, we arrive at the following form for the shape derivative ofvolume-integrated functionals, δJ (Γ; δ x ) = (cid:90) Γ d x δj (Γ; δ x ) + (cid:90) ∂ Γ d x δ x · ˆ n j (Γ) . (2.6)The first term accounts for the Eulerian change to j while the second term accounts forthe motion of the boundary. In fluid mechanics, this relation is sometimes referred to asthe Reynolds transport theorem (Chapter 2 in [145]), which describes the time derivativeof integrated quantities associated with a moving fluid. A physical picture of this result isgiven in Figure 2.1.We can now use (2.6) to obtain the shape derivative of the surface-integrated functional(2.2b). To do so, we recall that the normal vector can be expressed as ˆ n = ∇ b | ∂ Γ , where b is the signed distance function [179], b ( x ) = − d ( x , ∂ Γ) x ∈ Γ0 x ∈ ∂ Γ d ( x , ∂ Γ) x (cid:54)∈ Γ , (2.7)and d ( x , ∂ Γ) is the shortest distance from x to any point on ∂ Γ. This can be seen by notingthat ˆ n points outward, in the direction of increasing b ( x ), and the shortest path betweena point near ∂ Γ and ∂ Γ will be along the normal direction. As b ( x ) measures Euclidiandistance, ∇ b has unit length.We can now apply the divergence theorem to write (2.2b) as J (Γ) = (cid:90) Γ d x ∇ · (cid:0) j (Γ) ∇ b (Γ) (cid:1) . (2.8)We apply the transport theorem for volume-integrated functionals (2.6) to obtain, δJ (Γ; δ x ) = (cid:90) ∂ Γ d x (cid:104) δ x · ˆ n (cid:0) ˆ n · ∇ j + j ∇ b (cid:1) + ∇ b · ∇ δb (Γ; δ x ) + δj (Γ; δ x ) (cid:105) . (2.9)24 a) (b) Figure 2.1: (a) An unperturbed volume, Γ. (b) The normal perturbation field of magnitude (cid:15)δ x · ˆ n (black) and the perturbed volume, Γ (cid:15) (green). We can see that the linear change involume associated with the perturbation field is δV = (cid:82) ∂ Γ d x δ x · ˆ n .We can interchange shape and spatial derivatives to see that ∇ b ·∇ δb = δ ( ∇ b · ∇ b ) = 0, as ∇ b will remain a unit vector. We can also recognize that the mean curvature, H , is relatedto the normal vector by H = ∇ ∂ Γ · ˆ n , where ∇ ∂ Γ · f = ∇ · f − ˆ n · ( ∇ f ) · ˆ n is the tangentialdivergence operator. (Sometimes H is defined with the opposite sign.) For surface-integratedfunctionals we therefore obtain the following shape derivative, δJ (Γ; δ x ) = (cid:90) ∂ Γ d x (cid:2) δj (Γ; δ x ) + (ˆ n · ∇ j + 2 Hj ) δ x · ˆ n (cid:3) . (2.10)The first term accounts for the Eulerian change to j , while the second and third termsaccount for the motion of the boundary. As one would expect, an outward perturbation ofa surface with large mean curvature leads to a large change in the area. See Figure 2.2 fora physical picture.We can already see from (2.6) and (2.10) that the shape derivatives of volume and surface-integrated functionals involve integrals over the boundary. It may appear that to understandthe form of these shape derivatives, we will need to specify the structure of j (Γ) and j (Γ).However, we can make a more general statement about shape derivatives of any form. TheHadamard-Zolesio structure theorem [52, 87] states that the shape derivative of a generalfunctional of the domain Γ with sufficient smoothness can be expressed as, δJ (Γ; δ x ) = (cid:90) ∂ Γ d x δ x · ˆ n G , (2.11)where G is called the shape gradient. This is an example of the Riesz representation theorem, Under the assumption of sufficient smoothness, spatial and shape derivatives can be shown to commuteby noting that x and Γ are independent variables (Chapter 6 in [40]). ∂ Γ, shown asthe blue and red lines, with curvatures κ and κ , respectively. The unperturbed surfacearea element bounded by the principal directions is given by dA = l l . Upon a normaldisplacement of magnitude (cid:15)δ x · ˆ n , the new area element is given by ( dA ) (cid:15) = l l (1 + κ (cid:15)δ x · ˆ n )(1 + κ (cid:15)δ x · ˆ n ), so the linear change in the area element is δA = ( dA )2 Hδ x · ˆ n , where H = κ + κ is the mean curvature.which (roughly) states that any linear functional can be expressed as an inner product withan element of the appropriate space (Chapter 4 in [199]). The shape derivative is a linearfunctional of the normal perturbation to the boundary, δ x · ˆ n , and can be expressed as asurface integral with the shape gradient. This form is especially powerful for computation,as the deformation field only needs to be defined on the boundary, and the derivative canbe written in terms of a surface integral rather than a volume integral. Intuitively, linearchanges to a functional only depend on normal perturbations of the boundary. If the shapegradient can be determined, then for any possible deformation field, δ x , the correspondingchange to the functional δJ (Γ; δ x ), is known. We can think of G as being a measure of the local sensitivity: regions of increased |G| correspond to regions of increased sensitivity of J (Γ) with respect to normal perturbations.For stellarator optimization, we are also interested in functionals which depend on theshape of a set of filamentary lines, C = { C k } . We expect that perturbations of the coils inthe tangential direction will not result in a linear change to the functional. We can, therefore,write the shape derivative in a form analogous to the structure theorem (2.11) by the Rieszrepresentation theorem, δf ( C ; δ x C k ) = (cid:88) k (cid:73) C K dl δ x C k × ˆ t · G k , (2.12)where ˆ t is the tangent vector, integration is taken along each coil, and the sum is taken overall coils. As a curve has two independent directions perpendicular to the tangent vector,26he shape gradient is now a vector, G k . Its direction indicates the direction of perturbationwhich leads to the largest increase in the functional, and its magnitude indicates the level ofsensitivity to a given perturbation.To motivate this form of the coil shape gradient, we consider the example of the magneticfield computed from the Biot-Savart law applied to a set of filamentary coils { C k } , B ( x , C ) = µ π (cid:88) k I C k (cid:73) C k dl ˆ t ( l ) × ( x − x k ( l )) | x − x k ( l ) | , (2.13)where x k is the position along the k th coil and ˆ t = x (cid:48) k ( l ) is the unit tangent vector. The shapederivative of the magnetic field can now be computed with respect to a coil perturbationfield δ x by considering the perturbation of a general closed line integral Q L ( C ) = (cid:72) C dl Q ( C )[9, 138], δQ L ( C ; δ x ) = (cid:73) C dl (cid:32) δ x · (cid:18) − κ Q + (cid:16) I − ˆ t ˆ t (cid:17) · ∇ Q (cid:19) + δQ ( C ; δ x ) (cid:33) , (2.14)where κ ( l ) = ˆ t (cid:48) ( l ) is the curvature vector.Upon application of this identity and integration by parts, we obtain, δ B ( x , C ; δ x k ) = µ π (cid:88) k (cid:73) C k dl δ x k × ˆ t ( l ) · (cid:32) − I | x − x k ( l ) | + 3( x − x k ( l )) ( x − x k ( l )) | x − x k ( l ) | (cid:33) , (2.15)where I is the identity tensor. Thus the shape derivative of a figure of merit that depends onthe vacuum magnetic field through the Biot-Savart law can be expressed in the coil shapegradient form (2.12). In Chapter 5 we will show explicit examples of other figures of meritthat can be expressed in this form. In practice, it may be convenient to describe a shape by a set of parameters, Ω. We canrelate the shape derivative and shape gradient defined in the previous Section to derivativeswith respect to such parameters.Suppose that we have a surface described by a set of parameters, Ω. For example, in thecontext of stellarator equilibrium calculations, the plasma boundary is often described by aset of Fourier coefficients of the cylindrical coordinates, { R cm,n , Z sm,n } , R = (cid:88) m,n R cm,n cos( mθ − nN P φ ) (2.16a) Z = (cid:88) m,n Z sm,n sin( mθ − nN P φ ) . (2.16b)Here θ is a poloidal angle, φ is a toroidal angle, and the configuration is assumed to possessstellarator symmetry, which implies that R ( − θ, − φ ) = R ( θ, φ ) and Z ( − θ, − φ ) = − Z ( θ, φ )2753]. The number of periods is N P , representing the discrete rotational symmetry of theequilibrium (Section 12 in [121]). This is the representation of the boundary shape used inthe VMEC code [111].In this case, we can compute the shape derivative corresponding to perturbations of eachparameter, δ x = (cid:0) ∂ x (Ω) /∂ Ω i (cid:1) δ Ω i δJ (Γ(Ω); δ x ) = ∂J (Γ(Ω)) ∂ Ω i δ Ω i , (2.17)by expression our functional as a function of the parameters. We apply the structure theorem(2.11) to obtain the following expression, ∂J (Γ(Ω)) ∂ Ω i = (cid:90) ∂ Γ d x ∂ x (Ω) ∂ Ω i · ˆ n G . (2.18)Given ∂J (Γ(Ω)) /∂ Ω i and ∂ x (Ω) /∂ Ω i , we can consider this to be a linear system for G .For numerical calculation, the above can be discretized using a collocation method or byexpanding G in a set of basis functions. Often the linear system is not square, in which casean SVD or QR decomposition can be used.Now suppose that our coils are described by a set of parameters, Ω. For example, theCartesian components of the filamentary line can be described by a Fourier series, x k = (cid:88) m X kcm cos( mθ ) + X ksm sin( mθ ) (2.19a) y k = (cid:88) m Y kcm cos( mθ ) + Y ksm sin( mθ ) (2.19b) z k = (cid:88) m Z kcm cos( mθ ) + Z ksm sin( mθ ) , (2.19c)where θ ∈ [0 , π ] is an angle parameterizing each curve. Again we compute the shapederivative corresponding to perturbations of each parameter, δ x C k = (cid:0) ∂ x C k (Ω) /∂ Ω i (cid:1) δ Ω i , δf ( C ; δ x C k ) = ∂f ( { C k (Ω) } ) ∂ Ω i δ Ω i , (2.20)to obtain, ∂f ( C ) ∂ Ω i = (cid:88) k (cid:73) C k dl ∂ x C k (Ω) ∂ Ω i × ˆ t · G k . (2.21)As with the case of functionals of surfaces, we can consider the above to be a linear systemfor G k that can be solved numerically.An overview of this method and examples of its application for figures of merit relevantfor stellarator optimization are provided in [138]. The shape derivatives computed in this Section are quite general, applying to any func-tional of surfaces, volumes, or lines. For some problems we will be able to use the expressions28or the shape derivatives, (2.6) and (2.10), to obtain an explicit expression for the shape gra-dient. For example, if we consider the volume functional, (2.2a) with j = 1, then we seefrom (2.6) that the shape gradient will be G = 1. If we consider the surface functional, (2.2b)with j = 1, then we see from (2.10) that the shape gradient will be G = 2 H . However, formany functionals, this type of explicit calculation is not possible. We are often interested infunctionals which depend on solutions of a PDE, in which case we can compute the shapegradient by solving an additional PDE, known as an adjoint equation. We describe theadjoint method in more detail in the following Section.For other problems, it may be more convenient to compute the shape derivative fromparameter derivatives, as in (2.17) and (2.20), rather than applying the transport theorems.The shape gradient can then be inferred by solving the corresponding linear systems, (2.18)and (2.21). Sometimes these parameter derivatives can be obtained analytically or with anadjoint method; otherwise, they are obtained with a finite-difference method.As the shape gradient measures the local sensitivity of a figure of merit to perturbationsof a shape, we can use it to quantify the uncertainty in a figure of merit given a distributionof small perturbations to the shape. As shown in [138], the plasma surface or coil shapegradient can be used to determine the allowable deformations of a shape given a permissiblechange to a figure of merit. Suppose a figure of merit f has an allowable deviation ∆ f (ineither direction). If we define a local tolerance for the k th coil as, T k ( l ) = w k ( l )∆ f (cid:80) k (cid:48) (cid:72) dl w k (cid:48) ( l (cid:48) ) | G k (cid:48) ( l (cid:48) ) | , (2.22)such that the perturbation amplitude | δ x C k ( l ) × ˆ t ( l ) | ≤ T k ( l ) along the k th coil, then thethe change of the figure of merit will be, | δf (cid:0) C ; δ x C k (cid:1) | ≤ (cid:88) k (cid:73) C k dl | δ x C k × ˆ t · G k | ≤ (cid:88) k (cid:73) C k dl T k | G k | = ∆ f, (2.23)upon application of the triangle inequality. Here w k ( l ) is a weight function which allows forthe distribution of tolerance to be non-uniform along the coil. In identifying such a toler-ance we have relied on a local approximation of the function, considering small-amplitudeperturbations such that a linear approximation is valid.Similarly, a tolerance with respect to perturbations of a surface can be defined withrespect to the surface shape gradient, T = w ∆ f (cid:82) ∂ Γ d x w G , (2.24)where w is a weight function defined on the surface ∂ Γ. For example, we could considerthe tolerance of a figure of merit that depends on the position of the plasma boundary, S P .If we constrain perturbations of the surface such that | δ x · ˆ n | ≤ T , then we find that thecorresponding change to the figure of merit is δf ≤ ∆ f . However, the deformation of amagnetic surface is not a quantify that can be directly experimentally controlled, requiringequilibrium reconstruction methods [89].A more practically relevant quantity is computed from the sensitivity to perturbations29f the magnetic field, S B , defined through, δf ( S P ; δ x ) = (cid:104)G(cid:105) ψ δV ( δ x ) + (cid:90) S P d x S B δ B ( δ x ) · ˆ n , (2.25)where δV and δ B are the perturbations to the volume enclosed by S P and magnetic fieldresulting from a surface displacement of δ x and (cid:104) . . . (cid:105) ψ is the flux-surface average (A.10).The quantity S B , which quantifies the local sensitivity to perturbations of the magneticfield, is computed from the shape gradient as, B · ∇ S B = (cid:104)G(cid:105) ψ − G . (2.26)A tolerance with respect to magnetic field perturbations can then be constructed as, T B = w ∆ f (cid:82) S P d x w | S B | , (2.27)for a chosen weight function w , such that if the normal magnetic perturbations satisfy | δ B · ˆ n | ≤ T B , then δf ≤ ∆ f . The tolerance with respect to magnetic perturbations caninform allowable coil deformations, location of trim coils, and position of current leads. Inthis way, important engineering tolerances are inferred, addressing objective 4 from Section1.4.4. An adjoint method is a numerical method for the efficient calculation of derivatives ofan objective function that depends on the solution to some set of equations, known as theforward system. At the heart of the adjoint method is the adjoint equation, in which theadjoint of the linearized forward operator appears in addition to an inhomogeneous termthat depends on the objective function of interest.There are other instances in which the adjoint operator may become useful. An adjointFokker-Planck equation is used to compute the quasilinear generation of current by RF waves[9] or to study runaway electron dynamics [148]. An adjoint gyrokinetic equation can alsobe used to analyze the evolution of free energy [141]. Finally, adjoint operators are used topredict and correct discretization error [78, 189] and perform efficient grid adaptation [231].In this Chapter, we focus our attention on adjoints for efficient derivative calculations.Adjoint methods were introduced by the optimal control theory community in the 1960s[74, 126], and were later adopted by the fluid dynamics community [190]. They have sincebeen popularized for aeronautical design [123], car aerodynamics [180], geophysics [192], andnuclear fission reactor design [68]. Aside from the body of work associated with this Thesis,there is only one other example of the use of adjoint methods in fusion sciences: for theshape optimization of tokamak divertors based on adjoint fluid equations [47, 49, 50, 51].We refer to several introductory articles on adjoint methods [4, 79, 192].We begin our overview of adjoint methods with its application for objective functions thatdepend on the solution of finite-dimensional, discrete linear systems in Section 2.2.1. We willthen generalize to objective functions that depend on the solution of infinite-dimensional,30ossibly nonlinear systems in Section 2.2.2. The two approaches are compared in Section2.2.3.
Suppose we would like to solve the optimization problem,min Ω f (Ω , −→ x ) , (2.28)where −→ x is the solution of a linear system, ←→ A (Ω) −→ x = −→ b (Ω) . (2.29)Here ←→ A is an N × N matrix and −→ x and −→ b are N × { Ω i } N Ω i =1 be a set of design parameters defining our optimization space. To minimize (2.28) with agradient-based method, we compute the derivative with respect to Ω using the chain rule, df (Ω , −→ x (Ω)) d Ω = ∂f (Ω , −→ x ) ∂ Ω + (cid:32) ∂f (Ω , −→ x ) ∂ −→ x (cid:33) T ∂ −→ x (Ω) ∂ Ω . (2.30)Here ∂f (Ω , −→ x ) /∂ −→ x is the gradient of f with respect to −→ x , a column vector. To evaluate ∂ −→ x (Ω) /∂ Ω, we must compute linear perturbations of (2.29), ∂ ←→ A (Ω) ∂ Ω −→ x (Ω) + ←→ A (Ω) ∂ −→ x (Ω) ∂ Ω = ∂ −→ b (Ω) ∂ Ω . (2.31)We schematically evaluate the perturbation to the solution as, ∂ −→ x (Ω) ∂ Ω = ←→ A (Ω) − (cid:32) ∂ −→ b (Ω) ∂ Ω − ∂ ←→ A (Ω) ∂ Ω −→ x (Ω) (cid:33) . (2.32)Inserting the result into (2.30), we obtain df (Ω , −→ x (Ω)) d Ω = ∂f (Ω , −→ x ) ∂ Ω+ (cid:32) ∂f (Ω , −→ x ) ∂ −→ x (cid:33) T ←→ A (Ω) − (cid:32) ∂ −→ b (Ω) ∂ Ω − ∂ ←→ A (Ω) ∂ Ω −→ x (Ω) (cid:33) . (2.33)This approach to computing the derivative, the forward-sensitivity method, requires com-puting N Ω + 1 solutions to a linear system of size N × N : we must solve (2.29) once for −→ x ,and we must solve, ←→ A (Ω i ) −→ y = ∂ −→ b (Ω) ∂ Ω i − ∂ ←→ A (Ω) ∂ Ω i −→ x (Ω i ) , (2.34)for ←→ y once for each Ω i . 31y rearranging parentheses, (2.33) is equivalent to, df (Ω , −→ x (Ω)) d Ω = ∂f (Ω , −→ x ) ∂ Ω+ (cid:32)(cid:16) ←→ A (Ω) T (cid:17) − ∂f (Ω , −→ x ) ∂ −→ x (cid:33) T (cid:32) ∂ −→ b (Ω) ∂ Ω − ∂ ←→ A (Ω) ∂ Ω −→ x (Ω) (cid:33) , (2.35)where we have noted that the transpose and inverse operations can be interchanged for anyinvertible matrix. Thus we can see that if we compute the solution to the following adjointequation, ←→ A (Ω) T ←→ z = ∂f (Ω , −→ x ) ∂ −→ x , (2.36)then we can compute the derivative of the objective function in a more convenient way, df (Ω , −→ x (Ω)) d Ω = ∂f (Ω , −→ x ) ∂ Ω + −→ z T (cid:32) ∂ −→ b (Ω) ∂ Ω − ∂ ←→ A (Ω) ∂ Ω −→ x (Ω) (cid:33) . (2.37)This method for computing the derivative, known as the adjoint method, only requires twosolutions of a linear system of size N × N : (2.29) and (2.36). In general, the partial derivativesof −→ b (Ω) and ←→ A (Ω) can be computed analytically. In this way, no approximations are madein obtaining (2.37). The power of this approach becomes apparent in high-dimensionalspaces: the adjoint method requires only two solutions of such linear systems, while theforward-sensitivity method requires N Ω + 1 solutions. Approximating the derivative with afinite-difference method also requires at least N Ω + 1 solutions, depending on the size of thestencil.The approach presented in this Section can be understood as a linear algebra trick. Wewant to solve a linear system for many right-hand sides, as in (2.34). Moreover, we areonly interested in a specific inner product with these solutions, (2.33). As we are allowedto interchange the transpose and inverse operations, we arrive at the adjoint form (2.36). Ifthe partial derivatives of ←→ A (Ω) and −→ b (Ω) can be computed analytically, and the adjointequation is solved exactly, then no approximations are made here. In this sense, we canconsider the adjoint-based derivative to be the exact analytic derivative. In practice, theremay be a small amount of error introduced due to the finite tolerance of the linear solve. Computational complexity comparison
We now compare the computational complexity of the forward-sensitivity method, thefinite-difference method, and the adjoint method for computing the derivative. Here we willignore any cost associated with constructing ←→ A (Ω), −→ b (Ω), or their derivatives. For somematrix types (e.g. sparse) the number of required operations may be reduced from whatis given here, but we simply try to estimate the relative costs. The flop counts for matrixcomputations can be found in standard references such as [226].For both the forward and adjoint sensitivity methods, we must form the right-hand side of(2.34) for each Ω i , each of which requires a matrix-vector product and a vector-vector sum for32orward Sensitivity Finite difference Adjoint4 N Ω N + N N Ω N N Ω N + N Table 2.1: Approximate flop counts for the forward-sensitivity, finite-difference, and adjointmethod for calculation of the derivative.a combined cost of ≈ N + N flops. The forward-sensitivity method requires solving (2.34) N Ω times. For example, an LU factorization method can be used, which requires ≈ N flops. Once the factorization is known, solving the system (2.31) via backward substitutioncosts ≈ N flops for each Ω i . Once ∂ −→ x /∂ Ω is obtained, N Ω vector-vector products mustbe performed to obtain the derivatives of f as in (2.33), each which requires 2 N flops. Thusthe composite number of flops is ≈ N Ω N + N . With a finite-difference method, the totalcost of computing ∂ −→ x /∂ Ω requires at least ≈ N Ω N flops, assuming that the linear solveis the most expensive step and a one-sided stencil is used.Alternatively, the adjoint method for computing the derivative requires two linear solves.If an LU factorization method is used, then the matrix factorization of ←→ A = ←→ L ←→ U can bereused to solve the adjoint system (2.36), as ←→ A T = ←→ U T ←→ L T where ←→ U T is lower-triangularand ←→ L T is upper-triangular. Thus the cost of computing the two solutions requires ≈ N + 4 N flops. Once the adjoint solution is obtained, N Ω matrix-vector products andvector-vector sums must be computed in (2.37) each with cost ≈ N + N flops. Again, N Ω vector-vector products are required, each of which requires ≈ N flops. Thus the totalcomplexity is ≈ N Ω N + N flops, assuming large N . A summary of these approximateflop counts is given in Table 2.1.We see that the adjoint method provides modest savings over the forward-sensitivitymethod when N Ω is comparable to N . However, for many problems the assumptions madein this Section do not apply. In particular, if ←→ A is sparse, ←→ L and ←→ U will be generally bedense, in which case the matrix-vector multiplication that appears on the right-hand-sideof (2.37) will be significantly cheaper than backsubstitution to solve (2.34), and there willbe a more significant savings with the application of the adjoint method over the forward-sensitivity method. For very large matrices it may be impractical to LU factorize ←→ A .Instead, a preconditioner may be factorized, and the linear system is solved with a Krylovsubspace iterative method. Again for such systems, solving the factorized system will besignificantly more expensive than matrix-vector multiplication.In comparison with finite differences, the adjoint method offers a reduction of complexityby O ( N Ω ). The accuracy of the finite-difference method depends on the size of the stenciland choice of step size. While a wider stencil provides a more accurate derivative, it increasesthe number of required function evaluations. The step size must also be chosen carefully toavoid the introduction of noise: a large step size will introduce nonlinearity, while a smallstep size will introduce round-off error. For these reasons, the adjoint method is preferableover a finite-difference method. 33 .2.2 Continuous approach The adjoint method presented in the previous Section applies only to functions thatdepended on the solution of a linear system in a finite-dimensional space. We now gener-alize this result to obtain an adjoint equation in an infinite-dimensional space. Often inoptimization, we are interested in an objective function which depends on the solution of aPDE, L (Ω , u ) = 0 , (2.38)such as the MHD equilibrium equations (1.3). Here L is some linear or nonlinear operator,and u is an unknown. We are optimizing with respect to a set of parameters, Ω, which maygenerally be infinite-dimensional; for example, Ω may describe the shape of some domain.Our differential operator may depend on these parameters. We assume that u is a member ofsome Hilbert space, H , which possesses an inner product structure denoted by (cid:104) . , . (cid:105) . If thisPDE is linear, then the discretized form of this problem can generally be written as (2.29),and the adjoint equation can be obtained after discretization as described in the previousSection. The method described in this Section will allow us to get an adjoint equation before discretization.We can consider u to depend on Ω through the solution to (2.38). We perform linearperturbations about the base state (2.38) corresponding to perturbations of Ω, δL (Ω , u ; δ Ω) + δL (cid:0) Ω , u ; δu (Ω; δ Ω) (cid:1) = 0 . (2.39)Our objective function, f (Ω , u ), is some linear or nonlinear scalar functional of Ω and u .Linear perturbations of f (Ω , u ) can generally be written as an inner product with δu , δf (Ω , u ; δu ) = (cid:68) (cid:101) f , δu (cid:69) . (2.40)This is another example of the Riesz representation theorem: as δf is a linear functional of δu , we can express it as an inner product with (cid:101) f ∈ H .We are interested in computing linear perturbations to f such that u (Ω) satisfies thePDE. The constrained problem is expressed through the objective function, f (Ω , u (Ω)),whose derivative with respect to Ω is computed to be, δf (Ω , u (Ω); δ Ω) = δf (Ω , u ; δ Ω) + (cid:68) (cid:101) f , δu (Ω; δ Ω) (cid:69) , (2.41)and δu (Ω; δ Ω) satisfies (2.39). This is an analogous expression to (2.33) in the discretelinear case. Computing the derivative in this way requires many solutions of a PDE: onesolution of the initial base state (2.38) and one solution of (2.39) for each perturbation ofthe optimization parameters, δ Ω.A more efficient method of computing these derivatives is by application of Lagrangemultipliers, enforcing (2.38) as a constraint. We now define the corresponding Lagrangianas, L (Ω , (cid:101) u, (cid:101) λ ) = f (Ω , (cid:101) u ) + (cid:68)(cid:101) λ, L (Ω , (cid:101) u ) (cid:69) , (2.42)where (cid:101) λ ∈ H is a Lagrange multiplier. In the above expression, (cid:101) u ∈ H but it does not34ecessarily satisfy (2.38), hence the distinction by the tilde. If L is stationary with respectto (cid:101) λ , then (cid:101) u is a weak solution of the PDE, indicated by u . If L is stationary with respectto (cid:101) u , then (cid:101) λ will satisfy the weak form of an adjoint PDE, at which point we denote (cid:101) λ by λ .If L is stationary with respect to both (cid:101) u and (cid:101) λ , or (cid:101) u = u and (cid:101) λ = λ , then derivatives of L with respect to Ω are equal to derivatives of f with respect to Ω, δ L (Ω , (cid:101) u, (cid:101) λ ; δ Ω) | (cid:101) u = u, (cid:101) λ = λ = δf (Ω , u (Ω); δ Ω) . (2.43)We will show this directly in a moment.We now look for a stationary point of L with respect to (cid:101) u , δ L (Ω , (cid:101) u, (cid:101) λ ; δ (cid:101) u ) = (cid:68) (cid:101) f , δ (cid:101) u (cid:69) + (cid:68)(cid:101) λ, δL (Ω , (cid:101) u ; δ (cid:101) u ) (cid:69) = 0 . (2.44)We note that δL (Ω , (cid:101) u ; δ (cid:101) u ) is a linear functional of δ (cid:101) u , so we can write this schematically as, δL (Ω , (cid:101) u ; δ (cid:101) u ) = ˆ L (Ω , (cid:101) u ) δu, (2.45)where ˆ L (Ω , (cid:101) u ) is a linear operator. The adjoint of an operator A , which we denote by A † , isdefined by (cid:104) Ay, x (cid:105) = (cid:104) y, A † x (cid:105) for x, y ∈ H . Thus we can rewrite the above as, δ L (Ω , (cid:101) u, (cid:101) λ ; δ (cid:101) u ) = (cid:68) (cid:101) f + ˆ L (Ω , (cid:101) u ) † (cid:101) λ, δ (cid:101) u (cid:69) = 0 . (2.46)This is a weak form of the adjoint PDE, (cid:101) f + ˆ L (Ω , (cid:101) u ) † λ = 0 . (2.47)We indicate its solution by λ , as it corresponds with a stationary point of L with respectto (cid:101) u . We now see that if (cid:101) u satisfies (2.38) and (cid:101) λ satisfies (2.47), then derivatives of f withrespect to Ω are equal to derivatives of L with respect to Ω, δ L (Ω , (cid:101) u, (cid:101) λ ; δ Ω) | (cid:101) u = u, (cid:101) λ = λ = δf (Ω , u ; δ Ω) + (cid:10) λ, δL (Ω , u ; δ Ω) (cid:11) = δf (Ω , u ; δ Ω) − (cid:10) λ, δL (Ω , u ; δu (Ω; δ Ω) (cid:11) , (2.48)where we have used (2.39). If we now apply the adjoint condition and enforce that λ satisfythe adjoint PDE (2.47), then we indeed obtain (2.41), as desired.The adjoint method for computing the derivative of f with respect to the parameters Ωis, δf (Ω , u (Ω); δ Ω) = δ L (Ω , (cid:101) u, (cid:101) λ ; δ Ω) | (cid:101) u = u, (cid:101) λ = λ = δf (Ω , u ; δ Ω) + (cid:10) λ, δL (Ω , u ; δ Ω) (cid:11) . (2.49)This is the continuous analogue of (2.37). The first term corresponds with the explicitdependence of f on Ω, while the second term corresponds with the dependence through u .Note that, if (2.38) is satisfied, then we can choose λ to be whatever we would like, as thesecond term in the Lagrangian functional (2.42) will always vanish. For some problems, otherchoices for λ may be convenient, although (2.49) will no longer hold. In Chapter 5, a slightlydifferent choice for the adjoint variable will be made. Rather than being a stationary point,boundary terms remain in the expression for δ L (Ω , u, λ ; δu ) (see (5.42)-(5.43) and (5.52)-(5.53)).In practice, the infinite-dimensional optimization space may be approximated by a dis-crete set of parameters, Ω = { Ω i } N Ω i =1 . Thus with the solution of only two PDEs, the forward352.38) and adjoint (2.47) problems, we obtain the derivative of our objective function withrespect to an arbitrary number of parameters. An alternative is the forward-sensitivitymethod, using (2.39) and (2.41), which requires N Ω linear PDE solution and one (possibly)nonlinear PDE solutions, (2.38).The finite-difference method requires at least N Ω + 1 (possibly) nonlinear PDE solutions,depending on the size of the stencil. Thus the adjoint method provides a significant advantagewhen N Ω is large, assuming that the PDE solve is expensive in comparison with otheroperations, such as performing the inner products. It is not straightforward to comparethe complexity of these methods as in Section 2.2.1 as the flop count will depend on thenumerical methods used to solve a PDE. However, we can see that the adjoint methodprovides a reduction in the number of required PDE solves by O ( N Ω ) over both the forward-sensitivity and finite-difference methods.Of course, both the forward and adjoint PDEs are typically solved numerically by ap-proximation in a finite-dimensional space. The accuracy of the derivative computed withthe adjoint method will, therefore, depend on the tolerance to which the base state and ad-joint PDEs are solved in addition to the discrepancy between the infinite-dimensional innerproduct and its finite-dimensional approximation. We now see that there are two general strategies to the application of the adjoint method:obtaining the adjoint before discretization, the continuous adjoint approach, or obtaining theadjoint after discretization, the discrete approach. There are relative merits to each. Withthe discrete adjoint method, the accuracy of the derivative only depends on the tolerance towhich the forward and adjoint systems are solved. On the other hand, with the continuousmethod, it also depends on the discretization error of the PDE due to the difference betweenthe infinite-dimensional inner product and its finite-dimensional approximation. The twoapproaches must agree in the limit of infinite resolution. In practice, the difference betweenthe two is relatively small, though it has been suggested that the discrepancy between thecontinuous and discrete gradients may become important near a local minimum [47], wherethe gradient obtained from the continuous approach may not be a descent direction of thediscretized problem.The continuous approach offers the advantage that the adjoint equation can be derivedindependently of the choice of discretization; thus, if the adjoint equation has a significantlydifferent structure from the forward equation, a distinct discretization scheme can be applied.It also may offer further insight into the structure of the adjoint equations and its boundaryconditions. For this reason, the continuous approach may be preferable in the presenceof shocks or singularities [79], as we demonstrate in Chapter 6. For both approaches, theresulting adjoint equation is linear. Implementation of the discrete method is sometimesmore straightforward, as the adjoint and forward operators have the same eigenvalues, sothe same numerical linear algebra methods can typically be used to solve both problems. Aswe will see in Chapter 4, if an LU factorization method is used to solve the linear system,then the factorization of the matrix or its preconditioner can be reused to solve the discrete36djoint problem. There is not a clear consensus in the literature as to which approach ispreferable, and the choice usually depends on the application of interest. With an adjoint method, optimization within a high-dimensional space is no longer asignificant challenge. An adjoint-based derivative provides a reduction of computationalcomplexity over finite differences by approximately the optimization dimension, N Ω , assummarized in Table 2.1. Given that the cost of computing the gradient becomes com-parable to the cost of the forward solve, we can easily take advantage of gradient-basedoptimization methods. For line-search gradient-based methods, each iteration reduces toa one-dimensional line search once a descent direction is identified [170]. Therefore withadjoint methods, high-dimensional, non-convex optimization becomes feasible, allowing usto address objectives 2 and 3 from Section 1.4.4. In the following Chapters, we will demonstrate the application of shape calculus andadjoint methods for several problems arising in stellarator optimization. In Chapter 3 wedescribe a discrete adjoint method for the optimization of coil shapes based on the currentpotential method described in Section 1.4.3. With the derivatives obtained from the ad-joint method, we compute a shape gradient with respect to perturbations of the coil-windingsurface, allowing us to identify regions where figures of merit become sensitive to coil pertur-bations. In Chapter 4, we compare a continuous and discrete adjoint method for computinggeometric derivatives of several neoclassical quantities. These geometric derivatives allowus to compute a sensitivity function for local magnetic field strength perturbations that isanalogous to the shape gradient. In Chapter 5, we describe a continuous adjoint methodfor computing the shape gradient of quantities that depend on MHD equilibrium solutions.These shape gradients can be used for equilibrium optimization of the plasma boundaryor coil shapes and sensitivity analysis. For this application, the adjoint equation containssingular behavior, so a distinct discretization and solution scheme are required, discussed inChapter 6. 37 hapter 3
Adjoint winding surface optimization
In this Chapter, we apply the linear adjoint approach described in Section 2.2.1 for theoptimization of coil shapes. We assume that coils are confined to a winding surface using thecurrent potential method introduced in Section 1.4.3. The application of the adjoint methodwill allow us to efficiently optimize in the space of the geometry of the coil-winding surfaceand study the sensitivity to local perturbations using the shape gradient.The material in this Chapter has been adapted from [185] with permission.
In the traditional stellarator optimization method, coils are designed to produce a targetouter plasma boundary. The plasma boundary is separately optimized for various physicsquantities, including magnetohydrodynamic (MHD) stability, neoclassical confinement, andprofiles of rotational transform and pressure [175]. The coil shapes are then optimized suchthat one of the magnetic surfaces approximately matches the desired plasma surface. Ingeneral, the desired plasma configuration cannot be produced exactly due to engineeringconstraints on the coil complexity. Additional difficulty is introduced by the ill-posednessof solving Laplace’s equation numerically in the vacuum region for a prescribed normalmagnetic field on the plasma boundary [25, 158].In addition to the minimization of the magnetic field error, several factors should be con-sidered in the design of coil shapes. The winding surface upon which the currents lie shouldbe sufficiently separated from the plasma surface to allow for neutron shielding to protectthe coils, the vacuum vessel, and a divertor system. In a reactor, the coil-plasma distanceis closely tied to the tritium-breeding ratio and overall cost of electricity, as it determinesthe allowable blanket thickness. The coil-plasma distance was targeted in the ARIES-CSstudy to reduce machine size [60]. In practice, the minimum feasible coil-plasma separationis a function of the desired plasma shape. Concave regions (such as the bean-shaped W7-Xcross-section) are especially challenging to produce [137] and require the winding surface tobe near the plasma surface. While decreasing the inter-coil spacing minimizes ripple fields,increasing coil-coil spacing allows adequate space for removal of blanket modules, heat trans-port plumbing, diagnostics, and support structures. The curvature of a coil should be below38 certain threshold to allow for the finite thickness of the conducting material and to avoidprohibitively high manufacturing costs. The length of each coil should also be considered,as the expense will grow with the amount of conducting material that needs to be produced.For these reasons, identifying coils with suitable engineering properties can impact the sizeand cost of a stellarator device.Most coil design codes have assumed the coils to lie on a closed toroidal winding surfaceenclosing the desired plasma surface. In NESCOIL [158], the currents on this surface aredetermined by minimizing the integral-squared normal magnetic field on the target plasmasurface. The current density is computed using a stream function approach, where the cur-rent potential on the winding surface is decomposed in Fourier harmonics. The optimizationtakes the form of a least-squares problem that can be solved with the solution of a singlelinear system. The coil filament shapes are then obtained from the contours of the currentpotential. Because it is guaranteed to find a global minimum, NESCOIL is often used in thepreliminary stages of the design process [57, 135, 212]. NESCOIL was used for the initial coilconfiguration studies for NCSX [194], and the W7-X coils were designed using an extensionof NESCOIL, which modified the winding surface geometry for quality of magnetic surfacesand engineering properties of the coils [15]. However, the inversion of the Biot-Savart in-tegral by NESCOIL is fundamentally ill-posed, resulting in solutions with amplified noise.The REGCOIL [136] approach addresses this problem with Tikhonov regularization. Herethe surface-average-squared current density, corresponding to the squared-inverse distancebetween coils, is added to the objective function. With the addition of this regularizationterm, REGCOIL can simultaneously increase the minimum coil-coil distances and improvethe reconstruction of the desired plasma surface over NESCOIL solutions. In this Chapter,we build on the REGCOIL method to optimize the current distribution in three dimensions.The current distribution on a single winding surface is computed with REGCOIL, and thewinding surface geometry is optimized to reproduce the plasma surface with fidelity andimprove the engineering properties of the coil shapes.Other nonlinear coil optimization tools exist which evolve discrete coil shapes ratherthan continuous surface current distributions. Drevlak’s ONSET code [154] optimizes coilswithin limiting inner and outer coil surfaces. The COILOPT [216, 218] code, developed forthe design of the NCSX coil set [242], optimizes coil filaments on a winding surface whichis allowed to vary. COILOPT++ [32] improved upon COILOPT by defining coils usingsplines, which enables one to straighten modular coils to improve access to the plasma. Theneed for a winding surface was eliminated with the FOCUS [243] code, which represents coilsas three-dimensional space curves. The FOCUS approach employs analytic differentiationfor gradient-based optimization, as we do in this Chapter. As the design of optimal coilsis central to the development of an economical stellarator, it is important to have severalapproaches. The current potential method could have several advantages, including thepossible implementation of adjoint methods. Furthermore, the complexity of the nonlinearoptimization is reduced over other approaches, as the current distribution on the windingsurface is efficiently and robustly computed by solving a linear system. By optimizing thewinding surface, it is possible to gain insight into what features of plasma surfaces requirecoils to be close to the plasma, and what features allow coils to be placed farther away [137].39arallels can be drawn between the design of stellarator coils and the design of magneticresonance imaging (MRI) coils. MRI gradient coils which lie on a cylindrical winding surfacemust provide a specified spatial variation in the magnetic field within a region of interest.This inverse problem is often solved with a linear least-squares system by minimizing thesquared departure from the desired field at specified points with respect to the currentin differential surface elements [228]. This method is comparable to the NESCOIL [158]approach for stellarator coil design. Gradient coil design was improved by the addition of aregularization term related to the integral-squared current density [63] or the integral-squaredcurvature [62], comparable to the REGCOIL approach. The adjoint method has been appliedto compute the sensitivity of an objective function with respect to the current potential onthe MRI winding surface. Here the Biot-Savart law is written in terms of a matrix equationusing the least-squares finite element method, and the adjoint of this matrix is inverted tocompute the derivatives [124]. As the adjoint formalism has proven fruitful in this field, weanticipate that it could have similar applications in the closely-related field of stellarator coildesign.In the Sections that follow, we present a new method for the design of the coil-windingsurface using adjoint-based optimization. An adjoint solve is performed to obtain gradientsof several figures of merit, the integral-squared normal magnetic field on the plasma surfaceand root-mean-squared current density on the winding surface, with respect to the Fouriercomponents describing the coil surface. A brief overview of the REGCOIL approach is givenin Section 3.2. The optimization method and objective function are described in Section 3.3.The adjoint method for computing gradients of the objective function is outlined in Section3.4. Optimization results for the W7-X and HSX winding surfaces are presented in Section3.5. In Section 3.6 we demonstrate a method for computing local sensitivity of figures ofmerit to perturbations of the winding surface using the shape gradient. We discuss propertiesof optimized winding surface configurations in Section 3.7. In Section 3.8 we summarize ourresults and conclude.
First, we review the problem of determining coil shapes once the plasma boundary andcoil-winding surface have been specified. Given the winding surface geometry, our task isto obtain the surface current density, J . The divergence-free surface current density can berelated to a scalar current potential Φ, the stream function for J , J = ˆ n × ∇ Φ . (3.1)Here ˆ n is the unit normal on the winding surface. The current potential Φ can be decomposedinto single-valued and secular terms,Φ( θ, φ ) = Φ sv ( θ, φ ) + Gφ π + Iθ π . (3.2)Here φ is the cylindrical azimuthal angle and θ is a poloidal angle. The quantities G and I are the currents linking the surface poloidally and toroidally, respectively. The single-valued40erm (Φ sv ) is determined by solving the REGCOIL system. It is chosen to minimize theprimary objective function, χ = χ B + λχ J . (3.3)Here χ B is the surface-integrated-squared normal magnetic field on the desired plasma sur-face, χ B = (cid:90) S P d x ( B · ˆ n ) . (3.4)The normal component of the magnetic field on the plasma surface, B · ˆ n , includes contri-butions from currents in the plasma, current density J on the winding surface, and currentsin other external coils. The quantity χ J is the surface-integrated-squared current density onthe winding surface, χ J = (cid:90) S coil d x | J | . (3.5)As discussed in Section 1.4.3, minimization of χ B by itself ( λ = 0) is fundamentally ill-posed,as very different coil shapes can provide almost identical normal field on the plasma surface.(Oppositely directed currents cancel in the Biot-Savart integral.) The addition of χ J to theobjective function is a form of Tikhonov regularization. As we will show, minimization of χ J also simplifies coil shapes. While the NESCOIL formulation relies on Fourier series trunca-tion for regularization, the formulation in REGCOIL allows for finer control of regularizationwhile improving engineering properties of the coil set. The regularization parameter λ canbe chosen to obtain a target maximum current density J max , corresponding to a minimumtolerable inter-coil spacing. A 1D nonlinear root finding algorithm is typically used for thisprocess.The single-valued part of the current potential Φ sv is represented using a finite Fourierseries, Φ sv ( θ, φ ) = (cid:88) m,n Φ m,n sin( mθ − nN P φ ) , (3.6)where N P is the number of periods. Only a sine series is needed if stellarator symmetryis imposed on the current density ( J ( − θ, − φ ) = J ( θ, φ )). As the minimization of χ withrespect to Φ m,n is a linear least-squares problem, it can be solved via the normal equations toobtain a unique solution. The Fourier amplitudes Φ m,n are determined by the minimizationof χ , ∂χ ∂ Φ m,n = ∂χ B ∂ Φ m,n + λ ∂χ J ∂ Φ m,n = 0 , (3.7)which takes the form of a linear system, (cid:88) m,n A m (cid:48) ,n (cid:48) ; m,n Φ m,n = b m (cid:48) ,n (cid:48) . (3.8)We will use the notation ←→ A −→ Φ = −→ b . Throughout bold-faced type with a right-facing arrow41ill denote the vector space of basis functions for Φ sv unless otherwise noted. For additionaldetails see [136]. We use REGCOIL to compute the distribution of current on a fixed, two-dimensionalwinding surface. To design coil shapes in three-dimensional space, we modify the wind-ing surface geometry by minimizing an objective function (3.10). This objective functionquantifies fundamental physics and engineering properties and is easy to calculate from theREGCOIL solution. Optimal coil geometries are obtained by nonlinear, constrained opti-mization. The cylindrical components of the winding surface are decomposed in Fourier harmonics, R = (cid:88) m,n R cm,n cos( mθ + nN p φ ) (3.9a) Z = (cid:88) m,n Z sm,n sin( mθ + nN p φ ) , (3.9b)where stellarator symmetry of the winding surface is assumed ( R ( − θ, − φ ) = R ( θ, φ )and Z ( − θ, − φ ) = − Z ( θ, φ )). We take the Fourier components of the winding surface,Ω = { R cm,n , Z sm,n } , as our optimization parameters and assume that the desired plasma sur-face is held fixed. Throughout, Ω displayed with a subscript index will refer to a single Fouriercomponent, while in the absence of a subscript, it refers to the set of Fourier components.For a given winding surface geometry, Ω, and desired plasma surface, the current poten-tial Φ(Ω) can be determined by solving the REGCOIL system to obtain a solution whichboth reproduces the desired plasma surface with fidelity and maximizes coil-coil distance, asdescribed in Section 3.2.We define an objective function, f , which will be minimized with respect to Ω, f (Ω , −→ Φ (Ω)) = χ B (Ω , −→ Φ (Ω)) − α V V / (Ω) + α S S (Ω) + α J (cid:107) J (cid:107) (Ω , −→ Φ (Ω)) . (3.10)The coefficients α V , α S , and α J are positive constants that weigh the relative importance ofthe terms in f . We take χ B (3.4) as our proxy for the desired physics properties of the plasmasurface. The normal magnetic field depends on −→ Φ , the single-valued current potential onthe surface, and Ω, the geometric properties of the coil-winding surface. The quantity V coil is the total volume enclosed by the coil-winding surface, V coil = (cid:90) S coil d x. (3.11) The adjoint method and winding-surface optimization tools are implemented in the main branch of theREGCOIL code https://github.com/landreman/regcoil.
42e use V / as a proxy for the coil-plasma separation. Our objective function decreaseswith increasing V coil , as we desire a winding surface which allows for increased coil-plasmaseparation. This minimizes coil ripple and provides increased access for neutral beams anddiagnostics. We recognize that increasing V coil implies increased coil length and experimentsize, which may not always be desired.The quantity S is a measure of the spectral width of the Fourier series describing thecoil-winding surface [110], S = (cid:88) m,n m p (cid:16) ( R cm,n ) + ( Z sm,n ) (cid:17) . (3.12)Smaller values of S correspond to Fourier spectra which decay rapidly with increasing m .We take advantage of the non-uniqueness of the representation in (3.9) to obtain surfaceparameterization which are more efficient. As χ B , (cid:107) J (cid:107) , and V coil are coordinate-independent,these terms remain unchanged if the surface is reparameterized ( θ is redefined). Minimizationof S removes this zero-gradient direction in parameter space. We use a typical value of p = 2. One could also remove the redundancy in the definition of θ by using the uniqueand spectrally condensed representation of Hirshman and Breslau [109] or by solving thenonlinear constraint equation of Hirshman and Meier [110] once the optimal surface hasbeen obtained.The quantity (cid:107) J (cid:107) = (cid:112) χ J /A coil is the 2-norm of the current density, where A coil is thewinding surface area, A coil = (cid:90) coil d x . (3.13)Although we are using a current potential approach rather than directly optimizing coilshapes, including (cid:107) J (cid:107) in the objective function allows us to obtain coils with good engi-neering properties. Derivatives of coil-specific metrics (such as curvature) could be com-puted from the current potential if desired. For example, consider N contours beginning atequally-spaced toroidal angles φ i and θ = 0. The i th contour is defined by functions θ i ( s )and φ i ( s ) for parameter s , where ∂ Φ /∂s = 0. The derivatives of coil metrics which dependon x ( θ i ( s ) , φ i ( s )), could be computed with the adjoint method which will be described inSection 3.4. As the direct targeting of coil metrics introduces additional arbitrary weightsin the objective function and the solution to another adjoint equation must be obtained tocompute its gradient, we instead include (cid:107) J (cid:107) in our objective function.To demonstrate this correlation between (cid:107) J (cid:107) and coil shape complexity, we compute thecoil set on the actual W7-X winding surface using REGCOIL. The regularization parameter λ is varied to achieve several values of (cid:107) J (cid:107) . Coil shapes are obtained from the contours ofΦ. In Figure 3.1, two of the W7-X non-planar coils computed in this way are shown, and thecorresponding coil metrics are given in Table 3.1. (These correspond to the two leftmost coilsin Figure 3.5.) We consider the average and maximum length l , toroidal extent ∆ φ , curvature κ , and the minimum coil-coil distance d mincoil-coil . The average, maximum, and minimum aretaken over the set of 5 unique coils. The coil shapes become more complex as (cid:107) J (cid:107) increases,quantified by increasing κ and ∆ φ and decreasing d mincoil-coil . Here the curvature, κ , of a43 a) (b) (c) Figure 3.1: Two non-planar W7-X coils (corresponding to the two leftmost coils in Figure3.5) computed with REGCOIL using the actual W7-X winding surface. The regularizationparameter λ is chosen to achieve the shown values of (cid:107) J (cid:107) . As (cid:107) J (cid:107) increases, the averagelength, toroidal extent, and curvature increase. Figure adapted from [185] with permission.three-dimensional parameterized curve, x ( t ), is, κ = (cid:12)(cid:12) x (cid:48) ( t ) × x (cid:48)(cid:48) ( t ) (cid:12)(cid:12)(cid:12)(cid:12) x (cid:48) ( t ) (cid:12)(cid:12) . (3.14)We have compared coil shapes on a single winding surface, finding them to become simpleras (cid:107) J (cid:107) decreases. As (cid:107) J (cid:107) = (cid:0) χ J /A coil (cid:1) / , we would find similar trends with χ J . We havechosen to include (cid:107) J (cid:107) in the objective function as it is normalized by A coil , so it is a moreuseful quantity for comparison of coil shapes on different winding surfaces.To minimize f , the relative weights in (3.10) ( α V , α S , and α J ) are chosen such that eachof the terms in the objective function have similar magnitudes, though much tuning of theseparameters is required to obtain results which simultaneously improve the physics properties(decrease χ B ) and engineering properties (increase V coil and d mincoil-coil , decrease κ and ∆ φ ). Minimization of f is performed subject to the inequality constraint d min ≥ d targetmin . Here d min is the minimum distance between the coil-winding surface and the plasma surface, d min = min θ,φ (cid:0) d coil-plasma (cid:1) = min θ,φ (cid:18) min θ p ,φ p | x C − x P | (cid:19) , (3.15)and d targetmin is the minimum tolerable coil-plasma separation. The quantities θ p and φ p arepoloidal and toroidal angles on the plasma surface, x P and x C are the position vectors onthe plasma and winding surface, and d coil-plasma is the coil-plasma distance as a function of θ and φ . 44 J (cid:107) [MA/m] 2.20 2.70 3.20 J max [MA/m] 4.55 9.50 29.1 χ B [T m ] 1.89 5 . × − . × − Average l [m] 8.03 9.18 9.81Max l [m] 8.26 10.5 11.8Average ∆ φ [rad.] 0.146 0.222 0.253Max ∆ φ [rad.] 0.161 0.282 0.372Average κ [m − ] 1.04 1.29 1.32Max κ [m − ] 2.54 20.3 56.1 d mincoil-coil [m] 0.353 0.182 0.0758Table 3.1: Comparison of metrics for coils computed with REGCOIL using the actual W7-X winding surface. Average and max are evaluated for the set of 5 unique coils. Theregularization parameter λ is varied to achieve these values of (cid:107) J (cid:107) . Table adapted from[185] with permission.The maximum current density J max is also constrained, J max = max θ,φ J. (3.16)This roughly corresponds to a fixed minimum coil-coil spacing. This constraint is enforcedby fixing J max to obtain the regularization parameter λ in the REGCOIL solve, so we avoidthe need for an equality constraint or the inclusion of J max in the objective function. Rather, −→ Φ (Ω) is determined such that J max is fixed. The inequality-constrained nonlinear opti-mization is performed using the NLOPT [125] software package using a conservative convexseparable quadratic approximation (CCSAQ) [224]. While there are several gradient-basedinequality-constrained algorithms available, we choose to use CCSAQ as it is relatively in-sensitive to the bound constraints imposed on the optimization parameters. We recognizethat there are many possible combinations of constraints, objective functions, and regular-ization conditions that could be used. For example, (cid:107) J (cid:107) could be fixed to determine λ while J max could be included in the objective function. We found that the formulation we havepresented produces the best coil shapes. f and the adjoint method We must compute derivatives of f with respect to the geometric parameters Ω in orderto use gradient-based optimization methods. The spectral width S and the volume V coil areexplicit functions of Ω, so their analytic derivatives can be obtained. On the other hand, χ B and (cid:107) J (cid:107) depend both explicitly on coil geometry and on Φ (Ω). One approach to obtain45he derivatives of these quantities could be to solve the REGCOIL linear system N Ω + 1times, taking a finite-difference step in each Fourier coefficient. However, if N Ω is large, thecomputational cost of this method could be prohibitively expensive. Instead, we will applythe adjoint method to compute derivatives. This technique will be demonstrated below.The derivative of χ B can be computed using the chain rule, ∂χ B (Ω , −→ Φ (Ω)) ∂ Ω m,n = ∂χ B (Ω , −→ Φ ) ∂ Ω m,n + ∂χ B (Ω , −→ Φ ) ∂ −→ Φ · ∂ −→ Φ (Ω) ∂ Ω m,n , (3.17)where −→ Φ (Ω) is understood to vary with Ω such that (3.8) is satisfied. The dot prod-uct is a contraction over the current potential basis functions, { Φ m,n } . We can compute ∂ −→ Φ (Ω) /∂ Ω m,n by differentiating the linear system (3.8) with respect to Ω m,n , ∂ ←→ A (Ω) ∂ Ω m,n −→ Φ + ←→ A ∂ −→ Φ (Ω) ∂ Ω m,n = ∂ −→ b (Ω) ∂ Ω m,n , (3.18)and formally solving this equation to obtain, ∂ Φ (Ω) ∂ Ω m,n = ←→ A − (cid:32) ∂ −→ b (Ω) ∂ Ω m,n − ∂ ←→ A (Ω) ∂ Ω m,n −→ Φ (cid:33) . (3.19)Equation (3.19) is inserted into (3.17), ∂χ B (Ω , −→ Φ (Ω)) ∂ Ω m,n = ∂χ B (Ω , −→ Φ ) ∂ Ω m,n + ∂χ B (Ω , −→ Φ ) ∂ −→ Φ · ←→ A − (cid:32) ∂ −→ b (Ω) ∂ Ω m,n − ∂ ←→ A (Ω) ∂ Ω m,n −→ Φ (cid:33) . (3.20)This expression could be evaluated by solving the linear system (3.18) for ∂ −→ Φ /∂ Ω m,n andperforming the inner product with ∂χ B /∂ −→ Φ . However, the computational cost of this methodscales similarly to that of finite differencing, as described in Section 2.2.1. Instead, we canexploit the adjoint property of the operator to obtain, ∂χ B (Ω , −→ Φ (Ω)) ∂ Ω m,n = ∂χ B (Ω , −→ Φ ) ∂ Ω m,n + (cid:34)(cid:16) ←→ A − (cid:17) T ∂χ B (Ω , −→ Φ ) ∂ −→ Φ (cid:35) · (cid:32) ∂ −→ b (Ω) ∂ Ω m,n − ∂ ←→ A (Ω) ∂ Ω m,n −→ Φ (cid:33) . (3.21)For any invertible matrix, (cid:16) ←→ A − (cid:17) T = (cid:16) ←→ A T (cid:17) − . Hence we can instead solve a linear systeminvolving the matrix ←→ A T to compute an adjoint variable −→ q , defined as the solution of ←→ A T −→ q = ∂χ B (Ω , −→ Φ ) ∂ −→ Φ . (3.22)Rather than compute a finite-difference derivative for each Ω m,n or solve a linear system tocompute each ∂ −→ Φ /∂ Ω m,n as in (3.19), we solve two linear systems: the forward (3.8) andadjoint (3.22). The adjoint equation is similar to the forward equation ( ←→ A T has the samedimensions and eigenspectrum as ←→ A ), so the same computational tools can be used to solvethe adjoint problem. We then perform an inner product with −→ q to obtain the derivatives46ith respect to each Ω m,n , ∂χ B (Ω , −→ Φ (Ω)) ∂ Ω m,n = ∂χ B (Ω , −→ Φ ) ∂ Ω m,n + −→ q · (cid:32) ∂ −→ b (Ω) ∂ Ω m,n − ∂ ←→ A (Ω) ∂ Ω m,n Φ (cid:33) . (3.23)The derivatives ∂ −→ b /∂ Ω m,n , ∂ ←→ A /∂ Ω m,n , ∂χ B /∂ Ω m,n , and ∂χ B /∂ −→ Φ can be computed ana-lytically. In the above discussion, the regularization parameter λ has been assumed to befixed. A similar method can be used if a λ search is performed to obtain a target J max (seeAppendix C). The same method is used to compute derivatives of (cid:107) J (cid:107) .We note that adjoint methods provide the most significant reduction in computationalcost when the linear solve is expensive. For the REGCOIL system, this is not the case, asthe cost of constructing ←→ A and −→ b exceeds that of the solve. We have implemented OpenMPmultithreading for the construction of ∂ ←→ A /∂ Ω and ∂ −→ b /∂ Ω such that the cost of computingthe gradients via the adjoint method is cheaper than computing finite-difference derivativesserially.The constraint functions, d min and J max , must also be differentiated with respect to Ω m,n .As d min is defined in terms of the minimum function, we approximate it using the smoothlog-sum-exponent function [29], d min, lse = − q log (cid:32) (cid:82) S C d x C (cid:82) S P d x P exp (cid:0) − q | x C − x P | (cid:1)(cid:82) S C d x C (cid:82) S P d x P (cid:33) . (3.24)This function can be analytically differentiated with respect to Ω m,n . As q approachesinfinity, d min, lse approaches d min . For q very large, the function obtains very sharp gradients.A typical value of q = 10 m − was used. The log-sum-exponent function is also used toapproximate J max , as described in Appendix C. Beginning with the actual W7-X winding surface, we perform scans over the coefficients α V and α S in the objective function (3.10). The plasma surface was obtained from a fixed-boundary VMEC solution that predated the coil design and is free from modular coil ripple.The constraint target is set to be the minimum coil-plasma distance on the initial windingsurface, d targetmin = 0 .
37 m. The cross-sections of the optimized surfaces in the poloidal planeare shown in Figures 3.2 and 3.3 along with the last-closed flux surface (red), a constantoffset surface at d targetmin (black solid), and the initial winding surface (black dashed).We perform a scan over α S with α V = α J = 0. For optimal values of α S , the additionof the spectral width term should simply reparameterize the surface, eliminating the zero-gradient direction in parameter space. Thus we expect that when χ B is the only other term inthe objective function, the winding surface should collapse to a constant offset surface. When α S is too large, the surface shape changes to favor a condensed Fourier series. When α S istoo small, the optimization may terminate prematurely in a local minimum due to the non-47niqueness of the representation. Indeed we find that with increasing α S , the winding surfaceapproaches a torus with a circular cross-section, which has a minimal Fourier spectrum. Atmoderately small values of α S ( ∼ .
3) the surface approaches a constant offset surface at d targetmin , as χ B is dominant in objective function. For very small values of α S ( ∼ . α S = 0 . α V is performed at fixed α S = 0 . α J = 0 such that the spectral widthdoes not greatly increase. As α V increases, d coil-plasma increases significantly on the outboardside while it remains fixed in the inboard concave regions. This trend is not surprising,as concave plasma shapes have been shown to be inefficient to produce with coils [137].Interestingly, the winding surface obtains a somewhat pointed shape at the triangle cross-section ( φ = 0 . π/N p ), becoming elongated at the tip of the triangle and “pinching” towardthe plasma surface at the edges. We now include nonzero α J and attempt a comprehensive optimization. The J max con-straint is selected such that the metrics ( l , κ , and ∆ φ ) of the coils computed on the initialsurface roughly match those of the actual non-planar coil set. The coil-plasma distance con-straint d targetmin is set to be the minimum d coil-plasma on the initial winding surface. Parameters α V = 0 . α S = 0 .
24, and α J = 1 . × − were used in the objective function. Optimizationwas performed over 118 Fourier coefficients (cid:0) | n | ≤ m ≤ (cid:1) and the objectivefunction was evaluated a total of 5165 times to reach the optimum (1 . × linear solvesrather than 6 . × required for finite-difference derivatives). The optimal surface and coilset are shown in Figures 3.4 and 3.5, and the corresponding metrics are shown in Table 3.2.We find a solution which increases V coil by 22% and decreases χ B by 52% over the initialwinding surface. (Note that it is numerically impossible to obtain a current distribution thatexactly reproduces the plasma surface, so χ B is nonzero when computed from the REGCOILsolution on the initial winding surface.) In addition, the optimized coil set features a smalleraverage and maximum ∆ φ and κ and larger d mincoil-coil . The length of the coils increases toaccommodate for the increase in V coil . Again we find that the increase in V coil is most pro-nounced in the outboard convex regions while d coil-plasma is maintained in the concave regionsof the bean-shaped cross-sections. The “pinching” feature of the winding surface is againpresent in the triangle cross-section ( φ = 0 . π/N p ).It should be noted that the decrease in d coil-plasma at the bottom and top of the bean cross-section ( φ = 0) might interfere with the current W7-X divertor baffles. However, the increasein volume on the outboard side would allow for increased flexibility for the neutral beaminjection duct [200]. We have performed this optimization to show that a winding surfacecould be constructed that increases V coil (and thus the average d coil-plasma ), improves coilshapes, and decreases χ B . If further engineering considerations were necessary, these couldbe implemented. The surface we have obtained is optimal with respect to the engineeringconsiderations and constraints we have imposed, which differ from those of the W7-X team48 Z [ m e t e r s ] =0.0 N p =0.25 N p Offset surface S = 0.003 S = 0.3= 3 × 10 Initial surfacePlasma surface Z [ m e t e r s ] =0.5 N p =0.75 N p Figure 3.2: Optimized winding surfaces obtained with α V = α J = 0 and the values of α S shown. The actual W7-X winding surface is used as the initial surface in the optimization(black dashed). As α S increases, the magnitude of the spectral-width term in the objectivefunction increases, and the winding surface approaches a cylindrical torus with a minimalFourier spectrum. For moderately small values of α S , the winding surface approaches auniform offset surface from the plasma surface (black solid). Figure adapted from [185] withpermission. 49 Z [ m e t e r s ] =0.0 N p =0.25 N p Offset surface V = 0.5 V = 1.0 V = 2.0Initial surfacePlasma surface Z [ m e t e r s ] =0.5 N p =0.75 N p Figure 3.3: Optimized winding surfaces obtained with α S = 0 . α J = 0, and the values of α V shown. The actual W7-X winding surface is used as the initial surface in the optimization(black dashed). As α V increases, d coil-plasma increases on the outboard side while it remainsfixed in the concave region. Figure adapted from [185] with permission.50 .51.00.50.00.51.01.5 Z [ m e t e r s ] =0.0 N p =0.25 N p Actual surfaceOptimized surfacePlasma surface Z [ m e t e r s ] =0.5 N p =0.75 N p Figure 3.4: The actual W7-X coil-winding surface and plasma surface are shown with ouroptimized winding surface. In comparison with the actual surface, the optimized surfacereduces χ B by 52% and increases V coil by 22%. Figure adapted from [185] with permission.[15]. Thus the direct comparison between our method and those of [15] cannot be madebased on these results. We perform the same procedure for the optimization of the HSX winding surface. Pa-rameters α V = 3 . × − , α S = 0, and α J = 3 × − were used in the objective function.We found that the spectral width term was not necessary to obtain a satisfying optimum inthis case. The initial winding surface was taken to be a toroidal surface on which the actualmodular coils lie. The plasma equilibrium used is a fixed-boundary VMEC solution withoutcoil ripple. Optimization was performed over 100 Fourier coefficients (cid:0) | n | ≤ m ≤ (cid:1) and the objective function was evaluated a total of 560 times to reach the optimum(1 . × linear solves rather than 5 . × required for forward-difference derivatives).The coil-plasma distance constraint was set to be d targetmin = 0 .
14 m, the minimum coil-plasma51 a)(b)
Figure 3.5: Comparisons of coil set computed with REGCOIL using the actual W7-X windingsurface (dark blue) and the optimized surface (light blue). Figure reproduced from [185] withpermission. 52nitial Optimized Actual coil set χ B [T m ] 0.115 0.0711 V coil [m ] 156 190 (cid:107) J (cid:107) [MA/m] 2.21 2.16 J max [MA/m] 7.70 7.70Average l [m] 8.51 8.95 8.69Max l [m] 8.84 9.14 8.74Average ∆ φ [rad.] 0.190 0.179 0.198Max ∆ φ [rad.] 0.222 0.197 0.208Average κ [m − ] 1.21 1.10 1.20Max κ [m − ] 9.01 4.84 2.59 d mincoil-coil [m] 0.223 0.271 0.261Table 3.2: Comparison of metrics of the actual W7-X winding surface and our optimizedsurface. We also show metrics of the coil set computed on the winding surfaces using REG-COIL and the metrics for the actual W7-X nonplanar coils. Regularization in REGCOILis chosen such that the coil metrics computed on the initial surface roughly match those ofthe actual coil set. Coil complexity improves from the initial to the final surface (decreasedaverage and max ∆ φ and κ , increased d mincoil-coil ). The average and max l increases to allowfor the increase in V coil . Table adapted from [185] with permission.53 .50.00.5 Z [ m e t e r s ] =0.0 N p =0.25 N p Actual surfaceOptimized surfacePlasma surface Z [ m e t e r s ] =0.5 N p =0.75 N p Figure 3.6: The actual HSX coil-winding surface and plasma surface are shown with ouroptimized winding surface. In comparison with the actual surface, the optimized surface hasdecreased χ B by 4% and increased V coil by 18%. Figure adapted from [185] with permission.distance on the actual winding surface. The optimal surface and coil set are shown in Figures3.6 and 3.7, and the corresponding coil metrics are shown in Table 3.3. We find a solutionthat increases V coil by 18% and decreases χ B by 4% over the initial winding surface. Thecoil set computed with REGCOIL using the optimized surface appears qualitatively similarto that computed with the initial surface but with increased d coil-plasma on the outboard side.The average and maximum ∆ φ and κ decreased while d mincoil-coil was increased for the coil setcomputed on the optimal surface in comparison to that of the initial surface. As was ob-served in the W7-X optimization (Figure 3.4), the optimized HSX winding surface obtainsa somewhat pinched shape near the triangle cross-section ( φ = 0 . π/N p ).54 a)(b) Figure 3.7: The coils obtained from REGCOIL using the actual HSX winding surface (darkblue) and optimized surface (light blue). Figure reproduced from [185] with permission.55nitial Optimized Actual coil set χ B [T m ] 1 . × − . × − V coil [m ] 2.60 3.07 (cid:107) J (cid:107) [MA/m] 0.956 0.891 J max [MA/m] 1.84 1.84Average l [m] 2.26 2.39 2.24Max l [m] 2.49 2.46 2.33Average ∆ φ [rad.] 0.372 0.365 0.362Max ∆ φ [rad.] 0.530 0.505 0.478Average κ [m − ] 5.15 4.80 5.05Max κ [m − ] 33.4 25.8 11.7 d mincoil-coil [m] 0.0850 0.0853 0.0930Table 3.3: Comparison of metrics of the actual HSX winding surface and our optimized sur-face. We also show metrics of the coil set computed on the winding surfaces using REGCOILand the metrics for the actual HSX modular coils. Regularization in REGCOIL is chosensuch that the coil metrics computed on the initial surface roughly match those of the actualcoil set. Coil complexity improves from the initial to the final surface (decreased averageand max ∆ φ and κ , increased d mincoil-coil ). The average and max l increases to allow for theincrease in V coil . Table adapted from [185] with permission.56 .6 Local winding surface sensitivity With the adjoint method we have computed derivatives of the objective function withrespect to Fourier components of the winding surface, ∂f /∂
Ω. While this representation ofderivatives is convenient for gradient-based optimization, the sensitivity to local displace-ments of the surface is obscured. Alternatively, it is possible to represent the sensitivity of f with respect to normal displacements of surface area elements of a given winding surface S C , δf ( S C ; δ x ) = (cid:90) S C d x G δ x · ˆ n . (3.25)The shape gradient and shape derivatives are described in detail in Section 2.1. As both χ B and (cid:107) J (cid:107) are defined in terms of surface integrals over the winding surface, it can be shownthat the shape derivative of these functions can be written in the Hadamard form [171]. Theshape gradients G χ B and G (cid:107) J (cid:107) can be computed from the Fourier derivatives ( ∂χ B /∂ Ω and ∂ (cid:107) J (cid:107) /∂ Ω) using a singular value decomposition method [138]. Here the perturbations δf and δ x are written in terms of the Fourier derivatives, and G is also represented in a finiteFourier series, ∂f (Ω) ∂ Ω m,n = (cid:90) S C d x (cid:88) m,n G m,n cos( mθ + nN p φ ) ∂ x (Ω) ∂ Ω m,n · ˆ n . (3.26)After discretizing in θ and φ , (3.26) takes the form of a (generally not square) matrix equationwhich can be solved using the Moore-Penrose pseudoinverse to obtain G m,n .We compute G χ B and G (cid:107) J (cid:107) (Figure 3.9) at fixed λ . These quantities are computed onthe actual W7-X winding surface and a surface uniformly offset from the plasma surfacewith d coil-plasma = 0 .
61 m (the area-averaged d coil-plasma over the actual surface). We considersurfaces that are equidistant from the plasma surface on average as G scales inversely with A coil . The poloidal cross-sections of these surfaces are shown in Figure 3.8. For each surface λ is chosen to achieve J max = 7 . G χ B , indicating that d coil-plasma shoulddecrease at that location in order that χ B decreases. This corresponds to locations onthe plasma surface with significant concavity (Figure 3.11b). The maximum G χ B occurs at φ = 0 .
15 2 π/N p on both surfaces (Figure 3.4). In comparison with this region, the magnitudeof G χ B is relatively small over the majority of the area of the surfaces shown, demonstratingthat engineering tolerances might be more relaxed in these locations. There is also a regionof negative G χ B near φ = π/N p and θ = 0. This is the “tip” of the triangle-shaped cross-section, where d coil-plasma was increased over the course of the optimization (Figures 3.2, 3.3,and 3.4). We find that G χ B computed on the actual winding surface has similar trends tothat computed on the surface uniformly offset from the plasma. This indicates that theshape gradient depends on the specific geometry of the winding surface. We have computed G χ B for several other winding surfaces with varying d coil-plasma . Regardless of the windingsurface chosen, we observe increased sensitivity in the concave regions.The quantity G (cid:107) J (cid:107) roughly quantifies how coil complexity changes with normal displace-57 Z [ m e t e r s ] =0.0 N p =0.25 N p Offset from plasmaActual surfacePlasma surface Z [ m e t e r s ] =0.5 N p =0.75 N p Figure 3.8: The cross-sections of the two winding surfaces used to compute G χ B and G (cid:107) J (cid:107) are shown in the poloidal plane. Figure adapted from [185] with permission.ments of the coil surface. In view of Figure 3.10, the locations of large G (cid:107) J (cid:107) overlap withareas of increased J . On the actual winding surface, the maximum of G (cid:107) J (cid:107) occurs near thelocation of the closest approach between coils (two rightmost coils in Figure 3.5(a)). Theshape gradients G (cid:107) J (cid:107) and G χ B have very similar trends. The concave regions of the plasmasurface are difficult to produce with external coils, resulting in increased coil complexity and J . Therefore, (cid:107) J (cid:107) is most sensitive to displacements of the coil-winding surface in theseregions.We recognize several ways that the shape gradient technique could be improved to providemore relevant diagnostics for experimental design. With a winding surface representation,the shape gradient does not allow for calculation of the sensitivity to lateral coil displace-ments. Also, our analysis does not account for field ripple due to the finite number of coils.Although Figure 3.9 indicates that the coils should move toward the plasma to reduce thefield error, the ripple fields might be significant with a filamentary model. A similar cal-culation could be performed using the filamentary coil sensitivity techniques presented inSection 2.1 and discussed further in Chapter 5. Finally, χ B does not account for the sensi-tivity to resonant fields that could cause the formation of islands, though there is ongoingwork toward computing the shape gradient for such a metric [76].Sensitivity studies on NCSX similarly found that coil errors on the inboard side in regions58 a) Offset from plasma (b) Actual(c) Offset from plasma (d) Actual Figure 3.9: Shape gradient for χ B ((a) and (b)) || J || ((c) and (d)). These functions arecomputed using the W7-X plasma surface and a uniform offset winding surface from theplasma surface with d coil-plasma = 0 .
61 m ((a) and (c)) and the actual winding surface ((b)and (d)). The region of increased G χ B corresponds with concave regions of the plasma surface(Figure 3.11b). Regions of large positive (cid:107) J (cid:107) correspond to regions with increased J (Figure3.10). Figure adapted from [185] with permission.59 a) Offset from plasma (b) Actual Figure 3.10: Current density magnitude, J , computed from REGCOIL using the W7-Xplasma surface and (a) a uniform offset winding surface from the plasma surface with d coil-plasma = 0 .
61 m and (b) the actual winding surface. Figure adapted from [185] withpermission.of small d coil-plasma had a significant effect on flux surface quality [236]. The necessity of small d coil-plasma for bean-shaped plasmas has been noted in many coil optimization efforts [60, 216]and has been demonstrated by evaluating the singular value decomposition of the discretizedBiot-Savart integral operator [137]. We can identify these regions where the fidelity of theplasma surface requires tighter tolerance on coil positions using the shape gradient. The results presented here and in [137] indicate that the concave regions of the surfaceare both the regions where a small coil-plasma distance is required and the sensitivity tothe winding surface position is highest. The regions of concavity can be determined byconsidering the principal curvatures of the plasma surface. Let ˆ n ( x ) represent the normalvector at the plasma surface at some point x , and let A n represent a plane that includesthis normal vector. The intersection of the plane and the surface makes a curve x ( l ), whichhas curvature κ at the point x , as calculated from (3.14). The two principal curvatures κ and κ represent the maximum and minimum curvatures, κ , from all possible planes A n . We choose the convention for the principal curvatures such that convex curves havepositive curvature and concave curves have negative curvatures. Therefore, small values ofthe second principal curvature, κ , represent regions on the surface where the concavity isincreased. 60 a) (b) Figure 3.11: (a) The minimum distance between the W7-X plasma surface and the optimizedwinding surface obtained in Section 3.5.2 and (b) the second principle curvature κ are shownas a function of location on the plasma surface. Locations of large negative κ coincide withregions where the optimization resulted in small d coil-plasma . Figure adapted from [185] withpermission.The second principal curvature for the W7-X plasma surface is shown in Figure 3.11b.Although κ and the shape gradients are evaluated on different surfaces, we note that regionsof high concavity (negative κ ) coincide with regions of large, positive G (Figure 3.9). Theregions of high concavity also correspond to the regions where the optimization proceduretends to place the winding surface closest to the plasma (Figure 3.11). We recognize that ourwinding surface optimization accounts for several engineering considerations in addition toreproducing the desired plasma surface. However, for a wide range of parameters the windingsurfaces we obtain feature small d coil-plasma in the bean-shaped cross-sections (Figures 3.2 and3.3). Thus κ , which is exceedingly fast to compute, may serve as a target for optimizationof the plasma configuration. By minimizing the regions of high concavity, it may be possibleto find stellarator equilibria that are more amenable to coils that are positioned farther fromthe plasma. Any increase in the minimal distance between the plasma and the coils hasimplications for the size of a reactor, where d coil-plasma is set by the required blanket width.Similar metrics are considered in the ROSE code, such as the integrated absolute value ofthe Gaussian curvature and integrated absolute value of the maximum curvature [59]. We have outlined a new method for the optimization of the stellarator coil-winding surfaceusing a continuous current potential approach. Rather than evolving filamentary coil shapes,we use REGCOIL to obtain the current density on a winding surface and optimize the61inding surface using analytic gradients of the objective function. We have shown that wecan indirectly improve the coil curvature and toroidal extent by targeting the root-mean-squared current density in our objective function (Figure 3.1). This approach offers severalpotential advantages over other nonlinear coil optimization tools.1. The difficulty of the optimization is reduced by the application of the REGCOILmethod, which takes the form of a linear least-squares system. The optimal coil shapeson a given winding surface can thus be efficiently and robustly computed.2. By fixing the maximum current density to obtain the regularization in REGCOIL, weeliminate the need to implement an additional equality constraint or arbitrary weightin the objective function.3. By using REGCOIL to compute coil shapes on a given surface, we can apply the adjointmethod for computing derivatives (Section 3.4). This allows us to reduce the numberof function evaluations required during the nonlinear optimization by a factor of ≈ hapter 4 Adjoint-based optimization of neoclassicalproperties
Several critical quantities for stellarator design arise from neoclassical physics, the kinetictheory of collisional transport in the presence of magnetic field gradients and curvature. Thisso-called neoclassical transport results from the random-walk of charged particles as theyexhibit guiding center motion. Due to the complicated guiding center orbits present in a3D field, neoclassical transport is generally enhanced in a stellarator. One of the primarygoals of stellarator optimization is to reduce this transport. Furthermore, the bootstrapcurrent, driven by collisional processes, should be minimized in low-shear designs or if anisland divertor system is to be used. These neoclassical properties are described by solutionsof the drift-kinetic equation (DKE), (cid:16) v || ˆ b + v d (cid:17) · ∇ f = C ( f ) , (4.1)where f is the distribution function, v || = v · ˆ b is the parallel component of the velocity, v d isthe guiding center drift velocity, and C is the collision operator. The DKE is obtained fromthe Fokker-Planck equation under the assumption that the plasma is strongly magnetizedsuch that (4.1) describes length scales much longer than the gyroradius and frequenciesmuch smaller than the gyrofrequency. We have taken the equilibrium limit, assuming timescales longer than the gyroperiod but shorter than the transport time scale on which theprofiles relax. In this Chapter we make an additional assumption of local thermodynamicequilibrium, such that f ≈ f M , a Maxwellian distribution (defined in Section 4.2), to lowestorder. This assumption is valid in stellarator configurations, provided that the collisionlessorbits are sufficiently confined and the collision frequency is not too low [33, 227]. Thedeparture from a Maxwellian, f , is driven by gradients in f M due to variations in thedensity, temperature, and electrostatic potential. The drift-kinetic equation is described inmany references, including Chapter 7 in [99] and [94, 97].In this Chapter, we will apply both the discrete and continuous adjoint methods describedin Chapter 2 to efficiently compute derivatives of functions that depend on such solutionsof the drift kinetic equation. This analysis will allow us to efficiently optimize the localmagnetic field for several neoclassical quantities in addition to analyzing their sensitivity tochanges in the magnetic field. 64he material in this Chapter has been adapted from [186]. Neoclassical transport is governed by solutions of the drift kinetic equation (DKE) (5.131)from which moments (e.g., radial fluxes and bootstrap current) are computed. The DKE localto a flux surface can be solved numerically [18, 140]. However, this four-dimensional problemis expensive to solve within an optimization loop, especially in low-collisionality regimes forwhich increased pitch-angle resolution is required to resolve the collisional boundary layer.Therefore, it is sometimes desirable to consider an analytic reduction of the DKE. Un-der the assumption of low collisionality, a bounce-averaged DKE can be considered [17, 34].While bounce-averaging can significantly reduce the computational cost by decreasing thespatial dimensionality, this approach typically requires restrictions on the geometry, suchas closeness to omnigeneity or a model magnetic field. Additional reduction of the DKEcan be made in low-collisionality regimes, resulting in semi-analytic expressions. For ex-ample the effective ripple, (cid:15) eff [168], quantifies the geometric dependence of the 1 /ν radialtransport ( ν is the collision frequency) and has been widely used during optimization studies[106, 134, 242]. (The effective ripple will be discussed further in Chapter 5 and Appendix M.)The 1 /ν regime, though, is only relevant when E r is small enough that the typical poloidalrotation frequency is much smaller than the typical collision frequency [116], which is not al-ways an experimentally-relevant regime. A low-collisionality semi-analytic bootstrap currentmodel [205] is also commonly adopted for stellarator design [15, 114]. However, this ana-lytic expression is known to be ill-behaved near rational surfaces. Furthermore, benchmarkswith numerical solutions of the DKE in the low-collisionality limit have been shown to differsignificantly from the semi-analytic model [16, 127]. Any analytic reduction of the DKEimplies additional assumptions, such as on the collisionality, size of E r , or on the magneticgeometry.Due to the limitations of bounce-averaged and semi-analytic models, there are benefitsto computing neoclassical quantities using numerical solutions to the DKE without approx-imation. With the numerical methods currently used for stellarator optimization, this ap-proach becomes computationally challenging within an optimization loop. Due to their fullythree-dimensional nature, optimization of stellarator geometry requires navigation throughhigh-dimensional spaces, such as the space of the shape of the outer boundary of the plasmaor the shapes of electromagnetic coils. The number of parameters required to describe thesespaces, N , is often quite large ( O (10 )). Knowledge of the gradient of the objective functionwith respect to these parameters can significantly improve the convergence to a local min-imum. Once a descent direction is identified, each iteration reduces to a one-dimensionalline search. Gradient-based optimization with the Levenberg-Marquardt algorithm in theSTELLOPT code [218] has been widely used in the stellarator community and led to thedesign of NCSX [197].Although derivative information is valuable, numerically computing the derivative of afigure of merit f (for example, with finite-difference derivatives) can be prohibitively expen-65ive, as f must be evaluated O ( N ) times. For neoclassical optimization, this implies solvingthe DKE O ( N ) times; thus including finite-collisionality neoclassical quantities in the ob-jective function is often impractical. In this Chapter, we describe an adjoint method forneoclassical optimization. With this method, the computation of the derivatives of f withrespect to N parameters has cost comparable to solving the DKE twice, thus making theinclusion of these quantities possible within an optimization loop. In this Chapter, we obtainderivatives of neoclassical figures of merit with respect to local geometric parameters on asurface rather than the outer boundary or coil shapes. However, the geometric derivatives wecompute provide an important step toward adjoint-based optimization of MHD equilibria,as discussed in Section 4.5.2 and Chapter 5.In Section 4.2, we provide an overview of the numerical solution of the DKE local to aflux surface. In Section 4.3 the adjoint neoclassical method is described. The continuousand discrete approaches for this problem are presented, and their implementation and bench-marks are discussed in Section 4.4. The adjoint method is used to compute derivatives ofmoments of the neoclassical distribution function with respect to local geometric quantities.The derivative information can be used to identify regions of increased sensitivity to magneticperturbations, as discussed in Section 4.5.1. We demonstrate adjoint-based optimization inSection 4.5.2 by locally modifying the field strength on a flux surface. A discussion of theapplication of this method for optimization of MHD equilibria is presented in 4.5.2. Finally,the adjoint method is applied to accelerate the calculation of the ambipolar electric field inSection 4.5.3. The local drift kinetic equation is, (cid:16) v || ˆ b + v E (cid:17) · ∇ f s − C s ( f s ) = − v m s · ∇ ψ ∂f Ms ∂ψ , (4.2)Here ˆ b = B /B is a unit vector in the direction of the magnetic field, v || = v · ˆ b is the parallelcomponent of the velocity, and 2 πψ is the toroidal flux. The Fokker-Planck collision operatoris C s ( f s ), linearized about a Maxwellian f Ms = n s v − ts π − / e − v /v ts where v ts = (cid:112) T s /m s is the thermal speed, n s is the density, T s is the temperature, m s is the mass, and thesubscript indicates species. In (4.2), derivatives are performed holding W s = m s v / q s Φand µ = v ⊥ / B fixed, where v = √ v · v is the magnitude of velocity, Φ is the electrostaticpotential, v ⊥ = (cid:113) v − v || is the perpendicular velocity, and q s is the charge. The radialmagnetic drift is, v m s · ∇ ψ = m s q s B (cid:32) v || + v ⊥ (cid:33) ˆ b × ∇ B · ∇ ψ, (4.3)assuming a magnetic field in MHD force balance, and v E is the E × B velocity, v E = B × ∇ Φ B . (4.4)66hroughout we assume Φ = Φ( ψ ) such that (4.2) is linear. In (4.2) we will not consider theeffect of inductive electric fields, as these can be assumed to be small for stellarators withoutinductive current drive. We also do not consider the effects of magnetic drifts tangentialto the flux surface in (4.2), as these only become important when E r is small [184]. Wecan assume radial locality, manifested by the absence of any radial derivatives of f s in(4.2), when ν ∗ (cid:29) ρ ∗ [33], where ν ∗ = ν/ ( v t /L ) (cid:28) L and ρ ∗ = v t m/ ( LqB ) is the normalized gyrofrequency. Numericalsolutions to (4.2) are computed with the Stellarator Fokker-Planck Iterative NeoclassicalSolver (SFINCS) [140] code which allows for general stellarator geometry with flux surfaces.SFINCS solves (4.2) locally on a flux surface ψ , a four-dimensional system. The SFINCScoordinates include two angles (poloidal angle θ and toroidal angle φ ), speed X s = v/v ts , andpitch angle ξ s = v || /v . Specifics about the implementation of (4.2) in the SFINCS code aredescribed in Appendix D. We will refer to two choices of implementation: the full trajectorymodel and the DKES trajectory model. The full trajectory model maintains µ conservationas radial coupling (terms involving ∂f s /∂ψ ) is dropped. While the DKES model does notconserve µ when E r (cid:54) = 0, the adjoint operator under the DKES model takes a particularlysimple form, as discussed in Section 4.3.1. This model also does not introduce any unphysicalconstraints on the distribution function when E r = 0, as occurs for the full trajectory model[140]. These constraints motivate the introduction of particle and heat sources, which arediscussed in the following Section. We will discuss details of the implementation of the DKEin the SFINCS code, as these need to be considered in arriving at the adjoint equation.However, the adjoint neoclassical approach is quite general and could be implemented inother drift-kinetic codes with slight modification.From solutions of (4.2), several neoclassical quantities are computed, including the flux-surface averaged parallel flow, V || ,s = (cid:10) B (cid:82) d v f s v || (cid:11) ψ n s (cid:104) B (cid:105) / ψ , (4.5)the radial particle flux, Γ s = (cid:28)(cid:90) d v ( v m s · ∇ ρ ) f s (cid:29) ψ , (4.6)and the radial heat flux (sometimes referred to as an energy flux), Q s = (cid:42)(cid:90) d v m s v v m s · ∇ ρ ) f s (cid:43) ψ . (4.7)Here the flux-surface average of a quantity A is, (cid:104) A (cid:105) ψ = (cid:82) π dθ (cid:82) π dφ √ gAV (cid:48) ( ψ ) (4.8a) V (cid:48) ( ψ ) = (cid:90) π dθ (cid:90) π dφ √ g, (4.8b)67nd √ g = ( ∇ ψ × ∇ θ · ∇ φ ) − is the Jacobian. We will also consider species-summed quan-tities including the bootstrap current, J b = (cid:80) s q s n s V || ,s , the radial current, J r = (cid:80) s q s Γ s ,and the total heat flux, Q tot = (cid:80) s Q s . Here the effective normalized radius is ρ = (cid:112) ψ/ψ ,where 2 πψ is the toroidal flux at the boundary. To avoid unphysical constraints on f s implied by the moment equations of (4.2) in thepresence of a non-zero E r [140], particle and heat sources are added to the DKE (D.1), L s f s − C s ( f s ) − f Ms (cid:18) X s − (cid:19) S f s ( ψ ) − f Ms (cid:18) X s − (cid:19) S f s ( ψ ) = S s , (4.9)where S f s ( ψ ) and S f s ( ψ ) are unknowns such that S f s provides a particle source and S f s provides a heat source. The collisionless trajectory operator in SFINCS coordinates is, L s = ˙ x · ∇ + ˙ X s ∂∂X s + ˙ ξ s ∂∂ξ s , (4.10)and the inhomogeneous drive term is S s = − ( v m s · ∇ ψ ) ∂f Ms /∂ψ . The source functions aredetermined via the requirement that (cid:104) (cid:82) d v f s (cid:105) ψ = 0 and (cid:104) (cid:82) d v X s f s (cid:105) ψ = 0 (i.e. f s doesnot provide net density or pressure). So, the following system of equations is solved, L s − C s − f Ms ( X s − ) − f Ms ( X s − ) L s L s (cid:124) (cid:123)(cid:122) (cid:125) L s f s S f s S f s (cid:124) (cid:123)(cid:122) (cid:125) F s = S s (cid:124) (cid:123)(cid:122) (cid:125) S s . (4.11)The velocity-space averaging operations are denoted L s f s = (cid:104) (cid:82) d v f s (cid:105) ψ and L s f s = (cid:104) (cid:82) d v f s X s (cid:105) ψ . The full multi-species system can be written as, L ... L N species F ... F N species = S ... S N species . (4.12)Here the linear systems corresponding to each species as in (4.11) are coupled through thecollision operator. We use the following notation to refer to the above system, L F = S . (4.13) The goal of the adjoint neoclassical approach is to compute derivatives of a moment ofthe distribution function efficiently, R (e.g., V || ,s , Γ s , Q s , J b , J r , Q tot ), with respect to manyparameters. Consider a set of parameters, Ω = { Ω i } N Ω i =1 , on which R depends. Computing a68orward-difference derivative with respect to Ω requires N Ω + 1 solutions of (4.13). With theadjoint approach, ∂ R /∂ Ω can be computed with one solution of (4.13) and one solution of alinear adjoint equation of the same size as (4.13). Thus if N Ω is very large and the solutionto (4.13) is computationally expensive to obtain, the adjoint approach can reduce the costby N Ω . For stellarator optimization, it is desirable to compute derivatives with respect toparameters that describe the magnetic geometry. In fully three-dimensional geometry, N Ω is O (10 ) and solving (4.13) is the most expensive part of computing R (rather than con-structing the linear system or taking a moment of the distribution function). The discretizedlinear system is typically very large ( N ∼ − for the calculations shown in the Chap-ter) and sparse. Thus matrix-matrix products are significantly less expensive than the linearsolve, which is performed with a preconditioned Krylov iterative method. Consequently, theadjoint method provides a factor of N Ω ∼ savings over both the forward sensitivity andfinite-difference methods, as described in Section 2.2.1. The adjoint method also allows usto avoid additional round-off or truncation error arising from finite-difference derivatives. Inwhat follows, we consider Ω to be a set of parameters describing the magnetic geometry,which will be specified in Section 4.4.We compute the derivatives of R using two approaches. In the first approach, we define aninner product that involves integrals over the distribution function, and an adjoint operatoris obtained with respect to this inner product. This is the continuous approach introducedin Section 2.2.2. In the second approach, we consider the DKE after discretization, definingan adjoint operator with respect to the Euclidean dot product. This is the discrete approachintroduced in Section 2.2.1. While these approaches should provide identical results withindiscretization error, the advantages and drawbacks of each method will be discussed at theend of Section 4.3.2. Let F = { F s } N species s =1 be the set of unknowns computed with SFINCS before discretization,denoted by the column vector in (4.12) with F s given by (4.11). That is, F consists of a setof N species distribution functions over ( θ, φ, X s , ξ s ) and their associated source functions. Wedefine an inner product between two such quantities in the following way, (cid:104) F, G (cid:105) = (cid:88) s (cid:28)(cid:90) d v f s g s f Ms (cid:29) ψ + S f s S g s + S f s S g s . (4.14)Here the superscript on S s and S s denotes the distribution function with which the sourcefunctions are associated and the sum is over species. The space of continuous functions, F ,of this form such that (cid:104) F, F (cid:105) is bounded will be denoted by H . It can be seen that (4.14)is indeed an inner product, as it satisfies conjugate symmetry ( (cid:104) G, F (cid:105) = (cid:104) F, G (cid:105) ∀
F, G ∈ H ),linearity ( (cid:104) F + G, H (cid:105) = (cid:104) F, H (cid:105) + (cid:104) G, H (cid:105) ∀
F, G, H ∈ H and (cid:104)
F, aG (cid:105) = a (cid:104) F, G (cid:105) ∀
F, G ∈ H , a ∈ R ), and positive definiteness ( (cid:104) F, F (cid:105) ≥ (cid:104)
F, F (cid:105) = 0 only if F = 0 ∀ F ∈ H ) [199].This implies that if H is finite-dimensional, then for any linear operator L there exists aunique adjoint operator L † such that (cid:104) LF, G (cid:105) = (cid:104) F, L † G (cid:105) for all F, G ∈ H . While here H is not finite-dimensional, we will show that such an adjoint operator exists for this inner69roduct.Note that the norm associated with this inner product || F || = (cid:112) (cid:104) F, F (cid:105) is similar to thefree energy norm, W = (cid:88) s (cid:42)(cid:90) d v T s f s f Ms (cid:43) ψ , (4.15)which obeys a conservation equation in gyrokinetic theory [2, 132, 141]. The choice of innerproduct (4.14) is advantageous, as the linearized Fokker-Planck collision operator becomesself-adjoint for species linearized about Maxwellians with the same temperature. In whatfollows, we assume that all included species are of the same temperature. This assumptioncould be lifted, with a modification to the collision operator that appears in the adjointequation (Appendix E). This assumption is not necessary when using the discrete approach(Section 4.3.2).Consider a moment of the distribution function R ∈ { V || ,s , Γ s , Q s , J b , J r , Q tot } , which canbe written as an inner product with a vector (cid:101) R ∈ H , R = (cid:104) F, (cid:101) R(cid:105) , (4.16)according to (4.14). For example, (cid:101) J r = q s v m s · ∇ ψf Ms N species s =1 , (4.17)where the column structure corresponds with that in (4.11) and (4.12).We are interested in computing the derivative of R with respect to a set of parameters,Ω = { Ω i } N Ω i =1 such that the DKE is satisfied. Computing such a derivative with the forwardsensitivity method requires that we compute ∂F (Ω) /∂ Ω i from the linearized DKE, ∂ L (Ω) ∂ Ω i F + L ∂F (Ω) ∂ Ω i = ∂ S (Ω) ∂ Ω i , (4.18)for each Ω i and evaluate the derivative using the chain rule, ∂ R (Ω , F (Ω)) ∂ Ω i = ∂ R (Ω , F ) ∂ Ω i + (cid:28) (cid:101) R , ∂F (Ω) ∂ Ω i (cid:29) . (4.19)We see that the forward sensitivity method requires solutions of N Ω linear systems of thesame dimension as the DKE (4.13).To avoid this additional computational cost, we instead apply the adjoint method byconstructing the Lagrangian functional, enforcing (4.13) as a constraint, L (Ω , F, λ R ) = R (Ω , F ) + (cid:68) λ R , L F − S (cid:69) . (4.20)Here λ R is the Lagrange multiplier. We obtain the adjoint equation by finding a stationary70oint of L with respect to F , δ L (Ω , F, λ R ; δF ) = (cid:104) δF, (cid:101) R(cid:105) + (cid:68) λ R , L δF (cid:69) = 0 . (4.21)We can now use the adjoint property to express the above as, δ L (Ω , F, λ R ; δF ) = (cid:104) δF, (cid:101) R + L † λ R (cid:105) . (4.22)A stationary point of L with respect to F corresponds to λ R which satisfies the weak formof the adjoint equation, L † λ R + (cid:101) R = 0 . (4.23)With this adjoint variable, we can now compute derivatives of R with respect to any pa-rameter by computing the corresponding perturbations of L , ∂ R (Ω , F (Ω)) ∂ Ω i = ∂ L (Ω , F, λ R ) ∂ Ω i = ∂ R (Ω , F ) ∂ Ω i + (cid:28) λ R , ∂ L (Ω) ∂ Ω i F − ∂ S (Ω) ∂ Ω i (cid:29) . (4.24)The first term on the right hand side accounts for the explicit dependence on Ω i whilethe second accounts for the implicit dependence on Ω i through F . Thus, using (4.24),the derivative with respect to Ω can be computed with the solution to two linear systems,(4.13) and (4.23). The partial derivatives on the right hand side of (4.24) can be computedanalytically by considering the explicit geometric dependence of R , L , and S .When N Ω is large, the cost of computing ∂ R /∂ Ω using (4.24) is dominated not by thelinear solve but by constructing ∂ S /∂ Ω and ∂ L /∂ Ω and computing the inner product. Thusthe cost still scales with N Ω . However, we obtain a significant savings in comparison withforward-difference derivatives, as shown in Section 4.4.The adjoint operator for each species takes the following form, L † s = L † s − C s f Ms f Ms X s L † s L † s , (4.25)where L † s = 5 / L s − L s and L † s = 3 / L s − L s . The same column structure is used as forthe forward operator (4.12), L † = { L † s } N species i =1 . The quantity L † s satisfies (cid:104) (cid:82) d v g s L s f s /f Ms (cid:105) ψ = (cid:104) (cid:82) d v f s L † s g s /f Ms (cid:105) ψ and depends on which trajectory model is applied. The expression(4.25) can be verified by noting that (cid:104) L F, G (cid:105) = (cid:88) s (cid:42) f s (cid:16) ( L † s − C s ) g s + f Ms (cid:0) S g s + S g s X s (cid:1)(cid:17) f Ms (cid:43) ψ + S f s L † s g s + S f s L † s g s = (cid:104) F, L † G (cid:105) . (4.26)For the DKES trajectories the adjoint operator is, L † s = − L s . (4.27)This anti-self-adjoint property is used in obtaining the variational principle which provides71ounds on neoclassical transport coefficients in the DKES code [230]. For full trajectories itis, L † s = − L s + q s T s Φ (cid:48) ( ψ ) v m s · ∇ ψ. (4.28)The anti-self-adjoint property does not hold for this trajectory model as the E × B drift(F.9) is no longer divergenceless. Appendix F contains details on obtaining these adjointoperators. Next, we consider the discrete adjoint approach. Let −→ F be the set of unknowns computedwith SFINCS after discretization of F . The linear DKE (4.13) upon discretization can thenbe written schematically as, ←→ L −→ F = −→ S . (4.29)In this case, we can define an inner product as the vector dot product, (cid:104)−→ F , −→ G (cid:105) = −→ F · −→ G . (4.30)In real Euclidean space, the adjoint operator, (cid:16) ←→ L (cid:17) † , which satisfies, (cid:68) ←→ L −→ F , −→ G (cid:69) = (cid:28) −→ F , (cid:16) ←→ L (cid:17) † −→ G (cid:29) (4.31)is simply the transpose of the matrix, (cid:16) ←→ L (cid:17) T . Again, the moments of the distributionfunction, R can be expressed as an inner product with a vector −→ R , R = (cid:104)−→ F , −→ R (cid:105) . (4.32)Using the discrete approach, the following adjoint equation must be solved (cid:16) ←→ L (cid:17) T −→ λ R = −→ R . (4.33)The adjoint variable, −→ λ R , can again be used to compute the derivative of R with respect toΩ, ∂ R (cid:16) Ω , −→ F (Ω) (cid:17) ∂ Ω i = ∂ R (cid:16) Ω , −→ F (cid:17) ∂ Ω i + (cid:42) −→ λ R , (cid:32) ∂ −→ S (Ω) ∂ Ω i − ∂ ←→ L (Ω) ∂ Ω i −→ F (cid:33)(cid:43) . (4.34)As with the continuous approach, the partial derivatives on the right hand side can becomputed analytically. In this way, the derivative of R with respect to Ω can be computedwith only two linear solves, (4.29) and (4.33).In the SFINCS implementation, the DKE is typically solved with the preconditionedGMRES algorithm. In the continuous approach, a preconditioner matrix for both the for-ward and adjoint operator must be LU -factorized. Here the preconditioner matrix is the72ame as the full matrix but without cross-species or speed coupling. As the adjoint matrixis sufficiently different from the forward matrix, we do not obtain convergence when thesame preconditioner is used for both problems. However, in the discrete approach, the LU -factorization for the preconditioner of the forward matrix can be reused for the preconditionerof the adjoint matrix. (If a matrix A has been factorized as A = LU then A T = U T L T where U T is lower triangular and L T is upper triangular). This provides a significant reduction inmemory and computational cost for the discrete approach.Furthermore, the discrete adjoint approach provides the exact derivatives for the dis-cretized problem. With this method, the adjoint equation is obtained using the vector dotproduct and matrix transpose, which can be computed without any numerical approxima-tion. The error in the derivatives obtained by the adjoint method is therefore only limitedby the tolerance to which the linear solve is performed with GMRES. On the other hand,the continuous adjoint approach relies on a continuous inner product that must ultimatelybe approximated numerically. Thus the continuous approach provides the exact derivativesonly in the limit that the discrete approximation of the inner product exactly reproducesthe continuous inner product. Therefore we expect the results of the discrete and adjointapproaches to agree within discretization error, as will be demonstrated in Section 4.4.The continuous approach can be advantageous in that an adjoint equation may be pre-scribed independently of the discretization scheme. Note that in the discrete approach, theadjoint operator is obtained from the matrix transpose of the discretized forward operator,which implies that the same spatial and velocity resolution parameters must be used for boththe forward and adjoint solutions. In this Chapter, we will employ the same discretizationparameters for both the adjoint and forward problems, but this restriction is not requiredfor the continuous approach. The adjoint method has been implemented in the SFINCS code using both the dis-crete and continuous approaches. The magnetic geometry is specified in Boozer coordinates(Appendix A.4) such that the covariant form of the magnetic field is, B = I ( ψ ) ∇ ϑ B + G ( ψ ) ∇ ϕ B + K ( ψ, ϑ B , ϕ B ) ∇ ψ, (4.35)where I ( ψ ) = µ I T ( ψ ) / π and G ( ψ ) = µ I P ( ψ ) / π , I T ( ψ ) is the toroidal current enclosedby ψ , and I P ( ψ ) is the poloidal current outside of ψ . The contravariant form is, B = ∇ ψ × ∇ ϑ B − ι ( ψ ) ∇ ψ × ∇ ϕ B , (4.36)where ι ( ψ ) is the rotational transform. The Jacobian is obtained from dotting (4.35) with(4.36), √ g = G ( ψ ) + ι ( ψ ) I ( ψ ) B . (4.37) The adjoint method is implemented in the main branch of the SFINCS codehttps://github.com/landreman/sfincs. K ( ψ, ϑ B , ϕ B ) does not appear in any of the trajectory coefficients ((D.2) and (D.4)), in thedrive term in (D.1), or in the geometric factors used to define the moments of the distributionfunction ((4.5), (4.6), and (4.7)), all the geometric dependence enters through B ( ψ, ϑ B , ϕ B ), G ( ψ ), I ( ψ ), and ι ( ψ ). We choose to use Boozer coordinates for these computations as itreduces the number of geometric parameters that must be considered, but the neoclassicaladjoint method is not limited to this choice of coordinate system.We approximate B by a truncated Fourier series, B = (cid:88) m,n B cm,n cos( mϑ B − nN P ϕ B ) , (4.38)where the sum is taken over Fourier modes m ≤ m max and | n | ≤ n max and N P is the numberof periods. In (4.38), we have assumed stellarator symmetry such that B ( − ϑ B , − ϕ B ) = B ( ϑ B , ϕ B ), and N p symmetry such that B ( ϑ B , ϕ B + 2 π/N P ) = B ( ϑ B , ϕ B ). Thus we computederivatives with respect to the parametersΩ = { B cm,n , I ( ψ ) , G ( ψ ) , ι ( ψ ) } . Additionally, derivatives with respect to E r are computed,which are used for efficient ambipolar solutions and computing derivatives of geometricquantities at ambipolarity (Section 4.5.3) rather than at fixed E r .To demonstrate, we compute ∂ R /∂B c , for moments of the ion distribution function usingthe discrete and continuous adjoint methods. A 3-mode model of the standard configurationW7-X geometry at ρ = (cid:112) ψ/ψ = 0 . B = B c , + B c , cos( N P ϕ B ) + B c , cos( ϑ B − N P ϕ B ) + B c , cos( ϑ B ) , (4.39)where B c , = 0 . B c , , B c , = − . B c , , and B c , = − . B c , . Electron and ion( q i = e ) species are included, and the derivatives are computed at the ambipolar E r withthe full trajectory model. The derivatives are also computed with a forward-difference ap-proach with varying step size ∆ B c , . In Figure 4.1 we show the fractional-difference between ∂ R /∂B c , computed using the adjoint method and with forward-difference derivatives. Wesee that at large values of ∆ B c , , the adjoint and numerical derivatives begin to differ signifi-cantly due to discretization error from the forward-difference approximation. The fractionalerror decreases proportional to ∆ B c , as expected until the rounding error begins to domi-nate [203] when ∆ B c , /B c , is approximately 10 − , where B c , is the value of the unperturbedmode. The discrete and continuous approaches show qualitatively similar trends. However,the minimum fractional difference is lower in the discrete approach due to the additionaldiscretization error that arises with the continuous approach. With sufficient resolution pa-rameters (41 θ grid points, 61 φ grid points, 85 ξ basis functions, and 7 X basis functions),the fractional error of the continuous approach is ≤ .
1% and should not be significantfor most applications. We find similar agreement for other derivatives and with the DKEStrajectory model.To demonstrate that the discrete and continuous methods indeed produce the samederivative information, we compute the fractional difference between the derivatives com-puted with the two methods as a function of the resolution parameters. As an example, inFigure 4.2a we show the fractional difference in ∂Q i /∂ι , where Q i is the radial ion heat flux,as a function of the number of Legendre polynomials used for the pitch angle discretization,74 -8 -7 -6 -5 -4 -3 -2 -1 F r a c t i ona l d i ff e r en c e (a) Discrete approach -8 -7 -6 -5 -4 -3 -2 -1 (b) Continuous approach Figure 4.1: Fractional difference between derivatives with respect to B c , computed withthe adjoint method and with a forward-difference derivative with step size ∆ B c , . The fulltrajectory model was used with (a) the discrete and (b) the continuous adjoint approaches.Figure adapted from [186] with permission. N ξ , keeping the other resolution parameters fixed. As N ξ is increased, the fractional differ-ences converge to a finite value, approximately 10 − , due to the discretization error in theother resolution parameters. Similar resolution parameters are required for the convergenceof the moment itself, Q i , and its derivative computed with the continuous method, ∂Q i /∂ι .Convergence of Q i within 5% is obtained with N ξ = 38, similar to that required for theconvergence of ∂Q/∂ι , as can be seen in Figure 4.2a.In Figure 4.2b, we compare the cost of calculating derivatives of one moment with respectto N Ω parameters using the continuous and discrete adjoint methods and forward-differencederivatives. All computations are performed on the Edison computer at NERSC using 48processors, and the elapsed wall time is reported. Here we include the cost of solving thelinear system and computing diagnostics N Ω + 1 times for the forward-difference approach,and the cost of solving the forward and adjoint linear systems and computing diagnostics forthe adjoint approaches. The cost of the continuous approach is slightly more than that of thediscrete approach due to the cost of factorizing the adjoint preconditioner. However, at large N Ω the cost of computing diagnostics for the adjoint approach (e.g., computing ∂ S /∂ Ω and ∂ L /∂ Ω and performing the inner product in (4.24)) dominates that of solving the adjointlinear system; thus the discrete and continuous approaches become comparable in cost. Inthis regime, the adjoint approach provides speed-up by a factor of approximately 50.75 N -5 -4 -3 -2 -1 Q i / F r a c t i ona l d i ff e r en c e (a) N W a ll c l o ck t i m e [ s ] Forward differenceContinuous adjointDiscrete adjoint (b)
Figure 4.2: (a) The fractional difference between ∂Q i /∂ι computed with the continuousand discrete approaches converges with the number of pitch angle Legendre modes, N ξ .(b) Comparison of the computational cost of computing ∂ R /∂ Ω with forward-differencederivatives and the adjoint approach as a function of N Ω , the number of parameters in thegradient. Figure reproduced from [186] with permission. With the adjoint method, it is possible to compute derivatives of a moment of the distri-bution function with respect to the Fourier amplitudes of the field strength, { ∂ R /∂B cm,n } .Rather than consider sensitivity in Fourier space, we would like to compute the sensitivityto local perturbations of the field strength. We now quantify the relationship between thesetwo representations of sensitivity information.Consider the Gateaux functional derivative [52] of R with respect to B , δ R ( B ( x ); δB ) = lim (cid:15) → R ( B ( x ) + (cid:15)δB ( x )) − R ( B ( x )) (cid:15) . (4.40)Here the field strength is perturbed at fixed I ( ψ ), G ( ψ ), and ι ( ψ ). As δ R ( B ( x ); δB ) is alinear functional of δB , by the Riesz representation theorem [199], δ R can be expressed asan inner product with δB and some element of the appropriate space. The function δB isdefined on a flux surface, ψ ; thus it is sensible to express δ R in the following way, δ R ( B ( x ); δB ) = (cid:10) S R δB ( x ) (cid:11) ψ . (4.41)Here δ R quantifies the change in the moment R associated with a local perturbation to thefield strength, δB ( x ). The function S R is analogous to the shape gradient introduced inSection 2.1, which will be discussed further in Section 4.5.2.Suppose that B is stellarator symmetric and N P symmetric. If E r = 0, then S R must76lso possess stellarator and N P symmetry (Appendix G). However, when E r (cid:54) = 0, S R is nolonger guaranteed to have stellarator symmetry. Nonetheless, it may be desirable to ignorethe stellarator-asymmetric part of S R if an optimized stellarator-symmetric configurationis desired. For the remainder of this Chapter, we will make this assumption, though theanalysis could be extended to consider the effect of breaking of stellarator symmetry. Atruncated Fourier series can approximate the quantity S R under these assumptions, S R = (cid:88) m,n S m,n cos( mϑ B − nN P ϕ B ) , (4.42)where the sum is taken over m ≤ m max and | n | ≤ n max . The quantity δB ( x ) can be writtenin terms of perturbations to the Fourier coefficients, δB ( x ) = (cid:88) m,n δB cm,n cos( mϑ B − nN P ϕ B ) , (4.43)and now δ R can be written in terms of these perturbations to the Fourier coefficients, δ R = (cid:88) m,n ∂ R ∂B cm,n δB cm,n . (4.44)In this way, (4.41) can be expressed as a linear system, ∂ R ∂B cm,n = (cid:88) m (cid:48) ,n (cid:48) D m,n ; m (cid:48) ,n (cid:48) S m (cid:48) ,n (cid:48) , (4.45)where, D m,n ; m (cid:48) ,n (cid:48) = V (cid:48) ( ψ ) − (cid:90) π dϑ B (cid:90) π dϕ B √ g cos( mϑ B − nN P ϕ B ) cos( m (cid:48) ϑ B − n (cid:48) N P ϕ B ) . (4.46)If the same number of modes is used to discretize δ R and S R , then the linear system issquare.In contrast to derivatives with respect to the Fourier modes of B , the sensitivity function, S R , is a spatially local quantity, quantifying the change in a figure of merit resulting from alocal perturbation of the field strength. In this way, S R can inform where perturbations to themagnetic field strength can be tolerated. The sensitivity function could be related directly toa local magnetic tolerance, as described in Section 2.1.3. In contrast with the work in [138],here we are considering perturbations to the field strength on any flux surface rather thanat the plasma boundary. However, S R still provides insight into where trim coils shouldbe placed or coil displacements can be tolerated without sacrificing desired neoclassicalproperties. The sensitivity function can also be used for gradient-based optimization inthe space of the field strength on a flux surface, as demonstrated in Section 4.5.2.We compute S J b for the W7-X standard configuration at ρ = 0 .
70, shown in Figure 4.3a.We use a fixed-boundary equilibrium that preceded the coil design and does not include coilripple, and the full equilibrium is used rather than the truncated Fourier series considered inSection 4.4. The same resolution parameters are used as in Section 4.4, and derivatives with77 a) (b)(c)
Figure 4.3: (a) The local magnetic sensitivity function for the bootstrap current, S J b , isshown for the W7-X standard configuration. Positive values indicate that increasing thefield strength at a given location will increase J b through (4.41). (b) The local sensitivityfunction for the ion particle flux, S Γ i . (c) The magnetic field strength on the ρ = 0 . B cm,n are computed for m max = n max = 20. The largest modes for this configurationare the helical curvature B c , , the toroidal curvature B c , , and the toroidal mirror B c , .We find that S J b is large and negative on the inboard side, indicating that increasing themagnitude of the toroidal curvature component of B would lead to an increase in J b . Thisresult is in agreement with previous analysis [155], which found that at low collisionality,the bootstrap current coefficients depend strongly on the toroidal curvature. Additionally,we note a localized region of strong sensitivity on the inboard side near the bean-shapedcross-section. Experimental [55] and numerical [75] evidence indicates that the magnitude ofthe bootstrap current is increased in the lower mirror-ratio configuration of W7-X, where themirror-ratio is defined as ( B max − B min ) / ( B max + B min ). Our result appears to be consistentwith these observations: we note that the localized region of strongly positive S J b is nearthe maximum of the magnetic field strength (Figure 4.3c), indicating that increasing themirror-ratio would lead to a decrease in the magnitude of bootstrap current, as J b < S Γ i , computed for thesame configuration using m max = 20 and n max = 20. We find that the particle flux is moresensitive to perturbations on the outboard side in localized regions, while on the inboardside the sensitivity is relatively small in magnitude. Optimization of the magnetic field strength
As a second demonstration of the adjoint neoclassical method, we consider optimizingin the space of the field strength on a surface, taking Ω = { B cm,n } . As Boozer coordinatesare used, the covariant form (4.35) satisfies ( ∇ × B ) · ∇ ψ = 0 and the contravariant form(4.36) satisfies ∇ · B = 0. As we will artificially modify the field strength while keepingother geometry parameters fixed, the resulting field will not necessarily satisfy both of theseconditions with both the covariant and contravariant forms. While there is no guarantee thatthe resulting field strength will be consistent with a global equilibrium solution, it providesinsight into how local changes to the field strength can impact neoclassical properties. As asecond step, the outer boundary could be optimized to match the desired field strength ona single surface. In Section 4.5.2, we discuss how the derivatives computed in this Chaptercould be coupled to the optimization of an MHD equilibrium.We perform optimization with a BFGS quasi-Newton method (Chapter 6 in [170]) usingan objective function χ = J b , implemented in the sfincs adjoint branch of the STEL-LOPT code. A backtracking line search is used at each iteration to find a step size thatsatisfies a condition of sufficient decrease of χ . We use the same equilibrium as in Section4.5.1, retaining modes m ≤
12 and | n | ≤
12, and compute derivatives with respect to thesemodes. Convergence to χ ≤ − was obtained within 8 BFGS iterations (28 functionevaluations), as shown in Figure 4.4a. The difference in field strength between the initialand optimized configuration, B opt − B init , is shown in Figure 4.4b. As expected from theanalysis in Section 4.5.1, the field strength increased on the outboard side and decreased onthe inboard side in comparison with B init . (Note that J b < Iteration -10 -5 Initial value (a) -1.5-1-0.500.5110 -3 (b) Figure 4.4: (a) Convergence of χ = J b for optimization over Ω = { B cm,n } with an adjoint-based BFGS method. (b) The change in field strength from the initial to optimized config-uration. Figure adapted from [186] with permission. Optimization of MHD equilibria
The local sensitivity function, S R , along with ∂ R /∂I , ∂ R /∂G , and ∂ R /∂ι , can be used todetermine how perturbations to the outer boundary of the plasma, S P , result in perturbationsto R . This is quantified through the idea of the shape gradient, introduced in Section 2.1.The partial derivatives of R can be computed with the adjoint method outlined in Section4.3, and the shape gradient can be obtained with only one additional MHD equilibriumsolution through the application of another adjoint method.Consider a figure of merit which is integrated over the toroidal confinement volume, V P , f R ( S P ) = (cid:90) V P d x w ( ψ ) R ( ψ ) , (4.47)where w ( ψ ) is a weighting function. That is, SFINCS is run on a set of ψ surfaces within V P and the volume integral is computed numerically. Here we consider S P to be the plasmaboundary used for a fixed-boundary MHD equilibrium calculation. From the Hadamard-Zolesio structure theorem (Section 2.1), the perturbation to f R resulting from normal per-turbation to S P can be written in the following form, δf R ( S P ; δ x ) = (cid:90) S P d x ( δ x · ˆ n ) G , (4.48)under certain assumptions of smoothness [52]. This can be thought of as another instanceof the Riesz representation theorem, as δf R is a linear functional of δ x . Here ˆ n is the out-ward unit normal on S P and δ x is a vector field describing the perturbation to the surface.Intuitively, only normal perturbations to S P result in a change to f R . The shape gradient80s G , which quantifies the contribution of a local normal perturbation of the boundary tothe change in f R . The shape gradient can be used for fixed-boundary optimization of equi-libria or analysis of sensitivity to perturbations of magnetic surfaces. It can be computedusing a second adjoint method, where a perturbed MHD force balance equation is solvedwith the addition of a bulk force that depends on derivatives computed from the neoclas-sical adjoint method. This will be described in detail in Chapter 5. While the continuousneoclassical adjoint method described in this Chapter arises from the self-adjointness of thelinearized Fokker-Planck operator, the adjoint method for MHD equilibria arises from theself-adjointness of the MHD force operator. In practice, these two adjoint methods could becoupled by first computing an MHD equilibrium solution, computing neoclassical transportand its geometric derivatives from this equilibrium with the neoclassical adjoint method,and passing these derivatives back to the equilibrium code to compute the shape gradientwith the perturbed MHD adjoint method. In this way, derivatives of neoclassical quantitieswith respect to the shape of the outer boundary are computed with only two equilibriumsolutions and two DKE solutions.Rather than solve an additional adjoint equation, the outer boundary could be optimizedby numerically computing derivatives of { B cm,n ( ψ ) , G ( ψ ) , I ( ψ ) } with respect to the doubleFourier series describing the outer boundary shape in cylindrical coordinates, { R cm,n , Z sm,n } ,using a finite-difference method. This could be done using the STELLOPT code [197, 213]with BOOZ XFORM [202] to perform the coordinate transformation. For example, if therotational transform is held fixed in the VMEC equilibrium calculation [111], the derivativeof a moment, R , with respect to a boundary coefficient, R cm,n , can be computed as, ∂ R ( ψ ) ∂R cm,n ( ψ ) = (cid:88) m (cid:48) ,n (cid:48) ∂ R ( ψ ) ∂B cm (cid:48) ,n (cid:48) ( ψ ) ∂B cm (cid:48) ,n (cid:48) ( ψ ) ∂R cm,n ( ψ ) + ∂ R ( ψ ) ∂G ( ψ ) ∂G ( ψ ) ∂R cm,n ( ψ ) + ∂ R ( ψ ) ∂I ( ψ ) ∂I ( ψ ) ∂R cm,n ( ψ ) , (4.49)where ∂ R ( ψ ) /∂B cm,n ( ψ ), ∂ R ( ψ ) /∂G ( ψ ), and ∂ R ( ψ ) /∂I ( ψ ) are computed with the neoclas-sical adjoint method and ∂B cm,n ( ψ ) /∂R cm,n ( ψ ), ∂G ( ψ ) /∂R cm,n ( ψ ), and ∂I ( ψ ) /∂R cm,n ( ψ ) are computed with finite-difference derivatives using STELLOPT. Similarly,derivatives of { B cm,n ( ψ ) , G ( ψ ) , I ( ψ ) } could be computed with respect to coil parameters us-ing a free-boundary equilibrium solution, allowing for direct optimization of neoclassicalquantities with respect to coil shapes. The neoclassical calculation with SFINCS is typicallysignificantly more expensive than the equilibrium calculation (for the geometry discussedin Section 4.5.1 fixed-boundary VMEC took 54 seconds while SFINCS took 157 secondson 4 processors of the NERSC Edison computer). As such, combining adjoint-based withfinite-difference derivatives can still result in a significant computational savings. As stellarators are not intrinsically ambipolar, the radial electric field is not truly anindependent parameter. The ambipolar E r must be obtained which satisfies the condition J r ( E r ) = 0. The application of adjoint-based derivatives for computing the ambipolar solu-tion is discussed in Section 4.5.3. An adjoint method to compute derivatives with respect togeometric parameters at fixed ambipolarity is discussed in Section 4.5.3.81 ccelerating ambipolar solve A nonlinear root-finding algorithm must be used to compute the ambipolar E r . Thisroot-finding can be accelerated with derivative information, such as with a Newton-Raphsonmethod [195]. The derivative required, ∂J r /∂E r , can be computed with the discrete orcontinuous adjoint method as described in Section 4.3 with the replacement Ω i → E r , con-sidering R = J r .We implement three nonlinear root finding methods: Brent’s method [30], the Newton-Raphson method, and a hybrid between the bisection and Newton-Raphson methods [195].Brent’s method guarantees at least linear convergence by combining quadratic interpola-tion with bisection and does not require derivatives. The Newton-Raphson method canprovide quadratic convergence under certain assumptions but in general is not guaranteedto converge. If an iterate lies near a stationary point or a poor initial guess is given, themethod can fail. For this reason, we implement the hybrid method, which combines thepossible quadratic convergence properties of Newton-Raphson with the guaranteed linearconvergence of the bisection method. Both Brent’s method and the hybrid method requirethe root to be bracketed and therefore may require additional function evaluations to obtainthe bracket.We compare these methods in Figure 4.5, using the W7-X standard configuration con-sidered in Section 4.5.1 with the full trajectory model and the discrete adjoint approach,beginning with an initial guess of E r = −
10 kV/m with bounds at E min r = −
100 kV/m and E max r = 100 kV/m. The root is located at E r = − .
84 kV/m. For this example, the hybridand Newton methods had nearly identical convergence properties. However, the Newtonmethod is less expensive as it does not require J r to be evaluated at the bounds of the in-terval. The Newton method provides a 22% savings in wall clock time over Brent’s methodto obtain the root within the same tolerance.In the above discussion, we have assumed that there is only one stable root of interest. Ofcourse, a given configuration may possess several roots, especially if the ions and electrons arein different collisionality regimes [92]. Multiple roots can be obtained by performing severalroot solves with different initial values and brackets, which could be trivially parallelized.Thus the adjoint method could still provide an acceleration in this more general case. Derivatives at ambipolarity
The adjoint method described in Section 4.3 assumes that E r is held constant when com-puting derivatives with respect to Ω. However, E r cannot truly be determined independentlyfrom geometric quantities, as the ambipolar solution should be recomputed as the geometryis altered. It is therefore desirable to compute derivatives at fixed ambipolarity (fixed J r = 0)rather than at fixed E r . This is performed by solving an additional adjoint equation, L † λ J r + (cid:101) J r = 0 , (4.50)in the continuous approach or, (cid:16) ←→ L (cid:17) T −→ λ J r = −→ J r , (4.51)82 Iteration -10 -5 BrentNewton hybridNewtonInitial guess
Figure 4.5: The ambipolar root is obtained with Brent, Newton-Raphson, and Newtonhybrid nonlinear root solvers. The derivatives obtained with the adjoint method providebetter convergence properties for the Newton methods. Figure adapted from [186] withpermission.in the discrete approach. Details are described in Appendix H.It should be noted that by computing derivatives at ambipolarity, we assume that agiven moment R is a differentiable function of the geometry at fixed J r = 0. That is, thismethod cannot be applied to cases in which a stable root disappears as the geometry varies.As this will occur at a stationary point of J r ( E r ), this situation could be avoided withinan optimization loop by computing derivatives at constant E r rather than constant J r if | ∂J r /∂E r | falls below a given threshold at ambipolarity.Although an additional adjoint solve is required, this method of computing derivativesat ambipolarity is advantageous as several linear solves are typically needed to obtain theambipolar root. A comparison of the computational cost between the adjoint method and theforward-difference method for derivatives at ambipolarity is shown in Figure 4.6a. Here thefull trajectory model is used, and the results for both the discrete and continuous adjointmethods are shown. For the finite-difference derivative, the ambipolar solve is performedwith Brent’s method at each step in Ω. As in Figure 4.2b, we find that for large N Ω , thecost of the continuous and discrete approaches are essentially the same, as the cost is nolonger dominated by the linear solve. When computing the derivatives at ambipolarity, bothadjoint methods decrease the cost by a factor of approximately 200 for large N Ω .In Figure 4.6b we show a benchmark between derivatives at ambipolarity,( ∂ R /∂B c , ) J r , computed with the discrete adjoint method and with forward-difference deriva-tives. For the forward-difference method, the Newton solver is used to obtain the ambipolar E r as B c , is varied. As the forward difference step size ∆ B c , decreases, the fractional dif-ference again decreases proportional to ∆ B c , until it reaches a minimum when ∆ B c , /B c , is approximately 10 − . In comparison with Figure 4.1, we see that the minimum fractionaldifference is slightly larger at fixed ambipolarity than at fixed E r , as the tolerance parametersassociated with the Newton solver introduce an additional source of error to the forward-83 N W a ll c l o ck t i m e [ s ] Forward differenceContinuous adjointDiscrete adjoint (a) -8 -7 -6 -5 -4 -3 -2 -1 F r a c t i ona l d i ff e r en c e (b) Figure 4.6: (a) The cost of computing the gradient ∂ R /∂ Ω at ambipolarity scales with N Ω ,the number of parameters in Ω. (b) The fractional difference between ∂ R /∂B c , at constantambipolarity obtained with the adjoint method and with finite-difference derivatives. Figureadapted from [186] with permission.difference approach.In Figures 4.7a and 4.7b we compare the sensitivity function for the particle flux, S Γ i ,computed using derivatives at constant E r with that computed at constant J r . Here deriva-tives are computed using the discrete adjoint method with full trajectories, and the sensitivityfunction is constructed as described in Section 4.5.1. The configuration and numerical pa-rameters are the same as described in Section 4.5.1. At constant J r , the large region ofincreased sensitivity on the outboard side that appears at constant E r remains, though theoverall magnitude of the sensitivity decreases. Thus it may be important to account for theeffect of the ambipolar E r when optimizing for radial transport. In Figures 4.7c and 4.7d weperform the same comparison for S J b , finding the derivatives at fixed E r and at fixed J r to bevirtually identical. This is to be expected, as numerical calculations of neoclassical transportcoefficients for W7-X have found that the bootstrap coefficients are much less sensitive to E r than those for the radial transport (Figures 18 and 26 in [16]). Furthermore, the bootstrapcurrent in the 1 /ν regime is independent of E r , and the finite-collisionality correction is smallfor optimized stellarators, such as W7-X [102]. Therefore, the ambipolarity corrections tothe derivatives are less important for J b than for the radial transport. We have described a method by which moments R of the neoclassical distribution functioncan be differentiated efficiently with respect to many parameters. The adjoint approach84 a) (b)(c) (d) Figure 4.7: The sensitivity function for the ion particle flux, S Γ i , is computed at (a) constant E r and (b) constant J r . Similarly, S J b is computed at (c) constant E r and (d) constant J r .Figure adapted from [186] with permission. 85equires defining an inner product from which the adjoint operator is obtained. We considertwo choices for this inner product. One choice corresponds with computing the adjoint ofthe linear operator after discretization, and the other corresponds with computing it beforediscretization. In the case of the former, the Euclidean dot product can be used, and inthe case of the latter, an inner product whose corresponding norm is similar to the freeenergy norm (4.14) is defined. In Section 4.4, we show that these approaches provide thesame derivative information within discretization error, as expected. Both methods provide areduction in computational cost by a factor of approximately 50 in comparison with forward-difference derivatives when differentiating with respect to many ( O (10 )) parameters. InSection 4.5.3 the adjoint method is extended to compute derivatives at ambipolarity. Thismethod provides a reduction in cost by a factor of approximately 200 over a forward-differenceapproach. We have implemented this method in the SFINCS code, and similar methods couldbe applied to other drift kinetic solvers.In this Chapter, we consider derivatives with respect to geometric quantities that enterthe DKE through Boozer coordinates. However, the adjoint neoclassical method we havedescribed is much more general, allowing for many possible applications. For example,derivatives of the radial fluxes with respect to the temperature and density profiles couldbe used to accelerate the solution of the transport equations using a Newton method [13].The transport solution could furthermore be incorporated into the optimization loop to self-consistently evolve the macroscopic profiles in the presence of neoclassical fluxes. Ratherthan simply optimizing for minimal fluxes, an objective function such as the total fusionpower could be considered [107], with optimization accelerated by adjoint-based derivatives.Another application of the continuous adjoint formulation is the correction of discretiza-tion error. The same solution obtained in Section 4.3.1 can be used to quantify and correctfor the error in a moment, R , providing similar accuracy to that computed with a higher-order stencil or finer mesh without the associated cost. This method has been applied in thefield of computational fluid dynamics by solving adjoint Euler equations [189, 231] and couldprove useful for efficiently obtaining solutions of the DKE in low-collisionality regimes.In Section 4.5.2, we have shown an example of adjoint-based neoclassical optimization,where the optimization space is taken to be the Fourier modes of the field strength on asurface, { B cm,n } . While optimization within this space is not necessarily consistent witha global equilibrium solution, it demonstrates the adjoint neoclassical method for efficientoptimization. In Section 4.5.2, two approaches to self-consistently optimize MHD equilibriaare discussed. Further discussion and demonstration will be provided in Chapter 5.In Appendix G we show that when E r = 0 and the unperturbed geometry is stellaratorsymmetric, the sensitivity functions for moments of the distribution function are also stel-larator symmetric. However, when E r (cid:54) = 0 this is no longer true. This implies that obtainingminimal neoclassical transport in the √ ν regime may require breaking of stellarator symme-try. In this Chapter, we have ignored the effects of stellarator symmetry-breaking, thoughwe hope to extend this work to study these effects in the future.86 hapter 5 Adjoint shape gradient for MHD equilibria
Most stellarator optimization to date has assumed that the magnetic field satisfies theMHD equilibrium equations with either a fixed or free-boundary approach, as detailed inSection 1.4.2. If a gradient-based optimization approach is applied, derivatives of quantitiesthat depend on the equilibrium solutions must be computed with respect to the shapesof the filamentary coils or plasma boundary. In this Chapter, we demonstrate an adjointapproach for obtaining the coil or surface shape gradient of such functions. With the shapegradient efficiently computed, shape derivatives with respect to any shape perturbation canbe calculated.The material in this Chapter has been adapted with permission from [10] and [187].
Several figures of merit quantifying confinement must be considered in the numerical op-timization of stellarator MHD equilibrium. These figures of merit describing a configurationdepend on the shape of the outer plasma boundary or the shape of the electro-magneticcoils. It is thus desirable to obtain derivatives with respect to these shapes for optimizationof equilibria or identification of sensitivity information. These so-called shape derivatives canbe computed by directly perturbing the shape, recomputing the equilibrium, and computingthe resulting change to a figure of merit that depends on the equilibrium solution. However,this direct finite-difference approach requires recomputing the equilibrium for each possibleperturbation of the shape. For stellarators whose geometry is described by a set of N Ω ∼ parameters, this requires N Ω solutions to the MHD equilibrium equations. Despite this com-putational complexity, gradient-based optimization of stellarators has proceeded with thedirect approach (e.g. [134, 196, 197]).As the target optimized configuration can never be realized exactly, an analysis of thesensitivity to perturbations, such as errors in coil fabrication or assembly, is central to thesuccess of a stellarator. Tight tolerances have proven to be a significant driver of the cost ofstellarator experiments [130, 220]; thus an improvement to the algorithms used to conductsensitivity studies can have a substantial impact on the field. In studies of the coil tolerancesfor flux surface quality of LHD [240] and NCSX [31, 236], perturbations of several distribu-87ions were manually applied to each coil. Sensitivity analysis can also be performed withanalytic derivatives. Numerical derivatives with respect to tilt angle and coil translation ofthe CNT coils have been used to compute the sensitivity of the rotational transform on axis[88]. Analytic derivatives have recently been applied to study coil sensitivities of the CNTstellarator by considering the eigenvectors of the Hessian matrix [243]. Thus, in addition togradient-based optimization, derivatives with respect to shape can be applied to sensitivityanalysis.The shape gradient quantifies the change in a figure of merit associated with a localperturbation to a shape. Thus, if the shape gradient can be obtained, the shape derivativewith respect to any perturbation is known (more precise definitions of the shape deriva-tive and gradient are given in Sections 2.1 and 5.2). The shape gradient representation canbe computed from parameter derivatives by solving a small linear system (Sections 2.1.2).However, computing parameter derivatives can often be computationally expensive, as nu-merical derivatives require evaluating the objective function at least N Ω +1 times if one-sidedfinite-difference derivatives are used, or 2 N Ω times for centered differences. As computingthe objective function often involves solving a linear or nonlinear system, such as the MHDequilibrium equations, this implies solving the system of equations ≥ N Ω + 1 times. Numer-ical derivatives also introduce additional noise, and the finite-difference step size must bechosen carefully.Rather than use parameter derivatives, in this Chapter we will use an adjoint method tocompute the shape gradient. This is sometimes termed adjoint shape sensitivity or adjointshape optimization, which has its origins in aerodynamic engineering and computational fluiddynamics [82, 190]. As with adjoint methods for parameter derivatives, this technique onlyrequires the solution of two linear or nonlinear systems of equations. This technique has beenapplied to magnetic confinement fusion for the design of tokamak divertor shapes by solvingforward and adjoint fluid equations [48, 49, 50]. As stellarators require many parametersto describe their shape, adjoint shape sensitivity could significantly decrease the cost ofcomputing the shape gradient. If one is optimizing in the space of parameters describing theboundary of the plasma or the shape of coils, the shape gradient representation obtainedfrom the adjoint method can be converted to parameter derivatives upon multiplication witha small matrix (Section 2.1).We begin in Section 5.2 with a brief review of shape calculus concepts in the contextof MHD equilibria. In Section 5.3, the fundamental adjoint relations for perturbations toMHD equilibria are derived and discussed. These relations take a form that is similar tothat of transport coefficients that are related by Onsager symmetry [177, 178]. Specifically,perturbations to the equilibrium are characterized as a set of generalized responses to acomplementary set of generalized forces. The responses and forces can be thought of asbeing related by a matrix operator, which is symmetric. The resulting relations amongforces and responses can be used to compute the shape gradient of functions of the equilibriawith respect to displacements of the plasma boundary or the coil shapes. In Section 5.4, thecontinuous adjoint method that takes advantage of the generalized self-adjointness relationsis discussed. Several applications to stellarator figures of merit will be demonstrated inSection 5.5. 88lthough the adjoint relations are based on the equations of linearized MHD, we performnumerical calculations in this Chapter with nonlinear MHD solutions with the addition ofa small perturbation. Demonstration is performed using nonlinear stellarator MHD equi-librium codes based on a variational principle, VMEC [111] and ANIMEC [43]. We obtainexpressions for the shape gradients of the volume-averaged β (Section 5.5.1), rotational trans-form (Section 5.5.2), vacuum magnetic well (Section 5.5.3), magnetic ripple (Section 5.5.4),effective ripple in the 1 /ν neoclassical regime [168] where ν is the collision frequency (Section5.5.5), and departure from quasi-symmetry (Section 5.5.6). Finally, we demonstrate that theadjoint method for neoclassical optimization outlined in Chapter 4 can be coupled with a lin-earized adjoint MHD solution to compute derivatives of several neoclassical quantities withrespect to the shape of the plasma boundary (Section 5.5.7). We present calculations of theshape gradient with the adjoint approach for the volume-averaged β , rotational transform,and vacuum magnetic well figures of merit, which do not require modification to VMEC.The calculation for the magnetic ripple is computed with a minor modification of the ANI-MEC code. The adjoint force balance equations needed to compute the shape gradient forthe other figures of merit require the addition of a bulk force that will necessitate furthermodification of an equilibrium or linearized MHD code. Numerical calculations for thesefigures of merit will, therefore, not be presented in this Chapter. We now review shape calculus fundamentals introduced in Chapter 2 in the context offunctions that depend on MHD equilibrium quantities. Consider a functional, F ( S P ), thatdepends implicitly on the plasma boundary, S P , through the solution to the fixed-boundaryMHD equilibrium equations (Section 1.4.1) with boundary condition B · ˆ n | S P = 0 where ˆ n isthe outward unit normal on S P . We define a functional integrated over the plasma volume, V P , f ( S P ) = (cid:90) V P d x F ( S P ) , (5.1)where S P is the boundary of V P . Consider a vector field describing displacements of thesurface, δ x , and a displaced surface S P,(cid:15) = { x + (cid:15)δ x : x ∈ S P } . The shape derivative of F is defined as, δF ( S P ; δ x ) = lim (cid:15) → F ( S P,(cid:15) ) − F ( S P ) (cid:15) . (5.2)The shape derivative of f is defined by the same expression with F → f . Under certainassumptions of smoothness of δF with respect to δ x , the shape derivative of the volume-integrated quantity, f , can be written in the following way (Section 2.1), δf ( S P ; δ x ) = (cid:90) V P d x δF ( S P ; δ x ) + (cid:90) S P d x δ x · ˆ n F. (5.3)The first term accounts for the Eulerian perturbation to F while the second accounts for themotion of the boundary. This is referred to as the transport theorem for domain functionals89nd will be used throughout this Chapter to compute the shape derivatives of figures ofmerit of interest.According to the Hadamard-Zolesio structure theorem [52], the shape derivative of afunctional of S P (not restricted to the form of (5.1)) can be written in the following form, δf ( S P ; δ x ) = (cid:90) S P d x δ x · ˆ n G , (5.4)assuming δf exists for all δ x and is sufficiently smooth. In the above expression, G is theshape gradient. This is an instance of the Riesz representation theorem, which states thatany linear functional can be expressed as an inner product with an element of the appropriatespace [199]. As the shape derivative of f is linear in δ x , it can be written in the form of(5.4). Intuitively, the shape derivative does not depend on tangential perturbations to thesurface. The shape gradient can be computed from derivatives with respect to the set ofparameters, Ω, used to discretize S P , ∂f (Ω) ∂ Ω i = (cid:90) S P d x ∂ x (Ω) ∂ Ω i · ˆ n G . (5.5)For example, Ω = { R cm,n , Z sm,n } could be assumed, where these are the Fourier coefficients(5.70) in a cosine and sine representation of the cylindrical coordinates ( R, Z ) of S P . Upondiscretization of the right-hand side on a surface, the above takes the form of a linear systemthat can be solved for G [138]. However, this approach requires performing at least oneadditional equilibrium calculation for each parameter with a finite-difference approach.The shape gradient can also be computed with respect to perturbations of currents inthe vacuum region. We now consider f to depend on the shape of a set of filamentary coils, C = { C k } , through a free-boundary solution to the MHD equilibrium equations (Section1.4.1). We consider a vector field of displacements to the coils, δ x C . The shape derivativeof f can also be written in shape gradient form, δf ( C ; δ x C ) = (cid:88) k (cid:73) C k dl δ x C k · (cid:101) G k , (5.6)where (cid:101) G k is the shape gradient for coil k , C k is the line integral along coil k , and the sumis taken over coils. Again, (cid:101) G k can be computed from derivatives with respect to a set ofa parameters describing coil shapes (5.84), analogous to (5.5). Note that we have definedthe shape gradient in a slightly different way here than that introduced in Chapter 2 (2.12)(without the cross with ˆ t ), although we will find in this Chapter that (cid:101) G k is perpendicularto ˆ t for the functionals under consideration. We distinguish the shape gradient as definedin (5.6) from that defined in (2.12) with a tilde.To avoid the cost of direct computation of the shape gradient, we apply an adjointapproach. The shape gradient is thus obtained without perturbing the plasma surface orcoil shapes directly, but instead by solving an additional adjoint equation that depends onthe figure of merit of interest. We perform the calculation with the direct approach todemonstrate that the same derivative information is computed with either method.90 .3 Adjoint relations for MHD equilibria The goal of this Section is to generalize the well-known self-adjointness [20] of the MHDforce operator, (cid:90) V P d x (cid:0) ξ · F [ ξ ] − ξ · F [ ξ ] (cid:1) − µ (cid:90) S P d x ˆ n · (cid:0) ξ δ B [ ξ ] · B − ξ δ B [ ξ ] · B (cid:1) = 0 , (5.7)to allow for perturbations of interest for stellarator optimization. In this expression, theperturbed magnetic field is expressed in terms of the displacement vector, δ B [ ξ , ] = ∇ × (cid:0) ξ , × B (cid:1) , (5.8)which follows from the assumption that the rotational transform is fixed by the perturbation(flux-freezing). The MHD force operator, F [ ξ , ] = (cid:0) ∇ × δ B [ ξ , ] (cid:1) × B µ + ( ∇ × B ) × δ B [ ξ , ] µ − ∇ (cid:0) δp [ ξ , ] (cid:1) , (5.9)is a linearization of the MHD equilibrium equation,( ∇ × B ) × B µ = ∇ p, (5.10)with boundary condition, B · ˆ n | S P = 0 , (5.11)under the assumption that the magnetic field is perturbed according to (5.8) and the pressureis perturbed according to, δp [ ξ , ] = − ξ , · ∇ p − γp ∇ · ξ , , (5.12)where γ is the adiabatic index. As ξ describes the motion of field lines, modes which perturbthe plasma boundary exhibit non-zero ξ · ˆ n | S P . The self-adjointness provides a relationshipbetween two perturbations about an MHD equilibrium state described by (5.10)-(5.11).This relation is incredibly valuable for ideal MHD stability analysis, forming the basis forthe energy principle.As described in Section 2.2.2, when formulating a continuous adjoint approach, the ad-joint of the linearized operator appearing in the forward PDE must be obtained. However,we cannot directly apply the self-adjointness relation from MHD stability theory (5.7) for thestellarator optimization problem. While MHD perturbations assume fixed rotational trans-form, stellarator optimization is often performed instead at fixed toroidal current. While theMHD self-adjointness relation allows for perturbations of the plasma boundary, we wouldalso like to consider linearized equilibrium states corresponding to perturbations of coils inthe vacuum region. We now form the appropriate generalized self-adjointness relations corre-sponding to fixed-boundary perturbations (applied perturbations to the plasma boundary)and free-boundary perturbations (applied perturbations to electro-magnetic coils). Eventhough the boundary shape changes in the former case, we refer to it as “fixed boundary”since the equilibrium code is run in fixed-boundary mode, and since the associated adjointproblem will turn out to have no boundary perturbation.91he resulting expressions will allow us to relate the “direct perturbations,” those cor-responding to a linearized equilibrium state associated with the direct perturbation of theplasma boundary or coil shapes, and “adjoint perturbations,” with which we can computethe shape gradient efficiently. The adjoint perturbation will correspond to the change in theequilibrium when an additional bulk force acts on the plasma or the toroidal current profileis changed. For the adjoint perturbation, there is no change to the outer flux surface inthe fixed-boundary case or to the coil currents in the free-boundary case. In this Section,we will show that aspects of the direct and adjoint changes are related to each other in amanner similar to Onsager symmetry. Thus, it will be shown that by calculating the adjointperturbation, with a judiciously chosen added force or change in the toroidal current profile,the solution to the direct problem can be determined.We consider equilibria in which the magnetic field in the plasma can be expressed interms of scalar functions ψ ( x ) , χ ( ψ ) , ϑ ( x ), and ϕ ( x ), B = ∇ ψ × ∇ ϑ − ∇ χ × ∇ ϕ = ∇ ψ × ∇ α, (5.13)where ( ψ , ϑ , ϕ ) form any magnetic coordinate system (Appendix A.3). We will regard ψ aslabeling the flux surfaces and consider toroidal geometries for which, α = ϑ − ι ( ψ ) ϕ, (5.14)label field lines in a flux surface, where ϑ is a poloidal angle, ϕ is a toroidal angle, and ι ( ψ ) = χ (cid:48) ( ψ ) is the rotational transform, with χ ( ψ ) being the poloidal flux function. Withthese definitions, the magnetic flux passing toroidally through a poloidally closed curve ofconstant ψ is 2 πψ , and the flux passing poloidally between the magnetic axis and the surfaceof constant ψ is 2 πχ ( ψ ). Thus, we assume that good flux surfaces exist and leave aside theissues of islands and chaotic field lines. In addition to the representation of the magneticfield, we assume that MHD force balance (5.10) is satisfied with a scalar pressure, p ( ψ ).As mentioned, we will consider two cases, a fixed-boundary case in which the shape ofthe outer flux surface is prescribed, and a free-boundary case for which outside the plasma,whose surface is defined by a particular value of toroidal flux, the force balance equation(5.10) does not apply, but rather, the magnetic field is determined by Ampere’s law, ∇ × B = µ J , (5.15)with a given current density J C , representing current flowing outside the confinement region.The fixed-boundary and free-boundary equations are discussed in detail in Section 1.4.1.From (5.10) it follows that current density stream-lines also lie in the ψ = constant sur-faces. The toroidal current passing through a surface, S T ( ψ ) (Figure A.2), whose perimeteris a closed poloidal loop at constant ψ is given by, I T ( ψ ) = (cid:90) S T ( ψ ) d x ˆ n · J = (cid:90) S T ( ψ ) dψ dϑ √ g ∇ ϕ · J , (5.16)where √ g − = ∇ ψ × ∇ ϑ · ∇ φ .Equations (5.10) and (5.13) to (5.16) describe our base equilibrium configuration. We nowconsider small changes in the equilibrium that are assumed to yield a second equilibrium stateof the same form as (5.13), but with new functions such that B (cid:48) = ∇ ψ (cid:48) ×∇ ϑ (cid:48) −∇ χ (cid:48) ( ψ (cid:48) ) ×∇ ϕ (cid:48) .92ach of the primed variables is assumed to differ from the corresponding unprimed variablesby a small amount (e.g. ψ (cid:48) = ψ + δψ ( x )). The perturbed magnetic field can then be expressed B (cid:48) = B + δ B , where, δ B = ∇ δψ × ∇ ϑ + ∇ ψ × ∇ δϑ − ∇ χ ( ψ ) × ∇ δϕ − ∇ (cid:0) ι ( ψ ) δψ + δχ ( ψ ) (cid:1) × ∇ ϕ. (5.17)We write the perturbed poloidal flux as the sum of a term resulting from the perturbationof toroidal flux at fixed rotational transform, ι ( ψ ) δψ , and a term representing the perturbedrotational transform, δχ ( ψ ). Thus, we can regroup the terms in (5.17) as follows, δ B = ∇ × (cid:0) δψ ∇ ϑ − ι ( ψ ) δψ ∇ ϕ − δϑ ∇ ψ + δϕ ∇ χ ( ψ ) (cid:1) − ∇ δχ ( ψ ) × ∇ ϕ. (5.18)The group of terms in parentheses in (5.18) corresponds to perturbations of the magneticfield allowed by ideal MHD, which is constrained by the “frozen-in law”, and which preservesthe rotational transform, ( δι ( ψ ) = 0). The last term in (5.18) allows for changes in therotational transform, ( δι ( ψ ) = χ (cid:48) ( ψ )). Note also that the expression in parentheses in (5.18)can be written as a sum of terms parallel to ∇ ψ and ∇ α , and hence it is perpendicular to B . The group of terms in parentheses in (5.18) can thus be expressed in terms of a vectorpotential that is perpendicular to the equilibrium magnetic field, while the last term in(5.18) can be represented in terms of a vector potential in the toroidal direction, which thushas a component parallel to the equilibrium field. We can therefore write δ B [ ξ , δχ ( ψ )] = ∇ × δ A [ ξ , δχ ( ψ )], where, δ A [ ξ , δχ ( ψ )] = ξ × B − δχ ( ψ ) ∇ ϕ. (5.19)Here, the variable ξ can be taken to be perpendicular to the applied magnetic field, as theperturbed magnetic field, δ B [ ξ , δχ ( ψ )] = ∇ × ( ξ × B ) − δχ (cid:48) ( ψ ) ∇ ψ × ∇ ϕ, (5.20)does not depend on ξ · ˆ b . We emphasize that this departs from the typical assumption madein ideal MHD stability theory that ∇ · ξ = 0.We define a vector field of the displacement of a field line, δ x , such that the perturbationto the field line label α = ϑ − ι ( ψ ) ϕ and toroidal flux satisfy, δψ + δ x · ∇ ψ = 0 (5.21a) δα + δ x · ∇ α = 0 , (5.21b)and δ x · B = 0. Noting that δα = δϑ − ι ( ψ ) δϕ − (cid:0) ι (cid:48) ( ψ ) δψ + δχ (cid:48) ( ψ ) (cid:1) ϕ , we find, δ x = ξ + ˆ b × ∇ δχ ( ψ ) B ϕ, (5.22)which follows from (5.18). As one would expect, in the limit δχ ( ψ ) = 0, we recover the MHDdisplacement vector.As the pressure profile is often assumed to be held fixed during a configuration optimiza-tion, we assume that the local pressure changes such that p ( ψ ) is unchanged, δp [ ξ ] = − ξ · ∇ p, (5.23)which follows from (5.22). We would similarly like to consider direct perturbations that fix93he toroidal current. The change in toroidal current flowing through the perturbed surfaceis computed using (5.3) by expressing (5.16) as a volume integral, δI T ( ψ ) = (cid:90) ∂ S T ( ψ ) dϑ √ g ξ · ∇ ψ J · ∇ ϕ + (cid:90) S T ( ψ ) dψdϑ √ gδ J [ ξ , δχ ( ψ )] · ∇ ϕ, (5.24)where S T ( ψ ) is a surface at constant toroidal angle (Figure A.2) bounded by the ψ surfaceand ∂ S T ( ψ ) is the boundary of such surface, a closed poloidal loop. The perturbed currentdensity is δ J [ ξ , δχ ( ψ )] = ∇ × δ B [ ξ , δχ ( ψ )]. Here the first term accounts for the displacementof the flux surface and the second term accounts for the change in toroidal current density.A linearized equilibrium state satisfies, F [ ξ , δχ ( ψ )] + δ F = 0 , (5.25)where δ F is an additional perturbed force to be prescribed and F [ ξ , δχ ( ψ )] is the generalizedforce operator, F [ ξ , δχ ( ψ )] = δ J [ ξ , δχ ( ψ )] × B + J × δ B [ ξ , δχ ( ψ )] − ∇ δp [ ξ ] . (5.26)We now consider two distinct perturbations of the equilibrium of the type described by(5.19), (5.20) and (5.23) to (5.26), which we denote with subscripts 1 and 2. In general,variables with subscript 1 will be associated with the direct perturbation, and those withsubscripts 2 will be associated with the adjoint perturbation. We then form the quantity, U T = (cid:90) V T d x ( δ J · δ A − δ J · δ A ) = 0 , (5.27)where we use the notation δ J , = δ J [ ξ , , δχ , ( ψ )] and δ A , = δ A [ ξ , , δχ , ( ψ )] and theintegral is, for the time being, over all space. The above is seen to vanish by expressing δ J , in terms of δ B , using Ampere’s law (5.15) and applying the divergence theorem.We now express the volume integral in (5.27) as the sum of three terms, U T = U P + U B + U C = 0 . (5.28)Here U P is the contribution from the plasma volume, integrated just up to the plasma-vacuumboundary. For this term we represent the vector potentials using (5.19), U P = (cid:90) V P d x (cid:16) δ J · (cid:0) ξ × B − δχ ( ψ ) ∇ ϕ (cid:1) − δ J · (cid:0) ξ × B − δχ ( ψ ) ∇ ϕ (cid:1)(cid:17) . (5.29)To evaluate (5.29) we use the perturbed force balance relation (5.25).The term U B comes from integrating over a thin layer at the plasma-vacuum boundary.At the boundary, the difference between the perturbed and unperturbed current density hasthe character of a current sheet due to the displacement of the outermost flux surface. Thiseffective current sheet causes a jump in the tangential components of the perturbation to themagnetic fields at the surface. This jump implies that care must be taken in evaluating theperturbed magnetic fields at the surface as they have different values on either side of theplasma-vacuum surface. However, the vector potential is continuous at the plasma-vacuum94oundary. Thus, we write, U B = (cid:90) S P d x |∇ ψ | ( ξ · ∇ ψ J · δ A − ξ · ∇ ψ J · δ A ) , (5.30)where the vector potentials are expressed as in (5.19). Using this expression for the vectorpotentials and expressing the surface integral as an integral over the toroidal and poloidalangles gives, U B = (cid:90) S P dϑdϕ √ g J · ∇ ϕ (cid:0) − ξ · ∇ ψδχ ( ψ ) + ξ · ∇ ψδχ ( ψ ) (cid:1) . (5.31)Here we note the terms in the vector potential coming from the MHD displacement cancel.Last, the quantity U C represents the contribution from the integral over the volumeoutside the plasma where only the coil currents need to be included, U C = (cid:90) V V d x ( δ J C · δ A V − δ J C · δ A V ) , (5.32)where δ A V , is the change in the vacuum vector potential, and δ J C , is the change in thecoil current density.Combining U P , U B , and U C gives the following relation appropriate to the free-boundarycase U T = U P + U B + U C = 0, or (cid:90) V P d x ( ξ · F − ξ · F ) + 2 π (cid:90) V P dψ (cid:16) δχ ( ψ ) δI (cid:48) T, ( ψ ) − δχ ( ψ ) δI (cid:48) T, ( ψ ) (cid:17) + (cid:90) V V d x ( δ J C · δ A V − δ J C · δ A V ) = 0 , (5.33)where we use the notation F , = F [ ξ , , δχ , ( ψ )]. This is the generalized free-boundaryadjoint relation. The steps leading to (5.33) are outlined in Appendix I. When the coilcurrents are confined to filaments, the integral over the vacuum region can be expressed interms of changes to the coil currents, fluxes through the coils, and integrals along the coils, (cid:90) V V d x δ J C , · δ A V , = (cid:88) k (cid:32) δ Φ C , ,k δI C , ,k + I C k (cid:73) C k dl δ x , ,C k ( x ) · ˆ t × δ B , (cid:33) . (5.34)Here δ Φ C k and δI C k are the change in magnetic flux through and change in current in coil k ,respectively, and I C k is the current through the unperturbed coil. The unit tangent vectoralong C k is ˆ t , and δ x C k is a vector field of perturbations to the k th coil. The above expressionis obtained upon application of Stokes theorem and the expression for the perturbation of aline integral (2.14).A similar relation can be obtained in the fixed-boundary case. Here the integral overthe plasma volume (5.29) can be written as a surface integral by applying the divergencetheorem, U P = 1 µ (cid:90) S P d x ˆ n · ( δ B × δ A − δ B × δ A ) . (5.35)95gain, following steps outlined in Appendix I, this may be rewritten in the following form, (cid:90) V P d x ( ξ · F − ξ · F ) − π (cid:90) V P dψ (cid:0) δI T, ( ψ ) δχ (cid:48) ( ψ ) − δI T, ( ψ ) δχ (cid:48) ( ψ ) (cid:1) − µ (cid:90) S P d x ˆ n · ( ξ δ B − ξ δ B ) · B = 0 . (5.36)The fixed-boundary adjoint relation can also be obtained by applying the self-adjointness(5.7) of the MHD force operator (Appendix J). If the second term in (5.36) is integrated byparts in ψ , we see that the fixed and free-boundary adjoint relations share the terms involvingthe products of displacements with bulk forces and perturbed fluxes with perturbed toroidalcurrents. The integral over the vacuum region in (5.33) is replaced by an integral over theplasma boundary and a boundary term from the integration by parts in ψ in (5.36).We now have two integral relations between perturbations 1 and 2, (5.33) and (5.36).They have a common form in that they each are the sum of three integrals: the first involvingforces and displacements, the second involving the toroidal current and poloidal flux profiles,and the third involving the manner in which the plasma boundary is prescribed. In (5.33),the free-boundary case, the changes in coil current densities are specified. In (5.36), thefixed-boundary case, the displacement of the outer flux surface is prescribed. Equations(5.33) and (5.36) can also be viewed as the difference in sums of generalized forces andresponses. For example, in (5.33) we can consider the quantities δ F , δχ ( ψ ), δ J C as forcesand ξ , δI (cid:48) T ( ψ ), δ A V as responses. The fact that the sum of the products of direct forces andadjoint responses less the products of adjoint forces and direct responses vanishes is similarto the relation between forces and fluxes related by Onsager symmetry [177, 178]. In thecase of Onsager symmetry, this relation follows from the self-adjoint property of the collisionoperator. In this case, the symmetry follows from the generalized self-adjointness relation. We now demonstrate how these relations (5.33) and (5.36) can be used to compute theshape gradient efficiently with a continuous adjoint method.
Consider a general figure of merit which involves a volume integral over the plasmadomain, f ( S P , B ) = (cid:90) V P d x F ( B ) , (5.37)where F ( B ) depends on the plasma surface through the fixed-boundary MHD equilibriumequations (Table 1.1). We are interested in computing perturbations of f such that (5.10)96s satisfied. This constraint is enforced using the following Lagrangian functional, L ( S P , B , ξ ) = f ( S P , B ) + (cid:90) V P d x ξ · (cid:18) ( ∇ × B ) × B µ − ∇ p (cid:19) , (5.38)where ξ is a Lagrange multiplier and we have defined our inner product to be a volumeintegral over the domain. To obtain the adjoint equation that ξ must satisfy, we computethe functional derivative of (5.38) with respect to B , where we note that perturbations to themagnetic field satisfy (5.20). As δf (cid:0) S P , B ; δ B [ ξ , δχ ( ψ )] (cid:1) is a linear functional of ξ ∈ V P , δχ (cid:48) ( ψ ), and ξ · ˆ n | S P , from the Riesz representation theorem, the functional derivative of f with respect to B is expressed as, δf ( S P , B ; δ B ) = (cid:90) V P d x ξ · L + (cid:90) V P dψ χ (cid:48) ( ψ ) L ( ψ ) + (cid:90) S P d x ξ · ˆ n L , (5.39)for some quantities L , L , and L . The functional derivative of L is now, δ L ( S P , B , ξ ; δ B ) = (cid:90) V P d x ( ξ · L + ξ · F )+ (cid:90) V P dψ δχ (cid:48) ( ψ ) L ( ψ ) + (cid:90) S P d x ξ · ˆ n L , (5.40)where F = F [ ξ , δχ ( ψ )] is the generalized force operator associated with the direct pertur-bation (5.26). We apply the fixed-boundary self-adjointness relation (5.36) to obtain, δ L ( S P , B , ξ ; δ B ) = (cid:90) V P d x ξ · ( L + F )+ (cid:90) V P dψ (cid:0) δχ (cid:48) ( ψ ) L ( ψ ) − πδI T, δχ (cid:48) ( ψ ) + 2 πδI T, ( ψ ) δχ (cid:48) ( ψ ) (cid:1) + (cid:90) S P d x (cid:34) ξ · ˆ n (cid:18) L + B · δ B µ (cid:19) − ξ · ˆ n B · δ B µ (cid:35) , (5.41)where F = F [ ξ , δχ ( ψ )] is the generalized bulk force associated with the adjoint perturba-tion (5.26), δI T, ( ψ ) is the adjoint toroidal current perturbation, and δχ ( ψ ) is the adjointpoloidal flux perturbation.If the direct problem is computed with fixed rotational transform, then δχ ( ψ ) = 0,and the adjoint variable (Lagrange multiplier) is chosen to satisfy the linearized equilibriumproblem, F [ ξ , δχ ( ψ )] + L = 0 (5.42a)ˆ n · ξ | S P = 0 (5.42b) δχ (cid:48) ( ψ ) = 0 , (5.42c)such that the above functional derivative (5.41) vanishes, except for the final term that isalready in the desired Hadamard form (5.4). If instead the direct problem is computed with97xed toroidal current, then δI T, ( ψ ) = 0 and the adjoint variable is chosen to satisfy, F [ ξ , δχ ( ψ )] + L = 0 (5.43a)ˆ n · ξ | S P = 0 (5.43b) δI T, ( ψ ) = L π . (5.43c)The shape derivative of L with respect to boundary perturbation ξ is now computed to be, δ L ( S P , B , ξ ; ξ ) = (cid:90) S P d x ξ · ˆ n ( F + L ) + (cid:90) V P d x ξ · L + (cid:90) V P dψ δχ (cid:48) ( ψ ) L ( ψ ) + δ (cid:32)(cid:90) V P d x ξ · (cid:18) ( ∇ × B ) × B µ − ∇ p (cid:19)(cid:33) , (5.44)where the first term is evaluated using the transport theorem (5.3). The notation in thefinal term indicates a shape derivative with respect to boundary perturbation ξ . The aboveexpression can be evaluated more easily by using the generalized adjoint relation (5.36),applying the conditions placed on the adjoint state (5.42) or (5.43), δ L ( S P , B , ξ ; ξ ) = (cid:90) S P d x ˆ n · ξ (cid:18) F + L + B · δ B µ (cid:19) . (5.45)So we identify the shape gradient to be, G = (cid:18) F + L + B · δ B µ (cid:19) S P . (5.46)Thus by solving a linearized equilibrium problem corresponding to the addition of a bulkforce for δ B [ ξ , δχ ( ψ )], we can compute the shape derivative with respect to any boundaryperturbation using the above shape gradient. We now consider free-boundary perturbations. Consider a general figure of merit whichinvolves a volume integral over the plasma domain, f ( C, B ) = (cid:90) V P d x F ( B ) , (5.47)where F ( B ) depends on the coil shapes C = { C k } through the free-boundary MHD equi-librium equations (Table 1.2). We are interested in computing perturbations of f such that(5.10) is satisfied, which we enforce with the Lagrangian functional, L ( C, B , ξ ) = f ( C, B ) + (cid:90) V P d x ξ · (cid:18) ( ∇ × B ) × B µ − ∇ p (cid:19) . (5.48)In this case, δf ( C, B ; δ B [ ξ , δχ ( ψ )]) is a linear functional of ξ ∈ V P , δχ ( ψ ), and theboundary perturbation ξ · ˆ n | S P resulting from a coil perturbation δ x ,C k × ˆ t . (While inthe fixed-boundary case, we considered δf to be a linear functional of δχ (cid:48) ( ψ ), for the free-98oundary case it is more convenient to consider it to be a linear functional of δχ ( ψ ).) Bythe Riesz representation theorem, δf (cid:0) C, B ; δ B [ ξ , δχ ( ψ )] (cid:1) = (cid:90) V P d x ξ · L + (cid:90) V P dψ χ ( ψ ) L ( ψ ) + (cid:90) S P d x ξ · ˆ n L , (5.49)for some quantities L , L ( ψ ), and L . The functional derivative of L is now, δ L (cid:0) C, B , ξ ; δ B [ ξ , δχ ( ψ )] (cid:1) = (cid:90) V P d x ( ξ · L + ξ · F )+ (cid:90) V P dψ δχ ( ψ ) L ( ψ ) + (cid:90) S P d x ξ · ˆ n L . (5.50)We apply the free-boundary relation (5.33) to obtain, δ L (cid:0) C, B , ξ ; δ B [ ξ , δχ ( ψ )] (cid:1) = (cid:90) V P d x ξ · ( L + F )+ (cid:90) V P dψ (cid:16) δχ ( ψ ) L ( ψ ) − πδI (cid:48) T, ( ψ ) δχ ( ψ ) + 2 πδI (cid:48) T, ( ψ ) δχ ( ψ ) (cid:17) + (cid:88) k I C k (cid:73) C k dl (cid:0) δ x ,C k ( x ) × δ B − δ x ,C k ( x ) × δ B (cid:1) · ˆ t + (cid:90) S P d x ξ · ˆ n L , (5.51)where we have considered perturbations to currents in the vacuum region corresponding todisplacements of the filamentary coils without change to their currents. If the direct problemis computed with fixed rotational transform, then δχ ( ψ ) = 0, and the adjoint variable ischosen to satisfy, F [ ξ , δχ ( ψ )] + L = 0 (5.52a) δχ ( ψ ) = 0 (5.52b) δ x ,C k × ˆ t = 0 , (5.52c)such that the above functional derivative vanishes, except for the terms involving integralsover S P or the filamentary coils. If instead the direct problem is computed with fixed toroidalcurrent, then δI T, ( ψ ) = 0 and the adjoint variable is chosen to satisfy, F [ ξ , δχ ( ψ )] + L = 0 (5.53a) δI T, ( ψ ) = L π (5.53b) δ x ,C k × ˆ t = 0 . (5.53c)The shape derivative of L is now computed to be, δ L (cid:0) C, B , ξ ; δ x ,C k (cid:1) = (cid:90) V P d x ( ξ · L ) + δ (cid:32)(cid:90) V P d x ξ · (cid:18) ( ∇ × B ) × B µ − ∇ p (cid:19)(cid:33) + (cid:90) V P dψ δχ ( ψ ) L ( ψ ) + (cid:90) S P d x ξ · ˆ n ( L + F ) , (5.54)99here the notation δ ( . . . ) indicates a shape derivative with respect to coil displacement δ x ,C k . We can now simplify the above expression using the free-boundary relation (5.33)and the conditions placed on the adjoint variable, (5.52) or (5.53). We now obtain, δ L ( C, B , ξ ; δ x ,C k ) = (cid:90) S P d x ξ · ˆ n ( L + F ) + (cid:88) k I C k (cid:73) C k dl δ x ,C k × δ B · ˆ t , (5.55)where it is understood that ξ is the perturbation to the boundary arising from the coilperturbation δ x ,C k . The first term can equivalently be expressed in terms of displacementsof the coil shapes using the virtual casing principle [143], though in this Chapter for simplicitywe will consider figures of merit such that ( L + F ) S P vanishes.Some examples of these continuous adjoint methods are discussed in the following Sec-tions. In this Section we will consider figures of merit which depend on the shape of the outerboundary of the plasma (Sections 5.5.1, 5.5.2, 5.5.3, and 5.5.4) and on the shape of theelectro-magnetic coils (Sections 5.5.2 and 5.5.3). The shape gradients of these figures ofmerit will be computed using both a direct method and an adjoint method, to demonstratethat the adjoint method produces identical results to the direct method but at much lowercomputational expense. For other figures of merit (Sections 5.5.5-5.5.7) the calculation is notpossible with existing codes, but a discussion of the adjoint linearized equilibrium equationsis presented. β Consider a figure of merit, the volume-averaged β , f β = f P f B , (5.56)where, f P = (cid:90) V p d x p ( ψ ) , (5.57)and, f B = (cid:90) V p d x B µ . (5.58)(This definition of volume-averaged β is the one employed in the VMEC code [111].) While f β is a figure of merit not often considered in stellarator shape optimization, we includethis calculation to demonstrate the adjoint approach, as its shape gradient can be computedwithout modifications to an equilibrium code.100 urface shape gradient We consider direct perturbations about an equilibrium with fixed rotational transform, F [ ξ , δχ ( ψ )] = 0 (5.59a) ξ · ˆ n | S P = δ x · ˆ n | S P (5.59b) δχ (cid:48) ( ψ ) = 0 . (5.59c)The differential change in f P associated with displacement ξ is, δf P ( S P ; ξ ) = − (cid:90) V P d x ξ · ∇ p + (cid:90) S P d x ξ · ˆ n p ( ψ ) , (5.60)which follows from the transport theorem (5.3). The first term accounts for the change in p at fixed position due to the motion of the flux surfaces, and the second term accounts forthe motion of the boundary. The differential change in f B associated with ξ is, δf B ( S P ; ξ ) = − µ (cid:90) V P d x (cid:16) B ∇ · ξ + ξ · ∇ (cid:0) B + µ p (cid:1)(cid:17) + 12 µ (cid:90) S P d x ξ · ˆ n B , (5.61)where we have noted that the perturbation to the magnetic field strength at fixed positionis given by, δB = − B (cid:16) B ∇ · ξ + ξ · ∇ (cid:0) B + µ p (cid:1) + δχ (cid:48) ( ψ ) B · ( ∇ ψ × ∇ ϕ ) (cid:17) . (5.62)The first term in (5.61) corresponds with the change in f B due to the perturbation to thefield strength, while the second term accounts for the motion of the boundary. Applying thedivergence theorem we obtain, δf B ( S P ; ξ ) = − (cid:90) V P d x ξ · ∇ p − µ (cid:90) S P d x ξ · ˆ n B . (5.63)The differential change in f β associated with displacement ξ satisfies, δf β ( S P ; ξ ) f β = (cid:90) S P d x ξ · ˆ n (cid:32) p ( ψ ) f P + B µ f B (cid:33) − (cid:18) f P − f B (cid:19) (cid:90) V P d x ξ · ∇ p. (5.64)The first term on the right of (5.64) is already in the form of a shape gradient. To evaluatethe second term, we turn to the adjoint problem, choosing, F [ ξ , δχ ( ψ )] − ∇ p = 0 (5.65a) ξ · ˆ n | S P = 0 (5.65b) δχ (cid:48) ( ψ ) = 0 . (5.65c)That is, we add a bulk force corresponding to the equilibrium pressure gradient. Thisadditional force produces a proportional change in magnetic field at the boundary and thusfrom (5.36), we find, δf β ( S P ; ξ ) f β = (cid:90) S P d x ξ · ˆ n (cid:32) p ( ψ ) f P + B µ f B + (cid:18) f P − f B (cid:19) δ B · B µ (cid:33) . (5.66)101hus, we can obtain the shape gradient without perturbing the shape of the surface, G = f β (cid:32) p ( ψ ) f P + B µ f B + (cid:18) f P − f B (cid:19) δ B · B µ (cid:33) S P . (5.67)In practice, the adjoint magnetic field is approximated from a nonlinear equilibrium solutionby adding a small perturbation to the pressure of magnitude ∆ P , p (cid:48) = (1 + ∆ P ) p . A forward-difference approximation is used to obtain, δ B ≈ B ( p + ∆ P p ) − B ( p )∆ P , (5.68)where B ( p ) is the magnetic field evaluated with pressure p ( ψ ).A similar expression can be obtained for equilibria for which the rotational transform isallowed to vary, but the toroidal current is held fixed ( δI T, = 0). In this case, F [ ξ , δχ ( ψ )] − ∇ p = 0 (5.69a) ξ · ˆ n | S P = 0 (5.69b) δI T, ( ψ ) = − I T ( ψ ) (cid:0) /f P − /f B (cid:1) − (cid:0) /f B (cid:1) . (5.69c)The shape gradient can then be obtained from (5.67).To demonstrate, we use the NCSX LI383 equilibrium [242]. The pressure profile wasperturbed with ∆ P = 0 .
01 to compute the adjoint field. The unperturbed and adjointequilibria are computed with the VMEC code [111]. The shape gradient obtained with theadjoint solution, G adjoint , and that obtained with the direct approach, G direct , are shown inFigure 5.1a. Positive values of the shape gradient indicate that f β increases if a normalperturbation is applied at a given location as indicated by (5.4). For the direct approachparameter derivatives with respect to the Fourier harmonics describing the plasma boundary( ∂f β /∂R cm,n , ∂f β /∂Z sm,n ), where R cm,n and Z sm,n are defined through, R = (cid:88) m,n R cm,n cos( mθ − nN P φ ) (5.70a) Z = (cid:88) m,n Z sm,n sin( mθ − nN P φ ) , (5.70b)are computed with a centered 4-point stencil for m ≤
15 and | n | ≤ G residual = |G adjoint − G direct | (cid:113)(cid:82) S P d x G / (cid:82) S P d x , (5.71)is shown in Figure 5.1c, where the surface-averaged value of G residual is 1 . × − . We notethat the number of required equilibrium calculations for the direct shape gradient calculationdepends on the Fourier resolution and finite-difference stencil chosen. In this Chapter wepresent the number of function evaluations required in order for the adjoint and direct shape102radient calculations to agree within a few percent. As the Fourier resolution is increased,the results of the adjoint and direct methods converge to each other.The parameter ∆ P must be chosen carefully, as the perturbation must be large enoughthat the result is not dominated by round-off error, but small enough that nonlinear effectsdo not become important. The relationship between G residual and ∆ P is shown in Figure5.1d. Here G direct is computed using the parameters reported above such that convergenceis obtained. We find that G residual decreases as (∆ P ) until ∆ P ≈ .
5, at which point round-off error begins to dominate. This scaling is to be expected, as δ B is computed with aforward-difference derivative with step size ∆ P .For this and the following examples, the computational cost of transforming the param-eter derivatives to the shape gradient was negligible compared to the cost of computing theparameter derivatives. The direct approach used 2357 calls to VMEC while the adjoint ap-proach only required two. It is clear that the adjoint method yields nearly identical derivativeinformation to the direct method but at a substantially reduced computational cost.The residual difference is nonzero due to several sources of error, including discretizationerror in VMEC. As a result of the assumption of nested magnetic surfaces, MHD forcebalance (5.10) is not satisfied exactly, but a finite force residual is introduced. Error isalso introduced by computing δ B with the addition of a small perturbation to a nonlinearequilibrium calculation rather than from a linearized MHD solution.In Figure 5.1 we find that f β is everywhere positive. This reflects the fact that the toroidalflux enclosed by S P is fixed. As perturbations which displace the plasma surface outwardincrease the surface area of a toroidal cross-section, the toroidal field must correspondinglydecrease, thus increasing f β . We find that the shape gradient is increased in regions of largefield strength, as indicated by the second term in (5.67). Consider a figure of merit, the average rotational transform in a radially localized region, f ι = (cid:90) V P dψ ι ( ψ ) w ( ψ ) . (5.72)Here w ( ψ ) is a normalized weighting function, w ( ψ ) = e − ( ψ − ψ m ) /ψ w (cid:82) V P dψ e − ( ψ − ψ m ) /ψ w , (5.73)and ψ m and ψ w are parameters defining the center and width of the Gaussian weighting,respectively. 103 a)(b) (c) -3 -2 -1 P -3 -2 -1 ( P ) (d) Figure 5.1: (a) The shape gradient for f β (5.56) computed using the adjoint solution (5.67)(left) and using parameter derivatives (right). (b) The shape gradient computed with theadjoint solution in the φ − θ plane, the VMEC [111] poloidal and toroidal angles (not magneticcoordinates). (c) The fractional difference (5.71) between the shape gradient obtained withthe adjoint solution and with parameter derivatives. (d) The fractional difference (5.71)depends on the scale of the perturbation added to the adjoint force balance equation, ∆ P .Figure adapted from [10] with permission. 104 urface shape gradient We consider direct perturbations about an equilibrium such that the toroidal current isfixed and the rotational transform is allowed to vary, F [ ξ , δχ ( ψ )] = 0 (5.74a) ξ · ˆ n | S P = δ x · ˆ n | S P (5.74b) δI T, ( ψ ) = 0 . (5.74c)The differential change of f ι associated with perturbation ξ is, δf ι ( S P ; ξ ) = (cid:90) V P dψ δχ (cid:48) ( ψ ) w ( ψ ) . (5.75)For the adjoint problem, we prescribe, F [ ξ , δχ ( ψ )] = 0 (5.76a) ξ · ˆ n | S P = 0 (5.76b) δI T, = w ( ψ ) . (5.76c)This additional current produces a proportional change in the magnetic field at the boundary;thus using (5.36), we obtain the following, δf ι ( S p ; ξ ) = 12 πµ (cid:90) S P d x ˆ n · ξ δ B · B . (5.77)So, we can obtain the shape gradient from the adjoint solution, G = (cid:18) δ B · B πµ (cid:19) S P . (5.78)Note that the computation of the shape derivative of the rotational transform on a singlesurface, ψ m , with the adjoint approach would require a delta-function current perturbation, δI T, = δ ( ψ − ψ m ). As this type of perturbation is difficult to resolve in a numerical computa-tion, the use of the Gaussian envelope allows the shape derivative of the rotational transformin a localized region of ψ m to be computed.To demonstrate, we use the NCSX LI383 equilibrium. We again apply a forward-difference approximation (5.68) of the adjoint solution, characterized by amplitude ∆ I = 715A. The parameters of the weight function are taken to be ψ m = 0 . ψ , and ψ w = 0 . ψ . Theshape gradient obtained with the adjoint solution and with the direct approach are shown inFigure 5.2a. For the direct approach, the shape gradient is computed from parameter deriva-tives with respect to the Fourier harmonics of the boundary (2.16) using an 8-point stencilwith m ≤
18 and | n | ≤
12. The fractional difference, G residual , between the two approachesis shown in Figure 5.2c, with a surface-averaged value of 2 . × − . The direct approachused 7401 calls to VMEC, while the adjoint only required two. Again, it is apparent thatthe adjoint method allows the same derivative information to be computed at a much lowercomputational cost.We find that over much of the surface, the shape gradient is close to zero. A regionof large negative shape gradient occurs in the concave region of the plasma surface with105 a)(b) (c) Figure 5.2: (a) The shape gradient for f ι (5.72) computed using the adjoint solution (5.78)(left) and using parameter derivatives (right). (b) The shape gradient computed with theadjoint solution in the φ − θ plane, the VMEC [111] poloidal and toroidal angles (not mag-netic coordinates). (c) The fractional difference (5.71) between the shape gradient obtainedwith the adjoint solution and with parameter derivatives. Again, the results are essentiallyindistinguishable, as expected. Figure adapted from [10] with permission.106djacent regions of large positive shape gradient. This indicates that “pinching” the surfacein this region, making it more concave, would increase ι near the axis. Coil shape gradient
The shape gradient of f ι can also be computed with a free-boundary approach. Weconsider perturbations about an equilibrium with fixed toroidal current, F [ ξ , δχ ( ψ )] = 0 (5.79a) δI T, ( ψ ) = 0 , (5.79b)with specified perturbation to the coil shapes, δ x C × ˆ t . We prescribe the adjoint problem, F [ ξ , δχ ( ψ )] = 0 (5.80a) δ x C × ˆ t = 0 (5.80b) δI T, ( ψ ) = w ( ψ ) , (5.80c)where w ( ψ ) is given by (5.73). Using (5.75) and (5.33) and noting that δI T, ( ψ ) vanishes atthe plasma boundary and on the axis, we find, δf ι ( C ; δ x C ) = 12 π (cid:90) V V d x δ J C · δ A V . (5.81)Using (5.34), this can be written in terms of changes in the positions of coils in the vacuumregion, δf ι ( C ; δ x C ) = 12 π (cid:88) k (cid:32) I C k (cid:73) C k dl δ x C k ( x ) · ˆ t × δ B (cid:33) . (5.82)When computing the coil shape gradient, the current in each coil is fixed. In arriving at(5.82), we assume that δI C ,k = 0. The coil shape gradient is thus (cid:101) G k = I C k ˆ t × δ B π (cid:12)(cid:12)(cid:12)(cid:12) C k . (5.83)As anticipated, (cid:101) G k has no component in the direction tangent to the coil. The adjointmagnetic field is computed with a forward-difference approximation (5.68) with step size∆ I = 5 . × A. Evaluating the shape gradient requires computing the adjoint magneticfield at the unperturbed coil locations in the vacuum region. This can be performed withthe DIAGNO code [71, 143], which employs the virtual casing principle.To demonstrate, we use the NCSX stellarator LI383 equilibrium. The toroidal currentprofile was perturbed with ψ m = 0 . ψ and ψ w = 0 . ψ . The shape gradient is computedfor each of the three unique modular coils per half period of the C09R00 coil set [236],keeping the planar coils fixed. The result obtained with the adjoint solution, (cid:101) G adjoint ,k , isshown in Figure 5.3. The shape gradient is also computed with the direct approach, (cid:101) G direct ,k . https://princetonuniversity.github.io/STELLOPT/VMEC%20Free%20Boundary%20Run x k = (cid:88) m X kcm cos( mθ ) + X ksm sin( mθ ) (5.84a) y k = (cid:88) m Y kcm cos( mθ ) + Y ksm sin( mθ ) (5.84b) z k = (cid:88) m Z kcm cos( mθ ) + Z ksm sin( mθ ) , (5.84c)where θ ∈ [0 , π ] parameterizes each filament and k denotes each coil shape. The numericalderivative with respect to these parameters are computed for m ≤
45 using an 8-point stencil.In Figure 5.4a the Cartesian components of the shape gradient computed with the adjointapproach, (cid:101) G l adjoint ,k , and with the direct approach, (cid:101) G l direct ,k , are shown for each coil, where l ∈ { x, y, z } . The arrows indicate the direction and magnitude of (cid:101) G k such that if a coil weredeformed in the direction of (cid:101) G k , f ι would increase according to (5.6). The direct approachused 6553 calls to VMEC, while the adjoint only required two. In Figure 5.4b the fractionaldifference between the results obtained with the two methods, (cid:101) G l residual ,k = | (cid:101) G l adjoint ,k − (cid:101) G l direct ,k | (cid:114)(cid:72) C k dl (cid:16) (cid:101) G l adjoint ,k (cid:17) / (cid:72) C k dl , (5.85)is plotted. The line-averaged values of (cid:101) G l residual are 6 . × − for coil 1, 3 . × − for coil 2,and 4 . × − for coil 3.From Figure 5.3, we see that the sensitivity of f ι to coil displacements is much higher inregions where the coils are close to the plasma surface. The shape gradient points towardthe plasma surface in the concave region of the plasma surface, while on the outboard sidethe sensitivity is significantly lower, again indicating the “pinching” effect seen in Figure 5.2. The averaged radial (normal to a flux surface) curvature is an important metric for MHDstability [64], κ ψ ≡ (cid:42) κ · (cid:18) ∂ x ∂ψ (cid:19) α,l (cid:43) ψ = (cid:42) B (cid:18) ∂∂ψ (cid:0) µ p + B (cid:1)(cid:19) α,l (cid:43) ψ , (5.86)where the curvature is κ = ˆ b · ∇ ˆ b , ˆ b = B /B is a unit vector in the direction of the magneticfield and l measures length along a field line. Subscripts in the above expression ( α, l )indicate quantities held fixed while computing the derivative. The flux surface average of aquantity A is, (cid:104) A (cid:105) ψ = (cid:82) ∞−∞ dlB A (cid:82) ∞−∞ dlB = (cid:82) π dϑ (cid:82) π dϕ √ gAV (cid:48) ( ψ ) . (5.87)108igure 5.3: The coil shape gradient for f ι (5.72) computed using the adjoint solution (5.83)for each of the 3 unique coil shapes (black). The arrows indicate the direction of (cid:101) G k , and theirlength indicates the local magnitude relative to the reference arrow shown. The arrows arenot visible on this scale on the outboard side. Figure reproduced from [10] with permission.109 x y z AdjointDirect arclength [m] -0.0500.050.1 0 2 4 6 arclength [m] -0.2-0.100.10.2 0 2 4 6 arclength [m] (a) x y z arclength [m] arclength [m] arclength [m] (b) Figure 5.4: (a) The Cartesian components of the coil shape gradient for each of the 3 uniquemodular NCSX coils computed with the adjoint and direct approaches. (b) The fractionaldifference (5.85) between the shape gradient computed with the adjoint approach and thedirect approach is plotted for each Cartesian component and each of the 3 unique coils.Figure adapted from [10] with permission. 110ere V ( ψ ) is the volume enclosed by the surface labeled by ψ . The average radial curvatureappears in the ideal MHD potential energy functional for interchange modes, and it providesa stabilizing effect when p (cid:48) ( ψ ) κ ψ <
0. As typically p (cid:48) ( ψ ) < κ ψ > κ ψ = − V (cid:48)(cid:48) ( ψ ) V (cid:48) ( ψ ) . (5.88)Thus, as volume increases with flux, V (cid:48)(cid:48) ( ψ ) < p (cid:48) ( ψ ) V (cid:48)(cid:48) ( ψ )also appears in the Mercier criterion for ideal MHD interchange stability [157]. Known as thevacuum magnetic well, V (cid:48)(cid:48) ( ψ ) has been employed in the optimization of several stellaratorconfigurations (e.g. [106, 114]).We consider the following figure of merit, f W = (cid:90) V P dψ w ( ψ ) V (cid:48) ( ψ ) , (5.89)where w ( ψ ) is a radial weight function which will be chosen so that (5.89) approximates V (cid:48)(cid:48) ( ψ ). This can equivalently be written as, f W = (cid:90) V P d x w ( ψ ) . (5.90) Surface shape gradient
We consider direct perturbations about an equilibrium with fixed toroidal current (5.74).The shape derivative of f W is computed upon application of the transport theorem (5.3),noting that δψ = − ξ · ∇ ψ , δf W ( S P ; ξ ) = − (cid:90) V P d x ξ · ∇ w ( ψ ) + (cid:90) S P d x ξ · ˆ n w ( ψ ) , (5.91)where we have assumed w ( ψ ) to be differentiable. We recast the first term in (5.91) as asurface integral by applying the fixed-boundary adjoint relation (5.36) and prescribing theadjoint perturbation to satisfy the following, F [ ξ , δχ ( ψ )] − ∇ w ( ψ ) = 0 (5.92a) ξ · ˆ n | S P = 0 (5.92b) δI T, ( ψ ) = 0 . (5.92c)Upon application of (5.36) we obtain the following expression for the shape gradientwhich depends on the adjoint solution, δ B , G W = (cid:18) w ( ψ ) + δ B · B µ (cid:19) S P . (5.93)In Figure 5.5 we present the computation of G W for the NCSX LI383 equilibrium [242]using the the adjoint and direct approaches. We use a weight function, w ( ψ ) = exp( − ( ψ − ψ m, ) /ψ w ) − exp( − ( ψ − ψ m, ) /ψ w ) , (5.94)111uch that f W remains smooth while it approximates V (cid:48) ( ψ m, ) − V (cid:48) ( ψ m, ) where ψ m, = 0 . ψ , ψ m, = 0 . ψ , and ψ w = 0 . ψ (Figure 5.5c). We note that f W can be interpreted asmeasuring the change in volume due to the interchange of two flux tubes centered at ψ m, and ψ m, . If f W >
0, this indicates that moving a flux tube radially outward will cause it toexpand and lower its potential energy.The adjoint magnetic field is computed with a forward-difference approximation (5.68)characterized by a step size ∆ P = 400 Pa. For the direct approach, derivatives with respectto the Fourier discretization (5.70) of the boundary are computed for m ≤
20 and | n | ≤ G residual is 3 . × − . Coil shape gradient
The shape derivative of f W can also be computed with respect to a perturbation of thecoil shapes. We consider perturbations about an equilibrium with fixed toroidal current, F [ ξ , δχ ( ψ )] = 0 (5.95a) δI T, ( ψ ) = 0 , (5.95b)with specified perturbation to the coils shapes, δ x C × ˆ t . We prescribe the following adjointperturbation, F [ ξ , δχ ( ψ )] − ∇ w ( ψ ) = 0 (5.96a) δ x C × ˆ t = 0 (5.96b) δI T, ( ψ ) = 0 . (5.96c)The same weight function (5.94) is applied, which decreases sufficiently fast that we canapproximate w ( ψ ) = 0. Upon application of the free-boundary adjoint relation (5.33), weobtain the following coil shape gradient, (cid:101) G k = I C k ˆ t × δ B µ (cid:12)(cid:12)(cid:12)(cid:12) C k . (5.97)The calculation of (cid:101) G k for each of the 3 unique coil shapes from the NCSX C09R00 coilset is shown in Figure 5.6. A two-point centered-difference approximation of the adjointmagnetic field (5.68) is applied with characteristic step size ∆ P = 3 × Pa. The adjointfield is evaluated in the vacuum region using the DIAGNO code. The shape gradient isalso computed with a direct approach. The Cartesian components of each coil are Fourier-discretized (5.84), and derivatives are computed with respect to modes with m ≤
40 witha 4-point centered-difference stencil. The fractional difference between the results obtainedwith the two approaches is quantified with (5.85). The line-averaged value of (cid:101) G l residual ,k is4 . × − . The direct approach required 2917 VMEC calls while the adjoint only required112 a) Adjoint (b) Direct / -1-0.500.51 w () (c) Weight function Figure 5.5: The shape gradient for f W (5.89) is computed using the (a) adjoint and (b) directapproaches. (c) The weight function (5.94) used to compute f W . Figure reproduced from[187] with permission. 113 a) Adjoint (b) Direct Figure 5.6: The coil shape gradient for f W is calculated for each of the 3 unique NCSXcoil shapes. The arrows indicate the direction of (cid:101) G k (5.97), and their lengths indicate themagnitude scaled according to the legend. Figure reproduced from [187] with permission.three. We now consider a figure of merit which quantifies the ripple near the magnetic axis[37, 58, 59]. As all physical quantities must be independent of the poloidal angle on themagnetic axis, this quantifies the departure from quasi-helical or quasi-axisymmetry nearthe magnetic axis. We define the magnetic ripple to be, f R = (cid:90) V P d x (cid:101) f R , (5.98)with, (cid:102) f R ( ψ, B ) = 12 w ( ψ ) (cid:16) B − B (cid:17) (5.99a) B = (cid:82) V P d x w ( ψ ) B (cid:82) V P d x w ( ψ ) , (5.99b)and a weight function given by, w ( ψ ) = exp( − ψ /ψ w ) , (5.100)with ψ w = 0 . ψ . 114 urface shape gradient We compute perturbations about an equilibrium with fixed rotational transform (5.59).Noting that the local perturbation to the field strength is given by (5.62), the shape derivativeis computed with the transport theorem (5.3), δf R ( S P ; ξ ) = (cid:90) S P d x ξ · ˆ n (cid:102) f R + (cid:90) V P d x (cid:32) ∂ (cid:102) f R ( ψ, B ) ∂B δB + ∂ (cid:102) f R ( ψ, B ) ∂ψ δψ (cid:33) . (5.101)We prescribe the following adjoint perturbation, F [ ξ , δχ ( ψ )] − ∇ · P = 0 (5.102a) ξ · ˆ n | S P = 0 (5.102b) δχ (cid:48) ( ψ ) = 0 . (5.102c)The bulk force perturbation required for the adjoint problem is written as the divergence ofan anisotropic pressure tensor, P = p ⊥ I + ( p || − p ⊥ )ˆ b ˆ b where I is the identity tensor. Theparallel and perpendicular pressures are related by the parallel force balance condition, ∂p || ( ψ, B ) ∂B = p || − p ⊥ B , (5.103)which follows from the requirement that ˆ b · δ F = 0 (5.25). We take the parallel pressure tobe, p || = (cid:102) f R . (5.104)Upon application of the fixed-boundary adjoint relation and the expression for the cur-vature in an equilibrium field, κ = ∇ ⊥ BB + ∇ pµ B , (5.105)we obtain the following shape gradient, G R = (cid:18) p ⊥ + δ B · B µ (cid:19) S P . (5.106)If instead the toroidal current is held fixed in the direct perturbation as in (5.74), then therequired adjoint current perturbation is given by, δI T, ( ψ ) = V (cid:48) ( ψ )2 π (cid:42) ∂ (cid:101) f R ( ψ, B ) ∂B ˆ b · ∇ ϕ × ∇ ψ (cid:43) ψ , (5.107)with the shape gradient unchanged. See Appendix L for details of the calculation.To compute the adjoint perturbation (5.102)-(5.107), we consider the addition of ananisotropic pressure tensor to the nonlinear force balance equation, J (cid:48) × B (cid:48) = ∇ p (cid:48) + ∆ P ∇ · P ( ψ (cid:48) , B (cid:48) ) , (5.108)where P ( ψ (cid:48) , B (cid:48) ) = p ⊥ ( ψ (cid:48) , B (cid:48) ) I + (cid:0) p || ( ψ (cid:48) , B (cid:48) ) − p ⊥ ( ψ (cid:48) , B (cid:48) ) (cid:1) ˆ b (cid:48) ˆ b (cid:48) . Here primes indicate theperturbed quantities (i.e. B (cid:48) = B + δB ) where unprimed quantities satisfy (5.10). As in115ection 5.5.3, the perturbation has a scale set by ∆ P which is chosen to be small enough thatthe response is linear. Enforcing parallel force balance from (5.108) results in the followingcondition, ∂p || ( ψ (cid:48) , B (cid:48) ) ∂B (cid:48) = p || ( ψ (cid:48) , B (cid:48) ) − p ⊥ ( ψ (cid:48) , B (cid:48) ) B (cid:48) . (5.109)If we furthermore assume that ∆ P ∇ · P is small compared with the other terms in (5.108),we can consider it to be a perturbation to the base equilibrium (5.10). In this way, we canapply the perturbed force balance equation (5.25) with δ F = − ∆ P ∇ · P ( B ), where P is nowevaluated with the equilibrium field which satisfies (5.10). Thus the desired pressure tensor(5.104) can be implemented by evaluating p || with the perturbed field such that (5.109) issatisfied.We have implemented the pressure tensor defined by (5.103)-(5.104) in the ANIMECcode [43], which modifies the VMEC variational principle to allow 3D equilibrium solutionswith anisotropic pressures to be computed. The ANIMEC code has been used to modelequilibria with energetic particle species using pressure tensors based on bi-Maxwellian [45]and slowing-down [44] distribution functions. The variational principle assumes that p || onlyvaries on a surface through B and can, therefore, be used to include the required adjointbulk force.In Figure 5.7, we present the computation of G R for the NCSX LI383 equilibrium usingthe adjoint and direct approaches. For the direct approach, derivatives with respect to theFourier discretization of the boundary (5.70) are computed for m ≤
11 and | n | ≤ P = 7 . × Pa. The directapproach required 2761 calls to VMEC while the adjoint approach required two calls. Thesurface-averaged value of G residual (5.71) is 3 . × − . /ν regime The effective ripple in the 1 /ν regime [168] is a figure of merit which has proven valuablefor neoclassical optimization (e.g. [106, 134, 242]). This quantity characterizes the geometricdependence of the neoclassical particle flux under the assumption of low-collisionality suchthat (cid:15) eff is analogous to the helical ripple amplitude, (cid:15) h , that appears in the expression ofthe 1 /ν particle flux for a classical stellarator [66]. The following expression is obtained forthe effective ripple, (cid:15) / ( ψ ) = π √ V (cid:48) ( ψ ) (cid:15) (cid:90) /B min /B max dλλ (cid:90) π dα (cid:88) i ( ∂∂α ˆ K i ( α, λ )) ˆ I i ( α, λ ) . (5.110)Here λ = v ⊥ / ( v B ) is the pitch angle, B min and B max are the minimum and maximum valuesof the field strength on a surface labeled by ψ , and (cid:15) ref is a reference aspect ratio. We have116 a) Adjoint (b) Direct(c) Weight function Figure 5.7: The shape gradient for f R (5.98) is computed using the (a) adjoint and (b) directapproaches with a weight function (5.100) shown in (c). Figure reproduced from [187] withpermission. 117efined the bounce integrals, ˆ I i ( α, λ ) = (cid:73) dl v || Bv (5.111a)ˆ K i ( α, λ ) = (cid:73) dl v || Bv , (5.111b)where the notation (cid:72) dl = (cid:80) σ σ (cid:82) ϕ + ϕ − dϕ/ ˆ b · ∇ ϕ indicates integration at constant λ and α between successive bounce points where v || ( ϕ + ) = v || ( ϕ − ) = 0 and σ = sign( v || ). The sumin (5.110) is taken over wells at constant λ and α for ϕ − ,i ∈ [0 , π ).We consider an integrated figure of merit, f (cid:15) = (cid:90) V P d x w ( ψ ) (cid:15) / ( ψ ) , (5.112)where w ( ψ ) is a radial weight function. We perturb about an equilibrium with fixed toroidalcurrent (5.74). The shape derivative of f (cid:15) is computed to be, δf (cid:15) ( S P ; ξ ) = (cid:90) V P d x (cid:0) P (cid:15) : ∇ ξ + δχ (cid:48) ( ψ ) I (cid:15) (cid:1) , (5.113)where the double dot (:) indicates contraction between dyadic tensors A and B as A : B = (cid:80) i,j A ij B ji , with, I (cid:15) = πw ( ψ )2 √ (cid:15) (cid:90) /B /B max dλλ × (cid:34) (cid:16) ∂∂α ˆ K ( α, λ, ϕ ) (cid:17) ˆ I ( α, λ, ϕ ) − ϕ B × ∇ ψ · ∇ (cid:32) | v || | vB (cid:33) + B × ∇ ψ · ∇ ϕ ∂∂B (cid:32) | v || | vB (cid:33) + 2 ∂∂α (cid:32) ∂∂α ˆ K ( α, λ, ϕ )ˆ I ( α, λ, ϕ ) (cid:33) − ϕ B × ∇ ψ · ∇ (cid:32) | v || | v B (cid:33) + B × ∇ ψ · ∇ ϕ ∂∂B (cid:32) | v || | v B (cid:33) (cid:35) , (5.114)and P (cid:15) = p || ˆ b ˆ b + p ⊥ ( I − ˆ b ˆ b ) with, p || = − πw ( ψ )2 √ (cid:15) (cid:90) /B /B max dλλ (cid:32) (cid:16) ∂∂α ˆ K ( α, λ, ϕ ) (cid:17) ˆ I ( α, λ, ϕ ) | v || | v + 2 ∂∂α (cid:32) ∂∂α ˆ K ( α, λ, ϕ )ˆ I ( α, λ, ϕ ) (cid:33) | v || | v (cid:33) (5.115a) p ⊥ = − πw ( ψ )2 √ (cid:15) (cid:90) /B /B max dλλ (cid:32) (cid:16) ∂∂α ˆ K ( α, λ, ϕ ) (cid:17) ˆ I ( α, λ, ϕ ) (cid:32) λvB | v || | + | v || | v (cid:33) + 2 ∂∂α (cid:32) ∂∂α ˆ K ( α, λ, ϕ )ˆ I ( α, λ, ϕ ) (cid:33) (cid:32) λ | v || | B v + | v || | v (cid:33) (cid:33) . (5.115b)118erivatives are computed assuming (cid:15) ref is held constant. The bounce integrals are de-fined with respect to ϕ such that ˆ I ( α, λ, ϕ ) = ˆ I i if ϕ ∈ [ ϕ − ,i , ϕ + ,i ] and ˆ I ( α, λ, ϕ ) = 0 if λB ( α, ϕ ) >
1. The same convention is used for ˆ K ( α, λ, ϕ ). We prescribe the followingadjoint perturbation, F [ ξ , δχ ( ψ )] − ∇ · P (cid:15) = 0 (5.116a) ξ · ˆ n | S P = 0 (5.116b) δI T, ( ψ ) = V (cid:48) ( ψ )2 π (cid:104)I (cid:15) (cid:105) ψ . (5.116c)The adjoint bulk force must be consistent with parallel force balance from (5.25), which isequivalent to the condition, ∇ || p || = ∇ || BB ( p || − p ⊥ ) . (5.117)This can be shown to be satisfied by (5.115), noting that the λ integrand vanishes at 1 /B such that there is no contribution from the parallel gradient acting on the bounds of theintegral. There is also no contribution to the parallel gradient from the bounce-integrals, as | v || | vanishes at points of non-zero gradient of ˆ I ( α, λ, ϕ ) and ˆ K ( α, λ, ϕ ).Upon application of the fixed-boundary adjoint relation (5.36) and integration by parts,we obtain the following expression for the shape gradient, G (cid:15) = (cid:18) p ⊥ + δ B · B µ (cid:19) S P . (5.118)See Appendix M for details of the calculation. The approach demonstrated in this Sectioncould be extended to compute the shape gradients of other figures of merit involving bounceintegrals, such as the Γ c metric for energetic particle confinement [169] or the variation ofthe parallel adiabatic invariant on a flux surface [58]. Quasi-symmetry is desirable as it ensures collisionless confinement of guiding centers.This property follows when the field strength depends on a linear combination of the Boozerangles, B ( ψ, ϑ B , ϕ B ) = B ( ψ, M ϑ B − N ϕ B ) for fixed integers M and N [22, 175] (Appendix5.5.6). Several stellarator configurations have been optimized to be close to quasi-symmetry(e.g., [57, 106, 149, 197]) by minimizing the amplitude of symmetry-breaking Fourier har-monics of the field strength. We will consider a figure of merit that does not require a Boozercoordinate transformation; instead, we use a general set of magnetic coordinates ( ψ, ϑ, ϕ ) todefine our figure of merit.In Boozer coordinates [21, 97] ( ψ, ϑ B , ϕ B ) the covariant form for the magnetic field is, B = I ( ψ ) ∇ ϑ B + G ( ψ ) ∇ ϕ B + K ( ψ, ϑ B , ϕ B ) ∇ ψ. (5.119)Here G ( ψ ) = µ I P ( ψ ) / (2 π ), where I P ( ψ ) is the poloidal current outside the ψ surface. Thepoloidal current can be computed using Ampere’s law and expressed as an integral over a119urface labeled by ψ , S P ( ψ ), I P ( ψ ) = 1 µ (cid:90) π dϕ B · ∂ x ∂ϕ = − πµ (cid:90) S P ( ψ ) d x B · ∇ ϑ × ˆ n . (5.120)The quantity I ( ψ ) = µ I T ( ψ ) / (2 π ), where I T ( ψ ) is the toroidal current inside the ψ surface(5.16). We quantify the departure from quasi-symmetry in the following way, f QS = 12 (cid:90) V P d x w ( ψ ) (cid:0) B × ∇ ψ · ∇ B − F ( ψ ) B · ∇ B (cid:1) . (5.121)Here w ( ψ ) is a radial weight function and, F ( ψ ) = ( M/N ) G ( ψ ) + I ( ψ )( M/N ) ι ( ψ ) − . (5.122)If f QS = 0, then the field is quasi-symmetric with mode numbers M and N [97], which can beshown using the covariant (5.13) and contravariant (5.119) representations of the magneticfield assuming B = B ( ψ, M ϑ B − N ϕ B ) for fixed M and N . Note that f QS quantifies thesymmetry in Boozer coordinates but can be evaluated in any flux coordinate system.We consider perturbation about an equilibrium with fixed toroidal current (5.74). Theperturbations to the Boozer poloidal covariant component is computed using the transporttheorem (5.3), δG ( ψ ) = − π (cid:90) S P ( ψ ) d x (cid:0) ∇ · ( B × ∇ ϑ ) ξ · ˆ n + δ B × ∇ ϑ · ˆ n (cid:1) . (5.123)In arriving at (5.123) we have used the fact that spatial derivatives commute with shapederivatives. The first term accounts for the unperturbed current density through the per-turbed boundary, and the second accounts for the perturbed current density through theunperturbed boundary. The contribution from the perturbation to the poloidal angle canbe shown to vanish. Upon application of (5.20) we obtain, noting that (cid:82) S P ( ψ ) d x A = V (cid:48) ( ψ ) (cid:104) A |∇ ψ |(cid:105) ψ for any quantity A , δG ( ψ ) = − V (cid:48) ( ψ )4 π (cid:42) ξ · ∇ ψ ∇ · ( B × ∇ ϑ ) − √ g ∂ x ∂ϕ · ∇ × ( ξ × B ) − δχ (cid:48) ( ψ ) √ g ∂ x ∂ϕ · ∂ x ∂ϑ (cid:43) ψ , (5.124)Applying the transport theorem (5.3), the shape derivative of f QS takes the form, δf QS ( S P ; ξ ) = 12 (cid:90) S P d x ξ · ˆ n M w ( ψ ) + 12 (cid:90) V P d x w (cid:48) ( ψ ) δψ M + (cid:90) V P d x w ( ψ ) M (cid:18) δ B · A + S · ∇ δB + B × ∇ δψ · ∇ B − δG ( ψ ) B · ∇ Bι ( ψ ) − ( N/M ) (cid:19) + (cid:90) V P d x w ( ψ ) M (cid:18) F ( ψ ) ι ( ψ ) − ( N/M ) δχ (cid:48) ( ψ ) B · ∇ B − δψF (cid:48) ( ψ ) B · ∇ B (cid:19) , (5.125)120here M = B ×∇ ψ ·∇ B − F ( ψ ) B ·∇ B , A = ∇ ψ ×∇ B − F ( ψ ) ∇ B , and S = B ×∇ ψ − F ( ψ ) B .After several steps outlined in Appendix N, the shape derivative can be written in thefollowing way, δf QS ( S P ; ξ ) = (cid:90) V P d x (cid:0) ξ · F QS + δχ (cid:48) ( ψ ) I QS (cid:1) + (cid:90) S P d x ξ · ˆ n B QS , (5.126)with, F QS = 12 ∇ ⊥ (cid:0) w ( ψ ) M (cid:1) + (cid:16) (ˆ b × ∇ ψ ) ∇ || B + F ( ψ ) ∇ ⊥ B (cid:17) w ( ψ ) B · ∇M + B × ( ∇ × ( ∇ ψ × ∇ B )) w ( ψ ) M − B ∇ ⊥ (cid:0) w ( ψ ) S · ∇M (cid:1) + κ Bw ( ψ ) S · ∇M− ∇ ψ ∇ B · ∇ × (cid:0) w ( ψ ) M B (cid:1) + 14 π (cid:32) − ∇ ⊥ (cid:18) w ( ψ ) V (cid:48) ( ψ ) (cid:104)M B · ∇ B (cid:105) ψ ( ι ( ψ ) − ( N/M )) (cid:19) ( B · ∇ ψ × ∇ ϑ )+ w ( ψ ) V (cid:48) ( ψ ) (cid:104)M B · ∇ B (cid:105) ψ ι ( ψ ) − ( N/M ) (cid:0) ∇ ψ ∇ · ( B × ∇ ϑ ) − B × ∇ × ( ∇ ψ × ∇ ϑ ) (cid:1) (cid:33) (5.127a) B QS = − w ( ψ ) M + Bw ( ψ ) S · ∇M − w ( ψ ) M∇ B × B · ∇ ψ + w ( ψ ) V (cid:48) ( ψ ) (cid:104)M B · ∇ B (cid:105) ψ π ( ι ( ψ ) − ( N/M )) ( B · ∇ ψ × ∇ ϑ ) (5.127b) I QS = − w ( ψ ) M∇ ψ × ∇ ϕ · A + w ( ψ ) ( S · ∇M ) ˆ b · ∇ ψ × ∇ ϕ + w ( ψ ) M B · ∇ Bι ( ψ ) − ( N/M ) F ( ψ ) − (cid:42) V (cid:48) ( ψ )4 π √ g ∂ x ∂ϕ · ∂ x ∂ϑ (cid:43) ψ . (5.127c)In (5.127a), ∇ || = ˆ b · ∇ and ∇ ⊥ = ∇ − ˆ b ∇ || are the parallel and perpendicular gradients.We can now prescribe an adjoint perturbation which satisfies, F [ ξ , δχ ( ψ )] + F QS = 0 (5.128a) ξ · ˆ n | S P = 0 (5.128b) δI T, ( ψ ) = V (cid:48) ( ψ )2 π (cid:104)I QS (cid:105) ψ . (5.128c)We note that F QS satisfies the parallel force balance condition (ˆ b · F QS = 0) implied by(5.25). Upon application of the fixed-boundary adjoint relation we obtain the following shapegradient, G QS = (cid:18) B QS + δ B · B µ (cid:19) S P . (5.129)121 .5.7 Neoclassical figures of merit In Section 5.5.5, we considered a figure of merit that quantifies the geometric dependenceof the neoclassical particle flux in the 1 /ν regime. In applying this model, several assump-tions are imposed, such as a small radial electric field, E r , low collisionality, and a simplifiedpitch-angle scattering collision operator. In this Section, we consider a more general neo-classical figure of merit arising from a moment of the local drift kinetic equation, allowingfor optimization at finite collisionality and E r . It is assumed here that the collision time iscomparable to the bounce time but shorter than the time needed to complete a magneticdrift orbit. In Chapter 4, an adjoint method is demonstrated for obtaining derivatives ofneoclassical figures of merit with respect to local geometric quantities on a flux surface. Theadjoint method described in this Section will extend these results, such that shape derivativeswith respect to the plasma boundary can be computed.Consider the following figure of merit, f NC = (cid:90) V P d x w ( ψ ) R ( ψ ) . (5.130)Here R ( ψ ) is a flux surface averaged moment of the neoclassical distribution function, f ,which satisfies the local drift kinetic equation (DKE),( v || ˆ b + v E ) · ∇ f − C ( f ) = − v m · ∇ ψ ∂f M ∂ψ , (5.131)where v E = E × B /B is the E × B drift velocity, v m · ∇ ψ is the radial magnetic driftvelocity (4.3), f M is a Maxwellian (M.3), and C is the linearized Fokker-Planck operator.For example, R can be taken to be the bootstrap current, J b = (cid:88) s (cid:104) B (cid:82) d v f s v || (cid:105) ψ n s (cid:104) B (cid:105) / ψ , (5.132)where the sum is taken over species. We note that the geometric dependence that enters theDKE when written in Boozer coordinates only arises through the quantities { B, G ( ψ ) , I ( ψ ) , ι ( ψ ) } .Thus for simplicity, Boozer coordinates will be assumed throughout this Section.The perturbation to R ( ψ ) at fixed toroidal current (5.74) can be written as, δ R ( ψ ) = (cid:104) S R δB (cid:105) ψ + ∂ R ( ψ ) ∂G ( ψ ) δG ( ψ ) + ∂ R ( ψ ) ∂ι ( ψ ) δχ (cid:48) ( ψ ) . (5.133)Here S R is a local sensitivity function which quantifies the change to R associated with aperturbation of the field strength δB defined in the following way. Consider the perturbationto R resulting from a change in the field strength at fixed G ( ψ ), I ( ψ ), and ι ( ψ ). Thefunctional derivative of R ( ψ ) with respect to B ( x ) can be expressed as, δ R ( δB ; B ( x )) = (cid:10) S R δB ( x ) (cid:11) ψ . (5.134)This is another instance of the Riesz representation theorem: δ R is a linear functional of δB , with the inner product taken to be the flux surface average. Thus S R can be thoughtof as analogous to the shape gradient (5.4). 122he quantities { S R , ∂ R ( ψ ) /∂G ( ψ ) , ∂ R ( ψ ) /∂ι ( ψ ) } can be computed with the adjointmethod described in Chapter 4 with the SFINCS code [140]. Here we consider SFINCSto be run on a set of surfaces such that (5.130) can be computed numerically. The deriva-tives computed by SFINCS will appear in the additional bulk force required for the adjointperturbed equilibrium. We consider perturbations of an equilibrium at fixed toroidal cur-rent (5.74). The shape derivative of f NC can be computed on application of the transporttheorem (5.3), δf NC ( S P ; ξ ) = (cid:90) S P d x ξ · ˆ n w ( ψ ) R ( ψ ) + (cid:90) V P d x δψ ∂∂ψ (cid:0) w ( ψ ) R ( ψ ) (cid:1) + (cid:90) V P d x w ( ψ ) (cid:18) ∂ R ( ψ ) ∂G ( ψ ) δG ( ψ ) + ∂ R ( ψ ) ∂ι ( ψ ) δχ (cid:48) ( ψ ) + (cid:104) S R δB (cid:105) ψ (cid:19) . (5.135)After several steps outlined in Appendix O, the shape derivative is written in the followingform, δf NC ( S P ; ξ ) = (cid:90) V P d x (cid:0) ξ · F NC + δχ (cid:48) ( ψ ) I NC (cid:1) + (cid:90) S P d x ξ · ˆ n B NC , (5.136)with, F NC = −∇ ( R ( ψ ) w ( ψ )) − ∇ ψ ( ∇ × B ) · ∇ ϑ ∂ R ( ψ ) ∂G ( ψ ) w ( ψ ) B √ g (cid:104) B (cid:105) ψ + w ( ψ ) (cid:104) B (cid:105) ψ ∂ R ( ψ ) ∂G ( ψ ) B × ∇ × (cid:18) ∂ x ∂ϕ B (cid:19) + G ( ψ ) B ∇ (cid:32) w ( ψ ) (cid:104) B (cid:105) ψ ∂ R ( ψ ) ∂G ( ψ ) (cid:33) − κ w ( ψ ) S R B + B ∇ ⊥ ( w ( ψ ) S R ) (5.137a) B NC = w ( ψ ) R ( ψ ) − w ( ψ ) B (cid:104) B (cid:105) ψ ∂ R ( ψ ) ∂G ( ψ ) G ( ψ ) − w ( ψ ) S R B (5.137b) I NC = ∂ R ( ψ ) ∂G ( ψ ) w ( ψ ) B (cid:104) B (cid:105) ψ √ g ∂ x ∂ϕ · ∂ x ∂ϑ + w ( ψ ) ∂ R ( ψ ) ∂ι ( ψ ) − w ( ψ ) S R ˆ b · ∇ ψ × ∇ ϕ. (5.137c)We consider the following adjoint perturbation, F [ ξ , δχ ( ψ )] + F NC = 0 (5.138a) ξ · ˆ n | S P = 0 (5.138b) δI T, ( ψ ) = V (cid:48) ( ψ )2 π (cid:104)I NC (cid:105) ψ . (5.138c)The adjoint bulk force F NC is chosen to satisfy parallel force balance required by (5.25).Upon application of the fixed-boundary adjoint relation we obtain the shape gradient, G NC = (cid:18) B NC + δ B · B µ (cid:19) S P . (5.139)123 .6 Conclusions We have obtained a relationship between 3D perturbations of MHD equilibria that isa consequence of the self-adjoint property of the MHD force operator. The relation allowsfor the efficient computation of shape gradients for either the outer plasma surface usingthe fixed-boundary adjoint relation (5.36) or for coil shapes using the free boundary adjointrelation (5.33). The computation of the shape gradient of several stellarator figures of merithas been demonstrated with both the adjoint and direct approach. The application of theadjoint relation provides an O ( N Ω ) reduction in CPU hours required in comparison with thedirect method of computing the shape gradient, where N Ω is the number of parameters usedto describe the shape of the outer boundary or the coils. For fully 3D geometry, N Ω canbe 10 − . Thus, the application of adjoint methods can significantly reduce the cost ofcomputing the shape gradient for gradient-based optimization or local sensitivity analysis.We have demonstrated that the self-adjointness relations (Section 5.3) can be imple-mented to efficiently compute the shape gradient of figures of merit relevant for stellaratorconfiguration optimization. The shape gradient is obtained by solving an adjoint perturbedforce balance equation that depends on the figure of merit of interest. For the volume-averaged β and vacuum well parameter (Sections 5.5.1 and 5.5.3), the additional bulk forcerequired for the adjoint problem is simply the gradient of a function of flux, and so it can beimplemented by adding a perturbation to the pressure profile. For the magnetic ripple onaxis (Section 5.5.4), the required bulk force takes the form of the divergence of a pressuretensor that only varies on a surface through the field strength. As the ANIMEC code cur-rently treats this type of pressure tensor, this adjoint bulk force is implemented with a minormodification to the code. Computing the shape gradient of (cid:15) / with the adjoint approachalso requires the addition of the divergence of a pressure tensor. However, this pressuretensor varies on a surface through the field line label due to the bounce integrals that appear(5.115). Thus the variational principle used by the ANIMEC code cannot be easily extendedfor this application. Similarly, the shape gradients for the quasi-symmetry (Section 5.5.6)and neoclassical (Section 5.5.7) figures of merit require an adjoint bulk force that is not in theform of the divergence of a pressure tensor. This provides an impetus for the developmentof a flexible perturbed MHD equilibrium code that could enable these calculations. Whileseveral 3D ideal MHD stability codes exist [7, 204, 219], only the CAS3D code has beenmodified in order to perform perturbed equilibrium calculations [28, 173]. A discussion ofsuch linear equilibrium calculations for adjoint-based shape gradient evaluations is presentedin Chapter 6.It should be noted that the adjoint approach we have outlined can not yield an exactanalytic shape gradient, as error is introduced through the approximation of the adjointsolution. Throughout, we have assumed the existence of magnetic surfaces as the 3D equi-librium is perturbed. Therefore a code such as VMEC or ANIMEC, which minimizes anenergy subject to the constraint that surfaces exist, is suitable. Generally VMEC solutionsdo not satisfy (5.10) exactly [174], as they do not account for the formation of islands orcurrent singularities associated with rational surfaces. Furthermore, the parameters ∆ P and∆ I introduce additional numerical noise. As demonstrated in Section 5.5.1, these parameters124ust be small enough that nonlinear effects do not become important yet large enough thatround-off error does not dominate. We have demonstrated that the typical difference be-tween the shape gradient obtained with the adjoint method and that computed directly fromnumerical derivatives is (cid:46) hapter 6 Linearized equilibrium solutions
As discussed in Chapter 5, the application of the adjoint approach for computing theshape gradient of functions of MHD equilibria requires solutions of linearized MHD equi-librium equations. In the examples presented thus far, these linearized solutions were ap-proximated by adding a small perturbation to a nonlinear MHD equilibrium, such as aperturbation to the prescribed toroidal current or pressure profiles. This approximation in-troduces error associated with the choice of the amplitude of the perturbation and limits thetypes of objective functions that can be treated. In this Chapter, we discuss an approach tocompute the necessary linearized equilibrium solutions based on a variational method.
There are several existing techniques for computing linearized ideal MHD equilibria. Aswill be shown directly in the following Section, a linearized equilibrium state is a stationarypoint of an energy functional. This energy functional is related to the potential energythat appears in ideal MHD stability analysis, W P [ ξ ] = − (cid:82) V P d x ξ · F [ ξ ], where ξ is thedisplacement vector and F [ ξ ] is the MHD force operator (6.3). For this reason, ideal MHDstability codes can be augmented for perturbed equilibrium calculations. One approach isbased on the Direct Criterion of Newcomb (DCON) code [80], which minimizes the potentialenergy by solving an Euler-Lagrange equation for the displacement vector. This method hasbeen extended with the Ideal Perturbed Equilibrium Code (IPEC) [182, 183], which couplesapplied plasma boundary perturbations to perturbations of currents in the vacuum region.This code models axisymmetry-breaking perturbations on tokamak equilibria for the studyof mode-locking [61] and neoclassical toroidal viscosity (NTV) [152]. Modification of DCONis currently underway to enable stability calculations for stellarators with stepped-pressureequilibria [81].The Code for the Analysis of the MHD Stability of 3D Equilibria (CAS3D) has similarlybeen modified for perturbed MHD equilibrium calculations. To evaluate ideal MHD stability,CAS3D solves an eigenvalue problem to obtain a minimum of W P [ ξ ] /W K [ ξ ], where W K [ ξ ] = (cid:82) V P d x ρ | ξ | is the kinetic energy associated with the displacement vector ξ and ρ is thedensity. As perturbed equilibria are stationary points of an energy functional similar to126 P [ ξ ], not W P [ ξ ] /W K [ ξ ], such stability codes based on eigenvalue calculations need to bemodified in order to compute perturbed equilibrium states. The CAS3D code allows theoption to normalize W P [ ξ ] by a modified energy functional such that perturbed equilibriumstates can be computed [28, 173]. This technique has been used to study the effect ofboundary perturbations on magnetic island width [174].While several 3D MHD stability codes exist [7, 204, 219], they cannot be directly usedto compute perturbed equilibrium states relevant for stellarator optimization problems. Forstability studies, it is often sufficient to consider only symmetry-breaking modes (modes thatbreak period symmetry or stellarator symmetry), while optimization is typically performedassuming preservation of symmetry. Furthermore, none of the existing codes enable theaddition of a general bulk force perturbation as is required for our adjoint approach.There are additional limitations that motivate us to consider the development of an in-dependent linearized equilibrium code. The DCON and CAS3D approaches minimize theirrespective energy functionals assuming that the displacement vector is divergenceless. Thisassumption implies that (cid:104) ξ · ∇ ψ (cid:105) ψ vanishes [153, 204], where (cid:104) . . . (cid:105) ψ is the flux-surface av-erage (A.10). This places a significant restriction on ξ ψ ≡ ξ · ∇ ψ that cannot generally besatisfied in addition to the Euler-Lagrange equation. Therefore, modes that are constrainedby (cid:104) ξ ψ (cid:105) ψ = 0 cannot be included in the Euler-Lagrange equation. In axisymmetry, this dis-allows the toroidal mode number n = 0. In stellarator geometry with discrete N P -symmetry,this disallows modes where n is an integer multiple of N P (sometimes called the N = 0 modefamily [204]). This assumption is valid for stability problems, as such modes correspondingto fixed-boundary perturbations are always stable [204]. However, for stellarator optimiza-tion and tolerance calculations, these modes cannot be ignored. Rather than assume that ∇ · ξ = 0, for adjoint calculations it is much more convenient to assume that ξ · B = 0,which enables the inclusion of these modes. Finally, the postprocessing of results differs sig-nificantly between stability and perturbed equilibria applications. The development of sucha 3D perturbed equilibrium code could substantially reduce the computational complexityof gradient-based optimization by enabling the application of the adjoint approach to manycritical objective functions. Such a tool would also allow for the analysis of the response ofan equilibrium to boundary perturbations without resorting to a full nonlinear calculation.This capability would improve fixed-boundary optimization when an adjoint method is notavailable for sensitivity and tolerance studies.In Section 6.2, we present the proposed method to compute linearized equilibrium stateswith the addition of an arbitrary bulk force. This method is based on a variational principlesimilar to that used in the DCON code. In Section 6.3, we analyze the behavior of classes ofmodes of the displacement vector in the simplified geometry of a screw pinch. In this way,we highlight key numerical challenges and proposed solution methods. Finally, in Section6.4, we demonstrate this method for the computation of the shape gradient of a figure of This assumption is made in the original version of CAS3D [204]. There exists the option to retain theterms in the energy functional involving ∇ · ξ in a more recent version [172]. This arises from noting (cid:104)∇ · ξ (cid:105) ψ = V (cid:48) ( ψ ) − d/dψ (cid:0) V (cid:48) ( ψ ) (cid:104) ξ · ∇ ψ (cid:105) ψ (cid:1) , thus V (cid:48) ( ψ ) (cid:104) ξ · ∇ ψ (cid:105) ψ must be aconstant. As ξ · ∇ ψ must vanish at the origin due to regularity while V (cid:48) ( ψ ) is finite at the origin, thequantity V (cid:48) ( ψ ) (cid:104) ξ · ∇ ψ (cid:105) ψ = 0. We consider a base equilibrium magnetic field satisfying MHD force balance,( ∇ × B ) × B = µ ∇ p, (6.1)with prescribed pressure p ( ψ ) and rotational transform ι ( ψ ). We would like to computelinearizations about this state satisfying, F [ ξ ] + δ F = 0 , (6.2)where the MHD force operator is F [ ξ ] = (cid:0) ∇ × δ B [ ξ ] (cid:1) × B µ + ( ∇ × B ) × δ B [ ξ ] µ − ∇ (cid:0) δp [ ξ ] (cid:1) , (6.3)and δ F is a bulk force perturbation. The perturbed magnetic field can be expressed in termsof the displacement vector ξ , δ B [ ξ ] = ∇ × ( ξ × B ) , (6.4)under the assumption that the rotational transform ι ( ψ ) is preserved by the perturbation.In this Chapter, we will not consider the effect of perturbations to the rotational transform,although such effects are necessary to compute the shape gradient of certain figures of merit.Assuming the pressure profile is fixed by the perturbation, then we can also express theperturbation to the local pressure in terms of the displacement vector, δp [ ξ ] = − ξ · ∇ p. (6.5)The linearized force balance equation is solved subject to a boundary condition, ξ · ˆ n (cid:12)(cid:12) S P = δ x · ˆ n , (6.6)for a prescribed boundary perturbation δ x · ˆ n . We can express this PDE (6.2) with boundarycondition (6.6) in an equivalent variational form involving the energy functional, W [ ξ ] = (cid:90) V P d x ξ · (cid:0) F [ ξ ] + 2 δ F (cid:1) + 1 µ (cid:90) S P d x ˆ n · (cid:0) ξ δ B [ ξ ] (cid:1) · B . (6.7)Stationary points of W [ ξ ] subject to the boundary condition (6.6) are equivalent to solutionsof (6.2). While (6.2) is a coupled set of PDEs involving two components of the displacementvector, the application of the variational principle will allow us to arrive at an Euler-Lagrangeequation that is a coupled set of ODEs for one component of the displacement vector.We now demonstrate that stationary points of (6.7) with respect to ξ subject to theboundary condition (6.6) indeed correspond with solutions of (6.2). We perform the first128ariation with respect to ξ , δW [ ξ ; δ ξ ] = (cid:90) V P d x (cid:16) δ ξ · (cid:0) F [ ξ ] + 2 δ F (cid:1) + ξ · F [ δ ξ ] (cid:17) + 1 µ (cid:90) S P d x ˆ n · (cid:0) δ ξ δ B [ ξ ] + ξ δ B [ δ ξ ] (cid:1) · B . (6.8)We now apply the self-adjointness of the MHD force operator (5.7), repeated here for con-venience, (cid:90) V P d x (cid:0) ξ · F [ ξ ] − ξ · F [ ξ ] (cid:1) − µ (cid:90) S P d x ˆ n · (cid:0) ξ δ B [ ξ ] · B − ξ δ B [ ξ ] · B (cid:1) = 0 , (6.9)to obtain, δW [ ξ ; δ ξ ] = 2 (cid:90) V P d x (cid:16) δ ξ · (cid:0) F [ ξ ] + δ F (cid:1)(cid:17) , (6.10)where the boundary term vanishes due to (6.6). As δW [ ξ ; δ ξ ] must vanish for any δ ξ , weobtain (6.2) as our Euler-Lagrange equation. Thus stationary points of W [ ξ ] correspondwith solutions of (6.2).We can now obtain a simplified Euler-Lagrange equation from manipulations of ourenergy functional (6.7). A vector identity is applied in order to obtain, W [ ξ ] = (cid:90) V P d x (cid:20) − δ B [ ξ ] · δ B [ ξ ] µ + ξ · J × δ B [ ξ ] + ξ · ∇ ( ξ · ∇ p ) + 2 ξ · δ F (cid:21) . (6.11)The energy functional now does not depend on second derivatives of the displacement vector.This form of the energy functional is further simplified in Appendix P. We apply anothervector identity to obtain, W [ ξ ] = (cid:90) V P d x (cid:20) − δ B [ ξ ] · δ B [ ξ ] µ + ξ · J × δ B [ ξ ] − ( ξ · ∇ p ) ∇ · ξ + 2 ξ · δ F (cid:21) − (cid:90) S P d x ξ · ˆ n ξ · ∇ p. (6.12)We can drop this boundary term, as variations that respect the boundary condition (6.6)will automatically make it vanish. We note that this energy functional is the same (to withinoverall constants) as (12) in [80] if γ = 0, though we have allowed for the inclusion of anadditional bulk force.Minimization of W [ ξ ] is performed upon expressing the magnetic field in a magneticcoordinate system (Appendix A.3), B = ∇ ψ × ∇ ϑ − ι ( ψ ) ∇ ψ × ∇ ϕ. (6.13)From the assumption that ξ · B = 0, in such a coordinate system, the energy functional onlydepends on the radial, ξ ψ = ξ · ∇ ψ, (6.14)129nd in-surface, ξ α = ξ · (cid:0) ∇ ϑ − ι ( ψ ) ∇ ϕ (cid:1) , (6.15)components of the displacement vector. Furthermore, we note that no radial derivatives of ξ α appear in the energy functional, as we can express the perturbed magnetic field as, δ B = ∇ ξ α × ∇ ψ + ∇ × (cid:16) ξ ψ (cid:0) ι ( ψ ) ∇ ϕ − ∇ ϑ (cid:1)(cid:17) . (6.16)Upon further manipulations of the energy functional (Appendix P), we also note that ξ α only appears under derivatives with respect to ϑ and ϕ in the first three terms of the energyfunctional (6.11). Given certain constraints on the bulk force perturbation that can alwaysbe satisfied (Appendix Q), we are free to choose (cid:82) π dϑ (cid:82) π dϕ ξ α = 0 on all surfaces. Thisreflects the fact that constant shifts of ξ α on a surface do not change the perturbed magneticfield.We express the radial component of the displacement vector in a Fourier series, ξ ψ ( ψ, ϑ, ϕ ) = (cid:88) m,n (cid:16) ξ ψcm,n ( ψ ) cos( mϑ − nϕ ) + ξ ψsm,n ( ψ ) sin( mϑ − nϕ ) (cid:17) (6.17)= Ξ ψ · F ψ . Here Ξ ψ is interpreted as a vector of Fourier amplitudes and F ψ is a vector of the Fourierbasis functions. We similarly expand ξ α in a Fourier series, ξ α = (cid:88) m,n ;max( | m | , | n | ) (cid:54) =0 (cid:16) ξ αcm,n ( ψ ) sin( mϑ − nϕ ) + ξ αsm,n ( ψ ) cos( mϑ − nϕ ) (cid:17) (6.18)= Ξ α · F α . As we are free to shift ξ α by a constant on each surface, we can take the m = 0, n = 0mode of ξ α to vanish. If the equilibrium geometric quantities have a definite parity withrespect to ϑ and ϕ and the prescribed boundary perturbation and bulk force perturbationmaintains this parity, then ξ ψ will have the same parity as the equilibrium and ξ α willhave the opposite parity. For example, if the equilibrium is stellarator symmetric [53] (thecylindrical coordinates satisfy R ( ψ, − ϑ, − ϕ ) = R ( ψ, ϑ, ϕ ) and Z ( ψ, − ϑ, − ϕ ) = − Z ( ψ, ϑ, ϕ ))and this parity is maintained by the perturbation, only the cosine series is needed for ξ ψ andthe sine series is needed for ξ α . We will assume stellarator symmetry for the remainder ofthis Chapter for simplicity of the presentation.We similarly express the bulk force perturbation in a magnetic coordinate system, δ F = δF ψ ∇ ψ + δF α (cid:0) ∇ ϑ − ι ( ψ ) ∇ ϕ (cid:1) . (6.19)This results from the parallel force balance condition (6.2), which implies that δ F · ˆ b = 0.130he energy functional can be expressed schematically as, W [ Ξ ψ , Ξ α ] = (cid:90) V P dψ (cid:20) Ξ (cid:48) ψ ( ψ ) · (cid:16) A ψ (cid:48) ψ (cid:48) Ξ (cid:48) ψ ( ψ ) (cid:17) + Ξ ψ · (cid:16) A ψψ Ξ ψ + A ψψ (cid:48) Ξ (cid:48) ψ ( ψ ) + I ψ (cid:17) + Ξ α · (cid:16) A αα Ξ α + A αψ (cid:48) Ξ (cid:48) ψ ( ψ ) + A αψ Ξ ψ + I α (cid:17) (cid:21) , (6.20)upon integration over ϑ and ϕ . Explicit forms for the coefficient matrices are provided inAppendix P.We now perform variations with respect to the in-surface component, δW [ Ξ ψ , Ξ α ; δ Ξ α ] = (cid:90) V P dψ δ Ξ α · (cid:20) A αα Ξ α + A αψ (cid:48) Ξ (cid:48) ψ ( ψ ) + A αψ Ξ ψ + I α (cid:21) , (6.21)where we have noted that A αα can be made symmetric due to the self-adjointness of the MHDforce operator. (The explicit form given in Appendix P is evidently symmetric.) Thus thein-surface component can be expressed in terms of the radial component of the displacementvector using the corresponding Euler-Lagrange equation,2 A αα Ξ α + A αψ (cid:48) Ξ (cid:48) ψ ( ψ ) + A αψ Ξ ψ + I α = 0 . (6.22)As shown in Appendix P, A αα is invertible, so we find the reduced energy functional to be, W [ Ξ ψ ] = (cid:90) V P dψ (cid:20) Ξ ψ · (cid:16) C ψψ Ξ ψ + C ψψ (cid:48) Ξ (cid:48) ψ ( ψ ) + K ψ (cid:17) + Ξ (cid:48) ψ ( ψ ) · (cid:16) C ψ (cid:48) ψ (cid:48) Ξ (cid:48) ψ ( ψ ) + K ψ (cid:48) (cid:17) − I α · A − αα I α (cid:21) , (6.23)with, C ψψ = A ψψ − A Tαψ A − αα A αψ (6.24a) C ψψ (cid:48) = A ψψ (cid:48) − A Tαψ A − αα A αψ (cid:48) (6.24b) C ψ (cid:48) ψ (cid:48) = A ψ (cid:48) ψ (cid:48) − A Tαψ (cid:48) A − αα A αψ (cid:48) (6.24c) K ψ = I ψ − A Tαψ A − αα I α (6.24d) K ψ (cid:48) = − A Tαψ (cid:48) A − αα I α . (6.24e)We now perform variations with respect to Ξ ψ , δW [ Ξ ψ ; δ Ξ ψ ] = (cid:90) V P dψ δ Ξ ψ · (cid:20) C ψψ Ξ ψ + C ψψ (cid:48) Ξ (cid:48) ψ ( ψ ) + K ψ − ddψ (cid:16) C Tψψ (cid:48) Ξ ψ + 2 C ψ (cid:48) ψ (cid:48) Ξ (cid:48) ψ ( ψ ) + K ψ (cid:48) (cid:17) (cid:21) , (6.25)131o obtain the following Euler-Lagrange equation,2 C ψψ Ξ ψ + C ψψ (cid:48) Ξ (cid:48) ψ ( ψ ) + K ψ − ddψ (cid:16) C Tψψ (cid:48) Ξ ψ + 2 C ψ (cid:48) ψ (cid:48) Ξ (cid:48) ψ ( ψ ) + K ψ (cid:48) (cid:17) = 0 . (6.26)We define our vector of unknowns as, −→ u = Ξ ψ C Tψψ (cid:48) Ξ ψ + 2 C ψ (cid:48) ψ (cid:48) Ξ (cid:48) ψ ( ψ ) , (6.27)so that our Euler-Lagrange equation takes the form, ←→ L −→ u + ←→ L −→ u (cid:48) ( ψ ) + −→ b = 0, with, ←→ L = C Tψψ (cid:48) − I C ψψ (6.28a) ←→ L = C ψ (cid:48) ψ (cid:48) C ψψ (cid:48) − I (6.28b) −→ b = K ψ − K (cid:48) ψ (cid:48) ( ψ ) . (6.28c)Currently this is an implicit system of differential equations. When ←→ L is invertible, thissystem can be transformed into an explicit system of ODEs. If det (cid:0) C ψ (cid:48) ψ (cid:48) (cid:1) = 0 at a point ψ = ψ s and C − ψ (cid:48) ψ (cid:48) ∼ / ( ψ − ψ s ) to leading order near ψ s , then ψ s is a regular singularpoint. At such points, additional care must be taken in obtaining numerical solutions tothe Euler-Lagrange equation. In analogy with regular singular points of an uncoupled ODE,power series solutions can be constructed near ψ s using a matrix form of Frobenius analysis(Chapter 4 in [41]). As discussed in [80], for the Euler-Lagrange equation under consider-ation, such singular points occur when ψ = 0, ι = 0, or mι ( ψ ) − n = 0 for any m and n included in the spectrum for ξ ψ and ξ α . This singular behavior is discussed in more detailin Section 6.3.This coupled set of second-order ODEs is solved with a boundary condition of Ξ ψ (0) = 0and Ξ ψ ( ψ ) specified according to the prescribed boundary perturbation, ξ ψcm,n ( ψ ) = (cid:82) π dϑ (cid:82) π dϕ δ x · ∇ ψ cos( mϑ − nϕ ) (cid:82) π dϑ (cid:82) π dϕ cos( mϑ − nϕ ) , (6.29)where ψ is the flux label on the plasma boundary S P . As ∇ ψ vanishes at the origin, werequire that Ξ ψ (0) = 0 such that the displacement vector remains finite.The approach presented in this Section is very similar to that of the DCON approach,with several important distinctions. (1) Rather than assuming ∇ · ξ = 0, we have assumedˆ b · ξ . This allows us to include n = 0 modes in our displacement vector in axisymmetryand n that are an integer multiple of the number of periods in N P symmetry. (2) We haveallowed for the inclusion of a general bulk force, given it is consistent with the conventionswe have adopted for our displacement vector (ˆ b · ξ = 0 and ξ αc , = 0). (3) DCON solves an132nitial value problem by integrating a set of linearly-independent solutions that are regularat the axis. We instead solve a BVP. (4) Our treatment of singular surfaces differs slightlyfrom that of DCON, as is described in Section 6.3.4. To further analyze the behavior of the solutions to the linearized equilibrium equations,we will consider the simplified geometry of a one-dimensional screw pinch. A screw pinch isan infinite cylindrical device with field lines that lie on surfaces of constant radius r . Thefield lines generally have both a toroidal (ˆ z ) and poloidal ( ˆ θ ) component. We assume acylindrical coordinate system with ˆ r × ˆ θ · ˆ z = 1 where all equilibrium quantities only dependon r . The infinite length of a screw pinch is approximated by a cylindrical torus with majorradius R (cid:29) B = ψ (cid:48) ( r ) (cid:32) ˆ z r + ι ( r ) ˆ θ R (cid:33) . (6.30)Here ψ ( r ) is the toroidal flux label,2 πψ ( r ) = (cid:90) π dθ (cid:90) r dr (cid:48) r (cid:48) B · ˆ z , (6.31)and ι ( r ) is the rotational transform, ι ( r ) = R B · ∇ θ B · ∇ z , (6.32)the number of poloidal rotations of the field line through a z displacement of 2 πR . We notethat θ and z/R are magnetic coordinates for this system. The MHD force balance equation(6.1) for this geometry becomes, ddr (cid:18) µ p ( r ) + 12 r (cid:0) ψ (cid:48) ( r ) (cid:1) (cid:19) + ι ( r ) ψ (cid:48) ( r ) rR ddr (cid:0) rι ( r ) ψ (cid:48) ( r ) (cid:1) = 0 , (6.33)where ι ( ψ ), p ( ψ ) and ψ ≡ ψ ( r = 1) are prescribed. The solution is obtained for r ∈ [0 , ψ ( r = 0) = 0.Due to the toroidal and poloidal symmetry of this equilibrium, each of the Fourier modesof the displacement vector decouple from each other, and we can consider each mode indepen-dently. Although the Euler-Lagrange equation is solved for ξ ψ ( ψ ), it is more straightforwardto analyze the nature of the solutions in terms of ξ r ( r ) = ξ · ∇ r . Thus we will discuss theEuler-Lagrange equation in terms of modes of ξ r , (cid:16) ξ rcm,n (cid:17) (cid:48)(cid:48) ( r ) = B ( r ) (cid:16) ξ rcm,n (cid:17) (cid:48) ( r ) + B ( r ) ξ rcm,n ( r ) + B ( r ) . (6.34)We consider a bulk force perturbation of the form, δ F = (cid:88) m,n δF m,nrc ( r ) cos (cid:18) mθ − n zR (cid:19) ˆ r + δF m,nαs ( r ) sin (cid:18) mθ − n zR (cid:19) (cid:18) r ˆ θ − ι ( r ) R ˆ z (cid:19) , (6.35)133nd a boundary condition given by, ξ r (1) = (cid:88) m,n ξ rcm,n (1) cos (cid:18) mθ − n zR (cid:19) . (6.36) m = 0 , n = 0 mode We begin with a discussion of the m = 0, n = 0 mode. The coefficients appearing in theEuler-Lagrange equation (6.34) become, B ( r ) = R − r ι ( r )( ι ( r ) + 2 rι (cid:48) ( r )) r ( R + r ι ( r ) ) − ψ (cid:48)(cid:48) ( r ) ψ (cid:48) ( r ) (6.37a) B ( r ) = (3 R − r ι ( r ) ) ψ (cid:48) ( r ) − rR ψ (cid:48)(cid:48) ( r ) r ( R + r ι ( r ) ) ψ (cid:48) ( r ) (6.37b) B ( r ) = − µ r δF , rc ( r )(1 + r ι ( r ) /R ) ψ (cid:48) ( r ) . (6.37c)We note that the Euler-Lagrange equation exhibits regular singular behavior at r = 0. Tostudy the regular singular behavior near the axis in more detail, we expand the toroidal fluxas, ψ ( r ) = ψ r + O ( r ) , (6.38)where ψ is some constant, which follows from noting that ψ ( r ) must be even in r from(6.33). From the indicial equation for the homogeneous problem with B ( r ) = 0, we findthe leading order behavior to be ξ rc , ( r ) ∼ r ± near the origin. The negative root will beexcluded given our boundary condition on the axis; thus, we expect a smooth solution forthe radial displacement vector. The leading order behavior of the inhomogeneous problemwill depend on the bulk force perturbation of interest.We first demonstrate a perturbed equilibrium with an imposed boundary perturbationand no force perturbation, ξ rc , (1) = 1 δF , rc ( r ) = 0 . (6.39)The boundary value problem is solved with MATLAB’s bvp4c routine, which employs animplicit Runge-Kutta method with adaptive mesh refinement [128]. Given that the coeffi-cients become singular on the axis, the axis is not included on the computational grid, andthe inner boundary condition is imposed at a point near the axis, ψ min . For the calculationsin this Chapter, we use ψ min ∼ − − − . (While some numerical methods for BVPs donot require the evaluation of the ODE at the boundary points, such as finite-difference orcollocation methods, our numerical method requires evaluation at the origin.)The Euler-Lagrange equation is computed for a VMEC [111] equilibrium, approximatinga screw pinch by imposing a large aspect ratio boundary, R ( ψ , θ b ) = R + a cos( θ b ) Z ( ψ , θ b ) = a sin( θ b ) , (6.40) a = 1 and R = 10 . The angle θ b ∈ [0 , π ] is used to parameterize the boundary.The profiles are taken to be p ( ψ ) = 10 − × (cid:0) ψ/ψ (cid:1) + 2 . × ( ψ/ψ ) and ι ( ψ ) =10 + 5 × ( ψ/ψ ) + 2 × ( ψ/ψ ) . The equilibrium flux and profiles are presented inFigure 6.1.We compare the numerical solution of the Euler-Lagrange equation with the displacementvector computed from finite-difference calculations with the nonlinear VMEC code. Weimpose a perturbed boundary of the form, δR ( ψ , θ b ) = ∆ cos( θ b ) δZ ( ψ , θ b ) = ∆ sin( θ b ) . (6.41)We apply a two-point centered difference derivative with a step size of ∆ = 10 − . Theresulting displacement vector is computed from, ξ ψ ( ψ, ϑ ) = δR ( ψ, ϑ ) ∂ψ ( R, Z ) ∂R + δZ ( ψ, θ ) ∂ψ ( R, Z ) ∂Z , (6.42)where δR ( ψ, ϑ ) and δZ ( ψ, ϑ ) are the measured changes in the cylindrical coordinates atfixed flux label and straight field line poloidal angle. The result of the calculation is shownin Figure 6.2, where we observe good agreement between the finite-difference and Euler-Lagrange results with a volume-averaged error,∆ V = (cid:82) V P d x (cid:16) ξ r VMEC − ξ r Euler-Lagrange (cid:17) (cid:82) V P d x (cid:0) ξ r VMEC (cid:1) , (6.43)of 2 . × − .We next consider a perturbed equilibrium state corresponding to the addition of a bulkforce in the form of the gradient of a scalar pressure perturbation, ξ rc , (1) = 0 δF , rc ( r ) = − δp (cid:48) ( r ) . (6.44)This type of bulk force perturbation is necessary to compute the shape gradient for thevacuum magnetic well and beta figures of merit discussed in Chapter 5. We take δp ( r ) = p ( r ),the unperturbed pressure profile. The Euler-Lagrange solution is compared with a finite-difference VMEC calculation, δp ( ψ ) = ∆ p ( ψ ) , (6.45)computed with a two-point centered-difference stencil of amplitude ∆ = 10 − . The resultingdisplacement vectors are displayed in Figure 6.3, where we again observe good agreementbetween the linearized solution and its approximation with a finite-difference derivative of thenonlinear solution. The volume-averaged fractional difference (6.43) between the solutionsis found to be 1 . × − . 135 () / R (a) p () (b) r (c) Figure 6.1: Equilibrium (a) rotational transform and (b) pressure profiles used for screwpinch calculations. (c) Equilibrium flux computed with these profiles.136igure 6.2: Benchmark of screw pinch m = 0, n = 0 mode with applied boundary pertur-bation (6.39). The solution of the Euler-Lagrange equation (6.34) with coefficients (6.37) iscompared with a finite-difference VMEC calculation.137igure 6.3: Benchmark of screw pinch m = 0, n = 0 mode with applied pressure pertur-bation (6.44). The solution of the Euler-Lagrange equation (6.34) with coefficients (6.37) iscompared with a finite-difference VMEC calculation.138 .3.2 n = 0 , m (cid:54) = 0 modes We next consider the behavior of the n = 0, m (cid:54) = 0 modes. The coefficients appearingthe Euler-Lagrange equation (6.34) are, B ( r ) = − r − ι (cid:48) ( r ) ι ( r ) − ψ (cid:48)(cid:48) ( r ) ψ (cid:48) ( r ) (6.46a) B ( r ) = m − r (6.46b) B ( r ) = − µ R mδF m, rc + δ (cid:0) F m, αs (cid:1) (cid:48) ( r ) mι ( r ) ψ (cid:48) ( r ) . (6.46c)In addition to the regular singular point on the axis, we note that the coefficients becomesingular when ι ( r ) = 0. This class of equilibria is typically not of interest, so we will notconsider this type of singularity. Expanding the displacement vector as a power series nearthe origin, we find the leading order behavior of the homogeneous solution to be ξ rcm, ∼ r − ± m .As ψ ( r ) ∼ r to leading order near the axis, we note that ξ ψcm, ∼ ψ ±| m | / . In order to satisfythe boundary condition at ψ = 0, the minus solution is excluded. As ξ ψcm, ( ψ ) becomes non-smooth at the origin, additional care must be taken in obtaining the numerical solution. Wefind that the accuracy is improved by solving the BVP on a grid in √ ψ rather than ψ , asthe solution is expected to be a smooth function of √ ψ ( ξ ψcm, ( √ ψ ) ∼ (cid:0) √ ψ (cid:1) m ). To ensure theaccuracy of the coefficients near the axis, we additionally employ a near-axis expansion of theequilibrium equations to O ( r ) (Appendix R). The incorporation of the near-axis solutionbecomes important when linearizing about equilibria computed with the VMEC code, whichexhibits poor resolution near the magnetic axis.To demonstrate this method, we perform a benchmark of the homogeneous problem withan m = 1 boundary perturbation, ξ rc , (1) = 1 δF , rc ( r ) = 0 . (6.47)The same equilibrium profiles are used as those in Section 6.3.1. We perform a benchmarkbetween solutions of the Euler-Lagrange equation and finite-difference approximations withVMEC equilibria. A boundary perturbation of the form, δR ( ψ , θ b ) = ∆ cos(2 θ b ) δZ ( ψ , θ b ) = ∆ sin(2 θ b ) , (6.48)is imposed. The amplitude of the perturbation is taken to be ∆ = 10 − , and the perturbedequilibrium state is computed with a two-point centered-difference stencil.The resulting displacement vector is presented in Figure 6.4. We indeed find that thedisplacement vector has very sharp derivatives near the origin, though our numerical methodcan reproduce the solution obtained from VMEC. The volume-averaged fractional errorbetween the solutions is found to be ∆ V = 5 . × − .139igure 6.4: Benchmark of screw pinch m = 1, n = 0 mode with applied boundary pertur-bation (6.47). The solution of the Euler-Lagrange equation (6.34) with coefficients (6.46) iscompared with a finite-difference VMEC calculation.140 .3.3 m = 0 , n (cid:54) = 0 modes We next consider the m = 0, n (cid:54) = 0 modes, for which the coefficients of the Euler-Lagrangeequation take the form, B ( r ) = 1 r − ψ (cid:48)(cid:48) ( r ) ψ (cid:48) ( r ) (6.49a) B ( r ) = 3 r + n R − R ι ( r ) (cid:0) ι ( r ) + rι (cid:48) ( r ) (cid:1) − r R ι ( r ) ) rψ (cid:48) ( r ) ψ (cid:48)(cid:48) ( r ) (6.49b) B ( r ) = − µ r (cid:16) nrδF ,nrc ( r ) + rι ( r ) (cid:0) δF ,nαs (cid:1) (cid:48) ( r ) + δF ,nαs (cid:0) ι ( r ) + rι (cid:48) ( r ) (cid:1)(cid:17) nψ (cid:48) ( r ) . (6.49c)Although the ODE exhibits a regular singular point at the axis, we expect regular behaviorof the homogenous solution near the origin, as the indicial equation implies that ξ rc ,n ( r ) ∼ r . Analytic solutions
We can compare numerical solutions of the Euler-Lagrange equation with an analyticsolutions in certain limits. Assuming ι = 0 and p = 0, we find that the equilibrium flux(6.33) satisfies ψ ( r ) = ψ r . We consider a perturbed equilibrium problem corresponding toa boundary perturbation and no force perturbation, ξ rc ,n (1) = 1 δF ,nrc ( r ) = 0 . (6.50)In this case, we recover the modified Bessel equation, n r R (cid:16) ξ rc ,n (cid:17) (cid:48)(cid:48) (cid:18) nrR (cid:19) + nrR (cid:16) ξ rc ,n (cid:17) (cid:48) (cid:18) nrR (cid:19) − (cid:32) n r R (cid:33) ξ rc ,n (cid:18) nrR (cid:19) = 0 . (6.51)The two solutions are I ( nr/R ) and K ( nr/R ), the modified Bessel functions of the firstand second kind. As the solution must be finite at the origin we find, ξ rc ,n ( r ) = I (cid:16) nrR (cid:17) I (cid:16) nR (cid:17) . (6.52)A comparison between the n = 1 Euler-Lagrange solution and analytic solution is given inFigure 6.5. The volume-averaged fractional error between the solutions is ∆ V = 1 . × − .We now consider the inhomogeneous problem with a bulk force given by δF ,nrc ( r ) =1 / ( rµ ). In this case, our Euler-Lagrange equation takes the form of an inhomogeneousmodified Bessel equation, n r R (cid:16) ξ rc ,n (cid:17) (cid:48)(cid:48) (cid:18) nrR (cid:19) + nrR (cid:16) ξ rc ,n (cid:17) (cid:48) (cid:18) nrR (cid:19) − (cid:32) n r R (cid:33) ξ rc ,n (cid:18) nrR (cid:19) + r (2 ψ ) = 0 . (6.53)141igure 6.5: Benchmark of screw pinch m = 0, n = 1 mode with an applied boundaryperturbation (6.50). The solution of the Euler-Lagrange equation (6.34) with coefficients(6.49) is compared with an analytic solution (6.52).142igure 6.6: Benchmark of screw pinch m = 0, n = 1 mode with a bulk force perturbation δF , rc = 1 /r . The solution of the Euler-Lagrange equation (6.34) with coefficients (6.49) iscompared with an analytic solution (6.54).The solution satisfying the BVP is given by, ξ rc ,n ( r ) = R (2 ψ ) rn I (cid:16) nR (cid:17) (cid:32) rI (cid:18) nrR (cid:19) (cid:32) − R + nK (cid:18) nR (cid:19)(cid:33) + I (cid:18) nR (cid:19) (cid:32) R − nrK (cid:18) nrR (cid:19)(cid:33) (cid:33) . (6.54)We note that xK ( x ) ∼ (cid:0) A + B log( x ) (cid:1) x for constants A and B near x = 0, so ourdisplacement vector is not smooth. We find that the numerical solution depends very sensi-tively on the accuracy of the coefficients, and it becomes useful to employ the axis expansiondescribed in Appendix R. We compare the resulting numerical and analytic Euler-Lagrangesolutions in Figure 6.6. The volume-averaged fractional error (6.43) between the numericalEuler-Lagrange solution and analytic solution is ∆ V = 6 . × − .143 .3.4 m (cid:54) = 0 , n (cid:54) = 0 modes Finally, we consider modes with m (cid:54) = 0 and n (cid:54) = 0, for which the Euler-Lagrange coeffi-cients take the form, B ( r ) = − r + 2 n rn r + m R + 2 mι (cid:48) ( r ) n − mι ( r ) − ψ (cid:48)(cid:48) ( r ) ψ (cid:48) ( r ) (6.55a) B ( r ) = 2 n rµ p (cid:48) ( r )( n − mι ( r )) ψ (cid:48) ( r ) + n ( − m ) + n r R + m ( m − R r + n n − mι ( r ) n r + m R (6.55b) B ( r ) = − µ n r + m R ( n − mι ( r )) ψ (cid:48) ( r ) δF m,nrc − µ mR + nr ι ( r )( n − mι ( r )) ψ (cid:48) ( r ) ( δF m,nαs ) (cid:48) ( r ) (6.55c) − µ nr (cid:0) − mnR + 2( n r + 2 m R ) ι ( r ) + ( n r + m rR ) ι (cid:48) ( r ) (cid:1) ( n r + m R )( n − mι ( r )) ( ψ (cid:48) ( r )) δF m,nαs . By expanding the solution in a power series, we note the behavior of the solution variesas ξ rcm,n ∼ r m − near the origin. Thus, as for modes with n = 0 and m (cid:54) = 0, ξ ψ will varywith fractional powers of ψ . The numerical treatment of these modes benefits from accuratecalculations of the coefficients with the near-axis expansion. In addition to the regularsingular point at r = 0, we note that there will also be a singular point on surfaces where ι ( r ) = n/m .One method to treat singular surfaces relies on a series expansion of the displacementvector within a boundary layer near the singularity. The method of Frobenius yields twoindependent solutions of the second-order ODE, ξ r series ( r ) = A ξ r, ( r ) + A ξ r, ( r ) , (6.56)near a resonant surface at r = r s . A numerical solution of the ODE, ξ r num ( r ) is integratedfrom the axis to the beginning of the boundary layer at r = r s − r b . The two constants, A and A , are fixed by matching the numerical solution and its derivative at r s − r b . Theseries solution is then evaluated at the other edge of the boundary layer at r s + r b . Thenumerical solution is integrated to the plasma boundary at r = 1 using the initial conditions ξ r num ( r s + r b ) = ξ r series ( r s + r b ) and ( ξ r num ) (cid:48) ( r s + r b ) = ( ξ r series ) (cid:48) ( r s + r b ). A shooting methodis used to solve the BVP. This technique is similar to that used in the DCON [80] code.However, in DCON only one independent series solution is considered, as the other is not anelement of the required function space for the generalized Newcomb crossing criteria.While the above method can reproduce the singular behavior of the Euler-Lagrangeequation, as will be demonstrated shortly, it is not always desirable to include such singularbehavior in the Euler-Lagrange solutions. If the perturbed current density varies as ∼ / ( r − r s ) near the rational surface, this will drive infinite classical transport [97], which is144nphysical. An alternative is to smooth the coefficients artificially as, B smooth1 ( r ) = B ( r )sign( n − mι ( r )) n − mι ( r ) (cid:112) ( n − mι ( r )) + (cid:15) (6.57a) B smooth2 ( r ) = B ( r ) ( n − mι ( r )) ( n − mι ( r )) + (cid:15) , (6.57b)where (cid:15) (cid:28) (cid:15) →
0, the Euler-Lagrange equation remains unchanged. For small but finite (cid:15) , the coefficientsare only modified in the vicinity of r s . This is similar to a technique used in the IPEC [181]code. Analytic solution near singular surfaces
To study the solutions of the Euler-Lagrange equation with m (cid:54) = 0 and n (cid:54) = 0 further,we consider a limit in which analytic solutions can be obtained. We will take p (cid:48) ( ψ ) = 0 and ι ( r ) = ι r where ι is a constant. In this case the force-balance equation (6.33) gives us thefollowing expression for the flux in terms of hypergeometric functions, ψ ( r ) = r ψ F (cid:16) ; ; ; − r ι R (cid:17) F (cid:16) ; ; ; − ι R (cid:17) . (6.58)We define a variable r s = n/ ( mι ) such that a singular surface occurs at r = r s . Thecoefficients of the homogeneous problem can be expressed as, B ( r ) = 3 r − r s − r s r − rr s − r + r ι /R − R rR + r r s ι (6.59a) B ( r ) = 1 + m r + 4 r s r − r + m r s ι R + 2 R ( r + r s ) r ( r − r s )( R + r r s ι ) . (6.59b)In the limit of small shear, (cid:15) ι = ι r s /R (cid:28)
1, we can approximate the coefficients as, B ( r ) = 3 r s − rr ( r − r s ) + O (cid:0) (cid:15) ι (cid:1) (6.60a) B ( r ) = m − r + O (cid:0) (cid:15) ι (cid:1) . (6.60b)In practice we choose a very small value for this expansion parameter ( (cid:15) ι ∼ − ) so thatdropping the higher order terms is a very good approximation. For the m = 2, n = 1 modesubject to a boundary perturbation, ξ rc , (1) = 1 δF , rc ( r ) = 0 , (6.61)we have the analytic solution, ξ rc , ( r ) = r Re F (cid:16) − √
7; 3 + √ , , rr s (cid:17) F (cid:16) − √
7; 3 + √ , , r s (cid:17) . (6.62)145e first consider the case in which r s = 2 such that a singular surface does not appearwithin the volume. We compare the numerical solution of the Euler-Lagrange equation witha finite-difference calculation with VMEC. We impose a boundary perturbation of the form, δR ( ψ , θ b , φ ) = ∆ cos(3 θ b − φ ) (6.63a) δZ ( ψ , θ b , φ ) = ∆ sin(3 θ b − φ ) , (6.63b)where φ is the geometric toroidal angle. The perturbed field is computed with a two-point centered-difference stencil with amplitude ∆ = 10 − . The results of the calculationsare shown in Figure 6.7. We note that the Euler-Lagrange solution agrees well with theanalytic solution, with a volume-averaged difference of ∆ V = 1 . × − , but there is asmall discrepancy between the VMEC solution and the analytic solution near the edge,with a volume-averaged difference of ∆ V = 9 . × − . One possible source of this erroris the treatment of singularities by the VMEC code. While recent results have indicatedthat VMEC equilibria can exhibit 1 /x -like behavior near rational surfaces [144, 160], thenumerical solution is not truly singular on such surfaces, and very large numerical resolutionis necessary in order to see behavior resembling a singularity. Therefore, we do not expect thedisplacement vector computed with finite-difference VMEC to agree with the Euler-Lagrangesolution. Although for this equilibrium, ι does not resonate with the harmonics of thedisplacement vector, it may resonate with other modes present in the nonlinear equilibrium.Next we consider an equilibrium with a singular surface in the volume, r s = 0 .
5. TheEuler-Lagrange equation is solved with both the power-series method, which captures thesingular nature of the solution, and the coefficient smoothing method (6.57) with severalvalues of (cid:15) . Again, we compare with a finite-difference VMEC solution with a boundaryperturbation given by (6.63). With the power-series method, we find agreement between theEuler-Lagrange and analytic solutions. As expected, the solutions with smoothed coefficientsdo not reproduce the analytic expression. However, neither of these approaches approximatesthe VMEC solution well. Although the VMEC equilibrium is fairly well-resolved (701 fluxsurfaces, 10 − force tolerance, m ≤ | n | ≤ r = r s . We may need to consider a revised treatment of thesingularity to match the behavior from VMEC better. We will now demonstrate the linearized equilibrium technique to compute the shapegradient of the vacuum magnetic well figure of merit discussed in Chapter 5, f W ( S P ) = (cid:90) V P d x w ( ψ ) , (6.64)with, w ( ψ ) = exp( − ( ψ − ψ m, ) /ψ w ) − exp( − ( ψ − ψ m, ) /ψ w ) , (6.65)146igure 6.7: Benchmark of screw pinch m = 2, n = 1 mode with a boundary perturbation(6.61). The solution of the Euler-Lagrange equation (6.34) with coefficients (6.55) is com-pared with an analytic solution (6.62) and a finite-difference calculation from VMEC. Thisequilibrium does not contain a resonant surface within the volume.147igure 6.8: Benchmark of screw pinch m = 2, n = 1 mode with a boundary perturbation(6.61). The solution of the Euler-Lagrange equation (6.34) with coefficients (6.55) is com-pared with an analytic solution (6.62) and a finite-difference calculation from VMEC. Thisequilibrium contains a resonant surface at r = 0 . ψ = 0 . ψ m, = 0 . ψ , ψ m, = 0 . ψ , and ψ w = 0 . ψ . The shape gradient of f W is obtainedwith an adjoint approach by computing a perturbed equilibrium state corresponding to theaddition of a bulk force with no displacement of the boundary, δ x · ∇ ψ = 0 δ F = −∇ w ( ψ ) . (6.66)The resulting perturbed field, δ B [ ξ ], is used to compute the shape gradient, G = δ B [ ξ ] · B µ (cid:12)(cid:12)(cid:12)(cid:12) S P . (6.67)We perform this calculation for an axisymmetric configuration with a plasma boundary givenby, R ( ψ , θ b ) = R + a cos( θ b ) + b cos(2 θ b ) (6.68a) Z ( ψ , θ b ) = a sin( θ b ) − b sin(2 θ b ) , (6.68b)with R = 3, a = 1, and b = 0 .
1. Owing to its toroidal symmetry, all of the toroidalmodes of the displacement vector decouple. Given the toroidal symmetry of the bulk forceperturbation, we only need to consider the n = 0 modes. Therefore, the only singular pointof the Euler-Lagrange equation is at the origin. As before, the magnetic axis is not includedon the computational grid, and the coupled BVP is solved with the bvp4c routine. Theradial displacement vector is computed retaining modes m ≤ δp ( ψ ) = ∆ w ( ψ ) . (6.69)A two-point centered-difference derivative is computed with magnitude ∆ = 10. The surface-averaged fractional difference between the Euler-Lagrange and VMEC solutions is computedto be 7 . × − . We have demonstrated a variational method for computing perturbed equilibrium statescorresponding to the addition of a bulk force or boundary perturbation. We considered thesimplified geometry of a screw pinch to demonstrate the behavior of each of the modes of thedisplacement vector. Numerical solutions of the Euler-Lagrange equation are benchmarkedwith finite-difference calculations of the nonlinear equilibrium code, VMEC, and with ana-lytic solutions in certain limits. Finally, we employed this approach to compute the shapegradient of a figure of merit of interest for stellarator optimization in toroidally symmetricgeometry. We aim to apply this approach for computing such shape gradients in stellaratorgeometry, though this task may be somewhat more challenging. In fully 3D geometry, theremay exist several singular surfaces throughout a volume due to toroidal mode coupling, eachof which needs to be treated carefully,While the Euler-Lagrange equation exhibits singular behavior at rational surfaces, theequilibria computed with the VMEC code do not appear to exhibit any singular response, as149 a) (b)(c)
Figure 6.9: The shape gradient of the vacuum magnetic well (6.64) is computed for a toka-mak equilibrium with triangularity (6.68) with the solution of the Euler-Lagrange equationcorresponding to the adjoint problem (6.66) and a finite-difference approximation of theadjoint problem with VMEC (6.69). 150emonstrated in Section 6.3.4. If the goal is to linearize about VMEC equilibria, we thereforemay not want to solve the Euler-Lagrange equation exactly, but to artificially smooth thecoefficients appearing in the ODE. As an alternative, artificial viscosity could be addedto the Euler-Lagrange system with the addition of a small term involving a higher-orderderivative. This technique, commonly used in the fluid dynamics community [67, 156], turnsa singular ODE into an ODE with a singular perturbation. It remains to be demonstratedthat the shape gradients obtained from Euler-Lagrange solutions including such smoothingtechniques can reproduce the expected shape gradients computed with the VMEC code.In addition to the demonstration for three-dimensional geometry, there are several inter-esting extensions of the work discussed in this Chapter. As discussed in Chapter 5, there areseveral figures of merit for which the adjoint problem requires the addition of a perturbationto the prescribed toroidal current profile. This would necessitate generalizing this formula-tion to allow for perturbations to the magnetic field that vary the rotational transform profile.While the work in this Chapter has been applied to compute the shape gradient with respectto the plasma boundary, it may be possible to couple perturbations of the boundary to coilperturbations in order to compute the coil shape gradient. This may benefit from a methodsimilar to that used in the IPEC code, in which the virtual casing principle is applied tocouple boundary perturbations to changes in the external magnetic fields.The further development of this linear equilibrium approach would enable the shapegradient of many additional figures of merit to be computed with an adjoint method. Evenif an adjoint method is not applied, the linear equilibrium approach could prove very fruitfulfor gradient-based, fixed-boundary optimization. Replacing a finite-difference calculationby an analytic derivative may reduce computational cost and noise associated with thefinite-difference step size, enabling more efficient sensitivity and tolerance calculations forstellarator configurations. 151 hapter 7
Conclusions
In this Thesis, we have aimed to address fundamental challenges (Section 1.4.4) associatedwith stellarator optimization using the adjoint method and shape sensitivity analysis:1.
Coil complexity Non-convexity High-dimensionality Tight engineering tolerances .The adjoint method allows us to efficiently compute derivatives in the context of sev-eral problems of interest for stellarator optimization. These derivatives enable navigationthrough high-dimensional, non-convex spaces with gradient-based methods. We demonstrategradient-based optimization with adjoints in Chapter 3, for the design of coil shapes withminimal complexity. Computing the shape gradient of coil metrics to perturbations of thewinding surface allows us to gain intuition about features of configurations that enable sim-pler coils. We also demonstrate gradient-based optimization of the local magnetic geometryfor finite-collisionality neoclassical properties in Chapter 5. While including such objectivefunctions is typically prohibitively expensive for non-convex, high-dimensional optimization,we demonstrate convergence toward a local optimum with a minimal number of functionevaluations. With this adjoint method, we also gain intuition of the sensitivity of the boot-strap current and particle fluxes to perturbations in the field strength, informing engineeringtolerances. Finally, in Chapter 5 we demonstrate an adjoint method for computing theplasma surface and coil shape gradient for functions that depend on MHD equilibrium solu-tions. Importantly, the coil shape gradient can be used to evaluate engineering tolerances forsuch figures of merit (Section 2.1.3). While it has not yet been demonstrated in this Thesis,these shape gradients can also enable efficient adjoint-based optimization, either in the spaceof the plasma boundary or coil shapes. As discussed in Section 1.4, the direct optimizationof coil shapes may result in coils that can be more feasibly engineered than those resultingfrom the traditional two-step optimization.For several problems discussed in this Thesis, it is convenient to apply the discrete adjointmethod (Section 2.2.1). For the winding surface optimization problem in Chapter 3, the152orward problem is solved as a discrete linear system, so the discrete adjoint operator canbe obtained by simply taking the matrix transpose. A similar discrete adjoint method wasapplied for neoclassical optimization in Chapter 4, as the discretized form of the drift-kineticequation takes the form of a linear system in the SFINCS code.Physical insight into the structure of the relevant equations can inform the developmentof continuous adjoint methods (Section 2.2.2). For the neoclassical application, the adjointequation was obtained based on an inner product similar to the free-energy norm fromgyrokinetic theory. The self-adjointness of the linear Fokker-Planck operator with respect tothis inner product enabled straightforward calculation of the adjoint operator. For the MHDapplication, the adjoint equation is obtained by noting the self-adjointness of the MHD forceoperator, generalized to allow for perturbations of the rotational transform and currentsin the vacuum region. Finally, in Chapter 6, a variational method for solving the adjointequations obtained in Chapter 5 is presented. Here we are able to borrow a variationalmethod from MHD stability theory to efficiently compute the adjoint equilibrium problem.
There are several natural extensions of the work presented in this Thesis. • The advancement of the adjoint approach for functions of MHD equilibria necessitatesthe further development of a linearized equilibrium code, as outlined in Chapter 6.While we have demonstrated this technique for axisymmetric equilibria, we plan toextend it to 3D equilibria. In this way, adjoint methods for computing the shapegradient of the departure from quasi-symmetry (Section 5.5.6), effective ripple (Section5.5.5), and several finite-collisionality neoclassical quantities (Section 5.5.7) could bedemonstrated. • In Chapter 3, we applied the adjoint method to compute derivatives with respect to thewinding surface parameters. Similarly, we can apply the adjoint method to computederivatives with respect to plasma surface parameters. This would allow for the iden-tification of plasma surfaces that do not require overly-complex coils, facilitating theincorporation of coil considerations in plasma configuration optimization [36]. Similarfigures of merit (without derivative information) have been used in the ROSE code[59].
We have not yet taken full advantage of derivative information for stellarator optimizationproblems. • The analysis of sensitivity and tolerances presented in this Thesis is based on a localmodel, using a linear approximation of a function with first derivative information.153 more accurate global analysis can be computed from Monte-Carlo sampling, whichtypically requires many function evaluations to converge. Uncertainty quantificationcan be accelerated through the application of a surrogate model of the design space[238] with the incorporation of the uncertainty of the data. A surrogate model is anapproximation to an expensive simulation based on a small number of evaluations ofthe function. The number of required evaluations to build the surrogate is reducedwith a gradient-enhanced Gaussian process regression model [146]; thus the availabil-ity of adjoint-based gradients would enable more accurate uncertainty quantification.In addition to sensitivity analysis, once a surrogate is constructed, it can replace theexpensive model during optimization, allowing for more efficient local or global opti-mization. • In particular, one type of surrogate function of interest is a neural network, whichcan be trained more efficiently using derivative information. Neural networks withcertain choices of activation functions are differentiable, and can therefore be optimizedwith gradient-based optimization techniques. Gradient-based shape optimization withneural networks has proven fruitful in the field of aerodynamics [222]. • Optimization under uncertainty methods optimize the expected value of an objectivefunction by performing a sample average over a distribution of possible deviations.These techniques can improve the robustness of the optimum by avoiding small localminima and obtaining solutions with reduced risk. This technique has proven effectivefor the optimization of coil shapes with increased tolerances [150, 151], using a Monte-Carlo approach. To avoid the excessive cost of a Monte-Carlo method, a linear orquadratic approximation can be made such that the expectation value and variancecan be computed with derivative information [3] obtained with an adjoint method.We look forward to the adoption of adjoint methods and shape optimization tools formany stellarator design problems. 154 ppendix A: Toroidal coordinate systems
In this Appendix, we briefly review coordinate systems for describing scalar and vectorfields in toroidal systems. Comprehensive introductions to this topic are provided in thetextbook [54], the review article [97], and the tutorial [121].
A.1 Toroidal coordinates
In this Thesis, we often want to describe surfaces of toroidal topology or the volumesenclosed by such surfaces. We can describe the position on a toroidal surface by two angles(Figure A.1). A poloidal angle, denoted by θ , increases by 2 π upon one rotation the shortway around the torus. A toroidal angle, denoted by φ , increases by 2 π upon one rotationthe long way around the torus.We will consider a volume, V , bounded by a toroidal surface, S . Suppose that we usea set of continuously nested toroidal surfaces, Γ( r ), as a radial coordinate r , such that theposition within this volume can be expressed as x ( r, θ, φ ). A vector field, A can be expressedin the basis of the gradients of the coordinates, A = A r ∇ r + A θ ∇ θ + A φ ∇ φ, (A.1)the covariant form, or the derivatives of the position vectors with respect to the coordinates, A = A r ∂ x ∂r + A θ ∂ x ∂θ + A φ ∂ x ∂φ , (A.2)Figure A.1: The position on a toroidal surface, S , is described by the toroidal and poloidalangles. Figure adapted from [121]. 155acobian √ g = (cid:16) ∂ x ∂x i × ∂ x ∂x j (cid:17) · ∂ x ∂x k = (cid:16)(cid:0) ∇ x i × ∇ x j (cid:1) · ∇ x k (cid:17) − Differential volume d x = |√ g | dx i dx j dx k Differential length d x = (cid:80) i =1 ∂ x ∂x i dx i Differential surface area (constant x k ) d x = |√ g ||∇ x k | dx i dx j Divergence of vector field ∇ · A = (cid:80) i =1 1 √ g ∂∂x i (cid:0) √ gA i (cid:1) Curl of vector field ∇ × A = (cid:80) k =1 1 √ g (cid:16) ∂A j ∂x i − ∂A i ∂x j (cid:17) ∂ x ∂x k Gradient of scalar ∇ q = (cid:80) i =1 ∂q∂x i ∇ x i Table A.1: Summary of formulas used to describe the geometry of a non-orthogonal coor-dinate system ( x , x , x ). In the above, { i, j, k } is a cyclic permutation of { , , } . Tableadapted from [121].the contravariant form. The two basis vectors can be related through the dual relations, ∂ x ∂x i = ∇ x j × ∇ x k ∇ x i · ∇ x j × ∇ x k , (A.3)where ( x i , x j , x k ) = ( r, θ, φ ) or cyclic permutations. Such a coordinate system is generallynon-orthogonal, so ∂ x /∂x i is not necessarily parallel to ∇ x i . Several useful relations in non-orthogonal coordinate systems are summarized in Table A.1. For a more detailed discussion,refer to Chapter 2 in [54]. A.2 Flux coordinates
If magnetic surfaces exist, indicating that the magnetic field is tangent to a set of con-tinuously nested toroidal surfaces, we can use the toroidal flux through such surfaces as a156igure A.2: The plasma domain, V P , is bounded by a toroidal surface, S P . We make theassumption that there exists a set of toroidal magnetic surfaces, Γ( ψ ). The toroidal fluxthrough each of these surfaces is defined by (A.4) with S T ( ψ ) an open surface bounded by apoloidally closed curve on Γ( ψ ), ∂ S T ( ψ ).coordinate, defined as, 2 πψ ≡ (cid:90) S T ( ψ ) d x B · ˆ n . (A.4)In the above expression, S T ( ψ ) is an open surface such that ∂ S T ( ψ ) is a loop on Γ( ψ ) thatcloses after one poloidal rotation (Figure A.2). The unit normal is ˆ n , often chosen to pointin the direction of increasing φ . Another choice for labeling magnetic surfaces is the poloidalflux function, χ , 2 πχ ≡ (cid:90) S P ( ψ ) d x B · ˆ n , (A.5)where S P ( ψ ) is an open surface such that ∂ S P ( ψ ) is a loop on Γ( ψ ) that closes after onetoroidal rotation (Figure A.3).The rotational transform quantifies the number of poloidal turns of a field line per toroidalturn, ι ≡ lim n →∞ (cid:80) nk =1 (∆ θ ) k πn . (A.6)Here (∆ θ ) k is the change in poloidal angle in toroidal rotation k and n counts the toroidalturns. If flux surfaces exist, then the rotational transform can be computed from the deriva-tive of the poloidal flux with respect to the toroidal flux, ι ( ψ ) = χ (cid:48) ( ψ ) , (A.7)If a flux label, ψ , is used as one of the coordinates, known as a flux coordinate system,then the contravariant form for the magnetic field simplifies, B = B θ ∂ x ∂θ + B φ ∂ x ∂φ , (A.8)157igure A.3: The poloidal flux through the magnetic surface, Γ( ψ ), is defined by (A.5) with S P ( ψ ) an open surface bounded by a toroidally closed curve on Γ( ψ ), ∂ S P ( ψ ).from the assumption that B · ∇ ψ = 0. Given ∇ · B = 0 and using (A.3), we can express themagnetic field as, B = ∇ ψ × ∇ (cid:0) θ − ι ( ψ ) φ + λ ( ψ, θ, φ ) (cid:1) , (A.9)where λ ( ψ, θ, φ ) is 2 π -periodic in θ and φ (Section 11.1 in [121]).In a flux-coordinate system, the flux-surface average, (cid:104) A (cid:105) ψ = (cid:82) π dθ (cid:82) π dφ √ gAV (cid:48) ( ψ ) , (A.10)appears in many calculations, where V (cid:48) ( ψ ) = (cid:90) π dθ (cid:90) π dφ √ g, (A.11)is the differential volume associated with a change in flux. The flux-surface average can beequivalently defined as the average over the infinitesimal volume between flux surfaces, (cid:104) A (cid:105) ψ = lim ∆ V → V (cid:32)(cid:90) V P ( ψ )+∆ V d x A − (cid:90) V P ( ψ ) d x A (cid:33) , (A.12)where V P ( ψ ) is the volume enclosed by a surface labeled by ψ and V P ( ψ ) + ∆ V is the volumeof a neighboring surface. The flux-surface average is discussed in more detail in Section 4.9of [54]. A.3 Magnetic coordinates
A flux coordinate system can be defined with many choices of poloidal and toroidalangles. With some choices of these angles, the contravariant expression for the magneticfield can simplify further. Given (A.9), the definition of the poloidal and toroidal angles can158e shifted to ϑ and ϕ such that the magnetic field can be expressed as, B = ∇ ψ × ∇ (cid:0) ϑ − ι ( ψ ) ϕ (cid:1) . (A.13)Such angles define a magnetic coordinate system. For example, one choice is ϑ = θ + λ ( ψ, θ, φ )and ϕ = φ . For any choice of ϕ , there is a corresponding choice of ϑ that defines a magneticcoordinate system. With this choice of angles, the magnetic field lines are said to be straightin the ϑ − ϕ plane, dϑ ( l ) dϕ ( l ) = B · ∇ ϑ B · ∇ ϕ = ι ( ψ ) , (A.14)with a slope given by the rotational transform. Here l measures length along a field line suchthat df /dl = ˆ b · ∇ f for any quantity f , where ˆ b = B /B is the unit vector in the directionof the magnetic field.From the covariant form for the magnetic field, B = B ϑ ∇ ϑ + B ϕ ∇ ϕ + B ψ ∇ ψ, (A.15)we can compute the net toroidal and poloidal currents enclosed by the surface labeled by ψ , I T ( ψ ) ≡ (cid:90) S T ( ψ ) d x J · ˆ n = 1 µ (cid:73) ∂ S T ( ψ ) d l · B = 1 µ (cid:90) π dϑ B ϑ (A.16a) I P ( ψ ) ≡ (cid:90) S P ( ψ ) d x J · ˆ n = 1 µ (cid:73) ∂ S P ( ψ ) d l · B = 1 µ (cid:90) π dϕ B ϕ , (A.16b)where S T is defined in Figure A.2 and S P is defined in Figure A.3. Under the additionalassumption that J · ∇ ψ = 0, which follows from MHD force balance (1.3a) with p ( ψ ), wecan write the covariant form as, B = I ( ψ ) ∇ ϑ + G ( ψ ) ∇ ϕ + K ( ψ, ϑ, ϕ ) ∇ ψ + ∇ H ( ψ, ϑ, ϕ ) , (A.17)where I ( ψ ) = µ I T ( ψ ) / (2 π ) and G ( ψ ) = µ I P ( ψ ) / (2 π ). See Section 2.5 in [97], Section 9.2in [121], and Chapter 6.5 of [54] for details. A.4 Boozer coordinates
As previously mentioned, there are many choices of magnetic coordinates correspondingto different choices of toroidal angle, ϕ . Suppose we begin with a system defined by ( ψ, ϑ, ϕ )and want to transform for a system defined by ( ψ, ϑ (cid:48) , ϕ (cid:48) ). In order for the primed systemto remain a magnetic coordinate system, we must have ϕ (cid:48) = ϕ + γ ( ψ, ϑ, ϕ ) and ϑ (cid:48) = ϑ + ι ( ψ ) γ ( ψ, ϑ, ϕ ), where γ ( ψ, ϑ, ϕ ) is 2 π -periodic in ϑ and ϕ . To construct the Boozer coordinatesystem [23], we will make a particular choice for γ to simplify the covariant form for themagnetic field (A.17). The corresponding changes to the quantities appearing in the covariant159orm (A.17) are H (cid:48) = H − (cid:0) ι ( ψ ) I ( ψ ) + G ( ψ ) (cid:1) γ ( ψ, ϑ, ϕ ) (A.18a) K (cid:48) = K + γ ( ψ, ϑ, ϕ ) (cid:0) ι ( ψ ) I (cid:48) ( ψ ) + G (cid:48) ( ψ ) (cid:1) . (A.18b)Boozer coordinates are defined such that H (cid:48) = 0, or γ ( ψ, ϑ, ϕ ) = H ( ψ, ϑ, ϕ ) / ( ι ( ψ ) I ( ψ ) + G ( ψ )). With this choice of transformation, we will denote ϑ B = ϑ + ιγ and ϕ B = ϕ + γ .The covariant form becomes, B = I ( ψ ) ∇ ϑ B + G ( ψ ) ∇ ϕ B + K ( ψ, ϑ B , ϕ B ) ∇ ψ. (A.19)By dotting the covariant with the contravariant form, we obtain an expression for the Jaco-bian, √ g = 1 ∇ ψ × ∇ ϑ B · ∇ ϕ B = G ( ψ ) + ι ( ψ ) I ( ψ ) B . (A.20)We note that the Jacobian only varies on a surface through the magnetic field strength;thus each of the contravariant and covariant components of the magnetic field, except for K ( ψ, ϑ B , ϕ B ), possesses the same property. (The radial covariant component, K ( ψ, ϑ B , ϕ B ),is related to the field strength through the MHD force balance equation (1.3a).) For thisreason, the Boozer coordinate system is extremely convenient for analyzing guiding centermotion and neoclassical transport, as we will in Chapter 4.160 ppendix B: Justification for current potential In this Appendix, we justify the form for a continuous current density supported on atoroidal surface, S C , J C ( θ, φ ) = ˆ n × ∇ Φ , (B.1)where ˆ n is the unit normal vector.We consider an extension of J C in a neighborhood of S C of width ∆ b , (cid:101) J C ( b, (cid:101) θ, (cid:101) φ ) = J C ( θ, φ ) , (B.2)where we define extensions of θ and φ as, (cid:101) θ ( x ) = θ ( x − b ( x ) ∇ b ) (B.3a) (cid:101) φ ( x ) = φ ( x − b ( x ) ∇ b ) , (B.3b)or a normal projection onto S C . We consider b ∈ [ − ∆ b , ∆ b ] to be a “thickened” region ofcontinuous current density. We impose the constraint that ∇ · (cid:101) J C = 0, expressed in the( b, (cid:101) θ, (cid:101) φ ) coordinate system (Table A.1),1 √ g ∂ (cid:16) √ g (cid:101) J C · ∇ b (cid:17) ∂b + ∂ (cid:16) √ g (cid:101) J C · ∇ (cid:101) θ (cid:17) ∂ (cid:101) θ + ∂ (cid:16) √ g (cid:101) J C · ∇ (cid:101) φ (cid:17) ∂ (cid:101) φ = 0 , (B.4)where √ g = ∂ x /∂b · (cid:16) ∂ x /∂ (cid:101) θ × ∂ x /∂ (cid:101) φ (cid:17) By the definition of our extension, the first term willvanish. In the limit that ∆ b →
0, the divergence-free condition is expressed as, ∇ Γ · J C ≡ √ g (cid:32) ∂ (cid:0) √ gJ θ (cid:1) ∂θ + ∂ (cid:0) √ gJ φ (cid:1) ∂φ (cid:33) = 0 , (B.5)where we have expressed the current in the contravariant basis as J C = J θ ∂ x /∂θ + J φ ∂ x /∂φ and ∇ Γ · is the surface divergence (Appendix 3 in [229]). For a continuous current density,Ampere’s law (1.3b) implies that ∇ · J = 0. Thus the equivalent condition for a currentsupported on a surface is ∇ Γ · J C = 0 [11]. The surface divergence of a vector field tangentto a surface Γ ( A · ˆ n = 0 on Γ) defined in terms of a general continuous extension, (cid:101) A in aneighborhood of Γ is, ∇ Γ · A ≡ (cid:16) ∇ · (cid:101) A (cid:17) (cid:12)(cid:12) Γ − ˆ n · (cid:16) ∇ (cid:101) A (cid:17) (cid:12)(cid:12) Γ · ˆ n . (B.6)In (B.2), we have defined our extension such that ∇ b · (cid:16) ∇ (cid:101) J C (cid:17) = 0 such that the second term161n the above expression vanishes.Given (B.5), we can write, J θ = − √ g ∂ Φ( θ, φ ) ∂φ (B.7a) J φ = 1 √ g ∂ Φ( θ, φ ) ∂θ , (B.7b)where, Φ = (cid:90) dθ √ gJ φ . (B.7c)In other words, J C = ˆ n × ∇ Φ . (B.8)162 ppendix C: Adjoint derivative at fixed J max We enforce J max = constant in the REGCOIL solve in order to obtain the regularizationparameter λ by requiring that the following constraint be satisfied within a given tolerance, G (cid:16) Ω , −→ Φ (Ω , λ ) (cid:17) = J max (cid:16) Ω , −→ Φ (Ω , λ ) (cid:17) − J targetmax = 0 . (C.1)Here J targetmax is the target maximum current density and −→ Φ is chosen to satisfy the forwardequation (3.8), −→ F (cid:16) Ω , −→ Φ , λ (cid:17) = ←→ A (Ω , λ ) −→ Φ − −→ b (Ω , λ ) = 0 . (C.2)A log-sum-exponent function is used to approximate the maximum function, similar to thatused to approximate d coil-plasma (3.24), J max ≈ J max , lse = 1 p log (cid:32) (cid:82) S C d x exp ( pJ ) A coil (cid:33) . (C.3)We compute the total differential of −→ F , d −→ F (Ω , −→ Φ , λ ) = (cid:88) m,n (cid:32) ∂ ←→ A (Ω , λ ) ∂ Ω m,n −→ Φ − ∂ −→ b (Ω , λ ) ∂ Ω m,n (cid:33) d Ω m,n + ←→ A d −→ Φ + (cid:16) ←→ A K −→ Φ − −→ b K (cid:17) dλ = 0 . (C.4)Here ←→ A K = ∂ ←→ A /∂λ and −→ b K = ∂ −→ b /∂λ . We left multiply by ←→ A − and solve for d −→ Φ suchthat d −→ F (Ω , −→ Φ , λ ) = 0, d −→ Φ = − (cid:88) m,n ←→ A − (cid:32) ∂ ←→ A (Ω , λ ) ∂ Ω m,n −→ Φ − ∂ −→ b (Ω , λ ) ∂ Ω m,n (cid:33) d Ω m,n − ←→ A − (cid:16) ←→ A K −→ Φ − −→ b K (cid:17) dλ. (C.5)We also compute the total differential of G , dG (Ω , −→ Φ ) = (cid:88) m,n ∂G (Ω , −→ Φ ) ∂ Ω m,n d Ω m,n + ∂G (Ω , −→ Φ ) ∂ −→ Φ · d −→ Φ = 0 . (C.6)163sing the form for d −→ Φ (C.5), we compute dλ in terms of d Ω m,n , dλ = (cid:32) ∂G (Ω , −→ Φ ) ∂ −→ Φ · (cid:20) ←→ A − (cid:16) ←→ A K −→ Φ − −→ b K (cid:17)(cid:21)(cid:33) − × (cid:88) m,n ∂G (Ω , −→ Φ ) ∂ Ω m,n − ∂G (Ω , −→ Φ ) ∂ −→ Φ · ←→ A − (cid:32) ∂ ←→ A (Ω , λ ) ∂ Ω m,n −→ Φ − ∂ −→ b (Ω , λ ) ∂ Ω m,n (cid:33) d Ω m,n . (C.7)Using (C.5) and (C.7), the derivative of −→ Φ with respect to Ω m,n subject to equations (C.1)and (C.2) is given by the following expression, ∂ −→ Φ (Ω , λ (Ω)) ∂ Ω m,n = −←→ A − (cid:32) ∂ ←→ A (Ω , λ ) ∂ Ω m,n −→ Φ − ∂ −→ b (Ω , λ ) ∂ Ω m,n (cid:33) − ←→ A − (cid:16) ←→ A K −→ Φ − −→ b K (cid:17) ∂G (Ω , −→ Φ ) ∂ −→ Φ · (cid:20) ←→ A − (cid:16) ←→ A K −→ Φ − −→ b K (cid:17)(cid:21) × ∂G (Ω , −→ Φ ) ∂ Ω m,n − ∂G (Ω , −→ Φ ) ∂ −→ Φ · ←→ A − (cid:32) ∂ ←→ A (Ω , λ ) ∂ Ω m,n −→ Φ − ∂ −→ b (Ω , λ ) ∂ Ω m,n (cid:33) . (C.8)Here −→ Φ is understood to be a function of Ω and λ through (C.2) and λ is understood to bea function of Ω through (C.1). We use the adjoint method to avoid solving a linear systeminvolving the operator ←→ A for each Ω m,n , ∂ −→ Φ (Ω , λ (Ω)) ∂ Ω m,n = −←→ A − (cid:32) ∂ ←→ A (Ω , λ ) ∂ Ω m,n −→ Φ − ∂ −→ b (Ω , λ ) ∂ Ω m,n (cid:33) − ←→ A − (cid:16) ←→ A K −→ Φ − −→ b K (cid:17) ∂G (Ω , −→ Φ ) ∂ −→ Φ · (cid:20) ←→ A − (cid:16) ←→ A K −→ Φ − −→ b K (cid:17)(cid:21) × ∂G (Ω , −→ Φ ) ∂ Ω m,n − (cid:34)(cid:16) ←→ A T (cid:17) − ∂G (Ω , −→ Φ ) ∂ −→ Φ (cid:35) · (cid:32) ∂ ←→ A (Ω , λ ) ∂ Ω m,n −→ Φ − ∂ −→ b (Ω , λ ) ∂ Ω m,n (cid:33) . (C.9)We introduce a new adjoint vector −→ (cid:101) q , defined to be the solution of, ←→ A T −→ (cid:101) q = ∂G (Ω , −→ Φ ) ∂ −→ Φ . (C.10)Equation (C.9) is then used to compute the derivatives of χ B with respect to Ω m,n , ∂χ B (cid:16) Ω , −→ Φ (Ω , λ (Ω)) (cid:17) ∂ Ω m,n = ∂χ B (Ω , −→ Φ ) ∂ Ω m,n + ∂χ B (Ω , −→ Φ ) ∂ −→ Φ · ∂ −→ Φ (Ω , λ (Ω)) ∂ Ω m,n . (C.11)164his result can be written in terms of both adjoint variables, −→ q and −→ (cid:101) q , ∂χ B (cid:16) Ω , −→ Φ (Ω , λ (Ω)) (cid:17) ∂ Ω m,n = ∂χ B (Ω , −→ Φ ) ∂ Ω m,n − −→ q · (cid:32) ∂ ←→ A (Ω , λ ) ∂ Ω m,n −→ Φ − ∂ −→ b (Ω , λ ) ∂ Ω m,n (cid:33) − −→ q · (cid:16) ←→ A K −→ Φ − −→ b K (cid:17) −→ (cid:101) q · (cid:16) ←→ A K −→ Φ − −→ b K (cid:17) ∂G (Ω , −→ Φ ) ∂ Ω m,n − −→ (cid:101) q · (cid:32) ∂ ←→ A (Ω , λ ) ∂ Ω m,n −→ Φ − ∂ −→ b (Ω , λ ) ∂ Ω m,n (cid:33) . (C.12)The same method is used to compute derivatives of (cid:107) J (cid:107) . So, to obtain the derivatives atfixed J max , we compute a solution to the two adjoint equations, (3.22) and (C.10), in additionto the forward equation, (3.8). 165 ppendix D: Trajectory models In the SFINCS coordinate system, the DKE can be written in the following way,˙ x · ∇ f s + ˙ X s ∂f s ∂X s + ˙ ξ s ∂f s ∂ξ s − C s ( f s ) = − ( v m s · ∇ ψ ) ∂f Ms ∂ψ . (D.1)To obtain the trajectory coefficients ( ˙ x , ˙ X s , and ˙ ξ s ) several approximations are made. Forexample, any terms that require radial coupling ( ψ derivatives of f s ) cannot be retained, asthis would necessitate solving a five-dimensional system.Under the full trajectory model, the trajectory coefficients are chosen such that µ con-servation is maintained as radial coupling is dropped,˙ x = v || ˆ b + Φ (cid:48) ( ψ ) B B × ∇ ψ (D.2a)˙ X s = − ( v m s · ∇ ψ ) q s T s X s Φ (cid:48) ( ψ ) (D.2b)˙ ξ s = − − ξ s Bξ s v || ˆ b · ∇ B + ξ s (1 − ξ s ) 12 B Φ (cid:48) ( ψ ) B × ∇ ψ · ∇ B. (D.2c)Under the DKES trajectory model, the E × B velocity is taken to be divergenceless, v DKES E = B × ∇ Φ (cid:104) B (cid:105) ψ , (D.3)where the flux surface average of a quantity is (4.8). Under the DKES trajectory model, thetrajectory coefficients are taken to be,˙ x = v || ˆ b + 1 (cid:104) B (cid:105) ψ Φ (cid:48) ( ψ ) B × ∇ ψ (D.4a)˙ X s = 0 (D.4b)˙ ξ s = − − ξ s Bξ s v || ˆ b · ∇ B. (D.4c)These effective trajectories are adopted in the widely-used DKES code [113, 230].166 ppendix E: Adjoint collision operator We want to find an adjoint collision operator, C † s , that satisfies the following relation, (cid:28)(cid:90) d v g s C s ( f s ) f Ms (cid:29) ψ = (cid:42)(cid:90) d v f s C † s ( g s ) f Ms (cid:43) ψ . (E.1)The linearized Fokker-Planck collision operator can be written as, C s ( f s ) = (cid:88) s (cid:48) C Lss (cid:48) ( f s , f s (cid:48) ) = (cid:88) s (cid:48) C ss (cid:48) ( f s , f Ms (cid:48) ) + C ss (cid:48) ( f Ms , f s (cid:48) ) , (E.2)where s (cid:48) sums over species. The first term on the right hand side of (E.2) is referred to as thetest-particle collision operator, C Tss (cid:48) ( f s ) = C ss (cid:48) ( f s , f Ms (cid:48) ), and the second the field-particlecollision operator, C Fss (cid:48) ( f s (cid:48) ) = C ss (cid:48) ( f Ms , f s (cid:48) ). The test and field terms satisfy the followingrelations [198, 221], (cid:90) d v g s C ss (cid:48) ( f s , f Ms (cid:48) ) f Ms = (cid:90) d v f s C ss (cid:48) ( g s , f Ms (cid:48) ) f Ms (E.3a) (cid:90) d v g s C ss (cid:48) ( f Ms , f s (cid:48) ) f Ms = T s (cid:48) T s (cid:90) d v f s (cid:48) C s (cid:48) s ( f Ms (cid:48) , g s ) f Ms (cid:48) . (E.3b)For collisions between species of the same temperature, we see that C s ( f s ) is self-adjoint.The adjoint operator with respect to the inner product (4.14) is thus, C † s = C Ts + (cid:88) s (cid:48) f Ms f Ms (cid:48) T s (cid:48) T s C Fs (cid:48) s . (E.4)167 ppendix F: Adjoint collisionless trajectories We want to find an adjoint operator, L † s , that satisfies, (cid:28)(cid:90) d v g s L s f s f Ms (cid:29) ψ = (cid:42)(cid:90) d v f s L † s g s f Ms (cid:43) ψ , (F.1)for both trajectory models, where L s is defined in (4.10) with (D.4) for the DKES trajectoriesmodel and (D.2) for the full trajectory model. Throughout we use the velocity space elementin SFINCS coordinates, d v = 2 πv ts X s dξ s dX s . F.0.1 DKES trajectories
The operator under consideration is, L s = v || ˆ b · ∇ + ˆ v DKES E · ∇ − − ξ s Bξ s v || ˆ b · ∇ B ∂∂ξ s . (F.2)Considering the contribution of the streaming term in (F.2) to the left hand side of (F.1) weobtain, (cid:42)(cid:90) d v g s v || ˆ b · ∇ f s f Ms (cid:43) ψ = − (cid:42)(cid:90) d v f s v || B · ∇ (cid:0) g s /B (cid:1) f Ms (cid:43) ψ . (F.3)Here the identity (cid:104)∇ · Q (cid:105) ψ = 1 /V (cid:48) ( ψ ) ∂/∂ψ (cid:0) V (cid:48) ( ψ ) (cid:104) Q · ∇ ψ (cid:105) ψ (cid:1) for any vector Q has beenused. We next consider the contribution of the E × B drift term in (F.2), (cid:42)(cid:90) d v g s v DKES E · ∇ f s f Ms (cid:43) ψ = − (cid:42)(cid:90) d v f s v DKES E · ∇ g s f Ms (cid:43) ψ . (F.4)Here we have used the identity, (cid:104) B × ∇ ψ · ∇ w (cid:105) ψ = 0 , (F.5)for any w . We consider the contribution of the mirror-force term in (F.2), (cid:42)(cid:90) d v g s ˙ ξ s f Ms ∂f s ∂ξ s (cid:43) ψ = − (cid:42)(cid:90) d v f s ˙ ξ s f Ms ∂g s ∂ξ s (cid:43) ψ − (cid:28)(cid:90) d v v || B ˆ b · ∇ B g s f s f Ms (cid:29) ψ . (F.6)168ombining (F.3-F.6), we obtain (cid:28)(cid:90) d v g s L s f s f Ms (cid:29) ψ = − (cid:28)(cid:90) d v f s L s g s f Ms (cid:29) ψ . (F.7)Therefore, in the DKES trajectory model we obtain (4.27). F.0.2 Full trajectories
The operator under consideration for the full model is, L s = v || ˆ b · ∇ + v E · ∇ + (1 + ξ s ) X s B v E · ∇ B ∂∂X s − − ξ s Bξ s v || ˆ b · ∇ B ∂∂ξ s + ξ s (1 − ξ s )2 B v E · ∇ B ∂∂ξ s . (F.8)The contribution to (F.1) from the streaming term in (F.8) is identical to that in the case ofthe DKES trajectory model, (F.3). We next consider the contribution from the E × B driftterm in (F.8), (cid:28)(cid:90) d v g s v E · ∇ f s f Ms (cid:29) ψ = − (cid:42)(cid:90) d v f s B v E · ∇ (cid:0) g s /B (cid:1) f Ms (cid:43) ψ , (F.9)again using (F.5). The contribution from the ˙ X s term in (F.8) is, (cid:42)(cid:90) d v g s ˙ X s f Ms ∂f s ∂X s (cid:43) ψ = − (cid:42)(cid:90) d v f s ˙ X s f Ms ∂g s ∂X s (cid:43) ψ − (cid:28)(cid:90) d v (3 + 2 X s )(1 + ξ s ) g s f s f Ms B v E · ∇ B (cid:29) ψ . (F.10)The contribution from the mirror term in (F.8) is the same as in the case of the DKEStrajectories model (F.6). We consider the contribution from the final term in (F.8), (cid:42)(cid:90) d v g s ξ s (1 − ξ s ) v E · ∇ B Bf Ms ∂f s ∂ξ s (cid:43) ψ = − (cid:42)(cid:90) d v f s ξ s (1 − ξ s ) v E · ∇ B Bf Ms ∂g s ∂ξ s (cid:43) ψ − (cid:28)(cid:90) d v (1 − ξ s ) v E · ∇ B f s g s Bf M (cid:29) ψ . (F.11)Combining (F.3), (F.9), (F.10), (F.6), and (F.11), we obtain (cid:28)(cid:90) d v g s L s f s f Ms (cid:29) ψ = − (cid:28)(cid:90) d v f s L s g s f Ms (cid:29) ψ + Φ (cid:48) ( ψ ) q s T s (cid:28)(cid:90) d v ( v m s · ∇ ψ ) f s g s f Ms (cid:29) ψ . (F.12)Therefore, under the full trajectory model we obtain (4.28).169 ppendix G: Symmetry of the sensitivity function In this Appendix we discuss several symmetry properties of the local sensitivity function, S R , defined through (4.41). The arguments that follow are similar to those in Appendix Cof [138]. Throughout we will assume that B is stellarator symmetric and N P symmetric. Wewill show that this implies N P symmetry of S R . In the limit that E r →
0, then S R also hasstellarator symmetry. G.0.1 Symmetry of S R implied by Fourier derivatives First we would like to show that S R is stellarator symmetric if and only if ∂ R /∂B sm,n = 0for all m and n , where we express B in a Fourier series, B = (cid:88) m,n B cm,n cos( mϑ B − nϕ B ) + B sm,n sin( mϑ B − nϕ B ) . (G.1)The perturbation, δB , is decomposed similarly. We begin with the “if” portion of theargument. From (4.41) we have, ∂ R ∂B sm,n = V (cid:48) ( ψ ) − (cid:90) π dϑ B (cid:90) π dϕ B √ gS R sin( mϑ B − nϕ B ) . (G.2)Suppose ∂ R /∂B sm,n = 0 for all m and n . The quantity ( √ gS R ) can be represented as aFourier series, (cid:0) √ gS R (cid:1) = (cid:88) m,n A cm,n cos( mϑ B − nϕ B ) + A sm,n sin( mϑ B − nϕ B ) . (G.3)From (G.2), we see that A sm,n = 0 for all m and m . Thus the quantity ( √ gS R ) must beeven under the transformation ( ϑ B , ϕ B ) → ( − ϑ B , − ϕ B ). We now note that √ g must be evenfrom (4.37) under the assumption that B is stellarator symmetric. Therefore S R must bestellarator symmetric, assuming that √ g does not vanish anywhere, which must be the casefor any well-defined coordinate transformation.We continue with the “only if” portion of the argument. Suppose S R is stellaratorsymmetric. As √ g is also stellarator symmetric, ( √ gS R ) can be expressed in a Fourier seriesas (G.3) with A sm,n = 0 for all m and n . Thus from (G.2) ∂ R /∂B sm,n = 0 for all m and n .We next show that if B is N P symmetric, then S R is N P symmetric if and only if ∂ R /∂B cm,n = 0 for all n that are not integer multiples of N P . We begin with the “if” portion170f the argument. From (4.41), ∂ R ∂B cm,n = V (cid:48) ( ψ ) − (cid:90) π dϑ B (cid:90) π dϕ B √ gS R cos( mϑ B − nϕ B ) . (G.4)Suppose ∂ R /∂B cm,n = 0 for all n which are not integer multiples of N P . Here ( √ gS R ) can beexpressed in a Fourier series as (G.3) with A sm,n = 0 for all m and n . Inserting the Fourierseries into (G.4), we find that A cm,n = 0 for all n that are not integer multiples of N P . Thus( √ gS R ) must be N P symmetric. As √ g must be N P symmetric, this implies S R possessesthe same symmetry.Next we consider the “only if” portion of the argument. Suppose that S R is N P symmet-ric. As √ g is also N P symmetric, then ( √ gS R ) can be expressed in a Fourier series as (G.3)where the sum includes n that are integer multiples of N P . Inserting the Fourier series into(G.4), we find that ∂ R /∂B cm,n = 0 for all n that are not integer multiples of N P . G.0.2 Symmetry of Fourier derivatives
To continue, we need to show that ∂ R /∂B sm,n = 0 for all m and n and ∂ R /∂B cm,n = 0 forall n which are not integer multiples of N P . We begin with the N P symmetry argument. Weconsider the symmetry of f s implied by (D.1). Under the transformation ϕ B → ϕ B +2 π/N P ,we find that each of the trajectory coefficients remain unchanged, as well as the source termand collision operator. Therefore we can conclude that f s is N P symmetric. We can alsonote that each of the (cid:101) R vectors are N P symmetric, as well as √ g . We consider the integrandthat appears in the flux surface average in (4.16), D s ( ϑ B , ϕ B ) = (cid:90) d v f s (cid:101) R fs √ gf Ms . (G.5)Here the superscript and subscript on (cid:101) R denotes that we consider the unknowns correspond-ing to the distribution function of species s . We note that D s ( ϑ B , ϕ B +2 π/N P ) = D s ( ϑ B , ϕ B ).The quantity R can be expressed in terms of D s as follows, R = (cid:88) s V (cid:48) ( ψ ) − (cid:90) π dϕ B (cid:90) π dϑ B D s . (G.6)Next we consider the functional derivative of R with respect to B , defined as in (4.40). Thederivative with respect to B cm,n can be thus defined as, ∂ R ∂B cm,n = V (cid:48) ( ψ ) − (cid:90) π dϕ B (cid:90) π dϑ B (cid:32)(cid:88) s δD s δB − R δ √ gδB (cid:33) cos( mϑ B − nϕ B ) . (G.7)As the functional derivative maintains the N P symmetry of D s and √ g , the quantity inparenthesis in (G.7) can be expressed in a Fourier series containing only n that are integermultiples of N P . Thus we see that the quantity ∂ R /∂B cm,n = 0 for all n that are not integermultiples of N P .Next we consider a similar argument for stellarator symmetry. We begin by consider-ing the symmetry of f s implied by (D.1) in the case E r = 0. Under the transformation171 ϑ B , ϕ B , v || ) → ( − ϑ B , − ϕ B , − v || ), we see that both the collisionless trajectory operator andthe collision operator maintain the parity of f s , while the source term is odd. Therefore, f s must be odd under this transformation. In this case, we can write f s as, f s = f − a,s ( X s , ξ s ) f + b,s ( ϑ B , ϕ B ) + f + a,s ( X s , ξ s ) f − b,s ( ϑ B , ϕ B ) , (G.8)where f − a,s ( X s , − ξ s ) = − f − a,s ( X s , ξ s ), f + a,s ( X s , − ξ s ) = f + a,s ( X s , − ξ s ), and analogous expressionsfor f + b,s and f − b,s .We next note that each of the (cid:101) R fs are odd under the transformation ( ϑ B , ϕ B , v || ) → ( − ϑ B , − ϕ B , − v || ). As √ g is even, then we can express (cid:101) R fs √ g in a similar way to (G.8), (cid:101) R fs √ g = B − a,s ( X s , ξ s ) B + b,s ( ϑ B , ϕ B ) + B + a,s ( X s , ξ s ) B − b,s ( ϑ B , ϕ B ) . (G.9)The integrand that appears in the flux surface average becomes, D s = (cid:90) d v f − Ms (cid:18) f − a,s ( X s , ξ s ) B − a,s ( X s , ξ s ) f + b,s ( ϑ B , ϕ B ) B + b,s ( ϑ B , ϕ B )+ f + a,s ( X s , ξ s ) B + a,s ( X s , ξ s ) f − b,s ( ϑ B , ϕ B ) B − b,s ( ϑ B , ϕ B ) (cid:19) . (G.10)We see that D s is even with respect to the transformation ( ϑ B , ϕ B ) → ( − ϑ B , − ϕ B ). Thequantity R can be written as in (G.6) and the derivative with respect to a stellarator asym-metric mode is ∂ R ∂B sm,n = V (cid:48) ( ψ ) − (cid:90) π dϕ B (cid:90) π dϑ B (cid:32)(cid:88) s δD s δB − R δ √ gδB (cid:33) sin( mϑ B − nϕ B ) . (G.11)The functional derivative with respect to B does not change the parity of D s or √ g , thuswe see that the quantity in parenthesis in the above equation is even with respect to thetransformation ( ϑ B , ϕ B ) → ( − ϑ B , − ϕ B ). Therefore, ∂ R /∂B sm,n = 0 for all m and n . Asimilar argument cannot be made if E r (cid:54) = 0, as the inhomogeneous drive term in (D.1)no longer has definite parity. However, according to the arguments in [112] the transportcoefficients do obey this symmetry property.172 ppendix H: Derivatives at ambipolarity In this Appendix, we derive an expression for derivatives of moments of the distributionfunction at fixed ambipolarity rather than fixed E r by determining the relationship betweengeometry parameters, Ω, and E r . We begin by assuming that the continuous adjoint ap-proach outlined in Section 4.3.1 is used. The approach taken here is analogous to that usedin Appendix C, in which an additional adjoint equation is used to compute derivatives at afixed constraint function for optimization of stellarator coil shapes.Consider the set of unknowns computed with SFINCS, F , which depends on parametersΩ and E r . The total differential of F satisfies, L dF (Ω , E r ) = (cid:18) ∂ S (Ω , E r ) ∂E r − ∂ L (Ω , E r ) ∂E r F (cid:19) dE r + N Ω (cid:88) i =1 (cid:18) ∂ S (Ω , E r ) ∂ Ω i − ∂ L (Ω , E r ) ∂ Ω i F (cid:19) d Ω i , (H.1)which follows from (4.13). Consider J r (Ω , F ), which depends on E r through F . The totaldifferential of J r can be computed, dJ r (Ω , F (Ω , E r )) = N Ω (cid:88) i =1 ∂J r (Ω , F ) ∂ Ω i d Ω i + (cid:68) (cid:101) J r , dF (Ω , E r ) (cid:69) , (H.2)which can be written using (H.1) and the solution to (4.50), dJ r (Ω , F (Ω , E r )) = (cid:42) λ J r , (cid:18) ∂ L (Ω , E r ) ∂E r F − ∂ S (Ω , E r ) ∂E r (cid:19)(cid:43) dE r + N Ω (cid:88) i =1 ∂J r (Ω , F ) ∂ Ω i + (cid:42) λ J r , (cid:18) ∂ L (Ω , E r ) ∂ Ω i F − ∂ S (Ω , E r ) ∂ Ω i (cid:19)(cid:43) d Ω i . (H.3)By enforcing dJ r (Ω , F (Ω , E r )) = 0, we obtain the relationship between E r and Ω at ambipo-173arity, ∂E r (Ω) ∂ Ω i (cid:12)(cid:12)(cid:12)(cid:12) dJ r =0 = − (cid:42) λ J r , (cid:18) ∂ L (Ω , E r ) ∂E r F − ∂ S (Ω , E r ) ∂E r (cid:19)(cid:43) − ∂J r (Ω , F ) ∂ Ω i + (cid:42) λ J r , (cid:18) ∂ L (Ω , E r ) ∂ Ω i F − ∂ S (Ω , E r ) ∂ Ω i (cid:19)(cid:43) . (H.4)Consider a moment of the distribution function, R (Ω , F (Ω , E r )). The derivative with respectto Ω i at fixed ambipolarity can thus be computed, ∂ R (Ω , F (Ω , E r (Ω)) ∂ Ω i = ∂ R (Ω , F ) ∂ Ω i + (cid:28) (cid:101) R , ∂F (Ω , E r (Ω)) ∂ Ω i (cid:29) , (H.5)where E r is viewed as a function of Ω through (H.4). The first term corresponds to the explicitdependence on Ω i , while the second contains dependence through F . Here ∂F (Ω , E r (Ω)) /∂ Ω i satisfies, L ∂F (Ω , E r (Ω)) ∂ Ω i = (cid:18) ∂ S (Ω , E r ) ∂ Ω i − ∂ L (Ω , E r ) ∂ Ω i F (cid:19) − (cid:18) ∂ S (Ω , E r ) ∂E r − ∂ L (Ω , E r ) ∂E r F (cid:19) (cid:42) λ J r , (cid:18) ∂ L (Ω , E r ) ∂E r F − ∂ S (Ω , E r ) ∂E r (cid:19)(cid:43) − × ∂J r (Ω , F ) ∂ Ω i + (cid:42) λ J r , (cid:18) ∂ L (Ω , E r ) ∂ Ω i F − ∂ S (Ω , E r ) ∂ Ω i (cid:19)(cid:43) , (H.6)from (H.1) using (H.4). Using (H.6) and (4.23), we find ∂ R (Ω , F (Ω , E r (Ω)) ∂ Ω i = ∂ R (Ω , F ) ∂ Ω i + (cid:42) λ R , (cid:18) ∂ L (Ω , E r ) ∂ Ω i F − ∂ S (Ω , E r ) ∂ Ω i (cid:19)(cid:43) − (cid:42) λ R , (cid:18) ∂ L (Ω , E r ) ∂E r F − ∂ S (Ω , E r ) ∂E r (cid:19)(cid:43) × (cid:32) ∂J r (Ω ,F ) ∂ Ω i + (cid:28) λ J r , (cid:16) ∂ L (Ω ,E r ) ∂ Ω i F − ∂ S (Ω ,E r ) ∂ Ω i (cid:17)(cid:29)(cid:33)(cid:28) λ J r , (cid:16) ∂ L (Ω ,E r ) ∂E r F − ∂ S (Ω ,E r ) ∂E r (cid:17)(cid:29) . (H.7)174n analogous expression can be obtained using the discrete approach, ∂ R (cid:16) Ω , −→ F (cid:0) Ω , E r (Ω) (cid:1)(cid:17) ∂ Ω i = ∂ R (cid:16) Ω , −→ F (cid:17) ∂ Ω i + (cid:42) −→ λ R , (cid:32) ∂ −→ S (Ω , E r ) ∂ Ω i − ∂ ←→ L (Ω , E r ) ∂ Ω i −→ F (cid:33)(cid:43) − (cid:42) −→ λ R , (cid:32) ∂ −→ S (Ω , E r ) ∂E r − ∂ ←→ L (Ω , E r ) ∂E r −→ F (cid:33)(cid:43) × ∂J r (cid:16) Ω , −→ F (cid:17) ∂ Ω i + (cid:42) −→ λ J r , (cid:18) ∂ −→ S (Ω ,E r ) ∂ Ω i − ∂ ←→ L (Ω ,E r ) ∂ Ω i −→ F (cid:19)(cid:43)(cid:42) −→ λ J r , (cid:18) ∂ −→ S (Ω ,E r ) ∂E r − ∂ ←→ L (Ω ,E r ) ∂E r −→ F (cid:19)(cid:43) , (H.8)where (4.51) has been used. 175 ppendix I: Derivation of generalized MHD self-adjointness relation The quantity U P = U P + U P consists of two terms, accounting for changes to the vectorpotential due to MHD perturbations, U P = (cid:90) V P d x ( δ J · ξ × B − δ J · ξ × B ) , (I.1)and changes to the rotational transform, U P = (cid:90) V P d x (cid:0) δχ ( ψ ) δ J · ∇ ϕ − δχ ( ψ ) δ J · ∇ ϕ (cid:1) . (I.2)The quantity U P can be expressed by using (5.26) and applying the divergence theorem tothe pressure gradient terms, U P = (cid:90) V P d x ξ · (cid:0) J × δ B + ∇ p ( ∇ · ξ ) − F (cid:1) − (cid:90) V P d x ξ · (cid:0) J × δ B + ∇ p ( ∇ · ξ ) − F (cid:1) . (I.3)We will define δ (cid:101) B , = ∇ × (cid:0) ξ , × B (cid:1) such that δ B , = δ (cid:101) B , − ∇ δχ , ( ψ ) × ∇ ϕ . The termsin (I.3) due to δ (cid:101) B , can be evaluated using J = J || ˆ b + ˆ b × ∇ p/B and (5.10), (cid:90) V P d x (cid:16) ξ · J × δ (cid:101) B − ξ · J × δ (cid:101) B (cid:17) = (cid:90) V P d x J || B ∇ · (cid:0) ( ξ × B ) × ( ξ × B ) (cid:1) + (cid:90) V P d x B (cid:16) ( ξ · ∇ p ) ˆ b · δ (cid:101) B − ( ξ · ∇ p ) ˆ b · δ (cid:101) B (cid:17) . (I.4)The first term in (I.4) can be simplified using ∇ · J = 0 and noting that the perturbationcan be written as ξ , = ξ ψ , ∇ ψ + ξ ⊥ , ˆ b × ∇ ψ . Applying the identity B · δ (cid:101) B , = − B ∇ · ξ , − ξ , · ∇ B − µ ξ , · ∇ p to the second term, the following expression can be obtained, (cid:90) V P d x (cid:16) ξ · J × δ (cid:101) B − ξ · J × δ (cid:101) B (cid:17) = (cid:90) V P d x (cid:0) ( ∇ · ξ ) ξ · ∇ p − ( ∇ · ξ ) ξ · ∇ p (cid:1) . (I.5)176ence we obtain the following expression for U P , U P = (cid:90) V P d x ( − ξ · F + ξ · F ) − (cid:90) V P d x (cid:0) δχ (cid:48) ( ψ ) ξ · ∇ ψ − δχ (cid:48) ( ψ ) ξ · ∇ ψ (cid:1) J · ∇ ϕ. (I.6)We now consider U P defined in (I.2). Applying (5.24) for the change in toroidal current,integrating by parts in ψ , and combining the expressions for U P (I.3) and U P (I.2), weobtain, U P = (cid:90) V P d x ( − ξ · F + ξ · F ) + 2 π (cid:90) V P dψ (cid:16) δχ ( ψ ) δI (cid:48) T, ( ψ ) − δχ ( ψ ) δI (cid:48) T, ( ψ ) (cid:17) − (cid:90) S P d x (cid:0) δχ ( ψ ) ξ − δχ ( ψ ) ξ (cid:1) · ˆ nJ · ∇ ϕ. (I.7)Next we combine U P (I.7) with U B (5.31) and U C (5.32) to obtain the free-boundary adjointrelation (5.33).To obtain the fixed-boundary adjoint relation, the integral over the plasma volume (5.29)can be related to a surface integral by applying the divergence theorem to arrive at (5.35).Using (5.19) and applying several vector identities, U P = − µ (cid:90) S P d x ˆ n · ( ξ δ B − ξ δ B ) · B − µ (cid:90) S P d x (cid:0) δχ ( ψ ) δ B − δχ ( ψ ) δ B (cid:1) · ∇ ϕ × ˆ n . (I.8)Using (I.7) and expressing the second term in (I.8) as a perturbed current using (5.24), thefixed boundary adjoint relation (5.36) is obtained.177 ppendix J: Alternate derivation of fixed-boundary adjoint relation The MHD force operator, F [ ξ , ] = J × (cid:16) ∇ × (cid:0) ξ , × B (cid:1)(cid:17) + ∇ × (cid:16) ∇ × (cid:0) ξ , × B (cid:1)(cid:17) × B µ + ∇ (cid:0) ξ , · ∇ p (cid:1) , (J.1)possesses the following self-adjointness property [20, 83], (cid:90) V P d x (cid:0) ξ · F [ ξ ] − ξ · F [ ξ ] (cid:1) = 1 µ (cid:90) S P d x ˆ n · (cid:16) ξ B · δ (cid:101) B − ξ B · δ (cid:101) B (cid:17) , (J.2)where δ (cid:101) B , = ∇ × (cid:0) ξ , × B (cid:1) is the perturbed field corresponding to the MHD perturba-tions. As we consider linearized equilibrium states that preserve p ( ψ ), the perturbed pressuresatisfies δp ( ψ ) = − ξ · ∇ p . The force operator we adopt (J.1) is the γ → ∇ ( γp ∇ · ξ ).For perturbations described by (5.19), (5.20) and (5.23) to (5.26), the force operatorsatisfies, F [ ξ , ] = J × (cid:0) ∇ δχ , ( ψ ) × ∇ ϕ (cid:1) + ∇ × (cid:0) ∇ δχ , ( ψ ) × ∇ ϕ (cid:1) × B µ − δ F , . (J.3)Using (J.3) and several vector identities, the left hand side of (J.2) can be written as (cid:90) V P d x (cid:0) ξ · F [ ξ ] − ξ · F [ ξ ] (cid:1) = (cid:90) V P d x (cid:0) δχ (cid:48) ( ψ ) ξ − δχ (cid:48) ( ψ ) ξ (cid:1) · ∇ ψ J · ∇ ϕ − µ (cid:90) V P d x ∇ ψ × ∇ ϕ · (cid:16) δχ (cid:48) ( ψ ) δ (cid:101) B − δχ (cid:48) ( ψ ) δ (cid:101) B (cid:17) − µ (cid:90) S P d x (cid:0) ξ δχ (cid:48) ( ψ ) − ξ δχ (cid:48) ( ψ ) (cid:1) · ˆ n ( ∇ ψ × ∇ ϕ · B ) − (cid:90) V P d x ( ξ · δ F − ξ · δ F ) . (J.4)In arriving at (J.4), we use J · ∇ ψ = 0, which follow from MHD force balance (5.10). Using1785.24) to re-express the first two terms on the right-hand side, (cid:90) V P d x (cid:0) ξ · F [ ξ ] − ξ · F [ ξ ] (cid:1) = 2 π (cid:90) V P dψ (cid:0) δI T, ( ψ ) δχ (cid:48) ( ψ ) − δI T, ( ψ ) δχ (cid:48) ( ψ ) (cid:1) − µ (cid:90) S P d x (cid:0) ξ δχ (cid:48) ( ψ ) − ξ δχ (cid:48) ( ψ ) (cid:1) · ˆ n ( ∇ ψ × ∇ ϕ · B ) − (cid:90) V P d x ( ξ · δ F − ξ · δ F ) . (J.5)Using (5.19) and (J.2) we obtain (5.36). 179 ppendix K: Interpretation of the displacement vector For MHD perturbations such that δ B = ∇ × ( ξ × B ) the displacement can be interpretedas a vector describing the motion of a field lines. Thus a normal perturbation to the surfaceof the plasma as in (5.4) can be expressed in terms of the displacement vector, δf ( S P ; ξ ) = (cid:90) S P d x G ξ · ˆ n . (K.1)For perturbations that allow for changes in the rotational transform it remains to be shownthat a similar relation can be found.As we require that ψ remain a flux surface label in the perturbed equilibrium, the La-grangian perturbation to ψ at fixed position is δψ = − δ x · ∇ ψ. (K.2)The perturbed magnetic field, B (cid:48) = B + δ B must remain tangent to ψ (cid:48) = ψ + δψ surfaces;thus to first order in the perturbation,0 = B (cid:48) · ∇ ψ (cid:48) = B · ∇ δψ + δ B · ∇ ψ. (K.3)Applying the form for the perturbed field allowing for changes in the rotational transform, δ B = ∇ × (cid:0) ξ × B − δχ ( ψ ) ∇ ϕ (cid:1) , and using several vector identities, the following conditionis obtained B · ∇ ( δ x · ∇ ψ ) = B · ∇ ( ξ · ∇ ψ ) . (K.4)This implies that δ x · ∇ ψ = ξ · ∇ ψ + F ( ψ ), where F ( ψ ) is some flux function which canbe determined by requiring that the perturbation to the toroidal flux as a function of ψ vanishes, δ Ψ T ( ψ ) = 0.The perturbed toroidal flux through a surface labeled by ψ contains two terms, corre-sponding to the flux of the unperturbed field through the perturbed surface and the perturbedfield through the unperturbed surface, δ Ψ T ( ψ ) = (cid:90) ∂S T ( ψ ) dϑ √ gδ x · ∇ ψ B · ∇ ϕ + (cid:90) S T ( ψ ) dψdϑ √ gδ B · ∇ ϕ. (K.5)Using the form for δ B , applying the divergence theorem, and noting that B · ∇ ϕ = √ g − ,the following condition is obtained, δ Ψ T ( ψ ) = (cid:90) π dϑ ( δ x · ∇ ψ − ξ · ∇ ψ ) . (K.6)By requiring that δ Ψ T ( ψ ) = 0, we find that F ( ψ ) = 0. Thus we can express shape gradients180n the form of (K.1) even when the rotational transform is allowed to vary.181 ppendix L: Details of axis ripple calculation In this Appendix, we compute the shape derivative of the finite-pressure magnetic wellfigure of merit from (5.101) and show that if we impose an adjoint perturbation of the form(5.102), the shape gradient is given by (5.106).We use the expression for the perturbation to the field strength (5.62) and δψ = − ξ · ∇ ψ with (5.101) to obtain, δf R ( S P ; ξ ) = (cid:90) S P d x ξ · ˆ n (cid:102) f R − (cid:90) V P d x ∂ (cid:102) f R ∂ψ ξ · ∇ ψ − (cid:90) V P d x ∂ (cid:102) f R ∂B B (cid:16) B ∇ · ξ + ξ · ∇ (cid:0) B + µ p (cid:1) + δχ (cid:48) ( ψ ) B · ( ∇ ψ × ∇ ϕ ) (cid:17) . (L.1)The third term can be integrated by parts to obtain, δf R ( S P ; ξ ) = (cid:90) S P d x ξ · ˆ n (cid:32)(cid:102) f R − ∂ (cid:102) f R ∂B B (cid:33) + (cid:90) V P d x (cid:32) ∂ (cid:102) f R ∂B∂ψ B − ∂ (cid:102) f R ∂ψ (cid:33) ξ · ∇ ψ + (cid:90) V P d x (cid:32) − ∂ (cid:102) f R ∂B B ξ · κ + B ∂ (cid:102) f R ∂B ξ · ∇ B + δχ (cid:48) ( ψ ) ∂ (cid:102) f R ∂B ˆ b · ( ∇ ϕ × ∇ ψ ) (cid:33) , (L.2)where the expression for the curvature in an equilibrium field (5.105) has been applied.We compute one term that appears in the fixed-boundary adjoint relation (5.36) usingthe prescribed adjoint bulk force perturbation (5.102a), (cid:90) V P d x ξ · F = (cid:90) V P d x (cid:32) − ∂ p || ∂B∂ψ B + ∂p || ∂ψ (cid:33) ξ · ∇ ψ + (cid:90) V P d x (cid:32) ∂p || ∂B B ξ · κ − B ∂ p || ∂B ξ · ∇ B (cid:33) , (L.3)where we have applied the parallel force balance condition (5.103). Therefore, if we impose182 || = (cid:102) f R , we obtain the following expression for the shape derivative of f R , δf R ( S P ; ξ ) = (cid:90) S P d x ξ · ˆ n (cid:32)(cid:102) f R − ∂ (cid:102) f R ∂B B (cid:33) − (cid:90) V P d x ξ · F + (cid:90) V P d x δχ (cid:48) ( ψ ) ∂ (cid:102) f R ∂B ˆ b · ( ∇ ϕ × ∇ ψ ) . (L.4)Upon application of the fixed-boundary adjoint relation we obtain (5.106) with (5.102).183 ppendix M: Details of effective ripple in the 1 /ν regime calculation Neoclassical transport in the 1 /ν collisionality regime is discussed in many referencesincluding [65], [42], and [116]. In this Appendix we sketch the computation of (cid:15) / originallyintroduced in [168] and compute linear perturbations of f (cid:15) (5.112), showing them to takethe form of (5.113).In the 1 /ν regime, the distribution function is ordered in the parameter ν ∗ = ν/ ( v t /L ) (cid:28)
1, where ν is the collision frequency, the thermal speed is v t = (cid:112) T /m for mass m andtemperature T , and L is a macroscopic scale length, f = f − + f + O ( ν ∗ ) . (M.1)In velocity space we use a pitch angle coordinate λ = v ⊥ / ( v B ), energy coordinate (cid:15) = v / σ = sign( v || ), where v ⊥ = (cid:113) v − v || is the perpendicular velocity and v || = v · ˆ b is theparallel velocity. We use the field line label, α , and length along a field line, l , to describelocation on a constant ψ surface. In the 1 /ν regime the E × B precession frequency isassumed to be small relative to the collision frequency, so the drift kinetic equation (4.2)becomes, v || ∂f ∂l = C ( f ) − v m · ∇ ψ ∂f ∂ψ , (M.2)where the Maxwellian with density n is, f = nπ − / v − t e − v /v t , (M.3)and the radial magnetic drift is, v m · ∇ ψ = ( v + v || ) m qB ∇ ψ × B · ∇ B, (M.4)for charge q . The drift kinetic equation to O ( ν − ∗ ) is, v || ∂f − ∂l = 0 . (M.5)In the trapped portion of phase space, this implies that f − = f − ( ψ, α, (cid:15), λ ), and in thepassing portion of phase space, this implies that f − = f − ( ψ, (cid:15), λ, σ ). The drift kineticequation to O ( ν ∗ ) is, v || ∂f ∂l = C ( f − ) − v m · ∇ ψ ∂f ∂ψ . (M.6)In the passing region, this implies that f − is a Maxwellian, so it can be taken to vanish.184e employ a pitch-angle scattering operator, C = 2 ν ( (cid:15) ) v || B(cid:15) ∂∂λ (cid:18) λv || ∂∂λ (cid:19) . (M.7)The parallel streaming term in (M.6) is annihilated by the bounce averaging operation,0 = (cid:104) C ( f − ) (cid:105) b − (cid:104) v m · ∇ ψ (cid:105) b ∂f ∂ψ , (M.8)where the bounce average of a quantity A is (cid:104) A (cid:105) b = τ − (cid:72) dl A/v || and the bounce time is τ = (cid:72) dl v − || . The bounce-averaged equation (M.8) can be expressed in terms of the paralleladiabatic invariant J = (cid:72) dl v || using the relation, (cid:104) v m · ∇ ψ (cid:105) b = mqτ ∂J∂α . (M.9)Integrating (M.8) with respect to λ we obtain, ∂f − ∂λ = m(cid:15) qλν ( (cid:15) ) ∂f ∂ψ (cid:18)(cid:73) dl v || B (cid:19) − (cid:90) λ /B max dλ (cid:48) ∂J∂α . (M.10)Here B max is the maximum value of the field strength on the surface labeled by ψ . Wehave used the boundary condition (cid:0)(cid:72) dl v || /B (cid:1) ∂f − /∂λ | λ =1 /B max = 0, as there is no fluxin pitch-angle from the passing region. The integration with respect to λ is performed toobtain, ∂f − ∂λ = − m qλν ( (cid:15) ) ∂f ∂ψ (cid:18)(cid:73) dl v || B (cid:19) − ∂∂α (cid:32)(cid:73) dl v || B (cid:33) . (M.11)The particle flux from f − is obtained by multiplying (M.6) by f − ( ∂f /∂ψ ) − , integratingover velocity space, and flux surface averaging, (cid:104) Γ · ∇ ψ (cid:105) ψ ≡ (cid:28)(cid:90) d v f − v m · ∇ ψ (cid:29) ψ = (cid:42)(cid:90) d v f − C ( f − ) (cid:18) ∂f ∂ψ (cid:19) − (cid:43) ψ . (M.12)The velocity space integration is performed using the velocity-space Jacobian d v = 2 π (cid:80) σ B(cid:15)/ | v || | dλd(cid:15) .Upon integration by parts in λ and applying (M.11), the following expression is obtained, (cid:104) Γ · ∇ ψ (cid:105) ψ = − √ πV (cid:48) ( ψ ) (cid:18) m q (cid:19) (cid:90) ∞ d(cid:15) (cid:18) ∂f ∂ψ (cid:19) (cid:15) / ν ( (cid:15) ) (cid:90) /B min /B max dλλ (cid:90) π dα (cid:88) i ( ∂∂α ˆ K i ( α, λ )) ˆ I i ( α, λ ) , (M.13)where the bounce integrals are defined by (5.111). The sum in (M.13) is taken over trappingregions for particles with pitch angle λ on a field line labeled by α for left bounce points ϕ − ,i ∈ [0 , π ).The parameter (cid:15) / quantifies the geometric dependence of the 1 /ν particle flux. It is185efined in terms of the radial particle flux in the following way [168], (cid:104) Γ · ∇ ψ (cid:105) ψ = − (cid:104)|∇ ψ |(cid:105) ψ (cid:18) m q (cid:19) B R (cid:15) / (cid:90) ∞ d(cid:15) (cid:18) ∂f ∂ψ (cid:19) (cid:15) / ν ( (cid:15) ) . (M.14)We take our normalizing length and field values to be such that B R = (cid:15) − (cid:104)|∇ ψ |(cid:105) ψ , where (cid:15) ref is a reference aspect ratio. Comparing (M.13) with (M.14) we obtain the expressionfor (cid:15) / (5.110). The corresponding expression (29) in [168] is obtained by noting thatˆ H Nemov = − ( ∂ ˆ K/∂α ) λ / B / and ˆ I = 2 ˆ I Nemov , where ˆ H Nemov and ˆ I Nemov are given in (30)-(31) of [168].The shape derivative of f (cid:15) (5.112) is computed to be, δf (cid:15) ( S P ; ξ ) = (cid:90) V P dψ w ( ψ ) δ ( V (cid:48) ( ψ ) (cid:15) / ( ψ )) . (M.15)The perturbation to the bounce integrals is computed using the following identity for theperturbation of a line integral Q L = (cid:82) l L l dl Q due to displacement of the integration curveby vector field δ x [9, 138], δQ L = (cid:90) l L l dl (cid:32) δ x · (cid:18) − κ Q + (cid:16) I − ˆ t ˆ t (cid:17) · ∇ Q (cid:19) + δQ (cid:33) + Q ( l L ) δl L − Q ( l ) δl , (M.16)where δQ is the perturbation to the integrand at fixed position, ˆ t = x (cid:48) ( l ) is the unit tangentvector, κ = x (cid:48)(cid:48) ( l ) is the curvature, and δl L and δl are perturbations to the bounds of theintegral.We compute the perturbation to the bounce integrals to be, δ ˆ I i = (cid:73) dl − v || vB κ · δ x − (cid:32) λv Bv || + v || B v (cid:33) ( δ x · ∇ B + δB ) (M.17a) δ ˆ K i = (cid:73) dl − v || v B κ · δ x − (cid:32) λv || Bv + v || B v (cid:33) ( δ x · ∇ B + δB ) , (M.17b)where δB is the perturbation to the field strength (5.62) and δ x is given by (5.22). Wenote that δ x · ˆ b = 0 such that the perpendicular projection, ( I − ˆ t ˆ t ), is not needed. Thereis no contribution due to the perturbation of the bounce points, as the integrand vanishesat these points. The expressions (5.113)-(5.115) can now be obtained by writing (M.15) interms of the perturbations of the bounce integrals, using ξ · ∇ B + δB = − B (cid:16) I − ˆ b ˆ b (cid:17) : ∇ ξ − δχ (cid:48) ( ψ )ˆ b · ( ∇ ψ × ∇ ϕ ) and κ · ξ = − ˆ b ˆ b : ∇ ξ .186 ppendix N: Details of departure from quasi-symmetry calculation In this Appendix we compute the shape derivative of f QS (5.121) to obtain (5.126)-(5.127c) by expressing each term in (5.125) in the desired form. The second term in (5.125)is expressed using δψ = − ξ · ∇ ψ ,12 (cid:90) V P d x w (cid:48) ( ψ ) δψ M = − (cid:90) V P d x M ξ · ∇ w ( ψ ) . (N.1)The third term in (5.125) is computed upon application of (5.20), the divergence theorem,and noting that M = B · A , (cid:90) V P d x w ( ψ ) M δ B · A = − (cid:90) S P d x ξ · n w ( ψ ) M − (cid:90) V P d x w ( ψ ) δχ (cid:48) ( ψ ) M∇ ψ × ∇ ϕ · A + (cid:90) V P d x ξ · (cid:16) w ( ψ ) M (cid:0) B × ( ∇ × A ) (cid:1) − A w ( ψ ) B · ∇M + M∇ (cid:0) w ( ψ ) M (cid:1)(cid:17) . (N.2)The quantity A can be projected into the perpendicular direction as ξ · ˆ b = 0, noting that,ˆ b × (cid:16) A × ˆ b (cid:17) = − (ˆ b × ∇ ψ ) ∇ || B − F ( ψ ) ∇ ⊥ B. (N.3)Similarly, any terms in (N.2) involving ξ · ∇ can be expressed as ξ · ∇ ⊥ . The correspondingterms in (5.127a) are obtained using the expression for the curvature in an equilibrium field.The fourth term in (5.125) is expressed in the following way upon application of (5.62), thedivergence theorem, and noting that S · ∇ ψ = ∇ · S = 0, (cid:90) V P d x w ( ψ ) M S · ∇ δB = (cid:90) S P d x ξ · n Bw ( ψ ) S · ∇M − (cid:90) V P d x ξ · (cid:104) B ∇ (cid:0) w ( ψ ) S · ∇M (cid:1)(cid:105) + (cid:90) V P d x w ( ψ )( S · ∇M ) (cid:16) δχ (cid:48) ( ψ )ˆ b · ( ∇ ψ × ∇ ϕ ) + B ξ · κ (cid:17) . (N.4)We express terms involving ξ · ∇ as ξ · ∇ ⊥ to obtain the corresponding terms in (5.127a).The fifth term in (5.125) is expressed in the following way upon application of δψ = − ξ ·∇ ψ ,the divergence theorem, and several vector identities, (cid:90) V P d x w ( ψ ) M B × ∇ δψ · ∇ B = − (cid:90) S P d x ξ · ˆ n w ( ψ ) M∇ B × B · ∇ ψ − (cid:90) V P d x ξ · ∇ ψ ∇ B · ∇ × (cid:0) w ( ψ ) M B (cid:1) . (N.5)187he sixth term in (5.125) upon application of (5.124) is, − (cid:90) V P d x δG ( ψ ) w ( ψ ) M B · ∇ Bι ( ψ ) − ( N/M ) =14 π (cid:90) S P d x w ( ψ ) V (cid:48) ( ψ ) (cid:104)M B · ∇ B (cid:105) ψ ( ι ( ψ ) − ( N/M )) ( B · ∇ ψ × ∇ ϑ ) ξ · ˆ n − π (cid:90) V P d x ξ · ∇ (cid:18) w ( ψ ) V (cid:48) ( ψ ) (cid:104)M B · ∇ B (cid:105) ψ ( ι ( ψ ) − ( N/M )) (cid:19) B · ∇ ψ × ∇ ϑ + 14 π (cid:90) V P d x w ( ψ ) V (cid:48) ( ψ ) (cid:104)M B · ∇ B (cid:105) ψ ι ( ψ ) − ( N/M ) (cid:16) ξ · (cid:0) ∇ ψ ∇ · ( B × ∇ ϑ ) − B × ∇ × ( ∇ ψ × ∇ ϑ ) (cid:1)(cid:17) − π (cid:90) V P d x δχ (cid:48) ( ψ ) w ( ψ ) V (cid:48) ( ψ ) (cid:104)M B · ∇ B (cid:105) ψ √ g ( ι ( ψ ) − ( N/M )) ∂ x ∂ϕ · ∂ x ∂ϑ . (N.6)In obtaining the corresponding terms in (5.127a), terms involving ξ · ∇ are expressed as ξ · ∇ ⊥ . The seventh term in (5.125) is expressed using δψ = − ξ · ∇ ψ . Combining all terms,we obtain (5.126)-(5.127c). 188 ppendix O: Details of neoclassical figures of merit calculation In this Section we compute the shape derivative of f NC (5.130) to obtain (5.136)-(5.137c)by expressing each term in (5.135) in the desired form. Throughout Boozer coordinates willbe assumed.The second term in (5.135) is expressed using δψ = − ξ · ∇ ψ . The third term in (5.135)can be computed using (5.124), noting that V (cid:48) ( ψ ) / (4 π √ g ) = B / (cid:104) B (cid:105) ψ in Boozer coordi-nates and applying the divergence theorem, (cid:90) V P d x w ( ψ ) ∂ R ( ψ ) ∂G ( ψ ) δG ( ψ ) = − (cid:90) V P d x w ( ψ ) B √ g (cid:104) B (cid:105) ψ ∂ R ( ψ ) ∂G ( ψ ) ξ · ∇ ψ ( ∇ × B ) · ∇ ϑ + (cid:90) V P d x ξ · ∇ (cid:32) ∂ R ( ψ ) ∂G ( ψ ) w ( ψ ) (cid:104) B (cid:105) ψ (cid:33) B G ( ψ ) + w ( ψ ) (cid:104) B (cid:105) ψ ∂ R ( ψ ) ∂G ( ψ ) ξ · B × ∇ × (cid:18) ∂ x ∂ϕ B (cid:19) + (cid:90) V P d x w ( ψ ) δχ (cid:48) ( ψ ) B √ g (cid:104) B (cid:105) ψ ∂ R ( ψ ) ∂G ( ψ ) ∂ x ∂ϕ · ∂ x ∂ϑ − (cid:90) S P d x w ( ψ ) B (cid:104) B (cid:105) ψ ∂ R ( ψ ) ∂G ( ψ ) G ( ψ ) ξ · ˆ n . (O.1)The fifth term in (5.135) can be computed using (5.62), the divergence theorem, and theexpression for the curvature in an equilibrium field (5.105), (cid:90) V P d x w ( ψ ) (cid:104) S R δB (cid:105) ψ = (cid:90) V P d x (cid:16) ξ · ∇ (cid:0) w ( ψ ) S R (cid:1) B − BS R w ( ψ ) ξ · κ (cid:17) − (cid:90) V P d x δχ (cid:48) ( ψ ) S R w ( ψ )ˆ b · ∇ ψ × ∇ ϕ − (cid:90) S P d x w ( ψ ) S R B ξ · ˆ n . (O.2)The resulting terms can be combined to write the shape derivative in the form of (5.136),noting that any terms involving ξ · ∇ can be expressed as ξ · ∇ ⊥ .189 ppendix P: Linearized equilibrium energy functional and coefficient ma-trices P.1 Further simplification of energy functional
We will now further simplify the energy functional (6.11) using a magnetic coordinatesystem. Each of the contravariant components of the perturbed magnetic field are evaluatedto be, Q ψ ≡ δ B [ ξ ] · ∇ ψ = 1 √ g (cid:32) ∂ξ ψ ∂ϕ + ι ∂ξ ψ ∂ϑ (cid:33) (P.1a) Q ϑ ≡ δ B [ ξ ] · ∇ ϑ = 1 √ g (cid:32) ∂ξ α ∂ϕ − ∂ξ ψ ι∂ψ (cid:33) (P.1b) Q ϕ ≡ δ B [ ξ ] · ∇ ϕ = − √ g (cid:32) ∂ξ α ∂ϑ + ∂ξ ψ ∂ψ (cid:33) . (P.1c)We also express the current density in the contravariant basis as, J = J ϑ ∂ x ∂ϑ + J ϕ ∂ x ∂ϕ . (P.2)The first term in the energy functional is expressed as, W ≡ − µ (cid:90) V P d x δ B [ ξ ] · δ B [ ξ ] (P.3)= − µ (cid:90) V P d x (cid:34) (cid:16) Q ψ (cid:17) g ψψ + (cid:16) Q ϑ (cid:17) g ϑϑ + ( Q ϕ ) g ϕϕ + 2 Q ψ Q ϑ g ψϑ (cid:35) , where g x i x j = ∂ x /∂x i · ∂ x /∂x j are the metric coefficients. Here we have assumed that ϕ = φ ,the geometric toroidal angle, such that g ϑϕ = g ψϕ = 0.The second term in the energy functional is expressed as, W ≡ (cid:90) V P d x ξ · J × δ B [ ξ ] (P.4)= (cid:90) V P d x √ g (cid:18) ξ ψ (cid:16) J ϑ Q ϕ − J ϕ Q ϑ (cid:17) + Q ψ (cid:16) ξ ϑ J ϕ − ξ ϕ J ϑ (cid:17)(cid:19) . Here we can note that the radial component of MHD force balance yields p (cid:48) ( ψ ) = J ϑ − ι ( ψ ) J ϕ W = (cid:90) V P d x √ g (cid:18) ξ ψ (cid:16) J ϑ Q ϕ − J ϕ Q ϑ (cid:17) + Q ψ (cid:0) ξ α J ϕ − p (cid:48) ( ψ ) ξ ϕ (cid:1)(cid:19) . (P.5)The third term in the energy functional can be expressed as, W ≡ (cid:90) V P d x ξ · ∇ ( ξ · ∇ p ) (P.6)= (cid:90) V P d x ξ ψ ∂ ( ξ ψ p (cid:48) ( ψ )) ∂ψ + p (cid:48) ( ψ ) (cid:32) ξ α ∂ξ ψ ∂ϑ + √ gQ ψ ξ ϕ (cid:33) . Combining W and W , we see that the energy functional indeed only depends on ξ α and ξ ψ , W + W = (cid:90) V P d x (cid:32) √ gξ ψ (cid:16) J ϑ Q ϕ − J ϕ Q ϑ (cid:17) + ξ α J · ∇ ξ ψ + ξ ψ ∂ ( ξ ψ p (cid:48) ( ψ )) ∂ψ (cid:33) . (P.7)We now can apply the divegernce theorem, noting that ∇ · J = J · ∇ ψ = 0, to obtain, W + W = (cid:90) V P d x (cid:32) ξ ψ (cid:16) J ϕ ι (cid:48) ( ψ ) ξ ψ − J · ∇ ξ α + ξ ψ p (cid:48)(cid:48) ( ψ ) (cid:17) (cid:33) . (P.8)We now see that the first three terms of the energy functional only depend on ξ α throughits ϑ and ϕ derivatives. Furthermore, given the restriction of δF α discussed in Appendix Q,the m = 0, n = 0 mode of ξ α will not enter the variational principle.191 .2 Explicit forms of coefficient matrices We can now express the linear operators that couple the Fourier components of ξ α , ξ ψ ,and ∂ξ ψ /∂ψ given the simplifications of the energy functional in the previous Section: A ψ (cid:48) ψ (cid:48) = − V (cid:48) ( ψ ) µ (cid:42) (cid:0) √ g (cid:1) (cid:0) g ϕϕ + ι ( ψ ) g ϑϑ (cid:1) F ψ F ψ (cid:43) ψ (P.9a) A ψψ = V (cid:48) ( ψ ) µ (cid:42) (cid:0) √ g (cid:1) (cid:34) − g ψψ (cid:32) ∂ F ψ ∂ϕ ∂ F ψ ∂ϕ + ι ( ψ ) ∂ F ψ ∂ϑ ∂ F ψ ∂ϑ (cid:33) (P.9b) − g ψψ ι ( ψ ) (cid:32) ∂ F ψ ∂ϑ ∂ F ψ ∂ϕ + ∂ F ψ ∂ϕ ∂ F ψ ∂ϑ (cid:33) + (cid:16) µ (cid:0) √ g (cid:1) (cid:0) J ϕ ι (cid:48) ( ψ ) + p (cid:48)(cid:48) ( ψ ) (cid:1) − g ϑϑ (cid:0) ι (cid:48) ( ψ ) (cid:1) (cid:17) F ψ F ψ + g ψϑ ι (cid:48) ( ψ ) (cid:32) ∂ F ψ ∂ϕ + ι ( ψ ) ∂ F ψ ∂ϑ (cid:33) F ψ + F ψ (cid:32) ∂ F ψ ∂ϕ + ι ( ψ ) ∂ F ψ ∂ϑ (cid:33) (cid:35)(cid:43) ψ A ψψ (cid:48) = V (cid:48) ( ψ ) µ (cid:42) ι ( ψ ) (cid:0) √ g (cid:1) (cid:34) − F ψ g ϑϑ ι (cid:48) ( ψ ) + g ψϑ (cid:32) ∂ F ψ ∂ϕ + ι ( ψ ) ∂ F ψ ∂ϑ (cid:33) (cid:35) F ψ (cid:43) ψ (P.9c) A αα = − V (cid:48) ( ψ ) µ (cid:42) (cid:0) √ g (cid:1) (cid:34) g ϑϑ ∂ F α ∂ϕ ∂ F α ∂ϕ + g ϕϕ ∂ F α ∂ϑ ∂ F α ∂ϑ (cid:35)(cid:43) ψ (P.9d) A αψ (cid:48) = 2 V (cid:48) ( ψ ) µ (cid:42) (cid:0) √ g (cid:1) (cid:34) g ϑϑ ι ∂ F α ∂ϕ − g ϕϕ ∂ F α ∂ϑ (cid:35) F ψ (cid:43) ψ (P.9e) A αψ = − V (cid:48) ( ψ ) µ (cid:42) (cid:0) √ g (cid:1) (cid:34) (cid:18) − g ϑϑ ι (cid:48) ( ψ ) ∂ F α ∂ϕ + µ (cid:0) √ g (cid:1) J · ∇ F α (cid:19) F ψ (P.9f)+ g ψϑ ∂ F α ∂ϕ (cid:32) ∂ F ψ ∂ϕ + ι ( ψ ) ∂ F ψ ∂ϑ (cid:33) (cid:35)(cid:43) ψ I ψ = 2 V (cid:48) ( ψ ) (cid:68) F ψ δF ψ (cid:69) ψ (P.9g) I α = 2 V (cid:48) ( ψ ) (cid:104) F α δF α (cid:105) ψ , (P.9h)where (cid:104) ... (cid:105) ψ is the flux-surface average (A.10). P.3 Invertibility of A αα Obtaining the Euler-Lagrange solution for ξ α requires inverting A αα . We now show thatthis matrix is, in fact, negative definite and thus invertible. For any non-zero vector Ξ α , we192an write the inner product with A αα as, Ξ α · ( A αα Ξ α ) = − µ (cid:90) π dϑ (cid:90) π dϕ (cid:34) g ϑϑ √ g (cid:18) ∂ξ α ∂ϕ (cid:19) + g ϕϕ √ g (cid:18) ∂ξ α ∂ϑ (cid:19) (cid:35) . (P.10)We note that for a well-defined coordinate system, g ϑϑ > g ϕϕ >
0, and √ g >
0. Whileeither ∂ξ α /∂ϕ or ∂ξ α /∂ϑ may vanish, they will not vanish simultaneously throughout theintegrand as we have excluded the n = 0, m = 0 mode. Therefore, the integrand will onlyvanish at isolated points. Thus the above integral is negative definite, and A αα is invertiblethroughout the volume. 193 ppendix Q: Constraint on bulk force perturbation As shown in Appendix P, the first three terms in the energy functional (6.11) only dependon ξ α through its derivatives with respect to ϑ and ϕ . In this Appendix, we show that itis always possible to choose the in-surface component of the bulk force perturbation, δF α ,such that the final term in the energy functional, W ≡ (cid:90) V P d x ξ α δF α , (Q.1)does not depend on ξ αc , = π ) (cid:82) π dϑ (cid:82) π dϕ ξ α . As ξ αc , does not enter our variationalprinciple, we can take it to vanish. The condition that ξ αc , does not enter W is equivalentto requiring that, (cid:104) δF α (cid:105) ψ = 0 , (Q.2)on every surface, where (cid:104) . . . (cid:105) ψ is the flux-surface average (A.10). This follows from thesurface-averaged in-surface component of the linearized force-balance equation (6.2), (cid:28) ∂ x ∂ϑ · F [ ξ ] (cid:29) ψ = 0 . (Q.3)This property of the MHD force operator holds for any equilibrium field that satisfies MHDforce balance (6.1). To see this we note that the flux-surface average can be defined in termsof an average over the infinitesimal volume between flux surfaces ∆ V (A.12). We can nowapply the self-adjointness relation (6.9) to simplify (Q.3), (cid:28) ∂ x ∂ϑ · F [ ξ ] (cid:29) ψ = (cid:42) ξ · F (cid:20) ∂ x ∂ϑ (cid:21)(cid:43) ψ + lim ∆ V → µ ∆ V (cid:32)(cid:90) ∂ ( V P +∆ V ) d x ˆ n · ξ B · δ B (cid:20) ∂ x ∂ϑ (cid:21) − (cid:90) ∂ ( V P ) d x ˆ n · ξ B · δ B (cid:20) ∂ x ∂ϑ (cid:21)(cid:33) , (Q.4)where we have noted that ˆ n · ∂ x ∂ϑ = 0, as ˆ n ∝ ∇ ψ . The quantity δ B (cid:2) ∂ x /∂ϑ (cid:3) = ∇ × (cid:0) ∂ x /∂ϑ × B (cid:1) is shown to vanish by expressing B in contravariant form and using the dualrelations (A.3) between the contravariant and covariant basis vectors. The remaining flux-194urface averaged term can also be shown to vanish, (cid:42) ξ · F (cid:20) ∂ x ∂ϑ (cid:21)(cid:43) ψ = (cid:42) ξ · J × δ B (cid:20) ∂ x ∂ϑ (cid:21) + (cid:18) ∇ × δ B (cid:104) ∂ x ∂ϑ (cid:105)(cid:19) × B µ + ∇ (cid:18) ∂ x ∂ϑ · ∇ p (cid:19)(cid:43) ψ , (Q.5)as ∂ x ∂ϕ · ∇ ψ = 0 and δ B (cid:2) ∂ x /∂ϑ (cid:3) = 0.Therefore, we see that in order to satisfy linear force balance, δF α must be chosen tosatisfy the condition (Q.2). However, this property can always be imparted on a bulk forcearising from the adjoint formulation. Consider the fixed-boundary adjoint relation (5.36)without perturbations to the rotational transform, (cid:90) V P d x ( ξ · F − ξ · F ) − µ (cid:90) S P d x ˆ n · (cid:0) ξ δ B [ ξ ] · B − ξ δ B [ ξ ] · B (cid:1) = 0 . (Q.6)As δ B [ ξ ] does not depend on ξ αc , , we can choose to define the displacement vector such that ξ αc , = 0. This is analogous to our convention that ξ · B = 0, as δ B [ ξ ] does not depend onthe parallel component of ξ . Given this convention for the displacement vector, we can notethat (cid:104) δF α, (cid:105) ψ and (cid:104) δF α, (cid:105) ψ do not enter the above adjoint relation. Therefore, we are free tochoose our bulk force such that the desired constraint (Q.2) is satisfied.195 ppendix R: Near-axis expansion of screw pinch equilibria The MHD force-balance equation for a screw pinch is, ddr (cid:18) µ p ( r ) + 12 r (cid:0) ψ (cid:48) ( r ) (cid:1) (cid:19) + ι ( r ) ψ (cid:48) ( r ) R r ddr (cid:0) rι ( r ) ψ (cid:48) ( r ) (cid:1) = 0 . (R.1)We note that (R.1) remains unchanged under the transformation r → − r , so ψ ( r ) must beeven in r . Thus near the origin we can express the flux function as, ψ ( r ) = ψ r + ψ r + ψ r + O ( r ) , (R.2)under the assumption that ψ (0) = 0. We similarly express the rotational transform andpressure profiles in a power series near the axis, ι ( ψ ( r )) = ι + ι ψ ( r ) + ι ψ ( r ) + ι ψ ( r ) + O ( ψ ( r ) ) (R.3a) p ( ψ ( r )) = p + p ψ ( r ) + p ψ ( r ) + p ψ ( r ) + O ( ψ ( r ) ) . (R.3b)The force-balance equation to O ( r ) becomes, µ p ψ + 2 ι ψ R + ψ ψ , (R.4)and to O ( r ) it is, µ p ψ ι ι ψ R + µ p ψ ι ψ ψ R + ψ
18 + ψ ψ
30 = 0 . (R.5)In order to determine the power series expansion of ψ , we match the solution near the axiswith a numerical solution for ψ ( r ) at some chosen boundary location near the axis, r b . Toperform an expansion to O ( r ), ψ is chosen such that ψ = 2 ψ ( r b ) r b . (R.6)196o perform an expansion to O ( r ), (R.4) is used to express ψ in terms of ψ , and ψ ischosen such that ψ r b / ψ r b /
4! = ψ ( r b ), ψ = − µ p r b − ψ ( r b )2 r b (cid:16) r b ι R − (cid:17) (R.7a) ψ = − (cid:32) µ p + 2 ι ψ R (cid:33) . (R.7b)To perform an expansion to O ( r ), (R.4) and (R.5) are used to express ψ and ψ in terms of ψ , and ψ is chosen such that ψ r b / ψ r b /
4! + ψ r b /
6! = ψ ( r b ). The resulting equation for ψ is quadratic, but only one solution is allowed in practice to ensure that ( ψ r b / / ( ψ r b / ψ r b / ∼ r b in the limit that r b (cid:28) ψ = − R ι ι r b (cid:32) − r b + 12 r b ι R + r b (cid:32) µ p − ι R (cid:33) (R.8a)+ r b (cid:34) (cid:32) −
24 + µ p r b + 12 r b ι R − r b ι R (cid:33) + 48 r b ι ι R p r b (cid:32) − r b ι R (cid:33) − ψ ( r b ) (cid:35) / (cid:33) ψ = − (cid:32) µ p + 2 ι R ψ (cid:33) (R.8b) ψ = 15 (cid:32) µ p ι R − µ p ψ + 8 ι ψ R − ι ι ψ R (cid:33) . (R.8c)We compare the resulting solution for ψ to a numerical solution of (R.1) using MATLAB’sbvp4c routine. The solution is computed for r ∈ [0 ,
1] with a boundary condition of ψ (0) = 0and ψ (1) = ψ . The same profiles are used as described in Section 6.3.1. The axis expansionsolution is matched with the numerical solution at r b = 10 − . In Figure R.1 we present acomparison between the numerical solution and axis expansion of ψ ( r ). As expected, theerror in the axis expansion to O ( r p ) scales as ∼ | r − r b | p +2 as one moves away from r = r b .197 a) (b) Figure R.1: (a) The axis expansion solutions to O ( r ), O ( r ), and O ( r ) are compared withthe numerical solution of ψ ( r ) near the axis. (b) The absolute error in the expansion isshown, | (cid:80) n ψ n r n /n ! − ψ ( r ) | where ψ ( r ) is the numerical solution. As expected, the error inthe axis expansion to O ( r p ) scales as | r − r b | p +2 near r = r b .198 ibliography [1] Princeton plasma physics laboratory - timeline. URL . date accessed: 01/03/2019.[2] I. Abel, G. Plunk, E. Wang, M. Barnes, S. Cowley, W. Dorland, and A. Schekochi-hin. Multiscale gyrokinetics for rotating tokamak plasmas: fluctuations, transport andenergy flows. Reports on Progress in Physics , 76(11):116201, 2013.[3] A. Alexanderian, N. Petra, G. Stadler, and O. Ghattas. Mean-variance risk-averseoptimal control of systems governed by PDEs with random parameter fields usingquadratic approximations.
SIAM/ASA Journal on Uncertainty Quantification , 5(1):1166–1192, 2017.[4] G. Allaire. A review of adjoint methods for sensitivity analysis, uncertainty quantifi-cation and optimization in numerical codes.
Ingnieurs de lAutomobile , 836:33, 2015.[5] A. F. Almagri, D. T. Anderson, and S. F. B. Anderson. Design and construction ofHSX: A helically symmetric stellarator. In
Helical System Research . 1998.[6] D. Anderson. Personal communication, 9 2019.[7] D. V. Anderson, W. Cooper, R. Gruber, S. Merazzi, and U. Schwenn. Methods forthe efficient calculation of the (MHD) magnetohydrodynamic stability properties ofmagnetically confined fusion plasmas.
The International Journal of SupercomputingApplications , 4(3):34, 1990.[8] F. S. B. Anderson, A. F. Almagri, D. T. Anderson, P. G. Matthews, J. N. Talmadge,and J. L. Shohet. The Helically Symmetric eXperiment,(HSX) goals, design and status.
Fusion Technology , 27(3T):273–277, 1995.[9] T. Antonsen and Y. Lee. Electrostatic modification of variational principles foranisotropic plasmas.
Physics of Fluids , 25(1):132, 1982.[10] T. Antonsen, E. J. Paul, and M. Landreman. Adjoint approach to calculating shapegradients for three-dimensional magnetic confinement equilibria.
Journal of PlasmaPhysics , 85(2), 2019.[11] H. F. Arnoldus. Conservation of charge at an interface.
Optics Communications , 265(1):52–59, 2006. 19912] A. Bader, M. Drevlak, D. Anderson, B. Faber, C. Hegna, K. Likin, J. Schmitt, andJ. Talmadge. Stellarator equilibria with reactor relevant energetic particle losses.
Jour-nal of Plasma Physics , 85(5), 2019.[13] M. Barnes, I. Abel, W. Dorland, T. G¨orler, G. Hammett, and F. Jenko. Direct multi-scale coupling of a transport code to gyrokinetic turbulence codes.
Physics of Plasmas ,17(5):056109, 2010.[14] F. Bauer, O. Betancourt, and P. Garabedian.
A Computational Method in PlasmaPhysics . Springer Science & Business Media, 2012.[15] C. Beidler, G. Grieger, F. Herrnegger, E. Harmeyer, W. Lotz, H. Maassberg, P. Merkel,J. N¨uhrenberg, F. Rau, J. Sapper, F. Sardei, R. Scardovelli, A. Schl¨uter, and H. Wobig.Physics and engineering design for Wendelstein VII-X.
Fusion Technology , 17(1):148,1990.[16] C. Beidler, K. Allmaier, M. Y. Isaev, S. Kasilov, W. Kernbichler, G. Leitold, H. Maass-berg, D. Mikkelsen, S. Murakami, M. Schmidt, et al. Benchmarking of the mono-energetic transport coefficientsresults from the International Collaboration on Neo-classical Transport in Stellarators (ICNTS).
Nuclear Fusion , 51(7):076001, 2011.[17] C. D. Beidler and W. D. D’haeseleer. A general solution of the ripple-averaged kineticequation (GSRAKE).
Plasma Physics and Controlled Fusion , 37(4):463, 1995.[18] E. A. Belli and J. Candy. Neoclassical transport in toroidal plasmas with nonaxisym-metric flux surfaces.
Plasma Physics and Controlled Fusion , 57(5):054012, 2015.[19] E. Berkl et al. Plasma physics and controlled nuclear fusion research 1968. In
Proceed-ings of the 3rd International Conference Novosibirsk , volume 1, 1968.[20] I. Bernstein, E. Frieman, M. Kruskal, and R. Kulsrud. An energy principle for hydro-magnetic stability problems.
Proceedings of the Royal Society A , 244(1236):17, 1958.[21] A. Boozer. Plasma equilibrium with rational magnetic surfaces.
The Physics of Fluids ,24(11):1999, 1981.[22] A. Boozer. Quasi-helical symmetry in stellarators.
Plasma Physics and ControlledFusion , 37(11A):A103, 1995.[23] A. H. Boozer. Guiding center drift equations.
The Physics of Fluids , 23(5):904, 1980.[24] A. H. Boozer. Transport and isomorphic equilibria.
The Physics of Fluids , 26(2):496,1983.[25] A. H. Boozer. Stellarator coil optimization by targeting the plasma configuration.
Physics of Plasmas , 7(8):3378, 2000. 20026] A. H. Boozer. Non-axisymmetric magnetic fields and toroidal plasma confinement.
Nuclear Fusion , 55(2):025001, 2015.[27] A. H. Boozer. Stellarators as a fast path to fusion energy. arXiv preprintarXiv:1912.06289 , 2019.[28] A. H. Boozer and C. N¨uhrenberg. Perturbed plasma equilibria.
Physics of Plasmas ,13(10):102501, 2006.[29] S. Boyd and L. Vandenberghe.
Convex Optimization . Cambridge University Press,2004.[30] R. P. Brent.
Algorithms for Minimization Without Derivatives . Courier Corporation,2013.[31] A. Brooks and W. Reiersen. Coil tolerance impact on plasma surface quality for NCSX.In , page 553. IEEE, 2003.[32] T. Brown, J. Breslau, D. Gates, N. Pomphrey, and A. Zolfaghari. Engineering op-timization of stellarator coils lead to improvements in device maintenance. In
IEEE26th Symposium on Fusion Engineering (SOFE) , Austin, Texas, 2015.[33] I. Calvo, F. I. Parra, J. L. Velasco, and J. A. Alonso. The effect of tangential driftson neoclassical transport in stellarators close to omnigeneity.
Plasma Physics andControlled Fusion , 59(5):055014, 2017.[34] I. Calvo, J. L. Velasco, F. I. Parra, J. A. Alonso, and J. M. Garc´ıa-Rega˜na. Electro-static potential variations on stellarator magnetic surfaces in low collisionality regimes.
Journal of Plasma Physics , 84(4), 2018.[35] J. Canik, D. Anderson, F. Anderson, C. Clark, K. Likin, J. Talmadge, and K. Zhai.Reduced particle and heat transport with quasisymmetry in the Helically SymmetricExperiment.
Physics of Plasmas , 14(5):056107, 2007.[36] A. Carlton-Jones, E. Paul, and W. Dorland. Computing the shape gradient of coilcomplexity with respect to the plasma boundary with an adjoint method.
Bulletin ofthe American Physical Society , 64, 2019.[37] B. Carreras, V. Lynch, and A. Ware. Configuration studies for a small-aspect-ratiotokamak stellarator hybrid. Technical report, Oak Ridge National Lab., 1996.[38] J. R. Cary and J. D. Hanson. Simple method for calculating island widths.
Physics ofFluids B: Plasma Physics , 3(4):1006, 1991.[39] J. R. Cary and S. G. Shasharina. Omnigenity and quasihelicity in helical plasmaconfinement systems.
Physics of Plasmas , 4(9):3323, 1997.20140] K. K. Choi and N.-H. Kim.
Structural Sensitivity Analysis and Optimization 1: LinearSystems . Springer Science & Business Media, 2006.[41] E. A. Coddington and N. Levinson.
Theory of Ordinary Differential Equations . TataMcGraw-Hill Education, 1955.[42] J. Connor and R. Hastie. Neoclassical diffusion in an l = 3 stellarator. Physics ofFluids , 17(114):114, 1974.[43] W. Cooper, S. Hirshman, S. Merazzi, and R. Gruber. 3D magnetohydrodynamicequilibria with anisotropic pressure.
Computer Physics Communications , 72(1):1, 1992.[44] W. Cooper, S. Hirshman, T. Yamaguchi, Y. Narushima, S. Okamura, S. Sakak-ibara, C. Suzuki, K. Watanabe, H. Yamada, and K. Yamazaki. Three-dimensionalanisotropic pressure equilibria that model balanced tangential neutral beam injectioneffects.
Plasma Physics and Controlled Fusion , 47(3):561, 2005.[45] W. Cooper, J. Graves, S. Hirshman, T. Yamaguchi, Y. Narushima, S. Okamura,S. Sakakibara, C. Suzuki, K. Watanabe, H. Yamada, et al. Anisotropic pressurebi-Maxwellian distribution function model for three-dimensional equilibria.
NuclearFusion , 46(7):683, 2006.[46] T. Coor, S. Cunningham, R. Ellis, M. Heald, and A. Kranz. Experiments on the ohmicheating and confinement of plasma in a stellarator.
The Physics of Fluids , 1(5):411,1958.[47] W. Dekeyser.
Optimal Plasma Edge Configurations for Next-Step Fusion Reactors .PhD thesis, Katholieke Universiteit Leuven, 2014.[48] W. Dekeyser, D. Reiter, and M. Baelmans. Divertor design through shape optimization.
Contributions to Plasma Physics , 52(5):544, 2012.[49] W. Dekeyser, D. Reiter, and M. Baelmans. Automated divertor target design byadjoint shape sensitivity analysis and a one-shot method.
Journal of ComputationalPhysics , 278:117, 2014.[50] W. Dekeyser, D. Reiter, and M. Baelmans. Optimal shape design for divertors.
Inter-national Journal of Computational Science and Engineering 2 , 9(5-6):397, 2014.[51] W. Dekeyser, D. Reiter, and M. Baelmans. A one shot method for divertor targetshape optimization.
Proceedings in Applied Mathematics and Mechanics , 14(1):1017,2014.[52] M. C. Delfour and J.-P. Zol´esio.
Shapes and Geometries . Society for Industrial andApplied Mathematics, 2011.[53] R. Dewar and S. Hudson. Stellarator symmetry.
Physica D: Nonlinear Phenomena ,112(1):275–280, 1998. 20254] W. D. D’haeseleer, W. N. Hitchon, J. D. Callen, and J. L. Shohet.
Flux Coordinatesand Magnetic Field Structure: A Guide to a Fundamental Tool of Plasma Theory .Springer, 1991.[55] A. Dinklage, C. Beidler, P. Helander, G. Fuchert, H. Maaßberg, K. Rahbarnia, T. S.Pedersen, Y. Turkin, R. Wolf, A. Alonso, et al. Magnetic configuration effects on theWendelstein 7-X stellarator.
Nature Physics , 14(8):855–860, 2018.[56] M. Drevlak. Optimization of heterogenous magnet systems. In
Proceedings of the 12thInternational Stellarator Workshop , number P1-17, 1999.[57] M. Drevlak, F. Brochard, P. Helander, J. Kisslinger, M. Mikhailov, C. N¨uhrenberg,J. N¨uhrenberg, and Y. Turkin. ESTELL: A Quasi-Toroidally Symmetric Stellarator.
Contributions to Plasma Physics , 53(6):459, 2013.[58] M. Drevlak, J. Geiger, P. Helander, and Y. Turkin. Fast particle confinement withoptimized coil currents in the W7-X stellarator.
Nuclear Fusion , 54(7):073002, 2014.[59] M. Drevlak, C. Beidler, J. Geiger, P. Helander, and Y. Turkin. Optimisation of stel-larator equilibria with ROSE.
Nuclear Fusion , 59(1):016010, 2018.[60] L. El-Guebaly, P. Wilson, D. Henderson, M. Sawan, G. Sviatoslavsky, R. Slaybaugh,B. Kiedrowski, A. Ibrahim, C. Martin, R. Raffray, S. Malang, J. Lyon, L. P. Ku,X. Wang, L. Bromberg, B. Merrill, L. Waganer, F. Najmabadi, and the Aries-CS Team.Designing ARIES-CS Compact Radial Build and Nuclear System: Neutronics, Shield-ing, and Activation.
Fusion Science and Technology , 54:747, 2008.[61] N. M. Ferraro, J.-K. Park, C. Myers, A. Brooks, S. Gerhardt, J. Menard, S. Munaretto,and M. Reinke. Error field impact on mode locking and divertor heat flux in NSTX-U.
Nuclear Fusion , 59(8):086021, 2019.[62] L. K. Forbes and S. Crozier. A novel target-field method for finite-length magneticresonance shim coils: I. Zonal shims.
Journal of Physics D: Applied Physics , 34:3447,2001.[63] L. K. Forbes, M. A. Brideson, and S. Crozier. A Target-Field Method to DesignCircular Biplanar Coils for Asymmetric Shim and Gradient Fields.
IEEE Transactionson Magnetics , 41(6):2134, 2005.[64] J. Freidberg.
Ideal MHD . Cambridge University Press, 2014.[65] E. Frieman. Collisional diffusion in nonaxisymmetric toroidal systems.
Physics ofFluids , 13(490):490, 1970.[66] A. Galeev and R. Sagdeev.
Theory of Neoclassical Diffusion , volume 7 of
Reviews ofPlasma Physics , page 257. 1979. 20367] I. M. Gamba. Viscosity approximating solutions to ODE systems that admit shocks,and their limits.
Advances in Applied Mathematics , 15(2):129–182, 1994.[68] A. Gandini. Importance and sensitivity analysis in assessing system reliability.
IEEETransactions on Reliability , 39(1):61, 1990.[69] P. Garabedian. Three-dimensional stellarator codes.
Proceedings of the NationalAcademy of Sciences , 99(16):10257, 2002.[70] P. R. Garabedian and G. B. McFadden. Design of the DEMO fusion reactor followingITER.
Journal of Research of the National Institute of Standards and Technology , 114(4):229, 2009.[71] H. Gardner. Modelling the behaviour of the magnetic field diagnostic coils on the WVII-AS stellarator using a three-dimensional equilibrium code.
Nuclear Fusion , 30(8):1417, 1990.[72] D. Gates and L. Delgado-Aparicio. Origin of tokamak density limit scalings.
PhysicalReview Letters , 108(16):165004, 2012.[73] D. A. Gates, D. Anderson, S. Anderson, M. Zarnstorff, D. A. Spong, H. Weitzner,G. Neilson, D. Ruzic, D. Andruczyk, J. Harris, et al. Stellarator research opportuni-ties: a report of the National Stellarator Coordinating Committee.
Journal of FusionEnergy , 37(1):51, 2018.[74] M. Gavrilovi´c, R. Petrovi´c, and D. ˇSiljak. Adjoint method in the sensitivity analysisof optimal systems.
Journal of the Franklin Institute , 276(1):26, 1963.[75] J. Geiger, C. Beidler, M. Drevlak, H. Maassberg, C. N¨uhrenberg, Y. Suzuki, andY. Turkin. Effects of net currents on the magnetic configuration of W7-X.
Contribu-tions to Plasma Physics , 50(8):770, 2010.[76] A. Geraldini and M. Landreman. Optimizing stellarator surfaces using magnetic islandwidth sensitivity.
Bulletin of the American Physical Society , 64, 2019.[77] S. P. Gerhardt, J. N. Talmadge, J. M. Canik, and D. T. Anderson. Measurements andmodeling of plasma flow damping in the Helically Symmetric eXperiment.
Physics ofPlasmas , 12(5):056116, 2005.[78] M. Giles and N. Pierce. Improved lift and drag estimates using adjoint Euler equations.In , page 3293, 1999.[79] M. B. Giles and N. A. Pierce. An introduction to the adjoint approach to design.
Flow,Turbulence and Combustion , 65(3-4):393, 2000.[80] A. Glasser. The direct criterion of Newcomb for the ideal MHD stability of an axisym-metric toroidal plasma.
Physics of Plasmas , 23(7):072505, 2016.20481] A. Glasser. DCON for stellarators.
Bulletin of the American Physical Society , 63,2018.[82] R. Glowinski and O. Pironneau. On the numerical computation of the minimum-dragprofile in laminar flow.
Journal of Fluid Mechanics , 72(2):385, 1975.[83] J. H. Goedbloed and S. Poedts.
Principles of Magnetohydrodynamics: With Applica-tions to Laboratory and Astrophysical Plasmas . Cambridge University Press, 2004.[84] H. Grad. Toroidal containment of a plasma.
The Physics of Fluids , 10(1):137, 1967.[85] J. Greene. A brief review of magnetic wells.
Comments on Plasma Physics and Con-trolled Fusion , 17:389, 1997.[86] G. Grieger, W. Lotz, P. Merkel, J. N¨uhrenberg, J. Sapper, E. Strumberger, H. Wobig,R. Burhenn, V. Erckmann, U. Gasparino, et al. Physics optimization of stellarators.
Physics of Fluids B: Plasma Physics , 4(7):2081, 1992.[87] J. Hadamard.
M´emoire sur le probl`eme d’analyse relatif `a l’´equilibre des plaques´elastiques encastr´ees , volume 33. Imprimerie Nationale, 1908.[88] K. Hammond, A. Anichowski, P. Brenner, T. S. Pedersen, S. Raftopoulos, P. Traverso,and F. Volpe. Experimental and numerical study of error fields in the CNT stellarator.
Plasma Physics and Controlled Fusion , 58(7):074002, 2016.[89] J. D. Hanson, D. Anderson, M. Cianciosa, P. Franz, J. Harris, G. Hartwell, S. P. Hir-shman, S. F. Knowlton, L. L. Lao, E. A. Lazarus, et al. Non-axisymmetric equilibriumreconstruction for stellarators, reversed field pinches and tokamaks.
Nuclear Fusion ,53(8):083016, 2013.[90] K. Harafuji, T. Hayashi, and T. Sato. Computational study of three-dimensionalmagnetohydrodynamic equilibria in toroidal helical systems.
Journal of ComputationalPhysics , 81(1):169, 1989.[91] J. Haslinger and R. A. M¨akinen.
Introduction to Shape Optimization: Theory, Approx-imation, and Computation . Society for Industrial and Applied Mathematics, 2003.[92] D. Hastings, W. Houlberg, and K.-C. Shaing. The ambipolar electric field in stellara-tors.
Nuclear Fusion , 25(4):445, 1985.[93] R. Hawryluk and H. Zohm. The challenge and promise of studying burning plasmas.
Physics Today , 72(12):34, 2019.[94] R. D. Hazeltine. Recursive derivation of drift-kinetic equation.
Plasma Physics , 15(1):77, 1973.[95] C. C. Hegna and N. Nakajima. On the stability of Mercier and ballooning modes instellarator configurations.
Physics of Plasmas , 5(5):1336, 1998.20596] C. C. Hegna, P. W. Terry, and B. J. Faber. Theory of ITG turbulent saturation instellarators: identifying mechanisms to reduce turbulent transport.
Physics of Plasmas ,25(2):022511, 2018.[97] P. Helander. Theory of plasma confinement in non-axisymmetric magnetic fields.
Re-ports on Progress in Physics , 77(8):087001, 2014.[98] P. Helander and J. N¨uhrenberg. Bootstrap current and neoclassical transport in quasi-isodynamic stellarators.
Plasma Physics and Controlled Fusion , 51(5):055004, 2009.[99] P. Helander and D. J. Sigmar.
Collisional Transport in Magnetized Plasmas . CambridgeUniversity Press, 2005.[100] P. Helander and A. Simakov. Intrinsic ambipolarity and rotation in stellarators.
Phys-ical Review Letters , 101(14):145003, 2008.[101] P. Helander, C. Beidler, T. Bird, M. Drevlak, Y. Feng, R. Hatzky, F. Jenko, R. Kleiber,J. Proll, Y. Turkin, et al. Stellarator and tokamak plasmas: a comparison.
PlasmaPhysics and Controlled Fusion , 54(12):124009, 2012.[102] P. Helander, F. Parra, and S. Newton. Stellarator bootstrap current and plasma flowvelocity at low collisionality.
Journal of Plasma Physics , 83(2), 2017.[103] P. Helander, M. Drevlak, M. Zarnstorff, and S. Cowley. Stellarators with permanentmagnets.
Physical Review Letters , 124(9):095001, 2020.[104] T. Hender, J. Wesley, J. Bialek, A. Bondeson, A. Boozer, R. Buttery, A. Garofalo,T. Goodman, R. Granetz, Y. Gribov, et al. MHD stability, operational limits anddisruptions.
Nuclear Fusion , 47(6):S128, 2007.[105] S. Henneberg, M. Drevlak, and P. Helander. Improving fast-particle confinement inquasi-axisymmetric stellarator optimization.
Plasma Physics and Controlled Fusion ,62(1):014023, 2019.[106] S. Henneberg, M. Drevlak, C. N¨uhrenberg, C. Beidler, Y. Turkin, J. Loizu, and P. He-lander. Properties of a new quasi-axisymmetric configuration.
Nuclear Fusion , 59(2):026014, 2019.[107] E. Highcock, N. Mandell, M. Barnes, and W. Dorland. Optimisation of confinementin a fusion reactor using a nonlinear turbulence model.
Journal of Plasma Physics , 84(2), 2018.[108] M. Hirsch, J. Baldzuhn, C. Beidler, R. Brakel, R. Burhenn, A. Dinklage, H. Ehmler,M. Endler, V. Erckmann, Y. Feng, et al. Major results from the stellarator Wendelstein7-AS.
Plasma Physics and Controlled Fusion , 50(5):053001, 2008.[109] S. P. Hirshman and J. Breslau. Explicit spectrally optimized Fourier series for nestedmagnetic surfaces.
Physics of Plasmas , 5:2664, 1998.206110] S. P. Hirshman and H. K. Meier. Optimized Fourier representations for three dimen-sional magnetic surfaces.
Physics of Fluids , 28:1387, 1985.[111] S. P. Hirshman and J. C. Whitson. Steepest-descent moment method for three-dimensional magnetohydrodynamic equilibria.
Physics of Fluids , 26(12):3553, 1983.[112] S. P. Hirshman, K. C. Shaing, and W. I. van Rij. Consequences of time-reversalsymmetry for the electric field scaling of transport in stellarators.
Physical ReviewLetters , 56(16):1697, 1986.[113] S. P. Hirshman, K. C. Shaing, W. I. van Rij, C. O. Beasley, and E. C. Crume. Plasmatransport coefficients for nonsymmetric toroidal confinement systems.
Physics of Flu-ids , 29(9):2951, 1986.[114] S. P. Hirshman, D. A. Spong, J. C. Whitson, B. Nelson, D. B. Batchelor, J. F. Lyon,R. Sanchez, A. Brooks, G. Y.-Fu, R. J. Goldston, et al. Physics of compact stellarators.
Physics of Plasmas , 6(5):1858, 1999.[115] S. P. Hirshman, R. Sanchez, and C. Cook. SIESTA: A scalable iterative equilibriumsolver for toroidal applications.
Physics of Plasmas , 18(6):062504, 2011.[116] D.-M. Ho and R. Kulsrud. Neoclassical transport in stellarators.
Physics of Fluids , 30(2):442, 1987.[117] J. Hofmann, J. Baldzuhn, R. Brakel, Y. Feng, S. Fiedler, J. Geiger, P. Grigull, G. Herre,R. Jaenicke, M. Kick, et al. Stellarator optimization studies in W7-AS.
Plasma Physicsand Controlled Fusion , 38(12A):A193, 1996.[118] S. Hudson, C. Zhu, D. Pfefferl´e, and L. Gunderson. Differentiating the shape of stel-larator coils with respect to the plasma boundary.
Physics Letters A , 382(38):2732,2018.[119] S. R. Hudson, D. Monticello, A. Reiman, A. Boozer, D. Strickler, S. Hirshman, andM. Zarnstorff. Eliminating islands in high-pressure free-boundary stellarator magne-tohydrodynamic equilibrium solutions.
Physical Review Letters , 89(27):275003, 2002.[120] S. R. Hudson, R. Dewar, M. Hole, and M. McGann. Non-axisymmetric, multi-regionrelaxed magnetohydrodynamic equilibrium solutions.
Plasma Physics and ControlledFusion , 54(1):014005, 2011.[121] L.-M. Imbert-Gerard, E. Paul, and A. Wright. An introduction to symmetries instellarators. arXiv preprint arXiv:1908.05360 , 2019.[122] M. Y. Isaev, J. N¨uhrenberg, M. Mikhailov, W. Cooper, K. Watanabe, M. Yokoyama,K. Yamazaki, A. Subbotin, and V. Shafranov. A new class of quasi-omnigenous con-figurations.
Nuclear Fusion , 43(10):1066, 2003.207123] A. Jameson, L. Martinelli, and N. Pierce. Optimum aerodynamic design using theNavier-Stokes equations.
Theoretical and Computational Fluid Dynamics , 10(1-4):213,1998.[124] F. Jia, Z. Liu, M. Zaitsev, J. Hennig, and J. G. Korvink. Design multiple-layer gra-dient coils using least-squares finite element method.
Structural and MultidisciplinaryOptimization , 49(3):523, 2014.[125] S. G. Johnson. The NLopt nonlinear-optimization package, May 2014. URL http://ab-initio.mit.edu/nlopt .[126] H. J. Kelley. Gradient theory of optimal flight paths.
American Rocket Society Journal ,30(10):947, 1960.[127] W. Kernbichler, S. Kasilov, G. Kapper, A. F. Martitsch, V. Nemov, C. Albert, andM. Heyn. Solution of drift kinetic equation in stellarators and tokamaks with brokensymmetry using the code NEO-2.
Plasma Physics and Controlled Fusion , 58(10):104001, 2016.[128] J. Kierzenka and L. F. Shampine. A BVP solver based on residual control and theMaltab PSE.
ACM Transactions on Mathematical Software (TOMS) , 27(3):299–316,2001.[129] J. Kisslinger, C. Beidler, E. Harmeyer, F. Herrnegger, H. Wobig, and W. Maurer. Coilsystem of a Helias reactor. Technical report, 1999.[130] T. Klinger, C. Baylard, C. Beidler, J. Boscary, H. Bosch, A. Dinklage, D. Hartmann,P. Helander, H. Maßberg, A. Peacock, et al. Towards assembly completion and prepa-ration of experimental campaigns of Wendelstein 7-X in the perspective of a path to astellarator fusion power plant.
Fusion Engineering and Design , 88(6-8):461, 2013.[131] R. Kress, V. Maz’ya, and V. Kozlov.
Linear Integral Equations , volume 82. Springer,1989.[132] J. A. Krommes and G. Hu. The role of dissipation in the theory and simulations ofhomogeneous plasma turbulence, and resolution of the entropy paradox.
Physics ofPlasmas , 1(10):3211, 1994.[133] M. D. Kruskal and R. Kulsrud. Equilibrium of a magnetically confined plasma in atoroid.
The Physics of Fluids , 1(4):265, 1958.[134] L. Ku, P. Garabedian, J. Lyon, A. Turnbull, A. Grossman, T. Mau, M. Zarnstorff, andA. Team. Physics design for ARIES-CS.
Fusion Science and Technology , 54(3):673,2008.[135] L. P. Ku and A. H. Boozer. New classes of quasi-helically symmetric stellarators.
Nuclear Fusion , 51:013004, 2011. 208136] M. Landreman. An improved current potential method for fast computation of stel-larator coil shapes.
Nuclear Fusion , 57(4):046003, 2017.[137] M. Landreman and A. H. Boozer. Efficient magnetic fields for supporting toroidalplasmas.
Physics of Plasmas , 23(3):032506, 2016.[138] M. Landreman and E. J. Paul. Computing local sensitivity and tolerances for stellaratorphysics properties using shape gradients.
Nuclear Fusion , 58(7):076023, 2018.[139] M. Landreman and W. Sengupta. Direct construction of optimized stellarator shapes.Part 1. Theory in cylindrical coordinates.
Journal of Plasma Physics , 84(6), 2018.[140] M. Landreman, H. M. Smith, A. Moll´en, and P. Helander. Comparison of particle tra-jectories and collision operators for collisional transport in nonaxisymmetric plasmas.
Physics of Plasmas , 21(4), 2014.[141] M. Landreman, G. G. Plunk, and W. Dorland. Generalized universal instability: tran-sient linear amplification and subcritical turbulence.
Journal of Plasma Physics , 81(5), 2015.[142] M. Landreman, W. Sengupta, and G. G. Plunk. Direct construction of optimizedstellarator shapes. Part 2. Numerical quasisymmetric solutions.
Journal of PlasmaPhysics , 85(1), 2019.[143] S. Lazerson. The virtual-casing principle for 3D toroidal systems.
Plasma Physics andControlled Fusion , 54(12):122002, 2012.[144] S. A. Lazerson, J. Loizu, S. Hirshman, and S. R. Hudson. Verification of the idealmagnetohydrodynamic response at rational surfaces in the VMEC code.
Physics ofPlasmas , 23(1):012507, 2016.[145] L. G. Leal.
Advanced Transport Phenomena: Fluid Mechanics and Convective Trans-port Processes . Cambridge University Press, 2007.[146] S. J. Leary, A. Bhaskar, and A. J. Keane. A derivative based surrogate model for ap-proximating and optimizing the output of an expensive computer simulation.
Journalof Global Optimization , 30(1):39–58, 2004.[147] D. Lee, J. Harris, and G. Lee. Magnetic island widths due to field perturbations intoroidal stellarators.
Nuclear Fusion , 30(10):2177, 1990.[148] C. Liu, D. P. Brennan, A. Bhattacharjee, and A. H. Boozer. Adjoint Fokker-Planckequation and runaway electron dynamics.
Physics of Plasmas , 23(1):010702, 2016.[149] H. Liu, A. Shimizu, M. Isobe, S. Okamura, S. Nishimura, C. Suzuki, Y. Xu, X. Zhang,B. Liu, J. Huang, et al. Magnetic configuration and modular coil design for the ChineseFirst Quasi-Axisymmetric Stellarator.
Plasma and Fusion Research , 13:3405067, 2018.209150] J.-F. Lobsien, M. Drevlak, T. S. Pedersen, et al. Stellarator coil optimization towardshigher engineering tolerances.
Nuclear Fusion , 58(10):106013, 2018.[151] J.-F. Lobsien, M. Drevlak, T. Kruger, S. Lazerson, C. Zhu, and T. S. Pedersen. Im-proved performance of stellarator coil design optimization.
Journal of Plasma Physics ,86(2):815860202, 2020.[152] N. C. Logan, J.-K. Park, K. Kim, Z. Wang, and J. W. Berkery. Neoclassical toroidalviscosity in perturbed equilibria with general tokamak geometry.
Physics of Plasmas ,20(12):122507, 2013.[153] D. Lortz. The general peeling instability.
Nuclear Fusion , 15(1):49, 1975.[154] M. Drevlak. Automated optimization of stellarator coils.
Fusion Technology , 33:106,1998.[155] H. Maassberg, W. Lotz, and J. N¨uhrenberg. Neoclassical bootstrap current and trans-port in optimized stellarator configurations.
Physics of Fluids B: Plasma Physics , 5(10):3728, 1993.[156] G. B. McFadden. An artificial viscosity method for the design of supercritical airfoils.1979.[157] C. Mercier and H. Luc. The MHD approach to the problem of plasma confinementin closed magnetic configurations.
Lectures in Plasma Physics, Commission of theEuropean Communities, Luxembourg , 1974.[158] P. Merkel. Solution of stellarator boundary value problems with external currents.
Nuclear Fusion , 27(5):867, 1987.[159] M. Mikhailov, M. Drevlak, J. N¨uhrenberg, and V. Shafranov. Medium- β free-boundaryequilibria of a quasi-isodynamic stellarator. Plasma Physics Reports , 38(6):439, 2012.[160] M. Mikhailov, J. N¨uhrenberg, and R. Zille. Elimination of current sheets at resonancesin three-dimensional toroidal ideal-magnetohydrodynamic equilibria.
Nuclear Fusion ,59(6):066002, 2019.[161] W. H. Miner Jr, P. M. Valanju, S. P. Hirshman, A. Brooks, and N. Pomphrey. Useof a genetic algorithm for compact stellarator coil design.
Nuclear Fusion , 41(9):1185,2001.[162] B. Mohammadi and O. Pironneau. Shape optimization in fluid mechanics.
AnnualReview of Fluid Mechanics , 36:255, 2004.[163] S. Murakami, A. Wakasa, H. Maassberg, C. Beidler, H. Yamada, K. Watanabe, L. E.Group, et al. Neoclassical transport optimization of LHD.
Nuclear Fusion , 42(11):L19,2002. 210164] S. Murakami, H. Yamada, M. Sasao, M. Isobe, T. Ozaki, T. Saida, P. Goncharov,J. Lyon, M. Osakabe, T. Seki, et al. Effect of neoclassical transport optimization onenergetic ion confinement in LHD.
Fusion Science and Technology , 46(2):241–247,2004.[165] H. Mynick. Transport optimization in stellarators.
Physics of Plasmas , 13(5):058102,2006.[166] F. Najmabadi, A. Raffray, S. Abdel-Khalik, L. Bromberg, L. Crosatti, L. El-Guebaly,P. Garabedian, A. Grossman, D. Henderson, A. Ibrahim, et al. The ARIES-CS compactstellarator fusion power plant.
Fusion Science and Technology , 54(3):655, 2008.[167] B. Nelson, L. Berry, A. Brooks, M. Cole, J. Chrzanowski, H.-M. Fan, P. Fogarty,P. Goranson, P. Heitzenroeder, S. Hirshman, et al. Design of the National CompactStellarator Experiment (NCSX).
Fusion Engineering and Design , 66:169, 2003.[168] V. Nemov, S. Kasilov, W. Kernbichler, and M. Heyn. Evaluation of 1/ ν neoclassicaltransport in stellarators. Physics of Plasmas , 6(12):4622, 1999.[169] V. Nemov, S. Kasilov, W. Kernbichler, and G. Leitold. The ∇ B drift velocity oftrapped particles in stellarators. Physics of Plasmas , 12(11):112507, 2005.[170] J. Nocedal and S. J. Wright.
Numerical Optimization . Springer, 2006.[171] A. A. Novotny and J. Sokolowski.
Topological Derivatives in Shape Optimization .Springer, 2013.[172] C. N¨uhrenberg. Personal communication, 4 2020.[173] C. N¨uhrenberg and A. H. Boozer. Magnetic islands and perturbed plasma equilibria.
Physics of Plasmas , 10(7):2840, 2003.[174] C. N¨uhrenberg, A. H. Boozer, and S. R. Hudson. Magnetic-surface quality in nonax-isymmetric plasma equilibria.
Physical Review Letters , 102(23):235001, 2009.[175] J. N¨uhrenberg and R. Zille. Quasi-helically symmetric toroidal stellarators.
PhysicsLetters A , 129:113, 1988.[176] J. N¨uhrenberg, W. Lotz, and S. Gori. Theory of fusion plasmas. In
Proceedings of theJoint Varenna-Lausanne International Workshop , page 3, 1994.[177] L. Onsager. Reciprocal relations in irreversible processes. I.
Physical review , 37(4):405, 1931.[178] L. Onsager. Reciprocal relations in irreversible processes. II.
Physical review , 38(12):2265, 1931. 211179] S. Osher, R. Fedkiw, and K. Piechor. Level set methods and dynamic implicit surfaces.
Applied Mechanics Review , 57(3):B15, 2004.[180] C. Othmer. Adjoint methods for car aerodynamics.
Journal of Mathematics in Indus-try , 4(1):6, 2014.[181] J.-K. Park.
Ideal Perturbed Equilibria in Tokamaks . PhD thesis, Princeton University,2009.[182] J.-K. Park, A. H. Boozer, and A. H. Glasser. Computation of three-dimensional toka-mak and spherical torus equilibria.
Physics of Plasmas , 14(5):052110, 2007.[183] J.-K. Park, M. J. Schaffer, J. E. Menard, and A. H. Boozer. Control of asymmetricmagnetic perturbations in tokamaks.
Physical Review Letters , 99(19):195003, 2007.[184] E. J. Paul, M. Landreman, F. M. Poli, D. A. Spong, H. M. Smith, and W. Dorland.Rotation and neoclassical ripple transport in ITER.
Nuclear Fusion , 57(11):116044,2017.[185] E. J. Paul, M. Landreman, A. Bader, and W. Dorland. An adjoint method for gradient-based optimization of stellarator coil shapes.
Nuclear Fusion , 58(7):076015, 2018.[186] E. J. Paul, I. G. Abel, M. Landreman, and W. Dorland. An adjoint method forneoclassical stellarator optimization.
Journal of Plasma Physics , 85(5), 2019.[187] E. J. Paul, T. Antonsen, M. Landreman, and W. A. Cooper. Adjoint approach tocalculating shape gradients for three-dimensional magnetic confinement equilibria. Part2. Applications.
Journal of Plasma Physics , 86(1):905860103, 2020.[188] T. S. Pedersen, M. Otte, S. Lazerson, P. Helander, S. Bozhenkov, C. Biedermann,T. Klinger, R. C. Wolf, H.-S. Bosch, T. Wendelstein, et al. Confirmation of thetopology of the Wendelstein 7-X magnetic field to better than 1: 100,000.
NatureCommunications , 7:13493, 2016.[189] N. A. Pierce and M. B. Giles. Adjoint and defect error bounding and correction forfunctional estimates.
Journal of Computational Physics , 200:769, 2004.[190] O. Pironneau. On optimum design in fluid mechanics.
Journal of Fluid Mechanics , 64(1):97, 1974.[191] O. Pironneau.
Optimal Shape Design for Elliptic Systems . Springer, 1982.[192] R. E. Plessix. A review of the adjoint-state method for computing the gradient of afunctional with geophysical applications.
Geophysical Journal International , 167(2):495, 2006. 212193] G. G. Plunk, M. Landreman, and P. Helander. Direct construction of optimized stel-larator shapes. Part 3. Omnigenity near the magnetic axis.
Journal of Plasma Physics ,85(6), 2019.[194] N. Pomphrey, L. Berry, A. Boozer, A. Brooks, R. Hatcher, S. Hirshman, L.-P. Ku,W. Miner, H. Mynick, W. Reiersen, D. Strickler, and P. Valanju. Innovations incompact stellarator coil design.
Nuclear Fusion , 41:339, 2001.[195] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery.
Numerical Recipes:The Art of Scientific Computing . Cambridge University Press, 2007.[196] J. Proll, H. Mynick, P. Xanthopoulos, S. Lazerson, and B. Faber. TEM turbulenceoptimisation in stellarators.
Plasma Physics and Controlled Fusion , 58(1):014006, 2015.[197] A. Reiman, G. Fu, S. Hirshman, L. Ku, D. Monticello, H. Mynick, M. Redi, D. Spong,M. Zarnstorff, B. Blackwell, et al. Physics design of a high-quasi-axisymmetric stel-larator.
Plasma Physics and Controlled Fusion , 41(12B):B273, 1999.[198] M. Rosenbluth, R. Hazeltine, and F. L. Hinton. Plasma transport in toroidal confine-ment systems.
The Physics of Fluids , 15(1):116, 1972.[199] W. Rudin.
Real and Complex Analysis . Tata McGraw-Hill Education, 2006.[200] N. Rust, B. Heinemann, B. Mendelevitch, A. Peacock, and M. Smirnow. W7-X neutral-beam-injection : Selection of the NBI source positions for experiment start-up.
FusionEngineering and Design , 86(6-8):728, 2011.[201] S. Sakakibara, K. Watanabe, Y. Suzuki, Y. Narushima, S. Ohdachi, N. Nakajima,F. Watanabe, L. Garcia, A. Weller, K. Toi, et al. MHD study of the reactor-relevanthigh-beta regime in the Large Helical Device.
Plasma Physics and Controlled Fusion ,50(12):124014, 2008.[202] R. Sanchez, S. Hirshman, A. Ware, L. Berry, and D. Spong. Ballooning stabilityoptimization of low-aspect-ratio stellarators.
Plasma Physics and Controlled Fusion ,42(6):641, 2000.[203] T. Sauer.
Numerical Analysis . Pearson, 2012.[204] C. Schwab. Ideal magnetohydrodynamics: Global mode analysis of three-dimensionalplasma configurations.
Physics of Fluids B: Plasma Physics , 5(9):3195, 1993.[205] K.-C. Shaing, E. Crume Jr, J. Tolliver, S. Hirshman, and W. Van Rij. Bootstrap currentand parallel viscosity in the low collisionality regime in toroidal plasmas.
Physics ofFluids B: Plasma Physics , 1(1):148, 1989.[206] A. Shimizu, H. Liu, M. Isobe, S. Okamura, S. Nishimura, C. Suzuki, Y. Xu, X. Zhang,J. Liu, B.and Huang, et al. Configuration property of the Chinese First Quasi-Axisymmetric Stellarator.
Plasma and Fusion Research , 13:3403123, 2018.213207] R. Sinclair, J. Hosea, and G. Sheffield. Magnetic surface mappings by storage ofphase-stabilized low-energy electron beams.
Applied Physics Letters , 17(2):92, 1970.[208] C. L. Smith and S. Cowley. The path to fusion power.
Philosophical Transactionsof the Royal Society A: Mathematical, Physical and Engineering Sciences , 368(1914):1091, 2010.[209] L. Spitzer Jr. A proposed stellarator. Technical report, Princeton University, NJForrestal Research Center, 1951.[210] L. Spitzer Jr. Magnetic fields and particle orbits in a high-density stellarator. Technicalreport, Princeton University, NJ Project Matterhorn, 1952.[211] L. Spitzer Jr. The stellarator concept.
The Physics of Fluids , 1(4):253, 1958.[212] D. A. Spong and J. H. Harris. New QP / QI Symmetric Stellarator Configurations.
Plasma and Fusion Research , 5:S2039, 2010.[213] D. A. Spong, S. P. Hirshman, J. C. Whitson, D. B. Batchelor, B. A. Carreras, V. E.Lynch, and J. A. Rome. J * optimization of small aspect ratio stellarator/tokamakhybrid devices. Physics of Plasmas , 5(5):1752, 1998.[214] D. A. Spong, S. P. Hirshman, L. A. Berry, J. F. Lyon, R. H. Fowler, D. J. Strickler,M. J. Cole, B. N. Nelson, D. E. Williamson, A. S. Ware, et al. Physics issues of compactdrift optimized stellarators.
Nuclear Fusion , 41(6):711, 2001.[215] T. H. Stix. Highlights in early stellarator research at princeton.
Journal of PlasmaFusion Research Series , 1:3, 1998.[216] D. J. Strickler, L. A. Berry, and S. P. Hirshman. Designing Coils for Compact Stel-larators.
Fusion Science and Technology , 41(2):107, 2002.[217] D. J. Strickler, L. A. Berry, and S. P. Hirshman. Integrated plasma and coil optimiza-tion for compact stellarators. Technical report, 2003.[218] D. J. Strickler, S. P. Hirshman, D. A. Spong, M. J. Cole, J. F. Lyon, B. E. Nelson,D. E. Williamson, and A. S. Ware. Development of a robust quasi-poloidal compactstellarator.
Fusion Science and Technology , 45(1):15, 2004.[219] E. Strumberger and S. G¨unter. CASTOR3D: Linear stability studies for 2D and 3Dtokamak equilibria.
Nuclear Fusion , 57(1):016032, 2016.[220] R. Strykowsky, T. Brown, J. Chrzanowski, M. Cole, P. Heitzenroeder, G. Neilson,D. Rej, and M. Viol. Engineering cost & schedule lessons learned on ncsx. In , pages 1–4. IEEE, 2009.214221] H. Sugama, T.-H. Watanabe, and M. Nunami. Linearized model collision operatorsfor multiple ion species plasmas and gyrokinetic entropy balance equations.
Physics ofPlasmas , 16(11):112503, 2009.[222] G. Sun and S. Wang. A review of the artificial neural network surrogate modeling inaerodynamic design.
Proceedings of the Institution of Mechanical Engineers, Part G:Journal of Aerospace Engineering , 233(16):5863–5872, 2019.[223] T. Sunn Pedersen, A. Dinklage, Y. Turkin, R. Wolf, S. Bozhenkov, J. Geiger,G. Fuchert, H.-S. Bosch, K. Rahbarnia, H. Thomsen, et al. Key results from thefirst plasma operation phase and outlook for future performance in Wendelstein 7-X.
Physics of Plasmas , 24(5):055503, 2017.[224] K. Svanberg. A class of globally convergent optimization methods based on conser-vative convex separable approximations.
SIAM Journal on Optimization , 12(2):555,2002.[225] A. N. Tikhonov. On the solution of ill-posed problems and the method of regularization.In
Doklady Akademii Nauk , volume 151, pages 501–504. Russian Academy of Sciences,1963.[226] L. N. Trefethen and D. Bau III.
Numerical Linear Algebra . Society for Industrial andApplied Mathematics, 1997.[227] V. Tribaldos and J. Guasp. Neoclassical global flux simulations in stellarators.
Plasmaphysics and controlled fusion , 47(3):545, 2005.[228] R. Turner. Gradient coil design : A review of methods.
Magnetic Resonance Imaging ,11:903, 1993.[229] J. G. Van Bladel.
Electromagnetic Fields , volume 19. John Wiley & Sons, 2007.[230] W. I. van Rij and S. P. Hirshman. Variational bounds for transport coefficients inthree-dimensional toroidal plasmas.
Physics of Fluids B: Plasma Physics , 1(3):563,1989.[231] D. Venditti and D. Darmofal. A multilevel error estimation and grid adaptive strategyfor improving the accuracy of integral outputs. In , page 3292, 1999.[232] F. Wagner. Stellarators and optimised stellarators.
Fusion Technology , 33(2T):67,1998.[233] F. Wagner, S. B¨aumel, J. Baldzuhn, N. Basse, R. Brakel, R. Burhenn, A. Dinklage,D. Dorst, H. Ehmler, M. Endler, et al. W7-AS: One step of the Wendelstein stellaratorline.
Physics of Plasmas , 12(7):072509, 2005.215234] A. Weller, S. Sakakibara, K. Watanabe, K. Toi, J. Geiger, M. Zarnstorff, S. Hudson,A. Reiman, A. Werner, C. N¨uhrenberg, et al. Significance of MHD effects in stellaratorconfinement.
Fusion Science and Technology , 50(2):158, 2006.[235] J. Wesson and D. J. Campbell.
Tokamaks , volume 149. Oxford University Press, 2011.[236] D. Williamson, A. Brooks, T. Brown, J. Chrzanowski, M. Cole, H.-M. Fan, K. Freuden-berg, P. Fogarty, T. Hargrove, P. Heitzenroeder, G. Lovett, P. Miller, R. Myatt, B. Nel-son, W. Reiersen, and D. Strickler. Modular coil design developments for the NationalCompact Stellarator Experiment (NCSX).
Fusion Engineering and Design , 75-79:71,2005.[237] R. Wolf, A. Alonso, S. ¨Ak¨aslompolo, J. Baldzuhn, M. Beurskens, C. Beidler, C. Bie-dermann, H.-S. Bosch, S. Bozhenkov, R. Brakel, et al. Performance of Wendelstein7-X stellarator plasmas during the first divertor operation phase.
Physics of Plasmas ,26(8):082504, 2019.[238] X. Wu, C. Wang, and T. Kozlowski. Kriging-based surrogate models for uncertaintyquantification and sensitivity analysis. In
Proceedings of the MC-2017, InternationalConference on Mathematics Computational Methods Applied to Nuclear Science Engi-neering , 2017.[239] P. Xanthopoulos, H. Mynick, P. Helander, Y. Turkin, G. Plunk, F. Jenko, T. G¨orler,D. Told, T. Bird, and J. Proll. Controlling turbulence in present and future stellarators.
Physical Review Letters , 113(15):155001, 2014.[240] K. Yamazaki, N. Yanagi, H. Ji, H. Kaneko, N. Ohyabu, T. Satow, S. Morimoto, J. Ya-mamoto, O. Motojima, and the LHD Design Group. Requirements for accuracy ofsuperconducting coils in the Large Helical Device.
Fusion Engineering and Design , 20:79–86, 1993.[241] S. Yoshikawa and T. Stix. Experiments on the Model C stellarator.
Nuclear Fusion ,25(9):1275, 1985.[242] M. Zarnstorff, L. Berry, A. Brooks, E. Fredrickson, G. Fu, S. Hirshman, S. Hudson,L. Ku, E. Lazarus, D. Mikkelsen, et al. Physics of the compact advanced stellaratorNCSX.
Plasma Physics and Controlled Fusion , 43(12A):A237, 2001.[243] C. Zhu, S. R. Hudson, Y. Song, and Y. Wan. New method to design stellarator coilswithout the winding surface.
Nuclear Fusion , 58:016008, 2018.[244] C. Zhu, S. R. Hudson, Y. Song, and Y. Wan. Designing stellarator coils by a modifiedNewton method using FOCUS.
Plasma Physics and Controlled Fusion , 60(6):065008,2018. 216245] C. Zhu, D. A. Gates, S. R. Hudson, H. Liu, Y. Xu, A. Shimizu, and S. Okamura.Identification of important error fields in stellarators using the Hessian matrix method.
Nuclear Fusion , 59(12):126007, 2019.[246] C. Zhu, M. Zarnstorff, D. Gates, and A. Brooks. Designing stellarators using perpen-dicular permanent magnets. arXiv preprint arXiv:1912.05144arXiv preprint arXiv:1912.05144