[PDF] Optimization of Flat to Round Transformers Using Adjoint Techniques

Abstract

A continuous system of moment equations is introduced that models the transverse dynamics of a beam of charged particles as it passes through an arbitrary lattice of quadrupoles and solenoids in the presence of self-fields. Then, figures of merit are introduced specifying system characteristics to be optimized. The resulting model is used to optimize the parameters of the lattice elements of a flat to round transformer, as could be applied in relativistic electron cooling. Results are shown for a case of no space-charge and two cases with space-charge. The optimization is based on a gradient descent algorithm in which the gradient is calculated using adjoint methods that prove to be very computationally efficient. In the optimized configurations for each of the three cases, the beam in the solenoid is round and with uniform second moments.

Full PDF

OOptimization of Flat to Round Transformers Using Adjoint Techniques

L. Dovlatyan ∗ , B.L. Beaudoin , S. Bernal , I. Haber, D. Sutter, and T.M. Antonsen Jr.(Institute For Research in Electronics and Applied Physics,University of Maryland, College Park, Maryland 20742, USA) (Dated: February 16, 2021)A continuous system of moment equations is introduced that models the transverse dynamics ofa beam of charged particles as it passes through an arbitrary lattice of quadrupoles and solenoidsin the presence of self-ﬁelds. Then, ﬁgures of merit are introduced specifying system characteristicsto be optimized. The resulting model is used to optimize the parameters of the lattice elements of aﬂat to round transformer, as could be applied in relativistic electron cooling. Results are shown fora case of no space-charge and two cases with space-charge. The optimization is based on a gradientdescent algorithm in which the gradient is calculated using adjoint methods that prove to be verycomputationally eﬃcient. In the optimized conﬁgurations for each of the three cases, the beam inthe solenoid is round and with uniform second moments. I. INTRODUCTION

Beams of charged particles in accelerators are guidedand manipulated by complex systems of magnets whosedesign optimization requires tracking the trajectories ofthe particles through the ﬁelds of the magnets as well asthrough the self-ﬁelds of the beam. Since the propertiesof the beam after passing through a lattice of magneticfocusing elements depend in a complicated way on themany parameters describing the lattice, optimizing thelattice is a computationally intensive task, and histori-cally, a large body of knowledge and techniques for ad-dressing the optimization problem have been developed.Traditionally, the design of these magnet systems iscarried out using computer codes that calculate the beamparticle phase space trajectories in the prescribed latticeof magnets. The process often reduces to the optimiza-tion of ﬁgures of merit (FoMs) in the high dimensionalparameter space characterizing the lattice. Because ofthe large number of parameters, the eﬃciency of the op-timization algorithm is critical. One class of optimizationalgorithms, based on calculating the gradient of the FoMin parameter space, becomes computationally prohibitiveif the gradient is to be calculated directly (by individuallyvarying each of the parameters) in the high dimensionalparameter space.The number of computations needed to calculate thegradient can be reduced via the introduction of adjointtechniques [1–5]. In this approach an alternate, but re-lated, mathematical problem is introduced in which ina single (or in several) computation(s) the linear depen-dence of the FoM on all the parameters can be deter-mined. The adjoint approach has previously been appliedin circuit theory [1], electromagnetics [2], aerodynamics[3], and accelerator physics [4, 5], as well as in other ﬁelds.In this paper we will illustrate the application of theadjoint approach to the design of Flat-to-Round (FTR) ∗ [email protected] or Round-to-Flat (RTF) transformers as have been pro-posed for use in relativistic electron cooling [6–9]. As thenames suggest these transformers are systems of mag-nets that will convert an unmagnetized beam that hasa high aspect ratio, elliptical spatial cross section, toa round beam in a solenoidal magnetic ﬁeld, or viceversa. In its simplest form this conversion is accom-plished with a triplet of quadrupoles, and a solenoid.See Fig. 1. Parameters that can be varied to opti-mize this conversion are the positions and strengths ofthe four magnet elements, including the orientations ofthe quadrupoles (11 parameters including the locationof the ﬁrst quadrupole). The method we will introducealso allows for more detailed optimization in terms of thespatial proﬁles of the magnet elements. In addition, onemight have the option to vary the parameters of the in-coming beam, but we will take these to be given. Thedescription of the beam that we will use is based on thesolution of second moment equations describing the four-dimensional transverse phase space of the beam. Thereare 10 such moments corresponding to the ten indepen-dent elements of the matrix of second moments, the so-called sigma matrix [10]. The eﬀect of self-ﬁelds is in-cluded in the calculations under the assumption that thebeam maintains an elliptical spatial cross section of uni-form density. A future publication will extend the ap-proach to a kinetic or phase space description. The FoM FIG. 1. General geometry of quadrupole triplet for FTR beamtransformations. a r X i v : . [ phy s i c s . acc - ph ] F e b will be a weighted sum of squares of nine combinationsof the moments at the exit of the transformer. Each ofthese nine combinations ideally should vanish in a suc-cessful transformation. We will ﬁnd that combining theadjoint calculation with a gradient descent algorithm al-lows us to eﬀectively satisfy all nine conditions.We chose FTR and RTF transformers as an examplebecause of their multiple applications. The original mo-tivation for the development of circular mode adapters,introduced by Derbenev in 1998 [6], was to improve therates and eﬃciency of relativistic electron cooling [6–9]of hadron beams. Here a high energy electron beam anda hadron beam are made to co-propagate during whichthe beams interact and the hadrons are cooled. High en-ergy electron beam sources are generally electron storagedevices, in which the beams tend to become ﬂat throughradiation eﬀects and intra-beam scattering. The cool-ing of the hadron beam is optimized when the electronbeam interacting with the hadron beam is magnetizedand matches the hadron beam size and shape, which isapproximately round, and has a low eﬀective transversetemperature [11]. The adapters provide the means forconverting an unmagnetized beam, to the “magnetized”state in the cooling solenoid and then converting backto the unmagnetized state at the exit of the solenoid forreinjection into the storage device.Almost immediately after Derbenev’s initial proposal,it was realized that an RTF adapter following an elec-tron source immersed in a solenoid ﬁeld could be usedto produce a ﬂat, uncoupled beam of high enough emit-tance ratio and charge to potentially replace the highenergy electron damping rings required in proposed de-signs for high energy linear colliders [12]. Experimentsat Fermilab and DESY conﬁrmed the feasibility [13–15].Since then, the possible use of circular mode adaptersfor phase space manipulations have been expanded toinclude generating round beams at select points in cir-cular colliders to compensate for beam-beam eﬀects athigh energy and to oﬀset the eﬀects of space-charge andthe cyclotron component of the ion beam in low energycooler rings [9]. Additionally, their use has been exploredfor injection into high intensity X-ray Sources [16, 17] orinto dielectric laser driven accelerators [18, 19].The FTR and RTF beam transformations typically in-volve a solenoid and three, 45-degree skew quadrupoles,as depicted in Fig. 1. The overall length of the triplet,excluding the solenoid, scales inversely with the strengthof the solenoidal ﬁeld and linearly with the momentumof the beam, as reﬂected in the reference length β s = 2 p | qB | (1)where p is the momentum, q the charge, and B the max-imum solenoidal ﬁeld. In symmetrical triplets, the sim-plest conﬁguration, the two outer quadrupoles have thesame strength and polarity and are equidistant from thecentral one with opposite polarity. In asymmetric triplets[8, 9], the two outer quadrupoles have diﬀering strengths and diﬀerent separations from the inner quadrupole.Using a thin lens model for the symmetric triplet, thedistance d between the outer thin quadrupoles Q , Q and the central one Q , and the inverse focal lengths, q = q , q , are, following Wolske, all given in terms of β s [20]: d = β s (cid:112) √ , q = q = −√ β s , q = 2 √ β s (2)In this design, the ﬂat beam (for FTR transformer) isincident directly on the ﬁrst quadrupole which is locatedat z = 0, i.e., no drift space is assumed between theinput beam and the ﬁrst quadrupole and between thelast quadrupole and the solenoid ( s = s = 0 in Fig1). More general equations, ascribed to Edwards [21,22], still apply to thin lenses and no initial/ﬁnal driftspaces, but relax the condition of symmetry, i.e., q (cid:54) = q , s (cid:54) = s . For the triplet described by Eq. (2) aninitial horizontal ﬂat beam is transformed into a roundbeam with a canonical angular momentum of oppositesign to the canonical angular momentum term generatedby the solenoid on transport to its center.As an example, Burov and Danilov [7] described a sym-metrical FTR triplet for an electron energy of 500 MeVand solenoid ﬁeld strength B s = 10 . β s parameter in Eq. (1). Alternatively, it is possible toscale existing designs, as in references [7] or [8], to a de-sired electron energy and solenoid ﬁeld. We will adaptas a starting point for the test model to be used in thefollowing scaled version of the Burov-Danilov symmetri-cal FTR triplet [7], which includes non-zero drift spaces s = s (Fig. 1), an electron energy of 5 keV and asolenoid ﬁeld of 15 G ( β s = 0 .

319 m). The resultingoverall length of the triplet is 0.213 m. From these initialparameters it is straight forward to obtain a thick lensmodel that provides the initial conditions for the adjointoptimizations.The initial FTR optical parameters also provide a com-putation of the initial beam second moments. If M st isthe matrix that transports an initial ﬂat beam throughthe rotated triplet to the center of the solenoid, then foran initial matrix, (cid:80) , the sigma matrix at the center ofthe solenoid is (cid:88) = M st (cid:80) M T st (3)where M T st is the transpose [10, 23]. The generalizedsecond moment equations derived below for the use inthe adjoint optimization process relate directly to theelements of (cid:80) .The treatments discussed so far assume no space-charge or self-ﬁeld eﬀects. In Sec. II we will introducea general formulation based on a set of (10) diﬀerentialequations for the evolution with distance of the secondmoments of the beam particle distribution function. Thesystem allows for the proﬁles of the magnetic ﬁelds of thefocusing elements to be treated; it allows for the orienta-tions, locations and the strength of the quadrupoles to bevaried; and it treats self-ﬁelds in the approximation thatthe beam maintains an elliptical transverse cross sectionof constant density. We will also introduce a system ofadjoint equations that allows for the eﬃcient calculationof the change in system performance with changing pa-rameters. In Sec. III we will use the adjoint calculation ofthe gradient in parameter space to optimize the param-eters of a selected model of an FTR transformer under avariety of assumptions and constraints. II. BASIC MODEL

In this section we ﬁrst present a system of equationsthat describes the evolution with distance of the secondmoments of a charged particle beam distribution in thepresence of a combination of transverse forces. Theseforces include the Lorentz force of a spatially varyingsolenoidal (axial) magnetic ﬁeld, the Lorentz force ofa superposition of arbitrarily oriented quadrupole mag-netic ﬁelds, and the electric and magnetic self-force dueto the beam’s charge and current densities. These mo-ment equations will be used to simulate the propagationof a beam through a system of magnets, which converts abeam with an elliptical cross section to one with a roundcross section. A general ﬁgure of merit will be introducedthat quantiﬁes how successfully the shape conversion hasbeen made. Subsequently, we will formally perturb thissystem by making small changes in the parameters deﬁn-ing the focusing forces and assess the changes by evaluat-ing the change in the ﬁgure of merit. Such an evaluationwould be used in a gradient-based optimization scheme.Then we will introduce an adjoint system of equationsthat will allow one to calculate compactly the changes inthe system due to changes in the focusing parameters.The moments we consider are averages of products ofall possible pairs of variables describing the transversedisplacement of beam particles and the rate of changeof the transverse displacement with distance. These mo-ments correspond to the 16 elements of the 4 by 4 trans-verse sigma matrix mentioned in Eq. (3). Due to thesymmetry of this matrix only 10 elements are indepen-dent. Thus, our governing system consists of 10 momentevolution equations. Although there are 10 separatemoment equations, we show that as expected there areaccompanying conservation laws relating the moments.The evolution with distance of the sigma matrix is nor-mally treated by matrix multiplication with individualmatrices representing focusing elements and drift spaces.We choose to deal with diﬀerential equations for the con-tinuous moments, as this allows us to introduce adjointequations that include self-ﬁeld eﬀects and spatial pro-ﬁles of focusing ﬁelds.The underlying assumption that will be made in de-riving the moment equations is that the beam particles’trajectories are well described by the paraxial equations of motion in which transverse forces are linear in the par-ticles’ displacements from the axis, or are linearly pro-portional to a particle’s transverse velocity. The ﬁnalequations will describe the beam’s evolution in the Lar-mor, or rotating, frame deﬁned by the applied solenoidalmagnetic ﬁeld. We begin the calculation in the lab frame,transform to the Larmor frame, and complete the deriva-tion in the Larmor frame. Beam quantities in the labframe can always be recovered from the Larmor framevalues by application of rotation transformations. Theseare given in Appendix B.To start, we write equations for the evolution of thetransverse particle displacements in a Cartesian coordi-nate system in the lab frame, x (cid:48)(cid:48) = k x + k Ω y (cid:48) + 12 k (cid:48) Ω y , (4a) y (cid:48)(cid:48) = k y − k Ω x (cid:48) − k (cid:48) Ω x (4b)Here ( x , y ) are a particle’s transverse displacements inthe lab frame, for example, x is the horizontal displace-ment from the center-line and y is the vertical displace-ment. A prime denotes diﬀerentiation with respect tothe axial coordinate z , which measures distance alongthe center-line. The quantity k Ω ( z ) = qB z ( z ) mcγν z (5)is the spatial gyration rate due to the axial (solenoidal)magnetic ﬁeld. The quantities in Eq. (5) are as fol-lows: q and m are the particle charge and mass, B z is the axial component of the solenoidal magnetic ﬁeldstrength, c is the speed of light, ν z is the axial veloc-ity and γ = (1 − ν z /c ) − / . (If the beam consists ofelectrons, q = − e , and if B z > k Ω < k x , k y represent the transverse forces from thequadrupoles and space-charge, and will be deﬁned sub-sequently. The second terms in Eq. (4a) and (4b) de-scribe the Lorenz force due to the axial component ofthe solenoidal ﬁeld and the transverse velocity, and thethird terms describe the Lorentz force due to the radialcomponent of the solenoidal ﬁeld and the axial velocity.Our next step is to transform the variables to theirCartesian representation in a rotating frame, (cid:18) x y (cid:19) = (cid:20) cos φ − sin φ sin φ cos φ (cid:21) (cid:18) xy (cid:19) (6)where φ ( z ) is an axially dependent rotation phase thatwe choose to satisfy φ (cid:48) = − k Ω (cid:18) x (cid:48)(cid:48) y (cid:48)(cid:48) (cid:19) + (cid:18) k Ω (cid:19) (cid:18) xy (cid:19) = (cid:20) cos φ sin φ − sin φ cos φ (cid:21) (cid:18) k x k y (cid:19) (8)We wish to obtain equations for the average of prod-ucts of the four variables ( x, x (cid:48) , y, y (cid:48) ). There are 16 or-dered products. However, order does not matter, leaving10 independent products. We choose combinations ofthese products that distinguish x - y symmetric and non-symmetric motion. For spatial moments we choose Q =  Q + Q − Q x  =  (cid:104) x + y (cid:105) / (cid:104) x − y (cid:105) / (cid:104) xy (cid:105)  (9)Here the angle brackets imply average over the beam dis-tribution function. Accompanying these spatial momentsare momentum like moments P = ddz Q =  P + P − P x  =  (cid:104) xx (cid:48) + yy (cid:48) (cid:105)(cid:104) xx (cid:48) − yy (cid:48) (cid:105)(cid:104) yx (cid:48) + xy (cid:48) (cid:105)  (10)The angular momentum completes this group of momen- tum like moments, L = (cid:104) xy (cid:48) − yx (cid:48) (cid:105) (11)Note, that L is the angular momentum in the rotatingframe. The group of 10 moments is completed by threeenergy-like moments E =  E + E − E x  =  (cid:104) x (cid:48) + y (cid:48) (cid:105)(cid:104) x (cid:48) − y (cid:48) (cid:105) (cid:104) y (cid:48) x (cid:48) (cid:105)  (12)The evolution of these moments is determined by thefollowing system of equations ddz Q = P , (13a) ddz P = E + O · Q , (13b) ddz E = O · P + N L , (13c) ddz L = − N † · Q (13d)Here the matrix O and vector N are deﬁned as follows O = − k   + 2 (cid:88) quads K q  c q − s q c q − s q  + Λ Q ∆  c α s α c α s α  (14)and N = 2 (cid:88) quads K q  s q c q  − Λ Q ∆  s α − c α  (15)The matrices deﬁned in Eqs. (14) and (15) are con-tinuous functions of axial distance, z . The ﬁrst matrixin Eq. (14) is due to the solenoidal ﬁeld, the second isdue to the quadrupoles, and the third matrix is due tothe self-ﬁelds. The ﬁrst vector in Eq. (15) is due to thequadrupoles, and the second is due to the self-ﬁelds. InEqs. (14) and (15) the following expressions and nota-tion have been introduced. Each quadrupole magnet hasﬁeld strength given in the lab frame by B qx = B (cid:48) q ( z ) [sin (2 ψ q ) x − cos (2 ψ q ) y ] B qy = − B (cid:48) q ( z ) [cos (2 ψ q ) x + sin (2 ψ q ) y ] (16)where B (cid:48) q ( z ) deﬁnes the strength and axial proﬁle of thequadrupole ﬁeld and the angle ψ q deﬁnes the orientation. As shown in Appendix A, this leads to the displayed con-tributions to the O matrix and N vector where K q ( z ) = qB (cid:48) q ( z ) mcγν z (17)If K q > θ = ψ q , ψ q + π , and focusing when θ = ψ q + π/ , ψ q − π/

2. Here, θ is the angle a particle’s trans-verse displacement makes with respect to the x axis. If K q < s q , c q ) = ( sin (2 φ − ψ q ) , cos (2 φ − ψ q ) ) (18)where φ ( z ) is the angle resulting from the integration ofEq. (5). We note that with one group of quadrupolesplaced separately from the solenoid it is always possibleto deﬁne the phase such that it vanishes in the region ofthe quadrupoles.The self-ﬁeld term is calculated in the Larmor framein Appendix B. Here it is assumed that the beam chargedensity distribution is spatially uniform inside an ellipsewith diﬀering major and minor radii and tilted at an an-gle in the Larmor frame. The values of the major andminor radii and the angle are determined by the threespatial moments ( Q + , Q − , Q x ). The result is the follow-ing Q ∆ = (cid:2) Q − ( Q − + Q x ) (cid:3) / , (19a) c α = − Q − ( Q + + Q ∆ ) , (19b) s α = − Q x ( Q + + Q ∆ ) (19c)The strength of the self-ﬁelds is measured by the beamcurrent parameter Λ = cqZ I πmν z γ (20)with Z = 377 Ohms, and I is the beam current.The equations (13a) - (13d) can be combined to showthe following conservation relation ddz (cid:20) E · Q + L − P · P (cid:21) = 0 (21)This does not represent the conservation of a four-dimensional emittance. Rather it is related to the con-served quantity discussed in [24].We compare the solution of the moment equations withthe moments calculated by the PIC code Warp [25] inFig. 2. For this simulation we consider a symmetrictriplet with parameters selected to convert a ﬂat beamto a round one in the absence of space-charge. Param-eters for the simulation are given in Table I. The largepipe radius in the list of parameters is chosen to minimizethe contributions from image charge forces on the simu-lations in order to compare the results with the momentequations.Plotted in Fig. 2 are the computed values of x = (cid:104) x (cid:105) = Q + + Q − and y = (cid:104) y (cid:105) = Q + − Q − as functionsof z for three values of beam current. The solutions ofthe moment equations are shown as open circles, andthe PIC values are shown as solid and dashed lines. Theinitial distribution function in the PIC case is taken to bea K-V [26] distribution function. As can be seen, in thezero current case the beam becomes round at the endof the triplet, whereas in the presence of self-ﬁelds theroundness is spoiled. In the next section we will show howthe optimizations of the triplet parameters can recoverthe round state of the beam.We now imagine that Eqs. (13a) - (13d) have beensolved for a given set of initial conditions, ( Q , P , E , L ) (cid:12)(cid:12) z i ,and magnetic ﬁeld parameters and proﬁles. An assess-ment of the conﬁguration can be made on the basis ofa ﬁgure of merit that depends on the ﬁnal values of themoments, F ( Q , P , E , L ) (cid:12)(cid:12)(cid:12) z f (22) FIG. 2. Comparison of solutions of the moment equations(solid and dashed lines) with solutions from the PIC codeWarp (circular symbols).

For a ﬂat to round transition, the ﬁgure of merit mightbe a function which when minimized forces the compo-nents Q x , Q − and P all to zero, signifying the beam isround and its radius will remain constant. This will bediscussed in more detail in the next section. We referto a general solution just described (not necessarily theminimizing one) as a base case.If a small change is made in the focusing parameters,there will be a commensurately small change in the mo-ments as functions of z, and a small change in the ﬁgureof merit from the corresponding base case. The smallchanges in the moments will satisfy a linearization ofEqs. (13) - (15). We denote the small changes with asuperscript ( X ), and we refer to them as true changes. TABLE I. PIC code simulation and FTR lattice parameters.Parameter ValueBeam distribution K-VParticle number 40000Longitudinal step-size 0.1 (mm)Grid cell number 2048 x 2048Pipe radius 250.0 (mm)Beam energy 5 keVQuadrupole 1 location 0.0043 (m)Quadrupole 2 location 0.1066 (m)Quadrupole 3 location 0.2090 (m)Quadrupole 1,3 strength -18.236 (T/m)Quadrupole 2 strength 21.364 (T/m)Quadrupole 1,2,3 length 0.1 (mm)Solenoid location 0.2133Solenoid strength 15e-4 (T)Initial β x = β y α x = α y (cid:15) x , (cid:15) y These satisfy ddz δ Q ( X ) = δ P ( X ) (23a) ddz δ P ( X ) = δ E ( X ) + O · δ Q ( X ) + δ O ( X ) · Q (23b) ddz δ E ( X ) = O · δ P ( X ) + N δL ( X ) + δ O ( X ) · P + δ N ( X ) L (23c) ddz δL ( X ) = − N † · δ Q ( X ) − δ N † ( X ) · Q (23d)The resulting change in the ﬁgure of merit is δF ( X ) = (cid:20) δ Q ( X ) ∂F∂ Q + δ P ( X ) ∂F∂ P + δ E ( X ) ∂F∂ E + δL ( X ) ∂F∂L (cid:21) z f (24)Again, the superscript ( X ) signiﬁes that it is a true per-turbation, i.e. the result of changing the focusing system:the solenoidal and quadrupole magnetic ﬁelds.The perturbed matrices O ( X ) , N ( X ) appearing in Eqs.(23b) - (23d) are the result of perturbations to matrices O and N . These matrices are perturbed due to two eﬀects.One eﬀect is the set of perturbations due to changes inthe solenoidal and quadrupole magnetic ﬁelds that ap-pear explicitly in the deﬁnitions of O and N . Theseare being varied in order to optimize the conﬁguration.A second eﬀect is due to changes in the self-ﬁeld terms.These changes are expressed through the changes in thespatial moments, δ Q ( X ) in the deﬁnitions of Eqs. (19a)- (19c). We write the changes to these matrices as asum of contributions from explicit changes in the focus-ing conﬁguration and implicit changes in the self-ﬁelds, δ O ( X ) · Q = δ O ( X ) Q,B · Q + M Q · δ Q ( X ) , (25a) δ O ( X ) · P = δ O ( X ) Q,B · P + M P · δ Q ( X ) , (25b) δ N ( X ) = δ N ( X ) Q,B + M N · δ Q ( X ) , (25c)Here, the variables with subscripts Q , B are the contri-butions to the changes from the explicit dependence ofthe matrices on the magnetic focusing parameters. Thematrices M Q , M P , M N are the contributions from theself-ﬁeld terms in the matrices M , N , which depend on Q as given by Eqs. (14), (15), and (19). Formulas forthese matrices are given in Appendix B.Changing each parameter will result in a change to theﬁgure of merit. This can be thought of as deﬁning a gra-dient of the ﬁgure of merit in parameter space. Directevaluation of the change in the ﬁgure of merit would re-quire resolving the system of 10 coupled equations M p times where M p is the number of parameters that couldbe varied. We can avoid this burden by introducing an adjoint system of equations. Consider a second linearperturbation to the base case, denoted by a superscript( Y ). These are the adjoint perturbation equations thatwe will actually solve. ddz δ Q ( Y ) = δ P ( Y ) , (26a) ddz δ P ( Y ) = δ E ( Y ) + O · δ Q ( Y ) , (26b) ddz δ E ( Y ) = O · δ P ( Y ) + N δL ( Y ) + δ ˙ E ( Y ) , (26c) ddz δL = − N † · δ Q ( Y ) (26d)In Eqs. (26a) - (26d) we use the base case (unperturbed) O and N matrices. However, we have added a term inEq. (26c), δ ˙ E ( Y ) , which will be chosen, as describedsubsequently, so as to achieve a desired cancellation ofthe self-ﬁeld contributions to the changes in O and N .We next form the following combination of true ( X )and adjoint ( Y ) variables, (cid:15) ≡ δ P ( Y ) · δ P ( X ) − δ Q ( X ) · δ E ( Y ) − δ Q ( Y ) · δ E ( X ) − δL ( Y ) δL ( X ) (27)and diﬀerentiate it with respect to z . We evaluate theindividual terms in the derivative of Eq. (27) using theproduct rule and Eqs. (23) and (26). Noting numerouscancellations we arrive at ddz (cid:15) = δ P ( Y ) · δ O ( X ) · Q + δL ( Y ) δ N † ( X ) · Q − δ Q ( X ) · δ ˙ E ( Y ) − δ Q ( Y ) · δ O ( X ) · P − δ Q ( Y ) · δ N ( X ) L (28)Let us now consider the quantities δ N ( X ) and δ O ( X ) appearing in Eq. (28). We write them according to theseparation deﬁned in Eq. (23). We then have ddz (cid:15) = δ P ( Y ) · M Q · δ Q ( X ) + δ P ( Y ) · δ O ( X ) Q,B · Q − δ Q ( X ) · δ ˙ E ( Y ) + δL ( Y ) Q · M N · δ Q ( X ) + δL ( Y ) Q · δ N ( X ) Q,B − δ Q ( Y ) · δ O ( X ) Q,B · P − δ Q ( Y ) · M P · δ Q ( X ) − δ Q ( Y ) · δ N ( X ) Q,B L − δ Q ( Y ) · M N · δ Q ( X ) L (29)The next step is to pick δ ˙ E ( Y ) to cancel all terms pro-portional to the unknown δ Q ( X ) in Eq. (29), δ ˙ E ( Y ) = δ P ( Y ) · M Q + δL ( Y ) Q · M N − δ Q ( Y ) · M P − δ Q ( Y ) · M N L (30)This relation is taken to deﬁne the quantity δ ˙ E ( Y ) thatenters the adjoint equations in Eq. (26c). This leaves forthe adjoint relation ddz (cid:15) = δ P ( Y ) · δ O ( X ) Q,B · Q + δL ( Y ) Q · δ N ( X ) Q,B − δ Q ( Y ) · δ O ( X ) Q,B · P − δ Q ( Y ) · δ N ( X ) Q,B L (31) Integrating over z from initial to ﬁnal point (cid:16) δ P ( Y ) · δ P ( X ) − δ Q ( X ) · δ E ( Y ) − δ Q ( Y ) · δ E ( X ) − δL ( Y ) δL ( X ) (cid:17) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = z f z = z i = (cid:90) z f z i dz (cid:110) δ P ( Y ) · δ O ( X ) Q,B · Q + δL ( Y ) Q · δ N ( X ) Q,B − δ Q ( Y ) · δ O ( X ) Q,B · P − δ Q ( Y ) · δ N ( X ) Q,B L (cid:111) (32)We are now in position to describe the utility of theadjoint approach. Suppose we integrate the base caseequations, Eqs. (13a) - (13d), forward from z i to z f ,and we then integrate the adjoint equations, Eqs. (26a)- (26d), backward in z , starting at z f . Further, supposewe take the conditions on the adjoint variables at z f tobe δ P ( Y ) ( z f ) = ∂F∂ P (cid:12)(cid:12)(cid:12) z f , (33a) − δ E ( Y ) ( z f ) = ∂F∂ Q (cid:12)(cid:12)(cid:12) z f , (33b) − δ Q ( Y ) ( z f ) = ∂F∂ E (cid:12)(cid:12)(cid:12) z f , (33c) δL ( Y ) ( z f ) = ∂F∂L (cid:12)(cid:12)(cid:12) z f (33d)where F is the ﬁgure of merit. Then Eqs. (24) and(32) give for the change in the ﬁgure of merit due toperturbation ( X ) δF ( X ) = (cid:16) δ P ( Y ) · δ P ( X ) − δ Q ( X ) · δ E ( Y ) − δ Q ( Y ) · δ E ( X ) − δL ( Y ) δL ( X ) (cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = z i + (cid:90) z f z i dz (cid:110) δ P ( Y ) · δ O ( X ) Q,B · Q + δL ( Y ) Q · δ N ( X ) Q,B (cid:111) + (cid:90) z f z i dz (cid:110) − δ Q ( Y ) · δ O ( X ) Q,B · P − δ Q ( Y ) · δ N ( X ) Q,B L (cid:111) (34)Equation (34) can now be used to evaluate the changesin the ﬁgure of merit (FoM) F due to arbitrary, smallchanges in the initial conditions of the true solution ascontained in the ﬁrst term on the right of Eq. (34). Orit can be used to evaluate changes in the FoM F dueto arbitrary, small changes in the strength or proﬁle ofthe focusing magnetic ﬁelds as contained in the integralterm on the right of Eq. (34). These changes can be evaluated by a single integral once the adjoint solutionis found without having to resolve the coupled system of10 equations for each possible change in parameters.We illustrate the utility of the adjoint method in calcu-lating the gradient of an FoM with respect to variationsof parameters in Fig. 3. Here we show the dependence onparameters of a particular FoM that will be introduced FIG. 3. Comparison of changes in a ﬁgure of merit calcu-lated using the adjoint method (straight lines) and calculateddirectly (curved lines) in the next section. The dependence on parameter val-ues is calculated two ways: ﬁrst by directly varying theparameter and second by using Eq. (34) to compute thegradient. The top plot in Fig. 3 shows the variationof the FoM with respect to changes in the strengths ofthe three quadrupoles, while the bottom plot in Fig. 3shows the dependence on the solenoidal ﬁeld. The solidstraight lines have slopes that are predicted by the ad-joint method, while the dashed curved lines show theeﬀect of directly varying the parameter and plotting thechange in the FoM. As can be seen, the solid lines aretangent to the dashed lines for vanishing perturbationsindicating that the gradient has been calculated correctly.This method of calculating the gradient will be used inthe next section to optimize a FTR transformer.

III. OPTIMIZATION

The goal of the optimization is to minimize a givenFoM based on the values of the moments at a location( z = z f ) inside the solenoid using the adjoint approach.We choose the following function of the beam momentsto be our FoM F = 12 (cid:34) | P | + k ( Q − + Q x ) + k − ( E − + E x ) k − (cid:16) E + − k Q + + Λ (cid:17) + (2 E + Q + − L ) (cid:35) (35)It is a sum of terms quadratic in the moments, each ofwhich should be as small as possible in the optimumFTR transformer. The quantity k is introduced so thateach term has the same units. Here k is a scaling pa-rameter approximately equal to the inverse of the latticelength. This makes all the terms roughly comparable inmagnitude. The choice of each term is made as follows.We would like the beam in the solenoid to be round, Q − = Q x = 0, and we would like all the nonzero mo-ments to be independent of z . This implies for the spa-tial moments d Q /dz = P = 0. If we set the componentsof d P /dz = 0, we ﬁnd E − = E x = 0 is required aswell as E + − k Q + / r ) in theLarmor frame then E +0 = (cid:104) δx (cid:48) + δy (cid:48) (cid:105) + (cid:104) Ω r (cid:105) (36)where δx (cid:48) , δy (cid:48) are deviations from the mean rotation.The angular momentum in this case is L = (cid:104) Ω r (cid:105) . Wethus have, by virtue of the Schwarz inequality E + ≥ (cid:104) Ω r (cid:105) ≥ (cid:104) Ω r (cid:105) / (cid:104) r (cid:105) = L / (2 Q + ) (37)The second inequality becomes an equality if the rotationis rigid. The FoM in Eq. (35) is a general function in that itmeets all requirements for the FTR transformation. Inpractice an FoM will also contain extra constraint con-ditions pertinent to the system being optimized. Thiscan include beam size limits based on a physical aper-ture size, magnet sizes and locations based on availablebeamline space, etc. The extra constraints can be incor-porated within the FoM or as constraints on the inputparameters used for the optimization.We start with a design based on the symmetric tripletwith parameters given by Eqs. (1) and (2). It can beshown from analytic solutions of the moment equationsin the thin lens approximations that the entrance and exitconditions in Table II can be achieved for a symmetrictriplet oriented at 45 degrees with respect to the longdimension of the incident ﬂat beam. The strengths ofthe quadrupoles needed to achieve these parameters aregiven by d (cid:90) ∞−∞ K , ( z ) s , dz = − (cid:113) √ d (cid:90) ∞−∞ K ( z ) s dz = 2 √ / (cid:113) √ E − Q − + L / − P − / J ( E + ± E x )( Q − ± Q x ) − ( P + ± P x ) / J ± (39)Also note there are seven quantities which vanish at theexit but only four adjustable parameters, the values of E ± and the strengths of the inner and outer quadrupolesdisplayed in Eq. (38). The ability to satisfy what seems TABLE II. Entrance and exit conditions for the momentsin a thin lens approximation with a symmetric triplet andquadrupoles oriented at 45-degree rotations with respect tothe longitudinal direction. z = 0 z = z f = 2 dQ + = Q + (0) Q + = Q + (0) Q − = Q − (0) Q − = 0 Q x = 0 Q x = 0 P + = 0 P + = 0 P − = 0 P − = 0 P x = 0 P x = 0 E + = Q + (0) / (cid:2) d (2 + 2 √ (cid:3) E + = Q + (0) / (cid:2) d (2 + 2 √ (cid:3) E − = Q − (0) / (cid:2) d (2 + 2 √ (cid:3) E − = 0 E x = 0 E x = 0 L = 0 L = 2 Q − (0) / ( d (cid:112) √ FIG. 4. RMS beam size, (cid:104) x (cid:105) , (cid:104) y (cid:105) , plotted as a function of the longitudinal coordinate through a FTR transformer. Top rowshows the initial results with no optimization. Bottom row shows the optimized results. Scaled quadrupole ( K q ) and solenoid( k Ω ) ﬁeld proﬁles are also displayed for reference. Results are for a 5 keV beam. like three extra conditions then results from the symme-try leading to Eq. (39).We now optimize the parameters describing the mag-netic ﬁelds by using the FoM in Eq. (35). We consider theFoM to be a function of the elements of a list of param-eters a . The elements of the list include the strengths,locations, and orientations of the quadrupoles as well asthe strength and location of the solenoid. We use a sim-ple steepest descent algorithm in which we calculate thegradient of F in the space of parameters a , ∇ a F ( a ) andthen adjust the values of the parameters by moving fromthe current set of parameters along the line of steepestdescent.To calculate the gradient we set the conditions at z = z f for the adjoint variables in Eqs. (33a) - (33d) ac-cording to the current values of the parameters, a n . Thisthen allows the solution for the adjoint variables and theintegration in Eq. (34) to be carried out in order to ﬁndthe variation in the FoM due to a perturbation in anyof the focusing magnets in the system (contained withinthe O and N matrices). We then update the parametersaccording to a n +1 = a n − γ ∇ a F ( a ) (cid:12)(cid:12)(cid:12) a n (40)The step size γ is adjusted iteratively according to asimple algorithm based on successive values of F ( a n +1 ).Once the values of the parameters a n +1 are determinedthe gradient is recomputed and Eq. (40) is reapplied.The procedure is repeated until the FoM stops decreas-ing. A termination criterion is used in cases where the FoMdoes not reach a minimum fast enough. Meaning if theimprovement in the FoM drops below a certain thresh-old, then the optimization terminates. A relative tol-erance level of around 10 − is the cut-oﬀ point in theoptimizations. At this level the trade-oﬀ in computationtime for further reduction of FoM is too great. In ourcase the threshold is good enough for the lattice conﬁgu-rations being investigated. Depending on the lattice ac-curacy requirements for various experimental setups, therelative tolerance thresholds would need to be adjustedaccordingly.As a ﬁrst example of the technique, three optimizationsare run to optimize a FTR lattice in the presence of self-ﬁelds. A 5 keV beam is used with 0 mA (no self-ﬁelds),1 mA, and 5 mA beam currents. For these currents theself-ﬁeld parameter takes on values Λ = 0 , . e − , and1 . e −

4. The importance of self-ﬁelds is estimated fromEq. (20) to be quantiﬁed by the value of d Λ /Q + (0) =0 . , . , and 0 . FIG. 5. Input parameters being optimized vs the number of iterations. This is for the optimized results in Fig. 4. The percentchange of each input parameter from beginning to end is show in Table IV. The FoM vs iterations is also plotted as a dashedblack line. The left vertical axis is for the input parameters while the right side vertical axis is for the FoM.TABLE III. Initial moment values when solving the equationsfrom Eq. 13Initial Moments ( z = 0) Value Q + Q − E + E − the optimizations with added in self-ﬁelds. We can thinkof the added self-ﬁelds as perturbations on the 0 mA so-lution. Thus, the best starting values for the self-ﬁeldoptimizations are the results obtained from the 0 mA, noself-ﬁelds, solution. Results are shown in Fig. 4.Fig. 4 shows that by using the adjoint technique, weare able to ﬁnd and optimize solutions even in the pres-ence of large self-ﬁeld forces. The solutions are foundrelatively fast as the input parameters quickly convergeto optimized values. In comparison, a brute force methodof running PIC simulations over the full parameter spaceand calculating gradients would take orders of magnitudemore computation time. Figure 5 shows the change in in-put parameters vs iterations for the three optimizationsshown in Fig. 4. Table IV shows the percent changein each input parameter over the full optimization runsshown in Fig 4. In general, the second quadrupole hasthe largest change over the course of the diﬀerent opti-mization runs. It is interesting to note that in the caseof the solenoid, the optimization preferred to push thestarting location of the magnet upstream when tuningfor large self-ﬁeld forces. This would at times create op-timized solutions where the solenoid was too close to thethird quadrupole resulting in overlapping ﬁelds. To ﬁxthis a constraint was placed on how far the optimizationcould shift the solenoid starting location; this preventedmoving the solenoid ﬁelds too close to other magnets andstill resulted in successful optimized solutions. This can be seen in the 5 mA results in Fig. 5 with one of the inputparameter lines constant across the iterations; this is thesolenoid starting location being prevented from changing.While the optimizations in Fig. 4 use hardedge mag-net models, it is not limited to such proﬁles. The mag-net strengths, k Ω and K q , are continuous variables inthe moment equations, and as such, any proﬁle can beused. This may be useful for unique and nonlinear mag-net proﬁles that can not necessarily be represented inmatrix form for single particle tracking or would taketoo long to simulate in PIC tracking codes. Figure 6shows an optimization result using Gaussian like mag-net proﬁles similar to the quadrupole ﬁelds available inUMER’s magnets [27]. The solenoid ﬁeld also has a fringeﬁeld edge included in its proﬁle. With these continuousproﬁles the optimization is still able to ﬁnd acceptablesolutions.Certain lattice conﬁgurations require the quadrupoletriplet to maintain a ﬁxed 45-degree transverse rotationangle. While such an FTR solution is possible for a beamwith no self-ﬁelds, it is not necessarily true when self- TABLE IV. Percent change of the 11 magnet parameters asa result of running the optimization. The values are for theoptimized results in Fig. 4.0mA 1mA 5mAParameters % change % change % changeSolenoid Location 3.0 6.0 6.0Solenoid Strength 1.8 4.2 12.0Quad 1 location 0.1 0.3 1.0Quad 2 location 1.6 6.8 23.3Quad 3 location 1.0 2.6 7.5Quad 1 strength 1.0 4.0 13.5Quad 2 strength 3.7 7.3 16.0Quad 3 strength 3.7 4.6 8.3Quad 1 rotation 2.5 5.3 11.9Quad 2 rotation 2.7 3.5 4.9Quad 3 rotation 1.5 3.0 6.7 FIG. 6. RMS beam size vs longitudinal position for an opti-mized result that used non hardedge magnet proﬁles. A beamwith 5 keV energy and zero current was used. ﬁelds are added. Figure 7 shows the FoM vs iterations forthree diﬀerent optimization runs. Each run is repeatedusing ﬁxed and varied quadrupole rotation angles. In thecase with ﬁxed angles and self-ﬁeld forces, the optimiza-tion of the FoM converges early in the iterations limitingthe reduction of the FoM. When angles are varied andfree to rotate, the FoM can be further minimized, show-ing the necessity of having to adjust quadrupole rotationsin the presence of self-ﬁelds in order to better meet theFTR requirements. Figure 7 also shows that in the caseof no self-ﬁeld forces (0 mA beam) the FoM reductiondoes not plateau when optimizing with ﬁxed rotation an-gles. This conveys that the FTR transformation can bedone with ﬁxed 45-degree quadrupole rotation angles ifthere are no self-ﬁelds, but once self-ﬁelds are added thequadrupole rotations angles need to be adjusted to com-pensate.Using the optimized parameters from the ﬁnal FoMvalues within Fig 7, a plot of the RMS beam size ismade through the FTR to see the eﬀects of having thequadrupole rotation angles ﬁxed and varied. Results areshown in Fig 8. The most notable observation is howmuch larger the round beam becomes at the end of theFTR when using varied quadrupole rotation angles vsﬁxed angles. This diﬀerence in beam size also grows as afunction of the strength of the self-ﬁelds.All optimizations up to this point have used the sameinitial beam conditions displayed in Table II. These ini-tial conditions are needed in order to setup a successfulFTR transformation. However, some level of deviationfrom these initial conditions is expected when trying toexperimentally setup such a transformation; it might bediﬃcult to exactly match the beam to the required initialconditions. Is it then possible to use the optimizationroutine and adjust the magnet parameters in order tocorrect for inaccurate initial beam conditions? Figure 9shows these results and proves that such corrections are

FIG. 7. FoM vs iterations for optimizations that used variedquadrupole rotation angles (dashed lines) as well as ﬁxed 45-degree rotation angles (solid lines). Optimizations are runfor 0 mA, 1 mA, and 5 mA beam currents with 5 keV beamenergy.FIG. 8. FoM vs iterations for optimizations that used variedquadrupole rotation angles (dashed lines) as well as ﬁxed 45-degree rotation angles (solid lines). Optimizations are runfor 0 mA, 1 mA, and 5 mA beam currents with 5 keV beamenergy. possible.Figure 9 has optimized solutions for three beam cur-rents: 0 mA, 1 mA, and 5 mA. The analysis starts bytaking optimized results and randomly perturbing theinitial emittance of the beam by changing moment val-ues: E + (0) , E − (0). The perturbations are about 50%from the ideal value listed in Table II. The beam sizethrough the FTR transformation with the perturbed ini-tial conditions is then plotted in Fig. 9. An optimizationroutine is run, and magnet parameters are tuned to cor-rect for such a perturbation; these are also displayed inthe same plot. Figure 9 shows that bad initial conditionscan be corrected with and without self-ﬁeld forces. Thesize of the beam at the end of the FTR in the corrected2 FIG. 9. RMS beam size is plotted for optimized results with ideal initial conditions (blue circles). Initial conditions are purposelyperturbed to break the optimized results (orange line). Magnet parameters are re-optimized to correct for the perturbed initialconditions (green line). case matches close to the beam size before perturbationswere applied.

IV. CONCLUSION

In conclusion, this paper demonstrates a novel methodof using the adjoint approach to successfully model andoptimize a Flat-to-Round or Round-to-Flat lattice trans-former in the presence of space-charge forces. The devel-oped system of continuous moment equations shows ex-cellent agreement with PIC code simulation results. Thegradient-descent algorithm developed with these momentequations and the adjoint approach is then able to suc-cessfully optimize various transformer systems. The op-timization can correct for space-charge forces introducedinto the beam as well as for inaccurate initial beam con-ditions. The correction is achieved by tuning a set ofquadrupole and solenoid magnet parameters within thelattice. Because of the adjoint approach to the problem,such optimizations are performed at signiﬁcantly reducedcomputational costs. Traditional methods of brute forcesimulations to solve the same problem would require con-siderably more computational resources and time.For future work, a set of fully kinetic moment equationswill be developed. The adjoint approach will continueto be studied in its usefulness at addressing new andexisting problems within the accelerator physics ﬁeld ina meaningful way.

V. ACKNOWLEDGEMENTS

This work was supported by DOE-HEP award No.DE-SC0010301.

Appendix A: COUPLED MOMENT EQUATIONS

In this appendix we outline the steps leading to Eqs.(13)-(15). We start with the deﬁnitions of the moments inEqs. (9) - (12). Equation (13a), the derivative of Q withrespect to distance z , follows directly from the deﬁnitionof the components of P in Eq. (10). Diﬀerentiating thevector P and angular momentum L with respect to z gives, P (cid:48) + = E + + (cid:104) xx (cid:48)(cid:48) + yy (cid:48)(cid:48) (cid:105) , (A1a) P (cid:48)− = E − + (cid:104) xx (cid:48)(cid:48) − yy (cid:48)(cid:48) (cid:105) , (A1b) P (cid:48) x = E x + (cid:104) xy (cid:48)(cid:48) + yx (cid:48)(cid:48) (cid:105) , (A1c) L (cid:48) = (cid:104) xy (cid:48)(cid:48) − yx (cid:48)(cid:48) (cid:105) (A1d)where the moments E are deﬁned in Eq. (12). Diﬀer-entiating the moments E with respect to axial distancegives E (cid:48) + = 2 (cid:104) x (cid:48) x (cid:48)(cid:48) + y (cid:48) y (cid:48)(cid:48) (cid:105) , (A2a) E (cid:48)− = 2 (cid:104) x (cid:48) x (cid:48)(cid:48) − y (cid:48) y (cid:48)(cid:48) (cid:105) , (A2b) E (cid:48) x = 2 (cid:104) x (cid:48) y (cid:48)(cid:48) + y (cid:48) x (cid:48)(cid:48) (cid:105) , (A2c)Evaluation of each of the averages on the right sides ofEqs. (A1) and (A2) requires inserting expressions for thesecond derivatives of x and y in the Larmor Frame. Forthis we use Eq. (9).The ﬁrst contribution to the averages in Eqs. (A1) and(A2) comes from the solenoidal ﬁeld contribution to Eq.(9),  x (cid:48)(cid:48) B y (cid:48)(cid:48) B  = − (cid:32) k Ω (cid:33)  xy  (A3)3This when inserted in Eqs. (A1) and (A2) gives rise tothe ﬁrst term in the expression for the matrix O in Eq.(14).The second contribution to the averages in Eqs. (A1)and (A2) comes from the quadrupole ﬁelds. To evaluatethose terms we write the second derivatives in the Larmorframe in the following way  x (cid:48)(cid:48) Q y (cid:48)(cid:48) Q  = R − ( φ ) (cid:88) Quads K Q  cos 2 ψ sin 2 ψ sin 2 ψ − cos 2 ψ  R ( φ )  xy  (A4)Here the Larmor variables, x and y , are ﬁrst transformedback to the lab frame, where Eq. (17) describes thequadrupole ﬁelds. The transverse Lorentz force is com-puted and the lab frame, and the resulting acceleration istransformed to the Larmor frame. Multiplying the threematrices in Eq. (A4) gives R − ( φ ) (cid:88) Quads K Q  cos 2 ψ sin 2 ψ sin 2 ψ − cos 2 ψ  R ( φ )= − (cid:88) Quads K Q  − cos (2 φ − ψ ) sin (2 φ − ψ )sin (2 φ − ψ ) cos (2 φ − ψ )  (A5)Inserting Eq. (A5) into Eq. (A4) and then insertingthe resulting second derivatives in the expressions in Eq.(A1) and (A2) for the rates of change of the momentsgives rise to the second contribution to the matrix O inEq. (14) and the ﬁrst contribution to the vector N in Eq.(15).The evaluation of the self-ﬁeld force is done in whatwe label the beam frame. In the beam frame the beamis assumed to have an elliptical shape with major andminor semi-axes radii a and b , aligned with the x and y directions respectively. The x direction makes anangle α with respect to the x-axis in the Larmor frame. The transformation from beam frame to Larmor framecoordinates is thus given by  xy  = R ( φ )  x y  (A6)The beam is assumed to have uniform density withinan elliptical cross section in the beam frame. Thus, thebeam frame moments satisfy, (cid:104) x (cid:105) = a / , (cid:104) y (cid:105) = b / , (cid:104) x y (cid:105) = 0 (A7)This implies through application of Eq. (A6) for theLarmor frame moments, (cid:104) x (cid:105) = 18 (cid:104) ( a + b ) + cos (2 α )( a − b ) (cid:105) (cid:104) y (cid:105) = 18 (cid:104) ( a + b ) − cos (2 α )( a − b ) (cid:105) (cid:104) xy (cid:105) = 18 ( a − b ) sin (2 α ) (A8)The Larmor frame values for the elements of Q are thenrelated to the radii a and b and the angle α by Q + = 12 (cid:104) x + y (cid:105) = 18 ( a + b ) Q − = 12 (cid:104) x − y (cid:105) = cos (2 α )8 ( a − b ) Q x = (cid:104) xy (cid:105) = sin (2 α )8 ( a − b ) (A9) Appendix B: SELF-FIELDS ANDTRANSFORMATIONS

The eﬀective electric potential due to a beam with el-liptical cross section far from conducting boundaries isgiven byΦ = − a + b ) (cid:20) x a + y b (cid:21) = − ab (cid:20)

12 ( x + y ) + b − ab + a (cid:18) cos (2 α )2 ( x − y ) + sin (2 α ) xy (cid:19)(cid:21) (B1)where the ﬁrst expression is in the beam frame and thesecond expression has been transformed to the Larmorframe. The current strength parameter Λ is deﬁned inEq. (20). The relation between the eﬀective potentialand the space-charge potential φ sc is Φ = qφ sc / ( mν z γ ).One of the three powers of γ in the denominator of theeﬀective potential accounts for the eﬀective relativisticmass increase of the beam particles, and the other two powers accounts for the self magnetic ﬁeld. The acceler-ations due to self-ﬁelds then follow from  x (cid:48)(cid:48) Λ y (cid:48)(cid:48) Λ  = −  ∂/∂x∂/∂y  Φ( x, y ) (B2)When Eq. (B2) is inserted in Eqs. (A1) and (A2), andthe parameters a , b , and α are expressed in terms of the4moments Q using Eq. (A9), the result is the self-ﬁeldcontributions to the matrix O and vector N appearingin Eqs. (14) and (15) along with deﬁnitions in Eq. (19).Equations (13a)-(13d) describe the evolution of themoments in the Larmor frame. If the values of the mo-ments in the lab frame are desired, these can be recoveredby the following transformations, Q lab = R · Q (B3a) P lab = R · P + 2 φ (cid:48) R · Q (B3b) E lab = R · E + 2 φ (cid:48) R · P + 2 φ (cid:48) R · Q + 2 φ (cid:48) + L (B3c) L lab = L + 2 φ (cid:48) Q + (B3d)Here the following matrices are deﬁned R =  φ ) − sin (2 φ )0 sin (2 φ ) cos (2 φ )  (B4a) R =  − sin (2 φ ) − cos (2 φ )0 cos (2 φ ) − sin (2 φ )  (B4b) R =  − cos (2 φ ) sin (2 φ )0 − sin (2 φ ) − cos (2 φ )  (B4c) + =   (B4d)The adjoint equation (26c) has an added term δ ˙ E ( Y ) ,which is added to cancel out the terms in Eq. (28) thatcome from the dependence of the matrix O and vec-tor N on the changes in the self-ﬁelds. These changesare proportional to changes in the moments δ Q ( X ) . Torepresent these changes in the O matrix we ﬁrst sep-arate the changes due to the self-ﬁelds (Λ) from thechanges in the magnetic focusing parameters ( Q, B ), δ O ( X ) = δ O ( X ) Q,B + δ O ( X )Λ . We then construct a matrices M P and M Q that satisfy δ O ( X )Λ · P = M P · δ Q ( X ) (B5a) δ O ( X )Λ · Q = M Q · δ Q ( X ) (B5b)The vector N is treated similarly. We focus now on M P . We start by forming the vector R P = O Λ · P = Λ Q ∆  P + − Q − Q ∆ + Q + P − − Q x Q ∆ + Q + P x − Q − Q ∆ + Q + P + + P − − Q x Q ∆ + Q + P + + P x  (B6)The elements of the matrix M P are generated by diﬀer-entiating each element of vector R P with respect to eachelement of vector Q , M P = (cid:20) ∂ R P ∂ Q (cid:21) T (B7)To keep track of terms we decompose M P according to M P = V U T + V U T + V U T + V U T (B8)where V = − Λ Q  P + − Q − Q ∆ + Q + P − − Q x Q ∆ + Q + P x − Q − Q ∆ + Q + P + + P − − Q x Q ∆ + Q + P + + P x  (B9a) U T = (cid:16) ∂Q ∆ ∂Q + ∂Q ∆ ∂Q − ∂Q ∆ ∂Q x (cid:17) = 1 Q ∆ (cid:0) Q + − Q − − Q x (cid:1) (B9b) V = Λ Q ∆ ( Q ∆ + Q + )  Q − P − + Q x P x Q − P + Q x P +  (B9c) U T = (cid:16) ∂ ( Q ∆ + Q + ) ∂Q + ∂ ( Q ∆ + Q + ) ∂Q − ∂ ( Q ∆ + Q + ) ∂Q x (cid:17) = U T + (cid:0) (cid:1) (B9d) V = − Λ Q ∆ ( Q ∆ + Q + )  P − P +  , V = − Λ Q ∆ ( Q ∆ + Q + )  P x P +  (B9e) U T = (cid:0) (cid:1) , U T = (cid:0) (cid:1) (B9f)The matrices M Q and M N are constructed in the sameway.5 Appendix C: SCALING PROPERTIES

Finally, we discuss the scaling properties of our sys-tem of moment equations. Let us suppose that we havefound a solution to the moment equations in the form offunctions Q ( z ) , P ( z ) , E ( z ) , O ( z ) , N ( z ) , L ( z ) (C1)It can then be veriﬁed that a scaled solution that is alsosatisfying the moment equations is Q ( z ) = λ Q ( (cid:15)z ) P ( z ) = λ(cid:15) P ( (cid:15)z ) E ( z ) = λ(cid:15) E ( (cid:15)z ) O ( z ) = (cid:15) O ( (cid:15)z ) N ( z ) = (cid:15) N ( (cid:15)z ) L ( z ) = λ(cid:15)L ( (cid:15)z ) (C2)Here (cid:15), λ are arbitrary constants allowing for separatescaling of the axial dependence and size of the beam. If (cid:15) > (cid:15) < λ > O , N inEq. (C2) , there are a number of conditions that mustbe satisﬁed to give the required (cid:15) scaling. The solenoidal ﬁeld contribution requires, k Ω ( z ) = (cid:15) k Ω0 ( (cid:15)z ) (C3)This means a shorter solution requires a strongersolenoidal ﬁeld. The scaled solution gives for the phase φ ( z ) = φ ( (cid:15)z ) (C4)Thus, the values of the phase in the locations of thequadrupoles are preserved under the scaling. For thequadrupoles, strict application of the scaling gives K q ( z ) = (cid:15) K q ( (cid:15)z ) (C5)However, in the thin lens approximation only the inte-grated value of the quadrupole ﬁeld matters K = (cid:90) dz K q ( z ) = (cid:15) (cid:90) (cid:15) dz K q ( (cid:15)z ) = (cid:15)K (C6)In this approximation the strength of the quadrupole ﬁeldalso scales inversely with length.The self-ﬁeld contribution to the O and N matricescan also be preserved. We note that these contributionsscale as O, N ∝ Λ /Q (C7)So to preserve the solution we require O ∝ Λ /Q = Λ / ( (cid:15)λQ ) = (cid:15) O ∝ (cid:15) Λ /Q (C8)As a result, the current parameter scales asΛ = λ(cid:15) Λ (C9)Thus, scaling length or amplitude requires changingbeam current to maintain space-charge inﬂuence. [1] S. Director and R. Rohrer, “The generalized adjoint net-work and network sensitivities,” IEEE Transactions onCircuit Theory , 318–323 (1969).[2] N. K. Nikolova, M. H. Bakr, and J. W. Bandler,“Sensitivity analysis of network parameters with electro-magnetic frequency-domain simulators,” IEEE Transac-tions on Microwave Theory and Techniques , 670–681(2006).[3] A. Jameson, “Optimum aerodynamic design usingcfd and control theory,” CFD Review (1995),10.2514/6.1995-1729.[4] T.M. Antonsen, D. Chernin, and J.J. Petillo, “Adjointapproach to beam optics sensitivity based on hamilto-nian particle dynamics,” Physics of Plasmas , 013109(2019).[5] T.M. Antonsen, B.L. Beaudoin, L. Dovlatyan, andI. Haber, “Adjoint Approach to Accelerator Lattice De-sign,” in Proc. NAPAC’19 , North American Particle Ac-celerator Conference No. 4 (JACoW Publishing, Geneva,Switzerland, 2019) pp. 376–378. [6] Ya. Derbenev,

Adapting optics for high-energy electroncooling , Tech. Rep. UM-HE-98-04-A (1998).[7] A.V. Burov and V.V. Danilov,

An Insertion to elimi-nate horizontal temperature of high-energy electron beam ,Tech. Rep. FERMILAB-TM-2043 (1998).[8] A.V. Burov and S. Nagaitsev,

Courant-Snyder parame-ters of beam adapters , Tech. Rep. FERMILAB-TM-2114(2000).[9] A.V. Burov, S. Nagaitsev, and Ya. Derbenev, “Circularmodes, beam adapters, and their applications in beamoptics,” Phys. Rev. E , 016503 (2002).[10] A.W. Chao and M. Tigner, Handbook of AcceleratorPhysics and Engineering (World Scientiﬁc, Singapore,1999).[11] Ya. Derbenev, “Advanced optical concepts for electroncooling,” Nucl. Instrum. Meth. A , 223–233 (2000).[12] R. Brinkmann, Ya. Derbenev, and K. Flottmann,

Aﬂat beam electron source for linear colliders , Tech. Rep.DESY-TESLA-99-09 (1999). [13] R. Brinkmann, Y. Derbenev, and K. Fl¨ottmann, “A lowemittance, ﬂat-beam electron source for linear colliders,”Phys. Rev. ST Accel. Beams (2001), 10.1103/Phys-RevSTAB.4.053501.[14] D. Edwards et al. , “The Flat beam experiment at theFNAL photoinjector,” eConf C000821 , MOB13 (2000),arXiv:physics/0008042.[15] D. Edwards et al. , “Status of ﬂat electron beam produc-tion,” in

PAC 2001. Proceedings of the 2001 Particle Ac-celerator Conference (Cat. No.01CH37268) , Vol. 1 (2001)pp. 73–75.[16] S. Wang, I. Corlett, S. Lidia, J. Staples, and A. Zho-lents, “Flat beam production in low energy injectors,” in

Proceedings of the 2003 Particle Accelerator Conference ,Vol. 5 (2003) pp. 3198–3200.[17] S. Lidia, “Emittance compensation studies of photoinjec-tor beams with angular momentum,” in

Proceedings ofthe 2003 Particle Accelerator Conference , Vol. 3 (2003)pp. 2089–2091.[18] A. Ody, P. Musumeci, J. Maxson, D. Cesar, R. Eng-land, and K. Wootton, “Flat electron beam sources forDLA accelerators,” Nuclear Instruments and Methods inPhysics Research Section A: Accelerators, Spectrometers,Detectors and Associated Equipment , 75–83 (2017).[19] F.W. Cropp V et al. , “Maximizing 2-D Beam BrightnessUsing the Round to Flat Beam Transformation in theUltralow Charge Regime,” in

Proc. NAPAC’19 , NorthAmerican Particle Accelerator Conference No. 4 (JACoWPublishing, Geneva, Switzerland, 2019) pp. 986–989. [20] A. Wolski, “Alternative approach to general coupledlinear optics,” Phys. Rev. ST Accel. Beams , 024001(2006).[21] E. Thrane et al. , “Photoinjector production of a ﬂatelectron beam,” in (2002) pp. 308–310.[22] Y. Sun, Angular-momentum-dominated electron beamsand ﬂat-beam generation , Ph.D. thesis, Chicago U.(2005).[23] K. Kim, “Round-to-ﬂat transformation of angular-momentum-dominated beams,” Phys. Rev. ST Accel.Beams , 104002 (2003).[24] G. Rangarajan, F. Neri, and A. Dragt, “Generalizedemittance invariants,” in Proceedings of the 1989 IEEEParticle Accelerator Conference, . ’Accelerator Scienceand Technology , Vol. 2 (1989) pp. 1280–1282.[25] A. Friedman et al. , “Computational Methods in theWarp Code Framework for Kinetic Simulations of Par-ticle Beams and Plasmas,” IEEE Trans. Plasma Sci. ,1321–1334 (2014).[26] I. M. Kapchinsky and V. V. Vladimirsky, in Proceed-ings, International Conference on High Energy Accelera-tors (CERN, Geneva, Switzerland, 1959) p. 274.[27] W. W. Zhang, S. Bernal, H. Li, T. Godlove, R. A.Kishek, P. G. O’Shea, M. Reiser, V. Yun, and M. Ven-turini, “Design and ﬁeld measurements of printed-circuitquadrupoles and dipoles,” Phys. Rev. ST Accel. Beams3