[PDF] Self-consistent bulge/disk/halo galaxy dynamical modeling using integral field kinematics

Abstract

We introduce a method for modeling disk galaxies designed to take full advantage of data from integral field spectroscopy (IFS). The method fits equilibrium models to simultaneously reproduce the surface brightness, rotation and velocity dispersion profiles of a galaxy. The models are fully self-consistent 6D distribution functions for a galaxy with a Sersic-profile stellar bulge, exponential disk and parametric dark matter halo, generated by an updated version of GalactICS. By creating realistic flux-weighted maps of the kinematic moments (flux, mean velocity and dispersion), we simultaneously fit photometric and spectroscopic data using both maximum-likelihood and Bayesian (MCMC) techniques. We apply the method to a GAMA spiral galaxy (G79635) with kinematics from the SAMI Galaxy Survey and deep g - and r -band photometry from the VST-KiDS survey, comparing parameter constraints with those from traditional 2D bulge-disk decomposition. Our method returns broadly consistent results for shared parameters, while constraining the mass-to-light ratios of stellar components and reproducing the HI-inferred circular velocity well beyond the limits of the SAMI data. While the method is tailored for fitting integral field kinematic data, it can use other dynamical constraints like central fibre dispersions and \HI circular velocities, and is well-suited for modelling galaxies with a combination of deep imaging and HI and/or optical spectra (resolved or otherwise). Our implementation (MagRite) is computationally efficient and can generate well-resolved models and kinematic maps in under a minute on modern processors.

Full PDF

DDraft version October 5, 2018

Preprint typeset using L A TEX style emulateapj v. 01/23/15

SELF-CONSISTENT BULGE/DISK/HALO GALAXY DYNAMICAL MODELING USING INTEGRAL FIELDKINEMATICS

D. S. Taranu , , D. Obreschkow , , J. J. Dubinski , L. M. R. Fogarty , , J. van de Sande , B. Catinella , L.Cortese , A. Moffett , A. S. G. Robotham , J. T. Allen , , J. Bland-Hawthorn , J. J. Bryant , , , M.Colless , , S. M. Croom , , F. D’Eugenio , R. L. Davies , M. J. Drinkwater , , S. P. Driver , M. Goodwin , I. S.Konstantopoulos , J. S. Lawrence , ´A. R. L´opez-S´anchez , , N. P. F. Lorente , A. M. Medling , , , J. R.Mould , M. S. Owers , , C. Power , , S. N. Richards , , , C. Tonini Draft version October 5, 2018

ABSTRACTWe introduce a method for modeling disk galaxies designed to take full advantage of data fromintegral ﬁeld spectroscopy (IFS). The method ﬁts equilibrium models to simultaneously reproducethe surface brightness, rotation and velocity dispersion proﬁles of a galaxy. The models are fullyself-consistent 6D distribution functions for a galaxy with a Sersic-proﬁle stellar bulge, exponentialdisk and parametric dark matter halo, generated by an updated version of GalactICS. By creatingrealistic ﬂux-weighted maps of the kinematic moments (ﬂux, mean velocity and dispersion), we si-multaneously ﬁt photometric and spectroscopic data using both maximum-likelihood and Bayesian(MCMC) techniques. We apply the method to a GAMA spiral galaxy (G79635) with kinematics fromthe SAMI Galaxy Survey and deep g - and r -band photometry from the VST-KiDS survey, comparingparameter constraints with those from traditional 2D bulge-disk decomposition. Our method returnsbroadly consistent results for shared parameters, while constraining the mass-to-light ratios of stellarcomponents and reproducing the H i -inferred circular velocity well beyond the limits of the SAMIdata. While the method is tailored for ﬁtting integral ﬁeld kinematic data, it can use other dynamicalconstraints like central ﬁbre dispersions and H i circular velocities, and is well-suited for modellinggalaxies with a combination of deep imaging and H i and/or optical spectra (resolved or otherwise).Our implementation (MagRite) is computationally eﬃcient and can generate well-resolved models andkinematic maps in under a minute on modern processors. Subject headings: galaxies: spiral — galaxies: structure — galaxies: fundamental parameters —methods: data analysis INTRODUCTIONThe physical properties of nearby spiral galaxies aretypically derived by ﬁtting a number of distinct com-ponents to broadband images, either using azimuthally-averaged 1D proﬁles or directly in 2D (e.g. Peng et al. International Centre for Radio Astronomy Research, TheUniversity of Western Australia, 35 Stirling Highway, Crawley,WA, 6009, Australia ARC Centre of Excellence for All-sky Astrophysics (CAAS-TRO) Department of Astronomy and Astrophysics, University ofToronto, 50 St. George Street, Toronto, ON, Canada, M5S 3H4 Sydney Institute for Astronomy, School of Physics, A28, TheUniversity of Sydney, NSW, 2006, Australia Australian Astronomical Observatory, PO Box 915, NorthRyde, NSW, 1670, Australia Research School of Astronomy and Astrophysics, AustralianNational University, Canberra, ACT, 2611, Australia Astrophysics, Department of Physics, University of Oxford,Denys Wilkinson Building, Keble Rd., Oxford, OX1 3RH, UK School of Mathematics and Physics, University of Queens-land, QLD, 4072, Australia Envizi Suite 213, National Innovation Centre, AustralianTechnology Park, 4 Cornwallis Street, Eveleigh NSW 2015, Aus-tralia Department of Physics and Astronomy, Macquarie Univer-sity, NSW, 2109, Australia Cahill Center for Astronomy and Astrophysics CaliforniaInstitute of Technology, MS 249-17, Pasadena, CA, 91125, USA Hubble Fellow Swinburne University, Hawthorn, VIC, 3122, Australia Melbourne University, School of Physics, Parkville, VIC,3010, Australia S ´ersic proﬁle bulge (Simard et al. 2011) - appro-priate parameterizations for galaxy disks and “classical”dispersion-supported bulges (Gadotti 2009), respectively.Additional features like bars, spiral arms and dust areusually only modelled for well-resolved nearby galaxies.Photometric bulge-disk decomposition has several ma-jor drawbacks. Firstly, the best-ﬁt 2D model may beimpossible to reproduce with more realistic 3D densityproﬁles or a 6D phase space distribution function (DF)- a serious concern, since most nearby galaxies are dy-namically relaxed systems close to virial equilibrium. 2Dmodels may be unable to produce a stable equilibriumsystem, or require an unrealistic dark matter halo den-sity proﬁle to reproduce the rotation curve. Therefore, itis desirable that ﬁtting methods exclude parameter com-binations that cannot create stable equilibrium modelsconsistent with the galaxy’s dynamics.Bulge-disk decompositions can also produce ambigu-ous results. For ﬁts with an exponential and a S ´ersic pro-ﬁle, it is often assumed that the exponential component isa disk, whereas the S ´ersic component is a bulge; however,the bulge can have a best-ﬁt S ´ersic index n s ≈

1, leav-ing only the size and ellipticity to distinguish it from thedisk. Furthermore, bulges are typically centrally concen-trated and compact, and therefore diﬃcult to resolve be- a r X i v : . [ a s t r o - ph . GA ] O c t Taranu et al.yond z > .

05 with seeing-limited imaging (Kelvin et al.2014).Kinematic data can break these degeneracies, even ifspatially unresolved. A single-ﬁbre central velocity dis-persion can be used to infer the presence of a “clas-sical”, dispersion-supported bulge. Similarly, an unre-solved H i i surface density proﬁle.Integral-ﬁeld spectroscopy (IFS) permits the inferenceof spatially resolved rotation and dispersion proﬁles bytaking multiple spectra across each galaxy. Already-completed single-target surveys include SAURON (deZeeuw et al. 2002), ATLAS3D (Cappellari et al. 2011),and CALIFA (S´anchez et al. 2012, 2016) - with nearly1,000 galaxies between them - whereas ongoing multi-plexed surveys like SAMI (Croom et al. 2012; Bryantet al. 2015) and MaNGa (Bundy et al. 2015) have eachobserved ∼ ∼ i spectra.There are two major challenges in interpreting IFSkinematic maps. First, extracting information on galaxykinematics requires careful modelling to account for ob-servational and instrumental eﬀects, particularly “beam-smearing” - the tendency for a point-spread function(PSF) to blur ordered rotation across a galaxy, artiﬁciallyincreasing the velocity dispersion. Creating spectral dat-acubes by stacking dithered observations has an adverseimpact on image resolution, particularly in the presenceof diﬀerential atmospheric refraction (Sharp et al. 2015;Law et al. 2015) - an issue which can and should beresolved by forward modelling rather than in the datareduction process. Secondly, IFS maps may not havesuﬃcient spatial coverage or signal-to-noise to reach thepeak of a typical galactic rotation curve, whereas evenunresolved 21cm H i spectra can, since H i disks are typi-cally more extended than stellar disks (e.g. Walter et al.2008; Wang et al. 2016).Our new modelling method is designed to resolve theissues outlined above. We create dynamical modelsfrom fully self-consistent phase space DFs, then gener-ate synthetic observations of the kinematic moments tocompare with observed data. Using kinematic momentmaps allows for less ambiguous detections of dispersion-supported bulges. Synthetic observations reproduce bi-ases from beam smearing by the PSF/line-spread func-tion (LSF) and pixel discretization, allowing for simul-taneous ﬁtting of independent datasets. Finally, sincethe models are based on theoretically-motivated analyticdensity proﬁles, they predict reasonable extrapolationsbeyond the limits of the observed data - vital for es-timating the angular momentum in extended disks (Ro-manowsky & Fall 2012; Obreschkow & Glazebrook 2014).Existing galaxy dynamical modelling methods include(Schwarzschild 1979) modelling (e.g. Cappellari et al.2006), Jeans’ modelling (as reviewed by Courteau et al.2014), made-to-measure (Syer & Tremaine 1996) andaction-based modelling (e.g. Binney & McMillan 2011). However, most such methods are not speciﬁcally de-signed to perform bulge-disk decomposition (but seeVasiliev & Athanassoula 2015) and many do not neces-sarily produce self-consistent DFs (e.g. Trick et al. 2016,who model the Milky Way’s disk DF including a halopotential but no halo DF). Portail et al. (2017) ﬁt MilkyWay data using a near-equilibrium M2M model with adisk, halo, bulge and bar, but at a signiﬁcant compu-tational cost of 190 CPU-hours for 25 iterations. Ourmethod solves both problems, generating synthetic ob-servations of near-equilibrium bulge/disk/halo models ef-ﬁciently enough to ﬁt data from large surveys like SAMI.In §

2, we describe the data sources for the samplegalaxy used in this pilot study. In §

3, we describe eachstep of the method in greater detail. In §

4, we showmore detailed results and comparisons to 2D bulge-diskdecomposition, summarizing conclusions and outliningfuture directions in §

5. Three appendices detail system-atic tests of model integration accuracy (Appendix A),stability (Appendix B) and ﬁt robustness (Appendix C).Two further appendices discuss degeneracy/biases whenmodels ﬁt poorly (Appendix D) and when ﬁtting inclinedthick disks (Appendix E). Lastly, Appendix F details the G alactICS method used to build galaxy models. Futurepapers will provide ﬁts to a larger sample of SAMI galax-ies. DATAWe choose a well-resolved, massive SAMI spiral galaxy(G79635), with M (cid:63) ≈ . M (cid:12) (Taylor et al. 2011)from broadband spectral energy distribution ﬁts. Stel-lar kinematics are derived using single-Gaussian, two-moment pPXF (Cappellari & Emsellem 2004) ﬁts to datafrom the blue and red arms combined, degrading thered arm (FWHM=1.696˚A, covering the redder half ofSDSS r band) to match the blue arm’s spectral reso-lution (2.717˚A, covering the SDSS g band); see van deSande et al. (2017) and Fogarty et al. (2015) for details.We create “ SAM Igr ” ﬂux maps by collapsing the spec-tral cube, masking emission and sky lines as in van deSande et al. (2017). Flux uncertainties include approx-imate covariances (Sharp et al. 2015) added in quadra-ture to the shot/read noise, along with a ﬂat systematicuncertainty corresponding to 10% (2.1%) of the faintest(peak) surface brightness. The dispersion maps excludeoutliers from the best-ﬁt radial proﬁle. The PSF is aMoﬀat (1969) ellipse with 1.83” FWHM, derived via a P roFit (Robotham et al. 2017) ﬁt to the reference star’sﬂux map (obtained from its spectral cube exactly as forthe galaxy). g - and r -band images are from the VST-KiDS survey(de Jong et al. 2013, 2015), which covers GAMA (Driveret al. 2011) and SAMI survey regions. Uncertainties areestimated from the eﬀective gain and local sky bright-ness. PSFs are Moﬀat ellipses with 1.16” ( g ) and 0.54”( r ) FWHM, derived from a simultaneous P roFit ﬁt to39 nearby point sources. G79635 also has an H i spec-trum from the ALFALFA (Haynes et al. 2011) α .70 datarelease . METHODS http://egg.astro.cornell.edu/alfalfa/data/ elf-consistent IFS galaxy modelling 3 Fig. 1.—

An illustration of how various multi-wavelength data can constrain disk galaxy dynamics over diﬀerent regions of a typicalmassive spiral galaxy with an extended disk, concentrated bulge and ﬂat rotation curve.

The method solves a non-linear optimization problemusing a parametric galaxy model, constrained by 2Dkinematic moment maps or derived quantities thereof.First, a model phase space DF must be generated ( § § § § M agRite - is based on C/C++libraries with an R (R Core Team 2016) interface forﬁtting. 3.1. Galaxy Models

The models are generated using an updated version3.0 of the G alactICS (Kuijken & Dubinski 1995; Widrowet al. 2008) galaxy initial conditions code, to be detailedin a future paper (Dubinski et al., in prep). G alactICShas previously been used to model surface brightness pro-ﬁles and rotation curves of local group galaxies (Widrow& Dubinski 2005; Widrow et al. 2008) and NGC6503(Puglielli et al. 2010), but not 2D images/kinematicmaps. The core functions of the updated code are as de-scribed in Widrow et al. (2008). Key diﬀerences includethe adoption of a logarithmic grid (previously linear),and the use of GNU Scientiﬁc Library (Galassi 2009)splines to create smooth diﬀerentiable functions for tab-ulated DFs and multipole expansion coeﬃcients for thepotential, both of which allow for more accurate func-tion and derivative/integral evaluations using fewer gridelements than in earlier versions.GalactICS generates equilibrium DFs for galaxies withthree components: • An exponential stellar disk with mass M d,in , scale radius R d and scale height z d , where ρ ∝ exp ( − R/R d )sech ( z/z d ); • A (deprojected) S ´ersic proﬁle stellar bulge withscale velocity v b and eﬀective radius R b , where ρ ( r ) ∝ ( r/R b ) − p exp ( − b n ( r/R e ) /n s ), p = 1 − . /n s + 0 . /n s (Prugniel & Simien 1997),and b n scales such that R b is the projected half-light radius (Graham & Driver 2005); and: • A generalized Navarro et al. (1997, hereafter NFW)dark matter halo with scale velocity v h , scale radius r h , where ρ ∝ [( r/r h ) α (1 + r/r h ) ( β − α ) ] − , and α =1 , β = 3 for a “pure” NFW proﬁle.The minimal set of six free parameters includes asize and mass/scale velocity per component: R b and v b (bulge); R d and M d,in (disk); r h and v h (halo). Fourparameters control density proﬁles: n s (bulge), z d (diskscale height), and α , β (halo); we ﬁx β = 3 but leavethe others free. We ﬁt the disk radial central (cylindri-cal) radial velocity dispersion σ R , the square of whichthen declines exponentially with R d . Finally, we ﬁx thestreaming fractions f s,b = f s,h = 0 . L z ), givingnon-rotating bulges and halos.Any component c can be truncated at a radius r t,c withscale length dr t,c , such that ρ trunc ( r ) = ρ nominal ( r )[1 +exp (( r − r t,c ) /dr t,c )] − . Truncation is only strictly nec-essary for the halo because the NFW proﬁle has a di-vergent total mass. Nonetheless, we ﬁt the disk r t,d and dr t,d (see § r t,b = 10 R b ( dr t,b = R b ) and halo r t,h = 50 r s,h ( dr t,h = 7 . r s,h ). This adds an additional two free pa-rameters to the previous nine. G alactICS derives a DF for each spherical compo-nent using Eddington’s formula (e.g., Binney & Tremaine Taranu et al.2008) and iteratively computes corrections to an ana-lytic DF describing the disk. G alactICS then ﬁnds apotential/density pair for each component which is con-sistent with these DFs. The ﬁnal radial density proﬁleof each component only diﬀers slightly from its originalparametrization (see Appendix F). The key diﬀerencesare that spherical components (bulge and halo) are ﬂat-tened slightly in response to the presence of the disk po-tential, and that the integrated properties of components(e.g. disk mass M d ) can deviate slightly from their inputvalues. Although options have been recently added to G alactICS to further correct the disk density such thatthe output mass and proﬁle match the input values moreclosely, we omitted this step pending further testing ofthese new features.Despite these caveats, Appendix B shows that un-der normal circumstances, G alactICS models begin innear-perfect equilibrium; perturbations on the order ofa few percent result only for extreme parameter combi-nations. More importantly, models with large Toomre(1964) Q parameters are stable against secular evolu-tion. While G alactICS is not restricted to generatingmodels with large Q - this depends mostly on the val-ues of z d and σ R - it can be guided to do so if neces-sary through priors on these input parameters or on theminimum Q value at certain radii. G alactICS also con-verges to a near-equilibrium DF in ∼

30 seconds withoutexpensive orbital integration - a major advantage overSchwarzschild/made-to-measure methods and the key re-quirement to permit Bayesian analyses of large samplesof galaxies.3.2.

Distribution Function Integration

The default G alactICS integration scheme samples theDF using a Monte-Carlo acceptance-rejection method,which is ideal for generating unbiased, equal-mass N-body particle initial conditions. However, the rejectionstep is ineﬃcient unless a suitable (i.e. strictly largerthan the target distribution but only by a small margin)approximate sampling distribution is known. Unbiasedsampling is not optimal for accurate integration over auniform grid, because the fractional error is not constantbut scales with density, so low-density (outer) regionshave larger relative errors than the (possibly excessively)accurately-sampled inner regions. Lastly, stochastic in-tegration can induce spurious variations in the likelihoodwith small changes in parameter values. Evaluating thesame model with a diﬀerent random sequence or evena slightly diﬀerent model with an identical random seedcan result in spurious diﬀerences in integrated quanti-ties and the resulting model likelihoods, depending onthe number of samples. As a result, we chose to de-velop a faster and less stochastic grid-based integrationscheme which we will now describe in greater detail. Formore detailed comparisons between these two integrationschemes, see Appendix A.We integrate the DF in its native cylindrical coordi-nate system and then project it rather than integratingover projected coordinates, as is usually done for 2D sur-face brightness proﬁles. For the remainder of this sec-tion, we will use the mathematical convention where theazimuthal angle is θ rather than the physics convention( φ ). The disk DF is parametrized as f ( R, z, v R , v θ , v z ).It is independent of θ and symmetric over all axes except v θ . The DFs of spherical components (bulge and halo)are internally functions of energy and L z , but are re-parametrized as f ( R, z, v ) for convenience. Integratingthe model over a cylindrical grid allows for some eﬃ-cient optimizations, whereas integrating down the line ofsight requires repeated unique transformations at eachprojected position. Rotating and projecting cylindricalgrid elements into the sky plane does present a challenge.Doing so exactly requires computing the fraction of thevolume of a 3D tilted ring (with a ﬁxed height) projectedwithin each spaxel. However, this can be roughly approx-imated by further discretizing each annular ring into sec-tors, and then assigning the mass within each sector tothe spaxel containing its center of mass (in projection).The discretized disk integration grid for G79635 isshown in Figure 2 (top-left panels). The scheme is de-signed to create nearly equal-mass bins. The radial gridis roughly logarithmic - each bin covers a radius con-taining 1/ N R of the total mass of an ideal, thin expo-nential disk. The inner and outer bins are oversampledto minimize gaps at large radii and improve accuracynear the galactic center: 15% of the bins cover the in-ner 0.5 R d , whereas 35% cover R > R d . The bins arestaggered radially to spread them more evenly in projec-tion. Vertically, the grid covers 0 < z < z d , spacedto cover equal masses until switching to linear spacingnear the upper limit. For each R - z element, the disk DFis integrated over all v R , v z , v θ within ( (cid:104) v R (cid:105) = 0) ± σ R ,( (cid:104) v z (cid:105) = 0) ± σ z and ( (cid:104) v θ (cid:105) ≈ v circ ) ± σ θ , discretized intoequal-velocity bins. Figure 2 shows the integration gridfor a single spatial bin, including major-axis 2D projec-tions and 1D probability distribution functions (PDFs).Typically, the DF at most spatial coordinates in the diskis nearly (but not exactly) a Gaussian ellipsoid.The bulge uses similar radial divisions, such that theinner and outer 20% of the bins contain 0 . M bulge and0 . M bulge , respectively, accounting for the steep slopein the S ´ersic proﬁle at small/large radii for large/smallvalues of n s , respectively. The radial grid is divided intoquadrants and then subdivided in linearly-spaced cellsalong the z-axis. The bulge DF is then integrated overall v < v esc .3.3. Synthetic Observation Pipeline

To generate synthetic images and kinematic maps,we use an updated version of the synthetic observationpipeline described in Taranu et al. (2013) and ironicallynamed “This Is Not A Pipeline” (TINAP). We assumean exponentially-declining star formation history for thedisk:

SF R ∝ exp ( − ( t − t ) /τ ) from t = 2 Gyr to12.92 Gyr (the Universe’s age at G79635’s z = 0 .

04 as-suming H = 70 kms − Mpc − , Ω m = 0 .

3, Ω λ = 0 . τ − to avoid discontinuities at τ = 0. The bulgeis modelled as a single burst with a free formation time t b . Both bulge and disk components have free metal-licities ( Z b and Z d , respectively). We use Maraston &Str¨omb¨ack (2011) grids to compute M (cid:63) /L in the threebands ( g , r and SAM Igr ), assuming no stellar popula-tion gradients within components. We spawn a minimumof 8 “particles” at the central

R, z of each grid bin with anevenly-spaced distribution from 0 < θ < π/

4, beginningat a random θ (the only stochastic part of the scheme)and duplicating particles in the seven remaining octants.The two left panels of Figure 3 show distributions ofelf-consistent IFS galaxy modelling 5 R [kpc] z [ k p c ] ++++++++++++++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ Density [log10(M star /M sol kpc − )] R [kpc] z [ k p c ] ++++++++++++++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ ++++++++++++ Mass [log10(M star /M sol )] v [km s − ] Velocity DFs M * / M Ȃ . vRvzv ̀ v z [ k m s − ] v z [ kms − ] v ̀ [ k m s − ] v R [ kms − ] v ̀ [ k m s − ] Fig. 2.—

Densities within selected bins of a low-resolution model grid (25 radial bins and 25kms − velocity bins). Left panels: Densityand mass within the model integration grid. The scheme produces roughly equal-mass bins over most of the disk. Top: Mass-weightedvelocity distribution functions for the cyan-highlighted bin in the left panels. The remaining panels show pairwise (logarithmic) densitiesintegrated over the third velocity axis, i.e. projections of the velocity “ellipsoid” (which is not perfectly ellipsoidal) at a given position inthe disk. − −

100 0 100 200 x x x v [km s − ] L SA M I [ L s o l ] − −

100 0 100 200 x x x TotalBulgeDisk

Fig. 3.—

A visualization of the model projection and map generation schemes. Left panels: Scatter plots of projected disk DF samples,color-coded by line-of-sight velocity ( v LOS ). The top-left panel shows bins at similar radii but diﬀerent heights above/below the midplane,while the bottom-left panel shows a single z bin with ellipses at same height above/below the midplane but diﬀerent radii. Right panel:Scatter plot of DF samples for the disk and bulge, color-coded by v LOS . The inset shows v LOS

DFs for the highlighted spaxel (green).Radially-staggered bins are distributed more evenly in projection but still create artifacts.

Taranu et al.disk particles at two ﬁxed radii but at diﬀerent heightsabove/below the disk midplane, colour-coded by v LOS (top), along with particles at diﬀerent radii but ﬁxedheights above/below the disk midplane (bottom). Af-ter binning particles spatially and in v LOS , every spaxelproduces its own v LOS

PDF (right panel inset, Fig-ure 3). These PDFs are 2D integrals of line-of-sight pro-jections of the 3D velocity ellipsoids, and so kinematicmoments are sensitive to the disk’s vertical structure andanisotropy. It is worth emphasizing that Figure 2 showsa very coarse integration grid with just 25 radial bins,whereas for G79635 we use 100 bins, eliminating mostdiscreteness eﬀects. However, the X-shaped pattern ofgaps remains even for very ﬁne grids. This is essentiallya Moir´e pattern generated by overlaying an elliptical gridonto a rectangular one. The eﬀect is minimized but noteliminated by staggering radial bins. In practice, thepatterns are small enough to be virtually invisible afterPSF convolution and could be avoided entirely with moresophisticated schemes for gridding the model DF, whichare under development.For stellar kinematics, v LOS cubes are convolved withthe PSF and spectral line spread function (LSF), bothof which are oversampled threefold. Finally, we measurethe kinematic moments in each spaxel, subtracting theLSF dispersion in quadrature for the second moment.Gaussian ﬁts to G79635’s v LOS

PDFs are indistinguish-able from direct measurements of v LOS , σ , so we use thelatter. As discussed in Appendix C, this choice may notbe suitable for galaxies with more massive and extendedbulges, so we are investigating alternatives for future ap-plications.Generating synthetic kinematic maps requires nine ad-ditional free parameters - the disk inclination and posi-tion angle, oﬀsets for x, y, v los , and ages and metallicitiesfor the bulge and disk - bringing the total to twenty onefree parameters. 3.4.

Model Fitting

Wherever possible, initial parameter estimates andprior means are obtained from 2D P roFit ﬁts. All priorsare assumed to be normal and broad ( σ ≈ C MAES; Hansen 2006). We then perform BayesianMCMC using the Componentwise Hit-And-Run Monte-Carlo (CHARM) sampler of the L aplacesDemon R pack-age . The likelihood function is the sum of the log-likelihoods from each map, assuming either a chi-squaredistribution for image residuals, or a sum of normally-distributed residuals for kinematic maps with less well-deﬁned errors. Residuals are deﬁned as χ i = ( data i − model i ) /error i and χ = Σ i (( data i − model i ) /error i ) .Note that although we quote reduced χ ( χ red ) values,they are not used in the optimization procedure, whichinstead derives likelihoods from the chosen statistics’PDF directly (i.e. by calling the “dnorm” and “dchisq”functions in R ).Our C MAES code is based on the R cmaes package .We have modiﬁed both C MAES and L aplacesDemon, https://cran.r-project.org/web/packages/LaplacesDemon/ https://cran.r-project.org/web/packages/cmaes/ implementing runtime limits for supercomputer queues;these versions are available on github , . RESULTSThe best ﬁt for G79635 using SAMI and KiDS g + r is shown in Figure 4. The χ red for all of the ﬂux mapsis signiﬁcantly above unity. However, the largest r -bandresiduals clearly trace non-axisymmetric features like spi-ral arms and inter-arm gaps, and the similarity in resid-uals across independent data sets is encouraging, giventhe systematics introduced by SAMI’s cubing procedureand single-star ﬂux calibration. The r -band ﬁt is worstsimply due to its higher signal-to-noise (better seeing andlonger exposures than g ).Figure 5 shows 1D proﬁles azimuthally averaged overthe best-ﬁt P roFit disk ellipse, compared to a P roFit 2Ddouble S ´ersic r-band ﬁt with a free bulge position angle.The dispersion map/proﬁle is overﬁt and the best-ﬁt ro-tation curve appears to rise slightly too steeply, as canalso be seen in Figure 4 (where the velocity map residualsshow spatial coherence). Encouragingly, the predictedrotation curve at a ﬁducial radius of (3–3.4) R d (Catinellaet al. 2007) is within 10% of the independent ALFALFAH i W = (347 ± − measurement, even though theH i data was not used in the ﬁt and the SAMI data doesnot appear to reach the peak of the rotation curve. Thelower stellar velocity could be due to asymmetric drift,as it is not unusual for stellar disks with radial disper-sion support to have ∼

10% lower rotation speeds thangaseous disks (Ciardullo et al. 2004; Martinsson et al.2013; Brooks et al. 2017). The peak stellar velocity is alsoconsistent with the independent V max sin i = 165kms − circular speed derived by Cecil et al. (2016).The fact that the observed mean stellar velocity lieswell below the circular speed curve is due to a combina-tion of factors. First, the mean velocity within R e,disk is decreased due to beam smearing (compare the solidblue and dashed blue lines in Figure 5). This eﬀect ismodest beyond the peak of the rotation curve (comparethe solid and dashed rotation curves), although it con-tinues to boost the observed velocity dispersion by about10kms − . Note that estimates of the mean velocity anddispersion are unreliable beyond about 15kpc, where thedispersion drops well below the 60kms − velocity gridresolution. Also, there is some subjectivity in how 1Dapertures are deﬁned. We measure velocities and ve-locity dispersions using data from spaxels within 5 and10 degrees of the major axis, respectively. The modelprojected velocity without beam smearing (dashed bluecurve in Figure 5) is measured within the same aperturesas the PSF-convolved version (solid line) and does nottake into account the fact that PSF convolution modiﬁesisophotes as well; however, since the 1D kinematics aremeasured close to the major axis, this eﬀect is minor.In the inner few kpc, the mean velocity is suppressedboth because of the presence of a non-rotating bulge andbecause the disk has a ﬁnite thickness, so that a largefraction of disk stars are a signiﬁcant distance away fromthe disk midplane. Beyond this inner region, asymmetricdrift and the non-zero radial and vertical velocity disper-sion of the disk continue to lower the mean velocity. Ob- https://github.com/taranu/LaplacesDemon https://github.com/taranu/cmaeshpc elf-consistent IFS galaxy modelling 7 KIDS g

PSF q=0.90FWHM=1.16'' D a t a KIDS r

PSF q=0.95FWHM=0.54''

SAMI gr

PSF q=0.96FWHM=1.84''

SAMI v [kms − ] SAMI σ [km s − ] M ode l M / L * [ d , b ] = [ . , . ] log([Rd,Rb]/kpc)=[0.81,-0.29]log([Md,Mb]/Msol)=[10.72,9.13] M / L * [ d , b ] = [ . , . ] χ =8.01e4, χ red2 =3.06 R e s i dua l s χ =1.68e5, χ red2 =6.46 χ =1.10e3, χ red2 =1.78 χ =1.28e3, χ red2 =1.96 χ =4.63e2, χ red2 =0.71 -4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4 Fig. 4.—

Best-ﬁt G79635 model using SAMI moment 0-2 maps and KiDS g + r images, along with residuals relative to pixel/spaxeluncertainties. KiDS images are 200x200 (2 (cid:48)(cid:48) ticks) while SAMI are 36x36 (1 (cid:48)(cid:48) ticks). servations suggest that it is not unusual for stellar diskswith radial dispersion support to have ∼

10% lower ro-tation speeds than gaseous disks (Ciardullo et al. 2004;Martinsson et al. 2013; Brooks et al. 2017).Despite also ﬁtting the g -band image and SAMI kine-matics, the M agRite best-ﬁt model is a better ﬁt to theKiDS r -band image than a single-band, exponential disk P roFit ﬁt; P roFit only ﬁts slightly better with a free n s disk.Table 1 lists best-ﬁt values and uncertainties for M agRite model parameters and several key derivedquantities. Posterior distributions for selected commonparameters of the M agRite and P roFit ﬁts are shown inFigure 6. We ﬁnd that direct ﬁts to the data yield un-reasonably narrow PDFs, listed as σ obs. in Table 1. Totest whether M agRite is the cause of this eﬀect, we gen-erate and ﬁt noisy mock maps of the best-ﬁt model (seeAppendix C for a full description of the procedure). Weﬁnd no evidence for signiﬁcant bias in the best-ﬁt param-eter values. This form of “noise bias” can be signiﬁcantin low signal-to-noise image, as is the case in weak lens-ing studies (e.g Bernstein & Jarvis 2002; Refregier et al.2012). However, the parameter PDFs for ﬁts to the mockmaps are signiﬁcantly broader (Table 1) than when ﬁt-ting the actual data - in some cases by over two ordersof magnitude - and P roFit exhibits similar behaviour.As Figure 4 shows, an axisymmetric disk is not a goodﬁt to the ﬂux maps and cannot reproduce the spiral armstructure evident in the KiDS images (especially in r ).In general, models that ﬁt data poorly underestimateuncertainties signiﬁcantly, although the degree to whichthis occurs does depend on the model and ﬁt statistic. This result is not immediately obvious and we discussit further in Appendix D. Our solution of ﬁtting mockimages to obtain more realistic parameter uncertaintiesis necessary but likely insuﬃcient. That is, the σ mock inTable 1 should be interpreted as a lower bound on theuncertainty on each parameter in the highly idealizedscenario that the galaxy is perfectly described by themodel. There is no obvious prescription for estimatingor adjusting parameter uncertainties for models that donot ﬁt data well.Table 1 also lists ∆, the diﬀerence between the P roFitand M agRite best-ﬁt values for common parameters(whether derived or ﬁt directly). This can be consid-ered as an estimate of systematic uncertainties from us-ing two diﬀerent (but still similar) modelling methods.In all cases, | ∆ | is larger than sigma mock - sometimesby more than an order of magnitude. This suggests thatsystematic uncertainties dominate over statistical uncer-tainties. Unfortunately, we are unaware of any robustmethods for incorporating systematic uncertainties intoour likelihood functions, so the only obvious solution tothis issue remains increasing the model’s ﬂexibility untilit can reproduce the data.Our testing demonstrates that M agRite will recover in-put parameters correctly from idealized mock data, butthis does not guarantee realistic parameter values whenﬁtting real galaxies. For example, the M agRite modelhas an unrealistically large disk scale height z d = 1 . r t,d = 14 . R d = 6 . . . . F l u x [ − e r g s − c m − Å − ] ̀ [Å] [OII] H ̀ H ̀ H ̀ [OIII] [OIII] [OI] [NII] [NII] ̀ [SII] SAMI v [km s − ] -100 -50 0 50 100 150 200 SAMI σ [km s − ]

20 40 60 80 100 0 5 10 15 20 25

R ["] µ [ m ag / " ] R e,disk (ProFit) v [ k m / s ] R [kpc]

DataBulge (ProFit)Disk (ProFit)Total (ProFit)Total (MagRite)Stellar VelocityModel Velocity'' (no PSF)Circular VelocityStellar DispersionModel Dispersion'' (no PSF)ALFALFA HI W50 − v [km s − ] F l u x [ m Jy ] v sys v sys ±(W /2)ALFALFAHI KiDS gri Fig. 5.—

Data and 1D proﬁles for G79635. Clockwise from bottom-left: the SAMI dispersion and velocity maps, with a 2” diameteraperture (green); the SAMI spectrum within this aperture, with emission lines excluded from the ﬁt shaded in gray; the ALFALFA H i spectrum; RGB image using KiDS g - and r -band images, overlaid with SAMI coverage (green). Center: azimuthally-averaged, spline-ﬁt1D proﬁles (dotted lines) for the ﬁrst three kinematic moments (KiDS r -band surface brightness - gray/black; velocity - blue; velocitydispersion - red), along with 1- σ uncertainties (shaded). Also shown are best-ﬁt M agRite (thick solid lines) and P roFit 2D double S ´ersicr-band ﬁts (dotted lines), split between bulge (orange), disk (green), and total (purple). Note that velocities are as observed and notinclination-corrected, i.e. v ≈ v circ sin ( i ), where v circ is the circular velocity and i is the disk inclination. The best-ﬁt M agRite model v circ sin ( i ) (dot-dashed dark blue line) lies well above the observed rotation curve, illustrating the combined eﬀects of beam smearing,asymmetric drift and a thick disk. steeper than exponential, and the best-ﬁt P roFit disk n s ≈ . M agRite surface brightness proﬁle at large radii,whereas the large scale height lowers the surface bright-ness along the minor axis from the galaxy center - pre-cisely where there are two under-dense inter-arm gaps.A model with azimuthal variations and a more general S ´ersic or broken-exponential disk proﬁle might prefer athinner, non-truncated disk. Having said that, Mu˜noz-Mateos et al. (2013) ﬁt broken exponential proﬁles toSpitzer 3.6 µ m imaging of nearly face-on disks and founda typical break radius at 2 . ± . i mass of 10 . M (cid:12) , so it is possible that our largerdisk mass is compensating for the contribution of thegas disk to the rotation curve. This could also be thecause of the slight under-prediction of the rotation curveat large radii, if it is not due to asymmetric drift. Inpractice, a more ﬂexible and better-ﬁtting stellar massmodel would likely allow the halo parameters to varymore to compensate for such inconsistencies.The best-ﬁt disk metallicity is quite low (log 10( Z d / Z (cid:12) ) < − .

5) for such a massive disk,whereas the bulge metallicity reaches the ceilingof the Maraston & Str¨omb¨ack (2011) model grids(log 10( Z b / Z (cid:12) ) = 0 . τ = 2 .

06 Gyr, while the bulge has amoderate age of 5.92 Gyr. The observed galaxy colourscannot be reproduced by such a relatively simple model;in particular, the galaxy center is redder than the model,and the outskirts are signiﬁcantly bluer, both by about0.2 in g − r and with a fairly sharp transition ratherthan a smooth gradient. Additional model complexity(especially dust reddening and stellar population gradi-ents) is necessary to fully reproduce galaxy colours, andfull spectral modelling would be ideal. However, it isworth noting that systematic diﬀerences in the inferredstellar masses of GAMA galaxies (including the SAMIsample) can be as large as 0.2 dex depending largelyon the treatment of star formation histories and dust(Wright et al. 2017), even neglecting possible variationsin the initial mass function. There are also signiﬁcantdiﬀerences between stellar population models, stellarspectral libraries and isochrones which preclude makingaccurate estimates of stellar mass-to-light ratios evengiven a star formation history and it is unclear how onemight estimate the magnitude of such eﬀects for a givengalaxy.elf-consistent IFS galaxy modelling 9 P D F Disk L oo Disk Re oo − − − − − Disk zs oo Disk i oo Bulge L o o − − − Bulge Re oo − − − Bulge ns oo . . . D i sk R e oo oo oo o o oooo − . − − . − − . . D i sk zs oo oo oo o o oooo . . . D i sk i oo oo oo o o oooo . . . . . B u l ge L oo oo oo o o oooo − . − . − . B u l ge R e oo oo oo oo oooo − . − . − . B u l ge n s oo oo oo oo o o o oMagRite mockProFit mockMagRite best ProFit best ±1 ̀ ±1 ̀ Fig. 6.—

Triangle plot showing joint posterior parameter distributions ( L ≡ log 10( L r / L (cid:12) ), Re ≡ log 10( R e / kpc), zs ≡ log 10( z d /kpc ), n ≡ log 10( n s ), and i is the inclination in degrees) for P roFit (blue) and M agRite (red), where the P roFit disk is a S ´ersic proﬁle andshares its position angle with the bulge. The upper-left half shows scatter plots of accepted samples, while the bottom-right half shows1D and smoothed 2D probability contours. Accepted samples are colored by probability on an arbitrary scale, such that more probablepoints have darker and more saturated colors. Plots along the diagonals show PDFs of accepted samples for the variable on the x-axis; toavoid crowding, the y-axis ticks and labels are omitted for the three interior histograms. All posterior distributions are for ﬁts to mockdata; the best-ﬁt parameters used to generate the mock data are plotted as circles. Note that because the P roFit thick disk ﬁts have anunrealistically large disk scale height, the P roFit mock uses the best-ﬁt thin disk ﬁt parameters with z d = 0 . R e,d / . . R d for an exponential disk); this illustrates the limited constraints on disk thickness from photometry alone. TABLE 1Best-fit G79635 MagRite Parameters

Fitted ParametersName unit log mean σ obs. σ mock ∆ M d,in M sim Y 1.604 8.76e-5 1.33e-3 R d kpc Y 8.141e-1 2.62e-5 9.85e-4 z d kpc Y 2.224e-1 1.83e-4 8.08e-3 r t,d kpc Y 1.146 2.07e-4 1.67e-3 dr t,d kpc Y 7.065e-1 3.56e-4 5.26e-3 σ R v sim Y -1.281e-1 2.93e-4 2.44e-2 v b √ v sim N 1.020e-2 1.80e-4 1.92e-3 R b kpc N -2.899e-1 1.01e-4 7.35e-3 -3.62e-2 n s N/A Y -9.970e-2 9.79e-4 2.45e-2 1.82e-1 v h √ v sim Y 2.693e-1 6.41e-5 1.82e-3 r h kpc Y 8.051e-1 3.17e-4 1.47e-2 α N/A N 9.845e-1 1.57e-4 2.31e-2 τ − Gyr − N 4.851e-1 1.36e-3 6.62e-3 t b Gyr Y 8.450e-1 1.19e-3 8.55e-3 Z d Z / Z (cid:12) Y -5.170e-1 1.75e-3 7.19e-3 Z b Z / Z (cid:12) N 3.000e-1 8.17e-5 8.83e-3P.A. rad. N 8.059e-1 2.72e-4 2.08e-3 -3.26e-3sin ( i ) rad. N 8.160e-1 1.55e-4 1.65e-3 2.09e-2 X off kpc N 3.732e-2 4.32e-4 5.81e-3 5.11e-2 Y off kpc N 1.513e-2 9.68e-4 5.89e-3 5.91e-2 V z,off kms − Y -1.547e-1 3.47e-1 3.55e-1Derived ParametersName unit log mean σ obs. σ mock ∆ M d M (cid:12) N 10.72 1.43e-4 1.38e-3 M b M (cid:12) N 9.126 6.93e-4 1.07e-2(

M/L g ) d (M / L g ) (cid:12) N 2.726 1.47e-3 1.18e-2(

M/L r ) d (M / L r ) (cid:12) Y 2.213 8.60e-4 7.42e-3(

M/L g ) b (M / L g ) (cid:12) N 3.745 1.18e-2 8.79e-2(

M/L r ) b (M / L r ) (cid:12) N 2.581 7.44e-3 5.42e-2 L d L (cid:12) , r Y 10.38 1.54e-4 8.76e-4 4.46e-3 L b L (cid:12) , r Y 8.715 9.91e-4 7.58e-3 2.99e-2 R e,d kpc Y 0.9365 8.82e-5 1.10e-3 7.10e-3 M h M (cid:12) Y 11.81 3.03e-4 3.33e-2

Note . — Fitted parameters are listed as ﬁt internally by M agRite, whereas derived parameters are measured during or aftermodel generation. Values listed as log 10(value unit − ) ∆: M agRite value less P roFit value (where applicable), where the P roFit model has a thin S ´ersic disk sharing its position angle withthe bulge. M sim =2 . M (cid:12) v sim =100 kms − Directly measured model half-light radius, accounting for trun-cation

One potential general solution to limit parameter biasis to introduce stricter priors on model parameters basedon external data. Disk scale height distributions canbe constrained from independent observations of edge-ondisks, and dust extinction can be estimated from Balmerline decrements. Ultimately, the best solution is to im-prove the model itself, which would permit quantiﬁca-tion of such biases. Such improvements are planned for M agRite but are not necessary to implement the methoditself. For the moment, we advise caution when inter-preting uncertainties from models that do not adequatelyreproduce known or clearly visible features in the data.This particular galaxy is well-resolved compared to aver-age SAMI galaxies (although not uniquely so); these is-sues are less pronounced when ﬁtting lower-quality data.However, as the recent public release of Subaru HyperSuprime-Cam data (Aihara et al. 2017) shows, high-quality, deep ground-based imaging is rapidly becomingavailable for large galaxy surveys - even in the south-ern sky (Keller et al. 2007) - and so this issue cannot be ignored for much longer. CONCLUSIONSWe have outlined a method for kinematic bulge-diskdecomposition using self-consistent, DF-based dynamicalmodels. The method can be used to model any combina-tion of data, including deep optimal images and 1D/2Dkinematic constraints. Our G alactICS-based implemen-tation ( M agRite) is eﬃcient enough ( ∼ P roFit 2D decomposi-tions and with the independent H i W constraint. Thissuggests that M agRite can extrapolate reasonable rota-tion curves even without IFS data reaching the peak of agalaxy’s rotation curve - a crucial requirement for accu-rate stellar mass and angular momentum estimates. Inthe provided appendices, we demonstrate that M agRitecan ﬁt synthetic model data with minimal biases. How-ever, we caution that ﬁts to real data are not immuneto biases, particularly in the presence of signiﬁcant non-axisymmetric features. Furthermore, we showed thatpoorly-ﬁtting models can seriously underestimate param-eter uncertainties by yielding artiﬁcially narrow poste-rior PDFs. This can be mitigated but not corrected byestimating uncertainties from mock observations of thebest-ﬁt model.Our example galaxy was selected as a well-resolved,fairly passive spiral galaxy with an HI detection, butthere are many more SAMI galaxies with similar qualitydata. The KiDS images are good enough to constrain thebulge fraction in G79635 to at most a few percent andclearly show deviations from our idealized models, whichassume axisymmetric, exponential disks and simple starformation histories for each component. We thereforedemonstrate that data quality is not the main impedi-ment to improved, physical modelling of galaxies, but themodels themselves. G79635’s azimuthally-averaged diskproﬁle can be reproduced with a combination of an un-usually thick and smoothly-truncated exponential disk,but would be better ﬁt with a S ´ersic or non-parametricproﬁle disk including perturbations from spiral arms.One shortcoming of the method using kinematic mo-ment maps is that these must be derived independently;nonetheless, the method can be generalized to ﬁt spec-tral datacubes directly (e.g. Tabor et al. 2017) and weplan to implement this functionality within M agRite.Additional model features like spiral arms, dust at-tenuation/scattering (Pastrav et al. 2013) and moreﬂexible/non-parametric density proﬁles are longer-termambitions. M agRite is under active development andwill be released in the near future, alongside early re-sults from a larger SAMI sample. Parties interested intesting the code or contributing to future developmentare encouraged to contact the authors. ACKNOWLEDGEMENTSThis research was conducted by the AustralianResearch Council Centre of Excellence for All-skyAstrophysics (CAASTRO), through project numberelf-consistent IFS galaxy modelling 11CE110001020. This work was supported by the Flag-ship Allocation Scheme of the NCI National Facilityat the ANU. The SAMI Galaxy Survey is based onobservations made at the Anglo-Australian Telescope.The Sydney-AAO Multi-object Integral ﬁeld spectro-graph (SAMI) was developed jointly by the Universityof Sydney and the Australian Astronomical Observa-tory. The SAMI input catalogue is based on data takenfrom the Sloan Digital Sky Survey, the GAMA Sur-vey and the VST-ATLAS Survey. The SAMI GalaxySurvey is funded by CAASTRO and other participat-ing institutions. The SAMI Galaxy Survey website is http://sami-survey.org/ . DST acknowledges sup-port from a 2016 University of Western Australia Re-search Collaboration Award. BC acknowledges supportfrom the Australian Research Council’s Future Fellow-ship (FT120100660) funding scheme.APPENDIX A. INTEGRATION SCHEME COMPARISON

To test the accuracy and speed of the DF integra-tion scheme described in § § § θ to generate smoothmaps - otherwise, the model images would have bright“spokes” at the sampled angles θ and artiﬁcial gaps be-tween them. Similarly, binning evenly-spaced ellipsesonto a rectangular grid creates Moir´e-like artifacts, ap-parent as an X-shaped residual in Figure 7. These issuescould be resolved by distributing DF samples over theprojected areas of elliptical rings, rather than as pointsbinned onto a rectangular grid. In principle, this wouldalso need to be done in 3D, i.e. by generating ringscorresponding to the top of one bin and the bottom ofthe next. These would be fairly inexpensive calculationscompared to the other steps in model integration, butare somewhat complex and would be unlikely to changethe PSF-convolved model maps signiﬁcantly, so they areleft to future revisions of M agRite.One measurable impact of the integration scheme isthe change in model likelihoods through stochasticity. To test this, we generate a series of maps for the best-ﬁtmodel, varying only the random seed used to generatethe angular oﬀsets in θ . For the r -band KiDS image, the χ value varies by about 40 from diﬀerent seeds. Thisis an insigniﬁcant diﬀerence for a well-ﬁtting model buthighly signiﬁcant for one with a large χ red , as discussedin Appendix D. To minimize the impact of this issue, wekeep the random seed ﬁxed for all ﬁts. B. MODEL STABILITY TEST

To test the long-term stability of the model, we gen-erated N-body initial conditions with G alactICS, sam-pling the disk/bulge/halo with 5M/0.1M/5M particlesand using softening lengths of 50/50/150 pc, respectively.We ran the model for 1 Gyr with PARTREE (Dubinski1996), using a ﬁxed 0.196Myr timestep and opening an-gle of 0.8. Figure 8 shows the resulting maps from theevolved galaxy compared to the initial conditions. Thereis evidence of relaxation of the system, with the evolvedmodel having lower central density and velocity disper-sion and a shallower rotation curve.The evolution of this model is not representative oftypical G alactICS model. As discussed in §

4, the best-ﬁtmodel has an unusually large disk scale height and smalltruncation radius. The initial virial ratio q = − T /W ,where T and W are the total kinetic and potential en-ergy, respectively, is q = 1 . z d = 1 kpc )and much larger truncation radius ( r t,d = 8 r d ) with thesame disk mass. This model shows virtually no evolu-tion outside of the inner 200 pc, where there is a modestdepression in the central density and velocity dispersion.We conclude that while the disk truncation parametersand scale height can in principle mimic a non-exponential( n s <

1) disk, adjusting them beyond G alactICS lim-its can produce unstable models and should be avoided.This could be accomplished without running simulationssimply by placing stronger priors model parameters oron output diagnostics like the virial ratio, but furthertesting is needed to determine guidelines for these limits. C. MODEL FITTING TEST

We test the code’s ability to recover model parametersby ﬁtting synthetic maps generated by M agRite, a pro-cess that is . We assume shot noise-dominated errors forthe ﬂux maps, given a gain and mean sky brightness incounts per pixel. The higher-order SAMI moment mapsrequire some simplifying assumptions. Kinematic con-straints originate mainly from stellar absorption lines,which only cover a small fraction of typical spectra. Weparameterize this eﬀect with a simple “kinematic gain”ratio g k,eff , which is roughly the ratio of the sum of theequivalent widths of all absorption lines to the full wave-length range. We generate a noisy ﬂux-weighted v LOS

DF for each spaxel, where the counts in each bin aremultiplied by g k,eff , and ﬁt a Gaussian to extract themean velocity and dispersion. Finally, we re-use the ex-2 Taranu et al. KiDS r M ag R i t e N - bod y R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l -0.4 -0.2 0.0 0.2 0.4 R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o -0.04 0.00 0.02 0.04 SAMI gr

SAMI v [kms − ] -150 -50 0 50 100-3 -2 -1 0 1 2 3-0.04 0.00 0.02 0.04 SAMI σ [km s − ]

40 50 60 70-2 -1 0 1 2-0.04 0.00 0.02 0.04

Fig. 7.—

Comparison between maps from grid-based and Monte Carlo integration using 20M/0.4M disk/bulge particles, respectively.The residuals are shown on an absolute scale (MagRite - N-body) and as a ratio relative to the smoother MagRite map. The large relativevelocity residuals along the minor axis are due to the small absolute value of the velocity; absolute diﬀerences are small. elf-consistent IFS galaxy modelling 13

KiDS r N - bod y t = N - bod y t = G y r R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l R e s i dua l -0.4 -0.2 0.0 0.2 0.4 R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o R e s i dua l r a t i o -0.2 -0.1 0.0 0.1 0.2 SAMI gr

SAMI v [kms − ] -150 -50 0 50 100-5 0 5-0.2 -0.1 0.0 0.1 0.2 SAMI σ [km s − ]

40 50 60 70-6 -4 -2 0 2 4 6-0.2 -0.1 0.0 0.1 0.2

Fig. 8.—

Comparison between maps generated from N-body initial conditions and after 1 Gyr of evolution. Pixels outside of the limitsof the color bar are assigned the darkest (most saturated) colors. The residuals show some evolution in the galaxy structure due to thethick, truncated disk being slightly out of equilibrium. g k,eff until the noise in the velocity/dispersion mapsis roughly consistent with the original errors.Figure 9 shows synthetic noisy maps for G79635 us-ing g k,eff = 0 .

02. As expected, the ﬂux maps are com-pletely consistent with (nearly) Normal shot noise andhave χ red ≈

1. However, the noise in the v LOS and σ maps is not entirely identical to that from the originalSAMI maps, being slightly over- and under-estimated,respectively. This not surprising, given the non-linearnature of kinematic ﬁts, but the mock kinematics stillappear consistent with random noise without any obvi-ous systematic bias and are usable as data for a mockﬁt. The possibly conservative choice of g k,eff = 0 .

02 ismotivated by the need to keep this noise consistent withthe SAMI error maps; larger values lead to excessivelysmooth kinematic maps. When ﬁtting, we continue touse directly-measured velocity moments in the model,rather than the Gaussian ﬁts used for both the mock andreal data. This is because ﬁtting v LOS requires some es-timate of the uncertainty of the v LOS

DF in each spaxel,and it is not clear what this uncertainty should be in anoiseless model. Since G79635 has a very modest bulge,there is little diﬀerence between the two measurements;however, this may not be the case in galaxies with moremassive and extended bulges, where this issue would beworth revisiting.Figure 10 shows posterior distributions for a M agRiteﬁt to the mock data shown in Figure 9. Reassuringly,the best-ﬁt parameters are close to the inputs, with smalldeviations well within the 1 σ uncertainties. Small oﬀsetsare expected given the noise and the diﬀerence betweendirectly measured and ﬁtted kinematics, but there areno signiﬁcant systematic biases. This is not to say thatﬁts to real data will be devoid of biases - this simplyveriﬁes that M agRite recovers unbiased parameters whenthe data are perfectly described by the model.One possible concern is that there is still some noisein the model probabilities themselves - when colored bylog probability, the points do not show smooth gradi-ents in posterior probability from the maximum likeli-hood solution. This is especially true for the ﬁts to theobserved data, which is unfortunately not clearly visiblein Figure 10 because the posteriors are so narrow. This islikely due to the issues with model integration outlinedin Appendix A. The main practical eﬀect of a slightlystochastic model likelihood is that convergence to thebest-ﬁt solution can be slow, as the solution wanders be-tween entirely artiﬁcial local maxima. This problem isexacerbated when the best-ﬁt model is a poor ﬁt, sinceeven extremely small changes in input parameters cancause spurious changes in the model likelihood. Fortu-nately, since the actual changes in parameter values tendto be small, the impact on the posteriors from mock ﬁtsis minimal.Since poorly-ﬁtting models tend to vastly underpre-dict parameter uncertainties (for reasons detailed in Ap-pendix D below), we suggest that uncertainties frommock ﬁts should be considered lower bounds on the“true” parameter uncertainties. Again, this does notguarantee that there are no biases in the best-ﬁt param-eters themselves, so caution must be taken in deﬁningpriors and interpreting best-ﬁt values in such cases. Sim- ilarly, since the mock data are idealized, they should onlybe interpreted as lower limits until further testing is doneto quantify the impact of deviations from the assumedmodel parameterizations, which are necessarily presentin all galaxies. D. PARAMETER UNCERTAINTIES FROMPOORLY-FITTING MODELS

It is not necessarily intuitive that poorly-ﬁtting modelscan or should yield systematically smaller uncertaintiesthan good models. Nonetheless, it is a natural conse-quence of the shape of some common statistical distribu-tions. The diﬀerence in log-likelihood (∆ LL ) between an n - and ( n + 1)- σ deviation from a univariate Normal dis-tribution is − ( n + 0 . σ deviations is then much more signiﬁcant than thatbetween 2- and 3- σ deviations, simply because the log ofthe PDF of a Normal distribution declines as x − / σ .For a more practical example, the KiDS r -band im-age for G79635 has 25961 usable data points (unmaskedpixels). Using the chi-square distribution with 25961degrees of freedom as the ﬁt statistic , the maximumlikelihood solution for an ideal model has χ =25959.A three- σ deviation from a univariate Normal distri-bution (i.e. for a Gaussian parameter posterior) hasa log-likelihood 4.5 lower than the peak. The equiva-lent range of likelihoods for the given χ distribution is χ = [25281 . , . χ = [ − . , − .

6] or χ red = 0 . , . χ = 1 . e r -band image,or χ red = 6 .

46. An increase in χ of just 10.6 producesa ∆ LL = − .

5, so the range of acceptable χ shrinksby a factor of approximately χ red . Accordingly, the pos-terior parameter PDFs shrink by a comparable margin,depending on the linearity of the model. For a large num-ber of degrees of freedom, this behaviour approaches thatof a Normal distribution.Finally, we note that overﬁtting is strongly disfavouredby the χ statistic - a desirable feature for data with reli-able errors, as in images dominated by shot noise from alarge number of counts. For lower signal-to-noise and/orwhere other terms like read noise are important, Poissonstatistics or the so-called Cash (1979) statistic should beused. E. THICK DISK SURFACE BRIGHTNESSPROFILE FITS

As discussed in §

4, the best-ﬁt M agRite model has anunusually large disk scale height of 1.67 kpc. We have runa number of tests by modifying P roFit to ﬁt a thick diskwith a sech vertical proﬁle using a similar integrationscheme as described in § S ´ersicdisks above and below the disk midplane by shifting theproﬁle center along the minor axis, weighting each diskby the total mass within each vertical bin. This is notthe most eﬃcient integration method for a 3D densityproﬁle and has limited accuracy for highly inclined thickdisks, but it is analogous to the M agRite method andideal for model comparisons. We neglect model parameters in the eﬀective degrees of free-dom, as the number of model parameters in non-linear models ispoorly deﬁned (Andrae et al. 2010). elf-consistent IFS galaxy modelling 15

KIDS g

PSF q=0.90FWHM=1.16'' D a t a KIDS r

PSF q=0.95FWHM=0.54''

SAMI gr

PSF q=0.96FWHM=1.84''

SAMI v [kms − ] SAMI σ [km s − ] M ode l M / L * [ d , b ] = [ . , . ] log([Rd,Rb]/kpc)=[6.52,0.51]log([Md,Mb]/Msol)=[10.72,9.13] M / L * [ d , b ] = [ . , . ] χ =2.64e4, χ red2 =1.01 R e s i dua l s χ =2.64e4, χ red2 =1.02 χ =5.85e2, χ red2 =0.89 χ =7.40e2, χ red2 =1.13 χ =4.85e2, χ red2 =0.74 -4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4 Fig. 9.—

Mock G79635 maps with realistic shot noise. The kinematic maps are derived from Gaussian ﬁts to noisy v los DFs and thereforedo not necessarily follow Poisson or (approximately) Normal statistics.

The results of ﬁtting thick disk proﬁles to G79635’s r -band image are shown in Figure 11. Firstly, ﬁtting athick exponential disk (second row/column) signiﬁcantlyimproves the residuals over a thin exponential. The im-provement is seen largely in the two under-dense regionsbetween the galaxy center and spiral arms (roughly NNEand SSW from the galaxy center). However, the thin S ´ersic disk achieves similar improvements without sig-niﬁcantly worsening the residuals anywhere else in thedisk. Thus, it is clear that a thick disk can compensatefor deviations from a pure exponential disk, but not nec-essarily as well as by simply modifying the radial proﬁle.Unfortunately, the best-ﬁt P roFit disk thicknesses areunrealistically large - 4.83 kpc for the exponential diskand 2.35 for the S ´ersic disk.Figure 12 shows posterior distribution from P roFit r -band ﬁts to mock data using the same input parameters.The “Thick” model shows the same chains as in Figure 6,where the mock data was generated with a more plau-sible scale height z d = 0 . R e,d / . . R d for an exponential disk), and with the other best-ﬁt parameters taken from the best thin disk ﬁt. A secondﬁt was run on mock data with clipped residuals addedback in. Speciﬁcally, after generating the PSF-convolvedmodel image, we add χσ tanh(abs( χ/ . , where σ is the per-pixel uncertainty. The tanh scaling smoothlytruncates residuals below 2 σ , so the disk more closelyfollows a S ´ersic proﬁle. Because overdensities like spiralarms and star-forming regions tend to be more signiﬁ-cant than underdensities, the residuals have a small netpositive ﬂux of slightly under one percent of the disk lu-minosity, which is reﬂected in a small positive bias in disk luminosity in Figure 12 compared to the input parame-ters. The size and S ´ersic index are somewhat biased, butthe scale height and axis ratio are signiﬁcantly over- andunder-estimated, respectively, indicating that structuredresiduals with a small net ﬂux can severely bias poorly-constrained and/or degenerate parameters even when themodel is a reasonable approximation to the data.The fact that scale height and inclination are highlydegenerate is not surprising - if the disk’s vertical den-sity proﬁle is the same as its radial proﬁle, then thescale height and inclination will be completely degen-erate. Using a sech vertical density proﬁle rather thanexponential limits but does not prevent this degeneracy. M agRite achieves a tighter constraint on the inclinationby ﬁtting the velocity map. Of course, the mass modelalso modiﬁes the rotation curve, but the stellar mass isindependently constrained by the ﬂux maps. Unfortu-nately, the kinematic constraints on the disk scale heightitself are weak. In principle, the disk dispersion is re-lated to the disk’s vertical structure, but this is (mostly)independently parameterized in G alactICS by σ R , andSAMI’s spectral resolution is not ﬁne enough to measuretypical disk dispersions anyway. Thus, there is insuﬃ-cient data to guarantee an accurate best-ﬁt scale height,and a strong prior based on observations of edge-on disksshould be used in practice. F. SUMMARY OF THE GALACTICS METHOD

We provide a brief description of the methods used by G alactICS to generate DFs for composite galaxy modelscontaining any number of disk-like and spherical compo-6 Taranu et al. − P D F Disk Min

Disk Rs

Disk rt

Bulge vs − − − − − − Bulge Re − − − Bulge ns . . . D i sk R s . . . . D i sk r t . . . . B u l ge vs − . − . − . − . − . − . B u l ge R e − . − . − . B u l ge n s MagRite mockMagRite ±1 ̀ ±1 ̀ Fig. 10.—

Triangle plot showing joint posterior distributions for model input parameters (Disk Min ≡ log 10( M d,in /M sim ), Disk Rs ≡ log 10( R d / kpc), Disk rt ≡ log 10( r t,d / kpc), Bulge vs ≡ log 10( v b / v sim ), Bulge Re ≡ log 10( R b / kpc), ns ≡ log 10( n s )). Points are color-codedby log probability as described in Figure 6. PDFs are shown for ﬁts to the mock data and the observed data. Since the latter areexceptionally narrow, univariate PDFs along the diagonal are shown on a logarithmic scale spanning six orders of magnitude. elf-consistent IFS galaxy modelling 17 χ red2 =8.31 χ LL=−6.7405e+04[i,z]=[52.8,0.00] e x p .t h i n exp.thick ser.thin ser.thick magrite ∆ LL norm =1.557e+04 ∆ LL:[−73.3,73.3] e x p .t h i ck χ red2 =7.11 χ LL=−5.3855e+04[i,z]=[63.3,4.83] ∆ LL norm =2.539e+04 ∆ LL:[−71.5,71.5] s e r .t h i n ∆ LL norm =9.818e+03 ∆ LL:[−34.4,34.4] χ red2 =6.35 χ LL=−4.5498e+04[i,z]=[52.6,0.00] ∆ LL norm =2.559e+04 ∆ LL:[−75.0,75.0] s e r .t h i ck ∆ LL norm =1.002e+04 ∆ LL:[−31.1,31.1] ∆ LL norm =2.037e+02 ∆ LL:[−5.9,5.9] χ red2 =6.34 χ LL=−4.5326e+04[i,z]=[53.9,2.35] ∆ LL norm =2.339e+04 ∆ LL:[−76.5,76.5] m ag r i t e ∆ LL norm =7.814e+03 ∆ LL:[−43.0,43.0] ∆ LL norm =−2.003e+03 ∆ LL:[−23.9,23.9] ∆ LL norm =−2.207e+03 ∆ LL:[−24.9,24.9] χ red2 =6.51 χ LL=−4.7189e+04[i,z]=[54.7,1.67]

Fig. 11.—

Comparison of r -band residuals for ﬁve G79635 model ﬁts. The models include four ProFit models - thin exponential disk(“exp.thin”), thick exponential disk (“exp.thick”), thin S ´ersic disk (“ser.thin”) and thick S ´ersic disk (“ser.thick”) - as well as the M agRite ﬁt.Diagonals panels show model residuals χ = ( data − model ) /σ on a common scale (from approximately − < χ < χ distribution; χ red ; and the disk inclination i and scale height in kpc z . Panels above the diagonal show diﬀerences between models on a common linearscale centered on zero; dark colours are where the model in a given row is brighter than the model in the given column. Panels below thediagonal show diﬀerences in model residuals ( χ ) on arbitrary linear scales, again centered on zero and where dark colours represent a betterﬁt in the the model in the given row than in the model in the given column. The range of changes in per-pixel log-likelihood (assumingNormally distributed errors) is show in the top left, and the total change in Normal log-likelihood is listed at bottom-right. As expected,residuals improve with added model complexity, but are slightly worse in M agRite (which also ﬁts the kinematics and g -band image) thanin the P roFit S ´ersic thick disk ﬁt; however, both P roFit thick disk models yield excessively large scale heights. P D F Disk L

Disk Re − − − − − − − Disk ns − − − − Disk q . . . . . . D i sk L − − − Disk zs . . . . . . . D i sk R e . . . . . . . D i sk R e − . − . − . − . − . − . − . D i sk n s − . − . − . − . − . − . − . D i sk n s − . − . − . − . D i sk q − . − . − . − . D i sk q − . − − . . D i sk zs Disk L

Disk Re − − − − − − − Disk ns − − − − Disk q P D F − − − Disk zs

Thick MockThick+Resid.Input ±1 ̀ ±1 ̀ Fig. 12.—

Triangle plot showing joint posterior parameter distributions ( L ≡ log 10( L r / L (cid:12) ), Re ≡ log 10( R e / kpc), n ≡ log 10( n s ), zs ≡ log 10( z d /kpc ), and where q is the disk axis ratio) for P roFit r -band S ´ersic disk ﬁts to mock data, with and without including clippedresiduals from the best-ﬁt model. Panels are structured as in Figure 6. The ﬁt to the mock data with clipped residuals (“Thick+Resid.”)has smaller uncertainties and is also signiﬁcantly biased, particularly for the scale height and inclination. The magnitude bias is due to thesmall net positive ﬂux of the residuals. For clarity, points in the upper-left quadrant are thinned by a factor of 5 and 10 for the “ThickMock” and “Thick+Resid.” samples, respectively. elf-consistent IFS galaxy modelling 19nents composed of stars, dark matter and gas (Kuijken &Dubinski 1995; Widrow & Dubinski 2005; Widrow et al.2008). We use the Binney & Tremaine (2008) conven-tion for deﬁning the cylindrical radius as uppercase R and spherical radius as lowercase r below, as well as thephysics/ISO 80000 convention of θ for the polar angleand ψ for the azimuthal angle (in contrast with § E ≡ E and relativepotential Ψ ≡ − Φ where E = Ψ − v / v is thevelocity (such that v / ρ ( r ). For example, we use the3D deprojected Sersic proﬁle (Prugniel & Simien 1997)to describe the bulge, given by: ρ b ( r ) = ρ (cid:18) rR e (cid:19) − p exp[ − b n ( r/R e ) /n s ] , (F1)where the parameter are a characteristic density ρ , theprojected half-mass radius R e , and the the Sersic index n s . The two parameters b n and p are structural quan-tities depending on n s . They are well-approximated bythe formulae: p = 1 − . /n s + 0 . /n s (F2) b = 2 n s − / . /n s (F3)for 0 . < n s <

10 and 10 − < R/R e < (Terzi´c &Graham 2005).There are numerous options for halo proﬁles dependingon one’s theoretical bias, including proﬁles with constantdensity cores or power-law cusps (Merritt et al. 2006). Inthis paper, we use a double power-law model to describethe halo: ρ h ( r ) = ρ s (cid:18) rr s (cid:19) − α (cid:18) rr s (cid:19) α − β (F4)where ( α, β ) = (1 ,

3) is the NFW proﬁle and ( α, β ) =(1 ,

4) is the Hernquist (1990) proﬁle; such double power-law models are sometimes referred to as generalizedNFW or Hernquist proﬁles.Disks are ﬂat axisymmetric models and are approxi-mated by the density law: ρ d ( R, z ) = M d πR d z d exp( − R/R d )sech ( z/z d ) (F5)(van der Kruit & Searle 1981). We emphasize that this isnot the exact disk density but rather a close approxima-tion to the ﬁnal density law derived in the computationof the disk distribution function deﬁned below.Finally, we force the density of each component tosmoothly approach zero by multiplying each proﬁle bya truncation function. We truncate density laws using alogistic function deﬁned by: T ( t ) = (1 + e t ) − , (F6)with t = r − r t δr t , (F7)where r t is the truncation radius and δr t is the radialwidth of the truncation interval. Equation F6 is a simplerepresentation of a smooth step function chosen for itscomputational eﬃciency and continuous derivatives. The method computes an axisymmetric DF for the sys-tem of the form: f ( E , L z , E z ) = f d ( E , L z , E z ) + f b ( E ) + f h ( E ) (F8)where E is the relative binding energy E ≡ − E , L z isthe z -component of angular momentum and E z is the z energy deﬁned below. The bulge and halo are functionsof energy alone and so are modelled as spherical isotropicsystems. The disk DF is deﬁned as a function of threeintegrals of motion. The ﬁrst two are the usual energyand z -component of angular momentum for axisymmet-ric systems but we introduce a third approximate integral E z = Ψ z − v z / z is the vertical potential deﬁnedas Ψ z ≡ Ψ( R, z ) − Ψ( R, z = 0), where Ψ is the relativegravitational potential of the system Ψ = − Φ.With these various deﬁnitions in hand, we can de-scribe a numerical procedure for computing the compo-nent DFs. First consider a purely spherical system com-posed of multiple components – we will consider modiﬁ-cations when including a thin disk component later. Theconstruction of an isotropic DF f ( E ) from a potential-density pair can be accomplished using Eddington’s for-mula (e.g., Binney & Tremaine 2008) f ( E ) = 1 √ π (cid:34)(cid:90) E d Ψ √E − Ψ d ρd Ψ + 1 √E (cid:18) dρd Ψ (cid:19) Ψ=0 (cid:35) . (F9)For a system of total density ρ ( r ), we can compute thetotal potential using the integral expression:Ψ( r ) = 4 πG (cid:20) r (cid:90) r dr (cid:48) r (cid:48) ρ ( r (cid:48) ) + (cid:90) ∞ r dr (cid:48) r (cid:48) ρ ( r (cid:48) ) (cid:21) (F10)To use the Eddington formula, one needs to determinethe function ρ (Ψ) and its derivatives up to second order.In general, it is diﬃcult to ﬁnd an analytic solution sowe use the following numerical method. We ﬁrst deﬁne agrid with n radial positions equally spaced in logarithmicspace deﬁned by: u i = log r i /r (F11)where r is a reference radius and r i is the grid pointradius for i = 1 ..n . With this transformation, we cancompute the potential at the grid positions u i as:Ψ( u i ) = 4 πGr × (cid:20) e − u i (cid:90) u i −∞ du (cid:48) e u (cid:48) ρ ( u (cid:48) ) + (cid:90) ∞ u i du (cid:48) e u (cid:48) ρ ( u (cid:48) ) (cid:21) . (F12)In practice, the inﬁnite limits for the inner and outer in-tegrals can be replaced with the initial and ﬁnal points ofthe logarithmic grid u and u n without loss of accuracy.The inner integral is just the mass versus radius and thisbecomes insigniﬁcant if the inner most radius is suﬃ-ciently small. For the outer integral, since the densitydrops to zero at a ﬁnite radius deﬁned by the truncationfunction there is no need to integrate beyond this point.Accurate and stable numerical solutions are achievablewith 200 grid spacings per dex, making the new methodmuch faster.The integral in Eddington’s formula can be solved nu-merically by creating a tabulated function of density ρ for each of the model functions versus the total Ψ on0 Taranu et al.the logarithmic radial grid. We use the interpolationmodules in the GNU science library (GSL) to create asplined function for ρ (Ψ) and its ﬁrst and second deriva-tives. We can then solve the integral in the Eddington’sformula numerically to determine a DF for each sphericalcomponent independently. The DF is also determined astabulated function by ﬁnding f for each value of E = Ψon the radial grid.To incorporate the disk component in this scheme, weuse the ad hoc method introduced by Widrow & Dubinski(2005). The disk density law of equation is sphericallyaveraged to create an additional spherical density com-ponent that is part of the total potential. The individualDF’s computed from the scheme above then include agood approximation of the disk potential in their deriva-tion. We see below that when we include the ﬂatteneddisk, the spherical density proﬁles of the bulge and halothat are consistent with their DF’s are modiﬁed slightlyfrom the ideal spherical case but remain close to the orig-inal deﬁnition.We now consider the construction of the disk DF. Kui-jken & Dubinski (1995, ,hereafter KD95) introduced thisDF to describe the disk: f d ( E p , L z , E z ) = Ω( R c )(2 π ) / κ ( R c ) (cid:101) ρ d ( R c ) (cid:101) σ R ( R c ) (cid:101) σ z ( R c ) × exp (cid:20) − E p − E c ( R c ) (cid:101) σ R ( R c ) − E z (cid:101) σ z ( R c ) (cid:21) (F13)where E p = E − E z is the energy in planar motions, L z isthe z -component of angular momentum, R c and E c arethe radius and energy of the circular orbit with angularmomentum L z , and Ω and κ are the circular orbital andradial frequencies derived from the total potential. Asshown in KD95, one can obtain the density by integratingover velocities and the resulting density in the plane is (cid:101) ρ d ( R ) with fractional errors O ( (cid:101) σ R /v c ).The disk density can be obtained by integrating thedisk DF over the velocities and the result generates amidplane disk density equal to (cid:101) ρ d ( R ) plus fractionalterms of O ( (cid:101) σ R /v c ). By construction, this disk DF worksbest for cool, thin disks where the epicyclic approxi-mation is valid for disk star orbits though warmer andthicker disks are still good equilibria in practice (see Ap-pendix B).The goal is to ﬁnd a set of “tilde” functions in thedisk DF (cid:101) ρ , (cid:101) σ R and (cid:102) σ z that closely approximate the diskdensity in equation F5. To this end, we use the densitylaw: ρ d ( R, z ) = M d πR d z d e − R/R d × exp (cid:20) C Ψ z ( R, z )Ψ z ( R, C z d ) (cid:21) T (cid:18) r − r t δr t (cid:19) . (F14)The constants C and C = ln sech ( C ) are chosen sothat the run of vertical density of the disk at a givenradius R approximates sech ( z/z d ). In practice, a goodchoice is C = 3 corresponding to equivalence of the ver-tical density at 3 scale-heights for the target density ofequation F5 and this disk density.At this point, we have built a DF for each componentwith the density deﬁned in terms of the total potential Ψ( R, z ). To achieve self-consistency, one needs to solvefor the total potential: ∇ Ψ( R, z ) = − πG [ ρ d ( R, Ψ , Ψ z ) + ρ b (Ψ) + ρ h (Ψ)](F15)As in KD95, we use the iterative method of Prender-gast & Tomer (1970) to ﬁnd a numerical solution. Themethod proceeds by making an initial guess of the po-tential Ψ, computing the densities on the right side ofequation F15 and then re-solving for Ψ. This process isiterated until Ψ relaxes to a solution. In practice, weuse a multipole expansion of Ψ in spherical coordinates( r, cos θ ) with the radius deﬁned on the logarithmic grid.For an axisymmetric system, we can write the potentialon the logarithmic radial grid as the multipole expansion:Ψ( r, θ ) = 4 πGr ∞ (cid:88) l =0 P l (cos θ )2 l + 1 × (cid:18) e − ( l +1) u (cid:90) u −∞ du e ( l +3) u a l ( u ) + e lu (cid:90) ∞ u du e (2 − l ) u a l ( u ) (cid:19) (F16)where u = log( r/r ) and the functions a l ( u ) are given by a l ( u ) = (cid:90) sin θ dθP l (cos θ ) ρ ( u, θ ) (F17)(e.g., Binney & Tremaine 2008). We can replace the inﬁ-nite limits by the end points of the grid as before withoutlosing accuracy. We use convergence of the “tidal” radiusof the total density within some error tolerance to stopthe iteration. The tidal radius is just the ﬁnite radiuswhere the total density drops to zero for the model. Theseries is truncated for some l suﬃciently large to approx-imate the ﬂattened potential.We describe some modiﬁcations to this procedure thatspeed up convergence and overall accuracy. The initialguess to the total potential is the composite sphericalpotential derived in the ﬁrst stage. We ﬁrst perform aniterative sequence to monopole order until convergence.In a second phase, we restart the iteration with this solu-tion gradually adding the terms for higher order expan-sions. Furthermore, the new potential derived at each it-eration is determined as a weighted average of the newlydetermined potential and the previous one. In general,convergence to a high order in l can be reached in a fewtens of iterations.Following KD95, we also use an analytic “high har-monics” disk potential to improve the accuracy of themultipole expansion at lower order in l . This analyticpotential is:Ψ † d = GM d z d R d ln cosh( z/z d ) e − r/R d T (cid:18) r − r t δr t (cid:19) (F18)where one notes that the radial parameter is the sphericalradius. One can use the identity: ∇ f ( r ) ln cosh( z ) = f (cid:48)(cid:48) ( r ) ln cosh( z )+2 f (cid:48) ( r ) r [ z tanh( z ) + ln cosh( z )] + f ( r )sech ( z ) (F19)to derive an analytic disk density ρ † d from Ψ † d . Thelast term reproduces a sech ( z ) disk to O ( z/R ) whilethe other terms are generally small for thin disks andelf-consistent IFS galaxy modelling 21tend to zero at the origin. In the iterative method de-scribed above, we replace the disk density with the resid-ual δρ d = ρ d − ρ † d when determining the total potentialfrom the multipole expansion. The total ﬁnal potentialis computed as the sum of the multipole expansion plusΨ † d . In practice, this method allows an accurate repre-sentation of the thin disk model for expansion orders of l <

10 greatly speeding up the overall procedure.The ﬁnal step involves the ﬁnding “tilde” functions inthe disk DF that match the disk density in equation F14.The ﬁrst function (cid:101) σ R ( R ) can be set arbitrarily but fol-lowing KD95 we use the observationally inspired proﬁle (cid:101) σ R = σ R, exp( − R/R d ) where the central radial velocitydispersion σ R, is a free parameter for the disk. Nor-mally, this parameter is chosen so that the disk is stablei.e. Toomre Q > (cid:101) ρ d ( R )and the vertical velocity dispersion (cid:101) σ z – are iterativelyadjusted for each radius on the grid such that the mid-plane density and the density at z = z d are the same as that in equation F14. Finally, the tilde functions can berepresented numerically with splines and thus the diskDF is fully speciﬁed.In summary, the ﬁnal products of this moderately com-plex procedure are full speciﬁed potentials and densitiesfor each component deﬁned by multipole expansions withmodiﬁcations for the thin disk to improve accuracy forlower order l . Each component also has a well-deﬁned DFdetermined by Eddington’s method for spherical compo-nents and the constructed disk DF from equation F13.The multipole coeﬃcients of the potential expansion aretabulated on the logarithmic grid and represented assplined functions in the code, allowing rapid evaluationof the potential and density at any point. The sphericalDFs as f ( E ) and the tilde functions are computed at thepredeﬁned grid coordinates. The code takes tens of sec-onds on current hardware to tabulate these functions -depending on the maximum multipole order - making itpractical to use for ﬁtting galaxy observations. REFERENCESAihara, H., Armstrong, R., Bickerton, S., et al. 2017, ArXive-prints, arXiv:1702.08449Andrae, R., Schulze-Hartung, T., & Melchior, P. 2010, ArXive-prints, arXiv:1012.3754Bernstein, G. M., & Jarvis, M. 2002, AJ, 123, 583Binney, J., & McMillan, P. 2011, MNRAS, 413, 1889Binney, J., & Tremaine, S. 2008, Galactic Dynamics: SecondEdition (Princeton University Press)Brooks, A. M., Papastergis, E., Christensen, C. R., et al. 2017,ArXiv e-prints, arXiv:1701.07835Bryant, J. J., Owers, M. S., Robotham, A. S. G., et al. 2015,MNRAS, 447, 2857Bundy, K., Bershady, M. A., Law, D. R., et al. 2015, ApJ, 798, 7Cappellari, M., & Emsellem, E. 2004, PASP, 116, 138Cappellari, M., Bacon, R., Bureau, M., et al. 2006, MNRAS, 366,1126Cappellari, M., Emsellem, E., Krajnovi´c, D., et al. 2011,MNRAS, 413, 813Cash, W. 1979, ApJ, 228, 939Catinella, B., Haynes, M. P., & Giovanelli, R. 2007, AJ, 134, 334Cecil, G., Fogarty, L. M. R., Richards, S., et al. 2016, MNRAS,456, 1299Ciardullo, R., Durrell, P. R., Laychak, M. B., et al. 2004, ApJ,614, 167Courteau, S., Cappellari, M., de Jong, R. S., et al. 2014, Reviewsof Modern Physics, 86, 47Croom, S. M., Lawrence, J. S., Bland-Hawthorn, J., et al. 2012,MNRAS, 421, 872de Jong, J. T. A., Verdoes Kleijn, G. A., Kuijken, K. H., &Valentijn, E. A. 2013, Experimental Astronomy, 35, 25de Jong, J. T. A., Verdoes Kleijn, G. A., Boxhoorn, D. R., et al.2015, A&A, 582, A62de Vaucouleurs, G. 1959, Handbuch der Physik, 53, 275de Zeeuw, P. T., Bureau, M., Emsellem, E., et al. 2002, MNRAS,329, 513Driver, S. P., Hill, D. T., Kelvin, L. S., et al. 2011, MNRAS, 413,971Dubinski, J. 1996, New Astronomy, 1, 133Fogarty, L. M. R., Scott, N., Owers, M. S., et al. 2015, MNRAS,454, 2050Freeman, K. C. 1970, ApJ, 160, 811Gadotti, D. A. 2009, MNRAS, 393, 1531Galassi, M. 2009, GNU scientiﬁc library: reference manual, 3rdedn., A GNU manual (Bristol, UK: Network Theory), xvi + 573Graham, A. W., & Driver, S. P. 2005, PASA, 22, 118Hansen, N. 2006, in Towards a new evolutionary computation.Advances on estimation of distribution algorithms, ed.J. Lozano, P. Larranaga, I. Inza, & E. Bengoetxea (Springer),75–102Haynes, M. P., Giovanelli, R., Martin, A. M., et al. 2011, AJ, 142,170Hernquist, L. 1990, ApJ, 356, 359Johnston, E. J., H¨außler, B., Arag´on-Salamanca, A., et al. 2017,MNRAS, 465, 2317Keller, S. C., Schmidt, B. P., Bessell, M. S., et al. 2007, PASA,24, 1 Kelvin, L. S., Driver, S. P., Robotham, A. S. G., et al. 2014,MNRAS, 439, 1245Kuijken, K., & Dubinski, J. 1995, MNRAS, 277, 1341Law, D. R., Yan, R., Bershady, M. A., et al. 2015, AJ, 150, 19Maraston, C., & Str¨omb¨ack, G. 2011, MNRAS, 418, 2785Martinsson, T. P. K., Verheijen, M. A. W., Westfall, K. B., et al.2013, A&A, 557, A130Merritt, D., Graham, A. W., Moore, B., Diemand, J., & Terzi´c,B. 2006, AJ, 132, 2685Moﬀat, A. F. J. 1969, A&A, 3, 455Mu˜noz-Mateos, J. C., Sheth, K., Gil de Paz, A., et al. 2013, ApJ,771, 59Navarro, J. F., Frenk, C. S., & White, S. D. M. 1997, ApJ, 490,493Obreschkow, D., & Glazebrook, K. 2014, ApJ, 784, 26Pastrav, B. A., Popescu, C. C., Tuﬀs, R. J., & Sansom, A. E.2013, A&A, 557, A137Peng, C. Y., Ho, L. C., Impey, C. D., & Rix, H.-W. 2002, AJ,124, 266Portail, M., Gerhard, O., Wegg, C., & Ness, M. 2017, MNRAS,465, 1621Prendergast, K. H., & Tomer, E. 1970, AJ, 75, 674Prugniel, P., & Simien, F. 1997, A&A, 321, 111Puglielli, D., Widrow, L. M., & Courteau, S. 2010, ApJ, 715, 1152R Core Team. 2016, R: A Language and Environment forStatistical Computing, R Foundation for StatisticalComputing, Vienna, AustriaRefregier, A., Kacprzak, T., Amara, A., Bridle, S., & Rowe, B.2012, MNRAS, 425, 1951Robotham, A. S. G., Taranu, D. S., Tobar, R., Moﬀett, A., &Driver, S. P. 2017, MNRAS, 466, 1513Romanowsky, A. J., & Fall, S. M. 2012, ApJS, 203, 17S´anchez, S. F., Kennicutt, R. C., Gil de Paz, A., et al. 2012,A&A, 538, A8S´anchez, S. F., Garc´ıa-Benito, R., Zibetti, S., et al. 2016, A&A,594, A36S´anchez-Janssen, R., Ferrarese, L., MacArthur, L. A., et al. 2016,ApJ, 820, 69Schwarzschild, M. 1979, ApJ, 232, 236S´ersic, J. L. 1968, Atlas de galaxias australesSharp, R., Allen, J. T., Fogarty, L. M. R., et al. 2015, MNRAS,446, 1551Simard, L., Mendel, J. T., Patton, D. R., Ellison, S. L., &McConnachie, A. W. 2011, ApJS, 196, 11Syer, D., & Tremaine, S. 1996, MNRAS, 282, 223Tabor, M., Merriﬁeld, M., Arag´on-Salamanca, A., et al. 2017,MNRAS, 466, 2024Taranu, D. S., Dubinski, J., & Yee, H. K. C. 2013, ApJ, 778, 61Taylor, E. N., Hopkins, A. M., Baldry, I. K., et al. 2011, MNRAS,418, 1587Terzi´c, B., & Graham, A. W. 2005, MNRAS, 362, 197Toomre, A. 1964, ApJ, 139, 1217Trick, W. H., Bovy, J., & Rix, H.-W. 2016, ApJ, 830, 97van de Sande, J., Bland-Hawthorn, J., Fogarty, L. M. R., et al.2017, ApJ, 835, 104van der Kruit, P. C., & Searle, L. 1981, A&A, 95, 105Vasiliev, E., & Athanassoula, E. 2015, MNRAS, 450, 28422 Taranu et al.