Substructure at High Speed II: The Local Escape Velocity and Milky Way Mass with Gaia DR2
DDraft version February 5, 2021
Typeset using L A TEX twocolumn style in AASTeX63
Substructure at High Speed II: The Local Escape Velocity and Milky Way Mass with
Gaia
DR2
Lina Necib
1, 2, 3 and Tongyan Lin Walter Burke Institute for Theoretical Physics, California Institute of Technology, Pasadena, CA 91125, USA Center for Cosmology, Department of Physics and Astronomy, University of California, Irvine, CA 92697, USA Observatories of the Carnegie Institution for Science, 813 Santa Barbara St., Pasadena, CA 91101, USA Department of Physics, University of California San Diego, La Jolla, CA 92093, USA
ABSTRACTMeasuring the escape velocity of the Milky Way is critical in obtaining the mass of the MilkyWay, understanding the dark matter velocity distribution, and building the dark matter densityprofile. In Necib & Lin (2021), we introduced a strategy to robustly measure the escape velocity.Our approach takes into account the presence of kinematic substructures by modeling the tail of thestellar distribution with multiple components, including the stellar halo and the debris flow called the
Gaia
Sausage (Enceladus). In doing so, we can test the robustness of the escape velocity measurementfor different definitions of the “tail” of the velocity distribution, and the consistency of the data withdifferent underlying models. In this paper, we apply this method to the second data release of
Gaia and find that a model with at least two components is preferred. Based on a fit with three boundcomponents to account for the disk, relaxed halo, and the
Gaia
Sausage, we find the escape velocity ofthe Milky Way at the solar position to be v esc = 484 . +17 . − . km/s. Assuming a Navarro-Frenck-Whitedark matter profile, and taken in conjunction with a recent measurement of the circular velocity atthe solar position of v c = 230 ±
10 km/s, we find a Milky Way concentration of c = 13 . +6 . − . and amass of M = 7 . +1 . − . × M (cid:12) , which is considerably lighter than previous measurements. INTRODUCTIONSince the initial discovery of dark matter (DM)(Zwicky 1933), estimating the total mass and densityprofile of the Milky Way has been of crucial importance,providing a window into estimating the mass of the un-seen DM. Various methods have been suggested to tacklethis question, from modeling the density distributionsof the different Galactic components (Caldwell & Os-triker 1981), to the study of the fastest moving stars(Alexander 1982), to fitting the local escape velocity ofthe stars as a way to constrain the local gravitationalpotential (Leonard & Tremaine 1990), to more complexmethods that involve using large stream structures suchas the Sagittarius stream to constrain the Milky Waypotential at large distances (Gibbons et al. 2014; Dier-ickx & Loeb 2017).In recent years, a number of new phase-space struc-tures have been discovered, which speaks to the successof hierarchical structure formation (White & Rees 1978)and also suggests the need to re-examine methods to ex-tract the Milky Way mass. One of the many discoveriespioneered by
Gaia was the
Gaia
Sausage or
Gaia
Ence-ladus (Belokurov et al. 2018; Helmi et al. 2018), whichwe will refer to as the Sausage in the remainder of thispaper. The Sausage is the remnant of a merger that oc- curred 6 to 10 billion years ago between a galaxy with astellar mass of ∼ − M (cid:12) and the Milky Way (Myeonget al. 2018; Deason et al. 2018; Lancaster et al. 2018).This substructure is kinematically distinct from the stel-lar halo, with stars on extremely radial orbits (Deasonet al. 2018), and shifts the peak of the stellar speed dis-tribution to lower values compared to the Standard HaloModel (Necib et al. 2019a).In Necib & Lin (2021), we introduced a method toaccount for the presence of kinematic substructures inmeasurements of the escape velocity. Our work buildson the approach of Leonard & Tremaine (1990), whichmodeled the tail of the stellar speed distribution as f ( v ) ∝ ( v esc − v ) k , v min < v < v esc (1)where v esc is the escape velocity, k is the slope, and v min is an arbitrary speed above which we define the “tail”of the distribution. Many papers have used this for-mulation to infer the local escape velocity by fitting forthe parameters v esc and k with various datasets and as-sumptions (Smith et al. 2007; Piffl et al. 2014; Monariet al. 2018; Deason et al. 2019; Koppelman & Helmi2020). These studies have found large correlations be-tween v esc and k , and subsequently large errors on theescape velocity measurements. In order to reduce the er- a r X i v : . [ a s t r o - ph . GA ] F e b ror bars associated with v esc , works such as Leonard &Tremaine (1990); Smith et al. (2007); Piffl et al. (2014);Monari et al. (2018); Deason et al. (2019) argue for im-posing prior ranges on k based on simulations. However,Grand et al. (2019) and Necib & Lin (2021) showed thatthis approach can lead to underestimates of the escapevelocity if the prior range on k is too low, or overesti-mates of v esc if the prior range on k is too high comparedto the Milky Way.In Necib & Lin (2021), hereafter Paper I, we devel-oped a strategy for measuring v esc that does not rely onartificial prior ranges, and that accounts for the pres-ence of substructure by including multiple bound com-ponents following Eq. (1). In particular, the choice of arather low v min = 300 km/s is standard in the literaturein order to increase statistics, but it is not clear thatthe speed distribution for v > v min can be described byonly one power-law component. For instance, in simu-lated halos with major mergers, the speed distributioncan deviate significantly from the power-law form due tosubstructure (Grand et al. 2019). Since it is known thatthe Sausage contributes a large fraction of non-disk starsin the solar neighborhood (Necib et al. 2019a,b), thereis strong motivation to include multiple components inmodeling the speed distribution.To test the idea that v esc measurements might be bi-ased by kinematic substructures, in Paper I we gener-ated mock data that contains two components followingEq. (1), with a common v esc but different k . We at-tributed these to a “relaxed” stellar halo component,which has a larger slope k ∼ k S = 1. These slopes arebased on the analytic arguments of Deason et al. (2019)for tracer populations with different levels of velocityanisotropy. When v min is low, we found that a single-component fit to the mock data tends to overestimate v esc and also give larger error bars on v esc , but as v min isincreased the single-component fit will converge to thecorrect value. A key conclusion of Paper I is that a ro-bust inference of v esc requires testing fit dependence on v min and on the number of components.In this work, we use the methods developed in Paper Ito estimate the escape velocity of the Milky Way at thesolar position. We perform the analysis on the seconddata release of Gaia ( Gaia
DR2) subset that includesline-of-sight velocity measurements and passes the qual-ity cuts of Sec. 2, as well as the subset of this sample withstars on retrograde motion. We test for the dependenceof v esc on both v min and number of components in orderto ensure that the model selection is self-consistent andthat results are robust to changes in data selection. Thispaper is organized as follows: we discuss the datasets used in Sec. 2, present the analysis and discuss best fitresults in Sec. 3, and study implications for the mass ofthe Milky Way in Sec. 4. DATAWe use
Gaia
DR2 and focus on the subset of starswith radial velocity measurements (Gaia Collaborationet al. 2016, 2018) such that we can reconstruct the stellarspeed. This subset is already a rather local sample ofstars, but in order to restrict to a local measurementof the escape velocity, we implement a Galactocentricdistance cut of r GC ∈ [7 . , .
0] kpc.We define the distance measurement as the inverse ofthe measured parallax (1 /(cid:36) ). In order to make surethis assumption is valid, we also implement a cut on theparallax error, such that the fractional parallax error foreach star is less than 10%. We use Bovy (2011) to an-alytically transform the proper motions and the radialvelocities of
Gaia into Galactocentric Cartesian coor-dinates, using the Local Standard of Rest (
U, V, W ) =(11 . , . , .
25) km/s (Sch¨onrich et al. 2010). Weassume the position of the Sun in these coordinates is( − . , , .
02) kpc (Bennett & Bovy 2019).Of the stars passing the cuts above, the majority havefractional errors on the measured speed of (∆ v ) /v < v ) /v ≤
5% on the data sampleto ensure the data is of comparable quality as the mockstested in Paper I. For the entire stellar sample where thefractional speed error is ≤ v >
400 km/s.In previous works such as Monari et al. (2018); Dea-son et al. (2019), only retrograde stars were modeled inorder to avoid contamination from the disk. However,retrograde data also yields lower statistics, with only 442stars above 300 km/s and 66 above 400 km/s, makingit more difficult to draw strong conclusions from sucha small sample. Instead, with the analysis pipeline ofPaper I, we can easily account for possible disk contam-ination by introducing an additional component in themodel. We will therefore consider two sets of analyses:with the
Gaia
DR2 subset as discussed above, whereone of the components in our fit may be attributed tothe disk, and with the subset of retrograde stars only.As we will see below, in fitting only the retrograde starswe are not able to obtain a robust estimate of v esc .
300 400 500 600 700 800 | ~v | [km/s] − − − − − − f ( | ~ v | ) [ k m / s ] − N =1862 v min =300 km/s DataHaloOutliersSum
300 400 500 600 700 800 | ~v | [km/s] − − − − − − f ( | ~ v | ) [ k m / s ] − N =1862 v min =300 km/s DataHaloOutliersSausageSum
300 400 500 600 700 800 | ~v | [km/s] − − − − − − f ( | ~ v | ) [ k m / s ] − N =1862 v min =300 km/s DataDiskOutliersHaloSausageSum
Figure 1.
Best fit results obtained from the full
Gaia data with one (left), two (middle), and three (right) bound components.The bound components are labeled with decreasing k as Disk, Halo, and Sausage for convenience and in line with our expectationfor how different components might behave, but the specific assignment cannot be determined from the fit. The solid lines arethe best fit distributions, while the shaded regions are the 68% containment regions obtained from the posteriors of the modelparameters. The single-component model does not give a good fit to the data, while the results of the two- and three-componentfits are largely consistent with each other and fit the data well.3. RESULTSWe first briefly summarize the analysis pipeline, whilethe complete details can be found in Paper I. We modelthe stellar speed distribution above v min with one, two,or three bound components, as well as an outlier com-ponent to account for unbound stars. Each of the boundcomponents has a common v esc while the outlier distri-bution can extend above v esc . Explicitly, the likelihoodfunction for a single star is L = f p out ( v obs | σ out )+(1 − f ) (cid:88) i f i p i ( v obs | v esc , k i ) , (2)where v obs is the observed speed, f i the fractional contri-bution of the i -th component to the bound stars, and p i is the distribution of the i -th bound component. To ob-tain p i , we consider Eq. (1) with v esc and k i and convolveit with the measurement error for that star. The boundcomponents are ordered such that k i > k i +1 when thereare multiple components. To describe unbound out-lier stars, we assume p out is a Gaussian function witha dispersion of σ out and that f is the outlier fraction.Although the three component fit was not discussed inPaper I, it is a natural extension to include another com-ponent with the same v esc but a third slope k and anassociated fraction f . For the total likelihood, we sumover all stars in the dataset.We used the Markov Chain Monte Carlo emcee (Foreman-Mackey et al. 2013) to find the best fit pa-rameters, using 200 walkers, 500 steps for the burninstage, and 2000 steps for each run. We assume linearpriors in k i and f i and log priors in f, σ out . We take atheory prior that is uniform in 1 /v esc in order to favorlower v esc ; note that for our main results, we find thatthe v esc posteriors are sharply peaked, such that thisprior has a negligible effect on the results. 3.1. All Data
We begin with the entire dataset satisfying the qual-ity cuts of Sec. 2 and with v min = 300 km/s. Becauseof the possibility of disk contamination, we consider upto three bound components. We will refer to the com-ponents in order of decreasing k as the Disk, Halo, andSausage for convenience, but note that in this analysiswe cannot empirically determine a physical origin foreach of the bound components.Fig. 1 shows the best fit distributions obtained withone, two, or three bound components. First, we see thatthe single component fit (left panel) is not sufficient todescribe the data. This is due to the fact that the fitfor the bound component is anchored at the low end | (cid:126)v | ∼
300 km/s, where there is a steep slope or high k . This leads to an underestimate of the number ofstars at v ∼
400 km/s, which is partially compensatedfor by increasing the outlier fraction, ultimately leadingto an overestimate of the number of stars at | (cid:126)v | > v esc towards larger values. In-deed, for this fit we found v esc = 735 . +50 . − . km/s.Meanwhile, we see that the two and three compo-nent models do provide good fits to the data and con-sistent results. The two function fit yields v esc =519 . +17 . − . km/s and the three function fit yields v esc =484 . +17 . − . km/s, which are consistent within two stan-dard deviations. In both fits, we expect that the firstcomponent, or the highest k component, will correspondto any disk contamination which drops steeply in v . In-deed, we find that the result for k = k pushes up againstour default prior of k i ∈ [0 . ,
15] on the upper end; in-creasing the prior range to k i ∈ [0 . ,
300 325 350 375 400 v min [km/s]4005006007008009001000 v e s c [ k m / s ] N =1862 N =769 N =425 N =250 N =161 Gaia Data
Err=5.0%
Single FunctionTwo Functions Three Functions
Figure 2.
Posteriors in the escape velocity for different values of v min , from fitting the dataset satisfying quality cuts andfractional speed error ≤ v min for v min <
375 km/s,suggesting the need for an additional component in the fit. The posterior distributions for the two-component fit (shaded red)are much more stable with v min . For v min = 300 km/s only, we show the posterior from a three-component fit (shaded greenand dashed). The labels N indicate the number of stars in each sample. sulting slope for this disk component is k = 18 . +1 . − . and k = 18 . +0 . − . for the two and three function fits,respectively. While it is expected that the disk distri-bution drops sharply for | (cid:126)v | (cid:38)
300 km/s, it might beconcerning to have a fit pushing against the prior here.Therefore, we next turn to the dependence of our resultson v min . Studying the fit dependence with increasing v min will provide additional tests of robustness as wellas further eliminate the disk contamination.Our analysis is repeated for v min ∈ [300 , , , , v esc posterior which drifts with v min suggests thatthe model is missing some important component. InFig. 2, we show the posterior distribution of the es-cape velocity for the single and two component fits fordifferent values of v min . As v min increases, v esc in thesingle-component fit drifts towards lower values, con-verging towards similar results for v min = 375 km/s and v min = 400 km/s. This indicates that additional com-ponents are needed to describe the data below | (cid:126)v | ∼ v min ≥
375 km/s. Indeed, the posterior distributionsfor the two-component fit are much more stable with v min and consistent with the single component fit for v min ≥
375 km/s. However, for v min = 300 km/s the2-component posterior is peaked at higher v esc values,suggesting that a third component might be present.Thus, we also show the posterior for the 3-component
300 320 340 360 380 400 v min [km/s] − − − − − A I C Gaia Data
Err=5.0%
AIC - AIC AIC - AIC AIC - AIC Figure 3.
We use the ∆AIC to compare goodness-of-fit be-tween models with different numbers of bound components.The plot shows results using the full
Gaia dataset, and thesubscript indicates number of components in the fit. When∆AIC is negative, the first model is preferred. fit and v min = 300 km/s. This result is consistentwith the posteriors for single and two-component fitsat larger v min . (We do not show the posteriors for the3-component fit and v min ≥
325 km/s because thoseare largely unconstrained, indicating that we have toomany degenerate parameters.)In order to compare goodness-of-fit with differentnumbers of bound components, we calculate the AIC(Akaike 1974) as discussed in Paper I. The AIC is de-fined as AIC = 2 s − L ) where s is the numberof model parameters and log( ˆ L ) is the maximum loglikelihood. To compare models, we compute the differ-ence ∆AIC ab = AIC a − AIC b between fits with a or b bound components, where the model with the lowestAIC is preferred. We show the resulting ∆AIC in Fig. 3,where we compare the model with one, two, and threecomponents. For v min = 300 km/s, where we expecta large disk contamination, we find that the two andthree component models are strongly preferred over asingle function case. This is visible from the first panelof Fig. 1, where a single bound component clearly doesnot describe the data as well. As expected however, thesingle function is preferred for v min ≥
375 km/s. Thisis because at high | (cid:126)v | the distribution is dominated bya single distribution, and extra model parameters arepenalized. It is interesting to see that the three functionfit is marginally better than the two function fit onlyfor v min = 300 km/s, beyond which it leads to an over-fit. This implies that the data is determined by threecomponents for v min = 300 km/s, but that the disk islargely subdominant past v min = 325 km/s.From this discussion, we see a consistent story emerge:there is a sharply peaked component of stars near | (cid:126)v | ∼
300 km/s, along with two other bound components thatdominate at higher speeds. A single component domi-nates the tail for | (cid:126)v | >
375 km/s. These could already beseen in the three-component fit shown in Fig. 1, but isfurther supported by studying fit dependence with v min and the ∆AIC. We conclude that a single-componentmodel is not a self-consistent description of the data forlow v min . Our analysis allows us to identify the “tails”where a single or two-component fit is actually robust.In Sec. 3.3, we will summarize the key results for v esc and also compare with previous studies.It is important to note that our framework only showsthat this data is better modeled by multiple componentswith power law distributions. It cannot show that eachcomponent has a distinct physical origin. Doing so re-quires further studies. Nevertheless, the behavior of thedifferent components appears to be consistent with con-tributions from the Disk, relaxed Halo component, andSausage, and here we summarize the fit results for thosecomponents. Additional results can be found in App. B.As mentioned in Paper I, we do not have a priori in-formation about which component is the Halo and whichis the Sausage; the labels of Fig. 1 are based on the as-sumption that the Sausage has the lowest slope k . Work-ing with that assumption, we find that the posterior onthe slope of the Sausage is k S = 0 . +0 . − . from the threefunction fit at v min = 300 km/s. This is similar to the
300 400 500 600 700 800 | ~v | [km/s] − − − − − − f ( | ~ v | ) [ k m / s ] − N =442 v min =300 km/s DataHaloOutliersSausageSum
Figure 4.
Two function fit only to the retrograde stars sat-isfying quality cuts and with fractional speed errors cappedat 5%. The bound components and outlier population aremuch more poorly constrained than the fits to all stars, andwe do not obtain a robust v esc result. range of slopes in simulations with Sausage-like mergers, k S ∈ [1 , . . +0 . − . and 0 . +0 . − . , respectively. Thisleads to the Sausage component being 0 . +0 . − . of thesenon-disk stars for | (cid:126)v | >
300 km/s. Using the valuesof the best fit slopes, we find that the Sausage compo-nent rises to 0 . +0 . − . of non-disk stars for | (cid:126)v | > v min = 350 km/s, f S = 0 . +0 . − . . These fractional contributions are im-portant in understanding the composition of the DM dis-tribution and impacts predicted signals in DM searches,which was studied in Necib et al. (2018, 2019c).3.2. Retrograde Data
We next perform the same analysis with retrogradestars only. Previous studies have applied this cut onthe data to avoid disk contamination. Similarly, herewe consider a fit with two bound components since wedo not expect any disk contribution for v min = 300km/s. We show the resulting fit in Fig. 4, where wesee that the 68% containment region (shaded bands) forall the components is much larger compared to the fitto all stars in Fig. 1. It is expected that the uncer-tainties on the components should be larger given thatthe statistics is about 2.5 times lower, but the spreadin the bands here reflects more than that. Focusing onthe tail of the bound components, we see that there issignificantly more spread in v esc and in fact the fit gives v esc = 710 . +216 . − . km/s. Dataset Functions v min v esc M , tot [km/s] [km/s] [10 M (cid:12) ]All data 3 300 484 . +17 . − . . +1 . − . All data 2 350 498 . +63 . − . . +2 . − . All data 1 375 515 . +113 . − . . +3 . − . Table 1.
Best fit values of the escape velocity at the solarposition, as well as the total mass of the Milky Way obtainedthrough the assumptions of Sec. 4.
The reason for the much larger uncertainty in v esc hereis due to the difficulty in constraining the outlier pop-ulation. In particular, with the dataset in Fig. 4, it isdifficult to distinguish whether the stars with speeds of500-600 km/s are likely to come from the outlier com-ponent or the bound components. We can further un-derstand this outlier confusion by looking at the escapevelocity posterior, which is shown in Fig. 5. Here wesee that the posterior for the 2-component fit to retro-grade stars is actually double-peaked. One peak is at low v esc , at values roughly consistent with fits to all stars.Meanwhile, the peak at high v esc is correlated with alower outlier fraction, where those high speed stars near | (cid:126)v | ∼
600 km/s can also partly be modeled with thebound components.As a result, taking v esc = 710 . +216 . − . km/s for ret-rograde stars is not a robust result. In App. C, weshow the fit to retrograde stars for different v min andsee similar results with non-convergent fit results anddouble-peaked posteriors. This is expected since theoutlier confusion cannot be removed with larger v min .In fact, the larger number of high-speed stars in the fulldata sample helps pin down the outlier component andthus also v esc of the bound components. With limitedstatistics in the retrograde sample, modeling the tail ofa distribution is particularly sensitive to shot noise inthe outliers. Furthermore, applying a cut for retrogrademotion might be shaping the sample such that it is notkinematically complete. The possible presence of addi-tional kinematic structures was studied in Paper I by theintroduction of small peaked structures, and we foundthat it could strongly bias the results. Given these fac-tors, we will consider the fit to all of the data as ourmain result.3.3. Comparison with previous results
The main results for v esc are summarized in Fig. 5and Table 1. For these fits, we selected the lowest v min that was consistent with the number of bound compo-nents in the fit. This was determined by the lowest v min ∈ [300 , , , ,
400 600 800 1000 v esc [km/s]0 . . . . a . u . v min = 300 km/s2 Func, v min = 350 km/s1 Func, v min = 375 km/sRetrograde, v min = 300 km/s Figure 5.
Posteriors for v esc from fits to all data with 1, 2, or3 bound components, and from a fit to retrograde data with2 bound components. Full data posterior distributions havebeen normalized to unity, while the retrograde data has beenmultiplied by a factor of 6 to clarify the non-convergence ofthe posterior. For each of fits to all data, we selected thelowest v min valid for that number of components. For theretrograde data, we do not obtain a convergent result. see that the results for 1, 2, or 3 bound components arealso all consistent with each other.Previous studies all used a single function of the formof Eq. (1). To deal with a large degeneracy in v esc and k ,they typically impose an artificial prior on the slope k .Piffl et al. (2014); Monari et al. (2018) adopt a prior of k ∈ [2 . , . k ∈ [1 , .
5] given that cos-mological simulations with merger events similar to the
Gaia
Sausage had slopes in that range. These priorsstrongly affect the measured escape velocity, giving riseto nonconvergent results as can be seen in the cornerplot of Deason et al. (2019).As discussed in Paper I, the use of a single functionwhere the fit requires two leads to an averaging of theslopes k . For example, in v min = 350 km/s where thetwo function fit suffices, we find two components with k = 12 . +4 . − . and k S = 1 . +1 . − . , while a single func-tion fit yields k = 8 . +4 . − . . One can see in this casethat although k S is indeed within the range suggestedby Deason et al. (2019), using a single function fit anda limited prior in k can artificially inflate the escapevelocity value.Given this, it is not surprising that our results for v esc are lower than previous studies. Deason et al. (2019)finds v esc = 528 +24 − km/s with a prior range of k ∈ [1 , . v esc = 580 ± k ∈ [2 . , . k , they find v esc = 497 +53 − km/s where the quoted errors are the99% confidence intervals. Using a larger sample of starswith only tangential velocities, they report a lower limitof v esc = 497 +40 − km/s at the solar radius, again givingthe 99% confidence interval. MASS OF THE MILKY WAYThe escape velocity at a certain radius is related tothe gravitational potential Φ. Therefore, measurementof the escape velocity can translate to a measurement ofthe mass of the Milky Way, once profiles of the differ-ent mass components are assumed. For an isolated halo v esc ( r ) = (cid:112) | Φ( r ) | , but in practice we must select a lim-iting radius beyond which stars can become unbound. Amore realistic assumption is that stars are bound withina few times r , where r is the radius at which thegalaxy’s mass is 200 times the critical mass of the uni-verse. For ease of comparison, we will adopt the samedefinition as Deason et al. (2019) by taking the limitingradius as 2 r , with v esc ( r (cid:12) ) = (cid:112) | Φ( r (cid:12) ) − Φ(2 r ) | .To recover the mass of the Milky Way, we make thefollowing assumptions on the different baryonic and DMcomponents, similarly to Deason et al. (2019), whichassumes model I of Pouliasis et al. (2017): • The bulge is modeled as a Plummer profile (Plum-mer 1911) of mass M bulge = 1 . × M (cid:12) anda scale radius b = 0 . • The thin disk is modeled as a Miyamoto-Nagai profile (Miyamoto & Nagai 1975) of mass M thin disk = 3 . × M (cid:12) , a scale radius r thin disk = 5 . z thin disk = 0 .
25 kpc. • The thick disk is modeled as Miyamoto-Nagai pro-file of mass M thick disk = 3 . × M (cid:12) , a scaleradius r thick disk = 2 . z thick disk = 0 . • The Dark Matter profile is modeled as an Navarro-Frenk-White (NFW) (Navarro et al. 1996) profileof mass M and concentration parameter c ,which we will fit for. We take the Hubble con-stant H = 70 km s − Mpc − , the matter abun-dance Ω M = 0 . . . . . . ( M ) [ M (cid:12) ]51015202530 c v esc ( r (cid:12) ) v circ ( r (cid:12) ) Figure 6.
Mass and concentration of the DM halo, con-strained by the circular velocity measurement (blue) fromEilers et al. (2019), and by the escape velocity (red) studiedin this work. For v esc we take the posterior from the three-function fit and v min = 300 km/s. The contours are 68% and95% CL, and the black lines show the combined constraints. To translate the posterior distribution of v esc ,marginalized over all other parameters, into a posteriorin the enclosed mass-concentration M − c space,we use galpy (Bovy 2015) to compute the escape ve-locity of the summed potentials assumed above in a gridof M and c . We then plot the PDF of each point in M and c using the interpolated version of the es-cape velocity posterior distribution, and show in Fig. 6the 68% and 95% containment regions based on v esc .Similarly, we overlay constraints based on the circularvelocity v circ = 230 ±
10 km/s from Eilers et al. (2019).This is important as the escape velocity gives informa-tion on the mass of the Milky Way at large distances,while the circular velocity constrains it within the solarcircle.We now compare our findings to the literature. Wedefine the total mass of the Milky Way as the mass con-tained within r of the DM halo, in addition to thebaryons, including the bulge, thin, and thick disk as de-scribed above. In Fig. 7, we show the results for the total mass based on the four analyses summarized in Sec. 3.3.For analyses with higher v min , the errors on the escapevelocity increases, as explicitly shown in Table 1. Thisleads to an increase in the error estimate of the mass ofthe Milky Way.For reference, we also show the masses found byMonari et al. (2018); Deason et al. (2019); Koppelman& Helmi (2020). We note there is a small difference inconvention, where Monari et al. (2018) and Koppelman& Helmi (2020) define the escape velocity cutoff as 3 r and not 2 r as we do, although we do not expect more v e s c [ k m / s ] F un c , v m i n = k m / s F un c , v m i n = k m / s F un c , v m i n = k m / s R e t r og r a d e , v m i n = k m / s M o n a r i + ( ) D e a s o n + ( ) K o pp e l m a n + ( ) . . . . . . . . l og ( M , t o t ) [ M (cid:12) ] F un c , v m i n = k m / s F un c , v m i n = k m / s F un c , v m i n = k m / s R e t r og r a d e , v m i n = k m / s M o n a r i + ( ) D e a s o n + ( ) K o pp e l m a n + ( ) Figure 7.
Best fit values of the escape velocity (top) andtotal mass (bottom) of the Milky Way for the different anal-yses discussed in Sec. 3. The analyses of the full data is themost robust, and tends to be lower than previous measure-ments; it is within a single standard deviation of Deason et al.(2019) and Koppelman & Helmi (2020) and lower than mea-surements by Monari et al. (2018). The result based on theretrograde data is shown for completeness, but that analysisdid not give a robust v esc (see Fig. 4). We show the escapevelocity measurement of Koppelman & Helmi (2020) wherewe divide their errors by 3, as they quote 99% confidenceintervals while we quote 68%. For the mass measurement,Koppelman & Helmi (2020) multiply the escape velocity es-timate by 10%, motivated by the findings of (Grand et al.2019) for potential biases. The mass plotted above is the onecorrected with this factor. than a few percent difference due to this effect. We findthat our analysis consistently shows a lower mass of theMilky Way than previous studies, although it is within asingle standard deviation of the result from Deason et al.(2019). Note that we include the result from the retro-grade data for completeness, and to show that even thenon-convergent result for v esc still translates to a some-what consistent total mass.As can be seen in Fig. 6, the circular velocity mea-surement plays an important role in constraining themass of the Milky Way. For example, the current un-certainties on both the circular and escape velocities in- troduce a ∼
20% error on the mass estimate. Drop-ping the error on the circular velocity from 10 km/s to5 km/s would reduce this error down to ∼ M , tot = 6 . +1 . − . × M (cid:12) , or a ∼
10% differencefrom the value we quote above. A better understandingof the baryonic model of the Milky Way as well as amore accurate measurement of the circular velocity aretherefore important in improving the total Milky Waymass estimates. Similarly, using a contracted NFW pro-file due to the presence of baryons (Schaller et al. 2015)would also change the total mass estimate substantially;Piffl et al. (2014) used both the regular and contractedNFW profiles and found a difference of ∼ CONCLUSIONSIn this paper, we applied a new analysis pipeline forconstraining v esc that accounts for the presence of kine-matic substructure in the stellar speed distribution. Ourwork is motivated by the discovery of the Gaia
Sausage-Enceladus, as well as by the need to improve the ro-bustness of v esc fits, which can be very sensitive to thedefinition of the “tail” of the velocity distribution. Weintroduce a forward model that allows for multiple powerlaw components in the tail of the stellar velocity distri-bution, and showed that repeating the analysis for v min and number of components allows us to robustly deter-mine v esc . Details of this pipeline and examples withmock datasets can be found in Paper I.We found that at least two bound components is pre-ferred in fits to the Gaia data for stars with speeds v >
325 km/s. These components are suggestive of aSausage-Enceladus component and a relaxed stellar halocomponent, but more study is needed to understand ifthe components truly have different physical origin or ifa multi-component model simply provides a more flex-ible fitting framework. With this multi-component fit,our result for v esc is lower than previous measurementsin the literature. On the other hand, previous worksassumed a single power law component for the tail, andwe have shown that assuming a single component wherethe data prefers more can lead to systematically higher v esc values.Using our results for v esc , we determined the totalmass of the Milky Way, finding a value of the concen-tration of c = 13 . +6 . − . and a total mass of M =7 . +1 . − . × M (cid:12) . Our result for the total mass is lowerthan those of previous studies relying on the escape ve-locity to obtain the mass of the Galaxy. However, theyare more consistent with methods based on the distri-bution function of globular clusters (e.g. Eadie & Juri´c(2019)) and matching satellites of the Milky Way withtheir simulation counterparts (e.g. Patel et al. (2018)).Wang et al. (2020) provides a review of all these meth-ods, which shows a large scatter in the mass estimatesranging from ∼ (0 . − × M (cid:12) . Our method providesthe most robust measurement relying on the escape ve-locity at the location of the Sun.Along with a better understanding of the baryoniccomponents of the Milky Way, the dark matter profile,and the local circular velocity, other effects are impor-tant to evaluate in order to improve Milky Way massestimates. In particular, many of the existing estimatesassume a relaxed equilibrated halo between the locationof the Sun and the edge of the Galaxy. The presence ofsatellites, streams, and the evidence of the active mergerhistory of the Milky Way would suggest otherwise. It istherefore crucial to pair our pipeline, applied at differ-ent distances from the Galactic center, with a betterunderstanding of the halo at larger radii. A combinedapproach will help build a complete and coherent pictureof the Milky Way potential, and constrain the shape ofthe dark matter halo. ACKNOWLEDGEMENTSWe are grateful to I. Moult for early discussions andcollaboration on the project, and to M. Lisanti for help-ful feedback. We would also like to thank L. Anderson,A. Bonaca, G. Collin, A. Deason, P. Hopkins, A. Ji, andJ. Johnson for helpful conversations.This work was performed in part at Aspen Center forPhysics, which is supported by National Science Founda-tion grant PHY-1607611. This research used resourcesof the National Energy Research Scientific ComputingCenter (NERSC), a U.S. Department of Energy Office ofScience User Facility operated under Contract No. DE-AC02-05CH11231. LN is supported by the DOE underAward Number DESC0011632, the Sherman Fairchildfellowship, the University of California Presidential fel-lowship, and the fellowship of theoretical astrophysics atCarnegie Observatories. TL is supported by an AlfredP. Sloan Research Fellowship and Department of Energy(DOE) grant DE-SC0019195.This work has made use of data from the Euro-pean Space Agency (ESA) mission Gaia
Gaia
Gaia
Multilateral Agreement.REFERENCES
Ade, P., et al. 2016, Astron. Astrophys., 594, A13Akaike, H. 1974, IEEE Transactions on Automatic Control,19, 716Alexander, J. B. 1982, MNRAS, 201, 579Belokurov, V., Erkal, D., Evans, N. W., Koposov, S. E., &Deason, A. J. 2018, MNRAS, 478, 611Bennett, M., & Bovy, J. 2019, MNRAS, 482, 1417Bovy, J. 2011, PhD thesis, copyright - Database copyrightProQuest LLC; ProQuest does not claim copyright in theindividual underlying works; Last updated - 2016-06-04Bovy, J. 2015, The Astrophysical Journal SupplementSeries, 216, 29Caldwell, J. A. R., & Ostriker, J. P. 1981, ApJ, 251, 61Deason, A. J., Belokurov, V., Koposov, S. E., & Lancaster,L. 2018, ApJ, 862, L1Deason, A. J., Fattahi, A., Belokurov, V., et al. 2019,MNRAS, 485, 3514Dierickx, M. I. P., & Loeb, A. 2017, ApJ, 847, 42Eadie, G., & Juri´c, M. 2019, ApJ, 875, 159Eilers, A.-C., Hogg, D. W., Rix, H.-W., & Ness, M. K.2019, ApJ, 871, 120 Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman,J. 2013, PASP, 125, 306Gaia Collaboration, Brown, A. G. A., Vallenari, A., et al.2018, ArXiv e-prints, arXiv:1804.09365Gaia Collaboration, Prusti, T., de Bruijne, J. H. J., et al.2016, A&A, 595, A1Gibbons, S. L. J., Belokurov, V., & Evans, N. W. 2014,MNRAS, 445, 3788Grand, R. J. J., Deason, A. J., White, S. D. M., et al. 2019,MNRAS, 487, L72Helmi, A., Babusiaux, C., Koppelman, H. H., et al. 2018,Nature, 563, 85Koppelman, H. H., & Helmi, A. 2020, arXiv e-prints,arXiv:2006.16283Lancaster, L., Koposov, S. E., Belokurov, V., Evans, N. W.,& Deason, A. J. 2018, ArXiv e-prints, arXiv:1807.04290Leonard, P. J. T., & Tremaine, S. 1990, ApJ, 353, 486Miyamoto, M., & Nagai, R. 1975, PASJ, 27, 533Monari, G., Famaey, B., Carrillo, I., et al. 2018, A&A, 616,L9 Myeong, G. C., Evans, N. W., Belokurov, V., Sand ers,J. L., & Koposov, S. E. 2018, ApJ, 863, L28Navarro, J. F., Frenk, C. S., & White, S. D. M. 1996, ApJ,462, 563Necib, L., & Lin, T. 2021Necib, L., Lisanti, M., & Belokurov, V. 2019a, ApJ, 874, 3Necib, L., Lisanti, M., Garrison-Kimmel, S., et al. 2018,arXiv e-prints, arXiv:1810.12301Necib, L., Ostdiek, B., Lisanti, M., et al. 2019b, arXive-prints, arXiv:1907.07681—. 2019c, arXiv e-prints, arXiv:1907.07190Patel, E., Besla, G., Mandel, K., & Sohn, S. T. 2018, ApJ,857, 78Piffl, T., Scannapieco, C., Binney, J., et al. 2014, A&A,562, doi:10.1051/0004-6361/201322531Plummer, H. C. 1911, MNRAS, 71, 460Pouliasis, E., Di Matteo, P., & Haywood, M. 2017, A&A,598, A66 Schaller, M., Frenk, C. S., Bower, R. G., et al. 2015,Monthly Notices of the Royal Astronomical Society, 451,1247Sch¨onrich, R., Binney, J., & Dehnen, W. 2010, MNRAS,403, 1829Smith, M. C., Ruchti, G. R., Helmi, A., et al. 2007,MNRAS, 379, 755Wang, W., Han, J., Cautun, M., Li, Z., & Ishigaki, M. N.2020, Science China Physics, Mechanics, and Astronomy,63, 109801White, S. D. M., & Rees, M. J. 1978, Mon. Not. Roy.Astron. Soc., 183, 341Xue, X. X., Rix, H. W., Zhao, G., et al. 2008, ApJ, 684,1143Zwicky, F. 1933, Helvetica Physica Acta, 6, 110 A. DATA SELECTIONFor the data sample in the main text, we apply Galactocentric distance cuts as well as parallax error cuts. For theremaining stars, we show the distribution of the errors on the measured speeds in Fig. S1. The cut placed in the maintext of ∆ v/v < .
05 only removes a small fraction of the entire data sample, while the typical errors in the retrogradesample are larger. We account for the error in individual stars in the likelihood, as discussed in Necib & Lin (2021). ∆ v [km/s] N S t a r s All StarsRetrograde Stars .
00 0 .
05 0 . (∆ v ) /v N S t a r s Figure S1.
The top panel shows the distribution of speed errors in the entire
Gaia sample satisfying distance and parallaxerror cuts, while the bottom panel shows the distribution of fractional speed error. B. CORNER PLOTS FOR ANALYSES IN MAIN TEXTB.1.
Corner plots for v min = 300 km/s v esc = 735 . +50 . − . k k = 18 . +1 . − . − . − . − . − . − . l og ( f ) log( f ) = − . +0 . − .
500 600 700 800 900 v esc . . . . . l og ( σ o u t ) k − . − . − . − . − . log( f ) .
25 5 .
40 5 .
55 5 .
70 5 . log( σ out ) log( σ out ) = 5 . +0 . − . Figure S2.
Corner plot for the single component fit of the full
Gaia data, with v min = 300 km/s, and a cap of 5% on themeasured fractional error of the speeds. The contours correspond to 68% and 95% containment. v esc = 519 . +17 . − . . . . . k k = 18 . +1 . − . − . − . − . − . l og ( f ) log( f ) = − . +0 . − . . . . . . l og ( σ o u t ) log( σ out ) = 5 . +0 . − . . . . . k Sub s k Subs = 2 . +0 . − .
480 520 560 600 v esc . . . . . f Sub s . . . . k − . − . − . − . log( f ) . . . . . log( σ out ) . . . . k Subs .
32 0 .
40 0 .
48 0 .
56 0 . f Subs f Subs = 0 . +0 . − . Figure S3.
Corner plot for the two component fit of the full
Gaia data, with v min = 300 km/s, and a cap of 5% on themeasured fractional error of the speeds. The contours correspond to 68% and 95% containment. v esc = 484 . +17 . − . k k = 18 . +0 . − . − . − . − . − . − . l og ( f ) log( f ) = − . +0 . − . . . . . l og ( σ o u t ) log( σ out ) = 5 . +0 . − . . . . . . k Sub s k Subs = 0 . +0 . − . . . . . f Sub s f Subs = 0 . +0 . − . k k = 4 . +1 . − .
440 480 520 560 v esc . . . . f
12 14 16 18 k − . − . − . − . − . log( f ) . . . . log( σ out ) . . . . . k Subs .
15 0 .
30 0 .
45 0 . f Subs k .
15 0 .
30 0 .
45 0 . f f = 0 . +0 . − . Figure S4.
Corner plot for the three component fit of the full
Gaia data, with v min = 300 km/s, and a cap of 5% on themeasured fractional error of the speeds.The contours correspond to 68% and 95% containment. Corner plot for v min = 350 km/s and 2-component fit v esc = 498 . +63 . − . k k = 12 . +4 . − . − . − . − . − . − . l og ( f ) log( f ) = − . +0 . − . . . . . . l og ( σ o u t ) log( σ out ) = 5 . +0 . − . k Sub s k Subs = 1 . +1 . − .
450 600 750 900 v esc . . . . f Sub s k − . − . − . − . − . log( f ) . . . . . log( σ out ) k Subs . . . . f Subs f Subs = 0 . +0 . − . Figure S5.
Corner plot for the two component fit of the full
Gaia data, with v min = 350 km/s, and a cap of 5% on themeasured fractional error of the speeds. The contours correspond to 68% and 95% containment. Corner plot for v min = 375 km/s and 1-component fit v esc = 515 . +113 . − . k k = 2 . +3 . − . − . − . − . − . − . l og ( f ) log( f ) = − . +0 . − .
450 600 750 900 v esc . . . . l og ( σ o u t ) k − . − . − . − . − . log( f ) .
75 6 .
00 6 .
25 6 . log( σ out ) log( σ out ) = 5 . +0 . − . Figure S6.
Corner plot for the single component fit of the full
Gaia data, with v min = 375 km/s, and a cap of 5% on themeasured fractional error of the speeds. The contours correspond to 68% and 95% containment. C. ANALYSIS WITH RETROGRADE DATAIn the main text, we found robust and self-consistent results for v esc when fitting the Gaia dataset (satisfying somequality cuts). We are not able to find robust results when fitting retrograde stars only. In Fig. S7, we see that the v esc posteriors are non-convergent in all cases, and there is no clear trend in the results with v min . As described in themain text and can be seen in Fig. 4, there are far fewer retrograde stars with speeds above 500 km/s, and the outlierpopulation is not as well constrained, leading to outlier confusion and the double-peaked posteriors.
300 325 350 375 4004006008001000 v e s c [ k m / s ] N =442 N =285 N =172 N =95 N =66 Gaia Retrograde Data
Err=5.0%
Single Function Two Functions
300 325 350 375 400 v min [km/s]051015 k Figure S7.
From fitting the retrograde data only. Posteriors in escape velocity and k for different values of v minmin