[PDF] Evidence for Unresolved Gamma-Ray Point Sources in the Inner Galaxy

Abstract

We present a new method to characterize unresolved point sources (PSs), generalizing traditional template fits to account for non-Poissonian photon statistics. We apply this method to Fermi Large Area Telescope gamma-ray data to characterize PS populations at high latitudes and in the Inner Galaxy. We find that PSs (resolved and unresolved) account for ~50% of the total extragalactic gamma-ray background in the energy range ~1.9 to 11.9 GeV. Within 10 ∘ of the Galactic Center with |b|≥ 2 ∘ , we find that ~5-10% of the flux can be accounted for by a population of unresolved PSs, distributed consistently with the observed ~GeV gamma-ray excess in this region. The excess is fully absorbed by such a population, in preference to dark-matter annihilation. The inferred source population is dominated by near-threshold sources, which may be detectable in future searches.

Full PDF

MMIT-CTP-4684

Evidence for Unresolved Gamma-Ray Point Sources in the Inner Galaxy

Samuel K. Lee,

1, 2

Mariangela Lisanti, Benjamin R. Safdi, Tracy R. Slatyer, and Wei Xue Princeton Center for Theoretical Science, Princeton University, Princeton, NJ 08544 Broad Institute, Cambridge, MA 02142 Department of Physics, Princeton University, Princeton, NJ 08544 Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, MA 02139 (Dated: February 4, 2016)We present a new method to characterize unresolved point sources (PSs), generalizing traditionaltemplate ﬁts to account for non-Poissonian photon statistics. We apply this method to

Fermi

LargeArea Telescope gamma-ray data to characterize PS populations at high latitudes and in the InnerGalaxy. We ﬁnd that PSs (resolved and unresolved) account for ∼

50% of the total extragalacticgamma-ray background in the energy range ∼ ◦ of the Galactic Centerwith | b | ≥ ◦ , we ﬁnd that ∼ ∼ GeV gamma-ray excess in this region. The excess isfully absorbed by such a population, in preference to dark-matter annihilation. The inferred sourcepopulation is dominated by near-threshold sources, which may be detectable in future searches.

Dark-matter (DM) annihilation in the Galactic halocan contribute to the ﬂux of high-energy gamma raysdetected by experiments such as the

Fermi

Large AreaTelescope [1]. Currently, an excess of ∼ GeV gamma rayshas been observed by

Fermi near the Galactic Center(GC) [2–16]. The signal extends ∼ ◦ oﬀ the plane, isapproximately spherically symmetric, and has an inten-sity proﬁle that falls as r − γ with γ ≈ . . Fermi data from ∼ August 4, 2008 to ∼ December 5, 2013made available by [29]. A HEALPix [30] pixelization ofthe data with nside = 128 is used, corresponding to pix-els ∼ ◦ to a side. We emphasize that our study focuseson data in a single energy bin from 1.893–11.943 GeV anddoes not rely on or extract spectral information for the excess. The choice of this energy range keeps the signal-to-background ratio in the region of interest (ROI) high,maintains a suﬃciently large number of photons over thefull sky, and keeps the point-spread function relativelysmall and energy-independent.The analysis utilizes the photon-count probability dis-tribution in each pixel. In general, a given model for thegamma-ray ﬂux, with parameters θ , predicts a probabil-ity p ( p ) k ( θ ) of observing k photons in a pixel p . Severalsource components, each modeled by a spatial template,can contribute photons in a pixel. To date, analyses usingtemplates have assumed Poisson statistics for the photon-count distribution—speciﬁcally, that p ( p ) k ( θ ) is the Pois-son probability to draw k counts with mean given by thesum of the template components in pixel p .To account for unresolved PSs, the standard template-ﬁtting procedure must be generalized to include non-Poissonian photon counts. In the NPTF procedure, p ( p ) k ( θ ) depends on a potentially pixel-dependent PSsource-count function dN p /dF . The source-count func-tion determines the average number of PSs within pixel p that contribute photon ﬂux between F and F + dF . Inthis work, the source-count function is assumed to fol-low a broken power law, dN p /dF ∝ A p F − n , with pixel-dependent normalization A p and indices n ( n ) above(below) the break F b that are constant between pixels.For isotropically distributed PSs, A p is constant betweenpixels. To model a population of PSs that mimics a DMannihilation signal, A p must instead follow the DM anni-hilation template. Semi-analytic methods for calculatingthe p ( p ) k ( θ ) with a broken power law source-count func-tion were developed in [31, 32].We include templates for up to seven diﬀerent com-ponents in the NPTF analysis: (1) diﬀuse background,assuming the Fermi p6v11 diﬀuse model; (2) Fermi bub-bles, assuming uniform emission within the bubbles [34]; (3) isotropic background; (4) annihilation of Navarro-Frenk-White (NFW) [35, 36]-distributed DM, assumingno substructure; (5) isotropic PSs; (6) disk-correlated a r X i v : . [ a s t r o - ph . H E ] F e b -10 -9 -8 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] Iso. PSIso. PS (3FGL masked)3FGL PS

FIG. 1: The source-count function for high-latitude pointsources, derived from applying non-Poissonian template ﬁtsto data with 3FGL sources [33] unmasked (green band) andmasked (orange band). The colored bands indicate 68% con-ﬁdence intervals, which are computed pointwise in F fromthe posteriors for the source-count–function parameters, whilethe solid and dashed black lines show the median source-countfunctions. The black points show the source-count function ofthe detected point sources in the 3FGL catalog. The verticalerror bars indicate 68% conﬁdence intervals; the horizontalbars denote bins in F . PSs, and (7)

NFW-distributed PSs. Templates (1) through (4) are speciﬁed by a single normalization pa-rameter each. Templates (4) and (7) assume a general-ized NFW distribution with inner slope γ = 1 .

25. Tem-plate (6) corresponds to a doubly exponential thin-disksource distribution with scale height 0.3 kpc and radius5 kpc. The PS templates each have four parameters de-scribing their respective source-count functions.Bayesian methods (implemented with

MultiNest [37,38]) are used to extract posterior distributions for themodel parameters. The prior distributions of all param-eters are ﬂat, except for those of the DM and PS nor-malization factors, which are log ﬂat. Unless otherwisestated, the prior ranges of all parameters are suﬃcientlylarge so that the posterior distribution is well-converged.We begin by applying the NPTF to data at highGalactic latitudes ( | b | ≥ ◦ ). Fermi ’s third sourcecatalog (3FGL) [33] identiﬁes 1307 gamma-ray PSs inthis region of the sky, with ∼

55% associated with Ac-tive Galactic Nuclei and ∼

24% associated with pul-sars, supernova remnants and other known gamma-raysources. The remaining ∼

21% are unassociated. Figure 1 More speciﬁcally, the 2-D spatial distribution of ﬂux from thePSs, projected along the line-of-sight, follows the ﬂux distribu-tion of gamma-rays from DM annihilation, assuming the DM isdistributed with a generalized NFW distribution. shows the source-count function dN/dF in terms of theﬂux of the 3FGL PSs in our energy bin (black points).The observed source-count function is suppressed below F ∼ − photons / cm / s, where it is hard to detect PSsover the diﬀuse background with the current exposure.The NPTF is performed in this high-latitude region,including templates for the diﬀuse background, Fermi bubbles, isotropic emission, and isotropic PSs. The best-ﬁt source-count function values are given in Tab. I. The pointwise 68% conﬁdence interval for the source-count function is shown in Fig. 1, shaded green. Thesource-count function matches the 3FGL data well above F ∼ − photons / cm / s.The best-ﬁt intensities obtained from the NPTF can becompared to those obtained from a standard template ﬁtthat neglects PSs. The diﬀuse-background and Fermi -bubbles intensities (averaged over the ROI) are consis-tent between both procedures. When the PS templateis included, the isotropic-background intensity is I iso =1 . +0 . − . × − photons/cm /s/sr and the isotropic PSintensity is I isoPS = 1 . +0 . − . × − photons/cm /s/sr.With no PS template, the isotropic-background inten-sity is over twice as high, I iso = 3 . +0 . − . × − photons/cm /s/sr. Thus, the PS intensity is absorbed bythe isotropic-background template in the standard pro-cedure.The averaged intensity of the observed 3FGL PSs is ∼ × − photons/cm /s/sr at high latitudes. Us-ing the result of the NPTF described above and ne-glecting systematic uncertainties in modeling the 3FGLPSs, we predict that the intensity of unresolved PSs is7 . +0 . − . × − photons/cm /s/sr. This can be checkedexplicitly by repeating the NPTF with all 3FGL PSsmasked (at a ∼ ◦ radius). The results of this ﬁt aregiven in Tab. I and illustrated by the orange band inFig. 1. The source-count function for the unresolvedPSs agrees with that computed from the unmasked skyat low ﬂux. This suggests that the NPTF is sensitiveto contributions from unresolved PSs below Fermi ’s de-tection threshold. The intensity of the isotropic back-ground is I iso = 1 . +0 . − . × − photons/cm /s/sr,which agrees with that from the 3FGL-unmasked NPTF,within uncertainties. The intensity of the isotropic PSsis I isoPS = 4 . +0 . − . × − photons/cm /s/sr, which isslightly lower than the value inferred from the 3FGL-unmasked NPTF. I iso corresponds to the intensity of the isotropicgamma-ray background (IGRB), while I iso + I isoPS gives theintensity of the EGB. While the PS template does absorbsome contribution from Galactic PSs, extragalactic PSsare expected to dominate. From the 3FGL-unmaskedNPTF, we infer that 55 +2 − % of the EGB in this energy The exposure map in [29], with average exposure (cid:15) avg = 4 . × cm s, is used to translate between the number of photons S and the ﬂux F . ROI Template 3FGL n n F b Bayes Factor Bayes Factor NFW DM[photons/cm /s] (Data) (Simulation) (95% conﬁdence)HL Iso. PS unmasked 3 . +2 . − . . +0 . − . . +5 . − . × − — — —masked 4 . +0 . − . . +0 . − . . +1 . − . × − IG NFW PS + Disk PS unmasked 18 . +8 . − . − . +0 . − . . +0 . − . × − ∼ ∼ < .

44 %17 . +8 . − . . +0 . − . . +1 . − . × − IG NFW PS + Disk PS masked 18 . +7 . − . − . +1 . − . . +0 . − . × − ∼ ∼ < .

48 %17 . +8 . − . − . +1 . − . . +9 . − . × − TABLE I: Best-ﬁt values (16 th , 50 th , and 84 th percentiles) for the source-count functions associated with the PS templates inthe High Latitude (HL) and Inner Galaxy (IG) ROIs. The source-count function is ﬁt by a broken power-law, where n isthe slope above (below) the break in dN/dF , given by F b . The source-count function for the isotropic PS component in the IGis not included, as its ﬂux fraction is subdominant. Depending on the analysis, the Fermi range is associated with both resolved and unresolvedPS emission; from the 3FGL-masked NPTF and usingthe intensities of the 3FGL PSs, we ﬁnd that 47 +2 − % ofthe EGB is due to PS emission. These estimates appearto be consistent with those in [31, 39], though a directcomparison is made diﬃcult by the fact that these anal-yses cover a diﬀerent energy range and only use the ﬁrst ∼

11 months of

Fermi data. Our estimates for I iso agreewith the most recently published results from Fermi [40].Next, we use the NPTF procedure to determine thefraction of ﬂux from unresolved PSs in the IG. Theseanalyses include templates for the diﬀuse background,the

Fermi bubbles, isotropic background, and NFW-distributed DM, in addition to isotropic, disk-correlated,and NFW-distributed PSs. While the prior ranges forthe isotropic, isotropic PS,

Fermi bubbles, and diﬀusebackground template parameters are not constrained bythe high-latitude ﬁt, restricting these parameters to theirhigh-latitude values does not signiﬁcantly aﬀect the re-sults. The ROI consists of all pixels within 30 ◦ of the GCwith | b | ≥ ◦ , masking out the plane. As above, we In particular, allowing the isotropic and isotropic PS templateparameters to ﬂoat allows the isotropic components to partlycompensate for ﬂaws in the other templates. Mismodeling thatis roughly uniform across the relatively small IG ROI can beabsorbed in this way. For example, if our disk-correlated PStemplate is more sharply peaked toward the Galactic plane thanthe true disk PS population, the isotropic PS template can pickup an additional positive contribution that absorbs the higher-latitude disk PSs. If our disk PS template is broader in lati-tude than the true disk PS population, a negative contributionto the isotropic PS template can help account for this. Thusthe “isotropic” templates in the IG may in principle be eitherbrighter or fainter than their high-latitude counterparts. perform two analyses, one on the full ROI and anotherwith all 3FGL PSs masked. For both cases, the source-count functions and ﬂux fractions are quoted with respectto the region within 10 ◦ of the GC and | b | ≥ ◦ , with noPSs masked. The source-count function of the Galacticand unassociated 3FGL PSs in the IG is given by theblack points in the left panel of Fig. 2, with the numberof PSs in each bin indicated. The majority ( ∼ F ∼ × − photons/cm /s, the NFW PS template accountsfor nearly all the PS emission; its source-count functionhas a steep cutoﬀ just below the source sensitivity thresh-old. It is worth noting that there is no externally imposedthreshold for the PS population in this case, as the 3FGLsources are not masked.The most pressing question to address is whether theexcess ﬂux in the IG is better absorbed by the NFW PSor NFW DM template. The right panel of Fig. 2 showsthe respective ﬂux fractions, computed relative to thetotal photon count in the inner 10 ◦ region with | b | ≥ ◦ and the 3FGL sources unmasked. The disk and isotropicPSs contribute 5 . +0 . − . % and 1 . +0 . − . % of the ﬂux, respec-tively. In contrast, the best-ﬁt ﬂux fraction for the NFWPS component is 8 . +0 . − . %, while the best-ﬁt DM ﬂuxfraction is consistent with zero. The normalization ofthe diﬀuse model remains consistent (to within 1%) withexpectations from the ﬁt at high latitudes, suggestingthat the NFW PS template is absorbing the excess, andonly the excess, and corresponds to a source populationdistinct from the more disk-like population of resolved p o s t e r i o r p r o b a b ili t y NFW PSDisk PSIso. PSNFW DM

No NFW PS Template

FIG. 2: (Left) Best-ﬁt source-count functions within 10 ◦ of the GC and | b | ≥ ◦ , with the 3FGL sources unmasked. Themedian and 68% conﬁdence intervals are shown for each of the following PS components: NFW (dashed, orange), thin-disk(solid, blue), and isotropic (dotted, green). The number of observed 3FGL sources in each bin is indicated. The normalizationfor the diﬀuse emission in the ﬁt is consistent with that at high latitudes, as desired. (Right) Posteriors for the ﬂux fractionwithin 10 ◦ of the GC with | b | ≥ ◦ arising from the separate PS components, with 3FGL sources unmasked. The inset showsthe result of removing the NFW PS template from the ﬁt. Dashed vertical lines indicate the 16 th , 50 th , and 84 th percentiles. -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] NFW PSDisk PSIso. PS3FGL PS p o s t e r i o r p r o b a b ili t y NFW PSDisk PSIso. PSNFW DM

FIG. 3: Same as Fig. 2, except with 3FGL sources masked. sources. When the NFW PS template is omitted (inset),the fraction of ﬂux absorbed by the disk PS population isessentially unchanged at 6 . +0 . − . %, and the DM templateabsorbs 7 . +0 . − . % of the ﬂux. The DM ﬂux obtained inabsence of an NFW PS template is consistent with otherestimates in the literature [12, 14]. The model includingthe NFW PS contribution is preferred over that withoutby a Bayes factor ∼ . When the 3FGL sources are masked, the NPTF proce-dure yields a best-ﬁt source-count function given by theorange band in the left panel of Fig. 3. Below the break,the source-count function agrees well with that found bythe unmasked ﬁt. In this case, the contributions from theisotropic and disk-correlated PS templates are negligible. For reference, this corresponds to test statistic 2∆ ln

L ≈

The ﬂux fraction attributed to the NFW PS componentis 5 . +1 . − . %, while the NFW DM template absorbs nosigniﬁcant ﬂux.In the masked analysis, the Bayes factor for a modelthat contains an NFW PS component, relative to onethat does not, is ∼ , substantially reduced relative tothe result for the unmasked case. Masking the 3FGLsources removes most of the ROI within ∼ ◦ of the GC,reducing photon statistics markedly, especially for anysignal peaked at the GC. Furthermore, in the maskedROI, non-NFW PS templates can absorb a substantialfraction of the excess. For example, if only disk andisotropic PS templates are added, the ﬂux fraction at-tributed to the disk template is 2 . +0 . − . %, while thatattributed to NFW DM is 2 . +1 . − . % (the ﬂux attributedto isotropic PSs is negligible). When no PS templatesare included in the ﬁt, the NFW DM template absorbs4 . +1 . − . % of the total ﬂux. As we will discuss later, this -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] Simulated data with NFW PSs

NFW PSDisk PSIso. PS3FGL PS p o s t e r i o r p r o b a b ili t y Simulated data with NFW PSs

Disk PSNFW PSIso. PSNFW DM0 5 10 15 200.00.10.2 No NFW PS Template -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] Simulated data without NFW PSs

NFW PSDisk PSIso. PS3FGL PS p o s t e r i o r p r o b a b ili t y Simulated data without NFW PSs

Disk PSNFW PSIso. PSNFW DM0 5 10 15 200.00.10.2 No NFW PS Template

FIG. 4: Results obtained by applying the NPTF to simulated data. (Left column) The source-count functions for the PStemplates in the ﬁt when NFW PSs are included in the simulated data (top) or not (bottom). Note that when NFW PSs arenot simulated, an NFW DM component is instead. (Right column) The associated posteriors for the fraction of ﬂux absorbedby the diﬀerent templates in the ﬁt. The inset plots show the results of analyzing the simulated data without an NFW PStemplate in the ﬁt. All plots are relative to the region within 10 ◦ of the GC with | b | ≥ ◦ and 3FGL sources unmasked. Forthe ﬂux-fraction plots, the fractions are computed relative to the total number of counts observed in the real data. behavior agrees with expectations from simulated data.In this statistics-limited regime, the ﬁt does not distin-guish diﬀerent models for the PS distribution at highsigniﬁcance, but there is still a strong preference for un-resolved PSs. The Bayes factor for a model with diskand isotropic PSs, relative to one with no PSs, is ∼ ,while the Bayes factor for a model with NFW, disk andisotropic PSs, relative to one with no PSs, is ∼ . TheBayes factors, best-ﬁt source-count function parameters,and DM ﬂux fractions for the 3FGL masked and un-masked analyses are summarized in Tab. I.To validate the analysis procedure, we generate sim-ulated data using the best-ﬁt parameters from the un- However, if we repeat the analysis using Pass 8 data up to June3, 2015, corresponding to an average exposure increase of ∼ ∼ to ∼ in the 3FGLmasked analysis; in the 3FGL unmasked analysis, the Bayes fac-tor in favor of NFW PSs increases from ∼ to ∼ . masked IG analysis; we include isotropic, Fermi bubbles,and

Fermi p6v11 diﬀuse emission, as well as isotropic,disk and NFW-distributed PSs. The simulated data isthen passed through the 3FGL-unmasked IG analysispipeline described above. Details for how we performthe simulations may be found in the Supplementary Ma-terial.The top row of Fig. 4 shows the source-count functionsthat are recovered from the NPTF (left), as well as theposterior distributions for the ﬂux fractions of the sepa-rate components of the ﬁt (right). The ﬁtting procedureattributes the correct fraction of ﬂux to NFW-distributedPSs, within uncertainties, and ﬁnds no evidence for NFWDM. When no NFW PS template is included in the ﬁt(inset, top right), the NFW DM template absorbs the ex-cess. Both the source-count functions and the ﬂux frac-tions are consistent with the results obtained using realdata. Additionally, we recover a Bayes factor of ∼ inpreference for NFW PSs when using the simulated data,which is similar to what we found for the actual analysis.For comparison, the bottom row of Fig. 4 shows theresult of running the NPTF on a simulated data set thatdoes not include NFW-distributed PSs but does includeNFW DM. The model parameters used to generate thesimulated data are taken from the best-ﬁt values of theanalysis without NFW PSs on the real data. In this case,the ﬁtting procedure ﬁnds no evidence for NFW PSs, as itshould, and the Bayes factor in preference for NFW PSsis much less than 1. The source-count functions recoveredfor disk-correlated and isotropic PSs are consistent withthose used to generate the simulated data.The source-count function that we recover for NFWPSs in the IG diﬀers at low ﬂux from those previouslyconsidered in the literature, which were motivated bypopulation models and/or data for disk MSPs [19, 23,24, 41]. In particular, our source-count function seemsto prefer an increasing dN/d log F below the break, im-plying most sources lie close to the cutoﬀ luminosity,while previously-considered source-count functions tendto be ﬂatter or falling in dN/d log F . If conﬁrmed, thismay suggest novel features of the source population; how-ever, our results are also consistent with a ﬂat or falling dN/d log F within uncertainties.The results of the NPTF analyses presented herepredict a new population of PSs directly below thePS-detection threshold in the IG. We estimate fromthe 3FGL unmasked (masked) analysis that half ofthe excess within 10 ◦ of the GC with | b | ≥ ◦ may be explained by a population of 132 +31 − (86 +32 − )unresolved PSs, with ﬂux above 1 . +0 . − . × − (1 . +0 . − . × − ) photons / cm / s. The entire ex-cess within this region could be explained by 402 +159 − (258 +135 − ) PSs, although this estimate relies on extrapo-lating the source-count function to very low ﬂux, wheresystematic uncertainties are large. Detecting members ofthis PS population, which appears to lie just below thecurrent Fermi

PS-detection threshold, would be convinc-ing evidence in favor of the PS explanation of the ∼ GeVexcess.

Acknowledgements. — We thank S. Ando, K. Blum,D. Caprioli, I. Cholis, C. Dvorkin, D. Finkbeiner,D. Hooper, M. Kaplinghat, T. Linden, S. Murgia,N. Rodd, J. Thaler, and N. Weiner for useful discus-sions. We thank the

Fermi

Collaboration for the use of

Fermi public data and the Fermi Science Tools. B.R.Swas supported by a Pappalardo Fellowship in Physics atMIT. This work is supported by the U.S. Department ofEnergy under grant Contract Numbers DE-SC00012567,DE-SC0013999 and DE-SC0007968. B.R.S. and T.R.S.thank the Aspen Center for Physics, which is supportedby National Science Foundation grant PHY-1066293, forsupport during the completion of this work.

Note added in proof. —Recently, we became aware ofRef. [42], which studied the distribution of sub-thresholdPSs in the inner Galaxy using a wavelet technique. [1] W. Atwood et al. (Fermi-LAT), Astrophys.J. , 1071(2009).[2] L. Goodenough and D. Hooper (2009), 0910.2998. [3] D. Hooper and L. Goodenough, Phys.Lett.

B697 , 412(2011).[4] A. Boyarsky, D. Malyshev, and O. Ruchayskiy,Phys.Lett.

B705 , 165 (2011), 1012.5839.[5] D. Hooper and T. Linden, Phys.Rev.

D84 , 123005(2011), 1110.0006.[6] K. N. Abazajian and M. Kaplinghat, Phys.Rev.

D86 ,083511 (2012), 1207.6047.[7] D. Hooper and T. R. Slatyer, Phys.Dark Univ. , 118(2013), 1302.6589.[8] C. Gordon and O. Macias, Phys.Rev. D88 , 083521(2013), 1306.5725.[9] W.-C. Huang, A. Urbano, and W. Xue (2013), 1307.6862.[10] O. Macias and C. Gordon, Phys.Rev.

D89 , 063515(2014), 1312.6671.[11] K. N. Abazajian et al., Phys.Rev.

D90 , 023526 (2014),1402.4090.[12] T. Daylan et al. (2014), 1402.6703.[13] B. Zhou et al. (2014), 1406.6948.[14] F. Calore, I. Cholis, and C. Weniger, JCAP , 038(2015).[15] K. N. Abazajian et al. (2014), 1410.6168.[16] S. Murgia (Fermi-LAT) (Fermi Symposium, Nagoya,Japan, 2014).[17] M. Ackermann et al. (Fermi-LAT) (2015), 1503.02641.[18] K. N. Abazajian, JCAP , 010 (2011).[19] D. Hooper et al., Phys.Rev.

D88 , 083009 (2013).[20] N. Mirabal, Mon.Not.Roy.Astron.Soc. , 2461 (2013),1309.3428.[21] Q. Yuan and B. Zhang, JHEAp , 1 (2014).[22] F. Calore et al., Astrophys.J. , 1 (2014).[23] I. Cholis, D. Hooper, and T. Linden (2014), 1407.5625.[24] J. Petrovic, P. D. Serpico, and G. Zaharijas (2014),1411.2980.[25] Q. Yuan and K. Ioka (2014), 1411.4363.[26] R. M. O’Leary, M. D. Kistler, M. Kerr, and J. Dexter(2015), 1504.02477.[27] E. Carlson and S. Profumo, Phys.Rev.

D90 , 023015(2014).[28] J. Petrovic, P. D. Serpico, and G. Zaharijas, JCAP ,052 (2014).[29] S. K. Portillo and D. P. Finkbeiner, Astrophys.J. ,54 (2014).[30] K. Gorski et al., Astrophys.J. , 759 (2005).[31] D. Malyshev and D. W. Hogg, Astrophys.J. , 181(2011).[32] S. K. Lee, M. Lisanti, and B. R. Safdi, JCAP , 056(2015).[33] F. Acero et al. (Fermi-LAT) (2015), 1501.02003.[34] M. Su, T. R. Slatyer, and D. P. Finkbeiner, Astrophys.J. , 1044 (2010).[35] J. F. Navarro, C. S. Frenk, and S. D. White, Astrophys.J. , 563 (1996).[36] J. F. Navarro, C. S. Frenk, and S. D. White, Astrophys.J. , 493 (1997).[37] F. Feroz, M. Hobson, and M. Bridges,Mon.Not.Roy.Astron.Soc. , 1601 (2009).[38] J. Buchner et al., Astron.Astrophys. , A125 (2014).[39] A. A. Abdo et al. (Fermi-LAT), Astrophys. J. , 435(2010).[40] M. Ackermann et al. (Fermi-LAT), Astrophys.J. , 86(2015).[41] I. Cholis, D. Hooper, and T. Linden (2014), 1407.5583.[42] R. Bartels, S. Krishnamurthy, and C. Weniger (2015), , 17(2013).[44] T. Gr´egoire and J. Kn¨odlseder, Astron.Astrophys. ,A62 (2013).[45] M. Ackermann et al. (Fermi-LAT), Astrophys.J. , 3(2012).

Evidence for Unresolved Gamma-Ray Point Sources in the Inner Galaxy

Supplementary Material

Samuel K. Lee, Mariangela Lisanti, Benjamin R. Safdi, Tracy R. Slatyer, and Wei XueThe supplementary material is organized as follows. The analysis methods are described in Sec. I. Section IIestimates the

Fermi

PS-detection threshold. Extended results for the analyses described in the main text are thengiven in Sec. III. Section IV investigates a variety of systematic issues that may aﬀect the Inner Galaxy analysis.In Sec. V, we make a detailed comparison to previous work on the luminosity function for PSs in the Inner Galaxy.Section VI describes an additional test of the unresolved PS models utilizing the survival function. Lastly, Sec. VIIdescribes how the NPTF is validated using simulated data in the Inner Galaxy.

I. METHODS

This section describes in detail the analysis framework for the NPTF and the dataset used in this work.

A. Analysis framework

In the standard template-ﬁtting procedure, the photon-count probability distribution of observing k photons in agiven energy bin and pixel p is assumed to be Poissonian, p ( p ) k = ( µ p ) k e − µ p k ! , (S1)where µ p is the expected number of photons in the energy bin, summed over all the templates that contribute in thatpixel. In particular, µ p = (cid:88) (cid:96) α (cid:96) µ p,(cid:96) , (S2)where α (cid:96) is the best-ﬁt normalization of the (cid:96) th template in the given energy bin, and µ p,(cid:96) speciﬁes the spatialdependence of the template. That is, α (cid:96) µ p,(cid:96) gives the expected number of events in the energy bin and pixel p (accounting for an energy-dependent exposure, which may vary across the sky) for that template. Past studiesof the GeV excess have included templates that account for the spatial dependence of the diﬀuse and isotropicbackgrounds, the Fermi bubbles, and an NFW-distributed DM component. We have repeated the standard template-ﬁtting procedure and veriﬁed that our analysis pipeline reproduces results from those studies (see Sec. III B fordetails).Unlike the standard template-ﬁtting procedure, the NPTF does not assume that the photon counts are Poissonian.We compute the non-Poissonian photon-count probabilities using the method of generating functions. In particular,the total generating function for the photon-count probability distribution in each pixel is P ( p ) ( t ) = D ( p ) ( t ) · G ( p ) ( t ) , (S3)where t is an auxiliary variable, and D ( p ) ( t ) and G ( p ) ( t ) are the pixel-dependent generating functions for the diﬀuseand non-Poissonian PS contributions, respectively. The probability of observing k photons in pixel p is then p ( p ) k = 1 k ! d k P ( p ) dt k (cid:12)(cid:12)(cid:12) t =0 . (S4)If P ( p ) ( t ) = exp[ µ p ( t − Parameter Prior RangesHigh Latitude Inner Galaxy A iso [0 ,

10] [ − , A diﬀ [0 ,

10] [0 , A bub [0 ,

10] [0 , A NFW [ − ,

6] [ − , A PS [ − ,

6] [ − , S b [0 , k max ] [0 . , k max ] n [2 . ,

50] [2 . , n [ − , .

95] [ − , . A PS , S b , n , n ) when isotropic, disk-correlated, and NFW-distributed PS templates are allincluded in the ﬁt. The break S b is scanned up to k max , the maximum number of photons observed in a given pixel within theROI; in general, k max >

50 photons. • Isotropic Background

The isotropic-background template, which is uniform over the sky before correcting for exposure, is speciﬁed bythe normalization parameter A iso in the NPTF. We deﬁne A iso such that the number of counts arising from theisotropic background in pixel p is given by A iso α iso µ p, iso ; that is, A iso = 1 when the NPTF and the standardtemplate ﬁt agree exactly. The prior for A iso is taken to be linear ﬂat; the posterior is well-converged and istypically peaked around unity.The best-ﬁt pixel-averaged intensity for the isotropic background in a given ROI is I iso = A iso α iso ∆Ω (cid:68) µ p, iso (cid:15) ( p ) (cid:69) ROI , (S5)where ∆Ω is the pixel solid angle, (cid:15) ( p ) is the exposure in pixel p , and the angle brackets denote an average overthe pixels in the ROI. Note that µ p, iso ∝ (cid:15) ( p ) for the isotropic-background template. • Diﬀuse Background

The spatial dependence µ p, diﬀ of the diﬀuse-background template is based on the Fermi p6v11 model for themajority of our analyses; however, we do perform a number of cross-checks with other diﬀuse models. Aswith the isotropic-background template, the diﬀuse-background template is speciﬁed by a single normalizationparameter, A p, diﬀ , also deﬁned such that A p, diﬀ = 1 when the NPTF agrees exactly with the standard templateﬁt. We assume a linear-ﬂat prior, and the resulting best-ﬁt value is typically close to unity. The best-ﬁtpixel-averaged intensity I diﬀ is computed as in (S5). • Fermi Bubbles

The

Fermi -bubbles template assumes uniform emission within the bubbles [34] and is also speciﬁed by a singlenormalization parameter, A bub , deﬁned analogously as above. We similarly assume the prior for A bub is linearﬂat; the best-ﬁt value is again typically close to unity. The best-ﬁt pixel-averaged intensity I bub is computed asin (S5). • NFW Dark Matter

The spatial dependence of the DM template is determined by Φ p , which gives the integrated photon-count We neglect energy dependence in the NPTF and use only a single wide-energy bin, necessitating only one normalization parameter A l for each template. However, we do account for energy dependence when performing the standard template ﬁt, yielding individual valuesfor the normalization parameters α l for each of the narrower energy bins contained within the wide bin. Implicit in this expression andthose that follow is the appropriate sum over these bins. In Sec. IV B, we consider a class of diﬀuse models for which the π , bremsstrahlung, and ICS components are allowed to ﬂoat indepen-dently; in these cases, there are three normalization parameters for the diﬀuse background. intensity at the center of pixel p from DM annihilation in the Galactic halo. The intensity Φ p is computed bythe line-of-sight integral Φ( ψ ) ∝ (cid:90) los ds ρ [ r ( s, ψ )] , (S6)where ρ is the DM density proﬁle and r gives the distance from the GC. This work assumes a generalizedNavarro-Frenk-White (NFW) density proﬁle [35, 36] ρ ( r ) ∝ ( r/r s ) − γ (1 + r/r s ) − γ , (S7)where r s = 20 kpc is the scale radius. We take γ = 1 .

25 for the majority of the analyses.With the spatial dependence of µ p, NFW ﬁxed by the NFW intensity proﬁle and exposure map, the DM templatecan be speciﬁed by a single normalization parameter, A NFW , deﬁned analogously as above. We assume a priorthat is log ﬂat and covers a broad range (see Tab. S1). The normalization of A NFW is such that it is equalto unity for ∼

35 GeV DM annihilating into b ¯ b with cross section (cid:104) σv (cid:105) ≈ . × − cm /s, and a local DMdensity at the solar circle of 0.3 GeV/cm . • NFW Point Sources

Previous standard template ﬁts have included the above Poissonian templates. However, in the NPTF, PStemplates have non-Poissonian statistics, which can be derived from the source-count function. In the majorityof this work, the PS source-count function in a given pixel p is modeled as a broken power law dN p ( S ) dS = A p  (cid:16) SS b (cid:17) − n S ≥ S b (cid:16) SS b (cid:17) − n S < S b , (S8)where S b is the break, n , are the slopes above and below the break, and A p is a pixel-dependent normalization.We require n > n < S b , n , and n are all linear ﬂat.For the NFW PS template, the pixel-dependent normalization of the source-count function, A p , is assumed tobe proportional to µ p, NFW , as would be needed for the PSs to mimic a DM signal. We can thus specify theNFW PS template by the three source-count function parameters and an overall normalization parameter, A PS ,deﬁned such that A p = A PS µ p, NFW (cid:104) µ p, NFW (cid:105)

ROI . (S9)As with A NFW , the prior for A PS is log ﬂat and covers a broad range.We also quote the pixel-averaged intensity I PS , which is given by I PS = 1∆Ω (cid:68) µ PS p (cid:15) ( p ) (cid:69) ROI , (S10)with µ PS p the expected number of counts in pixel p arising from the PS population: µ PS p = (cid:90) dS S dN p dS = A p S b (cid:18) n − − n (cid:19) . (S11) • Isotropic Point Sources

The source-count function for the isotropic PS template is also modeled as a broken power law, as in (S8), exceptwith A p ∝ µ p, iso ∝ (cid:15) ( p ) . As above, the isotropic PS template can be speciﬁed by either its overall normalizationparameter A isoPS or the pixel-averaged intensity I isoPS . We fold the spatial dependence arising from both the NFW intensity proﬁle and the exposure into the pixel-dependent normalization A p as an approximation. Strictly speaking, the exposure correction should be modeled by a pixel-dependent break S b,p . However, thisincreases the computational complexity required to perform the NPTF. We use the same approximation to incorporate variations in theexposure map in the high-latitude analysis. While this approximation is probably valid in the IG, since the exposure map does not varysigniﬁcantly within ∼ ◦ of the GC, this may have an important eﬀect on the isotropic PS population in the high-latitude analysis. • Disk-correlated Point Sources

The thin-disk template is the projection along the line-of-sight of the source spatial distribution given by: n ( z, R ) ∝ exp (cid:20) − R (cid:21) exp (cid:20) −| z | . (cid:21) , (S12)where R , z are cylindrical polar coordinates. The source-count function associated with this template is modeledas a broken power law, as in (S8), except with A p ∝ µ p, disk .Given these templates and their associated generating functions, the overall photon-count probability distribution p ( p ) k ( θ ) can be written as a function of the 16 parameters θ = { A iso , A diﬀ , A bub , A NFW , A PS , S b , n , n , A isoPS , S iso b , n iso1 , n iso2 , A diskPS , S disk b , n disk1 , n disk2 } . (S13)Then, for a data set d consisting of the set of { n p } photon counts in each pixel p , the likelihood function for observinga particular photon-count distribution over all pixels in the ROI is p ( d | θ, M ) = (cid:89) p p ( p ) n p ( θ ) . (S14)With the priors speciﬁed above, this likelihood function can be used in the standard framework of Bayesian inferenceto compute both the posteriors and the evidence for models M that include various subsets of the parameters θ .We use the MultiNest package for the Bayesian calculations [37, 38]. All

MultiNest runs use 500 live points withimportance nested sampling disabled, constant eﬃciency mode set to false, and sampling eﬃciency set for model-evidence evaluation; the typical number of posterior samples generated for each run is ∼ . B. Data Selection Criteria

The NPTF analysis was performed using the Extended Pass 7 Reprocessed

Fermi data from ∼ August 4, 2008to ∼ December 5, 2013 made available by [29].

Ultraclean front-converting events with zenith angle less than 100 ◦ and “ DATA QUAL==1 && LAT.CONFIG==1 && ABS(ROCK.ANGLE) < 52 ” are selected, and a Q2 cut on the CTBCOREparameter is used to remove events with poor directional reconstruction. We have used the original CTBCORE-cutdataset throughout this work [29], but we have tested the eﬀect of including

Fermi data up to March 8, 2015 (missionweek 353) and the results are consistent with those presented here. Additionally, we have rerun the analysis withPass 8

Fermi data up to June 3, 2015, using the new

Ultracleanveto class and the top quartile of events ranked byPSF (denoted PSF3). We adopt the recommended data quality cuts for this analysis, which are slightly diﬀerent tothose in our main analysis ( e.g. , the zenith angle cut has been reduced from 100 ◦ to 90 ◦ and the rocking angle cuthas been removed). Again, the results are consistent with those presented in this work.The main body of the Letter focused primarily on two regions of interest: a high-latitude analysis with | b | ≥ ◦ andan IG analysis that included all pixels within 30 ◦ of the GC, with | b | ≥ ◦ . These regions are shown in Fig. S1. Whenmasking identiﬁed PSs from the Fermi × . ◦ of the source are excluded. Thismask is suﬃciently large to completely contain the ﬂux from the majority of the PSs; the results do not qualitativelychange as the mask size is varied, for example, to 7 × . ◦ .If the detector’s point spread function (PSF) is comparable to or larger than the pixel size, photons from a PS in agiven pixel can leak into neighboring pixels. As a result, the PSF must be properly accounted for in the calculation ofthe photon-count probability distributions for the PS templates [31, 32]. To model the PSF for the CTBCORE-cutdata, we use the instrument response functions made available by [29] to approximate the PSF as a two-dimensionalGaussian with energy-dependent width, σ (see Sec. IV D for details). The NPTF analysis includes data from a singleenergy bin from ∼ σ varies from ∼ . ◦ at low energies to ∼ . ◦ at high energies. In calculatingthe photon-count probability distributions for the PS components, we use the energy-averaged value for σ assumingthe spectrum of the excess (high-latitude 3FGL PSs) for all PS templates in the IG (high-latitude) analysis. Wefurther explore systematic uncertainties arising from the PSF in Sec. IV D. http://fermi.gsfc.nasa.gov/ssc/data/analysis/documentation/Cicerone/Cicerone Data Exploration/Data preparation.html The Gaussian PSF is used to smear the NFW DM map. In contrast, the diﬀuse-background model is smeared with the exact

Fermi

PSF using the Fermi Science Tools to construct the diﬀuse-background template.

II. THE

FERMI -LAT POINT-SOURCE DETECTION THRESHOLD

The analysis presented in this Letter does not rely on any prescription for the sensitivity of

Fermi to PSs, exceptimplicitly via masking out the known sources. However, it is still worth considering whether the PS population weinfer should already have been resolved as individual sources, given the nominal sensitivity of

Fermi .To some degree, the detection threshold can be read oﬀ directly from ﬁgures such as Fig. 2; the threshold may beresponsible for the turnover in the histogram of known sources between ∼ × − photons/cm /s. However, ouranalysis is restricted to a single high-energy bin; the “faintest” resolved sources in our sample include sources that arenot intrinsically faint but have very soft spectra, and so have very few photons at high energy. The analysis by whichthe 3FGL catalog was created used a wider range of energies, and so it is entirely possible that a source might bedetected by its bright emission at low energies, yet be too faint to be independently detected in our 1.893–11.943 GeVenergy range, thus giving the false appearance that very faint sources can be detected. The spectrum of the excess israther hard relative to the known 3FGL sources in our ROI, so PSs comprising the excess will tend to become lessdetectable relative to the resolved 3FGL sources in lower energy bands.Estimates for the PS sensitivity in the literature generally assume a speciﬁc source spectrum and are based on feweryears of Fermi data than our current analysis (although they may have similar statistics, because we are using front-converting

Ultraclean events and have applied further cuts on the CTBCORE parameter). The

Fermi

Collaborationhas presented longitude-averaged sensitivity estimates based on the integral ﬂux from 0.1–100 GeV with three yearsof data, for sources with a pulsar-like spectrum [43]; for Galactic latitudes with 2 ◦ < | b | < ◦ , the mean sensitivityvaries from ∼ . × − erg/cm /s, with the 10% percentile sensitivity ranging from ∼ . . × − erg/cm /s.The average over longitude probably leads to a somewhat lower threshold than is accurate in the neighborhood of theGC, but on the other hand, the 3FGL catalog is based on four years of data rather than three.Let us assume the spectrum of the excess, and all sources comprising it, is approximately given by the best-ﬁtbroken power-law spectrum of [14], with indices ∼ ∼ .

06 GeV. Then, a luminosity of 10 − erg/cm /s in the 0.1–100 GeV band corresponds to ∼ × − photons/cm /s in our energy band. Thus, a threshold of 3 . × − erg/cm /s in integrated energy ﬂux translates toa photon-ﬂux threshold ∼ . × − photons/cm /s in the relevant energy range, for the 10% percentile sensitivity10 ◦ from the Galactic plane. This is directly above the break in our inferred source-count functions. III. EXTENDED RESULTS

This section includes extended discussion of the results presented in the main text. counts l a t i t u d e [ d e g r ee s ] coun t s coun t s FIG. S1: The counts map for the high-latitude ( | b | ≥ ◦ ) analysis (clipped at 15 counts) is shown in the left panel. The IGanalysis focuses on the region within 30 ◦ of the GC, with | b | ≥ ◦ . The associated counts map (clipped at 50 counts) is shownin the right panel. All pixels within ∼ ◦ of known Fermi -8 -7 intensity [ photons/cm /s/sr ]0.000.050.100.150.200.250.30 p o s t e r i o r p r o b a b ili t y -8 -7 intensity [ photons/cm /s/sr ]0.000.050.100.150.200.250.30 p o s t e r i o r p r o b a b ili t y -8 -7 intensity [ photons/cm /s/sr ]0.000.050.100.150.200.250.30 p o s t e r i o r p r o b a b ili t y -8 -7 intensity [ photons/cm /s/sr ]0.000.050.100.150.200.250.30 p o s t e r i o r p r o b a b ili t y FIG. S2: The posterior probabilities for the photon intensity associated with a given template in the high-latitude analysiswith 3FGL sources unmasked (left column) and masked (right column). The top row shows the result without a PS template( i.e. , the standard template ﬁt), while the bottom row shows the result of the NPTF with an isotropic PS template included.Whether or not the 3FGL sources are masked, the isotropic template in the standard template ﬁt clearly absorbs intensity thatis assigned to the isotropic PS template in the NPTF. Dashed vertical lines indicate the 16 th , 50 th , and 84 th percentiles. A. High latitudes

Fig. S2 shows the posterior probabilities for the photon intensity associated with each template in the high-latitudeanalysis ( | b | ≥ ◦ ), with 3FGL sources unmasked and masked (left and right columns, respectively). The top rowshows the results when only the isotropic, diﬀuse, and bubbles templates are included in the ﬁt. The best-ﬁt intensityvalues for the diﬀuse and bubbles components are essentially constant, regardless of whether or not the 3FGL sourcesare masked. However, the isotropic template is sensitive to the presence of the 3FGL sources, with its median best-ﬁtintensity increasing from ∼ . × − to ∼ . × − photons/cm /s/sr when going from the right to left panels.When the isotropic PS template is included in the NPTF at high-latitudes (bottom row in Fig. S2), there is noeﬀect on the ﬂux absorbed by the diﬀuse and bubbles template. In this case, the isotropic template is insensitive(within uncertainties) to the masking of the 3FGL sources, with a photon intensity that remains unchanged betweenthe left and right panels. Instead, the isotropic PS template accounts for the resolved PSs when the 3FGL sourcesare unmasked, and picks up ﬂux (presumably from unresolved PS emission) when the 3FGL sources are masked.The intensity of the IGRB that is predicted by the NPTF can be compared with published values from the Fermi collaboration [40]. Applying a standard template analysis to 50 months of

Fermi

LAT data, [40] determined theIGRB spectrum from 100 MeV to 820 GeV. The brightest PSs from the 2FGL catalog (the predecessor of the3FGL catalog) were ﬁtted individually, while a standard template was used to model the other identiﬁed sources.Three foreground models—labeled A, B, C—were studied to better understand the eﬀects of variations in the diﬀuseemission. For each foreground model, the spectrum of the IGRB intensity was ﬁt to a broken power law with anexponential cutoﬀ. According to our estimates, the results in [40] predict IGRB intensities of I A iso = 1 . × − , I B iso = 1 . × − , I C iso = 1 . × − photons/cm /s/sr in the energy range from 1.893–11.943 GeV for the threemodels, with systematic uncertainties on the order of 10% in each case. The intensity of the IGRB computed usingthe NPTF ( e.g. , 1 . +0 . − . × − photons/cm /s/sr for the 3FGL-masked ﬁt) is generally in agreement with theresults of [40]. However, a direct comparison is diﬃcult to make because, for example, the analyses treat the knownPSs diﬀerently, use diﬀerent PS catalogs, use diﬀerent data sets, and use diﬀerent foreground models. It would beinteresting to repeat the analysis in [40] including a non-Poissonian PS template to more accurately extract thecontribution of unresolved PSs to the total EGB.The central analyses in this work assume the source-count function parameterization in (S8), however we alsoconsider the eﬀect of forcing the source-count function to have a sharp cutoﬀ at high ﬂux, while still allowing a breakat lower ﬂux. For example, we can impose a high-ﬂux cutoﬀ consistent with the ﬂux of the brightest 3FGL PS in theROI: F cut ≈ . × − photons / cm / s. In this case, the NPTF gives the following best-ﬁt values: n = 1 . +0 . − . , n = 1 . +0 . − . , and F b = 1 . +1 . − . × − photons / cm / s. Additionally, I isoPS = 1 . +0 . − . × − photons/cm /s/sr,which—subtracting the intensity of the identiﬁed 3FGL PSs—predicts that the unresolved PSs have an intensity4 . +0 . − . × − photons/cm /s/sr. This is in good agreement with the estimate obtained from the 3FGL-maskedNPTF described in the main body of the Letter ( e.g. , I isoPS = 4 . +0 . − . × − photons/cm /s/sr). Indeed, theagreement is better than that obtained from estimating the intensity of the unresolved PS population using thesource-count function without the high-ﬂux cutoﬀ ( e.g. , 7 . +0 . − . × − photons/cm /s/sr). B. Inner Galaxy

Figs. S3 and S4 show the posterior probabilities for the IG analysis (within 30 ◦ of the GC, with | b | ≥ ◦ ), with3FGL sources unmasked and masked, respectively. In both cases, the parameters are all well-constrained within theprior ranges, with one exception. Namely, the slope n above the break of the PS source-count functions may havelarge error bars. This tends to happen in the masked analyses, where the source-count functions fall oﬀ steeply nearthe detection threshold.As shown in Figs. S3 and S4, the NPTF ﬁnds that the average intensity of the diﬀuse emission in this region is I diﬀ = 59 . +0 . − . × − (58 . +0 . − . × − ) photons/cm /s/sr with 3FGL sources unmasked (masked); the predictedintensities are similar in both scenarios, as desired. Additionally, these intensities change by less than ∼

1% from therespective values obtained by the standard template procedure ( i.e. , when only an NFW DM template is included).The fact that the intensity of the diﬀuse emission is essentially constant between the NPTF and standard templateanalysis highlights that the addition of the PS templates does not aﬀect the ﬂux absorbed by the diﬀuse backgroundtemplate in the IG region.One cross-check of the results is to compare the predicted fractions of ﬂux from DM in the template ﬁts that donot include PSs—where the excess appears to be absorbed by the NFW DM template—to results found in previoustemplate studies. Ref. [12] performed two analyses, with diﬀerent ROIs, that are relevant for this comparison. TheROI for the ﬁrst analysis was a 40 ◦ × ◦ region around the GC, while the second analysis used the full sky. In bothcases, the plane was masked ( | b | ≥ ◦ ) in addition to the 300 brightest and most variable PSs in the 2FGL PS catalog,using estimated 95% containment masks. A spectral and spatial model was also included for the remaining 2FGLsources, based on their positions and spectra in the catalog. Ref. [12] used the same Poissonian templates as we do,so we can compare their results to our ﬁts that do not include NFW PSs.Using the results of [12], we can compute the predicted fraction of ﬂux from DM relative to the total number ofobserved counts within 10 ◦ of the GC, with | b | ≥ ◦ , no PSs masked, and within the energy range considered in thisLetter. For the restricted (full-sky) ROI, we ﬁnd that [12] predicts the fraction of ﬂux from DM to be 9 . ± . . ± . While the parameter n is peaked within the prior range, it is also not very well-constrained by the ﬁt because it mostly aﬀects thelow-ﬂux part of the distribution. Widening the prior range for n to values well below − n in some analyses, which skews the posterior for this parameter towards negative values. This has the eﬀect of widening thesource-count function conﬁdence interval at low ﬂux. With that said, we ﬁnd that the mean source-count function is not aﬀected bybroadening this prior. One reason we present results with the prior constraint n > − F , are in good agreement in all analyses at low ﬂux in this case. Similarly, in the PS-masked ﬁt, themedian source-count function may signiﬁcantly diﬀer from the mean at high ﬂux, above the PS-detection threshold, as the source-countfunction falls steeply. However, it should be understood that all our results are subject to considerable systematic uncertainties wellbelow the PS-detection threshold. In the 3FGL-masked ﬁt, the source-count function falls steeply above the PS-detection threshold,and here as well systematic uncertainties are expected to be large. See Sec. IV G for more details. We only present a detailed check of our ﬂux fractions against the results of [12] because the results of Ref. [14] are consistent with thoseof [12], within estimated systematic uncertainties. I diff = . +0 . − . . . . . I b u b I bub = . +0 . − . . . . . I i s o I iso = . +0 . − . . . . . I N F W I NFW = . +0 . − . I N F W P S I NFWPS = . +0 . − . n N F W n NFW = . +8 . − . n N F W n NFW = − . +0 . − . S N F W b S NFWb = . +1 . − . I d i s k P S I diskPS = . +0 . − . n d i s k n disk = . +8 . − . n d i s k n disk = . +0 . − . S d i s k b S diskb = . +85 . − . I i s o P S I isoPS = . +0 . − . n i s o n iso = . +8 . − . n i s o n iso = . +0 . − .

25 50 75 100 I diff S i s o b . . . . I bub . . . . I iso .

15 0 .

30 0 .

45 0 . I NFW I NFWPS n NFW n NFW

10 20 30 40 50 S NFWb I diskPS n disk n disk

150 300 450 600 S diskb I isoPS n iso n iso

20 40 60 80 100 S isob S isob = . +13 . − . FIG. S3: Triangle plot for the IG analysis with the 3FGL sources unmasked. Intensities are in units of 10 − photons/cm /s/srand are calculated with respect to the region within 10 ◦ of the GC, with | b | ≥ ◦ . I diff = . +0 . − . . . . . I b u b I bub = . +0 . − . . . . . I i s o I iso = . +0 . − . . . . . I N F W I NFW = . +0 . − . I N F W P S I NFWPS = . +0 . − . n N F W n NFW = . +7 . − . n N F W n NFW = − . +1 . − . S N F W b S NFWb = . +2 . − . I d i s k P S I diskPS = . +0 . − . n d i s k n disk = . +8 . − . n d i s k n disk = − . +1 . − . S d i s k b S diskb = . +41 . − . I i s o P S I isoPS = . +0 . − . n i s o n iso = . +8 . − . n i s o n iso = − . +1 . − .

25 50 75 100 I diff S i s o b . . . . I bub . . . . I iso .

15 0 .

30 0 .

45 0 . I NFW I NFWPS n NFW n NFW

10 20 30 40 50 S NFWb I diskPS n disk n disk

150 300 450 600 S diskb I isoPS n iso n iso

20 40 60 80 100 S isob S isob = . +36 . − . FIG. S4: Triangle plot for the IG analysis with the 3FGL sources masked. Intensities are in units of 10 − photons/cm /s/srand are calculated with respect to the region within 10 ◦ of the GC, with | b | ≥ ◦ . Fermi [19, 23, 44]. We have also considered the case of a thin disk with scale heightand radius of 0.3 and 10 kpc, respectively. In both these cases, the results of our analyses remain essentially the same.One substantial diﬀerence between our work and previous studies of the excess is that we include no energyinformation, using only a single large energy bin. In future work, it would be very useful to incorporate energydependence into the NPTF likelihood function in order to extract a spectrum for the PS population. It would alsobe useful to understand whether DM substructure could give rise to all or part of the excess. In this Letter, we havemodeled DM emission as smooth emission, but it may be the case that the DM emission is more PS-like because of,for example, DM subhalos.

IV. SYSTEMATIC UNCERTAINTIES IN THE NPTF

There are various systematic uncertainties that may inﬂuence the IG analysis, which we examine in some detail inthis section. A. Broadening the ROI

As a ﬁrst cross check of our results, we perform the NPTF on the full sky with | b | ≥ ◦ . This analysis can be usedto test the North-South symmetry of the excess by masking each hemisphere in turn. The results for the full-skyand hemisphere analyses are similar, so only the latter are presented here. Additionally, we only show results for the3FGL-unmasked analysis, since, when when masking 3FGL sources, the fraction of the sky masked near the GC isdiﬀerent in the North versus the South.Fig. S5 shows the best-ﬁt source-count function and ﬂux fractions for the Northern and Southern hemispheres inthe top and bottom rows, respectively. The source-count functions for the two regions are consistent within statisticaluncertainties. In both cases, the orange band cuts oﬀ steeply around F ∼ × − photons/cm /s. The inferredﬂux fraction of NFW PSs in the Northern (Southern) hemisphere analysis is 7 . +0 . − . % (10 . +1 . − . %) in the regionwithin 10 ◦ of the GC, with | b | ≥ ◦ , while the Bayes factor in favor of the model with NFW, disk, and isotropicPSs over the model without NFW PSs is ∼ ( ∼ ) in the Northern (Southern) hemisphere analysis. We concludethat there are no signiﬁcant asymmetries in the inferred NFW PS population between the Northern and Southernhemispheres. B. Varying the Diﬀuse Model

A potentially serious source of systematic uncertainty is due to limitations in modeling the diﬀuse gamma-raybackground arising from the propagation of high-energy cosmic rays in the Galaxy. The primary contributions comefrom bremsstrahlung of high-energy cosmic rays passing through the interstellar gas, inverse Compton scattering ofCosmic Microwave Background, interstellar, and infrared radiation oﬀ high-energy electrons, and the decay of boostedpions produced in inelastic proton collisions with the interstellar gas. Modeling the cosmic-ray emission depends onmany factors, including the location and spectrum of source injection, the gas distribution, magnetic ﬁelds, and theinterstellar radiation ﬁeld, as well as the diﬀusion parameters. Repeating the NPTF with diﬀerent diﬀuse-backgroundmodels can help to quantify the eﬀects of mis-modeling the diﬀuse background. We have additionally performed multiple tests that we do not show here for brevity. For example, we have performed analyses withisotropic and isotropic PS emission constrained by the results of the high latitude analyses. In general, constraining the priors for thesetemplates is found to increase the evidence in favor of the model with NFW PSs. Thus, to be conservative, we leave the priors for thesetemplates unconstrained throughout the Letter. p o s t e r i o r p r o b a b ili t y Northern hemisphere masked

Disk PSNFW PSIso. PSNFW DM0 5 10 15 200.00.10.2 No NFW PS Template p o s t e r i o r p r o b a b ili t y Southern hemisphere masked

Disk PSNFW PSIso. PSNFW DM0 5 10 15 200.00.10.2No NFW PS Template

FIG. S5: Results obtained by applying the NPTF to a full-sky map with either the Northern (top row) or Southern (bottomrow) hemispheres masked. The templates included are: isotropic, diﬀuse, bubbles, NFW DM, NFW PSs, disk PSs, and isotropicPSs. The best-ﬁt source-count functions (with 68% conﬁdence intervals shaded) are shown in the left column, and the posteriorprobabilities for the ﬂux fractions are shown in the right column. The source-count functions are plotted with respect to theregion within 10 ◦ of the GC with b ≥ ◦ ( b ≤ − ◦ ) for the Northern (Southern) analysis. The number of 3FGL sources in theseregions is indicated. The ﬂux-fraction plots are calculated for the region within 10 ◦ of the GC with | b | ≥ ◦ in both cases. Theinset shows the posterior probabilities for the ﬂux fractions when the NFW PS template is not included. The primary results presented in this work used the

Fermi p6v11 diﬀuse model. This model is not the most recentto be released by

Fermi . The more recent

Fermi p7v6 diﬀuse model includes contributions from large-scale diﬀusesubstructures such as Loop 1 and the bubbles. As found in [12], repeating the template analysis with the p7v6 modeldoes not qualitatively aﬀect the results for the GeV excess, except for the fact that the overall ﬂux absorbed by theDM template is reduced. This may be due to the largely data-driven p7v6 diﬀuse model having absorbed part of theGeV excess.The left panel of Fig. S6 shows the best-ﬁt source-count functions obtained for the IG study using the p7v6 diﬀusemodel, with all other templates the same. Comparing to Fig. 2, we see that the overall features are recovered. Theright panel of Fig. S6 shows the corresponding ﬂux fraction plot. In this case, the ﬂux fraction absorbed by the NFW(disk) PS template is ∼

4% (6%); when the NFW PS template is removed from the ﬁt (inset, right panel), the majorityof its corresponding ﬂux is absorbed by the NFW DM template instead. The main diﬀerence with the p6v11 resultsis that the ﬂux associated to NFW PSs is lower with p7v6 , which is consistent with previous ﬁndings of a smaller ﬂuxfraction for the GeV excess in p7v6 studies. As a result, the Bayes factor in preference for the NFW PS template isreduced to ∼ in this case. Figure S7 shows the corresponding results for p7v6 when the 3FGL sources are masked. Now, the NPTF ﬁnds noexcess ﬂux in the ROI; the ﬂux fractions for the PS templates, as well as the NFW DM template, are all consistent This Bayes factor increases to ∼ with Pass 8 data. -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] NFW PSDisk PSIso. PS3FGL PS p o s t e r i o r p r o b a b ili t y NFW PSDisk PSIso. PSNFW DM

FIG. S6: Same as Fig. 2, except using the

Fermi p7v6 diﬀuse background model. with 0%. The fact that no ﬂux is picked up by the NFW DM template is diﬀerent from what was previously observedin [12]. We have veriﬁed that this is due to the larger PS mask implemented here, which removes a considerableregion close to the Galactic Center; repeating the analysis with the PS masking of [12] recovers their result.In addition to studying the p7v6 diﬀuse model, we also consider thirteen other diﬀuse models from [45]: S L Z R T C S L Z R T C S L Z R T C S L Z R T C S L Z R T C S O Z R T C S Y Z R T C S S Z R T C S S Z R T C S S Z R T C S S Z R T C S O Z R T C S L Z R T C

5. These models are chosen to span variations in the source distributions, the diﬀusion scale heightand radius, the gas spin temperature, and cuts on the magnitude of E(B-V). For these models, we include separatetemplates for the π , bremsstrahlung, and ICS components. The prior ranges for each of these three templates isdealt with in the same way as the diﬀuse-model prior ranges for the p6v11 and p7v6 diﬀuse models; in all cases,the three component templates are well-converged within the prior ranges. For brevity, we only show results for the3FGL-masked analyses. Additionally, for computational reasons, we only include a non-Poissonian template for NFWPSs. When only this PS template is included, the Bayes factor in preference for the model with NFW PSs relative tothat without is ∼ ( ∼ ) for the p6v11 ( p7v6 ) diﬀuse model.Fig. S8 summarizes the results, showing the spread in the median source-count functions (dashed black lines)obtained by running the NPTF in the IG with each of these diﬀuse models. The orange band indicates the combinationof the thirteen 68% conﬁdence intervals. While there is some spread in the predicted source-count function, the NPTFconsistently ﬁnds a non-zero ﬂux for the NFW PS contribution in all cases, ranging from ∼

3% to ∼ ∼ –10 for all diﬀuse emission scenarios considered. In this sense, these thirteen diﬀuse models appear more similar to p6v11 than to p7v6 . -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] NFW PSDisk PSIso. PS3FGL PS p o s t e r i o r p r o b a b ili t y NFW PSDisk PSIso. PSNFW DM

FIG. S7: Same as Fig. 3, except using the

Fermi p7v6 diﬀuse background model. -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] p6 median3FGL PS FIG. S8: Same as Fig. 3, except showing the median source-count functions for the additional thirteen diﬀuse models describedin the text in dashed black; the median function for the p6v11 model is shown in solid blue, for comparison. The orange bandis the overlap of the 68% conﬁdence bands for all thirteen models. For computational reasons, the only PS template includedin these tests is that for NFW PSs.

For the model S L Z R T C

5, which provided the formal best ﬁt in a previous analysis of the IG [14], wealso test the eﬀect of adding an independent diﬀuse template with free normalization, corresponding to the predictedgas-correlated emission (pion decay and bremsstrahlung) within the innermost Galactocentric ring. The addition ofthis template does not signiﬁcantly alter the source-count function or ﬂux fraction attributed to the NFW PSs.

C. Scan Along the Galactic Plane

To address the concern that the PS templates may be erroneously absorbing contributions from the diﬀuse back-ground, we repeat the NPTF in regions centered at diﬀerent points along the Galactic plane. Previous studies havereported additional bright excesses along the plane, with the most signiﬁcant at l = 30 ◦ [12, 14]. The residual emissionin this region is roughly similar to that at the GC; however, its energy spectrum is softer, suggesting a diﬀerent origin.Repeating the NPTF on another region of sky—with an excess that is unlikely to arise from PSs—allows us to testwhether the ﬁtting procedure can adequately distinguish between extra diﬀuse emission on top of a mis-modeleddiﬀuse background and extra PS emission.We apply the NPTF to regions of the sky within 30 ◦ of the central points ( l, b ) = (30 n ◦ , ◦ ), where n = − , − , . . . , | b | ≥ ◦ throughout. The same templates are included as in the IG analysis, except the NFW templates arecentered at the middle of each ROI. The most signiﬁcant excess is found for the region centered at ( l, b ) = (30 ◦ , ◦ ),consistent with previous ﬁndings. Fig. S9 shows the best-ﬁt source-count functions for the diﬀerent PS templates inthe 10 ◦ region centered at this point. The disk template (solid, blue) successfully recovers the known PSs in this p o s t e r i o r p r o b a b ili t y NFW PSDisk PSIso. PSNFW DM

FIG. S9: Same as Fig. 2, except repeating the NPTF in the region within 30 ◦ of ( l, b ) = (30 ◦ , ◦ ), with | b | ≥ ◦ . ∼ no unresolved PSs in this ROI.The right panel of Fig. S9 shows that the ﬂux fractions of the NFW DM and disk PS components are comparablein this region, while the isotropic and NFW PS templates pick up negligible ﬂux. Excluding the NFW PS template(inset) does not signiﬁcantly aﬀect the DM and disk PS ﬂux fractions. The Bayes factor in favor of the model withNFW PSs relative to that without is ∼ D. Point Spread Function

An accurate PSF is a crucial component of the NPTF procedure, as an incorrect parametrization can lead to over-or under-estimation of the PS component. The instrument response functions for the

Fermi

CTBCORE-cut datawere made available by [29]. For a given energy range, the authors of [29] provide the 68% and 95% containmentradii, R and R .We model the PSF as a two-dimensional Gaussian P ( x, σ ) = 12 πσ exp (cid:20) − x σ (cid:21) , (S15)setting σ such that the 68% containment radius of (S15) is equal to R . The value of σ varies from ∼ ◦ at thelowest end to ∼ ◦ at the highest end of the energy bin (1.893-11.943 GeV). As described earlier, the isotropic, Fermi bubbles, and NFW DM templates are smeared with this PSF. (The diﬀuse-background template is smearedwith the exact energy-dependent PSF using the

Fermi

Science Tools.) The PSF must also be properly accounted forin the generating-function formalism that is used to obtain the non-Poissonian likelihood function (see [32] for furtherdetails).It is known that the real PSF has power-law tails that are not captured by (S15) [29]. One might rightfully beconcerned that ignoring these power-law tails can lead to mischaracterization of the PS population. To illustrate theeﬀect of mis-modeling the PSF, Fig. S10 shows the best-ﬁt source-count function for the IG analysis assuming extremevalues for the Gaussian-PSF parameter σ : σ = 0 . ◦ (0 . ◦ ) in the left (right) panel. When the assumed PSF is toowide, photons in neighboring pixels arising from diﬀuse emission or unresolved sources may instead be erroneouslyinterpreted as photons from a single PS. Thus, using a PSF that is wider than the true PSF will underestimatethe number of unresolved PSs and overestimate the number of observed sources, relative to the true source-countdistribution. This eﬀectively shifts the source-count function to large ﬂux values, as observed in the left panel ofFig. S10.In comparison, using a PSF with a width that is too small underestimates the number of bright PSs. For example,in the extreme limit, assuming a delta-function PSF would require all photons from a given PS to be contained in asingle pixel, while the photons from bright PSs are, in actuality, smeared over several pixels by the true PSF. In this -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] NFW PSDisk PSIso. PS3FGL PS -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] NFW PSDisk PSIso. PS3FGL PS

FIG. S10: Same as the left panel of Fig. 2, except setting the PSF parameter σ = 0 . ◦ (left) and σ = 0 . ◦ (right). Whenmasking identiﬁed 3FGL sources, all pixels within 5 × . ◦ are excluded. σ do not signiﬁcantly aﬀect the results. In Sec. VII, we further verifyour treatment of the PSF using simulated data. E. Radial Distribution Proﬁle

The analyses presented thus far have ﬁxed the inner slope of the generalized NFW density proﬁle to be γ = 1 . γ ≈ . .

4. Variations in γ have negligible eﬀects on our ﬁnalresult. Fig. S11 shows the best-ﬁt source-count function in the inner 10 ◦ with | b | > ◦ , assuming γ = 1 . . F. Diﬀuse-Correlated Point-Source Template

The conclusions described in this Letter provide evidence that the GeV excess can be accounted for by a populationof unresolved PSs. One possible systematic issue, however, is that the preference for unresolved PSs is driven bylocalized structure in the diﬀuse gamma-ray background that is not captured by our background model. As a testof this hypothesis, we repeat the NPTF in the IG adding a PS template that traces the diﬀuse model. Fig. S12shows the diﬀuse background in the ROI. Because the ﬂux from the diﬀuse emission is larger closer to the plane,a diﬀuse-correlated (diﬀ-corr) PS template is not only sensitive to localized structure in the diﬀuse model, but alsoto the presence of a disk-like population of PSs. For example, if the unresolved PSs are preferentially located closeto the plane, then they may be absorbed by the diﬀ-corr PS template. Breaking the degeneracy between these twointerpretations requires more careful study, which we postpone to future work. However, the preliminary results areilluminating, so we share them here.The left panel of Fig. S13 shows the best-ﬁt source-count functions obtained when doing the NPTF in the IGregion (without masking the 3FGL sources), including a diﬀ-corr PS template in addition to disk-correlated andNFW PS templates, along with the standard Poissonian templates. Here, the diﬀ-corr PS template only extends to10 ◦ from the GC. To simplify the analysis, we do not include the subdominant isotropic PS template. The best-ﬁt ﬂuxnormalization for the diﬀuse background is consistent with that obtained from the high-latitude analysis. In addition,the best-ﬁt NFW DM normalization is consistent with zero, and the recovered source-count function parameters fordisk and NFW PSs are consistent with those found when not including the diﬀ-corr PS template, since the diﬀ-corrPS template does not absorb a signiﬁcant fraction of the ﬂux. -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] NFW PSDisk PSIso. PS3FGL PS -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] NFW PSDisk PSIso. PS3FGL PS

FIG. S11: Same as the left panel of Fig. 2, except using γ = 1 . γ = 1 . l a t i t u d e [ d e g r ee s ] coun t s coun t s FIG. S12: The

Fermi p6v11 diﬀuse background in the IG region (smoothed using Fermi Science Tools), with | b | < ◦ masked.Counts are clipped at 30. As a point of comparison, we repeat the procedure letting the diﬀ-corr PS template have support over the full IGregion. Now, there is a potential degeneracy between the diﬀ-corr PS template, the disk-correlated PS template, andthe diﬀuse template. To break some of this degeneracy, we ﬁx the disk-correlated PS parameters to their best-ﬁtvalues from the scan including disk-correlated, isotropic, and NFW PSs. These results are shown in the right panelof Fig. S13. Again, the best-ﬁt NFW DM normalization is consistent with zero, however the source-count functionfor the NFW PSs changes. In particular, the source-count function for NFW PSs is shifted to lower ﬂux, potentiallysuggesting that some of the near-threshold sources could either be more disk-like in morphology or associated withmis-modeling the diﬀuse background. However, the preference for NFW PSs remains high, with the model includingNFW PSs preferred over that without by a Bayes factor ∼ . Unlike the previous analysis that used a truncateddiﬀ-corr PS template, here the best-ﬁt normalization of the diﬀuse-background template is lower than its best-ﬁt valueat high latitudes. When the NFW PS template is not included, the normalization can be shifted down by as much as20%; the best-ﬁt normalization of the diﬀuse template is still shifted down by ∼

10% when the NFW PS template isincluded.We caution the reader that these results are subject to considerable uncertainty due to the large number of pa-rameters in the ﬁtting procedure and the potential degeneracies between them. In addition, this analysis does notappear to be robust to changing the size of the ROI; as the ROI is increased, the best-ﬁt normalization of the diﬀusebackground approaches its value from high latitudes, and the ﬂux absorbed by the diﬀ-corr PS template decreases. -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] Diff-corr PS extending to 10 ◦ from GC NFW PSDiff. PSDisk PS3FGL PS -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] Diff-corr PS extending to 30 ◦ from GC NFW PSDiff. PSDisk PS3FGL PS

FIG. S13: Best-ﬁt source-count functions for PSs within 10 ◦ of the GC and | b | ≥ ◦ , with the 3FGL sources unmasked, formodels with NFW PS (dashed, orange), diﬀuse-correlated PS (dotted, pink), and thin-disk PS (solid, blue) components. Forthis analysis, the NPTF includes an additional template corresponding to diﬀuse-correlated PSs. This new template is takento have support either in the inner 10 ◦ (left) or over the full ROI (30 ◦ from the GC with | b | ≥ ◦ ) (right). For the latter case,parameters for the thin-disk PS component are ﬁxed to the best-ﬁt values found in the standard unmasked IG analysis. G. Binned Source-Count Functions

The imposition of a broken power law for the source-count function might be over-constraining in some cases. Forexample, it could lead to the apparent exclusion of a DM component because the extrapolation to low ﬂux of thesource-count function from high-ﬂux sources is already too large. Furthermore, the broken-power law prescription mayyield misleading results for the true uncertainty on the source-count function at low ﬂux. To address these concerns,we present preliminary results from an alternate analysis where the number of sources in each ﬂux bin is allowed toﬂoat independently; we use the seven logarithmically-spaced ﬂux bins shown in Fig. S14. Within each bin, dN/dF is constant as a function of F . The source-count function model parameters are seven normalization parameters, onefor each bin, which are taken to have log-ﬂat priors over the range indicated for log A PS in Tab. S1.Using the binned source-count function, we perform the NPTF on the 3FGL-masked IG ROI. For simplicity, weleave out isotropic and disk-correlated PS templates from this analysis. This is justiﬁed by the fact that leaving outthese two templates from the broken power-law analyses does not qualitatively aﬀect the results for the NFW PSand NFW DM components. We ﬁnd that the DM ﬂux fraction is consistent with zero, while the NFW PS templateabsorbs the excess ﬂux.As shown in Fig. S14, we recover a source-count function broadly consistent with the original analyses assumingbroken power laws. The blue (pink) bands indicate the 68% (95%) conﬁdence intervals for the source-count functionin each bin, while the solid black (dashed red) line shows the median (mean) of the distribution. The orange bandshows the 68% conﬁdence interval from the masked NPTF using the broken power-law formalism.At low ﬂux, the mean and median of the binned result diﬀer by multiple orders of magnitude, reﬂecting the factthat the posterior distributions for these parameters are skewed. This is related to the fact that the low-ﬂux binsare not well-constrained by the ﬁt, so the posterior distributions for these parameters are heavily inﬂuenced by thelog-ﬂat prior distributions. In the broken power-law ﬁt, only the overall normalization parameter A PS was taken tohave a log-ﬂat prior range, while in the binned ﬁt all PS parameters have log-ﬂat prior ranges. This point, combinedwith the fact that the broken power-law is more constrained than the binned source-count function, leads to largeruncertainties at low (and, to some extent, high) ﬂux in the binned analysis than in the broken power-law analysis.An additional challenge with this method is that the source counts in neighboring bins are generally highly corre-lated, leading to large errors in individual bins; however, the total ﬂux is still well constrained. For example, we ﬁndthat 5 . +0 . − . % of the ﬂux (in the inner 10 ◦ region with | b | ≥ ◦ ) is associated with the NFW PS template. Theseresults are consistent with the broken power-law results, within uncertainties. Similarly, the binned ﬁt prefers a largenumber of additional unresolved sources with ﬂuxes only slightly below the PS-detection threshold, but the exact binin which these sources appear is more uncertain. In future work, we plan to reﬁne the binned analysis to make itmore robust to changes in bin size and prior assumptions. -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] . +10 . − . . +10 . − . . +83 . − . . +24 . − . . +0 . − . . +0 . − . . +0 . − . medianmean FIG. S14: Best-ﬁt source-count function for NFW PSs within 10 ◦ of the GC and | b | ≥ ◦ , with the 3FGL sources masked,where the number of sources in each ﬂux bin is allowed to ﬂoat freely. Blue (pink) regions indicate 68% (95%) conﬁdenceintervals. The orange band is the 68% conﬁdence interval from the 3FGL-masked NPTF with a broken power-law source-countfunction. The median number of sources attributed to each bin is indicated, along with the 68% conﬁdence range. V. COMPARISON WITH LUMINOSITY FUNCTIONS IN THE LITERATURE

The best-ﬁt source-count function recovered by the NPTF for the unresolved NFW-correlated sources is quitediﬀerent to those previously considered in the literature—see e.g. , [19, 23, 24, 32, 41]. With our source-count function,both the number of sources and the ﬂux are dominated by sources within a factor ∼ ∼ L γ ∼ × erg/s.We construct models of the expected source-count function using a luminosity function derived from observedMSPs in the nearby ﬁeld of the Milky Way [41], for both NFW-distributed sources and the thick-disk model describedby (S12) with scale radius and height of 5 kpc and 1 kpc, respectively. For the former case, we also consider thepossibility that the luminosity function possesses a cutoﬀ at L γ = 10 or 10 erg/s, following [23]. The thick-disk model has been normalized to produce the correct number of high-latitude ( | b | > ◦ ) bright MSPs and MSPcandidates, with F E> > . × − photons/cm /s assuming the average pulsar spectrum, as presented in [19].Because this model has the wrong spatial morphology to explain the excess, the purpose of showing this curve is toillustrate the likely contribution from a disk population of MSPs within the ROI. The models with NFW-distributedsources have instead been normalized to match the ﬂux attributed to the NFW PS template in our analysis.Fig. S15 shows the expected source-count functions, averaged over our standard ROI, for these scenarios, for the3FGL-unmasked ﬁt (Fig. 2). In agreement with [23], we ﬁnd that for NFW-distributed sources with the luminosityfunction of [41] and the correct normalization to explain the excess, too many sources are predicted above the Fermi detection threshold when there is no luminosity cutoﬀ (solid red) or when L γ < erg/s (dashed red). The casewith a cutoﬀ at L γ = 10 erg/s (dotted red) evades this constraint, as expected, but requires O (10 ) new sourcesto explain the excess. Using the luminosity functions of [41], we ﬁnd good agreement with the number of sourcesrequired to ﬁt the excess as stated in [23].The purple dot-dashed line in Fig. S15 shows the predicted source-count function for the thick-disk distribution.It is remarkably similar to the best-ﬁt source-count function for the thin-disk PS model extracted from the data. Inparticular, we estimate the slope of the purple dot-dashed line at low ﬂux in Fig. S15 to be ∼ .

43, while the NPTFpredicts the slope of the thin-disk PS source-count function below the break to be n = 1 . +0 . − . . In agreement with[19], the unresolved sources associated with such a population should not produce enough photons to explain theexcess (as may be seen by comparison to the red lines, which have the correct total ﬂux to explain the excess).The source-count function for the NFW PSs rises sharply above the red curves at ﬂuxes F ∼ × − photons/cm /s. As this source-count function is very shallow below the ﬂux threshold, thetotal number of sources is dominated by this relatively high-ﬂux region, as is the total ﬂux. For this reason, only -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] no cutoffL < ergs/sL < ergs/sdisk model3FGL PS FIG. S15: Same as Fig. 2, but including comparisons to example source-count functions motivated by MSPs. In particular,the red lines assume an NFW spatial distribution and a gamma-ray luminosity function consistent with that of observed MSPsin the Milky Way [41]. Three scenarios are considered: a maximum luminosity cutoﬀ of 10 erg/s (dotted red) or 10 erg/s(dashed red), as well as no cutoﬀ (solid red). These curves are normalized to the ﬂux of the excess in the ROI. The purple line(dot-dashed) shows the expected source-count function assuming the same MSP-motivated luminosity function, but a thick-diskspatial distribution; it is normalized to match the number of bright F E> > . × − photons/cm /s MSPs and MSPcandidates at high latitudes ( | b | > ◦ ). O (10 ) sources are needed to account for the excess in the ROI, in contrast to the O (10 –10 ) quoted in [23]. Withthat said, we emphasize that estimates of the total number of NFW-distributed PSs based on the NPTF are highlyuncertain and subject to large systematic uncertainties since the low-ﬂux PSs are hard to constrain with our method.

VI. SURVIVAL FUNCTION

As a further cross-check that the PS identiﬁcation is working self-consistently, we attempt to identify pixels thatare likely to contain unresolved sources in the 3FGL-masked sky. To do so, we perform a standard template ﬁt inthe ROI (excluding the PS templates) and determine the best-ﬁt reference model by taking the median values of theposterior distributions for the appropriate ﬁt parameters. Using this reference model, the expected mean numberof counts, µ p , in a pixel p can be obtained. From the observed counts map one can then determine the Poissoniancumulative probability to observe the real count, n p , in that pixel given the expectation of the reference model. Thesurvival function for pixel p is deﬁned as (cid:15) p ≡ − CDF [ µ p , n p ] , (S16)where CDF [ µ p , n p ] denotes the cumulative probability of observing n p counts for a Poisson function with mean µ p .For example, if the best-ﬁt reference model predicts a total of µ p = 2 photons in a given pixel and the observednumber of counts is n p = 6, then (cid:15) p ≈ × − . In general, the smaller the value of (cid:15) p in a pixel, the more likely it isthat it contains an unresolved PS.We begin by considering the high-latitude region, where the reference model includes isotropic, diﬀuse, and bubblescontributions. The reference model is built using the parameters obtained by ﬁtting these three templates to the3FGL-masked Fermi data. The total contribution from the isotropic, diﬀuse, and bubbles components are thencompared to the observed counts to obtain a value (cid:15) p in each pixel. The left panel of Fig. S16 shows the fraction ofpixels (red crosses) at high latitudes for which (cid:15) p < (cid:15) , plotted as a function of (cid:15) .To better understand the observed (cid:15) distribution, we compare it to the results obtained using simulated data maps.To simulate a data map with only isotropic+diﬀuse+bubbles components, we draw Poisson counts from the sum ofthe reference templates. Then, we treat the simulated data just as we do the real data and calculate the fractionof pixels for which (cid:15) p < (cid:15) , as a function of (cid:15) . This process is repeated with ∼

200 simulated data maps. The result ✏✏ ✏✏

FIG. S16: The fraction of pixels with (cid:15) p < (cid:15) , with (cid:15) p deﬁned in (S16). (Left) The result for the high-latitude analysis,with the 3FGL sources masked. The red crosses show the survival-function distribution for the observed data set, given thereference background model (diﬀuse+isotropic+bubbles). The bars indicate the 68% conﬁdence intervals for simulated datasets generated from the best-ﬁt background model (blue) and the isotropic PS model (green). (Right) The result for the IGanalysis, with the 3FGL sources masked. This time, the reference background model includes diﬀuse+isotropic+bubbles+NFWDM contributions. The survival-function distribution for the Fermi data set (red crosses) is shown, as well as the conﬁdenceintervals for simulated-data studies generated from the model including isotropic PSs (blue) and isotropic+NFW PSs (green). The ROI employed in that work was also larger; for the NFW proﬁle we employ, this corresponds to roughly a factor of 3 diﬀerence inthe expected number of sources. -30-20-100102030 00 -30-20-10010203000 0510152005101520 FIG. S17: Map of the IG region (plane unmasked) showing the value of log (cid:15) − p in each pixel; the larger the value of log (cid:15) − p ,the more likely that the pixel contains a PS. The circles show the locations of known 3FGL sources; a dashed border indicatesan extragalactic source, while all other sources are indicated by a solid border. The radius of each circle is set to either 0 . ◦ or to the semi-major axis of the 95% conﬁdence ellipse for the source (as provided by Fermi ), whichever is larger. The scale isclipped at 20. The data for this ﬁgure is plotted as additional supplementary material. is illustrated by the blue points in the left panel of Fig. S16; the vertical bars indicate the 68% conﬁdence intervalsfor the fraction of pixels with (cid:15) p < (cid:15) . The fact that the blue points underpredict (overpredict) the fraction of pixelswith small (large) (cid:15) p indicates that a simulated data map containing contributions from isotropic, diﬀuse, and bubbleemission alone does not have the same photon-count statistics as the data; the real data has both more bright andmore dim pixels. Indeed, this is a sign that the data has residual unresolved PSs, even though the known PSs aremasked [31, 32].We repeat this procedure for simulated data maps that contain an isotropic PS component. The simulated datais created using the best-ﬁt values from the NPTF including diﬀuse+isotropic+bubbles components in addition toan isotropic PS component. The appropriate number of PSs are drawn from the source-count function and placedrandomly over the masked sky, following the procedure outlined in [32]. The PSs are then smeared with the energy-averaged PSF, and Poisson counts are drawn from the resulting map to create the simulated data. Note that the PSpopulation is independently redrawn from the source-count function for each simulated map. The 68% conﬁdenceinterval for the survival-function distribution is shown with green error bars in the left panel of Fig. S16. The survival-function distribution of the simulated PS data is consistent with that of the real data, within statistical uncertainty.This is strong evidence that the model with unresolved PSs is a better ﬁt to the data than the model without.Having illustrated the survival function at high latitudes, we repeat the analysis in the IG (within 30 ◦ of the GC,with | b | ≥ ◦ ); the results are shown in the right panel of Fig. S16. The best-ﬁt reference model is obtained by doinga standard template ﬁt with the following components: isotropic, diﬀuse, bubble, and NFW DM emission. The redcrosses show the survival-function distribution for the observed Fermi data map.The observed survival function in the IG can be compared to the expected distributions determined from twodiﬀerent sets of simulated data maps. The ﬁrst set of simulated data (68% conﬁdence intervals in blue) was gener-ated assuming the best-ﬁt values for the NPTF with diﬀuse+isotropic+bubbles+NFW DM templates in addition toisotropic PS contributions. The second set of simulated data maps (68% conﬁdence intervals in green) was generatedby also adding an NFW PS contribution. Notice that the NFW PS contribution is required to explain the survivalfunction observed by

Fermi .Figure S17 shows a map of log (cid:15) − p for all pixels within 30 ◦ of the GC. Notice that the brightest pixels ( i.e. , thosewith small (cid:15) p ) do not exhibit any distinctive spatial features, other than the fact that there are more such pixelsin regions of higher ﬂux closer to the plane. The white circles indicate the locations of known 3FGL sources. Thebrightest pixels are well-correlated with the locations of these sources, as one would expect. There are also brightpixels that are not in the 3FGL sources. In a followup paper, we plan to provide locations of these likely PS candidatesto potentially guide future dedicated multi-wavelength studies. It would also be interesting to correlate the log (cid:15) − p maps with other catalogs and data sets.1 -10 -9 F [photons / cm / s]10 d N / d F [ p h o t o n s − c m s d e g − ] Simulated PS data (p7)

Sim. PS3FGL PS p o s t e r i o r p r o b a b ili t y Simulated PS data (p7) NFW PSNFW DM0 5 10 15 20fraction of flux [%]0.00.10.2 Sim. DM data (p7)

FIG. S18: Results obtained by applying the NPTF to simulated data. The simulated data is generated within the 3FGL-maskedregion using the

Fermi p7v6 diﬀuse background and assuming the best-ﬁt parameters from the masked analysis described inSec. IV B. For simplicity, we leave out the sub-dominant disk-correlated and isotropic PS components and only include NFWPSs. We use the

Fermi p6v11 diﬀuse model in the NPTF analyzing the simulated data. The left panel shows the source-countfunction for the NFW-distributed PSs used to generate the simulated data in solid black; the 68% conﬁdence interval for theinferred posterior source-count function is shown by the orange bands. The right panel shows the fraction of ﬂux from theNFW PS and NFW DM templates. The inset plot shows the results of analyzing simulated data with NFW DM but withoutNFW-distributed PSs. All plots are relative to the region within 10 ◦ of the GC with | b | ≥ ◦ . For the ﬂux-fraction plots, thefractions are computed relative to the total number of counts observed in the real data. VII. SIMULATED DATA

We can check the validity of the NPTF using simulated data generated by Monte Carlo. In Fig. 4, we showed theresults of two analyses of simulated data in the IG including (1) isotropic and disk-correlated PSs, in addition to NFWDM, and (2) isotropic, disk-correlated, and NFW PSs, without NFW DM. We showed that the Bayes factors andsource-count functions recovered from those ﬁts behave as expected. In particular, when analyzing the simulated dataincluding NFW PSs, we ﬁnd results consistent with those from the real data. When, instead, the simulated data hasNFW DM instead of NFW PSs, we do not ﬁnd a preference for NFW PSs when performing the NPTF. In this section,we give more details for how we generated the simulated data and we also summarize additional simulated-data teststhat we have performed.The simulated data is generated through the following procedure. First, we follow the procedure outlined in Sec. VIto generate the population of PSs for each PS component and place them on the sky. Second, we smooth the PSsusing a PSF described, in each energy bin, by a King function [29]. We divide the full energy range of our analysis into80 smaller energy bins. To divide the photons for each PS into the diﬀerent energy bins, we must make assumptionsabout the energy spectrum of the PSs. For the NFW PSs, we take the energy spectrum to be that of NFW DM, overthis energy range, as computed in the standard ROI with a standard Poissonian template ﬁt. For the isotropic PSs,we take the spectrum to be that of the high-latitude isotropic emission, again measured from a standard Poissoniantemplate ﬁt. For the disk-correlated PSs, we take the spectrum to be the average of the Galactic and unassociated3FGL PSs over the ROI.Small variations to these spectra and to the form of the PSF do not signiﬁcantly aﬀect our results. For example,we have checked that analyzing simulated data where we smooth the PSs using a single Gaussian PSF, with 68%containment radius matching the energy-averaged value we assume for the NPTF, produces results consistent withinuncertainties with those obtained analyzing the simulated data that incorporated spectral information and used themore accurate King-function approximation to the PSF. This may be seen as an additional justiﬁcation for ourtreatment of the PSF in the NPTF.We may additionally use simulated data to study the eﬀect of mismodeling the diﬀuse background on the results ofthe NPTF. For these studies we follow the simpliﬁed procedure described in Sec. VI for constructing simulated data,which is less computationally intensive. For simplicity, we consider the 3FGL-masked IG ROI and we only includeNFW PSs in the simulations. We generate the simulated data using the best-ﬁt values found on the real data withthe p7v6 diﬀuse model and a NFW PS template. However, when analyzing the simulated data using the NPTF, weuse the