[PDF] Line confusion in spectroscopic surveys and its possible effects: Shifts in Baryon Acoustic Oscillations position

Abstract

Roman Space Telescope will survey about 17 million emission-line galaxies over a range of redshifts. Its main targets are H α emission-line galaxies at low redshifts and [O III] emission-line galaxies at high redshifts. The Roman Space Telescope will estimate the redshift these galaxies with single line identification. This suggests that other emission-line galaxies may be misidentified as the main targets. In particular, it is hard to distinguish between the H β and [O III] lines as the two lines are close in wavelength and hence the photometric information may not be sufficient to separate them reliably. Misidentifying H β emitter as [O III] emitter will cause a shift in the inferred radial position of the galaxy by approximately 90 Mpc/h. This length scale is similar to the Baryon Acoustic Oscillation (BAO) scale and could shift and broaden the BAO peak, possibly introduce errors in determining the BAO peak position. We qualitatively describe the effect of this new systematic and further quantify it with a lightcone simulation with emission-line galaxies.

Full PDF

MMNRAS , 1–10 (2020) Preprint 2 October 2020 Compiled using MNRAS L A TEX style ﬁle v3.0

Line confusion in spectroscopic surveys and its possibleeﬀects: Shifts in Baryon Acoustic Oscillations position

Elena Massara , , (cid:63) , Shirley Ho , Christopher M. Hirata , Joseph DeRose , , ,Risa H. Wechsler , , , Xiao Fang Waterloo Centre for Astrophysics, University of Waterloo, 200 University Ave W, Waterloo, ON N2L 3G1, Canada Center for Computational Astrophysics, Flatiron Institute, 162 5th Avenue, New York, NY 10010 USA Berkeley Center for Cosmological Physics, University of California, Berkeley, CA 94720, USA Center for Cosmology and AstroParticle Physics, Department of Physics, The Ohio State University, 191 W Woodruﬀ Ave,Columbus OH 43210, USA Santa Cruz Institute for Particle Physics, Santa Cruz, CA 95064, USA Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 93720 Kavli Institute for Particle Astrophysics & Cosmology, P. O. Box 2450, Stanford University, Stanford, CA 94305, USA Department of Particle Physics and Astrophysics, SLAC National Accelerator Laboratory, Stanford, CA 94305, USA Department of Physics, Stanford University, 382 Via Pueblo Mall, Stanford, CA 94305, USA Department of Astronomy and Steward Observatory, University of Arizona, 933 N Cherry Ave, Tucson, AZ, 85719, USA

Accepted XXX. Received YYY; in original form ZZZ

ABSTRACT

Roman Space Telescope will survey about 17 million emission-line galaxies over arange of redshifts. Its main targets are H α emission-line galaxies at low redshiftsand [O iii ] emission-line galaxies at high redshifts. The Roman Space Telescope willestimate the redshift these galaxies with single line identiﬁcation. This suggests thatother emission-line galaxies may be misidentiﬁed as the main targets. In particular,it is hard to distinguish between the H β and [O iii ] lines as the two lines are close inwavelength and hence the photometric information may not be suﬃcient to separatethem reliably. Misidentifying H β emitter as [O iii ] emitter will cause a shift in theinferred radial position of the galaxy by approximately 90 Mpc/h. This length scale issimilar to the Baryon Acoustic Oscillation (BAO) scale and could shift and broadenthe BAO peak, possibly introduce errors in determining the BAO peak position. Wequalitatively describe the eﬀect of this new systematic and further quantify it with alightcone simulation with emission-line galaxies. Key words: cosmology: large-scale structure of the universe – line: identiﬁcation

The next generation of spectroscopic large-scale structuresurveys, such as the Wide Field Infrared Spectroscopic Tele-scope (hereafter, Roman Space Telescope ) and the Euclidmission , will be able to observe a large fraction of thesky signiﬁcantly deeper than previously possible. Larger anddeeper maps will allow us to obtain more precise measure-ments of the expansion history of the Universe and of thegrowth of structure, leading to a better understanding of theaccelerating expansion of the Universe.These maps will be obtained by acquiring spectra with (cid:63) E-mail: [email protected] https://roman.gsfc.nasa.gov/ a relatively low signal-to-noise ratio and determining dis-tances of the galaxies using a single emission line (and itsphotometric properties). Unfortunately, these two featurescombined may introduce signiﬁcant systematics in the dataanalysis.The Hubble expansion of the Universe shifts the spec-trum of galaxies towards lower frequencies (redshift), suchthat the rest-frame and observed spectra appear diﬀerent.The redshift of each object is deﬁned as z = ( λ o − λ e )/ λ e ,where λ o and λ e are the observed and emitted wavelengthsof a speciﬁc line in the spectrum. When the rest-frame wave-length λ e is known, the amount of shift of the spectrum isalso determined; this can be used to infer the distance ofemitting objects once a cosmology is assumed. This seemsto be an easy procedure in principle, but it presents somecaveats. Two lines emitted by separate objects A and B © a r X i v : . [ a s t r o - ph . C O ] S e p E. Massara et al. at diﬀerent distances have diﬀerent rest-frame wavelength λ Ae (cid:44) λ Be , but can have the same observed wavelength of λ Ao = λ Bo . If the two lines are misunderstood to have thesame rest-frame wavelength (e.g. λ Ae is misunderstood tobe λ Be ), then the emission-line galaxy A will be associatedwith an incorrect redshift and thus placed at a wrong dis-tance. This galaxy is called an interloper and it can presenta source of systematic error, particularly when the measuredspectra have a low signal-to-noise ratio. This presents evenmore signiﬁcant problems when each galaxy redshift is com-puted with a single emission line (see e.g. Pullen et al. 2016;Grasshorn Gebhardt et al. 2019; Addison et al. 2018).The Roman Space Telescope will observe emission linesat 1.00–1.93 µ m observed wavelength (Spergel et al. 2015;see update by Akeson et al. 2019). This includes H α emittersat . < z < . and [O iii ] emitters at . < z < . . The[O iii ] feature is a doublet with the primary line at 500.7nm and the secondary line at 495.9 nm, with a line ratio of3:1 (Storey & Zeippen 2000). A doublet should be easier todistinguish from artifacts, noise, and other single emissionlines. At the dispersion of the Roman grism ( ∆ λ obs = . nmper 2 pixels) the [O iii ] doublet is resolved in most galaxies.However, low signal-to-noise spectra and small equivalentwidths can make the detection of the second line very hard,and [O iii ] can become indistinguishable from a single lineemission for some of the spectra.The Roman Space Telescope will use the primary[O iii ] line to determine the distance of high redshift [O iii ]emission-line galaxies. This line is close to the H β line (486.1nm), which can become the source of interlopers. The dif-ference between the true and the inferred radial position ofan H β galaxy misled as an [O iii ] emitter is ∆ d (cid:39) h − Mpc 1 + z (cid:112) Ω Λ + Ω m ( + z ) , (1)where z is the galaxy redshift, Ω Λ and Ω m are the cosmolog-ical constant and matter density parameters. Around red-shift z ∼ this corresponds to roughly h − Mpc, which isa length scale close to, but less than, the Baryonic AcousticOscillation (BAO) scale.The BAO scale is a standard ruler that can be mea-sured in the two-point correlation function of galaxies. Themonopole of the correlation function presents a single peakat the BAO scale. This scale is proportional to D A ( z ) / H ( z ) and provides a degenerate measurement of the angular di-ameter distance D A ( z ) and the Hubble parameter H ( z ) atthe median redshift z of the considered galaxy sample. Thedistance–redshift relation depends on the values of cosmolog-ical parameters. Therefore, measurements of the BAO scale(or peak) can be used to infer the expansion history of theuniverse and its cosmological contents. The presence of H β interlopers is expected to produce an additional peak in themeasured correlation function at ∼ h − Mpc, which blendswith the BAO peak and thus results in a change in the best-ﬁt shape and position. The presence of an additional shiftdue to interlopers can, therefore, lead to a biased estimationof cosmological parameters.In this paper, we analyze the shift in BAO peak posi-tion in the “[O iii ]” sample of the Roman Space TelescopeHigh Latitude Survey due to H β interlopers. We study thiseﬀect at redshifts < z < because we are limited by themaximum redshift reached by the currently available light- [ OIII ]/ H P D F Figure 1.

Probability distribution function of [O iii ]/H β equiva-lent width ratio in the redshift range . < z < . in the Roman-like mock catalog. cone simulations that are populated with galaxy spectra. Inthis redshift range, the main Roman Space Telescope targetis H α and its BAO analysis could be biased by line blending(Martens et al. 2019), while the [O iii ]-H β confusion will beof minor impact in the cosmological analysis, but will stillbe present. Therefore, the main purpose of this paper is topresent the phenomenon and the methodology to study it.Further analysis at higher redshift will be needed to under-stand the real impact of this line confusion in Roman SpaceTelescope data.For the analysis we use a lightcone N -body simulation— described in Section 2 — to generate a Roman-like [O iii ]mock galaxy catalog with H β interlopers, as explained inSection 3. In Section 4 we present the 2-point correlationfunction measured in the catalogs, with and without inter-lopers. The formalism and results of the BAO ﬁt — a quan-titative analysis of the BAO shift — is shown in Section 5.Conclusions are drawn in Section 6.Even though this paper is tailored to Roman, the BAOshift due to [O iii ]-H β confusion can be a common challengein all future spectroscopic surveys (e.g. Euclid), and themethodology presented here can easily be generalized andapplied to all of them. We use a lightcone simulation that covers 10,313 square de-grees of the sky up to redshift z = . . It has been gen-erated from three N -body simulations with box-size L = . , . , . h − Gpc and with number of dark matter par-ticles N = , , . These three boxes are usedto build the lightcone in redshift bins . < z < . , . < z < . and . < z < . , respectively. The varia-tion in resolution — becoming lower with increasing redshift— allows to decrease the computational cost of the sim-ulations while achieving the minimum halo mass requiredfor ﬂux-limited surveys, that increases with redshift. The N -body simulations are run in a Λ CDM cosmology with Ω m = . , Ω b = . , h = . , σ = . , n s = . ,and three massless neutrino species ( N eﬀ = . ). MNRAS000 , 1–10 (2020) ine confusion and BAO shift Galaxies are added to the lightcone using the AD-DGALS algorithm (DeRose et al. 2019; Wechsler et al. 2020).ADDGALS is tuned using a Sub-Halo Abundance Matching(SHAM) scheme that reproduces the luminosity dependentclustering of the SDSS catalog with high precision. The al-gorithm assigns central galaxies to halos and subsequentlyadds the remaining galaxies to dark matter overdensities fol-lowing a distribution of galaxy overdensities conditioned onabsolute magnitude and redshift. The spectral energy dis-tributions (SEDs) are assigned to each galaxy to match theSED-luminosity-density relationship measured in the SDSSdata. The assignment is performed using a set of principlecomponent coeﬃcients as in Blanton & Roweis (2007). Werefer the reader to DeRose et al. (2019) for further detailson the N -body simulations and to Wechsler et al. (2020) formore details on the galaxy model.In what follows we will identify and use [O iii ] and H β lines from the galaxy spectra. Fig. 1 shows the probabil-ity distribution function (PDF) of the [O iii ] to H β equiv-alent width ratios in the lightcone. The plot has been ob-tained by considering only galaxies that have both lines be-ing detectable by the Roman Space Telescope (see Sec. 3for more details on what this means). The measured ra-tio assumes values in the interval [ . , ] and its distri-bution peaks around 1.5. The MOSDEF collaboration ob-served emission-line galaxies in the same redshift range andmeasured larger [O iii ] / H β ratios: their distribution peaksaround 3 (see Fig. B1). This indicates that the lightconecould have more interlopers than the real data. Future ex-ploration will need to study this eﬀect with samples that aredirectly tuned to this observable. In this section we describe the construction of the Roman-like [O iii ] mock catalog, and how the H β interlopers areadded later on. The [O iii ] catalog is created by selectingspeciﬁc galaxies from the lightcone. The selection criteriondepends on two requirements.The ﬁrst requirement asks for the galaxy to be visible asan [O iii ] emitter by the Roman Space Telescope, i.e. it musthave an [O iii ] ﬂux above the Roman detection limit. We setthe detection limit to . σ sensitivity at 6 exposure depth.This limit is displayed in Fig. 2 as a function of the observedwavelength and the angular size of the emitted object. Allobjects with [O iii ] ﬂux above detection limit are potentially[O iii ] emission-line galaxies to place in our catalog.The second requirement asks for the catalog to repro-duce the Roman forecast number density of [O iii ] emission-line galaxies in the redshift range of interest: . < z < . .However, the maximum redshift of the simulation used hereis z = . , and above z = . the number of galaxies dropsdrastically. This is likely due to inaccuracies in the spec-tral properties of star-forming galaxies at these redshifts inthe catalog. Here we relax the second requirement by re-quiring the distribution of [O iii ] galaxies to reproduce theforecast in the redshift range . < z < . . The numberdensity in the mock catalog is large than expected; thus we http://mosdef.astro.berkeley.edu randomly sub-sample . of the the [O iii ] emission-linegalaxies that formally meet the Roman detection limit.Applying the Roman detection limit and a random sub-sampling applied over all redshifts, we have created 50 dif-ferent Roman-like [O iii ] mock galaxy catalog. The galaxydistribution of one realization is shown in Fig. 3 (black line),together with the forecast distribution for [O iii ] from Ro-man Space Telescope (green line).We create a second type of mock catalogs, where H β interlopers are added to the [O iii ] catalogs described above.To select the H β emitters, we perform a galaxy selectionin the lightcone simulation similar to what we have donefor the [O iii ] emitters. First, we select galaxies with H β ﬂux above the Roman detection limit. Among these, someof them present both H β and [O iii ] lines detectable, othershave an [O iii ] ﬂux too weak to be detected. The formershave two visible lines that can as a matter of principle beidentiﬁed correctly, and are not of interest. The latter havea strong H β line that will be mislead as [O iii ] and are inter-lopers. Thus, we select the H β interlopers as galaxies withvisible H β and non-detectable [O iii ] lines. In order to beconsistent with what has been done for the [O iii ] catalog,we randomly sub-sample to . the selected H β galaxies.These interlopers are not placed in the mock catalog at theirright position, but at a shifted position as if they were misledto be [O iii ] emitters. Their actual (wrong) redshift-position ˆ z in the mock is described by + ˆ z = λ H β e λ [ OIII ] e ( + z ) , (2)where λ H β e and λ [ OIII ] e are the rest-frame wavelengths of H β and [O iii ], and z is the true redshift of the galaxy. Thedistribution of H β interlopers is shown in Fig. 3 (red line).Following this procedure and sub-sampling galaxies in 50diﬀerent ways, we have generated 50 diﬀerent mocks with[O iii ] emission-line galaxies and H β interlopers.We have created the same type of catalogs in redshiftspace (RSD) by selecting [O iii ] and H β emitters after theemission line spectra have been shifted to take into accountalso the redshift due to the peculiar velocity of the galaxy: λ o = ( + z + z v ) λ e , where λ o is the observed wavelength, z isthe true redshift and z v = v / c is the shift due to the peculiarvelocity v and c is the speed of light. In these catalogs, [O iii ]emission-line galaxies are placed at redshift z + z v and H β interlopers are placed at the misidentiﬁed redshift + ˆ z RSD = λ H β e λ [ OIII ] e ( + z + z v ) . (3) In this section, we present the two-point correlation functionmeasured in the [O iii ] and [O iii ]+H β catalogs. We performthe measurements in three redshift bins: z = . − . , . − . ,and . − . . The percentage of H β interlopers over the totalnumber of galaxies in these bins are 7.7%, 9.04%, and 7.96%,respectively.We calculate the galaxy correlation function using the MNRAS , 1–10 (2020)

E. Massara et al. λ [ µm ] f l u x [ W / m ]

1e 18 0.0 arcsec0.1 arcsec0.2 arcsec0.3 arcsec0.4 arcsec0.5 arcsec0.6 arcsec0.7 arcsec0.8 arcsec0.9 arcsec1.0 arcsec

Figure 2.

Roman ﬂux detection limit as a function of observedwavelength and size of the emitting galaxy. z l og ( n ) [ h M p c − ] [OIII] galaxies - WFIRST flux cut - 11.4%H β galaxies - WFIRST flux cut - 11.4%[OIII] forecast for WFIRST Figure 3.

Redshift evolution of galaxy number density in the Ro-man catalog. The black line shows [O iii ] galaxies and the red lineindicates the amount of H β interlopers. The green line displaysthe predicted [O iii ] galaxies distribution in the Roman SpaceTelescope Survey. The light grey area indicates the redshift rangewhere the number of [O iii ] emitters drops considerably—this in-terval has not been used to calibrate the [O iii ] mock catalog.The dark grey region shows the redshift range not reached by thelightcone simulation. Landy–Szalay estimator ˆ ξ ( r , µ ) = DD ( r , µ ) − DR ( r , µ )/ f R + RR ( r , µ )/ f R RR ( r , µ )/ f R , (4)where DD , DR and RR are the number of galaxy–galaxy,galaxy–random and random–random pairs at separation r ,and θ = arccos ( µ ) is the angle between the separation vectorand the line of sight. The random catalog R is generated froma set of random points in a volume identiﬁed by the redshiftselection in the galaxy catalogs. The redshift distribution ofthe randoms reproduces the distribution of the galaxies andhas been tuned so that the ratio f R between the number of galaxies and the number of random points is f R = . Thepair counting is performed using the large-scale structuretoolkit nbodykit (Hand et al. 2017).We project the two-point correlation function ˆ ξ in abasis of Legendre polynomials L l ( µ ) of order l via ˆ ξ l ( r ) = l + ∫ − d µ ˆ ξ ( r , µ ) L l ( µ ) , (5)where l = { , , ... } correspond to the Legendre polynomi-als L l = { , ( µ − )/ , ... } and they identify the monopole,quadrupole, etc. of the correlation function.In Fourier space, the multipoles of the power spectrumare P l ( k ) = l + ∫ − d µ P ( k , µ ) L l ( µ ) , (6)and they are related to the multipoles of the two-point cor-relation function via ξ l ( r ) = i l ∫ ∞ k dk π P l ( k ) j l ( kr ) , (7)with j l ( kr ) being the spherical Bessel function of order l .Figure 4 shows the mean galaxy correlation functioncomputed from mocks in real space. Dashed lines indi-cate the measurements in the [O iii ] catalogs, while solidlines show the measurements in the [O iii ]+H β catalogs.The presence of interlopers modiﬁes the correlation func-tion by suppressing it on small separations and enhancing itaround the BAO scale. The BAO peak appears broadenedand shifted towards smaller separations. This behavior isunderstood by considering a small volume containing both[O iii ] and H β galaxies, which will be correlated and sourcingthe correlation on small separation. In our analysis, the H β galaxies have been moved away from the considered volumeby ∼ h − Mpc. This means that there is a lack of galaxypairs on small separation and an increase in the number ofgalaxy pairs with separations close to the BAO scale.Fig. 5 shows the analogous measurements in redshiftspace. The BAO peak is broadened and shifted towardssmall scales in this case as well.

In this section, we quantify the BAO shift due to H β in-terlopers. Firstly, we build a model for the monopole ofthe correlation function in the true cosmology of the sim-ulation, allowing for a free parameter to describe the BAOshift. Secondly, we ﬁt this model to the measured [O iii ]+H β monopole to determine the BAO shift induced by the inter-lopers. The BAO analysis of the monopole of the correlation func-tion allows to measure the spherically averaged distance D v ( z ) = (cid:20) ( + z ) D A ( z ) czH ( z ) (cid:21) / . (8)The interloper-induced shift of the BAO scale relative tothe true value can be described by the isotropic dilation MNRAS000

Mean of the two-point correlation function in real spacemeasured in realizations. Dashed lines show the correlation of[OIII] galaxies, while solid lines indicate the correlation in the fullcatalog, including both [OIII] and H β galaxies. parameter α = D v , i ( z )/ r s , i D v ( z )/ r s = (cid:34) D A , i ( z ) D A ( z ) H ( z ) H i ( z ) (cid:35) / r s r s , i , (9)where r s is the BAO scale (sound horizon) and the subscript i indicates the quantities in the presence of H β interlopers.Quantities without the subscript i correspond to the truecosmology used to compute all the measurements, also wheninterlopers are present.The parameter α is often used to quantify the shift be-tween the BAO scale in the ﬁducial and the true (inferredfrom the data) cosmology. Analogously to this case, we de-ﬁne a ﬁtting model for the correlation function given by ξ m ( r ) = B ( r ) ξ ( r α ) + A ( r ) , (10)where ξ is the template for the nonlinear galaxy correlationfunction based on linear theory and on the true cosmology, α is the isotropic dilation parameter in Eq. (9) used to adjustthe location of the BAO peak and, A ( r ) and B ( r ) are func-tions involving nuisance parameter used to marginalize overbroadband eﬀects of the correlation function, such as scale-dependent bias and redshift-space distortion eﬀects. Using A ( r ) and B ( r ) should minimize the contribution of broadbandeﬀects to the BAO scale and should allow the ﬁtting modelto make robust and broadband-independent predictions on r [ M p c h ] z = 1.3-1.5[O III] catalog[O III]+H catalog r [ M p c h ] z = 1.5-1.7 r [Mpc h ] r [ M p c h ] z = 1.7-1.9 Figure 5.

Mean of the two-point correlation function in redshiftspace measured in realizations. Dashed and solid lines are asin Fig. 4. α . In our analysis we use A ( r ) = a r + a r + a and B ( r ) = B . (11)We model the template for the nonlinear galaxy correla-tion function from its Fourier transform, the power spectrum P ( k , µ ) , given by P ( k , µ ) = b ( + βµ ) F ( k , µ, Σ s ) P w ( k , µ ) , (12)where b is the linear galaxy bias, ( + βµ ) is the Kaiser termdescribing the large-scale redshift-space distortions with β = f / b and f = Ω . m being the linear growth rate. The function F ( k , µ, Σ s ) = ( + k µ Σ s ) (13)is the streaming model describing the Finger of God (FoG)eﬀect, with Σ s being the streaming scale of order ∼ − h − Mpc. The wiggled power spectrum P w ( k , µ ) is deﬁnedas P w ( k , µ ) = [ P lin ( k ) − P nw ( k )] exp (cid:16) − k Σ / (cid:17) + P nw ( k ) , (14)where P lin is the linear matter power spectrum, P nw is theno-wiggle counterpart computed as in Vlah et al. (2016),and Σ = Σ ( + ( f + f ) µ ) (Cohn et al. 2016) with Σ = ∫ dk π [ − j ( kr s )] P lin ( k ) (15)(Vlah et al. 2016) describes the BAO damping due to non-linear structure growth. The power spectrum in Eq. (12) can MNRAS , 1–10 (2020)

E. Massara et al. be decomposed in multipoles using Eq. (6); they are relatedto the multipoles of the correlation function via Eq. (7).

To compute the BAO ﬁt, we need to estimate the covari-ance matrix of the two-point correlation function. Ideally, itshould be computed from a set of mock catalogs generatedfrom independent realizations of the Universe via C ij [ ξ ( r i ) ξ ( r j )] = N − N (cid:213) n = [ ξ n ( r i ) − ¯ ξ ( r i )][ ξ n ( r j ) − ¯ ξ ( r j )] . (16)However, we have only one realization from which we ran-domly subsampled the galaxies in diﬀerent ways to gen-erate diﬀerent mock galaxy catalogs. We could use themto compute the covariance matrix as in Eq. (16), but thisprocedure is not accurate since the mocks are not gener-ated from independent realizations of the dark matter ﬁeld.Therefore, we compute the covariance matrix analytically.For comparison, the results using the covariance matrix ofthe 50 mocks are presented and discussed in Appendix A.The simplest analytical model for the covariance matrixis the Gaussian model, motivated by the fact that matterﬂuctuations are Gaussian in the initial conditions. In thiscase, the covariance between two multiple moments l and l (cid:48) of the correlation function is (Xu et al. 2012) C ij ( ξ l ( r i ) ξ l (cid:48) ( r j )) = ( l + )( l (cid:48) + ) V i l + l (cid:48) (17) × ∫ k dk π j l ( kr i ) j l (cid:48) ( kr j ) P ll (cid:48) ( k ) , where V is the volume considered, j l is the spherical Besselfunction of order l and P ll (cid:48) ( k ) = ∫ − (cid:20) P ( k , µ ) + n (cid:21) L l ( µ ) L l (cid:48) ( µ ) d µ (18)contains the power spectrum P ( k , µ ) in Eq. (12) and thePoisson shot-noise / ¯ n . We take into account the redshiftdependence of ¯ n and assume it has no angular dependenceby considering the quantity I ( k ) = ∫ dVP ll (cid:48) ( k ) (19)with dV = cH r ( z ) (cid:112) Ω m ( + z ) + Ω Λ dz d Ω , (20)and by replacing P ll (cid:48) ( k )/ V with [ I ( k )] − in Eq. (17).The correlation function measured in the mocks isbinned and we must account for it in the modeling of theGaussian covariance matrix. If the correlation functions aremeasured in bins with lower bounds r and upper bound r ,then the binned Gaussian covariance matrix is C ij ( ξ l ( r i ) ξ l (cid:48) ( r j )) = ( l + )( l (cid:48) + ) i l + l (cid:48) r i − r i r j − r j × ∫ r i r i r dr d Ω π ∫ r j r j r dr d Ω π × ∫ k dk π j l ( kr i ) j l (cid:48) ( kr j )[ I ( k )] − . (21) Case . < z < . . < z < . . < z < . Real: f H β = . ± .

012 1 . ± .

010 1 . ± . Real: f H β = . . ± .

010 1 . ± .

010 1 . ± . Real: f H β = . ± .

011 1 . ± .

011 1 . ± . RSD: f H β = . ± .

011 1 . ± .

013 0 . ± . RSD: f H β = . . ± .

013 1 . ± .

011 1 . ± . RSD: f H β = . ± .

010 1 . ± .

011 1 . ± . Table 1.

Values of the isotropic dilation parameter α from theBAO ﬁt in real and redshift space with diﬀerent fractions f H β of interlopers. f H β = corresponds to no interlopers, f H β = indicates the inclusion of all the interlopers present in the mockcatalog, and f H β = . is the case where half of the interlopersare considered. In this work we use only the monopole to calculate the BAOﬁt. In this case l = l (cid:48) = and the above binned covariancematrix becomes C ij ( ξ l ( r i ) ξ l (cid:48) ( r j )) = ∫ k dk π ∆ l ( kr i ) ∆ l (cid:48) ( kr j ) [ I ( k )] − , (22)where ∆ ( kr ) = r − r (cid:34) r j ( kr ) − r j ( kr ) k (cid:35) (23)and j ( kr ) is the spherical Bessel function of order .There have been studies aiming to modify the simpleGaussian model to create more precise covariance matri-ces, see e.g. Xu et al. (2012). These approaches require theﬁt of additional parameters to the covariance matrix frommany N -body simulations, or in our case many lightconeswith emission-line galaxies. The limited number of lightconesavailable do not allow us to use these models. We perform a null test to validate our pipeline. Here we per-form the BAO analysis on the [O iii ] mock galaxy catalogs,where all galaxies are at the right redshift. In this case, weexpect the parameter α to be close to , as this indicates noshift of the BAO peak compared to the theoretical predic-tion in the cosmology of the simulation. The values for thebest ﬁt of α are shown in Table 1. There, f H β indicates thefraction of H β interlopers. f H β = corresponds to no inter-lopers ([O iii ] catalog), and f H β = indicates the inclusion ofall interlopers present in the mock catalog [O iii ]+H β . Thevalues of α in the case f H β = are | α − | < . and areconsistent with , both in real (1st row) and redshift (4throw) space.The results for the BAO analysis applied on the[O iii ]+H β catalog are shown in the same table and cor-respond to f H β = . In this case, α presents a systematicshift towards higher values. The shift is more pronouncedin the redshift bin . < z < . , where the percentage ofinterlopers over the total number of galaxies is the highest.In order to understand how the shift depends on thenumber of interlopers in the three redshift bins, we per-formed the BAO analysis in an intermediate case, where only MNRAS000 , 1–10 (2020) ine confusion and BAO shift half of the interlopers are considered. We randomly subsam-ple them and recompute the correlation function. The bestﬁts for α are shown in Table 1 as the case f H β = . . Again,the BAO peak appears to be shifted, but the shift is smallerthan in the case f H β = , where all interlopers were included.As expected, the BAO shift increases with the number of in-terlopers present in the catalog.Figs. 6 and 7 present a visualization of the shift as afunction of the percentage of interlopers % H β over the totalnumber of galaxies, in real and redshift space. Each panelshows diﬀerent redshift bins. From left to right, each pointcorrespond to f H β = , . , , respectively. The dark shadedareas indicate the expected statistical error for α in theBAO analysis with [O iii ] emission-line galaxies in the Ro-man Space Telescope, σ α = . (Spergel et al. 2015; thisnumber was updated using the number densities in prepa-ration for the Reference Survey as described in Eiﬂer et al.2020). The light shaded areas show a more conservative es-timation of the error: the one reported in the Roman SpaceTelescope Science Requirement Document ( σ α = . ).To roughly estimate the BAO shift at diﬀerent % H β , weinterpolate the three values of α in each redshift bin with alinear function. Since α should be equal to when % H β = ,the linear function is forced to pass through the point (0,1),and the only free parameter is the slope. The best ﬁts aredisplayed in Table 2 and as black solid lines in Figs. 6 and 7.The slope appears to be maximum in the redshift bin . < z < . in real space, and at both . < z < . and . < z < . in redshift space. We already noticed that the parameter α is larger in the redshift range . < z < . . The linearﬁt shows this is not due to the larger amount of interlopers,but it is caused by the particular redshift range considered.Indeed, the error on the interloper positions depends on z via Eq. 1, and the BAO shift will consequently depend onthe redshift considered.The linear function gives also an estimation of thelargest amount of interlopers that will produce a BAOshift smaller than a desired systematic errors on α . Usu-ally, the systematic errors σ sys are required to be smallerthan some percentage of the statistical error σ stat . If we con-sider σ sys < σ stat /√ with σ stat = . , then the maximumtolerated percentages of interlopers are ∼ . , . , . in real space and ∼ . , . , . in redshift space at z ∼ . , . , . , respectively.We perform a second linear ﬁt to evaluate the theo-retical uncertainties — non-linearities, bias, and inaccura-cies in the ﬁtting function — in our BAO analysis. Thisﬁt is performed by letting the intercept free to vary (blackdashed lines in Figs. 6 and 7), since the theoretical uncer-tainties would make α (cid:44) at f H β = . The best ﬁts for theintercept are all consistent with 1 within their error bars,but their value is more than . far from 1 in the red-shift bin . < z < . , both in real and redshift space. Inthese two cases the value of the intercept is smaller than1, which indicates that our model could be underestimatingthe values of α in this redshift bin. The value of the slope ap-pears to be slightly larger than the one obtained when ﬁxingthe intercept to 1 (Table 2). This indicates that our estima-tion of the maximum tolerated percentages of interlopers in . < z < . could be underestimated. % H Figure 6.

Results for the isotropic parameter α from the BAO ﬁtin real space. % H β indicates the percentage of interlopers over thetotal number of galaxies. % H β = correspond to no interlopersand the % H β > values correspond to half and the completenumber of interlopers in the mocks, for each redshift bin. Blacksolid lines show the linear ﬁt y = + b x (see also Table 2) whileback dashed lines display the linear ﬁt y = a + b x , with y = α and x = % H β . % H Figure 7.

Results for the isotropic parameter α from the BAOﬁt in redshift space. The color code is the same as in Figure 6.MNRAS , 1–10 (2020) E. Massara et al.

Space & redshift b slopeReal: . < z < . ( . ± . ) · − Real: . < z < . ( . ± . ) · − Real: . < z < . ( . ± . ) · − RSD: . < z < . ( . ± . ) · − RSD: . < z < . ( . ± . ) · − RSD: . < z < . ( . ± . ) · − Table 2.

Results for the linear ﬁt y = + b x , where y is theisotropic parameter α and x is the percentage of interlopers % H β ,in real and redshift space. In this paper we studied the impact that H β -[O iii ] line con-fusion can have on the BAO peak of [O iii ] emitters. Werestrict the analysis to the monopole of the [O iii ] galaxycorrelation function. Confusing an H β emitter as an [O iii ]emitter introduces an error in the estimation of the galaxyredshift, which makes the galaxy appear to be closer to usthan it actually is. This error is due to a modiﬁcation ofthe [O iii ] galaxy correlation function’s shape. In particular,the BAO peak in this correlation function becomes broaderand shifted towards smaller scales. This happens because H β interlopers add correlation power primarily on scales equalto the length of the displacement between their true andinferred position, which is ∼ Mpc/ h .To study this phenomenon, we have generated Roman-like mock catalogs containing [O iii ] galaxies and [O iii ]+H β interloper galaxies. Then, we performed a BAO ﬁt on themonopole measured in these mocks to quantify the impactof line confusion on future BAO analysis.The [O iii ] to H β equivalent width ratio is not exactlyknown at the considered redshifts — it is estimated to bearound 4-5% from the data obtained by the MOSDEF col-laboration (see Fig. B2). Therefore, we cannot give an es-timation of the expected BAO shift due to H β interlopers.Instead, we estimate the shift as a function of fraction ofH β interlopers in the considered emission-line galaxy sam-ple. If the Roman statistical error on α is σ stat = . and the systematic error introduced by interlopers mustbe σ sys < σ stat /√ , then the maximum tolerated percent-ages of interlopers are ∼ . , . , . in real space and ∼ . , . , . in redshift space at z ∼ . , . , . , re-spectively. This phenomenon should be more problematicat higher redshift, < z < , where the [O iii ] line is themain target. However, in this redshift range the [O iii ] to H β equivalent width ratio is expected to be larger (see Fig. B1)and the interloper fraction is expected to be smaller (seeFig. B2).Here we presented the methodology to study the im-pact of H β interlopers on the monopole analysis, and de-lineate how to determine the level of interlopers that canbe tolerated without introducing a systematic error. The ef-fect of interlopers on the correlation function is anisotropic,thus the inclusion of the quadrupole in the BAO analysisis expected to help disentangling the interloper eﬀect fromthe impact of a wrong ﬁducial cosmology, making H β in- terlopers a less severe source of systematic error. Moreover,having a theory model to describe how H β interlopers af-fect the quadrupole will allow to determine the amount ofinterlopers in a given galaxy sample. We will combine theanalysis on monopole and quadrupole and present a modelto describe H β interlopers in a following paper, where wewill also investigate the phenomenon at higher redshifts. ACKNOWLEDGEMENTS

E.M. thanks Naveen Reddy for providing the [O iii ] and H β equivalent width of the galaxies in the MOSDEF survey. Shethanks Zachary Slepian and the Roman Science Investiga-tion Team for useful discussions. She was supported by theNASA grant 15-WFIRST15-0008 during part of this project.S.H. thanks NASA for their support in grant number: NASAgrant 15-WFIRST15-0008 and NASA Research Opportu-nities in Space and Earth Sciences grant 12-EUCLID12-0004. Both E.M. and S.H. thank Simons Foundation forsupporting their work. C.H. is supported by NASA grant 15-WFIRST15-0008 and the Simons Foundation. This researchused resources of the National Energy Research ScientiﬁcComputing Center (NERSC), a U.S. Department of EnergyOﬃce of Science User Facility operated under Contract No.DE-AC02- 05CH11231. REFERENCES

Addison G. E., Bennett C. L., Jeong D., Komatsu E., WeilandJ. L., 2018Akeson R., et al., 2019, arXiv e-prints, p. arXiv:1902.05569Blanton M. R., Roweis S., 2007, AJ, 133, 734Cohn J. D., White M., Chang T.-C., Holder G., PadmanabhanN., Dor´e O., 2016, Mon. Not. Roy. Astron. Soc., 457, 2068DeRose J., et al., 2019, arXiv e-prints, p. arXiv:1901.02401Eiﬂer T., et al., 2020, arXiv e-prints, p. arXiv:2004.05271Grasshorn Gebhardt H. S., et al., 2019, Astrophys. J., 876, 32Hand N., Feng Y., Beutler F., Li Y., Modi C., Seljak U., SlepianZ., 2017Martens D., Fang X., Troxel M. A., DeRose J., Hirata C. M.,Wechsler R. H., Wang Y., 2019, Mon. Not. Roy. Astron. Soc.,485, 211Pullen A. R., Hirata C. M., Dore O., Raccanelli A., 2016, Publ.Astron. Soc. Jap., 68, 12Reddy N. A., et al., 2018, ApJ, 869, 92Spergel D., et al., 2015, arXiv e-prints, p. arXiv:1503.03757Storey P. J., Zeippen C. J., 2000, MNRAS, 312, 813Vlah Z., Seljak U., Chu M. Y., Feng Y., 2016, JCAP, 1603, 057Wechsler R. H., DeRose J. D., Busha M. T., Becker M. R., et al.2020, to be submitted.Xu X., Padmanabhan N., Eisenstein D. J., Mehta K. T., CuestaA. J., 2012, Mon. Not. Roy. Astron. Soc., 427, 2146

APPENDIX A: COVARIANCE MATRIX FROMTHE 50 REALIZATIONS

We perform the BAO analysis with covariance matrix com-puted from 50 galaxy mocks, as in equation 16. These mocksare not generated from 50 independent realizations of thedark matter density ﬁeld and are therefore correlated, sothe covariance matrix is not expected to be accurate. Here

MNRAS000

MNRAS000 , 1–10 (2020) ine confusion and BAO shift Case . < z < . . < z < . . < z < . Real: f H β = . ± .

008 1 . ± .

008 1 . ± . Real: f H β = . . ± .

008 1 . ± .

008 1 . ± . Real: f H β = . ± .

007 1 . ± .

008 1 . ± . RSD: f H β = . ± .

008 0 . ± .

009 1 . ± . RSD: f H β = . . ± .

009 1 . ± .

008 1 . ± . RSD: f H β = . ± .

007 1 . ± .

008 1 . ± . Table A1.

Values of the isotropic dilation parameter α from theBAO ﬁt in real and redshift space using the covariance matrixfrom 50 mocks. f H β indicates the fraction of interlopers with re-spect to the total amount present in the [O iii ]+H β catalogs. % H Figure A1.

Results for the isotropic parameter α from the BAOﬁt in real space. % H β indicates the percentage of interlopers overthe total number of galaxies. % H β = correspond to no interlop-ers and the % H β > values correspond to half and the completenumber of interlopers in the mocks, for each redshift bin. Thecovariance matrix has been computed from the 50 mocks. we use it only to test how the shift in α depends on the co-variance matrix used to perform the BAO ﬁt. To this aim,we redo the BAO analysis with this covariance matrix frommocks. The results are Table A1 and Figs. A1, and A2.We perform a linear ﬁt to model the parameter α asa function of the percentage of interlopers % H β over thetotal number of galaxies, as in Section 5. The best values forthe slope parameter in diﬀerent redshift bins are shown inTable A2.The best ﬁt values of α are very similar to the onesobtained with the Gaussian covariance matrix in Section 5,although their errors are bigger when using the covariancefrom the mocks. The slopes of the linear ﬁt are also similarin the two cases. % H Figure A2.

Results for the isotropic parameter α from the BAOﬁt in redshift space with covariance from the 50 mocks.Space & redshift b slopeReal: . < z < . ( . ± . ) · − Real: . < z < . ( . ± . ) · − Real: . < z < . ( . ± . ) · − RSD: . < z < . ( . ± . ) · − RSD: . < z < . ( . ± . ) · − RSD: . < z < . ( . ± . ) · − Table A2.

Results for the linear ﬁt y = + b x , where y is theisotropic parameter α and x is the percentage of interlopers % H β ,in real and redshift space. APPENDIX B: [O iii ] AND H β LINES INMOSDEF

In this section we present relevant results obtained using thedata from (Reddy et al. 2018) and collected by the MOSDEFsurvey. Fig. B1 shows the distribution of [O iii ] to H β equiv-alent width ratio for galaxies in redshift bins . < z < . (orange) and . < z < . (blue). The distribution in therange . < z < . presents a peak around the value . Inthe Roman-like catalog generated for this work, the samePDF peaks around the value . (see Fig. 1). This indicatesthat the catalog might contain an overestimated number ofinterlopers, and the expected interloper fraction could bebelow within . < z < . .Fig. B2 displays the number of interlopers in the MOS-DEF survey. Here interlopers are deﬁned as galaxies with H β ﬂux above detection limit (deﬁned as σ limit) and [O iii ]ﬂux below detection limit, while the whole selected galaxiesare interlopers or galaxies with [O iii ] ﬂux above detection MNRAS , 1–10 (2020) E. Massara et al. [ OIII ]/ H N z=1.9-3.0z=1.3-1.9 Figure B1. [O iii ]/H β equivalent width ratio distribution in theMOSDEF survey, at redshifts . < z < . (orange) and . < z < . (blue). z % i n t e r l o p e r s Figure B2.

Fraction of H β interlopers in the MOSDEF surveyat diﬀerent redshifts. limit. Under these assumptions, the percentage of interlop-ers in the sample selected from the MOSDEF survey is . in redshift range . < z < . , and . in redshift range . < z < . . Therefore, the galaxies observed in the MOS-DEF survey suggest that considering an interloper fractionequal to in Roman is a conservative assumption, andthat the percentage of H β interlopers should decrease whenconsidering higher redshift ranges. This paper has been typeset from a TEX/L A TEX ﬁle prepared bythe author. MNRAS000