[PDF] The HI intensity mapping bispectrum including observational effects

Abstract

The bispectrum is a 3-point statistic with the potential to provide additional information beyond power spectra analyses of survey datasets. Radio telescopes which broadly survey the 21cm emission from neutral hydrogen (HI) are a promising way to probe LSS and in this work we present an investigation into the HI intensity mapping (IM) bispectrum using simulations. We present a model of the redshift space HI IM bispectrum including observational effects from the radio telescope beam and 21cm foreground contamination. We validate our modelling prescriptions with measurements from robust IM simulations, inclusive of these observational effects. Our foreground simulations include polarisation leakage, on which we use a Principal Component Analysis cleaning method. We also investigate the effects from a non-Gaussian beam including side-lobes. For a MeerKAT-like single-dish IM survey at z=0.39, we find that foreground removal causes a 8% reduction in the equilateral bispectrum's signal-to-noise ratio S/N, whereas the beam reduces it by 62%. We find our models perform well, generally providing \chi^2_\text{dof}\sim 1, indicating a good fit to the data. Whilst our focus is on post-reionisation, single-dish IM, our modelling of observational effects, especially foreground removal, can also be relevant to interferometers and reionisation studies.

Full PDF

MMNRAS , 1–19 (2020) Preprint 23 February 2021 Compiled using MNRAS L A TEX style ﬁle v3.0

The Hi intensity mapping bispectrum including observational eﬀects

Steven Cunnington ★ , Catherine Watkinson , and Alkistis Pourtsidou , School of Physics and Astronomy, Queen Mary University of London, Mile End Road, London E1 4NS, UK Department of Physics & Astronomy, University of the Western Cape, Cape Town 7535, South Africa

Accepted XXX. Received YYY; in original form ZZZ

ABSTRACT

39, we ﬁnd that foreground removal causes a 8% reduction in the equilateral bispectrum’ssignal-to-noise ratio 𝑆 / 𝑁 , whereas the beam reduces it by 62%. We ﬁnd our models perform well, generally providing 𝜒 ∼ Key words: cosmology: large scale structure of Universe – cosmology: observations – radio lines: general – methods: dataanalysis – methods: statistical

Observations of the Cosmic Microwave Background (CMB) reveal that ﬂuctuations in our Universe’s primordial density ﬁeld are consistentwith Gaussian ﬂuctuations (Planck Collaboration et al. 2020). Perfectly Gaussian random ﬁelds with zero mean are fully characterised by thetwo-point correlation function or its Fourier space equivalent, the power spectrum. As the Universe evolves, gravitational instability drivesstructure growth, a non-linear process, and hence causes departures from Gaussianity (Peebles 1980; Bernardeau et al. 2002). Therefore,when probing large-scale structure (LSS) using late Universe observations, the information contained in the power spectrum is not a completestatistical description.In order to extract information contained in the non-Gaussian components of LSS, higher order statistics can be used. Three-point correlationfunctions, and the Fourier counterpart referred to as the bispectrum, provide additional information not contained within the power spectrum.Measurements of clustering in galaxy catalogues using these 3-point statistics has been performed for decades (Peebles & Groth 1975; Groth& Peebles 1977; Fry & Seldner 1982; Jing & Borner 1998; Frieman & Gaztanaga 1999; Scoccimarro et al. 2001; Verde et al. 2002; Crotonet al. 2004; Jing & Boerner 2004; Kulkarni et al. 2007; Gaztanaga et al. 2009; Marin 2011; Marin et al. 2013). Correct interpretation of thesemeasurements requires theoretical prescriptions to model non-linear matter and bias eﬀects (Fry 1994; Angulo et al. 2015). Furthermore, sinceobservations of LSS are performed in redshift space, a correct treatment for the eﬀect of redshift space distortions (RSD) on the bispectrumis required (Hivon et al. 1995; Matarrese et al. 1997; Verde et al. 1998; Heavens et al. 1998; Scoccimarro et al. 1999; Scoccimarro 2000;Verde et al. 2002; Sefusatti et al. 2006). With these techniques in place, measurements of the galaxy bispectrum have been performed inoptical galaxy redshift surveys (e.g. Gil-Marín et al. (2017); Pearson & Samushia (2018)), which have produced constraints on cosmologicalparameters. Next generation optical surveys such as DESI (DESI Collaboration et al. 2016) and Euclid (Euclid Collaboration et al. 2020) willsoon be operational and will aim to to use the bispectrum to analyse the data they obtain. In combination with the power spectrum, this canhelp tighten the constraints on galaxy bias (Yankelevich & Porciani 2019).A complementary approach to optical galaxy surveys for probing LSS is to use Hi intensity mapping (IM) (Bharadwaj et al. 2001; Battyeet al. 2004; Wyithe et al. 2008; Chang et al. 2008). In the post-reionisation Universe, the vast majority of neutral hydrogen (Hi) is containedwithin galaxies, self-shielded from ionising radiation. This means that 21cm emission, caused from hyperﬁne transitions in Hi, will be a tracerof galaxies and thus the underlying matter density. Hi IM involves recording the unresolved, redshifted 21cm signals, in order to construct a ★ E-mail: [email protected] © a r X i v : . [ a s t r o - ph . C O ] F e b S. Cunnington et al. and its pathﬁnder MeerKATwill be reliant on the single-dish method for its LSS science cases (SKA Cosmology SWG et al. 2020; Wang et al. 2020), this is an importantobservational eﬀect to consider. Since with IM we aim to map the diﬀuse, unresolved Hi emission, observations become prone to accumulatingforeground signals in the same frequency ranges as the redshifted Hi. These foreground contaminants are caused by numerous astrophysicalprocesses such as cosmic-ray electrons accelerated by the Galactic magnetic ﬁeld causing synchrotron radiation, or free-free emission causedby free electrons scattering oﬀ ions. Techniques exist to clean these foregrounds (Liu & Tegmark 2011; Wolz et al. 2014; Shaw et al. 2015;Alonso et al. 2015; Cunnington et al. 2020a) but these inevitably also remove the Hi modes most degenerate with the foregrounds and can alsoleave some foreground residuals in the cleaned data, potentially biasing measurements.Previous work has investigated the 21cm bispectrum at high redshift epochs, during the EoR and cosmic dawn (Pillepich et al. 2007;Shimabukuro et al. 2016; Yoshiura et al. 2015; Watkinson et al. 2017; Majumdar et al. 2018; Bharadwaj et al. 2020; Mazumdar et al. 2020;Watkinson et al. 2021). The bispectrum from post-reionisation Hi surveys has been studied in Sarkar et al. (2019), who explored its real-spacesignatures with semi-analytical simulations, to probe the Hi bias. In reality however, Hi IM data will be recorded in redshift space and besubject to the eﬀect of RSD.Recent work has investigated the post-reionisation redshift space Hi intensity mapping bispectrum analytically (see e.g. Karagiannis et al.(2020); Durrer et al. (2020); Jolicoeur et al. (2020)), as well as higher-redshift studies with simulated signals (Majumdar et al. 2020; Kamranet al. 2020). However, a post-reionisation simulation-based analysis of the redshift space Hi IM bispectrum, including instrumental andforeground removal eﬀects, is yet to be performed.In this work we investigate the prospects of performing bispectrum analyses using Hi IM. We include dedicated simulations of observationaleﬀects in the data, and we develop and test modelling prescriptions. The observational eﬀects we consider come from RSD, the telescope beam,and 21cm foreground contamination and removal. We use an 𝑁 -body simulation combined with a semi-analytical model, which is applied togenerate gas masses for the galaxies and can be used to produce a brightness temperature for Hi. We emulate the observational eﬀects in thesimulated data, which allows for a comprehensive study of their signatures. Furthermore, we present modelling prescriptions for these eﬀectsand validate their performance using our simulations. We base our modelling on second order perturbation theory in redshift space, and alsoderive damping functions for the beam and foreground eﬀects. The models we present should be beneﬁcial in future analyses looking to detectthe Hi IM bispectrum signal. In addition, the models should be applicable to analytical forecasts, making them more robust and reliable.The paper is outlined as follows; in Section 2 we introduce our simulated data, including the methods used to emulate the observationaleﬀects; in Section 3 we outline the framework for modelling the Hi IM bispectrum in redshift space and modelling the observational eﬀects,presenting validation tests throughout; ﬁnally, we summarise our main results and conclude in Section 4. Here we summarise our simulated data including the Hi signal with beam smoothing, as well as the 21cm foregrounds and their removal.We choose to tailor our simulations towards emulating a low-redshift Hi experiment, since this is consistent with current and forthcomingpathﬁnder surveys e.g. MeerKLASS ((Santos et al. 2017)) and GBT ((Masui et al. 2013; Wolz et al. 2017)). However, in principle, the modellingtechniques we derive can be extended to interferometers and higher redshift studies.Our underlying cosmological Hi simulation is based on the MultiDark-Galaxies 𝑁 -body simulation (Knebe et al. 2018) with a semi-analytical application (SAGE (Croton et al. 2016)) to infer a Hi mass for each galaxy. To test our models, we select a low-redshift simulationsnapshot at 𝑧 = .

39 which is the approximate central redshift for a MeerKAT-like survey performed in the L-band (899 < 𝜈 < . < 𝑧 < .

58) (Santos et al. 2017; Pourtsidou 2018). We outline the details for the simulations in Appendix A1 and alsorefer the reader to Cunnington et al. (2020a); Soares et al. (2021); Cunnington et al. (2020b) where similar simulations were used. The ﬁnalsimulated data are over-temperature maps 𝛿𝑇 Hi ( 𝒙 , 𝑧 ) = 𝑇 Hi ( 𝒙 , 𝑧 ) − 𝑇 Hi ( 𝑧 ) , where 𝑇 Hi ( 𝑧 ) represents the mean temperature of the ﬁeld, andthese are shown in Figure 1. The maps on the left are averaged along the y-dimension and demonstrate the eﬀects RSD have along the LoS(z-direction). The other maps are showing the eﬀects of a telescope beam (middle) and foreground contamination (right), which we outline inthe following sections. skatelescope.orgMNRAS , 1–19 (2020) he Hi Intensity Mapping Bispectrum Figure 1.

Maps of simulated data used in this study which have volume of 1 ( Gpc / ℎ ) and are at a central redshift of 𝑧 = .

39. This is approximately equivalentto a sky survey of 3000 deg and a redshift range of Δ 𝑧 = .

4, similar to the proposed MeerKLASS Hi IM survey using MeerKAT’s L-band (Santos et al.2017; Pourtsidou 2018).

Left -maps (averaged along the y-dimension) show the eﬀects of RSD along the z-dimension (our chosen LoS direction).

Middle -maps(averaged along the z-dimension) show the eﬀect of the radio telescope beam where the bottom map has been smoothed with a 𝑅 b =

10 Mpc / ℎ symmetricGaussian kernel, acting on the dimensions perpendicular to the LoS (x and y) to emulate beam eﬀects. Right -maps (averaged along the z-dimension) demonstratethe eﬀects from foregrounds where the top map is the full observed signal inclusive of 21cm foreground emission, and the bottom map has been cleaned byremoving 10 modes using Principal Component Analysis (PCA).

To investigate the eﬀect foreground contamination has on the bispectrum, we simulated maps of 21cm foregrounds, added these onto the HiIM, and then cleaned them using Principal Component Analysis (PCA). This will suﬃciently emulate the damping of Hi power caused byforeground removal, and also produce any residual foreground contamination which is left in the data. We outline our approach to simulatingthe 21cm foreground maps, inclusive of polarisation leakage, in Appendix A2 and show a map of the foreground signal in Figure 1 (top-right).This demonstrates the dominance the foregrounds have over the Hi-only signal (see top-middle map for comparison, noting the log-scale usedfor the foregrounds).Adding these foreground maps to the Hi creates foreground dominated simulated data. We then perform a PCA foreground clean, removingthe ﬁrst 𝑁 fg principal components from the data’s frequency-frequency covariance matrix. Since the foregrounds are dominant and highlycorrelated through frequency, i.e. along the LoS, this process removes the foreground signal leaving behind the cosmological Hi we areinterested in. However, this method is imperfect and inevitably also removes Hi modes which are degenerate with the foregrounds, mosttypically large radial modes along the line-of-sight. Furthermore, not all the foreground is removed and residuals can be left in the data,especially where polarisation leakage is present, as is the case in our simulations. We the refer the reader to Cunnington et al. (2020a) for adetailed description of the PCA foreground cleaning method and its eﬃcacy on Hi intensity maps contaminated with polarised foregrounds.Figure 1 (bottom-right) shows a PCA cleaned intensity map with 𝑁 fg =

10. Some diﬀerences between this and a foreground-free map (e.g.top-middle map) are immediately apparent. There is likely a large suppression of information due to removing 10 principal components fromthe data along with some residual foreground contamination. Furthermore, note the change of scale in the (bottom-right) colour bar relative tothe other maps. Because the map has been averaged along the z-direction, and since a blind foreground clean, such as PCA, will remove eachLoS’s mean (as noted in Cunnington et al. (2019)), the range of ﬂuctuations is restricted in this type of averaged map.We note that future instruments should aim to have good control over calibration and if that is the case, less aggressive foreground cleaningwould be required than what we simulate here. However, for this work, we opt for this conservative approach as a robust test on the limits offoreground contamination on the bispectrum.

The eﬀect from the telescope beam is a smoothing to the temperature ﬁeld in directions perpendicular to the LoS. A simple, and often suﬃcient,method to simulate these beam eﬀects is to convolve the density ﬁeld with a Gaussian kernel whose FWHM ( 𝜃 FWHM ) is chosen to match the

MNRAS000

MNRAS000 , 1–19 (2020)

S. Cunnington et al.

Figure 2.

Diﬀerent simulations for the telescope beam used in our analysis.

Left -panel shows the beam pattern for a generic Gaussian beam ( black-dashed line)and a more realistic Cosine beam ( blue-solid line) which includes multiple side-lobes as a function of angular distance from the beam centre 𝜃 . The vertical grey-dotted line marks the position of the FWHM for this example frequency which was chosen to be 1000 MHz. The right -panel shows how the beam sizegiven by 𝜃 FWHM varies with frequency in a realistic beam simulation ( blue-solid line) against a constant frequency-independent beam ( black-dashed line). model of the radio telescope one is trying to emulate. We can deﬁne this Gaussian smoothing kernel with (Battye et al. 2013) B G ( 𝜈, 𝒔 ⊥ ) = exp (cid:34) − (cid:18) 𝒔 ⊥ 𝑟 ( 𝜈 ) 𝜃 FWHM ( 𝜈 ) (cid:19) (cid:35) = exp (cid:34) (cid:18) 𝒔 ⊥ 𝑅 b (cid:19) (cid:35) , (1)where 𝒔 ⊥ = √︁ Δ 𝑥 + Δ 𝑦 is the perpendicular spatial separation from the centre of the beam. 𝑅 b = 𝑟 ( 𝑧 ) 𝜎 b deﬁnes the physical size of thebeam’s central lobe in Mpc/ ℎ , where 𝜎 b = 𝜃 FWHM /( √ ) represents the standard deviation of the Gaussian kernel in radians. 𝑅 b isdependent on frequency through the comoving distance out to the the density ﬂuctuations which changes with frequency ( 𝑟 ( 𝜈 ) ). It also hasa further frequency dependence from the intrinsic beam size of the instrument, which is itself a function of frequency, generically given by 𝜃 FWHM ≈ 𝑐 / 𝑣𝐷 dish , where 𝐷 dish is the diameter of the radio telescope dish.Due to the added diﬃculty in modelling a frequency-dependent beam size, and also the complications it causes to foreground cleaning(Matshawule et al. 2020), it is common for data to be re-convolved to a common eﬀective resolution (e.g. Masui et al. (2013); Wolz et al.(2017)). The disadvantage of this procedure is the loss of information by smoothing the data to a larger resolution than the one caused by thetelescope beam. For simplicity, we mostly assume this process has been employed which allows us to characterise diﬀerent Gaussian beamcases by a single parameter and investigate the impact on the bispectrum by smoothing the Hi intensity maps with diﬀerent values of 𝑅 b . Theeﬀect of a frequency-independent Gaussian beam is demonstrated by the middle maps of Figure 1 where a smoothing with 𝑅 b =

10 Mpc / ℎ hasbeen performed on the lower map. For some context, a dish-size of 13 . 𝑧 = .

39 (the dish size and eﬀective redshift for a MeerKAT-likeL-band survey) will result in a beam pattern with 𝑅 b ∼

12 Mpc / ℎ .In order to investigate the eﬀects from a more realistic beam, inclusive of side-lobes and with a complicated frequency dependence, we alsosimulate IM data where a cosine-tapered beam pattern has been applied, given by (Condon & Ransom 2016) B C ( 𝜈, 𝒔 ⊥ ) = (cid:20) cos ( . 𝜃𝜋 / 𝜃 FWHM ( 𝜈 )) − ( . 𝜃 / 𝜃 FWHM ( 𝜈 )) (cid:21) , (2)where the beam size 𝜃 FWHM is a function of frequency. For our simulations in Cartesian space, the angular separation 𝜃 from the centre of thebeam can be given as 𝜃 = 𝒔 ⊥ 𝑟 ( 𝑧 ) . The side-lobes in this beam pattern are evident in the left panel of Figure 2, relative to the simple case of theGaussian beam (black-dashed line) (Equation 1). We have normalised the beam pattern such that it is 1 at the centre ( 𝜃 =

0) but in its applicationin the simulations it is normalised such that its integral across whole sky region is 1. Noting the decibel scale of the y-axis, it is clear that theside-lobes are expected to be very small and will likely make negligible impact by eye on the Hi IM. However, it is still necessary to carefullytest the departure from a purely Gaussian beam simulation. For example, side-lobes can cause issues in relation to foreground cleaning. Thisis due to the fact that the beam size, given by 𝜃 FWHM , changes with frequency. This can be a complicated, non-linear relationship and wasinvestigated in Matshawule et al. (2020) where a ripple model was provided. Based on this, we include a simpliﬁed version to introduce somefrequency dependence, given by 𝜃 FWHM ( 𝜈 ) = 𝑐𝜈 𝐷 dish + 𝐴 sin (cid:18) 𝜋𝜈𝑇 (cid:19) , (3)where 𝐴 = . 𝑇 =

20 MHz. The second term in Equation 3 introduces a ripple into the frequency dependent relation of thebeam size and can be seen in the right-panel of Figure 2. This frequency dependent beam pattern can cause issues for the foreground cleaningbecause point sources, or other foreground components not smooth in the angular directions, can oscillate in and out of a side-lobe’s maximadue to the oscillating beam pattern shifting the side-lobe’s position. This introduces a frequency structure to the foreground signal which cantherefore be left in the data after a foreground clean which targets smooth, frequency coherent spectra. Whilst these structures will be minimal,due to the sub-dominant side-lobe power, they can still dominate the Hi signal. To ensure we include the potential for contamination fromfar-reaching side-lobes, we perform the beam convolution on a full-sky foreground map and outline the details for this in Appendix A3.

MNRAS , 1–19 (2020) he Hi Intensity Mapping Bispectrum Here we outline the framework for modelling the Hi IM bispectrum and present validation tests along the way with measurements from oursimulated data. We base our modelling upon second order perturbation theory which is expected to hold only in the mildly non-linear regime(Bernardeau et al. 2002; Sefusatti et al. 2006). However, this should be suﬃcient for our primary purposes, which is modelling and testingobservational eﬀects on the Hi IM bispectrum.The Hi bispectrum is deﬁned by the 3-point function in Fourier space; (cid:104) 𝛿𝑇 Hi ( 𝒌 ) 𝛿𝑇 Hi ( 𝒌 ) 𝛿𝑇 Hi ( 𝒌 )(cid:105) = ( 𝜋 ) 𝐵 Hi ( 𝒌 , 𝒌 , 𝒌 ) 𝛿 D ( 𝒌 + 𝒌 + 𝒌 ) , (4)where 𝛿𝑇 Hi ( 𝒌 ) ≡ ∫ d 𝒙 𝛿𝑇 Hi ( 𝒙 ) exp (− 𝑖 𝒌 • 𝒙 ) is the Fourier transform of the over-temperature ﬁeld 𝛿𝑇 Hi ( 𝒙 , 𝑧 ) = 𝑇 Hi ( 𝒙 , 𝑧 ) − 𝑇 Hi ( 𝑧 ) . The Diracdelta function, 𝛿 D ( 𝒌 + 𝒌 + 𝒌 ) , ensures the bispectrum is only deﬁned for closed triangles of wavevectors. We begin by deﬁning the Hibispectrum in redshift space, and then we model the observational eﬀects from Hi IM. We describe these by deﬁning damping functions 𝐷 ( 𝒌 𝑖 ) which act on the bispectrum such that 𝐵 Hiobs ( 𝒌 , 𝒌 , 𝒌 ) = 𝐵 Hi ( 𝒌 , 𝒌 , 𝒌 ) 𝐷 b ( 𝒌 , 𝒌 , 𝒌 ) 𝐷 fg ( 𝒌 , 𝒌 , 𝒌 ) , (5)where 𝐵 Hi is the redshift space bispectrum which we will outline in Section 3.1. 𝐷 b and 𝐷 fg are the damping functions for modelling theeﬀects from the telescope beam and 21cm foreground contamination, derived in Section 3.2 and Section 3.3, respectively.For measuring the bispectrum in our simulated data, we use the publicly available code bifft (Watkinson et al. 2017). bifft exploitsFast-Fourier Transforms to enforce the Dirac-delta function of Equation 5 to drastically speed up the calculation of the bispectrum (Scoccimarro2015). An operation that would naively be a series of nested loops through the dataset to ﬁnd the bispectrum contribution of all triangles thatconform to a given conﬁguration, becomes one whose main overhead consists of six Fast-Fourier transforms and four loops through the dataset.In other words the code is fast, taking of order a few seconds per triangle conﬁguration when run on a MacBookPro (2.3GHz i9 intel core,16Gb RAM) for a datacube with 256 pixels on a side. The method used by bifft is described and tested against a direct-sampling method inWatkinson et al. (2017) and Majumdar et al. (2018). A succinct explanation of the code’s inner workings is also provided in Watkinson et al.(2021). We note that in this work we have also cross-checked bifft with another publicly available code, Pylians (Villaescusa-Navarroet al. 2018), and found agreement ( Pylians applies the direct-sampling method).

The matter density ﬁeld 𝛿 is isotropic in real space. However, observations of Hi which trace the underlying density are conducted in redshiftspace and therefore a particular mode’s measurement will now depend on its direction of alignment relative to the LoS i.e. 𝐵 ( 𝑘 , 𝑘 , 𝑘 ) → 𝐵 ( 𝒌 , 𝒌 , 𝒌 ) . Therefore, any modelled Hi bispectra ﬁtted to data will need to account for the anisotropies introduced by RSD. In this work weexclusively operate in a Cartesian space, thus the plane-parallel approximation is exactly valid and we can parameterise the alignment of modesto the LoS with 𝜇 𝑖 = 𝒌 𝑖 • ˆz / 𝑘 𝑖 ≡ 𝑘 𝑖, / 𝑘 𝑖 (Kaiser 1987). Since the bispectrum is deﬁned for closed triangles such that 𝒌 = −( 𝒌 + 𝒌 ) , thebispectrum in redshift space is a function of ﬁve variables. Following the formalism from Scoccimarro et al. (1999), we use three parameterswhich describe the shape of the triangle 𝑘 , 𝑘 and the angle 𝜃 between them i.e. cos 𝜃 ≡ ˆ 𝒌 • ˆ 𝒌 . The other two variables describe the orientationof the triangle relative to the LoS; 𝜔 = cos − ( 𝜇 ) and the azimuthal angle 𝜙 about ˆ 𝒌 . This provides the expressions; 𝜇 = 𝜇 = cos 𝜔 = ˆ 𝒌 • ˆz , 𝜇 = 𝜇 cos 𝜃 − √︃(cid:0) − 𝜇 (cid:1) sin 𝜃 cos 𝜙 , 𝜇 = − 𝑘 𝑘 𝜇 − 𝑘 𝑘 𝜇 . (6)On large scales in the linear regime, the Hi over-temperature ﬁeld is given by 𝛿𝑇 Hi ( 𝒌 ) = 𝑇 Hi 𝑏 Hi 𝛿 ( 𝒌 ) where 𝑏 Hi represents the linear bias.The eﬀect of measuring a Fourier component of this ﬁeld in redshift space can be modelled as 𝛿𝑇 Hi ( 𝒌 ) → 𝛿𝑇 sHi ( 𝒌 ) = 𝑇 Hi 𝑍 ( 𝒌 ) 𝛿 ( 𝒌 ) (Kaiser1987), where the superscript s denotes that the quantity is in redshift space. This is the only time we use the notation 𝛿𝑇 sHi ( 𝒌 ) to denote aquantity in redshift space. In all other cases we drop the supscript s for brevity. Unless clearly stated, we will always be working in redshiftspace. The factor 𝑍 is often referred to as the Kaiser factor and is given by (Kaiser 1987) 𝑍 ( 𝒌 ) = 𝑏 Hi + 𝑓 𝜇 , (7)where 𝑓 is the linear growth rate of structure, approximated by 𝑓 (cid:39) Ω m ( 𝑧 ) . (Linder 2005).For future Hi IM observations, where we aim for precise measurements and constraints, it will be necessary to include non-linear eﬀects incosmological clustering statistics to avoid signiﬁcant discrepancies in the determination of the Hi bias and other parameters (Matarrese et al.1997; Mann et al. 1998; Castorina & White 2019). Since using the bispectrum to break degeneracies and determine bias parameters is seenas one of its primary beneﬁts, it is necessary to have accurate modelling prescriptions for it. From Eulerian perturbation theory, in which weassume a local, non-linear bias between the Hi over-density ( 𝛿 Hi ) and the underlying matter ﬂuctuations ( 𝛿 ), we can Taylor expand in 𝛿 (Fry &Gaztanaga 1993); 𝛿 Hi = ∑︁ 𝑖 𝑏 𝑖 𝑖 ! 𝛿 𝑖 , (8) bitbucket.org/caw11/biﬀt/src/master pylians3.readthedocs.io/en/master/Bk.html MNRAS , 1–19 (2020) S. Cunnington et al. and only retain terms up to 𝑖 = 𝑖 = 𝒌 = 𝛿 Hi ≡ 𝛿𝑇 Hi 𝑇 Hi = 𝑏 𝛿 + 𝑏 𝛿 , (9)where 𝑏 ≡ 𝑏 Hi is the linear bias parameter and 𝑏 is the non-linear (second order) bias. Other studies have considered extensions to thiswhich include non-local terms (e.g. Yankelevich & Porciani (2019)) and it has been argued that neglecting the non-local bias terms can leadto inaccurate predictions of bias parameters (Gil-Marín et al. 2014). However, for exploring observational eﬀects on the Hi IM bispectrum,consideration of non-local biasing eﬀects should not be required.To describe the Hi bispectrum in redshift space, we apply the standard redshift space kernels (see Heavens et al. (1998); Scoccimarro et al.(1999) for derivations) such that 𝐵 Hi ( 𝒌 , 𝒌 , 𝒌 ) = 𝑇 [ 𝑍 ( 𝒌 ) 𝑍 ( 𝒌 ) 𝑍 ( 𝒌 , 𝒌 ) 𝑃 lin ( 𝑘 ) 𝑃 lin ( 𝑘 ) + cycl. ] 𝐷 FoG ( 𝒌 , 𝒌 , 𝒌 , 𝜎 B ) , (10)where cycl. represents cyclic permutations which run over all possible pairs of 𝒌 , 𝒌 and 𝒌 . 𝑃 lin represents the real-space, linear matter powerspectrum for which we use the the CLASS Boltzmann solver (Lesgourgues 2011; Blas et al. 2011). 𝑍 is given in Equation 7 and 𝑍 denotesthe second-order kernel and is given by 𝑍 ( 𝒌 𝑖 , 𝒌 𝑗 ) = 𝑏 𝐹 ( 𝒌 𝑖 , 𝒌 𝑗 ) + 𝑓 𝜇 𝑖𝑗 𝐺 ( 𝒌 𝑖 , 𝒌 𝑗 ) + 𝑓 𝜇 𝑖𝑗 𝑘 𝑖𝑗 (cid:20) 𝜇 𝑖 𝑘 𝑖 𝑍 ( 𝒌 𝑗 ) + 𝜇 𝑗 𝑘 𝑗 𝑍 ( 𝒌 𝑖 ) (cid:21) + 𝑏 , (11)where 𝒌 𝑖𝑗 = 𝒌 𝑖 + 𝒌 𝑗 and 𝜇 𝑖𝑗 = 𝒌 𝑖𝑗 • ˆ z / 𝑘 𝑖𝑗 . 𝐹 and 𝐺 denote the second-order kernels for the real-space density and velocity ﬁelds and are givenby 𝐹 ( 𝒌 𝑖 , 𝒌 𝑗 ) = + 𝑚 𝑖𝑗 (cid:18) 𝑘 𝑖 𝑘 𝑗 + 𝑘 𝑗 𝑘 𝑖 (cid:19) + 𝑚 𝑖𝑗 , (12) 𝐺 ( 𝒌 𝑖 , 𝒌 𝑗 ) = + 𝑚 𝑖𝑗 (cid:18) 𝑘 𝑖 𝑘 𝑗 + 𝑘 𝑗 𝑘 𝑖 (cid:19) + 𝑚 𝑖𝑗 , (13)where 𝑚 𝑖𝑗 = ( 𝒌 𝑖 • 𝒌 𝑗 )/( 𝑘 𝑖 𝑘 𝑗 ) . The ﬁnal term in Equation 10, 𝐷 FoG , is a phenomenological factor to address some non-linear RSD eﬀects notsuﬃciently modelled by the redshift kernels alone. On smaller scales, internal motion inside virialized structures produces a radial smearing tothe density ﬁeld in redshift space, known as the Fingers-of-God (FoG) eﬀect (Jackson 1972). It is common to include a term which describesthe FoG (Taruya et al. 2010), even when including higher order perturbation theory terms and should be seen as a phenomenological dampingrequired to correct for non-linear eﬀects (Verde et al. 1998; Gil-Marín et al. 2014). For our choice of model, this factor is given by (Gil-Marínet al. 2015) 𝐷 FoG ( 𝒌 , 𝒌 , 𝒌 , 𝜎 B ) = (cid:20) + (cid:16) 𝑘 𝜇 + 𝑘 𝜇 + 𝑘 𝜇 (cid:17) 𝜎 (cid:21) − . (14)As a useful data-compression technique, and similar to the multipole expansion of the power spectrum into Legendre polynomials (seeCunnington et al. (2020b); Soares et al. (2021) for applications to Hi IM), the dependence on the orientation of a triangle of wavevectors,parameterised by 𝜔 and 𝜙 , can be decomposed into spherical harmonics (Scoccimarro et al. 1999; Scoccimarro 2015) 𝐵 ( 𝒌 , 𝒌 , 𝒌 ) = ∞ ∑︁ ℓ = ℓ ∑︁ 𝑚 = − ℓ 𝐵 ℓ𝑚 ( 𝑘 , 𝑘 , 𝜃 ) 𝑌 ℓ𝑚 ( 𝜔, 𝜙 ) , (15)where 𝐵 ℓ𝑚 ( 𝑘 , 𝑘 , 𝑘 ) = ∫ + − ∫ 𝜋 𝐵 ( 𝒌 , 𝒌 , 𝒌 ) 𝑌 ∗ ℓ𝑚 ( 𝜃, 𝜙 ) d cos ( 𝜃 ) d 𝜙 . (16)This shares the bispectrum signal between the diﬀerent multipoles 𝐵 ℓ𝑚 . To avoid working with the full multipole decomposition ( ℓ, 𝑚 ) , it iscommon to focus on the coeﬃcients with 𝑚 =

0, referred to as the redshift-space multipoles and corresponds to averaging over 𝜙 . In this casewe can decompose the bispectrum with Legendre polynomials; 𝐵 ( 𝒌 , 𝒌 , 𝒌 ) = ∞ ∑︁ ℓ = 𝐵 ℓ ( 𝑘 , 𝑘 , 𝜃 )L ℓ ( 𝜇 ) . (17)In this analysis we focus solely on the monopole, and we refer the interested reader to Yankelevich & Porciani (2019) for the higher ordermultipoles description. The monopole ( ℓ =

0) with L =

1, equates to an averaging over 𝜇 so that the bispectrum monopole is given by 𝐵 ( 𝒌 , 𝒌 , 𝒌 ) = 𝜋 ∫ + − d 𝜇 ∫ 𝜋 d 𝜙 𝐵 ( 𝒌 , 𝒌 , 𝒌 , 𝜔, 𝜙 ) . (18)We begin by investigating an equilateral triangle conﬁguration of the bispectrum, a special case with 𝑘 = 𝑘 = 𝑘 . In Figure 3 we plot themeasured equilateral bispectrum monopole for the simulated MultiDark Hi IM, omitting for now any beam, foreground, or thermal noiseeﬀects, which we introduce in the following sections. We show the dimensionless bispectrum, 𝑘 𝐵 ( 𝑘 )/( 𝜋 ) , and stick to the dimensionless This normalisation is commonly referred to as dimensionless in the literature since spatial dimensions have been normalised out. However, for radio IM, thebispectrum will still have units of mK .MNRAS , 1–19 (2020) he Hi Intensity Mapping Bispectrum Figure 3.

Equilateral redshift space Hi bispectrum monopole from simulatedHi intensity maps. We model the RSD case with the red-dashed line and ﬁndgood agreement with the data ( red-circular points). The other line-styles showmodels with diﬀering 𝜎 B . For comparison, we also show the bispectrum forthe simulation in real space, i.e. without RSD ( black-square points). Figure 4.

Hi bispectrum monopole for isosceles conﬁgurations, for fourdiﬀerent ﬁxed sizes of 𝑘 = 𝑘 (given in panel titles) as a function of avarying 𝑘 . Overlaid as dashed line is our model inclusive of RSD eﬀectswith 𝜎 B = Figure 5.

Residual comparison between data and model for redshift space bispectrum monopole for Hi IM for diﬀerent isosceles conﬁgurations. Vertical grey-dotted line marks the point where 𝑘 = 𝑘 , 𝑘 i.e. the equilateral conﬁguration. convention in all subsequent plots. The red-circular data points represent redshift space measurements but we also plot the real spacemeasurements (black-squares) to demonstrate the eﬀects RSD have on the bispectrum and motivate their modelling in order to avoid biased(incorrect) results. In order to obtain error-bars in Figure 3 (and subsequent plots) we employ a jackkniﬁng technique (Norberg et al. 2009)using 64 jackknife regions to compute the covariance matrix and use the diagonal elements as our error estimates. We found the contribution tothe covariance from the oﬀ-diagonal elements is minimal, with the exception of the small- 𝑘 bins (see Appendix B and Figure B1). In any case,the uncorrelated error assumption we make is reasonable for our purposes. We ﬁnd errors increase with decreasing 𝑘 due to cosmic-variancelimitations as one would expect. We note that a realistic experiment would contain some thermal noise and would therefore slightly increaseerrors on smaller scales. However, as we demonstrate later in Section 3.4, purely thermal noise should have little impact on the Hi IMbispectrum. It is the results with RSD which we ﬁt the redshift space Hi bispectrum model to (red-dashed line). We use known input ﬁducialparameters 𝑇 Hi = . 𝑓 = . { 𝑏 , 𝑏 , 𝜎 B } . Best-ﬁt analyses using theseparameters have been performed in the optical galaxy surveys literature, for example in Gil-Marín et al. (2015). At the low redshifts weconsider, these works have found that the modelling breaks down already at 𝑘 > . ℎ / Mpc. Since our main goal in this paper is to quantifyand model the observational eﬀects related to Hi IM observations, with the beam eﬀects dominating at the non-linear regime, we do not

MNRAS000

MNRAS000 , 1–19 (2020)

S. Cunnington et al. attempt to perform a best-ﬁt analysis. Instead, we ﬁnd sensible values by eye and keep them ﬁxed throughout this work, concentrating on themodelling of the beam and foreground removal eﬀects. These are { 𝑏 , 𝑏 , 𝜎 B } = { . , . , } . We demonstrate the eﬀect of changing 𝜎 B inFigure 3 by plotting some non-zero values (note that Gil-Marín et al. (2015) ﬁnd 𝜎 B ∼

10 for their tracers, but trusting the model only up to 𝑘 max = . ℎ / Mpc). As can be seen this damps the bispectrum at high- 𝑘 , as expected. However, we ﬁnd this makes agreement worse withour data, which include highly non-linear scales up to 𝑘 max = . ℎ / Mpc. Our simulations should encapsulate some FoG eﬀects, and they canindeed be seen by comparing the RSD simulation results to the no-RSD case (black-squares) at high- 𝑘 . Here, we see the RSD data begin tolower in amplitude relative to the no-RSD results. Therefore, the lack of agreement at high- 𝑘 with a FoG model can be explained by a failingof the model at these non-linear scales. We will return to the discussion on non-linear eﬀects and modelling in Section 3.1.1. With the caveatsdiscussed above in mind, we ﬁnd a good agreement between data and model, with a 𝜒 ∼ 𝑘 and 𝑘 to four diﬀerent values as shown in thepanel titles, then plot results for a varying 𝑘 . Again, we overlay our model (dashed line) and see reasonable agreement. The isosceles modelsalso show a tendency to under-predict at high- 𝑘 and have perhaps greater discrepancies than the equilateral results of Figure 3. It is also noteasy to identify a ﬁxed scale at which all models breakdown but we discuss this next in Section 3.1.1. We note that we have not applied anycorrections for aliasing eﬀects, which could in principle be causing some discrepancies at high- 𝑘 . However, we performed tests to investigatethis and found no evidence it is causing noticeable eﬀects at the scales we are interested in. We discuss this further in Appendix C. As we have already mentioned, our simulated data are at a low redshift, 𝑧 = .

39, in order to emulate current pathﬁnder surveys. Suﬃcientlymodelling non-linear scales at these redshifts is diﬃcult, and current state-of-the-art models struggle to accurately model the biased redshiftspace bispectra above 𝑘 max ∼ . ℎ / Mpc (Gil-Marín et al. 2015; Lazanu et al. 2016; Yankelevich & Porciani 2019) for the purposes ofprecision cosmology. Developing an accurate model of a Hi IM bisepctrum well into non-linear scales that cannot be treated perturbatively isbeyond the aims of the paper. With this considered, our approach is performing as one would reasonably expect from previous work.In general, we expect non-linear eﬀects to become more important at high- 𝑘 and there will therefore be some maximum scale which we cansuﬃciently model up to. Figure 3 shows that even with our mildest assumption of FoG ( 𝜎 B =

10) we begin to see a divergence between dataand model at 𝑘 (cid:38) . ℎ / Mpc. The 𝜎 B = 𝑘 . Figure 5 shows the agreement between data and model for diﬀerent isosceles conﬁgurations. In general, we achievea high signal-to-noise ratio and good agreement for 𝑘 (cid:38) . ℎ / Mpc (below this, the error for cosmic variance is large). Eventually though,agreement starts to worsen at higher- 𝑘 due to non-linear eﬀects. We ﬁnd that agreement begins to deteriorate at diﬀerent scales depending onthe conﬁguration. In the 𝑘 = 𝑘 = . ℎ / Mpc case, we see good agreement up to 𝑘 ∼ . ℎ / Mpc. But discrepancies begin at much lower 𝑘 for lower 𝑘 , 𝑘 values. The vertical grey dotted line in Figure 5 marks the equilateral triangle point, i.e. where the conﬁguration moves fromsqueezed ( 𝑘 , 𝑘 < 𝑘 ) to squashed ( 𝑘 , 𝑘 > 𝑘 ). In general we ﬁnd that model agreement begins to suﬀer when the conﬁguration moves intothe squashed regime.A more conclusive investigation of the non-linear modelling of Hi bispectra (and power specta) would be very valuable for 21cm preci-sion cosmology, but this would require highly sophisticated simulations, ideally hydrodynamical such as IllustrisTNG (see investigation inVillaescusa-Navarro et al. (2018)). It is possible there will some diﬀerences from the results for optical galaxy surveys, for example FoG areexpected to be higher within Hi, due to Hi only being present in haloes above a certain mass (Villaescusa-Navarro et al. 2018). However, sincethe primary aim of this work is to investigate the observational eﬀects more unique to Hi intensity mapping, namely the telescope beam andforeground contamination, we leave a more detailed analysis of non-linear eﬀects in the Hi ﬁeld for future work. The eﬀect from the telescope beam is to smooth the density ﬁeld in all directions perpendicular to the LoS and therefore its eﬀect on a Fouriercomponent of the Hi over-temperature ﬁeld can be modelled as 𝛿𝑇 Hi ( 𝒌 ) → 𝛿𝑇 smHi ( 𝒌 ) = B b 𝛿𝑇 Hi ( 𝒌 ) , where 𝛿𝑇 smHi denotes a smoothed quantityand B b represents the beam function (see Appendix A3 for more details). For the case of a Gaussian frequency-independent beam, i.e. onewhose FWHM does not vary in size along the LoS and has constant size given by the physical scale 𝑅 b , we can Fourier transform the Gaussianbeam function in Equation 1 to get the beam function B b ( 𝒌 ) = exp (cid:20) − 𝑘 ⊥ 𝑅 (cid:21) = exp (cid:20) − 𝑘 ( − 𝜇 ) 𝑅 (cid:21) . (19)This demonstrates that the beam will damp large 𝑘 ⊥ modes. We can model this eﬀect on the bispectrum by considering the combinedcontributions from smoothed modes, which provides the damping term required for Equation 5, and is given by 𝐷 b ( 𝒌 , 𝒌 , 𝒌 ) = B b ( 𝒌 ) B b ( 𝒌 ) B b ( 𝒌 ) = exp (cid:40) − (cid:34) 𝑘 (cid:16) − 𝜇 (cid:17) + 𝑘 (cid:16) − 𝜇 (cid:17) + 𝑘 (cid:16) − 𝜇 (cid:17)(cid:35) 𝑅 (cid:41) . (20) MNRAS , 1–19 (2020) he Hi Intensity Mapping Bispectrum Figure 6.

Eﬀects on the equilateral Hi bispectrum monopole from diﬀerentbeam sizes denoted by 𝑅 b . All cases are for a simple Gaussian, frequencyindependent beam. Models using Equation 21, are overlaid as dashed lines.We show the reduced 𝜒 measurements for each beam case in the legend,which demonstrate a good agreement between model and data in most cases. Figure 7.

Eﬀects of a 𝑅 b = / ℎ Gaussian frequency-independent beamon diﬀerent isosceles conﬁgurations for the Hi IM bispectrum monopole( blue-squares ). Included for comparison, are the no beam ( 𝑅 b =

0) case( black-circles ) results.

Dashed -lines are the models using Equation 21.

Figure 8.

Residual comparison between no beam and 𝑅 b = / ℎ for the Hi IM bispectrum monopole. We show the agreement between data and model fordiﬀerent isosceles conﬁgurations. Vertical grey-dotted line marks the point where 𝑘 = 𝑘 , 𝑘 i.e. the equilateral conﬁguration. Thus the Hi bispectrum monopole, with observational eﬀects from the telescope beam, can be modelled as 𝐵 Hi0 ( 𝒌 , 𝒌 , 𝒌 ) = ∫ + − ∫ 𝜋 𝐵 Hi ( 𝒌 , 𝒌 , 𝒌 ) exp (cid:40) − (cid:34) 𝑘 (cid:16) − 𝜇 (cid:17) + 𝑘 (cid:16) − 𝜇 (cid:17) + 𝑘 (cid:16) − 𝜇 (cid:17)(cid:35) 𝑅 (cid:41) d 𝜇 d 𝜙 , (21)where 𝐵 Hi is the redshift space bispectrum in Equation 10 and expressions for 𝜇 , 𝜇 , 𝜇 can be found in Equation 6.We begin by showing this model applied to our simulated data measurements for the equilateral case, where we know, from the previousresults in Figure 3, that the model without beam eﬀects was working suﬃciently well. In Figure 6 we show the results for a range of increasing 𝑅 b values which represents an increasing physical beam size. The impact from an increasing beam is seen in the results which show that ahigher 𝑅 b amounts to more damping on the bispectrum at high- 𝑘 , as expected, since the contributions from high- 𝑘 ⊥ are being restricted.We ﬁnd our model agrees well with the simulated data results across all beam sizes we test for the equilateral monopole and show the 𝜒 results in the legend of Figure 6. For modest beam sizes we see a good agreement with 𝜒 ∼ 𝑅 b , which can be indicative of over-ﬁtting. This could mean that we are over-estimating the errors for the highly-smoothed cases MNRAS000

0) case. Despite the less trivial distortions to the bisepctra, our modelling seems consistent in all cases with measurements fromthe simulated data. Again, we see some general discrepancies at higher- 𝑘 , this is the best demonstrated by Figure 8 where we show a direct datamodel comparison for diﬀerent isosceles cases. We again attribute the high- 𝑘 discrepancies to non-linear eﬀects. Since the telescope beam isa smoothing of the ﬁeld, it is plausible to expect some alleviation of non-linear eﬀects at high- 𝑘 , however, this is only a smoothing of modesperpendicular to the line-of-sight, and hence non-linear eﬀects can still dominate radially.These results are only for the simple case of a Gaussian beam. We will investigate the impact from more complex beam patterns in Section 3.5. As discussed in Section 2, several methods exist for removing dominant foregrounds from Hi intensity maps but these are imperfect and causesome contamination and damping to the power spectrum and bispectrum. The most notable of these eﬀects is the damping of large modesalong the LoS, which are the modes most degenerate with the foregrounds. This will cause a reduction in contribution from small- 𝑘 modeswhich we look to model here. To test this model we employ a PCA method, the most commonly used approach to foreground cleaning, whichremoves the ﬁrst 𝑁 fg principal components from the frequency-frequency covariance matrix.We can model the impact from this clean on a Fourier component of the Hi over-temperature ﬁeld by considering 𝛿𝑇 Hi ( 𝒌 ) → 𝛿𝑇 cleanHi ( 𝒌 ) = B fg 𝛿𝑇 Hi ( 𝒌 ) where 𝛿𝑇 cleanHi denotes a Fourier component from a foreground cleaned IM and B fg represents the foreground cleaning dampingfunction (Bernal et al. 2019). To capture the main impact from foreground cleaning, we thus require B fg to be some function which progressivelydamps the contribution to the signal from small- 𝑘 modes. There are various ways to perform this and we choose to adapt and extend themodelling used Soares et al. (2021). For the power spectrum, the foreground damping function was given as B fg ( 𝒌 ) = − exp  − (cid:169)(cid:173)(cid:171) 𝑘𝑘 fg (cid:170)(cid:174)(cid:172)  = − exp  − (cid:169)(cid:173)(cid:171) 𝑘 𝜇𝑘 fg (cid:170)(cid:174)(cid:172)  , (22)where 𝑘 fg is a free parameter governing the extent of information loss due to foreground cleaning. A higher 𝑘 fg , would mean more severedamping to modes from a more aggressive clean. We can model this eﬀect on the bispectrum by considering the combined contributions fromforeground damped modes, which provides the damping term required for Equation 5, and is given by 𝐷 fg ( 𝒌 , 𝒌 , 𝒌 ) = B fg ( 𝒌 ) B fg ( 𝒌 ) B fg ( 𝒌 ) = (cid:40) − exp  − (cid:169)(cid:173)(cid:171) 𝑘 𝜇 𝑘 fg (cid:170)(cid:174)(cid:172)  (cid:41)(cid:40) − exp  − (cid:169)(cid:173)(cid:171) 𝑘 𝜇 𝑘 fg (cid:170)(cid:174)(cid:172)  (cid:41)(cid:40) − exp  − (cid:169)(cid:173)(cid:171) 𝑘 𝜇 𝑘 fg (cid:170)(cid:174)(cid:172)  (cid:41) . (23)Thus the Hi bispectrum monopole, with damping caused by a foreground clean, can be modelled as 𝐵 Hi0 ( 𝒌 , 𝒌 , 𝒌 ) = ∫ + − ∫ 𝜋 𝐵 Hi ( 𝒌 , 𝒌 , 𝒌 ) (cid:40) − exp  − (cid:169)(cid:173)(cid:171) 𝑘 𝜇 𝑘 fg (cid:170)(cid:174)(cid:172)  − exp  − (cid:169)(cid:173)(cid:171) 𝑘 𝜇 𝑘 fg (cid:170)(cid:174)(cid:172)  − exp  − (cid:169)(cid:173)(cid:171) 𝑘 𝜇 𝑘 fg (cid:170)(cid:174)(cid:172)  + exp  − 𝑘 𝜇 + 𝑘 𝜇 (cid:16) 𝑘 fg (cid:17)  + exp  − 𝑘 𝜇 + 𝑘 𝜇 (cid:16) 𝑘 fg (cid:17)  + exp  − 𝑘 𝜇 + 𝑘 𝜇 (cid:16) 𝑘 fg (cid:17)  − exp  − 𝑘 𝜇 + 𝑘 𝜇 + 𝑘 𝜇 (cid:16) 𝑘 fg (cid:17)  (cid:41) d 𝜇 d 𝜙 , (24)where 𝐵 Hi is the redshift space bispectrum in Equation 10 and expressions for 𝜇 , 𝜇 , 𝜇 can be found in Equation 6.For investigating the impact from foregrounds we have removed the telescope beam from the simulation to avoid compounding two strongobservational eﬀects (although we test this combination in Section 3.5). We still include RSD in the simulations, hence our use of the redshiftspace bispectrum in Equation 24. We begin by presenting results in the equilateral conﬁguration and Figure 9 shows the measured bispectrumfor our simulated Hi IM inclusive of the foreground contamination (outlined in Appendix A2), then foreground cleaned to diﬀerent levels,parameterised by 𝑁 fg , which are the number of principal components removed from the frequency-frequency covariance. A higher 𝑁 fg willremove more foreground contaminant but damp the Hi signal more drastically and this is what we see in Figure 9 at small- 𝑘 . Unlike themodelling for the Gaussian beam, where we knew the exact value for the parameter 𝑅 b needed to model the results, the foreground clean modelis more phenomenological and requires ﬁtting the free parameter 𝑘 fg to match results. This represents a ﬂexible way to account for foregroundcleaning eﬀects since 𝑘 fg parameter can be treated as a nuisance parameter and marginalised over when constraining cosmological parameters,as demonstrated in Cunnington et al. (2020c).To model the three cases of 𝑁 fg = , ,

13 we use ﬁtted values of 𝑘 fg = . × − , . × − , . × − ℎ / Mpc respectively. Theseprovide good reduced 𝜒 results as shown in the legend of Figure 9. We see a similar trend to that seen in the beam results of Figure 6 where MNRAS , 1–19 (2020) he Hi Intensity Mapping Bispectrum Figure 9.

Eﬀects on the equilateral Hi bispectrum monopole from diﬀerentlevels of foreground cleaning. The black-circle points represent Hi intensitymaps with no foregrounds and therefore no foreground clean is required.For the other data, 𝑁 fg denotes the number of components removed in aPCA clean. The dashed -lines are the models using Equation 24 with 𝑘 fg (cid:107) = . × − , . × − , . × − ℎ / Mpc respectively. The 𝜒 measurementsfor each case are shown in the legend, which demonstrate a good agreementbetween model and data. Figure 10.

Eﬀects of a 𝑁 fg =

10 PCA foreground clean ( orange-squares ),on diﬀerent isosceles conﬁgurations for the Hi IM bispectrum monopole.Included for comparison, are the foreground-free ( black-circles ) results.

Figure 11.

Residual comparison between foreground-free and foreground contaminated Hi IM bispectrum monopole. We show the agreement between data andmodel for diﬀerent isosceles conﬁgurations. Vertical grey-dotted line marks the point where 𝑘 = 𝑘 , 𝑘 i.e. the equilateral conﬁguration. the 𝜒 are decreasing as the bispectrum is damped more severely. This is most likely indicating that we are over-ﬁtting the severely dampedmodes (in this case at low- 𝑘 ) that are fairly consistent with zero, but have a large error from the jackkniﬁng process.We then explore some isosceles conﬁgurations in Figure 10 where a 𝑁 fg =

10 foreground clean was performed on the simulated data. Wealso provide the foreground-free results for comparison and the eﬀects from the foreground clean are fairly intuitive with, in general, moredamping to the bispectrum at smaller- 𝑘 in the foreground cleaned results. It is interesting to note that this is more evident in the changing 𝑘 = 𝑘 values, i.e. the top-left panel appears to show more damping than the bottom-right. Whereas the damping appears slightly moreuniform across the range of 𝑘 . Again we show the respective models as dashed lines, which are following these features measured in thedata and in all cases are showing good agreement. The direct data and model comparison in Figure 11 demonstrates this nicely. Again we seesome discrepancies between data and model, generally at high- 𝑘 from non-linear eﬀects, but these discrepancies are consistent between the MNRAS000 , 1–19 (2020) S. Cunnington et al. foreground-free modelling and the foreground cleaned one. In other words, we see no evidence that these discrepancies are exacerbated in theforeground contamination cases and thus conclude that the model is performing well.Our main focus in this work regarding foreground eﬀects has concerned signal attenuating, occurring typically on larger modes. However,a foreground clean will also inevitably leave some residual foreground in the data which could contaminate a bispectrum measurement acrossall modes by causing an additive bias. Whilst simulations seem to suggest that this should be a minor impact, but still relevant for precisioncosmology (Cunnington et al. 2020a), it is clear that systematics are a big problem in real Hi IM data sets (Switzer et al. 2013). So far we haverelied on cross-correlations with optical surveys to bypass the large systematics currently contained within pathﬁnder IM observations (Masuiet al. 2013), but it remains unclear how much of this systematic contribution is coming from residual foregrounds.Finally, we should note that an alternative approach for addressing the damping from foreground cleaning is to employ a foreground transferfunction (see Switzer et al. (2015); Cunnington et al. (2020a) for details) which is a data-driven approach using mocks, injected with the realforeground and systematics contaminated data, then cleaned. The impact on the mock clustering statistics is then used to construct the transferfunction which is applied to the real data to reverse the Hi signal loss eﬀects from the foreground clean. This technique has been used forHi IM power spectra measurements with pathﬁnder surveys (Masui et al. 2013; Switzer et al. 2013; Anderson et al. 2018; Wolz et al. 2021).However, using this in the context of a bispectrum measurement would be more cumbersome and computationally expensive. Importantly,previous works have shown that foreground removal eﬀects can be degenerate with cosmological parameters (Cunnington et al. 2020c; Soareset al. 2021). Therefore, for precision cosmology the transfer function approach needs to be studied further.

Unlike galaxy surveys, shot-noise should not be a limiting factor for a radio telescope conducting an Hi IM survey (Battye et al. 2004; Changet al. 2008). Instead, the main source of noise comes from thermal motion of electrons inside the electronics of the instrument which produceGaussian-like ﬂuctuating currents, with a mean current of zero but a non-zero rms. The consequence from this is a component of white-noisecontained in the maps. From the radiometer equation, the rms of the thermal noise contained in time-ordered data for an instrument withsystem temperature 𝑇 sys , with frequency and time resolution 𝛿𝜈 and 𝛿𝑡 , will be given by 𝑇 sys /√ 𝛿𝜈 𝛿𝑡 (Wilson et al. 2009). At map level thiswill create a ﬁeld of white noise added into the data, with rms 𝜎 N . In the case of the power spectrum this produces an additive component; 𝑃 Hi → 𝑃 Hi + 𝑃 N where 𝑃 N = 𝜎 / 𝑉 cell . However, since the ﬂuctuations in this thermal noise are Gaussian, the thermal noise bispectrum shouldbe zero and introduce no additive component to the modelled Hi IM bispectrum. However, statistical ﬂuctuations of the noise do introduce anerror contribution to the bispectrum. This was investigated and concluded in the context of EoR observations in Yoshiura et al. (2015).We investigated this in our simulations by adding on to our Hi IM, a Gaussian ﬁeld ﬂuctuating around zero, with a rms of 𝜎 N . As expected wefound no additive bias to the bispectrum from these tests with a range of 𝜎 N but did ﬁnd an increase in the errors obtained from our jackkniﬁngprocedure. This is demonstrated in Figure 12 (top-panel) where we plot the signal-to-noise ( 𝑆 / 𝑁 ) i.e. the ratio between the amplitude and theerror ( 𝛿𝐵 ) on the bispectrum, for the Hi IM monopole as a function of 𝑘 for the equilateral conﬁguration. The increase in errors, and thusa reduction in 𝑆 / 𝑁 , from thermal noise is very marginal at low- 𝑘 but more signiﬁcant at higher- 𝑘 . However, we have used extremely highvalues of 𝜎 N to demonstrate this. Indeed, a value of 𝜎 N = . 𝜎 N = . 𝑆 / 𝑁 and no thermal noise bias. Whilst the Gaussian assumption regarding thermal noise is areasonable one, Hi IM is likely to have additional noise-like contributions from systematic eﬀects such as RFI and 1/ 𝑓 processes (Harper &Dickinson 2018; Li et al. 2020). These could provide non-Gaussian contributions and thus suﬃcient calibration would be required to avoidbiasing the bispectrum. This presents the possibility for using the bispectrum to characterise systematics in pathﬁnder Hi IM surveys, but weleave an exploration of this for future work. The beam results we presented in Section 3.2 assumed a Gaussian beam with a 𝑅 b that does not change with frequency. In reality the beampattern is more complex, contains side-lobes, and will change in size as a function of frequency. For Section 3.2, we therefore assumed that aperfect re-convolution had been performed to completely neutralise these complexities. This represents an unrealistic assumption and we willnow investigate the consequences of relaxing it.A non-Gaussian beam with side-lobe structure could introduce some troubling eﬀects into a statistical measurement of non-Gaussianity.We introduced the cosine beam pattern with side-lobes in Figure 2 and we now investigate the results from a bispectrum measurement ofthe Hi IM simulations with this beam applied. Figure 13 shows the bispectrum monopole results from the IM with a cosine beam for theequilateral conﬁguration (blue triangles). We attempt to model this with the same Gaussian beam model as before (Equation 21) using a valueof 𝑅 b = .

75 Mpc / ℎ ﬁtted by eye (blue dashed line). The agreement looks quite reasonable, however, the reduced 𝜒 statistic for this ﬁt isshown in the legend, and it is clear that this is indicating a poorer ﬁt relative to the no-beam case (black-circles), and the Gaussian beam casesfrom Figure 6. This is perhaps expected since we are using the same Gaussian beam model, but now on IM data with a non-Gaussian beam.Furthermore, we have not made any corrections for the frequency-dependence in the beam given by Equation 3 and demonstrated in Figure 2(right-panel). MNRAS , 1–19 (2020) he Hi Intensity Mapping Bispectrum Figure 12.

The Hi IM bispectrum monopole 𝑆 / 𝑁 , given by 𝐵 Hi0 / 𝛿𝐵 . Top-panel shows an increasing level of thermal noise contained in the IM. 𝜎 N isthe rms of the random Gaussian ﬂuctuations, which closely model a thermalnoise contribution. Bottom-panel shows the impact on 𝑆 / 𝑁 from the beamand foregrounds. Figure 13.

Results from a more realistic beam simulation, which has a cosinebeam pattern with multiple side-lobes and a frequency-dependent beam size(see Section 2 and Figure 2 for details). The case with foregrounds (red-squares) does not include polarisation leakage and has had a 𝑁 fg = We also show the impact on 𝑆 / 𝑁 from the cosine beam pattern in Figure 12 (bottom-panel) relative to the beam-free case. We see 𝑆 / 𝑁 reduced by 50% at 𝑘 ∼ . ℎ / Mpc due to the beam, although we found a similar reduction in 𝑆 / 𝑁 is also present when using a Gaussianbeam. This large reduction is unsurprising since the beam is damping the high- 𝑘 modes, which otherwise have very strong 𝑆 / 𝑁 . Comparingthis to the impact from foregrounds (orange-squares in Figure 12), this also reduces the 𝑆 / 𝑁 by 50% at 𝑘 ∼ . ℎ / Mpc. However, we can lookat the overall reduction to 𝑆 / 𝑁 by summing in quadrature each bin’s contribution. We ﬁnd that without the beam or foreground contaminationwe achieve 𝑆 / 𝑁 = .

9, but we stress that this only considering the equilateral conﬁguration. A complete and accurate forecast would includecontributions from all triangle orientations and a comprehensive analysis of the covariance properties. With this in mind and just focusing onthe equilateral conﬁguration, we ﬁnd foregrounds reduce the overall 𝑆 / 𝑁 by 7 .

9% whereas the beam reduces the overall 𝑆 / 𝑁 by 61 . 𝑁 fg = 𝑘 fg = × − ℎ / Mpc for the modelling. It is encouraging to note that this is still not causingdrastic eﬀects to the success of the model as shown by the 𝜒 (displayed in the legend). We stress that the simulated beam pattern we haveused is a simpliﬁcation, designed to investigate some of the key factors we felt could impact the bispectrum. For example, we are still assumingthe beam is symmetrical, which may not be the case for a radio telescope array like MeerKAT (Asad et al. 2019). Furthermore, the cosinebeam pattern is a generalised approximation, but the precise beam pattern will be unique to each instrument. However, given the little impactwe have seen from our results in Figure 13, it appears unlikely that these subtle diﬀerences will signiﬁcantly alter results.Some improvements could be made to both cosine beam results by performing a re-smoothing of the data to a common resolution.Alternatively, improvements to the beam bispectrum modelling could be made by accounting for a frequency-dependent beam size and theside-lobes in the beam pattern. However, this is beyond the scope of this work. A further general conclusion is that our bispectrum measurementsappear to largely avoid biased eﬀects concentrated to the scales relevant to the ripple frequency ( 𝑇 ) in the frequency-dependent beam size(Equation 3). These eﬀects were seen in Matshawule et al. (2020), in their radial 1D power spectra measurements. We believe that we do notsee these eﬀects because when measuring the spherically averaged bispectrum this ripple in the beam, which manifests on precise radial modes MNRAS , 1–19 (2020) S. Cunnington et al. related to the 𝑇 =

20 MHz frequency, is mostly spread out amongst 3D modes where 𝑘 = 𝑘 + 𝑘 ⊥ . Whereas in the Matshawule et al. (2020)results, this caused a very concentrated eﬀect at scales related to 𝑘 ∼ 𝜋 / 𝑇 , in the 𝑃 ( 𝑘 ) radial power spectrum. In this paper we have demonstrated how the main observational eﬀects relevant to a Hi IM survey can be modelled in a bispectrum analysis.We have provided validation tests for these models through comparisons with measured bispectra of simulated Hi IM data from an 𝑁 -bodysemi-analytical technique for the cosmological Hi, including emulated eﬀects from the telescope beam and a foreground clean, as well asredshift space distortions. This modelling framework was developed in the context of a low-redshift, single-dish IM survey, however, thesemodels could be transferable or extended for low- 𝑧 interferometer Hi IM surveys as well as EoR surveys.The main conclusions we draw from this work are: • Using second order perturbation theory and closely following previous work for optical galaxy surveys, we ﬁnd that the Hi IM bispectrumcan be modelled well at scales 𝑘 (cid:46) . ℎ / Mpc. As expected, RSD introduce a noticeable eﬀect into the Hi bispectrum (Figure 3) which canbe modelled with standard redshift space bispectrum kernels (Equation 7 and Equation 11). Including a FoG term in this RSD modelling wefound that a non-zero value for 𝜎 B made agreement between data and model worse. This is due to the failing of the model at high- 𝑘 scalesthat cannot be treated perturbatively. However, since these scales are greatly aﬀected by beam eﬀects in Hi IM, we concluded that this modelwas suﬃcient for our purposes. • Applying a Gaussian beam to our simulations, with size deﬁned by physical scale 𝑅 b , we derived a model for the eﬀects on the measuredbispectrum from these smoothed ﬁelds. We found the beam produced a trivial damping to high- 𝑘 in the equilateral bispectrum (Figure 6)but in the isosceles cases, there were some more complex distortions (Figure 7). In summary, the beam model performed well in all casesand was able to match the distortions caused from the beam. • By adding simulated foreground contamination to our Hi IM data, then cleaning these using PCA, we were able to study the eﬀects offoreground cleaning on the Hi IM bispectrum. As expected we found this mainly damped small- 𝑘 modes but crucially we were able to applya phenomenological model to the bispectrum that agreed well with the foreground cleaned results. This technique relies on tuning a singlefree parameter ( 𝑘 fg ), which governs how strong the damping from foreground cleaning is. • We used a reduced 𝜒 test throughout to measure the goodness-of-ﬁt between our models and simulations and in general found goodagreement with 𝜒 ∼ • Since the thermal noise from the radio telescope should be Gaussian (white noise), the Hi IM bispectrum should be immune to thermalnoise bias. We demonstrated how only unrealistically high levels of noise cause noticeable reduction to the 𝑆 / 𝑁 (Figure 12 - top-panel).However, this is under the assumption that contributions from other systematics can be controlled at an exquisite level, which is a majorongoing challenge for Hi IM. • We relaxed our assumption of a perfectly Gaussian, frequency-independent beam to investigate the impact this would have on bispectrummodelling. We used a cosine beam pattern with a frequency dependence and whilst agreement by-eye is still good, we found that thisdoes increase the 𝜒 (Figure 13 - blue-triangles). However, improvements to this should be possible either by attempting to model thenon-Gaussianity in the beam, or treating the actual data by re-convolution to a common Gaussian beam resolution as is typically done in HiIM data surveys. • We also included full-sky foregrounds convolved with the more realistic cosine beam, to investigate the eﬀect shifting side-lobes will haveon foreground contamination on a bispectrum measurement. We found no clear evidence that this increases modelling problems (Figure 13- red-squares), however, this may not be the case if we were just probing radial modes along the line-of-sight as identiﬁed in Matshawuleet al. (2020) in the radial 1D power spectrum. • We examined the impact on the bispectrum’s 𝑆 / 𝑁 from foregrounds and the beam separately in Figure 12 (bottom-panel). This showed thatthe beam has a signiﬁcantly larger impact decreasing the overall 𝑆 / 𝑁 by 61 .

9% relative to the foregrounds which, even with polarisationleakage, only decreases the 𝑆 / 𝑁 by 7 . 𝑆 / 𝑁 , whereas foregroundsmainly damp low- 𝑘 modes. However, in a realistic situation it is very likely that foreground removal eﬀects will be much harder to tocharacterise (and model) than the beam, which means they can pose a much bigger challenge for precise and accurate parameter estimation(e.g. BAO and the growth of structure (Soares et al. 2021), or primordial non-gaussianity (Cunnington et al. 2020c)).In future work it would be interesting to further explore non-linear eﬀects on the Hi bispectrum. This would require higher resolution MNRAS , 1–19 (2020) he Hi Intensity Mapping Bispectrum simulations, ideally with hydrodynamics to allow analysis of the particularly complex distribution of Hi on smaller scales. It would be revealingto see whether non-linear eﬀects on higher-order statistics, such as the bispectrum, signiﬁcantly diﬀer between Hi IM and optical galaxysurveys. Furthermore, our work could be extended to include parameter estimation to see if the bispectrum including the observational eﬀectscan help constrain e.g. the Hi bias. Also, including higher order multipoles in this analysis would extend upon (Cunnington et al. 2020b) whichstudied IM observational eﬀects in the power spectrum multipoles. IM observational eﬀects should also leak signal into higher- ℓ thus it wouldbe revealing to see if including higher-order multipoles improves parameter constraints for the Hi IM bispectrum, as has been shown to be thecase for the power spectrum (Soares et al. 2021). ACKNOWLEDGEMENTS

We thank Bernhard Vos Ginés and Santiago Avila for their help regarding the cold gas mass simulations. We also thank Paula Soares and MartaSpinelli for useful discussions and feedback. SC is supported by STFC grant ST/S000437/1. CW’s research for this project was supported bya UK Research and Innovation Future Leaders Fellowship, grant MR/S016066/1. AP is a UK Research and Innovation Future Leaders Fellow,grant MR/S016066/1, and also acknowledges support by STFC grant ST/S000437/1. This research utilised Queen Mary’s Apocrita HPCfacility, supported by QMUL Research-IT http://doi.org/10.5281/zenodo.438045 . We acknowledge the use of open source software(Jones et al. 01 ; Hunter 2007; McKinney 2010; Van Der Walt et al. 2011; Lewis et al. 2000).

REFERENCES

Alonso D., Ferreira P. G., Santos M. G., 2014, MNRAS, 444, 3183Alonso D., Bull P., Ferreira P. G., Santos M. G., 2015, MNRAS, 447, 400Anderson C., et al., 2018, MNRAS, 476, 3382Angulo R. E., Foreman S., Schmittfull M., Senatore L., 2015, JCAP, 10, 039Asad K. M. B., et al., 2019, arXiv e-prints, p. arXiv:1904.07155Battye R. A., Davies R. D., Weller J., 2004, MNRAS, 355, 1339Battye R. A., Browne I. W. A., Dickinson C., Heron G., Maﬀei B., PourtsidouA., 2013, MNRAS, 434, 1239Bernal J. L., Breysse P. C., Gil-Marín H., Kovetz E. D., 2019, Phys. Rev. D,100, 123522Bernardeau F., Colombi S., Gaztanaga E., Scoccimarro R., 2002, Phys. Rept.,367, 1Bharadwaj S., Nath B., Nath B. B., Sethi S. K., 2001, J. Astrophys. Astron.,22, 21Bharadwaj S., Mazumdar A., Sarkar D., 2020, MNRAS, 493, 594Blas D., Lesgourgues J., Tram T., 2011, JCAP, 1107, 034Blitz L., Rosolowsky E., 2006, Astrophys. J., 650, 933Bowman J. D., Rogers A. E. E., Monsalve R. A., Mozdzen T. J., Mahesh N.,2018, Nature, 555, 67Byun J., Oddo A., Porciani C., Sefusatti E., 2020, preprint, -( arXiv:2010.09579 )Carucci I. P., Irfan M. O., Bobin J., 2020a, 21 cm intensity mapping: a900 - 1300 MHz full-sky simulation, -, doi:10.5281/zenodo.3991818, https://doi.org/10.5281/zenodo.3991818

Carucci I. P., Irfan M. O., Bobin J., 2020b, MNRAS, 499, 304Castorina E., White M., 2019, JCAP, 06, 025Chang T.-C., Pen U.-L., Peterson J. B., McDonald P., 2008, Phys. Rev. Lett.,100, 091303Condon J. J., Ransom S. M., 2016, Essential Radio Astronomy. PrincetonUniversity PressCroton D. J., et al., 2004, MNRAS, 352, 1232Croton D. J., et al., 2016, Astrophys. J. Suppl., 222, 22Cunnington S., Wolz L., Pourtsidou A., Bacon D., 2019, MNRAS, 488, 5452Cunnington S., Irfan M. O., Carucci I. P., Pourtsidou A., Bobin J., 2020a,preprint, - ( arXiv:2010.02907 )Cunnington S., Pourtsidou A., Soares P. S., Blake C., Bacon D., 2020b,MNRAS, 496, 415Cunnington S., Camera S., Pourtsidou A., 2020c, MNRAS, 499, 4054DESI Collaboration et al., 2016, preprint, - ( arXiv:1611.00036 )Dickinson C., Davies R. D., Davis R. J., 2003, MNRAS, 341, 369Durrer R., Jalilvand M., Kothari R., Maartens R., Montanari F., 2020, JCAP,12, 003Euclid Collaboration et al., 2020, Astron. Astrophys., 642, A191 Frieman J. A., Gaztanaga E., 1999, Astrophys. J. Lett., 521, L83Fry J. N., 1994, Phys. Rev. Lett., 73, 215Fry J. N., Gaztanaga E., 1993, Astrophys. J., 413, 447Fry J. N., Seldner M., 1982, ApJ, 259, 474Gaztanaga E., Cabre A., Castander F., Crocce M., Fosalba P., 2009, MNRAS,399, 801Gil-Marín H., Noreña J., Verde L., Percival W. J., Wagner C., Manera M.,Schneider D. P., 2015, MNRAS, 451, 539Gil-Marín H., Percival W. J., Verde L., Brownstein J. R., Chuang C.-H., Ki-taura F.-S., Rodríguez-Torres S. A., Olmstead M. D., 2017, MNRAS,465, 1757Gil-Marín H., Wagner C., Noreña J., Verde L., Percival W., 2014, JCAP, 12,029Groth E., Peebles P., 1977, Astrophys. J., 217, 385Harper S., Dickinson C., 2018, MNRAS, 479, 2024Heavens A., Matarrese S., Verde L., 1998, MNRAS, 301, 797Hivon E., Bouchet F., Colombi S., Juszkiewicz R., 1995, Astron. Astrophys.,298, 643Hunter J. D., 2007, Computing In Science & Engineering, 9, 90Jackson J. C., 1972, MNRAS, 156, 1PJing Y., 2005, Astrophys. J., 620, 559Jing Y. P., Boerner G., 2004, Astrophys. J., 607, 140Jing Y. P., Borner G., 1998, Astrophys. J., 503, 37Jolicoeur S., Maartens R., De Weerd E. M., Umeh O., Clarkson C., CameraS., 2020, preprint, - ( arXiv:2009.06197 )Jones E., Oliphant T., Peterson P., et al., 2001–, SciPy: Open source scientiﬁctools for Python,

Kaiser N., 1987, MNRAS, 227, 1Kamran M., Ghara R., Majumdar S., Mondal R., Mellema G., Bharadwaj S.,Pritchard J. R., Iliev I. T., 2020, preprint, - ( arXiv:2012.11616 )Karagiannis D., Slosar A., Liguori M., 2020, JCAP, 11, 052Klypin A., Yepes G., Gottlober S., Prada F., Hess S., 2016, MNRAS, 457,4340Knebe A., et al., 2018, MNRAS, 474, 5206Kulkarni G. V., Nichol R. C., Sheth R. K., Seo H.-J., Eisenstein D. J., GrayA., 2007, MNRAS, 378, 1196Lazanu A., Giannantonio T., Schmittfull M., Shellard E. P. S., 2016, Phys.Rev. D, 93, 083517Lesgourgues J., 2011, preprint, - ( arXiv:1104.2932 )Lewis A., Challinor A., Lasenby A., 2000, Astrophys. J., 538, 473Li Y., Santos M. G., Grainge K., Harper S., Wang J., 2020, MNRAS, 501,4344Linder E. V., 2005, Phys. Rev., D72, 043529 MNRAS000

Liu A., Tegmark M., 2011, Phys. Rev. D, 83, 103006Majumdar S., Pritchard J. R., Mondal R., Watkinson C. A., Bharadwaj S.,Mellema G., 2018, MNRAS, 476, 4007Majumdar S., Kamran M., Pritchard J. R., Mondal R., Mazumdar A., Bharad-waj S., Mellema G., 2020, MNRAS, 499, 5090Mann B., Peacock J., Heavens A., 1998, MNRAS, 293, 209Marin F., 2011, Astrophys. J., 737, 97Marin F. A., et al., 2013, MNRAS, 432, 2654Masui K. W., et al., 2013, Astrophys. J., 763, L20Matarrese S., Verde L., Heavens A., 1997, MNRAS, 290, 651Matshawule S. D., Spinelli M., Santos M. G., Ngobese S., 2020, preprint, -( arXiv:2011.10815 )Mazumdar A., Bharadwaj S., Sarkar D., 2020, MNRAS, 498, 3975McKinney W., 2010, in van der Walt S., Millman J., eds, Proceedings of the9th Python in Science Conference. pp 51 – 56Norberg P., Baugh C. M., Gaztanaga E., Croton D. J., 2009, MNRAS, 396,19Patil A., et al., 2017, Astrophys. J., 838, 65Pearson D. W., Samushia L., 2018, MNRAS, 478, 4500Peebles P. J. E., 1980, The large-scale structure of the universe. PrincetonUniversity PressPeebles P. J. E., Groth E. J., 1975, ApJ, 196, 1Pillepich A., Porciani C., Matarrese S., 2007, Astrophys. J., 662, 1Planck Collaboration et al., 2016, Astron. Astrophys., 594, A13Planck Collaboration et al., 2020, Astron. Astrophys., 641, A6Pourtsidou A., 2018, PoS, MeerKAT2016, 037Pritchard J. R., Loeb A., 2012, Rept. Prog. Phys., 75, 086901SKA Cosmology SWG et al., 2020, Publ. Astron. Soc. Austral., 37, e007Santos M. G., et al., 2017, in Proceedings, MeerKAT Science: On the Pathwayto the SKA (MeerKAT2016): Stellenbosch, South Africa, May 25-27,2016. ( arXiv:1709.06099 )Sarkar D., Majumdar S., Bharadwaj S., 2019, MNRAS, 490, 2880Scoccimarro R., 2000, Astrophys. J., 544, 597Scoccimarro R., 2015, Phys. Rev. D, 92, 083532Scoccimarro R., Couchman H., Frieman J. A., 1999, Astrophys. J., 517, 531Scoccimarro R., Feldman H. A., Fry J. N., Frieman J. A., 2001, Astrophys.J., 546, 652 Sefusatti E., Crocce M., Pueblas S., Scoccimarro R., 2006, Phys. Rev. D, 74,023522Sefusatti E., Crocce M., Scoccimarro R., Couchman H., 2016, MNRAS, 460,3624Shaw J., Sigurdson K., Sitwell M., Stebbins A., Pen U.-L., 2015, Phys. Rev.D, 91, 083514Shimabukuro H., Yoshiura S., Takahashi K., Yokoyama S., Ichiki K., 2016,MNRAS, 458, 3003Soares P. S., Cunnington S., Pourtsidou A., Blake C., 2021, MNRAS, 502,2549Switzer E. R., et al., 2013, MNRAS, 434, L46Switzer E. R., Chang T.-C., Masui K. W., Pen U.-L., Voytek T. C., 2015,Astrophys. J., 815, 51Taruya A., Nishimichi T., Saito S., 2010, Phys. Rev. D, 82, 063522Van Der Walt S., Colbert S. C., Varoquaux G., 2011, preprint,( arXiv:1102.1523 )Verde L., Heavens A. F., Matarrese S., Moscardini L., 1998, MNRAS, 300,747Verde L., et al., 2002, MNRAS, 335, 432Villaescusa-Navarro F., et al., 2018, Astrophys. J., 866, 135Wang J., et al., 2020, preprint, - ( arXiv:2011.13789 )Watkinson C. A., Majumdar S., Pritchard J. R., Mondal R., 2017, MNRAS,472, 2436Watkinson C. A., Trott C. M., Hothi I., 2021, MNRAS, 501, 367Wilson T. L., Rohlfs K., Hüttemeister S., 2009, Tools of Radio Astronomy.Springer-Verlag, doi:10.1007/978-3-540-85122-6Wolz L., Abdalla F., Blake C., Shaw J., Chapman E., Rawlings S., 2014,MNRAS, 441, 3271Wolz L., et al., 2017, MNRAS, 464, 4938Wolz L., et al., 2021, preprint, - ( arXiv:2102.04946 )Wyithe S., Loeb A., Geil P., 2008, MNRAS, 383, 1195Yankelevich V., Porciani C., 2019, MNRAS, 483, 2078Yoshiura S., Shimabukuro H., Takahashi K., Momose R., Nakanishi H., ImaiH., 2015, MNRAS, 451, 266Zoldan A., De Lucia G., Xie L., Fontanot F., Hirschmann M., 2017, MNRAS,465, 2236

APPENDIX A: SIMULATED DATAA1 Cosmological Hi

To generate our Hi cosmological signal we used the MultiDark-Galaxies 𝑁 -body simulation data (Knebe et al. 2018) and the catalogueproduced from the SAGE (Croton et al. 2016) semi-analytical model application. These galaxies were produced from the dark mattercosmological simulation MultiDark-Planck (MDPL2) (Klypin et al. 2016), which follows the evolution of 3840 particles in a cubicalvolume of 1 ( Gpc / ℎ ) with mass resolution of 1 . × ℎ − M (cid:12) per dark matter particle. The cosmology adopted for this simulation is basedon Planck15 cosmological parameters (Planck Collaboration et al. 2016), with Ω m = . Ω b = . Ω Λ = . 𝜎 = . 𝑛 s = . ℎ = . 𝑧 =

17 and 𝑧 =

0. In this work we choselow-redshift, post-reionisation data to test our models and use the snapshot at 𝑧 = .

39 to emulate a MeerKAT-like survey performed in theL-band (899 < 𝜈 < . < 𝑧 < . .We used each galaxies (x, y and z) coordinates and placed them onto a grid with 𝑛 x , 𝑛 y , 𝑛 z = , ,

256 pixels and 1 ( Gpc / ℎ ) in physicalsize. To simulate observations in redshift space inclusive of RSD, we utilised the peculiar velocities of the galaxies. Assuming the LoS is alongthe z-dimension and given the plane-parallel approximation is exact for this Cartesian data, RSD can be simulated by displacing each galaxy’sposition to a new coordinate 𝑧 RSD given byz

RSD = z + + 𝑧𝐻 ( 𝑧 ) ℎ 𝑣 , (A1)where 𝑣 is the galaxy’s peculiar velocity along the LoS (z-dimension) which is given as an output of the simulation in units of km s − .To simulate the contribution to the signal from each galaxy, we used the cold gas mass 𝑀 cgm output from the MultiDark data and from this , 1–19 (2020) he Hi Intensity Mapping Bispectrum we can infer a Hi mass with 𝑀 Hi = 𝑓 H 𝑀 cgm ( − 𝑓 mol ) where 𝑓 H = .

75 represents the fraction of hydrogen present in the cold gas mass andthe molecular fraction is given by 𝑓 mol = 𝑅 mol /( 𝑅 mol + ) (Blitz & Rosolowsky 2006), with 𝑅 mol ≡ 𝑀 𝐻 / 𝑀 Hi = . 𝒙 , to generate a data cube of Hi masses 𝑀 Hi ( 𝒙 ) , which should trace the underlyingmatter density generated by the catalogue’s 𝑁 -body simulation for the snapshot redshift 𝑧 . These Hi masses are converted into a Hi brightnesstemperature for a frequency width of 𝛿𝜈 subtending a solid angle 𝛿 Ω given by 𝑇 Hi ( 𝒙 , 𝑧 ) = ℎ P 𝑐 𝐴 𝜋𝑚 h 𝑘 B 𝜈 [( + 𝑧 ) 𝑟 ( 𝑧 )] 𝑀 Hi ( 𝒙 ) 𝛿𝜈 𝛿 Ω , (A2)where ℎ P is the Planck constant, 𝐴 the Einstein coeﬃcient that quantiﬁes the rate of spontaneous photon emission by the hydrogen atom, 𝑚 h is the mass of the hydrogen atom, 𝑘 B is Boltzmann’s constant, 𝜈 the rest frequency of the 21cm emission and 𝑟 ( 𝑧 ) is the comoving distanceout to redshift 𝑧 (we will assume a ﬂat universe). Since Hi simulations on this scale have a ﬁnite halo-mass resolution, there will be somecontribution from the Hi within the lowest-mass host haloes which is not included in the ﬁnal 𝑇 Hi signal. To account for this, it is typical for arescaling of the ﬁnal 𝑇 Hi to be performed to bring the mean Hi temperature, 𝑇 Hi , in agreement with the modest data constraints we have forthis value. For the eﬀective redshift of our data, 𝑧 = .

39, we used a ﬁducial value of 𝑇 Hi = . 𝛿𝑇 Hi ( 𝒙 , 𝑧 ) = 𝑇 Hi ( 𝒙 , 𝑧 ) − 𝑇 Hi ( 𝑧 ) . (A3)This represents the ﬁnal form of our simulated data and examples of these were shown in Figure 1. A2 21cm Foreground Simulations

The observed IM data can be approximately decomposed as 𝑇 obs = 𝑇 Hi + 𝑇 fg . To produce the 𝑇 fg component we simulated diﬀerent foregroundprocesses, including galactic synchrotron, free-free emission and point sources. We also included the eﬀects of polarisation leakage whichwill act as an extra component of foreground with non-smooth spectra, thus posing an increased challenge for the foreground clean. Theforegrounds we used can thus be decomposed as 𝑇 fg = 𝑇 sync + 𝑇 free + 𝑇 point + 𝑇 pol , which represent the synchrotron, free-free, point sourcesand polarisation leakage.We brieﬂy summarise the simulation technique for these components but for a full outline we refer the reader to Cunnington et al. (2020a)and Carucci et al. (2020b) where they were also used. Furthermore, a full-sky realisation is openly available from Carucci et al. (2020a).The synchrotron emission is based on Planck Legacy Archive FFP10 simulations of synchrotron emission at 217 and 353 GHz formed fromthe source-subtracted and destriped 0 .

408 GHz map. The free-free simulation is from the FFP10 217 GHz free-free simulation at which is acomposite of the Dickinson et al. (2003) free-free template and the WMAP MEM free-free templates. The point sources are based on theempirical model of Battye et al. (2013) and makes the assumption that point sources over 10 mJy will be identiﬁable and thus can be removed.Lastly, we simulated polarisation leakage with the use of the

CRIME software (Alonso et al. 2014), which provides maps of Stokes Q emissionat each frequency and we ﬁx the polarization leakage to 0 .

5% of the Stokes Q signal.For the foregrounds we assumed they have been observed in a frequency range of 900 < 𝜈 < 𝑧 = . 𝛿𝜈 = . × . which corresponds to the size of a 1 ( Gpc / ℎ ) patch at the 𝑧 = .

39 snapshot redshiftof our cosmological simulation. A map of the foreground signal was shown in Figure 1 (top-right).

A3 Full-Sky Beam Convolution

In the more complex pattern of a cosine beam, there exist a number of side-lobes as shown by Figure 2. Since the side-lobes can continueout to very wide distances from the central pointing ( 𝜃 = ( Gpc / ℎ ) patch of sky will not suﬃciently emulate this behaviour. Instead, we carry outa full-sky convolution of the foregrounds which should produce any of the eﬀects we discussed in our targeted region. We then cut this skyregion and overlay our Hi simulated data. Whilst this approach is not entirely consistent with a real experiment, it is suﬃcient for our purposesfor investigating these severe observational eﬀects on a bispectrum measurement. To carry out the full-sky convolution we decomposed themap into spherical harmonics 𝑌 ℓ𝑚 such that 𝑇 fg ( 𝜈, 𝜽 ) = ∞ ∑︁ ℓ = ℓ ∑︁ 𝑚 = − ℓ 𝑎 ℓ𝑚 ( 𝑣 ) 𝑌 ℓ𝑚 ( 𝜽 ) , (A4) pla.esac.esa.int/pla intensitymapping.physics.ox.ac.uk/CRIME.html MNRAS , 1–19 (2020) S. Cunnington et al.

Figure B1.

Correlation between bins for the equilateral Hi IM bispectrum monopole.

Left -panel: without any eﬀects from a telescope beam or foregrounds. Theother panels demonstrate the low impact on covariance from observational eﬀects caused by the beam ( centre ) and foreground cleaning ( right ). where the harmonic coeﬃcients 𝑎 ℓ𝑚 describe the amplitudes of the ﬂuctuations in spherical harmonic space. This allows the convolution to beapplied as a simple product in spherical harmonic space between the harmonic coeﬃcients 𝑎 ℓ𝑚 and the harmonic coeﬃcients for the cosinebeam function we are using, such that the new convolved coeﬃcients, with correct normalisation factors, are given by˜ 𝑎 ℓ𝑚 ( 𝜈 ) = √︂ 𝜋 ℓ + 𝑎 ℓ𝑚 ( 𝜈 ) 𝑏 ℓ ( 𝜈 )√ 𝜋𝑏 ( 𝜈 ) , (A5)where 𝑏 ℓ are the beam harmonic coeﬃcients, which assuming a symmetrical beam function can be given by 𝑏 ℓ ( 𝑣 ) = ∫ B C ( 𝜈, 𝜽 ) 𝑌 ∗ ℓ ( 𝜽 ) d 𝜽 . (A6)where B C is given by Equation 2. We refer the reader to Matshawule et al. (2020) for a more focused investigation into beam eﬀects on theeﬃciency of foreground cleaning, than what we intend to carry out here. In this case of simulating a more realistic beam, we assumed a dishsize of 𝐷 dish = . < 𝜈 < 𝛿𝜈 = APPENDIX B: OBSERVATIONAL EFFECTS ON THE BISPECTRUM COVARIANCE

The covariance matrix is used to estimate the errors from the bispectrum estimator and indicates whether diﬀerent 𝑘 bins are correlated.In optical galaxy surveys, it has been found that there is more correlation between bins in the bispectrum compared to the power spectrum(Gil-Marín et al. 2017). In this work we did not aim to investigate this in detail, since a robust covariance estimation for the bispectrum iscomplex due to high number of triangle bins and typically requires a large number of mocks (Byun et al. 2020). Instead, we used a jackkniferoutine, splitting our data into 𝑁 jack =

64 sub-samples and estimating the covariance with (Norberg et al. 2009) 𝐶 (cid:0) 𝑘 𝑖 , 𝑘 𝑗 (cid:1) = ( 𝑁 jack − ) 𝑁 jack 𝑁 jack ∑︁ 𝑛 = ( 𝑘 𝑛𝑖 − ¯ 𝑘 𝑖 ) ( 𝑘 𝑛𝑗 − ¯ 𝑘 𝑗 ) , (B1)where ¯ 𝑘 𝑖, 𝑗 are the mean averages over the 𝑁 jack measurements. This approach allows a simple, yet suﬃcient means to check correlationsbetween bins (i.e., check the assumption that the covariance is diagonal), infer error-bars, and also study if the covariance is aﬀected by theobservational eﬀects from the beam or foreground cleaning. We show the correlation matrix deﬁned as 𝑅 𝑖 𝑗 = 𝐶 𝑖 𝑗 / √︁ 𝐶 𝑖𝑖 𝐶 𝑗 𝑗 between 𝑘 bins inFigure B1. We show this only for the equilateral conﬁguration for simplicity. The left-panel shows the Hi IM without a beam or foregroundcontamination. We can see that the diagonal covariance assumption is reasonable, except perhaps at low- 𝑘 where bins become more correlated,likely due to our logarithmic binning scheme. More importantly for this work, we ﬁnd there is little impact from a frequency-varying telescopebeam with side lobes (centre-panel), or from data with polarised foregrounds cleaned using a 𝑁 fg =

10 PCA method (right-panel). There isa slight increase in correlation at high- 𝑘 for the beam case, and at small- 𝑘 for the foreground case, as one may expect (Wolz et al. 2014).Therefore, this preliminary study suggests that neither the beam or foregrounds should cause large problems for the bispectrum covariance. APPENDIX C: ALIASING CORRECTIONS TO THE BISPECTRUM

Our Hi IM bispectrum measurements do not currently include any corrections for potential aliasing eﬀects, which could in principle be causingsome discrepancies at high- 𝑘 . Indeed, scales where 𝑘 > 𝑘 Nyq / 𝑘 Nyq is the Nyquist frequency) are likely to become biased in thebispectrum due to aliasing contributions (Sefusatti et al. 2016), which for our simulations equates to 𝑘 ∼ . ℎ / Mpc. We ran tests on a higherresolution gridding with 512 cells and found results follow this prediction with a noticeable bias beginning to form at 𝑘 > 𝑘 Nyq / MNRAS , 1–19 (2020) he Hi Intensity Mapping Bispectrum case of an equilateral triangle conﬁguration. However, below this at 𝑘 < 𝑘 Nyq /

3, results between the 256 and the 512 gridding schemewere consistent.A simpliﬁed approach to correct for the eﬀects of mass assignment can be made to the density ﬁeld whereby 𝛿 true ( 𝒌 ) ≈ 𝛿 meas ( 𝒌 )/ 𝑊 ( 𝒌 ) ,where 𝑊 is the Fourier transform of the mass assignment function (Jing 2005). Indeed this approach has been used in real data bispectrumanalysis on optical galaxy redshift surveys (Gil-Marín et al. 2015). However, the more thorough approach would need to replicate that typicallyused in aliasing corrections to the power spectrum, which involves an iterative process and requires a priori knowledge of the target powerspectrum one expects to measure. This is less trivial to calculate in the case of the bispectrum where triangle-shape dependencies would causeissues. This is discussed in Sefusatti et al. (2016), which also proposes corrections using interlacing techniques performed directly to densityﬁeld in Fourier space. These would therefore be naturally applicable to higher order measurement of this density ﬁeld such as the bispectrum.However, since we found no noticeable eﬀects on the scales we were interested in, we did not investigate such corrections in the context of Hi,but encourage future work into this, since it should become necessary if aiming to contribute to precision cosmology. This paper has been typeset from a TEX/L A TEX ﬁle prepared by the author. MNRAS000