On the Uncertainties of Results Derived from HI Spectral Line Stacking Experiments
MMNRAS , 1–11 (2019) Preprint 30 April 2019 Compiled using MNRAS L A TEX style file v3.0
On the Uncertainties of Results Derived from H i SpectralLine Stacking Experiments
E. C. Elson, , (cid:63) A. J. Baker, & S. L. Blyth, Department of Physics & Astronomy, University of the Western Cape, Robert Sobukwe Rd, Bellville, 7535, South Africa. South African Radio Astronomy Observatory (SARAO), Observatory 7925, South Africa. Department of Physics and Astronomy, Rutgers, The State University of New Jersey, 136 Frelinghuysen Road, Piscataway, NJ 08854-8019, USA. Department of Astronomy, University of Cape Town, Private Bag X3, Rondebosch, 7701, South Africa.
Accepted XXX. Received YYY; in original form ZZZ
ABSTRACT
We present the results of a set of mock experiments aimed at quantifying the accu-racy of results derived from H i spectral line stacking experiments. We focus on theeffects of spatial and spectral aperture sizes and redshift uncertainties on co-added H i spectra, and by implication on the usefulness of results from H i line spectral stackingexperiments. Using large spatial apertures to extract constituent galaxy spectra yieldsco-added spectra with high levels of contamination and with relatively low signal-to-noise ratios. These properties are also affected by the size of the spectral aperture aswell as H i redshift uncertainties of galaxies. When redshift uncertainties are high, S/Ndecreases while the contamination level remains roughly constant. Using small spec-tral apertures in the presence of large H i redshift uncertainties can yield significantdecreases in S/N without the expected decrease in amount of contaminant flux. Oursimulations show that a co-added spectrum rarely yields an accurate measure of thetotal H i mass of a galaxy sample. Total mass is generally over/under-estimated forlarge/small spatial apertures, regardless of spectral aperture size. Our findings stronglysuggest that any co-added H i galaxy spectrum needs to be fully modelled in the wayspresented in this paper in order to apply accurate corrections for flux contaminationand derive realistic uncertainties in total H i galaxy mass. Failing to do so will resultin unreliable inferences of galaxy and cosmological parameters, such as Ω HI . Key words: methods: numerical – radio lines: general – galaxies: fundamental pa-rameters – galaxies: evolution
Neutral atomic hydrogen (H i ) is the dominant phase of theinterstellar medium at z = . Constraints on the redshiftevolution of the cosmic H i density, Ω HI , can be used to modelthe processes by which gas is accreted onto galaxies andsubsequently processed and expelled. The cycle of gas ingalaxies (including the time it spends as H i ) is one of themain drivers of their formation and evolution.Observations of H i in and around galaxies have tradi-tionally been limited to the nearby Universe. In the very lo-cal Universe, direct detections of H i in galaxies can be usedto construct the H i mass function (e.g., Zwaan et al. 2005;Martin et al. 2010) and hence measure Ω HI . Owing to theintrinsic faintness of the H i emission line, directly detect-ing it in galaxies at distances of more than a few hundredmegaparsecs requires extremely long integration times. At (cid:63) E-mail: [email protected] (ECE) high redshifts (z (cid:38) . ), measurements of Ω HI are derivedfrom observations of damped Lyman α absorbers (DLAs,e.g., Wolfe et al. 1986, 1995; Storrie-Lombardi et al. 1996;P´eroux et al. 2005). S´anchez-Ram´ırez et al. (2016) used asample of 742 DLAs to measure the continuous redshift evo-lution of Ω HI . By combining their measurements with thoseat z = , they derive a factor ∼ Ω HI from z = to z = . They do, however, point out the large uncertaintieson Ω HI over the intermediate redshift range . (cid:46) z (cid:46) . ,where it is still poorly constrained by DLAs. At the lowend of this range, stacking of H i data from existing radiotelescopes has been used to measure Ω HI (e.g., Chengaluret al. 2001; Lah et al. 2007, 2009; Delhaize et al. 2013; Rheeet al. 2013, 2016). A larger fraction of this redshift range willbe probed by means of H i detections of individual galax-ies, and spectral line stacking, in the forthcoming LookingAt the Distant Universe with the MeerKAT Array survey(LADUMA, Blyth et al. 2016), to be carried out on the 64-element MeerKAT array. © a r X i v : . [ a s t r o - ph . GA ] A p r E. C. Elson et al.
Results derived from H i spectral line stacking experi-ments can be highly uncertain. For single-dish observationsof H i in the nearby Universe as well as interferometric ob-servations of galaxies at higher redshifts, source confusiongreatly limits the accuracy with which the average H i con-tent of a sample can be determined by means of stacking.As an example, for the redshift range . (cid:46) z (cid:46) . , Del-haize et al. (2013) estimate any one galaxy in their 2dFGRSsample observed with the Parkes Telescope to be confused,on average, with seven others. Unfortunately, quantifyingsuch uncertainties is not always possible when consideringonly the observational data. Jones et al. (2015, 2016) usethe Arecibo Legacy Fast Alfa (ALFALFA) survey correlationfunction to develop a computationally inexpensive methodof predicting the amount of confused flux in a co-addedH i spectrum. They conclude that stacking in deep SKA-precursor H i surveys will be only mildly affected by sourceconfusion if their target synthesised beam size of 10 arc-sec can be achieved. In Elson et al. (2016), we presented aset of synthetic data cubes containing model galaxies withrealistic spatial and spectral H i distributions based on thecatalogue of evaluated galaxy properties from Obreschkow& Meyer (2014). We used them to carry out several mock H i stacking experiments based on noise-free cubes for low andhigh-redshift galaxy samples, and consistently found largefractions of contaminant emission due to source confusionin all co-added spectra.In addition to quantifying the total amount of confusedH i emission in a co-added spectrum, our simulated datacubes can be used to gain further insights into the extentto which confusion depends on various user-specified extrac-tion parameters, as well as parameters not controlled by theuser. In this work we focus on the sorts of H i stacking ex-periments that will be applied to LADUMA H i data. We dothis by considering a noise-filled version of the high-redshift( . < z < . ) cube presented in Elson et al. (2016), whichwe use to produce a suite of co-added spectra. The co-addsdiffer in the ways in which their constituent spectra are gen-erated. We use spatial apertures of varying size, vary thespectral range over which individual spectra are extracted,and simulate the effects of offsets between optical and H i redshifts. All of these factors play a role in determining theamount of contaminant flux in a co-added spectrum, as wellas its shape and signal-to-noise ratio. Our aim in this work isto quantify these various characteristics of a co-added spec-trum as functions of the above-mentioned factors.The layout of this paper is as follows. In Section 2 wepresent the details of the data cube used for our stackingexperiments. In Section 3 we describe the various types of co-added spectra we produce in terms of various combinationsof aperture sizes and and redshift offsets. We discuss thevarious characteristics of our co-added spectra in Section 4,and present our conclusions in Section 5. Throughout thiswork we assume a Λ CDM cosmology with a Hubble constant H = . km s − Mpc − , Ω Λ = . , and Ω M = . (Planck Collaboration et al. 2014). In this work we make use of the synthetic data products fromElson et al. (2016), where we presented a set of methods for converting mock galaxy catalogues into realistic data cubescontaining H i line emission as well as telescope noise andbeam effects. The cubes are based on the catalogue of eval-uated galaxy properties from Obreschkow & Meyer (2014).For millions of galaxies in the redshift range z < . , the cat-alogue presents detailed H i properties as well as auxiliaryoptical properties. The catalogue is based on the SKA Sim-ulated Skies semi-analytic simulations, and therefore on thephysical models described in Obreschkow et al. (2009a,b,c).These models are able to assign realistic masses and sizes toH i discs, allowing us to evaluate the characteristic proper-ties of their H i emission lines. In Elson et al. (2016), we usednoise-free cubes to carry out a series of mock H i stacking ex-periments, in order to quantify the rate of source confusionas a function of spatial resolution. In this work, we use analmost identical synthetic data cube, but we include noise.The reader is referred to Elson et al. (2016) for the full de-tails of the properties of the synthetic H i data cubes, as wellas the methods used to create them.We choose to focus on the sort of H i image cube thatwill typically be produced by the LADUMA survey. LAD-UMA will carry out H i spectral line stacking experimentsusing data cubes with a frequency-dependent spatial resolu-tion of ∼ − arcsec. It was shown in Elson et al. (2016)that H i stacking experiments carried out at . < z < . will not be dominated by source confusion when the spa-tial resolution of the data is 18 arcsec. Rather, ∼ percentof the flux in a stacked spectrum will come from the tar-get galaxies of interest. Here we generate a synthetic cubespanning a sky area of 1.4 ◦ × ◦ and a redshift range . < z < . . This cube contains 52 662 galaxies with atotal H i mass of . × M (cid:12) . The distributions of the H i and stellar masses of these galaxies are (statistically) identi-cal to those shown in Fig. 13 of Elson et al. (2016), to whichthe reader is referred. The cube has a spatial resolution of15 arcsec , and spatial and spectral pixel sizes of 3 arcsecand 26 kHz, respectively. As mentioned above, the main dif-ference between the cubes used in this work and the cubesused in Elson et al. (2016) is the inclusion of noise. Herewe use noise that is Gaussian distributed in each channel,with a standard deviation of 28.8 µ Jy beam − . This noiselevel assumes an estimated system equivalent flux density of560 Jy for the 64-dish MeerKAT array’s UHF receivers at afrequency of ∼ MHz, for a fiducial integration time of1000 hours and a channel width of 26 kHz. Figure 1 showsthe total intensity map for the noise-free version of our syn-thetic H i cube. In this work we focus on quantifying the effects of aperturesize, redshift offsets, and spectral extraction ranges of con-stituent spectra on the characteristics of co-added H i spec-tra. We do not extract the spectra of all 52 662 galaxies inour synthetic cube. Rather, we consider only those galaxieswith stellar mass greater than M (cid:12) . This cut results in
15 arcsec corresponds to the central frequency, ν = MHz, ofthe redshift range z = . − . . MNRAS , 1–11 (2019) ncertainties of H i Stacking Results Figure 1.
Left panel: Total intensity map for a noise-free version of our synthetic H i data cube, generated by integrating H i emissionover the full spectral range of the cube. The map spans 1.4 ◦ × ◦ and contains H i line emission from 52 662 galaxies in the redshiftrange 0.7 – 0.758. The spatial resolution is 15 arcsec. Right panel: zoom in of the spatial region delimited by the white square in theleft panel. Emission from only 30 consecutive channels is shown, corresponding to a velocity range of ∼ km s − . H i flux densities inunits of Jy beam − to which the colours in each panel correspond are shown in the colour bars. approximately i mass.We convert from the native flux density units of the cube toH i mass in a given channel using M HI , i M (cid:12) = . × + z (cid:18) D L Mpc (cid:19) (cid:18) S i ∆ v i Jy · km s − (cid:19) , (1)where M HI , i is the H i mass in channel i , S i is the flux den-sity in channel i , ∆ v i is the velocity width of channel i in thegalaxy rest frame, D L is the luminosity distance of a targetgalaxy, and z is the target galaxy redshift. From this equa-tion it should be clear that we use the evaluated D L and z of each galaxy to convert its flux density spectrum into a massspectrum, rather than using a single measure of D L and z for all stacked galaxies. In practice, galaxy redshifts are typ-ically obtained from an optical redshift catalogue. Hence,in spite of the galaxies being un-detected in the HI cube,their spatial and spectral positions within the cube are stillknown.In figures 2, 3 and 4, co-added spectra are decomposed The exact number of individual spectra extracted from the cubedepends on the relevant proximities of the galaxies to the edgesof the cube. More specifically, the proximity of the sub-volume(containing the H i line emission of a galaxy) to the edges of thecube is dependent on the choice of sizes of the spatial and spectralapertures used to extract it. as in Elson et al. (2016). For any co-added spectrum, theblack histogram represents the galaxy-averaged co-addedsignal per channel coming from target galaxies, non-targetgalaxies, and noise. The blue histogram represents the co-added mass coming from galaxies only, excluding noise. Thegreen histogram represents the co-added mass coming onlyfrom target galaxies, containing zero contaminant signal.The red histogram represents contaminant mass from nearbygalaxies that are spatially and/or spectrally confused withthe target galaxies. The reader is referred to Elson et al.(2016) for a detailed description of how the mass contribu-tions from target galaxies, nearby neighbours, and distantneighbours are precisely and reliably calculated for a spec-trum extracted from the cube.Note that throughout this work, we typically refer to theco-added flux coming from target galaxies, non-target galax-ies, and noise as the total co-added flux or mass. We do thissimply to emphasise the fact that our black histograms infigures 2, 3 and 4 are based on all of the flux (including noise)in our simulated cube. In contrast, the other histograms arebased on noise-free versions of our simulated cube. How-ever, in all of our co-adds, it is always the galaxy-averagedH i mass in a channel that is shown. In other words, the to-tal amount of co-added flux in a particular channel is alwaysdivided by the number of galaxies that contributed flux tothat channel. MNRAS000
Left panel: Total intensity map for a noise-free version of our synthetic H i data cube, generated by integrating H i emissionover the full spectral range of the cube. The map spans 1.4 ◦ × ◦ and contains H i line emission from 52 662 galaxies in the redshiftrange 0.7 – 0.758. The spatial resolution is 15 arcsec. Right panel: zoom in of the spatial region delimited by the white square in theleft panel. Emission from only 30 consecutive channels is shown, corresponding to a velocity range of ∼ km s − . H i flux densities inunits of Jy beam − to which the colours in each panel correspond are shown in the colour bars. approximately i mass.We convert from the native flux density units of the cube toH i mass in a given channel using M HI , i M (cid:12) = . × + z (cid:18) D L Mpc (cid:19) (cid:18) S i ∆ v i Jy · km s − (cid:19) , (1)where M HI , i is the H i mass in channel i , S i is the flux den-sity in channel i , ∆ v i is the velocity width of channel i in thegalaxy rest frame, D L is the luminosity distance of a targetgalaxy, and z is the target galaxy redshift. From this equa-tion it should be clear that we use the evaluated D L and z of each galaxy to convert its flux density spectrum into a massspectrum, rather than using a single measure of D L and z for all stacked galaxies. In practice, galaxy redshifts are typ-ically obtained from an optical redshift catalogue. Hence,in spite of the galaxies being un-detected in the HI cube,their spatial and spectral positions within the cube are stillknown.In figures 2, 3 and 4, co-added spectra are decomposed The exact number of individual spectra extracted from the cubedepends on the relevant proximities of the galaxies to the edgesof the cube. More specifically, the proximity of the sub-volume(containing the H i line emission of a galaxy) to the edges of thecube is dependent on the choice of sizes of the spatial and spectralapertures used to extract it. as in Elson et al. (2016). For any co-added spectrum, theblack histogram represents the galaxy-averaged co-addedsignal per channel coming from target galaxies, non-targetgalaxies, and noise. The blue histogram represents the co-added mass coming from galaxies only, excluding noise. Thegreen histogram represents the co-added mass coming onlyfrom target galaxies, containing zero contaminant signal.The red histogram represents contaminant mass from nearbygalaxies that are spatially and/or spectrally confused withthe target galaxies. The reader is referred to Elson et al.(2016) for a detailed description of how the mass contribu-tions from target galaxies, nearby neighbours, and distantneighbours are precisely and reliably calculated for a spec-trum extracted from the cube.Note that throughout this work, we typically refer to theco-added flux coming from target galaxies, non-target galax-ies, and noise as the total co-added flux or mass. We do thissimply to emphasise the fact that our black histograms infigures 2, 3 and 4 are based on all of the flux (including noise)in our simulated cube. In contrast, the other histograms arebased on noise-free versions of our simulated cube. How-ever, in all of our co-adds, it is always the galaxy-averagedH i mass in a channel that is shown. In other words, the to-tal amount of co-added flux in a particular channel is alwaysdivided by the number of galaxies that contributed flux tothat channel. MNRAS000 , 1–11 (2019)
E. C. Elson et al.
Figure 2.
Co-added H i spectra based on the long spectra (230 channels) generated for galaxies with M ∗ ≥ M (cid:12) in our syntheticH i data cube. From left to right, the constituent spectra are generated using aperture sizes of 15, 30, and 45 arcsec, respectively. Theco-adds shown in the top row are based on constituent spectra for which the central channel corresponds exactly to the systemic velocity, V sys , of the galaxy. Rows 2, 3 and 4 show co-adds based on constituent spectra for which the central channel corresponds to a velocityequal to V sys + δ iz , where δ iz are galaxy-specific velocity offsets drawn from Gaussian distributions with means of 0 km s − and standarddeviations of 50, 100, and 250 km s − , respectively. In each panel, the black histogram represents the galaxy-averaged co-added signalcoming from target galaxies, non-target galaxies and noise. The blue histogram represents the co-added signal coming only from galaxies(targets and non-targets). The green histogram represents the co-added signal for the target galaxies only. The red histogram representsthe co-added signal from galaxies that are spatially and/or spectrally confused with the target galaxies (i.e., contaminant emission). TheRMS value of the reference spectrum is represented by the horizontal magenta line. The values shown in the right third of each panel givethe mass of a particular co-added mass component in units of M (cid:12) (left column) and the ratio of that mass to the co-added galaxymass (right column). From top to bottom, the values correspond to the co-adds represented by the black, blue, green and red histograms.Given in the left third of each panel is the RMS variation of the reference spectrum (i.e., the magenta line) in units of M (cid:12) for asingle channel, the signal-to-noise ratio for the total co-added signal, the number of galaxies contributing to the co-add, and the trueaverage H i galaxy mass in units of M (cid:12) as calculated from the evaluated H i masses in the Obreschkow & Meyer (2014) catalogue.The vertical dashed lines in each panel delimit a spectral range of ± km s − about the centre of the co-add. An H i spectrum for a single galaxy is usually generatedfrom a small sub-volume spanning several spatial pixels andchannels in an H i cube. The emission in each channel ofthe sub-volume is spatially integrated to yield flux densityas a function of frequency (i.e., a spectrum). The mannerin which a sub-volume is specified for a particular galaxyvaries from study to study. In this work we use the size ofthe Gaussian beam to specify the spatial extents of the sub-volumes. We use square spatial apertures with side lengths of15, 30, and 45 arcsec, corresponding to 1, 2, and 3 half-power widths of the fiducial Gaussian beam . Given the 3 arcsecspatial pixel scale of our synthetic cube, the apertures haveside lengths of 5, 10, and 15 pixels. We specify the spectral extents of the sub-volumes in threeways. The first method uses a fixed width of 230 chan-nels, corresponding to a velocity range of ∼ km s − . The width of the fiducial beam is calculated using the centralfrequency ν = MHz of the redshift range z = . − . .MNRAS , 1–11 (2019) ncertainties of H i Stacking Results Figure 3.
Co-added H i spectra based on the mid-length spectra generated for galaxies with M ∗ ≥ M (cid:12) in our synthetic H i datacube. Notation is as in Fig. 2. Constituent spectra (such as those in this figure) based on a variable spectral aperture sizes yield co-addsthat have a variable number of galaxies contributing to the flux in each channel. Because only the most massive galaxies span a velocityrange of up to ∼ km s − , very few galaxies contribute to the channels near the edges of the co-adds. At these extreme velocities, smallnumber statistics yield highly variable co-added fluxes. At the mean redshift of the cube, the frequency width ∆ ν = kHz of a channel corresponds to a velocity width ∆ v = . km s − . These spectra are hereafter referredto as “long” spectra. The other two methods incorporatea measure of the H i line width of each galaxy. To moreclosely mimic real data analysis procedures, rather than us-ing the evaluated H i line width from the Obreschkow &Meyer (2014) catalogue (which would not be available a pri-ori for an observed galaxy), we use the evaluated absoluteVega R -band magnitude (corrected for intrinsic dust extinc-tion) together with the Tully-Fisher relation (Tully & Fisher1977) from Verheijen (2001) to calculate each galaxy’s H i line width, corrected for the inclination of the disc. Figure 5shows the distribution of evaluated ( log W , M R ) pairs forall galaxies in our synthetic cube, with the Verheijen (2001)relation overlaid. In order to obtain the line-of-sight H i linewidth of a galaxy, W must be scaled by sin ( i ) , where i isthe inclination of the galaxy. In practice, the inclination canbe inferred from the optical morphology of a galaxy. How-ever, galaxies will not be spatially resolved at high redshifts.We therefore use the Verheijen (2001) Tully-Fisher relationto extract two W -based spectral ranges for each galaxy:one based on the assumption that i = deg, and the other based on the evaluated inclination of the galaxy. These spec-tra are hereafter referred to as the “mid-length” and “short”spectra, respectively. Once the spectral range of a sub-volume is specified, theexact spectral location (i.e., channel) in the H i cube aboutwhich to centre the sub-volume must also be specified. If weassume that the H i redshift of each target galaxy is knownwith zero uncertainty, then we can place the spectral centreof the sub-volume precisely at the corresponding channel inthe H i cube. In practice, however, an accurate measure ofthe H i redshift of a distant galaxy is seldom known. Forgalaxies that fall below the H i sensitivity threshold of asurvey, this is always the case. It is for this reason that theoptical redshift of a galaxy is typically used as a proxy forits H i redshift, which in turn allows for the spectral rangeof the galaxy to be specified.However, optical and H i galaxy redshifts are known tobe typically offset from one another. In their study of theimpact of redshift uncertainties on spectral line stacking,Maddox et al. (2013) compare various sets of optical and MNRAS000
Co-added H i spectra based on the mid-length spectra generated for galaxies with M ∗ ≥ M (cid:12) in our synthetic H i datacube. Notation is as in Fig. 2. Constituent spectra (such as those in this figure) based on a variable spectral aperture sizes yield co-addsthat have a variable number of galaxies contributing to the flux in each channel. Because only the most massive galaxies span a velocityrange of up to ∼ km s − , very few galaxies contribute to the channels near the edges of the co-adds. At these extreme velocities, smallnumber statistics yield highly variable co-added fluxes. At the mean redshift of the cube, the frequency width ∆ ν = kHz of a channel corresponds to a velocity width ∆ v = . km s − . These spectra are hereafter referredto as “long” spectra. The other two methods incorporatea measure of the H i line width of each galaxy. To moreclosely mimic real data analysis procedures, rather than us-ing the evaluated H i line width from the Obreschkow &Meyer (2014) catalogue (which would not be available a pri-ori for an observed galaxy), we use the evaluated absoluteVega R -band magnitude (corrected for intrinsic dust extinc-tion) together with the Tully-Fisher relation (Tully & Fisher1977) from Verheijen (2001) to calculate each galaxy’s H i line width, corrected for the inclination of the disc. Figure 5shows the distribution of evaluated ( log W , M R ) pairs forall galaxies in our synthetic cube, with the Verheijen (2001)relation overlaid. In order to obtain the line-of-sight H i linewidth of a galaxy, W must be scaled by sin ( i ) , where i isthe inclination of the galaxy. In practice, the inclination canbe inferred from the optical morphology of a galaxy. How-ever, galaxies will not be spatially resolved at high redshifts.We therefore use the Verheijen (2001) Tully-Fisher relationto extract two W -based spectral ranges for each galaxy:one based on the assumption that i = deg, and the other based on the evaluated inclination of the galaxy. These spec-tra are hereafter referred to as the “mid-length” and “short”spectra, respectively. Once the spectral range of a sub-volume is specified, theexact spectral location (i.e., channel) in the H i cube aboutwhich to centre the sub-volume must also be specified. If weassume that the H i redshift of each target galaxy is knownwith zero uncertainty, then we can place the spectral centreof the sub-volume precisely at the corresponding channel inthe H i cube. In practice, however, an accurate measure ofthe H i redshift of a distant galaxy is seldom known. Forgalaxies that fall below the H i sensitivity threshold of asurvey, this is always the case. It is for this reason that theoptical redshift of a galaxy is typically used as a proxy forits H i redshift, which in turn allows for the spectral rangeof the galaxy to be specified.However, optical and H i galaxy redshifts are known tobe typically offset from one another. In their study of theimpact of redshift uncertainties on spectral line stacking,Maddox et al. (2013) compare various sets of optical and MNRAS000 , 1–11 (2019)
E. C. Elson et al.
Figure 4.
Co-added H i spectra based on the short spectra generated for galaxies with M ∗ ≥ M (cid:12) in our synthetic H i data cube.Notation is as in Fig. 2. Also see Fig. 3 caption. Figure 5.
Distribution of ( log W , M R ) ordered pairs for allgalaxies in our synthetic cube. W measures have been correctedfor the evaluated inclinations of the galaxies. Overlaid as a greyline is the R -band Tully Fisher relation for the full H i sample ofVerheijen (2001). H i galaxy redshifts. They find the differences between opti-cal redshifts from the MPA-JHU catalogue and ALFALFAH i redshifts for 6419 galaxies to be well modelled as thesum of two Gaussians with means of 1.94 and -1.53 km s − , and standard deviations of 11.54 and 35.56 km s − , respec-tively. They use these Gaussian offsets to carry out somesimple stacking experiments. They also use offsets as largeas 250 km s − , inspired by the redshift survey of the GreatObservatories Origins Deep Survey (GOODS) South fieldconducted by Balestra et al. (2010) at low spectral resolu-tion. For galaxies with more than one spectrum, Balestraet al. (2010) find the accuracy of a single measurement tobe ±
255 km s − .In this work, we model the uncertainties in H i redshiftas being Gaussian distributed about 0 km s − , with standarddeviations σ z = , 100, and 250 km s − . Given δ iz as theredshift uncertainty of galaxy i (drawn from one of thesethree distributions) that has a true H i redshift z i true , thespectral centre of its corresponding sub-volume is placed atthe channel corresponding to z i true + δ iz . Note that becausethe random redshift uncertainties (i.e., δ iz ) are Gaussian-distributed (with a mean of zero), they can be positive ornegative. We also generate a version of each spectrum thatis based on a redshift uncertainty of 0 km s − .In total, we produce × × = co-added spectrain this work. Figures 2, 3, and 4 show the co-adds basedon the long, mid-length and short spectra, respectively. Ineach of the figures, columns 1, 2, and 3 correspond to theco-adds based on aperture sizes of 15, 30, and 45 arcsec, MNRAS , 1–11 (2019) ncertainties of H i Stacking Results respectively. Row 1 in each of the figures corresponds to co-adds based on spectra that have redshift uncertainties of0 km s − , whereas rows 2, 3, and 4 correspond to spectra as-suming the Gaussian-distributed redshift uncertainties withstandard deviations σ z = , 50, and 100 km s − . The various co-adds shown in figures 2, 3, and 4 contain awealth of information on the effects that spatial and spec-tral aperture sizes and redshift uncertainties have on thecharacteristics of co-added spectra. Throughout this work,when we state a co-added mass, we refer specifically to anintegral of the average co-added mass per channel over thevelocity range − km s − - 300 km s − about the stackcentre. This is the maximum velocity range that the H i lineemission from a galaxy is expected to span. If the appropriate corrections are not applied, a co-addedspectrum will seldom provide an accurate measure of thetotal H i mass of a galaxy sample. This limitation is dueprimarily to source confusion in the H i data cube, but alsoto the specific methodology used to generate the co-add.In this work, we use the term ”purity” to refer to thefractional contribution of target galaxies to the total massin a co-added spectrum. Purity is inversely related to theamount of contaminant flux in a stacked spectrum. Theother very important characteristic of a co-added spectrumis its signal to noise ratio (S/N). Indeed, the main goal ofstacking is to combine many low S/N spectra into a singleaverage spectrum with higher S/N. In this work we conser-vatively define S / N for a co-added spectrum as the ratio ofthe total flux integrated over the central ± km s − of theco-add to N × RMS co − add , where RMS co − add is the RMS valueof a corresponding reference co-added spectrum and N isthe number of channels making up the central ± km s − of the co-add. This conservative definition of S / N does nothave its denominator scaling as √ N , as other expressions forintegrated signal to noise typically do. Rather, our definitionspecifies the total integrated signal as a function of the max-imum possible amount of integrated noise. S / N ratios pre-sented in this work will therefore be lower than those basedon the more standard definition, presented in other studies.For each constituent galaxy spectrum extracted from thecube, another sub-volume of the same size is extracted ata position offset by 150 arcsec in both right ascension anddeclination . This serves as the reference spectrum for thegalaxy. All of the galaxy reference spectra are stacked toyield the final reference spectrum for a co-add. The RMSvalue of the reference spectrum (i.e., RMS co − add ) is shown asthe horizontal magenta line in all panels of figures 2, 3, and4. Figure 6 shows S/N as a function of purity for all 36of the co-added spectra shown in figures 2, 3, and 4. Theblack, blue, and red symbols represent the co-adds based on For reference: the typical separation between galaxies in ourcube with M ∗ ≥ M (cid:12) is ∼ . degrees. Figure 6.
Signal-to-noise ratio (S/N) as a function of purity(fractional contribution of target galaxies to total co-added signal)for the co-added spectra shown in figures 2, 3, and 4. Black, blue,and red symbols represent the long, mid-length and short spectratotal mass co-adds (i.e., black histograms in figures 2, 3, and 4),respectively. The size of a symbol is proportional to the size ofthe aperture used to extract the spectra, and the type of sym-bol represents the standard deviation of the Gaussian-distributedredshift offsets applied to them. Both purity and S/N are affectedby choices of spatial and spectral aperture sizes as well as redshiftuncertainties. long, mid-length, and short spectra (i.e., figures 2, 3, and 4).The colour of a symbol is therefore indicative of the spec-tral aperture size used to generate the co-add. The size ofany symbol is proportional to the size of the spatial aper-ture used to generate the co-add (i.e., 15, 30, or 45 arcsec).Finally, the type (or shape) of a symbol is indicative of theH i redshift uncertainties of the constituent spectra used togenerate a particular co-added spectrum.Figure 6 clearly shows both purity and S/N to be af-fected by the choices of spatial and spectral aperture sizesas well as redshift uncertainties. Furthermore, the two quan-tities are linked to one another in a roughly linear fashion.Digging deeper, we see that the first common characteristicshared by all sets of spectra is the manner in which the purityof a spectrum increases with decreasing spatial aperture size.All co-adds based on 15 arcsec (i.e., 1 beam width) spatialapertures have purity (cid:39) . . Notably, for all co-adds, puritydoes not vary significantly with increasing redshift offset: forgiven spatial and spectral aperture sizes, purity decreases byapproximately 5 percent as redshift uncertainties increase. Asecond characteristic shared by all sets of co-added spectrais the slightly counter-intuitive manner in which S/N (likepurity) generally decreases significantly with increasing spa-tial aperture size. Put differently: more co-added mass doesnot result in a higher S/N, at least not according to theway we define S/N in this work. As expected, S/N also de-creases with increasing redshift offsets. The typical shape ofthe co-added profile degrades significantly (becomes morespectrally extended) with increasing redshift offset as well;however, for a given typical redshift offset, the shape variesrelatively little with increasing spatial aperture size.Figure 6 also illustrates the undesirable effects on co-added spectra that can arise from using small spectral aper-tures when redshift uncertainties are high. Considering onlythe 12 small symbols in Fig. 6 (corresponding to 15 arcsecapertures), the black symbols represent the co-adds basedon long (velocity length: 2184 km s − ) constituent spectra. MNRAS000
Signal-to-noise ratio (S/N) as a function of purity(fractional contribution of target galaxies to total co-added signal)for the co-added spectra shown in figures 2, 3, and 4. Black, blue,and red symbols represent the long, mid-length and short spectratotal mass co-adds (i.e., black histograms in figures 2, 3, and 4),respectively. The size of a symbol is proportional to the size ofthe aperture used to extract the spectra, and the type of sym-bol represents the standard deviation of the Gaussian-distributedredshift offsets applied to them. Both purity and S/N are affectedby choices of spatial and spectral aperture sizes as well as redshiftuncertainties. long, mid-length, and short spectra (i.e., figures 2, 3, and 4).The colour of a symbol is therefore indicative of the spec-tral aperture size used to generate the co-add. The size ofany symbol is proportional to the size of the spatial aper-ture used to generate the co-add (i.e., 15, 30, or 45 arcsec).Finally, the type (or shape) of a symbol is indicative of theH i redshift uncertainties of the constituent spectra used togenerate a particular co-added spectrum.Figure 6 clearly shows both purity and S/N to be af-fected by the choices of spatial and spectral aperture sizesas well as redshift uncertainties. Furthermore, the two quan-tities are linked to one another in a roughly linear fashion.Digging deeper, we see that the first common characteristicshared by all sets of spectra is the manner in which the purityof a spectrum increases with decreasing spatial aperture size.All co-adds based on 15 arcsec (i.e., 1 beam width) spatialapertures have purity (cid:39) . . Notably, for all co-adds, puritydoes not vary significantly with increasing redshift offset: forgiven spatial and spectral aperture sizes, purity decreases byapproximately 5 percent as redshift uncertainties increase. Asecond characteristic shared by all sets of co-added spectrais the slightly counter-intuitive manner in which S/N (likepurity) generally decreases significantly with increasing spa-tial aperture size. Put differently: more co-added mass doesnot result in a higher S/N, at least not according to theway we define S/N in this work. As expected, S/N also de-creases with increasing redshift offsets. The typical shape ofthe co-added profile degrades significantly (becomes morespectrally extended) with increasing redshift offset as well;however, for a given typical redshift offset, the shape variesrelatively little with increasing spatial aperture size.Figure 6 also illustrates the undesirable effects on co-added spectra that can arise from using small spectral aper-tures when redshift uncertainties are high. Considering onlythe 12 small symbols in Fig. 6 (corresponding to 15 arcsecapertures), the black symbols represent the co-adds basedon long (velocity length: 2184 km s − ) constituent spectra. MNRAS000 , 1–11 (2019)
E. C. Elson et al.
Clearly, purity and S/N are roughly constant regardless ofH i redshift uncertainties. It is only when H i redshift uncer-tainties are of the order of 250 km s − that S/N noticeablydecreases. However, the situation is different for the co-addsbased on mid-length and short constituent spectra (blue andred symbols, respectively). These co-adds all span a narrowrange in purity, yet a large S/N range. Specifically, S/N de-creases as H i redshift uncertainties increase. This is due tothe manner in which the H i emission from a galaxy can becompletely “missed” if a mid-length or short spectral aper-ture is placed at the wrong position in the cube (due tolarge H i redshift uncertainties). A detailed example of howspectral ranges and redshift offsets conspire to decrease thepurity and S/N of constituent spectra is presented in Ap-pendix A.H i redshift offsets clearly play an important role in set-ting the optimal spectral apertures sizes to use in an H i stacking experiment. If H i redshift offsets are known or ex-pected to be ≥ km s − , there is no benefit in usingvery short spectral apertures (velocity length: W / sin ( i ) )over mid-length spectral apertures (velocity length: W ).This behaviour therefore eliminates the need for estimatesof galaxy inclinations from optical imaging. However, giventhe manner in which S/N decreases significantly when H i redshift uncertainties are high ( ≥ km s − ), a good strat-egy for maximising the S/N of a stacked spectrum is to uselong spectral apertures (not based on W measures), at theexpense of lowering the purity of the co-added spectrum.Clearly, the optimal combination of spatial and spectralaperture sizes in the presence of non-zero H i redshift offsetsneeds to be uniquely determined for a given H i stackingexperiment. Our simulations and the methods presented inthis work can be used to make such determinations to a highlevel of accuracy. i mass While purity and S/N are two important characteristics ofa co-added spectrum, the other very important property isthe level of accuracy with which the average H i galaxy masscan be determined from the co-add. Relatively few of the co-adds shown in figures 2, 3 and 4 yield an accurate measureof the true (evaluated) average H i galaxy mass, (cid:104) M HI (cid:105) true ≈ . × M (cid:12) .Figure 7 shows the average H i mass measured from aco-added spectrum, (cid:104) M HI (cid:105) co − add , relative to (cid:104) M HI (cid:105) true as afunction of purity. Co-adds based on large spatial apertureswith H i redshift uncertainties ≤ km s − over-estimate (cid:104) M HI (cid:105) true by factors of ∼ . to 1.7. This over-estimate is ex-pected given the high level of contamination (i.e., low purity)of these co-adds. For H i redshift uncertainties ≥ km s − , (cid:104) M HI (cid:105) co − add /(cid:104) M HI (cid:105) true approaches unity. However, this con-vergence is a serendipitous result of the manner in whichthe large redshift uncertainties degrade the shape and lowerthe S/N of the co-added spectrum; it does not suggest thatoverestimates of (cid:104) M HI (cid:105) true can be compensated for by largeuncertainties in one’s optical redshift catalogue. (cid:104) M HI (cid:105) co − add /(cid:104) M HI (cid:105) true ratios for co-added spectra basedon small (15 arcsec) spatial apertures are all too lowby factors of ∼ . to 0.8, despite these co-adds hav-ing the highest purities. This combination of high purityand (cid:104) M HI (cid:105) co − add /(cid:104) M HI (cid:105) true < is a result of the manner Figure 7.
Ratio of co-added H i mass to true H i mass, (cid:104) M HI (cid:105) co − add /(cid:104) M HI (cid:105) true , as a function of purity for the co-addedspectra shown in figures 2, 3, and 4. Black, blue, and red symbolsrepresent the long, mid-length and short spectra total mass co-adds (i.e., black histograms in figures 2, 3, and 4), respectively.The size of a symbol is proportional to the size of the apertureused to extract the spectra, and the type of symbol representsthe standard deviation of the Gaussian-distributed redshift offsetsapplied to them. A co-added spectrum seldom yields an accuratemeasure of the true H i mass of a galaxy sample. The size of thespatial aperture used to extract the constituent spectra largelydetermines the total amount of flux in the corresponding stackedspectrum. in which a spatial aperture smaller than three times thehalf-power width of the resolution element naturally probesonly a subset of the spatial area over which the H i fluxof a point source is spread. Co-added spectra based onsmall spatial apertures will never recover all target galaxyflux. To achieve full recovery, spatial apertures of size threetimes or more the half-power width of the resolution el-ement are required. However, as mentioned above, theseco-added spectra are highly contaminated, thereby yielding (cid:104) M HI (cid:105) co − add /(cid:104) M HI (cid:105) true > .Co-added spectra based on 30 arcsec spatial aperturesyield (cid:104) M HI (cid:105) co − add /(cid:104) M HI (cid:105) true similar to those based on the45 arcsec spatial apertures. When H i redshift uncertaintiesare small ( ≤ km s − ), (cid:104) M HI (cid:105) true is over-estimated by fac-tors of ∼ . to 1.4. When H i redshift uncertainties are high( ≥ km s − ), (cid:104) M HI (cid:105) true is underestimated.Finally, in all of the co-adds presented in 2, 3 and 4,the total co-added mass including noise (i.e., the black his-tograms) is slightly less that the total co-added noise-freemass (i.e., the blue histograms). In Appendix B, we discusshow this negative contribution of noise to the total co-addedmass is entirely consistent with the Gaussian properties ofthe noise.The results presented in this section show it to be im-possible to generate a co-added spectrum that has simulta-neously high purity, high S/N, and (cid:104) M HI (cid:105) co − add /(cid:104) M HI (cid:105) true ≈ . This result again points to the unavoidable need to fullymodel an H i stacking experiment in the ways presented inthis paper, in order to accurately quantify the character-istics of the co-added spectrum and, very importantly, to All high-redshift galaxies observed with MeerKAT will be spa-tially unresolved, so the great majority of their flux will be spa-tially distributed over the area of the synthesised beam.MNRAS , 1–11 (2019) ncertainties of H i Stacking Results apply the necessary corrections to any parameters inferredfrom it (e.g., average H i galaxy mass, Ω HI , etc.). In this work we have presented the details of a suite of H i spectral line stacking experiments based on synthetic datacubes that mimic the characteristics of observations thatwill be carried out as part of the MeerKAT deep H i survey,LADUMA. We have used various combinations of spatialand spectral aperture sizes as well as H i redshift uncertain-ties to generate co-added spectra. We analyse the propertiesof the co-adds in light of the specific details of how they weregenerated.All co-added spectra contain a significant amount ofcontaminant emission due to source confusion. The amountof contaminant emission rises quickly with increasing spatialaperture size. Spectral aperture size also affects the contami-nation level. H i redshift uncertainties have a relatively smalleffect on the contamination level, yet significantly impact thesignal-to-noise ratio (S/N) of a co-added spectrum. Usingsmall spectral apertures in the presence of large H i redshiftuncertainties can lead to a reduction in S/N by a factor ∼ without lowering the contamination level. A very clear con-clusion drawn from our results is the fact that a co-addedspectrum does not produce a reliable measure of the total H i mass of a galaxy sample. Regardless of spectral aperture size,large/small spatial apertures always over/underestimate thetotal H i mass. H i redshift uncertainties need to be carefullyconsidered since they may lower the difference between thetrue H i mass and the co-added mass, yet in a highly un-reliable manner. Our results suggest that H i spectral linestacking experiments based on LADUMA imaging will yieldco-added spectra with a contamination level of up to 20 per-cent and with S/N (using our new conservative definition)of up to ∼ .Given that almost all co-added H i spectra contain anon-negligible component of contaminant emission, the ap-propriate corrections must be made to the total co-addedmass in order to retrieve an accurate measure of the con-tribution from target galaxies alone. Results based on mockstacking experiments such as ours provide a means of doingso in a reliable manner. The scientific impact of H i stackingexperiments based on high redshift imaging from forthcom-ing facilities such as MeerKAT, ASKAP, and ultimately theSquare Kilometre Array will be significantly determined bythe extent to which we can quantify and correct for the var-ious observational and systematic uncertainties present inquantities derived from co-added spectra. ACKNOWLEDGEMENTS
ECE acknowledges the financial assistance of the SouthAfrican Radio Astronomy Observatory (SARAO) towards In this work, we define S/N as the ratio of the total co-addedflux to co-added noise over the central ± km s − of a stackedprofile. Alternative definitions (e.g., relying on the peak co-addedflux) will deliver higher S/N but still need to be validated bysimulations like those used in this paper. REFERENCES
Balestra I., et al., 2010, A&A, 512, A12Blyth S., et al., 2016, in Proceedings of MeerKAT Science: Onthe Pathway to the SKA.Chengalur J. N., Braun R., Wieringa M., 2001, A&A, 372, 768Delhaize J., Meyer M. J., Staveley-Smith L., Boyle B. J., 2013,MNRAS, 433, 1398Elson E. C., Blyth S. L., Baker A. J., 2016, MNRAS, 460, 4366Jones M. G., Papastergis E., Haynes M. P., Giovanelli R., 2015,MNRAS, 449, 1856Jones M. G., Haynes M. P., Giovanelli R., Papastergis E., 2016,MNRAS, 455, 1574Lah P., et al., 2007, MNRAS, 376, 1357Lah P., et al., 2009, MNRAS, 399, 1447Maddox N., Hess K. M., Blyth S.-L., Jarvis M. J., 2013, MNRAS,433, 2613Martin A. M., Papastergis E., Giovanelli R., Haynes M. P.,Springob C. M., Stierwalt S., 2010, ApJ, 723, 1359Obreschkow D., Meyer M., 2014, preprint, ( arXiv:1406.0966 )Obreschkow D., Croton D., De Lucia G., Khochfar S., RawlingsS., 2009a, ApJ, 698, 1467Obreschkow D., Heywood I., Kl¨ockner H.-R., Rawlings S., 2009b,ApJ, 702, 1321Obreschkow D., Kl¨ockner H.-R., Heywood I., Levrier F., RawlingsS., 2009c, ApJ, 703, 1890P´eroux C., Dessauges-Zavadsky M., D’Odorico S., Sun Kim T.,McMahon R. G., 2005, MNRAS, 363, 479Planck Collaboration et al., 2014, A&A, 571, A16Rhee J., Zwaan M. A., Briggs F. H., Chengalur J. N., Lah P.,Oosterloo T., van der Hulst T., 2013, MNRAS, 435, 2693Rhee J., Lah P., Chengalur J. N., Briggs F. H., Colless M., 2016,MNRAS, 460, 2675S´anchez-Ram´ırez R., et al., 2016, MNRAS, 456, 4488Storrie-Lombardi L. J., McMahon R. G., Irwin M. J., 1996, MN-RAS, 283, L79Tully R. B., Fisher J. R., 1977, A&A, 54, 661Verheijen M. A. W., 2001, ApJ, 563, 694Wolfe A. M., Turnshek D. A., Smith H. E., Cohen R. D., 1986,ApJS, 61, 249Wolfe A. M., Lanzetta K. M., Foltz C. B., Chaffee F. H., 1995,ApJ, 454, 698Zwaan M. A., Meyer M. J., Staveley-Smith L., Webster R. L.,2005, MNRAS, 359, L30
APPENDIX A: SPECTRAL RANGE
When a galaxy spectrum is extracted from a data cube,some estimate of its spectral extent is valuable. Extractingan unnecessarily long spectrum will lead to the inclusion ofunwanted contaminant emission from neighbour galaxies. Aspectral range matching that of the galaxy is expected tominimise the level of contamination in the spectrum. TheTully-Fisher (TF) relation can be used to estimate the spec-tral extent of a galaxy. However, if we are contending withlarge H i redshift uncertainties, extracting galaxy spectra MNRAS000
When a galaxy spectrum is extracted from a data cube,some estimate of its spectral extent is valuable. Extractingan unnecessarily long spectrum will lead to the inclusion ofunwanted contaminant emission from neighbour galaxies. Aspectral range matching that of the galaxy is expected tominimise the level of contamination in the spectrum. TheTully-Fisher (TF) relation can be used to estimate the spec-tral extent of a galaxy. However, if we are contending withlarge H i redshift uncertainties, extracting galaxy spectra MNRAS000 , 1–11 (2019) E. C. Elson et al. over a small spectral range can be extremely detrimental tothe purity and S/N of the co-add to which they contribute.In Fig. A1, each panel shows two spectra associatedwith one of two different target galaxies, extracted in severalways from our synthetic H i data cube. In each panel, thegreen spectrum represents emission from the target galaxy,whereas the grey spectrum is the combined emission fromthe target and other neighbour galaxies (noise is excludedfrom this example). The solid and red-dotted vertical linesrepresent the TF spectral ranges of the galaxy based on 1)an assumed inclination of 90 degrees, and 2) the evaluated(true) inclination.The first (top) panel shows the spectrum of the firstgalaxy, extracted using a redshift offset of 0 km s − . It is clearthat the two TF-based specifications of the spectral rangeare very similar to one another and that they both do a goodjob of excluding the majority of the contaminant emissionpresent within the full (grey) spectrum. Importantly, theyalso both neatly capture all of the target galaxy emission.Panel 2 shows spectra for the same galaxy as in panel 1,this time assuming a redshift offset of 150 km s − . Due tothe large offset, large fractions of target galaxy emission areshifted rightward of the spectral ranges expected to containthe target galaxy emission. Worse still is the fact that ad-ditional contaminant emission enters both of the TF-basedspectral ranges from the left. The purity of this spectrum istherefore doubly affected, in a detrimental way, by the non-zero redshift offset and the narrow spectral ranges specifiedby the TF relation.The third and fourth panels of Fig. A1 show the un-shifted and shifted versions of the spectra for a second targetgalaxy from our synthetic cube. This time the galaxy spansa much smaller spectral range. The full (grey) spectrum isalso highly contaminated by emission from neighbour galax-ies. For a redshift uncertainty of 0 km s − (third panel), bothTF-based specifications of the target galaxy spectral rangedo a good job of cutting out almost all of the contaminantemission in the full spectrum. For the case of a − km s − redshift offset (fourth panel), the situation is entirely dif-ferent. The large offset shifts approximately half of the tar-get galaxy emission leftward of the wider TF spectral range(solid red lines), whilst also adding a lot of contaminationfrom the right. For the narrower TF spectral range based onthe evaluated inclination of the galaxy (red-dotted lines), allof the target galaxy emission is shifted leftward beyond itslimits, whilst more contamination is again added from theright. In this case, the combination of a small spectral ex-traction range and a large redshift uncertainty leads to thegalaxy being completely “missed”, with its extracted spec-trum containing only noise. If redshift offsets are high, thespectral extraction range for each galaxy needs to be keptlarge enough to ensure that the emission from the targetis indeed captured, albeit at the expense of more contami-nation being introduced. Simulations such as ours should beused to find the optimal method(s) of specifying the spectralrange. APPENDIX B: CO-ADDED NOISE
For each of the co-adds shown in Figs. 2, 3 and 4, the to-tal co-added mass (i.e., black histogram) is less than the
Figure A1.
Various spectra associated with each of two differ-ent target galaxies. All spectra are extracted using a spatial aper-ture size of 30 arcsec. The top two panels show the spectra ofthe first galaxy, assuming redshift offsets of 0 and 150 km s − .The bottom two panels show the spectra of the second galaxy,assuming redshift offsets of 0 and -150 km s − . In each panel,the green spectrum represents emission from the target galaxy,whereas the grey spectrum is the combined emission from thetarget and other neighbour galaxies (noise is excluded from thisexample). The black-dashed vertical lines delimit a spectral rangeof ± km s − about the centre of the spectrum. The solid andred-dotted vertical lines represent the TF spectral ranges of thegalaxy based on 1) an assumed inclination of 90 degrees, and 2)the evaluated (true) inclination, respectively. The actual (eval-uated) galaxy inclination is given in the top left of each panel.A combination of a small spectral extraction range and a largeredshift offset can yield an extracted spectrum that completely“misses” the emission from a target galaxy of interest, therebyyielding a high level of contamination in the extracted spectrum. co-added galaxy mass (blue histogram). In other words, thenoise in our synthetic cube makes negative contributions tothe total co-added mass. This result, although unexpected,is indeed statistically correct and is a consequence of thefact that extracting sub-volumes from the full-size cube atthe positions of the M ∗ ≥ M (cid:12) galaxies constitutes asingle realisation of the many ways in which a collection ofsub-volumes can be extracted. Repeating such an experi-ment many times using different sets of constituent spectra MNRAS , 1–11 (2019) ncertainties of H i Stacking Results does indeed demonstrate that noise on average makes zerocontribution to the total co-added mass.In this section, we demonstrate this fact by carrying outthree experiments in which we use our synthetic noise cubeto generate 1000 co-adds each consisting of 2700 constituentspectra. Each constituent spectrum is based on a sub-volumewith pixel dimensions N × N × . For our three experiments,we use N = , , to match the spatial pixel dimensions ofour long spectra based on 15, 30, and 45 arcsec spatial aper-tures. The three panels in Fig. B1 show the distributions ofthe sums of the 1000 co-adds for each of our N = , , ex-periments. The co-add sums are clearly approximately Gaus-sian distributed with a mean very close to 0 Jy. In each panelof Fig. B1, the mean co-add sum is marked by the black-dotted vertical line, while the red-dotted vertical lines markthe ± σ values. The (negative) contributions from noise tothe total masses in the co-adds shown in Figs. 2, 3 and 4are entirely consistent with the results presented in Fig. B1.The most directly comparable cases are the co-adds based onlong spectra with redshift offsets of 0 km s − (i.e., top row ofFig. 2). For the 15, 30, and 45 arcsec co-adds, the respectivenoise contributions are ( . − . )× M (cid:12) = − . × M (cid:12) , ( . − . ) × M (cid:12) = − . × M (cid:12) , and ( . − . ) × M (cid:12) = − . × M (cid:12) . These contributions are markedby the blue-dashed vertical lines in Fig. B1. In each case,the contribution is contained well within one standard devi-ation of the mean sum of the 1000 co-adds generated fromour noise cube.For each of the distributions shown in Fig. B1, the meanco-add sum (indicated by the black-dashed line) is slightlyoffset from zero. However, the smoothed noise cube fromwhich all of the constituent spectra were extracted has amean voxel value µ = − . × − Jy/beam. This verysmall, yet negative, mean value yields the observed non-zero mean co-add sums shown in the panels of Fig. B1.For example, for the middle panel, N × N × × µ = − . × − Jy/beam, which does indeed convert almost ex-actly to the mean co-added mass of = − . × M (cid:12) . Thissuggests that when given a real cube that has its aggregateline emission dominated by noise (which may also includeun-subtracted continuum residuals), co-added fluxes shouldbe appropriately offset in order to account for biases intro-duced by the noise statistics (specifically a non-zero noiselevel). This paper has been typeset from a TEX/L A TEX file prepared bythe author.
Figure B1.
Distributions of summed H i mass for 1000 co-adds,each consisting of 2700 constituent sub-volumes extracted fromour smoothed noise cube. The sub-volumes have pixel dimensionsof N × N × , where N = , , (top to bottom panels). Themean, µ , of each distribution is marked by the black-dashed line,and µ ± σ are marked by the red-dashed lines. The solid blueline in each panel represents total H i mass contributed by noisein our co-added spectra shown in the top row of Fig. 2. In allcases, the (negative) amount of co-added noise is contained wellwithin a single standard deviation of the mean sum of a co-addextracted from the noise cube.MNRAS000