Using the sample variance of 21cm maps as a tracer of the ionisation topology
AAstronomy & Astrophysics manuscript no. main © ESO 2021February 9, 2021
Using the sample variance of 21cm mapsas a tracer of the ionisation topology
A. Gorce , , , A. Hutter , and J. R. Pritchard Department of Physics, Blackett Laboratory, Imperial College, London SW7 2AZ, U.K. Université Paris-Saclay, CNRS, Institut d’Astrophysique Spatiale, 91405, Orsay, France Department of Physics and McGill Space Institute, McGill University, Montreal, QC, Canada H3A 2T8 Kapteyn Astronomical Institute, University of Groningen, PO Box 800, NL-9700 AV Groningen, The Netherlandse-mail: [email protected]
Received *********; accepted *******
ABSTRACT
Intensity mapping of the 21cm signal of neutral hydrogen will yield exciting insights into the Epoch of Reionisation and the natureof the first galaxies. However, the large amount of data that will be generated by the next generation of radio telescopes, such asthe Square Kilometre Array (SKA), as well as the numerous observational obstacles to overcome, require analysis techniques tunedto extract the reionisation history and morphology. In this context, we introduce a one-point statistic, to which we refer as the localvariance, σ loc , that describes the distribution of the mean di ff erential 21cm brightness temperatures measured in two-dimensionalmaps along the frequency direction of a light-cone. The local variance takes advantage of what is usually considered an observationalbias, the sample variance. We find the redshift-evolution of the local variance to not only probe the reionisation history of the observedpatches of the sky, but also trace the ionisation morphology. This estimator provides a promising tool to constrain the midpoint ofreionisation as well as gaining insight into the ionising properties of early galaxies. Key words.
Cosmology: dark ages, reionization, first stars – Methods: statistical
1. Introduction
The Epoch of Reionisation (EoR) represents an essential timewithin the first billion years of the Universe, when the first lightsources formed and gradually ionised the neutral hydrogen gasin the intergalactic medium (IGM). However, the exact proper-ties of these first light sources remain uncertain. With the helpof the CMB small- and large-scale data (Planck Collaborationet al. 2016b), observations of the luminosity functions of star-forming galaxies (Bouwens et al. 2015), H i absorption troughsin the spectra of quasars (McGreer et al. 2015; Bañados et al.2018), we have obtained constraints on the ionisation history,that is the time evolution of the ionisation fraction of the hydro-gen gas in the IGM (Robertson et al. 2015; Gorce et al. 2018).However, reionisation is patchy, with di ff erent regions of the skybeing ionised at di ff erent times.For this reason, the observation of the time evolution of the21cm brightness temperature maps at redshifts z ≥
5, tracing theneutral hydrogen gas in the IGM and referred to as 21cm tomog-raphy, with the next generation of radio interferometers, such asthe Square Kilometre Array (SKA, Koopmans et al. 2015) or theHydrogen Epoch of Reionisation Array (HERA, DeBoer et al.2017) is highly anticipated. Such maps will allow us to trace thereionisation process, in particular the time and spatial evolutionof the ionised regions around the first light sources. Constraintsfrom current observations imply that star-forming galaxies werethe main drivers of reionisation. The topology of the ionised re-gions provides a tracer of the physical properties of these galax-ies and their distribution in the IGM (Zahn et al. 2007; McQuinnet al. 2007; Mesinger et al. 2011). In the recent years, many sta-tistical tools of various levels of complexity have been developed to extract information about the ionisation sources from thesemaps, but this is complicated by the cosmological signal beingoften swamped with thermal noise and foregrounds (Chapman& Jeli´c 2019; Liu & Shaw 2020). While the power spectrumof the 21cm signal has been the main tool to extract the Gaus-sian part of the 21cm signal from reionisation (e.g. Greig et al.2020; Pagano & Liu 2020), three-point statistics have been in-creasingly studied to access the non-Gaussian part of the sig-nal (Shimabukuro et al. 2016; Majumdar et al. 2018; Gorce &Pritchard 2019; Watkinson et al. 2019; Hutter et al. 2020). How-ever, three-point statistics are often computationally expensiveand challenging to interpret. One-point statistics provide here afaster approach.Previous studies have focused on the one-point probabil-ity distribution function (PDF) of the di ff erential 21cm bright-ness temperature δ T b and its moments (Ciardi & Madau 2003;Furlanetto et al. 2004; Mellema et al. 2006). Since the mor-phology of this 21cm signal is driven by the morphology of theionised regions during the EoR, it is informative to assess thePDF of the ionisation fraction distribution. For example, for abinary ionisation field, where pixels are either fully ionised orfully neutral, the corresponding one-point PDF can be derivedas a combination of Dirac delta functions δ : P ( x e ) = (1 − ¯ x e ) δ ( x e ) + ¯ x e δ ( x e − , (1)where ¯ x e is the filling fraction – or the mean ionisation levelof the simulation. From this PDF, analytic expression for statis-tical moments can be derived. Comparing how these statisticalmoments derived numerically from more sophisticated modelsand simulations deviate from these analytical expression pro-vide us with hints on the nature of reionisation (Gluscevic & Article number, page 1 of 12 a r X i v : . [ a s t r o - ph . C O ] F e b & A proofs: manuscript no. main
Barkana 2010), such as its reionisation topology (e.g. inside-outor outside-in, see Watkinson & Pritchard 2014) and its globalreionisation history (Bittner & Loeb 2011; Patil et al. 2014), evenwhen derived from dirty 21cm signal images or after foregroundremoval (Harker et al. 2009). Before the SKA goes online, Kit-tiwisit et al. (2018) show that HERA will be able to detect thevariance of the 21cm brightness temperature field from reionisa-tion with high sensitivity, by averaging over measurements frommultiple fields. However, these one-point statistics lack informa-tion on the correlations between pixels. For this reason, Barkana& Loeb (2008) has extended this formalism by analysing theone-point PDF of the di ff erence in the di ff erential 21cm bright-ness temperature measured at two points.In this work, we present a new higher-order one-point statis-tic, to which we refer to as the local variance σ loc . This localvariance is based on the sample variance but also contains cor-relation information. We introduce the local variance in Sec. 3.In Sec. 4, we apply this statistic to a range of simulations, thatwe describe in Sec. 2, and find that its evolution with redshift isa good tracer of the ionisation history and topology, even whenincluding observational e ff ects, such as thermal noise and tele-scope resolution. We conclude in Sec. 5. In the following, alldistances are in comoving units and the cosmology used is thebest-fit cosmology derived from Planck 2015 CMB data (PlanckCollaboration et al. 2016a): h = . Ω m = . Ω b h = . Y p = . σ = . T CMB = . x e .
2. Simulations
We apply our analysis to two types of simulations, in order tocheck that our results are robust against di ff erent ways of mod-elling reionisation.First, we consider the rsage simulations (Seiler et al. 2019),which are based on a N -body simulation with 2400 dark matter(DM) particles and a side length of 160 Mpc. A modified versionof the Semi-Analytic Galaxy Evolution (SAGE) model (Crotonet al. 2016), which accounts for delayed supernovae feedbackand radiative feedback, describes the evolution of the galaxiesand their properties within the simulation. The ultraviolet back-ground (UVB) is generated with the semi-numerical code cifog (Hutter 2018a,b). cifog also follows the time and spatial evolu-tion of the ionised hydrogen regions in the simulation box. Threedi ff erent prescriptions for the escape fraction of ionising photonsfrom galaxies into the IGM, f esc , cover the physical plausible pa-rameter space: In rsage const , f esc is considered to be constantregardless the redshift and properties of the galaxies. Its value isfixed to 20% (Robertson et al. 2015). In rsage fej , f esc scaleswith the fraction of gas ejected from each galaxy, f ej . In rsageSFR , f esc scales with the star formation rate of each galaxy, re-sulting in f esc e ff ectively scaling with halo mass. These di ff erentionising properties result in a di ff erent morphology of the ionisa-tion fields, with rsage SFR exhibiting the largest ionised bub-bles at a given global ionisation fraction ¯ x e . This is illustratedin the upper panels of Fig. 1, which show the binary ionisationfields of the three simulations at ¯ x e = .
3, when the simula-tions are 30% ionised. Since the three rsage simulations havethe same underlying DM distribution and have been tuned to re-produce the Planck optical depth, we find their ionisation his-tories to be very similar (see the upper right panel of Fig. 4).However, due to the di ff erent descriptions of f esc , they divergeslightly towards the end of the reionisation process, with rsage SFR reaching a fully ionised IGM by ∆ z (cid:39) . rsagefej .Secondly, we use the publicly available 21CMFAST sim-ulation (Mesinger & Furlanetto 2007; Mesinger et al. 2011) .21CMFAST is a semi-numerical code using excursion-set for-malism (Furlanetto et al. 2004): starting from a matter overden-sity field, it assumes each cell to be ionised when the numberof photons exceeds the number of baryons in the respective cell.21CMFAST has multiple simulation parameters that can be var-ied to change the underlying physical model, which again can re-sult in di ff erent reionisation histories and morphology. Here, wechoose to vary the parameter M turn , the turnover mass, which cor-responds to the minimum halo mass below which star formationis suppressed exponentially. During the EoR, M turn = M (cid:12) would correspond roughly to a virial temperature of 10 K. Wegenerate three simulations, with the same dimensions and reso-lution as rsage simulations, and assume M turn = M (cid:12) , 10 M (cid:12) and 10 M (cid:12) , to which we refer as M8, M9 and M10 in thefollowing, respectively. Their global ionisation histories can beseen in the lower-right panel of Fig. 4. The higher number ofsources in M8 lead to an earlier reionisation of the IGM than inthe other two simulations. In M10, however, reionisation is de-layed until haloes of su ffi cient mass have formed. Since moremassive sources are also more e ffi cient at ionising their sur-roundings, M10 has on average larger ionised regions than M8and M9. This can be seen in the lower panels of Fig. 1, whichshow the snapshots of the ionisation fields of M8, M9 and M10at ¯ x e = . ff erential 21cmbrightness temperature maps remains di ffi cult (e.g. Malloy &Lidz 2013; Beardsley et al. 2015; Datta et al. 2016; Giri et al.2018; Mangena et al. 2020) due to their contamination by in-strumental e ff ects and foregrounds (Gluscevic & Barkana 2010;Chapman et al. 2013; Liu & Shaw 2020): hence, we need sta-tistical tools that we can apply directly to these 21cm maps. Forthis reason we apply our new one-point statistics also to the dif-ferential 21cm brightness temperature δ T b cubes in the follow-ing. From the rsage neutral fraction, x H i = − x e , and baryondensity, δ b , cubes, we construct the corresponding δ T b fields fol-lowing (Pritchard & Loeb 2012): δ T b = x H i (1 + δ b ) Ω b h . (cid:115) . Ω m h (cid:114) + z
10 mK (2)Here, we assume that X-rays have heated the gas su ffi cientlysuch that the spin temperature of the neutral hydrogen gas ex-ceeds the CMB temperature considerably. This assumption isvalid for the redshifts that are considered in this work (4 < z < . Available at https://github.com/21cmfast/21cmFAST . In practice, an interferometer measures a mean-zero map for eachfrequency bin, which is equivalent to saying that each slice of this cubehas a mean zero and would make the local variance vanish. Hence, weconsider that using the N slices of a coeval cube at fixed redshift z isequivalent to considering N small patches of a larger field-of-view atfixed frequency. We leave a detailed investigation of this hypothesis forfuture work.Article number, page 2 of 12. Gorce et al.: Using the sample variance of 21cm maps as a tracer of the ionisation topology Fig. 1.
Binary ionisation fields cut through the three rsage simulations (upper panels) and the three 21CMFAST runs (lower panels) described inSec. 2 at ¯ x e = .
30, illustrating how di ff erent physics lead to a di ff erent reionisation morphology.
3. Methods
In order to build intuition for this new estimator, we first look atthe ionisation fields of our simulations. We obtain the 3D vari-ance of the ionisation field of a coeval cube extracted from thesimulations, at a given redshift z and global ionisation level ¯ x e ,by computing: σ whole ( z ) = N N (cid:88) i , j , k = x i , j , k ( z ) − ¯ x e (3)where the ionisation level of a cell ( i , j , k ) can be either x i , j , k ( z ) = σ whole2 of a 3D field x e ( r ) to its 2-point correlation function (2-PCF) ξ and, in turn,to its power spectrum P ( k ) . With this relation, we can esti-mate the variance of the EoR 21cm signal by measuring itspower spectrum. In this context, Patil et al. (2014) has used fore-cast LOFAR observations and the inferred redshift-evolution of σ whole to constrain the midpoint and duration of reionisation.From a topological point of view, examining the evolution of σ whole with ¯ x e allowed Watkinson & Pritchard (2014) to di ff eren-tiate between outside-in and inside-out scenarios of reionisation.However, in all the simulations analysed in this work, reionisa-tion proceeds inside-out. As such, we find for our three rsage simulations only a ∼
1% deviation from the theoretical parabolawhich can be derived from the PDF of ionised pixels P ( x e ) givenin Eq. (1): σ whole2 = (cid:90) ( x e − ¯ x e ) P ( x e ) d x e = ¯ x e (1 − ¯ x e ) . (6) The definition of the 2-PCF yields ξ ( r ) = V (cid:90) d s x e ( s ) x e ( s + r ) = π ) (cid:90) d k P ( k ) e i k · r , (4)such that, for an isotropic and homogeneous field, σ + ¯ x e = V (cid:90) d s x e ( s ) = ξ (0) = π (cid:90) k d k P ( k ) . (5) The results also hold for the 21CMFAST simulations and higherorder cumulants, such as the skewness and the kurtosis. There-fore, the distribution of pixels throughout the whole box can-not di ff erentiate between the simulations, as it mainly traces thereionisation history. In particular, it does not account for the cor-relations between pixels, and hence cannot track morphologicaldi ff erences.For this reason, we introduce the local variance, which issensitive to the ionisation topology as it includes the small andlarge-scale correlations between points: σ loc ( z ) = N N (cid:88) k = N N (cid:88) i , j = x i , j , k ( z ) − N N (cid:88) i , j , k x i , j , k ( z ) = N N (cid:88) k = µ ( k , z ) − ¯ x e ( z ) . (7)Here µ ( k , z ) is the mean value of the k th slice along the redshiftdirection . We explain the derivation of this statistic by consid-ering the three-dimensional binary ionisation field of the rsageconst simulation. The ionisation field is a cube with N = ∆ x = .
625 Mpc.Snapshots every 10 Myrs, tracking the reionisation process, areavailable. For each snapshot, we compute the average value µ of each of the N slices, each having a width of one cell along achosen direction. Here, we assume this direction to be the red-shift – or frequency – direction. The resulting distribution of N filling fractions { µ } ≤ k < N is centred around the filling fraction ofthe whole 3D box at the respective redshift, ¯ x e ( z ). The left panelof Fig. 2 shows these distributions for redshifts 6 ≤ z ≤
15. Atthe beginning of reionisation, the distribution is very narrow butwidens as reionisation progresses and ionised bubbles grow, trac-ing the underlying clustered galaxy population and the increas-ing correlation between pixels. At the end of reionisation, thedistribution is again very narrow as most pixels are fully ionised. Because we use coeval cubes, we assume that the redshift does notchange from one slice to the next. This is a reasonable assumption be-cause of the relatively small size of the box ( L =
160 Mpc).Article number, page 3 of 12 & A proofs: manuscript no. main
Fig. 2.
Left panel:
Distribution of the ionisation levels of the N slices that can be carved out of the rsage const simulation along one direction.Each colour corresponds to one of 12 snapshots taken on the range 6 . ≤ z ≤ .
63, corresponding to di ff erent ionisation levels, representedas the solid vertical lines. Right panel:
Evolution of the standard deviations of each distribution with global reionisation history (blue solid line),compared to the standard deviation of the distribution of ionised pixels throughout the whole box (dashed line, divided by 10).
In the right panel of the figure, we summarise our results by plot-ting the evolution of the standard deviation of these distributions σ loc as a function of the global ionisation level (blue solid line).As outlined in our motivation, we can see from Eqs. (3) and(7), that the local variance σ loc describes the morphology of theconsidered field more accurately than the ordinary 3D variance σ whole , as it includes the variance of each slice. Indeed, if weconsider Var( k ) the variance of the k -th slice, then Var( k ) − µ ( k ) = (cid:80) Ni , j x i , j , k / N , and it is easy to see that σ loc2 = σ whole2 − N N (cid:88) k = Var( k ) . (8)Furthermore, we note that since the 3D variance can be ex-pressed in terms of the 2-PCF given in Eq. (5), the local varianceis also given by σ = L (cid:90) d r µ ( r ) = π (cid:90) d k P µ ( k ) , (9)where µ ( r ) is the mean of the slice located at r and P µ ( k ) is thepower spectrum of the 1D distribution of means { µ ( r ) } r ≤ L . P µ ( k )corresponds to the 3D power spectrum of the field, when onlythe modes along the frequency direction in Fourier space arekept and are rescaled by the area of the observational windowin the sky plane: P µ ( k ) = P ( k ) / L for k = (0 , , k z ). Selectingsuch modes can be done by using a specific window function,for example a Bessel function (Muñoz & Cyr-Racine 2021).
4. Results
In this section, we analyse the evolution of the local varianceof the rsage const simulation. In order to understand the im-pact of the ionisation fraction and density distributions on ourestimator, we first discuss the local variance of the ionisationfraction fields before we analyse the local variance of the di ff er-ential 21cm brightness temperature maps. For clarity, we add asuperscript to σ loc , describing the field considered: σ ionloc for theionisation field, σ for the brightness temperature field.We show the local variance of the ionisation field σ ionloc of the rsage const simulation as a function of its mean in the rightpanel of Fig. 2. For comparison, we also plot the results for thescaled 3D variance, σ whole /
10. The dotted vertical line indicates the midpoint of reionisation at ¯ x e = .
50, where σ whole is max-imum (see Eq. (6)). On the other hand, σ ionloc ( ¯ x e ) (blue line) isslightly distorted compared to σ whole and reaches its maximumaround ¯ x e (cid:39) .
60. The location of the maximum indicates themoment when ionised and neutral regions are the largest, whichwill depend on the large-scale structure of the ionisation fields.We will discuss this in more detail in the next Section when wecompare di ff erent ionisation morphologies. To confirm the phys-ical origin of this signal, we compute the local variance of a con-trol test, consisting of a 3D box of the same resolution and size asour simulations, but randomly filled with ionised pixels to reachthe considered ionisation level. Such a field will contain none ofthe correlations we aim at probing with the local variance andwill actually be analogous to a box with white noise power spec-trum. It can also be seen as a field made of many uncorrelatedvery small bubbles, which is close to the ionisation field at thevery beginning of reionisation (see next paragraph). The result-ing variance, close to zero and largely insignificant comparedto what was obtained for the simulations, is shown as the blacksolid line in the right panel of Fig. 2.In Fig. 3, we show the local variance of the δ T b field of the rsage const simulation as a function of redshift (thick solidline) along with the local variances of the H i and δ b fields and thecovariance of the distributions of means for the ionisation field( x loc ) and the 21cm brightness temperate field ( δ b , loc ). Becauseof correlations, the exact expression of σ , given in App. A,does not equal the sum of the aforementioned three elements, butthey are useful to understand the behaviour of σ . Overall, theredshift-evolution of the local variance σ ( z ) traces the reion-isation history and is similar to the one observed in Patil et al.(2014) for the 3D variance. At high redshift, before the onsetof reionisation ( z (cid:38) δ T b back-ground ( z (cid:39) − z (cid:39) .
5. At this point, the distribution of ionised re-gions is similar to the one of the control test, considered above,hence the small amplitude of the local variance. As reionisationproceeds, zero (ionised) pixels start tracing the reionisation mor-phology. As their number increases compared to the number ofwarm (neutral) pixels (which still follow a Gaussian distribu-tion), σ starts mostly tracing the ionisation field and increasesuntil reaching its maximum around the midpoint of reionisation,when both ionised and neutral regions are the largest. As more Article number, page 4 of 12. Gorce et al.: Using the sample variance of 21cm maps as a tracer of the ionisation topology
Fig. 3.
Contributions to the local variance of the δ T b field in the rsageconst simulation (upper panel) and its reionisation history (lowerpanel). and more pixels are ionised, the global brightness temperatureapproaches zero and so does the local variance. A similar red-shift evolution has been observed in the skewness of pixel dis-tributions within two-dimensional brightness temperature maps(Harker et al. 2009) and can be recovered using a combinationof analytical functions (Ichikawa et al. 2010; Patil et al. 2014).However, these theoretical functions do not say much about thetime and spatial distribution of ionised regions. In order to understand how the ionisation morphology a ff ectsthe characteristic features in σ loc ( z ), which are the amplitudeand ionisation fraction at which σ loc ( z ) reaches its maximum,we compare σ ionloc ( z ) for all the simulations described in Sec. 2.Results for rsage and 21CMFAST are shown in the upper andlower panels of Fig. 4, respectively.As we can see from the top right panel, the three rsage sim-ulations show similar reionisation histories, and, therefore, theapproximate locations of the minima and maxima of the localvariance are similar. However, they di ff er in their amplitudes.Here, in contrast to what we have found for σ whole , there is aclear di ff erence between the three rsage simulations: for exam-ple, at ¯ x e = .
50, the local variances of rsage fej and rsageSFR are about 20% below and 20% above the one of rsageconst , respectively. We find more variance between the rsageSFR slices, since the simulation exhibits larger ionised regions(Seiler et al. 2019). This can be understood as follows. Consideran ionisation field at a given global ionisation level ¯ x e . If thefield is made of a few large ionised bubbles and we cut a slicethrough the box, we are more likely to pick up a large ionisedregion that will bias the filling fraction of the slice µ towardsvalues larger than ¯ x e . Conversely, if the field is made of manysmall ionised regions, such as in the rsage fej simulation, theslices cut through the box are more likely to have similar µ val-ues and σ ionloc will be lower.We confirm these findings and extend our understanding ofhow the characteristics of σ ionloc depend on the reionisation mor-phology by analysing the results we obtain for the three 21CM- FAST simulations that di ff er in their reionisation history andmorphology (see Fig. 1). When computing the local varianceof the 21CMFAST ionisation fields, we find that, similarly to rsage , at a given ionisation level, the simulation with the on av-erage largest ionised regions, M10, yields the largest σ ionloc values.This is in agreement with the findings of Gluscevic & Barkana(2010), who already noticed that if the ionising sources lie inmore massive haloes in one simulation than another, the impacton the 3D pixel distribution is noticeable as the ionised regionsare larger and more scarce at the same global ionisation fraction.Since the three 21CMFAST simulations exhibit not only di ff er-ent ionisation morphology but also di ff erent ionisation histories,the redshift-evolution of σ ionloc varies from one simulation to theother in addition to its variations in amplitude. This result im-plies that measuring the local variance of a field will help us toconstrain the reionisation history.Similar to the rsage simulations, the maximal local varianceis also reached around the reionisation midpoint for the three21CMFAST runs. In general, this is expected to happen when thederivative of the global signal with respect to redshift is maximal(Muñoz & Cyr-Racine 2021). During reionisation, we expectthe signal to be maximal when both ionised and neutral regionsare the largest. Applying the bubble size algorithm granulometry(Kakiichi et al. 2017) to the rsage simulations, we find this to bethe case for all three simulations, at ¯ x e ∼ . z = . ± .
1, 6 . ± . . ± .
1, corresponding toan ionisation level of ¯ x e = . ± .
02, 0 . ± .
02 and 0 . ± . ff erent ionisation lev-els and redshifts, they have their largest ionised regions in com-mon. This indicates that, in contrast to its amplitude, the locationof the maximum of the local variance is purely sensitive to thelarge-scale structure of the ionisation field. As the total numberof ionising photons decreases in M8, M9 and M10, respectively,the large-scale ionised regions will reach their maximum size atlower redshifts; but due to the di ff erent ionising emissivity dis-tributions across sources, the redshifts where the local variancebecomes maximal will correspond to a di ff erent ionisation levelof the box. For example, in Fig 5, we see that the neutral regionsare filled with many small ionised regions in M8, increasing theoverall ionisation level of the simulation to a higher one than inM10 at the redshift of maximal local variance.We have seen that di ff erences in the ionisation morphologyacross simulations translate into di ff erences in the amplitudeof σ ionloc ( z ), and di ff erences in reionisation histories into transla-tions in redshift. We now turn to the di ff erential 21cm bright-ness temperature fields which, as we have seen in the previousSection, encompass additional information from the ionisationmorphology. Fig. 6 shows the redshift-evolution of σ with red-shift for the three rsage simulations (upper panel) and the three21CMFAST runs (lower panel). We briefly note that the three rsage simulations show the same local variance at high redshifts Article number, page 5 of 12 & A proofs: manuscript no. main
Fig. 4.
Evolution of the standard deviation on the distribution of means measured in the set of N slices that can be carved out of the ionisationfields of simulations along one direction with redshift (left panel) and ionisation level (middle panel).
Right panel:
Corresponding reionisationhistories. Results for the three rsage simulations (upper panels) and the three 21CMFAST runs (lower panels) are compared.
Fig. 5.
Snapshots of the ionisation field at the maximum of the local variance for the M8, M9 and M10 simulations, corresponding to di ff erentredshifts and di ff erent ionisation levels: z = . x e = .
67 for M8, z = . x e = .
65 for M9, and z = . x e = .
58 for M10. ( z (cid:38) σ is governed by the same underlying den-sity field. However, as σ becomes sensitive to the ionisationfield, its shape traces the reionisation history, while its ampli-tude is sensitive to the reionisation morphology: as observed forthe ionisation field, here, the rsage SFR simulation, which hasthe largest ionised regions on average, exhibits the largest σ .However, this is not true for the 21CMFAST simulations: in con-trast to what is expected, M10 exhibits the lowest amplitude in σ during reionisation. This is because M10 reionises later thanM8, when the density field is more heterogeneous and the localvariance of the density field larger; such that the anti-correlationbetween δ b and x H i is stronger, adding negative signal to the localvariance of the overall brightness temperature field (see Fig. 3).The pre-factor of Eq. 2, which is proportional to √ + z , alsocontributes to enhancing the signal at higher redshifts and henceto M8 showing a higher amplitude at its maximum because it isreached at a higher redshift. This also explains why M10 exhibits the shallowest dip at z (cid:39) z (cid:38)
12 for M8, z (cid:38) Many limitations related to the nature of instruments are ex-pected to complicate the observation of the 21cm signal fromreionisation. For this reason, we consider the e ff ects of thermalnoise and angular resolution smoothing on our δ T b maps andsubsequent measurements of the local variance. In the follow-ing, we consider the performance characteristics of SKA1- Low (Braun et al. 2019), with a maximum baseline of b max =
65 kmand a total e ff ective collecting area of A tot ∼ m that is Article number, page 6 of 12. Gorce et al.: Using the sample variance of 21cm maps as a tracer of the ionisation topology
Fig. 6.
Local variance of the 21cm brightness temperature fields fromthe rsage (upper panel) and 21CMFAST (lower panel) simulation, as afunction of redshift. Vertical dotted lines show the ionisation midpointof the simulation of the corresponding colour. frequency dependent. Indeed, the Australian interferometer willconsist of about 2 × dipoles, gathered in 512 stations with24 tiles per station. Each individual dipole will have an e ff ectivearea of λ /
3, with λ being the redshifted 21cm wavelength.In order to apply the appropriate SKA1- Low angular smooth-ing to our δ T b maps, we convolve each simulation cube (cor-responding to a given redshift z ) by a Gaussian kernel with aFWHM of θ ( z ) d c ( z ), with d c ( z ) being the comoving distance toredshift z and θ ( z ) = . × λ ( z ) b max (10)the angular resolution of the telescope. Because of the size ofthe SKA1- Low array, its angular resolution will be very high:It will range from 0 .
15 Mpc at z = .
66 Mpc at z = ∆ x = L / N = .
625 Mpc) at all redshifts of interest. Conse-quently, the smoothed and original maps are very similar, and sowill be the resulting σ loc .We then generate thermal random noise by drawing a noisevalue n i from a Gaussian distribution with a variance σ for eachpixel, with the variance given by (Watkinson & Pritchard 2014): Fig. 7.
Local variance of the brightness temperature maps of theM9 21CMFAST simulation, for a clean map (solid blue line), a mapsmoothed to SKA1-
Low angular resolution (dashed orange line), and anoisy map (dash-dotted green line). See text for details. σ ( z ) = . × (cid:32) m A tot (cid:33) (cid:32)
10 arcmin ∆ θ (cid:33) × (cid:32) + z (cid:33) . (cid:115) ∆ ν
100 h t int . (11)Here, ∆ ν is the frequency resolution of the experiment, which wematch for computational e ffi ciency to the resolution of the sim-ulation ∆ x according to ∆ ν = H ν √ Ω m ∆ x / [ c √ (1 + z )] with H being the Hubble constant and ν the rest-frame 21cm fre-quency. SKA1- Low is expected to have a much better frequencyresolution than the comoving cell size of 0 .
625 Mpc used in thiswork, with a channel width of 5 . t int = σ loc for the resultingcoeval cubes.In Fig. 7, we see that, because σ loc is based on variance infor-mation and, therefore, not sensitive to the absolute value of thethe 21cm di ff erential brightness temperature, the variance infor-mation is still accessible in both smoothed and noisy maps, de-spite the amplitude of the noise being comparable to the one ofthe cosmological signal. On the range of redshifts correspond-ing to the bulk of the reionisation process (6 . ≤ z ≤ . . ≤ x e ≤ . σ loc when analysing noisy maps.The noise variance, computed using Eq. (11), increases with in-creasing redshift, which provides an explanation for the roughedges of σ loc in the noisy maps at z >
8. In fact, the noise andthe cosmological signal being uncorrelated, we have σ , smoothed (cid:39) σ , noisy map − σ , noise . (12)Subtracting the two local variances, we obtain the results shownas the dotted line in Fig. 7 and refer to these as the cor-rected results in the following. In these corrected results, the Article number, page 7 of 12 & A proofs: manuscript no. main noise bias has been removed, and the measured local varianceis much closer to the local variance values obtained from theclean smoothed maps than from the uncorrected one. AlthoughEq. (12) is an approximation and overlooks potential correlationsbetween the cosmological signal and the noise within a slice, thegood match between σ loc derived from clean maps and from cor-rected maps shows that their contribution is su ffi ciently small tojustify this approximation. The noise variance, and the associ-ated fluctuations at high redshift, still remain but the location ofthe maximum of the local variance can be recovered.Indeed, we fit a parabola y ( z ) = σ max − ( z − z max ) to thelocal variance data points of our M9 simulation, on a redshiftrange 6 . ≤ z ≤ .
5, for clean, noisy and corrected values. Inthis expression, z max is the redshift when the maximum of the lo-cal variance is reached, and σ max its amplitude. To estimate thecorresponding uncertainties, we derive the standard deviation ofthe local variance by running 21CMFAST for identical physicaland numerical parameters but 200 di ff erent random seeds, whichis equivalent to computing 200 di ff erent realisations of the sim-ulation. We find that the recovered amplitude is identical for allthree data sets, giving σ max = . ± .
24 at the 95% confidencelevel. The location of the peak is slightly shifted towards largerredshifts for noisy data, however remaining within the error barsobtained with clean data ( z max = . + . − . ), and most impor-tantly, within the size of a redshift bin (the redshift step betweentwo snapshots is ∆ z = . ffi cient to claimthat our statistic will keep its characteristics and constrainingpower when analysing observed data, as we have not consid-ered the impact of foreground avoidance or removal on our re-sults. Nevertheless, Harker et al. (2009) have found that one-point statistics are quite robust against the details of foregroundfitting. In contrast, Petrovic & Oh (2011) have shown that fore-ground cleaning can significantly distort the one-point PDF bysmoothing out its bi-modal structure , as it removes the large-scale modes. This will not be an issue for our local varianceanalysis, since, in contrast to the 3D pixel distribution, the dis-tribution of mean values is smoothed by the averages along thefrequency direction, and therefore not bi-modal. Additionally,foreground cleaning will reduce the contrast between neutral andionised regions while maintaining the topology of the map, sothat the distribution of means values and the variance within in-dividual slices should be maintained. Finally, and similarly toour thermal noise analysis, it might be possible to remove the ef-fects of foreground removal from the measured local variance, ifwe can characterise the statistical properties of foreground resid-uals su ffi ciently well. We keep a thorough analysis of the impactof foreground removal on the local variance for future work. In the previous sections, we have discussed the auto-correlationsof slices within a simulation box. Another option is to analysethe cross-correlations between the average values of slices sep-arated by a given distance s . This approach is similar to whathas been done for the one-point PDF in Barkana & Loeb (2008);Ichikawa et al. (2010) and Gluscevic & Barkana (2010). For theM9 simulation, we compute the following variance, to which we Ionised pixels form a Dirac peak centred on zero, whilst warm pixelsare distributed more evenly.
Fig. 8.
Cross-correlations between the average values of two slices cutthrough the M9 simulation and separated by a distance s , for the snap-shot corresponding to z = .
2, normalised by the value at s =
0. Re-sults are presented for the ionisation field (solid blue line) and the 21cmbrightness temperature field (dashed orange line). will refer to as the cross variance: V ( z , s ) = N N (cid:88) k = µ ( k , z ) µ ( k + i s , z ) − ¯ x e , (13)where µ ( k , z ) is the average of the k th slice and µ ( k + i s , z ) is theaverage of the ( k + i s ) th slice with s = i × ∆ x being the dis-tance in Mpc separating them. This cross variance is equivalentto computing the 1D 2-point correlation function of the distribu-tion of the means of slices. An example for the cross-correlationbetween slices at z = . x e = .
47 is shown in Fig. 8 forthe density, ionisation and brightness temperature fields of theM9 simulation. We find that, for all fields, the cross correlationis maximal at s = V ( z , ∼ − for theionisation field and 0 . for the 21cm brightness temperaturefield.We compute the cross variance for all snapshots available inthe M9 simulation. Interestingly, while the amplitude of V ( z , s )changes with redshift for the density field, its shape remains con-stant, and so does V ( z , s ) / V ( z , δ T b field has the same behaviour as the density field, un-til the box is about 10% ionised. For the same redshifts, thescale at which V ( z , s ) derived from the ionisation field becomeszero, s ( z ), remains around 15 Mpc. At ¯ x e ≥ δ T b fieldmostly follows x H i and s starts increasing. The maximum of s is reached at ¯ x e = s at z = . ∼ −
100 Mpc), noise has a zero coherency
Article number, page 8 of 12. Gorce et al.: Using the sample variance of 21cm maps as a tracer of the ionisation topology length (Santos et al. 2005; Mertens et al. 2018). These di ff er-ences provide the basis for statistical 21cm signal separationvia Gaussian Process Regression analysis (GPR, Mertens et al.2018), where the di ff erent components of the observed signal aremodelled in order to remove foregrounds from the 21cm obser-vations. Because their coherency length is much larger than thefrequency bandwidth, the cosmological 21cm signal and fore-grounds measured in two consecutive frequency channels (or,here, slices) should be almost identical and, when subtractingmeasurements in these two channels, only the uncorrelated ther-mal noise should remain. In Patil et al. (2016), the authors usethis technique to estimate the noise properties and remove noisefrom LOFAR data. Indeed, in contrast to foregrounds and cos-mological signal, the thermal noise is found to be uncorrelated,even with a frequency separation as small as 0 . ff erence between two Stokes I images in two consec-utive frequency channels, after removing bright sources, will bedominated by thermal noise. Estimating the noise properties withthis method leads to higher noise levels than when using StokesV. The authors state that this excess noise is due to their calibra-tion with an incomplete model. However, since this excess noiseis uncorrelated between di ff erent observations, multiplying dif-ferent observations will decrease the e ff ective / overall noise.
5. Discussion & Conclusion
In this paper, we have presented a novel first-order statistics, thelocal variance σ loc , which can be used to constrain the historyand morphology of reionisation.The local variance corresponds to the variance of the distri-bution of means of slices taken along an axis in a simulation,or along the frequency direction of observations, if the channelwidth is su ffi ciently narrow. At a fixed global ionisation level,the amplitude of the local variance of the ionisation field σ ionloc isa tracer of the size of ionised regions: it is higher for an ionisationfield showing a few large ionised regions, that is when ionisingsources are more scarce but more e ffi cient at ionising. For a fieldmade of many small ionised regions, as it arises when low-masssources are the main drivers of reionisation for example, the lo-cal variance will be smaller. In future work, we will investigatein more details how the local variance can constrain the physi-cal properties of early galaxies. Additionally, the filling fractionfor which the local variance reaches its maximum is mostly sen-sitive to the large-scale structure of the ionisation field duringreionisation. It will be reached when both ionised and neutralregions are the largest, which occurs when approximately 60%of the box is ionised. We have found that a more biased ion-ising emissivity distribution (that is fewer sources with higherionising emissivities opposed to many sources with lower ion-ising emissivities) results in the maximum local variance beingreached earlier in the reionisation process, that is at a lower ion-isation fraction ¯ x e . Finally, when applying our novel statistics tothe di ff erential 21cm brightness temperature fields, the redshift-evolution of σ exhibits a characteristic shape that traces thereionisation history of the sky patch observed. Before the on-set of reionisation, σ traces the correlations within the densityfield, but becomes sensitive to the ionisation morphology as arising number of ionised regions emerge and grow.We have shown that σ is robust against thermal noise andangular resolution pollution when conducting 1000 hrs observa-tions with SKA1- Low . In particular, if the statistical propertiesof the noise are known su ffi ciently well, the noise contributioncan be removed from the measured signal. We expect this re-sult to hold even after foreground removal. Indeed, σ loc changes only weakly when the one-point PDF is altered by foregroundremoval (Petrovic & Oh 2011). A detailed analysis of the impactof foreground removal on the local variance will be the focus offuture work. In conclusion, local variance data will enable the re-covery the reionisation history in a model-free fashion, as well asthe constraint of astrophysical parameters related to the physicalproperties of early galaxies. Applying our local variance statis-tics to measurements would consist of obtaining the reionisationmidpoint of a statistical sample of small fields of view, which areeither obtained by dividing a larger observational window intosmaller areas or by performing a series of di ff erent observations,and combining these "local" reionisation midpoints to estimatea global reionisation midpoint. In this work, for computationalreasons, individual observations have been represented by thedi ff erent slices in a single coeval cube.We note that other sample shapes could have been consid-ered instead of slices. Since observational data cubes will consistof slices at di ff erent (but gradually increasing) frequencies, ourfindings can be easier translated to the results from observations.Alternatively, Bittner & Loeb (2011) consider the distributionof means measured across beams drawn along the frequency di-rection, and one could also consider the mean of sub-cubes inthe entire coeval simulation box. Preliminary calculations haveshown that the local variances of both shapes su ff er from thesame drawbacks, that is their strong dependency on the simula-tion size and resolution, but yield very similar results.Depending on the reionisation scenario, the 21cm local vari-ance can reach values as high as 1 mK. However, this value willnaturally decrease as larger field of views or higher frequencyresolutions are considered. While in this work, large values ofthe sample variance are desired, it has previously been consid-ered an issue, as it represents an obstacle for a precise estimateof the 21cm global signal or power spectrum and, in turn, of as-trophysical parameters. Muñoz & Cyr-Racine (2021) proposeda way of quickly estimating the sample variance when measur-ing the 21cm global signal as a function of its derivative with re-spect to redshift. Considering a simulation as big as L = . σ ∼ . z = .
8. In parallel, Kaur et al. (2020)estimated that a simulation needs to be at least 200 −
300 Mpcwide to obtain a bias-free 21cm power spectrum on scales of − . < log k / Mpc − <
0. In contrast to other estimators, thelocal variance uses sample variance, often considered an obser-vational bias, to our benefit. Looking for optimal observationalstrategies allowing a maximum amplitude of the local variance,while minimising its error bars, in order to o ff er reliable con-straints, will be the focus of future work. Acknowledgements.
The authors thank Catherine A. Watkinson, Ian Hothi,Adrian Liu and Jordan Mirocha for their useful comments on a draft versionof this paper. The authors also thank Jacob Seiler for providing the rsage sim-ulations. AG and JP acknowledge financial support from the European ResearchCouncil under ERC grant number StG-638743 ("FIRSTDAWN"). AG’s workwas additionally supported by the McGill Astrophysics Fellowship funded bythe Trottier Chair in Astrophysics, as well as the Canadian Institute for AdvancedResearch (CIFAR) Azrieli Global Scholars program. AH acknowledges supportfrom the European Research Council’s starting grant ERC StG-717001 ("DEL-PHI"). The idea of this work was developed thanks to visits between the authorsof this paper, partly funded by the Leids Kerkhoven-Bosscha Fonds (LKBF).This research made use of astropy , a community-developed core Python pack-age for astronomy (Astropy Collaboration et al. 2013, 2018); matplotlib , aPython library for publication quality graphics (Hunter 2007); and of scipy , aPython-based ecosystem of open-source software for mathematics, science, andengineering (Jones et al. 2001) – including numpy (Oliphant 2006).
Article number, page 9 of 12 & A proofs: manuscript no. main
References
Astropy Collaboration, Price-Whelan, A. M., Sip˝ocz, B. M., et al. 2018, ArXive-prints [ arXiv:1801.02634 ]Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558,A33Bañados, E., Venemans, B. P., Mazzucchelli, C., et al. 2018, Nature, 553, 473Banet, A., Barkana, R., Fialkov, A., & Guttman, O. 2020, arXiv e-prints,arXiv:2002.04956Barkana, R. & Loeb, A. 2008, MNRAS, 384, 1069Beardsley, A. P., Morales, M. F., Lidz, A., Malloy, M., & Sutter, P. M. 2015,ApJ, 800, 128Bittner, J. M. & Loeb, A. 2011, J. Cosmology Astropart. Phys., 2011, 038Bouwens, R. J., Illingworth, G. D., Oesch, P. A., et al. 2015, ApJ, 803, 34Braun, R., Bonaldi, A., Bourke, T., Keane, E., & Wagg, J. 2019, arXiv e-prints,arXiv:1912.12699Chapman, E., Abdalla, F. B., Bobin, J., et al. 2013, MNRAS, 429, 165Chapman, E. & Jeli´c, V. 2019, arXiv e-prints, arXiv:1909.12369Ciardi, B. & Madau, P. 2003, ApJ, 596, 1Croton, D. J., Stevens, A. R. H., Tonini, C., et al. 2016, ApJS, 222, 22Datta, K. K., Ghara, R., Majumdar, S., et al. 2016, Journal of Astrophysics andAstronomy, 37, 27DeBoer, D. R., Parsons, A. R., Aguirre, J. E., et al. 2017, PASP, 129, 045001Furlanetto, S. R., Zaldarriaga, M., & Hernquist, L. 2004, ApJ, 613, 1Giri, S. K., Mellema, G., & Ghara, R. 2018, MNRAS, 479, 5596Gluscevic, V. & Barkana, R. 2010, MNRAS, 408, 2373Gorce, A., Douspis, M., Aghanim, N., & Langer, M. 2018, A&AGorce, A. & Pritchard, J. R. 2019, MNRAS, 489, 1321Greig, B., Trott, C. M., Barry, N., et al. 2020, arXiv e-prints, arXiv:2008.02639Harker, G. J. A., Zaroubi, S., Thomas, R. M., et al. 2009, MNRAS, 393, 1449Hunter, J. D. 2007, Computing In Science & Engineering, 9, 90Hutter, A. 2018a, CIFOG: Cosmological Ionization Fields frOm GalaxiesHutter, A. 2018b, MNRAS, 477, 1549Hutter, A., Watkinson, C. A., Seiler, J., et al. 2020, MNRAS, 492, 653Ichikawa, K., Barkana, R., Iliev, I. T., Mellema, G., & Shapiro, P. R. 2010, MN-RAS, 406, 2521Jones, E., Oliphant, T., Peterson, P., et al. 2001, SciPy: Open source scientifictools for PythonKakiichi, K., Majumdar, S., Mellema, G., et al. 2017, MNRAS, 471, 1936Kaur, H. D., Gillet, N., & Mesinger, A. 2020, MNRASKittiwisit, P., Bowman, J. D., Jacobs, D. C., Beardsley, A. P., & Thyagarajan, N.2018, MNRAS, 474, 4487Koopmans, L., Pritchard, J., Mellema, G., et al. 2015, Advancing Astrophysicswith the Square Kilometre Array (AASKA14), 1Liu, A. & Shaw, J. R. 2020, PASP, 132, 062001Majumdar, S., Pritchard, J. R., Mondal, R., et al. 2018, MNRAS, 476, 4007Malloy, M. & Lidz, A. 2013, ApJ, 767, 68Mangena, T., Hassan, S., & Santos, M. G. 2020, MNRAS, 494, 600McGreer, I. D., Mesinger, A., & D’Odorico, V. 2015, MNRAS, 447, 499McQuinn, M., Lidz, A., Zahn, O., et al. 2007, MNRAS, 377, 1043Mellema, G., Iliev, I. T., Pen, U.-L., & Shapiro, P. R. 2006, MNRAS, 372, 679Mertens, F. G., Ghosh, A., & Koopmans, L. V. E. 2018, MNRAS, 478, 3640Mesinger, A. & Furlanetto, S. 2007, ApJ, 669, 663Mesinger, A., Furlanetto, S., & Cen, R. 2011, MNRAS, 411, 955Muñoz, J. B. & Cyr-Racine, F.-Y. 2021, Phys. Rev. D, 103, 023512Oliphant, T. 2006, NumPy: A guide to NumPy, USA: Trelgol Publishing, [On-line; accessed < today > ]Pagano, M. & Liu, A. 2020, MNRAS, 498, 373Patil, A. H., Yatawatta, S., Zaroubi, S., et al. 2016, MNRAS, 463, 4317Patil, A. H., Zaroubi, S., Chapman, E., et al. 2014, Monthly Notices of the RoyalAstronomical Society, 443, 1113Petrovic, N. & Oh, S. P. 2011, MNRAS, 413, 2103Planck Collaboration et al. 2016a, A&A, 594, A1Planck Collaboration et al. 2016b, A&A, 596, A108Pritchard, J. R. & Loeb, A. 2012, Reports on Progress in Physics, 75Robertson, B. E., Ellis, R. S., Furlanetto, S. R., & Dunlop, J. S. 2015, ApJ Letters,802, L19Santos, M. G., Cooray, A., & Knox, L. 2005, ApJ, 625, 575Seiler, J., Hutter, A., Sinha, M., & Croton, D. 2019, MNRAS, 1578Shimabukuro, H., Yoshiura, S., Takahashi, K., Yokoyama, S., & Ichiki, K. 2016,MNRAS, 458, 3003Watkinson, C. A., Giri, S. K., Ross, H. E., et al. 2019, MNRAS, 482, 2653Watkinson, C. A. & Pritchard, J. R. 2014, MNRAS, 443, 3090Zahn, O., Lidz, A., McQuinn, M., et al. 2007, ApJ, 654, 12 Article number, page 10 of 12. Gorce et al.: Using the sample variance of 21cm maps as a tracer of the ionisation topology
Fig. A.1.
Contributions to the local variance of the 21cm brightnesstemperature field in a 21CMFAST simulation.
Appendix A: Local variance of the δ T b field In the following we consider the 21cm brightness temperaturefield to be the direct product of the neutral hydrogen H i and theoverdensity δ b = ρ b / ¯ ρ b − δ T b ( x ) = x H i ( x ) × [1 + δ b ( x )] . (A.1)If we consider the k th slice along the frequency direction in thesimulation, we yield from the previous equation δ T b , loc ( k ) = E [ δ T b ( k )] = E [ x H i ( k ) × (1 + δ b ( k ))] , (A.2)where E is the expectation value, δ T b , loc ( k ) is the mean of the21cm field of the slice, x H i ( k ) its 2D H i field and δ b ( k ) its 2Doverdensity field. Since these two fields are correlated, equationA.2 can be reformulated as δ T b , loc ( k ) = ¯ x H i ( k ) × (1 + ¯ δ b ( k )) + Cov [ x H i ( k ) × (1 + δ b ( k ))] . (A.3)Then the local variance is the variance of the distribution of δ T b , loc values. If we write X = ¯ x H i ( k ) × (1 + ¯ δ b ( k )) and Y = Cov [ x H i ( k ) × (1 + δ b ( k ))] then σ loc2 = Var( δ T b , loc ) = X , Y ) + Var X + Var Y = Cov[ ¯ x i , (1 + ¯ δ b ) ] + (cid:16) σ , H i + ¯ x i (cid:17) (cid:16) σ ,δ b + (cid:17) − (cid:104) Cov[ ¯ x H i , (1 + ¯ δ b )] + ¯ x H i (cid:105) . (A.4)These di ff erent elements are represented in Fig. A.1. We see thatthe final shape of the local variance is mostly made of the vari-ance of X , the product of the two local fields, and of the covari-ance of X and Y , which are both complicated objects di ffi cultto interpret. However, as shown in Fig. 3, considering the localvariances of x H i and 1 + δ b , as well as their covariance, is su ffi -cient to understand the redshift-evolution of σ . Appendix B: Tests on toy models
We generate toy models with dimensions equal to those of the rsage simulations, i.e. a comoving box length of L =
160 Mpcand 256 grid cells on each side. The initial field is a 3D neu-tral box filled with enough bubbles of radius R init to reach afilling fraction of ¯ x e = .
01, which are then artificially grown
Fig. B.1.
Local variance σ loc for toy models with di ff erent starting ra-dius. by 1 cell in radius until a filling fraction of 100% is reached.We compute the local variance for each of the resulting boxes.Results can be seen in Fig. B.1 in comparison to a control testwhere ionised pixels are randomly distributed. We evolve thetoy model boxes with di ff erent initial radii: a large R init will beequivalent to a higher mass threshold for ionising sources, thatis an increasing R init will be equivalent to an increasing M turn in21cmFAST or transitioning from rsage fej via rsage const to rsage SFR . The di ff erence to 21CMFAST and rsage is thationised bubbles are randomly located, so the ionising sourcesare not clustered. Additionally, there are no new ionised regionsthroughout the process, since all the bubbles are initialised inthe first field. Despite these di ff erences, the shape of σ ionloc ( x e ) re-mains unchanged, and the maximum is still reached for a fillingfraction of about 0 .
60: ¯ x e = .
61 for the field with small bubbles,and ¯ x e = .
56 for the field with large bubbles. Similarly to whathas been seen from the 21CMFAST simulations, the simulationwith the largest bubbles reaches on average its maximum localvariance at smaller filling fractions.
Appendix C: Dependence on box size and otherlimitations
Because the information contained in σ loc is purely related tothe sample variance, σ loc will depend on the size of the box con-sidered: the larger the box, the smaller the variance. In order tocompare with the fields-of-views anticipated for upcoming 21cmexperiments, we ask what is the limiting size that allows us todi ff erentiate between reionisation models with σ loc . To do so,we generate a 21CMFAST simulation box for M turn = M (cid:12) ,side length L =
480 Mpc and cell size ∆ x = .
625 Mpc, sameas before. We divide this large simulation into sub-cubes of de-creasing size, until a side length of L =
15 Mpc is reached(All sub-cubes have identical reionisation histories.). We com-pute the local variance obtained from the 21cm brightness tem-perature fields of all sub-cubes and compare their values inFig. C.1. The maximum signal is reached for the smallest boxsize, L =
15 Mpc, and reaches values as high as 5 . σ loc ,δ T b ∼ . × (cid:32) L
100 Mpc (cid:33) − . . (C.1) Article number, page 11 of 12 & A proofs: manuscript no. main
Fig. C.1.
Evolution of the standard deviation of the x loc distributions forM9 simulations of decreasing side length but constant resolution. SKA1-
Low has a field of view of 327 arcmin at nomi-nal frequency of 110 MHz, which corresponds to L = , , , z = , , ,
11 and 13, re-spectively. For such wide observational windows, the samplevariance is weak: at z =
6, towards the end of reionisation, itwill be about ∼ .
25 mK. This relation is only valid at a timewhen about 60% of the IGM is ionised, which will correspond todi ff erent redshifts depending on the reionisation scenario. How-ever, it will give the maximum amplitude of σ loc and is thereforea useful choice, if we want to estimate the errors on 21cm globalsignal measurements. Interestingly, it is about 10 times smallerthan the one found by Muñoz & Cyr-Racine (2021) at z = . σ loc to depend on the resolution ofthe simulation considered – or on the angular resolution of thetelescope used. Indeed, Banet et al. (2020) already pointed outthe impact of instrument resolution and smoothing on the one-point and di ff erential PDF. The anticipated angular resolution ofSKA1 at a nominal frequency of 110 MHz is expected to be11 arcsec (Braun et al. 2019), that is between 0 .
45 and 0 .
55 Mpcon the redshift range 5 ≤ z ≤
13. Running 21CMFAST for M turn = M (cid:12) and a fixed number of cells but an increasingresolution, from ∆ x = . σ loc requires to have a su ffi cient numberof slices available: it is obvious from Fig. 2 that without a suf-ficient number of slices, the x loc distributions will be noisy andtheir variance σ loc will not be reliable. For the M9 simulation,we find that 128 slices, so half of the box, will still provide sat-isfying results, while lower sample sizes make σ loc unusable.However, thanks to the very high frequency resolution of SKA,anticipated to be 5 . ffi cient number of images at su ffi cientlyclose redshifts for σ loc to give interesting results. This will bethe focus of future work.to give interesting results. This will bethe focus of future work.