Sparsely Sampling the Sky: Regular vs Random Sampling
MMon. Not. R. Astron. Soc. , 1–5 (2013) Printed 13 November 2018 (MN L A TEX style file v2.2)
Sparsely Sampling the Sky: Regular vs Random Sampling
P. Paykari (cid:63) S. Pires J. -L. Starck and A. H. Jaffe Laboratoire AIM, UMR CEA-CNRS-Paris 7, Irfu, SAp/SEDI, Service d’Astrophysique, CEA Saclay, F-91191 GIF- SUR-YVETTE CEDEX, France. Department of Physics, Blackett Laboratory, Imperial College, London SW7 2AZ, United Kingdom
Accepted 2012
ABSTRACT
The next generation of galaxy surveys, aiming to observe millions of galaxies, are ex-pensive both in time and cost. This raises questions regarding the optimal investmentof this time and money for future surveys. In a previous work, it was shown thata sparse sampling strategy could be a powerful substitute for the contiguous obser-vations. However, in this previous paper a regular sparse sampling was investigated,where the sparse observed patches were regularly distributed on the sky. The regular-ity of the mask introduces a periodic pattern in the window function, which inducesperiodic correlations at specific scales. In this paper, we use the Bayesian experimen-tal design to investigate a random sparse sampling, where the observed patches arerandomly distributed over the total sparsely sampled area. We find that, as there is nopreferred scale in the window function, the induced correlation is evenly distributedamongst all scales. This could be desirable if we are interested in specific scales inthe galaxy power spectrum, such as the Baryonic Acoustic Oscillation (BAO) scales.However, for constraining the overall galaxy power spectrum and the cosmologicalparameters, there is no preference over regular or random sampling. Hence any ap-proach that is practically more suitable can be chosen and we can relax the regular-gridcondition for the distribution of the observed patches.
Key words: cosmology
The accurate measurement of the cosmological parametersrelies on accurate measurements of a power spectrum, whichdescribe the spatial distribution of an isotropic random field.The power spectrum is enough to define the perturbationscompletely when the perturbations are assumed uncorre-lated Gaussian random fields in the Fourier space. Powerspectra (or its Fourier transform, the correlation function)are what the surveys actually measure, from which cosmo-logical parameters are inferred. These spectra are normally aconvolution of the primordial power spectrum and a trans-fer function, which depends on the cosmological parameters.One of the most important observed spatial power spectrais the galaxy power spectrum, which was first formulated byPeebles (1973). This is defined as P g ( k ) = 2 π · b ( k ) · k · T ( k ) · P p ( k ) , (1)where P p ( k ) = A s k n s − is the primordial power spectrumand T ( k ) is the transfer function, which depends upon the (cid:63) E-mail: [email protected] The primordial power spectrum measures the statistical distri-bution of perturbations in the early universe, for example, justafter the inflationary era. cosmological parameters (e.g., the matter density Ω m , thescalar spectral index, n s , etc.) responsible for the evolutionof the universe. The bias b relates the galaxy power spec-trum to the underlying matter power spectrum. The galaxypower spectrum is very rich in terms of constraining a largerange of cosmological parameters. On large scales it probesthe structures which are less affected by clustering and evo-lution and hence have a ‘memory’ of the initial state of theuniverse. On intermediate scales the spectrum informs usabout the evolution of the universe; for example the epochof matter-radiation equality. On relatively small scales thereis a great deal of information about galaxy clustering viathe Baryonic Acoustic Oscillations (BAO), which encodesinformation about the sound horizon at the time of recom-bination. Therefore, measuring the galaxy power spectrumon a large range of scales helps us to constrain a range ofcosmological parameters.For accurate measurements of the galaxy power spec-trum, surveys aim to maximize the observed number ofgalaxies to overcome the Poisson noise. Considering the largeinvestments in time and money for these surveys, one wouldlike to know the optimal survey strategy. For example, to in-vestigate larger scales, it may be more efficient to observe alarger, but sparsely sampled, area of sky instead of a smallercontiguous area. In this case one gathers a larger density of c (cid:13) a r X i v : . [ a s t r o - ph . C O ] J un P. Paykari et. al. states in Fourier space, but at the expense of an increasedcorrelation between different scales — aliasing. This wouldsmooth out features on certain scales and decrease their sta-tistical significance. The sparse sampling approach was in-vestigated in a previous paper (Paykari & Jaffe 2012), wherethe advantages and disadvantages of such a design was stud-ied. It was shown that a sparse sampling could be a pow-erful substitute for a contiguous sampling. In particular, itwas shown that for a survey similar to the Dark Energy Sur-vey (DES) , a sparse design could help reduce the observingtime (and hence the cost of the survey), for the same amountof constraining power for the cosmological parameters. Al-ternatively, for the same amount of observing time, one canobserve a larger, but sparsely sampled, area of the sky toimprove the constraining power of the survey. However, intheir sparse design, the observed patches were regularly dis-tributed over the total sampled area of the sky. The fixedand determined positions of the patches introduces a peri-odic pattern in the window function, which induces a pe-riodic aliasing of scales. This causes certain scales, whichcorrespond to the fixed distances between the patches, to bemore aliased than others. This regular design may, therefore,not be desirable for two reasons; 1. if we are interested in cer-tain scales, such as the Baryonic Acoustic Scales (BAO) orthe scale of matter-radiation equality in the galaxy powerspectrum, the regular design may not be preferred due toits periodic induced aliasing; 2. a rigid regular distributionof the observed patches may not be practically feasible, asthere are regions we would like to avoid, such as the planeof the Milky Way.If one is interested in all scales equally and would like tomeasure all the scales with the same statistical significance,a random sparse sampling may be the preferred approach.In this case, the patches are randomly distributed over thetotal sampled area, which is practically more feasible. Inthis work, we investigate both regular and random sparsesampling.As in Paykari & Jaffe (2012) we will make use ofthe Bayesian Experimental Design and the Figure of Merit(FoM) to select the optimal design for constraining thegalaxy power spectrum bins and a set of cosmological pa-rameters. Bayesian methods have recently been used in cosmologyfor model comparison and for deriving posterior probabilitydistributions for parameters of different models. Bayesianstatistics can also be used to investigate the performance offuture experiments, based on our current knowledge (Liddleet al. 2006; Trotta 2007a,b). We will use this strength ofBayesian statistics to optimise the strategy to observe thesky for galaxy surveys. For such an optimisation, we needto satisfy three requirements: 1. specify the parameters thatdefine the experiment; 2. specify the parameters to constrain(with respect to which the survey is optimised); 3. specify a quantity of interest, generally called the figure of merit(FoM), associated with the proposed experiment. We wantto extremise the FoM subject to constraints imposed by theexperiment or by our knowledge about the nature of theuniverse.Let us assume e denotes the different experimental de-signs, M i are the different models with their parameters θ i and experiment o has already been performed (this experi-ment’s posterior P ( θ | o ) forms our prior probability functionfor the new experiment). The FoM (sometimes called theutility function) will depend on the parameters of interest,the previous experiment (data) and the characteristics ofthe future experiment; U ( θ, e, o ) . From this, we can buildthe expected utility E [ U ] as E [ U | e, o ] = (cid:88) i P ( M i | o ) ˆ d ˆ θ i U (ˆ θ i , e, o ) P (ˆ θ i | o, M i ) , (2)where ˆ θ i represent the fiducial parameters for model M i .Our knowledge of the universe is described by the currentposterior distribution P (ˆ θ | o ) . Averaging the utility over theposterior accounts for the present uncertainty in the param-eters and summing over all the available models would ac-count for the uncertainty in the underlying true model. Theaim is to select an experiment that extremises the utilityfunction (or its expectation). One of the common choicesfor the FoM is some scalar function of the Fisher matrix,which is the expectation of the inverse covariance of the pa-rameters in the Gaussian limits (this will be explained in thenext section) . Three common FoMs (Hobson et al. 2009),which we will be using as well, are • A-optimality = log( trace ( F )) ; trace of the Fisher ma-trix F (or its log ), which is proportional to sum of the vari-ances. • D-optimality = log ( | F | ) ; determinant of the Fisher ma-trix F (or its log ), which measures the inverse of the squareof the parameter volume enclosed by the posterior. • Entropy (also called the Kullback-Leibler divergence)Entropy = ˆ dθ P ( θ | ˆ θ, e, o ) log P ( θ | ˆ θ, e, o ) P ( θ | o )= 12 (cid:2) log | F | − log | Π | − Tr ( I − ΠF − ) (cid:3) , (3)where P ( θ | ˆ θ, e, o ) is the posterior distribution with Fishermatrix F and P ( θ | o ) is the prior distribution with Fishermatrix Π . The posterior Fisher matrix is F = L + Π , where L is the likelihood Fisher matrix, which is the current sparsesurvey we have designed. Here, the FoMs are defined so thatthey need to be maximised for an optimal design. For adetailed comparison between the above FoMs please referto Hobson et al. (2009) and Paykari & Jaffe (2012). Notethat these are not the ‘expected’ utility functions — in ourcurrent models of the universe, we do not expect a significantdifference between the parameters of the same model.The Fisher matrix (Kendall & Stuart 1977; Tegmark1997) has been largely used for optimisation and forecasting. One can refer to the Dark Energy Task Force (DETF) (Albrechtet al. 2006) FoM, that use Fisher-matrix techniques to investigatehow well each model experiment would be able to restrict the darkenergy parameters w , w a , Ω DE for their purposes.c (cid:13) , 1–5 parsely Sampling the Sky: Regular vs Random Sampling The Fisher matrix is defined as the ensemble average of the curvature of the likelihood function L (i.e., it is the averageof the curvature over many realisations of signal and noise); F ij = (cid:104)F ij (cid:105) = (cid:28) − ∂ ln L ∂θ i ∂θ j (cid:29) = 12 Tr [ C ,i C − C ,j C − ] , (4)where the third equality is appropriate for a Gaussian dis-tribution with correlation matrix C determined by the pa-rameters θ i . The inverse of the Fisher matrix is an approxi-mation of the covariance matrix of the parameters, by anal-ogy with a Gaussian distribution in the θ i , for which thiswould be exact. The Cramer-Rao inequality states that thesmallest frequentist one-sigma error measured by any unbi-ased estimator is / √ F ii (non-marginalised) and (cid:112) ( F − ) ii (marginalised) . The derivatives in Equation 4 generally de-pend on where in the parameter space they are calculatedand hence the Fisher matrix is function of the fiducial pa-rameters. We further note, as in all uses of the Fisher matrix,that any results thus obtained must be taken with the caveatthat these relations only map onto realistic error bars in thecase of a Gaussian distribution, usually most appropriate inthe limit of high signal-to-noise ratio, so that the conditionsof the central limit theorem obtain. In case of no extremelydegenerate parameter directions, we expect that our resultswill be indicative of a full analysis (Trotta 2007c).Following (Tegmark 1997), the data in pixel i is definedas ∆ i ≡ ´ d x ψ i ( x ) [ n ( x ) − ¯ n ] / ¯ n , where n ( x ) is the galaxydensity at position x and ¯ n is the expected number of galax-ies at that position. The weighting function, ψ i ( x ) , whichdetermines the pixelisation and the shape of the survey, isdefined as a set of Fourier pixels ψ i ( x ) = S ( x ) e ιk i .x V × (cid:40) x inside survey volume otherwise , (5)where V is the total volume of the survey and S ( x ) is themask (i.e., design of the survey that, of example, defines thedistribution of the observed patches). We design the sparselysampled area of the sky as a distribution n p × n p squarepatches of size M × M — see Figure 1. Therefore, the struc-ture of the mask S on the sky is defined as a top-hat in both x and y directions and as a step function in the z direction S ( x ) = Θ( z ) × (cid:88) n,m Π( x − x n , y − y m ) , (6)where x n and y m mark the centres of the patches in ourcoordinate system and the functions are defined as Π( x − x n , y − y m ) = (cid:40) | ( x − x n , y − y m ) | < M/ otherwise , (7) Θ( z ) = (cid:40) z min < z < z max otherwise . (8)Dividing the survey volume into sub-volumes i , ∆ i isthen the fractional over-density in pixel i . Using this pix-elisation we can define a covariance matrix as (cid:10) ∆ i ∆ ∗ j (cid:11) = C = ( C S ) ij + ( C N ) ij , where C S and C N are the signal and It should be noted that the Cramer-Rao inequality is a state-ment about the so-called “Frequentist” confidence intervals and isnot strictly applicable to “Bayesian” errors. Integration of the joint probability over other parameters.
Figure 1.
Design of the mask for random (left) and regular(right) sampling. The patches (we are observing through the whitesquare patches in the Figure), of size M , are distributed randomlyand regularly on the surface of the sky. The total observed areais the sum of the areas of all the patches, ( n p × M ) , and total sampled area is the total area which bounds both the maskedand the unmasked areas. Hence the fractional sky coverage is f sky = ( n p × M ) /A tot , which is the same in both designs. noise covariance matrices respectively and are assumed in-dependent of each other. For generality, we take the complexconjugate of one member of the pair. By equating the num-ber over-density [ n ( x ) − ¯ n ] / ¯ n to the continuous over-density δ ( x ) = [ ρ ( x ) − ¯ ρ ] / ¯ ρ , the signal and the noise covariance ma-trices can be defined as ( C S ) ij = (cid:10) ∆ i ∆ ∗ j (cid:11) = ˆ dk (2 π ) k P ( k ) W ij ( k ) , (9) ( C N ) ij = (cid:10) N i N ∗ j (cid:11) = ˆ dk (2 π ) k n W ij ( k ) , (10)where ˜ ψ i ( k ) is the Fourier transform of ψ i ( x ) and the win-dow function W ij ( k ) is defined as the angular average ofthe square of the Fourier transform of the weighting func-tion; W ij ( k ) = ´ d Ω k ˜ ψ i ( k ) ˜ ψ ∗ j ( k ) . For a full analysis of aboveequations please refer to Dodelson (2003) or Paykari & Jaffe(2012). This prescription gives us a data covariance matrixfor a galaxy survey, from which we can obtain a Fisher ma-trix for the parameters of interest using Equation 4 above. We have chosen a geometrically flat Λ CDM model with adi-abatic perturbations with a five-parameter model: Ω m =0 . , Ω b = 0 . , Ω Λ = 0 . , τ = 0 . and h =0 . , where H = 100 h km s − Mpc − . As explained above,the FoM used are Entropy, A-optimality and D-optimality,where a SDSS-LRG-like survey has been chosen as the priorFisher matrix Π . As in Paykari & Jaffe (2012), we use a flatsky approximation to sparsify a survey — similar to that ofthe DES survey . We divide the total sparsely sampled area The Dark Energy Survey (DES) (The Dark Energy Survey Col-laboration 2005) has started taking data in December 2012 andwill continue for five years to catalogue 300 million galaxies in thesouthern sky over an area of 5000 square degrees and a redshiftrange of . < z < . .c (cid:13) , 1–5 P. Paykari et. al. of the sky into small square patches and distribute themrandomly (left panel of Figure 1) and regularly (right panelof Figure 1). Note that there are two scales that control thebehaviour of the window function; the size of the patchesand the distance between them. In both designs the sizeand the number of the patches are kept the same so thatthe fractional sky coverage, f sky , is the same in both cases.The only difference between the two designs is the distancebetween the patches due to their different distribution.Figure 2 shows the middle row of the power spectrumFisher matrix for regular (black) and random (blue) sam-pling. In both cases, the main peak in the middle is theexpected inverse error of the middle bin of the power spec-trum. Going away from the main peak, each point representsthe correlation between that bin and the middle one. In thecase of regular sampling, apart from the main peak at thecentre, there are secondary peaks at other scales indicatingan induced correlation at these scales. The position of thesecondary peaks is a consequence of the fixed distances be-tween the patches in the mask. The regularity in the maskintroduces a periodic pattern in the window function, whichin turn induces correlations at that period. Therefore, in thisdesign, certain scales can be less significantly measured thanothers. This could be a disadvantage if we wish to constrainthe behaviour of the power spectrum at a certain scale, suchas the BAO scale. On the other hand, in the case of randomsampling, the distance between the patches is not fixed. Asthere is no preferred scale in the mask, all scales are con-strained with almost the same level of significance in thisdesign. This is desirable for constraining the power spec-trum at a certain scale, as the power leakage from the mainpeak is evenly distributed amongst all scales. Also, note thatthe amplitude and the width of the main peak are controlledby the fractional sky coverage, f sky , and the total sparselysampled volume, V tot , respectively. As f sky and V tot are thesame in regular and random sampling, the main peaks havesame amplitude and width in both cases.Table 1 shows the FoM for the galaxy power spectrumbins on the left and the cosmological parameters on theright. As can be seen, both regular and random designs havevery similar values for both the power spectrum bins andthe parameters. This shows that, for the same f sky , the ar-rangement of the patches does not play an important rolein constraining the galaxy spectrum bins or the parameters.Therefore, the constraining power of the survey is not con-trolled by the distribution of the patches, rather, as investi-gated in Paykari & Jaffe (2012), by the total extent of thesampled area. However, note that the FoMs we have chosenhere do not measure the constraining power of the survey fora particular scale. Rather, they measure the integrated con-straining power over all scales of the spectrum. If we wish tomeasure a particular scale, random sampling would be thepreferred approach as it causes an evenly distributed leakageof the power into all the scales.To this end we summarise the main features of Fig. 2:1. The width of the main peak in both designs (and the sec-ondary peaks in the regular case) is controlled by the total size of the survey. As this is the same in both designs, thewidth is the same in both cases.2. The position of the secondary peaks in the regular caseis controlled by the position of the patches in the mask.Note that apart from the periodicity in the x and y di- Table 1.
Figure of Merit (FoM) for the galaxy power spectrumand the cosmological parameters for regular and random sam-pling. The FoMs are defined so that they need to be maximisedfor an optimal design.FoM Power Spectrum Cosmological ParametersRegular Random Regular RandomEntropy 39.3 39.2 3.5 3.5D-optimality -1415.5 -1415.9 11.2 11.3A-optimality -20.6 -20.6 7.8 7.8 rections, there is also periodicity along the ◦ line. Thiscauses the smaller secondary peaks, for e.g., at k (cid:39) . and k (cid:39) . h Mpc − .3. The size of the patches generates an envelope function overthe whole k range. As the size of the patches are so muchsmaller than the total size of the survey their effect over our k range is negligible. Also, the patches have the same sizein both designs, so their effect in the window function is ex-actly the same.4. As the random mask has been designed as a reflectionof a smaller random mask in x and y , and is hence notcompletely randomised over the whole area, some regulari-ties are expected. For example, the symmetric shoulders at k (cid:39) . h Mpc − and k (cid:39) . h Mpc − on the main peakof the random case is due to the patches placed at the edgesof the mask. Closer patches also have an effect with a smalleramplitude over a larger range of k . This effect is below theremaining small random fluctuations.5. As f sky is the same in both designs, the total informationgained in both surveys is the same. Note that an averageover the positions of the patches in the mask is constant: (cid:104) S ( x ) (cid:105) nm ∼ (cid:104) Π( x − x n , y − y m ) (cid:105) nm , ∼ ˆ dx n dy m p ( x n ) p ( y m ) Π( x − x n , y − y m ) , ∼ XY ˆ X − X ˆ Y − Y dx n dy m Π( x − x n , y − y m ) , ∼ Constant , (11)where X and Y are the total extent of the survey in the x and y directions and p ( x n ) and p ( y m ) are the probabil-ity distribution of the patches in the x and y directions.Therefore, in terms of the FoM, both designs have the sameconstraining power for the galaxy power spectrum and thecosmological parameters. For future surveys, one would like to know the optimal in-vestment of time and money. In the current era, where sta-tistical errors have been greatly reduced and compete withsystematic errors, observing a greater number of galaxies (toovercome the Poisson noise) may not necessarily improveour results. One desires more strategic ways to make obser-vations and take control of systematics. This inspired a newapproach in making observations (see Paykari & Jaffe 2012),in which the sampled area was covered sparsely as opposedto contiguously. In this case one gathers a larger density of c (cid:13) , 1–5 parsely Sampling the Sky: Regular vs Random Sampling Figure 2.
The middle row of the power spectrum Fisher matrix for regular (black) and random (blue) sampling. In both cases, the mainpeak in the middle is the inverse error (remember this is the Fisher matrix) of the middle bin of the power spectrum. Going away fromthe main peak, each point represents the correlation between that bin and the middle bin. In the case of regular sampling, there aresecondary peaks at specific scales,which is due to the fixed position of the patches. On the other hand, for random sampling, there is nopreferred scale and correlation is evenly distributed between all scales. The symmetric shoulders on the main peak, at k (cid:39) . h Mpc − and k (cid:39) . h Mpc − , in the random case are due to the design of the random mask, which has been obtained by a reflection in the x and y plane for simplicity (see main text). Note that the y -axis is in log scale. states in Fourier space, but at the expense of an increasedcorrelation between different scales — aliasing. This wouldsmooth out features on certain scales and decrease their sta-tistical significance. In that work, the area of the sky was di-vided into small square patches, regularly distributed acrossthe total area. It was shown that the loss of the constrain-ing power of the survey induced by the sparse sampling isnegligible.More interestingly, it was shown that for the sameamount of observing time, one could sparsely sample a largertotal area of sky, which improves the constraining power ofthe survey. One therefore gains a great deal by spendingthe same amount of time on a larger but sparsely sampledarea. Hence the sparse sampling could be a promising sub-stitute for the contiguous observations and the way forwardfor designing future surveys. However, one constraint in thisprevious design was the fixed and determined positions ofthe observed patches. The regular design of the mask in-troduces a periodic pattern in the window function, whichinduces periodic correlations at specific scales correspondingto the distances between the patches. This is can be a prob-lem if we are interested in a specific scale in in the powerspectrum.In this work, we have compared the random samplingof sky to regular sampling. In the random design, as thereis no preferred scale in the mask, we find that all scales areconstrained with almost the same level of significance.Moreover, in terms of constraining the power spectrumover all scales and constraining the cosmological parameters,there is no difference between regular or random sampling.Therefore, the arrangement of the patches does not controlthe constraining power of the surveys for the galaxy spec-trum or parameter measurements. This means we can relaxthe regular-grid condition in the sparse mask and any pat- tern that is practically more suitable can be applied. This isgood news because, in practice, it is hard to have a regularmask as there always are regions in the sky one would liketo avoid, such as the plane of the Milky Way. The authors would like to thank A. Woiselle and F. Lanussefor the kind discussions. This work was supported by the Eu-ropean Research Council grant SparseAstro (ERC-228261).
REFERENCES
Albrecht A., Bernstein G., Cahn R., Freedman W. L., He-witt J., Hu W., Huth J., Kamionkowski M., Kolb E. W.,Knox L., Mather J. C., Staggs S., Suntzeff N. B., 2006,ArXiv Astrophysics e-printsDodelson S., 2003, Modern cosmologyHobson M. P., Jaffe A. H., Liddle A. R., Mukherjee P.,Parkinson D., 2009, Bayesian Methods in CosmologyKendall M., Stuart A., 1977, The advanced theory of statis-tics. Vol.1: Distribution theoryLiddle A., Mukherjee P., Parkinson D., 2006, Astronomyand Geophysics, 47, 040000Paykari P., Jaffe A. H., 2012, ArXiv e-printsPeebles P., 1973, APJ, 185, 413Tegmark M., 1997, Physical Review Letters, 79, 3806The Dark Energy Survey Collaboration 2005, ArXiv As-trophysics e-printsTrotta R., 2007a, MNRAS, 378, 72Trotta R., 2007b, MNRAS, 378, 819Trotta R., 2007c, MNRAS, 378, 819 c (cid:13)000