[PDF] The VIMOS Public Extragalactic Redshift Survey (VIPERS): On the correct recovery of the count-in-cell probability distribution function

Abstract

We compare three methods to measure the count-in-cell probability density function of galaxies in a spectroscopic redshift survey. From this comparison we found that when the sampling is low (the average number of object per cell is around unity) it is necessary to use a parametric method to model the galaxy distribution. We used a set of mock catalogues of VIPERS, in order to verify if we were able to reconstruct the cell-count probability distribution once the observational strategy is applied. We find that in the simulated catalogues, the probability distribution of galaxies is better represented by a Gamma expansion than a Skewed Log-Normal. Finally, we correct the cell-count probability distribution function from the angular selection effect of the VIMOS instrument and study the redshift and absolute magnitude dependency of the underlying galaxy density function in VIPERS from redshift 0.5 to 1.1 . We found very weak evolution of the probability density distribution function and that it is well approximated, independently from the chosen tracers, by a Gamma distribution.

Full PDF

aa r X i v : . [ a s t r o - ph . C O ] M a y Astronomy&Astrophysicsmanuscript no. ms c (cid:13)

ESO 2018January 10, 2018

The VIMOS Public Extragalactic Redshift Survey (VIPERS) ⋆ On the correct recovery of the count-in-cell probability distribution function.

J. Bel , E. Branchini , , , C. Di Porto , O. Cucciati , , B. R. Granett , A. Iovino , S. de la Torre , C. Marinoni , , ,L. Guzzo , , L. Moscardini , , , A. Cappi , , U. Abbas , C. Adami , S. Arnouts , M. Bolzonella , D. Bottini ,J. Coupon , I. Davidzon , , G. De Lucia , A. Fritz , P. Franzetti , M. Fumana , B. Garilli , , O. Ilbert , J. Krywult ,V. Le Brun , O. Le F`evre , D. Maccagni , K. Małek , F. Marulli , , , H. J. McCracken , L. Paioro , M. Polletta ,A. Pollo , , H. Schlagenhaufer , , M. Scodeggio , L. A. .M. Tasca , R. Tojeiro , D. Vergani , , A. Zanichelli ,A. Burden , A. Marchetti , , Y. Mellier , R. C. Nichol , J. A. Peacock , W. J. Percival , S. Phleps , andM. Wolk (A ﬃ liations can be found after the references) Received –; accepted –

ABSTRACT

We compare three methods to measure the count-in-cell probability density function of galaxies in a spectroscopic redshift survey. From thiscomparison we found that when the sampling is low (the average number of object per cell is around unity) it is necessary to use a parametricmethod to model the galaxy distribution. We used a set of mock catalogues of VIPERS, in order to verify if we were able to reconstruct thecell-count probability distribution once the observational strategy is applied. We ﬁnd that in the simulated catalogues, the probability distributionof galaxies is better represented by a Gamma expansion than a Skewed Log-Normal. Finally, we correct the cell-count probability distributionfunction from the angular selection e ﬀ ect of the VIMOS instrument and study the redshift and absolute magnitude dependency of the underlyinggalaxy density function in VIPERS from redshift 0 . .

1. We found very weak evolution of the probability density distribution function and thatit is well approximated, independently from the chosen tracers, by a Gamma distribution.

Key words.

Cosmology: cosmological parameters – cosmology: large scale structure of the Universe – Galaxies: high-redshift – Galaxies: statis-tics

1. Introduction

The galaxy clustering o ﬀ ers a formidable playground to try tounderstand how structures have been growing during the evolu-tion of the universe. A number of statistical tools have been de-veloped and used over the past thirty years (see Bernardeau et al.2002, for a review). In general, these statistical methods use thefact that the clustering of galaxies is due to the gravitational pullof the underlying matter distribution. Hence, the study of thespatial distribution of galaxies in the universe allows us to getinformation about the statistical properties of its matter content.As a result, it is of paramount importance to be able to measurethe statistical quantities describing the galaxy distribution froma redshift survey. Send o ﬀ print requests to : Bel., J.e-mail: [email protected] ⋆ based on observations collected at the European SouthernObservatory, Cerro Paranal, Chile, using the Very Large Telescopeunder programmes 182.A-0886 and partly 070.A-9007. Also basedon observations obtained with MegaPrime / MegaCam, a joint projectof CFHT and CEA / DAPNIA, at the Canada-France-Hawaii Telescope(CFHT), which is operated by the National Research Council (NRC) ofCanada, the Institut National des Science de l’Univers of the CentreNational de la Recherche Scientiﬁque (CNRS) of France, and theUniversity of Hawaii. This work is based in part on data productsproduced at TERAPIX and the Canadian Astronomy Data Centre aspart of the Canada-France-Hawaii Telescope Legacy Survey, a col-laborative project of NRC and CNRS. The VIPERS web site ishttp: // / . The development of multi-object spectrographs on 8-m classtelescopes during the 1990s triggered a number of deep redshiftsurveys with measured distances beyond z ∼ . (e.g. VVDS Le Fevre et al. 2005, DEEP2 Newman et al.2012 and zCOSMOS Lilly et al. 2009). Even so, it was not untilthe wide extension of VVDS was produced (Garilli et al. 2008),that a survey existed with su ﬃ cient volume to attempt cosmo-logically meaningful computations at z ∼ z ≃ z ≃ . z ∼ ∼ ) ﬁbre optic positioners to probe hugevolumes at low sampling density, VIPERS exploits the featuresof VIMOS at the ESO VLT to yield a dense galaxy samplingover a moderately large ﬁeld of view ( ∼ .

08 deg ). It reachesa volume at 0 . < z < . z ∼ .

1, allowing the cosmological evo-lution to be tested with small statistical errors.The VIPERS redshifts are being collected by tiling theselected sky areas with a uniform mosaic of VIMOS ﬁelds.The area covered is not contiguous, but presents regular gapsdue to the speciﬁc footprint of the instrument ﬁeld of view,in addition to intrinsic unobserved areas due to bright stars

1. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS) or defects in the original photometric catalogue. The VIMOSﬁeld of view has four rectangular regions of about 8 × ﬀ ects, with di ﬀ erent constraintsalong the dispersion and spatial directions of the spectra, as thor-oughly discussed in de la Torre et al. (2013). Clearly, this com-bination of angular selection e ﬀ ects has to be taken properly intoaccount when estimating any clustering statistics.In this paper we measure the probability distribution func-tion from the VIPERS Public Data Release 1 (PDR-1) redshiftcatalogue, including ∼

64% of the ﬁnal number of redshifts ex-pected at completion (see Guzzo et al. 2014; Garilli et al. 2014for a detailed description of the survey data set). The paper is or-ganized as follows. In § § § § § § h = H /

100 km s − Mpc − , all magnitudes in this paper are in the ABsystem (Oke & Gunn 1983) and we will not give an explicit ABsu ﬃ x. In order to convert redshifts into comoving distances weassume that the matter density parameter is Ω m = .

27 and thatthe universe is spatially ﬂat with a Λ CDM cosmology withoutradiations.

2. Data

The VIMOS Public Extragalactic Redshift Survey (VIPERS) is aspectroscopic redshift survey being built using the VIMOS spec-trograph at the ESO VLT. The survey target sample is selectedfrom the Canada-France-Hawaii Telescope Legacy Survey Wide(CFHTLS-Wide) optical photometric catalogues (Mellier et al.2009). The ﬁnal VIPERS will cover ∼

24 deg on the sky, di-vided over two areas within the W1 and W4 CFHTLS ﬁelds.Galaxies are selected to a limit of i AB < .

5, further apply-ing a simple and robust gri colour pre-selection, as to e ﬀ ectivelyremove galaxies at z < .

5. Coupled to an aggressive observ-ing strategy (Scodeggio et al. 2009), this allows us to double thegalaxy sampling rate in the redshift range of interest, with re-spect to a pure magnitude-limited sample ( ∼ ∼ × h − Mpc , analogous to that of the 2dFGRSat z ∼ .

1. Such combination of sampling and depth is quiteunique over current redshift surveys at z > .

5. The VIPERSspectra are collected with the VIMOS multi-object spectrograph(Le Fevre et al. 2003) at moderate resolution ( R = + z ) km sec − . Thefull VIPERS area is covered through a mosaic of 288 VIMOSpointings (192 in the W1 area, and 96 in the W4 area). A discus-sion of the survey data reduction and management infrastruc-ture is presented in Garilli et al. (2012). An early subset of the Table 1.

List of the magnitude selected objects (in B-band) inthe VIPERS PDR-1 z min z min luminosity ¯ ρ (Eq. 1)M B − h ) < − h Mpc − . . − . − z . . . − . − z . . . − . − z . . . − . − z . . . − . − z . . . − . − z . . . − . − z . . . − . − z . . . − . − z . . . − . − z . . . − . − z . . . − . − z . spectra used here is analyzed and classiﬁed through a PrincipalComponent Analysis (PCA) in Marchetti et al. (2012).A quality ﬂag is assigned to each measured redshift, based onthe quality of the corresponding spectrum. Here and in all par-allel VIPERS science analyses we use only galaxies with ﬂags2 to 9 inclusive, corresponding to a global redshift conﬁdencelevel of 98%. The redshift conﬁrmation rate and redshift accu-racy have been estimated using repeated spectroscopic observa-tions in the VIPERS ﬁelds. A more complete description of thesurvey construction, from the deﬁnition of the target sample tothe actual spectra and redshift measurements, is given in the par-allel survey description paper (Guzzo et al. 2014).The data set used in this paper and the other papers ofthis early science release is the VIPERS Public Data Release1 (PDR-1) catalogue, which have been made publicly availablein September 2013. This includes 55 ,

359 objects, spread over aglobal area of 8 . × . and 5 . × . respectively inW1 and W4. It corresponds to the data frozen in the VIPERSdatabase at the end of the 2011 / . < z < .

1. Thereason for this selection is related to minimizing the shot noiseand maximizing the volume. This reduces the usable sample to18135 and 16879 galaxies in W1 and W4 respectively (alwayswith quality ﬂags between 2 and 9). The corresponding e ﬀ ectivevolume of the two samples are 6 .

57 and 6 . × h − Mpc .At redshift z = . ∼

370 and 230 h − Mpc. We divide the W1 and W4ﬁelds in three redshift bins and we build magnitude limited sub-samples in each of them. For convenience, we use the magnitudelimits listed in Table (1) of di Porto et al. (2014), which we recallin Tab. (2).The VIMOS footprint has an important impact on the ob-served probability of ﬁnding N galaxies in a randomly placedspherical cell in the survey volume. As a matter of fact, a directappreciation of the masked area can be shown on the ﬁrst mo-ment of the probability distribution, i.e. the expectation value ofthe number count ¯ N ≡ P ∞ N = NP N . On one hand, we can predictthe mean number of objects per cells from the knowledge of thenumber density in each considered redshift bins and on the otherhand we can estimate it by placing a regular grid of sphericalcells of radius R into the volume surveyed by VIPERS. In fact,given the solid angle of W1 and W4 and the corresponding num-

2. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS)

Fig. 1.

Upper : Expected mean number count in spheres (solid line, from Eq. 2) with respect to the observed one (symbols) for the variousluminosity cuts and for the three redshift bins [0 . , .

7] (left panel), [0 . , .

9] (central panel), and [0 . , .

1] (right panel). The selection in absolutemagnitude M B in B-band corresponding to each symbols / lines and colors are indicated in the inset. The dotted line displays the ¯ N = Lower :Displays the deviation α (see Eq. 3) between the expected mean number ¯ N R and the observed one ¯ N with respect to the radius R of the cells. ber of galaxies N and N contained in a redshift bin extractedfrom each ﬁeld, one can estimate the total number density as¯ ρ = N + N Ω + Ω V k , (1)where V k is deﬁned as the volume corresponding to a sectorof a spherical shell with solid angle equal to unity. In the caseof VIPERS PDR-1 the e ﬀ ective solide angles correspondingto W1 and W4 are respectively Ω = . × − and Ω = . × − (in square radians). One can thereforepredict the corresponding expected number of objects in eachcell by multiplying the averaged number density by the volumeof a cell. It reads¯ N R = π R ¯ ρ, (2)in the case of the spherical cells of radius R considered in thiswork. The expectation value ¯ N R with respect to the radius ofthe cells corresponding to each luminosity sub-sample extractedfrom VIPERS-PDR1 is represented by lines in Fig. (1). On thesame ﬁgure we display the measured mean number of object ¯ N in each redshift bins. Note that to perform this measurement weplace a grid of equally separated (4 h − Mpc) spheres of radius R = , , h − Mpc and we reject spheres with more than 40% oftheir volume outside the observed region (see Bel et al. 2014).We quantify the e ﬀ ect of the mask using the quantity α ≡ ¯ N ¯ N R , (3)in fact the botton panels of Fig. (1) shows that for all sub sam-ples and at all redshifts the neat e ﬀ ect of the masks is to under-sample the galaxy ﬁeld by roughly 72%. It also shows that thecorrection factor α depends on the considered redshift, on the luminosity and on the cell-size. The scale dependency can beexplained by the fact that the correction parameter α depends onhow the cells overlap with the masked regions. The left panelof Fig. (1) suggests that at low redshift the mask e ﬀ ect behavesin the same way for all the luminosity samples while the mid-dle panel shows a clear dependency with respect to luminosity.The correction factor α depends on the redshift distribution, as aresult the apparent dependency with respect to the luminosity isdue to the dependence of the number density with respect to theluminosity of the considered objects.The mask not only modiﬁes the mean number of object butit also modiﬁes the higher order moments of the distribution,such that the measured P N will be systematically altered. In thepresent paper we show that this systematic e ﬀ ect can be takeninto account by measuring the underlying probability densityfunction of the galaxy density contrast δ . It has been shown (seeFig. (8) of Bel et al. 2014) that after rejecting spheres with morethan 40% of their volume outside the survey, the local poissonprocess approximation holds. In particular, it allows to use the“wrong” probability distribution function in order to get reliableinformation on the underlying probability density function p ( δ ).Then applying the Poisson sampling one can recover the unal-tered P N using that ¯ N = ¯ N ( masked ) /α . For the sake of com-pleteness we provide the reader with the measured probabilityfunction obtained after rejecting the cells with more than 40%of their volume outside the survey (see Fig. 8).In particular, let P M and P N , respectively, be the observedand the true Counting Probability Distribution Function (CPDF).Assuming that from the knowledge of P M there exists a processto get the underlying probability density function of the stochas-tic ﬁeld Λ , which is associated to the random variable N , one cancompute the true CPDF applying P N = Z ∞ P [ N | Λ ] p ( Λ )d Λ , (4)

3. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS) where P [ N | Λ ] is called the sampling conditional probability; itdetermines the sampling process from which the discrete cell-count arises. In the following we assume that this sampling con-ditional probability follows a Poisson law (Layzer 1956), as aresult in Eq. (4) we substitute P [ N | Λ ] = K [ N , Λ ] ≡ Λ N N ! e − Λ . (5)It is also convenient to express Eq. (4) in terms of the densitycontrast of the stochastic ﬁeld Λ , δ ≡ Λ / ¯ Λ −

1, it follows that P N = Z ∞− K [ N | ¯ N (1 + δ )] p ( δ )d δ, (6)where we used that ¯ Λ = ¯ N , which is a property of the Poissonsampling.Continuing along this direction that we propose to comparethree methods which aim at extracting the underlying probabilitydensity function (PDF) in order to correct the observed CPDFfrom the angular selection e ﬀ ects of VIPERS.

3. Methods

In this section we review the PDF estimators that we use andcompare with each others in this paper. The purpose is to selectthe method which will be more adapted to the VIPERS charac-teristics.

This is an iterative method which aims at inverting Eq. (6) with-out parametrizing the underlying PDF, it has been investigatedby Szapudi & Pan (2004). This method starts with an initialguess p for the probability density function p which is usedto compute the corresponding expected observed P N , via P N , = Z ∞− ˆ K h N , ¯ N (1 + δ ) i p ( δ )d δ, where ˆ K h N , ¯ N (1 + δ ) i ≡ K / P N K . The probability density func-tion used at the next step is obtained usingˆ p i + ( δ ) = ˆ p i ( δ ) N max X N = P N P N , i ˆ K h N , ¯ N (1 + δ ) i , where ˆ p ≡ p P N K . For each step the agreement between theexpected observed probability distribution P N , i and the true one P N is quantiﬁed by χ i ≡ N max X N = P N P N , i − ! . It is therefore possible to know the evolution of the cost function χ with respect to the steps i .In fact it has been shown by Szapudi & Pan (2004) that itconverges toward a constant value which corresponds to thebest evaluation of the probability density function p given theobserved probability distribution P N . Since these authors haveshown that this convergence occurs after around 30 iterations.We did our own convergence tests which have shown that adopt-ing a value of 30 iterations is enough. However, it happens thatthe evolution of the χ is not always monotonic. In practice, westore the χ result of each step and we look for the step for whichthe χ is minimum, i.e. p ( δ ) = p i min ( δ ). As an initial guess we setthat the discret CPDF is equal to the continuous one ( p ( δ ) = p ). This is a parametric method where the shape of the probabilitydensity depends on a given number of parameters, in this case theprobability density function is assumed to be well described bya Skewed Log-Normal (Colombi 1994) distribution. It is derivedfrom the Log-Normal distribution (Coles & Jones 1991) but it ismore ﬂexible. It is indeed built upon an Edgeworth expansion; bethe stochastic ﬁeld Φ ≡ ln(1 + δ ), following a Normal distributionthen the density contrast δ follows instead a Log-Normal distri-bution. In the case of the Skewed Log-Normal (SLN) densityfunction, the ﬁeld Φ follows an Edgeworth expanded Normaldistribution P Φ ( Φ ) ≡ ( + h ν i c H ( ν ) + h ν i c H ( ν ) + h ν i c H ( ν ) ) G ( ν ) σ Φ , (7)where ν ≡ Φ − µ φ σ Φ , G is the central reduced Normal distribution G ( ν ) ≡ e − ν √ π and h ν n i c denotes the cumulant expectation valueof ν . As a result, the SLN is parameterized by the four param-eters µ Φ , σ Φ , h ν i c , and h ν i c which are related, respectively tothe mean, the dispersion, the skewness and the kurtosis of thestochastic variable Φ . They can all be expressed in terms of cu-mulants h Φ n i c of order n of the weakly non-Gaussian ﬁeld Φ .In Szapudi & Pan (2004) they use a best ﬁt approach and deter-mine these parameters by minimizing the di ﬀ erence between themeasured counting probability P N and the one obtained from P thN = R ∞− K h N , ¯ N (1 + δ ) i P Φ h ln(1 + δ ) , µ Φ , σ Φ , h Φ i c , h Φ i c i × d ln(1 + δ ) . (8)However, this requires us to perform the integral (Eq. 8) in a fourdimensional parameter space which is numerically expensive.In the present paper we use an alternative implementationwhich is computationally more e ﬃ cient. Instead of trying tomaximize the likelihood of the model given the observations,we rather use the observations to predict the parameters of theSLN. To do so we use the property of the local Poisson sam-pling (Bel & Marinoni 2012); the factorial moments h ( N ) nf i ofthe discrete counts are equal to the moments of the underlyingcontinuous distribution h Λ n i . Since the transformation betweenthe density contrast δ and the Edgeworth expanded ﬁeld Φ is lo-cal and deterministic, it is possible to ﬁnd a relation between themoments h Λ n i and the cumulants h Φ n i c .By deﬁnition, the moments of the positive continuous ﬁeld Λ are given by h Λ n i ≡ Z ∞ Λ n P ( Λ )d Λ , then for a local deterministic transformation the conservation ofprobability imposes P ( Λ )d Λ = P Φ ( Φ )d Φ , it follows that the mo-ments of Λ can be recast in terms of Φ h Λ n i = ¯ Λ n Z ∞ e n Φ P Φ ( Φ )d Φ . In the right hand side one can recognize the deﬁnition of themoment generating function M Φ ( t ) ≡ h e t Φ i we therefore obtainthat M Φ ( t = n ) = h Λ n i ¯ Λ n ≡ A n . (9)

4. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS)

This equation allows us to link the moment of Λ to the cumulantsof Φ via the moment generating function M Φ .Moreover, since the probability density P φ is the product ofa sum of Hermite polynomials with a Gaussian function it isstraightforward to compute the explicit expression of the mo-ment generating function we obtain M Φ ( t ) = ( + h Φ i c t + h Φ i c t + h Φ i c t ) e t µ Φ + t σ Φ . (10)As a matter of fact, Eq. (10) and Eq. (9) together allow to set upa system of four equations, for n = , , , Y n X n B n = A n , (11)where Y ≡ e σ Φ , X ≡ e µ Φ and B n ≡ M Φ ( t = n , µ Φ = , σ Φ = µ Φ , σ Φ , h Φ i c and h Φ i c parameterized in terms of X , Y , x ≡ h Φ i c and y ≡ h Φ i c . In appendix (A), we detail the procedure to solvethis non-linear system of equations.We therefore, get the values of the four parameters of theSLN by simply measuring the moments of the counting variable N up to the fourth order. The Gamma expansion method follows the same idea as de-scribed in § φ G deﬁned as φ G ( u ) ≡ u k − θ Γ ( k ) e − u , (12)where Γ is the Gamma function (for an integer n , Γ ( n + = n !, θ and k are two parameters which are related to the two ﬁrstmoments of the PDF. If the galaxy probability density functionis well described by a Gamma expansion at order n then it canbe formally written as P ( Λ ) = φ G ( u ) f ( k − n ( u ) , (13)where by deﬁnition u ≡ Λ θ , k = ¯ Λ σ Λ , θ ≡ ¯ Λ k = σ Λ ¯ Λ . The function f ( k − n represents the expansion aiming at tuning the moments ofthe Gamma distribution; note that the exponent ( k −

1) is not thederivative of order k −

1. Since this expansion is built upon theorthogonal properties of products of Laguerre polynomials withthe Gamma distribution, the function f ( k − n is given by the sum f ( k − n ( x ) ≡ n X i = c i L ( k − i ( x ) , (14)where L ( k − i are the generalized Laguerre polynomials of order i and the coe ﬃ cients c i represent the coe ﬃ cients of the Gamma expansion and therefore depend on the moments of the galaxyﬁeld Λ c n ≡ n X i = ni ! Γ ( k ) Γ ( k + i ) ( − i h Λ i i θ i . (15)The main interrest of the Gamma expansion with respect to theSLN is that the coe ﬃ cients of the expansion are directly relatedto the moments of the distribution we want to model, i.e. it isnot necessary to solve a complicated non-linear system of equa-tions nor to perform a Likelihood estimation of the coe ﬃ cients.Moreover, it can be easily performed at higher order to describeas best as possible the underlying probability density function ofgalaxies.Another advantage of describing the galaxy ﬁeld Λ by aGamma expansion probability density function is that the cor-responding observed P N can be expressed analytically, which isnot the case for the SLN which must be integrated numerically.In Appendix (B) we demonstrate the previous statement, itfollows that the CPDF P N can be calculated from P N = ( − θ ) N N ! n X i = c i Γ ( i + k ) Γ ( k ) h ( N ) i ( θ ) , (16)where h i ≡ i ! θ i (1 + θ ) i + k and in this case we use the notation h ( N ) i = d N h i d γ N . The successive derivatives of h i can be obtained from therecursive relation h ( N ) i ( θ ) = ( i ) Nf θ N h i ( θ ) − N X m = Nm ! ( i + k ) mf (1 + θ ) m h ( N − m ) i ( θ ) . In addition to the fact that having the possibility of computingthe corresponding observed P N without requiring an inﬁnite in-tegral for each number N is computationally more e ﬃ cient, it isalso practical to have the analytical calculation for some pecu-liar values of the k parameter of the distribution. In fact, when k is lower than 1 which occurs on small scales (4 h − Mpc), theprobability density function goes to inﬁnity when Λ goes to 0(although the distribution is still well deﬁned). In particular, thisnumerical divergence would induce large numerical uncertain-ties in the computation of the void probability P . In addition,one can see that for the void probability we have the simple re-lation P = n X i = c i Γ ( k + i ) Γ ( k ) h i ( θ ) , (17)which can be used to recover the true void probability inVIPERS.

4. Application of the methods on a synthetic galaxydistribution

In this section we analyse a suite of synthetic galaxy distribu-tions generated from 20 realizations of a Gaussian stochasticﬁeld. The full process involved in generating these bench-markcatalogues is detailed in Appendix C. Each comoving volumehas a cubical geometry of size 500 h − Mpc. We generate thegalaxies by discretizing the density ﬁeld according to the sam-pling conditional probability P [ N | Λ ] which we assume to be aPoisson distribution with mean Λ . In this way we know the true

5. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS) underlying galaxy density contrast δ . We can therefore performa fair comparison between the methods introduced in § ﬀ ect of the grid (0 . h − Mpc) wesmooth both the density ﬁeld and the discrete ﬁeld using a spher-ical Top-Hat ﬁlter of radius R = h − Mpc. We apply the threemethods mentioned in § δ .The discrete distribution of points contains an average num-ber of object per cell ¯ N = P N is given by theblack histogram in the lower panel of Fig. (2), from this measure-ment we apply the three methods R-L, SLN and Γ e and obtain anestimation of the probability density function corresponding toeach method. In the upper panel of Fig. (2) we compare the per-formance of the three methods in recovering the true probabilitydensity function (black histogram referred to as reference in theinset). Note that, for this test case, we use a Gamma expansionat order 4 in order to be coherent with the order of the expansionof the Skewed Log-Normal. We have also represented the proba-bility density function estimated when neglecting the shot noise(red dotted line), which is used as the initial guess in the case ofthe R-L method.From the top panel of Fig. (2) we can conclude that the threemethods perform reasonably well. It seems that the Γ e methodreproduces better the density distribution of under-dense regions( δ ∼ −

1) but this is expected in the sense that the distributionused to generate the synthetic catalogues is a Gamma distri-bution (see Appendix C). Although, it is not obvious becausethe scale on which the density ﬁeld has been set up is one or-der of magnitude smaller than the scale of the reconstruction R = h − Mpc.The performance of the three methods is also representedin the bottom panel of Fig. (2), in which we compare the ex-pected observed P N from each method to the true one. One cansee that they all agree at the 15% level, hence it is not possibleto conclude that one is better than an other. This was actuallyexpected, from the comparison on the underlying density ﬁeld(Fig. 2). On the contrary if one of the methods would not agreewith the PDF then we would expect also a disagreement on theobserved CPDF (see § N ≤ . N ≃

8) and found that the R-L method appearsto be highly sensitive to the shot noise. In fact if the mean num-ber of object per cell is too few then the output of the method de-pends too much on the initial guess. It follows that, if it is too farfrom the true PDF the process does not converge (see top panelof Fig. 3) and the corresponding P N does not match the observedone (see bottom panel of Fig. 3). Note that we explicitly checkedthis e ﬀ ect by increasing the number of iterations from 30 to 200.While in the case of both, the SLN and the Gamma expansion,one can see in Fig. (3), the output probability density function isin agreement (with a larger scatter) with the one obtained in the¯ N ≃ Fig. 2.

Upper : The black histogram with error bars shows thetrue underlying probability density function (referred to as refer-ence in the inset) compared to the reconstruction obtained withthe R-L (red dashed line), the SLN (green dot-dashed line), andthe Γ e (blue long dashed line) methods. The red dotted histogramshows the PDF used as the initial guess for the R-L method andthe colored dotted lines around each method line represent thedispersion of the reconstruction among the 20 fake galaxy cata-logues. We also display the relative di ﬀ erence of the result ob-tained from each method with respect to the true PDF. Lower :The black histogram with error bars shows the observed proba-bility density function (referred to as reference in the inset) com-pared to the reconstruction obtained with the R-L (red dashedline), the SLN (green dot-dashed line), and the Γ e (blue longdashed line) methods. We also display the relative di ﬀ erence ofthe result obtained from each method with respect to the ob-served P N .Considered the sensitivity of the R-L method to the initialguess, knowing that the average number of galaxies per cell canbe lower than unity and ﬁnally taking into account computa-tional time, we shall continue our analysis only using the twoparametric methods SLN and Γ e . In the following, we will com-pare them using more realistic mock catalogues but for whichwe don’t know apriori the true underlying PDF.

6. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS)

Fig. 3.

Same as in Fig. (2) but we use only 10% of the galaxiescontained in the fake galaxy catalogues as a result the averagenumber of galaxy per cell drops from ¯ N = N = .

5. Performances in realistic conditions

In this section we discuss how observational e ﬀ ects have beenaccounted for in our analysis and test the robustness of the recon-struction methods SLN and Gamma expansion. For this purposewe use a suite of mock catalogues created from the Milleniumsimulation, they are also used in the analysis performed bydi Porto et al. (2014).We shall compare the reconstruction methods betweentwo catalogues, namely REFERENCE and MOCK. The ref-erence is a galaxy catalogue obtained from semi-analyticalmodels. We simulate the redshift errors of VIPERS PDR-1by perturbing the redshift (including distortions due to pe-culiar motions) with a Normally distributed error with rms . + z ). Each MOCK catalogue is built from the cor-responding REFERENCE catalogue by applying the same ob-servational strategy (de la Torre et al. 2013) which is appliedon VIPERS PDR-1; spectroscopic targets are selected from theREFERENCE catalogue by applying the slit-positioning algo-rithm (SPOC, Bottini et al. 2005) with the same setting as forthe PDR-1. This allows us to reproduce the VIPERS footprinton the sky, the small-scale angular incompleteness due to spec-tra collisions and the variation of target sampling rate across theﬁelds. Finally, we deplete each quadrant to reproduce the e ﬀ ectof the survey success rate (SSR, see de la Torre et al. 2013). In Table 2.

List of the magnitude selected objects (in B-band) inthe mock catalogues z min z min luminosityM B − h ) < . . − . − z . . − . − z . . − . − z . . − . − z . . − . − z . . − . − z Fig. 4.

Comparison between the SLN and Γ e methods at 0 . < z < .

1. Each panel corresponds to a cell radius R of 4, 6 and8 h − Mpc from the left to the right.

Top : The red histogram showsthe observed PDF in the MOCK catalogues while the blackhistogram displays the PDF extracted from the REFERENCEcatalogues. The blue diamonds with lines and the magenta tri-angles show, respectively, the Γ e expansion performed in theREFERENCE and MOCK catalogues. On the other hand, thecyan diamonds with lines and the orange triangles show, re-spectively, the SLN expansion performed in the REFERENCEand MOCK catalogues. Bottom : Relative deviation of the Γ e andSLN expansions applied both on the REFERENCE and MOCKcatalogues with respect to the PDF of the REFERENCE cata-logues.this way, we end up with 50 realistic mock catalogues (namedMOCK hereafter), which simulate the detailed survey complete-ness function and observational biases of VIPERS in the W1 andW4 ﬁelds.In order to perform a similar analysis as the one we aim atdoing for VIPERS PDR-1, we construct sub-samples of galax-ies selected according to their absolute magnitude M B in B-band;we take all objects brighter than a given luminosity. We list thosesamples in Tab. (5), we have in total 6 galaxy samples. The high-est luminosity cut (M B − h ) < . − z ) allows us to followa single population of galaxies at three cosmic epocs.In Fig. 4, 5 and 6 we show the reconstruction performancesfor the SLN and the Γ e method. We consider the same popu-lation ( M b − h + z < − .

72) but in three redshift bins,0 . < z < .

1, 0 . < z < . . < z < .

7. In order totest the stability of the methods we perform the reconstructionat three smoothing scales, R =

4, 6 and 8 h − Mpc. The compar-ison is done as follows, on one hand we estimate the true P N from the REFERENCE catalogue (before applying the observa-tional selection) and we perform the reconstruction on it, in thisway we can test the intrinsic biases due to the assumed para-metric method (SLN or Γ e ). On the other hand, we estimate theobserved P M in the MOCK catalogues, from which we perform

7. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS)

Fig. 5.

Comparison between the SLN and Γ e methods at 0 . < z < .

9. Each panel corresponds to a cell radius R of 4, 6 and8 h − Mpc from the left to the right.

Fig. 6.

Comparison between the SLN and Γ e methods at 0 . < z < .

7. Each panel corresponds to a cell radius R of 4, 6 and8 h − Mpc from the left to the right.

Fig. 7.

Comparison between the SLN and Γ e methods. Each col-umn corresponds to a cell radius R of 4, 6 and 8 h − Mpc from theleft to the right, and each row corresponds to a combination ofredshift and magnitude cut.the reconstruction to verify if we recover the expected P N fromthe REFERENCE catalogue.Inspecting Fig. (4) we can ﬁrst see that the intrinsic errordue to the speciﬁc modeling of the methods is much larger forthe SLN (cyan diamonds compared to the black histogram) thanfor the Γ e (magenta diamonds compared to the black histogram).From the top panels we see that the SLN does not reproduce thetail of the CPDF and from the bottom panel we see that evenfor low counts it is showing deviations as large as 20%. Thisintrinsic limitation is propagating when performing the recon-struction on the MOCK catalogue (orange triangles compared tothe black histogram) while for the Γ e we see that the agreementis better than 10% (magenta triangles compared to the black his- togram) in the low count regime and the tail is fairly well repro-duced. In the second place, comparing the Γ e performed on theREFERENCE and the MOCK catalogues (blue diamonds withrespect to magenta triangles) one can see the loss of informationdue to the observational strategy has at most an impact of 10% onthe reconstructed CPDF which reduces when considering largercells (less shot noise).In general, examination of Fig. (5) and (6) conﬁrms thatfor the considered galaxy population the same results hold atlower redshifts. However, in particular the reconstruction at R = h − Mpc can exhibit deviations larger than 20%, this is at oddswith the fact that the shot noise contribution is expected to be thesame for the three redshift bins (magnitude limited). We attributethis larger instability to the fact that not only the shot noise con-tribution is higher for R = h − Mpc but also the volume probedis smaller when decreasing the redshift.The performances of the reconstruction for the last threegalaxy samples are shown in Fig.(7) where each row correspondsto a galaxy sample (we only show the residual with respect to theREFERENCE). This last comparison allows to say that the re-construction instability at 4 h − Mpc was indeed due to the highlevel of shot noise. We can conclude that in the HOD galaxymock catalogues, the galaxy distribution is more likely to bemodelled by a Γ e instead of an SLN. Finally, for a chosen re-construction method, the information contained in the MOCKcatalogues is enough to be able to reconstruct the CPDF of theREFERENCE catalogue to better than 10%.

6. VIPERS PDR-1 data

In this section we apply the reconstruction method to theVIPERS PDR-1. We saw in the previous sections that the SLNand Γ e methods are sensitive to the assumption we make aboutthe underlying PDF. In fact, we saw in § § Γ e than an SLN distribu-tion. However, in the following we will not take for granted thatthe same property holds for the galaxies in the PDR-1.We want to choose which one of the two distributions (Log-Normal or Gamma) best describes the observed galaxy distribu-tion in VIPERS PDR-1, when no expansion is applied. Thus,we compare the observed PDF to the one expected from thePoisson sampling of the Log-Normal probability density func-tion (PS-LN) and to the one expected from the Poisson samplingof the Gamma distribution (the so-called Negative Binomial).Error bars are obtained by performing a Jack-knife resamplingof 3 × P N = θ N N ! r ( r + ... ( r + N − + θ ) N + r , (18)where θ = ¯ Nr and r = ¯ N σ N − ¯ N to ensure that the ﬁrst two momentsof the Negative Binomial match those of the observed distribu-tion. We show in Fig. (8) the outcome of this comparison, it fol-lows that the Negative Binomial is much closer to the observedPDF than the PS-LN. As a result, the underlying galaxy distribu-tion is more likely to be described by a Gamma distribution thanby a Log-Normal. Hence, we only use the Gamma expansion tomodel the galaxy distribution of VIPERS PDR-1.

8. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS)

Fig. 8.

Observed count-in-cell probability distribution function P N (histograms) from VIPERS PDR-1 for various luminosity cuts(indicated in the inset). Each row corresponds to a redshift bin, from the bottom to the top 0 . < z < .

7, 0 . < z < . . < z < .

1. Each column corresponds to a cell radius R = , , h − Mpc from the left to the right. Moreover we addedthe expected PDF from two models which match the two ﬁrst moment of the observed distribution; the red solid line shows theprediction for a Poisson sampled Log-Normal (PS-LN) CPDF while the green dashed line displays the Negative Binomial modelfor the CPDF.Moreover, the use of the Gamma expansion instead of theSLN simpliﬁes substantially the analysis. In Fig. (9) we providethe reconstructed probability distribution function of VIPERSPDR-1 together with the corresponding underlying probabilitydensity function for each redshift bin and luminosity cut. Eachpanel of Fig. (9) shows how the choice of a particular class oftracers (selected according to their absolute magnitude in B-band) inﬂuence the PDF of galaxies. When measuring speciﬁcproperties of the intrinsic galaxy distribution for each luminositycut, it is enough to look at the CPDF however, when comparingthe distributions with each other it is necessary to take care aboutthe averaged number of objects per cell which varies from sam-ple to sample. As a result it appears more useful to compare theproperties of the di ﬀ erent galaxy samples using their underlyingprobability density function which, assuming Poisson sampling,is free from sampling rate variation between di ﬀ erent type oftracers.For the two ﬁrst redshift bins, we can see that the probabil-ity density function is broadening when selecting more luminousgalaxies, this goes in the direction of increasing the linear biaswith respect to the matter distribution. However, despite a lesssigniﬁcant trend, for the highest redshift bin it seems that it goesin the oposite direction. This trend might be an artifact; indeedby analyzing Fig. (1) we see that for all these samples the aver-aged number of object per cell is between 0 . . ﬀ ected by shot noise e ﬀ ects.As a result, speciﬁc care should be taken when interpreting thosethree high redshift samples.In the following we focus on the evolution of the underlyingPDF for a particular class of objects on the wide redshift rangeprobed by VIPERS PDR-1. The Fig. (10) displays the outcomeof this study, it shows how the PDF, for three populations (the three highest magnitude cuts), evolves regarding to the redshiftat which it is measured. The three populations (top, middle andbottom panels) exhibit non-monotonic evolution with respect tothe redshift. In particular, the more luminous population is show-ing that the PDF at 0 . < z < . ﬀ erent than in the two lower redshift bins. However, we seealso that some instabilities are appearing in the reconstruction(see wiggles at high 1 + δ ). This might be due to the fact thatwe have fewer galaxies in this sample giving rise to a large shot-noise contribution ( ¯ N < . σ while for the most luminous population, truncat-ing the expansion at order 4 only removes the instability with-out changing signiﬁcantly the overall behavior of the PDF. Thisconsistency test shows that the radical change in the measuredPDF for the highest redshift bin appears to be the true feature.Probably only the ﬁnal VIPERS data set will be able to give arobust conclusion.Finally, in Tab. (6), we list the relevant coe ﬃ cients of theGamma expansion which we measured from the VIPERS PDR-1 at the scale R = h − Mpc. They can be used in order to modelboth the CPDF (Eq. 16) and the PDF (Eq. 13).

7. Summary

The main goal of the present paper is to measure the probabil-ity of ﬁnding N galaxies falling into a spherical cell randomlyplaced inside a sparse sampled (i.e. with masked areas or withlow sampling rate) spectroscopic survey. Our general approachto this problem has been to use the underlying probability den-sity distribution of the density contrast of galaxies in order to

9. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS)

Fig. 9.

Top : Reconstructed PDF applying the Γ e method in three redshift bins (from left to right) at the intermediate smoothingscale R = h − Mpc.

Bottom : Underlying PDF corresponding to the CPDF in the top panel, for each luminosity cut the 1-sigmauncertainty is represented by the dotted lines.

Table 3.

Coe ﬃ cients of the Γ e expansion which describe the VIPERS PDR-1 data for R = h − Mpc z M B − h ) k θ c c c c . − . − . − z − . − z − . − z − . − z − . − z . − . − . − z − . − z − . − z − . − z . − . − . − z − . − z − . − z recover the counting probability corrected from sparseness ef-fects. We therefore compared three ways (R-L, SLN and Γ e ) ofmeasuring the probability density of galaxies classiﬁed in twocategories; direct and parametric. We found that when the sam-pling is high ( ¯ N ≃

10) the direct method (Rychardson-Lucy de-convolution) performs well and avoids putting any prior on theshape of the distribution. On the other hand, we saw that whenthe sampling is low ( ¯ N ≃

1) the direct method fails to convergeto the true underlying distribution. We thus concluded that, insuch cases, the only alternative is to use a parametric method.We presented two parametric forms aiming at describing thegalaxy density distribution, the SLN which is often used in theliterature to model the matter distribution and the Γ e . Despite thefact that the two distributions used in this paper have been al-ready investigated in previous works, the approach we proposeto estimate their parameters is completely new. Previously, ﬁt-ting procedures were used in order to estimate them. Here wepropose to measure directly the parameters of the distributionsfrom the observations. The method can be applied to both dis- tributions SLN and Γ e and decreases considerably the computa-tional time of the process.Relying on simulated galaxy catalogues of VIPERS PDR1,we tested the reconstruction scheme of the counting probabil-ity ( P N ) under realistic conditions in the case of the SLN and Γ e expansions. We found, that the reconstruction depends on thechoice of the model for the galaxy distribution. However, wehave also shown that it is possible to test which distribution bet-ter describes the observations.Using VIPERS PDR1, on the relevant scales investigated inthis paper ( R = , , h − Mpc), we found that the Γ distributiongives a better description of the observed P N than the one pro-vided by the Log-Normal (see Fig. 8). We therefore adopted the Γ e parametric form in order reconstruct the probability densityfunctions of galaxies. From these reconstruction we studied howtheir PDF changes according to their absolute luminosity in B-band and we also studied their redshift evolution. We found thatlittle evolution has been detected in the two ﬁrst redshift binswhile it seems that the density distribution of the galaxy ﬁeld isstrongly evolving in the last redshift bin.

10. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS)

Fig. 10.

Evolution of three galaxy populations selected accord-ing to their luminosity (from bottom to top). On each panel, theblack solid, red dashed and, cyand dot-dashed lines represent,respectively, the three redshift bins 0 . < z < .

7, 0 . < z < . . < z < . Acknowledgements.

JB acknowledges useful discussions with E. Gazta˜naga.We acknowledge the crucial contribution of the ESO sta ﬀ for the manage-ment of service observations. In particular, we are deeply grateful to M. Hilkerfor his constant help and support of this programme. Italian participation toVIPERS has been funded by INAF through PRIN 2008 and 2010 programmes.JB, LG and BJG acknowledge support of the European Research Councilthrough the Darklight ERC Advanced Research Grant ( / / B / ST9 / / / D / ST9 / / / ERCgrant agreement n. 202781. WJP and RT acknowledge ﬁnancial support fromthe European Research Council under the European Community’s SeventhFramework Programme (FP7 / / ERC grant agreement n. 202686. WJPis also grateful for support from the UK Science and Technology FacilitiesCouncil through the grant ST / I001204 /

1. EB, FM and LM acknowledge thesupport from grants ASI-INAF I / / / Institut Universitaire deFrance and the LABEX OCEVU.

References

Bel, J. & Marinoni, C. 2012, MNRAS, 424, 971Bel, J., et al. (the VIPERS Team) 2014, A&A, 563, A37Bernardeau, F., Colombi, S., Gazta˜aga, E. & Scoccimarro, R. 2002, PR, 367, 1Bottini D., Garilli, B., Maccagni, D., et al. 2005, PASP, 117, 996Coles, P., Jones, B. 1991, MNRAS, 248, 1Colless, M., Dalton, G., Maddox, S., et al. 2001, MNRAS, 328, 1039Colombi, S. 1994, ApJ, 435, 536 de la Torre, S., Guzzo, L., Kovac, K., et al. (the ZCOSMOS collaboration) 2010,MNRAS, 409, 867de la Torre, S., Guzzo, L., Peacock, J.A., et al. (VIPERS team) 2013, A&A,557A, 54di Porto, C., et al. (VIPERS team) 2014, submitted, arXiv:1406.6692DEisenstein, D. J. & Hu, W. 1998, ApJ, 496, 605Garilli, B., Le F`evre, O., Guzzo, L., et al. (the VVDS collaboration) 2008, A&A,486, 683Garilli, B., Paioro, L., Scodeggio, M. et al. 2012, PASP, 124, 1232Garilli, B., Guzzo, L., Scodeggio, M., et al. (the VIPERS team) 2014, A&A, 562,23Gazta˜naga, E., Fosalba, P. & Elizalde, E. 2000, ApJ, 539, 522Greiner, M., En β lin, T. A. 2015, A&A, 574, 86Guzzo, L., Pierleoni, M., Meneux, B., et al. (the VVDS team) 2008, Nature, 451,541Guzzo, L., Scodeggio, M., Garilli, B., et al. (the VIPERS team) 2014, A&A, 566,108Layzer, D. 1956, AJ, 61, 383Le F`evre, O., Saisse, M., Mancini, D., et al. 2003, Proc. SPIE, 4841, 1670Le F`evre, O. Vettolani, G., Garilli, B., et al 2005, A&A, 439, 845Lilly, S. J., Le Brun, V., Maier, C., et al. (the ZCOSMOS collaboration) 2009,ApJS, 184, 218Marchetti, A., Granett, B.R., Guzzo, L., et al. (the VIPERS team) 2012,MNRAS, in press, arXiv:1207.4374Mellier, Y., Bertin, E., Hudelot, P., et al. 2008, The CFHTLS T0005 Release,http: // terapix.iap.fr / cplt / oldSite / Descart / CFHTLS-T0005-Release.pdfMustapha , H. & Dimitrakopoulos, R. 2010, C&M, 60, 2178Newman, J. A., Cooper, M.C., Davis, M., et al. (the DEEP2 collaboration) 2012,arXiv:1203.3192Oke, J. B. & Gunn, J. E. 1983, ApJ, 266, 713Scodeggio M., Franzetti P., Garilli B., et al. 2009, Msngr, 135, 13Szapudi, I. & Pan, J. 2004, ApJ, 602, 26

Appendix A: Non-linear system

The problem of this system of equations is that it is non-linear, itis therefore di ﬃ cult to solve however it can be reduced to a onedimensional equation which can be solved numerically.The two ﬁrst equations ( n = n =

2) can be used to ex-press the two ﬁrst cumulants with respect to the third and fourthorder ones σ Φ = ln( A ) + ln  B B  (A.1) µ Φ = −  ln( A ) + ln  B B  (A.2)where B and B are both functions of x and y . Then using othercombinations of equation one can express a system of two equa-tions for x and y alone B = a B B (A.3) B B = a B , (A.4)where a ≡ A A and a ≡ A A . In order to solve properly the sys-tem we prefer to express it in term of one parameter η ≡ B / B ,moreover one can see that polynomials B to B are not indepen-dent, as a result B = d + aB + bB + cB , where a = , b = − , c = , d = − and which can besubstituted in Eq. (A.3). Combining Eq. (A.3) and Eq. (A.4) oneobtains a parametric equation for B ( a + b η ) B + ( d + c f ( η )) B − g ( η ) = , (A.5)

11. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS) which can be solved for each value of the parameter η and anindependent parametric equation for B B = f ( η ) . As a result we can ﬁnd a couple B , B for each value of theparameter η , it follows that one can express x and y with respectto η and given the deﬁnition of η the possible solution x and y must satisfy the condition B [ x ( η ) , y ( η )] − η B [ x ( η ) , y ( η )] = , which gives the possible values of η from which one can recover x and y . Finally, from Eq. (A.1) and Eq. (A.2) we can computethe values of σ Φ and µ Φ corresponding to each couple ( x , y ) ofsolutions. This allows us to select the solution which provides avalue of A closer to the observed one.Once the values of the cumulants µ Φ , σ Φ , h Φ i c and h Φ i c are known from the process detailed above, we know that themoments of the corresponding P thN will match those of the ob-served on up to order 4. At the end, one can check whether theSLN distribution provides a good match to data by integratingnumerically the probability density function convolved with thePoisson kernel K (see Eq. 5). Appendix B: Generating function

We show that the CPDF associated to a Gamma expanded PDFcan be calculated analytically from an expression which dependsexplicitly on the coe ﬃ cients c i of the Gamma expansion.Be G N the generating function associated to the probabilitydistribution P N , it is deﬁned as G N ( λ ) ≡ ∞ X i = λ N P N . (B.1)In case of the Poisson sampling of a Gamma distribution, aftersome algebra, one can show that it can be expressed with respectto the coe ﬃ cients of the Gamma expansion as G N ( λ ) = Γ ( k ) n X i = c i F i ( γ ) , (B.2)where γ ≡ (1 − λ ) θ and F i ( γ ) ≡ Z ∞ x k − e − x L ( k − i ( x ) e − γ x d x . Nevertheless, this integral can be computed using the Laguerreexpansion of the exponential e − γ x = ∞ X i = γ i (1 + γ ) i + α + L ( α ) i ( x ) , it reads to F i ( γ ) = γ i (1 + γ ) i + k Γ ( i + k ) i ! . (B.3)The formal expression of the generating function is thereforegiven by G N ( λ ) = (1 + γ ) − k Γ ( k ) n X i = c i Γ ( i + k ) i ! γ + γ ! , (B.4) where we still use γ = (1 − λ ) θ . From the explicit expressionof the moment generating function (Eq. B.4) one can get theprobability distribution P N by iteratively deriving the generatingfunction with respect to γ P N ≡ N ! d N G N ( λ )d λ N (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) λ = = ( − θ ) N N ! d N G N ( γ )d γ N (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) γ = θ . These derivatives can be calculated explicitly.

Appendix C: Synthetic galaxy catalogues

In this Appendix we describe how we generate synthetic galaxycatalogues from Gaussian realizations. The ﬁrst requirement ofthese catalogues is that they must be characterized by a knownpower spectrum and 1-point probability distribution function.The second requirement is that the probability distribution func-tion must be measurable.The basic idea is simple, we generate a Gaussian randomﬁeld in Fourier space (assuming a power spectrum), we inverseFourier transform it to get its analog in conﬁguration space. Wefurther apply a local transform in order to map the Gaussian ﬁeldinto a stochastic ﬁeld characterized by the target PDF. The twocrucial step of this process are the choice of the input powerspectrum and the choice of the local transform.Be ν a stochastic ﬁeld following a centered ( h ν i =

0) reduced( σ ν ≡ h ν i c =

1) Gaussian distribution. From a realization ofthis ﬁeld, one can generate a non-Gaussian density ﬁeld δ byapplying a local mapping L between the two, hence δ = L ( ν ) . (C.1)The local transform L must be chosen in order to match sometarget PDF P δ for the density contrast δ . Assuming that the localtransform is a monotonic function which maps the ensemble ] −∞ , + ∞ [ into ] − , + ∞ [ then, due to the probability conservation P δ ( δ )d δ = P ν ( ν )d ν , the local transform must verify the followingmatching C δ [ δ ] = C ν [ ν ] , (C.2)where C x stands for the cumulative probability distribution func-tion. Be [ a , b ] the deﬁnition assemble of the variable x thenits cumulative probability distribution function is deﬁned as C x [ x ] ≡ R xa P x ( x ′ )d x ′ , where P x is the PDF of x . By deﬁnition aprobability density function is positive, it follows that its cumu-lative is a monotonic function and therefore Eq. (C.2) can alwaysbe inverted, it reads δ = C − δ [ C ν ( ν )] , where the exponent − F − [ F ( x )] = x . For example, by deﬁnition the local map-ping L which allows transform a Normal distribution into a Log-Normal distribution is δ = e ν −

1. Note that depending on thePDF that must be matched this inversion can require a numeri-cal evaluation which can be tabulated.Once a local transform is chosen, we need to adress thequestion of ﬁnding the appropriate power spectrum of theGaussian ﬁeld ν which, once locally mapped into the densityﬁeld δ , will match the expected power spectrum. FollowingGreiner & En β lin (2015), who considered a log-transform wegeneralized their result to a generic local transformation. This

12. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS) mapping is not direct in Fourier space while it is in conﬁgura-tion. Writing the two point moment of order two of the densityﬁeld δ and assuming the probability conservation leads to ξ δ ≡ h δ δ i = Z L( ν )L( ν ) B ( ν , ν , ξ ν )d ν d ν , (C.3)where B is a bivariate Gaussian deﬁned as B ( ν , ν , ξ ν ) ≡ π | C ν | / exp ( − ν T C − ν ν ) . (C.4)One can notice that in our case (central reduced Gaussian) thecovariance matrix C ν takes the simple form C ν = " ξ ν ξ ν . Onceintegrated over the deﬁnition domain of ν and ν , Eq. (C.3)provides a mapping between the 2-point correlation function ofthe Gaussian ﬁeld ν and the 2-point correlation function of thedensity ﬁeld δ . However, we prefer to rotate the coordinate sys-tem before performing the integral (C.3) because in case of highcorrelation ( ∼

1) then the gaussian will be comparable with astraight line; most of the sampling of this function will be use-less. That’s why we look for the rotation allowing to diagonalizethe matrix C ν and therefore convert ν into a new variable x . Itfollows that C x = " − ξ ν

00 1 + ξ ν and the integral becomes ξ δ = π p − ξ ν Z L (cid:18) x − x (cid:19) L (cid:18) x + x (cid:19) e − x σ + x σ ! d x d x , (C.5)where σ = − ξ ν and σ = + ξ ν we can therefore integrateover a bounded domain corresponding to the − σ , 8 σ alongthe x axis and − σ , 8 σ along the x axis. An other possibilityto perform the integral C.3 is to use the Mehler’s formula, doingso, one can show that the 2-point correlation of the density ﬁeldcan be expressed as a Taylor expansion on the 2-point correlationfunction of the ν ﬁeld. It reads, ξ δ = λ ( ξ ν ) ≡ ∞ X n = n ! c n ξ n ν , (C.6)where the c n are the coe ﬃ cients of the Hermit transform of thelocal mapping L ( ν ) = P ∞ n = c n H n ( ν ) and they can be calculatedusing the orthogonal properties of Hermit polynomials c n = n ! Z + ∞−∞ L ( ν ) H n ( ν ) P ν ( ν )d ν. (C.7)The latter approach considerably speed up the numerical eval-uation of Eq. (C.5), it allows to compute the 2-D integral as aﬁnite sum of 1-D integrals. It also allows to verify that when the2-point function of the ﬁeld ν is positive then the derivative of ξ δ with respect to ξ ν is positive. Moreover, from Eq.(C.3) onecan see that ξ ν = ξ δ =

0. This means that the functionwhich transforms ξ ν into ξ δ is invertible as long as ξ δ is positive.On the other hand we know that the zero-crossing of the 2-pointcorrelation function occurs at very large scales at which one cansafely assume that | ξ δ | << ξ δ and ξ ν . As a result, one can take the reciprocal of the function λ such that ξ ν = λ − ( ξ δ ). Once the local transform L and the 2-point correlation map-ping λ are known, then the input power spectrum of the Gaussianﬁeld ν can be obtained as follow. We choose a power spectrum P ( k ), in the present case Eisenstein & Hu (1998), for the densityﬁeld δ , we calculate its corresponding 2-point correlation func-tion ξ δ = Z P ( k ) e i k · r d k . (C.8)At each scale r , one can deduce the 2-point correlation functionof the Gaussian ﬁeld ξ ν = λ − ( ξ δ ) and ﬁnally using a Fouriertransform we obtain the input power spectrum P in ( k ) = π ) Z ξ ν ( r ) e − i k · r d r . (C.9)Finally, in order to make sure that the PDF target will be re-produced, it is necessary to verify that, once the input powerspectrum P in ( k ) have been set up on regular k-space grid whichwill be used to generate the Gaussian ﬁeld, its integral is in-deed equal to the expected variance on the size of the mesh.ˆ σ a = ( π L ) P n P ( k n ) should be equal to σ a = R P ( k )d k . Ingeneral, σ a and ˆ σ a are not equal, thus we renormalize the targetpower spectrum by the quantity S = ˆ σ a /σ a , ˆ P in ( k ) = S P lin ( k ).We generate a Gaussian ﬁeld (with a ﬂat power spectrum),on a regular mesh of a = . h − Mpc and a comoving box of500 h − Mpc . We then Fourier transform with an FFT and keeponly the phases of the ﬁeld ν k = e i θ ( k ) . We generate at each po-sition k n the value of the modulus of ν k = √ X k e i θ ( k ) , where X k = − ˆ P in ( k ) ln(1 − ǫ ) and ǫ is a random number with a uniformprobability distribution between 0 and 1. We then inverse Fouriertransform the ﬁeld to get a centered reduced Gaussian ﬁeld. InFig. (C.1) we show the input power spectrum of the Gaussianﬁeld ν compared to the one measured using a FFT, and to theone expected from the local transformation applied to the ν ﬁeldin order to generate the density ﬁeld δ .

13. Bel et al.: The VIMOS Public Extragalactic Redshift Survey (VIPERS)

Fig. C.1.

Upper : Grey dotted lines show the power spectrummeasured in each of the 20 fake galaxy distributions, the blacksolid line represent their average and the errors display the dis-persion of the measurements. The blue long dashed line dis-plays the input power spectrum used too generate the Gaussianstochastic ﬁeld nu and the red dashed line shows the correspond-ing expectation value for the power spectrum of the density con-trast δ . Lower : Shows the deviation between the measured powerspectrum of the δ -ﬁeld and the expected one. Universit`a degli Studi di Milano, via G. Celoria 16, 20130 Milano,Italy INAF - Osservatorio Astronomico di Brera, Via Brera 28, 20122Milano, via E. Bianchi 46, 23807 Merate, Italy INAF - Istituto di Astroﬁsica Spaziale e Fisica Cosmica Milano, viaBassini 15, 20133 Milano, Italy Aix Marseille Universit´e, CNRS, LAM (Laboratoired’Astrophysique de Marseille) UMR 7326, 13388, Marseille,France INAF - Osservatorio Astronomico di Torino, 10025 Pino Torinese,Italy Canada-France-Hawaii Telescope, 65–1238 Mamalahoa Highway,Kamuela, HI 96743, USA Aix Marseille Universit´e, CNRS, CPT, UMR 7332, 13288Marseille, France Universit´e de Lyon, F-69003 Lyon, France INAF - Osservatorio Astronomico di Bologna, via Ranzani 1, I-40127, Bologna, Italy Dipartimento di Matematica e Fisica, Universit`a degli Studi RomaTre, via della Vasca Navale 84, 00146 Roma, Italy Institute of Cosmology and Gravitation, Dennis Sciama Building,University of Portsmouth, Burnaby Road, Portsmouth, PO1 3FX Institute of Astronomy and Astrophysics, Academia Sinica, P.O.Box 23-141, Taipei 10617, Taiwan INAF - Osservatorio Astronomico di Trieste, via G. B. Tiepolo 11,34143 Trieste, Italy SUPA, Institute for Astronomy, University of Edinburgh, RoyalObservatory, Blackford Hill, Edinburgh EH9 3HJ, UK Institute of Physics, Jan Kochanowski University, ul. Swietokrzyska15, 25-406 Kielce, Poland Department of Particle and Astrophysical Science, NagoyaUniversity, Furo-cho, Chikusa-ku, 464-8602 Nagoya, Japan Dipartimento di Fisica e Astronomia - Alma Mater StudiorumUniversit`a di Bologna, viale Berti Pichat 6 /

2, I-40127 Bologna, Italy INFN, Sezione di Bologna, viale Berti Pichat 6 /

2, I-40127 Bologna,Italy Institute d’Astrophysique de Paris, UMR7095 CNRS, Universit´ePierre et Marie Curie, 98 bis Boulevard Arago, 75014 Paris, France Max-Planck-Institut f¨ur Extraterrestrische Physik, D-84571Garching b. M¨unchen, Germany Laboratoire Lagrange, UMR7293, Universit´e de Nice SophiaAntipolis, CNRS, Observatoire de la Cˆote dAzur, 06300 Nice,France Astronomical Observatory of the Jagiellonian University, Orla 171,30-001 Cracow, Poland National Centre for Nuclear Research, ul. Hoza 69, 00-681Warszawa, Poland Universit¨atssternwarte M¨unchen, Ludwig-Maximillians Universit¨at,Scheinerstr. 1, D-81679 M¨unchen, Germany INAF - Istituto di Astroﬁsica Spaziale e Fisica Cosmica Bologna,via Gobetti 101, I-40129 Bologna, Italy INAF - Istituto di Radioastronomia, via Gobetti 101, I-40129,Bologna, Italy Dipartimento di Fisica, Universit`a di Milano-Bicocca, P.zza dellaScienza 3, I-20126 Milano, Italy INFN, Sezione di Roma Tre, via della Vasca Navale 84, I-00146Roma, Italy INAF - Osservatorio Astronomico di Roma, via Frascati 33, I-00040Monte Porzio Catone (RM), Italy Institut Universitaire de France Universit´e de Toulon, CNRS, CPT, UMR 7332, 83957 La Garde,France32