aa r X i v : . [ a s t r o - ph . GA ] M a y Mon. Not. R. Astron. Soc. , 000–000 (0000) Printed 27 June 2018 (MN L A TEX style file v2.2)
Galaxy shear estimation from stacked images
Antony Lewis ⋆ Institute of Astronomy, Madingley Road, Cambridge CB3 0HA, UK.
27 June 2018
ABSTRACT
Statistics of the weak lensing of galaxies can be used to constrain cosmology if thegalaxy shear can be estimated accurately. In general this requires accurate modellingof unlensed galaxy shapes and the point spread function (PSF). I discuss suboptimalbut potentially robust methods for estimating galaxy shear by stacking images suchthat the stacked image distribution is closely Gaussian by the central limit theorem.The shear can then be determined by radial fitting, requiring only an accurate modelof the PSF rather than also needing to model each galaxy accurately. When noiseis significant asymmetric errors in the centroid must be corrected, but the methodmay ultimately be able to give accurate un-biased results when there is a high galaxydensity with constant shear. It provides a useful baseline for more optimal methods,and a test-case for estimating biases, though the method is not directly applicable torealistic data. I test stacking methods on the simple toy simulations with constant PSFand shear provided by the GREAT08 project, on which most other existing methodsperform significantly more poorly, and briefly discuss generalizations to more realisticcases. In the appendix I discuss a simple analytic galaxy population model wherestacking gives optimal errors in a perfect ideal case.
Gravitational lensing of light from distant galaxies causesthe shape of the galaxies to be distorted in a way thatdepends on the transverse gradients of the gravitationalpotential along the line of sight (Bartelmann & Schneider2001). If the distortion can be measured accurately, it givesa constraint on the lensing potentials, and hence with largeenough number of samples on the geometry and distributionof perturbations in the universe. Since the galaxy shapesvary greatly, this can only be done by analysing a very largenumber of galaxies, with galaxies that are sufficiently wellseparated that their intrinsic shape correlations can be mod-elled out or is small. The galaxies can then be assumed tobe independent, so that any shape correlation is due entirelyto lensing. The task is to find a way to estimate the lens-ing distortion, which can then be used to extract statisticalresults from an ensemble of galaxy images.At leading order the main observable distortion is thatof galaxy shear. As discussed further below, if we couldobserve the galaxies directly, fitting any sheared profile toeach galaxy will give an unbiased estimator of this shear.The problem is however much more complicated, becausein practice we can only measure the shape after convolutionwith the point spread function (PSF) of the instrument (e.g.due to atmospheric fluctuations and instrumental imperfec-tions), and image pixelization. The levels of shear that areexpected — a few percent — are comparable to those of typi-cal PSFs, so the PSF must be modelled very accurately in or-der to isolate the cosmological signal. Since the PSF breaksthe symmetries of the problem, in general this requires ac- curate modelling of both the unlensed galaxy shapes andthe PSF. Finding methods of doing this that work to the re-quired level of precision is an active area of current research.At the moment it unclear whether it is even possible to getuseful high-precision shear constraints in the presence of re-alistic ground-observation PSFs, or whether in fact thereare unavoidable degeneracies with galaxy shapes and PSFmodelling uncertainties. The correct statistical error on theshear measurement could also be too large for the numberof observable galaxies to produce precision constraints.In this paper I re-visit an old sub-optimal methodof Kuijken (1999): stacking galaxy images. If the intrinsicgalaxy shapes are uncorrelated, a stacked unlensed imageshould have circular symmetry. Since convolution is a lin-ear operation, the observed stacked image should then bea PSF-convolved sheared version of a circularly symmetricaverage galaxy. If the PSF is known, the only modelling un-certainties are then in the averaged galaxy profile, whichshould be well determined by the data. Furthermore, un-der fairly general conditions a sum of independent samplesshould have a close-to-Gaussian distribution by the centrallimit theorem, so the statistics of the stacked image is knownwithout needing to know anything about the distribution ofindividual galaxy shapes. Fitting a radial profile and shearto the data with a Gaussian error model gives an estimate ofthe shear that should be very independent of the actual dis-tribution of galaxy shapes. The method therefore provides auseful baseline for comparing future more optimal methodsthat incorporate accurate modelling of individual galaxies.In practice of course things are not so simple. To start c (cid:13) A.M. Lewis with the shear and PSF are not expected to be constant, soany stacked galaxy image has to be interpreted with care. Inaddition, in the presence of noise, the process of stacking canitself produce biases since we cannot determine the centroidaccurately: any shear- or PSF-correlated misalignments inthe stacking procedure will introduce biases.Given the complexity of the general problem, the lens-ing community has helpfully boiled the issues down into a se-ries of much simpler problems (Heymans et al. 2006; Masseyet al. 2007; Bridle et al. 2008). Although existing methodsperform adequately for current and near-future data, evenin these highly simplified cases they are known to be inade-quate for future surveys. I therefore focus on these simplifiedproblems to try to isolate the important issues, in particularI shall assume the PSF is well measured from many low-noisestar images and that shear is constant. If no methods worksaccurately even on this very simple toy problem, then thatis clearly sufficient to show that ground-based weak lens-ing surveys with similar PSFs will be of no use for precisioncosmology (i.e. future cosmological parameter constraints atthe percent level or better). On the other hand if sufficientlyaccurate methods can be developed, the next task will be tomake them applicable to more realistic situations where thePSF is likely to vary significantly and the shear has a real-istic spatial correlation function. Space-based observationstypically have rather different PSFs and would require aseparate study.I start by reviewing the case of shape estimation whenthere is no PSF, and then briefly explain why introducinga PSF qualitatively increases the complexity of the prob-lem. I then move on to show that stacking can work wellwith low-noise simulations, and discuss various issues to dowith pixel-scale stacking, centroid errors and non-constantPSFs. I test stacking methods on the GREAT08 simula-tions (Bridle et al. 2008) and show that it performs wellcompared to other existing methods, most of which involvemodelling unlensed galaxy shape distributions with some-thing that is known to be incorrect. Unlike other existingmethods the stacking method is not directly applicable tomore realistic data, but may be useful to motivate moregeneral approaches. At lowest order in the gravitational potentials weak lensingcauses position x u on the unlensed image to be related tothe corresponding position x l on the lensed image by x u = S x l ≡ „ − g − g − g g « x l , (1)where the components of the shear matrix g and g are thereduced shear in some coordinate system. For the purposesof this paper we can neglect the uniform convergence whichis degenerate with the galaxy size and assume g , g are con-stant across each galaxy image. For a thorough introductionto weak lensing see Bartelmann & Schneider (2001); Lewis& Challinor (2006). Consider fitting a model m ( S m x , θ ) to an unlensed per-fect galaxy image I u ( x ), with model parameters θ and shearmatrix S m . For example a simple least-squares fit wouldsolve ∂∂θ Z d x | I u ( x ) − m ( S m x , θ ) | = 0 ∂∂ S m Z d x | I u ( x ) − m ( S m x , θ ) | = 0 . (2)Assuming there is a unique solution S m = S , θ = ˆ θ , thelensed image I l ( x ) = I u ( S x ) would then be fit by m ( ˆ S x , ˆ θ )where ˆ S = S S . The best-fit unlensed shear matrix S isdetermined by the particular galaxy and model. The key as-sumption in galaxy weak lensing is that the galaxy shapesare statistically isotropic, in other words that versions ofeach unlensed galaxy rotated by different angles (or flipped)are equally likely. So R T S R is just as likely as S forsome rotation matrix R . Taking R to be a rotation by90 ◦ , on average over many galaxy orientations we have, h S i = h S + R T S R i = I , and hence h ˆ S i = S : the shearmatrix estimator is unbiased. Note that this is entirely in-dependent of how well m ( S m x , θ ) actually fits the unlensedgalaxy, so in the idealized case we could fit any model welike to galaxy shapes and still on average get the correctanswer. This will remain true for best-fits to more generallog-likelihoods of the form χ = Z d x ( I l ( x ) − m ( S m x , θ )) T [ N ( I l ( x ) , S m x )] − × ( I l ( x ) − m ( S m x , θ )) , (3)i.e. where the noise depends only on the lensed galaxy inten-sity or follows the alignment of the galaxy model. Similarlyfor generalizations with correlated noise. Unfortunately we cannot observed lensed galaxies directly,but only after convolution with an instrumental point spreadfunction and pixelization. Pixelization can be though of asan additional contribution to the PSF, typically a convolu-tion with a square window function, followed by samplingat the pixel centres. I shall discuss the PSF in this general-ized sense, so that the observational data consists of a setof regularly-spaced samples of a PSF and pixel-convolvedgalaxy image. The noise-free observed value at position x on the image plane is then I o ( x ) = Z d y P ( x − y ) I l ( y ) , (4)where P ( x ) is the total PSF, or simply I o = P ⋆I l . If we knowthe PSF function, we can fit a PSF-convolved galaxy modelto the data; for example a least-squares solution would min-imize χ = Z d x „ I o ( x ) − Z d y P ( x − y ) m ( S m y , θ ) « = Z d x »Z d y P ( x − y )[ I u ( S y ) − m ( S m y , θ )] – . (5)If S m = S , θ = ˆ θ is the best fit when there is no lensing,due to the position dependence of the PSF it is no longer c (cid:13) , 000–000 alaxy shear estimation from stacked images in general the case that S m = S S , θ = ˆ θ is the best fitto the lensed image. Hence unlike in the case with no PSF,there is no longer any guarantee that fitting is giving anunbiased estimate of the shear. The only general exceptionis if the model fits the galaxy exactly, so the best fit has I u ( S y ) = m ( S m y , θ ), in which case fitting is giving theright answer on average independent of the known PSF.Extracting unbiased shear constraints by model fittingin the presence of a PSF therefore in general requires mod-elling the galaxies accurately. This poses several significantproblems. A large galaxy lensing survey will have most ofits galaxies near the edge of its resolution, therefore thereis typically only limited high-quality data to constrain theproperties of the bulk of the galaxies in the selection func-tion. A general Bayesian model could use information fromsome well-resolved galaxies, and fit a general model for un-certainties, but given the large variation in galaxy alignmentwith respect to the line of sight, and wide variations in theintrinsic shapes, a general model is likely to involve a largenumber of parameters and require many images to constrainwell. The galaxy model can also be constrained to some ex-tent using all of the observed galaxies. But if the param-eters can not all be well constrained by the data, it maybe essential that the priors accurately represent the galaxydistributions in order to get unbiased answers. In additionany model with large numbers of parameters per galaxy islikely to become numerically time consuming. For an excel-lent discussion of many related issues see Bernstein & Jarvis(2002); Hirata & Seljak (2003), and a summary of other ex-isting methods in Bridle et al. (2008). For promising recentresults on Bayesian model fitting see Miller et al. (2007);Kitching et al. (2008), however the galaxy model used inthis method is still unrealistic and results, though signifi-cantly better than many other methods, are still not goodenough for high-precision cosmology (see Sec. 4). General galaxy fitting should provide the best constraintson the shear. However given the problems outlined above,and given potential difficulties in knowing whether the mod-elling is accurate enough, it would be useful to have a simpleless-optimal but more robust shear-estimation method thatis more directly independent of the details of the galaxy dis-tribution. In simple test cases this would be a useful cross-check, and provide a baseline for the levels of residual noisethat better methods should be able to beat.The method I shall focus on simply stacks the galaxies,and then fits a sheared average galaxy model to the stackedimage, following Kuijken (1999). If the PSF is known, thisshould give unbiased results conditional only on being ableto stack in an unbiased way, and being able to model theradial profile of the averaged unlensed galaxy accurately. A1-dimensional radial model is clearly much easier to fit thata full 2D galaxy shape distribution, and since the averagegalaxy is expected to have a smooth radial profile only amodest number of parameters should be required. These pa-rameters are likely to be well constrained with a reasonablenumber of galaxies (and hence the results fairly independentof the priors).In order to stack galaxies images, we need to be able to define a rule for the relative galaxy alignment, e.g. by defin-ing a centroid in each image and then stacking the images sothat their centroids are aligned. Assuming this can be done,we then have an observed stack of N galaxy imagesˆ¯ I o ( x ) ≡ N N X i =1 β i I o,i ( x ) = 1 N N X i =1 [ β i P i ⋆ I l,i ]( x ) , (6)where β i are some weights and P i the PSF on galaxy i .Assuming the PSF is independent of the galaxy shape theexpected value of the stacked image is h ˆ¯ I o i = 1 N N X i =1 [ P i ⋆ h β i I l,i i ] = N N X i =1 P i ! ⋆ ¯ I l,β = ¯ P ⋆ ¯ I l,β , (7)where ¯ I l,β is the average of a weighted galaxy. By symme-try, taking x to have origin at the centroid, in the unlensedcase ¯ I u,β ( x ) = ¯ I u,β ( | x | ). The average PSF ¯ P — includingpixelization — is precisely what is observed from a large sta-tistically equivalent set of star images (assuming stars arepoint sources and have the same PSF as the galaxies).Assuming the weights are independent of the shear andthe shear is constant, the expectation of the stacked im-age is a sheared circularly-symmetric averaged galaxy, con-volved with an average PSF. We can therefore proceed tofit a model to the observed stacked image, and if the radialprofile can be fit accurately the method should be unbiased.In the appendix I discuss a simple analytic galaxy popula-tion model in which, for the ideal noise-free case, stackingwith an appropriate weighting is in fact optimal.In the presence of noise, and with finite N so that thereis dispersion about the expectation value, we need an errormodel. One benefit of using stacked images is that this iswell defined: assuming each galaxy is independent, the dis-tribution of ˆ¯ I o , a sum of many independent galaxy samples,should be nearly Gaussian by the central limit theorem.In fact we can apply any linear function to ˆ¯ I o , and thedistribution will still be Gaussian with expectation givenby the equivalent linear function of the averaged convolvedgalaxy. This can be useful for data compression, e.g. to re-pixelize, or expand in moments, etc, anything that’s likely toencapsulate most of the useful information in fewer numbers.If the stacked image is generated at much higher pixel sam-pling than the original image, there will be a large number ofpixel values and hence a huge number of galaxies requiredfor the covariance estimate to be accurate. Also since thenoise is correlated on the scale of the original pixel size, thecovariance would be singular, so applying some linear re-pixelization or other linear compression matrix M can beuseful. Writing the set of sampled x values as a vector, fora data vector X = M ˆ¯ I o the covariance C X ≡ h XX T i canbe estimated from N galaxy samples asˆ C X = 1 N X i M ( β i I o,i − ˆ¯ I o )( β i I o,i − ˆ¯ I o ) T M T (8)(for large N , N ≫ dim( X )). The likelihood as a function ofparameters θ and shear matrix S can then be approximatedas − L ( S , θ ) ∼ [ˆ¯ I o − m o ( S , θ )] T M T ˆ C − X M [ˆ¯ I o − m o ( S , θ )](9)where m o ( S , θ ) = ¯ P ⋆ m ( S , θ ) is the model for the av-erage PSF-convolved sheared circularly-symmetric galaxy. c (cid:13) , 000–000 A.M. Lewis
The simplest thing is to take M to just re-pixelize, e.g. atthe original pixel resolution. For simple PSFs it also loseslittle information to halve the number of points by taking M to sum I ( x ) and I ( − x ) since this includes the shear and ra-dial information, but ignores irrelevant dipole fluctuations.Note that C X includes variance due to both noise and ‘shapenoise’ due to the differences in galaxy shapes. The latter isexpected to be spatially correlated even if the noise is not,but in any case the central limit theorem result straightfor-wardly accounts for any correlated or non-Gaussian noise inindividual images.In the limit that the model fits the stacked image ex-actly, and the stated assumptions are met, the fitting proce-dure should be unbiased. However due to noise this will notquite be the case, so there is potentially a source of noise-bias through the PSF and shear dependence of the estimatedcovariance.Note that even if the instrumental PSF is actually con-stant, the PSF for points in the stacked image plane variesfrom galaxy to galaxy due to the offset between the centresof the pixels in each image and the centre of the stackedimage. If the stacked image is pixelized at higher resolution,then there are different PSFs for each non-equivalent high-resolution pixel centre. However the averaged PSF will bethe same for each high-resolution pixel. The high resolutionpixels are of course strongly correlated due to the pixeliza-tion of the individual images. Centroid errors on the galaxy image plane are harmless(other than increasing the error bars), since they merelyeffect the average galaxy profile. For example we could con-sider defining the centroid in terms of a random displace-ment from the centre of light in the unlensed galaxy, whichis perfectly legitimate. The problem is that in general thecentroid error will depend on the galaxy shape, and hencealso shear and PSF in non-trivial way. Since the centroidof a long thin shape is hard to determine in the long direc-tion, the centroid error is typically strongly correlated withthe shape of the galaxy; if the galaxies have a net ellipticityin one direction due to the PSF, the centroids will tend tohave a net dispersion aligned with the PSF. This has theeffect of making a naively stacked image give results biasedin the direction of the PSF. There are similar effects due toshear. When the centroid error is not negligible comparedto the galaxy sizes, the centroid error must be accountedfor somehow in order to get unbiased results. In general thisis difficult, though an approximate correction may be suffi-cient.Two simple approaches immediately present themself.We could simply attempt to model the effective centroid-error PSF and include it as part of the effective PSF on thestacked image. Or the centroid error could be modified toremove some of the sources of bias. The latter approach islikely to be more straightforward, if less optimal.As a crude first attempt to remove the leading-ordercentroid bias I simply add Gaussian noise to each centroidso that the total centroid dispersion is approximately cir-cularly symmetric. To do this I fit a 6-parameter Gaussianelliptical model to each observed (PSF-convolved) galaxyto determine the centroid, calculating the Hessian errors by
Figure 1.
Typical residuals after fitting a unit-amplitude shearedcircularly-symmetric galaxy to a stacked image. Note errors arecorrelated due to pixel-scale stacking correlations and correlatedshape noise. Here pixels have been added at 15 sub-pixel resolu-tion, less than needed for accurate results. numerical differentiation and then inverting to get an ap-proximate centroid error matrix. Then to each estimatedcentroid I add a small Gaussian displacement in a directionchosen such that the total centroid error is then isotropic.If the estimate of the centroid error on each galaxy is fairlyaccurate, this should remove the correlation of centroid dis-persion with galaxy alignment, and hence reduce the PSFbias. However the magnitude of the centroid error will stilldepend on the PSF-convolved sheared galaxy shape, andhence potentially lead to residual biases. Furthermore if thetotal centroid dispersion is accounted for by allowing the av-erage galaxy profile to change, the centroid error really hasto be sheared like the rest of the galaxy shape; a better ap-proach could therefore use an approximate estimate of theshear to ensure that the total centroid error on the imageplane is sheared approximately correctly .The centroid determined by Gaussian model fitting thatI use seems to have about 10% less dispersion than that ob-tained using adaptive moments (Bernstein & Jarvis 2002),however in the presence of more complicated PSF (e.g. witha dipole) the position of the centroid could be biased, so amore sophisticated method may be required. Unfortunatelyto get the centroid error correct in general requires mod-elling the shape of each galaxy correctly, which is just ashard as the original shape estimation problem. However aslong as the centroid error is small compared to the size ofthe galaxies, an approximate correction may be sufficient.Indeed the output from a more realistic galaxy fitting codelike Lensfit (Kitching et al. 2008) might be a good place tostart trying to improve the crude Gaussian model used here.Simulations may also be reliable enough to find a fudge pa-rameter to relate the estimated centroid error to the truecentroid error to the required accuracy. c (cid:13) , 000–000 alaxy shear estimation from stacked images I test the galaxy stacking method on simulations providedby the GREAT08 project (Bridle et al. 2008). These sat-isfy the required assumptions, in that the shear is constantover a larger number of galaxy images. The simulations alsohave constant (very simple) PSF, with a large number oflow-noise star images so that the PSF can be determinedessentially exactly. The PSF is anisotropic but has no dipolemoment and the isophotes have the same shape at each ra-dius; it is therefore a rather special case and is likely to beunrealistic in several important respects. Nonetheless, evenwith these radical simplifications from reality, most exist-ing shear-estimation methods fail to produce results at anaccuracy required for precision cosmology, so it makes aninterested test case.The disadvantage of non-optimal methods such asstacking is there are lots of free parameters, e.g. choice of β i and radial fitting function, choice of the M reduction ma-trix, as well as resolution parameters governing the stacking.The weights β i must be chosen in a shape-independent (or atleast alignment-independent) manner, otherwise biases maybe introduced. I take β i to constant or inversely proportionalto the integrated signal in each images (so that the averagegalaxy is then independent of the magnitude distribution ofthe galaxies); see the Appendix for a discussion of the op-timal weighting in an idealized case. For noisy images thisshould probably be modified by an estimate of the signal tonoise ratio do down-weight noise-dominated images. I chose M simply to re-pixelize the stacked image to the resolutionof the original galaxies.Parameterizing radial distribution of m ( S , θ ) usingsplines is convenient, so θ is a set of values at some radialspline nodes. Splines naturally have multiple resolution: e.g.we can do a quick fit with a few spline points, then increasethe number of spline parameters to refine the result. Thiscould be done in an adaptive way to make sure the datais fit but not over-fit. I simply choose to spline in the logof the radial amplitude, using 12 spline points over a ra-dius range of 8 pixel units for fitting the stacked image,with stacked-image resolution 1/31 or 1/41 of the originalpixel size (with initial fit at 1/9 resolution with 7 splinepoints to get close to the best-fit point quickly). The fit-ting could be done with MCMC to get accurate error bars,though at a quick look does not show evidence of strong de-generacies or asymmetries in the error bars. So finding justthe best fit is a reasonable first step, with errors approxi-mated from a Hessian if required. To find the best-fit pointI use the NEWUOA algorithm, which can be significantlyfaster than ‘AMOEBA’ downhill-simplex method (Nelder &Mead 1965) in many cases, though I need to be a bit carefulto avoid local minima. The resulting reduced chi-squared isgenerally less than one, indicating that indeed the galaxy isbeing fit accurately, though values are hard to assess due tothe non-realistic mirroring procedure used in the GREAT08simulations to help reduce shape noise. Typical residuals areshow in Fig. 1.For noisy images the centroid error needs to be cor- Thanks to Gary Bernstein for pointing out this issue. rected as discussed in the previous section. Comparison withthe test simulation indicates that the centroid variance isunderestimated by about a factor of a half, so I adopt acentroid-error fudge parameters α = 1 . α C where C is estimatedfrom the Hessian about the best-fit Gaussian model.Accuracy of results for the purpose of GREAT08 is de-fined by a quality parameter Q (Bridle et al. 2008), so thatthe shear variance is h ( h ˆ g i − g ) + ( h ˆ g i − g ) i = 10 − Q , (10)where ˆ g i is the shear estimated from a plate of 10000 galaxyimages at constant shear and the same PSF, g i is the trueshear, and h ˆ g i i is estimated from an ensemble of differ-ent simulated plates with the same shear. For noisy sim-ulations results are quoted for Q estimated from the ex-pectation value from a set of simulations with differentPSF and true shears. The target for future observations is Q ∼ O (1000) (Amara & Refregier 2008), and current meth-ods typically give Q . O (100). Biases on g i therefore needto be below . × − level, or a typical fractional shearerror of less than about a percent. For low-noise images thedefinition is simply to take each plate separatelyˆ Q = 10 − h (ˆ g − g ) + (ˆ g − g ) i plates , (11)with the stacking method described here giving ˆ Q ∼ Q . Q ∼ Q &
100 can be obtained by this sub-optimal method, making essentially no assumptions aboutthe galaxy distribution, is perhaps encouraging evidencethat there will exist a better method that is good enough forprecision cosmology using only modest assumptions aboutthe galaxy distribution. There is some evidence for shear-calibration bias in the stacking results, with a tendency for | g | to be too large. More careful modelling of the centroiderror, for example using better model fitting and an iterativeshear estimate, could probably reduce the systematic error.Suitable time-consuming adjustment of the method param-eters may also allow the method to perform significantlybetter. I have revisited the simple shear-estimation stacking methodof Kuijken (1999), and shown that it still makes a useful c (cid:13) , 000–000 A.M. Lewis baseline that can compare favorably with currently usedmethods in idealized cases. Although it only works straight-forwardly over regions with constant shear, it can be a usefultest case, and help to understand possible sources of bias inother methods. Stacking has the advantage of giving resultsthat are unbiased almost independent of the unknown distri-bution of unlensed galaxy shapes. Residual biases enter at alower level, for example through correlations of the centroiderror with galaxy shape. With low noise the method canproduce accurate results, comparing favourably with meth-ods that fit galaxy models that are known to be unrealistic.This should be unsurprising: Bayesian methods generallygive the right results only if the correct model is used andpriors truly reflect beliefs. Only in the very special case of noobservational PSF does fitting any model to galaxy shapesgive unbiased answers; a general PSF breaks all the symme-tries, requiring accurate modelling of both the PSF and theunlensed galaxy shape distribution to get the right result.Even if individual noisy galaxies are well fit by a simplegalaxy model due to the large noise, if in reality galaxieshave significant substructure or un-modelled shape varia-tions, the combined high-precision shear estimate from fit-ting many galaxies separately may be biased due to theinconsistent shape modelling. It is possible the modellingbias is negligible, but unless carefully proven analytically ordemonstrated numerically in realistic simulations it wouldbe safer to assume otherwise (see Voigt & Bridle (2009)for a quantitative analysis of the significant bias in vari-ous idealized cases). The noise-free stacking procedure is byconstruction linear in the galaxies, which is why substruc-ture variations between galaxies effectively cancel. Howeverfitting to individual galaxies is usually a non-linear proce-dure, and there is no reason to expect errors to cancel moregenerally. Future work may however be able to find fairlymodel-independent methods that can be applied to fittingindividual galaxies, significantly improving on the stackingmethod both in terms of signal to noise, and in terms ofapplication to more realistic cases. If not, stacking methodsmay still be useful. Future work could investigate how toapply stacking in more realistic cases where the shear variesfrom galaxy to galaxy, and the PSF can only be estimatedlocally with significant noise. At leading order, a fit to astacked galaxy constructed over a region with small varia-tions in shear should be probing the appropriately averagedshear. With a high galaxy density the corresponding sup-pression of small-scale power may be acceptable if it can beaccurately modelled without significant bias.
I acknowledge a PPARC/STFC Advanced fellowship andthank members of the GREAT08 team, especially SarahBridle, for helpful discussions, Lance Miller for comments,Steven Gratton for matrix identity advice, and CITA for useof their old computer.
APPENDIX A: ANALYTIC EXAMPLE
Here we consider a very simple toy distribution of galaxieswhere we can attempt to calculate some things analytically. Consider the case where each galaxy has a Gaussian-shapedprofile I u ( x | Q i ) = A e − x T Q − i x / | Q i | w/ , (A1)where the distribution of the covariance Q i of each galaxyis drawn from a 2-dimensional inverse Wishart distribution(see e.g. Gupta & Nagar (1999) for review and results usedbelow) P ( Q ) = | Ψ | ( n − / e −
12 Tr( Ψ Q − ) n − π / Γ[( n − / n − / | Q | n/ (A2)where n >
6. Since we assume the unlensed distribution isstatistically isotropic Ψ = ( n − σ g I where σ g is the aver-age galaxy width. The parameter n determines how broadthe galaxy shape distribution is, with n → ∞ correspond-ing to a distribution of identical circular Gaussian galaxies.Typical galaxy ellipticities are O ( n − / ) with h ( Q − Q ) ih ( Q + Q ) i = h Q ih ( Q + Q ) i = 1 n − . (A3)The parameter w governs how the magnitude varies, with w = 0 corresponding to all galaxies having the same peakamplitude, and w = 1 corresponds to them all having equalintegrated light.The averaged galaxy profile (with equal weight) is givenby¯ I u ( x ) = Z d Q P ( Q ) I u ( x | Q )= Γ[ n + w − n − | Ψ | w/ A (1 + x T Ψ − x ) ( n + w − / . (A4)As expected this becomes the same as the individual galaxyshape as n → ∞ . The covariance can be calculated similarlyas cov( x , y ) = Z d Q P ( Q ) I u ( x | Q ) I u ( y | Q ) − ¯ I u ( x ) ¯ I u ( y )= A Γ[ n − | Ψ | w » Γ[ n + 2 w − | I + Ψ − ( xx T + yy T ) | ( n +2 w − / − Γ[ n + w − Γ[ n − x T Ψ − x )(1 + y T Ψ − y )] ( n + w − / – . (A5)Note that | I + Ψ − ( xx T + yy T ) | =(1 + x T Ψ − x )(1 + y T Ψ − y ) − ( x T Ψ − y ) . (A6)The covariance is determined by the number of degrees offreedom governing the population, so that with a simplemodel the number of significantly non-zero eigenvalues issmall. In the case here each galaxy shape is determined bythe three independent numbers in Q .Using this analytic galaxy population model we cancompare the errors (e.g. estimating the shear matrix S suchthat Ψ = S T Ψ S ) using stacking compared to what couldbe done using an optimal analysis. If Q i were simply mea-sured directly from each galaxy (in the low-noise limit) thenthe optimal expected error is *X i ∂ ln P ( Q i ) ∂g a ∂g b + − g =0 = δ ab N ( n − , (A7) c (cid:13) , 000–000 alaxy shear estimation from stacked images where N is the number of galaxies. The corresponding errorper galaxy is σ = σ = 1 / (2 √ n − σ i is the erroron g i . For n = 7 this corresponds to an error per galaxy σ i = 0 . w = 0 in fact gives thesame average error per galaxy, with the errors increasingonly slightly for w ∼ O (1). In this noise-free case with knowndistributions and centroids, the stacking method is close tooptimal. To show that with w = 0 stacking gives optimalanswers we only need to show that the full likelihood can bewritten in terms of the stacked image. Since − P ( Ψ |{ Q i } ) = X i ˆ Tr( Ψ Q − i ) − ( n −
3) ln | Ψ | ˜ +const , (A8)where the last term is independent of Ψ , a sufficient statisticis P i Q − i . However this can be measured by taking deriva-tives of the perfect stacked imageˆ¯ I u = AN X i e − x T Q − i x (A9)at the origin, and hence stacking is lossless for measuringthe shear in this ideal case. Since the number of degrees offreedom in the galaxy model is small, the stacked image doesnot actually need to be densely sampled to obtain close tooptimal results.In the zero-noise limit with infinite resolution, a knownPSF can simply be deconvolved, so the above results alsoapply to PSF-smeared noise-free galaxies. Noise can be ac-counted for by adding an appropriate term to Eq. (A5) ifthe centroids are known, and will increase the expected er-ror per galaxy. Analysing more realistic cases analytically ischallenging. REFERENCES
Amara A., Refregier A., 2008, MNRAS, 391, 228,0710.5171Bartelmann M., Schneider P., 2001, Phys. Rept., 340, 291,astro-ph/9912508Bernstein G. M., Jarvis M., 2002, AJ, 123, 583, astro-ph/0107431Bridle S., et al., 2008, 0802.1214Gupta A., Nagar D., 1999, Matrix Variate Distributions.Chapman & Hall, ISBN: 1584880465Heymans C., et al., 2006, MNRAS, 368, 1323, astro-ph/0506112Hirata C. M., Seljak U., 2003, MNRAS, 343, 459, astro-ph/0301054Kitching T. D., Miller L., Heymans C. E., van WaerbekeL., Heavens A. F., 2008, MNRAS, 390, 0802.1528Kuijken K., 1999, Astronomy and Astrophysics, 352, 355,astro-ph/9904418Lewis A., Challinor A., 2006, Phys. Rept., 429, 1, astro-ph/0601594Massey R., et al., 2007, MNRAS, 376, 13, astro-ph/0608643Miller L., Kitching T. D., Heymans C., Heavens A. F., vanWaerbeke L., 2007, MNRAS, 382, 315, ADS, 0708.2340Nelder J. A., Mead R., 1965, The Computer Journal, 7,308 Voigt L., Bridle S., 2009, Fundamental limits to lensingshear estimation for model fitting methods, in prepara-tion. c (cid:13)000