Noise and bias in square-root compression schemes
Gary M. Bernstein, Chris Bebek, Jason Rhodes, Chris Stoughton, R. Ali Vanderveld, Penshu Yeh
$$Revision: 1.18 $Date: 2010/01/13 15:16:41 submitted to PASP
Noise and bias in square-root compression schemes
Gary M. Bernstein , Chris Bebek , Jason Rhodes , , Chris Stoughton , R. Ali Vanderveld , , &Penshu Yeh [email protected] ABSTRACT
We investigate data compression schemes for proposed all-sky diffraction-limitedvisible/NIR sky surveys aimed at the dark energy problem. We show that lossy square-root compression to 1 bit of noise per pixel, followed by standard lossless compressionalgorithms, reduces the images to 2.5–4 bits per pixel, depending primarily upon thelevel of cosmic-ray contamination of the images. Compression to this level adds noiseequivalent to ≤
10% penalty in observing time. We derive an analytic correction toflux biases inherent to the square-root compression scheme. Numerical tests on simplegalaxy models confirm that galaxy fluxes and shapes are measured with systematicbiases (cid:46) − induced by the compression scheme, well below the requirements ofsupernova and weak gravitational lensing dark-energy experiments. An accompanyingpaper (Vanderveld et al. Subject headings:
Data Analysis and Techniques
1. Introduction
The unexpected discovery of the acceleration of the Hubble expansion of the Universe hasinspired numerous proposals for large-scale observational projects to seek further constraints on Dept. of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA 19104 Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720 Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109, U.S.A. California Institute of Technology, 1200 East California Boulevard, Pasadena, CA 91125, U.S.A Fermi National Accelerator Laboratory, PO Box 500, Batavia, IL 60150 Goddard Space Flight Center, Greenbelt, MD 20771 a r X i v : . [ a s t r o - ph . I M ] J a n et al. f sky of the sky, with pixel scale p and N exp exposuresper sky location, will be N pix = 2 . × (cid:16) p . (cid:48)(cid:48) (cid:17) − f sky N exp . (1)A crate filled with disk drives is all that is needed these days for transmission of hundreds ofterabytes of data between terrestrial sites. The transmission of this data from a spacecraft at theEarth-Sun L2 Lagrange point 1.5 million km away is, however, a substantial expense. If the datarequire an average of bpp bits per pixel to transmit, with a telemetry rate r , and a survey durationof T , then the number of hours of downlink per day required will bedownlink time = bpp × .
33 hrsday f sky N exp /T . / yr (cid:18) r
150 Mbps (cid:19) − (cid:16) p . (cid:48)(cid:48) (cid:17) − . (2)To avoid excessive resource use both on the spacecraft and ground stations, the data volume mustbe reduced well below the 16 bits per pixel that is typical for digitization of astronomical arraydetectors. In this paper we investigate means for compressing astronomical imaging data well belowthis, ideally bpp (cid:46) et al. (2009). Optical astronomers are averse to lossy compression for fear of losing the information fromtheir dearly acquired photons or fear of inducing biases in high-precision measurements. It shouldbe pointed out, however, that lossy compression is already being performed on essentially all datawhen the analog detector outputs are digitized. Analog-to-digital gains are usually set, however,so that the errors induced by digitization are below the read noise of the detector system. In radioastronomy it is well known that digitization to only 1 or 2 bits per sample can capture nearly all thesignal information when S/N per sample is low (Weinreb 1963; Thompson et al. i.e. the ratio b = σ N / ∆of the signal noise to the digitization step size. We seek a lossy compression scheme which holds b fixed over the full dynamic range of the detector. When the signal noise is dominated by shotnoise from the detected photoelectrons, it is straightforward to show that a square-root compressionscheme holds b constant. The application of square-root compression to Poisson-dominated datahas been discussed by Gowen & Smith (2003) and is part of the signal chain for the James WebbSpace Telescope (Nieto-Santisteban et al. 1999) and
Supernova Acceleration Probe (Lin & Marriner2003) spacecraft designs. Further discussion is in Seaman, White, & Pence (2009). In § § / b for cases of interest.Once a lossy compression algorithm is determined to reduce the data volume with acceptablysmall degradation of the noise level, it is critical to determine whether the codec process will induceany bias on the astronomical measurements from the data. Future dark energy experiments willrequire very demanding, high-precision measurements. For use of Type Ia supernovae as distanceindicators, systematic errors in flux measurements must be (cid:28) < − so that they contributenegligibly to the photometric error budget.Even more demanding are programs for weak gravitational lensing (WL) measurements, whichdepend upon determination of the shapes (ellipticities) of (cid:38) galaxies on the sky. Huterer etal. (2006) and Amara & R´efr´egier (2008) both conclude that measurements of galaxy ellipticitiesmust have systematic errors of ≤ − in order to remain subdominant to statistical errors in theWL experiments. Because there are many possible sources of shape measurement error—primarilyuncorrected instrumental image distortions—we would like to have shape biases induced by thecodec be a minor contributor to the shape error budget, i.e. the shift δe i in each component e i ofgalaxy ellipticity should on average be < − e i . The goal of this paper is to examine the biasesinduced by square-root compression down to b = 1 bit of noise per pixel, and to show how thesebiases can be reduced to negligible levels. A companion paper (Vanderveld et al. § §
5. Vanderveld et al. (2009) test forshape-measurement bias on data that is a realistic simulation of images expected from the
Joint
Dark Energy Mission (JDEM) CCD imager.Finally in § JDEM -like images to 3–4 bits per pixel with only 4% penalty in image noise, and toreconstruct them such that biases in flux and shape measurements induced by the codec are notdetected, down to ≈ − levels.
2. Compression algorithm
We consider most generally a compression algorithm that maps the digitized detector outputvalue N into a compressed code i whenever N i − ∆ i / < N ≤ N i + ∆ i /
2. Here N i and ∆ i arethe center and width, respectively, of the data range that codes to compressed value i . Note that N i +1 − N i = (∆ i + ∆ i +1 ) / . We will assume that N is a debiased value that is linearly related tothe estimated number of photoelectrons e by N = e/g , where g is the (inverse) gain of the system.We will assume initially that decompression consists of mapping i → N i , i.e. we restore code i tothe mean of the input values that it codes. In a later section we will suggest an alteration to thedecompressed values which reduces the bias in reconstructed values.Any vector of increasing N i (or equivalently a vector of ∆ i >
0) defines a codec algorithm. Weexpect the bias and noise characteristics of the codec to depend primarily on the the number ofbits of noise retained in the compression, which we define as b = σ N / ∆ , (3)where σ N is the RMS noise level of a pixel and ∆ is the typical ∆ i for the codes i that span thenoise distribution of the pixel. Many astronomical detectors have noise described by σ N = [ e + ( RN ) ] /g = ( N + c ) /g, c ≡ ( RN ) /g, (4)with some read noise level RN added to the Poisson variance of the photoelectrons. In this case,if we wish to keep b independent of the signal level, we require∆ = (cid:20) didN (cid:21) − = σ N /b (5) ⇒ didN = b (cid:112) ( N + c ) /g (6) ⇒ i = (cid:112) k ( N + c ) , k ≡ gb . (7) http://jdem.gsfc.nasa.gov i must be integral, we can more precisely express the nominal square-rootcompression algorithm as i = (cid:104)(cid:112) k ( N + c ) (cid:105) = (cid:104) b (cid:112) e + RN (cid:105) , (8)where, in this equation only, [ x ] is the nearest integer to x . Gowen & Smith (2003) discuss moregeneral offsets under the square root, but we stick to these values since they hold σ N / ∆ fixed acrossthe dynamic range. If we define ∆ (cid:48) i = ∆ i − ∆ i − (9)then from Equation (7) we find that the square-root codec has∆ i ≈ (cid:112) k ( N + c ) k = 2 ik (10)∆ (cid:48) i ≈ k = 12 b g , (11) i.e. the width ∆ i of each code range increases at a constant rate. If the raw data N are integral,then these ∆ i and ∆ (cid:48) i must also be integral. If for example we desire b = 1 bit of noise and havegain g = 0 . e /ADU, then ∆ (cid:48) i = 1 produces the desired compression algorithm. When 1 / b g isnon-integral, however, we will more generally obtain (cid:104) ∆ (cid:48) i (cid:105) = 1 / b g . In fact we can define a classof compression algorithms as follows:1. Choose a sequence of integers { s , s , . . . , s n } that will repeat cyclically to give the incrementalrange steps { ∆ (cid:48) , ∆ (cid:48) , . . . } .2. Define the parameter b via b = 12 g (cid:104) s i (cid:105) . (12)3. Choose a range ∆ for the first compression code that extends from N = 0 to N = ( RN ) /gb .[The codec algorithm can be extended to small negative values of N if desired using the fixedrange width ∆ for small negative i values.]Then the codec defined by this range-step sequence will have nearly constant number b of bits perpixel noise level under the Poisson+read noise data model. The more nearly uniform the values s i are, the better will be the codec performance, as shown below. For example we can produce acodec with b = 1 with g = 1 by using code steps ∆ (cid:48) i = { , , , , , , . . . } . We will refer to this asa “ g = 1, { , } codec.” 6 – When considering the compression to act upon integer data, we denote the probability of theraw pixel value being N as P N . For notational convenience we will assume that there exists acontinuous and differentiable function P ( x ) over x ≥ P N at x = N . For examplethe Poisson formula for P N is easily extended to non-integral values with a gamma function.In the CCD data the Poisson signal has a Gaussian read noise added to it. We will assumethe read noise has zero mean. Before digitization we can hence consider the pixel photoelectroncount e to be a real number, with probability distribution P ( e ). In the limit of high gain g we canconsider the detector output to be a real number e or N ; at finite gain, P N is again defined only atintegral N , but we can assume there exists some analytic extension to the full real domain P ( N ).Most of our results are general to any well-behaved P ( N ) or P ( e ), and we will note when theassumption of Poisson or Poisson+Gaussian noise models are important.
3. Codec noise
A raw pixel value N will incur an error δN = N i − N upon being encoded and decoded throughcode i . If the input values are uniformly distributed across the range of code i , then the RMS errorinduced by the codec is σ codec = ∆ i / √
12. If we make the further simplifying assumptions that thecoding error is uncorrelated with the raw N and that ∆ is nearly constant for all i accessed by P ( N ), then the variance of the output is σ = σ N + σ = σ N (cid:18) b (cid:19) . (13)The integration time required for a shot-noise-limited astronomical exposure to reach fixed signal-to-noise ratio ( S/N ) will be inflated by σ σ N = 1 + 112 b . (14)If we want the codec to cause <
10% penalty in exposure time, we hence need b ≥ .
91. In thispaper we will focus our attention on this question: does a square-root algorithm with b ≥ causeany bias or degradation in astronomical measurements beyond the expected < increase in noiselevel? A floor on the bandwidth required to transmit real images is given by the bpp found forpure-noise images. The minimum bpp required to for lossless compression of a sequence of codes i with probabilities p i is the Shannon entropy H = − (cid:80) p i log p i . For nearly-Gaussian probabilitydistributions digitized to b bits per σ , the Shannon entropy is H ≈ log √ πe + log b (e.g. Romeo et al. b = 1 a numerical evaluation of the Shannon entropy shows that ≥ . e = 40 and 7 – RN = 4, apply square-root compression with b = 1, then apply lossless bzip2 or CCSDS 121B(CCSDS 1997) compression. The resulting image requires 2.4–2.5 bits per pixel, within 0.3–0.4 bppof the Shannon limit. We can therefore consider 2.4 bits per pixel to be a floor on the telemetryrequired when using a pixel-by-pixel lossy compression step, subject to ≤
10% integration-timepenalty from codec errors. We note that the “raw” data digitized at g = 1 would have b = 7 . e , RN , and g , as expected.
4. Reconstruction bias
The expectation value of the raw data ¯ N = (cid:104) N (cid:105) should be unchanged after compression anddecompression. We will denote by δ ¯ N the bias in the expectation value of the data after the codecoperates. In practice this bias is not a problem if it is independent of ¯ N , since it is easily corrected.If δ ¯ N varies with ¯ N and/or with the details of the P ( N ) distribution, then the response of thedetector effectively becomes nonlinear and/or ambiguous to correct, which can bias photometryand shape measurements.Any codec bias could of course be corrected if the raw distribution P N is known at each pixel.But the mean level ¯ N will vary from pixel to pixel due to background gradients and sources, sothere is no single P N that applies to the entire image. We would prefer to debias the data bydecompressing code i to N i − δ i , where δ i is some small correction that is independent of the rawprobability distribution. There is no guarantee that such a solution exists but we will show that agood approximation does exist for cases when P ( N ) spans multiple codes. We begin with the case in which the raw value N can be considered a continuous real variable.This case is instructive if not necessarily realistic. The center and width N i and ∆ i of the rangefor each code i can be set to any positive real numbers. The codec bias can be expressed as δ ¯ N = − (cid:88) i (cid:90) ∆ i / − ∆ i / du u P ( N i + u ) , (15)where u = N − N i is the distance from the center of code step i . If we can find a set of δ i such that (cid:88) i (cid:90) ∆ i / − ∆ i / du δ i P ( N i + u ) = δ ¯ N (16)holds for any distribution P ( N ), then we will have a complete de-biasing prescription for the codec.When the distribution P ( N ) is wider than a code step width ∆, it can be approximated by 8 –the Taylor expansion of P ( N ) about N i : P ( N i + u ) = P i + P (cid:48) i u + P (cid:48)(cid:48) i u / P (cid:48)(cid:48)(cid:48) i u / O ( u/σ N ) , (17)with P (cid:48) i = P (cid:48) ( N i ), etc. We will truncate the series at cubic order henceforth. The contribution ofthe neglected terms will scale as (∆ /σ N ) = b − . We will later perform a numerical check of thesignificance of the neglected terms. Substituting the Taylor expansion into Equation (15) gives δ ¯ N ≈ − (cid:88) i ∆ i (cid:2) P (cid:48) i ∆ i /
12 + P (cid:48)(cid:48)(cid:48) i ∆ i / (cid:3) . (18)If this sum over i were an integral, we would evaluate this quantity through integration byparts. We can perform an analogous discrete process by defining P i ± / ≡ P ( N i ± ∆ i /
2) and notingthat to our chosen order of Taylor expansion, we have∆ i P (cid:48) i = P i +1 / − P i − / − ∆ i P (cid:48)(cid:48)(cid:48) i . (19)Substituting this into Equation (18) yields δ ¯ N = − (cid:88) i (cid:20) ∆ i (cid:0) P i +1 / − P i − / (cid:1) − ∆ i P (cid:48)(cid:48)(cid:48) i (cid:21) (20)= (cid:88) i (cid:20) ∆ i (cid:0) ∆ (cid:48) i P i − / + ∆ (cid:48) i +1 P i +1 / (cid:1) + ∆ i P (cid:48)(cid:48)(cid:48) i (cid:21) . (21)Recalling that ∆ (cid:48) i = ∆ i − ∆ i − , it is clear first that the codec bias is very small if the step width ∆ is constant ( ∆ (cid:48) = 0 ). This would be the case if the codec were a simple decimation over the rangeof the data. This is the case for normal digitization of the data.The perfect square-root compressor of real-valued raw data Equation (8) has the unique prop-erty that ∆ (cid:48) i = ∆ (cid:48) is constant. In this special case we can simplify further and used the Taylorexpansion one more time to get P i +1 / + P i − / = 2 P i + ∆ i P (cid:48)(cid:48) i / , and now we have δ ¯ N = (cid:88) i ∆ i (cid:20) ∆ (cid:48) (cid:18) P i + ∆ i P (cid:48)(cid:48) i (cid:19) + ∆ i P (cid:48)(cid:48)(cid:48) i (cid:21) . (22)Second, it is clear that the square-root codec bias will be nearly independent of P ( N ) because (cid:80) i ∆ i P i → i , so we expect there to be bias δ ¯ N = ∆ (cid:48) b g . (23)A constant bias is relatively benign, for example does not influence the appearance of a sky-subtracted image. It is also easily remedied by decompressing code i to value N i − δ (0) , where 9 – δ (0) = ∆ (cid:48) /
6. After making this adjustment, the residual bias, to third order in the Taylor expansionof each code range, becomes δ ¯ N = (cid:88) i ∆ i (cid:20) ∆ (cid:48) ∆ i P (cid:48)(cid:48) i + ∆ i P (cid:48)(cid:48)(cid:48) i (cid:21) . (24)Further application of the “integration by parts” technique shows that the leading terms in thisresidual bias scale as (cid:82) dN ∆ (cid:48) P (cid:48) ( N ), which is zero if ∆ (cid:48) is constant across N . Hence we expect thereconstruction bias to be very close to the constant 1 / b g for the reconstruction of real-valuedraw data with b ≥ i.e. whenever the intrinsic noise spans multiple compression code steps.To summarize the key results: a simple decimation codec (∆ i =constant) reproduces the meanof the input data with no bias, and the square-root codec (∆ (cid:48) i = constant) biases the mean bythe constant value in Equation (23) independent of the distribution P ( N ) of the raw data. Thesestatements are very good approximations when b ≥ Next we consider the case when the raw data N must be integral. In this case ∆ i and ∆ (cid:48) i must be integral, and N i is integral or half-integral depending upon whether ∆ i is odd or even,respectively. The integrations over N in Equation (15) must be replaced by sums over the allowedintegral values of N . We obtain δ ¯ N = (cid:88) i (cid:88) j j P N i + j , (25)where the sum runs over j ∈ {− (∆ i − / , − (∆ i − / , . . . , (∆ i − / , (∆ i − / } .We smoothly extend the discrete probabilities P N to a continuous function P ( N ) that main-tains the normalization condition (cid:82) dN P ( N ) = 1, and we again perform a 3rd-order Taylor expan-sion of P ( N ) about the center N i of each code range. The bias can now be written δ ¯ N = − (cid:88) i P (cid:48) i (cid:88) j j + P (cid:48)(cid:48)(cid:48) i (cid:88) j j . (26)The sums of powers of integers (or half-integers) can be found from well-known formulae. Theresult is δ ¯ N = − (cid:88) i ∆ i (cid:0) ∆ i − (cid:1) (cid:18) P (cid:48) i + 3∆ i − P (cid:48)(cid:48)(cid:48) i (cid:19) . (27) For example, http://mathworld.wolfram.com/FaulhabersFormula.html
10 –The discrete version of integration by parts transforms this into an expression identical to Equa-tion (21) up to small corrections in the P (cid:48)(cid:48)(cid:48) term: δ ¯ N = (cid:88) i ∆ i (cid:20) ∆ (cid:48) i P i − / + ∆ (cid:48) i +1 P i +1 / + (∆ i − i + 7)120 P (cid:48)(cid:48)(cid:48) i (cid:21) . (28)For an integer-valued codec we cannot necessarily maintain constant ∆ (cid:48) i for an arbitrary b ; forexample with g = 1 and b = 1, we must have (cid:104) ∆ (cid:48) (cid:105) = 1 /
2. Hence we must choose adjustments δ i toeach decompressed value that will best cancel the bias in Equation (28). After further manipulationit is possible to show that the leading-order terms in the bias are cancelled, independent of theform of P N , if δ i = ∆ (cid:48) i +1 + ∆ (cid:48) i − ∆ (cid:48) i +2 − ∆ (cid:48) i +1 − ∆ (cid:48) i + ∆ (cid:48) i −
48 (29)= ∆ i +1 − ∆ i − − ∆ i +2 − i +1 + 2∆ i − − ∆ i − . (30)To summarize: the compression algorithm is defined by a repeating integer sequence { s i } for the∆ (cid:48) values, along with a gain g and read noise RN , as described in § b = (2 g (cid:104) s i (cid:105) ) − / bits per σ of input noise. On decompression, each compressed code i shouldbe replaced with the value N i − δ i , with δ i given by Equation (30).One should prefer the most uniform possible sequences { s i } . A constant value (nominally∆ (cid:48) = 1) will yield a “perfect” square-root compression with minimal reconstruction bias. Analternating sequence, i.e. { , } , has a constant bias correction δ i , but slightly larger biases. Longer { s i } sequences have more complicated and less precise bias corrections. Figure 1 shows the noise and bias introduced by several versions of the square-root codec. Ineach case the raw data are drawn from a Poisson distribution with the mean number of photoelec-trons plotted on the x axis, ranging from 40–100 e . The data are taken to have Gaussian read noisewith zero mean and 4 e RMS. Each plotted point is the result of 10 trials from this distribution.Each trial produces an integral raw N value and a real-valued decompressed value N i − δ i .We examine 4 different codecs. The red, blue, and green curves are designed to produce b = 1bit of noise. The red curve is a “perfect” square-root codec, with ∆ (cid:48) = 1 and g = 0 . e per ADU.The blue and green curves have g = 1 and { s i } sequences of (0 ,
1) and (0 , , , g = 1 and step sequence (0 , , b = 1 .
22 bits of noise.The bottom panel plots the mean gN i and range g ∆ i of each code step i for each of the fourcodecs, expressed in units of photoelectrons. The top panel shows the RMS fluctuation of thedecoded data, relative to the RMS fluctuation of the input data. All of the codecs inflate the RMS 11 – Mean Sky (e) R M S r a t i o C ode c B i a s C ode c Δ Fig. 1.—
Performance of four square-root codecs on data with Poisson noise plus 4 e RMS Gaussian readnoise. The bottom panel shows the center values N i and the range widths ∆ i of each step for the four codes.The red (solid), green (dotted), and blue (long dash) codecs each produce b = 1 bit per noise σ N of the inputvalue, with decreasing levels of smoothness in the derivative ∆ (cid:48) of the range width. The magenta (shortdash) codec retains more noise, b = 1 .
22. The top panel shows the increase in data RMS due to the codec,which is independent of input level. The central panel shows the bias in the reconstructed data stream, withthe noise level of this measure indicated by the black error bar. Bias levels are O (0 . e ), in some casesconsistent with zero within measurement noise. See text for further detail.
12 –by almost exactly the factor (1 + 1 / b ) / predicted in Equation (13). In particular the b = 1codecs inflate the noise by 4%.The central panel plots the mean difference between decompressed and raw data. The expectednoise in this measurement from the 10 trials is plotted in black at left. The “perfect” codec, inred, has zero bias within errors, | δ ¯ N | < . < − of the input values, < − of the typical noise σ N and code range ∆ =7–10, and also <
1% of the bias 1 / b = 0 . { , } -sequence, in green, showssome significant residual bias at the ± .
002 level, and the { , , , } -sequence codec (blue) has biasof ≈ ± .
004 electrons. While these biases are quite small, they do highlight that Equation (30)ignores some higher-order terms that make the bias dependent on P N , and that it is preferable touse the smoothest possible step sequences. The { , , } codec also has some bias at the 0.001 level,despite having slightly less compression and lower noise than the “perfect” b = 1 codec in red.These numerical tests insure that square-root compression at b = 1 will introduce no significantbiases into photometric measurements, and increases in errors will be limited to 4%.
5. Shape measurement biases
Using simple simulated galaxies and a very basic shape-measurement routine, we demonstratehere that the codec does not induce any biases in the measurements of galaxy fluxes, sizes, orellipticities at the part-in-10 level. In this work we analyze simple galaxies with specified size andsignal-to-noise ratios which span a range that we might expect in real images. A test for codecwith much more realistic images is presented by Vanderveld et al. (2009). In that work the galaxieshave sizes, appearances, fluxes, and shapes that accurately mimic those we might expect from avisible survey telescope in orbit. The simulated images are postage stamps constructed as follows:1. Start with an exponential-profile galaxy, with intensity I ∝ e − r/r . The exponential disk isconvolved with a Gaussian of width σ = r /
10, to suppress high frequencies associated withthe r = 0 cusp of the exponential disk.2. This circular galaxy is then sheared to an ellipticity e = ( a − b ) / ( a + b ) = 0 .
1, 0.3, or 0 . a and b are the major and minor axes of the galaxy). The major axis of the elliptical galaxycan have position angle φ = 0, 22.5 ◦ , or 45 ◦ from the x axis.3. The galaxy is (de-)magnified until the resulting galaxy has half-light radius of r / = 2, 4, or6 pixels. 13 –4. The galaxy profile is shifted by a random fraction of a pixel in x and y , then convolved withthe square pixel response function and sampled on a 1-pixel grid to yield a noiseless postagestamp image.5. The flux of the galaxy is adjusted until its total signal-to-noise ratio will be S/N = 15, 50,or 300 when noise is added in the following step. The
S/N is defined as(
S/N ) = (cid:88) i ∈ pixels I i σ i (31)where I i and σ i are the signal and noise in each image pixel.6. A sky level of 50 electrons is added to each pixel’s expected signal I i . Then an integral“observed” photoelectron count for each pixel is determined by drawing from a Poisson dis-tribution with mean I i . Gaussian read noise with zero mean and σ = 4 electrons is added togive a floating-point measurement value for each pixel.At this point two digitized versions of the postage-stamp image are produced. For the “codec”image, the photoelectron count I i is scaled by gain g = 0 . e/ ADU, digitized, compressed with asequence 1 perfect square-root codec, and then decompressed to a floating point image followingthe debiasing procedures above. Note that the codec retains b = 1 bits per noise sigma on eachpixel. For the “raw” image, we add Gaussian noise with RMS of σ i / √ b to each pixel, so that thenoise level of the raw and codec images will be expected to be equal. Then we scale by the gain g and digitize without square-root preprocessing.Both the raw and codec versions of the postage stamp galaxy are measured using a basic“adaptive moments” shape-measurement algorithm (Bernstein & Jarvis 2002) as follows: the pixeldata are fit to an elliptical Gaussian model, with the flux, centroid, size, and ellipticity of theGaussian allowed to vary. The sky background is fixed at 50 /g = 100 ADU. There are hence 6degrees of freedom in galaxy model (and the Gaussian galaxy model is not an exact fit to theunderlying exponential-profile galaxy). For each choice of ellipticity e , orientation φ , size r / of galaxy, we create 1000 simulatedgalaxies at S/N = 300, 30,000 realizations with
S/N = 50, and 300,000 realizations with
S/N = 15.To test for biases induced by the codec, we measure the mean difference e codec − e raw between theellipticities measured on the codec and the raw images. Similarly we measure the mean differencein the (log of) flux and size measured for the galaxies. To see if the codec induces extra shapenoise, we compare the RMS variation of the fitted e in the codec data set to the RMS variation inthe raw fits. We also check for additional variance in flux and size measurements. 14 – e e e e l n ( s i z e ) l n ( f l u x ) Fig. 2.— The biases (left) and relative noise level (right) of measurements on codec images, relativeto raw images, are plotted for several quantities measured on simulated galaxies. The details of thesimulation and the measurement process are in the text. From top to bottom, we plot the bias inthe (log of) galaxy flux, (log of) galaxy size, and then the two components of galaxy ellipticity. Red(outlined) points are for galaxies with
S/N = 15, green (filled) have
S/N = 50, and blue (shaded)have
S/N = 300, with each point plotted at its input ellipticity. All data are fully consistent withthe codec images yielding no bias or change in noise level from the raw images—at accuracies of ± − for biases and <
1% in noise levels. [Noise levels for
S/N = 300 cases are not plotted becausethere are too few trials for an accurate measure of RMS noise.] 15 –Figure 2 presents the biases and noise levels on the ellipticity, size, and flux measurements ofgalaxies with r / = 2 pixels and φ = 0. The results for larger galaxies and for different ellipticityorientations are similar and not plotted here. We find no evidence for any codec-induced bias insize, flux, or ellipticity measurements at any S/N or ellipticity. The number of trials is sufficientto determine these biases to ≈ ± − for each configuration, approximately 10 times below themaximum permissible ellipticity measurement errors in a large WL survey (Amara & R´efr´egier2008; Huterer et al. b = 1 does not induce any significant bias in weak-lensingshape measurements.The S/N = 50 and
S/N = 15 cases have enough trials to determine the measurement noisein e , size, and flux to (cid:28)
6. Compression factors6.1. Simulated scene
We create simulated images from a space-based visible survey telescope to estimate the bits oftelemetry required per pixel (bpp) needed to transmit the images after they have been compressedwith the square-root algorithm followed by a lossless compression algorithm. The simulated imageshave three components: the expected cosmic scene; a uniform sky background; and bright linearfeatures produced by cosmic rays in the detectors. Our test images are aimed at reproducingthe appearance of images taken with an instrument resembling the CCD camera proposed for the
JDEM/IDECS mission. The relevant characteristics assumed are: a 1.5-meter diameter telescopeoperating in an L2 Lissajous orbit, where the sky background is dominated by zodiacal light; a200-second integration through a bandpass in the red part of the visible spectrum, with bandwidthln( λ max /λ min ) = 0 .
3; a fully-depleted, 200-micron-thick p-channel CCD (Roe et al. 2007); a scaleof 0 . (cid:48)(cid:48)
14 per pixel; and a point spread function (PSF) with EE50 radius ≈ . (cid:48)(cid:48)
14, after including alloptical, mechanical, and detector-induced sources of image blur.The cosmic scene is taken from deep moderate-latitude exposures taken with the ACS/WFCF606W filter aboard the Hubble Space Telescope in the course of a search for very faint outer-solar-system objects (Bernstein et al. 2004). These exposures combine many HST exposures to producean image whose noise is far below that expected from a single
JDEM/IDECS exposure as takenabove. The scene is scaled so that a source of magnitude I AB = 31 .
09 produces 1 photoelectron.The HST image is blurred and repixelized to match the
JDEM/IDECS specifications describedabove.Diffuse sky emission is added assuming high-ecliptic-latitude zodiacal background at L2. We 16 –estimate ≈
45 electrons per pixel in the detector.
Cosmic rays in space are isotropic and uniformly impinge on a sphere. From ESA’s SpaceEnvironment Information System (SPENVIS) program, the cosmic ray rate is found to be approx-imately 5 particles s − cm over 4 π steradian. This is the average rate for Galactic cosmic rays,which are dominated by protons. Low energy particles (mainly solar protons) have little integratedeffect on the total rate even without shielding. We ignore periods of enhanced low momentum solarprotons during solar storms. These episodes are known in advance and data taking is avoided.We use SPENVIS to generate a momentum spectrum for a particular orbit (L2 in this case) ata particular solar epoch behind appropriate shielding (3 mm Al assumed here). The Monte Carlosimulation samples this spectrum and for each momentum, an average ionization rate (e/ µ m) iscalculated for the target material (Si in this case) according to a modified Bethe-Bloch formalism(Amsler et al. µ m in silicon.For a given exposure time, particle flux and detector area, the total number of cosmic rays iscalculated. We generate this number of cosmic rays uniformly over 4 π sr. Each ray is randomlyplaced over the surface of detector and tracked through the detector material. Tracing is done inuniform step lengths. For each step, a Landau distribution is sampled to determine the numberof ions produced. Hard scatters and δ -rays are not simulated. For each ion generated in a step, aGaussian lateral charge diffusion distribution is randomly sampled. The diffusion constant is basedon a measured 4 µ m RMS for a 200 µ m thick CCD operated at 100V bias voltage. This is linearlyscaled by ion conversion height above the pixel plane. The diffused charge is projected to the pixelplane where its position is discretized and added to any previous pixel charge.In the simulated 200 s exposure, 7% of the pixels receive ≥ e of CR-generated charge. Wealso simulate an image with triple this dose of cosmic rays, which is simply the sum of 3 200s CRsimulations. The scene, cosmic ray, and sky components are summed, and then Poisson noise and 4 e RMSread noise are added to each pixel. The noisy images are then scaled by gain g = 0 . e/ ADU anddigitized, then compressed using a perfect square-root algorithm with sequence 1. The compressed
17 –images are saved as 16-bit integer FITS-format images. A segment of a simulated image, beforecompression, is shown in Figure 3.The Shannon entropy of the square-root-compressed image, calculated by direct measure ofthe p i of each code in the image, is 2.8 bits per pixel. Tripling the cosmic-ray dose on the exposureincreases the entropy to 3.6 bits per pixel, suggesting that the normal dose of cosmic rays isresponsible for 0.4 bits per pixel of entropy in the image. Thus most of the 0.7-bit increase inentropy over the pure-noise image (as measured in §
3) is due to cosmic rays.There are many possible lossless compression algorithms to apply to the square-root-compressedFITS images. This choice will of course not affect the biases in the data, but will affect the com-pression ratio and the robustness of the telemetry stream. Applying the commonly available bzip2 algorithm to the FITS files yields the following results: • The images simulating a 200-second exposure and 200 seconds’ worth of cosmic-ray tracksoccupy after square-root and bzip2 compression. • The information content of the images is dominated by the cosmic-ray component. If wetriple the number of cosmic rays (mimicking a 600-second exposure, even though the cosmicscene and sky background are still 200 seconds’ worth), we get a pessimistic value of in the compressed images.The bzip2 performance is again 0.3–0.4 bpp above the entropy of the image.Full-frame bzip2 compression is not very robust: a single-bit transmission error can lead toloss of a full image. A more robust algorithm is CCSDS 121B (CCSDS 1997), which has alreadybeen implemented on >
25 space missions. We find the file size after square-root pre-processingand CCSDS 121B lossless compression to be within 0.1 bits per pixel of the bzip2 results.A different lossless compression scheme, optimized toward efficient compression of the linearcosmic-ray tracks that dominate the 2d image, might yield better compression yet. If furthercompression were desired, the most effective approach would be to identify cosmic-ray-contaminatedpixels on that satellite and replace them with a constant tag value before compression, so that theirentropy is not transmitted to the ground.
Implementation of square-root preprocessing is very simple, requiring just a fixed lookup tableto convert ADC output values to compressed codes. An even simpler algorithm is linear prescaling, i.e. decimation, in which the code range ∆ is fixed over the full dynamic range. As noted above, adecimation algorithm should be free of bias, just as the corrected square-root algorithm is measuredto be free of bias. What are the relative merits of the two lossy codecs? 18 –Fig. 3.— Section of an image used to test the square-root+lossless compression scheme. Thissimulates a 200-second CCD exposure with a 1.5-meter space telescope observing near the eclipticpole from an L2 orbit. Most pixels are dominated by sky and read noise, and celestial objects arevisible, but the cosmic-ray tracks dominate the information content of the image. The scale is inADU, and the image gain is 0.5 photoelectrons per ADU. The image requires 3.1 bits per pixel totransmit after application of square-root compression and lossless bzip2 compression. 19 –We compress our test images with a linear algorithm with code step ∆ chosen to yield b = 1steps per noise σ at the sky level of the images. As noted e.g. in Pence et al. (2009), most pixelsare at sky level so a fixed b at sky level should yield similar compression performance. Indeed forimages with no cosmic rays, we find that linear and square-root compression yield indistinguishablebpp values after the lossless step. As expected, the square-root algorithm performs better if thenumber of CR-impacted, high-signal pixels is substantial. In our nominal 200s exposure, thesquare-root algorithm uses 0.12 fewer bpp than decimation. For a 600s CR dose, the square-rootalgorithm requires 0.38 fewer bpp than decimation. This suggests that the square-root algorithmremoves about 2 bits of noise from the average CR-impacted pixel. If an algorithm to removecosmic rays before compression is implemented, then there would be little difference in compressionratio between the square-root and linear codecs. However such an algorithm would require moreprocessing power than the lookup table of the square-root codec, and one would have to insure thatobjects of interest are not discarded with the cosmic rays.We should expect the square-root algorithm to gain advantage also in the case of fields withmany objects brighter than the sky background, e.g. fields at low Galactic latitude or with largenearby galaxies.The square-root algorithm has practical advantages as well, in that it will retain the desired b = 1 value as the sky background varies. The linear codec would need to change its step size ∆ toaccommodate different sky levels, e.g. from changing filters or zodiacal background. An on-boardalgorithm could determine the sky noise level and choose ∆ appropriately for the linear codec. Thesquare-root codec requires adjustment only when the bias level or gain of the detector ADC outputchange.
7. Conclusion
Future dark-energy surveys such as
JDEM or Euclid propose to survey a substantial fractionof the full sky with space-based visible/NIR imaging near the diffraction limit. A data compressionscheme is needed to satisfy telemetry constraints, but the weak gravitational lensing measurementsplace very strict requirements on the degree to which the compression algorithm can bias themeasured shapes or fluxes of galaxies. For this reason we choose to investigate lossy compressionprocesses that operate on a pixel-by-pixel basis, fearing than algorithms examining pixel groupsmight induce correlated codec errors between pixels which bias the galaxy ellipticities.Square-root pre-processing is known to be a good means for compressing shot-noise-dominateddata streams while keeping a fixed number b of bits per σ of noise across the full detector dynamicrange. The reduction of entropy by reducing the number of noise bits is essential to efficient subse-quent lossless compression algorithms. We derive an expression for bias in the naive reconstruction http://sci.esa.int/euclid
20 –of square-root-compressed data and present a correction scheme, which we numerically verify issuccessful to high accuracy.Since the noise induced by roundoff in the compression process increases the required observingtime by 1 / b , we confine our examination to b ≥
1. We verify that flux and shape measurementsof simple galaxy models are biased by (cid:46) − for a range of galaxy sizes, ellipticities, orientations,and signal-to-noise ratios. This is well within the weak lensing requirements. We verify this withsimple postage-stamp galaxy images; Vanderveld et al. (2009) bound the shape biases determinedfrom realistic simulations of the true extragalactic sky.At b = 1 the Shannon entropy of the square-root-compressed images varies from 2.1 bits perpixel for pure noise to 3.6 bits per pixel for images with the expected astronomical scene andtriple the expected number of cosmic rays. Lossless compression algorithms such as bzip2 realizecompress levels within 0.3–0.4 bits of the Shannon limit. For simulated images best resemblingexpected JDEM
CCD data, we find compression to 3.1 bits per pixel using bzip2 . The size ofthe transmitted data stream depends primarily on the degree of cosmic-ray contamination once b = 1 is set: schemes to more efficiently compress or mask cosmic rays before compression mightlead to lower bit rates. Higher cosmic-ray rates or more crowded astronomical fields, e.g. near theGalactic plane, will require more bits. Since pure noise images require 2.4 bits per pixel with bzip2 at b = 1, we can state that compression to < . b <
1, and hence higherobserving-time penalty.We thus validate square-root lossy compression as useful for
JDEM data and capable of meet-ing requirements for low bias. Other approaches would be feasible as well. For instance simpledecimation to a fixed code step ∆ that equals the sky noise level should produce similar noise andbias properties. The presence of cosmic rays and/or bright objects in (cid:38)
10% of the pixels givesthe square-root codec a useful advantage in mean number of required telemetry bits per pixel. Thesquare-root algorithm also has a robust and exceptionally simple implementation—a single lookuptable—meaning that there is no need for on-board processing to recognize cosmic rays or determinethe sky noise level.It is thus expected that visible/NIR dark-energy extragalactic imaging survey data can betransmitted to the ground with 2.5–4 bits per pixel, depending upon the cosmic-ray flux leveland the sophistication of onboard cosmic-ray identification or removal. This compression can beachieved with no significant bias of observed quantities, and with 8% penalty in observing time toachieve fixed signal-to-noise ratio.The work of JR and RAV was carried out at the Jet Propulsion Laboratory, California Instituteof Technology, under a contract with NASA, and was funded by JPL’s Research and TechnologyDevelopment Funds. GB is supported by grant AST-0607667 from the NSF and DOE grant DE-FG02-95ER40893. CB is supported by the Director, Office of Science, Office of High EnergyPhysics, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Fermilab 21 –is operated by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with theUnited States Department of Energy.
REFERENCES
Amara, A., & R´efr´egier, A. 2008, MNRAS, 391, 228C. Amsler et al. (Particle Data Group) 2008, Physics Letters B667, 1, available at http://pdg.lbl.gov/
Bernstein, G. & Jarvis, M., 2002, AJ, 123, 583Bernstein, G. M., Trilling, D. E., Allen, R. L., Brown, M. E., Holman, M., & Malhotra, R. 2004,AJ, 128, 1364E. Cand`es, J. Romberg, and T. Tao 2006, IEEE Trans. on Information Theory, 52 489Consultative Committee for Space Data Systems 1997, Lossless Data Compression, CCSDS 121.0-B-1 Blue Book, Washington, DC 20546, available at
Frieman, J. A., Turner, M. S., & Huterer, D. 2008, ARA&A, 46, 385Gowen, R. A., & Smith, A. 2003, Review of Scientific Instruments, 74, 3853Huterer, D., Takada, M., Bernstein, G., & Jain, B. 2006, MNRAS, 366 101Lin, H., & Marriner, J. 2003, SNAP collaboration memorandumNieto-Santisteban, M. A., Fixsen, D. J., Offenberg, J. D., Hanisch, R. J., & Stockman, H. S. 1999,Astronomical Data Analysis Software and Systems VIII, 172, 137Pence, W. D., Seaman, R., & White, R. L. 2009, PASP, 121, 414Press, W. H. 1992, Astronomical Data Analysis Software and Systems I, 25, 3Roe, N. A., et al. 2007, Nuclear Instruments and Methods in Physics Research A, 572, 526Romeo, A., Gaztan˜aga, E., Barriga, J., & Elizalde, E. 1999, International Journal of ModernPhysics C, 10, 687Seaman, R. L., White, R. L., & Pence, W. D. 2009, arXiv:0910.3733Thompson, A. R., Moran, J. M., & Swenson, G. W. 2001, Interferometry and Synthesis in RadioAstronomy, 2nd Ed.Vanderveld, A. et al.