Steerable e PCA: Rotationally Invariant Exponential Family PCA
11 Steerable e PCA: Rotationally Invariant ExponentialFamily PCA
Zhizhen Zhao
Member, IEEE,
Lydia T. Liu, and Amit Singer
Abstract —In photon-limited imaging, the pixel intensities areaffected by photon count noise. Many applications, such as 3-Dreconstruction using correlation analysis in X-ray free electronlaser (XFEL) single molecule imaging, require an accurateestimation of the covariance of the underlying 2-D clean images.Accurate estimation of the covariance from low-photon countimages must take into account that pixel intensities are Poissondistributed, hence the classical sample covariance estimator issub-optimal. Moreover, in single molecule imaging, including in-plane rotated copies of all images could further improve theaccuracy of covariance estimation. In this paper we introduce anefficient and accurate algorithm for covariance matrix estimationof count noise 2-D images, including their uniform planarrotations and possibly reflections. Our procedure, steerable e PCA ,combines in a novel way two recently introduced innovations. Thefirst is a methodology for principal component analysis (PCA)for Poisson distributions, and more generally, exponential familydistributions, called e PCA. The second is steerable PCA, a fastand accurate procedure for including all planar rotations forPCA. The resulting principal components are invariant to therotation and reflection of the input images. We demonstratethe efficiency and accuracy of steerable e PCA in numericalexperiments involving simulated XFEL datasets and rotated YaleB face data.
Index Terms —Poisson noise, X-ray free electron laser, steer-able PCA, eigenvalue shrinkage, autocorrelation analysis, imagedenoising.
I. I
NTRODUCTION
X-ray free electron laser (XFEL) is an emerging imagingtechnique for elucidating the three-dimensional structure ofmolecules [1], [2]. Single molecule XFEL imaging collectstwo-dimensional diffraction patterns of single particles atrandom orientations. The images are very noisy due to thelow photon count and the detector count-noise follows anapproximately Poisson distribution. Since only one diffractionpattern is captured per particle and the particle orientationsare unknown, it is challenging to reconstruct the 3-D struc-ture at low signal-to-noise ratio (SNR). One approach isto use expectation-maximization (EM) [3], [4], but it has ahigh computational cost at low signal-to-noise ratio (SNR).Alternatively, assuming that the orientations of the particlesare uniformly distributed over the special orthogonal group
ZZ is with the Department of Electrical and Computer Engineering,University of Illinois at Urbana-Champaign, Urbana, IL, 61820 USA e-mail:[email protected] .LTL is with the Department of Electrical Engineering and ComputerSciences, University of California at Berkeley, Berkeley, CA, 94720 e-mail:[email protected] is with the Department of Mathematics, and Program in Applied andComputational Mathematics, Princeton University, Princeton, NJ, 08544 e-mail:[email protected] SO (3) , Kam’s correlation analysis [5]–[12] bypasses orienta-tion estimation and requires just one pass over the data, thusalleviating the computational cost. Kam’s method requires arobust estimation of the covariance matrix of the noiseless 2-D images. This serves as our main motivation for developingefficient and accurate covariance estimation and denoisingmethods for Poisson data. Nevertheless, the methods presentedhere are quite general and can be applied to other imagingmodalities involving Poisson distributions.Principal Component Analysis (PCA) is widely used fordimension reduction and denoising of large datasets [13], [14].However, it is most naturally designed for Gaussian data,and there is no commonly agreed upon extension to non-Gaussian settings such as exponential family distributions [13,Sec. 14.4]. For denoising with non-Gaussian noise, popularapproaches reduce it to the Gaussian case by a wavelettransform such as a Haar transform [15]; by adaptive waveletshrinkage [15], [16]; or by approximate variance stabilizationsuch as the Anscombe transform [17]. The latter is knownto work well for Poisson signals with large parameters, dueto approximate normality. However, the normal approximationbreaks down for Poisson distributions with a small parameter,such as photon-limited XFEL [18, Sec. 6.6]. Other methodsare based on alternating minimization [19]–[21], singular valuethresholding (SVT) [22], [23] and Bayesian techniques [24].Many of aforementioned methods such as [19] are com-putationally intractable for large datasets and do not havestatistical guarantees for covariance estimation. In particular,[21] applied the methodology of [19] to denoise a single imageby performing PCA on clusters of patches extracted from theimage; it is not suitable for our problem setting, where thegoal is to simultaneously denoise a large number of imagesand estimate their covariance. Recently, [25] introduced expo-nential family PCA ( e PCA), which extends PCA to a widerclass of distributions. It involves the eigendecomposition ofa new covariance matrix estimator, constructed in a deter-ministic and non-iterative way using moment calculations andshrinkage. e PCA was shown to be more accurate than PCAand its alternatives for exponential families. Its computationalcost is similar to that of PCA, it has substantial theoreticaljustification building on random matrix theory, and its outputis interpretable. We refer readers to [25] for experiments thatbenchmark e PCA against previous methods.In XFEL imaging, the orientations of the particles areuniformly distributed over SO (3) , and it is therefore equallylikely to observe any planar rotation of the given 2-D diffrac-tion pattern. Therefore, it makes sense to include all possiblein-plane rotations of the images when performing e PCA. To a r X i v : . [ c s . C V ] D ec this end, we incorporate steerability in e PCA, by adaptingthe steerable PCA algorithm that avoids duplicating rotatedimages [26]. The concept of a steerable filter was introducedin [27] and various methods were proposed for computing dataadaptive steerable filters [28]–[31]. We take into account theaction of the group O (2) on diffraction patterns by in-planerotation (and reflection if needed). The resulting principalcomponents are invariant to any O (2) transformation of theinput images.The new algorithm, to which we refer as steerable e PCA,combines e PCA and steerable PCA in a natural way. Steer-able e PCA does not involve any optimization, with all stepsinvolve only basic linear algebra operations. The various stepsinclude expansion in a steerable basis, eigen-decomposition,eigenvalue shrinkage, and different normalizations. The math-ematical and statistical rationale for all steps is provided inSection II.We illustrate the improvement in estimating the covariancematrix through image denoising in Section III. Specifically,we introduce a Wiener-type filtering using the principalcomponents. Rotation invariance enhances the effectivenessof e PCA in covariance estimation and thus achieves betterdenoising. In addition, the denoised expansion coefficientsare useful in building rotationally invariant image features(i.e. bispectrum-like features [32]). We verify the benefits ofsteerable e PCA through numerical experiments with simulatedXFEL diffraction patterns and a natural image dataset–Yale Bface database [33], [34]. Similar to the case of standard PCA,the computational complexity of the steerable e PCA is lowerthan e PCA.An implementation of steerable e PCA in MATLAB ispublicly available at github.com/zhizhenz/sepca/.II. M
ETHODS
The goal of steerable e PCA is to estimate the rotationallyinvariant covariance matrix from the images whose pixel inten-sities are affected by photon count noise. To develop this esti-mator, we combine our previous works on steerable PCA with e PCA in a novel way. The main challenge in combining e PCAwith steerable PCA is that steerable PCA involves a (nearlyorthogonal) change of basis transformation from a Cartesiangrid to a steerable basis. While multidimensional Gaussiandistributions are invariant to orthogonal transformations, themultidimensional Poisson distribution is not invariant to suchtransformations. Therefore, steerable PCA and e PCA cannotbe naively combined in a straightforward manner. Algorithm 1details the steps of the combined procedure that achieves thegoal of estimating a rotationally invariant covariance matrixfrom Cartesian grid images with pixel intensities followinga Poisson distribution. The following subsections explain themain concepts underlying e PCA and steerable PCA, and howthey are weaved together in a manner that overcomes theaforementioned challenge. In the following subsections, wedetail the associated concepts and steps for steerable e PCA.
A. The observation model and homogenization
We adopt the same observation model introduced in [25].We observe n noisy images Y i ∈ R p (i.e., p is the number of pixels), for i = 1 , . . . , n . These are random vectors sampledfrom a hierarchical model defined as follows. First, a latentvector—or hyperparameter— ω ∈ R p is drawn from a proba-bility distribution P with mean µ ω and covariance matrix Σ ω .Conditional on ω , each coordinate of Y = ( Y (1) , . . . , Y ( p )) (cid:62) is drawn independently from a canonical one-parameter expo-nential family, Y ( j ) | ω ( j ) ∼ p ω ( j ) ( y ) , Y = ( Y (1) , . . . , Y ( p )) (cid:62) , (1)with density p ω ( j ) ( y ) = exp[ ω ( j ) y − A ( ω ( j ))] (2)with respect to a σ -finite measure ν on R , where the j th entry of the latent vector, ω ( j ) ∈ R , is the natural parameterof the family and A ( ω ( j )) = log (cid:82) exp( ω ( j ) y ) dν ( y ) is thecorresponding log-partition function. The mean and varianceof the random variable Y ( j ) can be expressed as A (cid:48) ( ω ( j )) and A (cid:48)(cid:48) ( ω ( j )) , where we denote A (cid:48) ( ω ) = dA ( ω ) /dω . Therefore,the mean of Y conditional on ω is X := E ( Y | ω ) = ( A (cid:48) ( ω (1)) , . . . , A (cid:48) ( ω ( p ))) (cid:62) = A (cid:48) ( ω ) , so the noisy data vector Y can be expressed as Y = A (cid:48) ( ω ) +˜ ε , with E (˜ ε | ω ) = 0 and the marginal mean of Y is E Y = E A (cid:48) ( ω ) . Thus, one can think of Y as a noisy realization ofthe clean vector X = A (cid:48) ( ω ) . However, the latent vector ω isalso random and varies from sample to sample. In the XFELapplication, X = A (cid:48) ( ω ) are the unobserved noiseless images,and their randomness stems for the random (and unobserved)orientation of the molecule. We may write Y = A (cid:48) ( ω ) +diag[ A (cid:48)(cid:48) ( ω )] / ε , where the coordinates of ε are conditionallyindependent and standardized given ω . The covariance of Y is given by the law of total covariance:Cov [ Y ] = Cov [ E ( Y | ω )] + E [ Cov [ Y | ω ]]= Cov [ A (cid:48) ( ω )] + E diag[ A (cid:48)(cid:48) ( ω )] . (3)The off-diagonal entries of the covariance matrix of thenoisy images are therefore the same as those of the cleanimages. However, the diagonal of the covariance matrix (i.e.,the variance) of the noisy images is further inflated by thenoise variance. Unlike white noise, the noise variance hereoften changes from one coordinate to another (i.e., it is het-eroscedastic). In e PCA, the homogenization step is a particularweighting method that improves the signal-to-noise ratio [25,Section 4.2.2]. Specifically, the homogenized vector is definedas Z = diag[ E A (cid:48)(cid:48) ( ω )] − / Y = diag[ E A (cid:48)(cid:48) ( ω )] − / A (cid:48) ( ω ) + ε. (4)Then the corresponding homogenized covariance matrix be-comesCov [ Z ] = diag[ E A (cid:48)(cid:48) ( ω )] − / Cov [ A (cid:48) ( ω )] diag[ E A (cid:48)(cid:48) ( ω )] − / + I. (5)This step is commonly called “prewhitening” in signal andimage processing, while “homogenization” is more commonly diag[ x ] for x ∈ R p denotes a p × p diagonal matrix whose diagonalentries are x ( j ) for j = 1 , · · · , p . used in statistics. The terms “prewhitened” and “homoge-nized” are synonyms in the context of the paper. In Sec-tion II-C, we discuss how to estimate the homogenized ro-tationally invariant covariance matrix.For the special case of Poisson observations Y ∼ Poisson p ( X ) , where X ∈ R p is random, we can write Y = X + diag( X ) / ε . The natural parameter is the vector ω with ω ( j ) = log X ( j ) and A (cid:48) ( ω ( j )) = A (cid:48)(cid:48) ( ω ( j )) =exp( ω ( j )) = X ( j ) . Therefore, we have E Y = E X , andCov [ Y ] = Cov [ X ] + diag[ E X ] . (6)In words, while the mean of the noisy images agrees with themean of the clean images, their covariance matrices differ by adiagonal matrix that depends solely on the mean image. If wehomogenize the data by Z = diag( E [ X ]) − / Y , then Eq. (6)becomes,Cov [ Z ] = diag( E [ X ]) − / Cov [ Y ] diag( E [ X ]) − / = diag( E [ X ]) − / Cov [ X ] diag( E [ X ]) − / + I. (7)We can estimate the covariance of X from the homoge-nized covariance Z according to Eq. (7), i.e. Cov [ X ] =diag( E [ X ]) / ( Cov [ Z ] − I ) diag( E [ X ]) / . In Sec. II-E, wedetail the corresponding recoloring step to estimate Cov [ X ] .In Alg. 1, Steps 1 and 5, with D n = diag[ ¯ Y ] , correspond tothe homogenization procedure in e PCA. We provide more de-tails on how to estimate the homogenized rotationally invariantcovariance matrix (Steps 1–5) in Sec. II-B and Sec. II-C.
B. Steerable basis expansion
Under the observation model in Sec. II-A, we develop amethod that estimates the rotational invariant Cov [ X ] effi-ciently and accurately from the image dataset Y . We assumethat a digital image I is composed of discrete samples from acontinuous function f with band limit c . The Fourier transformof f , denoted F ( f ) , can be expanded in any orthogonal basisfor the class of squared-integrable functions in a disk of radius c . For the purpose of steerable PCA, it is beneficial to choosea basis whose elements are products of radial functions withFourier angular modes, such as the Fourier-Bessel functions, or2-D prolate functions [35]. For concreteness, in the followingwe use the Fourier-Bessel functions given by ψ k,qc ( ξ, θ ) = (cid:40) N k,q J k (cid:16) R k,q ξc (cid:17) e ıkθ , ξ ≤ c, , ξ > c, (8)where ( ξ, θ ) are polar coordinates in the Fourier domain (i.e., ξ = ξ cos θ , ξ = ξ sin θ , ξ ≥ , and θ ∈ [0 , π ) ; N k,q =( c √ π | J k +1 ( R k,q ) | ) − is the normalization factor; J k is theBessel function of the first kind of integer order k ; and R k,q is the q th root of the Bessel function J k . We also assume thatthe functions of interest are concentrated in a disk of radius R in real domain. In order to avoid aliasing, we only use Fourier-Bessel functions that satisfy the following criterion [26], [36] R k,q +1 ≤ πcR. (9)For each angular frequency k , we denote by p k the numberof components satisfying Eq. (9). The total number of compo-nents is p = (cid:80) k max k = − k max p k , where k max is the maximal possible Algorithm 1:
Steerable e PCA (s e PCA) and denoising
Input:
Image data Y that contains n images of size L × L Output:
Rotational invariant covariance estimator ofnoiseless images and denoised images Compute the sample mean ¯ Y = n (cid:80) ni =1 Y i ( e PCA) Estimate the support size R and band limit c for themean image (sPCA) Compute the Fourier-Bessel expansion coefficients of F ( ¯ Y ) and estimate the rotationally invariant samplemean ¯ f as in Eq. (14) (s e PCA) Compute the variance estimate D n = diag[ ¯ f ] (s e PCA) Prewhiten the image data Z = D − / n Y ( e PCA) Estimate the band limit c for whitened images (sPCA) Compute the truncated Fourier-Bessel expansioncoefficients of F ( Z ) and form the coefficients matrices A ( k ) , for k = 0 , . . . , k max (sPCA) for k = 0 , , . . . , k max do Compute the prewhitened sample covariance matrix S ( k ) h as in Eq. (16) and its eigendecomposition S ( k ) h = ˆ U Λ ˆ U ∗ ( e PCA) Shrink the eigenvalues S ( k ) h,η γk = ˆ U η γ k (Λ) ˆ U ∗ of top r k eigenvalues according to Eq. (19) ( e PCA, sPCA) Compute the recoloring matrix B ( k ) in eq. (23) and D ( k ) in eq. (26) (s e PCA) Recolor the covariance matrix S ( k ) he = (cid:0) B ( k ) (cid:1) ∗ · S ( k ) h,γ k · B ( k ) (s e PCA) Compute the scaling coefficients ˆ α in Eq. (27) andkeep components with ˆ α > (s e PCA) Scale the covariance matrix S ( k ) s = (cid:80) ˆ α ( k ) i ˆ v ( k ) i (cid:16) ˆ v ( k ) i (cid:17) ∗ , where theeigendecomposition of S ( k ) he is (cid:80) ˆ v ( k ) i (cid:16) ˆ v ( k ) i (cid:17) ∗ ( e PCA) Denoise { A ( k ) } k max k =0 as in Eqs. (29) and (30) (s e PCA) end The rotational invariant covariance matrix estimator ˆ S (( x, y ) , ( x (cid:48) , y (cid:48) )) = G (0) ( x, y ) S (0) s G (0) ( x (cid:48) , y (cid:48) ) ∗ +2 (cid:80) k max k =1 G ( k ) ( x, y ) S ( k ) s G ( k ) ( x (cid:48) , y (cid:48) ) ∗ . (sPCA) Reconstruct the denoised image using Eq. (31) (sPCA)value of k satisfying Eq. (9). We also denote γ k = p k n for k > and γ = p n .The inverse Fourier transform (IFT) of ψ k,qc is F − ( ψ k,qc )( r, φ ) = 2 c √ π ( − q R k,q J k (2 πcr ) ı k (2 πcr ) − R k,q e ıkφ ≡ g k,qc ( r ) e ıkφ , (10)where g k,qc ( r ) is the radial part of the inverse Fourier transformof the Fourier-Bessel function. Therefore, we can approximate f using the truncated expansion f ( r, φ ) ≈ k max (cid:88) k = − k max p k (cid:88) q =1 a k,q g k,qc ( r ) e ıkφ . (11) The approximation error in Eq. (11) is due to the finitetruncation of the Fourier-Bessel expansion. For essentiallybandlimited functions, the approximation error is controlledusing the asymptotic behavior of the Bessel functions, see [37,Section 2] for more details. We evaluate the Fourier-Besselexpansion coefficients numerically as in [26] using a quadra-ture rule that consists of equally spaced points in the angulardirection θ l = πln θ , with l = 0 , . . . , n θ − and a Gaussianquadrature rule in the radial direction ξ j for j = 1 , . . . , n ξ withthe associated weights w ( ξ j ) . Using the sampling criterionintroduced in [26], the values of n ξ and n θ depend on thecompact support radius R and the band limit c . Our previouswork found that using n ξ = (cid:100) cR (cid:101) and n θ = (cid:100) cR (cid:101) results inhighly-accurate numerical evaluation of the integral to evaluatethe expansion coefficients. To evaluate a k,q , we need to samplethe discrete Fourier transform of the image I , denoted F ( I ) ,at the quadrature nodes, F ( I )( ξ j , θ l ) = 12 R R − (cid:88) i = − R R − (cid:88) i = − R I ( i , i ) × exp ( − ı π ( ξ j cos θ l i + ξ j sin θ l i )) , (12)which can be evaluated efficiently using the the nonuniformdiscrete Fourier transform [38], and we get a k,q ≈ n ξ (cid:88) j =1 N k,q J k,q (cid:18) ξ j c (cid:19) (cid:91) F ( I )( ξ j , k ) ξ j w ( ξ j ) , (13)where (cid:91) F ( I )( ξ j , k ) is the 1D FFT of F ( I ) on each concentriccircle of radius ξ j . For real-valued images, it is sufficient toevaluate the coefficients with k ≥ , since a − k,q = a ∗ k,q . Inaddition, the coefficients have the following properties: undercounter-clockwise rotation by an angle α , a k,q changes to a k,q e − ıkα ; and under reflection, a k,q changes to a − k,q . Thenumerical integration error in Eq. (13) was analyzed in [26]and drops below − for the chosen values of n ξ and n θ .The steerable basis expansion are applied in two parts ofthe steerable e PCA: (1) the rotationally invariant sample meanestimation in Step 3 of Alg. 1 and (2) the expansion of thewhitened images in Step 7 of Alg. 1.
C. The sample rotationally-invariant homogenized covariancematrix
Suppose I , . . . , I n are n discretized input images sampledfrom f , . . . , f n . Here, the observational vectors Y i for e PCAare simply Y i = I i . In e PCA, the first step is to prewhitenthe data using the sample mean as suggested by Eqs. (4)and (6). However, the sample mean ¯ Y = n (cid:80) ni =1 Y i is notnecessarily rotationally invariant. With the estimated bandlimit and support size in Step 2, we compute the truncatedFourier-Bessel expansion coefficients of F ( ¯ Y ) , denoted by ¯ a k,q . The rotationally invariant sample mean can be evaluatedfrom ¯ a k,q , ¯ f ( r, φ ) = 12 π (cid:90) π n n (cid:88) i =1 f i ( r, φ − α ) dα ≈ p (cid:88) q =1 ¯ a ,q g ,qc ( r ) . (14) The approximation error in Eq. (14) follows directly fromthat of Eq. (11). The rotationally invariant sample mean iscircularly symmetric. We denote by ¯ A a vector that containsall the coefficients ¯ a ,q ordered by the radial index q . Althoughthe input images are non-negative, the finite truncation mayresult in small negative values in the mean estimation, sowe threshold any negative entries to zero. As mentioned inSec. II-A, the covariance matrix of the Poisson observationsdiffers by a diagonal matrix, where the diagonal entries areequal to the mean image. Therefore, in Step 4 of Alg. 1, wehave D n = diag[ ¯ f ] and it is used in Step 5 to compute thehomogenized vectors, similar to Eq. (4).We prewhiten the images by the estimated mean im-age to create new images Z , . . . , Z n as Z i ( x, y ) =¯ f ( x, y ) − / Y i ( x, y ) , when ¯ f ( x, y ) > , and Z i ( x, y ) is 0otherwise. The whitening step might change the band limit.Therefore, we estimate the band limit of the whitened imagesin Step 6. Combining Eqs. (9) and (13), we compute thetruncated Fourier-Bessel expansion coefficients a ik,q of F ( Z i ) .Let us denote by A ( k ) the matrix of expansion coefficients withangular frequency k , obtained by putting a ik,q into a matrix,where the columns are indexed by the image number i and therows are ordered by the radial index q . The coefficient matrix A ( k ) is of size p k × n .We use ¯ Z to represent the rotational invariant sample meanof the whitened images. Under the action of the group O (2) ,i.e. counter-clock wise rotation by an angle α ∈ [0 , π ) andreflection β ∈ { + , −} , where ‘ + ’ indicates no reflection and‘ − ’ indicates with reflection, the image Z i is transformedto Z α,βi . Since the truncated Fourier-Bessel transform is al-most unitary [26], the rotationally invariant covariance kernel S (( x, y ) , ( x (cid:48) , y (cid:48) )) built from the whitened image data with allpossible in-plane rotations and reflections, defined as, S (( x, y ) , ( x (cid:48) , y (cid:48) )) = 14 πn n (cid:88) i =1 (cid:88) β ∈{ + , −} (cid:90) π (cid:16) Z α,βi ( x, y ) − ¯ Z ( x, y ) (cid:1) (cid:16) Z α,βi ( x (cid:48) , y (cid:48) ) − ¯ Z ( x (cid:48) , y (cid:48) ) (cid:17) dα, (15)can be computed in terms of the IFT of the Fourier-Besselbasis and the associated expansion coefficients. Subtractingthe sample mean is equivalent to subtracting n (cid:80) nj =1 a j ,q from the coefficients a i ,q , while keeping other coefficients un-changed. Therefore, we first update the zero angular frequencycoefficients by a i ,q ← a i ,q − n (cid:80) nj =1 a j ,q . In terms of theexpansion coefficients, the rotational invariant homogenizedsample covariance matrix is S h = (cid:76) k max k = − k max S ( k ) h , with S ( k ) h = 1 n Re (cid:110) A ( k ) (cid:16) A ( k ) (cid:17) ∗ (cid:111) . (16)We further denote the eigenvalues and eigenvectors of S ( k ) h by λ ( k ) i and ˆ u ( k ) i , that is, S ( k ) h = p k (cid:88) i =1 λ ( k ) i ˆ u ( k ) i (ˆ u ( k ) i ) ∗ . (17)The procedure of homogenization for steerable e PCA isdetailed in Steps 1–5 in Alg. 1. Step 9 of Alg. 1 computes therotationally invariant homogenized sample covariance matrix.
D. Eigenvalue shrinkage
For data corrupted by additive white noise (with noisevariance 1 in each coordinate), previous works [39]–[42]show that if the population eigenvalue (cid:96) is above the Baik,Ben Arous, Péché (BBP) phase transition, then the sampleeigenvalue pops outside of the Marˇcenko-Pastur distribution ofthe “noise” eigenvalues. The sample eigenvalue will convergeto the value given by the spike forward map as p k , n → ∞ ,and p k /n = γ k : λ ( (cid:96) ; γ k ) = (cid:26) (1 + (cid:96) ) (cid:0) γ k (cid:96) (cid:1) if (cid:96) > √ γ k , (cid:0) √ γ k (cid:1) otherwise. (18)The underlying clean population covariance eigenvalues canbe estimated by solving the quadratic equation in Eq. (18), ˆ (cid:96) = η γ k ( λ )= (cid:16) λ − − γ k + √ ( λ − − γ k ) − γ k (cid:17) , λ > (1 + √ γ k ) , λ ≤ (1 + √ γ k ) . (19)Shrinking the eigenvalues improves the estimation of thesample covariance matrix [43]. Since the homogenized samplecovariance matrix S h is decoupled into small sub-blocks S ( k ) h ,the shrinkers are defined for each frequency k separately. Theshrinkers η γ k ( λ ) set all noise eigenvalues to zero for λ withinthe support of the Marˇcenko-Pastur distribution and reduceother eigenvalues according to Eq. (19). Then the denoisedcovariance matrices are S ( k ) h,η = r k (cid:88) i =1 η γ k (cid:16) λ ( k ) i (cid:17) ˆ u ( k ) i (ˆ u ( k ) i ) ∗ , (20)where r k is the number of components with η γ k ( λ ) > . Theempirical eigenvector ˆ u ( k ) of S ( k ) h is an inconsistent estimatorof the true eigenvector. We can heuristically quantify the in-consistency based on results from the Gaussian standard spikedmodel, even though the noise is non-Gaussian. Under thismodel, the empirical and true eigenvectors have an asymptot-ically deterministic angle: ( (cid:0) u ( k ) (cid:1) ∗ ˆ u ( k ) ) → c ( (cid:96) ; γ k ) almostsurely, where c ( (cid:96) ; γ k ) is the cosine forward map given by [41],[42]: c ( (cid:96) ; γ k ) = (cid:40) − γ k /(cid:96) γ k /(cid:96) if (cid:96) > √ γ k , otherwise. (21)Therefore, asymptotically for the population eigenvectors be-yond the BBP phase transition, the sample eigenvectors havepositive correlation with the population eigenvectors, but thiscorrelation is less than 1 [44]–[46]. We denote by ˆ c an estimateof c using the estimated clean covariance eigenvalues ˆ (cid:96) inEq. (19) and ˆ s = 1 − ˆ c .In short, Step 10 of Alg. 1 improves the estimation of therotationally invariant homogenized covariance matrix througheigenvalue shrinkage. E. Recoloring
Homogenization changes the direction of the clean eigen-vectors. Therefore, after eigenvalue shrinkage, we recolor (het-erogenize) the covariance matrix by conjugating the recoloring (a) clean (b) noisy
Fig. 1: Sample clean and noisy images of the XFEL dataset.Image size is × with mean per pixel photon count =0 . . (a) Estimate R (b) Estimate c Fig. 2: Estimating R and c from n = 7 × simulated noisyXFEL diffraction intensity maps of lysozyme. Each image isof size × pixels. (a) The radial profile of the rotationalinvariant sample mean image. The radius of compact supportis chosen at R = 61 . (b) Mean radial power spectrum of thewhitened noisy images. The curve levels off at σ = 1 .matrix B with S h,η : S he = B ∗ · S h,η · B . The recoloring matrixis derived as, B k ,q ; k ,q = (cid:90) R (cid:90) π (cid:113) ¯ f ( r ) F − (cid:0) ψ k ,q c (cid:1) ( r, θ ) × (cid:18) F − (cid:16) ψ k ,q c (cid:17) ( r, θ ) (cid:19) rdrdθ = (cid:90) R (cid:113) ¯ f ( r ) g k ,q c ( r ) g k ,q c ( r ) rdr × (cid:90) π e ı ( − k + k ) θ dθ = δ k ,k (cid:90) R (cid:113) ¯ f ( r ) g k ,q c ( r ) g k ,q c ( r ) rdr, (22)which has a block diagonal structure and is decoupled for eachangular frequency, B = (cid:76) k max k = − k max B ( k ) , with B ( k ) q ,q = (cid:90) R (cid:113) ¯ f ( r ) g k,q c ( r ) g k,q c ( r ) rdr. (23)The radial integral in Eq. (23) is numerically evaluated usingthe Gauss-Legendre quadrature rule [47, Chap. 4], whichdetermines the locations of n r = (cid:100) cR (cid:101) points { r j } n r j =1 on theinterval [0 , R ] and the associated weights w ( r j ) . The integralin Eq. (23) is thus approximated by B ( k ) q ,q ≈ n r (cid:88) j =1 (cid:113) ¯ f ( r j ) g k,q c ( r j ) g k,q c ( r j ) r j w ( r j ) . (24) (a) (b) (c) (d) (e)(f) (g) (h) (i) Fig. 3: Eigenimages estimated from noisy XFEL data using PCA, steerable PCA (sPCA), e PCA, and steerable e PCA (s e PCA),ordered by eigenvalues. Input images are corrupted by Poisson noise with mean photon count 0.01 (shown in Figure 1b).The recoloring step is also decoupled for each angularfrequency sub-block. The heterogenized covariance estimatorsare S ( k ) he = (cid:16) B ( k ) (cid:17) ∗ · S ( k ) h,η γk · B ( k ) . (25)Similar to Eq. (23), we define D ( k ) which will be used to scalethe heterogenized covariance matrix estimator (see Eq. (27))and denoise the expansion coefficients (see Eqs. (29) and (30)), D ( k ) q ,q = (cid:90) R ¯ f ( r ) g k,q c ( r ) g k,q c ( r ) rdr ≈ n r (cid:88) j =1 ¯ f ( r j ) g k,q c ( r j ) g k,q c ( r j ) r j w ( r j ) . (26)In Alg. 1, Steps 11 and 12 summarize the recoloring proce-dure. F. Scaling
The eigendecomposition of S ( k ) he gives S ( k ) he = (cid:80) r k i =1 ˆ v ( k ) i (cid:16) ˆ v ( k ) i (cid:17) ∗ . The empirical eigenvalues are ˆ t = (cid:107) ˆ v ( k ) (cid:107) which is a biased estimate of the true eigenvalues of the cleancovariance matrix Σ X . In [25, Sec. 4.2.3], a scaling rule wasproposed to correct the bias. We extend it in Steps 13 and 14in Alg. 1 to the steerable case and scale each eigenvalue of S ( k ) he by a parameter ˆ α ( k ) , ˆ α ( k ) = (cid:40) − ˆ s τ ( k ) ˆ c , for − ˆ s τ ( k ) > and ˆ c > otherwise (27) where the parameter τ ( k ) = tr D ( k ) p k · ˆ (cid:96) (cid:107) ˆ v ( k ) (cid:107) . The scaledcovariance matrices are S ( k ) s = r k (cid:88) i =1 ˆ α ( k ) i ˆ v ( k ) i (cid:16) ˆ v ( k ) i (cid:17) ∗ . (28)The rotational invariant covariance kernel S (( x, y ) , ( x (cid:48) , y (cid:48) )) is well approximated by (cid:80) k max k = − k max G ( k ) ( x, y ) S ( k ) s (cid:0) G ( k ) ( x (cid:48) , y (cid:48) ) (cid:1) ∗ , where G ( k ) contains IFT of all ψ k,qc with angular frequency k (seeStep 17) in Alg. 1). The computational complexity ofsteerable e PCA is O ( nL + L ) , same as steerable PCA,and it is lower than the complexity of e PCA which is O (min( nL + L , n L + n )) .In summary, Step 14 scales the heterogenized covariancematrix S he . The covariance matrix in the original pixel domainis efficiently computed from S ( k ) s in Step 17. G. Denoising
As an application of steerable e PCA, we develop a methodto denoise the photon-limited images. Since the Fourier-Besselexpansion coefficients are computed from the prewhitenedimages, we first recolor the coefficients by multiplying B ( k ) with A ( k ) and then we apply Wiener-type filtering to denoisethe coefficients. For the steerable basis expansion coefficients A ( k ) with angular frequency k (cid:54) = 0 , ˆ A ( k ) = S ( k ) s (cid:16) D ( k ) + S ( k ) s (cid:17) − B ( k ) A ( k ) . (29) (a) (b) Fig. 4: Error of covariance matrix estimation, measured as the (a) operator norm and (b) Frobenius norm of the differencebetween each covariance estimate and the true covariance matrix. The sample size n ranges from 100 to 140,000.Fig. 5: The estimated number of signal principal componentsusing PCA, sPCA, e PCA and s e PCA. Images are corruptedby Poisson noise with mean per pixel photon count 0.01.For k = 0 , we need to take into account the rotational invariantmean expansion coefficients, ˆ A (0) = S (0) s (cid:16) D (0) + S (0) s (cid:17) − B (0) A (0) + D (0) (cid:16) D (0) + S (0) s (cid:17) − ¯ A (cid:62) n . (30)The denoised image sampled on the Cartesian grid ( x, y ) in real domain are computed from the filtered expansioncoefficients ˆ a ik,q , ˆ X i ( x, y ) = p (cid:88) q =1 ˆ a i ,q g ,qc ( r x,y )+ 2 Re (cid:34) k max (cid:88) k =1 p k (cid:88) q =1 ˆ a ik,q g k,qc ( r x,y ) e − ıkθ x,y (cid:35) , (31)where r x,y = (cid:112) x + y and θ x,y = tan − (cid:0) yx (cid:1) .In essense, we first denoise the recolored steerable expan-sion coefficients in Step 15 of Alg. 1 according to Eqs. (29)and (30) and then reconstruct the image using Eq. (31) inStep 18 of Alg. 1. Fig. 6: The runtimes for computing the principal componentsusing PCA, sPCA, e PCA and s e PCA. Images are corruptedby Poisson noise with mean per pixel photon count = 0 . . (a) Clean (b) Noisy Fig. 7: Sample clean and noisy images of randomly rotatedYale B face data. Image size is × with mean intensity2.3 photons per pixel.III. N UMERICAL R ESULTS
We apply PCA, e PCA, steerable PCA, and steerable e PCAto a simulated XFEL dataset and compare the results forcovariance estimation and denoising. The algorithms are im-plemented in MATLAB on a machine with 60 cores, runningat 2.3 GHz, with total RAM of 1.5TB. Only 8 cores were used (a) (b) (c) (d) (e)(f) (g) (h) (i)
Fig. 8: Eigenimages estimated from noisy rotated Yale B face data using PCA, sPCA, e PCA, and s e PCA, ordered by eigenvalues.Input images are corrupted by Poisson noise with mean intensity 2.3 photons per pixel (shown in Fig. 7b).in our experiments.We simulate n = 140 , noiseless XFEL diffractionintensity maps of a lysozyme (Protein Data Bank 1AKI) withCondor [48]. The average pixel intensity is rescaled to be0.01 for image size × pixels such that shot noisedominates [49]. To sample an arbitrary number n of noisydiffraction patterns, we sample an intensity map at random,and then sample the photon count of each detector pixel from aPoisson distribution whose mean is the pixel intensity. Figs. 1aand 1b illustrate the clean intensity maps and the resultingnoisy diffraction patterns.We estimate the radius of the concentration of the diffractionintensities in real domain and the band limit in Fourier domainfrom the noisy images in the following way. The data varianceis proportional to the sample mean. Therefore, we estimatethe rotationally invariant sample mean and show the radialprofile in Fig. 2a, which is also the variance map of thedataset averaged in the angular direction. At large r , theradial part of the rotational invaraint sample mean levels offat . We compute the cumulative variance by integrating theradial sample mean over r with a Jacobian weight rdr . Thefraction of the cumulative mean exceeds 99.9% at r = 61 ,and therefore R was chosen to be (see Fig. 2a). Wecompute the whitened projection images using the rotationalinvariant sample mean. For the whitened images, we computethe angular average of the mean 2D power spectrum. Thecurve in Fig. 2b levels off at the noise variance σ = 1 when ξ is large. We used the same method as before to computethe cumulative radial power spectrum. The fraction reaches99.9% at ξ = 0 . , therefore the band limit is chosen to be c = 0 . . The band limit c and support radius R are used in both steerable PCA and steerable e PCA.
A. Covariance estimation and principal components
Fig. 3 shows the top 12 eigenimages for clean XFELdiffraction patterns (Fig. 3e), and noisy diffraction patternswith mean photon count per pixel 0.01 (Figs. 3c–3i) usingPCA, steerable PCA, e PCA, and steerable e PCA. The trueeigenimages in Fig. 3e are computed from 70000 clean diffrac-tion patterns whose orientations are uniformly distributed over SO (3) . Figs. 3a–3d are computed from noisy images.Since the number of samples is much smaller than the size ofthe image and the noise type is non-Gaussian, PCA can onlyrecover the first two or three components. e PCA improves theestimation and is able to extract the top 7 eigenimages. More-over, steerable PCA and steerable e PCA achieve much betterestimation of the underlying true eigenimages for a givensample size. Steerable e PCA achieves the best performancein estimating both the eigenvalues and eigenimages.Furthermore, we compare the operator norms and Frobeniusnorms of the difference between the covariance estimates andthe true covariance matrix. Fig. 4 shows that steerable e PCAsignificantly improves the covariance estimation, especiallywhen the sample size is small.For experiments using e PCA, we use a permutationbootstrap-based method to estimate the rank of the covariancematrix, following e.g. [50]. By randomly permuting eachcolumn of the mean-subtracted data matrix, we completelydestroy structural information including linear structure, whilethe noise statistics remain the same (see [51], [52] for an analy-sis). Singular values of the randomly permuted matrices reveal (a) Single Image NLPCA, MSE = 0 . (b) PCA, n = 1000 , MSE = 1 . × − (c) sPCA, n = 1000 , MSE = 5 . × − (d) e PCA, n = 1000 , MSE = 1 . × − (e) s e PCA, n = 1000 , MSE = 2 . × − (f) PCA, n = 70000 , MSE = 1 . × − (g) sPCA, n = 70000 , MSE = 5 . × − (h) e PCA, n = 70000 , MSE = 4 . × − (i) s e PCA, n = 70000 , MSE = 2 . × − Fig. 9: Sample denoised images of the XFEL dataset illustrated in Fig. 1 using NLPCA, PCA, sPCA, e PCA, s e PCA. Imagesize is × .Fig. 10: Comparing the denoising quality of the XFEL dataset( p = 16384 ) with various number of images.what should be the largest covariance matrix eigenvalues thatcorrespond to noise, up to a user-selected confidence level ρ . This can replace the other rank estimation methods thatassume Gaussian distribution when the noise model is non-Gaussian, such as in our case. Empirically, we observe that ρ = 0 . gives the best performance in covariance estimation. For steerable e PCA, we estimate the number of componentsusing the right edge of the Marˇcenko-Pastur distribution forhomogenized covariance matrices S ( k ) h and include only thecomponents whose scaling factor ˆ α ( k ) are above zero.Steerable e PCA is able to recover more signal principalcomponents from noisy images than PCA, steerable PCA,and e PCA (see Fig. 5). When the sample size n = 1000 ,the mean number of estimated components is and for e PCA and steerable e PCA respectively. For n = 70000 , theestimated number of components is and for e PCAand steerable e PCA respectively. Fig. 6 shows that steerable e PCA is more efficient than e PCA and PCA. Because steerable e PCA contains extra steps such as prewhitening, recoloring,and scaling, its runtime is slightly longer than steerable PCA.When n = 140000 , steerable e PCA is 8 times faster than e PCA.In addition to the XFEL diffraction intensity data, we in-clude a natural image dataset–Yale B face database [33], [34]–to illustrate the efficacy of the proposed method. The databasecontains 5760 single light source images of 10 subjects with 9poses. For every subject in a particular pose, images werecaptured under different ambient illuminations. The originalimages of a single face under different lighting conditionsinhabit an approximately 9-dimensional linear space [53]. In (a) Single Image NLPCA, MSE = 7 . (b) PCA, n = 2 , MSE = 0 . (c) sPCA, n = 2 , MSE = 0 . (d) e PCA, n = 2 , MSE = 0 . (e) s e PCA, n = 2 , MSE = 0 . (f) PCA, n = 2 , MSE = 0 . (g) sPCA, n = 2 , MSE = 0 . (h) e PCA, n = 2 , MSE = 0 . (i) s e PCA, n = 2 , MSE = 0 . Fig. 11: Sample denoised images of the face dataset illustrated in Fig. 7 using NLPCA, PCA, sPCA, e PCA, and s e PCA. Imagesize is × . The intensity range is [0, 2.6] for (a), and is [0, 11.1] for (b)–(i).the experiments, we take one subject at a particular posewith 64 different illumination conditions. We downsample theoriginal images to the size of × pixels and scale theintensity such that the mean photon count for the whole datasetis . photons per pixel (i.e., “faces in the dark”). Since weneed to rotate the images, we mask images by a disk ofradius R = 32 . We uniformly sample n clean images fromthe original images and apply a random in-plane rotation toeach sample. Fig. 7 shows 12 samples of the clean data and thecorresponding noisy observations. The corresponding principalcomponents are illustrated in Fig. 8. The true eigenimages areevaluated using the clean images rotated at every 0.7 degrees. B. Denoising
We compare the denoising effects of steerable e PCA withPCA, steerable PCA, e PCA, and patch-based single imagenon-local PCA (NLPCA) [21], by the mean squared error,MSE := ( pn ) − (cid:80) ni =1 (cid:107) ˆ X i − X i (cid:107) . We perform “ e PCA de-noising” using the empirical best linear predictor (EBLP) [25],which had been shown to outperform “PCA denoising,” i.e.,orthogonal projection onto sample or e PCA / heterogenizedeigenimages, as well as the exponential family PCA methodproposed by [19]. Note that in our implementation of e PCAWiener-type filtering, to avoid inverting a singular matrix(when some coordinates have 0 sample mean and sample variance), we compute diag[ ¯ Y ] + S s with regularization, diag[ ¯ Y ] + S s ← S s + (1 − (cid:15) ) diag[ ¯ Y ] + (cid:15)mI where (cid:15) = 0 . , m = p ¯ Y · , and I is the p × p identity matrix. The number ofcomponents are estimated by the permutation rank estimationas described in the previous section.Fig. 9 shows some examples of denoised XFEL images forsample size n = 1000 and . For robustness, we repeatedthe numerical experiment for another dataset simulated fromthe small protein chignolin (Protein Data Bank entry 1UAO)and obtained qualitatively similar results. Steerable e PCA isable to recover the images with lower MSEs compared to PCA,steerable PCA, and e PCA (see Fig. 10), especially when thesample size n is small.Fig. 11 shows examples of the denoised face images forsample size n = 2 and . We use the permutation rankestimation described in the previous section with confidencelevel ρ = 10 for PCA, sPCA, and e PCA. In this example,we see that when the number of images is small, PCA and e PCA can only recover rough contours of the faces, whereassPCA and s e PCA are able to recover finer details in the faces.The MSEs of the denoised images with varying number ofsamples are plotted in Fig. 12. In this experiment, the dataonly contains 64 unrotated clean images and the mean photoncount is 2.3 per pixel. Compared to the diffraction patternswith mean photon count 0.01 per pixel. Therefore, s e PCA is Fig. 12: Comparing the denoising quality of the face dataset( p = 4096 ) with various number of images. (a) XFEL images (b) Rotated Yale B face Fig. 13: Sensitivity of the denoising performance over theparameters c and R . For the face data, we cropped the imagesby the disc of radius R = 32 .able to sufficiently capture the clean data subspace with onlya small number of images, resulting in a much flatter MSEcurve with respect to the number of images.We also test how sensitive our method is to the choiceof parameters, i.e. the band limit c and the support radius R . For the diffraction intensity data, we use n = 10000 diffraction patterns with mean photon count 0.01 per pixel.The underlying clean diffraction patterns are very smooth. Wevary the band limit from c = 0 . to . and the supportradius R ranges from 50 to 63. With the chosen parametersindicated in Fig. 2, the MSE for the denoised image is small(see Fig. 13a). Although the MSEs vary when we increasethe parameter c above the chosen band limit and change R within the range of 50 to 63, the errors are still relativelysmall ( . − . × − ), compared to above × − with e PCA and sPCA. For the face data, we compare the resultswith n = 8192 noisy images with 2.3 photons per pixel. Sincewe crop out the region within a disk of radius R = L/ , theparameter R is fixed to be 32 and we vary the band limit c from . to . . We observe that choosing c = 0 . providesthe best denoising performance (see Fig. 13b), and this agreeswith the fact that the face images contain more high-frequencyinformation than the XFEL data. IV. C ONCLUSION
We presented steerable e PCA, a method for principal com-ponent analysis of a set of low-photon count images andtheir uniform in-plane rotations. This work has been mostlymotivated by its application to XFEL, but is relevant to otherlow-photon count imaging applications. The computationalcomplexity of the new algorithm is O ( nL + L ) , whereasthat of e PCA is O (min( nL + L , n L + n )) . Incorporatingrotational invariance allows more robust estimation of thetrue eigenvalues and eigenvectors. Our numerical experimentsshowed that steerable e PCA more accurately estimates the co-variance matrix and achieves better denoising results. Finally,we remark that the Fourier-Bessel basis can be replaced withother suitable bases, for example, the 2-D prolate spheroidalwave functions (PSWF) on a disk [54], [55]. Nevertheless,our method has certain limitations. For example, if the noisedistribution is not specified in advance, it is hard to estimatethe noise covariance E diag[ A (cid:48)(cid:48) ( ω )] , which will affect thehomogenization step of the proposed method. In addition,the optimal shrinkage function for the covariance estimationdepends on how we measure the error, for example usingFrobenius norm, operator norm, or other criteria. The corre-sponding theoretical guarantee in the non-Gaussian case is stillan open problem. Thus, the statistical optimality is beyondthe scope of this paper. For future work, we will study howsensitive the proposed method is to the misspecification ofthe noise distribution and the statistical optimality of ourprocedure in the non-Gaussian case.A CKNOWLEDGMENTS
The authors would like to thank associate editor and theanonymous referees for their valuable comments. The authorswould also like to thank Edgar Dobriban and Boris Landa fordiscussions. ZZ was partially supported by National Center forSupercomputing Applications Faculty Fellowship and Univer-sity of Illinois at Urbana-Champaign College of EngineeringStrategic Research Initiative. AS was partially supported byAward Number R01GM090200 from the NIGMS, FA9550-17-1-0291 from AFOSR, Simons Investigator Award, the MooreFoundation Data-Driven Discovery Investigator Award, andNSF BIGDATA Award IIS-1837992.R
EFERENCES[1] V. Favre-Nicolin, J. Baruchel, H. Renevier, J. Eymery, and A. Borbély,“XTOP: high-resolution X-ray diffraction and imaging,”
Journal ofApplied Crystallography , vol. 48, no. 3, pp. 620–620, 2015.[2] F. R. N. C. Maia and J. Hajdu, “The trickle before the torrent-diffractiondata from X-ray lasers,”
Scientific Data , vol. 3, 2016.[3] S. H. W. Scheres, H. Gao, M. Valle, G. T. Herman, P. P. B. Eggermont,J. Frank, and J.-M. Carazo, “Disentangling conformational states ofmacromolecules in 3D-EM through likelihood optimization,”
NatureMethods , vol. 4, no. 1, pp. 27–29, 2007.[4] D. N.-T. Loh and V. Elser, “Reconstruction algorithm for single-particlediffraction imaging experiments,”
Phys. Rev. E , vol. 80, p. 026705, Aug2009.[5] Z. Kam, “Determination of macromolecular structure in solution byspatial correlation of scattering fluctuations,”
Macromolecules , vol. 10,no. 5, pp. 927–934, 1977.[6] D. K. Saldin, V. L. Shneerson, R. Fung, and A. Ourmazd, “Structure ofisolated biomolecules obtained from ultrashort X-ray pulses: exploitingthe symmetry of random orientations,”
Journal of Physics: CondensedMatter , vol. 21, no. 13, 2009. [7] D. Starodub, A. Aquila, S. Bajt, M. Barthelmess, A. Barty, C. Bostedt,J. D. Bozek, N. Coppola, R. B. Doak, S. W. Epp et al. , “Single-particlestructure determination by correlations of snapshot X-ray diffractionpatterns,” Nature communications , vol. 3, p. 1276, 2012.[8] K. Pande, P. Schwander, M. Schmidt, and D. K. Saldin, “Deducingfast electron density changes in randomly orientated uncrystallizedbiomolecules in a pump–probe experiment,”
Phil. Trans. R. Soc. B , vol.369, no. 1647, p. 20130332, 2014.[9] R. P. Kurta, J. J. Donatelli, C. H. Yoon, P. Berntsen, J. Bielecki, B. J.Daurer, H. DeMirci, P. Fromme, M. F. Hantke et al. , “Correlationsin scattered X-ray laser pulses reveal nanoscale structural features ofviruses,”
Physical Review Letters , vol. 119, no. 15, p. 158102, 2017.[10] J. J. Donatelli, J. A. Sethian, and P. H. Zwart, “Reconstruction fromlimited single-particle diffraction data via simultaneous determinationof state, orientation, intensity, and phase,”
Proceedings of the NationalAcademy of Sciences , vol. 114, no. 28, pp. 7222–7227, 2017.[11] B. von Ardenne, M. Mechelke, and H. Grubmüller, “Structure deter-mination from single molecule X-ray scattering with three photons perimage,”
Nat Commun. , vol. 9, no. 1, p. 2375, 2018.[12] K. Pande, J. J. Donatelli, E. Malmerberg, L. Foucar, C. Bostedt,I. Schlichting, and P. H. Zwart, “Ab initio structure determinationfrom experimental fluctuation x-ray scattering data,”
Proceedings of theNational Academy of Sciences
Principal Component Analysis . Wiley Online Library, 2002.[14] T. W. Anderson,
An Introduction to Multivariate Statistical Analysis .Wiley New York, 2003.[15] R. D. Nowak and R. G. Baraniuk, “Wavelet-domain filtering for photonimaging systems,”
IEEE Transactions on Image Processing , vol. 8, no. 5,pp. 666–678, 1999.[16] F. Luisier, T. Blu, and M. Unser, “A new sure approach to image denois-ing: Interscale orthonormal wavelet thresholding,”
IEEE Transactions onimage processing , vol. 16, no. 3, pp. 593–606, 2007.[17] F. J. Anscombe, “The transformation of poisson, binomial and negative-binomial data,”
Biometrika , vol. 35, no. 3-4, p. 246, 1948.[18] J.-L. Starck, F. Murtagh, and J. M. Fadili,
Sparse image and signalprocessing: wavelets, curvelets, morphological diversity . Cambridgeuniversity press, 2010.[19] M. Collins, S. Dasgupta, and R. E. Schapire, “A generalization ofprincipal component analysis to the exponential family,”
Advances inNeural Information Processing Systems (NIPS) , 2001.[20] A. P. Singh and G. J. Gordon, “A unified view of matrix factorizationmodels,” in
Machine Learning and Knowledge Discovery in Databases ,W. Daelemans, B. Goethals, and K. Morik, Eds. Berlin, Heidelberg:Springer Berlin Heidelberg, 2008, pp. 358–373.[21] J. Salmon, Z. Harmany, C.-A. Deledalle, and R. Willett, “Poisson noisereduction with non-local PCA,”
Journal of mathematical imaging andvision , vol. 48, no. 2, pp. 279–294, 2014.[22] T. Furnival, R. K. Leary, and P. A. Midgley, “Denoising time-resolvedmicroscopy image sequences with singular value thresholding,”
Ultra-microscopy , 2016.[23] Y. Cao and Y. Xie, “Low-rank matrix recovery in Poisson noise,” in
Signal and Information Processing (GlobalSIP), 2014 IEEE GlobalConference on . IEEE, 2014, pp. 384–388.[24] M. Sonnleitner, J. Jeffers, and S. M. Barnett, “Local retrodiction modelsfor photon-noise-limited images,” in
SPIE Photonics Europe . Interna-tional Society for Optics and Photonics, 2016, pp. 98 960V–98 960V.[25] L. T. Liu, E. Dobriban, and A. Singer, “ e PCA: High dimensionalexponential family PCA,”
Annals of Applied Statistics , vol. 12, no. 4,pp. 2121–2150, 2018.[26] Z. Zhao, Y. Shkolnisky, and A. Singer, “Fast steerable principal com-ponent analysis,”
IEEE Transactions on Computational Imaging , vol. 2,no. 1, pp. 1–12, 2016.[27] W. T. Freeman and E. H. Adelson, “The design and use of steerablefilters,”
IEEE Transactions on Pattern Analysis & Machine Intelligence ,no. 9, pp. 891–906, 1991.[28] R. Hilai and J. Rubinstein, “Recognition of rotated images by invariantKarhunen-Loéve expansion,”
J. Opt. Soc. Am. A. , vol. 11, no. 5, pp.1610–1618, 1994.[29] P. Perona, “Deformable kernels for early vision,”
IEEE Trans. PatternAnal. Mach. Intell. , vol. 17, no. 5, pp. 488–499, 1995.[30] M. Uenohara and T. Kanade, “Optimal approxmation of uniformlyrotated images: Relationship between Karhunen-Loéve expansion anddiscrete cosine transform,”
IEEE Trans. Image Proces. , vol. 7, no. 1,pp. 116–119, 1998. [31] C. Ponce and A. Singer, “Computing steerable principal components ofa large set of images and their rotations,”
IEEE Transactions on ImageProcessing , vol. 20, no. 11, pp. 3051–3062, 2011.[32] Z. Zhao and A. Singer, “Rotationally invariant image representationfor viewing direction classification in cryo-EM,”
Journal of StructuralBiology , vol. 186, no. 1, pp. 153–166, 2014.[33] K.-C. Lee, J. Ho, and D. J. Kriegman, “Acquiring linear subspaces forface recognition under variable lighting,”
IEEE Transactions on PatternAnalysis & Machine Intelligence , no. 5, pp. 684–698, 2005.[34] A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman, “From fewto many: Illumination cone models for face recognition under variablelighting and pose,”
IEEE Transactions on Pattern Analysis & MachineIntelligence , no. 6, pp. 643–660, 2001.[35] B. Landa and Y. Shkolnisky, “Approximation scheme for essentiallybandlimited and space-concentrated functions on a disk,”
Applied andComputational Harmonic Analysis , vol. 43, no. 3, pp. 381–403, 2017.[36] A. Klug and R. A. Crowther, “Three-dimensional image reconstructionfrom the viewpoint of information theory,”
Nature , vol. 238, pp. 435–440, 1972.[37] Q. Wang, O. Ronneberger, and H. Burkhardt, “Rotational invariancebased on fourier analysis in polar and spherical coordinates,”
IEEETransactions on Pattern Analysis and Machine Intelligence , vol. 31,no. 9, pp. 1715–1722, 2009.[38] L. Greengard and J. Lee, “Accelerating the Nonuniform Fast FourierTransform,”
SIAM Review , vol. 46, no. 3, pp. 443–454, 2004.[39] J. Baik, G. Ben Arous, and S. Péché, “Phase transition of the largesteigenvalue for nonnull complex sample covariance matrices,”
Annals ofProbability , vol. 33, no. 5, pp. 1643–1697, 2005.[40] J. Baik and J. W. Silverstein, “Eigenvalues of large sample covariancematrices of spiked population models,”
Journal of Multivariate Analysis ,vol. 97, no. 6, pp. 1382–1408, 2006.[41] D. Paul, “Asymptotics of sample eigenstructure for a large dimensionalspiked covariance model,”
Statistica Sinica , vol. 17, no. 4, pp. 1617–1642, 2007.[42] F. Benaych-Georges and R. R. Nadakuditi, “The eigenvalues and eigen-vectors of finite, low rank perturbations of large random matrices,”
Advances in Mathematics , vol. 227, no. 1, pp. 494–521, 2011.[43] D. L. Donoho, M. Gavish, and I. M. Johnstone, “Optimal shrinkageof eigenvalues in the Spiked Covariance Model,”
Annals of Statistics ,vol. 46, no. 4, p. 1742, 2018.[44] I. M. Johnstone and A. Y. Lu, “On consistency and sparsity for principalcomponents analysis in high dimensions,”
Journal of the AmericanStatistical Association , vol. 104, no. 486, pp. 682–693, 2009.[45] S. Jung, J. S. Marron et al. , “Pca consistency in high dimension, lowsample size context,”
The Annals of Statistics , vol. 37, no. 6B, pp. 4104–4130, 2009.[46] I. M. Johnstone and D. Paul, “Pca in high dimensions: An orientation,”
Proceedings of the IEEE , vol. 106, no. 8, pp. 1277–1292, 2018.[47] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery,
Nu-merical recipes in Fortran 90 . Cambridge university press Cambridge,1996, vol. 2.[48] M. F. Hantke, T. Ekeberg, and F. R. N. C. Maia, “Condor: A simulationtool for flash X-ray imaging,”
Journal of Applied Crystallography ,vol. 49, no. 4, pp. 1356–1362, 2016.[49] P. Schwander, D. Giannakis, C. H. Yoon, and A. Ourmazd, “Thesymmetries of image formation by scattering. II. Applications,”
Opt.Express , vol. 20, no. 12, pp. 12 827–12 849, Jun 2012.[50] J. Landgrebe, W. Wurst, and G. Welzl, “Permutation-validated principalcomponents analysis of microarray data,”
Genome Biology , vol. 3, no. 4,2002.[51] E. Dobriban, “Permutation methods for factor analysis and PCA,” arXivpreprint arXiv:1710.00479 , Oct. 2017.[52] E. Dobriban and A. B. Owen, “Deterministic parallel analysis: an im-proved method for selecting factors and principal components,”
Journalof the Royal Statistical Society: Series B (Statistical Methodology) , 2018.[53] R. Basri and D. W. Jacobs, “Lambertian Reflectance and Linear Sub-spaces,”
IEEE Transactions on Pattern Analysis and Machine Intelli-gence , vol. 25, no. 2, pp. 218–233, 2003.[54] D. Slepian, “Prolate spheroidal wave functions, Fourier analysis, anduncertainty–IV: extensions to many dimensions, generalized prolatespheroidal wave functions,”
Bell System Technical Journal , vol. 43, pp.3009–3057, 1964.[55] B. Landa and Y. Shkolnisky, “Steerable principal components forspace-frequency localized images,”