[PDF] Recovering missing data in coherent diffraction imaging

Abstract

In coherent diffraction imaging (CDI) experiments, the intensity of the scattered wave impinging on an object is measured on an array of detectors. This signal can be interpreted as the square of the modulus of the Fourier transform of the unknown scattering density. A beam-stop obstructs the forward scattered wave and, hence, the modulus Fourier data from a neighborhood of k=0 cannot be measured. In this note, we describe a linear method for recovering this unmeasured modulus Fourier data from the measured values and an estimate of the support of the image's autocorrelation function without consideration of phase retrieval. We analyze the conditioning of this problem, which grows exponentially with the modulus of the maximum spatial frequency not measured, and the effects of noise.

Full PDF

RRecovering missing data in coherent diffractionimaging

D. A. Barmherzig , A.H. Barnett , C.L. Epstein , L.F. Greengard ,J.F. Magland , and M. Rachh Center for Computational Mathematics, Flatiron Institute, New York, NY Department of Mathematics, University of Pennsylvania Courant Institute, New York UniversityJune 4, 2020

Abstract

In coherent diffraction imaging (CDI) experiments, the intensity of thescattered wave impinging on an object is measured on an array of detectors.This signal can be interpreted as the square of the modulus of the Fouriertransform of the unknown scattering density. A beamstop obstructs the for-ward scattered wave and, hence, the modulus Fourier data from a neighbor-hood of k = cannot be measured. In this note, we describe a linear methodfor recovering this unmeasured modulus Fourier data from the measured val-ues and an estimate of the support of the image’s autocorrelation function without consideration of phase retrieval. We analyze the effects of noise, andthe conditioning of this problem, which grows exponentially with the modu-lus of the maximum spatial frequency not measured. Keywords:

Coherent diffraction imaging, hole in k -space, autocorrelationimage, recovered magnitude data, noise. In coherent diffraction imaging (CDI) experiments, the intensity of the scatteredwave impinging on an object is measured on an array of detectors. This signal canbe interpreted as the square of the modulus of the Fourier transform | (cid:98) ρ ( k ) | of theunknown scattering density, denoted by ρ ( x ) [15, 7]. The spatial frequency, k , isrelated to the scattering direction through the Ewald sphere construction [8]. Weassume here that the x-ray wavelength is sufﬁciently small that the curvature of1 a r X i v : . [ m a t h . NA ] J un he Ewald sphere can be neglected and that we are sampling | (cid:98) ρ ( k ) | on a uniformgrid. The phase retrieval problem is to recover the complex values (cid:98) ρ ( k ) , and hencethe desired unknown, from the measured intensity data, supplemented by auxiliaryinformation, which is typically the approximate support of ρ ( x ) [4, 18, 2]. Unfor-tunately, a beamstop obstructs the forward scattered wave and, hence, the modulusFourier data from a neighborhood of k = cannot be measured. Standard iterativeapproaches to recovering the unmeasured samples, such as HIO or difference maps[10, 12], use the auxiliary information to ﬁll in the unmeasured Fourier coefﬁcientsat the same time as the image itself is reconstructed. In this note, we describe a linear method for recovering this unmeasured modulus Fourier data from the mea-sured values and an estimate of the support of the image’s autocorrelation function, without consideration of phase retrieval.To set various parameters and length scales, we assume that ρ is supported in acompact subset of S = ( − , ) d , and that its autocorrelation image ( ρ (cid:63) ρ )( y ) ≡ (cid:90) S ρ ( x ) ρ ( y + x ) d x is supported in ( − β , β ) d . We deﬁne the ﬁeld of view (FOV) as the box ( − m , m ) d (Fig. 1).The Fourier transform, (cid:98) ρ, is deﬁned by (cid:98) ρ ( k ) ≡ (cid:90) R d ρ ( x ) e − πi x · k d x , (1)so that ρ ( x ) ≡ (cid:90) R d (cid:98) ρ ( k ) e πi x · k d k . (2)Finally, we assume that | (cid:98) ρ ( k ) | is given in the box D of side length K max , fromwhich a window W = [ − k , k ] d is deleted, corresponding to the beamstop (Fig.1). 2 ( x ) 𝒮𝒮 AC FOV

𝒮 = [ − 12 , 12 ] 𝒮 AC = [ − β β ] FOV = [ − m m ] 𝒲 = [− k , k ] 𝒟 = [− K max , K max ] D Figure 1: A two-dimensional object of interest ρ ( x ) is supported in the boundedregion S = [ − , ] and its autocorrelation is supported in S AC = [ − β , β ] . Theﬁeld of view (FOV) is the maximum region in physical space where we expect theFourier transform (2) to be valid for a given sampling of (cid:98) ρ ( k ) . We denote by R the subset of the ﬁeld of view outside S AC . In the transform domain ( k -space),the modulus of (cid:98) ρ ( k ) is measured on a box D , with the region W obscured by thebeamstop.For the sake of simplicity, we work in the discrete setting, with ρ j = ρ (cid:18) j N (cid:19) , for j ∈ [1 − N : N ] d . (3)From the Nyquist sampling theorem, the grid spacing N in physical space cor-responds to a maximum frequency in the transform domain of K max = N . Fol-lowing Fig. 1, we deﬁne J = [1 − mN : mN ] d , in order to cover the ﬁeld ofview. The vector ρ ∈ R J , is of length (2 mN ) d , and has entries deﬁned by (3) for j ∈ [1 − N : N ] d and zero otherwise, corresponding to the fact that ρ ( x ) is sup-ported in S . In taking the discrete Fourier transform of ρ with a ﬁxed N , increasingvalues of m lead to a ﬁner sampling of (cid:98) ρ in the transform domain, without changingthe maximum frequency K max = N . As a result, we sometimes refer to m > as the “oversampling” factor. In the CDI experiment, oversampling corresponds tousing an array of sensors that measure | (cid:98) ρ ( k ) | on a grid with spacing ∆ k = 1 /m .This is the Nyquist sampling rate in the inverse direction, sufﬁcient to recover aphysical object within the ﬁeld of view . It is the combination of oversampling with3rior information about the support of ρ ( x ) that makes the phase retrieval problemsolvable, for a dense open set of data, in dimensions d > [6, 14].In the remainder of this paper, we let (cid:98) ρ k ≡ (cid:88) j ∈ J ρ j e − πi j · k mN ≈ (2 N ) d (cid:98) ρ (cid:18) k m (cid:19) (4)denote the discrete Fourier transform (DFT) of the data extended by zero to theentire ﬁeld of view. Thus, in our model problem, the measured intensity is propor-tional to ( | (cid:98) ρ k | ) . We also recall the well-known fact that ( | (cid:98) ρ k | ) is the DFT of thediscrete (periodic) autocorrelation image: ( ρ (cid:63) ρ ) k ≡ (cid:88) j ∈ J ρ j ρ j + k . Let W ⊂ J denote the set of lattice points obstructed by the beamstop W . Sub-stantial effort has been devoted to the development of methods for approximatingthe Fourier coefﬁcients at these frequencies. Typically, this involves an iterativemethod designed to solve the phase retrieval problem and missing data problemsimultaneously (see, for example, [17, 9].) In Section 2 we describe a linear al-gorithm for recovering the unmeasured values {| (cid:98) ρ k | : k ∈ W } . It amounts tosolving a least squares problems for the unmeasured coefﬁcients, using knowledgeabout the support of the autocorrelation of ρ as a constraint. In Section 3 we ana-lyze the conditioning of this problem, by relating it to classical results for prolatespheroidal functions, see [20]. We show that if | W | is not too large, then, withsufﬁciently ﬁne sampling in the Fourier domain, e.g. m ≥ , the unmeasuredmagnitude data, {| (cid:98) ρ k | for k ∈ W } , can be stably determined by solving the leastsquares problem (Fig. 2).While W (and S AC ) can, in principle take any shape, we assume for simplicitythat it is square so that the lattice points lying within W are of the form W =[1 − w : w − d , where w = (cid:98) mk (cid:99) . With the discretized autocorrelation image ρ (cid:63) ρ supported in [ − βN : βN ] d , connections with prolate spheroidal functionsshow that asymptotically, as m, N, and βk grow large, the conditioning of thelinear method grows like κ ( β, k ) ∼ e πβk √ πd [ βk ] . (5)This estimate is proven in Appendix A. Empirically this asymptotic result is al-ready accurate for N ≥ , m ≥ . The main lessons of (5), and the examples inSection 4, are 4. The conditioning of the hole ﬁlling problem problem is largely determinedby the physical space-bandwidth product βk .

2. Little can be gained in this context by taking either m ≥ , or N very large(neither parameter appears in the formula).3. As the dimension, d, increases the conditioning can be expected to slightlyimprove; in particular, the exponent in (5) does not depend on d. ρ ( x ) [ ρ ⋆ ρ ]( x ) Re [ ̂ ρ ( x )] Im [ ̂ ρ ( x )] | ̂ ρ ( k ) | Beamstop 𝒲ℱ ℱ*ℱ*Unknown Phase

Retrieval 𝒮 𝒮 AC FOV − 𝒮 AC C D I E x p er i m e n t 𝒲 c Figure 2: An unknown object ρ ( x ) is supported in a bounded region S (upper left)and its autocorrelation is supported in S AC (upper right). In standard methods forphase retrieval from CDI experiments, the missing data within the beamstop W isinferred as part of an overall iterative scheme. Here, we solve for the missing dataitself, without consideration of phase, by using an estimate for the support of theautocorrelation image and solving a linear least squares problem.The effects of noise are analyzed in Sections 5–6, where it is shown that, if βk is not too large, then, with sufﬁcient SNR, this scheme can be robust even inthe presence of noise. At lower SNR, we show that improved images may result ifsome of the reconstructed modulus data is used, and some of the coefﬁcients arefound implicitly in the phase retrieval step.5 The Recovery Algorithm

In our model, the measured data, denoted by a , consists of a j = (cid:40) | (cid:98) ρ j | for j ∈ W c = J \ W for j ∈ W. (6)In the image domain, let S ⊂ [1 − N : N ] d ⊂ J be the lattice points within ourestimate for the support of ρ , then S AC = S (cid:9) S = { j − k : j , k ∈ S } , is an estimate for the support of the autocorrelation image S AC , and set R = J \ S AC . If | R | > | W | , then, in principle, the unmeasured magnitude data can bedetermined. For this problem to be reasonably well conditioned the ratio | W | /m d must be sufﬁciently small, and | R | >> | W | . Empirically, a little more oversam-pling ( m = 3 ) than is required for the phase retrieval problem to be solvable( m = 2 ) produces markedly better results. Having more samples also leads tobetter noise reduction when recovering the unmeasured samples. On other hand,greater oversampling may require a smaller pixel size on the detector, or a moredistant detector, either of which would tend to increase the noise content of individ-ual measurements, so clearly there are trade-offs to be considered. Our asymptoticanalysis, and numerical examples indicate that there is little improvement beyond m = 4 . Let F denote the d -dimensional DFT matrix, normalized to be a unitary oper-ator, and let F ∗ be its adjoint. To keep the notation simpler, we omit the spatialdimension d when the context is clear. We interpret F as a map from data on the | J | -point grid in the physical domain to a | J | -point grid in the frequency domain,both contained in Z d . The frequency domain grid is normalized to be centered on k = . In the remainder of the paper we let (cid:98) ρ k = [ F ( ρ )] k , which differs, by theconstant factor, [2 mN ] − d , from the normalization in (4). Deﬁnition 1.

We denote by F W,R the submatrix of of F that maps data fromgrid points in R to Fourier transform points in W . F W,S AC is the submatrix thatmaps data from grid points in S AC to Fourier transform points in W . F W c ,R and F W c ,S AC are deﬁned in the same manner, as are the submatrices of the adjoint: F ∗ R,W , F ∗ S AC ,W , F ∗ R,W c , F ∗ S AC ,W c .Note that taking the adjoint interchanges the roles of the two subsets, e.g., [ F W,R ] ∗ = F ∗ R,W .

6s noted above, the DFT coefﬁcients of the autocorrelation image, ρ (cid:63) ρ , are {| ˆ ρ j | : j ∈ J } (the Wiener-Khinchin theorem). Let us now write the inverse DFTin block form: (cid:18) F ∗ R,W F ∗ R,W c F ∗ S AC ,W F ∗ S AC ,W c (cid:19) (cid:18) α W a W c (cid:19) = (cid:18) ρ (cid:63) ρ (cid:19) , (7)where a W c = a restricted to W c , is the measured data and α W denotes the(unmeasured) coefﬁcients α j of a for j restricted to W . Clearly, letting α W =( | ˆ ρ j | ) j ∈ W yields a consistent solution of (7), since this is simply a restatement ofthe Wiener-Khinchin theorem. If we restrict our attention to the ﬁrst row, we havethe | R | × | W | linear system: F ∗ R,W α W = −F ∗ R,W c a W c . (8)This is shown schematically in Figure 2.For small sets W and large sets R the system of equations F ∗ R,W α W = 0 hasonly the trivial solution α W = . Assuming that the data a W c is exact, then thehighly overdetermined system in (8) has the exact solution, α W = ( | ˆ ρ j | ) j ∈ W , which is unique. For generic right hand sides, the equation F ∗ R,W x = −F ∗ R,W c y , does not have an exact solution, and in the remainder of the paper we take x to bethe solution to the least squares problem: x = arg min x (cid:107)F ∗ R,W x + F ∗ R,W c y (cid:107) , (9)which is also unique, as F W,R F ∗ R,W is invertible. More precisely, we have

Theorem 1.

Suppose that J = [1 − M : M ] d . If W ⊂ [ p : p + u ] × [1 − M : M ] d − ,and R ⊃ [ q : q + v ] × [1 − M : M ] d − , with v > u , then F ∗ R,W x = 0 has only thetrivial solution.Proof. Let x ∈ R J be a vector, with support in W, that belongs to the the null-space of F ∗ R,W , and let X ( z ) be its Z -transform. For every frequency k , the adjointDFT, ˇ x k , equals X ( ω k ) , for ω k an appropriate vector of points on the torus ( S ) d .We can rewrite the Z -transform as X ( z ) = (cid:88) j (cid:48) ∈ [1 − M : M ] d − p j (cid:48) ( z )( z , . . . , z d ) j (cid:48) . Up to a factor of z p , each p j (cid:48) ( z ) is a polynomial of degree u .The hypothesis of the theorem implies that for any ω k = ( ω k , . . . , ω k d ) with k ∈ [ q : q + u ] , we have that X ( ω k ) = 0 . By the invertibility of the ( d − -dimensional DFT, this implies that p j (cid:48) ( ω k ) = 0 for all j (cid:48) and k ∈ [ v : q + v ] .Because v > u and p j (cid:48) are polynomials of degree u , this shows that the polynomi-als are actually all zero, which, in turn, implies that x = as well.7s | W | is a reasonably small number, the reduced SVD of F ∗ R,W = U Σ V ∗ is fairly easy to compute. The Moore-Penrose inverse of F ∗ R,W is F ∗† R,W = V Σ − U ∗ . (10)The unique solution to the overdetermined linear system in (8) is given by α W = −F ∗† R,W F ∗ R,W c a W c . (11)We call the operator R R,W = −F ∗† R,W F ∗ R,W c (12)the recovery operator . For general right hand sides, y , the solution to the leastsquares problem is given by R R,W y . It should be noted that the recovery oper-ator only depends on

W, J, R, and is independent of the particular image beingreconstructed.

Re(Four-space sing vect) 1 Sing-val = 1 -0.2-0.100.10.2

Re(Img-space sing vect) 1 Sing-val = 1

50 100 15050100150 -4 Im(Img-space sing vect) 1 Sing-val = 1

50 100 15050100150 -50510 -9 S AC S AC Figure 3: The singular vector u of F ∗ R,W , corresponding to the largest singularvalue 1. On the left is the DFT representation, showing a small neighborhood of W .In the middle is a plot of Re( F ∗ ( u )) and on the right is a plot of Im( F ∗ ( u )) . Theset S AC is indicated in the middle and right panels as a lightly shaded rectangle.8 e(Four-space sing vect) 169 Sing-val = 8.372e-11 Re(Img-space sing vect) 169 Sing-val = 8.372e-11

50 100 15050100150 -4 Im(Img-space sing vect) 169 Sing-val = 8.372e-11

50 100 15050100150 -2-10123410 -12 S AC S AC Figure 4: The singular vector u of F ∗ R,W , corresponding to the smallest singularvalue . × − . On the left is the DFT representation, showing a small neigh-borhood of W. In the middle is a plot of

Re( F ∗ ( u )) and on the right is a plot of Im( F ∗ ( u )) . The set S AC is indicated in the middle and right panels as a lightlyshaded rectangle.Since F ∗ R,W is the composition of the unitary map F ∗ with orthogonal projec-tions, its singular values lie between and . It is straightforward to describe thesorts of images that lead to singular vectors with singular values very close to 1, orvery close to 0. In order for u ∈ C J to satisfy |F ∗ R,W u | ≈ | u | , it is necessary for u to be supported in W and for F ∗ ( u ) to be almost entirely supported in R. Anexample is shown in Figure 3. The larger R is, the easier it is to ﬁnd such images.On the other hand, for F ∗ R,W u ≈ it is necessary for u to be supported in W and F ∗ ( u ) to be supported almost entirely in J \ R. In d, these images resembletensor products of sampled Hermite functions. For a ﬁxed W , such vectors becomemore plentiful as R gets smaller. An example is shown in Figure 4. For theseexamples we use a thrice oversampled × grid; W is a × squarecentered on k = (0 , , and | R | = 24 , . In most practical examples the largestsingular value of F ∗ R,W is very close to . In this example, the ratio of the largestto smallest singular value of F ∗ R,W is = 1 . × . This quantity representsthe conditioning of the problem of recovering the samples of magnitude DFT in W, and is also the norm of F ∗† R,W . In Section 3 we give estimates and asymptoticresults for the conditioning of this problem.From Fig. 4, we see that the singular vector with the smallest singular valueis essentially a Gaussian centered at . In fact, this vector turns out to providethe most important contribution to “ﬁlling the hole” in k -space. This is easilyunderstood in terms of the continuum model embodied in equations (3) and (4).Since ρ is compactly supported, its Fourier transform is smooth and has a Taylorexpansion about zero, (cid:98) ρ ( k ) = (cid:98) ρ ( ) + (cid:104)∇ (cid:98) ρ ( ) , k (cid:105) + (cid:104) H (cid:98) ρ ( ) k , k (cid:105) + O ( (cid:107) k (cid:107) ) , where H (cid:98) ρ ( ) is the matrix of second derivatives of (cid:98) ρ at . For ρ a real valued9unction this implies that | (cid:98) ρ ( k ) | = | (cid:98) ρ ( ) | exp( −(cid:104) B k , k (cid:105) ) + O ( (cid:107) k (cid:107) ) , (13)where (cid:104) B k , k (cid:105) = 1 | (cid:98) ρ ( ) | (cid:2) |(cid:104)∇ (cid:98) ρ ( ) , k (cid:105)| − (cid:98) ρ ( ) (cid:104) H (cid:98) ρ ( ) k , k (cid:105) (cid:3) . (14)For the sort of functions that arise in CDI, the zero Fourier coefﬁcient (cid:98) ρ ( ) is muchlarger than any other. The analysis above shows that, near to k = , the function | (cid:98) ρ ( k ) | strongly resembles a Gaussian, as does the singular vector of F ∗ R,W withthe smallest singular value. As we see in the next example, this singular vectorplays a dominant role in ﬁlling in the unmeasured magnitude DFT data.

Example 1.

Let { v l : l = 1 , . . . } denote the right singular vectors deﬁned bythe matrix F ∗ R,W used in Figures 3 and 4, with the corresponding singular values { σ l } in decreasing order. The solution to equation (8) can then be represented as α W = (cid:88) l =1 c j v j . (15)Figure 5[a] shows the coefﬁcient vector c , deﬁned by a non-negative image similarto those used in Example 2, and Figure 5[b] shows the coefﬁcient vector deﬁned byan image having both signs, but still having a large mean value. From these plotsit is quite apparent that c is nearly an order of magnitude larger than any othercoefﬁcient.

20 40 60 80 100 120 140 160123456 10 (a) Coefﬁcients deﬁned by a non-negative image.

20 40 60 80 100 120 140 16012345678 10 (b) Coefﬁcients deﬁned by an imagewith both signs. Figure 5: The coefﬁcient vectors from equation (15) deﬁned by real images.10

The Norm of the Recovery Operator

The recovery operator is deﬁned in (12) as the composition of F ∗† R,W , the Moore-Penrose inverse of F ∗ R,W , with F ∗ R,W c . The operator F ∗ R,W c is a composition oforthogonal projections with the unitary operator F ∗ , and therefore its norm isbounded by . Let σ ≥ σ ≥ · · · ≥ σ | W | denote the singular values of F ∗ R,W in decreasing order. The norm of R R,W is therefore bounded above by σ − | W | , but,in fact, may be smaller.In this section, we restrict our attention to the case that W is a square subregionof J and R is the complement of the rectangular subregion S AC = S (cid:9) S within theﬁeld of view. Over the years, a great deal of effort has been expended to understandthe singular values of operators like F ∗ S AC ,W , a ﬁeld of research that goes, at least incontinuum case, under the rubric of “prolate spheroidal functions,” see [21, 20, 13].Because F ∗ is a unitary map, and R = S cAC , it follows that (cid:107) α W (cid:107) = (cid:107)F ∗ R,W α W (cid:107) + (cid:107)F ∗ S AC ,W α W (cid:107) , (16)and therefore: µ ( R, W, d ) = σ | W | = min α W (cid:54) = (cid:107)F ∗ R,W α W (cid:107) (cid:107) α W (cid:107) = 1 − max α W (cid:54) = (cid:107)F ∗ S AC ,W α W (cid:107) (cid:107) α W (cid:107) . (17)That is, there is a simple relationship between the smallest singular value of F ∗ R,W and the largest singular value of F ∗ S AC ,W . In fact this is a special case of the fol-lowing theorem:

Theorem 2.

Let

K, L ⊂ J and assume that | K | ≤ | L | . We let { σ j } denote thesingular values of F ∗ L,K in decreasing order and { τ j } the singular values of F ∗ L c ,K , also in decreasing order. If p = | K | , then, for ≤ j ≤ p,σ j = 1 − τ p − j +1 . (18)Note that K, L are arbitrary subsets of J subject to the requirement that | K | ≤ | L | . The proof of the theorem is given in Appendix B.This theorem is very useful in the present setting, where W is a rectangularregion and R the complement of a rectangular region: it allows us to reduce theanalysis of the singular values of F ∗ R,W , to the case of F ∗ S AC ,W . Recalling that w = (cid:98) mk (cid:99) , with W = [1 − w : w − d and S AC = [ − βN : βN ] d , welet { τ j,d } be the singular values of F ∗ S AC ,W , in the d -dimensional case, and { σ j,d } , the singular values of F ∗ R,W . Because the d -dimensional DFT is the d -fold tensorproduct of 1-dimensional transforms, it is not difﬁcult to show that τ ,d = τ d , . (19)11f q = | W | , then Theorem 2 implies that σ q,d = (cid:113) − τ ,d . As follows from the analysis in Appendix A, τ , = (1 − (cid:15) ) , for an (cid:15) << , whichimplies that τ ,d = (1 − (cid:15) ) d = 1 − d(cid:15) + O ( (cid:15) ) , (20)and therefore σ q,d = (cid:113) − τ ,d ≈ (cid:112) − (1 − d(cid:15) ) ≈ √ d(cid:15). (21)The norm of the operator of interest in the hole-ﬁlling-problem is given approxi-mately by: [ µ ( S AC , W, d )] − = σ − q,d ≈ √ d(cid:15) . (22)In Appendix A we show how to get an asymptotic estimate for the quantity µ ( S AC , W, , which depends only the “space-bandwidth” product, βk . Asymp-totically, as βk , m, N → ∞ , we show that (cid:15) = µ ( S AC , W, ∼ π (cid:112) βk e − πβk . (23)This formula, along with (22) imply the asymptotic formula: (cid:107)R W,R (cid:107) ∼ e πβk √ πd [ βk ] . (24)Note that the exponent in (24) does not depend on the dimension.The recent analysis in [3] gives a lower bound, which is slightly different, indi-cating that increased sampling, and oversampling might have the effect of slightlydecreasing the norm of R W,R . A result of Slepian (reproduced in [3]) shows, thatas m → ∞ , (cid:107)R W,R (cid:107) ∼ C k ,m,N √ d  (cid:16) πβ m (cid:17) − tan (cid:16) πβ m (cid:17)  mk +1 , (25)see [20]. Here C k ,m,N is an algebraic factor. As m → ∞ this formula gives thesame exponential rate as (24).It is worth noting that the exponential rate in the conditioning of the hole-ﬁllingproblem does not depend on the dimension. In fact the condition number shoulddecrease, albeit slowly, as the dimension increases. The computations in Example 3show that (24) is fairly accurate, even for moderate values of βk , N, and m ≥ . The main lessons of this analysis are: 12. The size of the hole in k -space that can be stably ﬁlled using the linearmethod we have introduced depends mostly on the product βk .

2. The norm of recovery operator grows exponentially with this product. Re-calling that < β < , the size of the hole, as measured by k , that canbe ﬁlled in this way is quite limited. However, the exponential rate does not depend on the dimension!In the examples in the next section we see that, for a given β and k , larger valuesof m do provide a better result, though with little improvement beyond m = 4 . We now consider several examples that illustrate the performance of this methodon d -images, and the dependence of (cid:107)R W,R (cid:107) on m, N, k , and β. It should berecalled that oversampling is a matter of changing the spacing between the sam-ples collected in k -space, and not the maximum frequency collected. As followsfrom (4), the double–oversampled Fourier coefﬁcient with indices (2 k , k ) isat the same spatial frequency as the triple–oversampled coefﬁcient with indices (3 k , k ) . Example 2.

For these examples we use an image, ρ , taking both signs that sits ina × -rectangle. The function sampled is twice differentiable; for the estimateof the support S, we use the 1-pixel neighborhood of the smallest rectangle thatcontains supp ρ . We use either double, | J | = 128 × , or triple, | J | = 192 × , oversampling, and remove neighborhoods, W, of in k -space of various sizes. Inall cases we solve for the missing values using (11).Figure 6 shows the results with double oversampling and Figure 7, the resultswith triple oversampling. The plots in the upper left corners show the singular val-ues, in decreasing order, of F ∗ R,W . The plots in the upper right corners show theset R in yellow. The support of ρ is contained in the union of the light blue anddark blue rectangles and the hole in k -space is dark blue. The plots in the lowerleft corners are the recovered magnitude-DFT coefﬁcients in W of the autocorre-lation function, using the values found in (11) to “ﬁll the hole.” The errors in theautocorrelation images are shown in the lower right corners.13 -4 -2 SV of M CN = 3.902e+05 k-Hole size = 49, OS = 2 AC-hole size = 4285

20 40 60 80 100 12020406080100120

Reconstructed DFT coeffs DFT-err = 1.46e-12 AC image error

20 40 60 80 100 12020406080100120 -5051010 -12 (a) W is a × square.

20 40 60 8010 -5 SV of M CN = 3.31e+07 k-Hole size = 81, OS = 2 AC-hole size = 4285

20 40 60 80 100 12020406080100120

Reconstructed DFT coeffs DFT-err = 3.29e-10 AC image error

20 40 60 80 100 12020406080100120 -6-5-4-3-2-1010 -9 (b) W is a × square. Figure 6: Plots connected with the recovery of missing samples of the magnitudeDFT data using double-oversampling.With triple oversampling we can recover the data with 11 digits of accuracy ina fairly large hole ( × -hole in a × grid), and the matrix F ∗ R,W hasmost of its singular values close to . With double oversampling the conditioningof the matrix F ∗ R,W deteriorates more quickly.

20 40 60 8010 -2 SV of M CN = 1765 k-Hole size = 81, OS = 3 AC-hole size = 24765

50 100 15050100150

Reconstructed DFT coeffs DFT-err = 5.15e-14 AC image error

50 100 15050100150 -6-4-2010 -13 (a) W is a × square.

50 100 150 20010 -4 -2 SV of M CN = 8.779e+05 k-Hole size = 225, OS = 3 AC-hole size = 24765

50 100 15050100150

Reconstructed DFT coeffs DFT-err = 4.28e-11

10 20 3051015202530 AC image error

50 100 15050100150 -10 (b) W is a × square. Figure 7: Plots connected with the recovery of missing samples of the magnitudeDFT data using triple-oversampling.We now make a systematic study of the dependence of (cid:107)R

R,W (cid:107) on the vari-ous parameters that deﬁne this operator: m, the degree of oversampling, N, the“base” number of samples, k the maximum spatial frequency not sampled. Forthese examples we ﬁx β = 1 . , so that the autocorrelation image is supported in14 − . N : 1 . N ] . As predicted from the asymptotic formula, (24), the norm of R R,W increases monotonically with β. Example 3.

In these examples the images are indexed by J = [1 − mN : mN ] , the image itself is supported in a proper subset of [ − N : N ] , its autocorrelationimage is supported in [ − . N, . N ] , and samples of the magnitude DFT withindices in [1 − w : w − , are not measured. As before, w = (cid:98) mk (cid:99) .The asymptotic formula in (24) is expected to become increasingly accurateas m, N grow. In fact, taking N = 128 , and m = 3 already leads to fairly goodagreement with this estimate. To generate the tables below we ﬁx β = 1 . andconsider various values of m, k for N = 64 , , . Taking m = 3 resultsin a large improvement over taking m = 2 , but m = 4 only provides a smallimprovement over m = 3 . N = 64 m = 2 m = 3 m = 4 Asymp. Val. k = 1 337 . .

87 45 . . k = 2 1 . × . × . × . × k = 3 1 . × . × . × . × k = 4 8 . × . × . × . × k = 5 9 . × . × . × . × Table 1: Values of (cid:107)R

R,W (cid:107) for β = 1 . , N = 64 and various choices of m, k . The asymptotic values predicted by (24) are shown in the last column.The following tables are generated with N = 128 , and N = 256 . The values inthis table that overlap with those in Table 1 are quite similar, with generally smallervalues than for N = 64 . N = 128 m = 2 m = 3 m = 4 Asymp. Val. k = 1 361 .

03 73 .

31 46 . . k = 2 1 . × . × . × . × k = 3 1 . × . × . × . × k = 4 6 . × . × . × . × N = 256 m = 2 m = 3 m = 4 Asymp. Val. k = 1 373 .

17 74 .

63 47 . . k = 2 1 . × . × . × . × k = 3 1 . × . × . × . × k = 4 6 . × . × . × . × Table 2: Values of (cid:107)R

R,W (cid:107) for β = 1 . , N = 128 , and various choices of m, k . The asymptotic values predicted by (24) are shown in the last column.To close this section we consider the relationship in the errors of the recoveredDFT magnitude data, versus that in the squared magnitude data. We express the15ecovered autocorrelation magnitude data, { u k : k ∈ W } , as u k = | (cid:98) ρ k | + (cid:15) k , so that | u k − | (cid:98) ρ k | || (cid:98) ρ k | = (cid:15) k | (cid:98) ρ k | . (26)Clearly we have that u k ≈ | (cid:98) ρ k | + (cid:15) k | (cid:98) ρ k | , (27)and therefore || (cid:98) ρ k | − u k || (cid:98) ρ k | ≈ (cid:15) k | (cid:98) ρ k | . (28)For k near to zero, the magnitude DFT coefﬁcients, | (cid:98) ρ k | , tend to be large, andtherefore we can expect these recovered values to have somewhat smaller relativeerrors than their squared counterparts. This, however, does not mean that the rel-ative mean square error is smaller for | (cid:98) ρ k | than for | (cid:98) ρ k | . An example comparingthese errors is shown in Figure 8. This resulted from ﬁlling a × -hole for athrice oversampled × -image. The data used here is noise-free.

20 40 60 80 100 120 140 1600.511.522.5 10 -11

Comp. or errors in AC-cffs (b) and IM-cffs (r)

Figure 8: Relative errors in recovery of {| (cid:98) ρ k | } (blue curve) versus those for re-covered values of {| (cid:98) ρ k |} (red curve). We now consider the effects of noise on the recovery process. Let n ∈ R W c rep-resent the measurement error and noise. Then, instead of solving (8), we actuallyneed to solve the equation F ∗ R,W ( α + β ) = −F ∗ R,W c ( a + n ) . (29)16he relative effect of the noise introduced into α is then measured by the ratio (cid:107) β (cid:107)(cid:107) n (cid:107) = (cid:107)R R,W n (cid:107)(cid:107) n (cid:107) . (30)The matrix R R,W has a representation of the form R R,W = | W | (cid:88) j =1 ν j w j ⊗ z ∗ j , (31)where { w j : j = 1 , . . . , | W |} is an orthonormal basis for the range and { z j : j =1 , . . . , | W |} , are pairwise orthonormal. For a vector n (cid:107)R R,W n (cid:107) = | W | (cid:88) j =1 ν j |(cid:104) n , z j (cid:105)| . (32)The collection of vectors { z j } can be augmented to give an orthonormal basis, { z j : j = 1 , . . . , | W c |} , for R W c . Hence for n ∈ R W c , we have that | W c | (cid:88) j =1 |(cid:104) n , z j (cid:105)| = (cid:107) n (cid:107) . (33)It is often reasonable to assume that the random variables {|(cid:104) n , z j (cid:105)| : j =1 , . . . , | W c |} are independent and identically distributed, and therefore the ex-pected values satisfy: E ( |(cid:104) n , z j (cid:105)| ) = (cid:107) n (cid:107) | W c | . (34)This would be the case for any additive, I.I.D. noise process. In this case E (cid:18) (cid:107)R R,W n (cid:107) (cid:107) n (cid:107) (cid:19) = 1 | W c |  | W | (cid:88) j =1 ν j  . (35)From the Cauchy-Schwarz inequality it follows that E (cid:18) (cid:107)R R,W n (cid:107)(cid:107) n (cid:107) (cid:19) ≤ (cid:112) | W c |  | W | (cid:88) j =1 ν j  . (36)Even when the norm of R R,W is large, the quantity appearing on the right handside of (36) may turn out to be rather modest. This value gives a good estimate for17he effect of noise on the accuracy of the recovered values of the unmeasured DFTmodulus data. The number (cid:112) | W c | ≈ mN, (in d ), which shows that a potentialadvantage of greater oversampling is better noise suppression when recovering theunmeasured DFT magnitude data.As is well known, an important source of noise in CDI applications is Poissonnoise that arises from the discreteness of X-ray photons. This is usually modeled asfollows: if | (cid:98) ρ k | is the “true intensity” of the DFT coefﬁcient in the k th pixel, thenmeasurement (cid:101) a k is a sample of a Poisson random variable with intensity | (cid:98) ρ k | . The“noise” in this pixel is therefore given by n k = (cid:101) a k − | (cid:98) ρ k | . (37)Clearly E ( n k ) = 0 , and E ( n k ) = | (cid:98) ρ k | , and therefore the SNR is | (cid:98) ρ k | , whichimplies that the Poisson noise process has a pixel dependent SNR. As E ( n k ) = E ( (cid:101) a k ) , the noise is in some ways similar to the image itself. Indeed, the projectionof F ∗ R,W c n into the range of F ∗ R,W tends be rather large.Figure 9 shows histograms of the ratios, (cid:107)R

R,W n (cid:107)(cid:107) n (cid:107) , for different noise processesin a triple oversampled example, where the condition number of R R,W is . × . These ratios are typically less than 400, for uniform and Gaussian noise, andless than 2000, for Poisson noise. The much smaller numbers in Gaussian anduniform cases are a reﬂection of the fact that the orthogonal projection of F ∗ R,W c n into the range of F ∗ R,W tends to be quite small for n a sample of an additive I.I.D.noise process, as predicted in (36). As suggested by the discussion above, thesituation is rather different in the Poisson case. OS = 3, Hole-size = 13, noise-model = 1 |M -1 P n|/|n| (a) Uniform noise.

OS = 3, Hole-size = 13, noise-model = 2 |M -1 P n|/|n| (b) Gaussian noise.

OS = 3, Hole-size = 13, noise-model = 3 |M -1 P n|/|n| (c) Poisson noise.

Figure 9: Histograms of the ratios (cid:107)R

R,W n (cid:107) / (cid:107) n (cid:107) for 5000 trials of uniform,Gaussian and Poisson noise. Here w = 7 , m = 3 , N = 64 . Hole Filling and Image Reconstruction

In this ﬁnal section we consider how the hole-ﬁlling procedure outlined above af-fects the outcome of image reconstruction using an HIO-algorithm, see [12, 5].This algorithm, which iterates a map like that in (38), is currently the basis for thebest known, and most frequently used phase retrieval method. In the examples inthis section we see that, for a certain range of hole-sizes and in the absence of noise,the images obtained by ﬁrst ﬁlling in the unmeasured data using equation (11), andthen using HIO are much better than those obtained by simply using HIO. The pic-ture is more complicated when there is noise, with the results now depending onthe character of the noise and the SNR. With noise, we ﬁnd that it is often useful touse some of the values recovered using equation (11), and allow others to be ﬁlledin implicitly using HIO.Suppose the data is of the form given by (6), where α W c denotes the mea-surements outside of the missing hole W . Let P A denote the projection opera-tor onto the nearest point in some set A. We set B S = { ρ : P S c ( ρ ) = 0 } and A a = { ρ : | P W c ◦ F ( ρ ) | = α W c } . HIO and related algorithms provide anupdate of the form: ρ ( k +1) = ρ ( k ) + P A a ◦ R B s ( ρ ( k ) ) − P B s ( ρ ( k ) ) , (38)where R B s ( ρ ) = 2 P B s ( ρ ) − ρ . Note that (38) operates agnostically in regards to the missing data inside W ,for every missing data value one less constraint equation is imposed. Thus, con-ceivably ﬁlling in the missing data in W before applying HIO (or any such phaseretrieval algorithm) could improve the quality of the reconstructed image.Extensive numerical simulations indeed conﬁrm this to be true. We ﬁx a testimage and the set of corresponding squared DFT magnitude measurements, a W c , with low frequencies removed that belong to a square, W, of size (2 w − × (2 w − centered on (0 , . Here, we use triple oversampling so that w = (cid:98)∗ k (cid:99) .We then compare the following two recovery procedures: (i) HIO is directly ap-plied to the “measured” data a W c , and (ii) the missing data in W is ﬁrst ﬁlledin using the recovery operator, and then HIO is applied to the full data set a (henceforth referred to as the “Fill+HIO” algorithm). It is observed that Fill+HIOproduces superior image reconstruction for values of k for which the linear sys-tem, given by (8), can be solved accurately. Fill+HIO provides improved recoveryup to w = 15 ( k ≈ ), whereas HIO alone fails after w = 6 ( k ≈ ). Typicalcomparative results on simulated CDI data are shown in Figs 10 and 11.Practical approaches for the phase retrieval problem in the presence of noisydata typically involve numerical optimization [19, 1] and data-driven methods,19ee [16], topics that are outside the scope of this paper and which we do not pur-sue further. However, we do provide some general remarks, and guidelines forapplying the Fill+HIO algorithm to problems with noisy data.Figure 10: Image reconstruction via HIO alone, in the middle row, versus phaseretrieval using hole-ﬁlling followed by HIO (“Fill+HIO”) in the bottom row. Theimage is of size × , and data is of size × , so m = 3 , and the (2 w − × (2 w − frequencies centered on (0 , zeroed-out.When used with real measurements, the ﬁlled-in data values obtained via (11)are necessarily contaminated by noise. Thus, there arises a tradeoff between ignor-ing the missing data and ﬁrst recovering estimates for these values which containerrors. It is observed from numerical simulations that, with noisy data, the bestimage reconstruction is achieved by utilizing a subset of the data found using (11),and allowing HIO to recover the remaining coefﬁcients.A natural procedure for determining the best subset to choose is to run multipletrials of the Fill+HIO procedure where, for each trial, the amount of recovereddata that is used is incrementally increased. While, in practice, the true smallesterror achieved throughout such trials is unknown (since the ground-truth image isunknown), an empirically successful proxy is to consider the data error for eachtrial; if ρ is the approximate reconstruction satisfying the support condition, then20he data error is: (cid:13)(cid:13)(cid:13) | (cid:98) ρ | − | (cid:99) ρ | (cid:13)(cid:13)(cid:13) (cid:13)(cid:13)(cid:13) | (cid:99) ρ | (cid:13)(cid:13)(cid:13) . (39)For our experiments we choose the partial ﬁlling that minimizes this quantity.Figure 11: Relative errors (cid:107) ρ − ρ (cid:107) (cid:107) ρ (cid:107) for the ground truth image ρ shown in Fig.1 and the recovered images ρ using the HIO (blue) and Fill+HIO (orange) algo-rithms, respectively. Data has the (2 w − × (2 w − square of lowest frequencieszeroed-out.We concentrate on the case of data corrupted by Poisson noise, such as typ-ically occurs in CDI experiments. The discussion at the end of Section 2 clearlyindicates that the largest ampliﬁcation of noise occurs in the recovery of the lowest-frequency values. Thus, a natural search strategy for partially ﬁlling a rectangleof missing data with recovered values is to work from the outer boundary of W inward, considering annular regions, which restore the mid-range of missing fre-quencies.We apply this procedure to simulated CDI data, corresponding to the setup inFigs. 10 and 11, when w = 5 , m = 3 , that is corrupted by Poisson noise with asignal-to-noise ratio of 1000. Over 1000 trials, we observe that the distribution ofthe recovery error is noticeably improved by restoring some of the missing databefore running HIO. This is illustrated in the histograms shown in Figure 12.21igure 12: Histograms of relative error values generated from 1000 noisy instancesof the simulated CDI setup shown in Figure 10, using the HIO (blue) and Par-tial Fill+HIO (orange) algorithms, respectively. Partial Fill+HIO signiﬁcantly im-proves the error distribution. In this paper, we have investigated the problem of recovering the unmeasured datawithin the beamstop in CDI imaging. Rather than including the full complexFourier transform at these missing locations as part of a global inverse problem,we have shown that the modulus Fourier data within the beamstop can itself berecovered, as the solution to a linear least squares problem. Algorithms for phaseretrieval can then be used in a second step on this “ﬁlled in” data set. The power ofthis approach is illustrated in Fig. 11.We also analyzed under what conditions this method of recovery is likely tobe successful. If W and S AC are rectangular subsets of J, then the answer hingeson the value of the dimensionless parameter βk , here β > is determined bythe support of the autocorrelation function ρ (cid:63) ρ, and k is the hole width. In thisanalysis, we assume the object of interest has spatial dimensions normalized to unitlength.Our analysis provides a generalization of Hayes’ theorem [14] to the case ofphase retrieval with missing data. Theorem 1, shows that, very often, the missingdata, hidden by the beamstop, can be uniquely recovered from the measured mag-nitude data. Hayes’ theorem then applies directly to the completed data set to showthat the solution to the phase retrieval problem is again generically unique.22his method for recovering the unmeasured magnitude data should be applica-ble to many classical phase retrieval problems. We are currently investigating theextension of our results to other X-ray imaging modalities. A Appendix

In this appendix we derive an asymptotic bound for µ ( R, W, d ) , assuming that W and S AC = R c are rectangular subsets. This bound becomes more accurate as m, N tend to inﬁnity and the product βk grows. A similar question is addressedin Barnett’s recent paper [3] on the conditioning of sub-blocks of the DFT matrix.An upper bound on µ ( R, W, , follows from the estimates in Barnett’s paper.In (17) we show that µ ( S AC , W, d ) = 1 − max α W (cid:54) =0 (cid:107)F ∗ S AC ,W α W (cid:107) (cid:107) α W (cid:107) . (40)It should ﬁrst be noted that as F ∗ α = [ F ( α )] ∗ , replacing F ∗ with F does notchange the value of the maximum in this formula. The key consequence of this for-mula is that the smallest singular value of F ∗ R , W is determined by the largest singu-lar value of F ∗ S AC ,W . Because S AC ⊂ J is a rectangular set, and the d -dimensionalDFT is a tensor product of 1-dimensional DFTs, this allows the determination ofthese singular values as products of singular values that arise in the 1-dimensionalcase.With this in mind, we let α be a sequence of length mN supported in [1 − w : w − , and f, a real valued function supported in [ − k , k ] , where w = (cid:98) mk (cid:99) , with f (cid:16) jm (cid:17) = (cid:112) m N α j . The discrete Fourier transform of α is ˆ α k = 1 m w (cid:88) j = − w f (cid:18) jm (cid:19) e − πijk mN ≈ (cid:98) f (cid:18) k N (cid:19) . (41)With these approximations it follows that βN (cid:88) k = − βN | ˆ α k | ≈ βN (cid:88) k = − βN (cid:12)(cid:12)(cid:12)(cid:12) (cid:98) f (cid:18) k N (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) , w − (cid:88) j =1 − w | α j | ≈ N (cid:90) k − k | f ( x ) | dx, (42)23nd therefore the ratio, whose maximum deﬁnes µ ([ − βN : βN ] , [1 − w : w − , , is approximated by: (cid:80) βNk = − βN | ˆ α k | (cid:80) w − j =1 − w | α j | ≈ (cid:82) β − β | (cid:98) f ( y ) | dy (cid:82) k − k | f ( x ) | dx . (43)As noted above, α is a sequence supported in [1 − w : w − . The ratio of thesums on the left hand side converge to the ratio of integrals on the right hand sideas

N, m → ∞ .We deﬁne λ ( k , β,

1) = max f ∈A k \{ }  (cid:82) β − β | (cid:98) f ( y ) | dy (cid:82) k − k | f ( x ) | dx  , (44)where A k consists of functions in L ( R ) supported in [ − k , k ] . The calculationsabove show that, at least asymptotically, as m, N grow, µ ( S AC , W, ≈ − λ ( k , β, . (45)The quantity on the right hand side of (45) has been intensively studied in the liter-ature on prolate spheroidal functions, see [13, 20]; adapting the result of Theorem1 from [13] we obtain: µ ( S AC , W, ∼ π (cid:112) βk e − πβk . (46)The asymptotic evaluation on the right hand side of (46) is in the limit βk → ∞ . In d dimensions, suppose that the support of the autocorrelation image is con-tained in the cuboid [ − β , β ] d and W = [1 − w : w − d . Let λ ( k , β, d ) bethe d –dimensional analogue of λ ( k , β, the extremizer deﬁning λ ( k , β, d ) , is just the d –fold tensor product of the d –extremizer. Hence we see that λ ( k , β, d ) = λ ( k , β, d ∼ (1 − π (cid:112) βk e − πβk ) d ≈ − dπ (cid:112) βk e − πβk , (47)and therefore µ ( S AC , W, d ) ∼ dπ (cid:112) βk e − πβk . (48)The norm of R R,W might be smaller than [ µ ( R, W, d )] − , as F ∗ R,W c has normless than 1. In fact, in our applications, the norm of F ∗ R,W c is very close to 1.Hence the norm of the recovery operator is given asymptotically by the quantity [ µ ( S AC , W, d )] − ∼ e πβk √ dπ [ βk ] . (49)24ince the minimum singular value of R R,W is very close to 1, this is also an asymp-totic estimate for the condition number. kmax C ond i t i on nu m be r

1d Cond. nos. N1= 128, m min = 2, m max = 5, beta= 1.4

Normal EquationsAymptotics (a) β = 1 . kmax C ond i t i on nu m be r

1d Cond. nos. N1= 128, m min = 2, m max = 5, beta= 1.6

Normal EquationsAymptotics (b) β = 1 . kmax C ond i t i on nu m be r

1d Cond. nos. N1= 128, m min = 2, m max = 5, beta= 1.8

Normal EquationsAymptotics (c) β = 1 . Figure 13: Plots of [ µ ( S AC , W, − , for m = 2 , , , , (in blue) along with aplot of the asymptotic formula (in red). In these plots k ranges from 1 to 10, and y -axis goes from to . In Figure 13[a,b,c] we show values of [ µ ( S AC , W, − , for β = 1 . , . , . . In each plot, there are 4 blue curves corresponding to m = 2 , , , , along with thepredictions (in red) made by (49), with d = 1 . The x -axis is k , which ranges from to . As m increases, the blue curves get closer to the plot of the asymptoticformula. As long as there is sufﬁcient accuracy in the double precision calculation,the asymptotic formula is close to exact calculation by the time m = 3 , and isa lower bound throughout this range of parameters. Once the condition numberreaches ∼ , the calculation of [ µ ( S AC , W, − saturates and is no longermeaningful. In these computations N = 128 , and increasing it does not signiﬁ-cantly change these results.This analysis shows that, asymptotically, the size of the hole in k -space thatcan be stably ﬁlled depends mostly on the product βk ; in particular, it does notdepend strongly on the extent of oversampling, provided that m ≥ . Perhaps mostsurprisingly, the norm of R R,W decreases with the dimension! Figure 13 and thetables in Example 3 show that the asymptotic formula provides a lower bound on (cid:107)R

R,W (cid:107) , which improves as m, N increase.

B Appendix

To prove Theorem 2 we recall the variational characterizations of the singular val-ues of a linear map A : C p → C n and assume that p ≤ n. It turns out to be simplerin the proof to list the singular values in increasing order: s ≤ s ≤ · · · ≤ s p . Note that if p > n, then s = 0 , which is not too interesting. The j th singular25alue of A has 2 variational characterizations: s j = min { S ⊂ C p : dim S = j } max x ∈ S : x (cid:54) = (cid:107) A x (cid:107) (cid:107) x (cid:107) s j = max { S ⊂ C p : dim S = p − j +1 } min x ∈ S : x (cid:54) = (cid:107) A x (cid:107) (cid:107) x (cid:107) . (50) Proof.

The basic observation is that, because F is a unitary map, for x ∈ Im π K , we have the identity (cid:107) x (cid:107) = (cid:107)F L,K x (cid:107) + (cid:107)F L c ,K x (cid:107) . (51)We think of F L,K as a map from C K to C J , with singular values s ≤ · · · ≤ s p , where p = | K | , and F L c ,K as a map from C K to C J , with singular values t ≤· · · ≤ t p . Using the observations above, we see that s j = min { S ⊂ C K : dim S = j } max x ∈ S : x (cid:54) = (cid:107)F L,K x (cid:107) (cid:107) x (cid:107) = min { S ⊂ C K : dim S = j } max x ∈ S : x (cid:54) = (cid:20) − (cid:107)F L c ,K x (cid:107) (cid:107) x (cid:107) (cid:21) = 1 − max { S ⊂ C K : dim S = j } min x ∈ S : x (cid:54) = (cid:107)F L c ,K x (cid:107) (cid:107) x (cid:107) = 1 − t p − j +1 . (52) References [1] D. A. B

ARMHERZIG AND

J. S UN , Low-photon holographic phase retrieval .OSA Imag. Appl. Opt. Cong. (2020), pp. 1-2.[2] D. A. B

ARMHERZIG , J. S UN , P. L I , T. J. L ANE , AND

E. J. C

AND ` ES , Holographic phase retrieval and reference design , Inv. Prob., 35 (2019),pp. 194001.[3] A. B

ARNETT , How exponentially ill-conditioned are contiguous submatricesof the Fourier matrix? , ArXiv, arXiv:2004.09643 (2020), pp. 1–24.[4] A. B

ARNETT , C. L. E

PSTEIN , L. G

REENGARD , J. M

AGLAND , Geometry ofthe phase retrieval problem , ArXiv, arXiv:1808.10747 (2018), pp. 1–33.265] H. H. B

AUSCHKE , P. L. C

OMBETTES , AND

D. R. L

UKE , Phase retrieval,error reduction algorithm, and Fienup variants: a view from convex opti-mization , J. Opt. Soc. Am. A, 19 (2002), pp. 1334–1345.[6] Y. B

RUCK AND

L. S

ODIN , On the ambiguity of the image reconstructionproblem , Optics Communications, 30 (1979), pp. 304–308.[7] H. N. C

HAPMAN ET AL , Femtosecond diffractive imaging with a soft-x-rayfree-electron laser , Nat. Phys., 2 (2006), pp. 839–843.[8] H. N. C

HAPMAN , A. B

ARTY , S. M

ARCHESINI , A. N OY , S. P. H AU -R IEGE ,C. C UI , M. R. H OWELLS , R. R

OSEN , H. H E , J. C. H. S PENCE , U. W

EIER - STALL , T. B

EETZ , C. J

ACOBSEN , AND

D. S

HAPIRO , High-resolution abinitio three-dimensional x-ray diffraction microscopy , J. Opt. Soc. Am. A, 23(2006), pp. 1179–1200.[9] K. H E , M. K. S HARMA , AND

O. C

OSSAIRT , High dynamic range coherentimaging using compressed sensing , Opt. Exp., 23 (2015), pp. 30904–30906.[10] V. E

LSER , I. R

ANKENBURG , AND

P. T

HIBAULT , Searching with iter-ated maps , Proceedings of the National Academy of Sciences, 104 (2007),pp. 418–423.[11] A. F

ANNJIANG AND

T. S

TROHMER , The Numerics of phase retrieval , ArXiv,arXiv:2004.05788 (2020), pp. 1–83.[12] J. R. F

IENUP , Phase retrieval algorithms: a comparison , Applied Optics, 21(1982), pp. 2758–2769.[13] W. F

UCHS , On the eigenvalues of an integral equation arising in the theoryof band-limited signals , Jour. Math. Anal. and Appl., 9 (1964), pp. 317–330.[14] M. H

AYES , The reconstruction of a multidimensional sequence from thephase or magnitude of its Fourier transform , IEEE Transactions on Acous-tics, Speech, and Signal Processing, 30 (1982), pp. 140–154.[15] J. M

IAO , P. C

HARALAMBOUS , J. K

IRZ , AND

D. S

AYRE , Extending themethodology of x-ray crystallography to allow imaging of micrometre-sizednon-crystalline specimens , Nature, 400 (1999), pp. 342–334.[16] C. M

ETZLER , P. S

CHNITER , A. V

EERARAGHAVAN , AND

R. B

ARANIUK , prDeep: Robust phase retrieval with a ﬂexible deep network , ICML, 35(2018), pp. 3501–3510. 2717] Y. N ISHINO , J. M

IAO , AND

T. I

SHIKAWA , Image reconstruction of nanos-tructured nonperiodic objects only from oversampled hard x-ray diffractionintensities , Phys. Rev. B, 68 (2003), p. 220101.[18] Y. S

HECHTMAN , Y. C. E

LDAR , O. C

OHEN , H. N. C

HAPMAN , J. M

IAO , AND

M. S

EGEV , Phase retrieval with application to optical imaging: a con-temporary overview , IEEE Sig. Proc. Mag., 32 (2015), pp. 87–109.[19] B. S HI , Q. L IAN , X. H

UANG , AND

N. A N , Constrained phase retrieval:when alternating projection meets regularization , J. Opt. Soc. Am. B, 35(2018), pp. 1271–1281.[20] D. S

LEPIAN , Prolate spheroidal wave functions, Fourier analysis, anduncertainty– V: the discrete case , Bell System Technical Journal, 57 (1978),pp. 1371–1430.[21] D. S

LEPIAN AND

H. O. P

OLLAK , Prolate spheroidal wave functions,Fourier analysis, and uncertainty, I , Bell System Technical Journal, 40(1961), pp. 43–63.D.A. B

ARMHERZIG : dbarmherzig@ﬂatironinstitute.orgA.H. B

ARNETT : abarnett@ﬂatironinstitute.orgC.L. E

PSTEIN : [email protected]. G

REENGARD : lgreengard@ﬂatironinstitute.orgJ.F. M

AGLAND : jmagland@ﬂatironinstitute.orgM. R