[PDF] Semiclassical sampling and discretization of certain linear inverse problems

Abstract

We study sampling of Fourier Integral Operators A at rates sh with s fixed and h a small parameter. We show that the Nyquist sampling limit of Af and f are related by the canonical relation of A using semiclassical analysis. We apply this analysis to the Radon transform in the parallel and the fan-beam coordinates. We explain and illustrate the optimal sampling rates for Af , the aliasing artifacts, and the effect of averaging (blurring) the data Af . We prove a Weyl type of estimate on the minimal number of sampling points to recover f stably in terms of the volume of its semiclassical wave front set.

Full PDF

SSEMICLASSICAL SAMPLING AND DISCRETIZATION OF CERTAIN LINEARINVERSE PROBLEMS

PLAMEN STEFANOV

Abstract.

We study sampling of Fourier Integral Operators A at rates sh with s ﬁxed and h asmall parameter. We show that the Nyquist sampling limit of Af and f are related by the canonicalrelation of A using semiclassical analysis. We apply this analysis to the Radon transform in theparallel and the fan-beam coordinates. We explain and illustrate the optimal sampling rates for Af , the aliasing artifacts, and the eﬀect of averaging (blurring) the data Af . We prove a Weyl typeof estimate on the minimal number of sampling points to recover f stably in terms of the volumeof its semiclassical wave front set. Introduction

The classical Nyquist–Shannon sampling theorem says that a function f ∈ L ( R n ) with a FourierTransform ˆ f supported in the box [ − B, B ] n can be uniquely and stably recovered from its samples f ( sk ), k ∈ Z n as long as 0 < s ≤ π/B . More precisely, we have(1) f ( x ) = (cid:88) k ∈ Z f ( sk ) χ (cid:16) πs ( x − sk ) (cid:17) , χ ( x ) := n (cid:89) j =1 sinc( x j )and(2) (cid:107) f (cid:107) = s n (cid:88) k ∈ Z | f ( sk ) | , where (cid:107) · (cid:107) is the L norm, see, e.g., [14] or [8]. If s < π/B (strictly) then we have oversamplingand one can replace the sinc function in (1) by a faster decaying one, see Theorem 3.1 below. Forpractical purposes, there are two major inconveniences: we need inﬁnitely many samples and f has to be real analytic, and in particular, it cannot be compactly supported unless it is zero. Thestability (2) allows us to resolve those diﬃculties by using approximate recovery for approximatelyband limited functions. Let us say that f is “essentially supported” in some box [ − R, R ] n in thesense that f = f + f with (cid:107) f (cid:107) ≤ ε (cid:28) f being “essentially B -band limited” in the sensethat the L norm of ˆ f outside that frequency box is bounded by some 0 < ε (cid:28)

1. Then (1)recovers f up to an error small with ε , see [14], by sampling f (not f ). The eﬀect of replacing f by f can be estimated in terms of ε as well.There are generalizations of the sampling theorem to non-rectangular but still periodic grids,see, e.g., [16] or to some non-uniform ones, see, e.g., [13] but the latter theory is not as completewhen n ≥

2. The version presented above is equivalent to viewing R n as a product of n copiesof R . In particular, it is invariant under translations and dilations and has a natural extension toactions of linear transformations. On the other hand, the conditions are sharp both for uniquenessand for stability. If the sampling rate (Nyquist) condition is violated, there is non-uniqueness andif we still use (1), we get aliasing. There are also versions for f belonging to spaces diﬀerent than Date : November 15, 2018.Partially supported by the National Science Foundation under grant DMS-1600327. a r X i v : . [ m a t h . A P ] N ov PLAMEN STEFANOV L . The proof of the sampling theorem is equivalent to thinking about f as the inverse Fouriertransform of ˆ f , the latter compactly supported. Therefore the samples f ( sk ) are essentially theFourier coeﬃcients of ˆ f extended as a 2 π/s periodic function in each variable (which also explainsthe Nyquist limit condition), see Theorem 2.2.The purpose of this work is to study the eﬀect of sampling the data at a certain rate for a classof linear inverse problems. This class consists of problems of inverting a Fourier Integral Operator(FIO): ﬁnd f if(3) Af = m with m given (so far, noiseless) and A is an FIO of a certain class. There are many examples:inversion of the Euclidean X-ray and the Radon transforms, for which the sampling problem is wellstudied, see, e.g., the references in [14, Ch. III]; inversion of the geodesic X-ray transform and moregeneral Radon transforms; thermo and photo-acoustic tomography with a possibly variable speed,etc. A large class of integral geometry operators are in fact FIOs, as ﬁrst noticed by Guillemin [9, 10].The solving operators of hyperbolic problems are also FIOs in general. On the other hand, manynon-linear inverse problems have a linearization of this kind, like the boundary rigidity problem orvarious problems of recovery coeﬃcients in a hyperbolic equation from boundary measurements.We study the following types of questions. (i) Sampling Af : Given an essential frequency bound of f (the lowest possible “detail”), howﬁne should we sample the data Af for an accurate enough recovery? This question, posed thatway, includes the problem of inverting A in the ﬁrst place, in addition to worrying about sampling.The answer is speciﬁc to A which could be associated to a canonical graph or not, elliptic or not,injective or not. Then a reformulation of the ﬁrst question is — if f is approximately band limited,is also Af approximately band limited, with what limit, and then what sampling rate will recoverreliably Af ? The problem of recovery of f after that depends on the speciﬁc A . (ii) Resolution limit on f given the sampling rate of Af . Suppose we have ﬁxed thesampling rate of Af (not necessarily uniformly sampled). In applications, we may not be able tosample too densely. What limit does this pose on the smallest detail of f we can recover? Theanswer may depend on the location and on the direction of those details. (iii) Aliasing. Above, if f has detail smaller than that limit, there will be aliasing. How willthe aliasing artifacts look like? Aliasing is well understood in classical sampling theory but thequestion here is what kind of artifacts an aliased Af would create in the reconstructed f . (iv) Averaged measurements/anti-aliasing. Assume we cannot sample Af densely enoughor assume that f is not even approximately band limited. Then the data would be undersampled.The next practical question is — can we blur the data before we sample to avoid aliasing andthen view this as an essentially properly sampled problem but for a blurred version of f ? This is astandard technique in imaging and in signal processing but here we want to relate to reconstructionof f to the blurring of the data Af . In X-ray tomography, for example, this would mean replacingthe X-ray with thin cylindrical packets of rays and/or using detectors which could average oversmall neighborhoods and possibly overlap. One could also vibrate the sample during the scan.In thermo and photoacoustic tomography, one can take detectors which average over some smallareas. The physical detectors actually do exactly that in order to collect good signal which bringsus to another point of view — physical measurements are actually already averaged and we wantto understand what this does to the reconstructed f .To answer the sampling question (i), one may try to estimate the essential support (the “band”)of (cid:99) Af given that of ˆ f and then apply some of the known sampling theorems. This is a possibleapproach for each particular problem but will also require the coeﬃcients of A , roughly speaking, to EMICLASSICAL SAMPLING 3 be also band limited, and the band limit of Af would depend of that of f and on A . The operatorsof interest have singular Schwartz kernels however. Also, it may not be easy to get sharp constants.One might prove that if f is approximately B band limited, then Af is, say, CB approximatelyband limited. The success of this approach would depend heavily on having a sharp constant C .Proving that there exists some C > s would have to be scaled as s/C . In some symmetric cases, suchdirect approach can and has been done, for example for the Euclidean X-ray/Radon transform, seethe references in Section 6. Our main interests however is in inverse problems without symmetries(coming from diﬀerential equations with variable coeﬃcients, for example). Note that one maytry to use interpolation by various functions, like splines, for example but the same problem existsthere since the bounds of the error depend on a priori bounds of some higher order derivatives, andthe constants in those estimates matter.To overcome this diﬃculty, we look at the problem as an asymptotic one. We think of thehighest frequency of f (in some approximate sense) as a large parameter and we are interested inthe optimal sampling rate for Af when that upper bound gets higher and higher; which would forcethe sampling step s > ξ to ξ/h , where 0 < h (cid:28) sh . Then weassume that | ξ | is bounded by some constant B which we call a semiclassical band limit of f . Wethink of f as a family, depending on h . The natural machinery for this is the semi-classical calculus.The frequency content of f locally is described by its semi-classical wave front set WF h ( f ): if thelatter consists (essentially) of ( x, ξ ) with | ξ | ≤ B (then we say that f is semiclassically B -bandlimited), then ˆ f is essentially supported in B/h . In classical terms, this means | ˆ f ( ξ ) | = O ( h ∞ ) for | ξ | ≥ B/h , i.e., ˆ f is essentially supported in | ξ | ≤ B/h as h (cid:28)

1. One can think of

B/h as the upperbound of | ξ | for ξ ∈ supp ˆ f (up to a small error). One can also handle O ( h ) errors instead of O ( h ∞ )by replacing WF h with is semiclassical Sobolev version. If A is a semiclassical FIO (h-FIO), thenWF h ( f ) is mapped to WF h ( Af ) by the canonical relation of A . This is also true, away from thezero frequencies, if A is a classical FIO. That property is sharp when A is elliptic (which happensfor most stable problems). Therefore, we can estimate sharply WF h ( Af ) and apply an appropriatesampling theorem. The canonical relation is typically described by some properties of the geometryof the problem, as we will demonstrate on some examples.Knowing WF h ( Af ) for all f semiclassically B-band limited f determines the sampling rate for Af . Indeed, let Σ h ( Af ) be the semiclassical frequency set of Af deﬁned as the projection ofWF h ( Af ) onto the dual variable. Then an upper bound of the size, or even the shape of Σ h ( f )determines a sharp sampling rate for Af , see Theorem 3.2. Since WF h ( Af ) is 2 n -dimensionaland Σ h ( Af ) is n -dimensional, the latter discards useful information about the x -localization ofWF h ( Af ). Our analysis allows to formulate results about non-uniform but still a union of locallyuniform sampling lattices allowing us to use coarser sampling where the frequencies cannot reachtheir global maximum. We demonstrate how this works for the Radon transform. Note that thisnon-uniform sampling is not a consequence of a priori assumptions on the localization of WF h ( f )(although, if we have such assumptions, we can do further reductions) — it depends on the intrinsicgeometry of the problem i.e., on the Lagrangian of A .To answer (ii), assuming that we sample Af at a rate requiring, say Σ h ( Af ) to be supported in { η ; | η j | ≤ B, ∀ j } , we need to map this box back by the inverse of the canonical relation of A .The aliasing question (iii) admits a neat characterization. Aliasing is well understood in principleas frequencies ξ shifting (or “folding”) in the Fourier domain. This can be regarded as a h-FIO, callit S , in our setting. If A is associated with a local canonical diﬀeomorphism, then inverting A withaliased measurements results in A − SA , which is an h-FIO with a canonical relation a composition PLAMEN STEFANOV of the three. While classical aliasing shifts frequencies but preserves the space localization (see, e.g.,Figure 1); we get a new eﬀect here: the “inversion” A − SA does not preserve the space localizationand can shift parts of the image, see Figure 5.The averaged measurements/anti-aliasing problem (iv) can be resolved as follows. If the a prioriestimate of the size of WF h ( f ) is too high (or inﬁnite) for our sampling rate, we can blur thedata before sampling, i.e., apply an anti-aliasing ﬁlter, which is routinely done in signal and imageprocessing. To do this, take some φ ∈ C ∞ decaying fast enough, set φ h = h − m φ ( · /h ) (here m is thedimension where we collect the measurements, and h − m is a normalization factor), and consider φ h ∗ Af . That convolution can be seen to be a semiclassical ΨDO (a Fourier multiplier) with asemiclassical symbol ˆ φ . One can also use a more general semiclassical ΨDO. By Egorov’s theorem, φ h ∗ Af = AP h f + O ( h ∞ ) f , where P h is a zeroth order semiclassical ΨDO with a principal symbolobtained by ˆ φ pulled back by the canonical relation of A . The essential support of the full symbol isalso supported where the principal one is. Therefore, if A is associated to a diﬀeomorphism at least(but not only), one can choose φ h so that P h plays the role of the low-pass ﬁlter needed for propersampling if the rate of the latter on the image side is given. If the inverse problem is well posedin a certain way, we will recover P h f stably and the latter will be a semiclassical ΨDO applied to f cutting oﬀ higher frequencies which can be viewed as f regularized. We can even choose P h asdesired, and then compute φ which is general may change from one sample to another. This bringsus to the more general question of sampling Q h Af , where Q h is an h -ΨDO limiting the frequencycontent, instead of being just a convolution. The analysis is then similar.Finally, we prove in Section 5 an asymptotic lower bound on the number of non-uniform samplingpoints needed to sample stably f with WF h ( f ) in a given compact subset K of T ∗ R n . It is of Weyltype and equal to (2 πh ) − n Vol( K int ). This generalizes a theorem by Landau [13] to our settingwhere the sampling is classical and the number of the sampling points in any Ω is estimated bybelow by (2 π ) − n Vol(Ω × B ), if supp ˆ f ⊂ B .We want to mention that numerical computations of FIOs by discretization is an importantproblem by itself, see, e.g., [4, 1, 5, 3] which we do not study here. The emphasis of this paperhowever is diﬀerent: how the sampling rate and/or the local averaging of the FIO aﬀect the amountof microlocal data we collect and in turn how they could limit or not its microlocal inversion. Acknowledgments.

The author thanks Fran¸cois Monard for the numerous discussions on boththe theoretical and on the numerical aspects of this project; Yang Yang for allowing him to use hiscode for generating the numerical examples in Figure 18; and Maciej Zworski and Kiril Datchevfor discussions on the relationship between classical and semiclassical ΨDOs and FIOs.2.

Action of Ψ DOs and FIOs on the semiclassical wave front set

Wave Front sets.

Our main reference for the semiclassical calculus is [20], see also [7]. Forthe sake of simplicity, we work in R n but those notions are extendable to manifolds. Recall thatthe semiclassical Fourier transform F h f of a function depending also on h is given by F h f ( ξ ) = (cid:90) e − i x · ξ/h f ( x ) d x. This is just a rescaled Fourier transform F h f ( ξ ) = ˆ f ( ξ/h ). Its inverse is (2 πh ) − n F ∗ h . We recall thedeﬁnition of the semiclassical wave front set of a tempered h -depended distribution ﬁrst. In thisdeﬁnition, h > h ∈ (0 , h ) is a “small” parameterand we are interested in the behavior of functions and operators as h gets smaller and smaller.Those functions are h -dependent and we use the notation f h or f h ( x ) or just f . We follow [20] with EMICLASSICAL SAMPLING 5 the choice of the Sobolev spaces to be the semiclassical ones deﬁned by the norm (cid:107) f (cid:107) H sh = (2 πh ) − n (cid:90) (cid:104) ξ (cid:105) s |F h f ( ξ ) | d ξ. Then an h -dependent family f h ∈ S (cid:48) is said to be h -tempered (or just tempered) if (cid:107) f h (cid:107) H sh = O ( h − N ) for some s and N . All functions in this paper are assumed tempered even if we do notsay so. The semiclassical wave front set of a tempered family f h is the complement of those( x , ξ ) ∈ R n for which there exists a C ∞ function φ so that φ ( x ) (cid:54) = 0 so that(4) F h ( φf h ) = O ( h ∞ ) for ξ in a neighborhood of ξ in L ∞ (or in any other “reasonable” space, which does not change the notion). The semiclassicalwave front set naturally lies in T ∗ R n but it is not conical as in the classical case. Note that thezero section can be in WF h ( f ).There is no direct relationship between the semiclassical wave front WF h and the classical oneWF (when h is ﬁxed in the latter case), see also [20]. For example, for f ∈ S independent of h ,WF h ( f ) = supp f × { } while WF( f ) is empty. On the other hand, if g is singular and compactlysupported, then for f ( x ) = g ( x − /h ) we have WF h ( f ) = ∅ while WF( f ) is non-empty for every h ,see [20]. Sj¨ostrand proposed adding the classical wave front set to WF h by considering the latterin T ∗ R n ∪ S ∗ R n , where the second space (the unit cosphere bundle) represents T ∗ R n as a conicset, i.e., each ( x, ξ ) with ξ unit is identiﬁed with the ray ( x, sξ ), s >

0. Their points are viewedas “inﬁnite” ones describing the behavior as | ξ | → ∞ along diﬀerent directions. An inﬁnite point( x , ξ ) does not belong to the so extended WF h ( f ) if we have(5) F h ( φf h ) = O ( h ∞ (cid:104) ξ (cid:105) −∞ ) for ξ in a conical neighborhood of ξ with φ as above. Our interest is in functions which are localized in the spatial variable and do nothave inﬁnite singularities. In [20], it is said that a tempered f h is localized in phase space, if thereexists ψ ∈ C ∞ ( R n ) so that(6) (Id − ψ ( x, hD )) f h = O S ( h ∞ ) , see the deﬁnition of h -ΨDOs below. Such functions do not have inﬁnite singularities and aresmooth. We work with functions localized in phase space and those are the functions which canbe sampled properly anyway. In practical applications, this assumption is satisﬁed by the naturalresolution limit of the data we collect, for example the diﬀraction limit.Other examples of semiclassical wave front sets are the following. If f h = e i x · ξ /h , then WF h ( f ) = R n × { ξ } . The coherent state(7) f h ( x ; x , ξ ) = e i x · ξ /h −| x − x | / h (to normalize for unit L norm, we need to multiply by ( πh ) − n/ ) satisﬁes WF h ( f ) = { ( x , ξ ) } .Its real or imaginary part have wave front sets at ( x , ξ ) and ( x , − ξ ). We will use such states inour numerical examples.It is convenient to introduce the notation Σ h ( f ) for the semiclassical frequency set of f . Deﬁnition 2.1.

For each tempered f h localized in phase space, set Σ h ( f ) = { ξ ; ∃ x so that ( x, ξ ) ∈ WF h ( f ) } . In other words, Σ p is the projection of WF h ( f ) to the second variable, i.e.,(8) Σ h ( f ) = π ◦ WF h ( Af ) , PLAMEN STEFANOV where π ( x, ξ ) = ξ . If WF h ( f ) (which is always closed) is bounded and therefore compact, thenΣ h ( f ) is compact. Deﬁnition 2.2.

We say that f h ∈ C ∞ ( R n ) is semiclassically band limited (in B ), if (i) supp f h iscontained in an h -independent compact set, (ii) f is tempered, and (iii) there exists a compact set B ⊂ R n , so that for every open U ⊃ B , we have (9) |F h f ( ξ ) | ≤ C N h N (cid:104) ξ (cid:105) − N for ξ (cid:54)∈ U for every N > . If h is ﬁxed, this estimate trivially holds for every ξ . Its signiﬁcance is in the h dependence. Inparticular, such functions do not have inﬁnite singularities, and are localized in phase space. Inapplications, we take B to be [ − B, B ] n with some B > | ξ | ≤ B or some other set, see,e.g., Figure 10.As an example, for 0 (cid:54) = χ ∈ C ∞ , f := χ ( x ) e i x · ξ /h is semiclassically band limited with B = { ξ } .Indeed, F h f ( ξ ) = ˆ χ (( ξ − ξ ) /h ) decays rapidly for ξ (cid:54) = ξ as h →

0. Clearly, that decay is notuniform as ξ → ξ which explains the appearance of U in the deﬁnition. We could have required(9) to hold in the closure of R n \ B to avoid introducing U ; then in this example, one can take B to be the closure of every neighborhood of ξ . Both deﬁnitions would work ﬁne for the intendedapplications. To generalize this example, we can take a superposition of such functions with ξ varying over a ﬁxed compact set B to get f = χ F − h g with g ∈ L , supp g ⊂ B to be semiclassicallyband limited with frequency set in B .Another example of semiclassically band limited functions can be obtained by taking any f ∈E (cid:48) ( R n ) and convolving if with φ h = h n φ ( · /h ) with supp ˆ φ ∈ C ∞ . Then φ h ∗ f is semiclassicallyband limited with B = supp ˆ φ . Proposition 2.1.

Let

B ⊂ R n be a compact set. For every tempered f h ∈ C ∞ with supportcontained in an h -independent compact set, the following statements are equivalent:(a) f h is semiclassically band limited,(b) f h is localized in phase space,(c) WF h ( f ) is ﬁnite and compact.Proof. Let f h satisfy the conditions of Deﬁnition 2.2. Then WF h ( f ) has no inﬁnite points. Let χ ∈ C ∞ be equal to 1 in a neighborhood of supp f h for all h (cid:28)

1, and χ be supported in thebounded U (cid:99) B and equal to 1 near B . Then (6) is satisﬁed with ψ = χ ( x ) χ ( ξ ). Indeed, by (9),for every s ,(10) (Id − χ ( hD )) f h = O H sh ( h ∞ ) . This implies the same estimate in the Schwartz space S as well. Apply χ ( x ) to get (6). Therefore,(a) ⇒ (b).Next, assume (b). Let the compact set B ⊂ R n be such that for ψ in (6) we have ψ ( x, ξ ) = 0for all x and ξ (cid:54)∈ B . With U as in Deﬁnition 2.2, we can apply a semiclassical Fourier multiplierId − χ ( hD ) with χ as above to get (10). Therefore, (cid:90) (cid:104) ξ (cid:105) s | (1 − χ ( ξ )) F h f h ( ξ ) | d ξ = O ( h ∞ ) . Set g ( ξ ) = (cid:104) ξ (cid:105) s (1 − χ ( ξ )) F h f h ( ξ ). Using Sobolev embedding and the fact that − i ∂ ξ correspondsto x/h via F h , we get (cid:107) g (cid:107) L ∞ = O ( h ∞ ). This proves (9). Therefore, (b) ⇒ (a).Assume (c). Using (5) and a partition of unity, we get (9), i.e., (a) holds. On the other hand,(a) implies (c) directly. (cid:3) EMICLASSICAL SAMPLING 7

Let f h be semiclassically band limited and let B be so that Σ h ( f ) ⊂ {| ξ | < B } . Then f h = χ ( x ) χ ( hD ) f h + O S ( h ∞ ) with χ , as in the proof of Proposition 2.1. We can assume supp χ ⊂{| ξ | < B } . Apply the operator (cid:104) hD (cid:105) s to f h , then on WF h ( f ), we get that (cid:104) hD (cid:105) s χ ( x ) χ ( D ) hasfull symbol (cid:104) ξ (cid:105) s up to the negligible class. Therefore, knowing (cid:107) f h (cid:107) L for a semiclassically bandlimited f h allows us to control the semiclassical Sobolev norms of every order s ≥ (cid:107) f h (cid:107) H sh ≤ (cid:104) B (cid:105) s (cid:107) f h (cid:107) L + O s ( h ∞ ) . h - Ψ DOs.

We deﬁne the symbol class S m,k (Ω), where Ω ⊂ R n is an open set, as the smoothfunctions p ( x, ξ ) on R n , depending also on h , satisfying the symbol estimates(12) | ∂ αx ∂ βξ p ( x, ξ ) | ≤ C K,α,β h k (cid:104) ξ (cid:105) m for x in any compact set K ⊂ Ω. The negligible class S −∞ , ∞ is the intersection of all S m,k . Given p ∈ S m,k (Ω), we write P = P h = p ( x, hD ) with(13) P f ( x ) = (2 πh ) − n (cid:90) (cid:90) e i( x − y ) · ξ/h p ( x, ξ ) f ( y ) d y d ξ, where the integral has to be understood as an oscillatory one. If we stay with functions localized inphase space, the factor (cid:104) ξ (cid:105) m is not needed and we can work with symbols compactly supported in ξ . Then the corresponding classes are denoted by S k (Ω) and k is called an order. One can alwaysdivide by h k ; so understanding zero order operators is enough.2.3. Classical Ψ DOs and semiclassical wave front sets.

We begin with an informal discussionabout the relationship between classical and semiclassical ΨDOs. Let us denote by ξ the dualvariable in the classical ΨDO calculus, and by η the dual variable in the semiclassical case. Formally,by (13) (with ξ replaced by η there), we have η = hξ and after this substitution, we seem to get aclassical ΨDO. The problem is that classical symbols do not need to be smooth or even deﬁned for ξ in a compact set; say for | ξ | ≤ C with some C . Then η = hξ maps this to | η | ≤ Ch . Semiclassicalsymbols however need to be deﬁned and smooth for every η in a neighborhood of the semiclassicalwave front set of the function we want to study. We see that the zero section η = 0 needs to beexcluded. If we try to rectify this problem by multiplying a classical symbols p ( x, ξ ) by χ ( ξ ) with χ ∈ C ∞ , χ = 0 near ξ = 0 and χ = 1 for large ξ , we run into the problem that ∂ η χ ( η/h ) ∼ h − and the symbol estimates (12) are not satisﬁed for χ ( η/h ) p ( x, η/h ) near η = 0.This shows that the zero section needs to be treated separately. Even when that problem doesnot exists, classical ellipticity does not necessarily mean semiclassical one. For example, − ∆ isa classical elliptic ΨDO with symbol | ξ | while | η | /h is the semiclassical symbol of the sameoperator but − ∆ is not semiclassically elliptic anymore (in S , − ). Proposition 2.2.

Let K be a smoothing operator, and let f h ∈ E (cid:48) ( R n ) be tempered. Then WF h ( Kf ) ⊂ R n × { } .Proof. Let φ, ψ ∈ C ∞ such that ψ = 1 near some ξ (cid:54) = 0 but ψ = 0 near 0. Then ψ F h ( φKf )( ξ ) = ψ ( ξ ) (cid:90) e − i x · ξ/h φ ( x ) Kf ( x ) d x = ψ ( ξ ) (cid:90) (cid:90) e − i x · ξ/h φ ( x ) K ( x, y ) f ( y ) d x d y = ψ ( ξ ) (cid:90) (cid:90) e − i x · ξ/h φ ( x ) (cid:2) (1 − h ∆ y ) m K ( x, y ) (cid:3) (1 − h ∆ y ) − m f ( y ) d x d y. PLAMEN STEFANOV

For m (cid:29)

1, (1 − h ∆) − m f ∈ L with a norm O ( h − N ) for some N . Fix one such m . We have L k e − i x · ξ/h = e − i x · ξ/h for every k with L = i h | ξ | − ξ · ∇ x . Integrating by parts with L k , we get ψ F h ( φKf ) = O ( h ∞ ). Therefore, every ( x, ξ ) with ξ (cid:54) = 0 is not in WF h ( Kf ). (cid:3) Theorem 2.1.

Let P be a properly supported Ψ DO of order m . Then for every f h localized inphase space, WF h ( P f ) \ ⊂ WF h ( f ) \ . If P is elliptic, then the inclusion above is an equality.Proof. By the proposition above, the property of the theorem is invariant under adding a smoothingoperator as it should be. Let p ( x, ξ ) be the symbol of P , so that P = p ( x, D ) modulo a smoothingoperator. Then P is a formal h -ΨDO with a symbol p ( x, ξ/h ). The latter is a semiclassical symbolaway from every neighborhood of ξ = 0. Indeed, for x in every compact set K , | ∂ αx ∂ βξ p ( x, ξ/h ) | ≤ C α,β,K h −| β | | ξ/h | m −| β | = C α,β,K h − m | ξ | m −| β | for | ξ/h | > | ∂ αx ∂ βξ p ( x, ξ/h ) | ≤ C α,β,K h − m | ξ | m −| β | for | ξ | ≥ ε > ε > < h ≤ ε . This shows that p ( x, ξ/h ) is a semiclassical symbol of order ( m, − m )restricted to | ξ | > (cid:15) . Note that the smallness requirement on h depends on ε . To complete theproof, we need to resolve this problem.We show next that for every ﬁxed ε >

0, if WF h ( f ) ⊂ B ε (0), then WF h ( P f ) ⊂ B ε (0) as well.This is a weaker version of what we want to prove but it is valid near ξ = 0.Since P is properly supported, with φ ∈ C ∞ as in the previous proof, we may assume that in φP f , the function f is supported in a ﬁxed compact set. Then by a compactness argument, forevery ε (cid:48) > ε , we have F h f ( ξ ) = O ( h ∞ ) for | ξ | ≥ ε (cid:48) . Then F h ( φP f )( η ) = (2 πh ) − n (cid:90) (cid:90) (cid:90) e − i x · η/h +i( x − y ) · ξ φ ( x ) a ( x, ξ ) f ( y ) d y d ξ d x = (2 πh ) − n (cid:90) (cid:90) (cid:90) e − i x · η/h +i( x − y ) · ξ/h φ ( x ) a ( x, ξ/h ) f ( y ) d y d ξ d x = (2 πh ) − n (cid:90) (cid:90) e i x · ( ξ − η ) /h φ ( x ) a ( x, ξ/h ) F h f ( ξ ) d ξ d x. The integration above can be restricted to | ξ | ≤ ε (cid:48) with ε (cid:48) > ε ﬁxed, and this will result in an O ( h ∞ ) error. For the phase function iΦ /h = i x · ( ξ − η ) /h we have Φ x = ξ − η , i.e. the zeros areon the diagonal ξ = η in the ﬁber variable. The following operator preserves exp(iΦ /h ): L = − i ξ − η | ξ − η | · h ∇ x . Since | ξ | ≤ ε (cid:48) , if we restrict η to | η | > ε (cid:48)(cid:48) with ε (cid:48)(cid:48) > ε (cid:48) , and integrate by parts, we get O ( h ∞ | ξ | −∞ )above. This proves that WF h ( P f ) is included in R n × B ε (cid:48)(cid:48) (0). Since ε (cid:48)(cid:48) > ε can be taken as closeto ε as we wish, this proves the claim.Now, using h -ΨDO cutoﬀs, we express f as f = f + f , with WF h ( f ) included in | ξ | ≤ ε and WF h ( f ) included in | ξ | ≥ ε/

2. By the claim above, WF h ( P f ) is included in | ξ | ≤ ε aswell. For WF h ( P f ), by what we proved earlier, WF h ( P f ) ⊂ WF h ( f ). Therefore, WF h ( f ) ⊂ WF h ( f ) ∪ R n × B ε (0) for every ε > h ( f ) ⊂ WF h ( f ) ∪ R n × { } . Inparticular, this proves the ﬁrst part of the theorem.To prove the second part, if P is elliptic, there is a parametrix Q of order − m so that QP f = f + Kf , where K is smoothing. Then we apply the ﬁrst part of the proof. (cid:3) EMICLASSICAL SAMPLING 9

For future reference, note that we also proved that for every ε >

0, every classical ΨDO is alsoa semiclassical one restricted to f with WF h ( f ) not containing ξ with | ξ | ≤ ε .2.4. Classical FIOs and semiclassical wave front sets.Theorem 2.2.

Let A be an FIO in the class I m ( R n , R n , Λ) , where Λ ⊂ T ∗ ( R n × R n ) \ is aLagrangian manifold and m ∈ R . Then for every f h localized in phase space, (14) WF h ( Af ) \ ⊂ C ◦ WF h ( f ) \ , where C = Λ (cid:48) is the canonical relation of A .Proof. The statement holds for h -FIOs, see, e.g, [11]. When A is a classical FIO, A can be writtenlocally as Af ( x ) = (2 π ) − ( n + n +2 N ) / (cid:90) (cid:90) e i φ ( x,y,θ ) a ( x, y, θ ) f ( y ) d y d θ, modulo a smoothing operator, where a is an amplitude of order m + ( n + n − N ) / φ is anon-degenerate phase function, see [12, Chapter 25.1] and θ ∈ R N . As in the proof of Theorem 2.1above, we can express Af as an oscillatory integral with a phase function φ ( x, y, θ ) /h and anamplitude a ( x, y, θ/h ). The latter is a semiclassical amplitude for | θ | > ε > < h < ε . Therest of the proof is as the proof of Theorem 2.1 using the non-degeneracy of the phase. (cid:3) As above, we also proved that for every ε >

0, every classical FIO is also a semiclassical onerestricted to f h with WF h ( f ) not containing ξ with | ξ | ≤ ε . Finally, if F has a left parametrixwhich is also an FIO, then (14) is an equality. This happens, for example, if C is locally a graph of adiﬀeomorphism and A is elliptic. It also happens when C satisﬁes the clean intersection condition,which is the case for the geodesic X-ray transform in dimensions n ≥ h ( f ) is just a set, the theorem (as typical for suchstatements) gives us more than recovery of that set. If we know Af = m up to an O S ( h ∞ ) error,and f and f are two possibly diﬀerent solutions, then A ( f − f ) has an empty semiclassical wavefront set; and if A is, say elliptic and associated to a local diﬀeomorphism, then WF h ( f − f ) canonly be contained in the zero section, i.e., we can recover f microlocally, away from ξ = 0; not justWF h ( f ). 3. Sampling theorems

Sampling on a rectangular grid.

We start with a version of the classical Nyquist–Shannonsampling theorem which allows for oversampling. Below,

B >

Theorem 3.1.

Let f ∈ L ( R n ) . Assume that supp ˆ f ⊂ [ − B, B ] n and let < s ≤ π/B . Let ˆ χ ∈ L ∞ be supported in [ − , n and equal to π n on B − . supp ˆ f . If s ≤ π/B , then f can be reconstructedby its samples f ( sk ) , k ∈ Z n by (15) f ( x ) = (cid:88) k ∈ Z n f ( sk ) χ (cid:16) πs ( x − sk ) (cid:17) , and (16) (cid:107) f (cid:107) = s n (cid:88) k ∈ Z n | f ( sk ) | . Proof.

Since ˆ f is supported in [ − B, B ] n , it is also supported in [ − π/s, π/s ] n . Then we can takethe 2 π/s periodic extension ˆ f ext of ˆ f in all variables and then the (inverse) Fourier series of thatextension to get(17) ˆ f ext ( ξ ) = s n (cid:88) k f ( sk ) e − i sξ · k . Multiply this by π − n ˆ χ ( sξ/π ) to get(18) ˆ f ( ξ ) = ( s/π ) n ˆ χ ( sξ/π ) (cid:88) k f ( sk ) e − i sξ · k . Take the inverse Fourier Transform to get (15). Equality (16) is just Parseval’s equality applied to(17). (cid:3)

Note that when ˆ χ/π is the characteristic function of [ − ,

1] in 1D, we get χ ( x ) = sinc( x ) :=sin x/x . In higher dimensions, we get a product of such functions. When supp ˆ f ⊂ ( − B, B ), onecan choose ˆ χ ∈ C ∞ , which makes the series (15) rapidly convergent. Even a piecewise linear ˆ χ willincrease the convergence rate: if for n = 1 we choose ˆ χ to have a trapezoidal graph by deﬁning itas linear in [ δ,

1] for some δ ∈ (0 ,

1) (and continuous everywhere), then χ ( x ) = cos x − cos( δx )(1 − δ ) x which is O δ ( x − ) instead of just O ( | x | − ) as sinc, with a constant getting large when δ is close to 1.Theorem 3.1 says that the sampling rate should not exceed π/B , which is known as the Nyquistlimit. The theorem can be extended to classes of non L functions and then we need the samplingrate to be strictly below the Nyquist one. If f = sin( x ) for example, B = 1 is the sharpest bandlimit and a sampling rate of π would yield zero values. On the other hand, that function is not in L . Sampling with a smaller step recovers that f uniquely in the corresponding class. Remark 3.1.

It is easy to see that the subspace L B of L consisting of functions f with supp ˆ f ⊂ [ − B, B ] n is a Hilbert space itself. If we consider the samples f ( sk ) as elements of (cid:96) s ( Z n ) withmeasure s n (i.e., sequences a = { a k } k ∈ Z n with (cid:107) a (cid:107) = s n (cid:80) k | a k | ), then the theorem implies thatthe map L B (cid:51) f = ⇒ f ( sk ) ∈ (cid:96) s ( Z n )given by (15) with s = π/B and χ = sinc, is unitary; i.e., not just an isometry but a bijective onebecause one can easily show that for every choice of the sampled values in (cid:96) s ( Z n ) there is unique f ∈ L B with those values. Therefore, the set of the samples is not an overdetermined one for astable recovery. Also, (15) is just an expansion of f in an orthogonal basis.There are several direct generalizations possible. First, one can have diﬀerent band limits foreach component: | ξ j | ≤ B j on supp ˆ f . Then we need s j B j /π ≤ δ <

1. One can have frequencysupport in non-symmetric intervals but since we are interested mostly in real valued functions, wekeep them symmetric.The next observation is that the set containing supp ˆ f does not have to be a box. Of course, ifit is bounded, it is always included in some box but more eﬃcient sampling can be obtained if weknow that the group generated by the shifts ξ j (cid:55)→ ξ j + 2 B j , j = 1 , . . . , n maps supp ˆ f into mutuallydisjoint sets. Then we take the periodic extension of ˆ f w.r.t. that group. The requirement for ˆ χ then is to have disjoint images under those shifts and to be equal to 1 on supp ˆ f ; then the stepfrom (17) to (18) works in the same way. Note that ˆ χ may not be contained in (cid:81) [ − B j , B j ]. EMICLASSICAL SAMPLING 11

Finally, one may apply general non-degenerate linear transformation as in [16]. It is well knownthat the change x = W y with W an invertible matrix triggers the change ξ = ( W ∗ ) − η of the dualvariables. Let f be band limited in B . Assume that W is such that the images of B under thetranslations ξ (cid:55)→ ξ + 2 π ( W ∗ ) − k , k ∈ Z n are mutually disjoint. Then in the y variables, the shiftsof supp ˆ f by the group η (cid:55)→ π Z n are mutually disjoint (i.e., for g ( y ) := f ( W y ), we have that forˆ g ( η )). Then g is band limited in a box with a half-side B = π and then for every s ∈ (0 , f is stably determined by its samples f ( sW k ), k ∈ Z n . Moreover, the reconstruction formula (15)still holds with the requirement that χ = 1 on supp ˆ f and supp χ has disjoint images under thetranslations by that group; there is an additional | det W | factor (see also Theorem 3.2) comingfrom the change of the variables.We present next a semiclassical version of the sampling theorem. It can be considered as anapproximate rescaled version of the classical theorem when the conditions are approximately held,with an error estimate. We present a version with a uniform estimate of the error and in thecorollary below, we formulate a corollary without that uniformity but which requires f h to besemiclassically band limited (only), i.e., it applies to a single f h . Theorem 3.2.

Assume that Ω ⊂ R n , ¯ B ⊂ B ⊂ R n with Ω , B , B open and bounded. Let f h ∈ C ∞ (Ω) satisfy (19) (cid:107) (Id − ψ ( x, hD )) f h (cid:107) H mh = O m ( h ∞ ) (cid:107) f h (cid:107) , ∀ m (cid:29) with some m , ψ ∈ C ∞ ( R n ) so that ψ ( x, ξ ) = 1 for ξ ∈ B and ψ ( x, ξ ) = 0 for ξ (cid:54)∈ B . Let ˆ χ ∈ C ∞ ( R n ) be so that supp ˆ χ ∈ B and ˆ χ ( ξ ) ψ ( x, ξ ) = ψ ( x, ξ ) .Assume that W is an invertible matrix so that the images of B under the translations ξ (cid:55)→ ξ + 2 π ( W ∗ ) − k , k ∈ Z n , are mutually disjoint. Then for every s ∈ (0 , , (20) f h ( x ) = | det W | (cid:88) k ∈ Z n f h ( shW k ) χ (cid:16) πsh ( x − shW k ) (cid:17) + O H s ( h ∞ ) (cid:107) f (cid:107) L , and (21) (cid:107) f h (cid:107) L = | det W | ( sh ) n (cid:88) k ∈ Z n | f h ( shW k ) | + O ( h ∞ ) (cid:107) f (cid:107) L . Proof.

As in the remark above, we can make the change of variables x = W y ; then the dualvariables η is related to ξ by η = W ∗ ξ . In the new variables, the images of B under the translations η → η + 2 πk do not intersect. Therefore, it is enough to consider the case W = Id.Set(22) g h ( x ) := F − h χ F h f h . Since F h g h is the classical Fourier transform of the rescaled h n g h ( hx ), by the previous theorem andthe remark after it, with B = π , h n g h ( hx ) = (cid:88) k h n g h ( shk ) χ ( π ( x − sk ) /s ) . Replace hx by x to get (20) and (21) for g h without the error terms, i.e.,(23) g h ( x ) = (cid:88) k ∈ Z n g h ( shk ) χ (cid:16) πsh ( x − shk ) (cid:17) , (cid:107) g h (cid:107) L = ( sh ) n (cid:88) k ∈ Z n | g h ( shk ) | . To estimate the error, write r n : = f h − g h = F − h (1 − χ ) F h f h = (Id − χ ( hD )) f h = (Id − χ ( hD )) ( ψ ( x, D ) f h + ˜ r h ) , where ˜ r h satisﬁes (19). Therefore, r n satisﬁes the same error estimate. Using Sobolev embedding,we get r n ( shk ) = O ( h ∞ ) (cid:107) f (cid:107) for shk in any ﬁxed neighborhood of ¯Ω. Outside of it, r h = g h isin the Schwartz class by (22) with every seminorm bounded by C N h N (cid:107) f (cid:107) , ∀ N . This implies thatreplacing g n ( shk ) by f n ( shk ) in the interpolation formula in (23) results in an O ( h ∞ ) error in every H sh . We already established such an error estimate if we replace g h by f h on the left. The Parseval’sidentity (21) follows from this as well.This completes the proof. (cid:3) In particular, when we are given a single semiclassically band limited f h , we have the corollarybelow. Note that estimates (4), (5), (6), and (9) are not formulated uniformly in f h ; and on theother hand, estimate (4) is the standard deﬁnition of the semiclassical wave front set. Theorem 3.2has uniform estimates of the error but requires the stronger condition (19). Also, one can have thesemiclassical band limited assumptions as in the corollary below but still the more general samplinggeometry as in Theorem 3.2. Corollary 3.1.

Let f h be semiclassically band limited with Σ h ( f ) ⊂ (cid:81) ( − B j , B j ) . Let ˆ χ j ∈ C ∞ ( R ) be supported in ( − , n and ˆ χ j ( ξ j /B j ) = π for ξ ∈ Σ h ( f ) . If < s j ≤ π/B j , then (24) f h ( x ) = (cid:88) k ∈ Z n f h ( s hk , . . . , s n hk n ) (cid:89) j χ j (cid:18) πs j h ( x − s j hk ) (cid:19) + O S ( h ∞ ) , and (25) (cid:107) f h (cid:107) = ( sh ) n (cid:88) k ∈ Z n | f h ( shk ) | + O ( h ∞ ) . Proof.

Take Ω ⊃ supp f , W = diag( π/B , . . . , π/B n ). Then det W = π n / ( B . . . B n ). If δ < χ there(depending on ξ ∈ R n ) to be the product ˆ χ ( ξ /B ) . . . ˆ χ ( ξ /B n ), where χ (depending on an1D variable) is as in the corollary but related to B j = 1. Set ˆ χ j ( ξ ) = ˆ χ ( ξ/B j ); then χ j ( x ) = B j χ ( B j x ). On the other hand, having χ j , we can compute χ . Then Theorem 3.2 implies thecorollary. (cid:3) In other words, if the sampling rate sh is smaller than πh/B , we have an accurate recovery. Notethat the sampling rate s is rescaled by h compared to Theorem 3.1. We call s a relative samplingrate. Remark 3.2.

Let supp f ∈ Ω (cid:98) R n be B -band limited as in the corollary. Then the smallestsubset in the phase space T ∗ Ω containing WF h ( f ) for all such f ’s is ¯Ω × [ − B, B ] n . Its volume withrespect to the volume form d x d ξ isVol (cid:0) ¯Ω × [ − B, B ] n (cid:1) = (2 B ) n Vol(Ω) . The number of samples we need for f is bounded from below by(26) πh/B ) n = (2 πh ) − n Vol (cid:0) ¯Ω × [ − B, B ] n (cid:1) up to an o (1) relative error as it follows by comparing the volume of Ω with the number of thelattice points we can ﬁt in it. The factor (2 πh ) − n is sometimes multiplied by d ξ to deﬁne a naturalrescaled measure (2 πh ) − n d ξ in the ξ space. Therefore, we see that the asymptotic number ofsamples needed to resample such f ’s has a sharp lower bound equal to the phase volume occupiedby f in the phase space w.r.t. to that semiclassical measure. Note that this is sharp when applied EMICLASSICAL SAMPLING 13 to f ’s with WF h ( f ) contained in the product of Ω an a box in the ξ variable. This is a version ofthe lower bound in [13] in our setting, which we prove in Theorem 5.1.3.2. Non-linear transformations and non-uniform sampling.

Non-uniform sampling in di-mensions n ≥ x = φ ( y ) is adiﬀeomorphism, one may try to obtain sampling theorems for f ( x ) by applying the results aboveto g ( y ) := f ( φ ( y )) if the latter happens to be band-limited. Then the sampling points y k , k ∈ Z n will give us sampling points x k = φ ( y k ). This is what we did above with φ ( y ) = W y linear. Thenwe ﬁrst choose V so that the translates of supp ˆ f (or Σ h ( f )) under the group η → η + V Z n aremutually disjoint and ﬁnd W from the equation V = 2 π ( W ∗ ) − . Let us consider the semiclassicalsampling theorem now. The analog of this for non-linear transformations is the well known propertythat WF h ( f ) transforms as a set of covectors, i.e., ξ = ((d φ ) ∗ ) − η . Now, if we want to prescribed φ ( y ) pointwise (at y = φ − ( x ) for every x ) based on an a priori knowledge of WF h ( f ), we runinto the problem that the equation d φ = Φ with Φ given is not solvable unless dΦ = 0 (i.e., thematrix-valued map Φ is curl-free). On the other hand, we can take a partition of unity and oneach covering open set, we can take a linear transformation giving is better sampling locally. InSection 6, we prove a lower bound for the number of sampling points which is sharp at least in thesituation of Theorem 3.2.3.3. Aliasing.

Aliasing is a well known phenomenon in classical sampling theory when thereNyquist limit condition is not satisﬁed (i.e., we have undersampling). For simplicity, we will recallthe basic notions when the sampling is done on a rectangular grid but the more general periodiccase in Theorem 3.2 can be handled similarly.Assume that we sample each x j -th variable of f ( x ) with a steps sh . Diﬀerent steps s j can behandled easily with a diagonal linear transformation. We do not assume that this steps satisfy theNyquist condition in Corollary 3.2. Assume also that we use formula (24). Following the proof ofTheorem 3.1, (17) still holds but the periodic extension ˆ f ext consists of a superposition of periodicshift of ˆ f h w.r.t. to the group (2 π/s ) Z n :(27) ˆ f ext ( ξ ) = (cid:88) k ∈ Z n ˆ f ( ξ + (2 π/s ) k ) . Now they can overlap and the restriction of ˆ f ext to [ − π/s, π/s ] n is not ˆ f anymore. Then (18) needsto be corrected by replacing the l.h.s. by ˆ χ ( sξ/π ) ˆ f ext ( ξ ). Then the proof of Theorem 3.2 showsthat the r.h.s. of (24) approximates not f h but of(28) Gf h := F − h ˆ χ ( s · /π )( F h f h ) ext . In particular, high (outside [ − π/s, π/s ] n ) frequencies ξ will be shifted by (2 π/s ) k for some k ∈ Z n so they land in that box and their amplitudes will be added to the ones with that actual frequencythere. This is known as “folding” of frequencies. Note also that for f h real valued, WF h ( f ) andΣ h ( f ) are even, i.e., symmetric under the transform ξ (cid:55)→ − ξ .In fact, G = (cid:80) k ∈ Z G k , where each G k is an h-FIO with a canonical relation given by the shifts(29) S k : ( x, ξ ) (cid:55)−→ ( x, ξ + 2 πk/s ) . Indeed, we have G k f ( x ) = (2 πh ) − n (cid:90) e i( x − y ) · ξ/h − π i k · y/sh ˆ χ ( sξ/π ) f ( y ) d y d ξ = e − π i k · x/sh ˆ χ ( shD/π + 2 k ) f. (30) This FIO view of aliasing will be very helpful later.In applications, other reconstructions are used, for example splines. Without a proof, we willmention that the aliasing artifacts are similar then.As an example, we plot the function f h consisting of a sum of the real parts of a sum of twocoherent states, see (7), with h = 0 .

01 and h = 0 .

04, respectively and unit ξ ’s, on the rectangle[ − , . If we take h = 0 . sh with s < π , therefore, it needs more than 2 / ( πh ) ≈

64 sampling points in eachvariable. The lower frequency set requires about 1 / ×

81 pointgrid ﬁst, and then on a 41 ×

41 one next. In the ﬁrst cases, both patterns are oversampled, while inthe second one, only one is, and the other one is undersampled. The reconstructed images, usingthe lanczos3 interpolation in MATLAB (instead of (24)) are shown; the aliasing of the smallerpattern changes the direction and the magnitude of the frequency. In the third and the fourthplots, we show the absolute values of the Fourier transforms of both images: the oversampled oneis plotted in its Nyquist box while the white box is the Nyquist box of the undersampled one nextto it. The absolute value Fourier transform of the undersampled image is plotted last. One cansee that the frequencies outside the Nyquist box in the previous case have shifted by (0 , − B ) and(0 , B ), where 2 B is the side of the wide box (with the so chosen h , we have B = 1). Figure 1.

From left to right: (a) the original f o ; (b) f u with a lower sampling rate: thelower left pattern is undersampled, the larger one is oversampled; (c) | ˆ f o | with the whitebox representing the Nyquist limit in the next plot; (d) | ˆ f u | , the white box is the Nyquistlimit. Sampling classical FIOs

We are ready to formulate the main results of this paper. We label them by (i), (ii), (iii) and(iv) as in the Introduction.4.1. (i) Sampling Af . Let A be a classical FIO as in Theorem 2.2. If we know a priori that f issemiclassically band limited, then so is Af and we can ﬁnd a sharp upper bound on its band limit.This determines a sharp sampling rate for Af . Theorem 4.1.

Let A be a classical FIO as in Theorem 2.2 with canonical relation C . Let f h besemiclassically band limited. Assume that C ◦ WF h ( f ) ⊂ Ω × B with Ω a bounded domain and B bounded. If W is as in Theorem 3.2, then Af can be reconstructed in Ω up to an O ( h ∞ ) error fromits values on the lattice shW k , k ∈ Z , < s < in the sense of (20), (21). If we do not require uniformity of the error in Af , one can take s = 1. EMICLASSICAL SAMPLING 15 (ii) Resolution limit of f given the sampling rate of Af . Assume now that we sample Af at a rate s j h in the x j -th variable with some ﬁxed s j >

0; or on a more general periodic lattice.What resolution limit does this impose on f ?By Corollary 3.2, to avoid aliasing, Σ h ( Af ) should be in the box [ − B j , B j ] with B j > π/s j (we assume that we deal with real valued functions, and therefore always work in frequency setssymmetric about the origin). Then if(31) π ◦ C ◦ WF h ( f ) ⊂ (cid:89) j ( − B j , B j ) , there is no loss (up to O ( h ∞ ) when the data Af has been sampled. Note that the canonical relation C does not need to be an 1-to-1 map. If C is a local diﬀeomorphism and if A is elliptic, then (31)is sharp in the sense that if it is violated, f cannot be recovered up to O ( h ∞ ), i.e., there will bealiasing. This happens for the 2D Radon transform, for example.We write (31) in a diﬀerent way. Then (31) is equivalent to(32) WF h ( f ) ⊂ C − ◦ (cid:16) R m × (cid:89) j ( − B j , B j ) (cid:17) , assuming that A takes values in R m . Similarly to C , the inverse canonical relation C − does notneed to be an 1-to-1 map.Relation (32) gives easily a sharp limit (sharp when A is elliptic, associated to a local diﬀeomor-phism) on Σ p ( f ) guaranteeing no aliasing. It actually says something more: the resolution limit on f is microlocal in nature, i.e., it may depend on the location and on the direction. We illustratethis below with the Radon transform.4.3. (iii) Aliasing artifacts. When (32) is violated and when A is elliptic, associated to a localdiﬀeomorphism, for example, there will be aliasing of Af . To understand how this aﬀects f , if weuse back-projection, i.e., a parametrix A − , we recall that the aliasing can be interpreted as anh-FIO associated to shifts of the dual variable, see (29) and (30). Then the inversion would be A − G k A ; and by Egorov’s theorem, that is an h-FIO with a canonical relation being C − ◦ S k ◦ C acting on ( x, ξ ) ∈ supp ˆ χ ( s · /π +2 k ). We will not formulate a formal theorem of this type; instead wewill illustrate it in the examples in the next sections. The classical aliasing described in Section 3.3creates artifacts at the same location but with shifted frequencies. The artifacts here however couldmove to diﬀerent locations, as it happens for the Radon transform, for example.Note that A − has canonical relation C − but the latter acts on the image of T ∗ Ω \ f ⊂ Ω and we restrict the reconstruction there. Some of the shifted singularities of Af may falloutside that domain of C − and they will create no singularities in the reconstruction. Therefore,we may have shifts in the reconstructed f , full or partial cancellation (and interference patterns asa result), and removal of singularities even if their images are still present in Af .4.4. (iv) Locally averaged sampling. We study now what happens when the measurements Af are locally smoothened either to avoid aliasing or for some practical reasons. One way to modelthis is to assume that we are given samples of φ h ∗ Af , where φ h = h − m φ ( · /h ), where m is thedimension of the data space, and φ is smooth with (cid:82) φ = 1. If ˆ φ ( η ) is approximately supportedin the ball | η | ≤ B (cid:48) , how to sample φ h ∗ Af and what does this tell us for f ? The answer tothe ﬁrst question is given by the sampling theorems — we need to sample at rate smaller than πh/B (cid:48) assuming that Af a priori has even higher frequencies than B (cid:48) . The second question is moreinteresting. As explained in the Introduction, we have that φ h ∗ is a h -ΨDO and if A is associatedto a canonical map, then and by Egorov’s theorem, φ h ∗ Af = AP h f + O ( h ∞ ) f , where P h is a h -ΨDO of order zero with principal symbol ˆ φ ◦ C . If A is well posed, we can recover P h f up tosmall error which is a certain regularized version of f .We can make this more general. Assume that the convolution kernel can depend on the samplingpoint. We model that by assuming that we are sampling Q h Af , where Q h is an h -ΨDO of order zerowith essential support of its symbol contained in ( y, η ) for which | η | ≤ B (cid:48) . This hides the implicitassumption that the convolution kernels cannot change rapidly when we make h smaller and smallerbecause the symbol q ( y, η ) of Q h must satisfy the symbol estimate (12), and in particular, ∂ x p cannot increase with h − . Then it is not hard to see that at every sampling point y = y j , Q h Af ( y j )is the Fourier multiplier with q ( y j , η ) restricted to the point y = y j . Therefore, each measurementis really a convolution. An application of the semiclassical Egorov theorem [11] combined with theremark following Theorem 2.2 yields the following. Proposition 4.1.

Let f be semiclassically band limited. For ε > , let f = f + f with Σ h ( f ) ⊂{| ξ | ≤ ε } and Σ h ( f ) ⊂ {| ξ | ≥ ε } . Let F be a classical FIO associated with a diﬀeomorphism C and let P h be a h - Ψ DO. Then Q h F f = F P h f + O ( h ∞ ) f with P h a h - Ψ DO with principal symbol q = p ◦ C , where q is the principal symbol of Q h . Moreover, the full symbol p of P h is supportedin C − ( supp ( q )) , where q is the full symbol of Q h . Therefore, having locally averaged data Q h Af instead of Af , with A elliptic, allows us to recon-struct the smoothened P h f (plus a function with much lower frequencies) instead of f . If we wantto choose P h ﬁrst, for example to be a speciﬁc convolution, we can ﬁnd the operator Q h whichapplied to the data Af results in the desired regularization. Finally, in applications, Q h A ( y j ) maynot be realized as convolutions with compact supports of their Fourier transforms but if those sup-ports are approximately compact with some error estimates, we can apply the asymptotic isometryproperty (25) to estimate the resulting error.5. Non-uniform sampling: A lower bound of the sampling rate

Assume we want to sample a semiclassically limited f h on a non-uniform grid, see also Section 3.2.One reason to do that would be to reduce the number of the sampling points if the shape of WF h ( f h )allows for this, as in the Radon transform examples (where we sample R f ). We prove a theoremsimilar to one of the results of Landau [13] in the classical case. In case of non-uniform sampling,we establish a lower bound of the sampling rate of f with WF h ( f ) contained in a ﬁxed compactset, equal to the phase volume of the interior of the latter. In most applications, ∂ K would havemeasure zero, and then Vol( K int ) = Vol( K ). The theorem below is diﬀerent that the correspondingresults in [13] in the following way. Aside from being semiclassical, our bound is in terms of thevolume of WF h ( f ), i.e., it is microlocalized, rather than being (2 π ) − n Vol(Ω) Vol( B ) for the numberof points in every Ω when supp ˆ f ⊂ B . In other words, we express the bound in terms of the volumeof WF h ( f h ) instead of the volume of the minimal bounding product Ω × B . Theorem 5.1.

Let { x j ( h ) } N ( h ) j =0 be a set of points in R n . Let K ⊂ T ∗ R n be a compact set. If (33) (cid:107) f h (cid:107) ≤ C N ( h ) (cid:88) j =1 | f h ( x j ( h )) | + O ( h ∞ ) for every semiclassically band limited f h with WF h ( f ) ⊂ K , then (34) N ( h ) ≥ (2 πh ) − n Vol( K int )(1 − o ( h )) , where Vol( K int ) = (cid:82) K int d x d ξ is the measure of the interior K int of K ⊂ T ∗ R n . EMICLASSICAL SAMPLING 17

Proof.

By the properties of the Lebesgue measure, given δ >

0, we can ﬁnd a closed (and necessarilycompact in this case) set F δ ⊂ K int so that Vol( K int \ F δ ) < δ . Let 0 ≤ p ( x, ξ ) ≤ K int and equal to 1 on F δ . Let P h = p w ( x, hD ) be the Weylquantization of p . Then P h is self-adjoint and compact.Let E h be the eigenspace of P h spanned by the eigenfunctions corresponding to the eigenvaluesin [1 / , P h cannot exceed 1 + O ( h ). For simplicity (not an essential assumption for the proof), assumethat 1 / p . Every f h ∈ E h satisﬁes WF h ( f h ) ⊂ K . Indeed, for every uniteigenfunction φ h of P h with an eigenvalue λ h ∈ [1 / , q ∈ S , with supp q ∩ K = ∅ we have q ( x, hD ) φ h = (1 /λ h ) q ( x, hD ) p ( x, hD ) φ h = O ( h ∞ ), and this yields the same conclusion forevery tempered f h ∈ E h .By [6], we have the following Weyl asymptotic(35) dim E h = (2 πh ) − n Vol (1 / ≤ p ≤

2) (1 + O ( h )) . We will show that N ( h ) ≥ dim E h for 0 < h (cid:28)

1. If this is not true, we would have N ( h ) < dim E h for a sequence h = h j →

0. Then for every h ∈ { h j } , we can ﬁnd a unit f h ∈ E h so that allthe samples of f h vanish for such h . As we showed above, WF h ( f h ) ⊂ K . This contradicts (33)however.Therefore, by (35), Vol( F δ ) < Vol (1 / ≤ p ≤ ≤ lim inf h → (2 πh ) n N ( h ) . Take the limit δ → (cid:3) Remark 5.1.

The proof holds if we replace the error term in (33) by o ( h ) (cid:107) f (cid:107) . Remark 5.2.

Existence and a characterization of optimal sampling sets where (34) would be anequality is a harder problem which we do not study here. Estimate (34) is sharp in case of uniformsampling described in Corollary 3.1 at least, see Remark 3.2 which generalizes easily to diﬀerentband limits B j for each component of x as in Corollary 3.1. It is straightforward to show that (34)is also sharp in the case of more general uniform sampling described in Theorem 3.2. Remark 5.3.

The statement of the theorem is preserved under (non-linear) diﬀeomorphic trans-formations because WF h and its phase volume are invariant. If for some K , we can choose anon-linear transformation which would ﬁt the problem in the situation handled by Theorem 3.2,we can construct a sampling set with the optimal number of sampling points by transforming theperiodic lattice by that transformation. Doing this piece-wise, as suggested in Section 3.2, wouldprovide a smaller sampling set. We show how this can be done in our Radon transform examples. Corollary 5.1.

Let A be a classical FIO of order m associated with a diﬀeomorphic canonicalrelation C . Then the minimal asymptotic number of points (up to an o ( h )) relative error) tosample Af h guaranteed by Theorem 5.1 does not exceed the number of points needed to sample f h ;and if A is elliptic, it is the same. The proof follows from the fact that C is symplectic and in particular it preserves the phasevolume; and from Theorem 2.2.In the examples we consider, A happened to be an 1-to-2 diﬀeomorphism, and each branch iselliptic. Then we can apply the corollary to each “half” of C . Then the number of points to sample Af stably would be twice that for f ; but the number of points to recover f stably up to a functionwith WF h in a small neighborhood of the zero section is half of that.In the remainder of the paper, we present a few applications. The X-ray/Radon transform in the plane in the parallel geometry

We present the ﬁrst example: the X-ray/Radon transform R in the plane in the so-called parallelgeometry parameterization. The analysis of this transform in this and in the fan-beam representa-tion can and has been done with traditional tools [2, 17, 15], also [14, Ch. III], when the weight isconstant since the symmetry allows us to relate that transform to the Fourier transform of f by theFourier Slice Theorem. The sampling of R f however requires estimates of the Fourier transform of R f , which is done in those papers in a non-rigorous way by Bessel functions expansions and theirasymptotics.We go a bit deeper than that even when the weight is constant and we treat variable weightsas well. The main purpose of this section is to demonstrate the general theory on a well studiedtransform, where one can write explicit formulas; and sampling analysis has been done (for constantweights), so we can compare the results.The numerical simulations in this and in the next section have been done in Matlab. Thephantoms f are deﬁned by formulas and sampled ﬁrst on a very ﬁne grid. Then we compute theirRadon transforms R f numerically. To simulate coarser sampling of R f , we sample the so computed R f . To simulate inversion with coarsely sampled R f , we upsample the downsampled data to theoriginal grid to simulate a function of continuous variables. Instead of using the Whittaker-Shannoninterpolation formula (24), we use the lanczos3 interpolation which is a truncated version of thelatter. Then we perform the inversion on that ﬁner grid. Note that our goal is not to reducethe computational grid at this point, it rather is to show the amount of data and the artifactscontained in data sampled in a certain way. We compute the Fourier transforms of f and R f using the discrete Fourier transform command in Matlab. Since we work with f vanishing near theboundary of the square [ − , , and R f is vanishing in the p variable near p = ± f ’s)and is periodic in its angular variables, the discrete Fourier transform, which in fact is a transformon a torus, gives no artifacts.6.1. R κ as an FIO. Let R κ be the weighted Radon transform in the plane(36) R κ f ( ω, p ) = (cid:90) x · ω = p κ ( x, ω ) f ( x ) d (cid:96), where κ is a smooth weight function, p ∈ R , ω ∈ S , and d (cid:96) is the Euclidean line measure on eachline in the integral above. If κ = 1, we write R = R κ . Each line is represented twice: as ( ω, p ) andas ( − ω, − p ) but it is represented only once as a directed one. In general, the weight does not needto be even in the ω variable, so it is natural to think of the lines as directed ones. Let ω ⊥ be ω rotated by π/

2. We parameterize ω by its polar angle ϕ and write(37) ω ( φ ) = (cos ϕ, sin ϕ ) . The Schwartz kernel of R κ is κδ ( x · ω ( φ ) − p ). Then it is straightforward to show that R κ is anFIO of order − / C = (cid:26)(cid:16) ϕ, x · ω ( ϕ ) (cid:124) (cid:123)(cid:122) (cid:125) p , − λ ( x · ω ⊥ ( ϕ )) (cid:124) (cid:123)(cid:122) (cid:125) ˆ ϕ , λ (cid:124)(cid:123)(cid:122)(cid:125) ˆ p , x, λω ( ϕ ) (cid:124) (cid:123)(cid:122) (cid:125) ˆ x = ξ (cid:17) , λ (cid:54) = 0 (cid:27) , where we used the non-conventional notation of denoting the dual variable of p by ˆ p , etc. Set ξ = λω . Given ξ (cid:54) = 0, there are two solutions for λ , ω : either λ + = | ξ | , ω = ξ/ | ξ | or λ − = −| ξ | , EMICLASSICAL SAMPLING 19 ω = − ξ/ | ξ | . Therefore, C = C + ∪ C − , where(39) C ± : ( x, ξ ) (cid:55)−→ (cid:18) arg( ± ξ ) (cid:124) (cid:123)(cid:122) (cid:125) ϕ , ± x · ξ/ | ξ | (cid:124) (cid:123)(cid:122) (cid:125) p , − x · ξ ⊥ (cid:124) (cid:123)(cid:122) (cid:125) ˆ ϕ , ±| ξ | (cid:124)(cid:123)(cid:122)(cid:125) ˆ p (cid:19) . The Schwartz kernel of R κ is a delta function on Z := { x · ω ( ϕ ) = p } which is invariant under thesymmetry τ ( ϕ, p ) = ( ϕ + π, − p ) (the latter modulo 2 π ). Then C is invariant under τ lifted to thecotangent bundle:(40) ( ϕ, p, ˆ ϕ, ˆ p ) (cid:55)−→ ( ϕ + π, − p, ˆ ϕ, − ˆ p ) , and in fact this is an isomorphism between C + and C − .Take C + ﬁrst. We see that a singularity ( x, ξ ) of f can create a singularity of R κ f ( p, ω ) at p = x · ξ/ | ξ | and ω = ξ/ | ξ | , i.e., at ϕ = arg( ξ ) in the codirections | ξ | ( − ( x · ω ⊥ ) ω ⊥ , p, ω ) determines the oriented line through x normal to ξ and the normal is consistentwith the orientation. Taking C − next, we see that ( x, ξ ) may aﬀect the wave front set of R κ at p = − x · ξ/ | ξ | and ω = − ξ/ | ξ | , and at the corresponding codirections. That is the same line asbefore but with the opposite orientation and the weight κ on it might be diﬀerent.So we see that R κ is an FIO associated with C − ∪ C + and each one of them is a local diﬀeomor-phism. Indeed, ( x, ξ ) = C − ± ( ϕ, p, ˆ ϕ, ˆ p ) is given by(41) x = pω ( ϕ ) − ( ˆ ϕ/ ˆ p ) ω ⊥ ( ϕ ) , ξ = ˆ pω ( ϕ ) . It is well deﬁned for ˆ p (cid:54) = 0 but if we want x in the image to be in | x | < R , we need to require p + ( ˆ ϕ/ ˆ p ) < R , see also (44) and Figures 2 and 3.6.2. (i) Sampling R κ f . Sampling R κ f on periodic grids. Assume that f = f h satisﬁes(42) WF h ( f ) ⊂ { ( x, ξ ); | x | ≤ R, | ξ | ≤ B } with some R > B >

0, i.e., up to O ( h ∞ ), f is essentially supported in B (0 , R ) and F h f isessentially supported in | ξ | ≤ B . The number N F of points to sample f stably then is given(43) N f ∼ (2 πh ) − πR × πB = h − R B / , where ∼ means equality up to an (1 + o ( h )) relative error, see Theorem 5.1 and Remark 5.2.A sharp upper interval containing ˆ p is [ −| ξ | , | ξ | ] and a sharp interval for ˆ ϕ is [ − R | ξ | , R | ξ | ], ˆ ϕ ˆ p RBB Figure 2.

The frequency set of R κ f . therefore, the smallest rectangle containing F h R κ f for all such possible f ’s is( ˆ ϕ, ˆ p ) ∈ RB [ − , × B [ − , . Note that this rectangle does not describe Σ h ( R κ f )when Σ h ( f ) is included in | ξ | ≤ B . The latter is thecone | ˆ ϕ | ≤ R | ˆ p | in that rectangle, i.e.,(44) { ( ˆ ϕ, ˆ p ); | ˆ ϕ | ≤ R | ˆ p | , | ˆ p | ≤ B } , see Figure 2. Suppose we sample on a rectangu-lar grid with sampling rates s ϕ h in the ϕ variablein [0 , π ]; and with a sampling rate s p h in the p variable in [ − R, R ]. Then the Nyquist condition is equivalent to(45) s ϕ ≤ πRB , s p ≤ πB . This means taking 2

RB/h × RB/ ( πh ) = h − R B /π samples to recover R κ approximately,see also [14]. Note that this is 8 /π times the estimate in (43) for 2 N f , which by Corollary 5.1is the theoretical asymptotic minimum since the canonical relation C is 1-to-2. For the recoveryof WF h ( f ) \ ∼ h − R B /π . This is again 8 /π times theasymptotic minimum in Corollary 5.1.On the other hand, the conic shape of the range of WF h ( f ) in (44) suggests a more eﬃcientsampling. One can tile the plane with that set by using the elementary translations by ( RB, B )and by (0 , B ). If 2 π ( W ∗ ) − has those columns, then(46) W = πRB (cid:18) − R (cid:19) . Then one should sample on a grid shW Z n with s <

1. Since det W = 2 π /RB , and the area ofthe region we sample is 2 π × R = 4 πR , we see that we need ∼ h − R B /π points which is 4 /π larger than 2 N f ; and for proper recovery of WF h ( f ) we need a half of that, i.e., ∼ /π times N f .The coeﬃcient 4 /π is close to 1 but not equal to 1 as it is clear from the next section and Figure 3.6.2.2. Microlocalization and non-uniform sampling.

The sampling requirements above were basedon the following. To determine the sampling rate for R κ f , we ﬁnd Σ h ( R χ f ) as the projectionof WF h ( R κ f ) ⊂ C ◦ WF h ( f ) to its phase variable. Since we are interested in sampling f withWF h ( f ) ⊂ {| x | ≤ R, | ξ | ≤ B } and to ﬁnd a sharp sampling rate, we projected C ◦{| x | ≤ R, | ξ | ≤ B } onto its phase variable to get the smallest closed set (44) containing Σ h ( R κ f ) for every such f .This answers the question if we are interested in sampling on a periodic grid for all such possible f ’s. The analysis allows us to localize or microlocalize some of those arguments. The dependence of the sampling requirements on supp f . The sampling frequency in theangular variable on a rectangular grid should be smaller than πh/ ( RB ), and the dependence on R may look strange since the Radon transform has a certain translation invariance. The reasonfor it is that we assume that we know that f is supported in a disk and we reconstruct it thereonly. Numerical experiments reveal that when the sampling rate in the angular variable decreases,artifacts do appear and they move closer and closer to the original when the rate decreases. Non-uniform sampling.

We are interested ﬁrst in the optimal sampling rate of R κ f locally,near some ( ϕ, p ). The latter is determined by the frequency set C ◦ {| x | ≤ R, | ξ | ≤ B } projectedto its phase variables ( ˆ ϕ, ˆ p ) with ( ϕ, p ) ﬁxed. It is straightforward to see that on the range of C ,(47) | ˆ ϕ | = ±| ˆ p | (cid:112) | x | − p , | ˆ p | ≤ B, where | p | ≤ | x | . Since x ranges in | p | ≤ | x | ≤ R , we get(48) | ˆ ϕ | ≤ | ˆ p | (cid:112) R − p , | ˆ p | ≤ B. We plot those double triangles in Figure 2 at a few points in the rectangle [0 , π ] × [ − ,

1] (where ϕ = 0 and ϕ = 2 π should be identiﬁed) in the ( ϕ, p ) plane. The phantom f consists of six smallGaussians in the unit disk. We also superimpose a density plot of R f for a certain f consisting ofsix randomly places small Gaussians in the unit disk (with R = 1). One can see from this ﬁgurethat the set of the conormals to the curves in the plot of R f fall inside those triangles but thesemiclassical wave front set also captures the range of the magnitudes of the frequencies. When thestripes are horizontal, the magnitude drops a bit and the stripes a bit more blurred. Since R = 1in this example, the Nyquist sampling limits of the sampling rates s ϕ h and s p h given in (45) areequal, i.e., the optimal grid would be a square one.This analysis suggests the following non-uniform sampling strategy. For a ﬁxed k , we divide theinterval [ − , (cid:51) p into 2 k sub-intervals I ± j = ± [( j − /k, j/k ], j = 1 , , . . . , k , each one of length EMICLASSICAL SAMPLING 21

Figure 3. R f ( ϕ, p ) for f consisting of six randomly placed small Gaussians in the unitball. The double triangles, shown in the upper half only, represent WF h ( R f ) localized attheir vertices ( ϕ, p ). /k . For each k , we take | ˆ ϕ | ≤ | ˆ p | (cid:112) R − j /k , | ˆ p | ≤ B as a sharp cone where ( ˆ ϕ, ˆ p ) lies, see (47).Then in [0 , π ] × I j , we sample on the grid δhW j Z n , where δ < W j is as in (46) with R = 1 /j . Then we can get closer and closer to the sharp number of the sampling point for R χ stated in Corollary 5.1 and Theorem 5.1, which should be ∼ N f for a stable recovery of R κ f and N f for a stable recovery of f itself, see (43).6.3. (ii) Resolution limit on f posed by the sampling rate of R κ f . Let s ϕ and s p bethe relative sampling rates for ϕ and p , respectively. Lack of aliasing is equivalent to ˆ ϕ < π/s ϕ ,ˆ p < π/s p , see (31), (32). By (39), this is equivalent to(49) | x · ξ ⊥ | ≤ π/s φ , | ξ | ≤ π/s p . If the sampling rates satisfy the sharp Nyquist condition (45), the latter condition above implies theformer. Actually, the ﬁrst condition in (49) is most critical for ( x, ξ ) with x close to the boundary | x | = R and ξ (cid:107) x , which are represented by radial lines close to | x | = R . In Figure 4b, which isundersampled in ϕ , we see evidence of that; another evidence is Figure 7c, where R f is blurred in ϕ . When the sampling rates do not necessarily satisfy the Nyquist condition (45) in | x | < R , weillustrate the signiﬁcance of (49) in Figure 4. The relative sampling rate s p imposes a universallimit on the resolution, independent on x and the direction of ξ . On the other hand, the secondinequality imposes a locally non-uniform and a non-isotropic resolution limit. Assuming s p (cid:28) | x · ξ/ | ξ || (cid:28)

1, so | x · ξ ⊥ / | ξ ⊥ || is close to itsmaximum for that x , which restricts | ξ | by (49). Resolution of meridional (circular) lines is thegreatest; there | x · ξ ⊥ / | ξ ⊥ || (cid:28)

1, so for a given x , | ξ | could be large by (49). There are also aliasingartifacts explained below.6.4. (iii) Aliasing. We study now what happens if R κ f is undersampled. It might be undersam-pled in the ϕ or the p variable or in both.6.4.1. Angularly undersampled R κ f . Assume (42) as above and assume that s ϕ > πh/ ( RB ), i.e.,the ﬁrst Nyquist condition in (45) is violated. Then the aliasing of R κ f can be described as a sum (a) f on [ − , (b) Reconstructed f undersampled in p . (c) Reconstructed f ,angular step s ϕ = 4 ◦ Figure 4. (a) f on [ − , , (b) f reconstructed with s p (cid:28) s ϕ = 4degrees; (c) f reconstructed with s ϕ (cid:28) p variable. of h-FIOs, see (28), (29), with canonical relations(50) S k : ˆ ϕ (cid:55)−→ ˆ ϕ + 2 πk/s ϕ , k = 0 , ± , . . . . In typical cases with not very severe undersampling, k is restricted to k = ± k = 0 whichis the original image but blurred by ˆ χ in (28). Then a direct computation shows that the aliasingartifacts are described by an h-FIO with canonical relations(51) ( x, ξ ) (cid:55)−→ C − ± ◦ S k ◦ C ± ( x, ξ ) = (cid:16) x ∓ πks ϕ ξ ⊥ | ξ | , ξ (cid:17) , when ˆ ϕ + 2 kπ/s ϕ ∈ [ − π/s ϕ , π/s ϕ ], i.e., when(52) − x · ξ ⊥ + 2 kπ/s ϕ ∈ [ − π/s ϕ , π/s ϕ ] . Those are shifts of ( x, ξ ) in the x variable, in the direction of ξ ⊥ , at distance 2 πk/ ( s ϕ | ξ | ). By (52), k depends on ( x, ξ ) and in particular for | x · ξ ⊥ | (cid:28)

1, we have k = 0 only of Σ h ( f ) is ﬁnite and thenthere is no aliasing. In general, the reconstructed f will have the singularities of f shifted by (51)for various k = 0 , ± , . . . , as long as they satisfy (52). The value k = 0 corresponds to WF h ( f ) (notshifted). Note that only ﬁnitely many of them would stay in the ball | x | < R . It is even possibleall of them to be outside that ball and R κ f to be undersampled and therefore aliased. Then thereconstructed image in | x | < R will not have a singularity corresponding to that one.We illustrate this with a numerical example in Figure 5d. We choose f to be a coherent state as inFigure 1. In Figure 5, we plot f , a crop of its Radon transform R f , oversampled, on [ π/ , π ] × [0 , / F R f . Since R f is even and real valued, F R f has two symmetries. A reconstruction of f withoversampled data, not shown, looks almost identical to f . Next we undersample R f using a s ϕ = 3degree step in ϕ . In Figure 5d, we show the reconstructed f which looks like f shifted alongthe direction of the pattern. The undersampled R f used to get reconstruction is shown in (b).Compared to (e), the pattern changed its orientation (and the magnitude of its frequency), similarlyto the classical aliasing eﬀect illustrated in Figure 1. The eﬀect on the reconstructed f , see (d),however is very diﬀerent and in an agreement with (51). In (f), we plot F R f where R f now is thealiased version of the Radon transform of f . We see that the bright spots where R f is essentiallysupported have shifted compared to (c): the ones to the left have shifted to the right and viceversa, as explained earlier.In this case, only the values k = ± f does not satisfy (54) with k = 0, i.e., the original singularity is not within the resolution range. EMICLASSICAL SAMPLING 23 (a) f (b) R f on [ π/ , π ] × [0 , /

2] oversampled (c) FR f oversampled (d) f recovered (e) R f on [ π/ , π ] × [0 , /

2] undersampled (f ) FR f undersampled Figure 5.

Top: f , R f and the Fourier transform FR f of R f when R f is angularlyundersampled. Bottom: f reconstructed with a back-projection, R f and FR f when R f isangularly undersampled with a 3 degrees step: the pattern has shifted. A similar example, not shown, with the pattern moved close to the center is reconstructed well(see also Figure 4c) is reconstructed well without an artifact even though the artifact computed by(51) would still ﬁt in the square shown. The reason for it is condition (52) which for small | x | andthe other parameters unchanged is valid for k = 0 only.6.4.2. R f undersampled in the p variable. Assume that s p is not small enough to satisfy the sam-pling conditions but s ϕ is. The aliasing of R κ f then can be computed, using (41) and (29), tobe(53) ( x, ξ ) (cid:55)−→ (cid:16) x ± x · ξ ⊥ (cid:18) | ξ | + 2 πk/s p − | ξ | (cid:19) ξ ⊥ | ξ | , ξ + 2 πks p ξ | ξ | (cid:17) , when ˆ p + (2 πk/s p ) ∈ [ − π/s p , π/s p ], i.e., when(54) | ξ | + 2 kπ/s p ∈ [ − π/s p , π/s p ] . Those are still shifts along ξ ⊥ but they are not equally spaced (with k ). Also, the magnitude ofthe frequency changes but the direction does not. In case of mild aliasing, we have k = ± f in | x | < R ) and they generate shifts of diﬀerent sizes. In general, there areinﬁnitely many artifacts outside the ball | x | < R regardless of the sampling rate and the band limitof f (our criterion whether Rf is aliased or not depends on R ).In Figure 6, we present an example where one of the patterns disappears from the computationaldomain [ − , due to undersampling in the p variable. The other one remains.6.5. (iv) Locally averaged measurements. Assume now that we measure Q h R κ f with Q h an h -ΨDO of order (0 ,

0) (or simply a convolution) limiting the frequency set Σ h ( R κ f ) of the data. If (a) f (b) f reconstructed (c) R f on [0 , π ] × [ − ,

1] over-sampled (d) R f on [0 , π ] × [ − ,

1] un-dersampled

Figure 6. (a) f and (b) reconstructed f with R f undersampled in p . (c) R f and (d) R f aliased. q ( ϕ, p, ˆ p, ˆ ϕ ) is the principal symbol of Q h , then by Proposition 4.1, a backprojection reconstructs P h f where P h has a principal symbol(55) p ( x, ξ ) = 12 q ◦ C + + 12 q ◦ C − . If, in particular, Q h is a convolution with a kernel of the type q = ψ ( a ˆ ϕ + b ˆ p ) with ϕ ∈ C ∞ anddecreasing, then(56) p ( x, ξ ) = ψ (cid:0) a | ξ | + b | x · ξ ⊥ | (cid:1) . This symbol takes its smallest values for x near the boundary and ξ ⊥ x , and those are the covectorswith the lowest resolution as well. The eﬀect of Q h is then non-uniform, it blurs f the most atthose covectors. If we want a uniform blur, then we choose p = ψ ( a | ξ | ) and compute q = ψ ( a ˆ p ).This is not surprising in view of the classical intertwining property d p R κ = R κ ∆ when κ = 1 (truemodulo lower order terms for general κ ). In other words, only convolving w.r.t. the p variable isneeded. This means integrating over “blurred lines”. If ψ limits WF h ( Q h R κ f ) to, say, | ˆ p | ≤ B (cid:48) ,then this limits ˆ ϕ as well by the ﬁrst inequality on (44), to | ϕ | ≤ | ˆ p | . Therefore, ( ˆ ϕ, ˆ p ) are restrictedto a smaller cone of the type (44) which imposes sampling requirements as above. Then we canrecover stably P h f .In Figure 7, we show a reconstructed image with data averaged in the p variable (then b = 0 in(56)) and the angular variables (then a = 0 in (56)) . Note that in Figure 7c the image is blurredangularly but in contrast to Figure 4b, there are no aliasing artifacts. (a) f on [ − , (b) reconstruction with R f averaged in p (c) reconstruction with R f averaged in ϕ Figure 7. f and a reconstructed f on [ − , with data averaged in the p and the ϕ variable. EMICLASSICAL SAMPLING 25 The X-ray/Radon transform R in the plane in fan-beam coordinates R κ as an FIO. We parametrize R κ now by the so-called fan-beam coordinates. Each line isrepresented by an initial point Rω ( α ) on the boundary of B (0 , R ), where f is supported, and byan initial direction making angle β with the radial line through the same point, see Figure 8. It isstraightforward to see that this direction is given by ω ( α + β ). Then the lines through B (0 , R ) aregiven by(57) x · ω ( α + β − π/

2) = R sin β, α ∈ [0 , π ) , β ∈ [ − π/ , π/ . This allows us to conclude that in this representation R κ is an FIO again (being FIO is invariant αβ Rω ( α + β ) Rω ( α ) Figure 8.

The fan-beam coordinates. under diﬀeomorphic changes) and to compute its canonical relation using the rules of transformingcovectors. We will do it directly however. We regard α as belonging to R modulo 2 π Z . Therelationship between this and the parallel geometry parameterization ( ϕ, p ) is given by(58) ϕ = α + β − π/ , p = R sin β. Each undirected line is given by a pair ( ϕ, p ) and ( ϕ + π, − p ); which in the parallel beam coordinatescorresponds to ( α, β ) and ( α + 2 β − π, − β ). The Schwartz kernel of R κ in this parameterizationis a smooth factor times a delta function on the manifold (57). As above, when κ = 1, we write R = R κ . Then(59) R ( α, β ) = R ( α + 2 β − π, − β ) . In general, that change of ( α, β ) is a symmetry of (57). The canonical relation is given by(60) C = (cid:26)(cid:16) α, β, λ ( − x · ω ( α + β )) (cid:124) (cid:123)(cid:122) (cid:125) ˆ α , λR cos β − λx · ω ( α + β ) (cid:124) (cid:123)(cid:122) (cid:125) ˆ β , x, λω ( α + β − π/ (cid:124) (cid:123)(cid:122) (cid:125) ˆ x = ξ (cid:17) , λ (cid:54) = 0 (cid:27) . Therefore, with ω = ω ( α + β ), we have ξ = − λω ⊥ , ˆ α = − λω · x = x · ξ ⊥ , ˆ β = λR cos β + ˆ α . If λ >

0, then λ = | ξ | and then λR cos β = | ξ | (cid:112) R − ( x · ξ/ | ξ | ) . Also, by (57), x · ξ = R | ξ | sin β . Forthe dual variables, we have ˆ β = | ξ | (cid:112) R − ( x · ξ/ | ξ | ) + ˆ α .If λ <

0, we get another solution by formally replacing | ξ | by −| ξ | . Therefore, the canonicalrelations C ± are given by(61) β = ± sin − x · ξR | ξ | , α = arg ξ − β ± π α = x · ξ ⊥ , ˆ β = ±| ξ | (cid:112) R − ( x · ξ/ | ξ | ) + ˆ α. ξx (a) f on [ − , (b) R f (c) (cid:100) R f Figure 9.

The canonical relation of R in fan-beam coordinates. (a) a coherent state f .(b): the image of ( x, ξ ) ∈ WF h ( f ) under C + and C − . Then C ± are isomorphic under the symmetry mentioned above lifted to the tangent bundle(62) ( α, β, ˆ α, ˆ β ) (cid:55)−→ ( α + 2 β − π, − β, ˆ α, α − ˆ β ) . We illustrate the canonical relations on Figure 9. On Figure 10, π ◦ C ± ( x, ξ ) are marked by crosses.The inverses C − ± are given by(63) x = R sin β ω ( α + β − π/ − ˆ α ˆ β − ˆ α R cos β ω ( α + β ) , ξ = ˆ β − ˆ αR cos β ω ( α + β − π/ . In particular, we recover the well known fact that C is 1-to-2, as in the previous case.7.2. (i) Sampling. We assume (42) again.7.2.1.

Sampling on a rectangular lattice.

The smallest rectangle including the range of ˆ α and ˆ β if | x | ≤ R , | ξ | ≤ B is(64) RB [ − , × RB [ − , . Therefore, for the relative sampling rates s α and s β in the α and in the β variables, respectively,in [0 , π ) × [ − π/ , π/ s α < πRB , s β < π RB , compare with (45). This means taking more than 2

RB/h × RB/h samples. This is π times morethan in the parallel geometry case. For a recovery of WF h ( f ) \

0, we need a half of that.To analyze the actual range, it is enough to analyze the range of ( ˆ α, ˆ β (cid:48) ) = ( ˆ α, ˆ β − ˆ α ), i.e., thel.h.s. of (31). Notice ﬁrst that on C , one can parameterize the line corresponding to ( α, β ) as(66) x = Rω ( α ) − tω ( α + β ) , ≤ t ≤ R cos β. Then ˆ α = −| ξ | ( R cos β − t ) , ˆ β (cid:48) = ±| ξ | R cos β, ≤ t ≤ R cos β. Therefore, for a ﬁxed ( α, β ), the range of ( ˆ α, ˆ β (cid:48) ) is independent of α and when ξ varies over | ξ | ≤ B ,that range ﬁlls the double triangle | ˆ α | ≤ | ˆ β (cid:48) | ≤ RB cos β . Over the whole range of β , see (57), thisﬁlls | ˆ α | ≤ | ˆ β (cid:48) | ≤ RB . Then we can get the range of ( ˆ α, ˆ β ), we take the inverse linear transformation.In Figure 10, we show the range for | ξ | ≤ B , and a numerically computed |F h f | of f representinga sum of several well concentrated randomly placed Gaussians. It has the symmetry (62). Thisresult can be obtained from the parallel geometry analysis, of course, see (44) and Figure 2, by thechange of variables on the cotangent bundle induced by (57). EMICLASSICAL SAMPLING 27 ˆ α ˆ β RB RB Figure 10.

Left: the range of F h R for | ξ | ≤ B in fan-beam coordinates. Center: f as asum of randomly placed small Gaussians with mean value zero. Right: |F h f | in fan-beamcoordinates on a log scale. As in the previous case, we can tile the plane with the regions in Figure 10 on the left by takingtranslations by (

RB,

0) and (0 , RB ). If 2 π ( W ∗ ) − has those columns, then(67) W = πRB (cid:18) (cid:19) . Then by Theorem 3.2, the most eﬃcient sampling would be on a grid shW Z n with s <

1, see also[14, 15]. This is ∼ N f , see (43) and is twice as sparse in each dimension compared to the previouscriterion. For a recovery of WF h ( f ) \

0, we need a half of that, i.e., ∼ N f , which is twice as muchas the sharp bound in Corollary 5.1. Note that this however requires a reconstruction formula ofthe type (15) with χ there having a Fourier transform supported in the gray region in Figure 10on the left, and equal to one on WF h ( f ) instead of the formula based on the sinc functions. Thereason that the number of points is not ∼ N f is clear from the analysis below and from Figure 11as well. Figure 11. R f ( α, β ) for f consisting of four randomly placed small Gaussians in the unitball. The double triangles, shown in the upper half only, represent WF h ( R f ) localized attheir vertices ( α, β ). In Figure 11, we plot R f ( α, β ) on [0 , π ] × [ − π/ , π/

2] for f consisting of four small Gaussians.We also plot the range of WF h ( f ) for all possible f satisfying (42) at each ( α, β ), i.e, we plot (66).The double triangles represent the set of all possible conormals of singularities in R f ( α, β ) withtheir lengths. As we can see (and prove), the highest oscillations can occur on the β = 0 line andthey are along the direction (1 , The analysis above and Figure 11, suggests the following improvement: we need to sample denserwhen β is closer to 0. In fact, one can set p = R sin β (the R factor is not essential), as in (58) andsample uniformly in p . We will explore that route in a forthcoming paper.7.3. (ii) Resolution limit given the sampling rate of R κ f . Let s α , s β be the relative samplingrates in α and β , respectively. The Nyquist limit for ( ˆ α, ˆ β ) is given by | ˆ α | < π/s α , | ˆ β | < π/s β . By(61), this is equivalent to(68) | x · ξ ⊥ | < π/s α , (cid:12)(cid:12)(cid:12) ± (cid:112) R | ξ | − ( x · ξ ) + x · ξ ⊥ (cid:12)(cid:12)(cid:12) < π/s β . Let θ be the angle which ξ makes with x when x (cid:54) = 0, more precisely, θ is such that x · ξ = | x | cos θ , x · ξ ⊥ = | x | sin θ . Then | x || ξ || sin θ | < π/s α , | ξ | (cid:12)(cid:12)(cid:12) ± (cid:112) R − | x | cos θ + | x | sin θ (cid:12)(cid:12)(cid:12) < π/s β . We plot the regions determined by the inequalities above with s α = 2 s β , see (65) to get theresolution diagram plotted on Figure 12, where R = 1 . The horizontal lines represent the resolution | x | = 1 Figure 12.

The resolution diagram of R κ in fan-beam coordinates in the unit ball. Foreach x , the circles around it represent the frequency limit imposed by s β as a function ofthe direction. A small radius means small frequency and therefore a smaller resolution.The horizontal lines mark the resolution limit imposed by s α . The diagram is rotationallysymmetric. limit imposed by s α . It is greatest near the origin and decreases (in vertical direction) away fromthe center. As it can be seen from (68), the second inequality in (68) (satisﬁed for both signs)implies the ﬁrst one; so that actual resolution is controlled by the double circles there except at | x | = 1, where the lines are tangent to the lens shaped region. Next, the symmetry relation (62)has an interesting implication. Let ( α ± , β ± , ˆ α ± , ˆ β ± ) be the image of ( x, ξ ) under C ± , related by(62), see Figure 9. Then the resolution limit on f at x in various directions posed by the samplingrate s β near ( α + , β + ) and near ( α − , β − ) are given by (68) with both choices of the signs ± . OnFigure 12, they are represented by the intersection of the two disks at each x . We can see thatnear the origin, it is quite small and close to isotropic. Near | x | = 1, the resolution increases andit is better for ξ close to radial (for example, for circular lines). On the other hand, since C is1-to-2, we need only one of ( α ± , β ± , ˆ α ± , ˆ β ± ) to recover ( x, ξ ). Therefore, the data R χ f actuallycontains stable information about recovery the singularities of f in the union of those disks, insteadof its intersection, if we can use that information. It follows from (61) that the better resolution iscoming from that of the two lines through x normal to ξ with a source Rω ( α ) which is closer to x .In the example in Figure 9, for example, if the sampling rate of R κ f is not suﬃcient to sample theright-hand pattern, we can just cut it oﬀ smoothly and use the other one only. EMICLASSICAL SAMPLING 29

Another approach is to note that the shape the double triangles in Figure 10 allow for under-sampling up to half of the rate, and when there is aliasing (overlapped shifted triangles), it aﬀectsboth images of every ( x, ξ ) equally. To beneﬁt from this however, instead of using a sinc type ofinterpolation, we need to use χ in (24) with ˆ χ supported in the double triangle in Figure 10. Evenbetter, we can sample on a parallelogram type of lattice as in (67). Unlike [14, 15] we could have anon-uniform sampling set as in Section 6.2.2 by dividing [0 , π ] × [ − π/ , π/

2] into horizontal stripsand using parallelogram-like lattices in each one of varying densities using the fact that the wavefront set size decreases when approaching β = ± π/ x and the ξ variables. For the state on the left, we have x almost parallel to ξ on the wave frontset, while for the state on the top, x is almost perpendicular to ξ . As a result, the singularities ofthe ﬁrst state are mapped to the lower frequency ones on the plot of R f closer to the corners. Thestate on the top creates the higher frequency oscillations of R f along the equatorial line of the plotof R f . The Fourier transform on the right in Figure 13 conﬁrms that — the four streaks closer tothe borders correspond to the top phantom. Note that the horizontal axis in the Fourier transformplot is stretched twice compared to Figure 10 because the sampling requirement requires the samenumber of points on each axis, and then the discrete Fourier transform maps a square to a square. (a) f on [ − , (b) R f (c) |FR f | Figure 13. (a): f having WF h ( f ) at two points. (b): R f in fan-beam coordinates. (c): |F h R f | . The two patterns on the central horizontal line correspond to the phantom on thetop. They are close to be critically sampled. The two patterns on the diagonal correspondto the phantom on the right. (iii) Aliasing artifacts. If R κ f is undersampled in either variable, we would get aliasingartifacts as h-FIOs related to shifts of ˆ α and ˆ β , see (29), (30) Section 4.3. By (63) this would createshifts in the x variable along ξ ⊥ and a possible change of the magnitude of ξ but not its direction.We observed similar eﬀects in the parallel parameterization case. In Figure 16b one can see thatthe aliasing artifacts are extended outside the location of the “doughnuts” there.7.5. (iv) Averaged measurements. As in section 6.5, assume we measure Q h R κ f with Q h an h -ΨDO of order (0 , q is the principal symbol of Q h , then a backprojection reconstructs P h f with P h having principal symbol as in (55). In particular, if q = ψ ( a | ˆ α | + b | ˆ β | ), then q = 12 ψ (cid:16) a | x · ξ ⊥ | + b (cid:12)(cid:12) x · ξ ⊥ + (cid:112) R | ξ | − ( x · ξ ) (cid:12)(cid:12) (cid:17) + 12 ψ (cid:16) a | x · ξ ⊥ | + b (cid:12)(cid:12) x · ξ ⊥ − (cid:112) R | ξ | − ( x · ξ ) (cid:12)(cid:12) (cid:17) . (69) This formula reveals something interesting, similar to the observations above: the loss of resolutioncoming from each term is diﬀerent. For each ( x, ξ ), the reconstructed f is q ( x, hD ) f plus a lowerorder term, which is a sum of two with diﬀerent (and direction dependent) losses of resolution. Letus say that ψ is radial and decreasing as r = | x | increases. If x · ξ ⊥ >

0, then the ﬁrst term attenuatesat that frequency more than the second one, and vice versa. Therefore, the reconstruction withfull data in speciﬁc regions and directions would have less resolution that one with partial data.This is also illustrated in Figure 12: the intersection of the circles there reﬂects the resolution limitif we use full data and the union — the resolution limit with partial data chosen to maximize theresolution. To take advantage of that, we would need to take a h -ΨDO Q h , not just a convolution.In Figure 14 we show an example. In (b), we show the reconstructed f with R f averaged in α . This corresponds to b = 0 in (69). The worst resolution is where | x · ξ ⊥ | is maximized, which (a) f on [ − , (b) f rec with R f av-eraged in α (c) f rec with R f av-eraged in β (d) f rec with R f av-eraged in β and half-data Figure 14. f and a reconstructed f rec on [ − , with data on the circumscribed circleand data averaged in the α and in the β variable. In (d), we use R f with sources on theright-hand part of the circle. happens when | x | is maximized and x ⊥ ξ , like for radial lines close to the boundary. We have thebest resolution when | x · ξ ⊥ | is small, and if we want that for all directions; this happens near theorigin but circular lines away from the origin are resolved well, too. Averaging in β is representedby (c) and corresponds to a = 0 in (69). As explained above, we get a superposition of two imagesand the understand the plot better, one should look ﬁrst at (d), where a reconstruction with α restricted to [ − π/ , π/

2] (the r.h.s. of the circumscribed circle) is shown. There, for “doughnuts”closer to the right-hand side, radial lines (where ξ ⊥ x ) are resolved better than circular ones, whichcorresponds to the union of the disks in Figure 12. On the left (far from the sources), it is theopposite: radial lines are very blurred, while circular ones are better resolved. This corresponds tothe intersection of the disks in Figure 12 which predicts better resolution for ξ (cid:107) x . Then in (c),we have a superposition of two such images which have a combined resolution in which radial andcircular blur are mixed: there is still better resolution of radial lines (but the eﬀect is subtle inthis example) and a larger radius blur in circular directions. The eﬀect is stronger near the cornersas compared to “doughnuts” near the edges but in the center of each side because the former arecloser to the circumscribed circle.To illustrate this eﬀect even better in Figure 15, we take f to be a slightly randomized 5 × R f in β . The reconstruction isshown in Figure 15a. Then we compute numerically F − q see (69) with a = 0, which representsthe convolution kernel of the reconstructed image at x = R (0 . , x as a constant. Weplot it (enlarged) in (b). This is what the theory predicts to be the reconstructed image of a delta EMICLASSICAL SAMPLING 31 (a) f reconstructedon [ − , (b) the predicted blur ker-nel at R (0 . , Figure 15. f is a 5 × f on[ − , with data on the circumscribed circle with R = 1 .

45 averaged in the β variable.; (b)the predicted blur kernel at R (0 . , ≈ (1 . , placed at that x . We can see a strong horizontal (i.e., radial) blur plus a fainter vertical (circular)one, spread over a larger area, with a negative sign. In this grayscale, black corresponds to themaximum and white corresponds to the minimum. In (a), one can see (smaller) similar images inthe four corners, which are close to the circumscribed circle. Their orientations are along the radiallines, of course. As x moves closer to the center, the kernel looks more circularly symmetric andgets larger, which can be seen from Figure 15a and also from (69). At the origin, it is Gaussian as(69) predicts. Anti-aliasing.

In Figure 16, we present an example of f undersampled in the β variable andthem blurred ﬁst (in the same variable) and still undersampled at the same rate. (a) original f on[ − , (b) a reconstruction R f undersampled in β (c) an anti-aliased re-construction Figure 16. f is a 7 × f with data under-sampled in the β variable, with 90 angles: a step s β = 2 o ; (b) f is blurred ﬁrst and thensampled as in (b). We see that the aliasing artifacts are mostly suppressed but some resolution is lost.8.

Thermo and Photo-Acoustic Tomography

Let Ω be a smooth bounded domain in R n . Let g be a Riemannian metric in ¯Ω, and let c > c = 1 and g is Euclidean on ∂ Ω (not an essential assumption). Fix

T > Let u solve the problem(70)  ( ∂ t − c ∆ g ) u = 0 in (0 , T ) × R n ,u | t =0 = f,∂ t u | t =0 = 0 . Here, ∂ ν = ν j ∂ x j , where ν is the unit outer normal vector ﬁeld on ∂ Ω. The function f is the sourcewhich we eventually want to recover. The Neumann boundary conditions correspond to a “hardreﬂecting” boundary ∂ Ω. In applications, g is Euclidean but the speed c is variable. The analysisapplies to more general second order symmetric operator involving a magnetic ﬁeld and an electricone, as in [18]. The metric determining the geometry is g := c − g . We assume that ∂ Ω is convex.Let Γ ⊂ ∂ Ω be a relatively open subset of ∂ Ω, where the measurements are made. The observationoperator is then modeled by(71) Λ f = u | [0 ,T ] × Γ . The inverse problem is to ﬁnd f given Λ f .The natural space for f is the Dirichlet space H D (Ω) deﬁned as the completion of C ∞ (Ω) underthe Dirichlet norm(72) (cid:107) f (cid:107) H D = (cid:90) Ω |∇ u | g d Vol . The model above assumes acoustic waves propagating freely through ∂ Ω where we make mea-surements. This means that the detectors have to be really small so that we can ignore their size. Adiﬀerent model studied in the literature is to assume that the waves are reﬂected from the boundaryand measured there. To be speciﬁc, we may assume zero Neumann conditions on ∂ Ω and then Λ f would be the Dirichlet data but other combinations are possible. Then we solve ﬁrst(73)  ( ∂ t − c ∆ g ) u = 0 in (0 , T ) × Ω ,∂ ν u | (0 ,T ) × ∂ Ω = 0 ,u | t =0 = f,∂ t u | t =0 = 0and deﬁne Λ f as in (71) again but this time u is diﬀerent.As shown in [18] in the ﬁrst case (70), Λ, restricted to f supported (strictly) in Ω, is an ellipticFIO of order zero with a canonical relation C = C − ∪ C + , where(74) C ± : ( x, ξ ) (cid:55)−→ (cid:32) ± s ± ( x, ξ/ | ξ | g ) (cid:124) (cid:123)(cid:122) (cid:125) t , γ x,ξ ( s + ( x, ξ )) (cid:124) (cid:123)(cid:122) (cid:125) y , ∓| ξ | g (cid:124) (cid:123)(cid:122) (cid:125) ˆ t = τ , ˙ γ (cid:48) x,ξ ( s ± ( x, ξ )) (cid:124) (cid:123)(cid:122) (cid:125) ˆ y = η (cid:33) , with s ± ( x, ξ ) being the exit time of the geodesic starting from x in the direction ± g − ξ (this is ξ identiﬁed as a vector by the metric g ) until it reaches ∂ Ω. We assume that c − g is non-trapping;them those exit times are ﬁnite and positively homogeneous in ξ of degree −

1. Also, ˙ γ (cid:48) stands forthe orthogonal (in the metric) projection of ˙ γ to T ∂

Ω. Clearly, the frequency range of C is thespace-like cone | η | < | τ | . The norm | ξ | g is the norm of η as a covector in the metric g , and similarly, | η | is in the metric on ∂ Ω induced by the Euclidean one on R n . We would have equality if ξ istangent to ∂ Ω but this cannot happen since supp f ⊂ Ω.If we use (73) as a model instead (allowing for reﬂections) it was shown in [19] that the ﬁrstsingularities give rise to an FIO Λ with the same canonical relation, which is actually 2Λ moduloa lower order operator. After each reﬂection, we get an FIO with a canonical relation of the sametype but reﬂected from the boundary. The sampling requirements are the same, and we will skipthe details.

EMICLASSICAL SAMPLING 33

Assume now that Σ h ( f ) ⊂ {| ξ | ≤ B } . We have | ξ | g = c g ij ξ i ξ j . Let M be the sharp lowerbound of the metric form c − g on the unit sphere over all x . Then 1 /M is the sharp upper boundon c g − and | ξ | g ≤ B/M which is sharp. ThenΣ h (Λ f ) ≤ { ( τ, η ) ∈ R × T ∗ ∂ Ω; | η | < | τ | ≤ B/M } ητ B/MB/M Figure 17.

The frequency set of Λ f . and the r.h.s. is actually the range of Σ h (Λ f ) forall f as above, see Figure 17. If we sample on agrid on [0 , T ] × ∂ Ω, with the second variable ina ﬁxed coordinate chart, we need to choose steps∆ t < πM h/B and | ∆ y j | < πM h/B , where the lat-ter norm is in the induced metric. Since ∆ y j isconstant (the superscript j refers to the j -th coor-dinate), for the Euclidean length ∆ y j we must have∆ y j < πM (cid:48) M h/B , where ( M (cid:48) ) is the sharp up-per bound on the induced metric on the Euclideansphere in that chart. In our numerical example be-low, the boundary is piecewise ﬂat parameterized in an Euclidean way; then M (cid:48) = 1 away from thecorners.Set c max = max c . If g is Euclidean, M = 1 /c max . The metric on R × ∂ Ω is d t + g (cid:48) , where g (cid:48) isthe Euclidean metric restricted to T ∂

Ω. The sampling requirements in any local coordinates on theboundary depend in those coordinates as explained above, with M (cid:48) = 1. Therefore, the samplingrate in the ( t, y ) coordinates should be smaller than πh/Bc max .It is interesting that the sampling requirements do not depend on existence of conjugate pointsor not and are unaﬀected by possible presence of caustics. In fact, we can have caustics even if thegeometry is Euclidean but we start from a concave wave front. In Figure 18 on the left, we plot Figure 18.

Left: f having WF h ( f ) along horizontal directions, on the [ − , square.Center: Λ f on the right hand side cross time for 0 ≤ t ≤ f does not contain higher frequencies there.Right: Λ f with a speed having a fast region in the center. f = e − | x | sin(( x − . / .

02) in the square [ − , computed with a high enough resolution. Onthe right, we plot Λ f for the second model (73) on the right hand side of the square cross the timeinterval [0 , c = 1 − . − | x | ) having a slow region in the center and range0 . ≤ c ≤ − e − / ≈ .

93. Then M ∼

1, and as noticed above, M (cid:48) = 1. The sampling requirementsof Λ f on [0 , × ∂ Ω are therefore the same as those of f on [ − , . Figure 18 demonstrates thatfact by showing that the highest frequencies of Λ f in the center are approximately the same as thehighest ones on the left. Naturally, they occur where the rays hit ∂ Ω at the largest angle with thenormal which is represented by the slanted curves on the plot of Λ f . Next, despite of presence ofcaustics a bit left of the center of the Λ f plot, the oscillations are not of higher frequencies thanelsewhere else. On the right, we plot Λ f when c = 1 + 0 . − | x | ), i.e., there is a fast region in the middle. The speed range is approximately [1 . , . f . The sampling requirements are higher.A more thorough analysis of this case in the context of this paper will be presented elsewhere. References [1] F. Andersson, M. V. de Hoop, and H. Wendt. Multiscale discrete approximation of Fourier integral operators.

Multiscale Model. Simul. , 10(1):111–145, 2012.[2] R. N. Bracewell. Strip integration in radio astronomy.

Aust. J. Phys. , 9:198–217, 1956.[3] P. Caday. Computing Fourier integral operators with caustics.

Inverse Problems , 32(12):125001, 33, 2016.[4] E. Cand`es, L. Demanet, and L. Ying. A fast butterﬂy algorithm for the computation of Fourier integral operators.

Multiscale Model. Simul. , 7(4):1727–1750, 2009.[5] M. V. de Hoop, G. Uhlmann, A. Vasy, and H. Wendt. Multiscale discrete approximations of Fourier integraloperators associated with canonical transformations and caustics.

Multiscale Model. Simul. , 11(2):566–585, 2013.[6] M. Dimassi and J. Sj¨ostrand.

Spectral asymptotics in the semi-classical limit , volume 268 of

London MathematicalSociety Lecture Note Series . Cambridge University Press, Cambridge, 1999.[7] S. Dyatlov and M. Zworski.

Mathematical theory of scattering resonances . book in progress.[8] C. L. Epstein.

Introduction to the mathematics of medical imaging . Society for Industrial and Applied Mathe-matics (SIAM), Philadelphia, PA, second edition, 2008.[9] V. Guillemin. On some results of Gel’fand in integral geometry. In

Pseudodiﬀerential operators and applications(Notre Dame, Ind., 1984) , volume 43 of

Proc. Sympos. Pure Math. , pages 149–155. Amer. Math. Soc., Providence,RI, 1985.[10] V. Guillemin and S. Sternberg.

Geometric asymptotics . American Mathematical Society, Providence, R.I., 1977.Mathematical Surveys, No. 14.[11] V. Guillemin and S. Sternberg.

Semi-classical analysis . International Press, Boston, MA, 2013.[12] L. H¨ormander.

The analysis of linear partial diﬀerential operators. IV , volume 275. Springer-Verlag, Berlin, 1985.Fourier integral operators.[13] H. J. Landau. Necessary density conditions for sampling and interpolation of certain entire functions.

Acta Math. ,117:37–52, 1967.[14] F. Natterer.

The mathematics of computerized tomography . B. G. Teubner, Stuttgart, 1986.[15] F. Natterer. Sampling in fan beam tomography.

SIAM J. Appl. Math. , 53(2):358–380, 1993.[16] D. P. Petersen and D. Middleton. Sampling and reconstruction of wave-number-limited functions in N-dimensional euclidean spaces.

Information and Control , 5(4):279–323, 1962.[17] P. Rattey and A. Lindgren. Sampling the 2-D Radon transform.

IEEE Transactions on Acoustics, Speech, andSignal Processing , 29(5):994–1002, Oct 1981.[18] P. Stefanov and G. Uhlmann. Thermoacoustic tomography with variable sound speed.

Inverse Problems ,25(7):075011, 16, 2009.[19] P. Stefanov and Y. Yang. Multiwave tomography in a closed domain: averaged sharp time reversal.

InverseProblems , 31(6):065007, 23, 2015.[20] M. Zworski.

Semiclassical analysis , volume 138 of

Graduate Studies in Mathematics . American MathematicalSociety, Providence, RI, 2012.. American MathematicalSociety, Providence, RI, 2012.