Emission tomography with a multi-bang assumption on attenuation
EEmission tomography with a multi-bang assumption onattenuation
Sean Holman and Philip RichardsonJanuary 14, 2020
Abstract
We consider the problem of joint reconstruction of both attenuation a and source density f in emission tomography in two dimensions. This is sometimes called the Single Photon EmissionComputed Tomography (SPECT) identification problem, or referred to as attenuation correctionin SPECT. Assuming that a takes only finitely many values and f ∈ C c ( R ) we are able to char-acterise singularities appearing in the Attenuated Radon Transform R a f , which models emissiontomography data. Using this characterisation we prove that both a and f can be determined insome circumstances. We also propose a numerical algorithm to jointly compute a and f from R a f based on a weakly convex regularizer when a only takes values from a known finite list, and showthat this algorithm performs well on some synthetic examples. In this paper we consider the problem of emission tomography in which we seek to recover bothattenuation a and radiation source density f based on passive radiation measurements. We willonly consider the problem in two dimensions, and use the photon transport equation for theforward model. Assume that a and f are compactly supported functions on R and let Ω ⊂ R bea bounded set containing the supports. Then the photon transport equation is θ · ∇ u ( x, θ ) + a ( x ) u ( x, θ ) = f ( x ) , ( x, θ ) ∈ Ω × S ,u | Γ − = 0 , (1)where u ( x, θ ) is the photon flux through the point x in a unit direction θ ∈ S andΓ − = { ( x, θ ) ∈ ∂ Ω × S | θ · n ( x ) ≤ } , where n ( x ) is the unit outward pointing normal to the boundary at x . Intuitively the differentialequation in (1) states that photons are created by a source with density f and then move alongstraight lines while being attenuated at a rate given by a . The boundary condition in (1) requiresthat we have no radiation entering the domain Ω. The photon transport equation can be solved(for example by method of characteristics as in [20]) to find the solution u ( x, θ ) which satisfieslim τ →∞ u ( sθ ⊥ + τ θ, θ ) = (cid:90) ∞−∞ f ( sθ ⊥ + tθ ) e − Da ( sθ ⊥ + tθ,θ ) d t, (2)where θ ⊥ ∈ S denotes θ rotated in the anticlockwise direction by π/ Da is the beamtransform of a defined as Da ( x, θ ) = (cid:90) ∞ a ( x + tθ ) d t. We define the Attenuated Radon Transform(AtRT)[20] of f with attenuation a to be the integralon the right hand side of (2), and denote this R a f ( s, θ ). The basic question we investigate in thispaper is whether it is possible to determine both of a and f from R a f . a r X i v : . [ m a t h . A P ] J a n hen a is fixed, the mapping f (cid:55)→ R a f is known to be analytically invertible under certainmild conditions, and when these hold a closed form solution for the inverse is known, see [23, 6, 19].The problem of recovering both a and f from the AtRT is sometimes known as the Single PhotonEmission Computed Tomography (SPECT) identification problem [27, 26], and one particularresult, given in [24], shows non-uniqueness for radial a and f . That is, there are different pairs of a and f depending only on distance to the origin which give the same AtRT. The issue persistsfor maps which are also “close” to being radial, as shown in [16], and numerical investigations in[25] also show evidence of non-uniqueness in other situations. One should also note that if f = 0,then it is trivially clear that a can be anything and still R a f = 0.Despite these negative results, under some additional hypotheses determination of a and f from R a f can still be possible. In medical imaging literature there has been quite a lot of workaimed towards numerical methods for what is more typically called “attenuation correction” inthat context (e.g. [28, 29, 15] and many others), although for practical applications of SPECTthe standard method is to obtain the attenuation through a separate transmission CT scan. Somestudies have also attempted to make use of scattered photons for attenuation identification inSPECT ([12, 13]) although the forward model must be enriched to include the scattering, and sothe mathematical problem is different. There are not many positive theoretical results concerningrecovery of both a and f from emission data alone. In the case when Da in (2) is replaced by aconstant µ times t the transform is called the exponential Radon transform, and in [26] it is shownthat µ can be determined from the exponential Radon transform when f is unknown, but notradial. A linearisation of the problem is studied from a microlocal point of view in [27], and usedto establish some results for the nonlinear problem as well. A range characterisation of f (cid:55)→ R a f for given a originally found in [23] and further explored in [22] was used in [2] to analyse recoveryof both a and f . Perhaps closest to the results of the present paper is the work in [8] which showsthat unique recovery of a and f is possible when a is a multiple of the characteristic function of astar shaped region. In this work we assume that a takes on only finitely many values, and refer tosuch a as multi-bang . As detailed below, we are able to show unique recovery of such a from R a f in some cases.Recently the authors of [10, 11] introduced a convex multi-bang regularization technique in-tended to allow reconstructions of images in which there are only certain known values, and ourline of research leading to the present paper was originally inspired by this technique. There aremany applications where the multi-bang regularization technique might be useful, particularly inmany forms of medical imaging, e.g. SPECT imaging[16] and X-ray Imaging[30]. The convexmulti-bang technique of [10, 11] was applied numerically to the problem of recovering multi-bang a and f from R a f in [25] with mixed results, and in the present paper we modified the method touse a weakly convex (rather than convex) multi-bang regularization combined with Total Varia-tion(TV) to promote the joint recovery of multi-bang a and f from the AtRT R a f . We implementthis by alternating updates between a and f using [1] to prove convergence. The a update is themost computationally intensive step due to the nonlinearity and using recent work by [17, 3] weapply a variant of the Alternating Direction Method of Multipliers(ADMM) with a non-convexmulti-bang regularizer, which we show lends itself to promoting multi-bang solutions.The novel contributions of this paper are summarised as follows. The precise and rigorousversions of the theoretical results, which include a few other technical assumptions, are presentedlater in the paper.1. Theorem 1. Assuming that a is multi-bang, f ∈ C c ( R ) is non-negative, and with someadditional assumptions about the regularity of the boundaries of the regions of constant a ,we can characterise the singularities occurring in R a f as a result of the jumps in a .2. Theorem 2. If a and f are as in Theorem 1 with the regions of constant a being constructedusing a sequence of nested convex sets, then a and f can be uniquely determined from R a f .3. We propose a numerical algorithm for joint recovery of multi-bang a and f for limited pro-jection data, and demonstrate its utility with some numerical examples.Our proofs for Theorems 1 and 2 are based on a careful analysis of the singularities which occur in R a f arising from the jumps of a , as well as a result that if a is known outside of a convex region,then R a f determines f uniquely also outside this convex region (see Lemma 8). We prove thislatter result by reducing it to the problem considered in [7, Theorem 3.1], although other proofs sing for example analytic microlocal analysis as in [14] should also be possible.The rest of the paper is structured as follows. Section 2 gives the theoretical results of thepaper introducing some necessary definitions in subsection 2.1, stating and proving Theorem 1 aswell as some related results in subsection 2.2, and stating and proving Theorem 2 in subsection2.3. Section 3 describes the numerical methods used and section 4 gives a number of numericalexamples. The final section concludes the work in the paper and suggests avenues for furtherresearch. We first recall from the introduction that the AtRT, R a f is defined by the formula R a f ( s, θ ) = (cid:90) ∞−∞ f ( sθ ⊥ + tθ ) e − Da ( sθ ⊥ + tθ,θ ) d t. (3)Throughout this section we will assume that f ∈ C c ( R ) and that a is multi-bang. Note that ( s, θ )corresponds to the line tangent to θ with distance s from the origin (this is slightly different fromthe standard parametrisation of lines in the Radon transform in which θ is normal to the line),and R a f is an integral along that line. We will always parametrise θ ∈ S by the angle ω ∈ R suchthat θ = (cos( ω ) , sin( ω )) and θ ⊥ = ( − sin( ω ) , cos( ω )).We now introduce the precise definition of multi-bang. Definition 1. (Multi-bang)
We say that a ∈ L ∞ ( R ) is multi-bang if there exists a finite set A = { a , ... , a n } ⊂ R , called the admissible set, and a collection of disjoint bounded open sets { Ω j } nj =1 with smooth boundaries possibly having corners such that a = n (cid:88) j =1 a j χ Ω j . (4) Here χ Ω j is the characteristic function of the set Ω j , and we assume that for all Ω j the interiorof the closure of Ω j is equal to Ω j . We also assume that any line only intersects the boundaries ∪ nj =1 ∂ Ω j finitely many times. The final hypothesis about lines intersecting the boundaries only finitely many times is addedfor technical reasons, and can probably be removed although we haven’t proven this. It is alsoworthwhile to point out that for the theoretical results in section 2 we do not require knowledgeof the set of admissible values A , although we do assume knowledge of the admissible values forthe numerical algorithm.As we will see below in section 2.3, we will be able to determine certain points on the boundariesof the sets Ω j in Definition 1 from R a f . The set of points which we can determine will be denoted P a , which we now define. Definition 2. ( P a ) Suppose that a is multi-bang with sets { Ω j } nj =1 as in Definition 1. Then P a is the set of points x ∈ ∂ Ω j for some j such that either1. ∂ Ω j has non-zero curvature at x , or2. ∂ Ω j has a corner at x .We further write P a, for the subset of P a where the boundary has non-zero curvature as in 1 and P a, for the subset of P a of corners as in 2. As we might expect from microlocal analysis (e.g. see [27]) the jumps in a along the boundariesof Ω j lead to singularities in R a f at points ( s, θ ) corresponding to lines which are either tangentto the boundaries ∂ Ω j , or pass through corners of ∂ Ω j . We introduce the following definition forthese lines. Note that in this definition we consider lines passing through a corner of ∂ Ω j that arealso tangent to one of the branches of the corner to be tangent to ∂ Ω j (so there would always beat least two lines passing through a corner that are also tangent). efinition 3. ( K a ) Suppose that a is multi-bang with sets { Ω j } nj =1 as in Definition 1 and P a isas in Definition 2. We define K a to be the subset of { ( s, θ ) : s ∈ R , θ ∈ S } such that the linecorresponding of ( s, θ ) is either tangent to ∂ Ω j or passes through a corner of ∂ Ω j for some j . Wefurther define K a ⊂ K a to be the same set with the added requirement that if the line is tangent,then the tangency must be at a point where ∂ Ω j has nonzero curvature. Finally, we define subsetsof K a which are the set K a, of lines tangent at a point where the curvature is non-zero, and K a, the set of lines passing through corners. (Note that K a, and K a, may not be disjoint.) For Theorem 2 we will also require an additional definition which describes precisely what wemeant when we said the regions of constant a are constructed using a sequence of nested convexsets. Definition 4. (Nicely multi-bang)
We say that a is nicely multi-bang if a is multi-bang andcan furthermore be written in the form a = N (cid:88) i =1 c j χ C j (5) where the sets C j are all convex, bounded, open with smooth boundary possibly having corners andnested in the sense that C n (cid:98) C n − (cid:98) ... (cid:98) C . Here C j (cid:98) C j − means that C j is contained in a compact set that is contained in C j − . In the next section we will state and prove Theorem 1 as well as some other related results.
We start the section with the statement of Theorem 1.
Theorem 1.
Suppose that f ∈ C c ( R ) is non-negative and a is multi-bang with sets Ω j as givenin Definition 1. The theorem has two parts corresponding to P a, and P a, .1. Suppose x ∈ P a, and the line tangent to a boundary ∂ Ω j at x is not tangent to a boundaryanywhere else. If this tangent line is given by ( s ∗ , θ ∗ ) with θ ∗ = (cos( ω ∗ ) , sin( ω ∗ )) and theray { x + tθ ∗ | t < } intersects the set { f > } , then ∂ s R a f ( s, θ ∗ ) has a singularity of order / at s = s ∗ , and x is the unique point on the line such that lim ω → ω ∗ | sin( ω − ω ∗ ) | / ∂ ω R ( x · θ ⊥ , θ ) = 0 . (Recall that θ = (cos( ω ) , sin( ω )) .)2. Suppose that x ∈ P a, lies on a corner of a boundary for a component of Ω j and is alsoa corner for only one other component of some Ω l . If ( s ∗ , θ ∗ ) ∈ K a, corresponds to anyline passing through x that passes through no other corners, is not tangent to any of theboundaries and the ray { x + tθ | t < } passes through the set { f > } , then ∂ s R a f ( s, θ ∗ ) has a jump across s = s ∗ . Remark 1.
We comment that while we have chosen Theorem 1 to summarise the results in thissection in a concise manner, in fact the lemmas we prove taken together can provide strongerstatements and more information than given in Theorem 1 concerning the singularities of R a f when a is multi-bang. In particular Lemmas 1 and 5, and Corollary 1 are not reflected in thestatement of Theorem 1. We now begin establishing lemmas and a corollary which will lead to the proof of Theorem 1. Forthe first lemma we look at the regularity of R a f in the complement of K a , which is actually notdirectly related to Theorem 1. . Lemma 1.
Suppose that f ∈ C c ( R ) , a is multi-bang, and ( s ∗ , θ ∗ ) ∈ ( K a ) c . Then the mapping s (cid:55)→ ∂ s R a f ( s, θ ∗ ) is continuous at s ∗ . roof. Since ( s ∗ , θ ∗ ) ∈ ( K a ) c the line tangent to θ ∗ with distance s ∗ from the origin is neithertangent to one of the boundaries where a jumps, nor passing through a corner of one of theseboundaries. W.L.O.G we can rotate the axis so that the line corresponding to ( s ∗ , θ ∗ ) lies on the x -axis and θ ∗ = (1 , R a f ( s, θ ∗ ) = (cid:90) ∞−∞ f ( x, s ) e − Da (( x,s ) ,θ ∗ ) d x. (6)For x ∈ R , recall that Da (( x, s ) , θ ∗ ) = (cid:90) ∞ x a ( t, s )d t. By assumption, the line given by ( s, θ ∗ ) only crosses the jumps of a finitely many times and letus label the ordered values of t for which these crossings occur (when the line is parametrisedas t (cid:55)→ ( t, s )) as { t i ( s ) } Ni =1 . Since ( s ∗ , θ ∗ ) ∈ ( K a ) c these functions t i are differentiable in aneighbourhood of s = 0. Next we introduce the functions φ i ( x, s ) = (cid:26) t i ( s ) x < t i ( s ) x x ≥ t i ( s ) . (7)Note that for all i , φ i is continuous with bounded first derivative in a neighbourhood of s = 0which is also continuous when x (cid:54) = t i ( s ). Using these functions we have Da (( x, s ) , θ ∗ ) = N − (cid:88) i =1 c i ( φ i +1 ( x, s ) − φ i ( x, s )) = N (cid:88) i =1 ( c i − − c i ) φ i ( x, s ) (8)where for each i , c i is one of the admissible values or possibly zero (in particular c and c N arealways zero). We thus see that Da (( x, s ) , θ ∗ ) also has bounded derivatives in a neighbourhood of s = 0 that are continuous except when x = t i ( s ) for some i , and differentiating this with respectto s gives ∂ s Da (( x, s ) , θ ∗ ) = N (cid:88) i =1 ( c i − − c i ) ∂ s φ i ( x, s ) . Since f ∈ C ( R ) as well, we can therefore differentiate under the integral sign in (6) to get ∂ s R a f ( s, θ ∗ ) = (cid:90) ∞−∞ ( ∂ s f ( x, s ) − ∂ s Da (( x, s ) , θ ∗ ) f ( x, s )) e − Da (( x,s ) ,θ ∗ ) d x = (cid:90) ∞−∞ ∂ s f ( x, s ) e − Da (( x,s ) ,θ ∗ ) d x + N (cid:88) i =1 ( c i − c i − ) t (cid:48) i ( s ) (cid:90) t i ( s ) −∞ f ( x, s ) e − Da (( x,s ) ,θ ∗ ) d x. (9)Since f ∈ C c ( R ) and the derivatives t (cid:48) i are all continuous, we see that in fact ∂ s R a f ( s, θ ∗ ) is alsocontinuous with respect to s at s = 0 which completes the proof.We now begin to consider what can happen to R a f at lines in K a . First we consider K a, . Lemma 2.
Suppose that f ∈ C c ( R ) , a is multi-bang, and ( s ∗ , θ ∗ ) ∈ K a, passes though exactlyone corner and is not tangent to any boundary ∂ Ω j . Then s (cid:55)→ ∂ s R a f ( s, θ ∗ ) is bounded near s ∗ .Suppose additionally that the corner point occurs at s ∗ ( θ ∗ ) ⊥ + t ∗ θ ∗ , is a corner for N differentcomponents of the regions Ω j , and the boundaries of these regions make angles { α k } Nk =1 with ( θ ∗ ) ⊥ where the orientation is chosen so that θ ∗ is at a positive angle. Also suppose that the jump in a across the boundary with angle α k in the direction of increasing angle is b k (see figure 1 andcaption). Then there is a jump in ∂ s R a f ( s, θ ∗ ) across s = s ∗ given by (cid:104) ∂ s R a f ( s, θ ∗ ) (cid:105) s ∗ + s ∗− = (cid:32) N (cid:88) k =1 b k tan( α k ) (cid:33) (cid:90) t ∗ −∞ f ( s ∗ ( θ ∗ ) ⊥ + tθ ∗ ) e − Da ( s ∗ ( θ ∗ ) ⊥ + tθ ∗ ,θ ∗ ) d t. (10) i α θ(θ�) ** + + c i + + c i + + c i - - c i - - c i - - c i - - c i - - c i - - c i - -1 Figure 1: This figure illustrates some of the notation used in the statement and proof of Lemma2. The line corresponding to ( s ∗ , θ ∗ ) is shown in red. Note that the jump b across the boundarycorresponding to the angle α in the example shown in the figure would be b = c + i +3 − c + i +2 . Proof.
We use the same set-up and notation as in the proof of Lemma 1, although by translating ifnecessary we also assume W.L.O.G. that the corner occurs at the origin. Unlike in Lemma 1 theremay be a different number of boundary crossings N when s > s <
0, and so we introducecorresponding functions { t ± i } N ± i =1 giving the crossing points where the t + i are defined for s > t − i are defined for s <
0. As in Lemma 1 these t ± i will all have bounded derivatives up to s = 0(this is because the line given by ( s ∗ , θ ∗ ) is not tangent to any boundary). We also introduce thecorresponding φ ± i defined as in (7) but only for s > s < ± added in appropriate places, and we can still see that Da (( x, s ) , θ ∗ ) is continuous for s close to zero. For the derivative ∂ s Da (( x, s ) , θ ∗ ) we have for s (cid:54) = 0, where ± is the sign of s , ∂ s Da (( x, s ) , θ ∗ ) = N ± (cid:88) i =1 ( c ± i − − c ± i ) ∂ s φ ± i ( x, s ) . (11)Since these derivatives are all bounded (but not necessarily continuous at s = 0) we still have (9)for s (cid:54) = 0, and we see that ∂ s R a f ( s, θ ∗ ) is bounded thus proving the first statement of the theorem.It remains to analyse the jump at s = 0.Let us first consider the jump in ∂ s Da (( x, s ) , θ ∗ ) across s = 0. The only terms contributing tothis jump in (11) will be those with φ ± i where t ± i ( s ) → s → ± since the others correspondto boundaries which do not have corners along ( s ∗ , θ ∗ ) and the line is not tangent to any of theboundaries. Let us reindex the indices i corresponding to such t i using a new index k as { i ± k } ˜ N ± k =1 .Then the jump in ∂ s Da (( x, s ) , θ ∗ ) is given by (cid:104) ∂ s Da (( x, s ) , θ ∗ ) (cid:105) + − = lim s → + ∂ s Da (( x, s ) , θ ∗ ) − lim s → − ∂ s Da (( x, s ) , θ ∗ )= ˜ N + (cid:88) k =1 ( c + i + k − − c + i + k ) ∂ s φ + i + k ( x, + ) − ˜ N − (cid:88) k =1 ( c − i − k − − c − i − k ) ∂ s φ − i − k ( x, − ) . Using (9) we find that the jump of ∂ s R a f ( s, θ ∗ ) across s = 0 will be (cid:104) ∂ s R a f ( s, θ ∗ ) (cid:105) + − = lim s → + ∂ s R a f ( s, θ ∗ ) − lim s → − ∂ s R a f ( s, θ ∗ )= − ˜ N + (cid:88) k =1 ( c + i + k − − c + i + k ) ∂ s t + i + k (0 + ) − ˜ N − (cid:88) k =1 ( c − i − k − − c − i − k ) ∂ s t − i − k (0 − ) × (cid:90) −∞ f ( x, e − Da (( x, ,θ ∗ ) d x. aking into account the rotation and translation used at the beginning we see that this correspondswith (10) and so completes the proof.Next we consider K a, which requires a bit more work. Lemma 3.
Suppose that f ∈ C c ( R ) , a is multi-bang, and that ( s ∗ , θ ∗ ) ∈ K a, is such that the linecorresponding to ( s ∗ , θ ∗ ) is only tangent to a boundary ∂ Ω j at one point given by s ∗ ( θ ∗ ) ⊥ + t ∗ θ ∗ which is not also a corner. Furthermore, suppose that a = c on the convex side of the point oftangency, a = c on the concave side, the curvature of ∂ Ω j at the point of tangency is κ > , and θ ∗⊥ ∈ S is orthogonal to θ ∗ and pointing into the convex side. Then lim s → ( s ∗ ) ± | s − s ∗ | ∂ s R ( s, θ ∗ ) = ± c − c (cid:112) κ/ (cid:90) t ∗ −∞ f ( s ∗ ( θ ∗ ) ⊥ + tθ ∗ )) e − Da ( s ∗ ( θ ∗ ) ⊥ + tθ ∗ ,θ ∗ ) d t (12) where ± is the sign of ( θ ∗ ) ⊥ · θ ∗⊥ .Proof. After possibly rotating as in the proof of Lemma 1 and reflecting across the x -axis we canassume that s ∗ = 0, θ ∗ = (1 ,
0) and θ ∗⊥ = (0 , ∂ Ω j will be given as a graph in the form y = x g ( x ) (13)where g is a strictly positive function and g (0) = κ .For sufficiently small s > s >
0) the equation (8). The difference from Lemma 1 is that in the current casethe derivatives of the t i ’s corresponding to the point of tangency will blow up as s → + . Therewill be two such t i ’s which both go to 0 as s → + , one positive and one negative which we labelrespectively as t ± , and differentiating (13) we can show thatlim s → + s / ∂ s t ± ( s ) = ± √ κ . (14)Now we have (9) which holds for s > s → + except those that involve derivatives of t ± . Thus when we multiply by s / and takethe limit s → + the only terms that will possibly remain arelim s → + s / ∂ s R a f ( s, θ ∗ ) = lim s → + (cid:40) − s / ( c − c ) ∂ s t + ( s ) (cid:90) t + ( s ) t − ( s ) f ( x, s ) e − Da (( x,s ) ,θ ∗ ) d x − s / ( c − c ) ( ∂ s t + ( s ) − ∂ s t − ( s )) (cid:90) t − ( s ) −∞ f ( x, s ) e − Da (( x,s ) ,θ ∗ ) d x (cid:41) . The term on the right on the first line is zero by (14) and using the fact the integrand is continuous,while using (14) another time we can evaluate the term on the second line to getlim s → + s / ∂ s R a f ( s, θ ∗ ) = c − c (cid:112) κ/ (cid:90) −∞ f ( x, e − Da (( x, ,θ ∗ ) d x. Taking into account the rotation, translation and reflection at the beginning of the proof, thiscorresponds with (12) and so completes the proof.
Remark 2.
Note that Lemma 3 precisely characterises the leading order singularity of the deriva-tive ∂ s R a f at ( s ∗ , θ ∗ ) . It is possible to obtain a similar formula and characterisation for somesmooth parts of the boundary where the curvature is zero with some higher order derivative whichdoes not vanish at x = 0 . In this case we would use y = x n g ( x ) in place of (13) where n ≥ and g (0) (cid:54) = 0 . The order of the singularity in ∂ s R a f as s → ( s ∗ ) ± is then − n rather than / . Remark 3.
It is possible to combine the methods of proof of the previous lemmas to characterisethe singularities at ( s ∗ , θ ∗ ) corresponding to lines both tangent to the boundaries at multiple placesand/or passing through through multiple corners, but to simplify the statements we have not donethis explicitly. t this point we note that these first three lemmas already show how we can determine someinformation about multi-bang a from R a f . First of all, Lemma 1 shows that if R a f ( s, θ ) is notcontinuous in s at a point ( s ∗ , θ ∗ ), then the corresponding line must either be tangent to or passingthrough a corner of the boundary of one of the regions Ω j . If ∂ s R a f ( s, θ ) is bounded near ( s ∗ , θ ∗ ),but has a jump in s at this point, then the line must be passing through a corner from Lemma 2.If ∂ s R a f ( s, θ ∗ ) blows up as s → s ∗ , then the line ( s ∗ , θ ∗ ) must be tangent to one of the boundariesby Lemma 3. This already gives most of the information required to prove Theorem 1 except forthe part about the derivative with respect to ω . For this we include one additional lemma studyingthe derivative with respect to variation in the angle ω rather than s as in the previous lemmas. Lemma 4.
Assume the same hypotheses as in Lemma 3. Additionally suppose that θ ∗ = (cos( ω ∗ ) , sin( ω ∗ )) and x ∗ is a point on the line corresponding to ( s ∗ , θ ∗ ) . Then lim ω → ( ω ∗ ) s | sin( ω − ω ∗ ) | / ∂ ω R ( x ∗ · θ ⊥ , θ ) = s ( c − c ) (cid:114) | x ∗ · θ ∗ − t ∗ | κ (cid:90) t ∗ −∞ f ( s ∗ ( θ ∗ ) ⊥ + tθ ∗ )) e − Da ( s ∗ ( θ ∗ ) ⊥ + tθ ∗ ,θ ∗ ) d t (15) where s is the sign of ( t ∗ − x ∗ · θ ∗ )( θ ∗ ) ⊥ · θ ∗⊥ , and s is the sign of ( θ ∗ ) ⊥ · θ ∗⊥ . In the case that t ∗ − x ∗ · θ ∗ = 0 , the equation (15) still holds with s removed (note the right hand side is zero inthat case).Proof. As in the previous lemmas by rotating, translating and possibly reflecting about the x-axiswe assume W.L.O.G. that θ ∗ = (1 , θ ∗⊥ = (0 ,
1) and the point of tangency is at the origin (i.e. t ∗ = 0). We also assume that x ∗ = ( (cid:96),
0) and for the moment consider only the case (cid:96) (cid:54) = 0. Notethat the line corresponding to ( x ∗ · θ ⊥ , θ ) is precisely the line through x ∗ tangent to θ , and we willchange the parametrisation of this line in the integral definition of the AtRT so that t = 0 alwayscorresponds with x ∗ . After doing all of this we have R a f ( x ∗ · θ ⊥ , θ ) = (cid:90) ∞−∞ f ( x ∗ + tθ ) e − Da ( x ∗ + tθ,θ ) d t. (16)Now we use the same notation as in the previous lemmas and label the ordered values of t alongthe line t (cid:55)→ x ∗ + tθ , for sgn( (cid:96) ) ω > | ω | sufficiently small, at which the line intersects oneof the boundaries ∂ Ω j as { t i ( ω ) } Ni =1 . Two of the t i will correspond to the point of tangency andthese will satisfy t i ( ω ) → − (cid:96) as ω → − sgn( (cid:96) ) . Combining (13) with the geometric relationscos( ω ) = x − (cid:96)t ± , sin( ω ) = yt ± , tan( ω ) = yx − (cid:96) (17)we can show by taking derivatives with respect to ω , and some computation, thatlim ω → − sgn( (cid:96) ) | sin( ω ) | / ∂ ω t ± ( ω ) = ± (cid:114) | (cid:96) | κ . (18)In the case that (cid:96) = 0 we will still have t ± ( ω ) when ω (cid:54) = 0 is sufficiently small corresponding to thetwo intersections near the tangent point, but t − sgn( ω ) ( ω ) = 0 and ω t sgn( ω ) ( ω ) > ω (cid:54) = 0.In this case we can show in a similar manner to the (cid:96) (cid:54) = 0 case thatlim ω → | sin( ω ) | / ∂ ω t ± ( ω ) = 0 . (19)We next define functions φ i in a similar way to before (compare with (7)) as φ i ( t, ω ) = (cid:26) t i ( ω ) t < t i ( ω ) t t ≥ t i ( ω ) , (20)for − sgn( (cid:96) ) ω >
0. As before Da ( x ∗ + tθ, θ ) = N (cid:88) i =1 ( c i − − c i ) φ i ( t, ω ) lso for − sgn( (cid:96) ) ω >
0. In the case (cid:96) = 0 we still have versions of the previous formula for ω (cid:54) = 0,but it will change depending on the sign of ω . We will also write φ ± for those φ ± correspondingto t ± .Now let us take the derivative of (16) in the case when − sgn( (cid:96) ) ω > (cid:96) (cid:54) = 0 or ω (cid:54) = 0 if (cid:96) = 0.We then have ∂ ω R a f ( x ∗ · θ ⊥ , θ ) = (cid:90) ∞−∞ (cid:16) ∂ ω f ( x ∗ + tθ ) − ∂ ω Da ( x ∗ + tθ, θ ) f ( x ∗ + tθ ) (cid:17) e − Da ( x ∗ + tθ,θ ) d t. (21)First consider the case (cid:96) = 0. In this case when we multiply by | sin( ω ) / | and take the limitas ω →
0, using (19) we see that the limit is zero. Since (cid:96) = x ∗ · θ ∗ − t ∗ , this proves the resultwhen (cid:96) = 0. Now consider when (cid:96) (cid:54) = 0. In this case we multiply by | sin( ω ) | / and take the limitas ω → − sgn( (cid:96) ) . The only terms that are not bounded in (21) for ω close to zero are those thatinvolve derivatives of φ ± . We therefore havelim ω → − sgn( (cid:96) ) | sin( ω ) | / ∂ ω R a f ( x ∗ · θ ⊥ , θ ) =( c − c ) lim ω → − sgn( (cid:96) ) (cid:90) t − ( ω ) −∞ | sin( ω ) | / ∂ ω t − ( ω ) f ( x ∗ + tθ ) e − Da ( x ∗ + tθ,θ ) d t + ( c − c ) lim ω → − sgn( (cid:96) ) (cid:90) t + ( ω ) −∞ | sin( ω ) | / ∂ ω t + ( ω ) f ( x ∗ + tθ ) e − Da ( x ∗ + tθ,θ ) d t. Applying (18) to this we finally obtainlim ω → − sgn( (cid:96) ) | sin( ω ) | / ∂ ω R a f ( x ∗ · θ ⊥ , θ ) = ( c − c ) (cid:114) | (cid:96) | κ (cid:90) − (cid:96) −∞ f ( x ∗ + tθ ∗ ) e − Da ( x ∗ + tθ ∗ ,θ ∗ ) d t Taking into account the translations, rotation and reflection from the beginning of the proof thisformula agrees with (15), and so completes the proof.Before giving the proof of Theorem 1 we record a corollary of the proof of Lemma 4 which looksat one case in which at the point of tangency of a line to the boundary of an Ω j , the curvature ofthe boundary is zero. This corollary will be useful for the proof of Theorem 2 later. Corollary 1.
Assume the same hypotheses as in Lemma 4 including the assumption that thereis a convex and concave side of the boundary near the point of tangency, but say the curvature is κ = 0 at the point of tangency. If x ∗ · θ ∗ − t ∗ (cid:54) = 0 and the ray { x ∗ + tθ ∗ : t < t ∗ } intersects theset { f > } , then limits in (15) and (12) are one of ±∞ .Proof. The proof follows the same outline as the proofs of Lemma 3 and 4, but in (13) we have g (0) = 0. Because of this (14) and (18) respectively change tolim s → + s / ∂ s t ± ( s ) = ±∞ , and lim ω → − sgn( (cid:96) ) | sin( ω ) | / ∂ ω t ± ( ω ) = ±∞ . Following the proofs through the rest of the way with this change, and using the fact that theintegrals appearing at the end do not vanish, proves the corollary.The proof of Theorem 1 now follows simply from Lemmas 2, 3 and 4 as we now point out.
Proof of Theorem 1.
For the first item in Theorem 1 we note that if the hypotheses are satisfied,then by Lemma 3 equation (12) holds, and since the ray { x + tθ ∗ | t < } intersects { f > } , theintegral on the right side of (12) is not zero. Therefore ∂ s R a f ( s, θ ∗ ) has a singularity of order 1 / s = s ∗ . Similarly Lemma 4 implies that (15) will hold and this limit will only be zero when x ∗ = x . This proves the first part.The second part follows similarly from Lemma 2, although we note the under the given hy-potheses N = 2 in (2) and b = − b (cid:54) = 0. Thus we have a jump in ∂ s R a f iftan( α ) (cid:54) = tan( α ) . his will always be true at a corner since there would be equality only if α = α + nπ for aninteger n , but that would mean we are not at a corner.To finish this section we prove one more lemma concerning what can happen if there is a flatsection of a boundary of one of the Ω j . This will be used in the proof of Theorem 2. Lemma 5.
Assume that a is multi-bang and f ∈ C c ( R ) is non-negative. Suppose that the linegiven by ( s ∗ , θ ∗ ) intersects the boundary of one of the Ω j in a line segment of length (cid:96) given by { s ∗ ( θ ∗ ) ⊥ + tθ ∗ | t ∈ [ t − , t + ] } , and that there are no corners for any of the other regions containedin the interior of this line segment. Assume also that the ray { s ∗ ( θ ∗ ) ⊥ + tθ ∗ | t < t + } intersectsthe set { f > } . Then R a f ( s, θ ∗ ) is discontinuous at s = s ∗ .Proof. We follow the same method as the proofs of Lemma 2 using the notation t ± i and φ ± i asbefore. The difference here is that some of these t ± i ( ω ) will converge to the endpoints t − and t + of the line segment as s → ± . These will lead to a jump in Da (( x, s ) , θ ∗ ) given by (8) at s = 0when x < t + . This jump will then lead to a jump in R a f if f satisfies the given hypothesis.We next proceed to the statement and proof of Theorem 2 as well as some related results. In this section we state and prove Theorem 2.
Theorem 2.
Suppose that a is nicely multi-bang (see Definition 4) and f ∈ C c ( R ) is non-negative. Also assume that1. for all x ∈ P a, the line tangent to a boundary at x passes through the set { f > } , and2. for all x ∈ P a, there is a line passing through x that also passes through the set { f > } .Then a and f are uniquely determined by R a f . We prove Theorem 2 throughout this section in a series of lemmas. The initial step in the proof ofTheorem 2 is to show that under the given hypotheses we can determine the set of points where a jumps. We will do this now. Lemma 6.
Assume the same hypotheses as Theorem 2. Then we can determine from R a f thesets C j appearing in Definition 4 for a .Proof. We first note that by Lemmas 1 and 2 and Corollary 1 the set of ( s ∗ , θ ∗ ) such that ∂ s R a f ( s, θ ∗ ) for s near s ∗ is not bounded gives the set of lines which are tangent to some boundary ∂C j , possibly missing some of the lines which intersect a boundary in a line segment. We can getrid of all of the ( s ∗ , θ ∗ ) corresponding to lines which intersect a boundary ∂C j in a line segmentby looking at the continuity of R a f ( s, θ ∗ ) near s ∗ and using Lemma 5. Thus we can determine theset of ( s ∗ , θ ∗ ) such that the corresponding lines are tangent to a boundary ∂C j at some point, andsince the C j are nested convex sets the point of tangency along each such line must be unique. Wecan determine the point of tangency along each line that is tangent at a point where the curvatureof ∂C j is not zero using Theorem 1 or we can determine if at the point of tangency the curvatureis zero using Corollary 1. Thus we can identify all points in the boundaries of the C j at which thecurvature of ∂C j is not zero. Next we will show that we can also find the corners of the boundaries ∂C j .By Lemma 2 and the hypotheses, for every corner point x for some ∂C j there will infinitely manylines passing through x such that for at least ( s ∗ , θ ∗ ) corresponding to these lines ∂ s R a f ( s, θ ∗ ) isbounded, but has a jump at s = s ∗ . This allows us to determine the corner points, and combiningthis with the previous paragraph we see that we can determine the P a from R a f under the givenhypotheses. We next show that this is sufficient to determine all of the C j .By Lemma 7, which we will prove next, the closure of the convex hull of P a is equal to theclosure of C . Therefore we can determine C . The rest of the sets C j can now be determinedinductively. Indeed, suppose that we know C l for all l < j . Then by Lemma 7 again C j = conhull (cid:32) P a \ j − (cid:91) l =1 ∂C l (cid:33) , nd so we can determine C j . This completes the proof.The following geometric lemma was needed in the proof of Lemma 6. Lemma 7.
Suppose that C ⊂ R is closed, convex, bounded and has smooth boundary possiblywith corners. Also let P be the subset of ∂C consisting of points which are either corners of ∂C ,or where ∂C has nonzero curvature. Then C = conhull ( P ) where conhull( P ) is the convex hull of P .Proof. Since C is closed and convex, and P ⊂ C , we have conhull ( P ) ⊂ C . Thus it only remainsto show the opposite inclusion. Suppose that x ∈ ∂C \ conhull ( P ). Then there must be aneighbourhood U of x such that U ∩ ∂C does not intersect P . Therefore the curvature of ∂C is zeroat all points in U ∩ ∂C , and since x is also not a corner for ∂C , this implies that U ∩ ∂C must containa line segment containing x in its relative interior. There must then be some maximally extendedline segment containing x which is contained in ∂C . At least one of the end points of this maximalline segment must also be in ∂C \ conhull ( P ) since otherwise we would have x ∈ conhull ( P )by convexity. However this is a contradiction since by the argument we have already given thisendpoint would be in the relative interior of a line segment contained in ∂C \ conhull ( P ). Thus ∂C \ conhull ( P ) = ∅ , which then implies the result.Having proven in Lemma 6 that we can determine the sets C j from R a f under the hypothesesof Theorem 2, it remains to show that we can recover f and the jumps in a across each of theboundaries. For this we argue by induction starting at the outermost region C , and continuinginward. First suppose that a and f are known everywhere outside of C j − for some j ≥
2. Thensince there must be at least one point x on the boundary ∂C j − in P a , we can either use (10) or(12) to determine the jump in a across the boundary at x . Therefore we can determine a outsideof C j . To complete the induction step it then remains to show we can determine f outside of C j .For this we use the following lemma. Lemma 8.
Suppose that f ∈ C c ( R ) , and a is nicely multi-bang with sets { C j } nj =1 , and let C be an open ball centred at the origin that is sufficiently large so that C (cid:98) C and supp( f ) (cid:98) C .Then for j ≥ , f | C j − \ C j is uniquely determined if we know all of1. R a f ,2. the sets C j ,3. a | R \ C j , and4. f | R \ C j − .Proof. For this proof we will write ( x, y ) as Cartesian coordinates for points in R . By translatingand rotating as necessary we assume W.LO.G. that C j is contained in the upper half plane { y > } ,and show that we can then uniquely determine f restricted to the lower half-plane { y < } . Bytranslating to bring C j arbitrarily close to { y = 0 } and rotating this then shows we can determine f everywhere outside of C j and so will complete the proof.Having done the transformations described in the previous paragraph, we also assume that C j − ⊂ { y > − h } . Now choose ω > (cid:15) > { y = (cid:15)x } lies entirely outsideof C j and the parabola { y = ωx − h } lies entirely outside of C j − . It is possible to find such ω and (cid:15) since the C j are all bounded. Now choose φ ∈ C ∞ ( R ) such that φ ( x, y ) = 1 on C j − and φ ( x, y ) = 0 on the set { y < ωx − h } . Then define ˜ f = φf so that ˜ f has support contained in theset { y ≥ ωx − h } and is such that ˜ f | C j − = f | C j − . Also, supposing that a = c on C j − \ C j , weset ˜ a = ˜ a ( x, y ) = a ( x, y ) ( x, y ) ∈ C j − ˜ a ( x, y ) = c ( x, y ) ∈ ( R \ C j − ) ∩ { y > (cid:15)x − h − } ˜ a ( x, y ) = 0 otherwise . The setup described in the last few lines is illustrated in figure 2. Our next step is to show thatwe can determine R ˜ a ˜ f ( s, θ ) if ( s, θ ) corresponds to a line contained in the set { y < (cid:15)x } given thehypotheses of the lemma. If this line does not pass through C j − , then there is no problem since y y = (cid:15)x − hy = (cid:15)x y = ωx − hC j − C j y = kx + l Figure 2: This illustrates the setup and some of the notation used in the proof of Lemma 8. Note thatwe assume a = c in the region C j − \ C j , and ˜ a = c in the region above the lowest parabola translateddownwards by 1 and outside of C j . f is assumed to be known outside of C j − , and then ˜ f is supportedin the region above the middle parabola. we know ˜ f and ˜ a outside of C j − . Suppose on the other hand that the line does pass through C j − , and let the two points of intersection between the line and ∂C j − be denoted t < t (notethere will always be two such points by convexity and these can be determined from C j ). We thenhave R a f ( s, θ ) = (cid:90) t −∞ f ( sθ ⊥ + tθ ) e − Da ( sθ ⊥ + tθ,θ ) d t + (cid:90) t t f ( sθ ⊥ + tθ ) e − Da ( sθ ⊥ + tθ,θ ) d t + (cid:90) ∞ t f ( sθ ⊥ + tθ ) e − Da ( sθ ⊥ + tθ,θ ) d t. The first and third terms on the right side of the last equation only involve a | R \ C j and f | R \ C j − as well as t and t , and thus are known functions of ( s, θ ) under the given hypotheses. We combinethese together, and also R a f , into one function G ( s, θ ), and so, since also f | C j − = ˜ f | C j − , wehave (cid:90) t t ˜ f ( sθ ⊥ + tθ ) e − Da ( sθ ⊥ + tθ,θ ) d t = G ( s, θ )where G is a function which can be determined from the known information. Next note thatfor t ∈ ( t , t ), F = − Da ( sθ ⊥ + tθ, θ ) + D ˜ a ( sθ ⊥ + tθ, θ ) only depends on s and θ , and can bedetermined under the hypotheses. Therefore we have (cid:90) t t ˜ f ( sθ ⊥ + tθ ) e − D ˜ a ( sθ ⊥ + tθ,θ ) d t = e − F ( s,θ ) G ( s, θ ) . Finally, we can add back in the integrals with ˜ f and ˜ a from −∞ to t and t to ∞ since theseonly involve ˜ a | R \ C j , and ˜ f | R \ C j − which are assumed to be known. Doing this we see that R ˜ a ˜ f can be determined given the hypotheses. The problem has now been reduced to determining˜ f | C j − ∩{ y< } .Our final step is to change variables in order to reduce the problem to the one considered in[7, Theorem 3.1]. For this we consider only R ˜ a ˜ f ( s, θ ) for ( s, θ ) corresponding to lines contained in y < (cid:15)x } . We reparametrise such lines using y = kx + l where the slope k and intercept l replace θ and s respectively. We will also write x + ( k, l ) = k + √ k +4 (cid:15) ( l + h +1)2 (cid:15) for the larger value of x atwhich { y = kx + l } intersects { y = (cid:15)x − h − } . When we parametrise the lines in this way, thebeam transform becomes D ˜ a (( x, kx + l ) , k ) = (cid:90) ∞ ˜ a ( x + s, k ( x + s ) + l ) (cid:112) k d s = c ( x + ( k, l ) − x ) (cid:112) k and so the AtRT becomes R ˜ a ˜ f ( k, l ) = (cid:90) ∞−∞ ˜ f ( x, kx + l ) e c ( x − x + ( k,l )) √ k (cid:112) k d x. We now introduce the new coordinates ( z, w ) defined by z = √ (cid:15)x and w = y − (cid:15)x + h . With thischange, the region { (cid:15)x > y > (cid:15)x − h } becomes the strip { h > w > } , and abusing notationslightly by writing ˜ f also for the same function in these coordinates the AtRT becomes R ˜ a ˜ f ( k, l ) = (cid:90) ∞−∞ ˜ f (cid:32) z, − (cid:18) z − k √ (cid:15) (cid:19) + k (cid:15) + l − h (cid:33) e c ( √ (cid:15)z − x + ( k,l )) √ k (cid:112) (cid:15) (1 + k ) d z for any ( k, l ) corresponding to a line contained in { y < (cid:15)x } and passing through the region { y > (cid:15)x − h } . The uniqueness of ˜ f in the region { y < (cid:15)x } , and therefore also f in the sameregion, now follows from [7, Theorem 3.1] since we have that a = e c ( √ (cid:15)z − x + ( k,l )) √ k (cid:112) (cid:15) (1 + k )is an analytic function of k/ (2 √ (cid:15) ) and z provided the imaginary part of k is sufficiently small.Lemma 8 now allows us to complete the proof of Theorem 2. Indeed, the induction step is alreadyproved as described just above the lemma. The base case is also included in Lemma 8 since wecan determine the set C as in the lemma by looking at the support of R a f , and then we alwaysknow f | R \ C = 0 and a | R \ C = 0. This completes the proof of Theorem 2. We now turn our attention to numerically recovering a and f from data. We begin by first outlininghow we discretize the domain. Let Ω be the domain of interest, and split Ω into M square pixelsof resolution d x . We order the pixels lexicographically from the top left to the bottom right. Wethen assume that a and f are piecewise constant over each pixel. Recall that for an oriented linegiven by ( s, θ ) we define the AtRT via R a f ( s, θ ) = (cid:90) ∞−∞ f ( sθ ⊥ + tθ ) e − Da ( sθ ⊥ + tθ,θ ) d t (22)where s is the signed closest approach to the origin and θ is a unit direction tangent to the lineand giving the orientation. Since a and f are piecewise constant on the pixels, we can evaluate(22) exactly as follows. Let P be a list of the pixels passed, in the order in which they are passed,along the oriented line and denote the length of P by N . Note for this to be well-defined we needthe ray to be oriented. Let K be the ordered set of t values which correspond to an intersectionwith an edge of a pixel in the grid and let IT be the set of distances between adjacent entries in K . Using this notation we find R a f ( s, θ ) = N (cid:88) i =1 f P ( i ) IT ( i ) e − IT ( i ) aP ( i )2 sinhc (cid:18) IT ( i ) a P ( i ) (cid:19) S ( i ) . (23)where a P ( i ) , f P ( i ) are the values of a and f in the P ( i )th pixel,sinhc( z ) = (cid:40) sinh( z ) z z (cid:54) = 01 z = 0 (24) nd S ( N ) = 1, S ( i −
1) = S ( i ) e − IT ( i ) a P ( i ) . This allows us to rewrite the AtRT as a vector equationinvolving a and f . If we are given data vector d for a set I of oriented lines ( s i , θ i ) i ∈I then we cancombine all of these vector equations into a matrix equation R [ a ] f = d. (25)The discretised problem of interest is then to determine both a and f from d given by (25)where a is multi-bang with the admissible set A = { a , a , ..., a n } known (note that for notationalconvenience we have reindexed the admissible values relative to Definition 1). We attempt to dothis by solving the variational problemargmin a,f R ( a, f ) := (cid:107) R [ a ] f − d (cid:107) + α M ( a ) + λ TV( a ) + η TV( f ) (26)where TV is a discrete version of the total variation and M , which will be described below, isused to enforce the multi-bang assumption. The known set of admissible attenuation values is A := { a , a , ..., a n } with a < a < ... < a n . Recent work in [10, 11] attempted to design a convexregularizer to promote multi-bang solutions when the admissible set is known. The original ideain [10] was to make a convex penalty with jumps in gradient at admissable values. In this paperwe instead use a modified, non-convex, version of the multi-bang penalty given by M ( a ) := (cid:90) Ω m ( a ( x ))d x (27)where m ( t ) = (cid:26) ( a i +1 − t )( t − a i ) , t ∈ [ a i , a i +1 ] ∞ , otherwise . (28)Compared to the convex multi-bang penalty from [10], this has the advantage of giving a proximalmap which has multi-bang values as stationary points. Note that in the discrete case we considerpiecewise constant a and so (27) is really a sum over pixels given by M ( a ) := M (cid:88) i =1 m ( a ( i )) . (29)Strictly speaking (29) should have a factor of d x in front of the summation but this gets absorbedby the regularization parameter α and so we omit it.Although the regularizer (29) promotes multi-bang solutions it provides no spatial regularity,and so we also include total variation[21] regularization as a joint regularizer. Total variationhas been widely studied and is well known to promote piecewise constant images with smallperimeter[21, 10]. This combination, at least numerically, allows us to significantly reduce thenumber of projections required to obtain a good reconstruction. For practical implementation weuse a smoothed version of the isotropic total variation [21]TV c ( a ) = M − (cid:88) i =1 (cid:113) (cid:107) D i a (cid:107) + c, (30)where c > D i ∈ R × M is a finite difference matrixsatisfying D i a = (cid:32) a ( i ) − a ( i + 1) a ( i ) − a ( i + M ) (cid:33) if 1 ≤ i ≤ M − M & mod ( i, M ) (cid:54) = 0 (cid:32) a ( i ) − a ( i + M ) (cid:33) if 1 ≤ i ≤ M − M & mod ( i, M ) = 0 (cid:32) a ( i ) − a ( i + 1)0 (cid:33) if M − M + 1 ≤ i ≤ M − . (31)Note that the smoothness of the total variation is required in order to guarantee global Lipschitzcontinuity of its gradient. t this point we would like to mention that although (30) is convex and the non-convex multi-bang regularizer is weakly convex, we still have to be careful with the data fidelity term. It can beshown that for sufficiently large a , (cid:107) R [ a ] f − d (cid:107) may be non-convex. Because of this we use thefollowing alternating minimization scheme [1] designed for non-convex objective functions a k +1 ∈ argmin a R ( a, f k ) + 12 ξ k (cid:107) a − a k (cid:107) ,f k +1 ∈ argmin f R ( a k +1 , f ) + 12 ξ k (cid:107) f − f k (cid:107) , (32)for sufficiently small { ξ k } ∞ k =1 . We first turn our attention to the a update. a Since we only concern ourselves with parts of the objective function R ( a, f ) involving a , the a update in (32) is equivalent to a k +1 ∈ argmin a (cid:107) R [ a ] f k − d (cid:107) + α M ( a ) + λ TV c ( a ) + 12 ξ k (cid:107) a − a k (cid:107) . (33)For the purpose of solving this optimisation problem we introduce two auxiliary variables whichare x corresponding to a itself, and y corresponding to the discrete derivative of a . These twoparts are linked by the matrix equation Dx = y where D is the finite difference matrix obtain by stacking all the D i defined by (31) on top of eachother. We further split y into a series of 2 by 1 column vectors which are linked to x by the matrixequations D i x = y i . Therefore, we can rewrite (33) as a k +1 ∈ argmin x (cid:107) R [ x ] f k − d (cid:107) + α M ( x ) + λ M − (cid:88) i =1 (cid:113) (cid:107) y i (cid:107) + c + 12 ξ k (cid:107) x − a k (cid:107) , subject to Dx = y. (34)A standard algorithm for solving a optimization problem in the form of (34) is the AlternatingDirection Method of Multipliers (ADMM)[5]. In this case the augmented Lagrangian[5] is givenby L β ( x, y, µ ) = M − (cid:88) i =1 (cid:18) λ (cid:113) (cid:107) y i (cid:107) + c − µ Ti ( y i − D i x ) + β (cid:107) y i − D i x (cid:107) (cid:19) + (cid:107) R [ x ] f k − d (cid:107) + α M ( x ) + 12 ξ k (cid:107) x − a k (cid:107) . (35)where β > µ i are Lagrange multipliers related to y i . We also define a vector µ given byplacing the µ i related to y i in the same positions in µ as the corresponding y i are in y . The ADMMalgorithm for solving (33) then proceeds as follows x l +1 = argmin x L β ( x, y l , µ l ) ,y l +1 = argmin y L β ( x l +1 , y, µ l ) ,µ l +1 = µ l + β ( y l +1 − Dx l +1 ) . Removing terms not involving x , we see that the x update for x l +1 can be calculated using thefirst order optimality condition0 ∈ ∂ x (cid:26) (cid:107) R [ x l +1 ] f − d (cid:107) + β (cid:107) y l − Dx l +1 (cid:107) + 12 ξ l (cid:107) x l +1 − a k (cid:107) − µ T ( y l − Dx l +1 ) + α M ( x l +1 ) (cid:27) ∈ ∇ x ( (cid:107) R [ x l +1 ] f − d (cid:107) ) + βD T ( Dx l +1 − y l ) + D T µ + 1 ξ l ( x l +1 − a k ) + ∂ x α M ( x l +1 ) . x m ( x ) Weakly Convex Multibang Penalty proximal map
Figure 3: Weakly convex proximal map where ∇ x ( (cid:107) R [ x l +1 ] f − d (cid:107) ) is determined from (23) as in [25]. Note that since the non-convexmulti-bang regularizer is separable, we have the elementwise optimality condition0 ∈ ∇ x ( (cid:107) R [ x l +1 ] f − d (cid:107) )( i ) + βD T ( Dx l +1 − y l )( i ) + D T µ ( i )+ 1 ξ l ( x l +1 ( i ) − a k ( i )) + ∂ x αm ( x l +1 ( i )) . (36)Now as ∇ x ( (cid:107) R [ x l +1 ] f − d (cid:107) ) + βD T ( y l − Dx l +1 ) + D T µ + ξ l ( x l +1 − a k ) is differentiable it isLipschitz continuous. Furthermore, as the pointwise multi-bang regularizer is weakly convex by[3] the pointwise multi-bang regularizer admits a well-defined proximal mapprox t αm ( x ) = a if x ≤ x , + a i if x i, − ≤ x ≤ x i, + for i ∈ { , , ... , n − } a n if x n, − ≤ x − αt (cid:18) x − αt ( a i +1 + a i ) (cid:19) if x i, + < x < x i +1 , − for i ∈ { , , ... , n − } where x i, − = a i − αt ( a i − a i − ) for i = 1 , ... , n,x i, + = a i + αt ( a i +1 − a i ) for i = 0 , ... , n − . A = { a , a , ... , a n } is the admissable set and > αt >
0. Figure 3 gives an example ofthe proximal map for the weakly convex multi-bang regularizer when the admissible set is A = { , . , . , . , } . Note in particular that prox t αm ( a i ) = a i which is in contrast to theconvex case [10]. Using this we can we find x l +1 satisfying the optimality condition (36) via afixed point iteration such as ISTA or FISTA [4]. Indeed, provided 0 < αt < by [4, 3] both ISTAand FISTA produce iterates which converge to a solution x l +1 of (36) to within any prescribedtolerance.We now turn our attention to the y update. The first order optimality condition for the y update gives 0 = λ y i (cid:112) (cid:107) y i (cid:107) + c − µ i + β ( y i − D i x l +1 ) (37)for all i . Whilst this cannot be explicitly solved for y i easily, we can make use of the gradient on theright hand side of (37) to solve the y update via gradient descent. Once we have updated all of the y i in this way we can combine them to update y . Finally since (cid:107) R [ a ] f k − d (cid:107) + α M ( a ) + λ TV( a )is lower semi-continuous and (cid:80) M − i =1 λ (cid:112) (cid:107) y i (cid:107) + c has Lipschitz continuous gradient, [17] givesconvergence of ADMM to a critical point; that is, both the primal residual r l := y l +1 − Dx l +1 anddual resdual s l := βD T ( y l +1 − y l ) converge. Numerically we can speed up the rate of convergence f the ADMM algorithm by having an adaptive β . We use the following scheme from [5]: pick β > l ≥ β l +1 := τ + β l if (cid:107) r l (cid:107) > ν (cid:107) s l (cid:107) β l τ − if (cid:107) s l (cid:107) > ν (cid:107) r l (cid:107) β l otherwise (38)for some chosen positive τ ± and ν . This completes the a update section of the numerical method.We now turn our attention to updating f . f Removing terms not involving f , the f update satisfies f k +1 ∈ argmin f (cid:107) R [ a k +1 ] f − d (cid:107) + η M − (cid:88) i =1 (cid:113) (cid:107) D i f (cid:107) + c + 12 ξ k (cid:107) f − f k (cid:107) (39)again where c > f k +1 solving this equation viaADMM [5, 17] in a similar way to the a update. However it is simpler here because we do not havethe multi-bang regularization term. The method is again proven to converge to a critical point.With both the a and f update dealt with we are ready to outline the joint reconstructionalgorithm. Algorithm 1
Joint reconstruction algorithm Input a as initial guess, step sizes t, β , tolerances δ , δ , δ , δ , δ and regularization parameters α, λ and µ . Set f to be the least squares solution of (cid:107) R [ a ] f − d (cid:107) . for k ≥ do Set x = a k and y = Dx . for l ≥ do Update x l +1 via ISTA or FISTA with δ as a tolerance on (cid:107) x l +1 − x l (cid:107) . Update y l +1 via gradient descent on (37). Set µ l +1 = µ l + β l ( y l +1 − Dx l +1 ). Update β l +1 via (38) Terminate when r l < δ and s l < δ and output a k +1 = x l +1 . Update f k +1 via (39) using ADMM with tolerance δ . Terminate when (cid:107) a k +1 − a k (cid:107) < δ and (cid:107) f k +1 − f k (cid:107) < δ . We point out that in this algorithm β is reset to the same initialised value whenever the inneriterations aimed at the a update in (32) (those indexed by l ) restart. With the numerical methodoutlined we now present some numerical results. Throughout this section we produce data on a 340 by 340 pixel grid and reconstruct on a 200by 200 grid to avoid inverse crime. All of the following examples have 5% added Gaussian whitenoise and were performed on a standard 4 core laptop using MATLAB. Note that much of thecomputational time is spent computing and recomputing the matrix representation of R [ a ] when a is updated, and many of the steps in this reconstruction can be done using parallel computingtoolboxes. Unless otherwise stated the following reconstructions use 12 parallel ray projectionswhich are equally spaced with some small perturbation to make the angles irrationally related (i.e.unless otherwise stated we only use data with 12 different values of θ ). Irrationally related angles have been shown to reduce the number of projections required to obtain good reconstructions[9, 30].For 12 projections the simultaneous reconstruction algorithm takes approximately 8 minutes.There are a large number of parameters to control which gives good flexibility but does requireextensive parameter tuning in order to obtain optimal results. In the following examples we useinitial guesses where a is constant. In practice convergence is obtained for all tested phantomsfor any constant initial guess of a , provided the constant value lies between a and a n in theadmissable set. We therefore use initial guess a = 0 for all numerical results presented here.The last general comment we make is that if we set ξ = ∞ , effectively removing the added terms (cid:107) a − a k (cid:107) and (cid:107) f − f k (cid:107) from (32) we still obtain convergence. In many cases removing this partimproves the speed of convergence, although the theoretical proof of convergence does not hold inthis case. Throughout the section we fix ξ = 50 and set all tolerances δ i to 1 × − .Figure 4 shows reconstructions obtained via an optimized parameter joint TV and multi-bangregularizer algorithm against those obtained purely by a TV approach. The left hand columngives the true phantoms for a and f ; a is binary so here A = { , } . The middle column showsthe joint reconstruction for a and f with both TV and multi-bang regularization. Here the stepsizes are t = 10 and β = 0 .
1. The regularization parameters are α = 0 . λ = η = 0 .
1. Thereconstruction for a is multi-bang with the overall structure very well recovered. This is also seenin the reconstruction for f . We note that an L1 regularizer could have been used in the place ofTV on f . In practice however TV performs much more favourably in removing cross talk-artefactswhich are common in these types of joint reconstructions[7, 24, 16]. The right hand column isan optimized parameter reconstruction obtained using just TV with 80 projections. In this caseboth the binary nature and the structure of a are lost, even with the extra data. This can alsobe seen in the recovery of f ; the structure is roughly recovered but there are several ring artefactsappearing in the reconstruction.Figure 5 shows the effect of the number of projections on reconstruction quality. Here thephantom is made up of 3 regions with A = { , . , } . The left column shows the true phantomsfor a and f . The middle column is an optimized reconstruction using 6 projections with α = 0 . λ = η = 0 .
05. The right column shows an optimized reconstruction using 12 projections with α = 0 . , λ = 0 .
05 and η = 0 .
15. In both reconstructions for a we obtain multi-bang solutions.The middle column shows a poor recovery of the structure of a and f . The recovered a has alot of misclassification and has been unable to separate the regions. The inaccuracies in a havean impact on the recovery of f , with the outer most regions of f being poorly recovered. The rightmost column is a very good recovery of both a and f , with just a small section on the leftbracket being misclassified. The matching f is also very well recovered. Although the smoothedTV regularization removes cross talk artefacts it is important not to use too large η as this removesthe continuous nature of f at the edges. We also remark that there is no significant improvementin the quality of the reconstruction if we increase the number of projections further.Figure 6 shows a plot of the proportion of pixels taking admissible values against outer iterationnumber (that is k in Algorithm 1) for the rightmost reconstruction of a in Figure 5. The initialguess is a constant at 0, which is why the initial proportion is 1. The proportion increasesmonotonically after about 20 iterations with some large jumps before this point. These typicallyline up with β l +1 being either increased or decreased in the inner iterations. This graph is typicalfor reconstructions presented here and suggests that another suitable stopping criteria would beto terminate after a certain proportion of admissible values is reached. Typically a convergentreconstruction has a multi-bang proportion of over 0.95 by the time the algorithm is terminatedby the step size tolerances.Figure 7 shows the effect of shrinking the size of the support of f on reconstruction. This islinked to the proof of uniqueness from Theorems 1 and 2, where we require f to be non-zero ona sufficient number of rays tangent to the jumps in a . Here the true a is a multi-bang versionof the Shepp-Logan phantom, with admissable set A = { , . , . , . , } . The step sizes are t = 0 .
075 and β = 0 . α = 0 . , λ = 0 .
05 and η = 0 .
15 for eachreconstruction. The top row shows reconstructions of a with fixed and known true f in the bottomrow. The reconstruction algorithm here is then performed by simply performing one update for a . The rightmost reconstruction captures the shape and classifies almost all pixels correctly; mostimportantly it captures the smallest regions well. Note here that as in Theorem 1 and 2 f hasa larger support than a . It is possible to obtain reconstructions similar to that of the rightmostrecovery for f which have a slightly smaller support than a . The 2nd column from the right alsohas a decent shape recovery but has lost some of the finer features such as the ellipses at thebottom. The 2nd column from the left does a poor job of the recovery of a with very few pixelscorrectly classified as 1. The high valued outer ellipse is lost and all the finer details are missing.This is a similar effect to reducing the number of projections, which could be expected as reducing f ).20igure 8: Numerical reconstruction of walnut phantom for a with 30 projections f leads to fewer rays contributing data per projection. We can however still see some detail outsideof the support of f , this is due to TV being able to fill in the gaps and extend our visibility. Theangled straight edges in the reconstruction are related to the angles of the projections in the dataset.The final numerical result we present in Figure 8 is obtained using an image of a walnutreconstructed from CT data by the Finnish Inverse Problems Society [18] as the attenuation map.The walnut phantom for a is binary and so A = { , } . Here the step sizes are t = 0 .
05 and β = 0 . α = 0 . , λ = 0 . η = 0 .
15. The left handcolumn is the true a and f and the right hand column the reconstruction. In general the largerstructures of a are recovered but the finer details are lost. This is still true even if the numberof projections is upped significantly. This is in part due to the detail being compressed goingfrom 340 by 340 to 200 by 200 pixels and the TV regularizer eliminating the smallest non-zeroregions. The larger sections are well-recovered and the outermost boundary is very well classified.The reconstruction for f is again good with the areas towards the boundaries being the areasmost affected by the errors in a . There are no cross-talk artefacts present, even with the morecomplicated a . In this paper we have presented and proved two theorems on the identification problem for SPECTinvolving multi-bang attenuation. In particular, for nicely multi-bang a and f ∈ C c ( R ) non-negative with sufficiently large support we have shown uniqueness of joint recovery for the AtRT.The method of proof for these theorems gives possible methods to produce similar results withfurther relaxed conditions on a and f , and we intend to investigate this in future work. n the numerical side, we have formulated a variational problem including a weakly convex ver-sion of the convex multi-bang regularizer[10, 11] and a smoothed total variation, and presented analgorithm for simultaneous recovery of a and f from the resulting variational problem (26). Usingan alternating direction approach for non-convex objective functions [1] coupled with ADMM[5]we are able to successfully solve these variational problems. The addition of a joint multi-bang andtotal variation regularizer has produced good results for joint recovery with projection numberssimilar to those used in the X-Ray recovery case when the image is known to have only a finitenumber of values [9, 30]. The apparent convergence of the algorithm even in the case ξ k = ∞ in all cases we have investigated, and also independent of the smoothing parameter c for thetotal variation presents theoretical questions for future work. Also, even though the variationalproblems on which our method is based are non-convex, our algorithm consistently converges toa reasonable approximation for the correct solution. This suggests we may actually be findingthe global minimiser, or be getting close to the global minimiser, and there is potential for futureresearch investigating whether this is indeed correct.Finally, the numerical examples shown in Figure 7 indicate that we are able to obtain goodrecovery for a even when the support of f is smaller than required in our theoretical results. Wesuspect that the total variation may be playing a role in filling in the boundaries of regions ofconstant a , and would like to investigate this further as well. References [1] H´edy Attouch, J´erˆome Bolte, Patrick Redont, and Antoine Soubeyran. Proximal alternatingminimization and projection methods for nonconvex problems: An approach based on theKurdyka-(cid:32)Lojasiewicz inequality.
Mathematics of Operations Research , 35(2):438–457, May2010.[2] Guillaume Bal and Alexandre Jollivet. Combined source and attenuation reconstructionsin SPECT. In
Tomography and Inverse Transport Theory , volume 559 of
ContemporaryMathematics , pages 13–28. American Mathematical Society, 2011.[3] I. Bayram. On the convergence of the iterative shrinkage/thresholding algorithm with aweakly convex penalty.
IEEE Transactions on Signal Processing , 64(6):1597–1608, 2016.[4] Amir Beck and Marc Teboulle. A fast iterative shrinkage-thresholding algorithm for linearinverse problems.
SIAM Journal on Imaging Sciences , 2(1):183–202, 2009.[5] Stephen Boyd. Distributed optimization and statistical learning via the alternating directionmethod of multipliers.
Foundations and Trends R (cid:13) in Machine Learning , 3(1):1–122, 2010.[6] A. A. Bukhgeim and S. G. Kazantsev. Inversion formula for the fan-beam, attenuated Radontransform in a unit disk. Russian Academy of Science Siberian Branch: The Sobolev Instituteof Mathematics, Preprint No.99 , 2002.[7] A. L. Bukhgeim.
Introduction to the Theory of Inverse Problems . Number 19 in Inverse andIll-Posed Problems. V.S.P. International Science Publishers, 2000.[8] A. L. Bukhgeim. Inverse gravimetry approach to attenuated tomography. In
Tomography andInverse Transport Theory , volume 559 of
Contemporary Mathematics . American MathematicalSociety, 2011.[9] Guang-Hong Chen, Jie Tang, and Shuai Leng. Prior image constrained compressed sensing(piccs): A method to accurately reconstruct dynamic ct images from highly undersampledprojection data sets.
Medical Physics , 35(2):660–663, Jan 2008.[10] Christian Clason, Florian Kruse, and Karl Kunisch. Total variation regularization of multi-material topology optimization.
ESAIM: Mathematical Modelling and Numerical Analysis ,52(1):275–303, Jan 2018.[11] Christian Clason and Karl Kunisch. A convex analysis approach to multi-material topologyoptimization.
ESAIM: Mathematical Modelling and Numerical Analysis , 50(6):1917–1936,Nov 2016.
12] M Courdurier, F Monard, A Osses, and F Romero. Simultaneous source and attenuationreconstruction in SPECT using ballistic and single scattering data.
Inverse Problems , 31,2015.[13] E. Cueva, A. Osses, J. C. Quintana, C. Tejos, M. Courdurier, and P. Irarrazaval. Algebraicreconstruction of source and attenuation in SPECT using first scattering measurements. InB. Hofmann, A. Leit˜ao, and J. Zubelli, editors,
New Trends in Parameter Identification forMathematical Models . Birkh¨auser, Cham, 2018.[14] Bela Frigyik, Plamen Stefanov, and Gunther Uhlmann. The X-ray transform for a genericfamily of curves and weights.
Journal of Geometric Analysis , 18(89), 2008.[15] D. Gourion, D. Noll, P. Gantet, A. Celler, and J.P. Esquerre. Attenuation correction usingSPECT emission data only.
IEEE Transactions on Nuclear Science , 49(5):2172–2179, 2002.[16] Daniel Gourion and Dominikus Noll. The inverse problem of emission tomography.
InverseProblems , 18(5):1435–1460, Sep 2002.[17] K. Guo, D. R. Han, and T. T. Wu. Convergence of alternating direction method for minimizingsum of two nonconvex functions with linear constraints.
International Journal of ComputerMathematics , 94(8):1653–1669, Sep 2016.[18] Keijo H¨am¨al¨ainen, Lauri Harhanen, Aki Kallonen, Antti Kujanp¨a¨a, Esa Niemi, and SamuliSiltanen. Tomographic x-ray data of a walnut. arXiv:1502.04064, February 2015.[19] Qiu Huang, Gengsheng L. Zeng, and Jiansheng Wu. An alternative proof of Bukhgeimand Kazantsev’s inversion formula for attenuated fan-beam projections.
Medical Physics ,33(11):3983–3987, Oct 2006.[20] F. Natterer.
The Mathematics of Computerized Tomography . Society for Industrial andApplied Mathematics, 2001.[21] Deanna. Needell and Rachel. Ward. Stable image reconstruction using total variation mini-mization.
SIAM Journal on Imaging Sciences , 6(2):1035–1058, 2013.[22] R. G. Novikov. On the range characterization for the two-dimensional attenuated X-raytransformation.
Inverse Problems , 18(3):677–700, Apr 2002.[23] Roman G. Novikov. An inversion formula for the attenuated X-ray transformation.
Arkiv f¨orMatematik , 40(1):145–167, 2002.[24] Eric Todd Quinto. The invertibility of rotation invariant radon transforms.
Journal of Math-ematical Analysis and Applications , 91(2):510 – 522, 1983.[25] Philip Richardson. Applications of multi-bang regularisation in passive radiation emissiontomography. Master’s thesis, University of Manchester, 2017.[26] Donald C. Solmon. The identification problem for the exponential Radon transform.
Mathe-matical Methods in the Applied Sciences , 18(9):687–695, Jul 1995.[27] Plamen Stefanov. The identification problem for the attenuated X-ray transform.
AmericanJournal of Mathematics , 136(5):1215–1247, 2014.[28] Benjamin M. W. Tsui, Hong-Bin Hu, and David R. Gilland. Implementation of simultaneousattenuation and detector response correction in SPECT.
IEEE Transactions on NuclearScience , 35(1):778–783, 1988.[29] Andy Welch, Rolf Clack, Frank Natterer, and Grant Gullberg. Toward accurate attenuationcorrection in SPECT without transmission measurements.
IEEE Transactions on MedicalImaging , 16(5):532–541, 1997.[30] X. Zhuge, W. J. Palenstijn, and K. J. Batenburg. TVR-DART: A more robust algorithmfor discrete tomography from limited projection data with automated gray value estimation.
IEEE Transactions on Image Processing , 25(1):455–468, 2016., 25(1):455–468, 2016.