Multi-Regularization Reconstruction of One-Dimensional T_2 Distributions in Magnetic Resonance Relaxometry with a Gaussian Basis
Chuan Bi, Miao-jung Yvonne Ou, Wenshu Qian, Kenneth W. Fishbein, Mustapha Bouhrara, Richard G. Spencer
MMulti-Regularization Reconstruction of One-Dimensional T Distributions in Magnetic Resonance Relaxometry with aGaussian Basis
Chuan Bi , M. Yvonne Ou , Wenshu Qian , Kenneth W. Fishbein , MustaphaBouhrara , and Richard G. Spencer National Institute on Aging, Baltimore, MD 21224, U.S.A Department of Mathematical Sciences, University of Delaware, Newark, DE 19716,[email protected], [email protected], [email protected], fi[email protected],[email protected],[email protected]
We consider the inverse problem of recovering the probability distribution func-tion of T relaxation times from NMR transverse relaxometry experiments. Thisproblem is a variant of the inverse Laplace transform and hence ill-posed. We castthis within the framework of a Gaussian mixture model to obtain a least-squareproblem with an L regularization term. We propose a new method for incorpo-rating regularization into the solution; rather than seeking to replace the nativeproblem with a suitable mathematically close, regularized, version, we insteadaugment the native formulation with regularization. We term this new approach’multi-regularization’; it avoids the treacherous process of selecting a single bestregularization parameter λ and instead permits incorporation of several degrees ofregularization into the solution. We illustrate the method with extensive simula-tion results as well as application to real experimental data. UPDATED: March 3, 2021
In the study of magnetic resonance (MR) and magnetic resonance imaging (MRI), measurementsand determination of T relaxation times by magnetic resonance spectroscopy (MRS) is one of the mainapplications. Physically speaking, T relaxation time represents the time constant for decay of transversemagnetization for specific tissue in atomic levels, therefore, due to the fact that different materials possesstheir unique T values, the inverse problem is then considered to be recovering those values from dataacquired by the noninvasive MR tests. Signals obtained from measurement usually exhibit behaviorsof mono- or multi-exponential decay whose governing equation can be initially expressed as weightedsummation of exponential functions: y ob ( t ) = M (cid:88) i =1 c i e − t/T ,i (1.1)where c i and T ,i represent characteristics (concentration and relaxation time) of the i -th material in thesample. The addition parameter M denotes the pre-determined number of components.1 a r X i v : . [ s t a t . M E ] M a r s an alternative to the model in Eq. (1.1), a more general representation in the form of Fredholmintegral equation of the first kind, is introduced: y ob ( t ) = (cid:90) T , max T , min G ( t, T ) f ( T ) dT (1.2)where G ( t, T ) = e − tT is called the kernel and f ( T ) ≥ T relaxation time distribution .As a consequence, Eq. (1.1) can be seen as a special case for Eq. (1.2) where the T distribution f ( T )becomes summation of M Dirac Delta functions at T = T ,i . On the other hand, Equation (1.2) canalso be recognized as a truncated Laplace transform, whose inverse transform is a well-known ill-posedproblem. In this case, the associated inverse problem is to determine the distribution f ( T ) given discretemeasurements from y ob ( t ). Practically, the observable data y ob ∈ R m comes as discretized values and thegoal is to determine a discretized version of the T distribution f true ∈ R n subject to the forward map y ob = Af + n , f ≥ (1.3)where A ∈ R m × n is the kernel matrix with entries A i,j = e − tiT ,j ∆ T , and n is denoted as additive noise.The positivity constraint f ≥ is physically intrinsic in the sense that, all T values of any material arealways non-negative. Inheriting the smoothing property of the integral operator in (1.2), A is a smoothingoperation whose condition number can be so large that any direct inversion without regularization such asusing pseudo-inverse or non-negative least squares (NNLS) can be extremely unstable.From the perspective of inverse problems, determination of discrete T distribution f can be seen asa constrained version of inverse Laplace transform and integral equation whose ill-posedness can be oftenaddressed by regularizations. And in inverse problems, Tikhonov-typed Regularization is often consideredas a powerful tool in stabilizing such ill-posedness. It is performed by minimizing the cost functioninvolving observable data and one regularization term which requires a regularization parameter λ . Theregularization parameter needs to be appropriately chosen so as to reach a balance between the modelerror and solution norm. In magnetic resonance applications, the classical Tikhonov regularization of reconstructing f given A and y ob reads f λ = argmin f ≥ (cid:110) (cid:107) Af − y ob (cid:107) + λ (cid:107) f (cid:107) (cid:111) . (1.4)Again the regularization parameter λ acts as a trade-off between the the size of regularized solutions and thefit to observable data. Additionally, prior knowledge of the solution f , such as sparsity or smoothness thatdepends on the actual application, can be applied to the minimization problem by varying the regularizingterm (cid:107) f (cid:107) to (cid:107) f (cid:107) p with p ≥ λ has been studied for decades, however, there is nouniversal approach. Classical methods such as L-curve [11, 10, 8, 9, 3], generalized cross-validation (GCV)[5, 19, 20], discrepancy principle (DP) [13, 14, 15] and so on are used to choose one λ by certain criteria.For example, in L-curve method, the optimal λ lies near the corner of the L-shaped curve practically, andthe L-curve is the log-log plot of the solution norm against residual norm; Selecting λ using GCV is tolocate the minimum of the GCV function; DP seeks λ such that the residual norm is proportional to thenorm of noise by a fixed constant. All existing parameter selection methods seek only one optimal λ andits corresponding regularized solution, and discard all others obtained from regularization with differentvalues of λ . A survey of the methods can be found in [1]. Distinct from all current methods, we propose anew method termed Multi-Regularization (Multi-Reg) in section 2. Recovered approximation by Multi-Regis formed by a linear combination of regularized solutions across a range of proposed λ ’s, as oppose toselecting one λ and its associated approximation. 2 .3 Gaussian Mixture Model and Application to Determining T Distributions
Gaussian mixture model is an approach of representing a probability density function by a weightedsum of Gaussian distribution functions [17]. Gaussian mixture models are often considered as a means ofunsupervised learning problems where a cluster of data is assumed to be governed by a Gaussian densityfunction g ( µ i , σ i ), with µ i and σ i represent the mean and standard deviation of a single Gaussian densityin one dimension. In the application of determining the one-dimensional T relaxation distribution, theidea of representing the distribution f ( T ) in terms of a weighted sum of Gaussian density functions, is tosolve for the unknowns µ i , σ i so that the problem of determining a T distribution can be formed as annon-linear least squares problem: { σ i , µ i } Mi =1 = argmin { σ i ,µ i } (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) y ob − A M (cid:88) i =1 g ( σ i , µ i ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (1.5)where g ( σ i , µ i ) denote the discretized Gaussian density function with standard deviation σ i and mean µ i . This approach has the advantage of significantly reducing the size of unknowns [16]. However, thisnonlinear form requires heavily on the prior knowledge of how many Gaussian functions are needed torepresent the end distribution, and the intrinsic ill-posedness of the linear problem in Eq. (1.3) is altered,which is beyond the scope of study in this paper.Instead, it is still possible to treat the problem of determination of T distribution based on the Gaussianmixture assumption as simply a least squares problem by providing a dictionary of Gaussian functions tochoose from: c ∗ = argmin c ≥ , (cid:80) i c i =1 (cid:107) y ob − ( AG ) c (cid:107) , (1.6)where columns of matrix G , termed { g i } , represent different Gaussian distributions and c is the vectorof coefficients assigned to each proposed distribution. The least squares problem is then considered asfinding the coefficients of the proposed Gaussian distribution functions such that problem (1.6) is satisfied.The problem is usually severely ill-posed due to the non-orthogonality of Gaussian functions that form thedictionary, and can lead to extremely unstable solutions. In general, in the case of a given underlying function f true ≥ , a common procedure of simulation aimsto test the performance of a certain parameter selection method, is performed following the steps below:1. Generate noiseless observation: y clean = Af true .2. Manually add noise with pre-selected SNR: y noisy = y clean + n .3. Solve Tikhonov regularization problem (1.4) with different proposed λ j to get regularized solution f λ j .4. Choose one regularized solution f λ ∗ according to certain parameter selection method.5. Compare the end result f λ ∗ with the underlying distribution f true .The common ground for all existing parameter selection methods is that, the “optimal” parameteris chosen among a list of proposed values and the associated regularized solution follows. That is thesame way to say that, in the view of all proposed regularized solutions, existing methods assign coefficients[0 , , · · · , , , , · · · ,
0] to the candidates, in which at index i , the i -th regularized solution is selected. How-ever, in general, there is no best method of parameter selection, and depending on the actual applicationsof the inverse problems, various methods perform differently. This means that there is no guarantee that3he selected i -th regularized solution according to a conventional method ought to be the optimal whileother methods might return different values.To put the problem of parameter selection in a grater framework, that is, the end solution is consideredto be a vector of coefficients assigned to different regularized solutions. So the end solution is not restrictedto sparse solutions such as [0 , , · · · , , , , · · · , f true is given, we can certainly obtainbetter recovery than classical parameter selection methods for (1.4) with a little help of “cheating”, whichis by solving for { α j } N that satisfy f true ≈ N (cid:88) j =1 α j f λ j (2.1)where { α j } N solve the following least square problem( α , α , · · · , α N ) = argmin α ≥ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) f true − N (cid:88) j =1 α j f λ j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (2.2)In other words, from the perspective of Eq. (2.2), all traditional parameter selection methods only arriveat a single regularized solution f J that corresponds to special choices of { α j } N where α J = 1 for some J and all other α j = 0 for j (cid:54) = J . Given the observation above, we wonder if similar idea can be appliedto this inverse problem in practice where f true is not known, as a result, the problem now becomes todetermine { α j } N while the underlying f unknown is not given: f unknown ≈ N (cid:88) j =1 α j f λ j , where α j ≥ (cid:8) f λ j (cid:9) N as one set of representing functions for f unknown as in (2.3), we also takeadvantage of the assumption that f unknown can be represented by linear combinations of Gaussian functions { g i } M in order to be consistent with the non-negativity f unknown ≥ . Therefore the other representationof f unknown yields f unknown = M (cid:88) i =1 c i g i , where c i ≥ . (2.4)Additional information maybe available depending on the applications. For MR relaxometry, f ( T ) isregarded as a probability distribution of T times, whose integral over the whole T space equals 1. In Eq.(2.4) this means (cid:80) i c i ≡
1. To link the two expressions (2.3) and (2.4), we propose to approximate theregularized solutions f λ j as a linear combination of the regularized solutions { g ij } Nj =1 , where noise pollutedGaussian g ij is obtained after step 1 - 3 for each Gaussian function g i with respect to λ j . Given noisy observation y ob ∈ R m , we obtain the regularized solutions f j =: A − λ j y ob (as a shortof f λ j ) by solving (1.4). Note that the inversion operator A − λ j is symbolic and not the usual form ofpseudoinverse because of the non-negativity constraints. We first assume the underlying distribution f unknown can be written as a linear combination of f j , and the goal is to seek coefficients { α j } N as well as f unknown according to f unknown ≈ N (cid:88) j =1 α j f j =: f α (2.5)and note that both f unknown and α j need to be determined.4n addition, if a Gaussian distribution g i ∈ R n is provided, it can be approximated as a linear combi-nation g ij ∈ R n , which are regularized solutions after the procedure g i → Ag i → Ag i + n → g ij or simplywritten as g ij = A − λ j ( Ag i + n ) . (2.6)Then follow the intuition of Multi-Reg g i ≈ N (cid:88) j =1 β ij g ij (2.7)where random noise realizations n share the same SNR tailored to the experimental data y ob . In fact, as n is random, the corresponding g ij and β ij are random variables whose values differ for each different n .So we average both g ij and β ij over n run times of noise realizations to get the mean values of g ij and β ij .Numerically the coefficients β ij can be obtained by solving the LS problem β ij = argmin β ij ≥ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) g i − N (cid:88) j =1 β ij g ij (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) , i = 1 , , · · · , M (2.8)Using the presumptive representation for our distribution f unknown that can be seen as sum of proposedGaussian functions { g i } Mi =1 , we have f unknown ≈ M (cid:88) i =1 c i g i =: f c (2.9)and note that coefficients c i need to be determined since f unknown is not known. By the assumption of(2.9) that the original distribution is represented in terms of Gaussian functions, we can further assumeeach regularized recovery f j can be represented by regularized solutions g ij as well: f j ≈ M (cid:88) i =1 x ij g ij (2.10)so coefficients x ij satisfy the following LS problem, where each minimization is over index i with oneminimization problem for each j : x ij = argmin x ij ≥ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) f j − M (cid:88) i =1 x ij g ij (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) , j = 1 , , · · · , N (2.11)In conclusion, once the dictionary of Gaussian distributions { g i } Mi =1 is defined and given the noisy data y ob , and the statistics of the noise in the experiment is known, we would be able to calculate the sets ofcoefficients { β ij } and { x ij } from the regularized solutions { f j } and { g ij } and our task becomes to seekcoefficients { α j } N , { c i } M in (2.5) and (2.9).Now equating the equations (2.5) and (2.9) together with (2.7) and (2.10), we arrived at an equationin terms of the obtained variables β ij , x ij and g ij : N (cid:88) j =1 α j f j ≈ M (cid:88) i =1 c i g i ⇒ f α = N (cid:88) j =1 α j M (cid:88) i =1 x ij g ij ≈ M (cid:88) i =1 c i N (cid:88) j =1 β ij g ij = f c (2.12)Then we can solve for α j and c i at the same time by solving the following LS problem ( α ∗ , c ∗ ) = argmin (cid:107) f α − f c (cid:107) subject to α ∗ ≥ , c ∗ ≥ , and (cid:88) i c ∗ i = 1 , (2.13)5ote that the solution ( α ∗ , c ∗ ) is unique, provided the additional equality constraint on (cid:80) Mi =1 c ∗ i = 1.Hence the final recovery f ∗ will be f ∗ = N (cid:88) j =1 α j f j or f ∗ = M (cid:88) i =1 c i g i (2.14) Before diving into the detailed computations, some notations shall be declared: L α = (cid:2)(cid:2) g g · · · g M (cid:3) , · · · , (cid:2) g N g N · · · g MN (cid:3)(cid:3) ∈ R m × MN (2.15) L c = (cid:2)(cid:2) g g · · · g N (cid:3) , · · · , (cid:2) g M g M · · · g MN (cid:3)(cid:3) ∈ R m × MN (2.16)note that the column size of L α is N M and L c is M N , the row size of two matrices are the same, whichis the number of nods for T . x vec = (cid:2)(cid:2) x x · · · x M (cid:3) , · · · , (cid:2) x N x N · · · x MN (cid:3)(cid:3) ∈ R MN (2.17) β vec = (cid:2)(cid:2) β β · · · β N (cid:3) , · · · , (cid:2) β M β M · · · β MN (cid:3)(cid:3) ∈ R MN (2.18)where the sizes of x vec and β vec are same as number of columns for L α and L c , respectively. α vec = (cid:2)(cid:2) α α · · · α α (cid:3) M , · · · , (cid:2) α N α N · · · α N α N (cid:3) M (cid:3) T (2.19) c vec = (cid:2)(cid:2) c c · · · c c (cid:3) N , · · · , (cid:2) c M c M · · · c M c M (cid:3) N (cid:3) T (2.20)where α vec ∈ R MN and c vec ∈ R MN share the same size with x vec and β vec as well.According to (2.5) and (2.9) f α = L α · diag ( x vec ) · α vec (2.21) f c = L c · diag ( β vec ) · c vec (2.22)hence the minimization problem (2.13) can be explicitly written as ( c ∗ , α ∗ ) = argmin (cid:107) L α · diag ( x vec ) · α vec − L c · diag ( β vec ) · c vec (cid:107) subject to c i ≥ , α j ≥ , (cid:88) i c i = 1 (2.23)Assume the solution is stacked in the form of s = [ α , α , · · · , α N , c , c , · · · , c M ] T ∈ R N + M (2.24)problem (2.23) can be further simplified to the classical form of constrained least square problem: find s ∗ ∈ R N + M such that s ∗ = argmin s ≥ (cid:107) Bs (cid:107) such that N + M (cid:88) j = N +1 s j = 1 (2.25)where B = L α · diag ( x vec ) · TT α − L c · diag ( β vec ) · TT c ∈ R m × ( N − M ) (2.26) TT α = I N ⊗ M · (cid:2) I N N × M (cid:3) ∈ R MN × ( N + M ) (2.27)6 T c = I M ⊗ N · (cid:2) M × N I M (cid:3) ∈ R MN × ( N + M ) (2.28) Remarks:
1. The computational procedure can be divided into two parts: offline computation and online com-putation. For offline computation, { g i } , { g ij } and { β ij } will be determined for only once. As foronline part, each time a noisy measurement y ob is given, the corresponding { f j } , { x ij } and ( c ∗ , α ∗ )can be obtained and final recovery is denoted as (2.14).2. To solve for { β ij } in (2.7), note that g ij depends on the noise and for a fixed g i , the resulted β ij depends on the added noise n as well, in other words, both g ij and β ij are actually random variables.One could perform the same task described in (2.7) for n run times and take the mean values for g ij and β ij .The pseudocode for offline and online computations read as follows: Algorithm 1:
Offline Computation
Data:
Pre-determined { λ j } , { g i } , n run and noise SN R
Result: { g ij } and { β ij } to be storedinitialization; for k = 1 , , · · · , n run dofor i = 1 , , · · · , M doy ( i )clean = A g i ; y ( i )noisy = y ( i )clean + n ; for j = 1 , , · · · , N dog ij = argmin g ≥ (cid:26)(cid:13)(cid:13)(cid:13) Ag − y ( i )noisy (cid:13)(cid:13)(cid:13) + λ j (cid:107) g (cid:107) (cid:27) ; endend Average g ij to get g ij ; β ij = argmin β ij ≥ (cid:13)(cid:13)(cid:13) g i − (cid:80) Nj =1 β ij g ij (cid:13)(cid:13)(cid:13) . endAlgorithm 2: Online Computation
Data: y ob , { λ j } , { g ij } , { β ij } Result: f ∗ initialization; for j = 1 , , · · · , N dof j = argmin f ≥ (cid:110) (cid:107) Af − y (cid:107) + λ j (cid:107) f (cid:107) (cid:111) ; x ij = argmin x ij ≥ (cid:13)(cid:13)(cid:13) f j − (cid:80) Mi =1 x ij g ij (cid:13)(cid:13)(cid:13) ; end ( c ∗ , α ∗ ) = argmin c , α ≥ , (cid:80) i c i =1 (cid:13)(cid:13)(cid:13)(cid:80) Nj =1 α j (cid:80) Mi =1 x ij g ij − (cid:80) Mi =1 c i (cid:80) Nj =1 β ij g ij (cid:13)(cid:13)(cid:13) ; f ∗ = (cid:80) j α ∗ j f j or f ∗ = (cid:80) i c ∗ i g i Case study: inversion of one-dimensional continuous NMR re-laxometry
The one-dimensional continuous NMR relaxometry signal admits the form [2, 12, 21, 6] y ob ( t ) = (cid:90) ∞ e − tT f true ( T ) d T + n ( t ) (3.1) y ob ( t ) denotes the observational signal at time t , T are relaxation times, f true ( T ) ≥ T distribution function and n ( t ) is the additive Gaussian noise for which the statistics of thenoise can be obtained. The truncated forward problem of (3.1) y ob ( t ) = (cid:90) T end T start e − tT f true ( T ) d T + n ( t ) (3.2)can be discretized as (1.3) y ob = Af true + n , f true ≥ (3.3)where y ob ∈ R m , f true ∈ R n , n ∈ R n , and entries of matrix A ∈ R m × n is given by A ij = e − tiT ,j ∆ T . Inthis case, we define the SNR of the noise as followsSNR = max | y ob | RMS( n ) (3.4)Gaussian Mixture Model (GMM) is considered as as a powerful tool to represent an arbitrary probabilitydistribution by a weighted sum of Gaussian distributions [16, 4, 7] and GMM works for reconstruction of T distribution because of the nature of non-negativity for f true . Therefore the associated inverse problemto (1.3) for Multi-Reg is f j = argmin f ≥ (cid:110) (cid:107) Af − y ob (cid:107) + λ j (cid:107) f (cid:107) (cid:111) (3.5)subject to the prior information that f can be represented as a linear combination of Gaussian distributions. In order to proceed with the offline computation, a few parameters need to be pre-determined, such asthe number of regularization parameters N , number of Gaussian functions M , choice of Gaussian functions g i , and the number of noise realizations n run added in the time-domain representation for each simulateddistribution in (2.6).We use the following biexponential form of f sim ( T ) in the T domain for simulation f sim ( T ) = 1 (cid:112) πσ e − ( T − µ σ + 1 (cid:112) πσ e − ( T − µ σ (3.6)where µ is set to 20ms, and µ is determined by the ratio of peak separation (RPS) µ µ , which consists of8 evenly spaced values between 1 and 8; σ = σ which contains 10 evenly spaced values between 2 and 8.There are 8 ×
10 = 80 different biexponential distributions in total.Discretized T values consist n = 200 evenly spaced nodes within the range [1 , y sim for truncated forward problem y sim = Af sim + n (3.7)is defined on t ∈ [0 . , m = 150 evenly spaced nodes, with A ij = e − tiT ,j . The SNR of the additivenoise is set to be 500. For the Tikhonov regularization (3.5) used in Multi-Reg, the range of regularizationparameter is from 10 − to 10. 8o test performance of Multi-Reg on the 80 simulated signals y sim , we represent the simulation resultusing heat maps to indicate the relative error over a wide range of Gaussian peak separations and widthswhere the relative error is defined as ε = (cid:107) f ∗ − f sim (cid:107) (cid:107) f sim (cid:107) (3.8)where f ∗ is the reconstructed distribution using Multi-Reg [18]. Moreover, the criteria for evaluating theperformance of each setting for Multi-Reg is the mean value of the heat map matrix (where the ( i, j )-thelement of the matrix stands for the relative error corresponds to the i -th σ and j -th RPS).To achieve good settings of Multi-Reg, we study different possible configurations of the dictionary { g i } M by varying the total number M , the standard deviations σ k for the Gaussian functions, as well asthe number of regularization parameters n λ , and number of repetitive tests n run . Since there are infinitesuch compositions, there is no way to analyze each possible configuration numerically, so we tested a fewpossible different settings and pick the one that has the smallest mean error of the heatmaps. In general,we tested 7 groups of dictionaries, for each dictionary { g i } M in group k , the standard deviations σ k of allGaussian functions g i equals k , and the mean values of all Gaussian distributions are set to be equallyspaced along the T axis depending on the number M , which means that, as M increases, the mean valuesof Gaussian functions become closer. For each group, we tested various settings, as are shown below. M Different M in the finite set { g i } M with uniform standard deviation σ k = k for each g i and for k = 1 , , · · · ,
7. To do this, for a given value of M , we vary the number of Gaussians by means of changingthe number of evenly spaced µ i within the range of T ∈ [1 , g i . The proposed values of M arechosen within { , , , , , , , , } , the performance (indicated by the mean valuesof the heatmaps) of Multi-Reg under the different settings of { g i } M . Other variations are fixed where thenumber of repetitive noise realizations n run = 10, and number of regularization parameters n λ = 10.
100 120 140 160 180 200 220 240 260 280
Number of Gaussians M ean V a l ue s o f H ea t m ap s = 1 = 2 = 3 = 4 = 5 = 6 = 7 Figure 1: Mean values of heat maps versus number of Gaussians M in { g i } M , i.e. the number of evenly spaced µ i for each g i with σ k = k where k = 1 , , · · · , σ k = 7 and with M = 220 as our proposed set ofGaussian distributions, simply because it shows the smallest averaged relative error among all other con-figurations. 9 .1.2 Choosing n λ The regularization parameters are chosen to be the n λ logarithmically spaced points between 10 − and10. Unlike in L-curve or GCV that a finer grid of regularization parameters is preferred than a coarse one,a good number of regularization parameters n λ for Multi-Reg should seek for a balance between reducingthe ill-posedness for solving the LS problems (2.8) and (2.11) and expanding the space spanned by { f j } .This means on one hand, n λ should be small so that solutions to (2.8) and (2.11), { β ij } and { x ij } arestable; on the other hand, n λ should be large so that each f j may be able to show various regularized results.While we fix all other settings of the method, where the M = 220 and n run = 10, we plot the trajectoriesof mean values of the heatmaps against various numbers of n λ . Number of Gaussians M ean V a l ue s o f H ea t m ap s = 1 = 2 = 3 = 4 = 5 = 6 = 7 Figure 2: L norms of heat maps for different number n λ ofregularization parameters, where the regularization parametersrange from 10 − to 10.We see that the “best” scenario is when n λ = 12 for σ k = 7, as it resulted in a smaller mean values ofthe heatmaps. n run To choose n run , we proposed a set of values { , , , , , , } to choose from while fixing all otherparameters where n λ = 12, M = 220. We plot the trajectories of the mean values as functions of n run ,shown in Fig. 3. 10
10 20 30 40 50 60 70 n run M ean V a l ue s o f H ea t m ap s = 1 = 2 = 3 = 4 = 5 = 6 = 7 Figure 3: Mean values of the heat maps against number of noiserealizations.It is seen from the trajectories that, when σ k = 6, and the associated n run = 32, the mean values ofheatmaps are minimal among all configurations considered for the test of choosing n run .In summary, during the limited number of testings, the Gaussian function dictionary is chosen to beformed as follows: M = 220, n λ = 12, n run = 32 and with uniform standard deviation σ = 6. After the preferred settings of Multi-Reg is determined, we would be ready to proceed online com-putations once new observational data y ob is given. We will use heat maps to represent the differencesbetween various underlying distributions and recoveries using Multi-Reg, together with L-curve and GCV.For L-curve and GCV recovery, we follow used Tikhonov Regularization to determine coefficients of thesame given Gaussian basis { g i } as is used in the offline computation for Multi-Reg, and increased thenumber of regularization parameter to 40 evenly logspaced nodes from 10 − to 10, in order to achieve ahigher resolution of L-curve and GCV curve. After the simulation, we averaged the the heat maps as wellas their corresponding L norm over 40 times. Each time of simulation, we manually added different noiserealizations with same SNR. Results are shown below:11 ulti-Reg Recoveries Ratio of Peak Separation G au ss i an (a) GCV Recoveries
Ratio of Peak Separation G au ss i an (b) L-curve Recoveries
Ratio of Peak Separation G au ss i an (c) Figure 4: Comparisons of the heat maps of recoveries using Multi-Reg, GCV and L-curve, on noisy signalsgenerated by various distributions.To better compare the results between Multi-Reg and the other two methods, and indicate regimes indetail, we plot the fractions of the relative-error heat maps matrices:
Multi-RegGCV and
Multi-RegL-curve as below, i.e.in the regime where the ratio is less than 1, Multi-Reg performs better and otherwise when the ratio islarger than 1, in the sense of relative errors. 12 omparison with GCV
Ratio of Peak Separation G au ss i an Figure 5: Ratio of average relative errors of Multi-Reg/GCV methods, over 20 noise realizations withSNR=500, corresponding to a range of peak widths and separations for two Gaussian peaks of equalstandard deviation.
Comparison with L-curve
Ratio of Peak Separation G au ss i an Figure 6: Ratio of average relative errors of Multi-Reg/L-curve methods, over 20 noise realizations withSNR=500, corresponding to a range of peak widths and separations for two Gaussian peaks of equalstandard deviation. 13
20 40 60 80 100 120 140 160 180 200
T2 Relaxation Time I n t e n s i t y Comparison of Recovered Distributions
True Dist.Multi-RegL-CurveGCV
Figure 7: Example showing different recoveries with comparablerelative errors, where Multi-Reg recovery correctly displays thenumber and location of the underlying distributions.
T2 Relaxation Time I n t e n s i t y Comparison of Recovered Distributions
True Dist.Multi-RegL-CurveGCV
Figure 8: Another example showing different recoveries withcomparable relative errors, where Multi-Reg recovery correctlydisplays the number and location of the underlying distributions.and examples consist of more than two Gaussian distributions whose coefficients are random:14
20 40 60 80 100 120 140 160 180 200
T2 Relaxation Time I n t e n s i t y Comparison of Recovered Distributions
True Dist.Multi-RegL-CurveGCV (a)
T2 Relaxation Time I n t e n s i t y Comparison of Recovered Distributions
True Dist.Multi-RegL-CurveGCV (b)
T2 Relaxation Time I n t e n s i t y Comparison of Recovered Distributions
True Dist.Multi-RegL-CurveGCV (c)
T2 Relaxation Time I n t e n s i t y Comparison of Recovered Distributions
True Dist.Multi-RegL-CurveGCV (d)
Figure 9: Examples showing that recoveries using Multi-Reg provide more information of the underlyingdistributions while the relative errors of Multi-Reg are comparable or even larger than L-curve or GCVwhen more than two Gaussian distributions with random coefficients are given. In the results above, bothL-curve and GCV select the same regularized solutions based on their parameter selection criterion.
To test the performances of Multi-Reg as well as other regularization parameter selection methods withrespect to real experimental data, two sets of experiments are conducted and their noisy observations arecollected. • Data acquisition: After informed consent, data were obtained from a 62-year-old male using a 3Twhole-body clinical scanner (Achieva, Philips) with a SENSE Flex-M coil. T -weighted scans werecollected along the axial plane within the thigh with T E/T R = 6ms / × • Problem settings: T ∈ [6 , SN R ≈ . T E , T , and noise level. • Optimal basis setting for Multi-Reg: the basis consists of 2 sets of Gaussians where the first set has n Gaussian = 250 and σ = 2, the second set of Gaussians has n Gaussian = 100 and σ = 3, n run = 20, γ i are chosen to be the n λ = 8 logarithmically spaced points between 10 − and 10.15 a) (b)(c) (d)(e) (f) Figure 10: Recoveries for muscle datasets: quadriceps (top) and medial (mid) and hamstrings (right). Lessregularized recoveries are selected by the GCV method, which successfully located the two water pools ineach muscle but also resulted in high peaks near 1ms; Over-regularized approximations are selected by theL-curve method, which failed to locate the extracellular water. Multi-Reg recoveries display a combinationof both features in a way that it reveals two water pools while reducing the sparsity. • Data acquisition: Data were obtained on formalin-fixed and washed 10 mm lengths, of cervical andlumbar spinal cord from a 4 month-old male C57BL/6 mouse, with cross-sectional lengths 2 × • Problem settings: T ∈ [1 , SN R ≈ T E , T , and noise level. • Optimal basis setting for Multi-Reg: the basis consists of 1 set of Gaussians where n Gaussian = 200and σ = 2, n run = 20, γ i are chosen to be the n λ = 10 logarithmically spaced points between 10 − and 10 − . (a) (b) Figure 11: Recoveries for spinal cord datasets: cervical (left) and lumbar (right) spinal cord. Multi-Regshows similar efficiency of GCV, while L-curve tends to select over-regularized recovery as is shown in theleft figure.
In this paper, we develop a new parameter selection method for the constrained standard form ofTikhonov regularization. Instead of choosing one regularized solution as the ”best” approximation tothe underlying T distribution using conventional methods such as L-curve or generalized cross-validation(GCV), we used a combination of multiple regularized solutions as well as representation of a over-completedictionary of Gaussian functions to express the unknown distribution function. Acknowledgement
CB and MYO’s research was partially sponsored by the National Science Foun-dation through the grant DMS-1413039.
References [1] Frank Bauer and Mark A. Lukas. Comparing parameter choice methods for regularization of ill-posedproblems.
Mathematics and Computers in Simulation , 81(9):1795–1841, May 2011.[2] Paula Berman, Ofer Levi, Yisrael Parmet, Michael Saunders, and Zeev Wiesman. Laplace inversionof low-resolution nmr relaxometry data using sparse representation methods.
Concepts in MagneticResonance Part A , 42(3):72–88, 2013. 173] D. Calvetti, S. Morigi, L. Reichel, and F. Sgallari. Tikhonov regularization and the L-curve for largediscrete ill-posed problems.
Journal of Computational and Applied Mathematics , 123(1):423–446,November 2000.[4] Christophe Couvreur and Yoram Bresler. Dictionary-based decomposition of linear mixtures of gaus-sian processes. In , volume 5, pages 2519–2522. IEEE, 1996.[5] Gene H Golub, Michael Heath, and Grace Wahba. Generalized cross-validation as a method forchoosing a good ridge parameter.
Technometrics , 21(2):215–223, 1979.[6] Simon J Graham, Peter L Stanchev, and Michael J Bronskill. Criteria for analysis of multicomponenttissue t2 relaxation data.
Magnetic Resonance in Medicine , 35(3):370–378, 1996.[7] So Hyan Han, E Ackerstaff, Radka Stoyanova, S Carlin, Wei Huang, JA Koutcher, JK Kim, G Cho,G Jang, and Hyungjoon Cho. Gaussian mixture model-based classification of dynamic contrast en-hanced mri data for identifying diverse tumor microenvironments: preliminary results.
NMR inBiomedicine , 26(5):519–532, 2013.[8] P.C. Hansen, Danmarks Tekniske Universitet. Institut for Matematisk Modellering, Technical Univer-sity of Denmark. Department of Mathematical Modelling, and DTU.
The L-curve and Its Use in theNumerical Treatment of Inverse Problems . IMM-REP. IMM, Department of Mathematical Modelling,Technical Universityof Denmark, 1999.[9] Per Hansen and Dianne O’leary. The Use of the L-Curve in the Regularization of Discrete Ill-PosedProblems.
SIAM J. Sci. Comput. , 14:1487–1503, November 1993.[10] Per Christian Hansen.
Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of LinearInversion . Society for Industrial and Applied Mathematics, Philadelphia, January 1987.[11] Per Christian. Hansen. Analysis of Discrete Ill-Posed Problems by Means of the L-Curve.
SIAMReview , 34(4):561–580, December 1992.[12] Randall M Kroeker and R Mark Henkelman. Analysis of biological nmr relaxation data with contin-uous distributions of relaxation times.
Journal of Magnetic Resonance (1969) , 69(2):218–235, 1986.[13] V. A. Morozov. On the solution of functional equations by the method of regularization.
Dokl. Akad.Nauk SSSR , 167(3):510–512, 1966.[14] V. A. Morozov.
Methods for Solving Incorrectly Posed Problems . Springer New York, New York,softcover reprint of the original 1st ed. 1984 edition edition, November 1984.[15] Vladimir Alekseevich Morozov.
Methods for solving incorrectly posed problems . Springer Science &Business Media, 2012.[16] Ashish Raj, Sneha Pandya, Xiaobo Shen, Eve LoCastro, Thanh D Nguyen, and Susan A Gauthier.Multi-compartment t2 relaxometry using a spatially constrained multi-gaussian model.
PLoS One ,9(6):e98391, 2014.[17] Douglas Reynolds.
Gaussian Mixture Models , pages 827–832. Springer US, Boston, MA, 2015.[18] Christiana Sabett, Ariel Hafftka, Kyle Sexton, and Richard G Spencer. L1, lp, l2, and elastic netpenalties for regularization of gaussian component distributions in magnetic resonance relaxometry.
Concepts in Magnetic Resonance Part A , 46(2):e21427, 2017.[19] Grace Wahba. Practical approximate solutions to linear operator equations when the data are noisy.
SIAM Journal on Numerical Analysis , 14(4):651–667, 1977.[20] Grace Wahba.
Spline models for observational data , volume 59. Siam, 1990.[21] Kenneth P Whittall and Alexander L MacKay. Quantitative interpretation of nmr relaxation data.