[PDF] Denoising Single Voxel Magnetic Resonance Spectroscopy with Deep Learning on Repeatedly Sampled In Vivo Data

Abstract

Objective: Magnetic Resonance Spectroscopy (MRS) is a noninvasive tool to reveal metabolic information. One challenge of MRS is the relatively low Signal-Noise Ratio (SNR) due to low concentrations of metabolites. To improve the SNR, the most common approach is to average signals that are acquired in multiple times. The data acquisition time, however, is increased by multiple times accordingly, resulting in the scanned objects uncomfortable or even unbearable. Methods: By exploring the multiple sampled data, a deep learning denoising approach is proposed to learn a mapping from the low SNR signal to the high SNR one. Results: Results on simulated and in vivo data show that the proposed method significantly reduces the data acquisition time with slightly compromised metabolic accuracy. Conclusion: A deep learning denoising method was proposed to significantly shorten the time of data acquisition, while maintaining signal accuracy and reliability. Significance: Provide a solution of the fundamental low SNR problem in MRS with artificial intelligence.

Full PDF

>> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 1  Abstract — Objective:

Magnetic Resonance Spectroscopy (MRS) is a noninvasive tool to reveal metabolic information. One challenge of MRS is the relatively low Signal-Noise Ratio (SNR) due to low concentrations of metabolites. To improve the SNR, the most common approach is to average signals that are acquired in multiple times. The data acquisition time, however, is increased by multiple times accordingly, resulting in the scanned objects uncomfortable or even unbearable.

Methods:

By exploring the multiple sampled data, a deep learning denoising approach is proposed to learn a mapping from the low SNR signal to the high SNR one.

Results:

Results on simulated and in vivo data show that the proposed method significantly reduces the data acquisition time with slightly compromised metabolic accuracy.

Conclusion:

A deep learning denoising method was proposed to significantly shorten the time of data acquisition, while maintaining signal accuracy and reliability.

Significance:

Provide a solution of the fundamental low SNR problem in MRS with artificial intelligence.

Index Terms —Magnetic resonance spectroscopy, denoising, fast acquisition, deep learning, artificial intelligence. I. I NTRODUCTION

AGNETIC Resonance Spectroscopy (MRS) is an examination method to determine molecular composition. As a non-invasive technique, it provides a quantitative analysis of metabolites in brains [1-4]. However,

This work was supported in part by National Natural Science Foundation of China (61971361, 61871341, 61811530021 and 61672335), National Key R&D Program of China (2017YFC0108703), Natural Science Foundation of Fujian Province of China (2018J06018), Health-Education Joint Research Project of Fujian Province (2019-WJ-31), Fundamental Research Funds for the Central Universities (20720180056 and 20720200065), Xiamen University Nanqiang Outstanding Talents Program. W. Hu, D. Chen, T. Qiu and X. Qu are with Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, Xiamen 361005, China (*corresponding author with Email: [email protected]). H. Chen is with School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China. X. Chen is with the McLean Hospital, Harvard Medical School, Belmont, MA 02478, USA L. Yang is with Department of Radiology, The Second Affiliated Hospital of Shantou University Medical College, Shantou 515041, China. G. Yan is with Department of Radiology, The Second Affiliated Hospital of Xiamen Medical College, Xiamen 361021, China. D. Guo is with School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China. for the in vivo brain spectrum, the Signal-Noise Ratio (SNR) is relatively low due to the low concentrations of metabolites, leading to the difficulty in the further metabolic quantification and analysis [1, 5]. To gain the high SNR, MRS is commonly sampled repeatedly in many times and then averaged [6-8]. However, repeated sampling will lengthen the total acquisition time as it increases linearly with the number of repeats. For example, one time sampling of the spectrum on a 3T human scanner costs 14 seconds and the common 128 repeated sampling requires 4 minutes 28 seconds. Thus, how to reduce the times of repeated sampling while maintaining high SNR become important for fast MRS. Denoising is a basic approach to achieve this goal and many methods have made great achievements in recent years [9-13]. Among these approaches, the Low Rank method is evidenced powerful to denoise MRS [9, 10, 14, 15] as its hypothesis, the time domain signal of MRS are exponential functions, fits the physics of magnetic resonance very well [9, 16]. The exponential assumption, sometimes, may deviate from real data due to field inhomogeneity or other imaging system imperfections [17]. Very recently, Deep Learning (DL), a representative artificial intelligence technique, has been introduced to spectra processing [18], such as sparse sampling reconstruction [19] and spectrum interpretation [20, 21]. These DL approaches require a huge database to learning the nonlinear mapping from input to output. By using the simulated basis set of metabolites [22] and incorporating the advanced signal models [23], DL has been introduced into the MRS denoising and quantification [22, 23]. Training data are commonly required to learn the neural network parameters in DL. However, in clinical, collecting enough data is very time-consuming and may encounter ethical issues if the scanning time is too long for patients. Therefore, these DL MRS approaches generally adopt the computer-simulated signal [19, 22, 23], such as exponential functions. Thus, some potential difference, such as physical and psychologic conditions, may exist between the simulated training set and in vivo data, which may hinder the applications of DL in MRS [22]. In this work, we propose a DL MRS approach that solely uses in vivo

MRS as the training data. By learning the mapping from the low-SNR input and the high-SNR output, neural

Denoising Single Voxel Magnetic Resonance Spectroscopy with Deep Learning on Repeatedly Sampled In Vivo Data

Wanqi Hu, Dicheng Chen, Tianyu Qiu, Hao Chen, Xi Chen, Lin Yang, Gen Yan, Di Guo, Xiaobo Qu* M REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2 networks are optimized and then used to denoise the target MRS. The contributions of this work include: 1) Build in vivo training data through deeply excavating and utilizing its repeated samplings from healthy volunteers. 2) Propose a causal Long Short-Term Memory (LSTM) neural network to suit for MRS denoising. 3) Reduce 80% data acquisition time with slightly compromised metabolic accuracy. II. D ATA P REPARATION

In this section, the construction of simulated and in vivo

MRS data set will be introduced. A. Simulated data set

Our simulation data is generated by the basis set. The basis set parameters used in this part are as follows: Point RESolved Spectroscopy (PRESS) sequence, 3T Field strength = 123.246 Hz/ppm, TE =30 ms, number of points = 2048, delta time = 0.5 ms. To make it easier to see the signals, we selected 10 metabolites that are present in higher concentrations in healthy brains. The basis spectra of the 10 metabolites in the basis set are shown in the Appendix-A. The default concentration of each metabolite is 1. Here we have 50 virtual people. The concentration range of each metabolite in the brain was determined according to the literature [22, 24]. For a virtual person, the concentrations of the 10 metabolites were randomly selected independently from the range in TABLE A1, and we assume that the concentration of metabolites is same in the repeated sampling. The simulated noise-free spectrum is obtained by multiplying each metabolite's concentration by the basis spectrum and adding the spectra of each metabolite (Fig. 1(a) and (b)). For each virtual person, 128 Gaussian noises are generated independently for each noise-free spectrum, and the noise generates with the distribution of mean 0 and standard deviation 0.1. They were superimposed with noise-free spectra respectively to simulate 128 repeated samples (Fig. 1(c) and (d)). Eventually, we build a simulated data set consists of 50 virtual spectra. Each spectrum contains 128 repeated samplings with independent noise. B. In vivo data set

The in vivo data come from eight healthy volunteers (for training and test) and one patient with Parkinson disease (for test). The former data acquisition was approved by the Institutional Review Board of the Shanghai Jiao Tong University and the latter was approved by Medical Ethics Committee of The Second Affiliated Hospital of Xiamen Medical College. Before the study, the written informed consents from all people have been obtained. All measurements of the healthy volunteer were conducted on the United Imaging 3T human scanner with the PRESS sequence [25]. The Free Induction Decay (FID) acquisition was applied to obtain H spectra with a nominal 90° RF excitation pulse and the following parameters: Field strength = 128.378 Hz/ppm, TR/TE = 2000/30 ms, spectral width = 1 kHz, number of points = 2048, delta time = 0.5 ms, voxel size = 20 mm ×

20 mm ×

20 mm. For each person and each position, it takes 4 minutes 21 seconds for 128 repeated samplings. Specifically, we scanned three healthy volunteers at three voxel locations (LA, LB and LC in Figs. 2(a)-(c)). The rest five healthy volunteers only took the spectra at voxel position LA. Consequently, there are 14 different spectra and each of them had 128 repeated samplings (see Table I for more details). The Parkinson patient data was collected by 3T GE scanner in the Second Affiliated Hospital of Xiamen Medical College. The FID acquisition was applied to obtain H spectra with a nominal 90° RF excitation pulse and the following parameters: Field strength = 127.764 Hz/ppm, TR/TE = 2000/30 ms, spectral width = 1 kHz, number of points = 4096, delta time = 0.2 ms, voxel size = 20 mm×20 mm×20 mm. For each person and each position, it takes 7 min 30 seconds for 128 repeated samplings. The voxel location is marked as LD in Fig. 2 (d).

Fig. 1. Example of simulated dataset. (a), (b) are the noise-free spectra with the two different concentrations of 10 metabolites listed in Table A1 of the appendix. (c) and (d) are two repeated samplings of the spectrum of (a) under independent noise. (e) and (f) are the averages of 24 and 128 repeated samplings of noisy spectra of (a), respectively.

TABLE Ⅰ Data Partitioning for Neural Network Training on Healthy Volunteer

MRS. Person Location 1 2 3 4 5 6 7 8 LA LB NULL NULL NULL NULL NULL LC

NULL NULL NULL NULL NULL Note: The light blue and orange color represent training set and test set used in deep learning. The NULL indicates the data were not collected. For illustration, we use P2-LA to represented the spectra scanned at location LA from Person 2.

REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 3 III. M ETHODS

This section first introduces how to build training set, then proposes the denoising neural network, and finally provides parameter settings in network. A. Dataset designs

For the in vivo data, the network is trained on the MRS obtained from three locations (LA, LB, LC) of 7 healthy volunteers (light blue grid in Table I). In training phase, the total number of spectra in network output is 11. For each spectrum, from M repeated samplings, we take m samplings and then average them as an input spectrum (orange Input k in Fig. 3). The output spectrum (green Label k in Fig. 3) is obtained by taking m samplings from M repeated samplings. Then, a pair of (Input k , Label k ) forms one item. As the number of input and output spectra are mM C and mM C , which are too huge to machine learning, only K items are randomly selected from all combinations of (Input k , Label k ), i.e. k =1, 2, …, K . For the training on in vivo data, we set m =24, m =124 and K =1000. The training process on simulated data is the same as that of in vivo data, and we set m =24, m =124 and K =1000. B. Causal-LSTM （ CLSTM ） The designed network for MRS denoising is summarized in Fig. 4. We choose Long Short-Term Memory (LSTM) as the basic network since it has strong ability to learn the intrinsic property of time series [26-28] and the acquired raw data of MRS, which is also FID signal, are just time series. However, while denoising the metabolic spectra with multiple peaks or different peak heights, the traditional LSTM could not solve it well (Appendix-B). By changing the LSTM input, the correlation between FID is easier to learn and solve. There are two sequences in our net: 1) The input of the item in the training set is the signal needs to be denoised and denoted as the low SNR FID

11 2 [ , ,... ] TT x x x    x  ; 2) The sequence, that updates the denoise results, is called as virtual FID

11 2 [ , ,... ] TT s s s    s  . Each point t x maps to the corresponding point t s through a Block composed of multi-layers (Fig. 4. (b)). The Block's input has five parts: the current point t x , the Low SNR Past Frames (LSPF) t x and the Virtual Past Frames (VPF) t s , memory cells t h and state cells t c . The utilization of t h and t c are the same as the general LSTM [29]. LSPF is a vector consisting of r points before t x : [ , , ,..., , ] t r t r t r t t x x x x x         t x . (1) VPF is the vector corresponding to LSPF in virtual FID. Its content comes from the result of each Block: [ , , ,..., , ] t r t r t r t t s s s s s         t s . (2) The output of the CLSTM Block is the denoise result t s . The parameters are marked as θ . The mapping of the entire Block can be expressed as: ( , , , , | ) t Block t t t t t s f x  x s h c θ (3) The output of each step t s constitutes the brand virtual FID t  s . And all t s make up the final noise reduction sequence s . Fig. 2. The location of voxels and spectra collected from 8 healthy volunteers and 1 patient. Among them, (a) voxel position LA, (b) voxel position LB and (c) voxel position LC are 8 healthy persons scanned areas of the brain. (d) voxel position LD is the scanned areas of the brain of the Parkinson’s disease. (e)-(h) are the corresponding spectra to those acquired in (a)-(d). Fig. 3. The process to form items in training set. For the acquired spectrum, P2-LA, at location LA of person 2, 24 samplings are randomly selected from 128 repeated samplings and then averaged as the input (Input k ) of the network. The output (Label k ) of the network is the average of 124 samplings that are randomly selected from the same 128 repeated samplings. One pair of (Input k , Label k ) is called one item. The total number of input and output are mM C  and mM C  , respectively. Only a small number ( K =1000) of all pairs (Input k , Label k ) are used to learn network parameters. REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 4 Each Block concludes four modules (See Fig. 4. (b)):

Input Module : Combien the LSPF and VPF through several linear layers and ReLU function. The output ct x is regarded as the input of the LSTM cell: (( , ) | ) c ct c t f  x x s θ , (4) where c θ represents the training parameters of the input module. LSTM cell : Employ the classic LSTM strategy to calculate the new next memory cells t  h and states cells t  c as below: ( , ) (( , , ) ) c ht t lstm t t t f    h c x h c θ , (5) where h θ represents the training parameters of the LSTM. Regression module : Input the new memory cells t  h to this model, and regression layers map t  h to a point t x  ,   rt r t x f   h θ  (6) where represents training parameters of linear regression r f . Data validation : Do the weighted v θ sum for fidelity term of t x and ct x , ( , | ) vt v t t s f x x  θ  (7) where t s is the denoised value that updates to the virtual FID. Details of each model can be found in Fig. 4 (c). C. Network Implementation

The network is implemented using the PyTorch [30] library. Model parameters { , , , } c h r v  θ θ θ θ θ are trained by an Adam optimizer [31] with learning rate 1e-4 and batch size 128 to minimize the Mean Absolute Error loss function at every sampling points. Then, the network parameters are updated iteratively through back-propagation. Next, the signal is normalized in the time domain for all the tasks to enhance stability and robustness. The learning rate is scheduled to decay according to the validation loss by a factor of 0.1. Finally, the best checkpoint model with the smallest validation loss is saved as the final experiment model. In addition, the network is trained using one NVIDIA RTX2080Ti GPU for 50 epochs. During testing, the typical inference time for 512 sampled points is around 1 second in each batch. IV. R ESULTS

This section is devoted to verify the denoising performance. For comparisons, we choose Low Rank method [15], one of the state-of-the-art techniques in MRS denoising, and the most common approach that averages maximal times of repeated samplings. The test set contains two parts: (a) 10 simulated spectra, and (b) 3 in vivo spectra, which have been introduced in Section III-A. In the evaluation, input signals are treated as low-SNR observations and spectra averaged by 128 repeated samplings are regarded as high-SNR references.

All spectra, including observations, references and denoising results, were analyzed with LCModel, a commercial software whose quantitative results are widely used in MRS applications [11, 22]. We evaluate spectra from two aspects: Concentration (Conc.) and Standard Deviations Percentage (SDP). The Conc. is the absolute concentration with water scaling. The SDP is the estimated standard deviations (Cramer-Rao lower bounds) expressed in percentage of estimated concentrations [32]. The SDP ranges from 0 to 999 and a SDP < 15 has been used as a criterion of acceptable reliability [32, 33]. For each trial, the lower the SDP is, the more accurate the metabolite estimate is. 0 is the best while 999 corresponds to an occasion where some

Fig. 4. Data flow and network structure of CLSTM. (a) For each point t x , a Block is designed to use LSPF and VPF to denoise t x to get the result t s . (b) The 4 modules in the CLSTM Block. (c) The network structure of 4 modules in detail, whose layers parameters are shown in Table II. TABLE Ⅱ Parameters of Network Layers Model Layers （ input size ， output size ） Input model Linear ( r ×

2, r) + Linear ( r , r ×

32) + ReLU + Linear ( r × r × r × r of LSPF and VPF is 10. REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 5 metabolite cannot be detected or do not exist. Seven metabolites whose SDP are smaller than 15 in references are presented because these metabolites, such as NAA, Cr+PCr, and Glu, are treated as important biomedical indicators in the diagnosis [24, 34]. To avoid the bias of random selection, we repeated 50 trials for each spectrum and counted the average and standard deviation.

The denoised results on simulated data are given in Fig. 5, showing that CLSTM outperforms the compared Low Rank method, no matter on the accuracy of concentration or reliability SDP. Compared with the input signal (average of 24 repeated samplings), although both Low Rank and CLSTM smoothen the spectra, CLSTM still preserves more details (Fig. 5(d)) than Low Rank (Fig. 5(c)). In quantification analysis, both methods provide reasonable estimates for most metabolic concentrations. However, Low Rank reduces concentrations of some metabolites, such as Asp and Glu (Fig. 5(e)). Moreover, for some metabolites which are difficult to be detected because of the low concentration, such as Asp, Low Rank fails to offer an acceptable reliability SDP. On the contrary, CLSTM is still capable of preserving this component during the denoising, even owns smaller SDP than that of References. Thus, by we taking 24 repeated samplings from 128 samplings, the proposed CLSTM achieves comparable concentration accuracy and reliability as the maximal times of averages, indicating saving 81.25% in the data acquisition time. The denoising performance of in vivo H spectrum confirms our conclusion in the simulated data. Both Low Rank and CLSTM suppress the noise well (Figs. 6(c) and (d)). However, Low Rank over-smooths some signals (e. g. NAA at 2.7~2.8

Fig. 5. Denoised spectra and the quantitative analysis of the simulated data. (a)-(d) are reference (128 averages), low-SNR input (24 averages), denoised spectra with low rank and the proposed CLSTM, respectively. (e) Estimated concentrations (above the zero line) and the reliability criteria, SDPs (under the zero line), of the metabolites. Note: In (a)-(d), the fit curves of the corresponding spectra (red yellow blue and green color lines) are presented with grey lines. In (e), only metabolites with the lowest seven SDPs are presented (Cr and PCr is shown as the sum Cr+PCr of the two metabolites), and the error bars denote standard deviations under 50 denoising trials. The SDPs of ASP with low-SNR input and Low Rank is far over 30 which is marked with star. The dashed solid line is SDP=15 and above this line implies acceptable reliability of estimation.

Fig. 6. Denoised spectra and the quantitative analysis of the healthy volunteer (P7-LA). (a)-(d) are reference (128 averages), low-SNR input (24 averages), denoised spectra with low rank and the proposed CLSTM, respectively. (e) Estimated concentrations (above the zero line) and the reliability criteria, SDPs (under the zero line), of the metabolites. Note: In (a)-(d), the fit curves of the corresponding spectra (red, yellow, blue and green color lines) are presented with grey lines. In (e), only metabolites with the lowest seven SDPs are presented (Cr and PCr is shown as the sum Cr+PCr of the two metabolites), and the error bars denote standard deviations under 50 denoising trials. The dashed solid line is SDP=15 and above this line implies acceptable reliability of estimation.

REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 6 ppm), leading to lower concentration estimation than the Reference (Fig.6(e)). By contrast, results of CLSTM are more comparable with Reference. We further verify the performance on Parkinson patient data. As shown in Fig. 7, for Inputs with very low SNR, CLSTM (Fig. 7(d)) removes noise better than Low Rank (Fig. 7(c)). For quantification analysis, CLSTM provides more comparable estimate with References, such as GPC and Glu, and reduces SDPs compared with Inputs (24 averages). Furthermore, for some metabolite whose SDPs are with a large bar, CLSTM obviously decreases the average and the standard deviation ofSDPs, indicating that CLSTM offers much more reliable and robust estimates than the inputs and Low Rank method. These observations imply that, although the network was trained on the data acquired from healthy volunteers, the proposed CLSTM still applicable to improve the patient MRS. In summary, results on the simulated and in vivo data demonstrate that the proposed CLSTM can reduce the sampling times from 128 to 24, saving 81.25% data acquisition time, with comparable or slightly compromised metabolic concentrations. V. D ISCUSSIONS

In this section, we will test the denoising performance of CLSTM under different numbers of repeated samplings (Fig. 8) and different voxels (Fig. 9) to verifying the generality of method. A. Different Number of Samplings

For Inputs with different sampling times in Fig. 8, CLSTM provides comparable estimates with Reference. For the high-concentration metabolites (Glu and NAA), CLSTM offers an estimate which is very close to Reference. For the

Fig. 7. Denoised spectra and the quantitative analysis of the Parkinson patient (P9-LD). (a)-(d) are reference (128 averages), low-SNR input (24 averages), denoised spectra with low rank and the proposed CLSTM, respectively. (e) Estimated concentrations (above the zero line) and the reliability criteria, SDPs (under the zero line), of the metabolites. Note: In (a)-(d), the fit curves of the corresponding spectra (red, yellow, blue and green color lines) are presented with grey lines. In (e), only metabolites with the lowest seven SDPs are presented (Cr and PCr is shown as the sum Cr+PCr of the two metabolites. -CrCH2 is a correction term, rather than a metabolite. It is a simulated negative creatine CH2 singlet around 3.94 ppm), and the error bars denote standard deviations under 50 denoising trials. The variance of GPC’s SDP is far over the scope which is marked with triangle. The dashed solid line is SDP=15 and above this line implies acceptable reliability of estimation.

Fig. 8. The quantitative analysis of the volunteer (P7-LA) with different number of samplings as input. (a) 16 average (b) 32 average. Both (a) and (b) shows the estimated concentrations (above the zero line) and the reliability criteria, SDPs (under the zero line), of the metabolites. Note: only metabolites with the lowest seven SDPs are presented (Cr and PCr is shown as the sum Cr ＋ PCr of the two metabolites), and the error bars denote standard deviations under 50 denoising trials. The dashed solid line is SDP=15 and above this line implies acceptable reliability of estimation.

REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 7 low-concentration metabolites (GPC, GSH, and Asp), besides the accurate concentrations, CLSTM obviously reduces the SDPs compared with Inputs. Thus, the benefits of CLSTM on accurate concentrations and high reliability still hold. B. Different Locations of Voxels

Voxel locations do not limit the benefit of CLSTM. In other locations (Fig. 9), CLSTM still offers reasonably estimated concentration compared with References. Besides, the averaged SDPs obtained with CLSTM are smaller than 15 for most metabolites, ensuring that concentration estimates are reliable enough. C. Limitations on Accuracy of Low Concentration Metabolites

However, for some metabolites, CLSTM may provide an estimate with relatively insufficient accuracy (accuracy of Conc. >5%). For example, on the data acquired from healthy volunteers, the relative error of one the estimated concentration of Asp is 5.4%. It speculates that this may be because the metabolite has more peaks and lower peak height, and its concentration is greatly affected by noise. VI.

CONCLUSION In this work, we proposed a deep learning denoising method, CLSTM, with limited measured data. The denoising result is comparable with the 128 averaged spectra, meaning that the proposed method saves about 81.25% acquisition time. In addition, the denoising result achieves the comparable quantitative accuracy and greatly reduces the SDPs, making the quantification more reliable. CLSTM may be applied to the MRS denoising on brain diseases. A

PPENDIX A. Simulated Dataset Detail

Fig. 10 is the basis spectra of the metabolites in the basis set shown at 4.0-0.2 ppm.

TABLE A1 is the reference range of the concentration of metabolites in healthy human brain, which is used in the of the simulated spectrum in this work. B. Examples of Limitations of Traditional LSTM

Traditional LSTM encounters problems to preserve small peaks when the training set (Fig. 11(a)) differs from the target spectra (Fig. 11(b)) The results of Fig. 11 are trained by LSTM with 9000 simulated spectra with different concentration of NAAG or NAA. Hidden size of LSTM is 128. The weights are both trained by 100 epochs. Comparing NAAG with NAA, when the number of metabolite peaks and the height difference of the spectral peak increased, the LSTM for denoising got worse. C. Accuracy of Concentration

TABLE A2 showed the accuracy of the metabolites' concentration relative to the reference spectrum.

Fig. 9. The quantitative analysis of the different voxel and people. (a) P1-LB (b) P3-LC. Both (a) and (b) shows the estimated concentrations (above the zero line) and the reliability criteria, SDPs (under the zero line), of the metabolites. Note: only metabolites with the lowest seven SDPs are presented (Cr and PCr is shown as the sum Cr+PCr of the two metabolites), and the error bars denote standard deviations under 50 denoising trials. The dashed solid line is SDP=15 and above this line implies acceptable reliability of estimation. TABLE A1 Concentration Used in the Simulation of Brain Spectra Metabolite (abbreviations) Reference Range(mmol/L) Concentration in Fig.1 (mmol/L) Low High (a) (b)

Aspartate (Asp)

Creatine (Cr)

Glutamine (Gln)

Glutamate (Glu)

Glutathione (GSH)

Glycerophosphoryl-choline (GPC)

Myo-inositol (mI)

N-acetyl-aspartate (NAA)

N-acetyl-aspartyl-glutamate (NAAG)

Phosphocreatine (PCr)

REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 8

Fig. 10. Basis spectra of 10 metabolites in basis set. The default concentration of each metabolite is 1.

Fig. 11. The denoising result of the NAAG (a) and NAA (b) simulated spectra and the corresponding FID. TABLE A2 Accuracy of Concentration

Figure /Metabolites GPC GSH Asp mI Cr+PCr Glu NAA Average absolute value Method

Fig. 5 Input

0. 98% -0. 85% 1.01%

Low Rank -5.38% -8.61% -62.37% -1.95% -1.03% -8.02% -11.45% -0. 87% -0.08% 0. 17%

0. 57%

0. 82% -9.33% -7.44% 8.32% CLSTM -4.31% -2.50% 5.39% 0. 82% -3.45% -1.72%

Fig. 8(a) Input -1.20%

0. 76% -4.62% -0. 79% Low Rank -16.59% -12.14% -34.20% -4.87% -0.16% -17.26% -3.61% 12.69% CLSTM -6.55% -3.27% -9.92% -1.54% -0. 55% -5.44% -6.72%

0. 26% 2.95%

Low Rank -12.44% -11.24% -32.85% -0. 56% -0. 84% -13.15% -3.18% 10.61% CLSTM -4.95% -4.83% -7.52% -0. 28% -0. 47% -5.44% -1.71% -13.19% -2.10% 11.99% CLSTM -1.47% -5.08% 3.86% 2.69% -3.48% 0.29% 2.55%

Fig. 9(b) Input 11.67% 6.36% -5.88% -4.51% 0. 98% 6.09% -2.46%

0. 61% -11.16% -16.22% 10.40% CLSTM

0. 48% -2.11% -1.68%

Fig. 7

GPC GSH -CrCh2 mI Cr+PCr Glu NAA Average absolute value

Input 12.99% 7.92% 25.38% 1.99% 7.81% 15.48% 7.02% 11.22% Low Rank 12.21% -1.54% -3.29% -13.96% -4.39% -2.14% -2.68% -2.34% 5.62%

Note: Accuracy = (Conc. (Method) - Conc. (Reference)) / Con. (Reference) ×100% ； Average absolute value is the average of the absolute accuracy of each metabolite.

REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 9 A

CKNOWLEDGEMENTS

The authors would like to thank Zi Wang and Xinlin Zhang for assisting in data acquisition and helpful discussions. The authors also thank the staff from Shanghai Jiao Tong University and the Second Affiliated Hospital of Xiamen Medical College for technical support. R

EFERENCES [1] J.-B. Poullet, D.M. Sima, and S. Van Huffel, "MRS signal quantitation: a review of time-and frequency-domain methods,"

J. Magn. Reson., vol. 195, no. 2, pp. 134-144, 2008. [2] M. Bagory, F. Durand-Dubief, D. Ibarrola, J.-C. Comte, F. Cotton, C. Confavreux, and D. Sappey-Marinier, "Implementation of an absolute brain H-MRS quantification method to assess different tissue alterations in multiple sclerosis,"

IEEE Trans. Biomed. Eng., vol. 59, no. 10, pp. 2687-2694, 2011. [3] S. Ferdowsi and V. Abolghasemi, "Semiblind spectral factorization approach for magnetic resonance spectroscopy quantification,"

IEEE Trans. Biomed. Eng., vol. 65, no. 8, pp. 1717-1724, 2017. [4] Y. Guo, S. Ruan, J. Landré, and J.M. Constans, "A sparse representation method for magnetic resonance spectroscopy quantification,"

IEEE Trans. Biomed. Eng., vol. 57, no. 7, pp. 1620-1627, 2010. [5] S.W. Provencher, "Estimation of metabolite concentrations from localized in vivo proton NMR spectra,"

Magn. Reson. Med., vol. 30, no. 6, pp. 672-679, 1993. [6] S.K. Valaparla, E.M. Ripley, G.R. Boone, T.Q. Duong, and G.D. Clarke, "Evaluation of vastus lateralis muscle fat fraction measured by two-point Dixon water-fat Imaging and H-MRS," in

Proc. Intl. Soc. Mag. Reson. Med , 2014, vol. 22, p. 1221. [7] Y. Liu, X. Mei, J. Li, N. Lai, and X. Yu, "Mitochondrial function assessed by P MRS and BOLD MRI in non-obese type 2 diabetic rats,"

Physiological reports, vol. 4, no. 15, p. e12890, 2016. [8] C. Choi, S.K. Ganji, R.J. DeBerardinis, I.E. Dimitrov, J.M. Pascual, R. Bachoo, B.E. Mickey, C.R. Malloy, and E.A. Maher, "Measurement of glycine in the human brain in vivo by H ‐ MRS at 3 T: application in brain tumors,"

Magn. Reson. Med., vol. 66, no. 3, pp. 609-618, 2011. [9] H.M. Nguyen, X. Peng, M.N. Do, and Z.-P. Liang, "Denoising MR spectroscopic imaging data with low-rank approximations,"

IEEE Trans. Biomed. Eng., vol. 60, no. 1, pp. 78-89, 2012. [10] Y. Chen, Y. Li, and Z. Xu, "Improved low-rank filtering of MR spectroscopic imaging data with pre-learnt subspace and spatial constraints,"

IEEE Trans. Biomed. Eng., vol. 67, no. 8, pp. 2381-2388, 2019. [11] S.H. Joshi, A. Marquina, S. Njau, K.L. Narr, and R.P. Woods, "Denoising of MR spectroscopy signals using total variation and iterative Gauss-Seidel gradient updates," in , 2015: IEEE, pp. 576-579. [12] H.F.C. de Greiff, R. Ramos-Garcia, and J.V. Lorenzo-Ginori, "Signal de-noising in magnetic resonance spectroscopy using wavelet transforms,"

Concepts Magn. Reson., vol. 14, no. 6, pp. 388-401, 2002. [13] O.A. Ahmed, "New denoising scheme for magnetic resonance spectroscopy signals,"

IEEE Trans. Med. Imaging, vol. 24, no. 6, pp. 809-816, 2005. [14] Y. Liu, C. Ma, B.A. Clifford, F. Lam, C.L. Johnson, and Z.-P. Liang, "Improved low-rank filtering of magnetic resonance spectroscopic imaging data corrupted by noise and B field inhomogeneity," IEEE Trans. Biomed. Eng., vol. 63, no. 4, pp. 841-849, 2015. [15] T. Qiu, W. Liao, D. Guo, D. Liu, X. Wang, and X. Qu, "Gaussian noise removal with exponential functions and spectral norm of weighted Hankel matrices," arXiv preprint arXiv:2001.11815,

Angew. Chem. Int. Ed., vol. 54, no. 3, pp. 852-854, 2015. [17] I. Marshall, J. Higinbotham, S. Bruce, and A. Freise, "Use of Voigt lineshape for quantification of in vivo H spectra,"

Magn. Reson. Med., vol. 37, no. 5, pp. 651-657, 1997. [18] D. Chen, Z. Wang, D. Guo, V. Orekhov, and X. Qu, "Review and prospect: deep learning in nuclear magnetic resonance spectroscopy,"

Chem.-A Eur. J., vol. 26, no. 46, pp. 10391-10401, 2020. [19] X. Qu, Y. Huang, H. Lu, T. Qiu, D. Guo, T. Agback, V. Orekhov, and Z. Chen, "Accelerated nuclear magnetic resonance spectroscopy with deep learning,"

Angew. Chem. Int. Ed., vol. 59, no. 26, pp. 10297-10300, 2019. [20] P. Klukowski, M. Augoff, M. Zięba, M. Drwal, A. Gonczarek, and M.J. Walczak, "NMRNet: a deep learning approach to automated peak picking of protein NMR spectra,"

Bioinformatics, vol. 34, no. 15, pp. 2590-2597, 2018. [21] S. Liu, J. Li, K.C. Bennett, B. Ganoe, T. Stauch, M. Head-Gordon, A. Hexemer, D. Ushizima, and T. Head-Gordon, "Multiresolution 3D-DenseNet for chemical shift prediction in NMR crystallography,"

J. Phys. Chem. Lett., vol. 10, no. 16, pp. 4558-4565, 2019. [22] H.H. Lee and H. Kim, "Intact metabolite spectrum mining by deep learning in proton magnetic resonance spectroscopy of the brain,"

Magn. Reson. Med., vol. 82, no. 1, pp. 33-48, 2019. [23] F. Lam, Y. Li, and X. Peng, "Constrained magnetic resonance spectroscopic imaging by learning nonlinear low-dimensional models,"

IEEE Trans. Med. Imaging, vol. 39, no. 3, pp. 545-555, 2019. [24] V. Govindaraju, K. Young, and A.A. Maudsley, "Proton NMR chemical shifts and coupling constants for brain metabolites,"

NMR Biomed., vol. 13, no. 3, pp. 129-153, 2000. [25] P.A. Bottomley, "Selective volume method for performing localized NMR spectroscopy," ed: Google Patents, 1984. [26] J. Xie and Q. Wang, "Benchmarking machine learning algorithms on blood glucose prediction for type 1 diabetes in comparison with classical time-series models,"

IEEE Trans. Biomed. Eng., vol. 67, no. 11, pp. 3101-3124, 2020. [27] M. Wang, C. Lian, D. Yao, D. Zhang, M. Liu, and D. Shen, "Spatial-temporal dependency modeling and network hub detection for functional MRI analysis via convolutional-recurrent network,"

IEEE Trans. Biomed. Eng., vol. 67, no. 8, pp. 2241-2252, 2019. [28] Y. Wang, K. Lin, Y. Qi, Q. Lian, S. Feng, Z. Wu, and G. Pan, "Estimating brain connectivity with varying-length time lags using a recurrent neural network,"

IEEE Trans. Biomed. Eng., vol. 65, no. 9, pp. 1953-1963, 2018. [29] S. Hochreiter and J. Schmidhuber, "Long short-term memory,"

Neural Comput., vol. 9, no. 8, pp. 1735-1780, 1997. [30] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, "Automatic differentiation in pytorch," 2017. [31] D.P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, H spectra with LCModel,"

NMR Biomed., vol. 14, no. 4, pp. 260-264, 2001. [33] K.S. Opstad, B.A. Bell, J.R. Griffiths, and F.A. Howe, "Toward accurate quantification of metabolites, lipids, and macromolecules in HRMAS spectra of human brain tumor biopsies using LCModel,"

Magn. Reson. Med., vol. 60, no. 5, pp. 1237-1242, 2008. [34] G. Öz, J.R. Alger, P.B. Barker, R. Bartha, A. Bizzi, C. Boesch, P.J. Bolan, K.M. Brindle, C. Cudalbu, and A. Dinçer, "Clinical proton MR spectroscopy in central nervous system disorders,"