Super-resolution of clinical CT volumes with modified CycleGAN using micro CT volumes
Tong ZHENG, Hirohisa ODA, Takayasu MORIYA, Takaaki SUGINO, Shota NAKAMURA, Masahiro ODA, Masaki MORI, Hirotsugu TAKABATAKE, Hiroshi NATORI, Kensaku MORI
SSuper-resolution of clinical CT volumes with modified CycleGAN using micro CT volumes
Tong ZHENG *1 , Hirohisa ODA *1 , Takayasu MORIYA *1 , Takaaki SUGINO *1 , Shota NAKAMURA *2 , Masahiro ODA *1 , Masaki MORI *3 , Hirotsugu TAKABATAKE *4 , Hiroshi NATORI *5 , and Kensaku MORI *1,6,7 Abstract
This paper presents a super-resolution (SR) method with unpaired training dataset of clinical CT and micro CT volumes. For obtaining very detailed information such as cancer invasion from pre-operative clinical CT volumes of lung cancer patients, SR of clinical CT volumes to μCT level is desired. While most SR methods require paired low- and high- resolution images for training, it is infeasible to obtain paired clinical CT and μCT volumes. We propose a SR approach based on CycleGAN, which could perform SR on clinical CT into μCT level. We proposed new loss functions to keep cycle consistency, while training without paired volumes. Experimental results demonstrated that our proposed method successfully performed SR of clinical CT volume of lung cancer patients into μCT level.
Keywords : Super-resolution, Clinical CT, μCT, CycleGAN, Unpaired learning
1. Introduction
Lung cancer causes largest number of deaths per year among cancers of male [1]. Currently, precise diagnosis of lung cancer mainly depends on clinical CT volumes. However, we could not obtain enough pathological information due to its low resolution. Super-resolution (SR) of clinical CT into μCT-like level is desired.
Deep learning-based methods have been proved to outperform other methods in SR. These approaches are often supervised, requiring aligned pairs of low-resolution (LR) and high-resolution (HR) patches to train a model. However, it is infeasible to obtain spatially corresponding patch pairs of clinical CT and μCT because registration between them is difficult. SR methods that can be trained by using unpaired images are desired. ――――――――――――――――――――――――――――――――――――― *1 Graduate School of Informatics, Nagoya University 〔 Furou-cho, Chikusa-ku, Nagoya 464-0814, Japan 〕 e-mail: [email protected] *2 Nagoya University Graduate School of Medicine *3 Sapporo-Kosei General Hospital *4 Sapporo Minami-sanjo Hospital *5 Keiwakai Nishioka Hospital *6 Information Technology Center, Nagoya University *7 Research Center of Medical Bigdata, National Institute of Informatics One of the first approaches that formalizes the possibility to transpose from a source domain to a target domain in the absence of paired examples is called CycleGAN [2]. For instance, pictures of the zebra are converted into those of the horse. Nevertheless, CycleGAN is not designed for SR.
In this paper, we propose an SR method of clinical CT into μCT-level by our modified CycleGAN. Unpaired clinical CT and μCT volumes are used for training.
2. Overview
In prior to inference, training of the network is required using clinical CT and µCT volumes. For inference, patches clipped from clinical CT volumes are input. In our study, scale of original µCT volumes is at least 8-times larger than the clinical CT volumes. Because of this, we consider 8-times SR to be the most proper. Input of our networks are 2D patches clipped from the volumes. The input clinical CT patch size is 32×32 pixels, while input μCT patch size is 256×256 pixels. 1) Network Structure Figure 1 shows the network structure of our proposed method. The first input is clinical CT patch 𝑨 , and generator 𝐺 generates corresponding SR patch 𝑨 SR from 𝑨 . Similar to CycleGAN, the generator 𝐺 is aimed to produce image patches that are similar to the ones in the target domain (μCT domain) by trying to fool the discriminator 𝐷 . Vice versa, the same work is done upon μCT patch 𝑩 by using generator 𝐺 and discriminator 𝐷 to keep the consistency of proposed framework. 2) Loss functions Like CycleGAN, our method uses cycle consistency while training the network. However, in SR problem, the cycle consistency between corresponding LR and SR image is different with that in image translation problem. In SR problem, corresponding LR and SR image are desired to have similarity in structure and average intensity, while the loss function Generator 𝐺 Clinical CT 𝑨 μCT 𝑩 Generator 𝐺 𝑨 SR (Fake μCT) Discriminator 𝐷 Fake or Real
Discriminator 𝐷 𝑩 LR (Fake clinical CT) Fake or Real
Average pooling
SSIM loss
SSIM loss
Nearest-neighbor Interpolation
Downsample loss
Upsample loss
Figure 1
Network structure used for training. Compared to original CycleGAN, proposed method uses loss functions for maintaining cycle consistency. We calculate SSIM loss between 𝑨 and 𝑨 SR , 𝑩 and 𝑩 LR . Further, we calculate downsample loss between average-pooled 𝑨 SR and 𝑨 , as well as upsample loss between 𝑩 and upsampled 𝑩 LR . sed in original CycleGAN could not obtain this. Here we propose serval loss functions in our pipeline to create the cycle consistency (blue blocks in Fig. 1). Without cycle consistency, the network would simply produce arbitrary patch in the target domain with no relationship to the structures contained in the input patch. The first loss function we proposed to keep cycle consistency is downsample loss. It is defined to maintain similarity while transforming clinical CT volume to μCT scale as 𝑙 downsample (𝑨) = MSE(𝑨, 𝑓(𝑨 SR )) , where 𝑓() is an average pooling function, reducing the size of 𝑨 SR to the same as 𝑨 , since 𝑨 SR is SR patch, 8 times larger than 𝑨 . MSE is the mean squared error. Analogously, we name the second loss function the upsample loss as 𝑙 upsample (𝑩) = MSE(𝑩, 𝑔(𝑩 LR )) , where 𝑔() is the nearest-neighbor interpolation function, upsampling the size of generated clinical-CT like 𝑩 𝐿𝑅 to the original size of 𝑩 . Although the first and second loss function could keep the cycle consistency while training network, both loss functions depend on intensity differences between generated and target image patches, which is not very well matched to perceived visual quality. Here we propose third and fourth loss functions, which we name as clinical-SSIM loss and micro-SSIM loss 𝑙 clinical−SSIM = SR )) , 𝑙 micro−ssim = LR )) , where SSIM is the structural similarity proposed in paper [3]. While training our model, third and fourth loss function helps protecting the model from generating blurred image patches. 3) Training We perform training process using 2D clinical CT patches as input of Generator 𝐺 and 2D μCT patches as input of Generator 𝐺 . Output of Generator 𝐺 is the generated μCT-like SR patches. Discriminator 𝐷 is used to discriminate output of Generator 𝐺 is real or fake. Furthermore, for more stable training, we mixed downsampled μCT patches in clinical CT patches as input. The percentage of downsampled μCT patches is 25%. 4) Inference For testing, we input 2D patch clipped from clinical CT volumes into the trained Generator 𝐺 . Output is a SR patch based on the input patch. μCT Clinical CT pixels in one slice 1024×1024 pixels 512×512 pixels number of slices 545~1082 slices 435~554 slices size of each pixel 34~53μm 0.625mm slide thickness 34~53μm 0.6mm Table 1
Profiles of clinical CT and μCT volumes
We utilized five lung cancer cases in the experiment. Clinical CT volumes were acquired by SOMATOM Definition flash (SIEMENS, Germany) from lung cancer patients. After surgical dissection of the lung cancers, μCT volumes of the dissected specimens were acquired by a μCT scanner, inspeXio SMX-90CT Plus (SHIMADZU, Japan). The profiles of clinical CT and μCT volumes are listed in Table 1. In our experiment, we used clinical CT and μCT volumes obtained from same patients, meaning one patient has one clinical CT volume and one μCT volume. We used 5 cases of clinical CT and μCT for training in our experiment. The epoch number is 200. For testing, we used 1case of clinical CT. SR results of our proposed method were compared to bicubic-interpolation and original CycleGAN, as shown in Fig. 2. We could obtain more details from SR results than bicubic-interpolation results. Lung anatomies, such as the bronchus looks more clearly than bicubic-interpolation. Original CycleGAN’s result has produced very different results from original clinical CT volumes.
4. Discussion and conclusion
We proposed a novel SR method with unpaired training dataset of clinical CT and micro CT volumes. New loss functions are introduced to keep cycle consistency in SR task. Experimental result showed that our method could apply (b) (c) (a)
Figure 2
Example and comparison of proposed method. (a) original clinical CT, (b) proposed SR result, (c) bicubic interpolation, and (d) CycleGAN (without proposed loss functions). (d)
R on clinical CT to μCT level.
Because training of proposed method is unpaired, we do not have corresponding ground truth for certain input, quantitative evaluation of output result becomes difficult. Our future work is quantitative evaluation of SR results.
Competing interests
None.
Acknowledgement
Parts of this research was supported by MEXT·JSPS KAKENHI (26108006, 17H00867, 17K20099), the JSPS Bilateral International Collaboration Grants, the AMED18lk1010028s0401, the AMED19lk1010036h0001 and the Hori Sciences & Arts Foundation.
References [ ] Vital Statistics Japan (Ministry of Health, Labour and Welfare) [ ] Zhu J, Park T, Isola P, et al.: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. IEEE International Conference on Computer Vision: 2242-2251, 2017 [ ] Zhou W, Alan C, Hamid R, et al.: Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans Image Processing : 600-612, 2004 CT を用いた改良版 Cycle-GAN による臨床用 CT 像の超解像処理 鄭 通 *1 , 小田 紘久 *1 , 守谷 享泰 *1 , 杉野貴明 *1 , 中村 彰太 *2 , 小田 昌宏 *1 , 森 雅樹 *3 , 高畠博嗣 *4 , 名取 博 *5 , 森 健策 *1,6,7 *1 名古屋大学大学院情報学研究科 *2 名古屋大学大学院医学系研究科 *3 札幌厚生病院 *4 札幌南三条病院 *5 恵和会西岡病院 *6 名古屋大学情報基盤センター *7 国立情報学研究所医療ビッグデータ研究センター 本稿では , 臨床用 CT 像の 超解像手法を提案する . 肺がん症例の臨床 CT 像から腫瘍の浸潤状況など疾患に関する情報を取得するため , 肺 臨床 CT に超解像を適用し , μCT レベルの解像度を得る手法が求められている . 多くの超解像手法における教師あり学習では , 対応関係のある低解像度と高解像度の画像ペアが必要となるが, 臨床 CT と μCT の画像ペアは正確に位置合わせを行うことが困難である . 我々は CycleGAN を改良し,臨床用 CT 像と μCT 像間の相互変換において 一貫性を保持するための 新しい損失関数を導入することによって, ペアなしの超解像手法を実現する . 実験の結果, 臨床 CT 像 の μCT レベルへの超解像が可能であった . キーワード:超解像 , 臨床 CT , μCT, CycleGAN, ペアなし学習ペアなし学習