Coarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing
Jinquan Guo, Rongda Fu, Lin Pan, Shaohua Zheng, Liqin Huang, Bin Zheng, Bingwei He
CCoarse-to-fine Airway Segmentation Using Multi information FusionNetwork and CNN-based Region Growing
Jinquan Guo a , Rongda Fu a , Lin Pan b , Shaohua Zheng b , Liqin Huang b , Bin Zheng c , ∗ andBingwei He a , ∗∗ a School of Mechanical engineering and Automation, Fuzhou University, Fuzhou 350108, China b School of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China c Thoracic Department, Fujian Medical University Union Hospital.
A R T I C L E I N F O
Keywords :airway segmentationmulti-information fusion convolution neu-ral networkvoxel classification network
A B S T R A C T
Background and Objectives: Automatic airway segmentation from chest computed tomography (CT)scans plays an important role in pulmonary disease diagnosis and computer-assisted therapy. How-ever, low contrast at peripheral branches and complex tree-like structures remain as two mainly chal-lenges for airway segmentation. Recent research has illustrated that deep learning methods performwell in segmentation tasks. Motivated by these works, a coarse-to-fine segmentation framework isproposed to obtain a complete airway tree.Methods: Our framework segments the overall airway and small branches via the multi-informationfusion convolution neural network (Mif-CNN) and the CNN-based region growing, respectively. InMif-CNN, atrous spatial pyramid pooling (ASPP) is integrated into a u-shaped network, and it canexpend the receptive field and capture multi-scale information. Meanwhile, boundary and locationinformation are incorporated into semantic information. These information are fused to help Mif-CNN utilize additional context knowledge and useful features. To improve the performance of thesegmentation result, the CNN-based region growing method is designed to focus on obtaining smallbranches. A voxel classification network (VCN), which can entirely capture the rich informationaround each voxel, is applied to classify the voxels into airway and non-airway. In addition, a shapereconstruction method is used to refine the airway tree.Results: We evaluate our method on a private dataset and a public dataset from EXACT’09. Comparedwith the segmentation results from other methods, our method demonstrated promising accuracy incomplete airway tree segmentation. In the private dataset, the Dice similarity coefficient (DSC), falsepositive rate (FPR), and sensitivity are 92.8%, 0.015%, and 88.6%, respectively. In EXACT’09, theDSC, FPR, and sensitivity are 95.8%, 0.053% and 96.6%, respectively.Conclusion: The proposed Mif-CNN and CNN-based region growing method segment the airway treeaccurately and efficiently in CT scans. Experimental results also demonstrate that the framework isready for application in computer-aided diagnosis systems for lung disease and other related works.
1. Introduction
Chronic obstructive pulmonary disease (COPD) is thethird leading cause of death, and it accounts for more thanmillion deaths in China [1]. Computed tomography (CT)technology is an important tool for the qualitative and quan-titative assessment of lung tissue function, and it can im-prove the accuracy of diagnosis and treatment of pulmonarydiseases (e.g., COPD). Airway segmentation from chest CTimages can be used for many applications. First, it can helpdoctors make clinical decisions (e.g., disease diagnosis, sur-gical navigation, and evaluation of disease evolution). Sec-ond, the lungs are anatomically subdivided on the basis ofthe airway tree, and thus, the airway tree can facilitate the ac-curate definition of intersegmental demarcation, which is themost important step of thoracoscopic pulmonary segmentec-tomy. In addition, each airway branch is accompanied by anartery, and both structures have similar orientation; doctorsalso use the airways to distinguish the corresponding arter-ies in clinical practice [2]. Experienced doctors manually ∗ Corresponding author ∗∗ Principal corresponding author [email protected] (B. Zheng); [email protected] (B. He)
ORCID (s): (B. He) label airway from CT via some interactive software, such asMIMICS and ITK-SNAP [3]. However, manual segmenta-tion is time consuming and susceptible to errors due to thelarge number of slices in CT images. Therefore, a robustautomated airway segmentation method is necessary.Automatic airway segmentation suffers from several chal-lenges. First, the size, shape and intensity of airway branchesare various. Fig. 1 (a) illustrates the differences betweenlarge and small branches in axial view. The large one is eas-ily identified, thus the lumen and lumen wall can be sepa-rated. However, the boundaries of small branches are blurred,and similar to the surrounding tissues. Second, the complextree-like structure of the airway. Although airway brancheshave some common characteristics (e.g., direction and se-quence order), the practical bronchi from different patientsexhibit various appearances and shapes [4]. As a result, seg-menting a complete airway tree becomes more difficult. Fig.1 (b) shows an incomplete airway tree obtained by U-Net [5].Fig. 1 (c) shows an accurate airway manually labeled byexperienced doctors. Compared with manually segmentedtrees, U-Net misses some important branches.Conventional airway segmentation methods generally in-clude region-growing based methods, morphological meth-ods and rule-based methods. A comparison of fifteen con-
J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 1 of 11 a r X i v : . [ ee ss . I V ] F e b oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing Figure 1: (a) Axial view image showing the differences between large (red box) and small branches (green boxes), (b) incompleteairway tree, (c) accurate airway tree labeled by experienced radiologists ventional airway segmentation methods was summarized inthe EXACT’ 09 challenge in 2012 [6]. Conventional air-way segmentation, which may also depend on the quality ofthe CT scan, is often a tedious task. Fine-tuning the param-eters to achieve a balance between obtaining more airwaybranches and avoiding leakages is necessary in these meth-ods [7]. Moreover, conventional airway segmentation meth-ods rely on the expertise and experience of researchers toextract features.Recently, deep learning technology has made remark-able improvements in the field of computer vision, and hasbecome the most widely used approach in medical imagesegmentation. Qier et al. [8] proposed a method to seg-ment the airway tree automatically by combining a fully-convolutional network (FCN) with image-based tracking al-gorithm. Charbonnier et al. [7] extracted the trachea andmain bronchus by the region growing method, and a 2D con-volution neural network (CNN) was used to segment smallbronchi and remove leaks. Jin et al. [9] proposed a 3DFCN to generate a probability map, and then a graph-basedmethod which incorporates fuzzy connectedness segmenta-tion was applied to refine the FCN output and guide leakageremoval. Juarez et al. [10] proposed a method based on 3DU-Net. They also investigated the importance of data aug-mentation and loss function selection for airway segmenta-tion. Qin et al. [11] provided a voxel-connectivity awaremethod, which focuses on reducing false positives and in-creasing the airway tree length. The author transformed abinary segmentation task into 26 tasks of predicting whethera voxel is connected to its neighbors. Zhao et al. [2] useda two-stage 2D+3D neural network to segment thick andthin bronchi separately. Then the results from both stageswas combined by a linear programming-based tracking al-gorithm. Qin et al. [12] replaced the bottom layer of the U-Net with 3D slice-by-slice convolutional layers, which cancapture the spatial information of elongated structures andimprove the segmentation performance.Although CNN-based methods are widely used for air-way segmentation due to high sensitivity and less false nega- tive rate, deep learning methods still suffer from some short-comings. First, these methods integrate multi-scale contex-tual information via successive pooling and subsampling lay-ers that reduce resolution until a global prediction is ob-tained [13]. Segmenting small branches in the peripheralregion is difficult. Second, some deep learning approachesare mainly based on intensity features and ignore other in-formation (e.g., location information and boundary informa-tion), which can potentially improve the performance of air-way segmentation.In order to solve the aforementioned problems, we pro-pose a segmentation framework which contains two parts:a multi-information fusion CNN (Mif-CNN) and a CNN-based region growing. The Mif-CNN is used to segment theoverall airway. It combines multi-information, i.e., bound-ary and location information, to utilize additional featureknowledge. An atrous spatial pyramid pooling (ASPP) blockis also integrated with Mif-CNN to obtain multiscale infor-mation. The CNN-based region growing method focuseson obtaining small branches. A voxel classification network(VCN) is applied to extract the airway voxel by voxel in theperipheral region.Our main contributions are summarized as follows:(1) We propose a coarse-to-fine segmentation framework toobtain airway tree. Our method focuses on improvingthe completeness of the airway tree.(2) We propose an Mif-CNN, which integrates ASPP, anedge guidance module (EGM) and the coordinates in-formation of voxels with a u-shaped network. The net-work can utilize additional useful feature information toimprove segmentation performance.(3) We propose a CNN-based region growing to segmentthe bronchi. The VCN can entirely capture the rich in-formation around each voxel and facilitates the classi-fication of voxels in peripheral regions into airway andnon-airway voxels.
J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 2 of 11oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing
Training Label Small branches Label
Mif-CNN
VCN
Mif-CNN VCNCNN-based Region GrowingInitial ResultVCN Result Final ResultShape Reconstruction
Training Testing
Zoom in Zoom in Dice coefficient lossSample Selection B i n a r y C r o ss E n t r opy l o ss Original CT Scan
Original CT Scan
Figure 2:
Schematic of the workflow of our coarse-to-fine airway segmentation framework
2. Method
In this section, we introduce our method for airway seg-mentation. Section 2.1 describes the selection of trainingsamples. Section 2.2 and Section 2.3 provides details of ourMif-CNN and CNN-based region growing method,respectively.Lastly, Section 2.4 presents our shape reconstruction methodbased on the centerline tracking algorithm.The workflow of our method is illustrated in Fig. 2.For the training process, we first use sample selection toobtain small branches label, which contains subsegmentalbronchi and segmental bronchi, whose diameters are lessthan 2 mm. Overall training label and small branches labelare used to train the Mif-CNN and the VCN, respectively.The ASPP, EGM, and coordinate information are integratedwith a u-shaped network in Mif-CNN; thus, additional con-text information and useful features are fused to improve seg-mentation performance. In the CNN-based region growingmethod, VCN is applied to classify airway voxels and non-airway voxels in the peripheral region.For the testing process, the original CT scan is the onlyinput, and selecting samples is unnecessary. We first obtainan initial airway tree by Mif-CNN. Then, we extract airwaycandidate voxels around the end of the initial tree. The VCNis used as a discriminator in the region growing method toclassify voxels into airway or non-airway voxels. A voxelwith high probability is considered an airway voxel and isconnected to the airway tree, thus becoming the new end-point of the tree. The iterative update stops until the airwaytree is unchanged. Lastly, the result of VCN is refined by ashape reconstruction method.
The coarse branches are easy to be extracted due to itslarge volume in airway. However, small branches are very thin and difficult to segment. We plan to learn the charac-teristics of the coarse branches and the small branches sep-arately due to their differences in location, size, shape, in-tensity. Thus, the training data of Mif-CNN and VCN aredifferent. Our sampling strategy follows:
Dividing airway branches:
We obtain an airway skele-ton tree with n branches { 𝐵 , 𝐵 ,. . . , 𝐵 𝑁 } by an iterativebacktracking algorithm presented in [14]. Then we obtainthe diameter 𝑑 𝑖,𝑗 of each voxel 𝑣 𝑖,𝑗 ,j ∈ {1,2,. . . , 𝐾 𝑖 } in branch 𝐵 𝑖 . Thus, the branch diameter is computed as the average ofall its centerline voxels’ diameters: 𝐷 𝐵 𝑖 = 1 𝐾 𝑖 𝐾 𝑖 ∑ 𝑗 =1 𝑑 𝑖,𝑗 (1)Every branch is also labeled with corresponding anatom-ical name by a bronchus classification algorithm. Then allbranches are classified into four types: trachea and mainbronchus, lobar bronchi, segmental bronchi and subsegmen-tal bronchi. We further sample two subsets from these branchesin line with their branch measurements and anatomical level:(a). The first subset 𝑆 contains whole airway tree; (b).The second subset 𝑆 contains segmental bronchi whose di-ameter is less than 2 mm and subsegmental bronchi. Sampling for the Mif-CNN:
We then obtain volume ofinterests (VOIs) from the first subsets 𝑆 to train Mif-CNN.Considering the limitation of computer GPU memory, wethen apply overlapped sliding windows with VOIs sized to64 × 64 × 64 pixels and the stride size is 32 during trainingand testing. Sampling for the VCN:
Our voxel classification net-work is trained to classify a voxel point into airway or non-airway in the peripheral region. Therefore, we select samplepoints from 𝑆 , and obtained VOIs from these voxel points. J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 3 of 11oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing
Down-sampling Block Up-sampling BlockDown-sampling Block ASPP Up-sampling Block C CC Coordinates map
WHZ
Edge
Map
GroundTruth 𝐷 × 𝐻 × 𝑊 𝐷2 × 𝐻2 × 𝑊2 × 𝐷 × 𝐻 × 𝑊 × 𝐷8 × 𝐻8 × 𝑊8 × 𝐷8 × 𝐻8 × 𝑊8 × 𝐷 × 𝐻 × 𝑊 × 𝐷2 × 𝐻2 × 𝑊2 × Conv 3d BN LayerMax pooling ReLUUpsampleLayer Concatenate C Up-sampling BlockDown-sampling Block
Figure 3:
The basic network of Mif-CNN
We random select 3000 points from the 𝑆 in each CTscans as positive samples. Then we select non-airway vox-els around the selected airway voxels. We divide non-airwaypoints into five subsets by measuring Euclidean distance.In the five subsets, the distance between non-airway voxeland airway voxels is 1 voxel, 2–4 voxels, 5–7 voxels, 8–10voxels, and 11–30 voxels, respectively. We random selected600 points in every non-airway subset and obtain 3000 non-airway points as negative samples in total. For each samplepoints, we extract an VOIs sized to 32 × 32 × 32 pixels withthis point as center. The architecture of Mif-CNN is shown in Fig. 3. Theencoder-decoder structure needs to expand receptive fieldvia successive pooling and subsampling layers, and it gradu-ally reduces the spatial resolution. Features of thin bronchi,whose diameters are usually only 2-3 voxels, are prone tovanish after three times of pooling, making it difficult tosegment and recover. In order to achieve the purpose ofmaintaining image resolution, our Mif-CNN only containstwo pooling layers. However, multiple pooling layers arenecessary to extract effective context information for largebronchi. Our method inspired with dilated/atrous convo-lution which can increase receptive field and maintain thenumber of kernel parameters at the same time. We com-bine the u-shape structure and dilated convolution to solvethe problem of reducing the size of feature map and detectmulti-scale object. For the purpose of improving the seg-mentation performance, we use boundary information andcoordinate information to integrate more useful features.
Atrous spatial pyramid pooling:
ASPP is useful to ex-pand the receive field and resample features at multi-scale[15]. In order to capture context infromation at multi-range,we adopt a ASPP module as the bottom layer in the encoding-path of the network. As shown in Fig. 4, the structure ofASPP module is mainly composed of two parts: (a) one 1×1
Conv 3 × × × × Global
Pooling BN + ReLUFeature Map C Concatenate C Input Output
Figure 4:
The basic architecture of ASPP convolution and three 3 × 3 convolutions with dilated rates of6, 12 and -18, (b) a global average pooling layer, and a 1 × 1convolution. The first part is aim to expand the receive fieldand better obtain multi-scale feature maps, the second partis used to overcome the problem of effective weight reduc-tion at long range. Moreover, all the convolution operationswith batch normalization (BN) layers and rectified linear unit(ReLU). Finally, these feature maps generated from the fivebranches are concatenated, and sent to a 1 × 1 convolutionwith BN.
Edge guidance module:
CNN method are mainly basedon intensity features and ignore other information. Bound-ary information are essential characteristics for segmentingtarget in medical images, and it can improve feature detec-tion for segmentation [16]. Therefore, we consider that low-level features are rich in spatial details, and can provide suf-ficient boundary information. We first detect local boundaryinformation from low-level feature map, then propagate thelocation information of this layer from top to bottom. More-over, the feature map is also fed to two successive convolu-tion layers, and supervised by targeting on an edge map 𝑀 𝑒 J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 4 of 11oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing derived from ground truth masks.
Coordinate information:
We also consider that loca-tion information of voxels beneficial for the network to seg-ment airway. Therefore, we convert the coordinate informa-tion of the voxel in three dimensions into a three-channelfeature map consistent with the network size. Then concate-nate them with the feature map on the last layer in decoderpart of Mif-CNN.
Loss function:
As shown in Fig. 3, Mif-CNN has twohead branches. We use a Binary Cross Entropy (BCE) lossfunction for supervising the feature map of EGM, which isformulated as: Γ edge = ∑ 𝑥 ∈ 𝑀 𝑒 log ( 𝑝 𝑒 ( 𝑥 ) ) + ∑ 𝑥 ∉ 𝑀 𝑒 log ( 𝑝 𝑒 ( 𝑥 ) ) (2)where x denote the voxel and 𝑝 𝑒 (x) is the predict edgevoxel.The second branch is trained to predict whether eachpixel is an airway voxel. The loss function used for our pixellevel classification problem is Dice coefficient loss function: Γ seg = 1 − ∑ 𝑖 =1 ( 𝑝 𝑖 ∗ 𝑔 𝑖 ) 𝑝 𝑖 + 𝑔 𝑖 + 𝜖 (3)Where 𝑝 𝑖 , and 𝑔 𝑖 ,i ∈ {0,1} denote the predict segmenta-tion result and label, respectively. 𝜖 is a smooth term to avoiddivision by zero.Finally, we define our loss function Γ 𝑡𝑜𝑡𝑎𝑙 as a combina-tion of Γ 𝑠𝑒𝑔 and Γ 𝑒𝑑𝑔𝑒 , which can be expressed by the fol-lowing formula: Γ total = Γ seg + Γ edge (4) Conventional region growing methods are used to seg-ment the airway based on their density (in Hounsfield Unitsor HU). However, since the density of airway voxels are closeto the surrounding tissue at peripheral branches, these meth-ods perform worse. In order to extract more airway voxel, wepropose a CNN-based region growing method which com-bines deep learning technology and region growing. In thiswork, we first initialize seed point. The voxels in the ter-minal of the airway tree which obtained from Mif-CNN areselected as the initial seed points, and they are pushed ona stack. Unlike the work in [17], we classify those unpro-cessed voxels in the 26-neighbor of the seed point into air-way voxels and non-airway voxels by using a 3D CNN (VCN).VCN can produce a probability whether the voxels are air-way voxels. A threshold 𝑇 𝑢 ( 𝑇 𝑢 =0.8) is introduced to dis-criminate airways and non-airways. A voxel with probabilitywhich higher than 𝑇 𝑢 is considered as airway voxel, it willbe selected as the new seed and pushed on the stack. Theairway tree is allowed to keep growing when the stack is notempty.The architecture of VCN is shown in Fig. 5. It consistsof three down-sampling blocks and two fully connected lay-ers. The down-sampling operation is used for encode high semantic feature. Down-sampling block consists of two 3× 3 convolution layers with BN and ReLU, and each blockfollowed by a 2 × 2 max pooling operation with stride 2. Thenumber of kernels in each convolution layer is 16,16,32,32,64,64.After down sampling, we then feed the feature map into twosuccessive fully connected layers which have 2048 neuronsand 128 neurons respectively. Finally, we calculate the prob-abilities of airway voxel by a softmax function after the lastfully connected layer. We use a Binary Cross Entropy (BCE)loss function for supervising the output. The loss is illus-trated as: Γ 𝑠 = − 𝑦 log 𝑝 − (1 − 𝑦 ) log(1 − 𝑝 ) (5)where y ∈ {0,1} is the predict probability and p ∈ [0,1] isthe label. CT Image
Branches in 𝑆 Conv 3d BN Layer
Max
Pooling ReLU
Feature Map
Softmax LayerBase Block Fully Connected
VOIs × ×
32 2048 128 2 S a m p li ng S t r a t e gy
16 32 64
Figure 5:
Scheme of the VCN architecture. 𝑆 contains seg-mental bronchi whose diameter is less than 2 mm and subseg-mental bronchi. Using VOIs of the voxels from 𝑆 as the inputof VCN. Our segmentation framework is robust due to its highquality of output and less false negative rate. Mostly, ourmethod will not produce large leakages, but cause small leak-ages near the peripheral bronchi. Some small boundary pro-trusions, branch like structures adjacent to airway branchesalso exist. Hence, our shape reconstruction method basedon centerline tracking algorithm is proposed to remove theseleakages. The detailed description is as follows:(1)
Skeleton detection:
We use an iterative backtrackingalgorithm presented in [14] to compute the skeleton ofairway obtained by our network. Then we remove somewrong branches and construct an airway skeleton tree.(2)
Airway reconstruction:
For each voxel 𝑣 𝑖,𝑗 , we calcu-late the average diameter of itself and four neighbors bythe following formula: 𝑑 ′ 𝑖,𝑗 = ∑ 𝑘 =−2 𝑑 𝑖,𝑗 + 𝑘 , we thenreplace 𝑑 𝑖,𝑗 as the new diameter of 𝑣 𝑖,𝑗 . Then we recon-struct the airway via compute the ball with diameter 𝑑 𝑖,𝑗 of each voxel.(3) Result refinement:
The reconstructed airway is usedto refine the segmentation results. We supplement thesmall fractures which 𝑑 𝑖,𝑗 < 𝑑 ′ 𝑖𝑗 , and voxel point out-side the radius of reconstructed airway are removed. J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 5 of 11oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing
3. Experiments and Result
In this section, several experiments are conducted to eval-uate the Mif-CNN and VCN segmentation methods. In Sec-tion 3.1, information on our private data and the public dataof EXACT’09 are presented. In Section 3.2, we introduceevaluation metrics and experimental settings. In Sections3.3 and 3.4, we show the segmentation results of Mif-CNNand the CNN-based region growing method, respectively.Given that our study focuses on extracting small branches inthe peripheral region, the overall airway tree and the airwaytree with removed trachea and main bronchus are evaluated.
We evaluated our airway segmentation method on twodatasets: private chest CT scans, and public CT scans of EX-ACT’ 09 [6].
Private dataset:
It consists of 20 CT scans. Each sliceof the CT scans has the same size of 512 × 512 pixels with aspatial resolution ranging from 0.625 mm to 1 mm. The slicethickness varies from 0.5 mm to 1.25 mm, and the numberof slices in each CT scan ranges from 237 to 441. On thebasis of the interactive segmentation results of ITK-SNAP,the ground truths are manually corrected by two experienceddoctors [3].
EXACT’09:
It consists of 20 CT scans from the trainingdataset of this challenge. All slices in the CT have a size of512 × 512 pixels, with a pixel size ranging from 0.5 mm to0.78 mm in axial view. The number of slices in each CT scanranges from 157 to 764, and the slice thickness varies from0.45 mm to 1.0 mm. We obtain the annotation of 20 publicCT scans from Qin [18]. The acquisition and investigationof data conform to the principles outlined in the Declarationof Helsinki [19].
To evaluate the performance of our methods, four met-rics based on area overlap are used: dice similarity coeffi-cient (DSC), false positive rate (FPR), sensitivity (Sen) andprecision (Pre). These metrics are illustrated as follows:
𝐷𝑆𝐶 = 2 × 𝐴 𝑝𝑟𝑒 ∩ 𝐴 𝑔𝑡 𝐴 𝑝𝑟𝑒 + 𝐴 𝑔𝑡 (6) 𝐹 𝑃 𝑅 = 2 × 𝐹 𝑃 𝑇 𝑃 + 𝑇 𝑃 (7)Sen = 2 × 𝑇 𝑃 𝑇 𝑃 + 𝐹 𝑃 (8)Pre = 2 × 𝑇 𝑃 𝑇 𝑃 + 𝐹 𝑁 (9)where 𝐴 𝑝𝑟𝑒 and 𝐴 𝑔𝑡 denote the segmentation result andground-truth respectively. 𝑇 𝑃 denote voxels of predicted pos-itive class belong to true airway. 𝐹 𝑃 denote voxels of pre-dicted positive class belong to non-airway, 𝐹 𝑁 denote voxels of predicted negative class belong to non-airway, 𝑇 𝑁 denotevoxels of predicted positive class belong to true airway.We also removed trachea and main bronchus from bothpredictions and ground truth, while calculated these metrics.In our experiments, we randomly chose 12 cases fromprivate dataset and 12 cases from public dataset for train-ing. Then we chose 3 cases from private dataset and publicdataset respectively as validation dataset. The remaining 10cases are test dataset. We implemented our method in Py-Torch and fine-tuned the hyper-parameter on training data.For the Mif-CNN, the Adam optimizer( 𝛽 = 0.9, 𝛽 =0.999) is used with a learning rate of 1.0e-04 in training.The maximum train epoch is set to 200. For the Voxel clas-sification network, the SGD is used with a learning rate of1.0e-04 in training. We finally use the model with best vali-dation results for testing.We then evaluating the performance of our Mif-CNN inour private dataset and public dataset of EXACT’09, respec-tively. To verify the segmentation performance of our Mif-CNN,we compare it with three state-of-the-art methods, i.e., thoseof Jin [9], and Juarez [10] and 3D U-Net[5]. We reimplementthese methods in PyTorch and fine-tune the hyperparameter.Detailed experiment results are shown as follows:
Table 1
Results of the proposed Mif-CNN in comparison with state-of-the-art methods (mean ± standard deviation) on the privatetesting dataset while evaluating the whole airway.Method DSC (%) FPR (%) Sen (%) Pre (%)U-Net [5] 91.6±1.6 0.018±0.007 88.7±3.6 94.8±1.7Jin [9] 92.4±2.3 0.022±0.010 88.8±4.9 95.7±1.5Juarez [10] 92.8±2.6 0.015±0.007 88.6±4.8 97.4±1.3Mif-CNN 93.5±2.4 0.020±0.009 90.8±5.2 96.7±1.4
Table 2
Results of the proposed Mif-CNN in comparison with state-of-the-art methods (mean ± standard deviation) on the privatetesting dataset while evaluating an airway without trachea andmain bronchus.Method DSC (%) FPR (%) Sen (%) Pre (%)U-Net [5] 81.3±5.6 0.017±0.007 74.2±8.6 87.7±2.7Jin [9] 82.1±4.9 0.016±0.006 76.5±8.8 91.7±3.2Juarez [10] 83.4±5.6 0.012±0.008 75.1±9.1 92.8±2.2Mif-CNN 84.6±5.7 0.018±0.009 78.8±9.6 92.3±2.4
The result of the four segmentation methods for the over-all airway tree are presented in Table 1. On average, ourmethod achieves the highest DSC of 93.5%, which is 0.07%higher than second segmentation performance of Juarez [10].In terms of sensitivity, our method gets 2.0% improvement
J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 6 of 11oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing
Figure 6:
Comparison of airway segmentation results between other methods and ground truth. From Row 1 to Row 3 are threedifferent subjects. From left to right are the results of ground-truth, Mif-CNN, Jin, Juarez and U-Net respectively. than second performance of Jin [9]. The experiment resultsillustrate that our method outperforms the other methods indetecting a complete airway tree.The result of four segmentation methods for the airwaytree without trachea and main bronchus are presented in Ta-ble 2. By comparing U-Net, the DSC and sensitivity increasefrom 81.3% and 74.2% to 84.6% and 78.8%, respectively. Interms of FPR and precision, our method achieves a resultof 0.02% and 96.7%. It could also shows that our methodoutperforms the other methods in the peripheral region.Fig. 6 shows the segmentation results of three differentsubjects on 3D images. Compared with Jin ( 𝑟 𝑑 column),Juarez ( 𝑡 ℎ column) and U-Net ( 𝑡 ℎ column), our Mif-CNN( 𝑛 𝑑 column) extracts more branches in the upper left lobeand upper right lobe, which are prone to lesions and nod-ules. A better airway segmentation result will facilitate thescreening and diagnosis of lesions. Specifically, the result ofour method is close to the ground-truth. In contrast, U-Netmisses a lot of important branches in some case. Jin [9] andJuarez [10] improve the segmentation results, but the perfor-mance is still leaving a little to be desired on detecting moreperipheral branches.Axial view of three subject for qualitative analysis areshown in Fig. 7. From the 𝑠 𝑡 row, our method ( 𝑛 𝑑 col-umn) not loses important branches. It can be seen from the 𝑛 𝑑 row and 𝑟 𝑑 row, our method compared with other meth-ods, still has a good performance in the low-contrast periph-eral area, where it is difficult to distinguish the airway andsurrounding tissues. For the public dataset, comparison results of proposedmethod and the state-of-the-art methods are illustrated in Ta-ble 3. Our Mif-CNN achieves 95.8%, 0.053%, 95.0% and
Table 3
Results of the proposed Mif-CNN in comparison with state-of-the-art methods (mean ± standard deviation) on the publictesting dataset while evaluating the whole airway.Method DSC (%) FPR (%) Sen (%) Pre (%)U-Net [5] . . .
051 ± 0 .
017 96 . . . . Jin [9] . . .
051 ± 0 .
017 96 . . . . Juarez [10] . . .
052 ± 0 .
018 96 . . . . Mif-CNN . . .
053 ± 0 .
019 96 . . . . Table 4
Results of the proposed Mif-CNN in comparison with state-of-the-art methods (mean ± standard deviation) on the publictesting dataset while evaluating airway without trachea andmain bronchus.Method DSC (%) FPR (%) Sen (%) Pre (%)U-Net [5] . . .
032 ± 0 .
021 83 . . . . Jin [9] . . .
030 ± 0 .
016 85 . . . . Juarez [10] . . .
031 ± 0 .
018 85 . . . . Mif-CNN . . .
033 ± 0 .
018 86 . . . . J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 7 of 11oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing
Figure 7:
Axial view images of three different subjects from Row 1 to Row 3. From left to right are the results of ground-truth,Mif-CNN, Jin, Juarez and U-Net respectively method, respectively. Considering that our purpose is to ex-tract more airway, the false positive rate does not affect theoverall result. Therefore, these results verify the feasibilityand effectiveness of Mif-CNN.
Although the Mif-CNN (Fig. 8 (c)) can extract more air-way branches than U-Net (Fig. 8 (b)), it still misses severalbranches. Our CNN-based region growing method focuseson classifying airway voxels at peripheral branches, and theVCN is applied to produce the probability of whether a voxelbelongs to the airway. Therefore, we extracted airway-candidateVOIs from the sample points located in 26-neighbor voxelsfrom the end points of the initial airway, and classified themas airway or non-airway by VCN. The airway result was up-dated by using only the voxels with high probability ( ≥ Our Mif-CNN is mainly based on integrating multiplebasic modules and prior information. Thus, the effective-ness of the key components of our network must be studiedthrough detailed ablation experiments. We evaluate threekey components of our network: the coordinate informa-tion, the ASPP, and the EGM. The ablation study results areshown in Table 5. The trachea and main bronchus are notincluded in this evaluation.(1)
Effectiveness of Coordinate Information:
Theoreti-cally, location information facilitates the object target-ing. Thus, we add coordinate information to our networkand further investigate its roles in our network. The re-sults (No.2 vs. No.1, No.5 vs. No.3, No.6 vs. No.4)show that the models with coordinate information have ahigher precision and lower FPR than those without coor-dinate information. The coordinate information restrictsthe spatial position of the airway and avoids leakage ofthe peripheral bronchi. However, the models with coor-dinate information have lower sensitivity than those thatwithout.(2)
Effectiveness of ASPP:
We also explore the contribu-tion of the ASPP module. Given the atrous convolutionthat expands the receptive field without reducing spatialresolution, our network can maximize multi-informationand has an improved ability to detect small objects. Asshown in Table 5, No. 4 performs better than other set-tings (Nos.1, 2, and 3) and achieves the highest sensitiv-ity at 78.3%. This result indicates that the ASPP module,which can resample features at multi-scales, can obtain
J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 8 of 11oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing
Figure 8:
Refined segmentation results using our voxel classification network. (a) Gold standard labeled by an experimentalradiologist, (b) Result of U-Net, (c) Initial output of Mif-CNN, (d) Final result of the voxel classification network.
Table 5
Ablation study of the proposed Mif-CNN.Method DSC (%) FPR (%) Sen (%) Pre (%)(No. 1) U-Net 81.3 0.017 74.2 93.1(No. 2) U-Net + CI 82.2 0.015 74.9 93.6(No. 3) U-Net + EGM 82.1 0.020 76.4 94.2(No. 4) U-Net + ASPP 82.2 0.024 78.3 91.6(No. 5) U-Net + CI + ASPP 82.9 0.021 77.3 93.2(No. 6) U-Net + CI + EGM 82.6 0.018 76.8 95.1(No. 7) U-Net + CI + ASPP + EGM 84.6 0.016 78.8 92.3 more airway voxels. Although the ASPP module resultsin a slightly higher FPR, this FPR does not have mucheffect on our results. The extraction of additional periph-eral branches proves that our experiment is successful.(3)
Effectiveness of EGM:
Boundary information is essen-tial for segmentation targets in medical images, and itcan improve edge feature detection for segmentation. Weuse an EGM to propagate rich boundary information fromlow-level features to a high-level feature map. In thismanner, boundary information can improve segmenta-tion performance. Compared with modules without edgeconstraint guidance (No. 1), our method is improved by2.2% in sensitivity and by 1.1% in precision.
4. Discussion
In the experiments, we evaluate the performance of theproposed Mif-CNN and VCN. In Section 3.3.1, we first in-vestigate Mif-CNN compared with other methods, includ-ing that of Jin and Juarez and U-Net, on the private dataset.The results indicate that our method achieves a better per-formance on the private dataset than on the public one. TheDSC and sensitivity for the whole airway segmentation are93.5% and 90.8%, respectively. The result is promising ascompared with the result from 3D U-Net, which is 91.6%in DSC and 88.7% in sensitivity. The DSC and sensitiv-ity of the quantitative results without the trachea and mainbronchus also show that Mif-CNN has an improved perfor-mance in airway segmentation. In addition, the qualitativeanalysis illustrates that the segmentation result by Mif-CNNis well, which is close to the ground truth.In Section 3.3.2, we compare the performance of Mif-CNN and other methods on the public dataset of EXACT’09.As shown in Table 3, our method performs well but does not show a significant difference compared with other methods.Nevertheless, our method outperforms others when the tra-chea and main bronchus are removed from the airway tree(Table 4). The reason is that lobar bronchi and segmentalbronchi only account for approximately 30% of the total air-way volume. When the segmentation result achieves a slightimprovement in the peripheral region, the numerical changewhen computing the metrics of the whole airway tree is notsignificantly different.In Section 3.4, we study the improvement of segmenta-tion results by the CNN-based region growing method. Theinitial airway tree of Mif-CNN is better than U-Net and isclose to the ground truth. However, it also misses severalbranches. Fig. 8 shows the result of the CNN-based re-gion growing method performs better than the initial air-way tree, indicating that the voxel-by-voxel classification ap-proach based on the deep neural network is useful. VCN canentirely capture rich information around each voxel and im-prove the segmentation performance.Our study focuses on extracting more airway branchesin the peripheral region. Thus, we also discuss the results oftwo different datasets, which evaluate the airway tree withouttrachea and main bronchus. As shown in Table 2 and 4, ourmethod achieves the best result in peripheral region regard-less of which dataset is used. We attribute this result to theEGM and coordinate information, which fuse additional ex-plicit knowledge to produce useful features. In addition, theASPP can expand the receptive field and detect multi-scaleinformation, thus improving the segmentation performance.False positives are higher than some methods because someof these errors might be due to miss terminal branches in thereference standard; however, our method can successfullyextract these missing peripheral branches. In addition, thesereal branches are considered false positive branches when
J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 9 of 11oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing calculating the FPR metrics. The multiple-dataset valida-tion also demonstrates that our method is robust and reliableby fusing additional useful information.In the literature, a few studies have focused on automaticairway segmentation. One of the classic methods is thatof [20], in which a vessel-guided airway segmentation al-gorithm that extracts additional airway voxels by combin-ing vessel information was proposed; the TPR reached up to98.68%. A conference paper [11] also leveraged distance tothe lung border and voxels’ coordinates to improve airwaysegmentation. This operation integrated additional seman-tic information to improve learning of additional airway fea-tures, and the DSC was 90.2% on 10 chest CT scans. Thesestudies believe that the fusion of additional feature informa-tion can improve segmentation performance. Thus, we usean EGM to integrate additional edge information. The au-thor in [21] focused on the small branches. A random forestclassifier was applied to classify each voxel into airway voxelor non-airway voxel. In [22], a 2.5D CNN was used to clas-sify each voxel at the peripheral branches. Three patches foraxial, coronal, and sagittal views in each voxel are fed intothe network, and the CNN was then used for voxel classifi-cation. The DSC of this method could reach up to 89.97%.However, the data and implementation of different studiesvary, thus, comparing different quantitative results objec-tively is difficult.
5. Conclusion
This paper has presented a fully automatic method to per-form segmentation of airways from chest CT scans. The coreof this method is a coarse-to-fine framework which segmentsthe overall airway tree and small airway branches respec-tively. The framework contains two parts: Mif-CNN andCNN-based region growing. In Mif-CNN, ASPP, EGM andcoordinate information are incorporated into a u-shape net-work, thus it can utilize more useful features to improve theperformance of segmentation. CNN-based region growingmethod can extract branches which are missing in the re-sult of Mif-CNN. In addition, a shape reconstruction methodbased on centerline tracking algorithm is employed to refinethe final segmentation result. Experimental results on pri-vate dataset and public dataset show that this method perfor-mance well in the upper lobe region which prone to lesionsand nodules. The multiple-dataset validation also demon-strates the reliability and further practicality of the proposedmethod. In the future, we plan to apply our segmentation re-sults to other related tasks, such as Artery/Vein separation,determination of lung segments.
Acknowledgement
This work was supported by the Natural Science Founda-tion of Fujian Province, China (Grant No.2020J01472) andProvincial Science and Technology Leading Project (GrantNo.2018Y0032).
References [1] Chen Wang, Jianying Xu, and et al. Lan Yang. Prevalence and riskfactors of chronic obstructive pulmonary disease in china (the chinapulmonary health [cph] study): a national cross-sectional study.
TheLancet , 391(10131), 2018.[2] Tianyi Zhao, Zhaozheng Yin, Jiao Wang, Dashan Gao, YunqiangChen, and Yunxiang. Mao. Bronchus segmentation and classificationby neural networks and linear programming.
Medical Image Com-puting and Computer Assisted Intervention – MICCAI 2019 , pages230–239, 2019.[3] E.M Van Rikxoort, W Baggerman, and B van Ginneken. Automaticsegmentation of the airway tree from thoracic ct scans using a multi-threshold approach.
Second International Workshop on PulmonaryImage Analysis , 486:341–349, 2009.[4] M. Sonka, Wonkyu Park, and E. A. Hoffman. Rule-based detectionof intrathoracic airway trees.
IEEE Transactions on Medical Imaging ,15(3):314–326, 1996.[5] Çiçek Ö., Ahmed Abdulkadir, Soeren S. Lienkamp, Thomas Brox,and Olaf. Ronneberger. 3d u-net: Learning dense volumetric seg-mentation from sparse annotation.
Medical Image Computing andComputer-Assisted Intervention – MICCAI 2016 , 9901:424–432,2016.[6] P. Lo, B. van Ginneken, J. M. Reinhardt, T. Yavarna, P. A. de Jong,B. Irving, C. Fetita, M. Ortner, R. Pinho, J. Sijbers, M. Feuerstein,A. Fabijanska, C. Bauer, R. Beichel, C. S. Mendoza, R. Wiemker,J. Lee, A. P. Reeves, S. Born, O. Weinheimer, E. M. van Rikxoort,J. Tschirren, K. Mori, B. Odry, D. P. Naidich, I. Hartmann, E. A.Hoffman, M. Prokop, J. H. Pedersen, and M. de Bruijne. Extraction ofairways from ct (exact’09).
IEEE Transactions on Medical Imaging ,31(11):2093–2107, 2012.[7] Jean Paul Charbonnier, Eva M.van Rikxoort, Arnaud A.A. Setio,Cornelia M. Schaefer-Prokop, Bram van Ginneken, and FrancescoCiompi. Improving airway segmentation in computed tomographyusing leak detection with convolutional networks.
Medical ImageAnalysis , 36:52–60, 2017.[8] Qier Meng, Holger R. Roth, Takayuki Kitasaka, Masahiro Oda, JunjiUeno, and Kensaku Mori. Tracking and segmentation of the air-ways in chest ct using a fully convolutional network.
Medical Im-age Computing and Computer Assisted Intervention – MICCAI 2017 ,10434:198–207, 2017.[9] Dakai Jin, Ziyue Xu, Adam P. Harrison, Kevin George, and Daniel J.Mollura. 3d convolutional neural networks with graph refinementfor airway segmentation using incomplete data labels.
InternationalWorkshop on Machine Learning in Medical Imaging , 10541:141–149, 2017.[10] Antonio Garcia-Uceda Juarez, H. A. W. M. Tiddens, and M. de Brui-jne. Automatic airway segmentation in chest ct using convolutionalneural networks.
Image Analysis for Moving Organ, Breast, and Tho-racic Images , 11040:238–250, 2018.[11] Yulei Qin, Mingjian Chen, Hao Zheng, Yun Gu, Mali Shen, Jie Yang,Xiaolin Huang, Yue-Min Zhu, and Guang-Zhong Yang. Airwaynet:A voxel-connectivity aware approach for accurate airway segmenta-tion using convolutional neural networks.
Medical Image Comput-ing and Computer Assisted Intervention – MICCAI 2019 , 11769:212–220, 2019.[12] Yulei Qin, Hao Zheng, Yun Gu, Xiaolin Huang, Jie Yang, LihuiWang, and Yue-Min Zhu. Learning bronchiole-sensitive airwaysegmentation cnns by feature recalibration and attention distillation.
Medical Image Computing and Computer Assisted Intervention –MICCAI 2020 , 12261:221–231, 2020.[13] Fisher Yu and Vladlen Koltun. Multi-scale context aggregation bydilated convolutions. , 2016.[14] S. Liu, D. Zhang, Y. Song, H. Peng, and W. Cai. Automated 3d neurontracing with precise branch erasing and confidence controlled backtracking.
IEEE Transactions on Medical Imaging , 37(11):2441–2452,2018.[15] Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig
J. Guo,R. Fu,B. He et al.:
Preprint submitted to Elsevier
Page 10 of 11oarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing
Adam. Rethinking atrous convolution for semantic image segmenta-tion.
CoRR , abs/1706.05587, 2017.[16] J. Zhao, J. Liu, D. Fan, Y. Cao, J. Yang, and M. Cheng. Egnet:Edge guidance network for salient object detection. pages 8778–8787,2019.[17] Eva van Rikxoort, W. Baggerman, and B. Ginneken. Automatic seg-mentation of the airway tree from thoracic ct scans using a multi-threshold approach.
The Second Workshop on Pulmonary ImageAnalysis , pages 341–349, 2009.[18] Y. Qin, Y. Gu, H. Zheng, M. Chen, J. Yang, and Y. Zhu. Airwaynet-se: A simple-yet-effective approach to improve airway segmentationusing context scale fusion. pages 809–813, 2020.[19] Article In Italian. World medical association (amm). helsinki declara-tion. ethical principles for medical research involving human subjects.
Assistenza Infermieristica E Ricerca Air , 20(2):104, 2001.[20] Pechin Lo, Jon Sporring, Haseem Ashraf, Jesper J.H. Pedersen, andMarleen de Bruijne. Vessel-guided airway tree segmentation: A voxelclassification approach.
Medical Image Analysis , 14(4):527 – 538,2010.[21] Z. Bian, J P Charbonnier, J. Liu, D. Zhao, D A Lynch, and B VanGinneken. Small airway segmentation in thoracic computed tomog-raphy scans: a machine learning approach.
Physics in medicine andbiology , 63(15), 2018.[22] Jihye Yun, Jinkon Park, Donghoon Yu, Jaeyoun Yi, Minho Lee,Hee Jun Park, June-Goo Lee, Joon Beom Seo, and Namkug Kim.Improvement of fully automated airway segmentation on volumetriccomputed tomographic images using a 2.5 dimensional convolutionalneural net.
Medical Image Analysis , 51:13 – 20, 2019.
J. Guo,R. Fu,B. He et al.: