Cranial Implant Design via Virtual Craniectomy with Shape Priors
Franco Matzkin, Virginia Newcombe, Ben Glocker, Enzo Ferrante
CCranial Implant Design via Virtual Craniectomywith Shape Priors
Franco Matzkin , Virginia Newcombe , Ben Glocker , and Enzo Ferrante Research Institute for Signals, Systems and Computational Intelligence, sinc(i),CONICET, FICH-UNL (Argentina) Division of Anaesthesia, Department of Medicine, University of Cambridge (UK) BioMedIA, Imperial College London (UK)
Abstract.
Cranial implant design is a challenging task, whose accu-racy is crucial in the context of cranioplasty procedures. This task isusually performed manually by experts using computer-assisted designsoftware. In this work, we propose and evaluate alternative automaticdeep learning models for cranial implant reconstruction from CT im-ages. The models are trained and evaluated using the database releasedby the AutoImplant challenge, and compared to a baseline implementedby the organizers. We employ a simulated virtual craniectomy to trainour models using complete skulls, and compare two different approachestrained with this procedure. The first one is a direct estimation methodbased on the UNet architecture. The second method incorporates shapepriors to increase the robustness when dealing with out-of-distributionimplant shapes. Our direct estimation method outperforms the baselinesprovided by the organizers, while the model with shape priors showssuperior performance when dealing with out-of-distribution cases. Over-all, our methods show promising results in the difficult task of cranialimplant design.
Keywords:
Skull reconstruction · self-supervised learning · decompres-sive craniectomy · shape priors Crainoplasty is a surgical procedure aimed at repairing a skull vault defect byinsertion of a bone or nonbiological implant (e.g. metal or plastic) [1]. Such skulldefect may exist due to different reasons, like a brain tumor removal procedureor a decompressive craniectomy surgery following a traumatic brain injury [12].Cranial implant design is usually performed by experts using computer-aideddesign software specifically tailored for this task [2]. The AutoImplant challenge,organized for the first time at MICCAI 2020, aims at bench-marking the latestdevelopments in computational methods for cranial implant reconstruction. Inthis work, we propose and evaluate two approaches to solve this task using deeplearning models. a r X i v : . [ ee ss . I V ] S e p Franco Matzkin, Virginia Newcombe, Ben Glocker, and Enzo Ferrante
Previous works on skull and cranial implant reconstruction suggest that deeplearning models are good candidates to solve this task. In [13] a denoising autoen-coder was used to perform skull reconstruction, following an approach similarto the recently proposed Post-DAE method [7,6]. In this case, a denoising au-toencoder is trained to reconstruct full skulls from corrupted versions. However,the model proposed in [13] works with skulls extracted from magnetic resonanceimages, can only handle low resolution images and was evaluated on the full-skull reconstruction task. Here we focus on reconstructing the flap only, on skullsextracted from high resolution and anisotropic computed tomography (CT) im-ages. Other approaches rely on a head symmetry assumption and propose totake advantage of it to reconstruct the missing parts by mirroring the completeside of the skull [4]. However, this is not a realistic assumption since missingflaps may occur in both sides simultaneously. Another alternative could be thesubtraction of the aligned pre- and post-operative CT scans. Unfortunately, thisrequires to have access to the pre-operative image, which may not be the casein real clinical scenarios.Recently, we have proposed [11] a simple virtual craniectomy procedure whichenables training different deep learning models in a self-supervised way, given adataset composed of full skulls. In this work, we compared two different ap-proaches: direct estimation of the implant, or reconstruct-and-subtract (RS)strategies where the full skull is first reconstructed, and then the original imageis subtracted from it to generate a difference map. We evaluated different archi-tectures and concluded that direct estimation produces more accurate estimatesthan RS strategies, since the latter one tends to generate noise in areas far fromthe flap. A different approach has been introduced by the AutoImplant challengeorganizers [9] which also employs deep learning models, but it works in two steps.First, a low resolution version of the image is reconstructed to localize the areawhere the defected region is located. Then, they extract a 3D patch from thehigh resolution image and process it using a second neural network trained forfine implant prediction.In this work, based on the conclusions from [11], we employ a direct estima-tion method that operates on full skulls which are rigidly registered to an atlasand resampled to an intermediate resolution. Aligning the images allow us towork in a common space which simplifies the reconstruction task. We adapt thevirtual craniectomy procedure to account for more realistic flap shapes, similarto the ones introduced in the AutoImplant challenge. Moreover, we propose toincorporate anatomical priors into the standard direct estimation model intro-duced in [11] by feeding the registered skull atlas as an extra image channel.Previous works [8] have shown that incorporating approximate shape priors asadditional image channels is a simple yet effective way to increase the anatom-ical plausibility of the segmentations, since it provides supplementary contextinformation to the network. We compare the results of our two methods withthose obtained by the baseline benchmark model introduced in [9], showing thesuperiority of our approach. ranial Implant Design via Virtual Craniectomy with Shape Priors 3
Fig. 1.
Examples of images from the D test set and D test − extra (out-of-distributioncases). As it can be observed, images from D test follow a common pattern, while thosein D test − extra present different defects with various shapes. The AutoImplant challenge organizers provided 100 images for training ( D train )and 110 images for testing. From the 110 test images, 100 of them (denoted hereas D test ) have simulated surgical defects which follow the same distribution as theones on the training images, while the remaining 10 (denoted as D test − extra ) havedefects which do not follow the same distribution (see Figure 1). The images wereselected from the CQ500 public database [3]. They have fixed image dimensionin the axial plane (512 x 512) and a variable number of axial slices Z.The training dataset ( D train ) is composed of triplets ( X full , X defected , Y ),where X full is the full skull, X defected corresponds to the defected skull and Y to the removed defect that we aim at reconstructing. For the test images,only the X defected images were released. We evaluated the proposed methods inthe test images and submitted the results to the organizers, who computed themetrics reported in this paper. It is important to note that, in order to avoidoverfitting to the test data, we could submit our results a maximum of 5 times. The database can be accessed at: http://headctstudy.qure.ai/dataset
Franco Matzkin, Virginia Newcombe, Ben Glocker, and Enzo Ferrante
Binary Skull Implant removal Difference Extracted Implant
VC proposedin [ 9 ]VC adaptedfor AutoImplantChallenge
Fig. 2.
Modified virtual craniectomy procedure. We incorporated new template shapesfor the virtual craniectomy to account for the pattern found in the AutoImplant chal-lenge dataset.
The proposed cranial implant reconstruction methods operate on the space ofbinary volumetric masks. Such binary skull can be obtained by simply thresh-olding a brain CT image according to the Hounsfield scale, or applying moresophisticated methods. In the AutoImplant challenge, the skulls were alreadyprovided as binary volumes, extracted from the CT images using thresholdingand additional post-processing steps (for further details we refer to [9]). Sincethe training data includes the full skulls, we leveraged the virtual craniectomyprocedure proposed in [11] to train our models.
Given a full skull, we designed a virtual craniectomy procedure which consistsin removing a bone flap using a template located in a random position along itsupper part. In [11], spherical template shapes were used. By visual inspectionof the AutoImplant training data, we observed that defects tend to follow apattern given by the intersection of the skull with a cube with two cylindersover the edges perpendicular to the axial planes. So, we designed a variable-size template shape which produces similar defects, as shown in Figure 2. Toincrease the diversity of our training procedure, we also included spherical andcubic templates of random sizes (all of the three shapes were selected with equalprobability).The virtual craniectomy was used as a data augmentation mechanism to gen-erate a variety of training samples from a limited amount of full skulls, resultingin a self-supervised learning approach, where no annotated skull defects are re-quired for training. We also included salt and pepper noise in the input imageswith probability 0.01. Moreover, we also considered the defective skulls provided ranial Implant Design via Virtual Craniectomy with Shape Priors 5
Fig. 3. (a)
The images are first registered to an atlas space, and resampled to a commonresolution. We store the resulting transform T and its inverse T − . (b) We comparetwo different approaches for the implant reconstruction task. The first one is a standardDE-UNet model. The second one incorporates a shape prior by considering the atlasas an extra input channel to the network. After prediction, the segmentation mask ismapped-back to the original image space using the inverse transform T − . by the organizers as part of our datasets (in these cases, virtual craniectomy wasnot performed). During training, we sampled images coming from both sources:simulated virtual craniectomies and defective skulls provided by the organizers. Before training, all the images were rigidly registered to a common space de-termined by a full skull atlas. It consists in a thresholded version of a full-skullhead CT atlas constructed by averaging several healthy head CT images. Suchatlas allowed us to normalize the images by resampling them to an intermediateresolution. We chose this resolution to be 0.695 x 0.695 x 0.715 mm (resultingin a volume of 304 x 304 x 224 voxels) because it was the maximum size wemanaged to fit in GPU memory. Moreover, aligning the images in a commonspace simplifies the reconstruction task for the neural network, since it can fo-cus on shape variations which are more relevant to the reconstruction task thantranslations and rotations. We used the FLIRT software package [5] for rigid reg-istration. At test time, given a test defective image X defected i , we apply the sameregistration procedure which returns a transformation T and its inverse T − .The transformation is applied to the original image T ◦ X defected i . The estimatedskull defect ˆ Y i is reconstructed in the common space, and the final estimate inthe original space is recovered by applying the inverse transformation T − ◦ ˆ Y i . Our first method is a direct estimation model which follows the same architectureas the DE-UNet used in [11]. It is a standard 3D UNet encoder-decoder archi-
Franco Matzkin, Virginia Newcombe, Ben Glocker, and Enzo Ferrante
All
Model D i c e H D ( mm ) All
Fig. 4.
Comparison of the results for the proposed methods in terms of Dice andHausdorff Distance (HD). HD is shown in log scale for better visualization. tecture with skip connections, trained using a compound loss which combinesDice and cross-entropy terms [14] (for more details, we refer to our work in [11]).After reconstruction, the segmentation is re-mapped to the original resolutionusing the inverse transform T − as previously discussed.The model is trained using batches with full volume images, pre-aligned inthe common space and resampled to an intermediate resolution as previouslydiscussed. Since the DE-UNet model is a fully convolutional architecture, the receptive fieldof the model is mainly determined by the amount of layers and parameters ofthe pooling and convolution operations. In other words, the local support of theoutput predictions is restricted to a certain area in the input image. When wehave to reconstruct big or out-of-distribution skull defects, it may happen thatmost of the image support for certain parts of it are background, so the networkmay have no context to infer the implant shape. To overcome this limitation andmake our model robust, we propose to incorporate context via shape priors givenas an extra channel to the segmentation network. Previous works [8] have shownthat this simple extension can boost the robustness of existing state-of-the-artpixel-wise approaches in medical image segmentation tasks.We take advantage of the fact that images are co-registered to a commonspace, and use the same skull atlas as shape prior. After registration, we con-catenate the resampled image with the atlas as a extra input channel, and trainthe network following the same strategy discussed before. In this case, the shapeprior acts as a kind of initialization for the network’s output, providing additionalcontext that will be useful specially to reconstruct out-of-distribution defects.We refer to this model as DE-Shape-UNet. ranial Implant Design via Virtual Craniectomy with Shape Priors 7
Table 1.
Quantitative results obtained for the two proposed methods (DE-UNet andDE-Shape-UNet) compared with the two baselines reported by the challenge organiz-ers in [9]. We report the mean Dice and HD values, and the standard deviation inparentheses.
Method D test (100) D test − extra (10) OverallDice HD (mm) Dice HD (mm) Dice HD (mm)Baseline N1 [9] 0.809 5.440 - - - -Baseline N2 [9] 0.855 5.182 - - - -DE-UNet DE-Shape-UNet 0.845 (0.107) 6.414 (9.060)
The models were implemented in Pyhton, using the PyTorch 1.4 library. Wetrained and evaluated the CNNs using an NVIDIA TITAN Xp GPU with 12GBof RAM. The same virtual craniectomy and data augmentation procedure wasused to train both models. In both cases we used a compound loss function whichcombines Dice loss and Binary Cross Entropy (BCE) as L = L Dice + λL BCE (parameter λ was set to λ = 1 by grid search). Both models followed the DE-UNet architecture described in [11]; the only difference between them was thatwe concatenated the atlas as an extra input channel in the DE-Shape-UNetmodel. For optimization, we used Adam with initial learning rate of 1e-4. Thebatch-size was set to 1 for memory restrictions. The models were trained for 50epochs. The 100 training images were split in 95 images for training and 5 forvalidation. After 50 epochs, we kept the model that achieved best accuracy inthe validation fold. Figure 4 and Table 1 include a quantitative comparison of the results. We re-port Dice coefficient and Hausdorff distance measured in the D test (100 images), D test − extra (10 images) and the whole test dataset. We observe that DE-Shape-UNet presents better performance for out-of-distribution cases ( D test − extra ),while DE-UNet outperforms the other model in the D test set. Since the wholetest dataset is composed of 100 images from D test and only 10 images from D test − extra , the DE-UNet model shows better performance in the overall com-parison. Moreover, DE-UNet model outperforms the two baseline models (N1and N2) reported by the organizers in [9]. Figure 5 provides some visual exam-ples for reconstructions obtained with both methods in samples from D test and D test − extra . In this work, we evaluated two different approaches for cranial implant recon-struction based on deep learning: a direct estimation method and an alterna-
Franco Matzkin, Virginia Newcombe, Ben Glocker, and Enzo Ferrante
Fig. 5.
Examples of different reconstructions from D test (cases which follow the samepattern than the training dataset, shown in rows 1 and 2) and D test − extra (out-of-distribution case, shown in row 3). As we can observe, both methods performed well inthe image depicted in row 1. For the case in the 2nd row, even if the DE-Shape-UNetmodel managed to reconstruct the implant, the quality of the reconstruction is lowerthan that of the DE-UNet. The opposite happened with the image in row 3 (an out-of-distribution case from D test − extra ) where the model which incorporated shape priorsmanaged to reconstruct the implant, while the DE-UNet failed in this task.ranial Implant Design via Virtual Craniectomy with Shape Priors 9 tive strategy which incorporates shape priors. We adapted the virtual craniec-tomy procedure proposed in [11] to the defect distribution of the AutoImplantchallenge. We found that the simple DE-UNet method produces more accu-rate results for the skull defects which follow the same distribution as those inthe training dataset. However, for out-of-distribution cases where the DE-UNetmodel tends to fail, the use of shape priors increases the robustness of the model,providing additional context to the network. In our implementation, this gain inrobustness for out-of-distribution cases was achieved to the detriment of the over-all accuracy. In future work, we plan to study alternative ways to introduce shapepriors, e.g. considering deformable registration with anatomical constraints [10]to the atlas space instead of rigid transformations, or incorporating shape priorsin a co-registration and segmentation process [15]. The authors gratefully acknowledge NVIDIA Corporation with the donation ofthe Titan Xp GPU used for this research, and the support of UNL (CAID-PIC-50220140100084LI) and ANPCyT (PICT 2018-03907).
References
1. Andrabi, S.M., Sarmast, A.H., Kirmani, A.R., Bhat, A.R.: Cranioplasty: Indi-cations, procedures, and outcome–an institutional experience. Surgical neurologyinternational (2017)2. Chen, X., Xu, L., Li, X., Egger, J.: Computer-aided implant design for the restora-tion of cranial defects. Scientific reports (1), 1–10 (2017)3. Chilamkurthy, S., Ghosh, R., Tanamala, S., Biviji, M., Campeau, N.G., Venugopal,V.K., Mahajan, V., Rao, P., Warier, P.: Development and validation of deep learn-ing algorithms for detection of critical findings in head ct scans. arXiv preprintarXiv:1803.05854 (2018)4. Hieu, L., Bohez, E., Sloten, J.V., Phien, H., Vatcharaporn, E., Binh,P., An, P., Oris, P.: Design for medical rapid prototyping of cranio-plasty implants. Rapid Prototyping Journal (3), 175–186 (Aug 2003).https://doi.org/10.1108/13552540310477481, https://doi.org/10.1108/13552540310477481
5. Jenkinson, M., Bannister, P., Brady, M., Smith, S.: Improved optimization forthe robust and accurate linear registration and motion correction of brain images.Neuroimage (2), 825–841 (2002)6. Larrazabal, A.J., Martnez, C., Glocker, B., Ferrante, E.: Post-dae:Anatomically plausible segmentation via post-processing with denois-ing autoencoders. IEEE Transactions on Medical Imaging (2020).https://doi.org/10.1109/TMI.2020.30052977. Larrazabal, A.J., Martinez, C., Ferrante, E.: Anatomical priors for image segmenta-tion via post-processing with denoising autoencoders. In: International Conferenceon Medical Image Computing and Computer-Assisted Intervention. pp. 585–593.Springer (2019)0 Franco Matzkin, Virginia Newcombe, Ben Glocker, and Enzo Ferrante8. Lee, M.C.H., Petersen, K., Pawlowski, N., Glocker, B., Schaap, M.: Tetris: templatetransformer networks for image segmentation with shape priors. IEEE transactionson medical imaging (11), 2596–2606 (2019)9. Li, J., Pepe, A., Gsaxner, C., von Campe, G., Egger, J.: A baseline approachfor autoimplant: the miccai 2020 cranial implant design challenge. arXiv preprintarXiv:2006.12449 (2020)10. Mansilla, L., Milone, D.H., Ferrante, E.: Learning deformable registration of med-ical images with anatomical constraints. Neural Networks124