Generative Modelling of 3D in-silico Spongiosa with Controllable Micro-Structural Parameters
GGenerative Modelling of 3D in-silico Spongiosawith Controllable Micro-Structural Parameters
Emmanuel Iarussi , , Felix Thomsen , , and Claudio Delrieux , Consejo Nacional de Investigaciones Cient´ıficas y T´ecnicas, Argentina Universidad Tecnol´ogica Nacional, Facultad Regional Buenos Aires, Argentina Universidad Nacional del Sur, Bah´ıa Blanca, ArgentinaCorresponding address: [email protected]
Abstract.
Research in vertebral bone micro-structure generally requirescostly procedures to obtain physical scans of real bone with a specificpathology under study, since no methods are available yet to generate re-alistic bone structures in-silico . Here we propose to apply recent advancesin generative adversarial networks (GANs) to develop such a method. Weadapted style-transfer techniques, which have been largely used in othercontexts, in order to transfer style between image pairs while preservingits informational content. In a first step, we trained a volumetric genera-tive model in a progressive manner using a Wasserstein objective and gra-dient penalty (PWGAN-GP) to create patches of realistic bone structure in-silico . The training set contained 7660 purely spongeous bone samplesfrom twelve human vertebrae (T12 or L1) with isotropic resolution of164 µ m and scanned with a high resolution peripheral quantitative CT(Scanco XCT). After training, we generated new samples with tailoredmicro-structure properties by optimizing a vector z in the learned latentspace. To solve this optimization problem, we formulated a differentiablegoal function that leads to valid samples while compromising the appear-ance (content) with target 3D properties (style). Properties of the learnedlatent space effectively matched the data distribution. Furthermore, wewere able to simulate the resulting bone structure after deteriorationor treatment effects of osteoporosis therapies based only on expectedchanges of micro-structural parameters. Our method allows to generatea virtually infinite number of patches of realistic bone micro-structure,and thereby likely serves for the development of bone-biomarkers and tosimulate bone therapies in advance. Keywords:
Bone micro-structure · progressive generative adversarialnetwork · structural morphing · style-transfer · XCT
The development of new methods for characterizing the micro-structure of spon-geous bone is an active research field in constant progress. For the developmentand analysis of specific structural parameters (e.g. bone volume ratio, trabecu-lar separation or plate-to-rod ratio), often very simple in-silico bone models are a r X i v : . [ ee ss . I V ] S e p Iarussi, Thomsen et al. latent zz Generator z RealRealReal
Discriminator ... ... mm
15 mminput patchsample patch cortex peeled spongiosa
Fig. 1.
Left: 7660 purely spongeous HR-pQCT patches of 32 voxels were sampled fromtwelve human vertebrae phantoms. Right: Two 3D Convolutional Neural Networks,Generator and Discriminator, were progressively trained to mimic the vertebrae volumedistribution at incremental resolutions. used [11,19] containing only few rods and plates intersecting each other, sincethey allow easiest control of the desired output. These very simple models reflectrather poorly the real structure of bone. A first attempt to generate more real-istic bone has recently been presented, evolving a 3D structure from separated2D slices, generated with a technique in Fourier domain [14]. The authors re-ported high accordance between generated and real samples regarding trabecularthickness, however further parameters have not been evaluated. In this work wedeveloped a direct method to create 3D micro-structural bone samples in-silico that 1) contain realistic structures, and 2) allow to steer the properties to simu-late changes of micro-structural parameters from deterioration or medical bonetreatment.In order to achieve these goals, we first trained a generative volumetric con-volutional neural network (CNN) on isotropic patches of 32 × ×
32 voxels(box of 5mm diameter) to generate in-silico
HR-pQCT patches. Working in 3Dseverely increases the complexity of the generative models with respect to im-ages (2D) or videos (2D+t). Therefore we adapted a WGAN-GP architecture [6]to work with 3D data and trained it by progressively growing the resolution upto the target shape [8], accordingly we call this architecture progressive
WGAN-GP (PWGAN-GP). The latent vector z corresponds to a random point in a32-dimensional hypersphere, which defines entirely the generated volume thatthe discriminator network is intended to differentiate from a real sample.A key factor of our approach is to provide control over the generated samples,to be able to navigate in the learned latent space and to customize the morpho-logical properties of the output volumes. Instead of shaping the latent space bymeans of a conditional GAN [10], conditioned on target micro-structural pa-rameters, we took advantage of generative style transfer techniques [3] largelyused in photo manipulation applications and other types of art-work. In con-trast to conditional GANs, this two-step approach, GAN and style transfer,does not require to retrain the generative model if the user desires to incorpo-rate new control properties, i.e. a new set of micro-structural parameters. Thestyle transfer mechanism allows to steer the properties and to simulate changesof micro-structural parameters, for instance from deterioration or medical bone odelling of 3D Spongiosa with Controllable Micro-Structural Parameters 3 treatment. This is achieved by mapping the latent vector of an original sampleto a vector of new micro-structural parameters (style) but still with a similarmicro-structure (content), hence by solving an optimization problem with twoconditions. Thus, we show that not only neural networks but also style transferstrategies can be employed to generate and control structural properties in 3Dbone CT scans. To the best of our knowledge, our work is the first attempt toaddress micro-structural customization of 3D volumes formulating it as a styletransfer problem and also the first generative adversarial method to synthesizebone micro-structure [16]. Code, data and trained models will be released at github.com/emmanueliarussi/generative3DSpongiosa . In this chapter we describe the examined generative models and the neural styletransfer function. We adapted a 2D state of the art generative network model [8]and trained it with HR-pQCT volumetric scans of human vertebrae (Fig. 1).Once trained, we assessed the suitability of generated volumes qualitatively via3D renderings, and quantitatively with standard metrics of trabecular bone.Next, we generated samples with tailored micro-structural properties using astyle transfer perspective over the learned latent space. Synthetic samples con-tained new micro-structural properties but preserved the general content of theoriginal sample.
Generative adversarial networks (GANs) [5] are trained to stochastically gen-erate samples close to a distribution represented by the training set. Despitethe high success of these network architectures, training is still a very unsta-ble process. Therefore, several alternatives have emerged to deal with trainingissues.In particular, we based our framework on Wasserstein GANs [1], consisting ofa generator network G : z (cid:55)→ ˜ x , and a discriminator network (also called critic in the context of WGANs) D : x (cid:55)→ D ( x ) ∈ R which were simultaneouslytrained to try to fool each other: while G learned to generate fake samples ˜ x froman unknown distribution or noise z , the critic D learned meanwhile to distinguishfake from real samples. Formally, the training objective function optimizes:min G max D ∈D E x ∼ P r [ D ( x )] − E ˜ x ∼ P g [ D (˜ x )] , (1)where D is the set of 1-Lipschitz functions, P r the data distribution and P g themodel distribution defined by ˜ x = G ( z ) , z ∼ p ( z ). Since enforcing the Lipschitzconstraint is not trivial, we applied a gradient penalty mechanism (PG) [6] atevery iteration with a critic parameter update (WGAN-PG).We employed three additional non-progressive GANs to compare perfor-mance. 1) We trained the exact same architecture as described before but with Iarussi, Thomsen et al. an adversarial loss [5] and without extra regularization terms, gradient penaltyand critic drift. Instead, we added a sigmoid layer at the final discriminatorstep in order to output labels in range [0 , We framed the problem of generating samples with tailored micro-structuralproperties by formulating an optimization problem over the latent space of ourgenerative model. In the spirit of style-transfer techniques, we defined the content x t as the target volume we want to stick to. The target volume could be a samplefrom our training data set, or also be produced by the generator network, in thatcase x t = G ( z t ). A key contribution of our approach is to redefine the notionof style for 3D scans of human vertebrae. If x t provides the overall volumeconstraint, the style is given by a set of micro-structural properties w t , ourgenerated sample has to satisfy. Formally, we set up an optimization problemover z (cid:48) minimizing the sum of two conditions:min z (cid:48) µ (cid:107) x t − G ( z (cid:48) ) (cid:107) + (cid:107) w t − P ( G ( z (cid:48) )) (cid:107) , (2)with µ ≥ P ( · ) the algorithm to compute a vector of differentiable micro-structural properties as given by w t , Sec. 3.3. The objective function is dif-ferentiable and can be minimized using gradient descent. Notice that the termaccounting for the content is hard to minimize and that the global minimummay be not unique. In practice, we set µ to a small value ( e − ) and optimizedEq. 2 with the Limited Memory BroydenFletcherGoldfarbShanno algorithm (L-BFGS). As we show in the Results section, this optimization allows us to navigatethe latent space learned by our network, retrieving plausible synthetic sampleswith the desired micro-structural properties. In this section we describe the sampling procedure, choice of hyperparametersand the applied statistics.
Twelve human vertebrae (T12 and L1) were embedded into epoxy resin with-out damaging any trabeculae to become cylindrical vertebrae phantoms. These odelling of 3D Spongiosa with Controllable Micro-Structural Parameters 5 phantoms were scanned on a high-resolution peripheral QCT (HR-pQCT) withisotropic resolution of 82 µ m, 59 . µ As (XtremeCT I, ScancoMedical AG, Br¨uttisellen, Switzerland) and automatically calibrated to densityvalues. The spongiosa has been peeled from the cortex with a semi-automaticprocedure [18] and down-sampled to an isotropic resolution of 164 µ m, therebyincreasing the signal-to-noise ratio and obtaining a lower voxel number per patchbut still keeping most structural information (see Fig. 1).We defined then 7660 purely spongeous patches on the entire set of vertebraphantoms with isotropic size of 32 × ×
32 voxels (box of diameter 5 mm) andregular offset 8 voxels in all directions (1 . − , to [ − ,
1] (only 0 . We progressively trained our network in four stages with isotropic samples of4 , 8 , 16 and 32 voxels. Each training stage consisted on five training epochsfollowed by five blending epochs to fade smoothly in the new layers. We used abatch size of 16 in all stages and the ADAM optimizer [9] with β = 0, β = 0 . .
001 for the generator and critic networks. We set the num-ber of critic iterations per generator iteration to 3 and used gradient penalty with λ = 10. In order to further regularize the discriminator, we penalized outputsdrifting away from 0 by adding the average of the critic output squared to itsloss with weight (cid:15) drift = 0 . We implemented a set of commonly applied bone micro-structural parametersin Python: Bone mineral density (BMD), standard deviation of density values(BMD.SD), bone volume ratio (BV/TV), tissue mineral density (TMD), meanintercept length (MIL), parallel plate model dependent trabecular separation(Tb.Sp) and thickness (Tb.Th) [17]. The vector P ( G ( z (cid:48) )) as defined in Eq. 2 con-tained BMD, BV/TV, TMD and BMD.SD, that were specifically implemented asdifferentiable PyTorch functions, similar techniques have been used elsewhere [7].Since micro-structural parameters were not independent from each other, eitherfor physical (e.g. BMD cannot be higher than TMD), or for physiological rea-sons (e.g. correlation between BMD and BV/TV), we reduced the parameters for Iarussi, Thomsen et al.
Table 1.
Means, ± standard deviations of micro-structural parameters and p -valuesof Tukey’s range test’s ( α = 0 . Parameter Real GAN WGAN-GP PWGAN-GPBMD[mg/cm ] 123 . ± .
21 122 . ± . (99 . . ± . (19 . . ± . (95 . BMD.SD[mg/cm ] 125 . ± .
96 118 . ± .
62 ( < . . ± .
70 ( < . . ± . (21 . TMD[mg/cm ] 341 . ± .
90 334 . ± .
34 ( < . . ± .
04 ( < . . ± . (7 . BV/TV[%] 18 . ± .
11 18 . ± . (5 . . ± . (95 . . ± . (99 . MIL[mm] 1 . ± .
42 1 . ± . (16 . . ± . (99 . . ± . (68 . Tb.Sp[mm] 0 . ± .
43 0 . ± . (37 . . ± . (98 . . ± . (76 . Tb.Th[ µ m] 197 . ± .
62 189 . ± .
79 (0 . . ± . (96 . . ± . (70 . -4-2024 -6 -4 -2 0 2 4 6 8 10 ABCD
GANWGAN-GP PWGAN-GPReal P C -4-2024 ABC D -6 -4 -2 0 2 4 6 8 10 -4-2024
AB C D -6 -4 -2 0 2 4 6 8 10-4-2024
ABC D -6 -4 -2 0 2 4 6 8 10 P C P C P C PC 1PC 1 PC 1PC 1
A BC DA BC DA BC DA BC D
Fig. 2.
Distributions of principal components of 700 randomly selected patches, andeach four 3D renderings of representative samples of real and generated sets of threedifferent architectures. simplicity to two (linearly independent) principal components (PC1 and PC2)that explained 99% of the variation on real patches. We considered 700 real andgenerated patches for all resulting metrics. Quantitative analyses of the networkswere conducted by computing means and standard deviations of all consideredmicro-structural parameters, computed with a fixed threshold of 225 mg/cm .We employed for all methods Tukey’s range tests with α = 0 .
05 between realand generated patches.
Figure 2 shows the distribution of real and generated patches. The upper left plotindicates the target distribution, noticeably GAN differs most and PWGAN-GPleast from the real distribution. Besides that all neural networks still underpopu-lated certain areas in parameter space, e.g. PWGAN-GP at (PC1,PC2) = (2,-1),single generated patches were visually difficult to distinguish from real patches.Table 1 shows statistics of micro-structural parameters. Only PWGAN-GP (right odelling of 3D Spongiosa with Controllable Micro-Structural Parameters 7 B M D . S D B M D T M D B V / T V M I L T b . S p T b . T h P C PC 1
PC 1-1.0-1.8-2.6 -0.2 0.6 P C . . - . . . BMD[mg/cm ³ ]BMD.SD[mg/cm ³ ]TMD[mg/cm ³ ]BV/TV[%]MIL[mm]Tb.Sp[mm]Tb.Th[µm] 90.41101.74297.3812.540.780.58154.69 157.94130.37360.0026.441.401.22218.93Parameter min max R e s i d r o n a t e s o d i u m h o r m o n e T e r i p a r a t i d e Fig. 3.
Generated samples with varying micro-structural parameters but fixed content.Right: Direction of change of structural parameters and two specific drugs to strengthenbone (Fig. 4), range of measured parameters. column) was indistinguishable on all considered statistics, while WGAN-GP wasstatistically different from real data on parameters BMD.SD and TMD, andGAN additionally on Tb.Th. Figure 3 shows the application of the neural styletransfer method on 25 bone samples with different micro-structural parameters.The patches are variations of a single patch which was directly generated by theGAN with principal components close to the one in the center, thereby keepingbone micro structure as fixed as possible by simultaneously fitting the individ-ual target structural parameters. The circle on the top right shows the changeof principal components for each micro-structural parameter. Also treatment ef-fects of drugs for osteoporosis therapy can be expressed in such a way, as shownfor Residronate and Teriparatide. Therefore, we used reported treatment effectsof the examined structural parameters [4].
We implemented a method to generate realistic in-silico patches of bone micro-structure with defined micro-structural parameters. Qualitative analyses showedhigh similarity with real bone. Micro-structural parameters were indistinguish-able between generated and real patches, which is particularly important sincethe loss-function of the neural network was not explicitly considering these statis-tics, hence further micro-structural characteristics might be modeled correctlyas well.
Iarussi, Thomsen et al. p a t c h B baseline month 6 month 18 baseline month 6 month 18 p a t c h C p a t c h A Residronate Teriparatide
Fig. 4.
Simulated bone structure under Residronate and Teriparatide treatment after6 and 18 months. While Residronate (left) is known to calcify existing bone only,Teripartide (right) is able to form new bone.
We still see potential to improve our method, i.e. to closer align the dis-tributions of generated and real patches (Fig. 2). Latent vectors that are anexplicit combination of style-generating and structural variables might be worthof consideration. Furthermore, network architectures applied for similar prob-lems like the generation of blood vessels [15], lung nodules [13], liver lesions [2]or other applications [16] seem to be promising alternatives to consider. Ourmethod generated only patches of 5mm (or 32 voxels). We tried also to trainfor patches with 10mm (or 64 ) thereby using a larger number of epochs andlatent dimensions, but application on larger scale became unstable according tothe distribution criterion. An alternative might be a generative model to pro-duce continuous bone structures which required however a more complex logicto maintain a smooth continuum of bone structural parameters.We foresee our method being useful in a number of applications. For in-stance, the generation of bone structures for developing and testing of newmicro-structural parameters that enhance our current understanding of bonestability. Also, the simulation of micro-structural changes by bone-forming drugsor immobility- or age related bone-reduction is still an open issue. Since micro-structural properties of specific treatments for osteoporosis are generally known,the content-preserving method to vary micro-structural parameters can be usedto simulate bone micro-structure after a specific treatment. Such simulationsare shown in Fig. 4 based on reported treatment effects of the bisphosphonateResidronate and the parathyroid hormone Teriparatide using reported effectson BMD, BV/TV and TMD [4]. The proposed method could potentially berefined by incorporating differentiable formulas of additional structural parame-ters, such as Tb.Sp, Tb.Th, the anisotropy and elongation indices. The simulatedmicro-structure might also serve to compute minimum bone failure load with the odelling of 3D Spongiosa with Controllable Micro-Structural Parameters 9 finite-element-method or the mean-intercept-length tensor [12], which both re-quire exact models of the bone. Acknowledgements
We thank C.-C. Gl¨uer for providing the phantoms. This study was supported byAgencia Nacional de Promoci´on Cient´ıfica y Tecnol´ogica, Argentina (PICT 2017-1731), PID UTN 2018 (SIUTNBA0005139), PID UTN 2019 (SIUTNBA0005534),and NVIDIA GPU hardware grant that supported this research with the dona-tion of two Titan Xp graphic cards.
References
1. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv preprintarXiv:1701.07875 (2017)2. Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.:GAN-based synthetic medical image augmentation for increased CNN performancein liver lesion classification. Neurocomputing , 321–331 (2018)3. Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXivpreprint arXiv:1508.06576 (2015)4. Gl¨uer, C.C., Marin, F., Ringe, J.D., Hawkins, F., M¨oricke, R., Papaioannu, N.,Farahmand, P., Minisola, S., Mart´ınez, G., Nolla, J.M.: Comparative effects ofTeriparatide and Risedronate in glucocorticoid-induced osteoporosis in men: 18-month results of the EuroGIOPs trial. J. Bone Miner. Res. , 1355–1368 (2013)5. Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT press (2016)6. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improvedtraining of Wasserstein GANs. In: Advances In Neural Information ProcessingSystems. pp. 5767–5777 (2017)7. Karam, C., Sugimoto, K., Hirakawa, K.: Fast convolutional distance transform.IEEE Signal Process. Lett. (6), 853–857 (2019)8. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs forimproved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)9. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Int. Conf.on Learning Representations. pp. 1–15 (2015)10. Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprintarXiv:1411.1784 (2014)11. Moreno, R., Borga, M., Smedby, ¨O.: Evaluation of the plate-rod model assumptionof trabecular bone. In: IEEE Int. Symposium on Biomedical Imaging. pp. 470–473(2012)12. Moreno, R., Borga, M., Smedby, ¨O.: Techniques for computing fabric tensors: A re-view. In: Westin, C.F., Vilanova, A., Burgeth, B. (eds.) Visualization and Process-ing of Tensors and Higher Order Descriptors for Multi-Valued Data. pp. 271–292.Springer (2014)13. Onishi, Y., Teramoto, A., Tsujimoto, M., Tsukamoto, T., Saito, K., Toyama, H.,Imaizumi, K., Fujita, H.: Automated pulmonary nodule classification in computedtomography images using a deep convolutional neural network trained by genera-tive adversarial networks. BioMed Research Int. (2019)14. Pe˜na-Sol´orzano, C.A., Albrecht, D.W., Paganin, D.M., Harris, P.C., Hall, C.J.,Bassed, R.B., Dimmock, M.R.: Development of a simple numerical model for tra-becular bone structures. Med. Phys. (4), 1766–1776 (2019)0 Iarussi, Thomsen et al.15. Russ, T., Goerttler, S., Schnurr, A.K., Bauer, D.F., Hatamikia, S., Schad, L.R.,Z¨ollner, F.G., Chung, K.: Synthesis of CT images from digital body phantomsusing CycleGAN. Int. J. Comput. Assist. Radiol. Surg. (10), 1741–1750 (2019)16. Sorin, V., Barash, Y., Konen, E., Klang, E.: Creating artificial images for radiologyapplications using generative adversarial networks (GANs)–a systematic review.Acad. Radiol. (2020)17. Thomsen, F.: Medical 3D image processing applied to computed tomography andmagnetic resonance imaging. Ph.D. thesis, Universidad Nacional del Sur, Bah´ıaBlanca, Argentina (2017)18. Thomsen, F., Pe˜na, J., Delrieux, C., Gl¨uer, C.C.: Structural Insight v3: A stand-alone program for micro structural analysis of computed tomography volumes. In:Congreso Argentino de Inform´atica y Salud (2016)19. Thomsen, F., Pe˜na, J., Lu, Y., Huber, G., Morlock, M., Gl¨uer, C.C., Delrieux, C.: Anew algorithm for estimating the rod volume fraction and the trabecular thicknessfrom in vivo computed tomography. Med. Phys. (12), 6598–6607 (2016)20. Yazıcı, Y., Foo, C.S., Winkler, S., Yap, K.H., Piliouras, G., Chandrasekhar,V.: The unusual effectiveness of averaging in GAN training. arXiv preprintarXiv:1806.04498 (2018) Supplemental Document - Computation of DifferentiableStructural Parameters
Standard formulas
We briefly describe the differentiable formulas of the structural parameters usedin the style transfer optimization. We implemented structural parameters thatresemble common parameters bone mineral density (BMD), its standard devia-tion (BMD.SD), bone volume ratio (BVTV) and tissue mineral density (TMD).BMD is the average density of the calibrated input volume x . BVTV t is theratio of bone to total volume when segmenting with threshold t and TMD t isthe density of the segmented bone. The standard formulas readBMD( x ) = 1 n (cid:88) x i , (3)BMD.SD( x ) = (cid:115) (cid:80) x i − n ( (cid:80) x i ) n − , (4)BVTV t ( x ) = 1 n (cid:88) H( x i − t )and (5)TMD t ( x ) = (cid:80) H( x i − t ) x i (cid:80) H( x i − t ) , (6)with H( · ) the Heaviside function, i a voxel index and n the total number ofvoxels. Differentiable formulas
For the use in the neural style transfer optimizer we derived new formulas that1) allow computation on a very restricted local VoI of 32 voxels instead of a odelling of 3D Spongiosa with Controllable Micro-Structural Parameters 11 global VoI per-vertebra, and 2) are differentiable to become part of the oper-ator P ( x ). Thus we reimplemented the threshold dependent parameters TMDand BV/TV. Smoothness was achieved by defining parameters that are strictlymonotone in t . We defined a) the softplus function plus (cid:15) ( · ) that is a smoothversion of the rectified linear unit function max { , ·} parameterized with a fuzzi-ness factor (cid:15) and with plus (cid:15) ( a ) ≈ a for a > . (cid:15) , and b) a fuzzy binarizationH σ ( a − t ), a sigmoid version of the Heaviside function H( a − t ) with σ > (cid:15) ( a ) = (cid:15) ln (cid:16) (cid:16) a(cid:15) − (cid:17)(cid:17) + (cid:15), (7)H σ ( a − t ) = (cid:18) (cid:18) t − aσ (cid:19)(cid:19) − . (8)Smooth structural parameters read thenBVTV ∗ t ( x ) = 1 n (cid:88) H σ ( x i − t ) , (9)TMD ∗ t ( x ) = (cid:80) H σ ( x i − t ) x i plus (cid:15) ( (cid:80) H σ ( x i − t )) (10)with (cid:15) = 10 − a small number and σ = 10 mg/cm a fuzzy scale. Finally theterm P ( · ) as used in the paper reads P ( x ) = (cid:104) α BMD( x ) , α BMD.SD( x ) , α BVTV ∗ t ( x ) , α TMD ∗ t ( x ) (cid:105) (11)with t = 225 mg/cm and α , . . . , α4