[PDF] Development of Conditional Random Field Insert for UNet-based Zonal Prostate Segmentation on T2-Weighted MRI

Abstract

Purpose: A conventional 2D UNet convolutional neural network (CNN) architecture may result in ill-defined boundaries in segmentation output. Several studies imposed stronger constraints on each level of UNet to improve the performance of 2D UNet, such as SegNet. In this study, we investigated 2D SegNet and a proposed conditional random field insert (CRFI) for zonal prostate segmentation from clinical T2-weighted MRI data. Methods: We introduced a new methodology that combines SegNet and CRFI to improve the accuracy and robustness of the segmentation. CRFI has feedback connections that encourage the data consistency at multiple levels of the feature pyramid. On the encoder side of the SegNet, the CRFI combines the input feature maps and convolution block output based on their spatial local similarity, like a trainable bilateral filter. For all networks, 725 2D images (i.e., 29 MRI cases) were used in training; while, 174 2D images (i.e., 6 cases) were used in testing. Results: The SegNet with CRFI achieved the relatively high Dice coefficients (0.76, 0.84, and 0.89) for the peripheral zone, central zone, and whole gland, respectively. Compared with UNet, the SegNet+CRFIs segmentation has generally higher Dice score and showed the robustness in determining the boundaries of anatomical structures compared with the SegNet or UNet segmentation. The SegNet with a CRFI at the end showed the CRFI can correct the segmentation errors from SegNet output, generating smooth and consistent segmentation for the prostate. Conclusion: UNet based deep neural networks demonstrated in this study can perform zonal prostate segmentation, achieving high Dice coefficients compared with those in the literature. The proposed CRFI method can reduce the fuzzy boundaries that affected the segmentation performance of baseline UNet and SegNet models.

Full PDF

DDevelopment of Conditional Random Field Insert forUNet-based Zonal Prostate Segmentation onT2-Weighted MRI

Peng Cao , Susan M. Noworolski , Olga Starobinets ,Natalie Korn , Sage P. Kramer , Antonio C. Westphalen ,Andrew P. Leynes , Valentina Pedoia , Peder Larson Department of Radiology and Biomedical Imaging, University ofCalifornia, San Francisco ∗ Corresponding to:Peng CaoDepartment of Radiology and Biomedical Imaging, University of California at San Fran-cisco, San Francisco, CA, USAAddress: 1700 4th Street, San Francisco CA 94158Email: [email protected]

Short Running Title:

Deep Learning for MRI Zonal Prostate Segmentation

Key words: segmentation, prostate, Computer-aided detection and diagnosis1 a r X i v : . [ q - b i o . Q M ] F e b bstract Purpose:

A conventional 2D UNet convolutional neural network (CNN) architecturemay result in ill-deﬁned boundaries in segmentation output. Several studies imposedstronger constraints on each level of UNet to improve the performance of 2D UNet, suchas SegNet. In this study, we investigated 2D SegNet and a proposed conditional randomﬁeld insert (CRFI) for zonal prostate segmentation from clinical T2-weighted MRI data.

Methods:

We introduced a new methodology that combines SegNet and CRFI to im-prove the accuracy and robustness of the segmentation. CRFI has feedback connectionsthat encourage the data consistency at multiple levels of the feature pyramid. On theencoder side of the SegNet, the CRFI combines the input feature maps and convolutionblock output based on their spatial local similarity, like a trainable bilateral ﬁlter. For allnetworks, 725 2D images (i.e., 29 MRI cases) were used in training; while, 174 2D images(i.e., 6 cases) were used in testing.

Results:

The SegNet with CRFI achieved the relatively high Dice coeﬃcients (0.76, 0.84,and 0.89) for the peripheral zone, central zone, and whole gland, respectively. Comparedwith UNet, the SegNet+CRFIs segmentation has generally higher Dice score and showedthe robustness in determining the boundaries of anatomical structures compared with theSegNet or UNet segmentation. The SegNet with a CRFI at the end showed the CRFI cancorrect the segmentation errors from SegNet output, generating smooth and consistentsegmentation for the prostate.

Conclusion:

UNet based deep neural networks demonstrated in this study can performzonal prostate segmentation, achieving high Dice coeﬃcients compared with those in theliterature. The proposed CRFI method can reduce the fuzzy boundaries that aﬀectedthe segmentation performance of baseline UNet and SegNet models.2 ntroduction

MRI of the prostate helps to localize cancer and characterize disease severity [22, 5, 4].The prostate is composed of three regions: the central zone (CZ), the transition zone(TZ), and the peripheral zone (PZ). Approximately 70% of cancers arise in the PZ, and30% in the TZ, with very few in the CZ [7]. A PIRADS v2 score is given based on the zonallocation of a visible lesion [6, 19]. For example, in the PZ the predominant MRI sequenceused to estimate aggressiveness is diﬀusion-weighted imaging (DWI), but T2-weightedimaging (T2w) has greater weight when lesions are seen in the TZ [6, 19]. Therefore, anaccurate prostate zonal segmentation could serve as the ﬁrst processing step in the futureautomatized clinical decision-making pipeline: 1) MRI; 2) zonal segmentation; 3) lesiondetection; 4) PIRADS grading.Automated prostate segmentation provides essential and eﬃcient tools for accuratemeasurements, primarily because manual segmentation is impractical for analyzing large2D/3D datasets. Prostate segmentation research has been focused on the extraction ofthe entire gland for many years and achieved a Dice coeﬃcient/accuracy of 0.90 in state-of-the-art algorithms [11]. For zonal prostate segmentation, conventional methods, suchas template-based and C-means clustering [2] and level-set [21], provided the best Dicecoeﬃcients of 0.60 and 0.70 for PZ and combined CZ/TZ, respectively. However, theseconventional zonal segmentation methods typically involved complex pre-processing, suchas bias ﬁeld correction, limiting their clinical application.Recently, deep learning and convolutional neural networks (CNNs) have been appliedto a broad range of medical imaging segmentation tasks such as brain, heart, lung, andprostate [8]. The most widely used deep neural network in medical imaging segmenta-tion is a fully convolutional network (FCN) [13] and its variations, including 2D and 3DUNet [3, 16] and 3D VNet [14] architectures. These FCNs provide both large receptiveﬁeld and multi-scale representation for the input image, generating edge-preserved seg-mentations. For prostate segmentation, both UNet and VNet were successfully applied3o the whole prostate segmentation [24, 14], and achieved a Dice coeﬃcient of 0.87 to0.89 that was close to the performance of state-of-the-art algorithms [11]. Meanwhile,the ill-deﬁned boundary of segmentation output from 2D UNet was reported in refer-ence [24], which is likely related to the inadequate supervision in each internal level ofUNet, as proposed by Zhu et al. [24]. Several recent improvements for the UNet can begenerally considered as imposing stronger constraints on each level of UNet, e.g., provid-ing multi-level supervision or attention focusing [24, 15], i.e., attention-UNet, or usinga ﬁxed/simpliﬁed de-convolution kernel [1], i.e., SegNet. Meanwhile, the conventionallywidely used segmentation method, conditional random ﬁeld (CRF), can also be applied tothe segmentation output from CNN or FCN, i.e., CNN/FCN+CRF, resolving the sharpedge from blurred segmentation output from CNN or FCN [9, 20]. A recent study recastthe iterative inference process for CRF as a recurrent neural network (RNN) and intro-duced an end-to-end trainable FCN+CRFasRNN approach [23]. However, these CRFapproaches were computationally expensive, therefore, can not be inserted into the cur-rent UNet structure. In order to solve the ill-deﬁned boundary for the UNet segmentationwith CRF, one should simplify the current CRF implementation and made it compatiblewith the CNN architecture. In this study, we investigated 2D UNet and CRF insertsfor zonal prostate segmentation from T2-weighted MRI data. We also introduced a newmethodology that combines SegNet and the CRF insert (CRFI) to solve the ill-deﬁnedboundary from UNet segmentation.

Methods

Objects and MRI acquisition

This study was approved by the Institutional Review Board at this institution and wascompliant with the Health Insurance Portability and Accountability Act. Data fromthirty-ﬁve participants, yielding a total of 875 2D images (i.e., 35 3D volumes) were usedin this study. All participants provided written, informed consent and had a biopsy-4onﬁrmed diagnosis of prostate cancer.All participants were imaged with an expandable balloon endorectal coil (MedRad,Bayer HealthCare LLC, Whippany, NJ, USA) combined with an external phased-arraycoil on a 3T MR scanner (GE Healthcare, Waukesha, WI, USA). A perﬂuorocarbon ﬂuid(Galden; Solvay Plastics, West Deptford, NJ, USA) was used to inﬂate the balloon coil.Fast spin-echo (FSE) T2-weighted images were acquired in an oblique axial plane withFOV = 18 cm, slice thickness = 3 mm, matrix = 256 × UNet and SegNet implementations

As shown in Figure 1a, the convolutional neural network was created based on a UNet [16]structure with three decomposition levels. On the encoder side, i.e., on the contractingpath, three 8 × × × × ×

256 were used as the inputs/outputs for the network. The SegNet implemen-tation was based on the abovementioned UNet, using a ﬁxed/simpliﬁed de-convolutionkernel with the max-pooling index copied from the same level on the encoder side [1].5

RFI implementation

The CRF frame was based on the method in previous studies those used recurrent neuralnetwork to perform the inference for CRF [23, 12]. Brieﬂy, CRF combined both unary de-pendencies between output features, Y = { y , y , ..., y N } , where N the number of pixels,and the input features for convolution block, I = { x , x , ..., x N } , as well as the pair-wise dependencies between pairs of input features to produce the conditional probability P ( y i | I ), where y i is output feature vector. y i was the label assigned to the pixel i, whichwas drawn from a pre-deﬁned set of labels L = { l , l , ..., l L } . P ( y i | I ) = 1 Z exp [ − (cid:88) i ψ U ( y i ) − (cid:88) i

W hile ( not converged )˜ Q i ( m ) ( l ) ← (cid:80) j (cid:54) = i S m ( x i , x j ) Q i ( l ) , f orallm Message passingˇ Q i ( l ) ← (cid:80) m w ( m ) ˜ Q i ( m ) ( l ) Pre-weighting P i ( l ) ← (cid:80) l (cid:48) (cid:15) L µ ( l, l (cid:48) ) ˇ Q i ( l ) Compatibility transform˜ Q i ( l ) ← U i ( l ) − P i ( l ) Add unary Q i ← Z i exp ( ˜ Q i ( l )) SoftmaxIn this study we set M=1, i.e., one Gaussian kernel, and we reduced the message pass-ing in Algorithm to an element-wise multiplication of S and the local sum of Q . S ( x i ) wasthe approximation of S m ( x i , x j ) for adjacent pixels within a convolution kernel, i.e., S ( x i )8as derived from the average local distance map d , resulted that (cid:80) j (cid:54) = i S m ( x i , x j ) Q i ( l ) wasreplaced by S ( x i ) (cid:80) j (cid:54) = i,j ∈ Ω( i ) Q i ( l ) for a truncated local summation for j ∈ Ω( i ) in mes-sage passing. Then a convolution layer can perform such summation operation locally for Q i ( l ), as well as for learning weightings in pre-weighting and the compatibility transformsteps, i.e., these steps can be approximated by Conv ( Q ). This simpliﬁcation allowed theCRF to be implemented fully by two convolution layers, one for P = S ∗ Conv ( Q ), anelement-wise multiplication of S and Conv ( Q ), and another for d = Conv ( Q ), plus asoftmax and newly introduced activation function, i.e., S = exp ( − d ) from Eq. (2), asshown in Figure 1b. This simpliﬁed CRF can be inserted in the UNet or SegNet struc-ture without too much increase in the computation burden, as shown in Figure 1. Thesimpliﬁed CRFI algorithm is summarized below:Algorithm 2, Simpliﬁed neural network implementation for Algorithm Algorithm 1, whichcan be approximately by two convolution layers. U = Conv block ( I ) Convolution block in UNet/SegNet Q = Sof tmax ( U ) Initialization W hile ( not convergedd = Conv ( I ) Average local distance measurement S = exp ( − d ) Similarity weighting P = S ∗ Conv ( Q ) Message passing, pre-weighting, and compatibility transform˜ Q = U − P Add unary Q = Sof tmax ( ˜ Q ) SoftmaxAnother practical consideration was to utilize the attention gate for mean-ﬁeld iter-ation [15], in order to insert the CRF in a UNet or SegNet structure without creatinghurdles in training the network end-to-end. We considered the simpliﬁed message passingstep, i.e., S ∗ Conv ( Q ) (where ∗ for element-wise multiplication) in Algorithm , was anattention gate for S , gated by Conv ( Q ), in Figure 2. This was similar to the attention9ate for RNN in [15], as a series of multiplications of the attention maps and the inputfeature maps. In [15], a special design that concatenated the gating g and input featuremap x , i.e., g || x , was used to gate x , instead of directly using g , which may help the gra-dient backpropagation in this RNN. To adapt this idea, we simply replaced the averagelocal distance measurement in Algorithm by d = Conv ( I || Q ) (6)The complete data ﬂow of the modiﬁed CRF is illustrated in Figures 1 and 2.To reiterate the method above, as illustrated in Figure 1, in this study, three simpliﬁedconditional random ﬁeld layers as inserts (CRFIs) were added to the original SegNetstructure. The CRFI combines the input and output of the convolution block, i.e., I and U , based on the spatial local similarity measured on I . As explained in Algorithm, we replaced the message-passing step by a convolution layer so that the CRFasRNNstructure can be inserted into each level of SegNet on the encoder side (Fig. 1b). In thisCRFI structure, as shown in Figure 1b, intermediate variables were: U for negative ofunary energy, S for Gaussian similarity weighting from the average local distance map, d , and Q for probability from the previous iteration. These CRFIs on encoder side ofthe SegNet convert input feature vectors from convolution blocks, U = Conv block ( I ),into consistent output feature vectors based on the similarity measured on I , as governedby the exponential activation, exp ( − d ), and a convolution layer (i.e., Eq. (6)) thatdetermined the average local distance map, d . For example, small d correspond to highlocal similarity, which can result in a large exp ( − d ) and a small penalty on the pairwiseenergy deﬁned in Eqs. (1) and (4). The CRFasRNN method then used the Gaussiankernel, i.e., exp ( − d ) as the weighting for the “bilateral ﬁlter” for Q , in Algorithm tofollow the mean-ﬁeld inference method with the pairwise energy deﬁned in Eq. (4).With two simpliﬁcations, i.e., M = 1 and Gaussian weighting was applied locally andapproximately, a convolution layer could be used to replace several steps in Algorithm10 This setting allowed the use of two convolution layers to perform the simpliﬁed CRF.Such CRFI can be placed either on the encoder side of SegNet or at the end of theSegNet, as to how CRFasRNN was used [23] and can be compatible with the end-to-endtraining of SegNet. In addition, vector similarity in Eq. (2) was deﬁned as the existence ofsmall average distance in spatial adjacent feature vectors, which was implemented usinga convolution layer in this study. Training and testing

As an initial experiment, 725 2D images (i.e., 29 MRI cases) were used in training; while,174 2D images (i.e., 6 cases) were used in testing. The data augmentation methods usedincluded ± ± ±

5% aﬃne scaling, ±

50% intensityscaling, and ±

10% additive noise. For the training of all networks, the loss function(LF) was deﬁned as a joint negative logarithm of the soft Dice coeﬃcient: (2*intersec-tion)/(count(label)+count(prediction), and weighted cross-entropy loss. For weightedcross-entropy loss, 0.02, 1.0, and 1.0 weightings for background, TZ, and PZ classeswere used, respectively. ADAM operator was used with a ﬁxed training rate of 10 − .Within the CRFI, 5 × Results

Figure 3 shows typical neural network segmentation results from 4 participants. Theneural network was able to predict the correct zonal boundary in most cases. The Seg-Net+CRFIs can also provide slightly better contour detection/interpolation comparedwith that of UNet (e.g., in Volunteer

Discussion

These CRF inserts may help the convergence of the SegNet during training because of themulti-scale local spatial similarity is encouraged, i.e., multi-level attention focusing that issimilar to methods in (15,16). The functionality of CRFI at the end of SegNet was similarto the conventional CRFasRNN [23] or DenseCRF [10]. Meanwhile, in contrast with theprevious approaches, proposed CRFI can be placed inside the UNet or SegNet structureas the SegNet+CRFIs combination demonstrated in this study. The SegNet+CRFIs alsoachieved a higher Dice coeﬃcient compared with baseline UNet or SegNet. Intuitively, foreach CNN block plus CRFI, the combination behavior was like that of CNN + CRF, i.e.,preserving the edge of the object. The CRFI in this setting encouraged the multi-levelspatial similarity on the encoder side and helped the edge detection/interpolation and12odel convergence. The proposed SegNet + CRFI deep neural network can perform zonalprostate segmentation, achieving higher Dice coeﬃcients compared to those provided bynon-neural network state-of-the-art methods, i.e., Dice coeﬃcients of 0.60 and 0.70 forPZ and CG/TZ by template-based and C-means clustering [2] and level-set [21]. Thelimitations of the current study include the small sample size and the homogeneous MRIexam, i.e., all using the same parameters and from one institute and all patients bearuntreated prostate cancer. Further testing our method on a larger set of clinical andmulti-center data is required to conﬁrm these results.This study also presented a feasible simpliﬁcation for CRF and mean-ﬁeld inference,as we used two convolution layers to implement the CRFI in the proposed model, allowingthe CRFI to be placed either on the encoder side of SegNet or at the end of it or both.Interestingly, the similarity or the average local distance map in proposed CRFI mighthave the representation capacity for aﬃne transform on adjacent feature vectors, whichwas used globally in CapsuleNet [17]. We also empirically found that the SegNet+CRFIsnetwork could recognize the relative spatial relation between TZ and PZ at the earlystage of the training, leading to a rapid training convergence compared with the baselineUnet.

Conclusion

In summary, three fully convolutional neural networks based on SegNet and CRF forzonal prostate segmentation were presented and demonstrated high accuracy. The Seg-Net with CRFI solved the ill-deﬁned boundary with UNet segmentation and achievedthe higher Dice coeﬃcient compared with baseline UNet and SegNet. This study demon-strated that adding CRFI to the SegNet was appropriate augmentation for improving thesegmentation of the prostate. 13 cknowledgement

This research was supported by funds from the California Tobacco-Related Disease Re-search Grants Program Oﬃce of the University of California, Grant Number: 28IR-0060and by funds from the NIH: R01 CA148708, R01 EB16741, and from TRDRP 131866Aand American Cancer Society Research Scholar Grant RSG-18-005-01 CCE.Table 1: Dice coeﬃcients measured on six T2W MRI cases (N = 71, i.e., 71 2D imageson the prostate, mean ± standard deviation).Peripheral zone Central gland Whole glandUNet 0.734 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± a) Input,

I Conv block ( I ) ⇒ U exp [ − Conv ( I || Q ) ] ⇒ S Sof tmax [ U − S ∗ Conv ( Q )] ⇒ Q output, Q (b) Figure 1: (a) Modiﬁed CRF inserts (CRFIs) in SegNet (SegNet+CRFIs), applied to theencoder side of SegNet. Colors are yellow for convolution block, magenta for CRFI, redfor 2 × (cid:48) sbypath connections, CRFI has feedback connections that encourage the data consistencyor similarity at multiple levels. (b) Diagram of the connection between a convolutionblock (yellow) and the recurrent CRFI (magenta). The CRFI combines the input, I , andconvolution block output U based on the spatial similarity in I . Intermediate variablesare S for similarity weighting of the input image and Q for conditional probability fromthe previous iteration. S is computed by a convolution layer applied to the input I witha new activation function, i.e., S = exp [ − Conv ( I || Q ) ]. Small S corresponds to highlocally similarity, which results in a small penalty on the pairwise potential in Eq. (1). Q for the conditional probability that would produce the probable output during iterationfor similar intensity and adjacent pixels to be labeled with the same output vector. Thefollowing step, i.e., S ∗ Conv ( Q ), generated the pairwise part, i.e., Sof tmax [ U − S ∗ Conv ( Q )], updating Q . The darker band in some boxes indicate the activation function.15 ◦ || C onv ( I || Q ) exp ( − d ) × C onv ( Q ) − Sof tmax ◦ I d S PQ U

Attention gate

Figure 2: Data ﬂow of the recurrent CRFI in Figure 1. The CRFI combines the input, I , and convolution block output U based on the spatial similarity measured on I . U is the negative of the unary energy in CRF. S = exp ( − d ) is similarity weighting fromthe average local distance map d = Conv ( I || Q ), and Q for conditional probability fromprevious iteration. The whole CRFI can be viewed as an attention gate inside a CRF.The S is gated by Conv ( Q ) through element-wise multiplication, i.e., P = S ∗ Conv ( Q ).16 V o l un t ee r V o l un t ee r V o l un t ee r V o l un t ee r Figure 3: Representative neural network segmentation results from 4 participants. Theneural network predictions and the human-labeled PZ and TZ were contoured and over-laid on the T2 weighted MR images. Note that the neural network was able to predictthe correct zonal boundary in most cases, even at the presence of T2 lesion (arrow).17 olunteer UN e t S e g N e t S e g N e t + C R F I s Figure 4: Comparison of UNet, SegNet, and proposed SegNet+CRFIs segmentation andzoom-in views overlaid on the T2-weighted images. Slices were from the top or bottomof the prostate, where the segmentation was challenging. Results indicate the ill-deﬁnedboundaries in UNet and SegNet outputs. The SegNet+CRFIs segmentation has higherDice score compared with SegNet (Table 1) with smooth boundaries of the prostate.18

RFI

T2WI Pre end-CRFI Post end-CRFI Human label V o l un t ee r V o l un t ee r V o l un t ee r V o l un t ee r Figure 5: (Top) Scheme of a SegNet with a CRFI at the end. (Bottom) Representativeneural network segmentation results from 4 participants. Predictions before or after theCRFI. Note that the CRFI was able to correct the errors from the SegNet, i.e., post vs.pre end-CRFI, resulting in smooth and consistent segmentation for prostate.19

RFI CRFI

T2WI Pre end-CRFI Post end-CRFI Human label V o l un t ee r V o l un t ee r V o l un t ee r V o l un t ee r Figure 6: (Top) Scheme of a combination of two methods: CRFIs on encoder side andat the end. (Bottom) Predictions before or after end-CRFI and the human-labeled PZand TZ were overlaid on the T2 weighted MR images. Note that the segmentation frompre and post end-CRFI were both smooth and consistent. Meanwhile, results from postend-CRFI had highest Dice score in this study.20 eferences [1] Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. SegNet: A Deep Convo-lutional Encoder-Decoder Architecture for Image Segmentation.

IEEE Transactionson Pattern Analysis and Machine Intelligence , 39(12):2481–2495, dec 2017.[2] O Chilali, P Puech, S Lakroum, M Diaf, S Mordon, and N Betrouni. Gland andZonal Segmentation of Prostate on T2W MR Images.

Journal of Digital Imaging ,(June):730–736, 2016.[3] ¨Ozg¨un C¸ i¸cek, Ahmed Abdulkadir, Soeren S Lienkamp, Thomas Brox, and Olaf Ron-neberger. 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Anno-tation. jun 2016.[4] Anno Graser, Andreas Heuck, Bernhard Sommer, Joerg Massmann, Juergen Schei-dler, Maximillian Reiser, and Ullrich Mueller-Lisse. Per-sextant localization andstaging of prostate cancer: Correlation of imaging ﬁndings with whole-mount stepsection histopathology.

American Journal of Roentgenology , 188(1):84–90, 2007.[5] Masoom A. Haider, Theodorus H. Van Der Kwast, Jeﬀ Tanguay, Andrew J. Evans,Ali Tahir Hashmi, Gina Lockwood, and John Trachtenberg. Combined T2-weightedand diﬀusion-weighted MRI for localization of prostate cancer.

American Journal ofRoentgenology , 189(2):323–328, 2007.[6] Elmira Hassanzadeh, Daniel I. Glazer, Ruth M. Dunne, Fiona M. Fennessy,Mukesh G. Harisinghani, and Clare M. Tempany. Prostate imaging reporting anddata system version 2 (PI-RADS v2): a pictorial review.

Abdominal Radiology ,42(1):278–289, jan 2017.[7] Prostate Imaging. PI-RADS. . 218] Justin Ker, Lipo Wang, Jai Rao, and Tchoyoson Lim. Deep Learning Applications inMedical Image Analysis.

Annual Review of Biomedical Engineering , 6(March):9375–9389, 2018.[9] Philipp Kr¨ahenb¨uhl and Vladlen Koltun. Eﬃcient Inference in Fully ConnectedCRFs with Gaussian Edge Potentials. (4):1–4, 2012.[10] Philipp Kr¨ahenb¨uhl and Vladlen Koltun. Eﬃcient Inference in Fully ConnectedCRFs with Gaussian Edge Potentials. arXiv e-prints , page arXiv:1210.5644, Oct2012.[11] Geert Litjens, Robert Toth, Wendy van de Ven, Caroline Hoeks, Sjoerd Kerk-stra, Bram van Ginneken, Graham Vincent, Gwenael Guillard, Neil Birbeck, Jin-dang Zhang, Robin Strand, Filip Malmberg, Yangming Ou, Christos Davatzikos,Matthias Kirschner, Florian Jung, Jing Yuan, Wu Qiu, Qinquan Gao, Philip EddieEdwards, Bianca Maan, Ferdinand van der Heijden, Soumya Ghose, Jhimli Mitra,Jason Dowling, Dean Barratt, Henkjan Huisman, and Anant Madabhushi. Eval-uation of prostate segmentation algorithms for MRI: The PROMISE12 challenge.

Medical Image Analysis , 18(2):359–373, feb 2014.[12] Chengjiang Long, Roddy Collins, Eran Swears, and Anthony Hoogs. Deep neuralnetworks in fully connected crf for image labeling with social network metadata. 012018.[13] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networksfor semantic segmentation. In , volume 07-12-June, pages 3431–3440. IEEE, jun 2015.[14] Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. V-Net: Fully Convo-lutional Neural Networks for Volumetric Medical Image Segmentation.

Proceedings- 2016 4th International Conference on 3D Vision, 3DV 2016 , pages 565–571, jun2016. 2215] Ozan Oktay, Jo Schlemper, Loic Le Folgoc, Matthew Lee, Mattias Heinrich, Kazu-nari Misawa, Kensaku Mori, Steven McDonagh, Nils Y Hammerla, Bernhard Kainz,Ben Glocker, and Daniel Rueckert. Attention U-Net: Learning Where to Look forthe Pancreas. (Midl), apr 2018.[16] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional Net-works for Biomedical Image Segmentation.

Lecture Notes in Computer Science (in-cluding subseries Lecture Notes in Artiﬁcial Intelligence and Lecture Notes in Bioin-formatics) , 9351:234–241, may 2015.[17] Sara Sabour, Nicholas Frosst, and Geoﬀrey E Hinton. Dynamic Routing BetweenCapsules. arXiv e-prints , page arXiv:1710.09829, Oct 2017.[18] Olga Starobinets, Jeﬀry P. Simko, Kyle Kuchinsky, John Kornak, Peter R. Carroll,Kirsten L. Greene, John Kurhanewicz, and Susan M. Noworolski. Characteriza-tion and stratiﬁcation of prostate lesions based on comprehensive multiparametricMRI using detailed whole-mount histopathology as a reference standard.

NMR inBiomedicine , 30(12):1–13, 2017.[19] Philipp Steiger and Harriet C. Thoeny. Prostate MRI based on PI-RADS version 2:How we review and report.

Cancer Imaging , 16(1):1–9, 2016.[20] Marvin T. T. Teichmann and Roberto Cipolla. Convolutional CRFs for SemanticSegmentation. may 2018.[21] Robert Toth, Justin Ribaultb, John Gentilec, Dan Sperling, and Anant Madabhushi.Simultaneous Segmentation of Prostatic Zones Using Active Appearance ModelsWith Multiple Coupled Levelsets.

Comput Vis Image Underst , 117(9):1051–1060,2014.[22] A. E. Wefer, H. Hricak, D. B. Vigneron, F. V. Coakley, Y. Lu, J. Wefer, U. Mueller-Lisse, P. R. Carroll, and J. Kurhanewicz. Sextant localization of prostate cancer:23omparison of sextant biopsy, magnetic resonance imaging and magnetic resonancespectroscopic imaging with step section histology.

Journal of Urology , 164(2):400–404, 2000.[23] Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet,Zhizhong Su, Dalong Du, Chang Huang, and Philip H.S. Torr. Conditional ran-dom ﬁelds as recurrent neural networks. In