[PDF] Dosimetric impact of physician style variations in contouring CTV for post-operative prostate cancer: A deep learning-based simulation study

Abstract

Inter-observer variation is a significant problem in clinical target volume(CTV) segmentation in postoperative settings, where there is no gross tumor present. In this scenario, the CTV is not an anatomically established structure, but one determined by the physician based on the clinical guideline used, the preferred tradeoff between tumor control and toxicity, their experience and training background, and other factors. This results in high inter-observer variability between physicians. This variability has been considered an issue, but the absence of multiple physician CTV contours for each patient and the significant amount of time required for dose planning have made it impractical to study its dosimetric consequences. In this study, we analyze the impact that variations in physician style have on dose to organs-at-risk(OAR) by simulating the clinical workflow via deep learning. For a given patient previously treated by one physician, we use deep learning-based tools to simulate how other physicians would contour the CTV and how the corresponding dose distributions would look for this patient. To simulate multiple physician styles, we use a previously developed in-house CTV segmentation model that can produce physician style-aware segmentations. The corresponding dose distribution is predicted using another in-house deep learning tool, which, can predict dose within 3% of the prescription dose, on average, on the test data. For every test patient, four different physician style CTVs are considered, and four different dose distributions are analyzed. OAR dose metrics are compared, showing that even though physician style variations result in organs getting different doses, all the important dose metrics except Maximum Dose point are within the clinically acceptable limit.

Full PDF

DDosimetric impact of physician style variations in contouring CTV for post-operative prostate cancer: A deep learning based simulation study

Anjali Balagopal, Dan Nguyen, Maryam Mashayekhi, Howard Morgan, Aurelie Garant, Neil Desai, Raquibul Hannan, Mu-Han Lin, Steve Jiang Medical Artificial Intelligence and Automation (MAIA) Laboratory and Department of Radiation Oncology, UT Southwestern Medical Center

Corresponding Author : [email protected],+1-214-645-8542, Department of Radiation Oncology, University of Texas Southwestern Medical Center, 2280 Inwood Road, Dallas, TX 75390-9303

Total word count (excluding Abstracts, figures, tables, and references) : 3543

Total table and figure count: ABSTRACT

In tumor segmentation, inter-observer variation is acknowledged to be a significant problem. This is even more significant in clinical target volume (CTV) segmentation, specifically, in post-operative settings, where a gross tumor does not exist. In this scenario, CTV is not an anatomically established structure but rather one determined by the physician based on the clinical guideline used, the preferred trade off between tumor control and toxicity, their experience, training background etc... This results in high inter-observer variability between physicians. Inter-observer variability has been considered an issue, however its dosimetric consequence is still unclear, due to the absence of multiple physician CTV contours for each patient and the significant amount of time required for dose planning. In this study, we analyze the impact that these physician stylistic variations have on organs-at-risk (OAR) dose by simulating the clinical workflow using deep learning. For a given patient previously treated by one physician, we use deep learning-based tools to simulate how other physicians would contour the CTV and how the corresponding dose distributions should look like for this patient. To simulate multiple physician styles, we use a previously developed in-house CTV segmentation model that can produce physician style-aware segmentations. The corresponding dose distribution is predicted using another in-house deep learning tool, which, averaging across all structures, is capable of predicting dose within 3% of the prescription dose on the test data. For every test patient, four different physician-style CTVs are considered and four different dose distributions are analyzed. OAR dose metrics are compared, showing that even though physician style variations results in organs getting different doses, all the important dose metrics except Maximum Dose point are within the clinically acceptable limit.

Keywords : Treatment planning, Radiation Therapy, Post-operative prostate cancer, Clinical Tumor Volume, Clinical workflow Simulation, Deep learning . INTRODUCTION

Optimal radiation treatment entails uniform full-dose coverage of the radiation target with a sharp dose fall-off around it. This necessitates precise segmentation of both the radiation target and the nearby organs that are at risk for radiation damage (Organs-at-risk, OARs). In a typical radiotherapy setting, OARs are segmented together with the gross tumor volume (GTV), which is the tumor that is visible in images. Using their knowledge of the disease, physicians then expand the GTV to create the clinical target volume (CTV), which also includes the microscopic extensions not visible in images. In the case of post-operative radiotherapy, however, the visible tumor has been surgically removed, so the CTV is only a virtual volume encompassing area that may contain microscopic tumor cells, not an expansion of a macroscopic or visible tumor volume. The optimal CTV is determined by the treating physician based on not only patient characteristics but also on potential dose concerns for OARs. Physicians decide on CTV contour based on their experience regarding toxicity especially since in post-operative cases a lot of surrounding organs form a part of the CTV. In situations where CTV is contoured based on potential dose concerns, the CTV segmentation is impacted by high observer variability [1-6]. The CTV is further expanded by physicians to a planning tumor volume (PTV). PTV encloses the CTV with margins to account for possible uncertainties in beam alignment, patient positioning and organ motion. Ideally, the CTV-PTV margin should be determined solely by the magnitudes of the uncertainties involved. In practice, physicians usually also consider doses to nearby healthy tissues when deciding on the size of the margin. This would also result in a CTV different from the original CTV being treated. The impact that these variations in the CTV along with the added variations to PTV needs to be further studied from a dosimetric point of view. The two main time-consuming steps in the planning process are manual segmentation of required tumor and organ contours and multiple rounds of parameter tuning and optimization in a trial-and-error manner. Automating the treatment planning process with the help of available historical data can accelerate treatment planning. There are multiple studies [7-12] that use deep learning (DL) to successfully segment CTV and organs. Many radiotherapy dose prediction studies have also been reported that can be broadly categorized into two categories. First category deals with dose-volume Histogram (DVH) prediction based on mathematical frameworks [13, 14]. Those studies only focused on DVH indices as model inputs and outputs without three-dimensional dose volume information. They provided potential optimal plan information with OARs as sparing as possible. The second category deals with three-dimensional dose distribution prediction using deep learning networks [15-22]. Those studies showed relatively high prediction precision according to the quantitative index comparisons and the predicted results, three-dimensional dose volumes, could provide probabilities for all quantitative and qualitative evaluations. In this study, we analyze the impact that these physician stylistic variations have on organ dose by simulating the clinical workflow using deep learning. We implement a dose prediction model for post-operative prostate dose prediction using DL. Since the goal of this study is to evaluate the dosimetric impact of physician style variations in CTV segmentation, we mimic the clinical workflow by using a physician specific CTV segmentation DL network that we have previously developed in-house. This model can generate CTVs in four different physician styles. For each patient, since we already have the CTV contour manually segmented by one physician that was used for creating the ground truth dose plan and treating the patient, we use the segmentation model to generate three other CTVs in the other three physicians’ styles. iven that physicians do not always use the same margins for CTV-PTV expansion, we use a PTV expansion network that can mimic how each physician would expand the CTV to PTV with the help of OAR contours. This step is necessary to mimic the current clinical workflow. We show how the PTV expansion model significantly improves the CTV-PTV expansion accuracy when compared to a nominal expansion using recommended margins. The generated PTVs along with the OAR contours are input into the dose prediction network to generate the dose map, and the DVH as well as dose metrics are calculated for four different physicians for comparison. The OARs do not have much variability across physicians and the manually drawn OAR contour by the original treating physician is used for dose prediction for all cases. MATERIALS AND METHODS 2.1.

DATA

This study includes retrospectively collected CT volumes obtained at initial simulation for adjuvant or salvage postoperative radiotherapy (PORT) to the prostatic fossa of 297 patients who initially underwent resection for prostate cancer. These patients were treated with PORT by four expert GU radiation oncologists at UT Southwestern Medical center from January 2010 to December 2017. 220 patients had radiation delivered to prostate fossa (65-72 Gy dose) only while 77 patients had radiation delivered to regional lymph nodes in addition to prostate fossa (~41-45Gy dose). These scans were contoured by 4 different physicians, and these contours are the contours that were used for patient treatment. For each patient, the data available were the patient CT, masks of the planning target volumes and organs-at-risk and the 3D dose distribution. The dose distribution was generated from a VMAT setup. The structures available in the dataset were the PTVfossa, PTVnode, Body, Bladder, Rectum, Left Femoral Head, and Right Femoral Head. Each CT volume contains 60-360 slices and a voxel size of 1.17 × 1.17 × 2 mm . Using the Boolean structure masks, we generated a signed distance array for each structure. Each voxel from the signed distance array represents the minimum distance from the closest edge of the PTV contour. Voxels outside the PTV contour had a positive distance value, voxels inside the PTV had a negative distance value. 217 patients were used for training and validation and 80 patients, 50 with prostatic fossa PTV only and 30 with additional elective nodal PTV, were set aside for testing. Figure 1: Patient distribution information. a) Distribution of patients in the dataset across the 4 GU physicians. b) Distribution of patients in dataset across targets treated (PTVs). Patients with only prostate fossa as target have only one PTV, while patients with an additional nodal target have two PTVs.

PTV SEGMENTATION NETWORK

Three different models were trained and compared for PTV segmentation. Model A is a volumetric UNet [23] with residual blocks [24], Res-UNet, trained with only CT as input and Model B is a Res-UNet trained with CTV, OARs along with CT as input. Model C is a physician specific model which is a set of 4 models, one for each physician. Each of these physician models are Res-UNets with CTV, OARs and CT as input and are trained separately on each physician’s data. To keep the study as fair as possible, all models trained utilized the same exact architecture , Figure 2, and hyperparameters. The model starts by performing a convolution with a kernel size of 3 × 3 × 3, followed by ReLU, Group Norm (GN) [26], and then Dropout (DO) [25]. In this paper, we will refer to this set of operations as Conv-ReLU-GN-DO. This is followed by a residual block that has ReLU activation and GN incorporated. This is followed by a Maxpooling operation (2 × 2 × 2). These sets of calculations are performed a total of 3 times to reach the bottom of the U-net. The bottom layer has 2 sets of atrous convolutions that allows us to enlarge the field of view of filters to incorporate larger context. Then the features are upsampled, which consists of a 2 × 2 × 2 upscale and two sets of Conv-ReLu-GN-DO operations. This set of upsampling operations are performed 3 times to bring the data back to its original resolution as the input. At each upsampling step, the features from the left side of the U-net are copied over and concatenated with the upsampled features. Finally, the model uses 1 × 1 × 1 convolution to map the feature vector to the required number of classes and connected sigmoid layer to output the probability value of target. The loss function used for training the network is Dice similarity co-efficient loss. Since the CTV contour was already available, CT volumes were cropped to 160 × 160 × 64 sized volumes around the CTV before being input into the model. For optimization, Adam optimizer was used which was switched for SGD in the last 30 epochs [27], with learning rate of 0.001. All the models were trained on a V100 GPU with 32 GB of memory, using TensorFlow version 2.1. The batch size was set to 4 due to memory limitations.

Figure 2: Res-UNet used for PTV segmentation

DOSE PREDICTION NETWORK

For dose prediction, we utilized a U-net style architecture [23] which is shown in Figure 3. The model starts by Conv-ReLU-GN-DO. This operation is repeated twice. This is followed by a Maxpooling operation (2 × 2 × 2). These sets of calculations are performed a total of 3 times to reach the bottom of the U-net. Then the features are upsampled, which consists of a 2 × 2 × 2 upscale and two sets of Conv-ReLu-GN-DO operations. This set of upsampling operations are performed 3 times to bring the data back to its original resolution as the input. At each upsampling step, the features from the left side of the U-Net are copied over and concatenated with the upsampled features. The final layer has a linear activation and outputs the dose map. Two models with the same architecture were trained, Model A with Mean squared error as loss function and Model B with Huber loss as loss function. The final model, Model AB, is an ensemble of Model A and Model B. For optimization of both models, Adam optimizer was used which was switched for SGD in the last 30 epochs [27], with learning rate of 0.001. The training was performed on a V100 GPU with 32 GB of memory, using TensorFlow version 2.1. The batch size was set to 1, due to memory limitations.

Figure 3:

Dose Prediction Network. Schematic of the U-net used in the study

PERFORMANCE EVALUATION

PTV SEGMENTATION

Only 50 test patients that have prostatic fossa CTV alone were used for PTV segmentation evaluation. For PTV Segmentation, Dice similarity co-efficient between each of the model predictions and ground truth TV was compared. The recommended CTV-PTV expansion margin for post-operative prostate CTV at our institution is 9mm anterior, 5mm posterior, and 7mm in all other directions. The CTV was expanded as per these margins and compared to the ground truth PTV used, to create a baseline.

DOSE PREDICTION

To evaluate the performance of the dose prediction models, we assess its prediction accuracy on the 80 test patients, 50 patients with only prostate fossa target and 30 patients with additional nodal target. We evaluated the effects of the models on clinically relevant metrics, which include the mean dose (Dmean) and max dose (Dmax) to each structure. Dmax is defined as the dose to 2% of the volume for each structure, as recommended by the ICRU-83 report [28]. We also evaluated the model performance of the frameworks on the total PTV dose coverage D99, D98, D95, which is the dose to 99%, 98%, and 95% of the total PTV volume. We assessed the PTV homogeneity ((D2-D98)/D50) for the total PTV.

PHYSICIAN DOSE COMPARISON

To compare plans that use PTVs of different physicians, we evaluate the OAR DVHs and compare clinically relevant dose metrics. Only 50 test patients that have prostatic fossa CTV alone were used for this comparison. For bladder and rectum, Maximum Pt Dose (MaxPt), V70 and V60 were measured, which are the maximum dose to any point in the structure, the volume of structure in cc that receives more than 70Gy dose and 65Gy dose respectively. For left and right femoral heads along with MaxPt, V48 was measured, which is the volume of structure in cc that receives more than 48Gy dose. RESULTS 3.1.

PTV SEGMENTATION

For PTV segmentation, three models were compared along with a recommended CTV-PTV margin expansion. Figure 4a shows a comparison of all the models in terms of Dice similarity co-efficient. Model A which does not use CTV as input performs the worst with an average DSC of 79.7 ± 6.7%. Adding CTV as an input to the model significantly improves the model accuracy. Model B has a DSC accuracy of 93.4 ± 4.2%. Physician specific Models, Model C (4 separate models), performs the best with a DSC accuracy of 96 ± 1.9%. Expanding the CTV with the recommended PTV margins gives an accuracy of 92.3 ± 2.4%. Model C performs significantly better than (pairwise t-test, p<.05) other DL models as well as recommended expansion. Model C also has a much smaller standard deviation compared to other models. Figure 4b shows a bar plot with the improvement for each of the 50 test patients with Model C when compared to the standard margin expansion. Figure 4c shows a few visual examples for model performance. It can be seen that Model C (physician-specific model) performs better than Model B with better conformity to clinically used PTVs.

Figure 4: PTV segmentation result comparison for Model A- which takes only CT as input, Model B- which takes CTV and OARs as input along with CT, Model C which is a superset of four models each trained on a single physician’s data and uses CTV, OARs and CT as input. a) Box plots comparing DSC distribution for all the test patients for the considered models. B) Improvement in PTV accuracy with the physician specific expansion model when compared to recommended margin expansion. C) Visual examples showing improvement with physician specific model as opposed to a combined data model.

DOSE PREDICTION

Three models were compared, Model A – trained with MSE loss, Model B – trained with Huber loss and Model AB which is an ensemble of Models A&B. Figure 5a shows the MAE of Dmean, Dmax, D99, D98 and D95 of PTV. Model B performs significantly better than Model A for PTV (pairwise t-test, p<.05). Both the models are capable of maintaining a mean absolute error less than 5% of prescription dose. Figure 5b shows the MAE of Dmean and Dmax for OARs. For OARs unlike PTV, Model A performs significantly better than Model B (pair-wise t-test, p<.05). Both the models are capable of maintaining a mean absolute error less than 5% of prescription dose for all the OAR dose metrics. Model AB being the ensemble model of A and B, performs significantly (pairwise t-test, p<.05) better than both Model A and Model B for PTV as well as OARs. Model AB is capable of maintaining a mean absolute error less than 3% of prescription dose for all dose metrics. Figure 5c shows the PTV homogeneity. Ideally, a perfect homogeneity is 0. PTV dose predicted by Model A has a better conformity than Model B. The ensemble model has a similar homogeneity compared to Model A.

Figure 5: Comparison of dose metrics for the three models. Model A is a UNet trained with MSE loss, Model B is a UNet trained with Huber loss and Model AB is an ensemble of Models A&B. a) Mean absolute error of the dose coverage (

𝐷𝑚𝑎𝑥, 𝐷𝑚𝑒𝑎𝑛 , 𝐷 , 𝐷 , 𝐷 ) between the predictions and the ground truth dose for PTV. b) Mean absolute error of the dose coverage ( 𝐷𝑚𝑎𝑥 𝑎𝑛𝑑 𝐷𝑚𝑒𝑎𝑛) between the predictions and the ground truth dose for OARs. c) Homogeneity of the Model A prediction, Model B prediction, Model AB prediction and ground truth. Error bar represents the 95% confidence interval (𝑥̅ ± 1.96 𝜎√𝑛 ) . Figure 6a shows an example DVH and dose wash for a dose plan done only to prostatic fossa. In this case, there is only 1 PTV and it can be seen that model prediction is close to the ground truth dose. Figure 6b shows an example DVH and dose wash for a dose plan done to both prostatic fossa and elective nodal regions.

Figure 6: DVHs and Dose washes of two example test patients with the ground truths and the model predictions.

PHYSICIAN STYLE DOSE COMPARISON

We compared the dose to the OARs for every test patient when different physician CTVs are used. Figure 7 depicts the values of OAR volumetric dose metrics for all the test patients. The dashed line shows the limit for the metrics and the value should be below the limit to avoid any toxicity to the tissues. The CTV variations can be observed to have an impact on the OAR dose. This is prominent in some patient’s compared to others. But whichever is the CTV used, the VxGy metrics or the volume of the structures that receive more than xGy dose, seem to have a large overhead which would imply no danger of toxicity for any physicians’ CTV. The MaxPt [Gy] on the other hand which is the maximum dose to any point in the structure has large variations across physicians. For Rectum and Bladder, MaxPt seems to be within limits except for one patient. For Femoral heads on the other hand, MaxPt dose has large fluctuations and seems to cross the limit for 8/50 for left femoral head and 14/50 for Right femoral head. Figure 8a shows the average overhead for each physician for each metric. Overhead is the extra room left by the physicians for each dose metric below the corresponding limit as a % of the limit. The higher the overhead, the lower the value of the OAR dose metric and lower the dose to the structure. Figure 8b shows two example DVHs for bladder, rectum, and femoral heads when planning is performed with different physician CTVs expanded to PTVs. Figure 8a shows an example where physician CTV style variability have a large impact on OAR dose and Figure 8b shows an example in which the variation in OAR dose is minimal. Figure 8c shows an example of the four dose maps along with the dose metrics for a test patient with physician specific PTV used for dose prediction. Even with the stylistic PTV variations across physicians, the dose to OARs are still within the recommended limit.

Figure 7: Scatter plots depicting the values of OAR volumetric dose metrics for 4 different physician CTVs for all the 50 test patients.

Figure 8: a) average overhead for each physician for each metric. Overhead is the extra room left by the physicians for each dose metric below the corresponding limit as a % of the limit. The higher the overhead, the lower the value of the OAR dose metric and lower the dose to the structure. b&c) Two example predicted DVHs for bladder, rectum, and femoral heads when planning is performed with different physician CTVs expanded to PTVs. b) shows an example where physician CTV style variability has a large impact on OAR dose and c) shows an example in which the variation in OAR dose is minimal. d) a single test patient with planning done with physician specific CTVs. The dose washes (the same axial and sagittal locations) that show the variation in dose with PTV, Bladder and Rectum contours in black. DISCUSSION

In post-operative prostate patients, CTV is not an anatomically established structure but rather one determined by the physician based on the clinical guideline used, the preferred tradeoff between tumor control and toxicity, their experience, training background etc... This results in high inter-observer variability between physicians. Inter-observer variability has been considered an issue, however its dosimetric consequence is still unclear, due to the absence of multiple physician CTV contours for each atient and the significant amount of time required for dose planning. In this study, we leverage deep learning to simulate the clinical workflow for post-operative prostate treatment planning and evaluate the dosimetric impact of physician style variations on organs-at-risk. In this work, we have designed a dose prediction tool that can predict dose with less than 3% error. Three models were compared, a model trained with MSE loss, a model trained with Huber loss and an ensemble of the two models. While model trained with MSE resulted in better PTV dose metrics, model trained with Huber loss resulted in better OAR dose metrics. The ensemble model performed better or similar to the two individual models. To our knowledge this is the first DL-based radiotherapy dose prediction study for post-operative setting. To demonstrate the clinical use case of dose prediction model, we leverage an in-house physician specific CTV segmentation model that can create physician style-aware segmentations. Additionally, since PTVs are required for dose planning and standard CTV-PTV margin is not used by all physicians, we designed a PTV segmentation model that can produce accurate PTVs from CTV. Three models were designed and compared for PTV segmentation. All the models used an encoder-decoder architecture with group normalizations, atrous convolutions, residual blocks and skip connections between encoder and decoder. The model that was trained separately on each physician’s data, with CT, CTV and OARs as input performed the best with 96% DSC accuracy. This model had a significant improvement when compared to using a standard recommended margin of 9mm anterior expansion, 5mm posterior expansion, and 7mm expansion in all other directions. When comparing the OAR dose metrics, we observed that even though different physician PTVs can have a significant impact on the OAR dose, they all fall below the limits for the volume constraints. The MaxPt dose being a single point value could have large stray effects from errors in dose prediction model, and the effect of this error on MaxPt dose needs to be further studied. This study has its own limitations most of which comes from the simulation errors. There are three deep learning models in play and hence the sequential errors add up. Nevertheless, these errors are still acceptable considering this is a simulation study. Enabling either the segmentation or the margin expansion or dose planning to be performed manually could make the comparison more accurate. In future studies, including deep learning model uncertainties would help in evaluating the error associated with each prediction. CONCLUSION

In this study, we have simulated the clinical workflow for post-operative prostate treatment planning and evaluate the dosimetric impact of physician style variations on organs-at-risk. We have developed and proposed a dose prediction model for volumetric dose prediction for post-operative prostate cancer patients. Using our proposed implementation, we are capable of accurately predicting the dose distribution from the PTV and OAR contours, and the prescription dose. On average, our proposed model is capable of predicting the OAR max dose within 4% and mean dose within 3% of the prescription dose on the test data. We have also developed an accurate CTV-PTV expansion model that has 96% DSC accuracy. We have evaluated dosimetric impact of physician CTV contouring style variations on OAR doses by comparing dose plans for different physician CTVs and have provided a detailed analysis.

DATA AVAILABILITY ll the datasets were collected from one institution and are non-public. In accordance with HIPAA policy, access to the datasets will be granted on a case by case basis upon submission of a request to the corresponding authors and the institution.

CODE AVAILABILITY

The DL models will be free to download for non-commercial research purposes after paper acceptance on GitHub. (https://github.com/anjali91-DL/Post-op-prostate-doseprediction-model)

AUTHOR CONTRIBUTIONS

All authors have made contributions to the manuscript including its conception and design, the analysis of the data and the writing of the manuscript. All authors have reviewed all parts of the manuscript and take responsibility for its content and approve its publication.

CONFLICT OF INTEREST

The authors declare no competing financial interest. The authors confirm that all funding sources supporting the work and all institutions or people who contributed to the work, but who do not meet the criteria for authorship, are acknowledged. The authors also confirm that all commercial affiliations, stock ownership, equity interests or patent licensing arrangements that could be considered to pose a financial conflict of interest in connection with the work have been disclosed.

ACKNOWLEDGMENTS

We would like to thank Dr. Jonathan Feinberg for editing the manuscript and Varian Medical Systems, Inc. for providing funding support.

REFERENCES Mitchell DM, Perry L, Smith S, Elliott T, Wylie JP, Cowan RA, Livsey JE, Logue JP. Assessing the effect of a contouring protocol on postprostatectomy radiotherapy clinical target volumes and interphysician variation. Int J Radiat Oncol Biol Phys. 2009;75(4):990–993. doi: 10.1016/j.ijrobp.2008.12.042. 2.

Lawton CA, Michalski J, El-Naqa I, Kuban D, Lee WR, Rosenthal SA, Zietman A, Sandler H, Shipley W, Ritter M. et al. Variation in the definition of clinical target volumes for pelvic nodal conformal radiation therapy for prostate cancer. Int J Radiat Oncol Biol Phys. 2009;74(2):377–382. doi: 10.1016/j.ijrobp.2008.08.003. 3.

Lawton CA, Michalski J, El-Naqa I, Buyyounouski MK, Lee WR, Menard C, O'Meara E, Rosenthal SA, Ritter M, Seider M. RTOG GU Radiation oncology specialists reach consensus on pelvic lymph node volumes for high-risk prostate cancer. Int J Radiat Oncol Biol Phys. 2009;74(2):383–387. doi: 10.1016/j.ijrobp.2008.08.002. 4.

Livsey JE, Wylie JP, Swindell R, Khoo VS, Cowan RA, Logue JP. Do differences in target volume definition in prostate cancer lead to clinically relevant differences in normal tissue toxicity? Int J Radiat Oncol Biol Phys. 2004;60(4):1076–1081. doi: 10.1016/j.ijrobp.2004.05.005. 5.

Symon Z, Tsvang L, Pfeffer MR, Corn B, Wygoda M, Ben-Yoseph R. Prostatic fossa boost volume definition: physician bias and the risk of planned geographical miss. In: Proceedings 90th annual RSNA meeting, Chicago; 2004. Abstract SSC19-09. .

Lee E, Park W, Ahn SH, et al. Interobserver variation in target volume for salvage radiotherapy in recurrent prostate cancer patients after radical prostatectomy using CT versus combined CT and MRI: a multicenter study (KROG 13-11). Radiat Oncol J. 2017;36(1):11–16. doi:10.3857/roj.2017.00080 7.

Chang Liu, Stephen J. Gardner, Ning Wen, Mohamed A. Elshaikh, Farzan Siddiqui, Benjamin Movsas, Indrin J. Chetty, Automatic Segmentation of the Prostate on CT Images Using Deep Neural Networks (DNN), International Journal of Radiation Oncology*Biology*Physics, Volume 104, Issue 4, 2019, Pages 924-932, ISSN 0360-3016, https://doi.org/10.1016/j.ijrobp.2019.03.017. 9.

Men K et al 2017a Deep deconvolutional neural network for target segmentation of nasopharyngeal cancer in planning CT images Frontiers Oncol. 7 315 10.

Men K, Dai J and Li Y 2017b Automatic segmentation of the clinical target volume and organs at risk in the planning CT for rectal cancer using deep dilated convolutional neural networks Med. Phys. 44 6377–89 11.

Anjali Balagopal, Samaneh Kazemifar, Dan Nguyen, Mu-Han Lin, Raquibul Hannan, Amir Owrangi, Steve Jiang Fully automated organ segmentation in male pelvic CT images. Physics in Medicine & Biology, Volume 63, Number 24. 12.

Balagopal A, Nguyen D, Morgan H et al. A deep learning-based framework for segmenting invisible clinical target volumes with estimated uncertainties for post-operative prostate cancer radiotherapy . arXiv preprint arXiv: L .M. Appenzoller, J.M. Michalski, W.L. Thorstad, S. Mutic, K.L. Moore Predicting dose-volume histograms for organs-at-risk in IMRT planning Med Phys, 39 (2012:), p. 7446 14. A. Zawadzka, M. Nesteruk, B. Brzozowska, P.F. Kukolowicz Method of predicting the mean lung dose based on a patient's anatomy and dose-volume histograms Med Dosimetry, 42 (2017), pp. 57-62 15.

X. Chen, K. Men, Y. Li, J. Yi, J. Dai A feasibility study on an automated method to generate patient-specific dose distributions for radiotherapy using deep learning Med Phys, 46 (2019), pp. 56-64, 10.1002/mp.13262. PubMed PMID: WOS:000455029900008 16.

M. Mardani, P. Dong, L. Xing Deep-learning based prediction of achievable dose for personalizing inverse treatment planning Int J Radiat Oncol Biol Phys, 96 (2016), pp. E419-E420, 10.1016/j.ijrobp.2016.06.1685. PubMed PMID: WOS:000387655803347 17.

M. Ma, M.K. Buyyounouski, V. Vasudevan, L. Xing, Y. Yang Dose distribution prediction in isodose feature-preserving voxelization domain using deep convolutional neural network Med Phys, 46 (2019), pp. 2978-2987, 10.1002/mp.13618. PubMed PMID: WOS:000475671900006 18.

V. Kearney, J.W. Chan, S. Haaf, M. Descovich, T.D. Solberg DoseNet: a volumetric dose prediction algorithm using 3D fully-convolutional neural networks Phys Med Biol, 63 (2018), 10.1088/1361-6560/aaef74 PubMed PMID: WOS:000452415600003 19.

Fan Jiawei, Wang Jiazhou, Chen Zhi, et al. Automatic treatment planning based on three-dimensional dose distribution predicted from deep learning technique Med Phys, 46 (2019), pp. 370-381 doi: 10.1002/mp.13271 0.

D. Nguyen, T. Long, X. Jia, W. Lu, X. Gu, Z. Iqbal, et al. A feasibility study for predicting optimal radiation therapy dose distributions of prostate cancer patients from patient anatomy using deep learning Sci Rep, 9 (2019) 21.

S. Shiraishi, K.L. Moore Knowledge-based prediction of three-dimensional dose distributions for external beam radiotherapy Med Phys, 43 (2016), pp. 378-387 22.

D. Nguyen, X. Jia, D. Sher, M.H. Lin, Z. Iqbal, H. Liu, et al. Three-dimensional radiotherapy dose prediction on head and neck cancer patients with a hierarchically densely connected U-net deep learning architecture Arxiv (2018) 23.

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention MICCAI 2015, page 234241, 2015. 24.

K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385,2015. 25.

Ghiasi G, Lin T-Y and Le Q V Advances in Neural Information Processing Systems,2018), vol. Series) pp 10727-37 26.

Wu Y and He K Proceedings of the European Conference on Computer Vision (ECCV),2018), vol. Series) pp 3-19 27.

Nitish Shirish Keskar and Richard Socher. Improving generalization performance by switching from Adam to SGD. CoRR, abs/1712.07628, 2017. 28.