Knowledge-based Radiation Treatment Planning: A Data-driven Method Survey
Shadab Momin, Yabo Fu, Yang Lei, Justin Roper, Jeffrey D. Bradley, Walter J. Curran, Tian Liu, Xiaofeng Yang
KKnowledge-based Radiation Treatment Planning: A Data-drivenMethod Survey
Shadab Momin, Yabo Fu, Yang Lei, Justin Roper, Je ff rey D. Bradley, Walter J. Curran, Tian Liu andXiaofeng Yang*Department of Radiation Oncology, Emory University, Atlanta, GA Corresponding author:
Xiaofeng Yang, PhDDepartment of Radiation OncologyEmory University School of Medicine1365 Clifton Road NEAtlanta, GA 30322E-mail: [email protected]
Abstract
This paper surveys the data-driven dose prediction approaches introduced for knowledge-basedplanning (KBP) in the last decade. These methods were classified into two major categories accordingto their methods and techniques of utilizing previous knowledge: traditional KBP methods and deep-learning-based methods. Previous studies that required geometric or anatomical features to either findthe best matched case(s) from repository of previously delivered treatment plans or build predictionmodels were included in traditional methods category, whereas deep-learning-based methods includedstudies that trained neural networks to make dose prediction. A comprehensive review of each categoryis presented, highlighting key parameters, methods, and their outlooks in terms of dose prediction overthe years. We separated the cited works according to the framework and cancer site in each category.Finally, we briefly discuss the performance of both traditional KBP methods and deep-learning-basedmethods, and future trends of both data-driven KBP approaches.1 a r X i v : . [ phy s i c s . m e d - ph ] S e p Introduction
Cancer is the second-leading cause of death in North America with the most common types being thecancer of lung, breast, and prostate [8]. Radiation therapy (RT), chemotherapy, surgery or their combi-nation are used to control the disease. An approximately 50% of all cancer patients undergo RT duringthe course of their illness [53], which makes RT a crucial component of all cancer treatments. In termsof clinical usefulness and e ff ectiveness of RT treatments, the transition from conformal RT to intensitymodulated radiation therapy (IMRT) has significantly improved the two-fold dosimetric goal of im-proving the organ-at-risk (OAR) sparing while maintaining target dose homogeneity and conformity.Furthermore, algorithmic advancements have also played major roles in enhancing the e ffi ciency of RTtreatments. These include transition from forward treatment planning to inverse treatment planningapproaches, and extension of IMRT to volumetric modulated arc therapy (VMAT). However, despiteuse of complex inverse optimization algorithms, an inverse planning approach typically demands alarge amount of manual intervention to generate a high-quality treatment plan with a desired dosedistribution, taking up to a few days before patient gets the first fraction of RT treatment. To furtherenhance the treatment planning e ffi ciency, there has been significant progress into the development ofdata-driven treatment planning approaches that entail utilizing the knowledge from the past to predictthe outcome of a similar, yet new, task. In treatment planning, this concept was introduced by the re-searchers over a decade ago in the form of knowledge-based planning (KBP). It entails utilizing a largenumber of previously optimized plans to build a mathematical model or atlas-based repository thatcan be used to predict the dosimetry (i.e., dose-volume metrics, dose volume histogram (DVH), spatialdose distribution, etc.) for a new patient plan. In 2014, one of the traditional KBP approaches wasalso made commercially available as RapidPlanTM by the Varian Eclipse treatment planning system(Varian Medical Systems, Palo Alto, CA). In the past few years, another data driven approach - namelydeep learning (DL) has been gaining popularity in the field of radiation oncology for outperformingmany state-of-art techniques [23, 24, 36, 44, 70, 71, 72, 73, 74, 80, 81, 135, 140, 141, 142, 143]. Forinstance, convolutional neural network (CNN), a class of deep neural networks (DNN) with regular-ized multilayer perceptron, have significantly enhanced the performance of imaging and vision tasks.A complex architecture originally designed for image segmentation, also known as U-Net [110], hasrecently been shown to predict dose distribution without going through a treatment planning process[27, 60, 99, 131]. Though there is a review paper summarizing the articles on traditional KBP methodspublished between 2011 and 2018 [38], to our knowledge, there is no review paper specific to data-driven dose prediction approaches including both, traditional and recently introduced DL-based KBP.A key di ff erence between traditional and DL-based KBP is the way in which previous knowledge isutilized. In general, traditional KBP methods require user to utilize geometric features such as over-lapping volume information between planning target volume (PTV) and neighboring OARs in order toeither find the best matched case(s) from repository of previously delivered treatment plans or builddose prediction models (i.e. machine learning (ML), statistical model) [163]. DL methods, on the otherhand, can learn patterns hidden within the raw data without any requirement of manual feature ex-traction process, which makes it a more appealing KBP technique compared to the traditional KBPmethods. It is important note here that ML-based approaches are included in traditional KBP categoryin this review as it follows the similar framework to other traditional KBP methods in terms of input(geometric features) and outputs (dose volume metric or DVH). The traditional KBP methods includeatlas based, statistical modelling and ML methods. While a previous review summarizes these tradi-tional approaches based on methods, current work presents the review of recently emerging DL-basedmethods as well as the traditional KBP methods from the standpoint of various key parameters andtheir influence on dose prediction tasks. The goal of this review paper, therefore, is to present the suc-cess of traditional KBP methods thus far and highlight the potential of recent DL-based methods indose prediction tasks. We separate data-driven treatment planning approaches in this regard into twocategories: traditional KBP methods and DL-based methods. For each category, we first present a re-view of key parameters and methods. Subsequently, we present a review of specific investigations andinfluence of various parameters on dose prediction. Finally, we discuss the advantages and challengesof each dose prediction technique followed by highlighting the potential future trends in data-drivendose prediction methods. .1 Literature search We searched papers using Elsevier Scopus, Web of Science, PubMed, Google Scholar and medicalphysics category of arXiv.org by using logical statements that included the following keywords: knowledge-based treatment planning, ML, DL, dose prediction, RapidPlan, treatment planning automation, artifi-cial neural network (ANN), convolutional neural network and generative adversarial network.
Only peer reviewed research articles were included in this review. Each research article during lit-erature search was manually scanned based on the information presented in the abstract, which wasfollowed by further in-depth review of specific articles. The articles with description of methodologyand comparable or improved aspects of dose prediction quality or e ffi ciency were included. Retrospec-tive studies based on a commercialized KBP approach, RapidPlan, were also considered. The articleson external beam radiation therapy (IMRT, VMAT, Tomotherapy, Proton etc.) were included, whereasarticles on brachytherapy were excluded. The review of articles on predictions of patient specific qual-ity assurances of a treatment plan was not presented. In this review paper, the term dose predictionincludes prediction of entire DVH curve, dose metric (i.e. dose-volume parameter, mean or max dose),voxel dose, spatial dose distribution including slice by slice in 2D manner or 3D dose distribution,objective weights/constraints based on previous knowledge and also the transfer of all these metricsto the new case for generating an actual plan. Figure 1 shows the number of publications per year aswell as cumulative publications for both traditional KBP and DL-based dose predictions. Between 2009and 2014, there was a gradual increase in the number publications on traditional KBP dose predictionin what appears to be the initial development stage of the data driven treatment planning. The curvedemonstrates an uplift in the number of traditional KBP articles between 2015 and 2018. Majority oftraditional KBP studies in the past few years have been based on commercial RapidPlanTM versus onfurther expansion of earlier ML or statistical methods. This is certainly not because traditional methodshave been fully explored that it has reached its capacity in exploring potential research, but presum-ably due to recent emergence of DL-based methods owing to their flexibility and superior performancescompared to many state-of-the-art techniques. In the past few years, the number of publications on DL-based image processing has increased exponentially. To expand the horizons of DL-based applications,researchers have already begun to explore its potential scope for dose prediction tasks. In last fouryears, the number of DL-based dose prediction publications has gone from 1 in 2016 to already 15 in2020 as can be seen in Figure 1. The trend appears to demonstrate an increased rate of publications onDL-based versus traditional KBP in the current year. Table . Traditional KBP studies that aimed to predict dose volume histograms (DVHs) for providing astarting point for the plan optimization process.
Ref. Method Approach/Model Key Parame-ters Purpose [170] MB SupportVectorRegression Organ vol-umes, shapeand DTH To model functional relationshipbetween DVH and patient anatom-ical shape information.[2] MB Fittingusing leastsquaremin. OAR dis-tance toPTV To translate key parameter correla-tion to mathematical relationshipsbetween OAR geometry and ex-pected dose.[161] MB Stepwisemultipleregression DTH To build feature models to iden-tify variation of anatomical featurescontributing to OAR dose sparing.[78] MB Stepwisemultipleregression Target,OARs, over-lap volumesand DTH Extension of Yuan et al. for intra-treatment-modality model (IMRT –Tomotherapy)3162] MB Stepwisemultipleregression Target,OARs, over-lap volumesand DTH,fraction ofOAR outsidetreatmentfield To build two predictive mod-els (single-sparing and standardmodel) to characterize the de-pendence of parotid dose sparingon patient anatomical features insummed (primary + boost) plan,rather than two completely separatemodels.[163] AB Basedon itera-tive MLalgorithm Overlappingvolume To select a reference planfrom a library of clinically ap-proved/delivered plans withsimilar medical conditions andgeometry[22] AB Direct PTV shape,volume,three spher-ical coordi-nates of PTVwith respectto OAR OVH To develop knowledge driven deci-sion support system to assist clini-cians to pick plan parameters andassess radiation dose distributionfor a perspective patient[26] MB KernelDensityEstimate Distance toPTV To develop an automated treatmentplanning solution that iterativelyoptimize training setpredicts DVHs for OARsgenerates clinically acceptableplans[165] MB Ensemble Anatomicalfeatures,DTH To combine strengths of various lin-ear regression models to build amore robust model[166] MB K-nearestneighbors Generalized-DTH To characterize DVH variance inmultiple target plans[9, 11, 13, 14,16, 19, 21, 20,25, 30, 31, 33,34, 39, 48, 50,58, 61, 75, 87,114, 126, 127,128, 129, 130,138, 146, 154,155, 159, 160] [ ? ] RapidPlan T M
Eclipse (cid:114) treatmentplanning software:Algorithm is divided in two compo-nents: 1) Model configuration and2) DVH estimationMode configuration is divided intodata extraction phase and modeltraining phaseDVH estimation consists of DVHestimation phase and objectivegeneration phaseOVH = overlap volume histogram; DTH = distance-to-target histogram; AB = atlas based; MB =model based
This review includes over 90 articles on traditional dose prediction methods. These traditional KBP ap-proaches can be classified into two categories: I) Atlas based II) Model based. In atlas-based approaches,a physical parameter (i.e. overlap volume histogram (OVH), beams eye view projections, tumor loca-tion, etc.) is first identified to determine similarity between previous patients plans and a new patientplan. This is followed by transfer of knowledge (i.e. dose constraints, DVH values, beam geometrical4igure 1. The number of dose prediction publications on traditional and DL-based KBP methods peryear (bar) with cumulative number of publications (lines).parameters, DVHs of best matched cases) to predict achievable DVHs or to provide a better startingpoint to a treatment planner for further trial-and-error optimization. Within atlas-based methods, anindirect approach first predicts the dosimetric parameters through models and features, which are thenused to select matching cases. Whereas a direct approach directly predicts a similarity parameter basedon features of the plan, CT images, beams eye view (BEV) projections. In model-based approaches, sta-tistical or ML models are built from previously approved treatment plans. These methods requiremanually handcrafted features such as PTV-OAR overlap volume, OVH values, OAR distance-to-PTVto predict DVH by using di ff erent regression models. In this review, we categorized traditional KBPdose prediction articles into three groups according to prediction of: I) entire DVHs in Table 1, II) oneor more dose volume metrics in Table 2, and III) voxel doses in Table 3. The articles listed in Table 1 aimto predict the entire DVH for new patient case and utilize the predicted DVHs to guide the treatmentplanning for a new patient. Commercially available RapidPlanTM module also estimates DVH metricsand generates objectives for a new plan, which are also included in Table 1. Table 2 shows the articlesthat aim to predict one or more dose metric in order to guide the treatment planning for a new case.Table 3 shows the publications that aim to predict the voxel-level dose distributions to either assist inoptimizing a new plan or automatically generate an actual new plan. Figure 2 demonstrates the totalnumber of investigations on traditional KBP methods for various treatment sites. Prostate, head/neckand lung cancers were amongst the most frequently investigated cancer sites as anticipated, whereasvery few investigations are conducted on complex sites such as abdominal, intracranial and thoracic.In this section, we first provide an overview of key concepts involved in traditional KBP methods.Subsequently, we present a review of di ff erent metrics and their extension over the years. Finally, wesummarize the influence of di ff erent parameters on the performance of traditional methods in doseprediction tasks.3.1.1 Dimensionality reductionThough it is desirable to have more data for implementing di ff erent models, some implications of hav-ing too many data is that they can be redundant, irrelevant, and may result in overfitting, reducingmodels generalizability. Therefore, dimensionality reduction methods were used in majority of tra-ditional KBP studies to decrease the number of variables. Two main components in the process ofdimensionality reduction are: feature extraction and feature selection. The process of feature extrac-5ion begins with an initial set of features followed by redefining with the intention for them to be moreinformative. Principle component analysis (PCA) is one of the most commonly used reduced ordermodeling techniques in model-based approaches. The PCA determines features that retain the most ofthe variation among the data [106] so that they can be represented by a smaller number of dimensions.For example, in a binary classification problem, if the goal is to classify an object A, represented by a Pnumber of features in a P-dimensional vector, as either of two classes. If P is too large, some characteris-tics may be more valuable than others for the purpose of classification. The goal of PCA is to reduce thedimensionality of the dataset consisting interrelated variables into a smaller set of mutually uncorre-lated variables [106]. Feature selection process involves the selection of valuable features from the onesat our disposal. In many traditional KBP studies, the PCA is used in the process of feature selections[9, 11, 13, 16, 29, 33, 32, 34, 50, 76, 78, 114, 126, 127, 129, 136, 137, 154, 159, 161, 162, 165, 170].3.1.2 Various features/metricsA common theme in majority of traditional approaches is that the optimality of desired plan is stronglyinfluenced by geometries of critical structures with respect to the PTV. Commonly reported geometricfeatures include OVH, distance to target histograms (DTH), OAR distance-to-PTV. The influence ofparotid size and proximity to the PTV on the dosimetric sparing of parotid was first studied by Hunt etal. [49]. In addition to geometric features, additional plan features such as PTV-OAR volumes, mutualinformation including beams eye view projections, number and angles of specified beams and photonenergy have also been utilized in traditional KBP studies. List of these key parameters are tabulated inTable 1, 2 and 3 along with their corresponding references. Overlap volume histogram (OVH) based methods
The OVH was introduced to study the influence of OARs proximity to the target on its receiveddose. It is one of the most frequently used metrics in both atlas-based and model-based approaches ascan be seen in Table 1 and 2. Wu et al. and Kazhdan et al. first shed light on the concept of the OVHas a one-dimensional function measuring the proximity of an OAR to the target [59, 148]. The OVHcalculation involves uniform expansion and contraction of the target. Target contraction and expansionis repeated until OAR completely overlaps the target and there is no overlap between the target and theOAR, respectively [148]. In other words, it is the percentage of the OARs volume that overlaps with auniformly expanded or contracted target. In general, OVH-driven models assume that the dose to anOAR is inversely proportional to its distance from the target.A large array of studies has combined historical data with the OVH methods for prediction of entireDVH (Table 1) and one or more dose metrics (Table 2). Wu et al. used the OVH for its use in head/neckIMRT treatment plan quality control to help planners with evaluation [148]. This was followed by usingOVH to generate the achievable DVH objectives for head and neck cancer case [149]. With a modelbased on OVH [59] and PCA [161, 170], Wang et al. investigated the e ff ect of interorgan dependency andimpact of data inconsistency [146]. Larger prediction errors were found for head/neck region (¡4 Gy for83% of the cases) compared to similar model applied to prostate case (¡2 Gy for 96%) presumably dueto interorgan dependency [146]. Moore et al. also used OVH information to predict OAR dose metricsfor head/neck and prostate IMRT plans [97]. Yuan et al. used OVH metric to quantify the e ff ects of anarray of patient anatomical features of the PTV and OARs and their spatial relationship on interpatientOAR dose sparing in IMRT and found mean distance between OAR and PTV, mean volume betweenOAR and PTV, out-of-field volume of OARs and geometric relationship between multiple OARs to beimportant factors contributing to the organ dose sparing [161]. For multiple OARs, using separateOAR-specific prediction models was found to be more accurate in predicting voxel doses compared toall OAR voxels in a single training model [64].The success of OVH based prediction primarily rests on the observation that the minimum achiev-able dose to OAR depends on its distance and orientation to the PTV. However, the OVH based model[149] has been investigated for pancreatic cancer in which the OARs are larger compared to the tu-mor, part of OARs can engulf the PTV, and highly deformable organs can vary the beam configurationsamong di ff erent patients [107]. Petit et al. showed that the OVH based predicted doses were achievedwithin 1 and 2 Gy for more than 82% and 94% of the patients, respectively, with improvement of 1.4Gy and 1.7 Gy for mean dose to the liver and kidneys, respectively. To further investigate the capabilityof OVH parameter, the global shift of the OVH was quantified after hydrogel injection to represent thee ffi cacy of hydrogel injection in separating the rectum from PTV. The OVH was found to be a bettermetric for rectum sparing than the hydrogel volume [158]. Wang et al. used OVH to build a treatmentplanning QA model from consistently planned pareto-optimal plans for prostate cancer, improvingplanning standardization and preventing validation with possibly suboptimal benchmark plans [145].In earlier OVH-driven studies, a large variations in IMRT dose at a given OVH distance for a specific6C = Esophageal cancer; NC = Nasopharyngeal carcinoma; HC = Hepatocellular CancerFigure 2. The total number of traditional KBP investigations on dose prediction for various cancersites.fractional volume of an OAR was reported [152, 161]. To address this disparity in the distance-to-dosecorrelation, Wall et al. studied inherent inter-planner variations in plan quality of the previous plansand second order dosimetric and anatomical factors. Out of all factors, in-field bladder and rectal vol-ume showed the strongest correlation (R = 0.86 and R = 0.76) with doses. Therefore, in-field OARvolume was incorporated into the OVH only metric [134]. Generic OVH introduced by Kazhdan etal. directly infers a DVH rather than a spatial dose distribution [59]. With multi-patient atlas based-dose prediction approach, McIntosh and Purdie demonstrated that incorporating spatial informationinto the model can improve the dose prediction accuracy in comparison to the generic OVH method.This method was found to be less important for breast cavity and lung whereas improved predictionaccuracy for whole breast, rectum and prostate cancer [89]. Table 2.
Traditional KBP studies that aimed to predict one or more dose metrics for providing a startingpoint for the plan optimization process.
Ref. Method Approach/Model Key Parame-ters Purpose [164] MB Supportvector re-gression OAR, DVconstraintsettings To create an accurate IMRT plansurface as a decision support tool toaid treatment planners[94] AB Direct Clinicalstage, andgleasonscore To update the weights of di ff erenceclinical parameters for a new pa-tient through a group based simu-lated annealing approach[149] AB Direct OVH To use geometric and dosimetric in-formation retrieved from a databaseof previous plans to predict clini-cally achievable dose volume metric (A retrospective based on method by \ cite { RN43 } ) ffi ciency and consistencyfor head and neck cancer[144] AB Direct OVH To predict dose to 35% of rectal vol-ume as a treatment planning qual-ity assurance for prostate cancer pa-tients.[151] AB Direct OVH To investigate if OVH driven IMRTdatabase can guide and automateVMAT planning for head and neckcancer[158] MB LinearRegres-sion OVH To evaluate OVH metric for predic-tion of rectal dose following hydro-gel injection[136] MB Stepwisemultipleregres-sion OVH To utilize patients anatomic anddosimetric features to predict thepareto front[18] MB LogisticRegres-sion Distance tothe tangentfield edge To predict left anterior descendingartery maximum dose. Model toguide the positioning of the tangentfield to keep maximum dose <
10 Gy[62] MB LinearRegres-sion OAR vol-umes To develop a model to predict at-tainable prescription dose for IMRTof entire hemithoracic pleura[108] MB CurveFitting Rectum-target over-lap To predict optimum average rectumdose[92] MB StepwiseRegres-sion Target OARoverlap To predict mean parotid dose[134] AB Direct OVH’, Infield OARvolumes The minimum DVH value at thepercentage volume of the bladderand rectum was usedOVH = overlap volume histogram; DTH = distance-to-target histogram; AB = atlas based; MB =model based
Projection based methods
These algorithms typically rely on matching 2D images, beams eye view (BEV) of the projection ateach corresponding gantry angle, based on statistical properties of image histogram. The best matchedcase is generally identified based on the sum of mutual information (i.e., beams eye view projection)values for each of the total number of beam angles involved. This method has been used for prostate[12] and head/neck cancer [112]. Good et al. calculated mutual information representing the bestmatch for the query case. The PTV projection of matched case were deformed to the query cases PTVprojections at each angle to adjust for shape di ff erences between the PTVs of the query and match case.This approach reduced doses to the OARs and improve target dose conformity and homogeneity in KBPgenerated plans compared to the original plans [40]. Distance-to-target histogram (DTH) based methods
Distance to target histogram (DTH) is the fractional volume of the OAR within certain distancefrom the PTV surface. This metric along with volumes of the PTV and OARs are typically used asinput features in ML approaches such as multivariable nonlinear regression (MVNLR) and supportvector regression (SVR) [170]. It is important to note that DTH is equivalent to OVH [59] when theEuclidean form of the distance function is used. This DTH metric was extended to generalized distance-to-target histogram (gDTH) by Zheng et al. in order to account for the relative shape distribution ofmultiple PTVs for head and neck cancer [166]. In comparison to conventional model, the gDTH modelimproved DVH prediction accuracy for brainstem, cord, larynx, mandible, parotid, oral cavity and8harynx [166]. While this gDTH model presented similar plans with respect to an individual OAR, todevelop a knowledge based tradeo ff hyperplane model that assists with clinical decisions, the conceptof gDTH was further extended to select similar plans with respect to all OARs by employing casesimilarity metric that is a weighted sum of gDTH Euclidean distances between two cases across allOARs [167]. Finally, the DTH has also been utilized with multivariate regression-based models, whichis commercially available as RapidPlanTM in Eclipse treatment planning software.3.1.3. Influence of various parameters Outliers/Data inconsistency
Outlier detection is one of the important factors to consider when building a data driven dose pre-diction model or repository that is generalizable to new cases. Outliers can reduce the goodness offit between geometry and dosimetry, which, in turn, can comprise the model performance [97]. Twocommonly reported outliers in the literature are geometric outliers and dosimetric outliers. Geomet-ric outliers, on the other hand, entail large anatomical variations including OAR distance to the PTV.An example of geometric outlier is including a prostate + nodes case to prostate only cases. Severalstudies investigated the influence of outliers on model performance as shown in Table 4. Dosimetricoutliers represent the presence of plans in which OARs are not actively spared or there are violationsof dose-volume criteria. In other words, dosimetric outliers are the plans for which the re-planning cansignificantly reduce in OAR dose without compromising target coverage. Appenzoller et al. described amodel to identify outliers in the form of suboptimal plans and showed that excluding outliers in refinedmodel resulted in a strong correlation between predicted and realized gains after re-planning (r = 0.92for rectum, r = 0.88 for bladder and r = 0.84 for parotid glands). For head/neck RapidPlanTM basedKBP, Delaney et al. analyzed the influence dosimetric outliers and showed a moderate degradation inaccuracy of the model attributed to decreased precision of the estimated DVHs [20]. For pelvic cases,Sheng et al. assessed the e ff ectiveness of outlier identification by studying the impact of both, geomet-ric and dosimetric, outliers. This study suggested a greater impact of dosimetric outliers with negativeimpact on both bladder and rectum model compared to geometric outliers with negative impact onlyon bladder model [118]. Wang et al. studied e ff ect of data inconsistency with respect to planningprioritizations through a) mixed training dataset with a consistent validation dataset b) a consistenttraining dataset with a mixed validation dataset c) both a mixed training and validation dataset d) bothconsistent training and validation dataset and found that data inconsistency led to a large increase inprediction error with errord ¡ errorc ¡ errora ¡ errorb. [146]. In addition to removing the outliers (i.e.suboptimal plans) from the training cohort [2], an alternative to address the issue of outliers reportedin the literature is re-planning of the identified suboptimal plans for prostate and head/neck cancer [1]and lung cancer [58] followed by inclusion in the training cohort. Clinically available RapidPlanTMprovides di ff erent statistical evaluation metrics for identifying the outliers as shown in Table 4. Diversities within traditional methods
Many retrospective studies were published after 2014 presumably due to clinical implementationof traditional KBP module in the form of RapidPlanTM in Eclipse treatment planning software. Thesestudies investigated the applicability of traditional methods with respect to variations in external pa-rameters (i.e. multi-modality, multi-institution, sample size etc.). Here, we present a review of thesestudies with their findings. Wu et al 2013 used the DVH objectives derived from previous IMRT plansas an optimization parameter for VMAT treatment planning in head/neck cancer, resulting in a similardosimetric quality compared to IMRT plans [151]. Wu et al demonstrated that supine VMAT modelfor rectal plans can optimize IMRT plans of prone patients, yielding superior OAR sparing and qualityconsistency than conventional treatment planning method [155]. The prediction models trained on He-lical Tomotherapy for prostate cancer were utilized to predict constraints to perform an optimizationof new plans using RapidArcTM technique, it resulted in comparable/increased bladder and rectumdoses compared to expert planners plan. Delaney et al. demonstrated that using a model only on photonbeam characteristics could make the DVH predictions for proton therapy and can be used as a patientselection tool for protons [21]. McIntosh et al. studied contextual atlas random forest (cARF) algorithmwith and without OAR region of interest features and found that the algorithm can pick better atlaseswithout ROI features, however is not compatible enough to map the dose distribution from those at-lases onto a new patient [90]. Huang et al demonstrated that RapidPlanTM model for one energy (10MV) can generate dose volume objectives for plans with 6 and 10 MV, however a RapidPlanTM modelfor flattened beams cannot optimize un-flattened beams prior to adjusting the target objectives [48]. ARapidPlanTM module also has the potential to generate high quality treatment plans on a newly imple-mented treatment planning software compared to manually optimized plans for prostate cancer [87].9or esophageal cancers, the RapidPlan created from plans optimized using RayStationTM producedcomparable lung doses [130]. For patients enrolled in Radiation Therapy Oncology Group (RTOG)0617, Kavanaugh et al showed the feasibility of a single-institution RapidPlanTM model as a qualitycontrol tool for multi institutional clinical trials to improve overall plan quality and provide decisionsupport to determine the need for clinical trade-o ff s between target coverage and OAR sparing [58]. Forprostate cancer, Schubert et al. have demonstrated the possibility of sharing models among di ff erentinstitutes in a cooperative framework [114]. For prostate cancer RapidPlanTM amongst five di ff erentinstitutions, Ueda et al. also suggested that it is critical to ensure similarity of the registered DVH curvesin the models to the institutions plan design before sharing the models. For prostate cancer, Good et al. ,applied the model trained on their institute to generate plans for patient datasets outside institutionwith the potential for homogenizing plan quality by transferring planning expertise from more to lessexperienced institutions [40]. Good et al. achieved superior or equivalent to the original plan in 95%of 55 tests patients [40]. More recently, a disease site specific multi-institutional, NRG-HN001 clinicaltrial based RapidPlanTM model was built as an o ffl ine quality assurance tool for which it improvedsparing of OARs in a large number of reoptimized plans submitted to the NRG-HN001 clinical trial[39]. Sample size
Figure 3 shows an average number of training and test set for each cancer site in traditional KBPmethods with standard deviation over number of investigations listed on the top x-axis. The numberof training/test sample size were not directly mentioned or required in the methods described in somepublications. For RapidPlanTM , it is indicated that the minimum number of plans required for modelcreation is 20, however adding additional plans will usually help create a more robust plan [133].Numerous studies have compared the quality of plans generated by RapidPlan by using high qualityplans in training and found that 25 30 plans may produce clinically acceptable plan for prostate [31]and head/neck [127] cancer. For prostate cancer, Boutilier et al analyzed e ff ects of the training setsize on the accuracy of four models from three di ff erent classes: DVH point prediction, DVH curveprediction and objective function weights. The authors concluded that minimum required sample sizedepends on the specific model and endpoint to be predicted [7]. Zhang et al showed that approximately30 plans were su ffi cient to predict dose-volume levels with less than 3% relative error in both head andneck and whole pelvis/prostate [164].The requirement of sample size also partially depends on the robustness of the model used. Yuan et al. used 64 and 82 cases for prostate and head/neck case, respectively, in support vector regression(SVR) model for DVH predictions [161]. Landers et al. demonstrated statistical voxel dose learning(SVDL) to be more robust to patient variability compared to spectral regression and SVR for noncopla-nar IMRT and VMAT for head/neck, lung and prostate cancer by using 20 cases for each site in 4-foldcross-validation [64]. An atlas-based dose prediction [89] is more sophisticated method in which eachpatient in the training set represents 1 atlas. Feature extraction and characterization is typically per-formed on CT of the patients, which results in a probabilistic dose estimates to find the most likelyvoxel dose from similar atlases. In comparison to ANN and SVR methods, a large training sample sizeswere required for this method (58 for rectal, 77 for lung, 97 for breast cavity, 113 for central nervoussystem (CNS) brain, 144 for breast and 144 for prostate cancer). Overall, the review of traditional KBPdose prediction publications thus far suggests an improved e ffi ciency compared to manual optimiza-tion, su ffi cient flexibility of traditional KBP methods in terms of their applicability (i.e. multimodalityin EBRT), the need of these models for more complex sites, the requirement of an automated approachfor accounting for outliers to further enhance the treatment planning e ffi ciency and the potential ofbuilding site specific universal RapidPlanTM models for multi-institution adaptation. DL o ff ers numerous advantages and support to personals of di ff erent disciplines in the di ff erent stepsof radiotherapy treatment planning. An appealing feature of DL methods is that the layers of featuresare not manually designed, rather learned directly from raw data. Because DL methods are good at dis-covering intricate structures in high-dimensional data, it is applicable to a wide range of applications inscience [66]. In this section, we provide an overview of di ff erent architectures and neural networks thathave been applied to dose prediction tasks up to now. The use of DL in dose prediction was initiallyutilized in the form of ANN [119]. In these earlier DL-based methods, organ volumes including PTVand OARs, number of fields and distances from OARs to the PTV were used to train ANN, which wasthen used to correlate dose at a given voxel to a number of geometric and plan parameters, similar to10NS = Central Nervous System; NC = Nasopharyngeal Cancer; EC = Esophageal CancerFigure 3. The average number of training and testing datasets in traditional KBP dose predictionmethods for each cancer site. The values are averaged over number of investigations listed on topx-axis and the error bars represent standard deviation.11hat of used in traditional KBP methods. The DNNs are the most commonly used networks in DL-baseddose prediction. It resembles the traditional ANN, but with a large number of layers. Therefore, ANN-based studies are included into DL-based dose prediction category in Table 5 despite their comparableframework to that for traditional KBP methods. Neurons within each layer are nodes which are con-nected to subsequent nodes via links that correspond to biological axon-synapse-dendrite connections,analogous to the neural cell of human. The layers embedded between an initial input layer and thefinal output layer are called hidden layers. The number of layers determines networks width, whereasthe number of neurons determines its depth. Each neuron between its input and output undergoesa linear followed by a non-linear operation. In layered format, each neuron receives the informationfrom the neurons in the previous layer and passes it to neurons of the next layer after processing it.On the other hand, the residual connections can be added to connect neurons in non-adjacent layerssuch as ResNet proposed by He et al. [45]. The ResNet architecture has been presented with di ff erentnumber of layers: ResNet (18, 34, 50, 101, 152). Many DNN architectures have been presented for var-ious applications. For dose predictions, CNN namely fully convolutional neural network (FCN) andfully connected CNN (FCNN) have been used so far. A DL-based generative model, commonly knownas generative adversarial network (GAN), has also been employed to aid the main network (FCN) forpredicting dose distribution.3.2.1 Convolutional Neural NetworkMultilayer perceptron has the fully connected networks in which each neuron in one layer is connectedto all the neurons in the next layer. It is now succeeded by CNN, a class of DNN with regularizedmultilayer perceptron [65]. CNN, by far, is the most widely used DNN for dose prediction task as canbe seen in Table 5. Main components of a typical CNN are convolutional layers, max pooling layers,batch normalization, dropout layers, a sigmoid or softmax layer. The convolutional layer consists of aset of convolutional kernels where each kernel acts as a filter. The image is divided into small slices,known as receptive fields, through convolutional kernel, which aids in extracting features. Kernel usesa specific set of weights to convolve with corresponding elements of the receptive field. The weightsharing ability of convolutional operation allows extraction of di ff erent set of features within an imageby sliding kernel with the same set of weights on the image. This makes CNN parameter more e ffi cientcompared to the fully connected networks. This operation can be grouped based on the type and size offilters, direction of convolution, and type of padding [66]. From the result of convolution operation, thefeature motifs can occur at di ff erent locations in the image. The goal is to preserve its approximate po-sition relative to others rather than the exact location. The pooling or down-sampling sums up similarinformation in the neighborhood of the receptive fields and outputs the dominant response within thislocal region, helping to extract combination of features that are invariant to translational shifts [68].Commonly reported pooling formulations used in CNN are max, average, L2, spatial pyramid poolingand overlapping [46, 139] Nonlinear operation, also known as activation function, helps in learningof sophisticated patterns by serving as a decision function. Di ff erent activation functions reported inthe literature are sigmoid, tanh, SWISH, ReLU and its variants including leaky-ReLU, PReLU have beenused to inculcate non-linear combination of features [43, 67, 109, 139, 157]. More recently proposedactivation function is MISH, which has shown better performance than ReLU on benchmark datasets[95]. Batch normalization is applied to address the question of internal covariance shifts, a changein the distribution of hidden unit values, within feature maps that can reduce the convergence speed.It essentially unifies the distribution of feature map values by setting them to zero mean and unitvariance, which, in turn, improves the generalization of the network by smoothening the flow of thegradient [124]. Finally, weight regularization and dropout layers are used to alleviate data overfitting.The di ff erence between the predicted and the target output is calculated through loss function. CNNis generally trained by minimizing the loss via gradient back propagation using optimization methods.Di ff erent architectures have been proposed in the literature to enhance the performance of CNN. U-Net, originally introduced for segmentation of neuronal structures in electron microscope stacks [110],is the one of the most widely used architectures in CNN. In addition to segmentation, it is also usedfor image-to-image translation tasks that outputs an image that has a one-to-one voxel correspondencewith the input. U-Net permits e ff ective feature learning even with small number of training samplesize. Milletary et al. proposed a three dimensional variant of U-Net known as V-Net [91]. A knownissue of training DNN occurs from the vanishing gradient. Therefore, ReLU [66] and its variants aregenerally preferred as activation function owing to their ability in overcoming the vanishing gradientproblem [102]. LeCun et al. formulated the layers as learning residual functions instead of directlyfitting a desired underlying mapping [66]. A densely connected neural network (DenseNet) by Huang et al. connects each layer to every other layer [47]. More recently, attention gate was used in CNN in12rder to suppress irrelevant features and highlight salient features useful for a given task [111].3.2.2 Generative Adversarial NetworkGenerative adversarial network (GAN) is a widely used semi-supervised learning method in DL [41].Two major components of GAN are generative network and discriminator network that are trainedconcurrently to compete against each other. The goal of generative network is to generate artificialdata that can approximate a target data distribution from a low-dimensional latent space, whereas thegoal of discriminator network is to recognize the data presented by the generator and flag it as eitherreal or fake. Both networks get better over the course of training to reach nash equilibrium, which isthe minimax loss of the aggregate training protocol [41]. Some of popular variants of GAN includeCycleGAN [169], conditional GAN (cGAN) [93] and StarGAN [17]. GAN is widely used in medicalimaging [44, 54, 70, 71].3.2.3 Reinforcement LearningReinforcement learning (RL) trains an agent, connected to its environment through perception andaction, to make adjustments based on interaction between the agent and the environment. The agentgets certain indication of the current knowledge of the environment at each step of its interaction. Basedon received indication, the agent then chooses an action to generate as output. This action changes thestate of the environment, the value of this state transition is communicated to the agent through areward function. The agents behavior can learn to do this over time through trial and error [116]. Inother words, the goal of RL is to find the balance between the search and the current knowledge. RL hasbeen combined with DNN to accomplish human-level performances [96]. RL is a unique frameworkthat resembles the workflow of treatment planning optimization. The potential scope of RL in DL-based dose prediction task (Table 5) has been investigated in a recent study [115]. RL was used totrain a DNN named virtual treatment planner network, which, in turn, decides the way of changingtreatment planning parameters to improve plan quality instead of a treatment planner similar to thetreatment planning process [115]. Table 3.
Traditional KBP studies that aimed to predict voxel level doses for providing a starting pointfor the plan optimization process.
Ref. Method Approach/Model Key Parame-ters Purpose [12] AB Direct BEV’s pro-jections To identify similar patient cases bymatching 2D BEV projections of con-tours[40] AB Direct BEV’s pro-jections To adapt matched case’s plan parame-ters from one institute to optimize thequery case of an outside institution[104,103] MB Multivariateanaly-sis Sliceweightfunction Distance-to-PTV,Slice level To determine the relationship be-tween the position of voxels and cor-responding doses to predict sparingof the OARs[79] MB Activeshapemodel, ac-tive opticalflow model PTV lo-cations inrelation tospinal cord To study the e ff ect of PTV contours ondose distribution at spinal cord.Five subgroups were created accord-ing to the PTV locations in relation tospinal cords.[112] AB Direct Target-OARoverlapShell cre-ation sur-rounding thematch targetvolume To adapt the matched case from thedatabase for query case by deform-ing the match beam fluences, warpingthe match primary/boost dose distri-bution and distance scaling factor[113] AB Direct Target-OARoverlap To transfer the beam settings andmultileaf collimator positions of bestmatch case to the new case13117] AB Direct The PTVand Seminalvesicles (SV)concavenessangle and% distancefrom SV tothe PTV To transfer treatment parameters ofthe atlas case to the new case[88] AB Indirect Multi-scaleimage ap-pearancefeatures To use contextual atlas regression for-est (cARF) augmented with densityestimation over the most informativefeatures to learn an automatic atlas-selection metric for dose prediction[89] AB Indirect Featuresbased on thespatial dosedistributionand featuresderived fromDVHs To extend CRF by introducing con-ditional random field model (cARF-CRF) to transform the probabilisticdose distribution into a scalar dosedistribution that adheres to desiredDVHs.[90] AB Indirect Multi-scaleimage ap-pearancefeatures To converts a predicted per voxel dosedistribution into a complete radio-therapy plan through fully automatedpipeline using cARF-CRF.OVH = overlap volume histogram; DTH = distance-to-target histogram; AB = atlas based; MB =model based DL-based dose prediction methods can be categorized according to DL properties such as networkarchitectures (CNN, GAN etc.), training process (supervised, unsupervised, semi-supervised, deep re-inforcement etc.), input image types (CT only, CT + OAR + PTV contours, etc.), output types (2D or3D dose distribution) and sample size (training, testing etc.). As shown in Figure 1, DL-based doseprediction methods have gained popularity amongst the researchers only in the past few years, thereare nearly 30 publications on DL-based dose prediction so far. These DL-based dose prediction publica-tions are tabulated in Table 5 along with their network architectures, input and output characteristics.Figure 4 represents the total number of DL-based dose prediction investigations per treatment site. Thisfollows a similar trend to that observed for traditional KBP dose prediction approaches with the highestnumber of investigations being on prostate and head/neck cancer sites. Here, we categorized DL-baseddose prediction publications thus far into two groups based on network architectures: I) CNN namelyU-Net architecture and II) GAN. We first provide the review of work for each network architecture fol-lowed by their applicability on various dose prediction application and limitations. Subsequently, wediscuss the influence of di ff erent parameters in DL-based dose prediction methods.3.3.1 Overview of CNN based worksAs shown in Table 5, U-Net has been widely used CNN architectures used for predicting dose distri-butions. U-Net is e ff ective in terms of calculation and combination of global and local features becauseit is consisted of encoding and decoding path. The decoding path concatenates the features from bothprevious layers in encoding path and features from current layers in decoding path. Many variants ofU-Net including 3D U-Net have appeared in literature for dose prediction purposes. Earlier work inDL-based dose prediction methods involved predicting doses in 2D manner [27, 99]. Sumida et al. usedthe U-net model, initially proposed by Ronneberger et al. [110], to make 2D dose prediction. Two mainflows of this were encoding and decoding parts. Encoding parts layers followed 2D convolution layer,batch normalization, rectified linear unit (ReLU) and max-pooling layer. The network was trained tomake dose prediction for Acuros XB (AXB) from low resolution dose calculated through AAA algorithmand CT. Similarly, Nguyen et al. also trained a seven-level hierarchy with modified version of originalU-Net to make dose prediction for a prostate case [99].14ore recent works were focused on predicting 3D dose distributions using DL methods. To over-come increased computation load in 3D dose prediction, Nguyen et al. proposed Hierarchically DenselyU-Net (HD U-Net), which not only was able to predict 3D dose distribution, but also outperform dosepredictions made by standard U-Net model [100]. HD U-Net combines DenseNets e ffi cient featurepropagation and utilize U-Nets ability to infer both local and global feature by connecting each layer toevery other layer in feed-forward fashion, yielding better RAM usage and better generalization of themodel. To further simplify 3D dose prediction problem and increase prediction accuracy, Xing et al. projected the fluence maps to the dose distribution using a fast and inexpensive ray-tracing dose cal-culation algorithm and trained HD U-Net to map the ray-tracing low accuracy dose distribution (doesnot consider scatter e ff ect) into an accurate dose distribution calculated using collapsed cone convo-lution/superposition algorithm [156]. DL-based methods have also been expanded to predict paretooptimal dose distributions so that physicians can learn the desired dosimetric trade-o ff s in real timeand learn the viability of di ff erent dosimetric goals. Ma et al. constructed 3D U-Net architecture topredict individualized dose distribution for di ff erent tradeo ff s [84]. In predicting pareto dose distri-bution, the network should be able to map many dose distributions from a single anatomy. In doingso, it should be able to di ff erentiate between the clinical consequences and corresponding predicteddose distribution. To address this clinically relevant di ff erences amongst di ff erent dose distribution,Nguyen et al. proposed the di ff erentiable loss function based on the DVH and adversarial loss in addi-tion to traditional voxel wise mean square error (MSE) loss to train the network [101]. Along the sameline of work, Bohara et al. incorporated beam information to predict pareto dose distribution usinganatomy-beam model proposed by BarragnMontero et al. [6].U-Net architecture has also been used for internal radiation dose predictions [42, 69] where thenetwork was trained to predict 3D dose rate maps given the mass density distribution and radioac-tivity maps. Since clinically available Medical Internal Radiation Dose Committee (MIRD) based doseestimations are least precise, the long-term goal of these studies is to create a stable DL-based doseestimation model that achieves a precision close to that of Monte Carlo simulations. He et al. proposedresidual network, known as ResNet, to mitigate the di ffi culty of training DNN caused by gradient van-ishing [45]. He et al. reformulated the layers as learning residual function instead of directly fitting adesired underlying mapping. Chen et al. and Fan et al. proposed DL method based on ResNet with101 and 50 weight layers, respectively, to predict dose distribution for head/neck cancer IMRT patients[15, 120]. Since networks with very deep layers are di ffi cult to train due to vanishing gradient, suchnetworks used shortcut connections to add to the outputs of the stacked layers [45]. More recently,Liu et al. proposed ResNet for dose prediction in the nasopharyngeal cancers for Helical Tomotherapy.To achieve multi-scale feature learning, Liu et al. divided the ResNet into several parts without fullyconnected layers and respectively combined with input data to achieve pixel-wise feature abstractionand extraction in structural image.3.3.2 Overview of GAN-based worksGAN entails a pair of neural networks: a generator and a discriminator. From the treatment planningstandpoint, generator could be represented as the treatment planner who generates the plan and radi-ation oncologist could be represented as discriminative network who evaluates the plan generated bythe treatment planner. Both the treatment planner and a radiation oncologist get better at perform-ing their tasks as they become experienced over time. Only a handful of studies have investigated theperformance of GAN for dose prediction task as shown in Table 5. Mahmood et al. demonstrated thefirst use of 2D GAN for predicting dose for each 2D slice independently for oropharyngeal cancer. Sub-sequently, Babier et al. proposed the first 3D GAN for prediction of full 3D dose distributions, whichoutperformed the 2D GAN model proposed by Mahmood et al presumably owing to its ability to learnthe vertical relationship between adjacent axial slices in contrast to 2D GAN networks. Recently, Vas-ant et al. proposed a novel 3D attention-gated generative adversarial network (DoseGAN) as a superioralternative to current state of the art dose prediction networks [131]. Spatial self-attention allows net-works to emphasis portions of the intermediate convolution layers. Attention gated proposed by Vasant et al. can potentially o ff er deeper and more e ffi cient discrimination, while being trained in parallel withthe generator network and facilitating the model convergence [131]. This addresses the issue of keep-ing the number of networks parameters as low as possible in conventional GAN. Attention-gated GANproposed by Vasant et al. outperformed conventional 2D and 3D GAN in all dosimetric criteria includ-ing PTV and OARs [131]. All four studies [4, 85, 98, 131] on GAN-based dose predictions constructed agenerator and discriminator network using the pix2pix architecture proposed by Iosa et al. [52]. U-netgenerator was used, which passes a contoured CT image slice thorough consecutive layers, a bottlenecklayer and subsequent deconvolution layers. U-net also uses skip connections to easily pass high dimen-15C = Nasopharyngeal Cancer; PD = Personalized DosimetryFigure 4. The total number of DL-based dose prediction investigations for various cancer sites.sional information between the input (CT image slice or contoured structures) and the output (doseslice).3.3.3 Overview of learning processesIn this section, we briefly present a review of four learning processes including supervised learning(SL), unsupervised-learning (USL) and semi-supervised learning (SSL) that have been utilized so farin DL-based dose prediction tasks. Earlier approaches used SL that trained a model by using labeleddata in the form of di ff erent geometrical parameters and distance to the target to train the network.In contrast, USL does not require such target information and rely solely on the input data to learnthe patterns hidden within raw data. A typical example of USL is training deep auto encoder (DAE),which has a flexible network structure with encoder and decoder. These USL networks can be CNN,fully connected networks, or hybrid [116]. It can be seen from table 5 that USL is the most widely usedlearning strategies in DL-based dose prediction tasks. A category that falls between USL and SL is SSL.SSL is commonly used for tasks in which the target information is only partially available. GAN, apopular SSL, has also been utilized for dose prediction tasks (Table 5). Table 4.
A list of articles with investigations on e ff ects of outliers on plan quality and summary ofevaluation metrics used by RapidPlan T M with threshold in parentheses.
Reference Method Outlier [97] Restricted sum of residual(RSR) Dosimetric[20] Regression and residualanalysis Dosimetric[118] Leverage and studentizedresidual Dosimetric, Geometric[1] Regression analysis scat-ter plots, cook’s distance Dosimetric,Geometric[165] Model-based case filtering Dosimetric, Geometric16apidPlan
T M
Cook’s distance ( > > > ff erence of Esti-mate ( >
3) DosimetricandGeometric3.3.4 Influence of various parameters on DL-based model performance
Input parameters
In terms of number of input parameters, Williems et al. studied the impact of four di ff erent inputs(Table 5) for dose prediction under with and without data normalization of dose distribution. The orderof models in terms of performance was CT + isocenter + contours ¿ CT + contours ¿ CT + isocenter ¿ CTonly. While the dose distribution normalization had more benefits for CT + contours, it was found tobe less necessary for CT + isocenter + contours model. Whereas, normalization produced hot and coldspots for CT + isocenter model [147]. While many studies use only CT with anatomical information (i.e.PTV and OAR contours) as inputs to the CNN [5, 100, 99] as can be seen in Table 1, BarragnMontero et al. included beam gemoetry information along with anatomical information as inputs. As a result,the model was able to learn from database that was heterogeneous in terms of beam configurations(i.e. noncoplanar) [5], which was the limitation of network proposed in the earlier studies [99]. Forrectal cancer IMRT, Zhou et al. showed improvements in the prediction accuracy by including beamconfigurations as input to the network compared to that of without beam configuarations [168]. Forhead/neck cancer, Chen et al. investigated the influence of adding out of field labels into the networktraining to deal with inability of 2D network to account for radiation beam geometry. It resulted ina better overall performance compared to the network excluding out-of-field labels [15]. For prostatecancer, Murakami et al. compared the performance of CT-only based GAN with contour-based GAN inpredicting target images (i.e., RT-dose images) and found prediction performance of contoured-basedGAN to be superior. Loss functions
In terms of losses, MSE is one of the most widely used cost functions in DL methods as it has manydesirable properties from an optimization standpoint. Owing to its simplicity, well behaved gradientand convexity, majority of previous studies including the ones shown in Table 1 utilized only MSEloss for dose prediction. Nguyen et al. trained network with domain-specific loss function by addingnonconvex DVH and adversarial loss in addition to MSE loss function. While this outperformed dosepredictions compared to MSE based trained model, for the same computational system, it increasedthe training time to 3.8 days with 100000 iterations compared to 1.5 days for MSE only based network[101]. Lee et al. and Chen et al. utilized mean absolute error (MAE) cost function between the groundtruth and dose rate map predicted by CNN [15, 69]. A key di ff erence between MSE and MAE that MAEis more robust to outliers but may be ine ffi cient to find the solution, whereas MSE provides more stableand closed form solution. Other loss functions may include Huber loss, smooth mean absolute error,quantile loss, and log cosh loss function. So far, MSE loss function has been the standard cost functionused in DL-based dose prediction studies. Sample size
In general, the DL based methods require a large number of high quality data to be e ff ective. A smalldatasets in DL can be challengening as it can result in overfitting. Overfitting occurs when the model istrained to exactly fit a set of training data, however cannot learn the hidden pattern to maintain modelgenerality [116]. Data augmentation [122], dropout layer [121], estimation based on the training andthe validation curves [100], synthesizing new data based on physics principles [86] or incorporatingregularizations to model parameters [132] have been used in the literature to prevent overfitting. Theprocess of data augmentation, more commonly used in dose prediction approaches, is to expand datasetby synthesizing additional realistic samples from available samples. It is important to note here, how-ever, that the process of augmentation to be used depends on the suitability of the context. For thepurpose of dose prediction, we have presented the average training and testing sample size for eachtreatment site in Figure 5 for all DL-based dose prediction methods to date, which provides the readerswith an approximate range of training and testing data set for each cancer site.As shown in Table 1, three investigations on prostate cancer have been reported so far for predictingpareto optimal dose distirbutions [6, 84, 101]. For each patient in training set, 10, 100, and 500 planswere generated by Ma et al. , Nguyen et al. , and Bohara et al. , respectively, to sample the pareto surface17ith di ff erent tradeo ff s. An optimal number of plans per patient in training set is unknown as it maydepend on case to case basis. Nonetheless, in the case of predicting pareto optimal plans, it may be idealto stay within clinically relevant regime by including only those plans that covers dosimetric tradeo ff spresented by a physician.Kandalan et al. studied the issue of generalizing DL-based dose prediction models and to make useof transfer learning to adapt a DL dose prediction model to di ff erent planning styles in the same insti-tutions and planning practices at di ff erent institutions. A source model was adapted to four di ff erentplanning styles only with 14-29 cases [57]. A long-term goal of these studies is to generate a universalmodel that can easily be transferred to di ff erent institutions for a similar task. Table 5.
A list of publications on DL-based dose prediction for various treatment sites.
Reference Architecture Input Output [119] ANN Number of fields, PTV volume, PTVto OAR distance, azimuthal and ele-vation angles 3D[10] ANN Distance to PTV, Distance to OARsPTV volume 3D[125] ANN 16 di ff erent geometrical parameters 3D[99] ModifiedU-net PTV + OAR + Prescription 2D[27] ResNet-50 CT + OAR + PTV images + dose dis-tribution image 2D[60] 3D-FCN 3D CT + OAR + Prescription 3D[82] U-Res-Net 3D CT + OAR 3D[85] GAN Contoured CT images + dose distri-bution 2D[5] HD U-Net OAR + PTV + Beam informationwith approximated dose 3D[15] CNN -Res-Net 101 Contoured images + coarse dosemap, with out of field labels 2D[69] U-Net PET and CT image patches 3D[100] HD U-Net OAR + PTV 3D[147] U – Net CT only,CT + ISO, CT + Contours, CT + ISO+ Contours 2D[84] Modified 3DU-Net DVHs + Contours ParetoDoseDistri-butions[6] U-Net PTV + Body + OAR,PTV + Body + OAR + Dose informa-tion from selected beam angles ParetoDoseDistri-butions[28] 3D U-NetDRN CT + FMCV 3D[42] Modified U-Net Density map + 3D CT +Activitymap 2D[56] U-Net PTV + OAR contours 3D[98] GAN CT + RT Doses,PTV + OAR 2D[120] ResNet-50 CT + OAR +PTV +body contours 3D[122] U-Net Low resolution dose + CT 2D[156] HD U-Net CT + RT dose distribution 3D[123] GAN CT + PTV + OAR 2D[131] Attentiongated GAN CT + PTV + OAR 3D18igure 5. The average training and testing sample size in DL-based dose prediction methods for eachcancer site. The values are averaged over number of investigations listed on top x-axis and the errorbars represent standard deviation.[101] GAN PTV + OAR + Body ParetoDoseDistri-bution[4] 3D GAN Contoured CT images 3D[168] 3D U-Net+ ResidualNetwork CT + OAR + PTV contours + Beam+ Dose 3D[57] 3D U-Net OAR + PTV contours 2D[77] Dense-Reshybrid Net-work Beam + structural information Staticfieldfluencepredic-tion[115] VirtualTreatmentPlannerNetwork DVH TPPsadjust-mentactionFMCV = Fluence Map Converted Volume; PTV = Planning Target Volume; OAR = Organ at Risk;GAN = Generative Adversarial Network; HD = Hierarchically Dense; DVH = Dose Volume Histograms;TPP = Treatment Planning Parameter With the aim of minimizing the variations in treatment planning and improving the treatment plan-ning e ffi ciency namely a time-consuming trial-and-error process of planning a treatment from scratch19or every patient, the researchers introduced the concept of using previously delivered treatment plansin order to guide treatment planning for a new patient. This concept has been labelled as a knowledge-based planning today. In the last decade, there has been a rapid growth in the number publications intraditional KBP dose prediction. On the other hand, the number of publications on DL has increasedexponentially in recent years owing to its flexibility and superior performances compared to manystate-of-the-art techniques. Over 90 papers have been published on traditional KBP dose predictionmethods between 2011 and August 2020, whereas over 15 publications have already been published onDL-based dose prediction this year so far. In general, most paper demonstrated improvements in com-parison to manually optimized clinical plans in terms of both treatment planning quality and e ffi ciency.A large number of manuscripts were published on traditional methods between 2015 2018, with thehighest number of publications in 2017. This is presumably due to commercialization of the Rapid-PlanTM in Eclipse treatment planning software in 2014, which allowed researchers from di ff erent cen-ters to perform range of retrospective studies for investigating the influence of various parameters onthe quality of plans generated through RapidPlanTM KBP. While the number of traditional KBP basedpublications has been quite steady in the past 2 years, the DL-based publications have been rapidlyincreasing since 2017. In terms of modality, both techniques were mostly applied to IMRT, VMAT andother noncoplanar intensity modulated external beam radiation therapy treatments. Only a small num-ber of data driven dose prediction studies were reported for the purpose of magnetic resonance imagingguided therapy (MRgRT) [125]. The number of traditional KBP and DL based publications for on-tableadaptation may increase in the future, owing to recent technical developments such as MR-Linear Ac-celerator (MR-Linac). In terms of treatment sites, prostate, head/neck and lung were amongst the mostinvestigated sites in both traditional KBP and DL-based methods compared to complex abdominal orcranial sites. This was anticipated as both KBP techniques require a large training sample set and thesethree are commonly treated sites in external beam radiation therapy. Therefore, a large repository ofpreviously treated plans is likely to be available for building dose prediction models for these two sitesover other complex sites. In KBP, three commonly reported dose prediction metrics in the literaturewere entire DVH curve (Table 1), one or more dose metrics (Table 2) and voxel-based dose prediction(Table 3). A known limitation of DVH prediction is that DVHs are only predicted for contoured OARs,which may limit the accountability of enhance conformity and hotspots that may occur outside of theregion of interest. This was addressed through voxel-based dose prediction approaches in which themodels are built to predict individual voxels within the CT image. However, this approach relies heav-ily on the quality of the plans used to build the model as the inclusion of outliers can compromise themodel performance. Even for RapidPlanTM based KBP, several studies indicated the need to inves-tigate the proper identification of outlier plans [31, 32, 127]. Outlier identification in RapidPlanTMinvolves statistics and regression plots for each structure, suggesting Cooks Distance ¿ 10.0, Studen-tized Residual ¿ 3.0, Areal Di ff erence of Estimate ¿ 3, and Modified Z-score ¿ 3.5 as potential outliers[133]. To an extent, this also requires removal of outliers in iterative manner with either stopping theremoval once no significant improvement is observed or identification of the outliers followed by re-planning of all the outliers so that it can be reused in the training cohort [1]. The time required toaddress the issue of outliers may vary from one institute to the other as institutions without standard-ize techniques can have many dosimetric outliers presumably due to a large variations in treatmentplanning, which, in turn, can result in a time consuming process of eliminating outliers either throughvisual inspection or additional statistical analysis. In the literature, limited amount of emphasis hasbeen given on establishing a systematic process for identifying dosimetric and geometric outliers. Toour knowledge, currently, there is no well-established workflow for outlier identification and mitiga-tion in terms of model creation for both KBP techniques. Therefore, a standardized automated methodof outlier identification and model creation could further enhance the treatment planning experience[165]. In contrast to a previous review that presented a number of training and testing sample size peryear [38, 51], we separated the datasets per cancer sites for traditional KBP and DL-based in Figure 3and 5, respectively. This would provide readers with a range of training sample size for each cancer site,as required number of training set depends not only on the prediction model but also on the complexityof a treatment site. For instance, the number of cases required to train a model may be more for morecomplex cases such as head/neck to represent the case population versus a simple case such prostatecancer. Direct comparison of training sample size between the traditional and more recent DL-basedKBP was not made as DL-based dose prediction is a relatively new technique with a fewer number ofinvestigations per site compared to traditional KBP methods. In contrast to DL, an inherent limitationof traditional methods is that it is unable to process the raw data and extract important features andpatterns hidden within. Both, similarity measures in atlas-based methods and input features to model-20ased methods, require considerable e ff ort to extract valuable features (i.e. overlap volume histogram,OAR distance to the PTV, projections, etc..) that can process raw data either to identify the best matchedcase or into a representation from which patterns within the input can be classified through a classifier.In traditional approaches, PCA has been widely used in the literature for feature selection owing to itssimplicity. However, a major limitation of PCA is that it learns low dimensional representation of dataonly with a linear projection. Whereas, DNNs can be used to address this issue and untangle non-linearprojections. For instance, an autoencoder is a type of neural network that is consisted of encoder, whichencodes the input into a low dimensional latent space, and decoder, which restores the original inputfrom the low dimensional latent space [37]. This has been adopted in DL-based dose prediction meth-ods (Table 5) and extension of such unsupervised method is anticipated in the near future to furtherenhance the dose prediction accuracy. In terms of DL-based dose prediction methods, two mostly in-vestigated networks, thus far, included CNN and GAN. From the results so far, it appeared that GANsmay be a good choice for dose prediction tasks over conventional CNNs for several reasons. First, GANhave been proven to perform well in lesion detection and data augmentation tasks [4, 35]. In addition,GAN does not rely on pure spatial loss, such as mean square error between dose volumes, which makesit a suitable candidate not only for dose prediction of conventional radiation therapy but also for SBRTin which dose heterogeneity is prevalent. Furthermore, Babier et al. found that GAN models did notrequire significant parameter tuning and architecture modifications during implementations comparedto other conventional methods [4]. However, in contrast to CNN, one limitation of conventional GAN isthat they are di ffi cult to train and requires the number of network parameters to be as low as possible.Future studies are anticipated to account for such shortcomings by proposing extension of networkssuch as attention-gated GAN [131]. For head/neck cancer, the di ff erence between the traditional KBP predicted and actual median doses forthe parotids ranged from -17.7% to 15.3% [78], whereas it ranged between 7.7 to 13.5% for DL-baseddose prediction [15]. With the same level of prediction accuracy, DL-based KBP was able to predictmedian dose for 80% of parotids compared to 63% by the traditional KBP method [15]. Kajikawa etal. made the direct comparison of dose distribution predicted by DL method with that of generatedby RapidPlanTM for prostate cancer [56]. This dosimetric comparison showed that CNN significantlypredicted DVH accurately for D98 in PTV-2 and V35. V50, V65 in rectum. Given that features au-tomatically extracted by DL methods can include both geometric/anatomic features and the mutualtradeo ff s between the OARs, it gives an edge to DL-methods in terms of dose prediction accuracy com-pared to traditional KBP methods that mainly rely on DVH and geometry-based expected dose. Fororopharyngeal cancer, Mahmood et al. directly compared GAN approach for generating predicted dosedistribution with several traditional approaches including bagging query [3, 148] and generalized PCA[161], random forest [90]. Mahmood et al. , through the gamma analysis [83], demonstrated that GANplans were the most similar to the clinical plans and achieved 4.0 % to 7.6 % improvements in fre-quency of clinical criteria satisfactions compared to traditional approaches [85]. For prostate SBRT,Vasant et al. compared the performance of proposed attention gated GAN with an earlier approach thatused relative distance map information of neighboring input structures [119]. In contrast to conven-tional radiation therapy, SBRT produces hot spots within the target volume. Mean absolute di ff erencein V120 between KBP like approach and actual plan was four-fold higher compared to that achieved byattention gated GAN technique, demonstrating the ability of a DL-based method to predict cold spotsand hotspots that are prevalent to SBRT dose volumes. Both, traditional and DL-based, KBP approachesused the data from previously treated patients to make dose prediction for a new patient. DL basedapproaches, however, have been shown to outperform traditional methods in dose prediction tasks asdemonstrated by several studies in the literature. This is presumably due to ability of DL-methods tonot be limited by a small number of features in contrast to that of in traditional KBP. From the statistics of publications on data driven dose prediction approaches in recent years, there isa clear trend of transformation from traditional methods to DL-based methods for KBP. This is pre-sumably due to flexibility and superior performance of DL based approaches in contrast to traditionalapproaches. In terms of traditional KBP methods, future investigations are anticipated to be retro-spective in nature by using clinically available tools (i.e., RapidPlanTM). On the other hand, DL based21ethods appear to be in its initial development stage, hence, its potential will be explored in di ff erentareas of dose prediction tasks in treatment planning workflow including adaptive radiotherapy in nearfuture. Adaptive radiotherapy (ART) involves adjusting dose distribution based on anatomical changesobserved on intra-procedural imaging such CBCT. The standard approach requires physician to per-form recontouring of OAR and tumor regions followed by plan re-optimization, which is di ffi cult toimplement in an ART. To date, only one study has been reported to adopt DL methods for the purposeof ART of head/neck cancer [123]. The future trend will certainly be towards utilizing the flexibilityand e ffi ciency o ff ered by DL-based methods to present dose prediction models of dosimetry changesand radiotherapy response for ART. Post dose prediction, a main component of treatment planningworkflow includes ensuring the achievability of the predicted dose plans, which often involves inversetreatment planning through manual intervention. Only handful studies extended such data driven ap-proaches in a fully automated pipeline that not only predict the dose distribution but also generatesa complete treatment plan with minimum human interaction in traditional [26, 55, 89, 153, 171] andDL based methods [4, 77, 85]. The deliverability of the predicted plans is more important as it has toaccount for various mechanical and algorithmic constraints. It is important to note here that good pre-dictions with low error may not necessarily lead to the final deliverable plan with the same performanceon clinical criteria. For instance, five of the seven prediction methods investigated by Babier et al. re-sulted in a significantly worse clinical criteria satisfaction despite lower error post dose predictions [4].We, therefore, believe synchronizing an inverse optimization engine with dose prediction methods holda great potential in improving treatment planning e ffi cacy and e ffi ciency. Alternatively, a DL-based flu-ence prediction has also been proposed for real-time prostate treatment planning [77]. This approachfollows conversion of predicted fluence maps to a deliverable treatment plan through delivery param-eter generation and dose calculations directly in a treatment planning software. Such approaches donot require inverse optimization process and involve minimal human intervention. A subsequent task,after generating a deliverable plan, involves patient specific quality assurance measurements that areperformed routinely prior to actual treatment delivery to ensure delivery and dosimetry accuracies.Several ML [63] and DL [105] approaches have been reported for predicting gamma passing rates forIMRT patient specific QA. More e ff orts are also anticipated to be placed to incorporate such approachesinto treatment planning pipeline to establish a fully automated workflow. One of the challenges in datadriven algorithms, including both ML and DL, is that it requires a large set of a high-quality data. Sincethe quality of data and radiotherapy practices vary from one center to the other, the heterogeneity inpreviously treated plans become a major obstacle in deployment of data-driven solutions in the field ofradiation oncology. To address this issue, the concept of transfer learning for model adaptation to dif-ferent learning styles at di ff erent centers may be investigated further in the future. A long-term goal ofthis area of investigations would be to incorporate data-driven predictive tools as a part of the clinicalpathway. In the last decade, a tremendous amount of work has been done towards automation to improve treat-ment planning quality and e ffi ciency. We have performed a review of two major KBP approaches todose prediction: traditional KBP methods with over 90 articles and more recently introduced DL-basedKBP with nearly 30 articles. While traditional approaches are either equivalent or superior to an ex-perienced planner with greater e ffi ciency, recent developments in DL holds a greater potential in doseprediction task. Both KBP approaches, however, are needed to be expanded for more complex sitessuch as abdominal and intercranial. Given commercial accessibility of RapidPlanTM module, moreretrospectives studies are foreseen in the future. However, new approaches DL-based KBP are activelybeing introduced and trending in a steep upward direction. There are various areas of future research,several of which have been highlighted in this review, required to achieve an ultimate goal of a fullyautomated treatment planning system. Acknowledgements
This research is supported in part by the National Cancer Institute of the National Institutes ofHealth under Award Number R01CA215718, the Department of Defense (DoD) Prostate Cancer Re-search Program (PCRP) Award W81XWH-17-1-0438 and Dunwoody Golf Club Prostate Cancer Re-search Award, a philanthropic award provided by the Winship Cancer Institute of Emory University.22 isclosures
The authors declare no conflicts of interest.
References [1] Jorge Edmundo Alpuche Aviles, Maria Isabel Cordero Marcos, David Sasaki, Keith Sutherland,Bill Kane, and Esa Kuusela. Creation of knowledge-based planning models intended for largescale distribution: Minimizing the e ff ect of outlier plans. J. App. Clin. Med. Phys. , 19(3):215–226,2018. ISSN 1526-9914.[2] Lindsey M Appenzoller, Je ff M Michalski, Wade L Thorstad, Sasa Mutic, and Kevin L Moore.Predicting dose-volume histograms for organs-at-risk in imrt planning.
Med. Phys. , 39(12):7446–7461, 2012. ISSN 0094-2405.[3] Aaron Babier, Justin J Boutilier, Andrea L McNiven, and Timothy CY Chan. Knowledge-basedautomated planning for oropharyngeal cancer.
Med. Phys. , 45(7):2875–2883, 2018. ISSN 2473-4209.[4] Aaron Babier, Rafid Mahmood, Andrea L McNiven, Adam Diamant, and Timothy CY Chan.Knowledge-based automated planning with three-dimensional generative adversarial networks.
Med. Phys. , 47(2):297–306, 2020. ISSN 0094-2405.[5] Ana Mara Barragn-Montero, Dan Nguyen, Weiguo Lu, Mu-Han Lin, Roya Norouzi-Kandalan,Xavier Geets, Edmond Sterpin, and Steve Jiang. Three-dimensional dose prediction for lung imrtpatients with deep neural networks: robust learning from heterogeneous beam configurations.
Med. Phys. , 46(8):3679–3691, 2019. ISSN 0094-2405.[6] Gyanendra Bohara, Azar Sadeghnejad Barkousaraie, Steve Jiang, and Dan Nguyen. Using deeplearning to predict beam-tunable pareto optimal dose distribution for intensity modulated radi-ation therapy.
Med. Phys. , 2020. ISSN 0094-2405.[7] Justin J Boutilier, Tim Craig, Michael B Sharpe, and Timothy CY Chan. Sample size requirementsfor knowledge-based treatment planning.
Med. Phys. , 43(3):1212–1221, 2016. ISSN 0094-2405.[8] Freddie Bray, Jacques Ferlay, Isabelle Soerjomataram, Rebecca L Siegel, Lindsey A Torre, andAhmedin Jemal. Global cancer statistics 2018: Globocan estimates of incidence and mortalityworldwide for 36 cancers in 185 countries.
CA: Cancer. J. Clin. , 68(6):394–424, 2018. ISSN 0007-9235.[9] Elisabetta Cagni, Andrea Botti, Renato Micera, Maria Galeandro, Roberto Sghedoni, Matteo Or-landi, Cinzia Iotti, Luca Cozzi, and Mauro Iori. Knowledge-based treatment planning: An inter-technique and inter-system feasibility study for prostate cancer.
Phys. Med. , 36:38–45, 2017. ISSN1120-1797.[10] Warren G Campbell, Moyed Miften, Lindsey Olsen, Priscilla Stumpf, Tracey Schefter, Karyn AGoodman, and Bernard L Jones. Neural network dose models for knowledge-based planning inpancreatic sbrt.
Med. Phys. , 44(12):6148–6158, 2017. ISSN 0094-2405.[11] Amy TY Chang, Albert WM Hung, Fion WK Cheung, Michael CH Lee, Oscar SH Chan, HelenPhilips, Yung-Tang Cheng, and Wai-Tong Ng. Comparison of planning quality and e ffi ciencybetween conventional and knowledge-based algorithms in nasopharyngeal cancer patients usingintensity modulated radiation therapy. Int. J. Radiat. Oncol. Biol. Phys. , 95(3):981–990, 2016. ISSN0360-3016.[12] Vorakarn Chanyavanich, Shiva K Das, William R Lee, and Joseph Y Lo. Knowledge-based imrttreatment planning for prostate cancer.
Med. Phys. , 38(5):2515–2522, 2011. ISSN 0094-2405.[13] Avishek Chatterjee, Monica Serban, Bassam Abdulkarim, Valerie Panet-Raymond, Luis Souhami,George Shenouda, Siham Sabri, Bertrand Jean-Claude, and Jan Seuntjens. Performance ofknowledge-based radiation therapy planning for the glioblastoma disease site.
Int. J. Radiat.Oncol. Biol. Phys. , 99(4):1021–1028, 2017. ISSN 0360-3016.2314] Avishek Chatterjee, Monica Serban, Sergio Faria, Luis Souhami, Fabio Cury, and Jan Seuntjens.Novel knowledge-based treatment planning model for hypofractionated radiotherapy of prostatecancer patients.
Phys. Med. , 69:36–43, 2020. ISSN 1120-1797.[15] Xinyuan Chen, Kuo Men, Yexiong Li, Junlin Yi, and Jianrong Dai. A feasibility study on anautomated method to generate patient-specific dose distributions for radiotherapy using deeplearning.
Med. Phys. , 46(1):56–64, 2019. ISSN 0094-2405.[16] Karen Chin Snyder, Jinkoo Kim, Anne Reding, Corey Fraser, James Gordon, Munther Ajlouni,Benjamin Movsas, and Indrin J Chetty. Development and evaluation of a clinical model for lungcancer patients using stereotactic body radiotherapy (sbrt) within a knowledge-based algorithmfor treatment planning.
J. App. Clin. Med. Phys. , 17(6):263–275, 2016. ISSN 1526-9914.[17] Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. Star-gan: Unified generative adversarial networks for multi-domain image-to-image translation. In
Proceedings of the IEEE conference on computer vision and pattern recognition , pages 8789–8797.[18] Benjamin T Cooper, Xiaochun Li, Samuel M Shin, Aram S Modrek, Howard C Hsu, JK DeWyn-gaert, Gabor Jozsef, Stella C Lymberis, Judith D Goldberg, and Silvia C Formenti. Preplanningprediction of the left anterior descending artery maximum dose based on patient, dosimetric,and treatment planning parameters.
Adv. Radiat. Oncol. , 1(4):373–381, 2016. ISSN 2452-1094.[19] Mariel Cornell, Robert Kaderka, Sebastian J Hild, Xenia J Ray, James D Murphy, Todd F At-wood, and Kevin L Moore. Noninferiority study of automated knowledge-based planning versushuman-driven optimization across multiple disease sites.
Int. J. Radiat. Oncol. Biol. Phys. , 106(2):430–439, 2020. ISSN 0360-3016.[20] Alexander R Delaney, Jim P Tol, Max Dahele, Johan Cuijpers, Ben J Slotman, and Wilko FARVerbakel. E ff ect of dosimetric outliers on the performance of a commercial knowledge-basedplanning solution. Int. J. Radiat. Oncol. Biol. Phys. , 94(3):469–477, 2016. ISSN 0360-3016.[21] Alexander R Delaney, Max Dahele, Jim P Tol, Ingrid T Kuijper, Ben J Slotman, and Wilko FARVerbakel. Using a knowledge-based planning solution to select patients for proton therapy.
Ra-diother. Oncol. , 124(2):263–270, 2017. ISSN 0167-8140.[22] Ruchi R Deshpande, John DeMarco, James W Sayre, Brent J journal of computer assisted radi-ology Liu, and surgery. Knowledge-driven decision support for assessing dose distributions inradiation therapy of head and neck cancer.
Int. J. Comp. Ass. Radiol. , 11(11):2071–2083, 2016.ISSN 1861-6410.[23] Xue Dong, Yang Lei, Sibo Tian, Tonghe Wang, Pretesh Patel, Walter J Curran, Ashesh B Jani, TianLiu, and Xiaofeng Yang. Synthetic mri-aided multi-organ segmentation on male pelvic ct usingcycle consistent deep attention network.
Rad. Oncol. , 141:192–199, 2019. ISSN 0167-8140.[24] Xue Dong, Yang Lei, Tonghe Wang, Matthew Thomas, Leonardo Tang, Walter J Curran, Tian Liu,and Xiaofeng Yang. Automatic multiorgan segmentation in thorax ct images using u-net-gan.
Med. Phys. , 46(5):2157–2168, 2019. ISSN 0094-2405.[25] Vishruta A Dumane, James Tam, Yeh-Chi Lo, and Kenneth E Rosenzweig. Rapidplan forknowledge-based planning of malignant pleural mesothelioma.
Pract. Radiat. Oncol. , 2020. ISSN1879-8500.[26] Jiawei Fan, Jiazhou Wang, Zhen Zhang, and Weigang Hu. Iterative dataset optimization in au-tomated planning: Implementation for breast and rectal cancer radiotherapy.
Med. Phys. , 44(6):2515–2531, 2017. ISSN 0094-2405.[27] Jiawei Fan, Jiazhou Wang, Zhi Chen, Chaosu Hu, Zhen Zhang, and Weigang Hu. Automatictreatment planning based on three-dimensional dose distribution predicted from deep learningtechnique.
Med. Phys. , 46(1):370–381, 2019. ISSN 0094-2405.[28] Jiawei Fan, Lei Xing, Peng Dong, Jiazhou Wang, Weigang Hu, and Yong Yang. Data-driven dosecalculation algorithm based on deep learning. page arXiv:2006.15485, 2020.2429] Austin M Faught, Lindsey Olsen, Leah Schubert, Chad Rusthoven, Edward Castillo, RichardCastillo, Jingjing Zhang, Thomas Guerrero, Moyed Miften, and Yevgeniy Vinogradskiy.Functional-guided radiotherapy using knowledge-based planning.
Radiother. Oncol. , 129(3):494–498, 2018. ISSN 0167-8140.[30] A Fogliata, G Reggiori, A Stravato, F Lobefalo, C Franzese, D Franceschini, S Tomatis, P Mancosu,M Scorsetti, and L Cozzi. Rapidplan head and neck model: the objectives and possible clinicalbenefit.
Radiat. Oncol. , 12(1):73, 2017. ISSN 1748-717X.[31] Antonella Fogliata, Francesca Belosi, Alessandro Clivio, Piera Navarria, Giorgia Nicolini, MartaScorsetti, Eugenio Vanetti, and Luca Cozzi. On the pre-clinical validation of a commercial model-based optimisation engine: application to volumetric modulated arc therapy for patients withlung or prostate cancer.
Radiother. Oncol. , 113(3):385–391, 2014. ISSN 0167-8140.[32] Antonella Fogliata, Po-Ming Wang, Francesca Belosi, Alessandro Clivio, Giorgia Nicolini, Euge-nio Vanetti, and Luca Cozzi. Assessment of a model based optimization engine for volumetricmodulated arc therapy for patients with advanced hepatocellular cancer.
Radiat. Oncol. , 9(1):236,2014. ISSN 1748-717X.[33] Antonella Fogliata, Giorgia Nicolini, Alessandro Clivio, Eugenio Vanetti, Sarbani Laksar, AngeloTozzi, Marta Scorsetti, and Luca Cozzi. A broad scope knowledge based model for optimization ofvmat in esophageal cancer: validation and assessment of plan quality among di ff erent treatmentcenters. Radiother. Oncol. , 10(1):220, 2015. ISSN 1748-717X.[34] Joseph J Foy, Robin Marsh, Randall K Ten Haken, Kelly C Younge, Matthew Schipper, Yilun Sun,Dawn Owen, and Martha M Matuszak. An analysis of knowledge-based planning for stereotacticbody radiation therapy of the spine.
Pract. Radiat. Oncol. , 7(5):e355–e360, 2017. ISSN 1879-8500.[35] Maayan Frid-Adar, Idit Diamant, Eyal Klang, Michal Amitai, Jacob Goldberger, and HayitGreenspan. Gan-based synthetic medical image augmentation for increased cnn performancein liver lesion classification.
Neurocomp. , 321:321–331, 2018. ISSN 0925-2312.[36] Yabo Fu, Thomas R Mazur, Xue Wu, Shi Liu, Xiao Chang, Yonggang Lu, H Harold Li, Hyun Kim,Michael C Roach, Lauren Henke, and Deshan Yang. A novel mri segmentation method using cnn-based correction network for mri-guided adaptive radiotherapy.
Med. Phys. , 45(11):5129–5137,2018. ISSN 0094-2405.[37] Yabo Fu, Yang Lei, Tonghe Wang, Walter J Curran, Tian Liu, and Xiaofeng Yang. Deep learningin medical image registration: a review.
Phys. Med. Biol. , 2020. ISSN 0031-9155.[38] Yaorong Ge and Q Jackie Wu. Knowledge-based planning for intensity-modulated radiationtherapy: a review of data-driven approaches.
Med. Phys. , 46(6):2760–2775, 2019. ISSN 0094-2405.[39] Tawfik Giaddui, Huaizhi Geng, Quan Chen, Nancy Linnemann, Marsha Radden, Nancy Y Lee,Ping Xia, and Ying Xiao. O ffl ine quality assurance for intensity-modulated radiotherapy treat-ment plans for nrg-hn001 head and neck clinical trial using knowledge-based planning. Adv.Radiat. Oncol. , 2020. ISSN 2452-1094.[40] David Good, Joseph Lo, W Robert Lee, Q Jackie Wu, Fang-Fang Yin, and Shiva K Das. Aknowledge-based approach to improving and homogenizing intensity modulated radiation ther-apy planning quality among treatment centers: an example application to prostate cancer plan-ning.
Int. J. Radiat. Oncol. Biol. Phys. , 87(1):176–181, 2013. ISSN 0360-3016.[41] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair,Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In
Advances in neural informa-tion processing systems , pages 2672–2680.[42] Th I Gtz, C Schmidkonz, S Chen, S Al-Baddai, T Kuwert, and EW Lang. A deep learning approachto radiation dose estimation.
Phys. Med. Biol. , 65(3):035007, 2020. ISSN 0031-9155.[43] Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu,Xingxing Wang, Gang Wang, and Jianfei Cai. Recent advances in convolutional neural networks.
Pattern Recognit. , 77:354–377, 2018. ISSN 0031-3203.2544] Joseph Harms, Yang Lei, Tonghe Wang, Rongxiao Zhang, Jun Zhou, Xiangyang Tang, Walter JCurran, Tian Liu, and Xiaofeng Yang. Paired cycle-gan-based image correction for quantitativecone-beam computed tomography.
Med. Phys. , 46(9):3998–4009, 2019. ISSN 0094-2405.[45] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for imagerecognition. In
Proceedings of the IEEE conference on computer vision and pattern recognition , pages770–778.[46] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Spatial pyramid pooling in deepconvolutional networks for visual recognition.
IEEE Trans. Pat. Anal. Mach. Intel. , 37(9):1904–1916, 2015. ISSN 0162-8828.[47] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connectedconvolutional networks. In
Proceedings of the IEEE conference on computer vision and pattern recog-nition , pages 4700–4708.[48] Yuliang Huang, Sha Li, Haizhen Yue, Meijiao Wang, Qiaoqiao Hu, Haiyang Wang, Tian Li, Chen-guang Li, Hao Wu, and Yibao Zhang. Impact of nominal photon energies on normal tissue spar-ing in knowledge-based radiotherapy treatment planning for rectal cancer patients.
PloS. One. ,14(3):e0213271, 2019. ISSN 1932-6203.[49] Margie A Hunt, Andrew Jackson, Ashwatha Narayana, and Nancy Lee. Geometric factors influ-encing dosimetric sparing of the parotid glands using imrt.
Int. J. Radiat. Oncol. Biol. Phys. , 66(1):296–304, 2006. ISSN 0360-3016.[50] Mohammad Hussein, Christopher P South, Miriam A Barry, Elizabeth J Adams, Tom J Jordan,Alexandra J Stewart, and Andrew Nisbet. Clinical validation and benchmarking of knowledge-based imrt and vmat treatment planning in pelvic anatomy.
Radiother. Oncol. , 120(3):473–479,2016. ISSN 0167-8140.[51] Mohammad Hussein, Ben JM Heijmen, Dirk Verellen, and Andrew Nisbet. Automation in inten-sity modulated radiotherapy treatment planninga review of recent innovations.
Br J Radiol , 91(1092):20180270, 2018. ISSN 0007-1285.[52] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation withconditional adversarial networks. In
Proceedings of the IEEE conference on computer vision andpattern recognition , pages 1125–1134.[53] David A Ja ff ray and Mary K Gospodarowicz. Radiation therapy for cancer . Washington (DC), 2015.ISBN 9781464803499. doi: 10.1596/978-1-4648-0349-9 ch14.[54] Jue Jiang, Yu-Chi Hu, Neelam Tyagi, Pengpeng Zhang, Andreas Rimner, Joseph O Deasy, andHarini Veeraraghavan. Cross-modality (ct-mri) prior augmented deep learning for robust lungtumor segmentation from small mr datasets.
Med. Phys. , 46(10):4392–4404, 2019. ISSN 0094-2405.[55] Robert Kaderka, Robert C Mundt, Nan Li, Benjamin Ziemer, Victoria N Bry, Mariel Cornell,and Kevin L Moore. Automated closed-and open-loop validation of knowledge-based planningroutines across multiple disease sites.
Pract. Radiat. Oncol. , 9(4):257–265, 2019. ISSN 1879-8500.[56] Tomohiro Kajikawa, Noriyuki Kadoya, Kengo Ito, Yoshiki Takayama, Takahito Chiba, Seiji To-mori, Hikaru Nemoto, Suguru Dobashi, Ken Takeda, and Keiichi Jingu. A convolutional neuralnetwork approach for imrt dose distribution prediction in prostate cancer patients.
J. Radiat.Res. , 60(5):685–693, 2019. ISSN 0449-3060.[57] Roya Norouzi Kandalan, Dan Nguyen, Nima Hassan Rezaeian, Ana M Barragan-Montero, Sebas-tiaan Breedveld, Kamesh Namuduri, Steve Jiang, and Mu-Han Lin. Dose prediction with deeplearning for prostate cancer radiation therapy: Model adaptation to di ff erent treatment planningpractices. page arXiv:2006.16481, 2020.[58] James A Kavanaugh, Sarah Holler, Todd A DeWees, Cli ff ord G Robinson, Je ff rey D Bradley,Puneeth Iyengar, Kristin A Higgins, Sasa Mutic, and Lindsey A Olsen. Multi-institutional val-idation of a knowledge-based planning model for patients enrolled in rtog 0617: implicationsfor plan quality controls in cooperative group trials. Pract. Radiat. Oncol. , 9(2):e218–e227, 2019.ISSN 1879-8500. 2659] Michael Kazhdan, Patricio Simari, Todd McNutt, Binbin Wu, Robert Jacques, Ming Chuang,and Russell Taylor. A shape relationship descriptor for radiation therapy planning. In
Interna-tional Conference on Medical Image Computing and Computer-Assisted Intervention , pages 100–108.Springer.[60] Vasant Kearney, Jason W Chan, Samuel Haaf, Martina Descovich, and Timothy D Solberg.Dosenet: a volumetric dose prediction algorithm using 3d fully-convolutional neural networks.
Phys. Med. Biol. , 63(23):235022, 2018. ISSN 0031-9155.[61] Kazuki Kubo, Hajime Monzen, Kentaro Ishii, Mikoto Tamura, Ryu Kawamorita, Iori Sumida,Hirokazu Mizuno, and Yasumasa Nishimura. Dosimetric comparison of rapidplan and manuallyoptimized plans in volumetric modulated arc therapy for prostate cancer.
Phys. Med. , 44:199–204, 2017. ISSN 1120-1797.[62] LiCheng Kuo, Ellen D Yorke, Vishruta A Dumane, Amanda Foster, Zhigang Zhang, James GMechalakos, Abraham J Wu, Kenneth E Rosenzweig, and Andreas Rimner. Geometric dose pre-diction model for hemithoracic intensity-modulated radiation therapy in mesothelioma patientswith two intact lungs.
J. App. Clin. Med. Phys. , 17(3):371–379, 2016. ISSN 1526-9914.[63] Dao Lam, Xizhe Zhang, Harold Li, Yang Deshan, Brayden Schott, Tianyu Zhao, Weixiong Zhang,Sasa Mutic, and Baozhou Sun. Predicting gamma passing rates for portal dosimetry-based imrtqa using machine learning.
Med. Phys. , 46(10):4666–4675, 2019. ISSN 0094-2405.[64] Angelia Landers, Ryan Neph, Fabien Scalzo, Dan Ruan, and Ke Sheng. Performance comparisonof knowledge-based dose prediction techniques based on limited patient data.
Tech. Can. Res.Treat. , 17:1533033818811150, 2018. ISSN 1533-0346.[65] Yann LeCun and Yoshua Bengio. Convolutional networks for images, speech, and time series.
Han. Br. Neur. Net. , 3361(10):1995, 1995.[66] Yann LeCun, Yoshua Bengio, and Geo ff rey Hinton. Deep learning. Nat. , 521(7553):436–444,2015. ISSN 1476-4687.[67] Yann A LeCun, Lon Bottou, Genevieve B Orr, and Klaus-Robert Mller. E ffi cient backprop , pages9–48. Springer, 2012.[68] Chen-Yu Lee, Patrick W Gallagher, and Zhuowen Tu. Generalizing pooling functions in convo-lutional neural networks: Mixed, gated, and tree. In Artificial intelligence and statistics , pages464–472.[69] Min Sun Lee, Donghwi Hwang, Joong Hyun Kim, and Jae Sung Lee. Deep-dose: a voxel doseestimation method using deep convolutional neural network for personalized internal dosimetry.
Sci. Rep. , 9(1):1–9, 2019. ISSN 2045-2322.[70] Yang Lei, Xue Dong, Tonghe Wang, Kristin Higgins, Tian Liu, Walter J Curran, Hui Mao,Jonathon A Nye, and Xiaofeng Yang. Whole-body pet estimation from low count statistics usingcycle-consistent generative adversarial networks.
Phys. Med. Biol. , 64(21):215017, 2019. ISSN0031-9155.[71] Yang Lei, Joseph Harms, Tonghe Wang, Yingzi Liu, Hui-Kuo Shu, Ashesh B Jani, Walter J Curran,Hui Mao, Tian Liu, and Xiaofeng Yang. Mri-only based synthetic ct generation using dense cycleconsistent generative adversarial networks.
Med. Phys. , 46(8):3565–3581, 2019. ISSN 0094-2405.[72] Yang Lei, Joseph Harms, Tonghe Wang, Sibo Tian, Jun Zhou, Hui-Kuo Shu, Jim Zhong, Hui Mao,Walter J Curran, Tian Liu, and Xiaofeng Yang. Mri-based synthetic ct generation using semanticrandom forest with iterative refinement.
Phys. Med. Biol. , 64(8):085001, 2019. ISSN 0031-9155.[73] Yang Lei, Xiangyang Tang, Kristin Higgins, Jolinta Lin, Jiwoong Jeong, Tian Liu, Anees Dhabaan,Tonghe Wang, Xue Dong, Robert Press, Walter J Curran, and Xiaofeng Yang. Learning-basedcbct correction using alternating random forest based on auto-context model.
Med. Phys. , 46(2):601–618, 2019. ISSN 2473-4209. 2774] Yang Lei, Sibo Tian, Xiuxiu He, Tonghe Wang, Bo Wang, Pretesh Patel, Ashesh B Jani, Hui Mao,Walter J Curran, Tian Liu, and Xiaofeng Yang. Ultrasound prostate segmentation based on mul-tidirectional deeply supervised v-net.
Med. Phys. , 46(7):3194–3206, 2019. ISSN 0094-2405.[75] Nan Li, Ruben Carmona, Igor Sirak, Linda Kasaova, David Followill, Je ff Michalski, Walter Bosch,William Straube, Loren K Mell, and Kevin L Moore. Highly e ffi cient training, refinement, andvalidation of a knowledge-based planning quality-control system for radiation therapy clinicaltrials. Int. J. Radiat. Oncol. Biol. Phys. , 97(1):164–172, 2017. ISSN 0360-3016.[76] Nan Li, Sonal S Noticewala, Casey W Williamson, Hanjie Shen, Igor Sirak, Rafal Tarnawski,Umesh Mahantshetty, Carl K Hoh, Kevin L Moore, and Loren K Mell. Feasibility of atlas-basedactive bone marrow sparing intensity modulated radiation therapy for cervical cancer.
Radiother.Oncol. , 123(2):325–330, 2017. ISSN 0167-8140.[77] Xinyi Li, Jiahan Zhang, Yang Sheng, Yushi Chang, Fang-Fang Yin, Yaorong Ge, Q Jackie Wu,and Chunhao Wang. Automatic imrt planning via static field fluence prediction (aip-s ff p): adeep learning algorithm for real-time prostate treatment planning. Phys. Med. Biol. , 2020. ISSN1361-6560.[78] Jun Lian, Lulin Yuan, Yaorong Ge, Bhishamjit S Chera, David P Yoo, Sha Chang, FangFang Yin,and Q Jackie Wu. Modeling the dosimetry of organ-at-risk in head and neck imrt planning: anintertechnique and interinstitutional study.
Med. Phys. , 40(12):121704, 2013. ISSN 0094-2405.[79] Jianfei Liu, Q Jackie Wu, John P Kirkpatrick, Fang-Fang Yin, Lulin Yuan, and Yaorong Ge. Fromactive shape model to active optical flow model: a shape-based approach to predicting voxel-leveldose distributions in spine sbrt.
Phys. Med. Biol. , 60(5):N83, 2015. ISSN 0031-9155.[80] Yingzi Liu, Yang Lei, Tonghe Wang, Oluwatosin Kayode, Sibo Tian, Tian Liu, Pretesh Patel, Wal-ter J Curran, Lei Ren, and Xiaofeng Yang. Mri-based treatment planning for liver stereotacticbody radiotherapy: validation of a deep learning-based synthetic ct generation method.
Br. J.Radiol. , 92(1100):20190067, 2019. ISSN 0007-1285.[81] Yingzi Liu, Yang Lei, Yinan Wang, Tonghe Wang, Lei Ren, Liyong Lin, Mark McDonald, Walter JCurran, Tian Liu, Jun Zhou, and Xiaofeng Yang. Mri-based treatment planning for proton ra-diotherapy: dosimetric validation of a deep learning-based liver synthetic ct generation method.
Phys. Med. Biol. , 64(14):145015, 2019. ISSN 0031-9155.[82] Zhiqiang Liu, Jiawei Fan, Minghui Li, Hui Yan, Zhihui Hu, Peng Huang, Yuan Tian, Junjie Miao,and Jianrong Dai. A deep learning method for prediction of three-dimensional dose distributionof helical tomotherapy.
Med. Phys. , 46(5):1972–1983, 2019. ISSN 0094-2405.[83] Daniel A Low, William B Harms, Sasa Mutic, and James A Purdy. A technique for the quantitativeevaluation of dose distributions.
Med. Phys. , 25(5):656–661, 1998. ISSN 0094-2405.[84] Jianhui Ma, Ti Bai, Dan Nguyen, Michael Folkerts, Xun Jia, Weiguo Lu, Linghong Zhou, andSteve Jiang. Individualized 3d dose distribution prediction using deep learning. In
Workshop onArtificial Intelligence in Radiation Therapy , pages 110–118. Springer.[85] Rafid Mahmood, Aaron Babier, Andrea McNiven, Adam Diamant, and Timothy CY Chan. Au-tomated treatment planning in radiation therapy using generative adversarial networks. pagearXiv:1807.06489v1, 2018.[86] Joscha Maier, Yannick Berker, Stefan Sawall, and Marc Kachelrie. Deep scatter estimation (dse):feasibility of using a deep convolutional neural network for real-time x-ray scatter prediction incone-beam ct. In
Medical imaging 2018: physics of medical imaging , volume 10573, page 105731L.International Society for Optics and Photonics.[87] Kathryn Masi, Paul Archer, William Jackson, Yilun Sun, Matthew Schipper, Daniel Hamstra, andMartha Matuszak. Knowledge-based treatment planning and its potential role in the transitionbetween treatment planning systems.
Med. Dosim. , 43(3):251–257, 2018. ISSN 0958-3947.[88] Chris McIntosh and Thomas G Purdie. Contextual atlas regression forests: multiple-atlas-basedautomated dose prediction in radiation therapy.
IEEE Trans. Med. Imag. , 35(4):1000–1012, 2015.ISSN 0278-0062. 2889] Chris McIntosh and Thomas G Purdie. Voxel-based dose prediction with multi-patient atlasselection for automated radiotherapy treatment planning.
Phys. Med. Biol. , 62(2):415, 2016. ISSN0031-9155.[90] Chris McIntosh, Mattea Welch, Andrea McNiven, David A Ja ff ray, and Thomas G Purdie. Fullyautomated treatment planning for head and neck radiotherapy using a voxel-based dose predic-tion and dose mimicking method. Phys. Med. Biol. , 62(15):5926, 2017. ISSN 0031-9155.[91] Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. V-net: Fully convolutional neuralnetworks for volumetric medical image segmentation. In , pages 565–571. IEEE. ISBN 1509054073.[92] Cheryl H Millunchick, Heming Zhen, Gage Redler, Yixiang Liao, and Julius V Turian. A modelfor predicting the dose to the parotid glands based on their relative overlapping with planningtarget volumes during helical radiotherapy.
J. App. Clin. Med. Phys. , 19(2):48–53, 2018. ISSN1526-9914.[93] Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. pagearXiv:1411.1784v1, 2014.[94] Nishikant Mishra, Sanja Petrovic, and Santhanam Sundar. A self-adaptive case-based reasoningsystem for dose planning in prostate cancer radiotherapy.
Med. Phys. , 38(12):6528–6538, 2011.ISSN 0094-2405.[95] Diganta Misra. Mish: A self regularized non-monotonic neural activation function. pagearXiv:1908.08681v3, 2019.[96] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Belle-mare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, and Georg Ostrovski. Human-levelcontrol through deep reinforcement learning.
Nat. , 518(7540):529–533, 2015. ISSN 1476-4687.[97] Kevin L Moore, R Scott Brame, Daniel A Low, and Sasa Mutic. Experience-based quality controlof clinical intensity-modulated radiotherapy planning.
Int. J. Radiat. Oncol. Biol. Phys. , 81(2):545–551, 2011. ISSN 0360-3016.[98] Yu Murakami, Taiki Magome, Kazuki Matsumoto, Tomoharu Sato, Yasuo Yoshioka, and MasahikoOguchi. Fully automated dose prediction using generative adversarial networks in prostate can-cer patients.
PloS. One. , 15(5):e0232697, 2020. ISSN 1932-6203.[99] Dan Nguyen, Troy Long, Xun Jia, Weiguo Lu, Xuejun Gu, Zohaib Iqbal, and Steve Jiang. Afeasibility study for predicting optimal radiation therapy dose distributions of prostate cancerpatients from patient anatomy using deep learning.
Sci. Rep. , 9:1 – 10, 2017.[100] Dan Nguyen, Xun Jia, David Sher, Mu-Han Lin, Zohaib Iqbal, Hui Liu, and Steve Jiang. 3dradiotherapy dose prediction on head and neck cancer patients with a hierarchically denselyconnected u-net deep learning architecture.
Phys. Med. Biol. , 64(6):065020, 2019. ISSN 0031-9155.[101] Dan Nguyen, Rafe McBeth, Azar Sadeghnejad Barkousaraie, Gyanendra Bohara, Chenyang Shen,Xun Jia, and Steve Jiang. Incorporating human and learned domain knowledge into training deepneural networks: A di ff erentiable dose-volume histogram and adversarial inspired frameworkfor generating pareto optimal dose distributions in radiation therapy. Med. Phys. , 47(3):837–849,2020. ISSN 0094-2405.[102] Chigozie Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall. Activa-tion functions: Comparison of trends in practice and research for deep learning. pagearXiv:1811.03378v1, 2018.[103] Obioma Nwankwo, Dwi Seno K Sihono, Frank Schneider, and Frederik Wenz. A global qualityassurance system for personalized radiation therapy treatment planning for the prostate (or othersites).
Phys. Med. Biol. , 59(18):5575, 2014. ISSN 0031-9155.29104] Obioma Nwankwo, Hana Mekdash, Dwi Seno Kuncoro Sihono, Frederik Wenz, and GerhardGlatting. Knowledge-based radiation therapy (kbrt) treatment planning versus planning by ex-perts: validation of a kbrt algorithm for prostate cancer treatment planning.
Radiat. Oncol. , 10(1):111, 2015. ISSN 1748-717X.[105] Matthew J Nyflot, Phawis Thammasorn, Landon S Wootton, Eric C Ford, and W A Chaovalit-wongse. Deep learning for patient-specific quality assurance: Identifying errors in radiotherapydelivery by radiomic analysis of gamma images with convolutional neural networks.
Med. Phys. ,46(2):456–464, 2019. ISSN 0094-2405.[106] John-Anthony Principal component analysis: data reduction and simplification.
Mc. Schol. Res.Journ. , 1(1):2, 2014.[107] Steven F Petit, Binbin Wu, Michael Kazhdan, Andr Dekker, Patricio Simari, Rachit Kumar, RusselTaylor, Joseph M Herman, and Todd McNutt. Increased organ sparing using shape-based treat-ment plan optimization for intensity modulated radiation therapy of pancreatic adenocarcinoma.
Radiother. Oncol. , 102(1):38–44, 2012. ISSN 0167-8140.[108] Richard Powis, Andrew Bird, Matthew Brennan, Susan Hinks, Hannah Newman, Katie Reed,John Sage, and Gareth Webster. Clinical implementation of a knowledge based planning tool forprostate vmat.
Radiat. Oncol. , 12(1):1–8, 2017. ISSN 1748-717X.[109] Prajit Ramachandran, Barret Zoph, and Quoc V Le. Searching for activation functions. pagearXiv:1710.05941v2, 2017.[110] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks forbiomedical image segmentation. In
International Conference on Medical image computing andcomputer-assisted intervention , pages 234–241. Springer.[111] Jo Schlemper, Ozan Oktay, Michiel Schaap, Mattias Heinrich, Bernhard Kainz, Ben Glocker, andDaniel Rueckert. Attention gated networks: Learning to leverage salient regions in medical im-ages.
Med. Imag. Anal. , 53:197–207, 2019. ISSN 1361-8415.[112] Matthew Schmidt, Joseph Y Lo, Shelby Grzetic, Carly Lutzky, David M Brizel, and Shiva K Das.Semiautomated head-and-neck imrt planning using dose warping and scaling to robustly adaptplans in a knowledge database containing potentially suboptimal plans.
Med. Phys. , 42(8):4428–4434, 2015. ISSN 0094-2405.[113] Eduard Schreibmann, Tim Fox, Walter Curran, Hui-Kuo Shu, and Ian Crocker. Automatedpopulation-based planning for whole brain radiation therapy.
J. App. Clin. Med. Phys. , 16(5):76–86, 2015. ISSN 1526-9914.[114] Carolin Schubert, Oliver Waletzko, Christian Weiss, Dirk Voelzke, Sevda Toperim, Arnd Roeser,Silvia Puccini, Marc Piroth, Christian Mehrens, and Jan-Dirk Kueter. Intercenter validationof a knowledge based model for automated planning of volumetric modulated arc therapy forprostate cancer. the experience of the german rapidplan consortium.
PloS. One. , 12(5):e0178034,2017. ISSN 1932-6203.[115] Chenyang Shen, Dan Nguyen, Liyuan Chen, Yesenia Gonzalez, Rafe McBeth, Nan Qin, Steve BJiang, and Xun Jia. Operating a treatment planning system using a deep-reinforcement learning-based virtual treatment planner for prostate cancer intensity-modulated radiation therapy treat-ment planning.
Med. Phys. , 2020. ISSN 0094-2405.[116] Chenyang Shen, Dan Nguyen, Zhiguo Zhou, Steve B Jiang, Bin Dong, and Xun Jia. An introduc-tion to deep learning in medical physics: advantages, potential, and challenges.
Phys. Med. Biol. ,65(5):05TR01, 2020. ISSN 0031-9155.[117] Yang Sheng, Taoran Li, You Zhang, W Robert Lee, Fang-Fang Yin, Yaorong Ge, and Q Jackie Wu.Atlas-guided prostate intensity modulated radiation therapy (imrt) planning.
Phys. Med. Biol. ,60(18):7277, 2015. ISSN 0031-9155.[118] Yang Sheng, Yaorong Ge, Lulin Yuan, Taoran Li, Fang-Fang Yin, and Qingrong Jackie Wu. Outlieridentification in radiation therapy knowledge-based planning: a study of pelvic cases.
Med. Phys. ,44(11):5617–5626, 2017. ISSN 0094-2405. 30119] Satomi Shiraishi and Kevin L Moore. Knowledge-based prediction of three-dimensional dosedistributions for external beam radiotherapy.
Med. Phys. , 43(1):378–387, 2016. ISSN 0094-2405.[120] Ying Song, Junjie Hu, Yang Liu, Haiyun Hu, Yang Huang, Sen Bai, and Zhang Yi. Dose predictionusing a deep neural network for accelerated planning of rectal cancer radiotherapy.
Radiother.Oncol. , 149:111–116, 2020. ISSN 0167-8140.[121] Nitish Srivastava, Geo ff rey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov.Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. , 15(1):1929–1958, 2014. ISSN 1532-4435.[122] Iori Sumida, Taiki Magome, Indra J Das, Hajime Yamaguchi, Hisao Kizaki, Keiko Aboshi, HirokoYamaguchi, Yuji Seo, Fumiaki Isohashi, and Kazuhiko Ogawa. A convolution neural network forhigher resolution dose prediction in prostate volumetric modulated arc therapy.
Phys. Med. , 72:88–95, 2020. ISSN 1120-1797.[123] Andrei Svecic, David Roberge, and Samuel Kadoury. Prediction of inter-fractional radiother-apy dose plans with domain translation in spatiotemporal embeddings.
Med. Imag. Anal. , page101728, 2020. ISSN 1361-8415.[124] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Du-mitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In
Proceedings of the IEEE conference on computer vision and pattern recognition , pages 1–9.[125] M Allan Thomas, Yabo Fu, and Deshan Yang. Development and evaluation of machine learn-ing models for voxel dose predictions in online adaptive magnetic resonance guided radiationtherapy.
J. App. Clin. Med. Phys. , 2020. ISSN 1526-9914.[126] Jim P Tol, Max Dahele, Alexander R Delaney, Ben J Slotman, and Wilko FAR Verbakel. Canknowledge-based dvh predictions be used for automated, individualized quality assurance ofradiotherapy treatment plans?
Radiat. Oncol. , 10(1):1–14, 2015. ISSN 1748-717X.[127] Jim P Tol, Alexander R Delaney, Max Dahele, Ben J Slotman, and Wilko FAR Verbakel. Evaluationof a knowledge-based planning solution for head and neck cancer.
Int. J. Radiat. Oncol. Biol. Phys. ,91(3):612–620, 2015. ISSN 0360-3016.[128] Angelia Tran, Kaley Woods, Dan Nguyen, Y Yu Victoria, Tianye Niu, Minsong Cao, Percy Lee,and Ke Sheng. Predicting liver sbrt eligibility and plan quality for vmat and 4p plans.
Radiat.Oncol. , 12(1):1–9, 2017. ISSN 1748-717X.[129] Yoshihiro Ueda, Jun-ichi Fukunaga, Tatsuya Kamima, Yumiko Adachi, Kiyoshi Nakamatsu, andHajime Monzen. Evaluation of multiple institutions models for knowledge-based planning ofvolumetric modulated arc therapy (vmat) for prostate cancer.
Radiat. Oncol. , 13(1):46, 2018.ISSN 1748-717X.[130] Yoshihiro Ueda, Masayoshi Miyazaki, Iori Sumida, Shingo Ohira, Mikoto Tamura, HajimeMonzen, Haruhi Tsuru, Shoki Inui, Masaru Isono, and Kazuhiko Ogawa. Knowledge-basedplanning for oesophageal cancers using a model trained with plans from a di ff erent treatmentplanning system. Acta. Oncol. , 59(3):274–283, 2020. ISSN 0284-186X.[131] Kearney Vasant, Jason W Chan, Wang Tianqi, Alan Perry, Descovich Martina, Olivier Morin,Sue S Yom, and Timothy D Solberg. Dosegan: a generative adversarial network for syntheticdose prediction using attention-gated discrimination and generation.
Sci. Rep. , 10(1), 2020. ISSN2045-2322.[132] Ganesh Venkatesh, Eriko Nurvitadhi, and Debbie Marr. Accelerating deep convolutional net-works using low-precision and sparsity. In , pages 2861–2865. IEEE. ISBN 1509041176.[133] VMS. Eclipse photon and electron 13.6 instruction of use.
Varian Medical Systems, Inc. Palo Alto,CA, USA , 2015. 31134] Phillip DH Wall, Robert L Carver, and Jonas D Fontenot. An improved distance-to-dose corre-lation for predicting bladder and rectum dose-volumes in knowledge-based vmat planning forprostate cancer.
Phys. Med. Biol. , 63(1):015035, 2018. ISSN 0031-9155.[135] Bo Wang, Yang Lei, Sibo Tian, Tonghe Wang, Yingzi Liu, Pretesh Patel, Ashesh B Jani, Hui Mao,Walter J Curran, Tian Liu, and Xiaofeng Yang. Deeply supervised 3d fully convolutional networkswith group dilated convolution for automatic mri prostate segmentation.
Med. Phys. , 46(4):1707–1718, 2019. ISSN 0094-2405.[136] Jiazhou Wang, Xiance Jin, Kuaike Zhao, Jiayuan Peng, Jiang Xie, Junchao Chen, Zhen Zhang,Matthew Studenski, and Weigang Hu. Patient feature based dosimetric pareto front predictionin esophageal cancer radiotherapy.
Med. Phys. , 42(2):1005–1011, 2015. ISSN 0094-2405.[137] Juanqi Wang, Weigang Hu, Zhaozhi Yang, Xiaohui Chen, Zhiqiang Wu, Xiaoli Yu, Xiaomao Guo,Saiquan Lu, Kaixuan Li, and Gongyi Yu. Is it possible for knowledge-based planning to improveintensity modulated radiation therapy plan quality for planners with di ff erent planning experi-ences in left-sided breast cancer patients? Radiat. Oncol. , 12(1):85, 2017. ISSN 1748-717X.[138] Meijiao Wang, Sha Li, Yuliang Huang, Haizhen Yue, Tian Li, Hao Wu, Song Gao, and YibaoZhang. An interactive plan and model evolution method for knowledge-based pelvic vmat plan-ning.
J. App. Clin. Med. Phys. , 19(5):491–498, 2018. ISSN 1526-9914.[139] Tao Wang, David J Wu, Adam Coates, and Andrew Y Ng. End-to-end text recognition with convo-lutional neural networks. In
Proceedings of the 21st international conference on pattern recognition(ICPR2012) , pages 3304–3308. IEEE. ISBN 4990644107.[140] Tonghe Wang, Beth Bradshaw Ghavidel, Jonathan J Beitler, Xiangyang Tang, Yang Lei, Walter JCurran, Tian Liu, and Xiaofeng Yang. Optimal virtual monoenergetic image in twinbeam dual-energy ct for organs-at-risk delineation based on contrast-noise-ratio in head-and-neck radiother-apy.
J. App. Clin. Med. Phys. , 20(2):121–128, 2019. ISSN 1526-9914.[141] Tonghe Wang, Yang Lei, Nivedh Manohar, Sibo Tian, Ashesh B Jani, Hui-Kuo Shu, Kristin Hig-gins, Anees Dhabaan, Pretesh Patel, and Xiangyang Tang. Dosimetric study on learning-basedcone-beam ct correction in adaptive radiation therapy.
Med. Dosim. , 44(4):e71–e79, 2019. ISSN0958-3947.[142] Tonghe Wang, Yang Lei, Haipeng Tang, Zhuo He, Richard Castillo, Cheng Wang, Dianfu Li,Kristin Higgins, Tian Liu, Walter J Curran, Weihua Zhou, and Xiaofeng Yang. A learning-basedautomatic segmentation and quantification method on left ventricle in gated myocardial perfu-sion spect imaging: A feasibility study.
J. Nucl. Cardiol. , 3(3):1–12, 2019. ISSN 1532-6551.[143] Tonghe Wang, Nivedh Manohar, Yang Lei, Anees Dhabaan, Hui-Kuo Shu, Tian Liu, Walter JCurran, and Xiaofeng Yang. Mri-based treatment planning for brain stereotactic radiosurgery:dosimetric validation of a learning-based pseudo-ct generation method.
Med. Dosim. , 44(3):199–204, 2019. ISSN 0958-3947.[144] Yibing Wang, Andras Zolnay, Luca Incrocci, Hans Joosten, Todd McNutt, Ben Heijmen, andSteven Petit. A quality control model that uses ptv-rectal distances to predict the lowest achiev-able rectum dose, improves imrt planning for patients with prostate cancer.
Radiother. Oncol. ,107(3):352–357, 2013. ISSN 0167-8140.[145] Yibing Wang, Sebastiaan Breedveld, Ben Heijmen, and Steven F Petit. Evaluation of plan qualityassurance models for prostate cancer patients based on fully automatically generated pareto-optimal treatment plans.
Phys. Med. Biol. , 61(11):4268, 2016. ISSN 0031-9155.[146] Yibing Wang, Ben JM Heijmen, and Steven F Petit. Knowledge-based dose prediction models forhead and neck cancer are strongly a ff ected by interorgan dependency and dataset inconsistency. Med. Phys. , 46(2):934–943, 2019. ISSN 0094-2405.[147] Siri Willems, Wouter Crijns, Edmond Sterpin, Karin Haustermans, and Frederik Maes. Feasibilityof ct-only 3d dose prediction for vmat prostate plans using deep learning. Artificial Intelligencein Radiation Therapy, pages 10–17. Springer International Publishing. ISBN 978-3-030-32486-5.32148] Binbin Wu, Francesco Ricchetti, Giuseppe Sanguineti, Misha Kazhdan, Patricio Simari, MingChuang, Russell Taylor, Robert Jacques, and Todd McNutt. Patient geometry-driven informationretrieval for imrt treatment plan quality control.
Med. Phys. , 36(12):5497–5505, 2009. ISSN0094-2405.[149] Binbin Wu, Francesco Ricchetti, Giuseppe Sanguineti, Michael Kazhdan, Patricio Simari, RobertJacques, Russell Taylor, and Todd McNutt. Data-driven approach to generating achievable do-sevolume histogram objectives in intensity-modulated radiotherapy planning.
Int. J. Radiat. On-col. Biol. Phys. , 79(4):1241–1247, 2011. ISSN 0360-3016.[150] Binbin Wu, Todd McNutt, Marianna Zahurak, Patricio Simari, Dalong Pang, Russell Taylor, andGiuseppe Sanguineti. Fully automated simultaneous integrated boostedintensity modulated ra-diation therapy treatment planning is feasible for head-and-neck cancer: a prospective clinicalstudy.
Int. J. Radiat. Oncol. Biol. Phys. , 84(5):e647–e653, 2012. ISSN 0360-3016.[151] Binbin Wu, Dalong Pang, Patricio Simari, Russell Taylor, Giuseppe Sanguineti, and Todd McNutt.Using overlap volume histogram and imrt plan data to guide and automate vmat planning: ahead-and-neck case study.
Med. Phys. , 40(2):021714, 2013. ISSN 0094-2405.[152] Binbin Wu, Dalong Pang, Siyuan Lei, John Gatti, Michael Tong, Todd McNutt, Thomas Kole, Ana-toly Dritschilo, and Sean Collins. Improved robotic stereotactic body radiation therapy plan qual-ity and planning e ffi cacy for organ-confined prostate cancer utilizing overlap-volume histogram-driven planning methodology. Radiother. Oncol. , 112(2):221–226, 2014. ISSN 0167-8140.[153] Binbin Wu, Martijn Kusters, Martina Kunze-busch, Tim Dijkema, Todd McNutt, Giuseppe San-guineti, Karl Bzdusek, Anatoly Dritschilo, and Dalong Pang. Cross-institutional knowledge-based planning (kbp) implementation and its performance comparison to auto-planning engine(ape).
Radiother. Oncol. , 123(1):57–62, 2017. ISSN 0167-8140.[154] Hao Wu, Fan Jiang, Haizhen Yue, Sha Li, and Yibao Zhang. A dosimetric evaluation ofknowledge-based vmat planning with simultaneous integrated boosting for rectal cancer pa-tients.
J. App. Clin. Med. Phys. , 17(6):78–85, 2016. ISSN 1526-9914.[155] Hao Wu, Fan Jiang, Haizhen Yue, Hui Zhang, Kun Wang, and Yibao Zhang. Applying a rapidplanmodel trained on a technique and orientation to another: a feasibility and dosimetric evaluation.
Radiat. Oncol. , 11(1):108, 2016. ISSN 1748-717X.[156] Yixun Xing, Dan Nguyen, Weiguo Lu, Ming Yang, and Steve Jiang. A feasibility study on deeplearning-based radiotherapy dose calculation.
Med. Phys. , 47(2):753–758, 2020. ISSN 0094-2405.[157] Bing Xu, Naiyan Wang, Tianqi Chen, and Mu Li. Empirical evaluation of rectified activations inconvolutional network. page arXiv:1505.00853v2, 2015.[158] Yidong Yang, Eric C Ford, Binbin Wu, Michael Pinkawa, Baukelien Van Triest, Patrick Campbell,Danny Y Song, and Todd R McNutt. An overlap-volume-histogram based method for rectaldose prediction and automated treatment planning in the external beam prostate radiotherapyfollowing hydrogel injection.
Med. Phys. , 40(1):011709, 2013. ISSN 0094-2405.[159] Kelly C Younge, Robin B Marsh, Dawn Owen, Huaizhi Geng, Ying Xiao, Daniel E Spratt, JosephFoy, Krithika Suresh, Q Jackie Wu, and Fang-Fang Yin. Improving quality and consistency in nrgoncology radiation therapy oncology group 0631 for spine radiosurgery via knowledge-basedplanning.
Int. J. Radiat. Oncol. Biol. Phys. , 100(4):1067–1074, 2018. ISSN 0360-3016.[160] Gang Yu, Yang Li, Ziwei Feng, Cheng Tao, Zuyi Yu, Baosheng Li, and Dengwang Li. Knowledge-based imrt planning for individual liver cancer patients using a novel specific model.
Radiat.Oncol. , 13(1):52, 2018. ISSN 1748-717X.[161] Lulin Yuan, Yaorong Ge, W Robert Lee, Fang Fang Yin, John P Kirkpatrick, and Q Jackie Wu.Quantitative analysis of the factors which a ff ect the interpatient organ-at-risk dose sparing vari-ation in imrt plans. Med. Phys. , 39(11):6868–6878, 2012. ISSN 0094-2405.[162] Lulin Yuan, Q Jackie Wu, Fang-Fang Yin, Yuliang Jiang, David Yoo, and Yaorong Ge. Incorpo-rating single-side sparing in models for predicting parotid dose sparing in head and neck imrt.
Med. Phys. , 41(2):021728, 2014. ISSN 0094-2405.33163] Masoud Zarepisheh, Troy Long, Nan Li, Zhen Tian, H Edwin Romeijn, Xun Jia, and Steve BJiang. A dvh-guided imrt optimization algorithm for automatic treatment planning and adaptiveradiotherapy replanning.
Med. Phys. , 41(6Part1):061711, 2014. ISSN 0094-2405.[164] Hao H Zhang, Robert R Meyer, Leyuan Shi, and Warren D D’Souza. The minimum knowledgebase for predicting organ-at-risk dosevolume levels and plan-related complications in imrt plan-ning.
Phys. Med. Biol. , 55(7):1935, 2010. ISSN 0031-9155.[165] Jiahan Zhang, Q Jackie Wu, Tianyi Xie, Yang Sheng, Fang-Fang Yin, and Yaorong Ge. An ensembleapproach to knowledge-based intensity-modulated radiation therapy planning.
Front. Oncol. , 8:57, 2018. ISSN 2234-943X.[166] Jiahan Zhang, Yaorong Ge, Yang Sheng, Fang-Fang Yin, and Q Jackie Wu. Modeling of multipleplanning target volumes for head and neck treatments in knowledge-based treatment planning.
Med. Phys. , 46(9):3812–3822, 2019. ISSN 0094-2405.[167] Jiahan Zhang, Yaorong Ge, Yang Sheng, Chunhao Wang, Jiang Zhang, Yuan Wu, Qiuwen Wu,Fang-Fang Yin, and Q Jackie Wu. Knowledge-based tradeo ff hyperplanes for head and necktreatment planning. Int. J. Radiat. Oncol. Biol. Phys. , 2020. ISSN 0360-3016.[168] Jieping Zhou, Zhao Peng, Yuchen Song, Yankui Chang, Xi Pei, Liusi Sheng, and X George Xu.A method of using deep learning to predict three-dimensional dose distributions for intensity-modulated radiotherapy of rectal cancer.
J. App. Clin. Med. Phys. , 2020. ISSN 1526-9914.[169] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image transla-tion using cycle-consistent adversarial networks. In
Proceedings of the IEEE international confer-ence on computer vision , pages 2223–2232.[170] Xiaofeng Zhu, Yaorong Ge, Taoran Li, Danthai Thongphiew, Fang-Fang Yin, and Q Jackie Wu.A planning quality evaluation tool for prostate adaptive imrt based on machine learning.
Med.Phys. , 38(2):719–726, 2011. ISSN 0094-2405.[171] Benjamin P Ziemer, Parag Sanghvi, Jona Hattangadi-Gluth, and Kevin L Moore. Heuristicknowledge-based planning for single-isocenter stereotactic radiosurgery to multiple brain metas-tases.