Medical physics | 2021

Robustness of deep learning segmentation of cardiac substructures in non-contrast computed tomography for breast cancer radiotherapy.

 
 
 
 
 
 
 
 

Abstract


PURPOSE\nTo develop and evaluate deep learning based auto-segmentation of cardiac substructures from non-contrast planning computed tomography (CT) images in patients undergoing breast cancer radiotherapy and to investigate the algorithm sensitivity to out-of-distribution data such as CT image artifacts.\n\n\nMETHODS\nNine substructures including Aortic Valve (AV), Left Anterior Descending (LAD), Tricuspid Valve (TV), Mitral Valve (MV), Pulmonic Valve (PV), Right Atrium (RA), Right Ventricle (RV), Left Atrium (LA) and Left Ventricle (LV) were manually delineated by a radiation oncologist on non-contrast CT images of 129 patients with breast cancer; among them 90 were considered in-distribution data, also named clean data. The image/label pairs of 60 subjects were used to train a 3D deep neural network while the remaining 30 were used for testing. The rest of the 39 patients were considered out-of-distribution ( outlier ) data which were used to test robustness. Random rigid transformations were used to augment the dataset during training. We investigated multiple loss functions, including Dice Similarity Coefficient (DSC), Cross-entropy (CE), Euclidean loss as well as the variation and combinations of these, data augmentation, and network size on overall performance and sensitivity to image artifacts due to infrequent events such as the presence of implanted devices. The predicted label maps were compared to the ground truth labels via DSC and mean and 90th percentile symmetric surface distance (90th -SSD).\n\n\nRESULTS\nWhen using modified Dice combined with cross-entropy (MD-CE) as the loss function, the algorithm achieved a mean DSC = 0.79±0.07 for chambers and a mean DSC = 0.39±0.10 for smaller substructures (valves and LAD). The mean and 90th -SSD for chambers was 2.7±1.4 mm and 6.5±2.8 mm and was 4.1±1.7 mm and 8.6±3.2 mm for smaller substructures. Models with MD-CE, Dice-CE, MD, and weighted CE loss had highest performance, and were statistically similar. Data augmentation did not affect model performances on both clean and outlier data and model robustness was susceptible to network size. For a certain type of outlier data, robustness can be improved via incorporating them into the training process. The execution time for segmenting each patient was on average 2.1s.\n\n\nCONCLUSIONS\nA deep neural network provides a fast and accurate segmentation of large cardiac substructures in non-contrast CT images. Model robustness to two types of clinically common outlier data were investigated and potential approaches to improve them were explored. Evaluation of clinical acceptability and integration into clinical workflow are pending. This article is protected by copyright. All rights reserved.

Volume None
Pages None
DOI 10.1002/mp.15237
Language English
Journal Medical physics

Full Text