International journal of radiation oncology, biology, physics | 2021

Head and Neck Oropharyngeal GTV Autosegmentation: Combining nnU-Net With Shape Representation Loss Driven by a Variational Autoencoder Model.

 
 
 
 
 
 
 

Abstract


PURPOSE/OBJECTIVE(S)\nAutosegmentation models for tumors are becoming increasingly important across several domains. GTV autosegmentations can facilitate faster radiation treatment planning, standardize delineation for radiomics, and provide contour consistency across centers. Oropharyngeal squamous cell carcinomas (SCC) of the head and neck serve as a challenging tumor to segment given the underlying anatomy and imaging characteristics. We build upon nnU-Net, a new medical image segmentation framework that uses a set of heuristics based on the training data type (modality, size etc.) and available resources (GPU RAM etc.) to optimally select pre-processing steps, model parameters and model ensembling techniques. Previous work has shown that latent-space based shape representation (obtained from stacked auto-encoder) of organs at risk within the head and neck, when used in loss function, improves OAR segmentation accuracy. Here, we apply nnU-Net for autosegmentation of oropharyngeal SCCs GTVs. And we adapted nnU-Net to utilize a shape representation loss function to further constrain the GTV segmentations.\n\n\nMATERIALS/METHODS\nThe 2020 HEad and NeCK TumOR (HECKTOR) Challenge at the MICCAI 2020 conference provided a dataset of 201 patients with oropharyngeal caner with pre-radiation treatment CT and PET images from 4 different medical centers. We applied autosegmentation to this HECKTOR dataset, and we tested on an independent dataset consisting of 53 patients from a fifth and separate institution. We trained a model with the original nnU-Net framework on the HECKTOR CT and PET training data. To generate a shape representation model, we built a variational autoencoder with input as the cropped and centered ground truth GTV binary masks. We performed hyperparameter optimization and found an optimal latent space size of 50, a network depth of 5 convolutional layers, and a KL loss weight of 0.001. To constrain the output of nnU-Net, we computed the L1-loss in the latent space between the ground truth and the autosegmentation, generating a shape representation model (SRM) loss function.\n\n\nRESULTS\nThe latent space of the variational autoencoder encodes effectively shape information of GTVs, including size and T-staging of the tumor. The original nnU-Net model achieved a 74.7% Dice score on the test set, which placed 3rd in the post-submission HECKTOR challenge. Our SRM model constrained the segmentation and achieved a Dice of 75.9%, which placed first in the post-submission HECKTOR challenge as of February 2021.\n\n\nCONCLUSION\nThese results show that nnU-Net can be used to generate a well-performing autosegmentation model for head and neck GTV segmentation using multimodal input. We also demonstrate that the PyTorch framework can be readily altered and optimized for particular tasks. Our shape representation loss was able to improve performance and achieve the top segmentation result in the HECKTOR challenge. Further optimizations into this framework are actively being explored.

Volume 111 3S
Pages \n e398\n
DOI 10.1016/j.ijrobp.2021.07.1154
Language English
Journal International journal of radiation oncology, biology, physics

Full Text