Training deep-learning segmentation models from severely limited data

Yao Zhao, Dong Joo Rhee, Carlos Cardenas, Laurence E. Court, Jinzhong Yang

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Purpose: To enable generation of high-quality deep learning segmentation models from severely limited contoured cases (e.g., ~10 cases). Methods: Thirty head and neck computed tomography (CT) scans with well-defined contours were deformably registered to 200 CT scans of the same anatomic site without contours. Acquired deformation vector fields were used to train a principal component analysis (PCA) model for each of the 30 contoured CT scans by capturing the mean deformation and most prominent variations. Each PCA model can produce an infinite number of synthetic CT scans and corresponding contours by applying random deformations. We used 300, 600, 1000, and 2000 synthetic CT scans and contours generated from one PCA model to train V-Net, a 3D convolutional neural network architecture, to segment parotid and submandibular glands. We repeated the training using same numbers of training cases generated from 7, 10, 20, and 30 PCA models, with the data distributed evenly between each PCA model. Performance of the segmentation models was evaluated with Dice similarity coefficients between auto-generated contours and physician-drawn contours on 162 test CT scans for parotid glands and another 21 test CT scans for submandibular glands. Results: Dice values varied with the number of synthetic CT scans and the number of PCA models used to train the network. By using 2000 synthetic CT scans generated from 10 PCA models, we achieved Dice values of 82.8% ± 6.8% for right parotid, 82.0% ± 6.9% for left parotid, and 74.2% ± 6.8% for submandibular glands. These results are comparable with those obtained from state-of-the-art auto-contouring approaches, including a deep learning network trained from more than 1000 contoured patients and a multi-atlas algorithm from 12 well-contoured atlases. Improvement was marginal when >10 PCA models or >2000 synthetic CT scans were used. Conclusions: We demonstrated an effective data augmentation approach to train high-quality deep learning segmentation models from a limited number of well-contoured patient cases.

Original languageEnglish (US)
Pages (from-to)1697-1706
Number of pages10
JournalMedical physics
Volume48
Issue number4
DOIs
StatePublished - Apr 2021

Keywords

  • Auto-segmentation
  • convolutional neural networks
  • data augmentation
  • deep learning
  • principal component analysis

ASJC Scopus subject areas

  • Biophysics
  • Radiology Nuclear Medicine and imaging

MD Anderson CCSG core facilities

  • Clinical Trials Office

Fingerprint

Dive into the research topics of 'Training deep-learning segmentation models from severely limited data'. Together they form a unique fingerprint.

Cite this