Training deep-learning segmentation models from severely limited data

Yao Zhao; Dong Joo Rhee; Carlos Cardenas; Laurence E. Court; Jinzhong Yang

doi:10.1002/mp.14728

Training deep-learning segmentation models from severely limited data

Yao Zhao, Dong Joo Rhee, Carlos Cardenas, Laurence E. Court, Jinzhong Yang

Radiation Physics

Research output: Contribution to journal › Article › peer-review

9 Scopus citations

Abstract

Purpose: To enable generation of high-quality deep learning segmentation models from severely limited contoured cases (e.g., ~10 cases). Methods: Thirty head and neck computed tomography (CT) scans with well-defined contours were deformably registered to 200 CT scans of the same anatomic site without contours. Acquired deformation vector fields were used to train a principal component analysis (PCA) model for each of the 30 contoured CT scans by capturing the mean deformation and most prominent variations. Each PCA model can produce an infinite number of synthetic CT scans and corresponding contours by applying random deformations. We used 300, 600, 1000, and 2000 synthetic CT scans and contours generated from one PCA model to train V-Net, a 3D convolutional neural network architecture, to segment parotid and submandibular glands. We repeated the training using same numbers of training cases generated from 7, 10, 20, and 30 PCA models, with the data distributed evenly between each PCA model. Performance of the segmentation models was evaluated with Dice similarity coefficients between auto-generated contours and physician-drawn contours on 162 test CT scans for parotid glands and another 21 test CT scans for submandibular glands. Results: Dice values varied with the number of synthetic CT scans and the number of PCA models used to train the network. By using 2000 synthetic CT scans generated from 10 PCA models, we achieved Dice values of 82.8% ± 6.8% for right parotid, 82.0% ± 6.9% for left parotid, and 74.2% ± 6.8% for submandibular glands. These results are comparable with those obtained from state-of-the-art auto-contouring approaches, including a deep learning network trained from more than 1000 contoured patients and a multi-atlas algorithm from 12 well-contoured atlases. Improvement was marginal when >10 PCA models or >2000 synthetic CT scans were used. Conclusions: We demonstrated an effective data augmentation approach to train high-quality deep learning segmentation models from a limited number of well-contoured patient cases.

Original language	English (US)
Pages (from-to)	1697-1706
Number of pages	10
Journal	Medical physics
Volume	48
Issue number	4
DOIs	https://doi.org/10.1002/mp.14728
State	Published - Apr 2021

Keywords

Auto-segmentation
convolutional neural networks
data augmentation
deep learning
principal component analysis

ASJC Scopus subject areas

Biophysics
Radiology Nuclear Medicine and imaging

MD Anderson CCSG core facilities

Clinical Trials Office

Access to Document

10.1002/mp.14728

Cite this

@article{d55ca7ccf4684e99a62e31d56f3b5fd7,

title = "Training deep-learning segmentation models from severely limited data",

abstract = "Purpose: To enable generation of high-quality deep learning segmentation models from severely limited contoured cases (e.g., ~10 cases). Methods: Thirty head and neck computed tomography (CT) scans with well-defined contours were deformably registered to 200 CT scans of the same anatomic site without contours. Acquired deformation vector fields were used to train a principal component analysis (PCA) model for each of the 30 contoured CT scans by capturing the mean deformation and most prominent variations. Each PCA model can produce an infinite number of synthetic CT scans and corresponding contours by applying random deformations. We used 300, 600, 1000, and 2000 synthetic CT scans and contours generated from one PCA model to train V-Net, a 3D convolutional neural network architecture, to segment parotid and submandibular glands. We repeated the training using same numbers of training cases generated from 7, 10, 20, and 30 PCA models, with the data distributed evenly between each PCA model. Performance of the segmentation models was evaluated with Dice similarity coefficients between auto-generated contours and physician-drawn contours on 162 test CT scans for parotid glands and another 21 test CT scans for submandibular glands. Results: Dice values varied with the number of synthetic CT scans and the number of PCA models used to train the network. By using 2000 synthetic CT scans generated from 10 PCA models, we achieved Dice values of 82.8% ± 6.8% for right parotid, 82.0% ± 6.9% for left parotid, and 74.2% ± 6.8% for submandibular glands. These results are comparable with those obtained from state-of-the-art auto-contouring approaches, including a deep learning network trained from more than 1000 contoured patients and a multi-atlas algorithm from 12 well-contoured atlases. Improvement was marginal when >10 PCA models or >2000 synthetic CT scans were used. Conclusions: We demonstrated an effective data augmentation approach to train high-quality deep learning segmentation models from a limited number of well-contoured patient cases.",

keywords = "Auto-segmentation, convolutional neural networks, data augmentation, deep learning, principal component analysis",

author = "Yao Zhao and Rhee, {Dong Joo} and Carlos Cardenas and Court, {Laurence E.} and Jinzhong Yang",

note = "Funding Information: Funded in part by Cancer Center Support (Core) Grant P30 CA016672 from the National Cancer Institute, National Institutes of Health, to The University of Texas MD Anderson Cancer Center. Publisher Copyright: {\textcopyright} 2021 American Association of Physicists in Medicine",

year = "2021",

month = apr,

doi = "10.1002/mp.14728",

language = "English (US)",

volume = "48",

pages = "1697--1706",

journal = "Medical physics",

issn = "0094-2405",

publisher = "AAPM - American Association of Physicists in Medicine",

number = "4",

}

TY - JOUR

T1 - Training deep-learning segmentation models from severely limited data

AU - Zhao, Yao

AU - Rhee, Dong Joo

AU - Cardenas, Carlos

AU - Court, Laurence E.

AU - Yang, Jinzhong

N1 - Funding Information: Funded in part by Cancer Center Support (Core) Grant P30 CA016672 from the National Cancer Institute, National Institutes of Health, to The University of Texas MD Anderson Cancer Center. Publisher Copyright: © 2021 American Association of Physicists in Medicine

PY - 2021/4

Y1 - 2021/4

N2 - Purpose: To enable generation of high-quality deep learning segmentation models from severely limited contoured cases (e.g., ~10 cases). Methods: Thirty head and neck computed tomography (CT) scans with well-defined contours were deformably registered to 200 CT scans of the same anatomic site without contours. Acquired deformation vector fields were used to train a principal component analysis (PCA) model for each of the 30 contoured CT scans by capturing the mean deformation and most prominent variations. Each PCA model can produce an infinite number of synthetic CT scans and corresponding contours by applying random deformations. We used 300, 600, 1000, and 2000 synthetic CT scans and contours generated from one PCA model to train V-Net, a 3D convolutional neural network architecture, to segment parotid and submandibular glands. We repeated the training using same numbers of training cases generated from 7, 10, 20, and 30 PCA models, with the data distributed evenly between each PCA model. Performance of the segmentation models was evaluated with Dice similarity coefficients between auto-generated contours and physician-drawn contours on 162 test CT scans for parotid glands and another 21 test CT scans for submandibular glands. Results: Dice values varied with the number of synthetic CT scans and the number of PCA models used to train the network. By using 2000 synthetic CT scans generated from 10 PCA models, we achieved Dice values of 82.8% ± 6.8% for right parotid, 82.0% ± 6.9% for left parotid, and 74.2% ± 6.8% for submandibular glands. These results are comparable with those obtained from state-of-the-art auto-contouring approaches, including a deep learning network trained from more than 1000 contoured patients and a multi-atlas algorithm from 12 well-contoured atlases. Improvement was marginal when >10 PCA models or >2000 synthetic CT scans were used. Conclusions: We demonstrated an effective data augmentation approach to train high-quality deep learning segmentation models from a limited number of well-contoured patient cases.

AB - Purpose: To enable generation of high-quality deep learning segmentation models from severely limited contoured cases (e.g., ~10 cases). Methods: Thirty head and neck computed tomography (CT) scans with well-defined contours were deformably registered to 200 CT scans of the same anatomic site without contours. Acquired deformation vector fields were used to train a principal component analysis (PCA) model for each of the 30 contoured CT scans by capturing the mean deformation and most prominent variations. Each PCA model can produce an infinite number of synthetic CT scans and corresponding contours by applying random deformations. We used 300, 600, 1000, and 2000 synthetic CT scans and contours generated from one PCA model to train V-Net, a 3D convolutional neural network architecture, to segment parotid and submandibular glands. We repeated the training using same numbers of training cases generated from 7, 10, 20, and 30 PCA models, with the data distributed evenly between each PCA model. Performance of the segmentation models was evaluated with Dice similarity coefficients between auto-generated contours and physician-drawn contours on 162 test CT scans for parotid glands and another 21 test CT scans for submandibular glands. Results: Dice values varied with the number of synthetic CT scans and the number of PCA models used to train the network. By using 2000 synthetic CT scans generated from 10 PCA models, we achieved Dice values of 82.8% ± 6.8% for right parotid, 82.0% ± 6.9% for left parotid, and 74.2% ± 6.8% for submandibular glands. These results are comparable with those obtained from state-of-the-art auto-contouring approaches, including a deep learning network trained from more than 1000 contoured patients and a multi-atlas algorithm from 12 well-contoured atlases. Improvement was marginal when >10 PCA models or >2000 synthetic CT scans were used. Conclusions: We demonstrated an effective data augmentation approach to train high-quality deep learning segmentation models from a limited number of well-contoured patient cases.

KW - Auto-segmentation

KW - convolutional neural networks

KW - data augmentation

KW - deep learning

KW - principal component analysis

UR - http://www.scopus.com/inward/record.url?scp=85100576110&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85100576110&partnerID=8YFLogxK

U2 - 10.1002/mp.14728

DO - 10.1002/mp.14728

M3 - Article

C2 - 33474727

AN - SCOPUS:85100576110

SN - 0094-2405

VL - 48

SP - 1697

EP - 1706

JO - Medical physics

JF - Medical physics

IS - 4

ER -

Training deep-learning segmentation models from severely limited data

Abstract

Keywords

ASJC Scopus subject areas

MD Anderson CCSG core facilities

Access to Document

Other files and links

Fingerprint

Cite this