Evaluation of a multiview architecture for automatic vertebral labeling of palliative radiotherapy simulation CT images

Tucker J. Netherton; Dong Joo Rhee; Carlos E. Cardenas; Caroline Chung; Ann H. Klopp; Christine B. Peterson; Rebecca M. Howell; Peter A. Balter; Laurence E. Court

doi:10.1002/mp.14415

Evaluation of a multiview architecture for automatic vertebral labeling of palliative radiotherapy simulation CT images

Tucker J. Netherton, Dong Joo Rhee, Carlos E. Cardenas, Caroline Chung, Ann H. Klopp, Christine B. Peterson, Rebecca M. Howell, Peter A. Balter, Laurence E. Court

Research output: Contribution to journal › Article › peer-review

12 Scopus citations

Abstract

Purpose: The purpose of this work was to evaluate the performance of X-Net, a multiview deep learning architecture, to automatically label vertebral levels (S2-C1) in palliative radiotherapy simulation CT scans. Methods: For each patient CT scan, our automated approach 1) segmented spinal canal using a convolutional-neural network (CNN), 2) formed sagittal and coronal intensity projection pairs, 3) labeled vertebral levels with X-Net, and 4) detected irregular intervertebral spacing using an analytic methodology. The spinal canal CNN was trained via fivefold cross validation using 1,966 simulation CT scans and evaluated on 330 CT scans. After labeling vertebral levels (S2-C1) in 897 palliative radiotherapy simulation CT scans, a volume of interest surrounding the spinal canal in each patient's CT scan was converted into sagittal and coronal intensity projection image pairs. Then, intensity projection image pairs were augmented and used to train X-Net to automatically label vertebral levels using fivefold cross validation (n = 803). Prior to testing upon the final test set (n = 94), CT scans of patients with anatomical abnormalities, surgical implants, or other atypical features from the final test set were placed in an outlier group (n = 20), whereas those without these features were placed in a normative group (n = 74). The performance of X-Net, X-Net Ensemble, and another leading vertebral labeling architecture (Btrfly Net) was evaluated on both groups using identification rate, localization error, and other metrics. The performance of our approach was also evaluated on the MICCAI 2014 test dataset (n = 60). Finally, a method to detect irregular intervertebral spacing was created based on the rate of change in spacing between predicted vertebral body locations and was also evaluated using the final test set. Receiver operating characteristic analysis was used to investigate the performance of the method to detect irregular intervertebral spacing. Results: The spinal canal architecture yielded centroid coordinates spanning S2-C1 with submillimeter accuracy (mean ± standard deviation, 0.399 ± 0.299 mm; n = 330 patients) and was robust in the localization of spinal canal centroid to surgical implants and widespread metastases. Cross-validation testing of X-Net for vertebral labeling revealed that the deep learning model performance (F₁ score, precision, and sensitivity) improved with CT scan length. The X-Net, X-Net Ensemble, and Btrfly Net mean identification rates and localization errors were 92.4% and 2.3 mm, 94.2% and 2.2 mm, and 90.5% and 3.4 mm, respectively, in the final test set and 96.7% and 2.2 mm, 96.9% and 2.0 mm, and 94.8% and 3.3 mm, respectively, within the normative group of the final test set. The X-Net Ensemble yielded the highest percentage of patients (94%) having all vertebral bodies identified correctly in the final test set when the three most inferior and superior vertebral bodies were excluded from the CT scan. The method used to detect labeling failures had 67% sensitivity and 95% specificity when combined with the X-Net Ensemble and flagged five of six patients with atypical vertebral counts (additional thoracic (T13), additional lumbar (L6) or only four lumbar vertebrae). Mean identification rate on the MICCAI 2014 dataset using an X-Net Ensemble was increased from 86.8% to 91.3% through the use of transfer learning and obtained state-of-the-art results for various regions of the spine. Conclusions: We trained X-Net, our unique convolutional neural network, to automatically label vertebral levels from S2 to C1 on palliative radiotherapy CT images and found that an ensemble of X-Net models had high vertebral body identification rate (94.2%) and small localization errors (2.2 ± 1.8 mm). In addition, our transfer learning approach achieved state-of-the-art results on a well-known benchmark dataset with high identification rate (91.3%) and low localization error (3.3 mm ± 2.7 mm). When we pre-screened radiotherapy CT images for the presence of hardware, surgical implants, or other anatomic abnormalities prior to the use of X-Net, it labeled the spine correctly in more than 97% of patients and 94% of patients when scans were not prescreened. Automatically generated labels are robust to widespread vertebral metastases and surgical implants and our method to detect labeling failures based on neighborhood intervertebral spacing can reliably identify patients with an additional lumbar or thoracic vertebral body.

Original language	English (US)
Pages (from-to)	5592-5608
Number of pages	17
Journal	Medical physics
Volume	47
Issue number	11
DOIs	https://doi.org/10.1002/mp.14415
State	Published - Nov 2020

Keywords

automatic vertebral labeling
deep learning

ASJC Scopus subject areas

Biophysics
Radiology Nuclear Medicine and imaging

MD Anderson CCSG core facilities

Biostatistics Resource Group

Access to Document

10.1002/mp.14415

Cite this

@article{b06d316cb77b4d9d9d7ff3fe72dab0c4,

title = "Evaluation of a multiview architecture for automatic vertebral labeling of palliative radiotherapy simulation CT images",

abstract = "Purpose: The purpose of this work was to evaluate the performance of X-Net, a multiview deep learning architecture, to automatically label vertebral levels (S2-C1) in palliative radiotherapy simulation CT scans. Methods: For each patient CT scan, our automated approach 1) segmented spinal canal using a convolutional-neural network (CNN), 2) formed sagittal and coronal intensity projection pairs, 3) labeled vertebral levels with X-Net, and 4) detected irregular intervertebral spacing using an analytic methodology. The spinal canal CNN was trained via fivefold cross validation using 1,966 simulation CT scans and evaluated on 330 CT scans. After labeling vertebral levels (S2-C1) in 897 palliative radiotherapy simulation CT scans, a volume of interest surrounding the spinal canal in each patient's CT scan was converted into sagittal and coronal intensity projection image pairs. Then, intensity projection image pairs were augmented and used to train X-Net to automatically label vertebral levels using fivefold cross validation (n = 803). Prior to testing upon the final test set (n = 94), CT scans of patients with anatomical abnormalities, surgical implants, or other atypical features from the final test set were placed in an outlier group (n = 20), whereas those without these features were placed in a normative group (n = 74). The performance of X-Net, X-Net Ensemble, and another leading vertebral labeling architecture (Btrfly Net) was evaluated on both groups using identification rate, localization error, and other metrics. The performance of our approach was also evaluated on the MICCAI 2014 test dataset (n = 60). Finally, a method to detect irregular intervertebral spacing was created based on the rate of change in spacing between predicted vertebral body locations and was also evaluated using the final test set. Receiver operating characteristic analysis was used to investigate the performance of the method to detect irregular intervertebral spacing. Results: The spinal canal architecture yielded centroid coordinates spanning S2-C1 with submillimeter accuracy (mean ± standard deviation, 0.399 ± 0.299 mm; n = 330 patients) and was robust in the localization of spinal canal centroid to surgical implants and widespread metastases. Cross-validation testing of X-Net for vertebral labeling revealed that the deep learning model performance (F1 score, precision, and sensitivity) improved with CT scan length. The X-Net, X-Net Ensemble, and Btrfly Net mean identification rates and localization errors were 92.4% and 2.3 mm, 94.2% and 2.2 mm, and 90.5% and 3.4 mm, respectively, in the final test set and 96.7% and 2.2 mm, 96.9% and 2.0 mm, and 94.8% and 3.3 mm, respectively, within the normative group of the final test set. The X-Net Ensemble yielded the highest percentage of patients (94%) having all vertebral bodies identified correctly in the final test set when the three most inferior and superior vertebral bodies were excluded from the CT scan. The method used to detect labeling failures had 67% sensitivity and 95% specificity when combined with the X-Net Ensemble and flagged five of six patients with atypical vertebral counts (additional thoracic (T13), additional lumbar (L6) or only four lumbar vertebrae). Mean identification rate on the MICCAI 2014 dataset using an X-Net Ensemble was increased from 86.8% to 91.3% through the use of transfer learning and obtained state-of-the-art results for various regions of the spine. Conclusions: We trained X-Net, our unique convolutional neural network, to automatically label vertebral levels from S2 to C1 on palliative radiotherapy CT images and found that an ensemble of X-Net models had high vertebral body identification rate (94.2%) and small localization errors (2.2 ± 1.8 mm). In addition, our transfer learning approach achieved state-of-the-art results on a well-known benchmark dataset with high identification rate (91.3%) and low localization error (3.3 mm ± 2.7 mm). When we pre-screened radiotherapy CT images for the presence of hardware, surgical implants, or other anatomic abnormalities prior to the use of X-Net, it labeled the spine correctly in more than 97% of patients and 94% of patients when scans were not prescreened. Automatically generated labels are robust to widespread vertebral metastases and surgical implants and our method to detect labeling failures based on neighborhood intervertebral spacing can reliably identify patients with an additional lumbar or thoracic vertebral body.",

keywords = "automatic vertebral labeling, deep learning",

author = "Netherton, {Tucker J.} and Rhee, {Dong Joo} and Cardenas, {Carlos E.} and Caroline Chung and Klopp, {Ann H.} and Peterson, {Christine B.} and Howell, {Rebecca M.} and Balter, {Peter A.} and Court, {Laurence E.}",

note = "Funding Information: Funding was received in part from Varian Medical Systems and the National Cancer Institute. Multiple authors of this publication are members of the Radiation Planning Assistant team at the University of Texas MD Anderson Medical Center. Funding Information: We would like to thank Mona Amirmazaheri, Raphael Douglas, and Casey Gay for their pivotal data curation contributions. We would also like to thank Don Norwood from the Department of Scientific Publications at the University of Texas MD Anderson Cancer Center as well as Anjany Sekuboyina from the Technische Universit{\"a}t M{\"u}nchen for sharing his experience with the Btrfly architecture. Publisher Copyright: {\textcopyright} 2020 The Authors. Medical Physics published by Wiley Periodicals LLC on behalf of American Association of Physicists in Medicine.",

year = "2020",

month = nov,

doi = "10.1002/mp.14415",

language = "English (US)",

volume = "47",

pages = "5592--5608",

journal = "Medical physics",

issn = "0094-2405",

publisher = "AAPM - American Association of Physicists in Medicine",

number = "11",

}

TY - JOUR

T1 - Evaluation of a multiview architecture for automatic vertebral labeling of palliative radiotherapy simulation CT images

AU - Netherton, Tucker J.

AU - Rhee, Dong Joo

AU - Cardenas, Carlos E.

AU - Chung, Caroline

AU - Klopp, Ann H.

AU - Peterson, Christine B.

AU - Howell, Rebecca M.

AU - Balter, Peter A.

AU - Court, Laurence E.

N1 - Funding Information: Funding was received in part from Varian Medical Systems and the National Cancer Institute. Multiple authors of this publication are members of the Radiation Planning Assistant team at the University of Texas MD Anderson Medical Center. Funding Information: We would like to thank Mona Amirmazaheri, Raphael Douglas, and Casey Gay for their pivotal data curation contributions. We would also like to thank Don Norwood from the Department of Scientific Publications at the University of Texas MD Anderson Cancer Center as well as Anjany Sekuboyina from the Technische Universität München for sharing his experience with the Btrfly architecture. Publisher Copyright: © 2020 The Authors. Medical Physics published by Wiley Periodicals LLC on behalf of American Association of Physicists in Medicine.

PY - 2020/11

Y1 - 2020/11

N2 - Purpose: The purpose of this work was to evaluate the performance of X-Net, a multiview deep learning architecture, to automatically label vertebral levels (S2-C1) in palliative radiotherapy simulation CT scans. Methods: For each patient CT scan, our automated approach 1) segmented spinal canal using a convolutional-neural network (CNN), 2) formed sagittal and coronal intensity projection pairs, 3) labeled vertebral levels with X-Net, and 4) detected irregular intervertebral spacing using an analytic methodology. The spinal canal CNN was trained via fivefold cross validation using 1,966 simulation CT scans and evaluated on 330 CT scans. After labeling vertebral levels (S2-C1) in 897 palliative radiotherapy simulation CT scans, a volume of interest surrounding the spinal canal in each patient's CT scan was converted into sagittal and coronal intensity projection image pairs. Then, intensity projection image pairs were augmented and used to train X-Net to automatically label vertebral levels using fivefold cross validation (n = 803). Prior to testing upon the final test set (n = 94), CT scans of patients with anatomical abnormalities, surgical implants, or other atypical features from the final test set were placed in an outlier group (n = 20), whereas those without these features were placed in a normative group (n = 74). The performance of X-Net, X-Net Ensemble, and another leading vertebral labeling architecture (Btrfly Net) was evaluated on both groups using identification rate, localization error, and other metrics. The performance of our approach was also evaluated on the MICCAI 2014 test dataset (n = 60). Finally, a method to detect irregular intervertebral spacing was created based on the rate of change in spacing between predicted vertebral body locations and was also evaluated using the final test set. Receiver operating characteristic analysis was used to investigate the performance of the method to detect irregular intervertebral spacing. Results: The spinal canal architecture yielded centroid coordinates spanning S2-C1 with submillimeter accuracy (mean ± standard deviation, 0.399 ± 0.299 mm; n = 330 patients) and was robust in the localization of spinal canal centroid to surgical implants and widespread metastases. Cross-validation testing of X-Net for vertebral labeling revealed that the deep learning model performance (F1 score, precision, and sensitivity) improved with CT scan length. The X-Net, X-Net Ensemble, and Btrfly Net mean identification rates and localization errors were 92.4% and 2.3 mm, 94.2% and 2.2 mm, and 90.5% and 3.4 mm, respectively, in the final test set and 96.7% and 2.2 mm, 96.9% and 2.0 mm, and 94.8% and 3.3 mm, respectively, within the normative group of the final test set. The X-Net Ensemble yielded the highest percentage of patients (94%) having all vertebral bodies identified correctly in the final test set when the three most inferior and superior vertebral bodies were excluded from the CT scan. The method used to detect labeling failures had 67% sensitivity and 95% specificity when combined with the X-Net Ensemble and flagged five of six patients with atypical vertebral counts (additional thoracic (T13), additional lumbar (L6) or only four lumbar vertebrae). Mean identification rate on the MICCAI 2014 dataset using an X-Net Ensemble was increased from 86.8% to 91.3% through the use of transfer learning and obtained state-of-the-art results for various regions of the spine. Conclusions: We trained X-Net, our unique convolutional neural network, to automatically label vertebral levels from S2 to C1 on palliative radiotherapy CT images and found that an ensemble of X-Net models had high vertebral body identification rate (94.2%) and small localization errors (2.2 ± 1.8 mm). In addition, our transfer learning approach achieved state-of-the-art results on a well-known benchmark dataset with high identification rate (91.3%) and low localization error (3.3 mm ± 2.7 mm). When we pre-screened radiotherapy CT images for the presence of hardware, surgical implants, or other anatomic abnormalities prior to the use of X-Net, it labeled the spine correctly in more than 97% of patients and 94% of patients when scans were not prescreened. Automatically generated labels are robust to widespread vertebral metastases and surgical implants and our method to detect labeling failures based on neighborhood intervertebral spacing can reliably identify patients with an additional lumbar or thoracic vertebral body.

AB - Purpose: The purpose of this work was to evaluate the performance of X-Net, a multiview deep learning architecture, to automatically label vertebral levels (S2-C1) in palliative radiotherapy simulation CT scans. Methods: For each patient CT scan, our automated approach 1) segmented spinal canal using a convolutional-neural network (CNN), 2) formed sagittal and coronal intensity projection pairs, 3) labeled vertebral levels with X-Net, and 4) detected irregular intervertebral spacing using an analytic methodology. The spinal canal CNN was trained via fivefold cross validation using 1,966 simulation CT scans and evaluated on 330 CT scans. After labeling vertebral levels (S2-C1) in 897 palliative radiotherapy simulation CT scans, a volume of interest surrounding the spinal canal in each patient's CT scan was converted into sagittal and coronal intensity projection image pairs. Then, intensity projection image pairs were augmented and used to train X-Net to automatically label vertebral levels using fivefold cross validation (n = 803). Prior to testing upon the final test set (n = 94), CT scans of patients with anatomical abnormalities, surgical implants, or other atypical features from the final test set were placed in an outlier group (n = 20), whereas those without these features were placed in a normative group (n = 74). The performance of X-Net, X-Net Ensemble, and another leading vertebral labeling architecture (Btrfly Net) was evaluated on both groups using identification rate, localization error, and other metrics. The performance of our approach was also evaluated on the MICCAI 2014 test dataset (n = 60). Finally, a method to detect irregular intervertebral spacing was created based on the rate of change in spacing between predicted vertebral body locations and was also evaluated using the final test set. Receiver operating characteristic analysis was used to investigate the performance of the method to detect irregular intervertebral spacing. Results: The spinal canal architecture yielded centroid coordinates spanning S2-C1 with submillimeter accuracy (mean ± standard deviation, 0.399 ± 0.299 mm; n = 330 patients) and was robust in the localization of spinal canal centroid to surgical implants and widespread metastases. Cross-validation testing of X-Net for vertebral labeling revealed that the deep learning model performance (F1 score, precision, and sensitivity) improved with CT scan length. The X-Net, X-Net Ensemble, and Btrfly Net mean identification rates and localization errors were 92.4% and 2.3 mm, 94.2% and 2.2 mm, and 90.5% and 3.4 mm, respectively, in the final test set and 96.7% and 2.2 mm, 96.9% and 2.0 mm, and 94.8% and 3.3 mm, respectively, within the normative group of the final test set. The X-Net Ensemble yielded the highest percentage of patients (94%) having all vertebral bodies identified correctly in the final test set when the three most inferior and superior vertebral bodies were excluded from the CT scan. The method used to detect labeling failures had 67% sensitivity and 95% specificity when combined with the X-Net Ensemble and flagged five of six patients with atypical vertebral counts (additional thoracic (T13), additional lumbar (L6) or only four lumbar vertebrae). Mean identification rate on the MICCAI 2014 dataset using an X-Net Ensemble was increased from 86.8% to 91.3% through the use of transfer learning and obtained state-of-the-art results for various regions of the spine. Conclusions: We trained X-Net, our unique convolutional neural network, to automatically label vertebral levels from S2 to C1 on palliative radiotherapy CT images and found that an ensemble of X-Net models had high vertebral body identification rate (94.2%) and small localization errors (2.2 ± 1.8 mm). In addition, our transfer learning approach achieved state-of-the-art results on a well-known benchmark dataset with high identification rate (91.3%) and low localization error (3.3 mm ± 2.7 mm). When we pre-screened radiotherapy CT images for the presence of hardware, surgical implants, or other anatomic abnormalities prior to the use of X-Net, it labeled the spine correctly in more than 97% of patients and 94% of patients when scans were not prescreened. Automatically generated labels are robust to widespread vertebral metastases and surgical implants and our method to detect labeling failures based on neighborhood intervertebral spacing can reliably identify patients with an additional lumbar or thoracic vertebral body.

KW - automatic vertebral labeling

KW - deep learning

UR - http://www.scopus.com/inward/record.url?scp=85090928751&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85090928751&partnerID=8YFLogxK

U2 - 10.1002/mp.14415

DO - 10.1002/mp.14415

M3 - Article

C2 - 33459402

AN - SCOPUS:85090928751

SN - 0094-2405

VL - 47

SP - 5592

EP - 5608

JO - Medical physics

JF - Medical physics

IS - 11

ER -

Evaluation of a multiview architecture for automatic vertebral labeling of palliative radiotherapy simulation CT images

Abstract

Keywords

ASJC Scopus subject areas

MD Anderson CCSG core facilities

Access to Document

Other files and links

Fingerprint

Cite this