A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging

Andy S. Ding; Alexander Lu; Zhaoshuo Li; Manish Sahu; Deepa Galaiya; Jeffrey H. Siewerdsen; Mathias Unberath; Russell H. Taylor; Francis X. Creighton

doi:10.1002/ohn.317

A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging

Andy S. Ding, Alexander Lu, Zhaoshuo Li, Manish Sahu, Deepa Galaiya, Jeffrey H. Siewerdsen, Mathias Unberath, Russell H. Taylor, Francis X. Creighton

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Objective: Preoperative planning for otologic or neurotologic procedures often requires manual segmentation of relevant structures, which can be tedious and time-consuming. Automated methods for segmenting multiple geometrically complex structures can not only streamline preoperative planning but also augment minimally invasive and/or robot-assisted procedures in this space. This study evaluates a state-of-the-art deep learning pipeline for semantic segmentation of temporal bone anatomy. Study Design: A descriptive study of a segmentation network. Setting: Academic institution. Methods: A total of 15 high-resolution cone-beam temporal bone computed tomography (CT) data sets were included in this study. All images were co-registered, with relevant anatomical structures (eg, ossicles, inner ear, facial nerve, chorda tympani, bony labyrinth) manually segmented. Predicted segmentations from no new U-Net (nnU-Net), an open-source 3-dimensional semantic segmentation neural network, were compared against ground-truth segmentations using modified Hausdorff distances (mHD) and Dice scores. Results: Fivefold cross-validation with nnU-Net between predicted and ground-truth labels were as follows: malleus (mHD: 0.044 ± 0.024 mm, dice: 0.914 ± 0.035), incus (mHD: 0.051 ± 0.027 mm, dice: 0.916 ± 0.034), stapes (mHD: 0.147 ± 0.113 mm, dice: 0.560 ± 0.106), bony labyrinth (mHD: 0.038 ± 0.031 mm, dice: 0.952 ± 0.017), and facial nerve (mHD: 0.139 ± 0.072 mm, dice: 0.862 ± 0.039). Comparison against atlas-based segmentation propagation showed significantly higher Dice scores for all structures (p <.05). Conclusion: Using an open-source deep learning pipeline, we demonstrate consistently submillimeter accuracy for semantic CT segmentation of temporal bone anatomy compared to hand-segmented labels. This pipeline has the potential to greatly improve preoperative planning workflows for a variety of otologic and neurotologic procedures and augment existing image guidance and robot-assisted systems for the temporal bone.

Original language	English (US)
Pages (from-to)	988-998
Number of pages	11
Journal	Otolaryngology - Head and Neck Surgery (United States)
Volume	169
Issue number	4
DOIs	https://doi.org/10.1002/ohn.317
State	Published - Oct 2023
Externally published	Yes

Keywords

automated segmentation
deep learning
neural network
temporal bone

ASJC Scopus subject areas

Surgery
Otorhinolaryngology

Access to Document

10.1002/ohn.317

Cite this

@article{883624ab7f4f41cca5c2642e7f645718,

title = "A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging",

abstract = "Objective: Preoperative planning for otologic or neurotologic procedures often requires manual segmentation of relevant structures, which can be tedious and time-consuming. Automated methods for segmenting multiple geometrically complex structures can not only streamline preoperative planning but also augment minimally invasive and/or robot-assisted procedures in this space. This study evaluates a state-of-the-art deep learning pipeline for semantic segmentation of temporal bone anatomy. Study Design: A descriptive study of a segmentation network. Setting: Academic institution. Methods: A total of 15 high-resolution cone-beam temporal bone computed tomography (CT) data sets were included in this study. All images were co-registered, with relevant anatomical structures (eg, ossicles, inner ear, facial nerve, chorda tympani, bony labyrinth) manually segmented. Predicted segmentations from no new U-Net (nnU-Net), an open-source 3-dimensional semantic segmentation neural network, were compared against ground-truth segmentations using modified Hausdorff distances (mHD) and Dice scores. Results: Fivefold cross-validation with nnU-Net between predicted and ground-truth labels were as follows: malleus (mHD: 0.044 ± 0.024 mm, dice: 0.914 ± 0.035), incus (mHD: 0.051 ± 0.027 mm, dice: 0.916 ± 0.034), stapes (mHD: 0.147 ± 0.113 mm, dice: 0.560 ± 0.106), bony labyrinth (mHD: 0.038 ± 0.031 mm, dice: 0.952 ± 0.017), and facial nerve (mHD: 0.139 ± 0.072 mm, dice: 0.862 ± 0.039). Comparison against atlas-based segmentation propagation showed significantly higher Dice scores for all structures (p <.05). Conclusion: Using an open-source deep learning pipeline, we demonstrate consistently submillimeter accuracy for semantic CT segmentation of temporal bone anatomy compared to hand-segmented labels. This pipeline has the potential to greatly improve preoperative planning workflows for a variety of otologic and neurotologic procedures and augment existing image guidance and robot-assisted systems for the temporal bone.",

keywords = "automated segmentation, deep learning, neural network, temporal bone",

author = "Ding, {Andy S.} and Alexander Lu and Zhaoshuo Li and Manish Sahu and Deepa Galaiya and Siewerdsen, {Jeffrey H.} and Mathias Unberath and Taylor, {Russell H.} and Creighton, {Francis X.}",

note = "Publisher Copyright: {\textcopyright} 2023 American Academy of Otolaryngology–Head and Neck Surgery Foundation.",

year = "2023",

month = oct,

doi = "10.1002/ohn.317",

language = "English (US)",

volume = "169",

pages = "988--998",

journal = "Otolaryngology - Head and Neck Surgery (United States)",

issn = "0194-5998",

publisher = "Mosby Inc.",

number = "4",

}

TY - JOUR

T1 - A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging

AU - Ding, Andy S.

AU - Lu, Alexander

AU - Li, Zhaoshuo

AU - Sahu, Manish

AU - Galaiya, Deepa

AU - Siewerdsen, Jeffrey H.

AU - Unberath, Mathias

AU - Taylor, Russell H.

AU - Creighton, Francis X.

PY - 2023/10

Y1 - 2023/10

N2 - Objective: Preoperative planning for otologic or neurotologic procedures often requires manual segmentation of relevant structures, which can be tedious and time-consuming. Automated methods for segmenting multiple geometrically complex structures can not only streamline preoperative planning but also augment minimally invasive and/or robot-assisted procedures in this space. This study evaluates a state-of-the-art deep learning pipeline for semantic segmentation of temporal bone anatomy. Study Design: A descriptive study of a segmentation network. Setting: Academic institution. Methods: A total of 15 high-resolution cone-beam temporal bone computed tomography (CT) data sets were included in this study. All images were co-registered, with relevant anatomical structures (eg, ossicles, inner ear, facial nerve, chorda tympani, bony labyrinth) manually segmented. Predicted segmentations from no new U-Net (nnU-Net), an open-source 3-dimensional semantic segmentation neural network, were compared against ground-truth segmentations using modified Hausdorff distances (mHD) and Dice scores. Results: Fivefold cross-validation with nnU-Net between predicted and ground-truth labels were as follows: malleus (mHD: 0.044 ± 0.024 mm, dice: 0.914 ± 0.035), incus (mHD: 0.051 ± 0.027 mm, dice: 0.916 ± 0.034), stapes (mHD: 0.147 ± 0.113 mm, dice: 0.560 ± 0.106), bony labyrinth (mHD: 0.038 ± 0.031 mm, dice: 0.952 ± 0.017), and facial nerve (mHD: 0.139 ± 0.072 mm, dice: 0.862 ± 0.039). Comparison against atlas-based segmentation propagation showed significantly higher Dice scores for all structures (p <.05). Conclusion: Using an open-source deep learning pipeline, we demonstrate consistently submillimeter accuracy for semantic CT segmentation of temporal bone anatomy compared to hand-segmented labels. This pipeline has the potential to greatly improve preoperative planning workflows for a variety of otologic and neurotologic procedures and augment existing image guidance and robot-assisted systems for the temporal bone.

AB - Objective: Preoperative planning for otologic or neurotologic procedures often requires manual segmentation of relevant structures, which can be tedious and time-consuming. Automated methods for segmenting multiple geometrically complex structures can not only streamline preoperative planning but also augment minimally invasive and/or robot-assisted procedures in this space. This study evaluates a state-of-the-art deep learning pipeline for semantic segmentation of temporal bone anatomy. Study Design: A descriptive study of a segmentation network. Setting: Academic institution. Methods: A total of 15 high-resolution cone-beam temporal bone computed tomography (CT) data sets were included in this study. All images were co-registered, with relevant anatomical structures (eg, ossicles, inner ear, facial nerve, chorda tympani, bony labyrinth) manually segmented. Predicted segmentations from no new U-Net (nnU-Net), an open-source 3-dimensional semantic segmentation neural network, were compared against ground-truth segmentations using modified Hausdorff distances (mHD) and Dice scores. Results: Fivefold cross-validation with nnU-Net between predicted and ground-truth labels were as follows: malleus (mHD: 0.044 ± 0.024 mm, dice: 0.914 ± 0.035), incus (mHD: 0.051 ± 0.027 mm, dice: 0.916 ± 0.034), stapes (mHD: 0.147 ± 0.113 mm, dice: 0.560 ± 0.106), bony labyrinth (mHD: 0.038 ± 0.031 mm, dice: 0.952 ± 0.017), and facial nerve (mHD: 0.139 ± 0.072 mm, dice: 0.862 ± 0.039). Comparison against atlas-based segmentation propagation showed significantly higher Dice scores for all structures (p <.05). Conclusion: Using an open-source deep learning pipeline, we demonstrate consistently submillimeter accuracy for semantic CT segmentation of temporal bone anatomy compared to hand-segmented labels. This pipeline has the potential to greatly improve preoperative planning workflows for a variety of otologic and neurotologic procedures and augment existing image guidance and robot-assisted systems for the temporal bone.

KW - automated segmentation

KW - deep learning

KW - neural network

KW - temporal bone

UR - http://www.scopus.com/inward/record.url?scp=85150412682&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85150412682&partnerID=8YFLogxK

U2 - 10.1002/ohn.317

DO - 10.1002/ohn.317

M3 - Article

C2 - 36883992

AN - SCOPUS:85150412682

SN - 0194-5998

VL - 169

SP - 988

EP - 998

JO - Otolaryngology - Head and Neck Surgery (United States)

JF - Otolaryngology - Head and Neck Surgery (United States)

IS - 4

ER -

A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this