TY - JOUR
T1 - Computer-aided segmentation on MRI for prostate radiotherapy, part II
T2 - Comparing human and computer observer populations and the influence of annotator variability on algorithm variability
AU - Sanders, Jeremiah W.
AU - Mok, Henry
AU - Hanania, Alexander N.
AU - Venkatesan, Aradhana M.
AU - Tang, Chad
AU - Bruno, Teresa L.
AU - Thames, Howard D.
AU - Kudchadker, Rajat J.
AU - Frank, Steven J.
N1 - Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2022/4
Y1 - 2022/4
N2 - Background and purpose: Comparing deep learning (DL) algorithms to human interobserver variability, one of the largest sources of noise in human-performed annotations, is necessary to inform the clinical application, use, and quality assurance of DL for prostate radiotherapy. Materials and methods: One hundred fourteen DL algorithms were developed on 295 prostate MRIs to segment the prostate, external urinary sphincter (EUS), seminal vesicles (SV), rectum, and bladder. Fifty prostate MRIs of 25 patients undergoing MRI-based low-dose-rate prostate brachytherapy were acquired as an independent test set. Groups of DL algorithms were created based on the loss functions used to train them, and the spatial entropy (SE) of their predictions on the 50 test MRIs was computed. Five human observers contoured the 50 test MRIs, and SE maps of their contours were compared with those of the groups of the DL algorithms. Additionally, similarity metrics were computed between DL algorithm predictions and consensus annotations of the 5 human observers’ contours of the 50 test MRIs. Results: A DL algorithm yielded statistically significantly higher similarity metrics for the prostate than did the human observers (H) (prostate Matthew's correlation coefficient, DL vs. H: planning–0.931 vs. 0.903, p < 0.001; postimplant–0.925 vs. 0.892, p < 0.001); the same was true for the 4 organs at risk. The SE maps revealed that the DL algorithms and human annotators were most variable in similar anatomical regions: the prostate-EUS, prostate-SV, prostate-rectum, and prostate-bladder junctions. Conclusions: Annotation quality is an important consideration when developing, evaluating, and using DL algorithms clinically.
AB - Background and purpose: Comparing deep learning (DL) algorithms to human interobserver variability, one of the largest sources of noise in human-performed annotations, is necessary to inform the clinical application, use, and quality assurance of DL for prostate radiotherapy. Materials and methods: One hundred fourteen DL algorithms were developed on 295 prostate MRIs to segment the prostate, external urinary sphincter (EUS), seminal vesicles (SV), rectum, and bladder. Fifty prostate MRIs of 25 patients undergoing MRI-based low-dose-rate prostate brachytherapy were acquired as an independent test set. Groups of DL algorithms were created based on the loss functions used to train them, and the spatial entropy (SE) of their predictions on the 50 test MRIs was computed. Five human observers contoured the 50 test MRIs, and SE maps of their contours were compared with those of the groups of the DL algorithms. Additionally, similarity metrics were computed between DL algorithm predictions and consensus annotations of the 5 human observers’ contours of the 50 test MRIs. Results: A DL algorithm yielded statistically significantly higher similarity metrics for the prostate than did the human observers (H) (prostate Matthew's correlation coefficient, DL vs. H: planning–0.931 vs. 0.903, p < 0.001; postimplant–0.925 vs. 0.892, p < 0.001); the same was true for the 4 organs at risk. The SE maps revealed that the DL algorithms and human annotators were most variable in similar anatomical regions: the prostate-EUS, prostate-SV, prostate-rectum, and prostate-bladder junctions. Conclusions: Annotation quality is an important consideration when developing, evaluating, and using DL algorithms clinically.
KW - Annotation quality
KW - Brachytherapy
KW - Deep learning
KW - MRI
KW - Prostate
KW - Radiation therapy
KW - Segmentation
UR - http://www.scopus.com/inward/record.url?scp=85122958670&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85122958670&partnerID=8YFLogxK
U2 - 10.1016/j.radonc.2021.12.033
DO - 10.1016/j.radonc.2021.12.033
M3 - Article
C2 - 34979213
AN - SCOPUS:85122958670
SN - 0167-8140
VL - 169
SP - 132
EP - 139
JO - Radiotherapy and Oncology
JF - Radiotherapy and Oncology
ER -