E pluribus unum: prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation

Diana Lin, Kareem A. Wahid, Benjamin E. Nelms, Renjie He, Mohammed A. Naser, Simon Duke, Michael V. Sherer, John P. Christodouleas, Abdallah S.R. Mohamed, Michael Cislo, James D. Murphy, Clifton D. Fuller, Erin F. Gillespie

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Purpose: Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is a crowdsourced challenge engaging radiation oncologists across various expertise levels in segmentation. An obstacle to artificial intelligence (AI) development is the paucity of multiexpert datasets; consequently, we sought to characterize whether aggregate segmentations generated from multiple nonexperts could meet or exceed recognized expert agreement. Approach: Participants who contoured =1 region of interest (ROI) for the breast, sarcoma, head and neck (H N), gynecologic (GYN), or gastrointestinal (GI) cases were identified as a nonexpert or recognized expert. Cohort-specific ROIs were combined into single simultaneous truth and performance level estimation (STAPLE) consensus segmentations. STAPLEnonexpert ROIs were evaluated against STAPLEexpert contours using Dice similarity coefficient (DSC). The expert interobserver DSC (IODSCexpert) was calculated as an acceptability threshold between STAPLE nonexpert and STAPLEexpert. To determine the number of nonexperts required to match the IODSC expert for each ROI, a single consensus contour was generated using variable numbers of nonexperts and then compared to the IODSCexpert. Results: For all cases, the DSC values for STAPLEnonexpert versus STAPLEexpert were higher than comparator expert IODSCexpert for most ROIs. The minimum number of nonexpert segmentations needed for a consensus ROI to achieve IODSC expert acceptability criteria ranged between 2 and 4 for breast, 3 and 5 for sarcoma, 3 and 5 for H N, 3 and 5 for GYN, and 3 for GI. Conclusions: Multiple nonexpert-generated consensus ROIs met or exceeded expert-derived acceptability thresholds. Five nonexperts could potentially generate consensus segmentations for most ROIs with performance approximating experts, suggesting nonexpert segmentations as feasible cost-effective AI inputs.

Original languageEnglish (US)
Article numberS11903
JournalJournal of Medical Imaging
Volume10
DOIs
StatePublished - Feb 1 2023

Keywords

  • artificial intelligence
  • autosegmentation
  • contouring
  • crowdsourcing
  • radiation oncology
  • segmentation

ASJC Scopus subject areas

  • Radiology Nuclear Medicine and imaging

Fingerprint

Dive into the research topics of 'E pluribus unum: prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation'. Together they form a unique fingerprint.

Cite this