TY - JOUR
T1 - Statistical modeling approach to quantitative analysis of interobserver variability in breast contouring
AU - Yang, Jinzhong
AU - Woodward, Wendy A.
AU - Reed, Valerie K.
AU - Strom, Eric A.
AU - Perkins, George H.
AU - Tereffe, Welela
AU - Buchholz, Thomas A.
AU - Zhang, Lifei
AU - Balter, Peter
AU - Court, Laurence E.
AU - Li, X. Allen
AU - Dong, Lei
N1 - Funding Information:
This research was supported in part by the National Institutes of Health through MD Anderson's Cancer Center support grant CA016672 .
PY - 2014/5/1
Y1 - 2014/5/1
N2 - Purpose To develop a new approach for interobserver variability analysis. Methods and Materials Eight radiation oncologists specializing in breast cancer radiation therapy delineated a patient's left breast "from scratch" and from a template that was generated using deformable image registration. Three of the radiation oncologists had previously received training in Radiation Therapy Oncology Group consensus contouring for breast cancer atlas. The simultaneous truth and performance level estimation algorithm was applied to the 8 contours delineated "from scratch" to produce a group consensus contour. Individual Jaccard scores were fitted to a beta distribution model. We also applied this analysis to 2 or more patients, which were contoured by 9 breast radiation oncologists from 8 institutions. Results The beta distribution model had a mean of 86.2%, standard deviation (SD) of ±5.9%, a skewness of -0.7, and excess kurtosis of 0.55, exemplifying broad interobserver variability. The 3 RTOG-trained physicians had higher agreement scores than average, indicating that their contours were close to the group consensus contour. One physician had high sensitivity but lower specificity than the others, which implies that this physician tended to contour a structure larger than those of the others. Two other physicians had low sensitivity but specificity similar to the others, which implies that they tended to contour a structure smaller than the others. With this information, they could adjust their contouring practice to be more consistent with others if desired. When contouring from the template, the beta distribution model had a mean of 92.3%, SD ± 3.4%, skewness of -0.79, and excess kurtosis of 0.83, which indicated a much better consistency among individual contours. Similar results were obtained for the analysis of 2 additional patients. Conclusions The proposed statistical approach was able to measure interobserver variability quantitatively and to identify individuals who tended to contour differently from the others. The information could be useful as feedback to improve contouring consistency.
AB - Purpose To develop a new approach for interobserver variability analysis. Methods and Materials Eight radiation oncologists specializing in breast cancer radiation therapy delineated a patient's left breast "from scratch" and from a template that was generated using deformable image registration. Three of the radiation oncologists had previously received training in Radiation Therapy Oncology Group consensus contouring for breast cancer atlas. The simultaneous truth and performance level estimation algorithm was applied to the 8 contours delineated "from scratch" to produce a group consensus contour. Individual Jaccard scores were fitted to a beta distribution model. We also applied this analysis to 2 or more patients, which were contoured by 9 breast radiation oncologists from 8 institutions. Results The beta distribution model had a mean of 86.2%, standard deviation (SD) of ±5.9%, a skewness of -0.7, and excess kurtosis of 0.55, exemplifying broad interobserver variability. The 3 RTOG-trained physicians had higher agreement scores than average, indicating that their contours were close to the group consensus contour. One physician had high sensitivity but lower specificity than the others, which implies that this physician tended to contour a structure larger than those of the others. Two other physicians had low sensitivity but specificity similar to the others, which implies that they tended to contour a structure smaller than the others. With this information, they could adjust their contouring practice to be more consistent with others if desired. When contouring from the template, the beta distribution model had a mean of 92.3%, SD ± 3.4%, skewness of -0.79, and excess kurtosis of 0.83, which indicated a much better consistency among individual contours. Similar results were obtained for the analysis of 2 additional patients. Conclusions The proposed statistical approach was able to measure interobserver variability quantitatively and to identify individuals who tended to contour differently from the others. The information could be useful as feedback to improve contouring consistency.
UR - http://www.scopus.com/inward/record.url?scp=84898903658&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84898903658&partnerID=8YFLogxK
U2 - 10.1016/j.ijrobp.2014.01.010
DO - 10.1016/j.ijrobp.2014.01.010
M3 - Article
C2 - 24613812
AN - SCOPUS:84898903658
SN - 0360-3016
VL - 89
SP - 214
EP - 221
JO - International Journal of Radiation Oncology Biology Physics
JF - International Journal of Radiation Oncology Biology Physics
IS - 1
ER -