Comprehensive Quantitative Evaluation of Variability in Magnetic Resonance-Guided Delineation of Oropharyngeal Gross Tumor Volumes and High-Risk Clinical Target Volumes: An R-IDEAL Stage 0 Prospective Study

Academic Article


  • Purpose: Tumor and target volume manual delineation remains a challenging task in head and neck cancer radiation therapy. The purpose of this study was to conduct a multi-institutional evaluation of manual delineations of gross tumor volume (GTV), high-risk clinical target volume (CTV), parotids, and submandibular glands on treatment simulation magnetic resonance scans of patients with oropharyngeal cancer. Methods and Materials: We retrospectively collected pretreatment T1-weighted, T1-weighted with gadolinium contrast, and T2-weighted magnetic resonance imaging scans for 4 patients with oropharyngeal cancer under an institution review board–approved protocol. We provided the scans to 26 radiation oncologists from 7 international cancer centers that participated in this delineation study. We also provide the patients’ clinical history and physical examination findings, along with a medical photographic image and radiologic results. We used both the Simultaneous Truth and Performance Level Estimation algorithm and pair-wise comparisons of the contours, using overlap/distance metrics. Lastly, to assess experience and CTV delineation institutional practices, we had participants complete a brief questionnaire. Results: Large variability was measured between observers’ delineations for GTVs and CTVs. The mean Dice similarity coefficient values across all physicians’ delineations for GTVp, GTVn, CTVp, and CTVn were 0.77, 0.67, 0.77, and 0.69, respectively, for Simultaneous Truth and Performance Level Estimation algorithm comparison, and 0.67, 0.60, 0.67, and 0.58, respectively, for pair-wise analysis. Normal tissue contours were defined more consistently when considering overlap/distance metrics. The median radiation oncology clinical experience was 7 years. The median experience delineating on magnetic resonance imaging was 3.5 years. The GTV-to-CTV margin used was 10 mm for 6 of 7 participant institutions. One institution used 8 mm, and 3 participants (from 3 different institutions) used a margin of 5 mm. Conclusions: The data from this study suggests that appropriate guidelines, contouring quality assurance sessions, and training are still needed for the adoption of magnetic resonance–based treatment planning for head and neck cancers. Such efforts should play a critical role in reducing delineation variation and ensure standardization of target design across clinical practices.
  • Authors

    Digital Object Identifier (doi)

    Author List

  • Cardenas CE; Blinde SE; Mohamed ASR; Ng SP; Raaijmakers C; Philippens M; Kotte A; Al-Mamgani AA; Karam I; Thomson DJ
  • Start Page

  • 426
  • End Page

  • 436
  • Volume

  • 113
  • Issue

  • 2