Definition of gross tumor volume in lung cancer: inter-observer variability

Jan van de Steene, Nadine Linthout, Johan de Mey, Vincent Vinh-Hung, Cornelia Claassens, Marc Noppen, Arjan Bel, Guy Storme

Research output: Contribution to JournalArticleAcademicpeer-review

201 Citations (Scopus)


BACKGROUND AND PURPOSE: To determine the inter-observer variation in gross tumor volume (GTV) definition in lung cancer, and its clinical relevance. MATERIALS AND METHODS: Five clinicians involved in lung cancer were asked to define GTV on the planning CT scan of eight patients. Resulting GTVs were compared on the base of geometric volume, dimensions and extensions. Judgement of invasion of lymph node (LN) regions was evaluated using the ATS/LCSG classification of LN. Clinical relevance of the variation was studied through 3D-dosimetry of standard conformal plans: volume of critical organs (heart, lungs, esophagus, spinal cord) irradiated at toxic doses, 95% isodose volumes of GTVs, normal tissue complication probabilities (NTCP) and tumor control probabilities (TCP) were compared for evaluation of observer variability. RESULTS: Before evaluation of observer variability, critical review of planning CT scan led to up- (two cases) and downstaging (one case) of patients as compared to the respective diagnostic scans. The defined GTVs showed an inter-observer variation with a ratio up to more than 7 between maximum and minimum geometric content. The dimensions of the primary tumor had inter-observer ranges of 4.2 (transversal), 7.9 (cranio-caudal) and 5.4 (antero-posterior) cm. Extreme extensions of the GTVs (left, right, cranial, caudal, anterior and posterior) varied with ranges of 2.8-7.3 cm due to inter-observer variation. After common review, only 63% of involved lymph node regions were delineated by the clinicians (i.e. 37% are false negative). Twenty-two percent of drawn in lymph node regions were accepted to be false positive after review. In the conformal plans, inter-observer ranges of irradiated normal tissue volume were on average 12%, with a maximum of 66%. The probability (in the population of all conformal plans) of irradiating at least 95% of the GTV with at least 95% of the nominal treatment dose decreased from 96 to 88% when swapping the matched GTV with an unmatched one. The average (over all patients) inter-observer range in NTCP varied from 5% (spinal cord) to 20% (ipsilateral lung), whereas the maximal ranges amounted 16% (spinal cord) to 45% (heart). The average TCP amounted 51% with an average range of 2% (maximally 5%) in case of matched GTVs. These values shifted to 42% (average TCP) with an average range of 14% (maximally 31%) when defining unmatched GTVs. Four groups of causes are suggested for the large inter-observer variation: (1) problems of methodology; (2) impossible differentiation between pathologic structures and tumor; or (3) between normal structures and tumor, and (4); lack of knowledge. Only the minority of these can be resolved objectively. For most of the causal factors agreements have to be made between clinicians, intra- and inter-departmentally. Some of the factors will never be unequivocally solved. CONCLUSIONS: GTV definition in lung cancer is one of the cornerstones in quality assurance of radiotherapy. The large inter-observer variation in GTV definition jeopardizes comparison between clinicians, institutes and treatments
Original languageEnglish
Pages (from-to)37-49
JournalRadiotherapy and Oncology
Issue number1
Publication statusPublished - 2002

Cite this