Metrics to evaluate the performance of auto-segmentation for radiation treatment planning: A critical review

Michael V. Sherer; Diana Lin; Sharif Elguindi; Simon Duke; Li-Tee Tan; Jon Cacicedo; Max Dahele; Erin F. Gillespie

doi:https://doi.org/10.1016/j.radonc.2021.05.003

Metrics to evaluate the performance of auto-segmentation for radiation treatment planning: A critical review

Michael V. Sherer, Diana Lin, Sharif Elguindi, Simon Duke, Li-Tee Tan, Jon Cacicedo, Max Dahele, Erin F. Gillespie

Research output: Contribution to journal › Review article › Academic › peer-review

80 Citations (Scopus)

Abstract

Advances in artificial intelligence-based methods have led to the development and publication of numerous systems for auto-segmentation in radiotherapy. These systems have the potential to decrease contour variability, which has been associated with poor clinical outcomes and increased efficiency in the treatment planning workflow. However, there are no uniform standards for evaluating auto-segmentation platforms to assess their efficacy at meeting these goals. Here, we review the most frequently used evaluation techniques which include geometric overlap, dosimetric parameters, time spent contouring, and clinical rating scales. These data suggest that many of the most commonly used geometric indices, such as the Dice Similarity Coefficient, are not well correlated with clinically meaningful endpoints. As such, a multi-domain evaluation, including composite geometric and/or dosimetric metrics with physician-reported assessment, is necessary to gauge the clinical readiness of auto-segmentation for radiation treatment planning.

Original language	English
Pages (from-to)	185-191
Number of pages	7
Journal	Radiotherapy and oncology
Volume	160
DOIs	https://doi.org/10.1016/j.radonc.2021.05.003
Publication status	Published - 1 Jul 2021

Keywords

Auto-segmentation
Contouring
Quality assurance
Treatment planning

Access to Document

https://doi.org/10.1016/j.radonc.2021.05.003

Cite this

@article{c28519936b4d4c80ab6e2fd1e08fc417,

title = "Metrics to evaluate the performance of auto-segmentation for radiation treatment planning: A critical review",

abstract = "Advances in artificial intelligence-based methods have led to the development and publication of numerous systems for auto-segmentation in radiotherapy. These systems have the potential to decrease contour variability, which has been associated with poor clinical outcomes and increased efficiency in the treatment planning workflow. However, there are no uniform standards for evaluating auto-segmentation platforms to assess their efficacy at meeting these goals. Here, we review the most frequently used evaluation techniques which include geometric overlap, dosimetric parameters, time spent contouring, and clinical rating scales. These data suggest that many of the most commonly used geometric indices, such as the Dice Similarity Coefficient, are not well correlated with clinically meaningful endpoints. As such, a multi-domain evaluation, including composite geometric and/or dosimetric metrics with physician-reported assessment, is necessary to gauge the clinical readiness of auto-segmentation for radiation treatment planning.",

keywords = "Auto-segmentation, Contouring, Quality assurance, Treatment planning",

author = "Sherer, {Michael V.} and Diana Lin and Sharif Elguindi and Simon Duke and Li-Tee Tan and Jon Cacicedo and Max Dahele and Gillespie, {Erin F.}",

note = "Funding Information: This work is supported by an MSK Core Grant ( P30 CA008748 ), as well as the Radiologic Society of North American (RSNA) ( EI1902 , E.F.G), Agency for Healthcare Research and Quality (AHRQ) ( R18 HS026881 , E.F.G.), and Varian Medical Systems (M.D.). Funding Information: EFG is a cofounder of the educational website eContour.org. EFG reports funding from a Radiologic Society of North America (RSNA) Innovation grant. MD reports research funding from Varian Medical Systems outside the scope of this work. There are no other conflicts of interest to report. Publisher Copyright: {\textcopyright} 2021 Elsevier B.V. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.",

year = "2021",

month = jul,

day = "1",

doi = "https://doi.org/10.1016/j.radonc.2021.05.003",

language = "English",

volume = "160",

pages = "185--191",

journal = "Radiotherapy and oncology",

issn = "0167-8140",

publisher = "Elsevier Ireland Ltd",

}

TY - JOUR

T1 - Metrics to evaluate the performance of auto-segmentation for radiation treatment planning: A critical review

AU - Sherer, Michael V.

AU - Lin, Diana

AU - Elguindi, Sharif

AU - Duke, Simon

AU - Tan, Li-Tee

AU - Cacicedo, Jon

AU - Dahele, Max

AU - Gillespie, Erin F.

N1 - Funding Information: This work is supported by an MSK Core Grant ( P30 CA008748 ), as well as the Radiologic Society of North American (RSNA) ( EI1902 , E.F.G), Agency for Healthcare Research and Quality (AHRQ) ( R18 HS026881 , E.F.G.), and Varian Medical Systems (M.D.). Funding Information: EFG is a cofounder of the educational website eContour.org. EFG reports funding from a Radiologic Society of North America (RSNA) Innovation grant. MD reports research funding from Varian Medical Systems outside the scope of this work. There are no other conflicts of interest to report. Publisher Copyright: © 2021 Elsevier B.V. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.

PY - 2021/7/1

Y1 - 2021/7/1

N2 - Advances in artificial intelligence-based methods have led to the development and publication of numerous systems for auto-segmentation in radiotherapy. These systems have the potential to decrease contour variability, which has been associated with poor clinical outcomes and increased efficiency in the treatment planning workflow. However, there are no uniform standards for evaluating auto-segmentation platforms to assess their efficacy at meeting these goals. Here, we review the most frequently used evaluation techniques which include geometric overlap, dosimetric parameters, time spent contouring, and clinical rating scales. These data suggest that many of the most commonly used geometric indices, such as the Dice Similarity Coefficient, are not well correlated with clinically meaningful endpoints. As such, a multi-domain evaluation, including composite geometric and/or dosimetric metrics with physician-reported assessment, is necessary to gauge the clinical readiness of auto-segmentation for radiation treatment planning.

AB - Advances in artificial intelligence-based methods have led to the development and publication of numerous systems for auto-segmentation in radiotherapy. These systems have the potential to decrease contour variability, which has been associated with poor clinical outcomes and increased efficiency in the treatment planning workflow. However, there are no uniform standards for evaluating auto-segmentation platforms to assess their efficacy at meeting these goals. Here, we review the most frequently used evaluation techniques which include geometric overlap, dosimetric parameters, time spent contouring, and clinical rating scales. These data suggest that many of the most commonly used geometric indices, such as the Dice Similarity Coefficient, are not well correlated with clinically meaningful endpoints. As such, a multi-domain evaluation, including composite geometric and/or dosimetric metrics with physician-reported assessment, is necessary to gauge the clinical readiness of auto-segmentation for radiation treatment planning.

KW - Auto-segmentation

KW - Contouring

KW - Quality assurance

KW - Treatment planning

UR - http://www.scopus.com/inward/record.url?scp=85107636464&partnerID=8YFLogxK

U2 - https://doi.org/10.1016/j.radonc.2021.05.003

DO - https://doi.org/10.1016/j.radonc.2021.05.003

M3 - Review article

C2 - 33984348

SN - 0167-8140

VL - 160

SP - 185

EP - 191

JO - Radiotherapy and oncology

JF - Radiotherapy and oncology

ER -

Metrics to evaluate the performance of auto-segmentation for radiation treatment planning: A critical review

Abstract

Keywords

Access to Document

Other files and links

Cite this