Understanding metric-related pitfalls in image analysis validation

Annika Reinke; Minu D. Tizabi; Michael Baumgartner; Matthias Eisenmann; Doreen Heckmann-Nötzel; A. Emre Kavur; Tim Rädsch; Carole H. Sudre; Laura Acion; Michela Antonelli; Tal Arbel; Spyridon Bakas; Arriel Benis; Florian Buettner; M. Jorge Cardoso; Veronika Cheplygina; Jianxu Chen; Evangelia Christodoulou; Beth A. Cimini; Keyvan Farahani; Luciana Ferrer; Adrian Galdran; Bram van Ginneken; Ben Glocker; Patrick Godau; Daniel A. Hashimoto; Michael M. Hoffman; Merel Huisman; Fabian Isensee; Pierre Jannin; Charles E. Kahn; Dagmar Kainmueller; Bernhard Kainz; Alexandros Karargyris; Jens Kleesiek; Florian Kofler; Thijs Kooi; Annette Kopp-Schneider; Michal Kozubek; Anna Kreshuk; Tahsin Kurc; Bennett A. Landman; Geert Litjens; Amin Madani; Klaus Maier-Hein; Anne L. Martel; Erik Meijering; Bjoern Menze; Karel G. M. Moons; Henning Müller; Brennan Nichyporuk; Felix Nickel; Jens Petersen; Susanne M. Rafelski; Nasir Rajpoot; Mauricio Reyes; Michael A. Riegler; Nicola Rieke; Julio Saez-Rodriguez; Clara I. Sánchez; Shravya Shetty; Ronald M. Summers; Abdel A. Taha; Aleksei Tiulpin; Sotirios A. Tsaftaris; Ben van Calster; Gaël Varoquaux; Ziv R. Yaniv; Paul F. Jäger; Lena Maier-Hein

doi:10.1038/s41592-023-02150-0

Understanding metric-related pitfalls in image analysis validation

Annika Reinke, Minu D. Tizabi, Michael Baumgartner, Matthias Eisenmann, Doreen Heckmann-Nötzel, A. Emre Kavur, Tim Rädsch, Carole H. Sudre, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Arriel Benis, Florian Buettner, M. Jorge Cardoso, Veronika Cheplygina, Jianxu Chen, Evangelia Christodoulou, Beth A. Cimini, Keyvan FarahaniLuciana Ferrer, Adrian Galdran, Bram van Ginneken, Ben Glocker, Patrick Godau, Daniel A. Hashimoto, Michael M. Hoffman, Merel Huisman, Fabian Isensee, Pierre Jannin, Charles E. Kahn, Dagmar Kainmueller, Bernhard Kainz, Alexandros Karargyris, Jens Kleesiek, Florian Kofler, Thijs Kooi, Annette Kopp-Schneider, Michal Kozubek, Anna Kreshuk, Tahsin Kurc, Bennett A. Landman, Geert Litjens, Amin Madani, Klaus Maier-Hein, Anne L. Martel, Erik Meijering, Bjoern Menze, Karel G. M. Moons, Henning Müller, Brennan Nichyporuk, Felix Nickel, Jens Petersen, Susanne M. Rafelski, Nasir Rajpoot, Mauricio Reyes, Michael A. Riegler, Nicola Rieke, Julio Saez-Rodriguez, Clara I. Sánchez, Shravya Shetty, Ronald M. Summers, Abdel A. Taha, Aleksei Tiulpin, Sotirios A. Tsaftaris, Ben van Calster, Gaël Varoquaux, Ziv R. Yaniv, Paul F. Jäger, Lena Maier-Hein

Radiology and Nuclear Medicine (VUmc)

Research output: Contribution to journal › Article › Academic › peer-review

3 Citations (Scopus)

Abstract

Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.

Original language	English
Pages (from-to)	182-194
Number of pages	13
Journal	Nature methods
Volume	21
Issue number	2
DOIs	https://doi.org/10.1038/s41592-023-02150-0
Publication status	Published - 1 Feb 2024

Access to Document

10.1038/s41592-023-02150-0

Cite this

Reinke, A., Tizabi, M. D., Baumgartner, M., Eisenmann, M., Heckmann-Nötzel, D., Kavur, A. E., Rädsch, T., Sudre, C. H., Acion, L., Antonelli, M., Arbel, T., Bakas, S., Benis, A., Buettner, F., Cardoso, M. J., Cheplygina, V., Chen, J., Christodoulou, E., Cimini, B. A., ... Maier-Hein, L. (2024). Understanding metric-related pitfalls in image analysis validation. Nature methods, 21(2), 182-194. https://doi.org/10.1038/s41592-023-02150-0

@article{995582d7ce464bfca32ca26c2412c35d,

title = "Understanding metric-related pitfalls in image analysis validation",

abstract = "Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.",

author = "Annika Reinke and Tizabi, {Minu D.} and Michael Baumgartner and Matthias Eisenmann and Doreen Heckmann-N{\"o}tzel and Kavur, {A. Emre} and Tim R{\"a}dsch and Sudre, {Carole H.} and Laura Acion and Michela Antonelli and Tal Arbel and Spyridon Bakas and Arriel Benis and Florian Buettner and Cardoso, {M. Jorge} and Veronika Cheplygina and Jianxu Chen and Evangelia Christodoulou and Cimini, {Beth A.} and Keyvan Farahani and Luciana Ferrer and Adrian Galdran and {van Ginneken}, Bram and Ben Glocker and Patrick Godau and Hashimoto, {Daniel A.} and Hoffman, {Michael M.} and Merel Huisman and Fabian Isensee and Pierre Jannin and Kahn, {Charles E.} and Dagmar Kainmueller and Bernhard Kainz and Alexandros Karargyris and Jens Kleesiek and Florian Kofler and Thijs Kooi and Annette Kopp-Schneider and Michal Kozubek and Anna Kreshuk and Tahsin Kurc and Landman, {Bennett A.} and Geert Litjens and Amin Madani and Klaus Maier-Hein and Martel, {Anne L.} and Erik Meijering and Bjoern Menze and Moons, {Karel G. M.} and Henning M{\"u}ller and Brennan Nichyporuk and Felix Nickel and Jens Petersen and Rafelski, {Susanne M.} and Nasir Rajpoot and Mauricio Reyes and Riegler, {Michael A.} and Nicola Rieke and Julio Saez-Rodriguez and S{\'a}nchez, {Clara I.} and Shravya Shetty and Summers, {Ronald M.} and Taha, {Abdel A.} and Aleksei Tiulpin and Tsaftaris, {Sotirios A.} and {van Calster}, Ben and Ga{\"e}l Varoquaux and Yaniv, {Ziv R.} and J{\"a}ger, {Paul F.} and Lena Maier-Hein",

note = "Publisher Copyright: {\textcopyright} Springer Nature America, Inc. 2024.",

year = "2024",

month = feb,

day = "1",

doi = "10.1038/s41592-023-02150-0",

language = "English",

volume = "21",

pages = "182--194",

journal = "Nature methods",

issn = "1548-7091",

publisher = "Nature Publishing Group",

number = "2",

}

Reinke, A, Tizabi, MD, Baumgartner, M, Eisenmann, M, Heckmann-Nötzel, D, Kavur, AE, Rädsch, T, Sudre, CH, Acion, L, Antonelli, M, Arbel, T, Bakas, S, Benis, A, Buettner, F, Cardoso, MJ, Cheplygina, V, Chen, J, Christodoulou, E, Cimini, BA, Farahani, K, Ferrer, L, Galdran, A, van Ginneken, B, Glocker, B, Godau, P, Hashimoto, DA, Hoffman, MM, Huisman, M, Isensee, F, Jannin, P, Kahn, CE, Kainmueller, D, Kainz, B, Karargyris, A, Kleesiek, J, Kofler, F, Kooi, T, Kopp-Schneider, A, Kozubek, M, Kreshuk, A, Kurc, T, Landman, BA, Litjens, G, Madani, A, Maier-Hein, K, Martel, AL, Meijering, E, Menze, B, Moons, KGM, Müller, H, Nichyporuk, B, Nickel, F, Petersen, J, Rafelski, SM, Rajpoot, N, Reyes, M, Riegler, MA, Rieke, N, Saez-Rodriguez, J, Sánchez, CI, Shetty, S, Summers, RM, Taha, AA, Tiulpin, A, Tsaftaris, SA, van Calster, B, Varoquaux, G, Yaniv, ZR, Jäger, PF & Maier-Hein, L 2024, 'Understanding metric-related pitfalls in image analysis validation', Nature methods, vol. 21, no. 2, pp. 182-194. https://doi.org/10.1038/s41592-023-02150-0

TY - JOUR

T1 - Understanding metric-related pitfalls in image analysis validation

AU - Reinke, Annika

AU - Tizabi, Minu D.

AU - Baumgartner, Michael

AU - Eisenmann, Matthias

AU - Heckmann-Nötzel, Doreen

AU - Kavur, A. Emre

AU - Rädsch, Tim

AU - Sudre, Carole H.

AU - Acion, Laura

AU - Antonelli, Michela

AU - Arbel, Tal

AU - Bakas, Spyridon

AU - Benis, Arriel

AU - Buettner, Florian

AU - Cardoso, M. Jorge

AU - Cheplygina, Veronika

AU - Chen, Jianxu

AU - Christodoulou, Evangelia

AU - Cimini, Beth A.

AU - Farahani, Keyvan

AU - Ferrer, Luciana

AU - Galdran, Adrian

AU - van Ginneken, Bram

AU - Glocker, Ben

AU - Godau, Patrick

AU - Hashimoto, Daniel A.

AU - Hoffman, Michael M.

AU - Huisman, Merel

AU - Isensee, Fabian

AU - Jannin, Pierre

AU - Kahn, Charles E.

AU - Kainmueller, Dagmar

AU - Kainz, Bernhard

AU - Karargyris, Alexandros

AU - Kleesiek, Jens

AU - Kofler, Florian

AU - Kooi, Thijs

AU - Kopp-Schneider, Annette

AU - Kozubek, Michal

AU - Kreshuk, Anna

AU - Kurc, Tahsin

AU - Landman, Bennett A.

AU - Litjens, Geert

AU - Madani, Amin

AU - Maier-Hein, Klaus

AU - Martel, Anne L.

AU - Meijering, Erik

AU - Menze, Bjoern

AU - Moons, Karel G. M.

AU - Müller, Henning

AU - Nichyporuk, Brennan

AU - Nickel, Felix

AU - Petersen, Jens

AU - Rafelski, Susanne M.

AU - Rajpoot, Nasir

AU - Reyes, Mauricio

AU - Riegler, Michael A.

AU - Rieke, Nicola

AU - Saez-Rodriguez, Julio

AU - Sánchez, Clara I.

AU - Shetty, Shravya

AU - Summers, Ronald M.

AU - Taha, Abdel A.

AU - Tiulpin, Aleksei

AU - Tsaftaris, Sotirios A.

AU - van Calster, Ben

AU - Varoquaux, Gaël

AU - Yaniv, Ziv R.

AU - Jäger, Paul F.

AU - Maier-Hein, Lena

PY - 2024/2/1

Y1 - 2024/2/1

N2 - Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.

AB - Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.

UR - http://www.scopus.com/inward/record.url?scp=85184934082&partnerID=8YFLogxK

U2 - 10.1038/s41592-023-02150-0

DO - 10.1038/s41592-023-02150-0

M3 - Article

C2 - 38347140

SN - 1548-7091

VL - 21

SP - 182

EP - 194

JO - Nature methods

JF - Nature methods

IS - 2

ER -

Understanding metric-related pitfalls in image analysis validation

Abstract

Access to Document

Other files and links

Cite this