TY - GEN
T1 - CNNs vs. Transformers
T2 - 2nd International Workshop on Applications of Medical Artificial Intelligence, AMAI 2023
AU - Kusters, Carolus H. J.
AU - Boers, Tim G. W.
AU - Jaspers, Tim J. M.
AU - Jukema, Jelmer B.
AU - Jong, Martijn R.
AU - Fockens, Kiki N.
AU - de Groof, Albert J.
AU - Bergman, Jacques J.
AU - van der Sommen, Fons
AU - de With, Peter H. N.
N1 - Publisher Copyright: © 2024, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2024
Y1 - 2024
N2 - In endoscopy, imaging conditions are often challenging due to organ movement, user dependence, fluctuations in video quality and real-time processing, which pose requirements on the performance, robustness and complexity of computer-based analysis techniques. This paper poses the question whether Transformer-based architectures, which are capable to directly capture global contextual information, can handle the aforementioned endoscopic conditions and even outperform the established Convolutional Neural Networks (CNNs) for this task. To this end, we evaluate and compare clinically relevant performance and robustness of CNNs and Transformers for neoplasia detection in Barrett’s esophagus. We have selected several top performing CNN and Transformers on endoscopic benchmarks, which we have trained and validated on a total of 10,208 images (2,079 patients), and tested on a total of 4,661 images (743 patients), divided over a high-quality test set and three different robustness test sets. Our results show that Transformers generally perform better on classification and segmentation for the high-quality challenging test set, and show on-par or increased robustness to various clinically relevant input data variations, while requiring comparable model complexity. This robustness against challenging video-related conditions and equipment variations over the hospitals is an essential trait for adoption in clinical practice. The code is made publicly available at: https://github.com/BONS-AI-VCA-AMC/Endoscopy-CNNs-vs-Transformers.
AB - In endoscopy, imaging conditions are often challenging due to organ movement, user dependence, fluctuations in video quality and real-time processing, which pose requirements on the performance, robustness and complexity of computer-based analysis techniques. This paper poses the question whether Transformer-based architectures, which are capable to directly capture global contextual information, can handle the aforementioned endoscopic conditions and even outperform the established Convolutional Neural Networks (CNNs) for this task. To this end, we evaluate and compare clinically relevant performance and robustness of CNNs and Transformers for neoplasia detection in Barrett’s esophagus. We have selected several top performing CNN and Transformers on endoscopic benchmarks, which we have trained and validated on a total of 10,208 images (2,079 patients), and tested on a total of 4,661 images (743 patients), divided over a high-quality test set and three different robustness test sets. Our results show that Transformers generally perform better on classification and segmentation for the high-quality challenging test set, and show on-par or increased robustness to various clinically relevant input data variations, while requiring comparable model complexity. This robustness against challenging video-related conditions and equipment variations over the hospitals is an essential trait for adoption in clinical practice. The code is made publicly available at: https://github.com/BONS-AI-VCA-AMC/Endoscopy-CNNs-vs-Transformers.
KW - Barrett’s Esophagus
KW - CNN
KW - Robustness
KW - Transformers
UR - http://www.scopus.com/inward/record.url?scp=85177230832&partnerID=8YFLogxK
U2 - https://doi.org/10.1007/978-3-031-47076-9_3
DO - https://doi.org/10.1007/978-3-031-47076-9_3
M3 - Conference contribution
SN - 9783031470752
VL - 14313 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 21
EP - 31
BT - Applications of Medical Artificial Intelligence - 2nd International Workshop, AMAI 2023, Held in Conjunction with MICCAI 2023, Proceedings
A2 - Wu, Shandong
A2 - Shabestari, Behrouz
A2 - Xing, Lei
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 8 October 2023 through 8 October 2023
ER -