Comprehensive review of publicly available colonoscopic imaging databases for artificial intelligence research: availability, accessibility, and usability

Britt B. S. L. Houwen, Karlijn J. Nass, Jasper L. A. Vleugels, Paul Fockens, Yark Hazewinkel, Evelien Dekker

Research output: Contribution to journalReview articleAcademicpeer-review

11 Citations (Scopus)


Background and Aims: Publicly available databases containing colonoscopic imaging data are valuable resources for artificial intelligence (AI) research. Currently, little is known regarding the available number and content of these databases. This review aimed to describe the availability, accessibility, and usability of publicly available colonoscopic imaging databases, focusing on polyp detection, polyp characterization, and quality of colonoscopy. Methods: A systematic literature search was performed in MEDLINE and Embase to identify AI studies describing publicly available colonoscopic imaging databases published after 2010. Second, a targeted search using Google's Dataset Search, Google Search, GitHub, and Figshare was done to identify databases directly. Databases were included if they contained data about polyp detection, polyp characterization, or quality of colonoscopy. To assess accessibility of databases, the following categories were defined: open access, open access with barriers, and regulated access. To assess the potential usability of the included databases, essential details of each database were extracted using a checklist derived from the Checklist for Artificial Intelligence in Medical Imaging. Results: We identified 22 databases with open access, 3 databases with open access with barriers, and 15 databases with regulated access. The 22 open access databases contained 19,463 images and 952 videos. Nineteen of these databases focused on polyp detection, localization, and/or segmentation; 6 on polyp characterization, and 3 on quality of colonoscopy. Only half of these databases have been used by other researcher to develop, train, or benchmark their AI system. Although technical details were in general well reported, important details such as polyp and patient demographics and the annotation process were under-reported in almost all databases. Conclusions: This review provides greater insight on public availability of colonoscopic imaging databases for AI research. Incomplete reporting of important details limits the ability of researchers to assess the usability of current databases.
Original languageEnglish
Pages (from-to)184-199.e16
JournalGastrointestinal Endoscopy
Issue number2
Early online date2022
Publication statusPublished - Feb 2023


  • Artificial Intelligence
  • Colonic Polyps/diagnostic imaging
  • Colonoscopes
  • Colonoscopy/methods
  • Humans
  • Radiography

Cite this