Crowd-sourced and expert video assessment in minimally invasive esophagectomy

Mirte H. M. Ketel, Bastiaan R. Klarenbeek, Yassin Eddahchouri, Miguel A. Cuesta, Elke van Daele, Christian A. Gutschow, Arnulf H. Hölscher, Michal Hubka, Misha D. P. Luyer, Robert E. Merritt, Grard A. P. Nieuwenhuijzen, Yaxing Shen, Inger L. Abma, Camiel Rosman, Frans van Workum

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Background: Video-based assessment by experts may structurally measure surgical performance using procedure-specific competency assessment tools (CATs). A CAT for minimally invasive esophagectomy (MIE-CAT) was developed and validated previously. However, surgeon’s time is scarce and video assessment is time-consuming and labor intensive. This study investigated non-procedure-specific assessment of MIE video clips by MIE experts and crowdsourcing, collective surgical performance evaluation by anonymous and untrained laypeople, to assist procedure-specific expert review. Methods: Two surgical performance scoring frameworks were used to assess eight MIE videos. First, global performance was assessed with the non-procedure-specific Global Operative Assessment of Laparoscopic Skills (GOALS) of 64 procedural phase-based video clips < 10 min. Each clip was assessed by two MIE experts and > 30 crowd workers. Second, the same experts assessed procedure-specific performance with the MIE-CAT of the corresponding full-length video. Reliability and convergent validity of GOALS for MIE were investigated using hypothesis testing with correlations (experience, blood loss, operative time, and MIE-CAT). Results: Less than 75% of hypothesized correlations between GOALS scores and experience of the surgical team (r < 0.3), blood loss (r = − 0.82 to 0.02), operative time (r = − 0.42 to 0.07), and the MIE-CAT scores (r = − 0.04 to 0.76) were met for both crowd workers and experts. Interestingly, experts’ GOALS and MIE-CAT scores correlated strongly (r = 0.40 to 0.79), while crowd workers’ GOALS and experts’ MIE-CAT scores correlations were weak (r = − 0.04 to 0.49). Expert and crowd worker GOALS scores correlated poorly (ICC ≤ 0.42). Conclusion: GOALS assessments by crowd workers lacked convergent validity and showed poor reliability. It is likely that MIE is technically too difficult to assess for laypeople. Convergent validity of GOALS assessments by experts could also not be established. GOALS might not be comprehensive enough to assess detailed MIE performance. However, expert’s GOALS and MIE-CAT scores strongly correlated indicating video clip (instead of full-length video) assessments could be useful to shorten assessment time. Graphical abstract: [Figure not available: see fulltext.]

Original languageEnglish
Pages (from-to)7819-7828
Number of pages10
JournalSurgical endoscopy
Volume37
Issue number10
Early online date2023
DOIs
Publication statusPublished - Oct 2023

Keywords

  • Competency assessment tool
  • Crowdsourcing
  • Esophagectomy
  • GOALS
  • Surgical performance assessment
  • Video assessment

Cite this