The Effectiveness of Phrase Skip-Gram in Primary Care NLP for the Prediction of Lung Cancer

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Neural models that use context-dependency in the learned text are computationally expensive. We compare the effectiveness (predictive performance) and efficiency (computational effort) of a context-independent Phrase Skip-Gram (PSG) model and a contextualized Hierarchical Attention Network (HAN) model for early prediction of lung cancer using free-text patient files from Dutch primary care physicians. The performance of PSG (AUROC 0.74 (0.69–0.79)) was comparable to HAN (AUROC 0.73 (0.68–0.78)); it achieved better calibration; had much less parameters (301 versus > 300k) and much faster (36 versus 460 s). This demonstrates an important case in which the complex contextualized neural models were not required.

Original languageEnglish
Title of host publicationArtificial Intelligence in Medicine - 19th International Conference on Artificial Intelligence in Medicine, AIME 2021, Proceedings
EditorsAllan Tucker, Pedro Henriques Abreu, Jaime Cardoso, Pedro Pereira Rodrigues, David Riaño
PublisherSpringer Science and Business Media Deutschland GmbH
Pages433-437
Number of pages5
Volume12721 LNAI
ISBN (Print)9783030772109
DOIs
Publication statusPublished - 2021
Event19th International Conference on Artificial Intelligence in Medicine, AIME 2021 - Virtual, Online
Duration: 15 Jun 202118 Jun 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12721 LNAI

Conference

Conference19th International Conference on Artificial Intelligence in Medicine, AIME 2021
CityVirtual, Online
Period15/06/202118/06/2021

Keywords

  • Cancer
  • Deep learning
  • N-Grams
  • Phrase skip-gram
  • Prediction models
  • Primary care
  • Word embeddings

Cite this