Recommendations for item set completion: on the semantics of item co-occurrence with data sparsity, input size, and input modalities

I. Vagliano, L. Galke, A. Scherp

Research output: Contribution to journalArticleAcademicpeer-review

3 Citations (Scopus)

Abstract

We address the problem of recommending relevant items to a user in order to “complete” a partial set of already-known items. We consider the two scenarios of citation and subject label recommendation, which resemble different semantics of item co-occurrence: relatedness for co-citations and diversity for subject labels. We assess the influence of the completeness of an already known partial item set on the recommender’s performance. We also investigate data sparsity by imposing a pruning threshold on minimum item occurrence and the influence of using additional metadata. As models, we focus on different autoencoders, which are particularly suited for reconstructing missing items in a set. We extend autoencoders to exploit a multi-modal input of text and structured data. Our experiments on six real-world datasets show that supplying the partial item set as input is usually helpful when item co-occurrence resembles relatedness, while metadata are effective when co-occurrence implies diversity. The simple item co-occurrence model is a strong baseline for citation recommendation but can provide good results also for subject labels. Autoencoders have the capability to exploit additional metadata besides the partial item set as input, and achieve comparable or better performance. For the subject label recommendation task, the title is the most important attribute. Adding more input modalities sometimes even harms the results. In conclusion, it is crucial to consider the semantics of the item co-occurrence for the choice of an appropriate model and carefully decide which metadata to exploit.
Original languageEnglish
Pages (from-to)269-305
Number of pages37
JournalInformation Retrieval Journal
Volume25
Issue number3
Early online date2022
DOIs
Publication statusPublished - Sept 2022

Keywords

  • Autoencoders
  • Citation recommendation
  • Cold start
  • Data sparsity
  • Recommender systems
  • Subject label recommendation

Cite this