一周大模型+推荐系统论文资讯 - 文章 - 开发者社区

哈喽大家好，这里是小夏机器人的一周推荐系统论文资讯，推送目前最新的顶会论文。由于智力尚浅，中文内容是机翻，因此可能存在例如专业名词翻译不准确的情况，大家参考即可。

BookGPT: A General Framework for Book Recommendation Empowered by Large Language Model

comment: Under Review

reference: None

BookGPT:一种基于大型语言模型的图书推荐通用框架

http://arxiv.org/abs/2305.15673v1

With the continuous development and change exhibited by large language model (LLM) technology, represented by generative pretrained transformers (GPTs), many classic scenarios in various fields have re-emerged with new opportunities. This paper takes ChatGPT as the modeling object, incorporates LLM technology into the typical book resource understanding and recommendation scenario for the first time, and puts it into practice. By building a ChatGPT-like book recommendation system (BookGPT) framework based on ChatGPT, this paper attempts to apply ChatGPT to recommendation modeling for three typical tasks, book rating recommendation, user rating recommendation, and book summary recommendation, and explores the feasibility of LLM technology in book recommendation scenarios. At the same time, based on different evaluation schemes for book recommendation tasks and the existing classic recommendation models, this paper discusses the advantages and disadvantages of the BookGPT in book recommendation scenarios and analyzes the opportunities and improvement directions for subsequent LLMs in these scenarios.

随着以生成预训练变换器（GPT）为代表的大型语言模型（LLM）技术的不断发展和变化，各个领域的许多经典场景都以新的机遇重新出现。本文以ChatGPT为建模对象，首次将LLM技术融入到典型的图书资源理解和推荐场景中，并将其付诸实践。通过构建基于ChatGPT的类ChatGPT图书推荐系统（BookGPT）框架，本文试图将ChatGPT应用于图书评级推荐、用户评级推荐和图书摘要推荐三个典型任务的推荐建模，并探索LLM技术在图书推荐场景中的可行性。同时，基于图书推荐任务的不同评估方案和现有的经典推荐模型，讨论了BookGPT在图书推荐场景中的优缺点，并分析了在这些场景中后续LLM的机会和改进方向。

Text Is All You Need: Learning Language Representations for Sequential Recommendation

comment: accepted to KDD 2023

reference: None

文本就是你所需要的：学习顺序推荐的语言表达

http://arxiv.org/abs/2305.13731v1

Sequential recommendation aims to model dynamic user behavior from historical interactions. Existing methods rely on either explicit item IDs or general textual features for sequence modeling to understand user preferences. While promising, these approaches still struggle to model cold-start items or transfer knowledge to new datasets. In this paper, we propose to model user preferences and item features as language representations that can be generalized to new items and datasets. To this end, we present a novel framework, named Recformer, which effectively learns language representations for sequential recommendation. Specifically, we propose to formulate an item as a "sentence" (word sequence) by flattening item key-value attributes described by text so that an item sequence for a user becomes a sequence of sentences. For recommendation, Recformer is trained to understand the "sentence" sequence and retrieve the next "sentence". To encode item sequences, we design a bi-directional Transformer similar to the model Longformer but with different embedding layers for sequential recommendation. For effective representation learning, we propose novel pretraining and finetuning methods which combine language understanding and recommendation tasks. Therefore, Recformer can effectively recommend the next item based on language representations. Extensive experiments conducted on six datasets demonstrate the effectiveness of Recformer for sequential recommendation, especially in low-resource and cold-start settings.

顺序推荐旨在根据历史交互对动态用户行为进行建模。现有的方法依赖于显式项目ID或用于序列建模的通用文本特征来理解用户偏好。尽管这些方法很有前景，但仍难以对冷启动项目进行建模或将知识转移到新的数据集。在本文中，我们建议将用户偏好和项目特征建模为可以推广到新项目和数据集的语言表示。为此，我们提出了一个新的框架，名为Recformer，它可以有效地学习顺序推荐的语言表示。具体来说，我们建议通过将文本描述的项目键值属性扁平化，将项目公式化为“句子”（单词序列），从而使用户的项目序列变成句子序列。作为推荐，Recformer被训练来理解“句子”序列并检索下一个“句子”。为了对项目序列进行编码，我们设计了一个类似于Longformer模型的双向Transformer，但具有不同的嵌入层，用于顺序推荐。为了有效的表示学习，我们提出了新的预训练和微调方法，将语言理解和推荐任务相结合。因此，Recformer可以根据语言表示有效地推荐下一个项目。在六个数据集上进行的大量实验证明了Recformer在顺序推荐方面的有效性，尤其是在低资源和冷启动环境中。

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

comment: work in progress

reference: None

对大语言模型时代会话推荐评价的再思考

http://arxiv.org/abs/2305.13112v1

The recent success of large language models (LLMs) has shown great potential to develop more powerful conversational recommender systems (CRSs), which rely on natural language conversations to satisfy user needs. In this paper, we embark on an investigation into the utilization of ChatGPT for conversational recommendation, revealing the inadequacy of the existing evaluation protocol. It might over-emphasize the matching with the ground-truth items or utterances generated by human annotators, while neglecting the interactive nature of being a capable CRS. To overcome the limitation, we further propose an interactive Evaluation approach based on LLMs named iEvaLM that harnesses LLM-based user simulators. Our evaluation approach can simulate various interaction scenarios between users and systems. Through the experiments on two publicly available CRS datasets, we demonstrate notable improvements compared to the prevailing evaluation protocol. Furthermore, we emphasize the evaluation of explainability, and ChatGPT showcases persuasive explanation generation for its recommendations. Our study contributes to a deeper comprehension of the untapped potential of LLMs for CRSs and provides a more flexible and easy-to-use evaluation framework for future research endeavors. The codes and data are publicly available at https://github.com/RUCAIBox/iEvaLM-CRS.

最近，大型语言模型（LLM）的成功显示出开发更强大的会话推荐系统（CRS）的巨大潜力，该系统依赖于自然语言会话来满足用户需求。在本文中，我们对ChatGPT用于会话推荐的使用进行了调查，揭示了现有评估协议的不足。它可能过于强调与人类注释者生成的基本事实项目或话语的匹配，而忽略了作为一个有能力的CRS的互动性质。为了克服这一限制，我们进一步提出了一种基于LLM的交互式评估方法，称为iEvaLM，该方法利用了基于LLM用户模拟器。我们的评估方法可以模拟用户和系统之间的各种交互场景。通过在两个公开可用的CRS数据集上的实验，我们展示了与主流评估协议相比的显著改进。此外，我们强调对可解释性的评估，ChatGPT为其建议展示了有说服力的解释生成。我们的研究有助于更深入地理解LLM对CRS的未开发潜力，并为未来的研究工作提供了一个更灵活、更易于使用的评估框架。代码和数据可在https://github.com/RUCAIBox/iEvaLM-CRS.

Large Language Models are Zero-Shot Rankers for Recommender Systems

comment: None

reference: None

大型语言模型是推荐系统的零样本排名

http://arxiv.org/abs/2305.08845v1

Recently, large language models (LLMs) (e.g. GPT-4) have demonstrated impressive general-purpose task-solving abilities, including the potential to approach recommendation tasks. Along this line of research, this work aims to investigate the capacity of LLMs that act as the ranking model for recommender systems. To conduct our empirical study, we first formalize the recommendation problem as a conditional ranking task, considering sequential interaction histories as conditions and the items retrieved by the candidate generation model as candidates. We adopt a specific prompting approach to solving the ranking task by LLMs: we carefully design the prompting template by including the sequential interaction history, the candidate items, and the ranking instruction. We conduct extensive experiments on two widely-used datasets for recommender systems and derive several key findings for the use of LLMs in recommender systems. We show that LLMs have promising zero-shot ranking abilities, even competitive to or better than conventional recommendation models on candidates retrieved by multiple candidate generators. We also demonstrate that LLMs struggle to perceive the order of historical interactions and can be affected by biases like position bias, while these issues can be alleviated via specially designed prompting and bootstrapping strategies. The code to reproduce this work is available at https://github.com/RUCAIBox/LLMRank.

最近，大型语言模型（LLM）（例如GPT-4）已经证明了令人印象深刻的通用任务解决能力，包括处理推荐任务的潜力。沿着这条研究路线，这项工作旨在调查LLM作为推荐系统排名模型的能力。为了进行我们的实证研究，我们首先将推荐问题形式化为条件排序任务，将顺序交互历史作为条件，将候选生成模型检索到的项目作为候选。我们采用了一种特定的提示方法来解决LLM的排名任务：我们仔细设计了提示模板，包括顺序交互历史、候选项目和排名指令。我们在推荐系统的两个广泛使用的数据集上进行了广泛的实验，并得出了LLM在推荐系统中使用的几个关键发现。我们表明，LLM具有良好的零样本排序能力，甚至在多个候选生成器检索到的候选对象上与传统推荐模型竞争或优于传统推荐模型。我们还证明，LLM很难感知历史互动的顺序，可能会受到立场偏见等偏见的影响，而这些问题可以通过专门设计的提示和引导策略来缓解。复制此作品的代码可在https://github.com/RUCAIBox/LLMRank.