Search | arXiv e-print repository

LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking

Authors: Amy Xin, Yunjia Qi, Zijun Yao, Fangwei Zhu, Kaisheng Zeng, Xu Bin, Lei Hou, Juanzi Li

Abstract: Entity Linking (EL) models are well-trained at mapping mentions to their corresponding entities according to a given context. However, EL models struggle to disambiguate long-tail entities due to their limited training data. Meanwhile, large language models (LLMs) are more robust at interpreting uncommon mentions. Yet, due to a lack of specialized training, LLMs suffer at generating correct entity… ▽ More Entity Linking (EL) models are well-trained at mapping mentions to their corresponding entities according to a given context. However, EL models struggle to disambiguate long-tail entities due to their limited training data. Meanwhile, large language models (LLMs) are more robust at interpreting uncommon mentions. Yet, due to a lack of specialized training, LLMs suffer at generating correct entity IDs. Furthermore, training an LLM to perform EL is cost-intensive. Building upon these insights, we introduce LLM-Augmented Entity Linking LLMAEL, a plug-and-play approach to enhance entity linking through LLM data augmentation. We leverage LLMs as knowledgeable context augmenters, generating mention-centered descriptions as additional input, while preserving traditional EL models for task specific processing. Experiments on 6 standard datasets show that the vanilla LLMAEL outperforms baseline EL models in most cases, while the fine-tuned LLMAEL set the new state-of-the-art results across all 6 benchmarks. △ Less

Submitted 15 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

arXiv:2402.02842 [pdf, other]

Trinity: Syncretizing Multi-/Long-tail/Long-term Interests All in One

Authors: Jing Yan, Liu Jiang, Jianfei Cui, Zhichen Zhao, Xingyan Bin, Feng Zhang, Zuotao Liu

Abstract: Interest modeling in recommender system has been a constant topic for improving user experience, and typical interest modeling tasks (e.g. multi-interest, long-tail interest and long-term interest) have been investigated in many existing works. However, most of them only consider one interest in isolation, while neglecting their interrelationships. In this paper, we argue that these tasks suffer f… ▽ More Interest modeling in recommender system has been a constant topic for improving user experience, and typical interest modeling tasks (e.g. multi-interest, long-tail interest and long-term interest) have been investigated in many existing works. However, most of them only consider one interest in isolation, while neglecting their interrelationships. In this paper, we argue that these tasks suffer from a common "interest amnesia" problem, and a solution exists to mitigate it simultaneously. We figure that long-term cues can be the cornerstone since they reveal multi-interest and clarify long-tail interest. Inspired by the observation, we propose a novel and unified framework in the retrieval stage, "Trinity", to solve interest amnesia problem and improve multiple interest modeling tasks. We construct a real-time clustering system that enables us to project items into enumerable clusters, and calculate statistical interest histograms over these clusters. Based on these histograms, Trinity recognizes underdelivered themes and remains stable when facing emerging hot topics. Trinity is more appropriate for large-scale industry scenarios because of its modest computational overheads. Its derived retrievers have been deployed on the recommender system of Douyin, significantly improving user experience and retention. We believe that such practical experience can be well generalized to other scenarios. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2007.07203 [pdf, other]

Deep Retrieval: Learning A Retrievable Structure for Large-Scale Recommendations

Authors: Weihao Gao, Xiangjun Fan, Chong Wang, Jiankai Sun, Kai Jia, Wenzhi Xiao, Ruofan Ding, Xingyan Bin, Hui Yang, Xiaobing Liu

Abstract: One of the core problems in large-scale recommendations is to retrieve top relevant candidates accurately and efficiently, preferably in sub-linear time. Previous approaches are mostly based on a two-step procedure: first learn an inner-product model, and then use some approximate nearest neighbor (ANN) search algorithm to find top candidates. In this paper, we present Deep Retrieval (DR), to lear… ▽ More One of the core problems in large-scale recommendations is to retrieve top relevant candidates accurately and efficiently, preferably in sub-linear time. Previous approaches are mostly based on a two-step procedure: first learn an inner-product model, and then use some approximate nearest neighbor (ANN) search algorithm to find top candidates. In this paper, we present Deep Retrieval (DR), to learn a retrievable structure directly with user-item interaction data (e.g. clicks) without resorting to the Euclidean space assumption in ANN algorithms. DR's structure encodes all candidate items into a discrete latent space. Those latent codes for the candidates are model parameters and learnt together with other neural network parameters to maximize the same objective function. With the model learnt, a beam search over the structure is performed to retrieve the top candidates for reranking. Empirically, we first demonstrate that DR, with sub-linear computational complexity, can achieve almost the same accuracy as the brute-force baseline on two public datasets. Moreover, we show that, in a live production recommendation system, a deployed DR approach significantly outperforms a well-tuned ANN baseline in terms of engagement metrics. To the best of our knowledge, DR is among the first non-ANN algorithms successfully deployed at the scale of hundreds of millions of items for industrial recommendation systems. △ Less

Submitted 18 May, 2021; v1 submitted 12 July, 2020; originally announced July 2020.

Comments: 9 pages, 6 figures

arXiv:1308.3372 [pdf]

Objective Information Theory: A Sextuple Model and 9 Kinds of Metrics

Authors: Xu Jianfeng, Tang Jun, Ma Xuefeng, Xu Bin, Shen Yanli, Qiao Yongjie

Abstract: In the contemporary era, the importance of information is undisputed, but there has never been a common understanding of information, nor a unanimous conclusion to the researches on information metrics. Based on the previous studies, this paper analyzes the important achievements in the researches of the properties and metrics of information as well as their main insufficiencies, and explores the… ▽ More In the contemporary era, the importance of information is undisputed, but there has never been a common understanding of information, nor a unanimous conclusion to the researches on information metrics. Based on the previous studies, this paper analyzes the important achievements in the researches of the properties and metrics of information as well as their main insufficiencies, and explores the essence and connotation, the mathematical expressions and other basic problems related to information. On the basis of the understanding of the objectivity of information, it proposes the definitions and a Sextuple model of information; discusses the basic properties of information, and brings forward the definitions and mathematical expressions of nine kinds of metrics of information, i.e., extensity, detailedness, sustainability, containability, delay, richness, distribution, validity and matchability. Through these, this paper establishes a basic theory frame of Objective Information Theory to support the analysis and research on information and information system systematically and comprehensively. △ Less

Submitted 3 April, 2014; v1 submitted 15 August, 2013; originally announced August 2013.

Comments: 20 pages

Showing 1–4 of 4 results for author: Bin, X