Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: de Vries, A P

Searching in archive cs. Search in all archives.
.
  1. Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search

    Authors: Hideaki Joko, Shubham Chatterjee, Andrew Ramsay, Arjen P. de Vries, Jeff Dalton, Faegheh Hasibi

    Abstract: The future of conversational agents will provide users with personalized information responses. However, a significant challenge in developing models is the lack of large-scale dialogue datasets that span multiple sessions and reflect real-world user preferences. Previous approaches rely on experts in a wizard-of-oz setup that is difficult to scale, particularly for personalized tasks. Our method,… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted at SIGIR 2024 (Full Paper)

  2. MMEAD: MS MARCO Entity Annotations and Disambiguations

    Authors: Chris Kamphuis, Aileen Lin, Siwen Yang, Jimmy Lin, Arjen P. de Vries, Faegheh Hasibi

    Abstract: MMEAD, or MS MARCO Entity Annotations and Disambiguations, is a resource for entity links for the MS MARCO datasets. We specify a format to store and share links for both document and passage collections of MS MARCO. Following this specification, we release entity links to Wikipedia for documents and passages in both MS MARCO collections (v1 and v2). Entity links have been produced by the REL and… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  3. ORCAS-I: Queries Annotated with Intent using Weak Supervision

    Authors: Daria Alexander, Wojciech Kusa, Arjen P. de Vries

    Abstract: User intent classification is an important task in information retrieval. In this work, we introduce a revised taxonomy of user intent. We take the widely used differentiation between navigational, transactional and informational queries as a starting point, and identify three different sub-classes for the informational queries: instrumental, factual and abstain. The resulting classification of us… ▽ More

    Submitted 27 September, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

    Comments: presented at SIGIR 2022 (resource track)

  4. Entity-aware Transformers for Entity Search

    Authors: Emma J. Gerritse, Faegheh Hasibi, Arjen P. de Vries

    Abstract: Pre-trained language models such as BERT have been a key ingredient to achieve state-of-the-art results on a variety of tasks in natural language processing and, more recently, also in information retrieval.Recent research even claims that BERT is able to capture factual knowledge about entity relations and properties, the information that is commonly obtained from knowledge graphs. This paper inv… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    ACM Class: H.3.3

  5. Conversational Entity Linking: Problem Definition and Datasets

    Authors: Hideaki Joko, Faegheh Hasibi, Krisztian Balog, Arjen P. de Vries

    Abstract: Machine understanding of user utterances in conversational systems is of utmost importance for enabling engaging and meaningful conversations with users. Entity Linking (EL) is one of the means of text understanding, with proven efficacy for various downstream tasks in information retrieval. In this paper, we study entity linking for conversational systems. To develop a better understanding of wha… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

    ACM Class: H.3

  6. Bias in Conversational Search: The Double-Edged Sword of the Personalized Knowledge Graph

    Authors: Emma J. Gerritse, Faegheh Hasibi, Arjen P. de Vries

    Abstract: Conversational AI systems are being used in personal devices, providing users with highly personalized content. Personalized knowledge graphs (PKGs) are one of the recently proposed methods to store users' information in a structured form and tailor answers to their liking. Personalization, however, is prone to amplifying bias and contributing to the echo-chamber phenomenon. In this paper, we disc… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    ACM Class: H.3.3

  7. REL: An Entity Linker Standing on the Shoulders of Giants

    Authors: Johannes M. van Hulst, Faegheh Hasibi, Koen Dercksen, Krisztian Balog, Arjen P. de Vries

    Abstract: Entity linking is a standard component in modern retrieval system that is often performed by third-party toolkits. Despite the plethora of open source options, it is difficult to find a single system that has a modular architecture where certain components may be replaced, does not depend on external sources, can easily be updated to newer Wikipedia versions, and, most important of all, has state-… ▽ More

    Submitted 2 June, 2020; originally announced June 2020.

    ACM Class: H.3

  8. Graph-Embedding Empowered Entity Retrieval

    Authors: Emma J. Gerritse, Faegheh Hasibi, Arjen P. de Vries

    Abstract: In this research, we improve upon the current state of the art in entity retrieval by re-ranking the result list using graph embeddings. The paper shows that graph embeddings are useful for entity-oriented search tasks. We demonstrate empirically that encoding information from the knowledge graph into (graph) embeddings contributes to a higher increase in effectiveness of entity retrieval results… ▽ More

    Submitted 6 May, 2020; originally announced May 2020.

    Journal ref: Advances in Information Retrieval. ECIR 2020. Lecture Notes in Computer Science, vol 12035. Springer,

  9. arXiv:1905.04577  [pdf, other

    cs.IR cs.HC

    Information search in a professional context - exploring a collection of professional search tasks

    Authors: Suzan Verberne, Jiyin He, Gineke Wiggers, Tony Russell-Rose, Udo Kruschwitz, Arjen P. de Vries

    Abstract: Search conducted in a work context is an everyday activity that has been around since long before the Web was invented, yet we still seem to understand little about its general characteristics. With this paper we aim to contribute to a better understanding of this large but rather multi-faceted area of `professional search'. Unlike task-based studies that aim at measuring the effectiveness of sear… ▽ More

    Submitted 11 May, 2019; originally announced May 2019.

    Comments: 5 pages, 2 figures

  10. arXiv:1804.11131  [pdf, other

    cs.IR

    Author-topic profiles for academic search

    Authors: Suzan Verberne, Arjen P. de Vries, Wessel Kraaij

    Abstract: We implemented and evaluated a two-stage retrieval method for personalized academic search in which the initial search results are re-ranked using an author-topic profile. In academic search tasks, the user's own data can help optimizing the ranking of search results to match the searcher's specific individual needs. The author-topic profile consists of topic-specific terms, stored in a graph. We… ▽ More

    Submitted 30 April, 2018; originally announced April 2018.

    Comments: 13 pages, 1 figure

  11. arXiv:1712.08355  [pdf

    cs.IR

    Ranking Triples using Entity Links in a Large Web Crawl - The Chicory Triple Scorer at WSDM Cup 2017

    Authors: Frank Dorssers, Arjen P. de Vries, Wouter Alink, Roberto Cornacchia

    Abstract: This paper describes the participation of team Chicory in the Triple Ranking Challenge of the WSDM Cup 2017. Our approach deploys a large collection of entity tagged web data to estimate the correctness of the relevance relation expressed by the triples, in combination with a baseline approach using Wikipedia abstracts following [1]. Relevance estimations are drawn from ClueWeb12 annotated by Goog… ▽ More

    Submitted 22 December, 2017; originally announced December 2017.

    Comments: Triple Scorer at WSDM Cup 2017, see arXiv:1712.08081

    ACM Class: H.3

  12. Exploring Deep Space: Learning Personalized Ranking in a Semantic Space

    Authors: Jeroen B. P. Vuurens, Martha Larson, Arjen P. de Vries

    Abstract: Recommender systems leverage both content and user interactions to generate recommendations that fit users' preferences. The recent surge of interest in deep learning presents new opportunities for exploiting these two sources of information. To recommend items we propose to first learn a user-independent high-dimensional semantic space in which items are positioned according to their substitutabi… ▽ More

    Submitted 22 August, 2016; v1 submitted 31 July, 2016; originally announced August 2016.

    Comments: 6 pages, RecSys 2016 RSDL workshop

  13. arXiv:1606.07822  [pdf, ps, other

    cs.CL cs.DC

    Efficient Parallel Learning of Word2Vec

    Authors: Jeroen B. P. Vuurens, Carsten Eickhoff, Arjen P. de Vries

    Abstract: Since its introduction, Word2Vec and its variants are widely used to learn semantics-preserving representations of words or entities in an embedding space, which can be used to produce state-of-art results for various Natural Language Processing tasks. Existing implementations aim to learn efficiently by running multiple threads in parallel while operating on a single model in shared memory, ignor… ▽ More

    Submitted 24 June, 2016; originally announced June 2016.

    Comments: ICML 2016 Machine Learning workshop

  14. arXiv:1212.2287  [pdf, other

    cs.DB cs.IR cs.LG

    Runtime Optimizations for Prediction with Tree-Based Models

    Authors: Nima Asadi, Jimmy Lin, Arjen P. de Vries

    Abstract: Tree-based models have proven to be an effective solution for web ranking as well as other problems in diverse domains. This paper focuses on optimizing the runtime performance of applying such models to make predictions, given an already-trained model. Although exceedingly simple conceptually, most implementations of tree-based models do not efficiently utilize modern superscalar processor archit… ▽ More

    Submitted 26 April, 2013; v1 submitted 10 December, 2012; originally announced December 2012.

  15. arXiv:1107.1104  [pdf

    cs.DB

    SERIMI - Resource Description Similarity, RDF Instance Matching and Interlinking

    Authors: Samur Araujo, Jan Hidders, Daniel Schwabe, Arjen P. de Vries

    Abstract: The interlinking of datasets published in the Linked Data Cloud is a challenging problem and a key factor for the success of the Semantic Web. Manual rule-based methods are the most effective solution for the problem, but they require skilled human data publishers going through a laborious, error prone and time-consuming process for manually describing rules mapping instances between two datasets.… ▽ More

    Submitted 6 July, 2011; originally announced July 2011.

  16. arXiv:1106.5213  [pdf, ps, other

    cs.IR

    Personalised Travel Recommendation based on Location Co-occurrence

    Authors: Maarten Clements, Pavel Serdyukov, Arjen P. de Vries, Marcel J. T. Reinders

    Abstract: We propose a new task of recommending touristic locations based on a user's visiting history in a geographically remote region. This can be used to plan a touristic visit to a new city or country, or by travel agencies to provide personalised travel deals. A set of geotags is used to compute a location similarity model between two different regions. The similarity between two landmarks is derive… ▽ More

    Submitted 26 June, 2011; originally announced June 2011.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

    ACM Class: H.3.5; H.3.3; H.2.8