Zum Hauptinhalt springen

Showing 1–11 of 11 results for author: Azkune, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09952  [pdf, other

    cs.CV cs.CL cs.LG

    BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval

    Authors: Imanol Miranda, Ander Salaberria, Eneko Agirre, Gorka Azkune

    Abstract: Existing Vision-Language Compositionality (VLC) benchmarks like SugarCrepe are formulated as image-to-text retrieval problems, where, given an image, the models need to select between the correct textual description and a synthetic hard negative text. In this work we present the Bidirectional Vision-Language Compositionality (BiVLC) dataset. The novelty of BiVLC is to add a synthetic hard negative… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2406.07302  [pdf, ps, other

    cs.CL cs.AI cs.LG

    BertaQA: How Much Do Language Models Know About Local Culture?

    Authors: Julen Etxaniz, Gorka Azkune, Aitor Soroa, Oier Lopez de Lacalle, Mikel Artetxe

    Abstract: Large Language Models (LLMs) exhibit extensive knowledge about the world, but most evaluations have been limited to global or anglocentric subjects. This raises the question of how well these models perform on topics relevant to other cultures, whose presence on the web is not that prominent. To address this gap, we introduce BertaQA, a multiple-choice trivia dataset that is parallel in English an… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  3. arXiv:2404.19705  [pdf, other

    cs.CL cs.IR

    When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively

    Authors: Tiziano Labruna, Jon Ander Campos, Gorka Azkune

    Abstract: In this paper, we demonstrate how Large Language Models (LLMs) can effectively learn to use an off-the-shelf information retrieval (IR) system specifically when additional context is required to answer a given question. Given the performance of IR systems, the optimal strategy for question answering does not always entail external information retrieval; rather, it often involves leveraging the par… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

  4. Grounding Spatial Relations in Text-Only Language Models

    Authors: Gorka Azkune, Ander Salaberria, Eneko Agirre

    Abstract: This paper shows that text-only Language Models (LM) can learn to ground spatial relations like "left of" or "below" if they are provided with explicit location information of objects and they are properly trained to leverage those locations. We perform experiments on a verbalized version of the Visual Spatial Reasoning (VSR) dataset, where images are coupled with textual statements which contain… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted in Neural Networks

  5. arXiv:2403.00587  [pdf, other

    cs.CV cs.AI

    Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset

    Authors: Ander Salaberria, Gorka Azkune, Oier Lopez de Lacalle, Aitor Soroa, Eneko Agirre, Frank Keller

    Abstract: Existing work has observed that current text-to-image systems do not accurately reflect explicit spatial relations between objects such as 'left of' or 'below'. We hypothesize that this is because explicit spatial relations rarely appear in the image captions used to train these models. We propose an automatic method that, given existing images, generates synthetic captions that contain 14 explici… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 12 pages and 5 figures

  6. arXiv:2310.09350  [pdf, other

    cs.CL cs.AI

    Unsupervised Domain Adaption for Neural Information Retrieval

    Authors: Carlos Dominguez, Jon Ander Campos, Eneko Agirre, Gorka Azkune

    Abstract: Neural information retrieval requires costly annotated data for each target domain to be competitive. Synthetic annotation by query generation using Large Language Models or rule-based string manipulation has been proposed as an alternative, but their relative merits have not been analysed. In this paper, we compare both methods head-to-head using the same neural IR architecture. We focus on the B… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  7. arXiv:2308.01223  [pdf, other

    cs.CL cs.AI cs.LG

    Do Multilingual Language Models Think Better in English?

    Authors: Julen Etxaniz, Gorka Azkune, Aitor Soroa, Oier Lopez de Lacalle, Mikel Artetxe

    Abstract: Translate-test is a popular technique to improve the performance of multilingual language models. This approach works by translating the input into English using an external machine translation system, and running inference over the translated input. However, these improvements can be attributed to the use of a separate translation system, which is typically trained on large amounts of parallel da… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

  8. Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering

    Authors: Ander Salaberria, Gorka Azkune, Oier Lopez de Lacalle, Aitor Soroa, Eneko Agirre

    Abstract: Integrating outside knowledge for reasoning in visio-linguistic tasks such as visual question answering (VQA) is an open problem. Given that pretrained language models have been shown to include world knowledge, we propose to use a unimodal (text-only) train and inference procedure based on automatic off-the-shelf captioning of images and pretrained language models. Our results on a visual questio… ▽ More

    Submitted 25 March, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

    Comments: Under review. 25 pages with 4 figures

    Journal ref: Expert Systems with Applications, Volume 212, 2023, 118669

  9. Inferring spatial relations from textual descriptions of images

    Authors: Aitzol Elu, Gorka Azkune, Oier Lopez de Lacalle, Ignacio Arganda-Carreras, Aitor Soroa, Eneko Agirre

    Abstract: Generating an image from its textual description requires both a certain level of language understanding and common sense knowledge about the spatial relations of the physical entities being described. In this work, we focus on inferring the spatial relation between entities, a key step in the process of composing scenes based on text. More specifically, given a caption containing a mention to a s… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: Accepted in Pattern Recognition

    Journal ref: Pattern Recognition, Volume 113, 2021, 107847

  10. arXiv:2011.00615  [pdf, other

    cs.CL

    Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning

    Authors: Jon Ander Campos, Kyunghyun Cho, Arantxa Otegi, Aitor Soroa, Gorka Azkune, Eneko Agirre

    Abstract: The interaction of conversational systems with users poses an exciting opportunity for improving them after deployment, but little evidence has been provided of its feasibility. In most applications, users are not able to provide the correct answer to the system, but they are able to provide binary (correct, incorrect) feedback. In this paper we propose feedback-weighted learning based on importan… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: Accepted at COLING 2020. 11 pages, 5 figures

  11. arXiv:2004.01894  [pdf, other

    cs.CL

    Evaluating Multimodal Representations on Visual Semantic Textual Similarity

    Authors: Oier Lopez de Lacalle, Ander Salaberria, Aitor Soroa, Gorka Azkune, Eneko Agirre

    Abstract: The combination of visual and textual representations has produced excellent results in tasks such as image captioning and visual question answering, but the inference capabilities of multimodal representations are largely untested. In the case of textual representations, inference tasks such as Textual Entailment and Semantic Textual Similarity have been often used to benchmark the quality of tex… ▽ More

    Submitted 4 April, 2020; originally announced April 2020.

    Comments: Accepted in ECAI-2020, 8 pages, 6 tables, 6 figures