Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Boussaid, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.08669  [pdf, other

    cs.CV

    Segmentation-guided Attention for Visual Question Answering from Remote Sensing Images

    Authors: Lucrezia Tosato, Hichem Boussaid, Flora Weissgerber, Camille Kurtz, Laurent Wendling, Sylvain Lobry

    Abstract: Visual Question Answering for Remote Sensing (RSVQA) is a task that aims at answering natural language questions about the content of a remote sensing image. The visual features extraction is therefore an essential step in a VQA pipeline. By incorporating attention mechanisms into this process, models gain the ability to focus selectively on salient regions of the image, prioritizing the most rele… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted to IGARSS 2024

  2. arXiv:2311.14063  [pdf, other

    cs.CV cs.CL cs.LG

    Do VSR Models Generalize Beyond LRS3?

    Authors: Yasser Abdelaziz Dahou Djilali, Sanath Narayan, Eustache Le Bihan, Haithem Boussaid, Ebtessam Almazrouei, Merouane Debbah

    Abstract: The Lip Reading Sentences-3 (LRS3) benchmark has primarily been the focus of intense research in visual speech recognition (VSR) during the last few years. As a result, there is an increased risk of overfitting to its excessively used test set, which is only one hour duration. To alleviate this issue, we build a new VSR test set named WildVSR, by closely following the LRS3 dataset creation process… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  3. arXiv:2308.06112  [pdf, other

    cs.SD cs.CL eess.AS

    Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping

    Authors: Yasser Abdelaziz Dahou Djilali, Sanath Narayan, Haithem Boussaid, Ebtessam Almazrouei, Merouane Debbah

    Abstract: Visual Speech Recognition (VSR) differs from the common perception tasks as it requires deeper reasoning over the video sequence, even by human experts. Despite the recent advances in VSR, current approaches rely on labeled data to fully train or finetune their models predicting the target speech. This hinders their ability to generalize well beyond the training set and leads to performance degene… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  4. arXiv:2210.01713  [pdf, other

    eess.IV cs.CV cs.LG

    Anatomically constrained CT image translation for heterogeneous blood vessel segmentation

    Authors: Giammarco La Barbera, Haithem Boussaid, Francesco Maso, Sabine Sarnacki, Laurence Rouet, Pietro Gori, Isabelle Bloch

    Abstract: Anatomical structures such as blood vessels in contrast-enhanced CT (ceCT) images can be challenging to segment due to the variability in contrast medium diffusion. The combined use of ceCT and contrast-free (CT) CT images can improve the segmentation performances, but at the cost of a double radiation exposure. To limit the radiation dose, generative models could be used to synthesize one modalit… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: Accepted at BMVC 2022

  5. arXiv:2107.02655  [pdf, other

    cs.CV cs.AI cs.LG eess.IV stat.ML

    Automatic size and pose homogenization with spatial transformer network to improve and accelerate pediatric segmentation

    Authors: Giammarco La Barbera, Pietro Gori, Haithem Boussaid, Bruno Belucci, Alessandro Delmonte, Jeanne Goulin, Sabine Sarnacki, Laurence Rouet, Isabelle Bloch

    Abstract: Due to a high heterogeneity in pose and size and to a limited number of available data, segmentation of pediatric images is challenging for deep learning methods. In this work, we propose a new CNN architecture that is pose and scale invariant thanks to the use of Spatial Transformer Network (STN). Our architecture is composed of three sequential modules that are estimated together during training… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

    Comments: ISBI 2021

    Journal ref: ISBI 2021