Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Meignier, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.13012  [pdf, other

    cs.SD cs.AI cs.NE eess.AS eess.SP

    Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains

    Authors: Martin Lebourdais, Théo Mariotte, Marie Tahon, Anthony Larcher, Antoine Laurent, Silvio Montresor, Sylvain Meignier, Jean-Hugh Thomas

    Abstract: Voice activity and overlapped speech detection (respectively VAD and OSD) are key pre-processing tasks for speaker diarization. The final segmentation performance highly relies on the robustness of these sub-tasks. Recent studies have shown VAD and OSD can be trained jointly using a multi-class classification model. However, these works are often restricted to a specific speech domain, lacking inf… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

  2. arXiv:2209.04167  [pdf, other

    cs.SD cs.AI eess.AS

    Overlapped speech and gender detection with WavLM pre-trained features

    Authors: Martin Lebourdais, Marie Tahon, Antoine Laurent, Sylvain Meignier

    Abstract: This article focuses on overlapped speech and gender detection in order to study interactions between women and men in French audiovisual media (Gender Equality Monitoring project). In this application context, we need to automatically segment the speech signal according to speakers gender, and to identify when at least two speakers speak at the same time. We propose to use WavLM model which has t… ▽ More

    Submitted 9 September, 2022; originally announced September 2022.

    Comments: Submitted and accepted to Interspeech 2022

  3. End2End Acoustic to Semantic Transduction

    Authors: Valentin Pelloin, Nathalie Camelin, Antoine Laurent, Renato De Mori, Antoine Caubrière, Yannick Estève, Sylvain Meignier

    Abstract: In this paper, we propose a novel end-to-end sequence-to-sequence spoken language understanding model using an attention mechanism. It reliably selects contextual acoustic features in order to hypothesize semantic contents. An initial architecture capable of extracting all pronounced words and concepts from acoustic spans is designed and tested. With a shallow fusion language model, this system re… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: Accepted at IEEE ICASSP 2021

    Journal ref: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  4. arXiv:1602.01929  [pdf, other

    cs.CL

    Fantastic 4 system for NIST 2015 Language Recognition Evaluation

    Authors: Kong Aik Lee, Ville Hautamäki, Anthony Larcher, Wei Rao, Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Aleksandr Sizov, Ivan Kukanov, Amir Poorjam, Trung Ngo Trong, Xiong Xiao, Cheng-Lin Xu, Hai-Hua Xu, Bin Ma, Haizhou Li, Sylvain Meignier

    Abstract: This article describes the systems jointly submitted by Institute for Infocomm (I$^2$R), the Laboratoire d'Informatique de l'Université du Maine (LIUM), Nanyang Technology University (NTU) and the University of Eastern Finland (UEF) for 2015 NIST Language Recognition Evaluation (LRE). The submitted system is a fusion of nine sub-systems based on i-vectors extracted from different types of features… ▽ More

    Submitted 5 February, 2016; originally announced February 2016.

    Comments: Technical report for NIST LRE 2015 Workshop