Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Ebbers, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08056  [pdf, ps, other

    eess.AS cs.SD

    DCASE 2024 Task 4: Sound Event Detection with Heterogeneous Data and Missing Labels

    Authors: Samuele Cornell, Janek Ebbers, Constance Douwes, Irene Martín-Morató, Manu Harju, Annamaria Mesaros, Romain Serizel

    Abstract: The Detection and Classification of Acoustic Scenes and Events Challenge Task 4 aims to advance sound event detection (SED) systems in domestic environments by leveraging training data with different supervision uncertainty. Participants are challenged in exploring how to best use training data from different domains and with varying annotation granularity (strong/weak temporal resolution, soft/ha… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2406.04212  [pdf, ps, other

    eess.AS cs.SD

    Sound Event Bounding Boxes

    Authors: Janek Ebbers, Francois G. Germain, Gordon Wichern, Jonathan Le Roux

    Abstract: Sound event detection is the task of recognizing sounds and determining their extent (onset/offset times) within an audio clip. Existing systems commonly predict sound presence confidence in short time frames. Then, thresholding produces binary frame-level presence decisions, with the extent of individual events determined by merging consecutive positive frames. In this paper, we show that frame-l… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted for publication at Interspeech 2024

  3. arXiv:2306.15440  [pdf, ps, other

    eess.AS cs.SD

    Post-Processing Independent Evaluation of Sound Event Detection Systems

    Authors: Janek Ebbers, Reinhold Haeb-Umbach, Romain Serizel

    Abstract: Due to the high variation in the application requirements of sound event detection (SED) systems, it is not sufficient to evaluate systems only in a single operating mode. Therefore, the community recently adopted the polyphonic sound detection score (PSDS) as an evaluation metric, which is the normalized area under the PSD receiver operating characteristic (PSD-ROC). It summarizes the system perf… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: submitted to DCASE Workshop 2023

  4. arXiv:2201.13148  [pdf, other

    eess.AS cs.SD

    Threshold Independent Evaluation of Sound Event Detection Scores

    Authors: Janek Ebbers, Romain Serizel, Reinhold Haeb-Umbach

    Abstract: Performing an adequate evaluation of sound event detection (SED) systems is far from trivial and is still subject to ongoing research. The recently proposed polyphonic sound detection (PSD)-receiver operating characteristic (ROC) and PSD score (PSDS) make an important step into the direction of an evaluation of SED systems which is independent from a certain decision threshold. This allows to obta… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: accepted for ICASSP 2022

  5. arXiv:2105.01786  [pdf, other

    eess.AS cs.CL cs.SD

    Voice Conversion Based Speaker Normalization for Acoustic Unit Discovery

    Authors: Thomas Glarner, Janek Ebbers, Reinhold Häb-Umbach

    Abstract: Discovering speaker independent acoustic units purely from spoken input is known to be a hard problem. In this work we propose an unsupervised speaker normalization technique prior to unit discovery. It is based on separating speaker related from content induced variations in a speech signal with an adversarial contrastive predictive coding approach. This technique does neither require transcribed… ▽ More

    Submitted 4 May, 2021; originally announced May 2021.

    Comments: Submitted to Interspeech 2021

  6. arXiv:2103.06581  [pdf, ps, other

    eess.AS cs.SD

    Forward-Backward Convolutional Recurrent Neural Networks and Tag-Conditioned Convolutional Neural Networks for Weakly Labeled Semi-supervised Sound Event Detection

    Authors: Janek Ebbers, Reinhold Haeb-Umbach

    Abstract: In this paper we present our system for the detection and classification of acoustic scenes and events (DCASE) 2020 Challenge Task 4: Sound event detection and separation in domestic environments. We introduce two new models: the forward-backward convolutional recurrent neural network (FBCRNN) and the tag-conditioned convolutional neural network (CNN). The FBCRNN employs two recurrent neural netwo… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

    Comments: accepted by dcase2020 workshop, the presented system received the reproducible system award for the dcase2020 challenge task 4

  7. arXiv:2005.12963  [pdf, ps, other

    eess.AS cs.SD

    Contrastive Predictive Coding Supported Factorized Variational Autoencoder for Unsupervised Learning of Disentangled Speech Representations

    Authors: Janek Ebbers, Michael Kuhlmann, Tobias Cord-Landwehr, Reinhold Haeb-Umbach

    Abstract: In this work we address disentanglement of style and content in speech signals. We propose a fully convolutional variational autoencoder employing two encoders: a content encoder and a style encoder. To foster disentanglement, we propose adversarial contrastive predictive coding. This new disentanglement method does neither need parallel data nor any supervision. We show that the proposed techniqu… ▽ More

    Submitted 11 March, 2021; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: accepted by icassp 2021