Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Pylkkönen, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.10325  [pdf, other

    cs.CL cs.LG eess.AS

    Enhancing Multilingual Voice Toxicity Detection with Speech-Text Alignment

    Authors: Joseph Liu, Mahesh Kumar Nandwana, Janne Pylkkönen, Hannes Heikinheimo, Morgan McGuire

    Abstract: Toxicity classification for voice heavily relies on the semantic content of speech. We propose a novel framework that utilizes cross-modal learning to integrate the semantic embedding of text into a multilabel speech toxicity classifier during training. This enables us to incorporate textual information during training while still requiring only audio during inference. We evaluate this classifier… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024

  2. arXiv:2104.11127  [pdf, other

    cs.CL cs.SD eess.AS

    Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network

    Authors: Janne Pylkkönen, Antti Ukkonen, Juho Kilpikoski, Samu Tamminen, Hannes Heikinheimo

    Abstract: Adaption of end-to-end speech recognition systems to new tasks is known to be challenging. A number of solutions have been proposed which apply external language models with various fusion methods, possibly with a combination of two-pass decoding. Also TTS systems have been used to generate adaptation data for the end-to-end models. In this paper we show that RNN-transducer models can be effective… ▽ More

    Submitted 9 June, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

    Comments: 5 pages, 2 figures. Accepted to Interspeech 2021