Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Morais, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2207.13965  [pdf, other

    eess.AS cs.SD

    Extending RNN-T-based speech recognition systems with emotion and language classification

    Authors: Zvi Kons, Hagai Aronowitz, Edmilson Morais, Matheus Damasceno, Hong-Kwang Kuo, Samuel Thomas, George Saon

    Abstract: Speech transcription, emotion recognition, and language identification are usually considered to be three different tasks. Each one requires a different model with a different architecture and training process. We propose using a recurrent neural network transducer (RNN-T)-based speech-to-text (STT) system as a common component that can be used for emotion recognition and language identification a… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: Accepted for publication in Interspeech 2022

  2. arXiv:2203.00613  [pdf

    cs.CL cs.LG cs.SD eess.AS

    Towards a Common Speech Analysis Engine

    Authors: Hagai Aronowitz, Itai Gat, Edmilson Morais, Weizhong Zhu, Ron Hoory

    Abstract: Recent innovations in self-supervised representation learning have led to remarkable advances in natural language processing. That said, in the speech processing domain, self-supervised representation learning-based systems are not yet considered state-of-the-art. We propose leveraging recent advances in self-supervised-based speech processing to create a common speech analysis engine. Such an eng… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

    Comments: ICASSP 2022

  3. arXiv:2202.03896  [pdf

    cs.SD cs.AI cs.LG eess.AS

    Speech Emotion Recognition using Self-Supervised Features

    Authors: Edmilson Morais, Ron Hoory, Weizhong Zhu, Itai Gat, Matheus Damasceno, Hagai Aronowitz

    Abstract: Self-supervised pre-trained features have consistently delivered state-of-art results in the field of natural language processing (NLP); however, their merits in the field of speech emotion recognition (SER) still need further investigation. In this paper we introduce a modular End-to- End (E2E) SER system based on an Upstream + Downstream architecture paradigm, which allows easy use/integration o… ▽ More

    Submitted 6 February, 2022; originally announced February 2022.

    Comments: 5 pages, 4 figures, 2 tables, ICASSP 2022

  4. arXiv:2202.01252  [pdf, other

    cs.LG

    Speaker Normalization for Self-supervised Speech Emotion Recognition

    Authors: Itai Gat, Hagai Aronowitz, Weizhong Zhu, Edmilson Morais, Ron Hoory

    Abstract: Large speech emotion recognition datasets are hard to obtain, and small datasets may contain biases. Deep-net-based classifiers, in turn, are prone to exploit those biases and find shortcuts such as speaker characteristics. These shortcuts usually harm a model's ability to generalize. To address this challenge, we propose a gradient-based adversary learning framework that learns a speech emotion r… ▽ More

    Submitted 6 November, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: ICASSP 22

  5. arXiv:2104.05752  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs

    Authors: Sujeong Cha, Wangrui Hou, Hyun Jung, My Phung, Michael Picheny, Hong-Kwang Kuo, Samuel Thomas, Edmilson Morais

    Abstract: A major focus of recent research in spoken language understanding (SLU) has been on the end-to-end approach where a single model can predict intents directly from speech inputs without intermediate transcripts. However, this approach presents some challenges. First, since speech can be considered as personally identifiable information, in some cases only automatic speech recognition (ASR) transcri… ▽ More

    Submitted 14 June, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: Accepted to Interspeech 2021

  6. arXiv:2011.08238  [pdf

    cs.CL cs.SD eess.AS

    End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

    Authors: Edmilson Morais, Hong-Kwang J. Kuo, Samuel Thomas, Zoltan Tuske, Brian Kingsbury

    Abstract: Transformer networks and self-supervised pre-training have consistently delivered state-of-art results in the field of natural language processing (NLP); however, their merits in the field of spoken language understanding (SLU) still need further investigation. In this paper we introduce a modular End-to-End (E2E) SLU transformer network based architecture which allows the use of self-supervised p… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

    Comments: 5 pages, 3 tables and 1 figure

  7. arXiv:1907.06381  [pdf, other

    cs.CR

    A Survey on Zero Knowledge Range Proofs and Applications

    Authors: Eduardo Morais, Tommy Koens, Cees van Wijk, Aleksei Koren

    Abstract: In last years, there has been an increasing effort to leverage Distributed Ledger Technology (DLT), including blockchain. One of the main topics of interest, given its importance, is the research and development of privacy mechanisms, as for example is the case of Zero Knowledge Proofs (ZKP). ZKP is a cryptographic technique that can be used to hide information that is put into the ledger, while s… ▽ More

    Submitted 15 July, 2019; originally announced July 2019.