Showing 1–1 of 1 results for author: Kalinov, A

Search v0.5.6 released 2020-02-24

arXiv:2107.10708 [pdf, other]

eess.AS cs.SD

CarneliNet: Neural Mixture Model for Automatic Speech Recognition

Authors: Aleksei Kalinov, Somshubra Majumdar, Jagadeesh Balam, Boris Ginsburg

Abstract: End-to-end automatic speech recognition systems have achieved great accuracy by using deeper and deeper models. However, the increased depth comes with a larger receptive field that can negatively impact model performance in streaming scenarios. We propose an alternative approach that we call Neural Mixture Model. The basic idea is to introduce a parallel mixture of shallow networks instead of a v… ▽ More End-to-end automatic speech recognition systems have achieved great accuracy by using deeper and deeper models. However, the increased depth comes with a larger receptive field that can negatively impact model performance in streaming scenarios. We propose an alternative approach that we call Neural Mixture Model. The basic idea is to introduce a parallel mixture of shallow networks instead of a very deep network. To validate this idea we design CarneliNet -- a CTC-based neural network composed of three mega-blocks. Each mega-block consists of multiple parallel shallow sub-networks based on 1D depthwise-separable convolutions. We evaluate the model on LibriSpeech, MLS and AISHELL-2 datasets and achieved close to state-of-the-art results for CTC-based models. Finally, we demonstrate that one can dynamically reconfigure the number of parallel sub-networks to accommodate the computational requirements without retraining. △ Less

Submitted 22 July, 2021; originally announced July 2021.

Comments: Submitted to ASRU 2021

Search v0.5.6 released 2020-02-24