Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Belyaev, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.11740  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale

    Authors: Raphael Tang, Karun Kumar, Gefei Yang, Akshat Pandey, Yajie Mao, Vladislav Belyaev, Madhuri Emmadi, Craig Murray, Ferhan Ture, Jimmy Lin

    Abstract: End-to-end automatic speech recognition systems represent the state of the art, but they rely on thousands of hours of manually annotated speech for training, as well as heavyweight computation for inference. Of course, this impedes commercialization since most companies lack vast human and computational resources. In this paper, we explore training and deploying an ASR system in the label-scarce,… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted to EMNLP 2022 Industry Track; 9 pages, 7 figures

  2. End-to-end Deep Object Tracking with Circular Loss Function for Rotated Bounding Box

    Authors: Vladislav Belyaev, Aleksandra Malysheva, Aleksei Shpilman

    Abstract: The task object tracking is vital in numerous applications such as autonomous driving, intelligent surveillance, robotics, etc. This task entails the assigning of a bounding box to an object in a video stream, given only the bounding box for that object on the first frame. In 2015, a new type of video object tracking (VOT) dataset was created that introduced rotated bounding boxes as an extension… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.