Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Aly, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.07823  [pdf, other

    cs.CL cs.SD eess.AS

    PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding

    Authors: Trang Le, Daniel Lazar, Suyoun Kim, Shan Jiang, Duc Le, Adithya Sagar, Aleksandr Livshits, Ahmed Aly, Akshat Shrivastava

    Abstract: Spoken Language Understanding (SLU) is a critical component of voice assistants; it consists of converting speech to semantic parses for task execution. Previous works have explored end-to-end models to improve the quality and robustness of SLU models with Deliberation, however these models have remained autoregressive, resulting in higher latencies. In this work we introduce PRoDeliberation, a no… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2209.02448  [pdf, ps, other

    eess.SY

    Fast Adaptive Regression-based Model Predictive Control

    Authors: Eslam Mostafa, Hussein A. Aly, Ahmed Elliethy

    Abstract: Model predictive control (MPC) is an optimal control method that predicts the future states of the system being controlled and estimates the optimal control inputs that drive the predicted states to the required reference. The computations of the MPC are performed at pre-determined sample instances over a finite time horizon. The number of sample instances and the horizon length determine the perf… ▽ More

    Submitted 4 May, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: Accepted for publication in Control Theory and Technology May. 2023

  3. arXiv:2111.06331  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset

    Authors: Aly Moustafa, Salah A. Aly

    Abstract: Current authentication and trusted systems depend on classical and biometric methods to recognize or authorize users. Such methods include audio speech recognitions, eye, and finger signatures. Recent tools utilize deep learning and transformers to achieve better results. In this paper, we develop a deep learning constructed model for Arabic speakers identification by using Wav2Vec2.0 and HuBERT a… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: 5 pages, 9 figures, 2 tables

  4. Flexible Architecture for Real-time Processing of Multiple Video Signals

    Authors: Mohamed Awad, Islam T. Abougindia, Ahmed Elliethy, Hussein A. Aly

    Abstract: Simultaneous processing of multiple video sources requires each pixel in a frame from a video source to be processed synchronously with the pixels at the same spatial positions in corresponding frames from the other video sources. However, simultaneous processing is challenging as corresponding frames from different video signals provided by multiple sources have time-varying delay because of the… ▽ More

    Submitted 29 December, 2019; originally announced January 2020.

    Comments: 13 pages, 16 figures, 3 tables

    Journal ref: Springer Multimedia Tools and Applications (2021)