Zum Hauptinhalt springen

Showing 1–20 of 20 results for author: Naderi, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.14444  [pdf, other

    cs.SD cs.AI cs.CV eess.AS

    ICASSP 2024 Speech Signal Improvement Challenge

    Authors: Nicolae Catalin Ristea, Ando Saabas, Ross Cutler, Babak Naderi, Sebastian Braun, Solomiya Branets

    Abstract: The ICASSP 2024 Speech Signal Improvement Grand Challenge is intended to stimulate research in the area of improving the speech signal quality in communication systems. This marks our second challenge, building upon the success from the previous ICASSP 2023 Grand Challenge. We enhance the competition by introducing a dataset synthesizer, enabling all participating teams to start at a higher baseli… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  2. arXiv:2309.07385  [pdf, other

    eess.AS cs.SD

    Multi-dimensional Speech Quality Assessment in Crowdsourcing

    Authors: Babak Naderi, Ross Cutler, Nicolae-Catalin Ristea

    Abstract: Subjective speech quality assessment is the gold standard for evaluating speech enhancement processing and telecommunication systems. The commonly used standard ITU-T Rec. P.800 defines how to measure speech quality in lab environments, and ITU-T Rec.~P.808 extended it for crowdsourcing. ITU-T Rec. P.835 extends P.800 to measure the quality of speech in the presence of noise. ITU-T Rec. P.804 targ… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.06566

  3. arXiv:2309.07376  [pdf, other

    eess.IV cs.MM

    VCD: A Video Conferencing Dataset for Video Compression

    Authors: Babak Naderi, Ross Cutler, Nabakumar Singh Khongbantabam, Yasaman Hosseinkashi, Henrik Turbell, Albert Sadovnikov, Quan Zhou

    Abstract: Commonly used datasets for evaluating video codecs are all very high quality and not representative of video typically used in video conferencing scenarios. We present the Video Conferencing Dataset (VCD) for evaluating video codecs for real-time communication, the first such dataset focused on video conferencing. VCD includes a wide variety of camera qualities and spatial and temporal information… ▽ More

    Submitted 13 November, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

  4. arXiv:2309.00769  [pdf, other

    eess.IV cs.CV

    Full Reference Video Quality Assessment for Machine Learning-Based Video Codecs

    Authors: Abrar Majeedi, Babak Naderi, Yasaman Hosseinkashi, Juhee Cho, Ruben Alvarez Martinez, Ross Cutler

    Abstract: Machine learning-based video codecs have made significant progress in the past few years. A critical area in the development of ML-based video codecs is an accurate evaluation metric that does not require an expensive and slow subjective test. We show that existing evaluation metrics that were designed and trained on DSP-based video codecs are not highly correlated to subjective opinion when used… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

  5. arXiv:2303.12761  [pdf, other

    eess.IV cs.LG

    LSTM-based Video Quality Prediction Accounting for Temporal Distortions in Videoconferencing Calls

    Authors: Gabriel Mittag, Babak Naderi, Vishak Gopal, Ross Cutler

    Abstract: Current state-of-the-art video quality models, such as VMAF, give excellent prediction results by comparing the degraded video with its reference video. However, they do not consider temporal distortions (e.g., frame freezes or skips) that occur during videoconferencing calls. In this paper, we present a data-driven approach for modeling such distortions automatically by training an LSTM with subj… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: Accepted at ICASSP 2023

  6. arXiv:2303.11510  [pdf, other

    cs.SD eess.AS

    ICASSP 2023 Deep Noise Suppression Challenge

    Authors: Harishchandra Dubey, Ashkan Aazami, Vishak Gopal, Babak Naderi, Sebastian Braun, Ross Cutler, Alex Ju, Mehdi Zohourian, Min Tang, Hannes Gamper, Mehrsa Golestaneh, Robert Aichner

    Abstract: Deep Speech Enhancement Challenge is the 5th edition of deep noise suppression (DNS) challenges organized at ICASSP 2023 Signal Processing Grand Challenges. DNS challenges were organized during 2019-2023 to stimulate research in deep speech enhancement (DSE). Previous DNS challenges were organized at INTERSPEECH 2020, ICASSP 2021, INTERSPEECH 2021, and ICASSP 2022. From prior editions, we learnt t… ▽ More

    Submitted 8 May, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: 6 pages, 1 figure. arXiv admin note: text overlap with arXiv:2202.13288

  7. arXiv:2207.06265  [pdf, other

    cs.CL cs.AI cs.LG

    A Transfer Learning Based Model for Text Readability Assessment in German

    Authors: Salar Mohtaj, Babak Naderi, Sebastian Möller, Faraz Maschhur, Chuyang Wu, Max Reinhard

    Abstract: Text readability assessment has a wide range of applications for different target people, from language learners to people with disabilities. The fast pace of textual content production on the web makes it impossible to measure text complexity without the benefit of machine learning and natural language processing techniques. Although various research addressed the readability assessment of Englis… ▽ More

    Submitted 6 September, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

  8. arXiv:2203.16032  [pdf, other

    cs.SD eess.AS

    ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications

    Authors: Gaoxiong Yi, Wei Xiao, Yiming Xiao, Babak Naderi, Sebastian Möller, Wafaa Wardah, Gabriel Mittag, Ross Cutler, Zhuohuang Zhang, Donald S. Williamson, Fei Chen, Fuzheng Yang, Shidong Shang

    Abstract: With the advances in speech communication systems such as online conferencing applications, we can seamlessly work with people regardless of where they are. However, during online meetings, speech quality can be significantly affected by background noise, reverberation, packet loss, network jitter, etc. Because of its nature, speech quality is traditionally assessed in subjective tests in laborato… ▽ More

    Submitted 31 March, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

  9. arXiv:2104.10217  [pdf, other

    eess.AS cs.LG cs.SD eess.IV

    Bias-Aware Loss for Training Image and Speech Quality Prediction Models from Multiple Datasets

    Authors: Gabriel Mittag, Saman Zadtootaghaj, Thilo Michael, Babak Naderi, Sebastian Möller

    Abstract: The ground truth used for training image, video, or speech quality prediction models is based on the Mean Opinion Scores (MOS) obtained from subjective experiments. Usually, it is necessary to conduct multiple experiments, mostly with different test participants, to obtain enough data to train quality models based on machine learning. Each of these experiments is subject to an experiment-specific… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: Accepted at QoMEX 2021

  10. arXiv:2104.09494  [pdf, other

    eess.AS cs.AI cs.LG cs.SD

    NISQA: A Deep CNN-Self-Attention Model for Multidimensional Speech Quality Prediction with Crowdsourced Datasets

    Authors: Gabriel Mittag, Babak Naderi, Assmaa Chehadi, Sebastian Möller

    Abstract: In this paper, we present an update to the NISQA speech quality prediction model that is focused on distortions that occur in communication networks. In contrast to the previous version, the model is trained end-to-end and the time-dependency modelling and time-pooling is achieved through a Self-Attention mechanism. Besides overall speech quality, the model also predicts the four speech quality di… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: Submitted to Interspeech 2021

  11. arXiv:2104.04371  [pdf, other

    cs.MM eess.AS

    Speech Quality Assessment in Crowdsourcing: Comparison Category Rating Method

    Authors: Babak Naderi, Sebastian Möller, Ross Cutler

    Abstract: Traditionally, Quality of Experience (QoE) for a communication system is evaluated through a subjective test. The most common test method for speech QoE is the Absolute Category Rating (ACR), in which participants listen to a set of stimuli, processed by the underlying test conditions, and rate their perceived quality for each stimulus on a specific scale. The Comparison Category Rating (CCR) is a… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: Accepted for QoMEX2021

  12. arXiv:2010.13260  [pdf, ps, other

    cs.MM cs.SD eess.AS

    Effect of Language Proficiency on Subjective Evaluation of Noise Suppression Algorithms

    Authors: Babak Naderi, Gabriel Mittag, Rafael Zequeira Jim\a'enez, Sebastian Möller

    Abstract: Speech communication systems based on Voice-over-IP technology are frequently used by native as well as non-native speakers of a target language, e.g. in international phone calls or telemeetings. Frequently, such calls also occur in a noisy environment, making noise suppression modules necessary to increase perceived quality of experience. Whereas standard tests for assessing perceived quality ma… ▽ More

    Submitted 25 October, 2020; originally announced October 2020.

  13. arXiv:2010.13200  [pdf, other

    eess.AS cs.SD

    Subjective Evaluation of Noise Suppression Algorithms in Crowdsourcing

    Authors: Babak Naderi, Ross Cutler

    Abstract: The quality of the speech communication systems, which include noise suppression algorithms, are typically evaluated in laboratory experiments according to the ITU-T Rec. P.835, in which participants rate background noise, speech signal, and overall quality separately. This paper introduces an open-source toolkit for conducting subjective quality evaluation of noise suppressed speech in crowdsourc… ▽ More

    Submitted 16 April, 2021; v1 submitted 25 October, 2020; originally announced October 2020.

  14. arXiv:2010.13063  [pdf, other

    eess.AS cs.SD

    Crowdsourcing approach for subjective evaluation of echo impairment

    Authors: Ross Cutler, Babak Naderi, Markus Loide, Sten Sootla, Ando Saabas

    Abstract: The quality of acoustic echo cancellers (AECs) in real-time communication systems is typically evaluated using objective metrics like ERLE and PESQ, and less commonly with lab-based subjective tests like ITU-T Rec. P.831. We will show that these objective measures are not well correlated to subjective measures. We then introduce an open-source crowdsourcing approach for subjective evaluation of ec… ▽ More

    Submitted 27 February, 2022; v1 submitted 25 October, 2020; originally announced October 2020.

  15. arXiv:2007.07032  [pdf

    cs.MM

    QUALINET White Paper on Definitions of Immersive Media Experience (IMEx)

    Authors: Andrew Perkis, Christian Timmerer, Sabina Baraković, Jasmina Baraković Husić, Søren Bech, Sebastian Bosse, Jean Botev, Kjell Brunnström, Luis Cruz, Katrien De Moor, Andrea de Polo Saibanti, Wouter Durnez, Sebastian Egger-Lampl, Ulrich Engelke, Tiago H. Falk, Jesús Gutiérrez, Asim Hameed, Andrew Hines, Tanja Kojic, Dragan Kukolj, Eirini Liotou, Dragorad Milovanovic, Sebastian Möller, Niall Murray, Babak Naderi , et al. (19 additional authors not shown)

    Abstract: With the coming of age of virtual/augmented reality and interactive media, numerous definitions, frameworks, and models of immersion have emerged across different fields ranging from computer graphics to literary works. Immersion is oftentimes used interchangeably with presence as both concepts are closely related. However, there are noticeable interdisciplinary differences regarding definitions,… ▽ More

    Submitted 24 November, 2020; v1 submitted 10 June, 2020; originally announced July 2020.

  16. An Open source Implementation of ITU-T Recommendation P.808 with Validation

    Authors: Babak Naderi, Ross Cutler

    Abstract: The ITU-T Recommendation P.808 provides a crowdsourcing approach for conducting a subjective assessment of speech quality using the Absolute Category Rating (ACR) method. We provide an open-source implementation of the ITU-T Rec. P.808 that runs on the Amazon Mechanical Turk platform. We extended our implementation to include Degradation Category Ratings (DCR) and Comparison Category Ratings (CCR)… ▽ More

    Submitted 16 May, 2020; originally announced May 2020.

  17. Transformation of Mean Opinion Scores to Avoid Misleading of Ranked based Statistical Techniques

    Authors: Babak Naderi, Sebastian Möller

    Abstract: The rank correlation coefficients and the ranked-based statistical tests (as a subset of non-parametric techniques) might be misleading when they are applied to subjectively collected opinion scores. Those techniques assume that the data is measured at least at an ordinal level and define a sequence of scores to represent a tied rank when they have precisely an equal numeric value. In this paper… ▽ More

    Submitted 23 April, 2020; originally announced April 2020.

    Comments: his paper has been accepted for publication in the 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX)

  18. Application of Just-Noticeable Difference in Quality as Environment Suitability Test for Crowdsourcing Speech Quality Assessment Task

    Authors: Babak Naderi, Sebastian Möller

    Abstract: Crowdsourcing micro-task platforms facilitate subjective media quality assessment by providing access to a highly scale-able, geographically distributed and demographically diverse pool of crowd workers. Those workers participate in the experiment remotely from their own working environment, using their own hardware. In the case of speech quality assessment, preliminary work showed that environmen… ▽ More

    Submitted 11 April, 2020; originally announced April 2020.

    Comments: This paper has been accepted for publication in the 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX)

  19. Impact of the Number of Votes on the Reliability and Validity of Subjective Speech Quality Assessment in the Crowdsourcing Approach

    Authors: Babak Naderi, Tobias Hossfeld, Matthias Hirth, Florian Metzger, Sebastian Möller, Rafael Zequeira Jiménez

    Abstract: The subjective quality of transmitted speech is traditionally assessed in a controlled laboratory environment according to ITU-T Rec. P.800. In turn, with crowdsourcing, crowdworkers participate in a subjective online experiment using their own listening device, and in their own working environment. Despite such less controllable conditions, the increased use of crowdsourcing micro-task platforms… ▽ More

    Submitted 25 March, 2020; originally announced March 2020.

    Comments: This paper has been accepted for publication in the 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX)

  20. arXiv:1904.07733  [pdf, other

    cs.CL

    Subjective Assessment of Text Complexity: A Dataset for German Language

    Authors: Babak Naderi, Salar Mohtaj, Kaspar Ensikat, Sebastian Möller

    Abstract: This paper presents TextComplexityDE, a dataset consisting of 1000 sentences in German language taken from 23 Wikipedia articles in 3 different article-genres to be used for developing text-complexity predictor models and automatic text simplification in German language. The dataset includes subjective assessment of different text-complexity aspects provided by German learners in level A and B. In… ▽ More

    Submitted 16 April, 2019; originally announced April 2019.