Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Haneda, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:1811.02438  [pdf, other

    eess.AS cs.LG cs.SD eess.SP stat.ML

    Trainable Adaptive Window Switching for Speech Enhancement

    Authors: Yuma Koizumi, Noboru Harada, Yoichi Haneda

    Abstract: This study proposes a trainable adaptive window switching (AWS) method and apply it to a deep-neural-network (DNN) for speech enhancement in the modified discrete cosine transform domain. Time-frequency (T-F) mask processing in the short-time Fourier transform (STFT)-domain is a typical speech enhancement method. To recover the target signal precisely, DNN-based short-time frequency transforms hav… ▽ More

    Submitted 19 February, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: accepted to the 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2019)

  2. arXiv:1810.09137  [pdf, other

    stat.ML cs.LG cs.SD eess.AS

    DNN-based Source Enhancement to Increase Objective Sound Quality Assessment Score

    Authors: Yuma Koizumi, Kenta Niwa, Yusuke Hioka, Kazunori Kobayashi, Yoichi Haneda

    Abstract: We propose a training method for deep neural network (DNN)-based source enhancement to increase objective sound quality assessment (OSQA) scores such as the perceptual evaluation of speech quality (PESQ). In many conventional studies, DNNs have been used as a mapping function to estimate time-frequency masks and trained to minimize an analytically tractable objective function such as the mean squa… ▽ More

    Submitted 22 October, 2018; originally announced October 2018.

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol.26, Issue.10, 2018