Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Cho, D

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.07803  [pdf, other

    cs.SD cs.AI eess.AS

    EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech

    Authors: Deok-Hyeon Cho, Hyung-Seok Oh, Seung-Bin Kim, Sang-Hoon Lee, Seong-Whan Lee

    Abstract: Despite rapid advances in the field of emotional text-to-speech (TTS), recent studies primarily focus on mimicking the average style of a particular emotion. As a result, the ability to manipulate speech emotion remains constrained to several predefined labels, compromising the ability to reflect the nuanced variations of emotion. In this paper, we propose EmoSphere-TTS, which synthesizes expressi… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted at INTERSPEECH 2024

  2. arXiv:2401.08095  [pdf, other

    cs.SD cs.AI eess.AS

    DurFlex-EVC: Duration-Flexible Emotional Voice Conversion with Parallel Generation

    Authors: Hyung-Seok Oh, Sang-Hoon Lee, Deok-Hyeon Cho, Seong-Whan Lee

    Abstract: Emotional voice conversion involves modifying the pitch, spectral envelope, and other acoustic characteristics of speech to match a desired emotional state while maintaining the speaker's identity. Recent advances in EVC involve simultaneously modeling pitch and duration by exploiting the potential of sequence-to-sequence models. In this study, we focus on parallel speech generation to increase th… ▽ More

    Submitted 8 August, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

    Comments: 14 pages, 11 figures, 12 tables

  3. arXiv:2007.01524  [pdf, other

    cs.CV cs.LG eess.IV

    Domain Adaptation without Source Data

    Authors: Youngeun Kim, Donghyeon Cho, Kyeongtak Han, Priyadarshini Panda, Sungeun Hong

    Abstract: Domain adaptation assumes that samples from source and target domains are freely accessible during a training phase. However, such an assumption is rarely plausible in the real-world and possibly causes data-privacy issues, especially when the label of the source domain can be a sensitive attribute as an identifier. To avoid accessing source data that may contain sensitive information, we introduc… ▽ More

    Submitted 30 August, 2021; v1 submitted 3 July, 2020; originally announced July 2020.

    Comments: 13 pages

  4. arXiv:1912.00374  [pdf

    eess.SY

    Task Scheduling of Multiple Agile Satellites with Transition Time and Stereo Imaging Constraints

    Authors: Junhong Kim, Doo-Hyun Cho, Jaemyung Ahn, Han-Lim Choi

    Abstract: This paper proposes a framework for scheduling the observation and download tasks of multiple agile satellites with practical considerations such as attitude transition time, onboard data capacity, and stereoscopic image acquisition. A mixed integer linear programming (MILP) formulation for optimal scheduling that can address these practical considerations is introduced. A heuristic algorithm to o… ▽ More

    Submitted 1 December, 2019; originally announced December 2019.

  5. arXiv:1906.07851  [pdf, other

    cs.CV cs.LG eess.IV

    Key Instance Selection for Unsupervised Video Object Segmentation

    Authors: Donghyeon Cho, Sungeun Hong, Sungil Kang, Jiwon Kim

    Abstract: This paper proposes key instance selection based on video saliency covering objectness and dynamics for unsupervised video object segmentation (UVOS). Our method takes frames sequentially and extracts object proposals with corresponding masks for each frame. We link objects according to their similarity until the M-th frame and then assign them unique IDs (i.e., instances). Similarity measure take… ▽ More

    Submitted 26 July, 2019; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: Ranked 3rd in 'Unsupervised DAVIS Challenge' (CVPR 2019)