Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Varadhan, P S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.14056  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Rasa: Building Expressive Speech Synthesis Systems for Indian Languages in Low-resource Settings

    Authors: Praveen Srinivasa Varadhan, Ashwin Sankar, Giri Raju, Mitesh M. Khapra

    Abstract: We release Rasa, the first multilingual expressive TTS dataset for any Indian language, which contains 10 hours of neutral speech and 1-3 hours of expressive speech for each of the 6 Ekman emotions covering 3 languages: Assamese, Bengali, & Tamil. Our ablation studies reveal that just 1 hour of neutral and 30 minutes of expressive data can yield a Fair system as indicated by MUSHRA scores. Increas… ▽ More

    Submitted 30 August, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

    Comments: Accepted at INTERSPEECH 2024. First two authors listed contributed equally

  2. arXiv:2407.13435  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Enhancing Out-of-Vocabulary Performance of Indian TTS Systems for Practical Applications through Low-Effort Data Strategies

    Authors: Srija Anand, Praveen Srinivasa Varadhan, Ashwin Sankar, Giri Raju, Mitesh M. Khapra

    Abstract: Publicly available TTS datasets for low-resource languages like Hindi and Tamil typically contain 10-20 hours of data, leading to poor vocabulary coverage. This limitation becomes evident in downstream applications where domain-specific vocabulary coupled with frequent code-mixing with English, results in many OOV words. To highlight this problem, we create a benchmark containing OOV words from se… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted at INTERSPEECH 2024