Skip to main content

Showing 1–20 of 20 results for author: Do, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04047  [pdf, other

    cs.CL cs.SD eess.AS

    Improving Accented Speech Recognition using Data Augmentation based on Unsupervised Text-to-Speech Synthesis

    Authors: Cong-Thanh Do, Shuhei Imai, Rama Doddipatla, Thomas Hain

    Abstract: This paper investigates the use of unsupervised text-to-speech synthesis (TTS) as a data augmentation method to improve accented speech recognition. TTS systems are trained with a small amount of accented speech training data and their pseudo-labels rather than manual transcriptions, and hence unsupervised. This approach enables the use of accented speech data without manual transcriptions to perf… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to EUSIPCO 2024

  2. arXiv:2405.20431  [pdf, other

    cs.LG cs.CV

    Exploring the Practicality of Federated Learning: A Survey Towards the Communication Perspective

    Authors: Khiem Le, Nhan Luong-Ha, Manh Nguyen-Duc, Danh Le-Phuoc, Cuong Do, Kok-Seng Wong

    Abstract: Federated Learning (FL) is a promising paradigm that offers significant advancements in privacy-preserving, decentralized machine learning by enabling collaborative training of models across distributed devices without centralizing data. However, the practical deployment of FL systems faces a significant bottleneck: the communication overhead caused by frequently exchanging large model updates bet… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  3. arXiv:2403.15605  [pdf, other

    cs.CV cs.LG

    Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization

    Authors: Khiem Le, Long Ho, Cuong Do, Danh Le-Phuoc, Kok-Seng Wong

    Abstract: Domain shift is a formidable issue in Machine Learning that causes a model to suffer from performance degradation when tested on unseen domains. Federated Domain Generalization (FedDG) attempts to train a global model using collaborative clients in a privacy-preserving manner that can generalize well to unseen clients possibly with domain shift. However, most existing FedDG methods either cause ad… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  4. arXiv:2402.02021  [pdf, other

    cs.LG cs.CV

    Transfer Learning in ECG Diagnosis: Is It Effective?

    Authors: Cuong V. Nguyen, Cuong D. Do

    Abstract: The adoption of deep learning in ECG diagnosis is often hindered by the scarcity of large, well-labeled datasets in real-world scenarios, leading to the use of transfer learning to leverage features learned from larger datasets. Yet the prevailing assumption that transfer learning consistently outperforms training from scratch has never been systematically validated. In this study, we conduct the… ▽ More

    Submitted 26 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  5. arXiv:2311.15041  [pdf, other

    cs.LG cs.AI eess.SP

    MPCNN: A Novel Matrix Profile Approach for CNN-based Sleep Apnea Classification

    Authors: Hieu X. Nguyen, Duong V. Nguyen, Hieu H. Pham, Cuong D. Do

    Abstract: Sleep apnea (SA) is a significant respiratory condition that poses a major global health challenge. Previous studies have investigated several machine and deep learning models for electrocardiogram (ECG)-based SA diagnoses. Despite these advancements, conventional feature extractions derived from ECG signals, such as R-peaks and RR intervals, may fail to capture crucial information encompassed wit… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  6. arXiv:2311.04224  [pdf, other

    eess.SP cs.CV cs.LG

    MELEP: A Novel Predictive Measure of Transferability in Multi-Label ECG Diagnosis

    Authors: Cuong V. Nguyen, Hieu Minh Duong, Cuong D. Do

    Abstract: In practical electrocardiography (ECG) interpretation, the scarcity of well-annotated data is a common challenge. Transfer learning techniques are valuable in such situations, yet the assessment of transferability has received limited attention. To tackle this issue, we introduce MELEP, which stands for Muti-label Expected Log of Empirical Predictions, a measure designed to estimate the effectiven… ▽ More

    Submitted 12 June, 2024; v1 submitted 27 October, 2023; originally announced November 2023.

    Comments: Accepted to the Journal of Healthcare Informatics Research

  7. arXiv:2308.11621  [pdf, other

    cs.NI cs.AI

    Reinforcement Learning -based Adaptation and Scheduling Methods for Multi-source DASH

    Authors: Nghia T. Nguyen, Long Luu, Phuong L. Vo, Thi Thanh Sang Nguyen, Cuong T. Do, Ngoc-thanh Nguyen

    Abstract: Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently. In DASH, the client downloads video chunks in order from a server. The rate adaptation function at the video client enhances the user's quality-of-experience (QoE) by choosing a suitable quality level for each video chunk to download based on the network condition. Today networks such as content delivery… ▽ More

    Submitted 25 July, 2023; originally announced August 2023.

    Comments: 19 pages

    MSC Class: 14J60 (Primary) 14F05; 14J26 (Secondary) 14J60 (Primary) 14F05; 14J26 (Secondary) ACM Class: C.2.4; I.2.11

  8. arXiv:2208.07088  [pdf, other

    cs.CV

    Enhancing Deep Learning-based 3-lead ECG Classification with Heartbeat Counting and Demographic Data Integration

    Authors: Khiem H. Le, Hieu H. Pham, Thao B. T. Nguyen, Tu A. Nguyen, Cuong D. Do

    Abstract: Nowadays, an increasing number of people are being diagnosed with cardiovascular diseases (CVDs), the leading cause of death globally. The gold standard for identifying these heart problems is via electrocardiogram (ECG). The standard 12-lead ECG is widely used in clinical practice and the majority of current research. However, using a lower number of leads can make ECG more pervasive as it can be… ▽ More

    Submitted 15 August, 2022; originally announced August 2022.

    Comments: arXiv admin note: text overlap with arXiv:2207.12381

  9. Detecting COVID-19 from digitized ECG printouts using 1D convolutional neural networks

    Authors: Thao Nguyen, Hieu H. Pham, Huy Khiem Le, Anh Tu Nguyen, Ngoc Tien Thanh, Cuong Do

    Abstract: The COVID-19 pandemic has exposed the vulnerability of healthcare services worldwide, raising the need to develop novel tools to provide rapid and cost-effective screening and diagnosis. Clinical reports indicated that COVID-19 infection may cause cardiac injury, and electrocardiograms (ECG) may serve as a diagnostic biomarker for COVID-19. This study aims to utilize ECG signals to detect COVID-19… ▽ More

    Submitted 5 October, 2022; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: Accepted with minor revision by Plos One

  10. arXiv:2208.03408  [pdf, other

    cs.CV

    A novel deep learning-based approach for sleep apnea detection using single-lead ECG signals

    Authors: Anh-Tu Nguyen, Thao Nguyen, Huy-Khiem Le, Huy-Hieu Pham, Cuong Do

    Abstract: Sleep apnea (SA) is a type of sleep disorder characterized by snoring and chronic sleeplessness, which can lead to serious conditions such as high blood pressure, heart failure, and cardiomyopathy (enlargement of the muscle tissue of the heart). The electrocardiogram (ECG) plays a critical role in identifying SA since it might reveal abnormal cardiac activity. Recent research on ECG-based SA detec… ▽ More

    Submitted 11 September, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

    Comments: This work has been accepted for publication by the Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2022 (APSIPA ASC 2022)

  11. arXiv:2207.14736  [pdf, other

    cs.CL cs.SD eess.AS

    Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer

    Authors: Cong-Thanh Do, Mohan Li, Rama Doddipatla

    Abstract: This paper proposes a new approach to perform unsupervised fine-tuning and self-training using unlabeled speech data for recurrent neural network (RNN)-Transducer (RNN-T) end-to-end (E2E) automatic speech recognition (ASR) systems. Conventional systems perform fine-tuning/self-training using ASR hypothesis as the targets when using unlabeled audio data and are susceptible to the ASR performance of… ▽ More

    Submitted 29 July, 2022; originally announced July 2022.

    Comments: Accepted to Interspeech 2022

  12. arXiv:2207.12381  [pdf, other

    cs.CV cs.AI

    LightX3ECG: A Lightweight and eXplainable Deep Learning System for 3-lead Electrocardiogram Classification

    Authors: Khiem H. Le, Hieu H. Pham, Thao BT. Nguyen, Tu A. Nguyen, Tien N. Thanh, Cuong D. Do

    Abstract: Cardiovascular diseases (CVDs) are a group of heart and blood vessel disorders that is one of the most serious dangers to human health, and the number of such patients is still growing. Early and accurate detection plays a key role in successful treatment and intervention. Electrocardiogram (ECG) is the gold standard for identifying a variety of cardiovascular abnormalities. In clinical practices… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: Under review at Biomedical Signal Processing and Control

  13. arXiv:2103.15515  [pdf, other

    cs.CL

    Multiple-hypothesis CTC-based semi-supervised adaptation of end-to-end speech recognition

    Authors: Cong-Thanh Do, Rama Doddipatla, Thomas Hain

    Abstract: This paper proposes an adaptation method for end-to-end speech recognition. In this method, multiple automatic speech recognition (ASR) 1-best hypotheses are integrated in the computation of the connectionist temporal classification (CTC) loss function. The integration of multiple ASR hypotheses helps alleviating the impact of errors in the ASR hypotheses to the computation of the CTC loss when AS… ▽ More

    Submitted 31 March, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: Accepted at ICASSP 2021

  14. arXiv:2102.04697  [pdf, other

    eess.AS cs.AI cs.SD

    Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers

    Authors: Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Erfan Loweimi, Peter Bell, Steve Renals

    Abstract: Although the lower layers of a deep neural network learn features which are transferable across datasets, these layers are not transferable within the same dataset. That is, in general, freezing the trained feature extractor (the lower layers) and retraining the classifier (the upper layers) on the same dataset leads to worse performance. In this paper, for the first time, we show that the frozen… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: Accepted by ICASSP 2021

  15. arXiv:1907.01957  [pdf, other

    eess.AS cs.CL cs.SD

    End-to-End Speech Recognition with High-Frame-Rate Features Extraction

    Authors: Cong-Thanh Do

    Abstract: State-of-the-art end-to-end automatic speech recognition (ASR) extracts acoustic features from input speech signal every 10 ms which corresponds to a frame rate of 100 frames/second. In this report, we investigate the use of high-frame-rate features extraction in end-to-end ASR. High frame rates of 200 and 400 frames/second are used in the features extraction and provide additional information for… ▽ More

    Submitted 12 July, 2019; v1 submitted 3 July, 2019; originally announced July 2019.

  16. arXiv:1810.03303  [pdf, other

    cs.RO

    Accurate Pouring with an Autonomous Robot Using an RGB-D Camera

    Authors: Chau Do, Wolfram Burgard

    Abstract: Robotic assistants in a home environment are expected to perform various complex tasks for their users. One particularly challenging task is pouring drinks into cups, which for successful completion, requires the detection and tracking of the liquid level during a pour to determine when to stop. In this paper, we present a novel approach to autonomous pouring that tracks the liquid level using an… ▽ More

    Submitted 8 October, 2018; originally announced October 2018.

    Comments: 12 pages

  17. arXiv:1707.06633  [pdf, other

    cs.AI cs.CV cs.HC cs.LG cs.RO

    Acting Thoughts: Towards a Mobile Robotic Service Assistant for Users with Limited Communication Skills

    Authors: Felix Burget, Lukas Dominique Josef Fiederer, Daniel Kuhner, Martin Völker, Johannes Aldinger, Robin Tibor Schirrmeister, Chau Do, Joschka Boedecker, Bernhard Nebel, Tonio Ball, Wolfram Burgard

    Abstract: As autonomous service robots become more affordable and thus available also for the general public, there is a growing need for user friendly interfaces to control the robotic system. Currently available control modalities typically expect users to be able to express their desire through either touch, speech or gesture commands. While this requirement is fulfilled for the majority of users, paraly… ▽ More

    Submitted 12 June, 2018; v1 submitted 20 July, 2017; originally announced July 2017.

    Comments: * FB, LDJF, DK, MV and JA contributed equally to the work. Accepted as a conference paper at the European Conference on Mobile Robotics 2017 (ECMR 2017), 6 pages, 3 figures

    ACM Class: I.2.4; I.2.6; I.2.8; I.2.9; I.2.10; I.4.8; I.5.1

    Journal ref: 2017 European Conference on Mobile Robots (ECMR)

  18. arXiv:1406.2015  [pdf, other

    cs.IR cs.CY cs.DB

    MOOCdb: Developing Standards and Systems to Support MOOC Data Science

    Authors: Kalyan Veeramachaneni, Sherif Halawa, Franck Dernoncourt, Una-May O'Reilly, Colin Taylor, Chuong Do

    Abstract: We present a shared data model for enabling data science in Massive Open Online Courses (MOOCs). The model captures students interactions with the online platform. The data model is platform agnostic and is based on some basic core actions that students take on an online learning platform. Students usually interact with the platform in four different modes: Observing, Submitting, Collaborating and… ▽ More

    Submitted 8 June, 2014; originally announced June 2014.

  19. arXiv:1307.2579  [pdf, other

    cs.LG cs.AI cs.HC stat.AP stat.ML

    Tuned Models of Peer Assessment in MOOCs

    Authors: Chris Piech, Jonathan Huang, Zhenghao Chen, Chuong Do, Andrew Ng, Daphne Koller

    Abstract: In massive open online courses (MOOCs), peer grading serves as a critical tool for scaling the grading of complex, open-ended assignments to courses with tens or hundreds of thousands of students. But despite promising initial trials, it does not always deliver accurate results compared to human experts. In this paper, we develop algorithms for estimating and correcting for grader biases and relia… ▽ More

    Submitted 9 July, 2013; originally announced July 2013.

    Comments: Proceedings of The 6th International Conference on Educational Data Mining (EDM 2013)

  20. arXiv:1307.1568  [pdf

    cs.AI

    Using MathML to Represent Units of Measurement for Improved Ontology Alignment

    Authors: Chau Do, Eric J. Pauwels

    Abstract: Ontologies provide a formal description of concepts and their relationships in a knowledge domain. The goal of ontology alignment is to identify semantically matching concepts and relationships across independently developed ontologies that purport to describe the same knowledge. In order to handle the widest possible class of ontologies, many alignment algorithms rely on terminological and struct… ▽ More

    Submitted 5 July, 2013; originally announced July 2013.

    Comments: Conferences on Intelligent Computer Mathematics (CICM 2013), Bath, England

    Journal ref: CICM 2013, LNAI (7961), Springer, 2013