-
WHISMA: A Speech-LLM to Perform Zero-shot Spoken Language Understanding
Authors:
Mohan Li,
Cong-Thanh Do,
Simon Keizer,
Youmna Farag,
Svetlana Stoyanchev,
Rama Doddipatla
Abstract:
Speech large language models (speech-LLMs) integrate speech and text-based foundation models to provide a unified framework for handling a wide range of downstream tasks. In this paper, we introduce WHISMA, a speech-LLM tailored for spoken language understanding (SLU) that demonstrates robust performance in various zero-shot settings. WHISMA combines the speech encoder from Whisper with the Llama-…
▽ More
Speech large language models (speech-LLMs) integrate speech and text-based foundation models to provide a unified framework for handling a wide range of downstream tasks. In this paper, we introduce WHISMA, a speech-LLM tailored for spoken language understanding (SLU) that demonstrates robust performance in various zero-shot settings. WHISMA combines the speech encoder from Whisper with the Llama-3 LLM, and is fine-tuned in a parameter-efficient manner on a comprehensive collection of SLU-related datasets. Our experiments show that WHISMA significantly improves the zero-shot slot filling performance on the SLURP benchmark, achieving a relative gain of 26.6% compared to the current state-of-the-art model. Furthermore, to evaluate WHISMA's generalisation capabilities to unseen domains, we develop a new task-agnostic benchmark named SLU-GLUE. The evaluation results indicate that WHISMA outperforms an existing speech-LLM (Qwen-Audio) with a relative gain of 33.0%.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
ESAC (EQ-SANS Assisting Chatbot): Application of Large Language Models and Retrieval-Augmented Generation for Enhanced User Experience at EQ-SANS
Authors:
Changwoo Do,
Gergely Nagy,
William T. Heller
Abstract:
Neutron scattering experiments have played vital roles in exploring materials properties in the past decades. While user interfaces have been improved over time, neutron scattering experiments still require specific knowledge or training by an expert due to the complexity of such advanced instrumentation and the limited number of experiments each person may perform each year. This paper introduces…
▽ More
Neutron scattering experiments have played vital roles in exploring materials properties in the past decades. While user interfaces have been improved over time, neutron scattering experiments still require specific knowledge or training by an expert due to the complexity of such advanced instrumentation and the limited number of experiments each person may perform each year. This paper introduces an innovative chatbot application that leverages Large Language Models(LLM) and Retrieval-Augmented Generation (RAG) technologies to significantly enhance the user experience at the EQ-SANS, a small-angle neutron scattering instrument at the Spallation Neutron Source of Oak Ridge National Laboratory. Through a user-centric design approach, the EQ-SANS Assisting Chatbot (ESAC) serves as an interactive reference for users, thereby facilitating the use of the instrument by visiting scientists. By bridging the gap between the users of EQ-SANS and the control systems required to perform their experiments, the ESAC sets a new standard for interactive learning and support for the scientific community using large-scale scientific facilities.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
Improving Accented Speech Recognition using Data Augmentation based on Unsupervised Text-to-Speech Synthesis
Authors:
Cong-Thanh Do,
Shuhei Imai,
Rama Doddipatla,
Thomas Hain
Abstract:
This paper investigates the use of unsupervised text-to-speech synthesis (TTS) as a data augmentation method to improve accented speech recognition. TTS systems are trained with a small amount of accented speech training data and their pseudo-labels rather than manual transcriptions, and hence unsupervised. This approach enables the use of accented speech data without manual transcriptions to perf…
▽ More
This paper investigates the use of unsupervised text-to-speech synthesis (TTS) as a data augmentation method to improve accented speech recognition. TTS systems are trained with a small amount of accented speech training data and their pseudo-labels rather than manual transcriptions, and hence unsupervised. This approach enables the use of accented speech data without manual transcriptions to perform data augmentation for accented speech recognition. Synthetic accented speech data, generated from text prompts by using the TTS systems, are then combined with available non-accented speech data to train automatic speech recognition (ASR) systems. ASR experiments are performed in a self-supervised learning framework using a Wav2vec2.0 model which was pre-trained on large amount of unsupervised accented speech data. The accented speech data for training the unsupervised TTS are read speech, selected from L2-ARCTIC and British Isles corpora, while spontaneous conversational speech from the Edinburgh international accents of English corpus are used as the evaluation data. Experimental results show that Wav2vec2.0 models which are fine-tuned to downstream ASR task with synthetic accented speech data, generated by the unsupervised TTS, yield up to 6.1% relative word error rate reductions compared to a Wav2vec2.0 baseline which is fine-tuned with the non-accented speech data from Librispeech corpus.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Exploring the Practicality of Federated Learning: A Survey Towards the Communication Perspective
Authors:
Khiem Le,
Nhan Luong-Ha,
Manh Nguyen-Duc,
Danh Le-Phuoc,
Cuong Do,
Kok-Seng Wong
Abstract:
Federated Learning (FL) is a promising paradigm that offers significant advancements in privacy-preserving, decentralized machine learning by enabling collaborative training of models across distributed devices without centralizing data. However, the practical deployment of FL systems faces a significant bottleneck: the communication overhead caused by frequently exchanging large model updates bet…
▽ More
Federated Learning (FL) is a promising paradigm that offers significant advancements in privacy-preserving, decentralized machine learning by enabling collaborative training of models across distributed devices without centralizing data. However, the practical deployment of FL systems faces a significant bottleneck: the communication overhead caused by frequently exchanging large model updates between numerous devices and a central server. This communication inefficiency can hinder training speed, model performance, and the overall feasibility of real-world FL applications. In this survey, we investigate various strategies and advancements made in communication-efficient FL, highlighting their impact and potential to overcome the communication challenges inherent in FL systems. Specifically, we define measures for communication efficiency, analyze sources of communication inefficiency in FL systems, and provide a taxonomy and comprehensive review of state-of-the-art communication-efficient FL methods. Additionally, we discuss promising future research directions for enhancing the communication efficiency of FL systems. By addressing the communication bottleneck, FL can be effectively applied and enable scalable and practical deployment across diverse applications that require privacy-preserving, decentralized machine learning, such as IoT, healthcare, or finance.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization
Authors:
Khiem Le,
Long Ho,
Cuong Do,
Danh Le-Phuoc,
Kok-Seng Wong
Abstract:
Domain shift is a formidable issue in Machine Learning that causes a model to suffer from performance degradation when tested on unseen domains. Federated Domain Generalization (FedDG) attempts to train a global model using collaborative clients in a privacy-preserving manner that can generalize well to unseen clients possibly with domain shift. However, most existing FedDG methods either cause ad…
▽ More
Domain shift is a formidable issue in Machine Learning that causes a model to suffer from performance degradation when tested on unseen domains. Federated Domain Generalization (FedDG) attempts to train a global model using collaborative clients in a privacy-preserving manner that can generalize well to unseen clients possibly with domain shift. However, most existing FedDG methods either cause additional privacy risks of data leakage or induce significant costs in client communication and computation, which are major concerns in the Federated Learning paradigm. To circumvent these challenges, here we introduce a novel architectural method for FedDG, namely gPerXAN, which relies on a normalization scheme working with a guiding regularizer. In particular, we carefully design Personalized eXplicitly Assembled Normalization to enforce client models selectively filtering domain-specific features that are biased towards local data while retaining discrimination of those features. Then, we incorporate a simple yet effective regularizer to guide these models in directly capturing domain-invariant representations that the global model's classifier can leverage. Extensive experimental results on two benchmark datasets, i.e., PACS and Office-Home, and a real-world medical dataset, Camelyon17, indicate that our proposed method outperforms other existing methods in addressing this particular problem.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Transfer Learning in ECG Diagnosis: Is It Effective?
Authors:
Cuong V. Nguyen,
Cuong D. Do
Abstract:
The adoption of deep learning in ECG diagnosis is often hindered by the scarcity of large, well-labeled datasets in real-world scenarios, leading to the use of transfer learning to leverage features learned from larger datasets. Yet the prevailing assumption that transfer learning consistently outperforms training from scratch has never been systematically validated. In this study, we conduct the…
▽ More
The adoption of deep learning in ECG diagnosis is often hindered by the scarcity of large, well-labeled datasets in real-world scenarios, leading to the use of transfer learning to leverage features learned from larger datasets. Yet the prevailing assumption that transfer learning consistently outperforms training from scratch has never been systematically validated. In this study, we conduct the first extensive empirical study on the effectiveness of transfer learning in multi-label ECG classification, by investigating comparing the fine-tuning performance with that of training from scratch, covering a variety of ECG datasets and deep neural networks. We confirm that fine-tuning is the preferable choice for small downstream datasets; however, when the dataset is sufficiently large, training from scratch can achieve comparable performance, albeit requiring a longer training time to catch up. Furthermore, we find that transfer learning exhibits better compatibility with convolutional neural networks than with recurrent neural networks, which are the two most prevalent architectures for time-series ECG applications. Our results underscore the importance of transfer learning in ECG diagnosis, yet depending on the amount of available data, researchers may opt not to use it, considering the non-negligible cost associated with pre-training.
△ Less
Submitted 26 June, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
MPCNN: A Novel Matrix Profile Approach for CNN-based Sleep Apnea Classification
Authors:
Hieu X. Nguyen,
Duong V. Nguyen,
Hieu H. Pham,
Cuong D. Do
Abstract:
Sleep apnea (SA) is a significant respiratory condition that poses a major global health challenge. Previous studies have investigated several machine and deep learning models for electrocardiogram (ECG)-based SA diagnoses. Despite these advancements, conventional feature extractions derived from ECG signals, such as R-peaks and RR intervals, may fail to capture crucial information encompassed wit…
▽ More
Sleep apnea (SA) is a significant respiratory condition that poses a major global health challenge. Previous studies have investigated several machine and deep learning models for electrocardiogram (ECG)-based SA diagnoses. Despite these advancements, conventional feature extractions derived from ECG signals, such as R-peaks and RR intervals, may fail to capture crucial information encompassed within the complete PQRST segments. In this study, we propose an innovative approach to address this diagnostic gap by delving deeper into the comprehensive segments of the ECG signal. The proposed methodology draws inspiration from Matrix Profile algorithms, which generate an Euclidean distance profile from fixed-length signal subsequences. From this, we derived the Min Distance Profile (MinDP), Max Distance Profile (MaxDP), and Mean Distance Profile (MeanDP) based on the minimum, maximum, and mean of the profile distances, respectively. To validate the effectiveness of our approach, we use the modified LeNet-5 architecture as the primary CNN model, along with two existing lightweight models, BAFNet and SE-MSCNN, for ECG classification tasks. Our extensive experimental results on the PhysioNet Apnea-ECG dataset revealed that with the new feature extraction method, we achieved a per-segment accuracy up to 92.11 \% and a per-recording accuracy of 100\%. Moreover, it yielded the highest correlation compared to state-of-the-art methods, with a correlation coefficient of 0.989. By introducing a new feature extraction method based on distance relationships, we enhanced the performance of certain lightweight models, showing potential for home sleep apnea test (HSAT) and SA detection in IoT devices. The source code for this work is made publicly available in GitHub: https://github.com/vinuni-vishc/MPCNN-Sleep-Apnea.
△ Less
Submitted 25 November, 2023;
originally announced November 2023.
-
MELEP: A Novel Predictive Measure of Transferability in Multi-Label ECG Diagnosis
Authors:
Cuong V. Nguyen,
Hieu Minh Duong,
Cuong D. Do
Abstract:
In practical electrocardiography (ECG) interpretation, the scarcity of well-annotated data is a common challenge. Transfer learning techniques are valuable in such situations, yet the assessment of transferability has received limited attention. To tackle this issue, we introduce MELEP, which stands for Muti-label Expected Log of Empirical Predictions, a measure designed to estimate the effectiven…
▽ More
In practical electrocardiography (ECG) interpretation, the scarcity of well-annotated data is a common challenge. Transfer learning techniques are valuable in such situations, yet the assessment of transferability has received limited attention. To tackle this issue, we introduce MELEP, which stands for Muti-label Expected Log of Empirical Predictions, a measure designed to estimate the effectiveness of knowledge transfer from a pre-trained model to a downstream multi-label ECG diagnosis task. MELEP is generic, working with new target data with different label sets, and computationally efficient, requiring only a single forward pass through the pre-trained model. To the best of our knowledge, MELEP is the first transferability metric specifically designed for multi-label ECG classification problems. Our experiments show that MELEP can predict the performance of pre-trained convolutional and recurrent deep neural networks, on small and imbalanced ECG data. Specifically, we observed strong correlation coefficients (with absolute values exceeding 0.6 in most cases) between MELEP and the actual average F1 scores of the fine-tuned models. Our work highlights the potential of MELEP to expedite the selection of suitable pre-trained models for ECG diagnosis tasks, saving time and effort that would otherwise be spent on fine-tuning these models.
△ Less
Submitted 12 June, 2024; v1 submitted 27 October, 2023;
originally announced November 2023.
-
Neutron Scattering Cross-Section Correction Incorporating Neutron Wavelength Effects
Authors:
Karrie E. An,
Guan-Rong Huang,
Changwoo Do,
Wei-Ren Chen
Abstract:
This study outlines a numerical methodology aimed at rectifying the neutron scattering cross-sections of fundamental elements across a range of low neutron energies typically employed in general neutron scattering experiments. By using the experimental power law relationship governing the cross-section's dependence on neutron wavelength, we establish a mathematical connection between these two var…
▽ More
This study outlines a numerical methodology aimed at rectifying the neutron scattering cross-sections of fundamental elements across a range of low neutron energies typically employed in general neutron scattering experiments. By using the experimental power law relationship governing the cross-section's dependence on neutron wavelength, we establish a mathematical connection between these two variables. Leveraging this relationship, the scheme of central moment expansion is adopted to correct the cross-sections that are applicable to general neutron wavelength distributions commonly encountered in experimental scenarios. Importantly, our proposed method eliminates the requirement for knowledge about the functional form of the distribution. Consequently, this approach offers the capability to reconstruct neutron scattering data without introducing distortions stemming from the energy-dependent cross-sections of different types of elements within materials during experimental measurements. Ultimately, this advancement facilitates a more precise interpretation and analysis of material structures based on their scattering signatures.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Reinforcement Learning -based Adaptation and Scheduling Methods for Multi-source DASH
Authors:
Nghia T. Nguyen,
Long Luu,
Phuong L. Vo,
Thi Thanh Sang Nguyen,
Cuong T. Do,
Ngoc-thanh Nguyen
Abstract:
Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently. In DASH, the client downloads video chunks in order from a server. The rate adaptation function at the video client enhances the user's quality-of-experience (QoE) by choosing a suitable quality level for each video chunk to download based on the network condition. Today networks such as content delivery…
▽ More
Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently. In DASH, the client downloads video chunks in order from a server. The rate adaptation function at the video client enhances the user's quality-of-experience (QoE) by choosing a suitable quality level for each video chunk to download based on the network condition. Today networks such as content delivery networks, edge caching networks, content-centric networks,... usually replicate video contents on multiple cache nodes. We study video streaming from multiple sources in this work. In multi-source streaming, video chunks may arrive out of order due to different conditions of the network paths. Hence, to guarantee a high QoE, the video client needs not only rate adaptation but also chunk scheduling. Reinforcement learning (RL) has emerged as the state-of-the-art control method in various fields in recent years. This paper proposes two algorithms for streaming from multiple sources: RL-based adaptation with greedy scheduling (RLAGS) and RL-based adaptation and scheduling (RLAS). We also build a simulation environment for training and evaluating. The efficiency of the proposed algorithms is proved via extensive simulations with real-trace data.
△ Less
Submitted 25 July, 2023;
originally announced August 2023.
-
Coronal Heating as Determined by the Solar Flare Frequency Distribution Obtained by Aggregating Case Studies
Authors:
James Paul Mason,
Alexandra Werth,
Colin G. West,
Allison A. Youngblood,
Donald L. Woodraska,
Courtney Peck,
Kevin Lacjak,
Florian G. Frick,
Moutamen Gabir,
Reema A. Alsinan,
Thomas Jacobsen,
Mohammad Alrubaie,
Kayla M. Chizmar,
Benjamin P. Lau,
Lizbeth Montoya Dominguez,
David Price,
Dylan R. Butler,
Connor J. Biron,
Nikita Feoktistov,
Kai Dewey,
N. E. Loomis,
Michal Bodzianowski,
Connor Kuybus,
Henry Dietrick,
Aubrey M. Wolfe
, et al. (977 additional authors not shown)
Abstract:
Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms th…
▽ More
Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms that could explain it: nanoflares or Alfvén waves. To date, neither can be directly observed. Nanoflares are, by definition, extremely small, but their aggregate energy release could represent a substantial heating mechanism, presuming they are sufficiently abundant. One way to test this presumption is via the flare frequency distribution, which describes how often flares of various energies occur. If the slope of the power law fitting the flare frequency distribution is above a critical threshold, $α=2$ as established in prior literature, then there should be a sufficient abundance of nanoflares to explain coronal heating. We performed $>$600 case studies of solar flares, made possible by an unprecedented number of data analysts via three semesters of an undergraduate physics laboratory course. This allowed us to include two crucial, but nontrivial, analysis methods: pre-flare baseline subtraction and computation of the flare energy, which requires determining flare start and stop times. We aggregated the results of these analyses into a statistical study to determine that $α= 1.63 \pm 0.03$. This is below the critical threshold, suggesting that Alfvén waves are an important driver of coronal heating.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Polarization From A Radially Stratified Off-Axis GRB Outflow
Authors:
A. C. Caligula do E. S. Pedreira,
N. Fraija,
A. Galvan-Gamez,
B. Betancourt Kamenetskaia,
S. Dichiara,
M. G. Dainotti,
R. L. Becerra,
P. Veres
Abstract:
While the dominant radiation mechanism gamma-ray bursts (GRBs) remains a question of debate, synchrotron emission is one of the foremost candidates to describe the multi-wavelength afterglow observations. As such, it is expected that GRBs should present some degree of polarization across their evolution - presenting a feasible means of probing these bursts' energetic and angular properties. Althou…
▽ More
While the dominant radiation mechanism gamma-ray bursts (GRBs) remains a question of debate, synchrotron emission is one of the foremost candidates to describe the multi-wavelength afterglow observations. As such, it is expected that GRBs should present some degree of polarization across their evolution - presenting a feasible means of probing these bursts' energetic and angular properties. Although obtaining polarization data is difficult due to the inherent complexities regarding GRB observations, advances are being made, and theoretical modeling of synchrotron polarization is now more relevant than ever. In this manuscript, we present the polarization for a fiduciary model where the synchrotron forward-shock emission evolving in the radiative-adiabatic regime is described by a radially stratified off-axis outflow. This is parameterized with a power-law velocity distribution and decelerated in a constant-density and wind-like external environment. We apply this theoretical polarization model for selected bursts presenting evidence of off-axis afterglow emission, including the nearest orphan GRB candidates observed by the Neil Gehrels Swift Observatory and a few Gravitational Wave (GWs) events that could generate electromagnetic emission. In the case of GRB 170817A, we require the available polarimetric upper limits in radio wavelengths to constrain its magnetic field geometry.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Afterglow Polarization from Off-Axis GRB Jets
Authors:
A. C. Caligula do E. S. Pedreira,
N. Fraija,
A. Galvan-Gamez,
B. Betancourt Kamenetskaia,
P. Veres,
M. G. Dainotti,
S. Dichiara,
R. L. Becerra
Abstract:
As we further our studies on Gamma-ray bursts (GRBs), both on theoretical models and observational tools, more and more options begin to open for exploration of its physical properties. As transient events primarily dominated by synchrotron radiation, it is expected that the synchrotron photons emitted by GRBs should present some degree of polarization throughout the evolution of the burst. Wherea…
▽ More
As we further our studies on Gamma-ray bursts (GRBs), both on theoretical models and observational tools, more and more options begin to open for exploration of its physical properties. As transient events primarily dominated by synchrotron radiation, it is expected that the synchrotron photons emitted by GRBs should present some degree of polarization throughout the evolution of the burst. Whereas observing this polarization can still be challenging due to the constraints on observational tools, especially for short GRBs, it is paramount that the groundwork is laid for the day we have abundant data. In this work, we present a polarization model linked with an off-axis spreading top-hat jet synchrotron scenario in a stratified environment with a density profile $n(r)\propto r^ {-k}$. We present this model's expected temporal polarization evolution for a realistic set of afterglow parameters constrained within the values observed in the GRB literature for four degrees of stratification $k=0,1,1.5 {\rm \, and\,} 2$ and two magnetic field configurations with high extreme anisotropy. We apply this model and predict polarization from a set of GRBs exhibiting off-axis afterglow emission. In particular, for GRB 170817A, we use the available polarimetric upper limits to rule out the possibility of a extremely anisotropic configuration for the magnetic field.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Exploring the Early Afterglow Polarization of GRB 190829A
Authors:
A. C. Caligula do E. S. Pedreira,
N. Fraija,
S. Dichiara,
P. Veres,
M. G. Dainotti,
A. Galvan-Gamez,
R. L. Becerra,
B. Betancourt Kamenetskaia
Abstract:
The GRB 190829A has been widely studied due to its nature and the high energy emission presented. Due to the detection of a very-high-energy component by the High Energy Stereoscopic System and the event's atypically middling luminosity, it has been categorized in a select, limited group of bursts bordering classic GRBs and nearby sub-energetic events. Given the range of models utilized to adequat…
▽ More
The GRB 190829A has been widely studied due to its nature and the high energy emission presented. Due to the detection of a very-high-energy component by the High Energy Stereoscopic System and the event's atypically middling luminosity, it has been categorized in a select, limited group of bursts bordering classic GRBs and nearby sub-energetic events. Given the range of models utilized to adequately characterize the afterglow of this burst, it has proven challenging to identify the most probable explanation. Nevertheless, the detection of polarization data provided by the MASTER collaboration has added a new aspect to GRB 190829A that permits us to attempt to explore this degeneracy. In this paper, we present a polarization model coupled with a synchrotron forward-shock model -- a component in all models used to describe GRB 190829A's afterglow -- in order to fit the polarization's temporal evolution with the existing upper limits ($Π< 6\%$). We find that the polarization generated from an on-axis emission is favored for strongly anisotropic magnetic field ratios, while an off-axis scenario cannot be fully ruled out when a more isotropic framework is taken into account.
△ Less
Submitted 23 October, 2022;
originally announced October 2022.
-
Enhancing Deep Learning-based 3-lead ECG Classification with Heartbeat Counting and Demographic Data Integration
Authors:
Khiem H. Le,
Hieu H. Pham,
Thao B. T. Nguyen,
Tu A. Nguyen,
Cuong D. Do
Abstract:
Nowadays, an increasing number of people are being diagnosed with cardiovascular diseases (CVDs), the leading cause of death globally. The gold standard for identifying these heart problems is via electrocardiogram (ECG). The standard 12-lead ECG is widely used in clinical practice and the majority of current research. However, using a lower number of leads can make ECG more pervasive as it can be…
▽ More
Nowadays, an increasing number of people are being diagnosed with cardiovascular diseases (CVDs), the leading cause of death globally. The gold standard for identifying these heart problems is via electrocardiogram (ECG). The standard 12-lead ECG is widely used in clinical practice and the majority of current research. However, using a lower number of leads can make ECG more pervasive as it can be integrated with portable or wearable devices. This article introduces two novel techniques to improve the performance of the current deep learning system for 3-lead ECG classification, making it comparable with models that are trained using standard 12-lead ECG. Specifically, we propose a multi-task learning scheme in the form of the number of heartbeats regression and an effective mechanism to integrate patient demographic data into the system. With these two advancements, we got classification performance in terms of F1 scores of 0.9796 and 0.8140 on two large-scale ECG datasets, i.e., Chapman and CPSC-2018, respectively, which surpassed current state-of-the-art ECG classification methods, even those trained on 12-lead data. To encourage further development, our source code is publicly available at https://github.com/lhkhiem28/LightX3ECG.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Detecting COVID-19 from digitized ECG printouts using 1D convolutional neural networks
Authors:
Thao Nguyen,
Hieu H. Pham,
Huy Khiem Le,
Anh Tu Nguyen,
Ngoc Tien Thanh,
Cuong Do
Abstract:
The COVID-19 pandemic has exposed the vulnerability of healthcare services worldwide, raising the need to develop novel tools to provide rapid and cost-effective screening and diagnosis. Clinical reports indicated that COVID-19 infection may cause cardiac injury, and electrocardiograms (ECG) may serve as a diagnostic biomarker for COVID-19. This study aims to utilize ECG signals to detect COVID-19…
▽ More
The COVID-19 pandemic has exposed the vulnerability of healthcare services worldwide, raising the need to develop novel tools to provide rapid and cost-effective screening and diagnosis. Clinical reports indicated that COVID-19 infection may cause cardiac injury, and electrocardiograms (ECG) may serve as a diagnostic biomarker for COVID-19. This study aims to utilize ECG signals to detect COVID-19 automatically. We propose a novel method to extract ECG signals from ECG paper records, which are then fed into a one-dimensional convolution neural network (1D-CNN) to learn and diagnose the disease. To evaluate the quality of digitized signals, R peaks in the paper-based ECG images are labeled. Afterward, RR intervals calculated from each image are compared to RR intervals of the corresponding digitized signal. Experiments on the COVID-19 ECG images dataset demonstrate that the proposed digitization method is able to capture correctly the original signals, with a mean absolute error of 28.11 ms. Our proposed 1D-CNN model, which is trained on the digitized ECG signals, allows identifying individuals with COVID-19 and other subjects accurately, with classification accuracies of 98.42%, 95.63%, and 98.50% for classifying COVID-19 vs. Normal, COVID-19 vs. Abnormal Heartbeats, and COVID-19 vs. other classes, respectively. Furthermore, the proposed method also achieves a high-level of performance for the multi-classification task. Our findings indicate that a deep learning system trained on digitized ECG signals can serve as a potential tool for diagnosing COVID-19.
△ Less
Submitted 5 October, 2022; v1 submitted 10 August, 2022;
originally announced August 2022.
-
A novel deep learning-based approach for sleep apnea detection using single-lead ECG signals
Authors:
Anh-Tu Nguyen,
Thao Nguyen,
Huy-Khiem Le,
Huy-Hieu Pham,
Cuong Do
Abstract:
Sleep apnea (SA) is a type of sleep disorder characterized by snoring and chronic sleeplessness, which can lead to serious conditions such as high blood pressure, heart failure, and cardiomyopathy (enlargement of the muscle tissue of the heart). The electrocardiogram (ECG) plays a critical role in identifying SA since it might reveal abnormal cardiac activity. Recent research on ECG-based SA detec…
▽ More
Sleep apnea (SA) is a type of sleep disorder characterized by snoring and chronic sleeplessness, which can lead to serious conditions such as high blood pressure, heart failure, and cardiomyopathy (enlargement of the muscle tissue of the heart). The electrocardiogram (ECG) plays a critical role in identifying SA since it might reveal abnormal cardiac activity. Recent research on ECG-based SA detection has focused on feature engineering techniques that extract specific characteristics from multiple-lead ECG signals and use them as classification model inputs. In this study, a novel method of feature extraction based on the detection of S peaks is proposed to enhance the detection of adjacent SA segments using a single-lead ECG. In particular, ECG features collected from a single lead (V2) are used to identify SA episodes. On the extracted features, a CNN model is trained to detect SA. Experimental results demonstrate that the proposed method detects SA from single-lead ECG data is more accurate than existing state-of-the-art methods, with 91.13% classification accuracy, 92.58% sensitivity, and 88.75% specificity. Moreover, the further usage of features associated with the S peaks enhances the classification accuracy by 0.85%. Our findings indicate that the proposed machine learning system has the potential to be an effective method for detecting SA episodes.
△ Less
Submitted 11 September, 2022; v1 submitted 5 August, 2022;
originally announced August 2022.
-
Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer
Authors:
Cong-Thanh Do,
Mohan Li,
Rama Doddipatla
Abstract:
This paper proposes a new approach to perform unsupervised fine-tuning and self-training using unlabeled speech data for recurrent neural network (RNN)-Transducer (RNN-T) end-to-end (E2E) automatic speech recognition (ASR) systems. Conventional systems perform fine-tuning/self-training using ASR hypothesis as the targets when using unlabeled audio data and are susceptible to the ASR performance of…
▽ More
This paper proposes a new approach to perform unsupervised fine-tuning and self-training using unlabeled speech data for recurrent neural network (RNN)-Transducer (RNN-T) end-to-end (E2E) automatic speech recognition (ASR) systems. Conventional systems perform fine-tuning/self-training using ASR hypothesis as the targets when using unlabeled audio data and are susceptible to the ASR performance of the base model. Here in order to alleviate the influence of ASR errors while using unlabeled data, we propose a multiple-hypothesis RNN-T loss that incorporates multiple ASR 1-best hypotheses into the loss function. For the fine-tuning task, ASR experiments on Librispeech show that the multiple-hypothesis approach achieves a relative reduction of 14.2% word error rate (WER) when compared to the single-hypothesis approach, on the test_other set. For the self-training task, ASR models are trained using supervised data from Wall Street Journal (WSJ), Aurora-4 along with CHiME-4 real noisy data as unlabeled data. The multiple-hypothesis approach yields a relative reduction of 3.3% WER on the CHiME-4's single-channel real noisy evaluation set when compared with the single-hypothesis approach.
△ Less
Submitted 29 July, 2022;
originally announced July 2022.
-
LightX3ECG: A Lightweight and eXplainable Deep Learning System for 3-lead Electrocardiogram Classification
Authors:
Khiem H. Le,
Hieu H. Pham,
Thao BT. Nguyen,
Tu A. Nguyen,
Tien N. Thanh,
Cuong D. Do
Abstract:
Cardiovascular diseases (CVDs) are a group of heart and blood vessel disorders that is one of the most serious dangers to human health, and the number of such patients is still growing. Early and accurate detection plays a key role in successful treatment and intervention. Electrocardiogram (ECG) is the gold standard for identifying a variety of cardiovascular abnormalities. In clinical practices…
▽ More
Cardiovascular diseases (CVDs) are a group of heart and blood vessel disorders that is one of the most serious dangers to human health, and the number of such patients is still growing. Early and accurate detection plays a key role in successful treatment and intervention. Electrocardiogram (ECG) is the gold standard for identifying a variety of cardiovascular abnormalities. In clinical practices and most of the current research, standard 12-lead ECG is mainly used. However, using a lower number of leads can make ECG more prevalent as it can be conveniently recorded by portable or wearable devices. In this research, we develop a novel deep learning system to accurately identify multiple cardiovascular abnormalities by using only three ECG leads.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
Spatial correlations of entangled polymer dynamics
Authors:
Jihong Ma,
Jan-Michael Y. Carrillo,
Changwoo Do,
Wei-Ren Chen,
Péter Falus,
Zhiqiang Shen,
Kunlun Hong,
Bobby G. Sumpter,
Yangyang Wang
Abstract:
The spatial correlations of entangled polymer dynamics are examined by molecular dynamics simulations and neutron spin-echo spectroscopy. Due to the soft nature of topological constraints, the initial spatial decays of intermediate scattering functions of entangled chains are, to the first approximation, surprisingly similar to those of an unentangled system in the functional forms. However, entan…
▽ More
The spatial correlations of entangled polymer dynamics are examined by molecular dynamics simulations and neutron spin-echo spectroscopy. Due to the soft nature of topological constraints, the initial spatial decays of intermediate scattering functions of entangled chains are, to the first approximation, surprisingly similar to those of an unentangled system in the functional forms. However, entanglements reveal themselves as a long tail in the reciprocal-space correlations, implying a weak but persistent dynamic localization in real space. Comparison with a number of existing theoretical models of entangled polymers suggests that they cannot fully describe the spatial correlations revealed by simulations and experiments. In particular, the strict one-dimensional diffusion idea of the original tube model is shown to be flawed. The dynamic spatial correlation analysis demonstrated in this work provides a useful tool for interrogating the dynamics of entangled polymers. Lastly, the failure of the investigated models to even qualitatively predict the spatial correlations of collective single-chain density fluctuations points to a possible critical role of incompressibility in polymer melt dynamics.
△ Less
Submitted 2 August, 2021;
originally announced August 2021.
-
Robust multi-sensor Generalized Labeled Multi-Bernoulli filter
Authors:
Cong-Thanh Do,
Tran Thien Dat Nguyen,
Hoa Van Nguyen
Abstract:
This paper proposes an efficient and robust algorithm to estimate target trajectories with unknown target detection profiles and clutter rates using measurements from multiple sensors. In particular, we propose to combine the multi-sensor Generalized Labeled Multi-Bernoulli (MS-GLMB) filter to estimate target trajectories and robust Cardinalized Probability Hypothesis Density (CPHD) filters to est…
▽ More
This paper proposes an efficient and robust algorithm to estimate target trajectories with unknown target detection profiles and clutter rates using measurements from multiple sensors. In particular, we propose to combine the multi-sensor Generalized Labeled Multi-Bernoulli (MS-GLMB) filter to estimate target trajectories and robust Cardinalized Probability Hypothesis Density (CPHD) filters to estimate the clutter rates. The target detection probability is augmented to the filtering state space for joint estimation. Experimental results show that the proposed robust filter exhibits near-optimal performance in the sense that it is comparable to the optimal MS-GLMB operating with true clutter rate and detection probability. More importantly, it outperforms other studied filters when the detection profile and clutter rate are unknown and time-variant. This is attributed to the ability of the robust filter to learn the background parameters on-the-fly.
△ Less
Submitted 3 November, 2021; v1 submitted 31 May, 2021;
originally announced June 2021.
-
Conductance-based Dynamic Causal Modeling: A mathematical review of its application to cross-power spectral densities
Authors:
Inês Pereira,
Stefan Frässle,
Jakob Heinzle,
Dario Schöbi,
Cao Tri Do,
Moritz Gruber,
Klaas E. Stephan
Abstract:
Dynamic Causal Modeling (DCM) is a Bayesian framework for inferring on hidden (latent) neuronal states, based on measurements of brain activity. Since its introduction in 2003 for functional magnetic resonance imaging data, DCM has been extended to electrophysiological data, and several variants have been developed. Their biophysically motivated formulations make these models promising candidates…
▽ More
Dynamic Causal Modeling (DCM) is a Bayesian framework for inferring on hidden (latent) neuronal states, based on measurements of brain activity. Since its introduction in 2003 for functional magnetic resonance imaging data, DCM has been extended to electrophysiological data, and several variants have been developed. Their biophysically motivated formulations make these models promising candidates for providing a mechanistic understanding of human brain dynamics, both in health and disease. However, due to their complexity and reliance on concepts from several fields, fully understanding the mathematical and conceptual basis behind certain variants of DCM can be challenging. At the same time, a solid theoretical knowledge of the models is crucial to avoid pitfalls in the application of these models and interpretation of their results. In this paper, we focus on one of the most advanced formulations of DCM, i.e. conductance-based DCM for cross-spectral densities, whose components are described across multiple technical papers. The aim of the present article is to provide an accessible exposition of the mathematical background, together with an illustration of the model's behavior. To this end, we include step-by-step derivations of the model equations, point to important aspects in the software implementation of those models, and use simulations to provide an intuitive understanding of the type of responses that can be generated and the role that specific parameters play in the model. Furthermore, all code utilized for our simulations is made publicly available alongside the manuscript to allow readers an easy hands-on experience with conductance-based DCM.
△ Less
Submitted 7 April, 2021;
originally announced April 2021.
-
Multiple-hypothesis CTC-based semi-supervised adaptation of end-to-end speech recognition
Authors:
Cong-Thanh Do,
Rama Doddipatla,
Thomas Hain
Abstract:
This paper proposes an adaptation method for end-to-end speech recognition. In this method, multiple automatic speech recognition (ASR) 1-best hypotheses are integrated in the computation of the connectionist temporal classification (CTC) loss function. The integration of multiple ASR hypotheses helps alleviating the impact of errors in the ASR hypotheses to the computation of the CTC loss when AS…
▽ More
This paper proposes an adaptation method for end-to-end speech recognition. In this method, multiple automatic speech recognition (ASR) 1-best hypotheses are integrated in the computation of the connectionist temporal classification (CTC) loss function. The integration of multiple ASR hypotheses helps alleviating the impact of errors in the ASR hypotheses to the computation of the CTC loss when ASR hypotheses are used. When being applied in semi-supervised adaptation scenarios where part of the adaptation data do not have labels, the CTC loss of the proposed method is computed from different ASR 1-best hypotheses obtained by decoding the unlabeled adaptation data. Experiments are performed in clean and multi-condition training scenarios where the CTC-based end-to-end ASR systems are trained on Wall Street Journal (WSJ) clean training data and CHiME-4 multi-condition training data, respectively, and tested on Aurora-4 test data. The proposed adaptation method yields 6.6% and 5.8% relative word error rate (WER) reductions in clean and multi-condition training scenarios, respectively, compared to a baseline system which is adapted with part of the adaptation data having manual transcriptions using back-propagation fine-tuning.
△ Less
Submitted 31 March, 2021; v1 submitted 29 March, 2021;
originally announced March 2021.
-
A Machine Learning Inversion Scheme for Determining Interaction from Scattering
Authors:
Chi-Huan Tung,
Shou-Yi Chang,
Jan-Michael Carrillo,
Bobby G. Sumpter,
Changwoo Do,
Wei-Ren Chen
Abstract:
We outline a machine learning strategy for determining the effective interaction in the condensed phases of matter using scattering. Via a case study of colloidal suspensions, we showed that the effective potential can be probabilistically inferred from the scattering spectra without any restriction imposed by model assumptions. Comparisons to existing parametric approaches demonstrate the superio…
▽ More
We outline a machine learning strategy for determining the effective interaction in the condensed phases of matter using scattering. Via a case study of colloidal suspensions, we showed that the effective potential can be probabilistically inferred from the scattering spectra without any restriction imposed by model assumptions. Comparisons to existing parametric approaches demonstrate the superior performance of this method in accuracy, efficiency, and applicability. This method can effectively enable quantification of interaction in highly correlated systems using scattering and diffraction experiments.
△ Less
Submitted 27 March, 2021;
originally announced March 2021.
-
Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers
Authors:
Shucong Zhang,
Cong-Thanh Do,
Rama Doddipatla,
Erfan Loweimi,
Peter Bell,
Steve Renals
Abstract:
Although the lower layers of a deep neural network learn features which are transferable across datasets, these layers are not transferable within the same dataset. That is, in general, freezing the trained feature extractor (the lower layers) and retraining the classifier (the upper layers) on the same dataset leads to worse performance. In this paper, for the first time, we show that the frozen…
▽ More
Although the lower layers of a deep neural network learn features which are transferable across datasets, these layers are not transferable within the same dataset. That is, in general, freezing the trained feature extractor (the lower layers) and retraining the classifier (the upper layers) on the same dataset leads to worse performance. In this paper, for the first time, we show that the frozen classifier is transferable within the same dataset. We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers. We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks. The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
-
VinDr-CXR: An open dataset of chest X-rays with radiologist's annotations
Authors:
Ha Q. Nguyen,
Khanh Lam,
Linh T. Le,
Hieu H. Pham,
Dat Q. Tran,
Dung B. Nguyen,
Dung D. Le,
Chi M. Pham,
Hang T. T. Tong,
Diep H. Dinh,
Cuong D. Do,
Luu T. Doan,
Cuong N. Nguyen,
Binh T. Nguyen,
Que V. Nguyen,
Au D. Hoang,
Hien N. Phan,
Anh T. Nguyen,
Phuong H. Ho,
Dat T. Ngo,
Nghia T. Nguyen,
Nhan T. Nguyen,
Minh Dao,
Van Vu
Abstract:
Most of the existing chest X-ray datasets include labels from a list of findings without specifying their locations on the radiographs. This limits the development of machine learning algorithms for the detection and localization of chest abnormalities. In this work, we describe a dataset of more than 100,000 chest X-ray scans that were retrospectively collected from two major hospitals in Vietnam…
▽ More
Most of the existing chest X-ray datasets include labels from a list of findings without specifying their locations on the radiographs. This limits the development of machine learning algorithms for the detection and localization of chest abnormalities. In this work, we describe a dataset of more than 100,000 chest X-ray scans that were retrospectively collected from two major hospitals in Vietnam. Out of this raw data, we release 18,000 images that were manually annotated by a total of 17 experienced radiologists with 22 local labels of rectangles surrounding abnormalities and 6 global labels of suspected diseases. The released dataset is divided into a training set of 15,000 and a test set of 3,000. Each scan in the training set was independently labeled by 3 radiologists, while each scan in the test set was labeled by the consensus of 5 radiologists. We designed and built a labeling platform for DICOM images to facilitate these annotation procedures. All images are made publicly available (https://www.physionet.org/content/vindr-cxr/1.0.0/) in DICOM format along with the labels of both the training set and the test set.
△ Less
Submitted 20 March, 2022; v1 submitted 29 December, 2020;
originally announced December 2020.
-
Multi-object Tracking with an Adaptive Generalized Labeled Multi-Bernoulli Filter
Authors:
Cong-Thanh Do,
Tran Thien Dat Nguyen,
Diluka Moratuwage,
Changbeom Shim,
Yon Dohn Chung
Abstract:
The challenges in multi-object tracking mainly stem from the random variations in the cardinality and states of objects during the tracking process. Further, the information on locations where the objects appear, their detection probabilities, and the statistics of the sensor's false alarms significantly influence the tracking accuracy of the filter. However, this information is usually assumed to…
▽ More
The challenges in multi-object tracking mainly stem from the random variations in the cardinality and states of objects during the tracking process. Further, the information on locations where the objects appear, their detection probabilities, and the statistics of the sensor's false alarms significantly influence the tracking accuracy of the filter. However, this information is usually assumed to be known and provided by the users. In this paper, we propose an adaptive generalized labeled multi-Bernoulli (GLMB) filter which can track multiple objects without prior knowledge of the aforementioned information. Experimental results show that the performance of the proposed filter is comparable to an ideal GLMB filter supplied with correct information of the tracking scenarios.
△ Less
Submitted 12 January, 2022; v1 submitted 2 August, 2020;
originally announced August 2020.
-
Characterizing Hydration of SDS Micelles by Contrast Variation Small Angle Neutron Scattering
Authors:
Katherine Chen,
Chi-Huan Tung,
Changwoo Do
Abstract:
Small-angle neutron scattering (SANS) from cationic globular micellar solutions composed of sodium dodecyl sulfate (SDS) and in water was studied with contrast variation approach. Extensive computational studies have demonstrated that the distribution of invasive water is clearly an important feature for understanding the self-organization of SDS molecules and the stability of assemblies. However,…
▽ More
Small-angle neutron scattering (SANS) from cationic globular micellar solutions composed of sodium dodecyl sulfate (SDS) and in water was studied with contrast variation approach. Extensive computational studies have demonstrated that the distribution of invasive water is clearly an important feature for understanding the self-organization of SDS molecules and the stability of assemblies. However, in existing scattering studies the degree of hydration level was not examined explicitly. Here using the scheme of contrast variation, we establish a methodology of SANS to determine the intra-micellar radial dis-tributions of invasive water and SDS molecules from the evolving spectral lineshapes caused by the varying isotopic ratio of water. A detailed description hydration of SDS micelles is provided, which in an excellent agreement with known results of many existing simulations studies. Extension of our method can be used to provide an in-depth insight into the micellization phenomenon which is commonly found in many soft matter systems.
△ Less
Submitted 22 October, 2019;
originally announced October 2019.
-
End-to-End Speech Recognition with High-Frame-Rate Features Extraction
Authors:
Cong-Thanh Do
Abstract:
State-of-the-art end-to-end automatic speech recognition (ASR) extracts acoustic features from input speech signal every 10 ms which corresponds to a frame rate of 100 frames/second. In this report, we investigate the use of high-frame-rate features extraction in end-to-end ASR. High frame rates of 200 and 400 frames/second are used in the features extraction and provide additional information for…
▽ More
State-of-the-art end-to-end automatic speech recognition (ASR) extracts acoustic features from input speech signal every 10 ms which corresponds to a frame rate of 100 frames/second. In this report, we investigate the use of high-frame-rate features extraction in end-to-end ASR. High frame rates of 200 and 400 frames/second are used in the features extraction and provide additional information for end-to-end ASR. The effectiveness of high-frame-rate features extraction is evaluated independently and in combination with speed perturbation based data augmentation. Experiments performed on two speech corpora, Wall Street Journal (WSJ) and CHiME-5, show that using high-frame-rate features extraction yields improved performance for end-to-end ASR, both independently and in combination with speed perturbation. On WSJ corpus, the relative reduction of word error rate (WER) yielded by high-frame-rate features extraction independently and in combination with speed perturbation are up to 21.3% and 24.1%, respectively. On CHiME-5 corpus, the corresponding relative WER reductions are up to 2.8% and 7.9%, respectively, on the test data recorded by microphone arrays and up to 11.8% and 21.2%, respectively, on the test data recorded by binaural microphones.
△ Less
Submitted 12 July, 2019; v1 submitted 3 July, 2019;
originally announced July 2019.
-
Accelerating Neutron Scattering Data Collection and Experiments Using AI Deep Super-Resolution Learning
Authors:
Ming-Ching Chang,
Yi Wei,
Wei-Ren Chen,
Changwoo Do
Abstract:
We present a novel methodology of augmenting the scattering data measured by small angle neutron scattering via an emerging deep convolutional neural network (CNN) that is widely used in artificial intelligence (AI). Data collection time is reduced by increasing the size of binning of the detector pixels at the sacrifice of resolution. High-resolution scattering data is then reconstructed by using…
▽ More
We present a novel methodology of augmenting the scattering data measured by small angle neutron scattering via an emerging deep convolutional neural network (CNN) that is widely used in artificial intelligence (AI). Data collection time is reduced by increasing the size of binning of the detector pixels at the sacrifice of resolution. High-resolution scattering data is then reconstructed by using AI deep super-resolution learning method. This technique can not only improve the productivity of neutron scattering instruments by speeding up the experimental workflow but also enable capturing kinetic changes and transient phenomenon of materials that are currently inaccessible by existing neutron scattering techniques.
△ Less
Submitted 31 May, 2019; v1 submitted 17 April, 2019;
originally announced April 2019.
-
Evolution of CTAB/NaSal Micelles: Structural Analysis by SANS
Authors:
Christopher N. Lam,
William D. Hong,
Changwoo Do,
Wei-Ren Chen
Abstract:
Surfactants are amphiphilic molecules that spontaneously self-assemble in aqueous solution into various ordered and disordered phases. Under certain conditions, one-dimensional structures in the form of long, flexible wormlike micelles can develop. Cetyltrimethylammonium bromide (CTAB) is one of the most widely studied surfactants, and in the presence of sodium salicylate (NaSal), wormlike micelle…
▽ More
Surfactants are amphiphilic molecules that spontaneously self-assemble in aqueous solution into various ordered and disordered phases. Under certain conditions, one-dimensional structures in the form of long, flexible wormlike micelles can develop. Cetyltrimethylammonium bromide (CTAB) is one of the most widely studied surfactants, and in the presence of sodium salicylate (NaSal), wormlike micelles can form at very dilute concentrations of surfactant. We carry out a systematic study of the microscopic structures of CTAB/NaSal over a surfactant concentration range of 2.5 - 15 mM and at salt-to-surfactant molar ratios of 0.5 - 10. Using small-angle neutron scattering, we qualitatively and quantitatively characterize the equilibrium structures of CTAB/NaSal, mapping the phase behavior of CTAB/NaSal at low concentrations within the region of phase space where nascent wormlike micelles transition into long and entangled structures.
△ Less
Submitted 17 December, 2018;
originally announced December 2018.
-
Influence of Side Chain Isomerism on the Rigidity of Poly(3-alkylthiophenes) in Solutions Revealed by Neutron Scattering
Authors:
William D. Hong,
Christopher N. Lam,
Yangyang Wang,
Dongsook Chang,
Youjun He,
Luis E. Sánchez-Díaz,
Changwoo Do,
Wei-Ren Chen
Abstract:
Using small angle neutron scattering, we conducted a detailed structural study of poly(3-alkylthiophenes) dispersed in deuterated dicholorbenzene. The focus was placed on addressing the influence of spatial arrangement of constituent atoms of side chain on backbone conformation. We demonstrate that by impeding the π- π interactions, the branch point in side chain promotes torsional motion between…
▽ More
Using small angle neutron scattering, we conducted a detailed structural study of poly(3-alkylthiophenes) dispersed in deuterated dicholorbenzene. The focus was placed on addressing the influence of spatial arrangement of constituent atoms of side chain on backbone conformation. We demonstrate that by impeding the π- π interactions, the branch point in side chain promotes torsional motion between backbone units and results in greater chain flexibility. Our findings highlight the key role of topological isomerism in determining the molecular rigidity and are relevant to the current debate about the condition necessary for optimizing the electronic properties of conducting polymers via side chain engineering.
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
Accurate Pouring with an Autonomous Robot Using an RGB-D Camera
Authors:
Chau Do,
Wolfram Burgard
Abstract:
Robotic assistants in a home environment are expected to perform various complex tasks for their users. One particularly challenging task is pouring drinks into cups, which for successful completion, requires the detection and tracking of the liquid level during a pour to determine when to stop. In this paper, we present a novel approach to autonomous pouring that tracks the liquid level using an…
▽ More
Robotic assistants in a home environment are expected to perform various complex tasks for their users. One particularly challenging task is pouring drinks into cups, which for successful completion, requires the detection and tracking of the liquid level during a pour to determine when to stop. In this paper, we present a novel approach to autonomous pouring that tracks the liquid level using an RGB-D camera and adapts the rate of pouring based on the liquid level feedback. We thoroughly evaluate our system on various types of liquids and under different conditions, conducting over 250 pours with a PR2 robot. The results demonstrate that our approach is able to pour liquids to a target height with an accuracy of a few millimeters.
△ Less
Submitted 8 October, 2018;
originally announced October 2018.
-
Scaling Behavior of Anisotropy Relaxation in Deformed Polymers
Authors:
Christopher N. Lam,
Wen-Sheng Xu,
Wei-Ren Chen,
Zhe Wang,
Christopher B. Stanley,
Jan-Michael Y. Carrillo,
David Uhrig,
Weiyu Wang,
Kunlun Hong,
Yun Liu,
Lionel Porcar,
Changwoo Do,
Gregory S. Smith,
Bobby G. Sumpter,
Yangyang Wang
Abstract:
Drawing an analogy to the paradigm of quasi-elastic neutron scattering, we present a general approach for quantitatively investigating the spatiotemporal dependence of structural anisotropy relaxation in deformed polymers by using small-angle neutron scattering. Experiments and non-equilibrium molecular dynamics simulations on polymer melts over a wide range of molecular weights reveal that their…
▽ More
Drawing an analogy to the paradigm of quasi-elastic neutron scattering, we present a general approach for quantitatively investigating the spatiotemporal dependence of structural anisotropy relaxation in deformed polymers by using small-angle neutron scattering. Experiments and non-equilibrium molecular dynamics simulations on polymer melts over a wide range of molecular weights reveal that their conformational relaxation at relatively high momentum transfer $Q$ and short time can be described by a simple scaling law, with the relaxation rate proportional to $Q$. This peculiar scaling behavior, which cannot be derived from the classical Rouse and tube models, is indicative of a surprisingly weak direct influence of entanglement on the microscopic mechanism of single-chain anisotropy relaxation.
△ Less
Submitted 9 May, 2018; v1 submitted 25 February, 2018;
originally announced February 2018.
-
Dynamic Equivalence between Soft Star Polymers and Hard Spheres
Authors:
Zhe Wang,
Antonio Faraone,
Panchao Yin,
Lionel Porcar,
Yun Liu,
Changwoo Do,
Kunlun Hong,
Wei-Ren Chen
Abstract:
Understanding the dynamics of soft colloids, such as star polymers, dendrimers, and microgels, is of scientific and practical importance. It is known that the excluded volume effect plays a key role in colloidal dynamics. Here, we propose a condition of compressibility equivalence that provides a simple method to experimentally evaluate the excluded volume of soft colloids from a thermodynamic vie…
▽ More
Understanding the dynamics of soft colloids, such as star polymers, dendrimers, and microgels, is of scientific and practical importance. It is known that the excluded volume effect plays a key role in colloidal dynamics. Here, we propose a condition of compressibility equivalence that provides a simple method to experimentally evaluate the excluded volume of soft colloids from a thermodynamic view. We apply this condition to survey the dynamics of a series of star polymer dispersions. It is found that as the concentration increases, the slowing of the long-time self-diffusivity of the star polymer, normalized by the short-time self-diffusivity, can be mapped onto the hard-sphere behavior. This phenomenon reveals the dynamic equivalence between soft colloids and hard spheres, despite the apparent complexity of the interparticle interaction of the soft colloids. The methods for measuring the osmotic compressibility and the self-diffusivities of soft colloidal dispersions are also presented.
△ Less
Submitted 11 September, 2019; v1 submitted 24 September, 2017;
originally announced September 2017.
-
Excitation of coupled phononic frequency combs via two-mode parametric three-wave mixing
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin Seshia
Abstract:
This paper builds on the recent demonstration of three-wave mixing based phononic frequency comb. Here, in this process, an intrinsic coupling between the drive and resonant frequency leads to a frequency comb of spacing corresponding to the separation between drive and resonant frequency. Now, in this paper, we experimentally demonstrate the possibility to further excite multiple frequency combs…
▽ More
This paper builds on the recent demonstration of three-wave mixing based phononic frequency comb. Here, in this process, an intrinsic coupling between the drive and resonant frequency leads to a frequency comb of spacing corresponding to the separation between drive and resonant frequency. Now, in this paper, we experimentally demonstrate the possibility to further excite multiple frequency combs with the same external drive through its coupling with other identical devices. In addition, we also experimentally identify interesting features associated with such a frequency comb generation process.
△ Less
Submitted 30 August, 2017; v1 submitted 24 August, 2017;
originally announced August 2017.
-
Towards N-mode parametric electromechanical resonances
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin Seshia
Abstract:
The ubiquity of parametric resonance is continually evident in the repeated experimental observations of this phenomenon in multiple physical systems. The elementary case of 2 mode parametric resonance of order 1 involves the excitation of a spectral tone of a parametrically driven mode at a sub-harmonic frequency of the higher directly driven mode. Historically, such examples of parametric resona…
▽ More
The ubiquity of parametric resonance is continually evident in the repeated experimental observations of this phenomenon in multiple physical systems. The elementary case of 2 mode parametric resonance of order 1 involves the excitation of a spectral tone of a parametrically driven mode at a sub-harmonic frequency of the higher directly driven mode. Historically, such examples of parametric resonance have been predominantly researched in a system of micro- and nanoelectromechanical resonators. Here, in this paper, we break this convention by showcasing a collection of experimental signatures in support of the concept of "N-mode parametric resonance" using a number of elementary microelectromechanical devices. Specifically, we present observations of 2, 3, (2+3) and (3+3) mode parametric resonances demonstrating co-existence of different regimes within the same device. In addition, we also present observations of intrinsic "Four-Wave Mixing" of parametric excitations. This paper presents contributions towards the existence proof for such multimode parametric resonances which can also be exploited for engineering benefit within the field of "micro and nanoelectromechanical resonators". The experimental results further point towards the possibility of the ultimate observation of N-mode parametric resonance in such physical system.
△ Less
Submitted 28 June, 2017;
originally announced August 2017.
-
Acting Thoughts: Towards a Mobile Robotic Service Assistant for Users with Limited Communication Skills
Authors:
Felix Burget,
Lukas Dominique Josef Fiederer,
Daniel Kuhner,
Martin Völker,
Johannes Aldinger,
Robin Tibor Schirrmeister,
Chau Do,
Joschka Boedecker,
Bernhard Nebel,
Tonio Ball,
Wolfram Burgard
Abstract:
As autonomous service robots become more affordable and thus available also for the general public, there is a growing need for user friendly interfaces to control the robotic system. Currently available control modalities typically expect users to be able to express their desire through either touch, speech or gesture commands. While this requirement is fulfilled for the majority of users, paraly…
▽ More
As autonomous service robots become more affordable and thus available also for the general public, there is a growing need for user friendly interfaces to control the robotic system. Currently available control modalities typically expect users to be able to express their desire through either touch, speech or gesture commands. While this requirement is fulfilled for the majority of users, paralyzed users may not be able to use such systems. In this paper, we present a novel framework, that allows these users to interact with a robotic service assistant in a closed-loop fashion, using only thoughts. The brain-computer interface (BCI) system is composed of several interacting components, i.e., non-invasive neuronal signal recording and decoding, high-level task planning, motion and manipulation planning as well as environment perception. In various experiments, we demonstrate its applicability and robustness in real world scenarios, considering fetch-and-carry tasks and tasks involving human-robot interaction. As our results demonstrate, our system is capable of adapting to frequent changes in the environment and reliably completing given tasks within a reasonable amount of time. Combined with high-level planning and autonomous robotic systems, interesting new perspectives open up for non-invasive BCI-based human-robot interactions.
△ Less
Submitted 12 June, 2018; v1 submitted 20 July, 2017;
originally announced July 2017.
-
Thermal-induced stress of plasmonic magnetic nanocomposites
Authors:
Anh D. Phan,
Nghia C. Do,
Do T. Nga
Abstract:
We present theoretical calculations to interpret optical and mechanical properties of Ag@Fe3O4 nanoflowers. The microstructures and nature of optical peaks of nanoflowers are determined by means of the Mie theory associated with effective dielectric approximation and the experimental absorption spectrum. Under laser illumination, the thermal strain fields inside and outside the structure due to th…
▽ More
We present theoretical calculations to interpret optical and mechanical properties of Ag@Fe3O4 nanoflowers. The microstructures and nature of optical peaks of nanoflowers are determined by means of the Mie theory associated with effective dielectric approximation and the experimental absorption spectrum. Under laser illumination, the thermal strain fields inside and outside the structure due to the absorbed optical energy are studied using continuum mechanics approach. Our findings provide simple but comprehensive description of the elastic behaviors of previous experiments.
△ Less
Submitted 7 July, 2017; v1 submitted 29 June, 2017;
originally announced June 2017.
-
Phononic frequency comb via three-mode parametric three-wave mixing
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin Seshia
Abstract:
This paper is motivated by the recent demonstration of three-wave mixing based phononic frequency comb. While the previous experiments have shown the existence of three-wave mixing pathway in a system of two-coupled phonon modes, this work demonstrates a similar pathway in a system of three-coupled phonon modes. The paper also presents a number of interesting experimental facts concomitant to the…
▽ More
This paper is motivated by the recent demonstration of three-wave mixing based phononic frequency comb. While the previous experiments have shown the existence of three-wave mixing pathway in a system of two-coupled phonon modes, this work demonstrates a similar pathway in a system of three-coupled phonon modes. The paper also presents a number of interesting experimental facts concomitant to the three-mode three-wave mixing based frequency comb observed in a specific micromechanical device. The experimental validation of three-mode three-wave mixing along with the previous demonstration of two-mode three-wave mixing points to the ultimate possibility of multimode frequency combs.
△ Less
Submitted 26 April, 2017;
originally announced April 2017.
-
Frequency transitions in phononic four-wave mixing
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin Seshia
Abstract:
This work builds upon the recent demonstration of a phononic four-wave mixing (FWM) pathway mediated by parametric resonance. In such a process, drive tones f_d1 and f_d2 associated with a specific phonon mode interact such that one of the drive tones also parametrically excites a second mode at a sub-harmonic frequency and such interactions result in a frequency comb f_d1/2 +/- n(f_d1-f_d2 ). How…
▽ More
This work builds upon the recent demonstration of a phononic four-wave mixing (FWM) pathway mediated by parametric resonance. In such a process, drive tones f_d1 and f_d2 associated with a specific phonon mode interact such that one of the drive tones also parametrically excites a second mode at a sub-harmonic frequency and such interactions result in a frequency comb f_d1/2 +/- n(f_d1-f_d2 ). However, the specific behaviour associated with the case where both drive tones can independently excite the sub-harmonic phonon mode has not been studied or previously described. While it may be possible to expect the merger of two frequency combs f_d1/2 +/- n(f_d1-f_d2 ) and f_d2/2 +/- n(f_d1-f_d2 ), this paper indicates that only one of these mechanisms is selected and also shows an interesting transition linked to this process. Such frequency transitions from f_d1/2 +/- n(f_d1-f_d2 ) to f_d2/2 +/- n(f_d1-f_d2 ) holds potential promise for computing applications.
△ Less
Submitted 6 April, 2017;
originally announced April 2017.
-
Excitation of multiple 2-mode parametric resonances by a single driven mode
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin Seshia
Abstract:
We demonstrate autoparametric excitation of two distinct sub-harmonic mechanical modes by the same driven mechanical mode corresponding to different drive frequencies within its resonance dispersion band. This experimental observation is used to motivate a more general physical picture wherein multiple mechanical modes could be excited by the same driven primary mode within the same device as long…
▽ More
We demonstrate autoparametric excitation of two distinct sub-harmonic mechanical modes by the same driven mechanical mode corresponding to different drive frequencies within its resonance dispersion band. This experimental observation is used to motivate a more general physical picture wherein multiple mechanical modes could be excited by the same driven primary mode within the same device as long as the frequency spacing between the sub-harmonic modes is less than half the dispersion bandwidth of the driven primary mode. The excitation of both modes is seen to be threshold-dependent and a parametric back-action is observed impacting on the response of the driven primary mode. Motivated by this experimental observation, modified dynamical equations specifying 2-mode auto-parametric excitation for such systems are presented.
△ Less
Submitted 4 April, 2017;
originally announced April 2017.
-
Anomaly in coupled parametric resonance
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin Seshia
Abstract:
We present experimental observations of an anomaly in the coupled response of auto-parametrically excited microelectromechanical beams. When one of the two coupled beams is driven at elevated amplitudes, the excitation of dominant and recessive modes is observed in the driven and non-driven beams respectively. This anomalous nature of auto-parametric excitation has been unexplored by both theory a…
▽ More
We present experimental observations of an anomaly in the coupled response of auto-parametrically excited microelectromechanical beams. When one of the two coupled beams is driven at elevated amplitudes, the excitation of dominant and recessive modes is observed in the driven and non-driven beams respectively. This anomalous nature of auto-parametric excitation has been unexplored by both theory and experiments and falls outside the scope of the conventional description of parametric resonance.
△ Less
Submitted 18 November, 2016;
originally announced November 2016.
-
Phononic four-wave mixing
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin A. Seshia
Abstract:
We present the first experimental observations of phononic four-wave mixing (FWM) in a piezoelectrically actuated free-free beam microstructure. The FWM response is facilitated by the intrinsic coupling between a driven mode and an auto-parametrically excited sub-harmonic mode. Motivated by the experimental results, a dynamical model for FWM has been specified.
We present the first experimental observations of phononic four-wave mixing (FWM) in a piezoelectrically actuated free-free beam microstructure. The FWM response is facilitated by the intrinsic coupling between a driven mode and an auto-parametrically excited sub-harmonic mode. Motivated by the experimental results, a dynamical model for FWM has been specified.
△ Less
Submitted 3 October, 2016;
originally announced October 2016.
-
Discrete intrinsic localized modes in a microelectromechanical resonator
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin A. Seshia
Abstract:
Intrinsic Localized Modes (ILMs) or Discrete Breathers (DBs) are produced through a non-linear vibration localization phenomenon. While Anderson localization is due to lattice defects, the nonlinearity of lattices provides the basis for ILM excitation. Over the past two decades, these ILMs have been realized in a wide range of physical systems including photonic crystals, nonlinear atomic lattices…
▽ More
Intrinsic Localized Modes (ILMs) or Discrete Breathers (DBs) are produced through a non-linear vibration localization phenomenon. While Anderson localization is due to lattice defects, the nonlinearity of lattices provides the basis for ILM excitation. Over the past two decades, these ILMs have been realized in a wide range of physical systems including photonic crystals, nonlinear atomic lattices, anti-ferromagnets, coupled Josephson junction arrays and coupled cantilevers. This paper brings out the feasibility of exciting ILMs in a standalone mechanical resonator. Through piezoelectric driving and optical visualization, various intriguing features of ILMs have been recorded. The ILMs in our system are observed as spectral bushes and their frequencies are much lower than that of the drive frequency. The excitation of ILMs is mediated through large amplitude instability following autoparametric excitation of a sub-harmonic mode. The spatial prevalence of discrete ILM excitations is at antinodes of the sub-harmonic mode. Further, the ILMs have been observed to be time-variant and various events including attraction-repulsion (or splitting-merging) of ILMs and hopping occur during the time evolution of ILMs.
△ Less
Submitted 31 August, 2017; v1 submitted 5 October, 2016;
originally announced October 2016.
-
Phononic High Harmonic Generation
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin A. Seshia
Abstract:
This paper reports experimental evidence for phononic low-order to high-order harmonic conversion leading to high harmonic generation. Phononic high harmonic generation is mediated by a threshold dependent instability of a driven phonon mode. Once the threshold for instability is met, a cascade of harmonic generation processes is triggered. Firstly, the up-conversion of first harmonic phonons into…
▽ More
This paper reports experimental evidence for phononic low-order to high-order harmonic conversion leading to high harmonic generation. Phononic high harmonic generation is mediated by a threshold dependent instability of a driven phonon mode. Once the threshold for instability is met, a cascade of harmonic generation processes is triggered. Firstly, the up-conversion of first harmonic phonons into second harmonic phonons is established. Subsequently, the down-conversion of second harmonic phonons into first harmonic phonons and conversion of first and second harmonic phonons into third harmonic phonons occur. On the similar lines, an eventual conversion of third harmonic phonons to high orders is also observed to commence. This physical pathway for phononic low-order to high-order harmonic conversion may find general relevance to other physical systems.
△ Less
Submitted 21 April, 2018; v1 submitted 15 September, 2016;
originally announced October 2016.
-
Hyperfine phononic frequency comb
Authors:
Adarsh Ganesan,
Cuong Do,
Ashwin A. Seshia
Abstract:
Optical frequency combs [1-8] have resulted in significant advances in optical frequency metrology and found wide application to precise physical measurements [1-4, 9] and molecular fingerprinting [8]. A direct analogue of frequency combs in the phononic or acoustic domain has not been reported to date. In this letter, we report the first clear experimental evidence for a phononic frequency comb.…
▽ More
Optical frequency combs [1-8] have resulted in significant advances in optical frequency metrology and found wide application to precise physical measurements [1-4, 9] and molecular fingerprinting [8]. A direct analogue of frequency combs in the phononic or acoustic domain has not been reported to date. In this letter, we report the first clear experimental evidence for a phononic frequency comb. In contrast to the Kerr nonlinearity [10] in optical frequency comb formation, the phononic frequency comb is generated through the intrinsic coupling of a driven phonon mode with an auto-parametrically excited sub-harmonic mode [16]. Through systematic experiments at different drive frequencies and amplitudes, we portray the well-connected process of phononic frequency comb formation and define attributes to control the features [17-18] associated with comb formation in such a system. Further, the interplay between these nonlinear resonances and the well-known Duffing phenomenon [12-14] is also observed. The presented pathway for phononic frequency comb formation finds general relevance to other nonlinear systems in both classical and quantum domains.
△ Less
Submitted 22 September, 2016; v1 submitted 15 September, 2016;
originally announced September 2016.
-
Transfer factors for Jacquet-Mao's metaplectic fundamental lemma
Authors:
Viet Cuong Do
Abstract:
In an earlier paper we proved Jacquet-Mao's metaplectic fundamental lemma which is the identity between two orbital integrals (one is defined on the space of symmetric matrices and another one is defined on the $2$-fold cover of the general linear group) corrected by a transfer factor. But we restricted to the case where the relevant representative is a diagonal matrix. Now, we show that we can ex…
▽ More
In an earlier paper we proved Jacquet-Mao's metaplectic fundamental lemma which is the identity between two orbital integrals (one is defined on the space of symmetric matrices and another one is defined on the $2$-fold cover of the general linear group) corrected by a transfer factor. But we restricted to the case where the relevant representative is a diagonal matrix. Now, we show that we can extend this result for the more general relevant representative. Our proof is based on the concept of Shalika germs for certain Kloosterman integrals.
△ Less
Submitted 2 April, 2020; v1 submitted 5 November, 2015;
originally announced November 2015.
-
MOOCdb: Developing Standards and Systems to Support MOOC Data Science
Authors:
Kalyan Veeramachaneni,
Sherif Halawa,
Franck Dernoncourt,
Una-May O'Reilly,
Colin Taylor,
Chuong Do
Abstract:
We present a shared data model for enabling data science in Massive Open Online Courses (MOOCs). The model captures students interactions with the online platform. The data model is platform agnostic and is based on some basic core actions that students take on an online learning platform. Students usually interact with the platform in four different modes: Observing, Submitting, Collaborating and…
▽ More
We present a shared data model for enabling data science in Massive Open Online Courses (MOOCs). The model captures students interactions with the online platform. The data model is platform agnostic and is based on some basic core actions that students take on an online learning platform. Students usually interact with the platform in four different modes: Observing, Submitting, Collaborating and giving feedback. In observing mode students are simply browsing the online platform, watching videos, reading material, reading book or watching forums. In submitting mode, students submit information to the platform. This includes submissions towards quizzes, homeworks, or any assessment modules. In collaborating mode students interact with other students or instructors on forums, collaboratively editing wiki or chatting on google hangout or other hangout venues. With this basic definitions of activities, and a data model to store events pertaining to these activities, we then create a common terminology to map Coursera and edX data into this shared data model. This shared data model called MOOCdb becomes the foundation for a number of collaborative frameworks that enable progress in data science without the need to share the data.
△ Less
Submitted 8 June, 2014;
originally announced June 2014.
-
On the motives of moduli of parabolic chains and parabolic Higgs bundles
Authors:
Viet Cuong Do
Abstract:
Like the Higgs bundles on a Riemann surface who played an important role in the study of representation of the fundamental group of the surface, the parabolic Higgs bundles play also their importance in the study of the fundamental group but of punctured surface. In this paper, we shall give an algorithm to calculate the (virtual) motive (i.e in a suitable Grothendieck ring) of the moduli spaces o…
▽ More
Like the Higgs bundles on a Riemann surface who played an important role in the study of representation of the fundamental group of the surface, the parabolic Higgs bundles play also their importance in the study of the fundamental group but of punctured surface. In this paper, we shall give an algorithm to calculate the (virtual) motive (i.e in a suitable Grothendieck ring) of the moduli spaces of parabolic Higgs bundles of fixed rank, fixed degree and fixed parabolic structure, using localization with respect to the circle action.
△ Less
Submitted 15 May, 2014;
originally announced May 2014.