Search | arXiv e-print repository

Topic Modeling as Multi-Objective Contrastive Optimization

Authors: Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu

Abstract: Recent representation learning approaches enhance neural topic models by optimizing the weighted linear combination of the evidence lower bound (ELBO) of the log-likelihood and the contrastive learning objective that contrasts pairs of input documents. However, document-level contrastive learning might capture low-level mutual information, such as word ratio, which disturbs topic modeling. Moreove… ▽ More Recent representation learning approaches enhance neural topic models by optimizing the weighted linear combination of the evidence lower bound (ELBO) of the log-likelihood and the contrastive learning objective that contrasts pairs of input documents. However, document-level contrastive learning might capture low-level mutual information, such as word ratio, which disturbs topic modeling. Moreover, there is a potential conflict between the ELBO loss that memorizes input details for better reconstruction quality, and the contrastive loss which attempts to learn topic representations that generalize among input documents. To address these issues, we first introduce a novel contrastive learning method oriented towards sets of topic vectors to capture useful semantics that are shared among a set of input documents. Secondly, we explicitly cast contrastive topic modeling as a gradient-based multi-objective optimization problem, with the goal of achieving a Pareto stationary solution that balances the trade-off between the ELBO and the contrastive objective. Extensive experiments demonstrate that our framework consistently produces higher-performing neural topic models in terms of topic coherence, topic diversity, and downstream performance. △ Less

Submitted 9 March, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

Comments: Accepted at ICLR 2024 (poster)

arXiv:2402.06682 [pdf, other]

Private Knowledge Sharing in Distributed Learning: A Survey

Authors: Yasas Supeksala, Dinh C. Nguyen, Ming Ding, Thilina Ranbaduge, Calson Chua, Jun Zhang, Jun Li, H. Vincent Poor

Abstract: The rise of Artificial Intelligence (AI) has revolutionized numerous industries and transformed the way society operates. Its widespread use has led to the distribution of AI and its underlying data across many intelligent systems. In this light, it is crucial to utilize information in learning processes that are either distributed or owned by different entities. As a result, modern data-driven se… ▽ More The rise of Artificial Intelligence (AI) has revolutionized numerous industries and transformed the way society operates. Its widespread use has led to the distribution of AI and its underlying data across many intelligent systems. In this light, it is crucial to utilize information in learning processes that are either distributed or owned by different entities. As a result, modern data-driven services have been developed to integrate distributed knowledge entities into their outcomes. In line with this goal, the latest AI models are frequently trained in a decentralized manner. Distributed learning involves multiple entities working together to make collective predictions and decisions. However, this collaboration can also bring about security vulnerabilities and challenges. This paper provides an in-depth survey on private knowledge sharing in distributed learning, examining various knowledge components utilized in leading distributed learning architectures. Our analysis sheds light on the most critical vulnerabilities that may arise when using these components in a distributed setting. We further identify and examine defensive strategies for preserving the privacy of these knowledge components and preventing malicious parties from manipulating or accessing the knowledge information. Finally, we highlight several key limitations of knowledge sharing in distributed learning and explore potential avenues for future research. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: Manuscript submitted to ACM

arXiv:2402.03832 [pdf, other]

Rethinking Skill Extraction in the Job Market Domain using Large Language Models

Authors: Khanh Cao Nguyen, Mike Zhang, Syrielle Montariol, Antoine Bosselut

Abstract: Skill Extraction involves identifying skills and qualifications mentioned in documents such as job postings and resumes. The task is commonly tackled by training supervised models using a sequence labeling approach with BIO tags. However, the reliance on manually annotated data limits the generalizability of such approaches. Moreover, the common BIO setting limits the ability of the models to capt… ▽ More Skill Extraction involves identifying skills and qualifications mentioned in documents such as job postings and resumes. The task is commonly tackled by training supervised models using a sequence labeling approach with BIO tags. However, the reliance on manually annotated data limits the generalizability of such approaches. Moreover, the common BIO setting limits the ability of the models to capture complex skill patterns and handle ambiguous mentions. In this paper, we explore the use of in-context learning to overcome these challenges, on a benchmark of 6 uniformized skill extraction datasets. Our approach leverages the few-shot learning capabilities of large language models (LLMs) to identify and extract skills from sentences. We show that LLMs, despite not being on par with traditional supervised models in terms of performance, can better handle syntactically complex skill mentions in skill extraction tasks. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: Published at NLP4HR 2024 (EACL Workshop)

arXiv:2402.02319 [pdf]

Smart Textile-Driven Soft Spine Exosuit for Lifting Tasks in Industrial Applications

Authors: Kefan Zhu, Bibhu Sharma, Phuoc Thien Phan, James Davies, Mai Thanh Thai, Trung Thien Hoang, Chi Cong Nguyen, Adrienne Ji, Emanuele Nicotra, Nigel H. Lovell, Thanh Nho Do

Abstract: Work related musculoskeletal disorders (WMSDs) are often caused by repetitive lifting, making them a significant concern in occupational health. Although wearable assist devices have become the norm for mitigating the risk of back pain, most spinal assist devices still possess a partially rigid structure that impacts the user comfort and flexibility. This paper addresses this issue by presenting a… ▽ More Work related musculoskeletal disorders (WMSDs) are often caused by repetitive lifting, making them a significant concern in occupational health. Although wearable assist devices have become the norm for mitigating the risk of back pain, most spinal assist devices still possess a partially rigid structure that impacts the user comfort and flexibility. This paper addresses this issue by presenting a smart textile actuated spine assistance robotic exosuit (SARE), which can conform to the back seamlessly without impeding the user movement and is incredibly lightweight. The SARE can assist the human erector spinae to complete any action with virtually infinite degrees of freedom. To detect the strain on the spine and to control the smart textile automatically, a soft knitting sensor which utilizes fluid pressure as sensing element is used. The new device is validated experimentally with human subjects where it reduces peak electromyography (EMG) signals of lumbar erector spinae by around 32 percent in loaded and around 22 percent in unloaded conditions. Moreover, the integrated EMG decreased by around 24.2 percent under loaded condition and around 23.6 percent under unloaded condition. In summary, the artificial muscle wearable device represents an anatomical solution to reduce the risk of muscle strain, metabolic energy cost and back pain associated with repetitive lifting tasks. △ Less

Submitted 3 February, 2024; originally announced February 2024.

Comments: 6 pages, 7 figures

arXiv:2402.02021 [pdf, other]

Transfer Learning in ECG Diagnosis: Is It Effective?

Authors: Cuong V. Nguyen, Cuong D. Do

Abstract: The adoption of deep learning in ECG diagnosis is often hindered by the scarcity of large, well-labeled datasets in real-world scenarios, leading to the use of transfer learning to leverage features learned from larger datasets. Yet the prevailing assumption that transfer learning consistently outperforms training from scratch has never been systematically validated. In this study, we conduct the… ▽ More The adoption of deep learning in ECG diagnosis is often hindered by the scarcity of large, well-labeled datasets in real-world scenarios, leading to the use of transfer learning to leverage features learned from larger datasets. Yet the prevailing assumption that transfer learning consistently outperforms training from scratch has never been systematically validated. In this study, we conduct the first extensive empirical study on the effectiveness of transfer learning in multi-label ECG classification, by investigating comparing the fine-tuning performance with that of training from scratch, covering a variety of ECG datasets and deep neural networks. We confirm that fine-tuning is the preferable choice for small downstream datasets; however, when the dataset is sufficiently large, training from scratch can achieve comparable performance, albeit requiring a longer training time to catch up. Furthermore, we find that transfer learning exhibits better compatibility with convolutional neural networks than with recurrent neural networks, which are the two most prevalent architectures for time-series ECG applications. Our results underscore the importance of transfer learning in ECG diagnosis, yet depending on the amount of available data, researchers may opt not to use it, considering the non-negligible cost associated with pre-training. △ Less

Submitted 26 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.17897 [pdf, ps, other]

Employing Label Models on ChatGPT Answers Improves Legal Text Entailment Performance

Authors: Chau Nguyen, Le-Minh Nguyen

Abstract: The objective of legal text entailment is to ascertain whether the assertions in a legal query logically follow from the information provided in one or multiple legal articles. ChatGPT, a large language model, is robust in many natural language processing tasks, including legal text entailment: when we set the temperature = 0 (the ChatGPT answers are deterministic) and prompt the model, it achieve… ▽ More The objective of legal text entailment is to ascertain whether the assertions in a legal query logically follow from the information provided in one or multiple legal articles. ChatGPT, a large language model, is robust in many natural language processing tasks, including legal text entailment: when we set the temperature = 0 (the ChatGPT answers are deterministic) and prompt the model, it achieves 70.64% accuracy on COLIEE 2022 dataset, which outperforms the previous SOTA of 67.89%. On the other hand, if the temperature is larger than zero, ChatGPT answers are not deterministic, leading to inconsistent answers and fluctuating results. We propose to leverage label models (a fundamental component of weak supervision techniques) to integrate the provisional answers by ChatGPT into consolidated labels. By that way, we treat ChatGPT provisional answers as noisy predictions which can be consolidated by label models. The experimental results demonstrate that this approach can attain an accuracy of 76.15%, marking a significant improvement of 8.26% over the prior state-of-the-art benchmark. Additionally, we perform an analysis of the instances where ChatGPT produces incorrect answers, then we classify the errors, offering insights that could guide potential enhancements for future research endeavors. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: 15 pages

arXiv:2401.15625 [pdf, other]

Generative AI-enabled Blockchain Networks: Fundamentals, Applications, and Case Study

Authors: Cong T. Nguyen, Yinqiu Liu, Hongyang Du, Dinh Thai Hoang, Dusit Niyato, Diep N. Nguyen, Shiwen Mao

Abstract: Generative Artificial Intelligence (GAI) has recently emerged as a promising solution to address critical challenges of blockchain technology, including scalability, security, privacy, and interoperability. In this paper, we first introduce GAI techniques, outline their applications, and discuss existing solutions for integrating GAI into blockchains. Then, we discuss emerging solutions that demon… ▽ More Generative Artificial Intelligence (GAI) has recently emerged as a promising solution to address critical challenges of blockchain technology, including scalability, security, privacy, and interoperability. In this paper, we first introduce GAI techniques, outline their applications, and discuss existing solutions for integrating GAI into blockchains. Then, we discuss emerging solutions that demonstrate the effectiveness of GAI in addressing various challenges of blockchain, such as detecting unknown blockchain attacks and smart contract vulnerabilities, designing key secret sharing schemes, and enhancing privacy. Moreover, we present a case study to demonstrate that GAI, specifically the generative diffusion model, can be employed to optimize blockchain network performance metrics. Experimental results clearly show that, compared to a baseline traditional AI approach, the proposed generative diffusion model approach can converge faster, achieve higher rewards, and significantly improve the throughput and latency of the blockchain network. Additionally, we highlight future research directions for GAI in blockchain applications, including personalized GAI-enabled blockchains, GAI-blockchain synergy, and privacy and security considerations within blockchain ecosystems. △ Less

Submitted 28 January, 2024; originally announced January 2024.

arXiv:2401.14420 [pdf, other]

A Novel Blockchain Based Information Management Framework for Web 3.0

Authors: Md Arif Hassan, Cong T. Nguyen, Chi-Hieu Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Eryk Dutkiewicz

Abstract: Web 3.0 is the third generation of the World Wide Web (WWW), concentrating on the critical concepts of decentralization, availability, and increasing client usability. Although Web 3.0 is undoubtedly an essential component of the future Internet, it currently faces critical challenges, including decentralized data collection and management. To overcome these challenges, blockchain has emerged as o… ▽ More Web 3.0 is the third generation of the World Wide Web (WWW), concentrating on the critical concepts of decentralization, availability, and increasing client usability. Although Web 3.0 is undoubtedly an essential component of the future Internet, it currently faces critical challenges, including decentralized data collection and management. To overcome these challenges, blockchain has emerged as one of the core technologies for the future development of Web 3.0. In this paper, we propose a novel blockchain-based information management framework, namely Smart Blockchain-based Web, to manage information in Web 3.0 effectively, enhance the security and privacy of users data, bring additional profits, and incentivize users to contribute information to the websites. Particularly, SBW utilizes blockchain technology and smart contracts to manage the decentralized data collection process for Web 3.0 effectively. Moreover, in this framework, we develop an effective consensus mechanism based on Proof-of-Stake to reward the user's information contribution and conduct game theoretical analysis to analyze the users behavior in the considered system. Additionally, we conduct simulations to assess the performance of SBW and investigate the impact of critical parameters on information contribution. The findings confirm our theoretical analysis and demonstrate that our proposed consensus mechanism can incentivize the nodes and users to contribute more information to our systems. △ Less

Submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.14113 [pdf, other]

On the Affinity, Rationality, and Diversity of Hierarchical Topic Modeling

Authors: Xiaobao Wu, Fengjun Pan, Thong Nguyen, Yichao Feng, Chaoqun Liu, Cong-Duy Nguyen, Anh Tuan Luu

Abstract: Hierarchical topic modeling aims to discover latent topics from a corpus and organize them into a hierarchy to understand documents with desirable semantic granularity. However, existing work struggles with producing topic hierarchies of low affinity, rationality, and diversity, which hampers document understanding. To overcome these challenges, we in this paper propose Transport Plan and Context-… ▽ More Hierarchical topic modeling aims to discover latent topics from a corpus and organize them into a hierarchy to understand documents with desirable semantic granularity. However, existing work struggles with producing topic hierarchies of low affinity, rationality, and diversity, which hampers document understanding. To overcome these challenges, we in this paper propose Transport Plan and Context-aware Hierarchical Topic Model (TraCo). Instead of early simple topic dependencies, we propose a transport plan dependency method. It constrains dependencies to ensure their sparsity and balance, and also regularizes topic hierarchy building with them. This improves affinity and diversity of hierarchies. We further propose a context-aware disentangled decoder. Rather than previously entangled decoding, it distributes different semantic granularity to topics at different levels by disentangled decoding. This facilitates the rationality of hierarchies. Experiments on benchmark datasets demonstrate that our method surpasses state-of-the-art baselines, effectively improving the affinity, rationality, and diversity of hierarchical topic modeling with better performance on downstream tasks. △ Less

Submitted 31 January, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

Comments: Accepted to AAAI2024 conference. Our code is available at https://github.com/bobxwu/TraCo

arXiv:2401.10901 [pdf, other]

Enabling Technologies for Web 3.0: A Comprehensive Survey

Authors: Md Arif Hassan, Mohammad Behdad Jamshidi, Bui Duc Manh, Nam H. Chu, Chi-Hieu Nguyen, Nguyen Quang Hieu, Cong T. Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Nguyen Van Huynh, Mohammad Abu Alsheikh, Eryk Dutkiewicz

Abstract: Web 3.0 represents the next stage of Internet evolution, aiming to empower users with increased autonomy, efficiency, quality, security, and privacy. This evolution can potentially democratize content access by utilizing the latest developments in enabling technologies. In this paper, we conduct an in-depth survey of enabling technologies in the context of Web 3.0, such as blockchain, semantic web… ▽ More Web 3.0 represents the next stage of Internet evolution, aiming to empower users with increased autonomy, efficiency, quality, security, and privacy. This evolution can potentially democratize content access by utilizing the latest developments in enabling technologies. In this paper, we conduct an in-depth survey of enabling technologies in the context of Web 3.0, such as blockchain, semantic web, 3D interactive web, Metaverse, Virtual reality/Augmented reality, Internet of Things technology, and their roles in shaping Web 3.0. We commence by providing a comprehensive background of Web 3.0, including its concept, basic architecture, potential applications, and industry adoption. Subsequently, we examine recent breakthroughs in IoT, 5G, and blockchain technologies that are pivotal to Web 3.0 development. Following that, other enabling technologies, including AI, semantic web, and 3D interactive web, are discussed. Utilizing these technologies can effectively address the critical challenges in realizing Web 3.0, such as ensuring decentralized identity, platform interoperability, data transparency, reducing latency, and enhancing the system's scalability. Finally, we highlight significant challenges associated with Web 3.0 implementation, emphasizing potential solutions and providing insights into future research directions in this field. △ Less

Submitted 29 December, 2023; originally announced January 2024.

arXiv:2401.08723 [pdf, other]

HierSFL: Local Differential Privacy-aided Split Federated Learning in Mobile Edge Computing

Authors: Minh K. Quan, Dinh C. Nguyen, Van-Dinh Nguyen, Mayuri Wijayasundara, Sujeeva Setunge, Pubudu N. Pathirana

Abstract: Federated Learning is a promising approach for learning from user data while preserving data privacy. However, the high requirements of the model training process make it difficult for clients with limited memory or bandwidth to participate. To tackle this problem, Split Federated Learning is utilized, where clients upload their intermediate model training outcomes to a cloud server for collaborat… ▽ More Federated Learning is a promising approach for learning from user data while preserving data privacy. However, the high requirements of the model training process make it difficult for clients with limited memory or bandwidth to participate. To tackle this problem, Split Federated Learning is utilized, where clients upload their intermediate model training outcomes to a cloud server for collaborative server-client model training. This methodology facilitates resource-constrained clients' participation in model training but also increases the training time and communication overhead. To overcome these limitations, we propose a novel algorithm, called Hierarchical Split Federated Learning (HierSFL), that amalgamates models at the edge and cloud phases, presenting qualitative directives for determining the best aggregation timeframes to reduce computation and communication expenses. By implementing local differential privacy at the client and edge server levels, we enhance privacy during local model parameter updates. Our experiments using CIFAR-10 and MNIST datasets show that HierSFL outperforms standard FL approaches with better training accuracy, training time, and communication-computing trade-offs. HierSFL offers a promising solution to mobile edge computing's challenges, ultimately leading to faster content delivery and improved mobile service quality. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: 6 Pages, 5 figures, IEEE Virtual Conference on Communications 2023

arXiv:2401.03917 [pdf, other]

Toward a comprehensive simulation framework for hypergraphs: a Python-base approach

Authors: Quoc Chuong Nguyen, Trung Kien Le

Abstract: Hypergraphs, or generalization of graphs such that edges can contain more than two nodes, have become increasingly prominent in understanding complex network analysis. Unlike graphs, hypergraphs have relatively few supporting platforms, and such dearth presents a barrier to more widespread adaptation of hypergraph computational toolboxes that could enable further research in several areas. Here, w… ▽ More Hypergraphs, or generalization of graphs such that edges can contain more than two nodes, have become increasingly prominent in understanding complex network analysis. Unlike graphs, hypergraphs have relatively few supporting platforms, and such dearth presents a barrier to more widespread adaptation of hypergraph computational toolboxes that could enable further research in several areas. Here, we introduce HyperRD, a Python package for hypergraph computation, simulation, and interoperability with other powerful Python packages in graph and hypergraph research. Then, we will introduce two models on hypergraph, the general Schelling's model and the SIR model, and simulate them with HyperRD. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 13 pages, 3 figures

arXiv:2401.03551 [pdf, other]

CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information Retrieval and Entailment Tasks

Authors: Chau Nguyen, Phuong Nguyen, Thanh Tran, Dat Nguyen, An Trieu, Tin Pham, Anh Dang, Le-Minh Nguyen

Abstract: The Competition on Legal Information Extraction/Entailment (COLIEE) is held annually to encourage advancements in the automatic processing of legal texts. Processing legal documents is challenging due to the intricate structure and meaning of legal language. In this paper, we outline our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our approach involved utiliz… ▽ More The Competition on Legal Information Extraction/Entailment (COLIEE) is held annually to encourage advancements in the automatic processing of legal texts. Processing legal documents is challenging due to the intricate structure and meaning of legal language. In this paper, we outline our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition. As a result, our performance in these tasks has been outstanding, with first places in Task 2 and Task 3, and promising results in Task 4. Our source code is available at https://github.com/Nguyen2015/CAPTAIN-COLIEE2023/tree/coliee2023. △ Less

Submitted 7 January, 2024; originally announced January 2024.

arXiv:2401.02093 [pdf, ps, other]

A new approach to convergence analysis of iterative models with optimal error bounds

Authors: Minh-Phuong Tran, Thanh-Nhan Nguyen, Thai-Hung Nguyen, Tan-Phuc Nguyen, Tien-Khai Nguyen, Cong-Duy-Nguyen Nguyen, Trung-Hieu Huynh

Abstract: In this paper, we study a new approach related to the convergence analysis of Ishikawa-type iterative models to a common fixed point of two non-expansive mappings in Banach spaces. The main novelty of our contribution lies in the so-called \emph{optimal error bounds}, which established some necessary and sufficient conditions for convergence and derived both the error estimates and bounds on the c… ▽ More In this paper, we study a new approach related to the convergence analysis of Ishikawa-type iterative models to a common fixed point of two non-expansive mappings in Banach spaces. The main novelty of our contribution lies in the so-called \emph{optimal error bounds}, which established some necessary and sufficient conditions for convergence and derived both the error estimates and bounds on the convergence rates for iterative schemes. Although a special interest here is devoted to the Ishikawa and modified Ishikawa iterative sequences, the theory of \emph{optimal error bounds} proposed in this paper can also be favorably applied to various types of iterative models to approximate common fixed points of non-expansive mappings. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: 29 pages, 9 figures

arXiv:2401.00165 [pdf, other]

Mitigating the Impact of False Negatives in Dense Retrieval with Contrastive Confidence Regularization

Authors: Shiqi Wang, Yeqin Zhang, Cam-Tu Nguyen

Abstract: In open-domain Question Answering (QA), dense retrieval is crucial for finding relevant passages for answer generation. Typically, contrastive learning is used to train a retrieval model that maps passages and queries to the same semantic space. The objective is to make similar ones closer and dissimilar ones further apart. However, training such a system is challenging due to the false negative i… ▽ More In open-domain Question Answering (QA), dense retrieval is crucial for finding relevant passages for answer generation. Typically, contrastive learning is used to train a retrieval model that maps passages and queries to the same semantic space. The objective is to make similar ones closer and dissimilar ones further apart. However, training such a system is challenging due to the false negative issue, where relevant passages may be missed during data annotation. Hard negative sampling, which is commonly used to improve contrastive learning, can introduce more noise in training. This is because hard negatives are those closer to a given query, and thus more likely to be false negatives. To address this issue, we propose a novel contrastive confidence regularizer for Noise Contrastive Estimation (NCE) loss, a commonly used loss for dense retrieval. Our analysis shows that the regularizer helps dense retrieval models be more robust against false negatives with a theoretical guarantee. Additionally, we propose a model-agnostic method to filter out noisy negative passages in the dataset, improving any downstream dense retrieval models. Through experiments on three datasets, we demonstrate that our method achieves better retrieval performance in comparison to existing state-of-the-art dense retrieval systems. △ Less

Submitted 13 January, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

Comments: Accepted by AAAI24

arXiv:2312.17619 [pdf, other]

Discontinuous Galerkin Methods for Hypersonic Flows

Authors: Dominique S. Hoskin, R. Loek Van Heyningen, Ngoc Cuong Nguyen, Jordi Vila-Pérez, Wesley L. Harris, Jaime Peraire

Abstract: In recent years, high-order discontinuous Galerkin (DG) methods have emerged as an attractive approach for numerical simulations of compressible flows. This paper presents an overview of the recent development of DG methods for compressible flows with particular focus on hypersononic flows. First, we survey state-of-the-art DG methods for computational fluid dynamics. Next, we discuss both matrix-… ▽ More In recent years, high-order discontinuous Galerkin (DG) methods have emerged as an attractive approach for numerical simulations of compressible flows. This paper presents an overview of the recent development of DG methods for compressible flows with particular focus on hypersononic flows. First, we survey state-of-the-art DG methods for computational fluid dynamics. Next, we discuss both matrix-based and matrix-free iterative methods for the solution of discrete systems stemming from the spatial DG discretizations of the compressible Navier-Stokes equations. We then describe various shock capturing methods to deal with strong shock waves in hypersonic flows. We discuss adaptivity techniques to refine high-order meshes, and synthetic boundary conditions to simulate free-stream disturbances in hypersonic boundary layers. We present a few examples to demonstrate the ability of high-order DG methods to provide accurate solutions of hypersonic laminar flows. Furthermore, we present direct numerical simulations of hypersonic transitional flow past a flared cone at Reynolds number $10.8 \times 10^6$, and hypersonic transitional shock wave boundary layer interaction flow over a flat plate at Reynolds number $3.97 \times 10^6$. These simulations run entirely on hundreds of graphics processing units (GPUs) and demonstrate the ability of DG methods to directly resolve hypersonic transitional flows, even at high Reynolds numbers, without relying on transition or turbulence models. We end the paper by offering our perspectives on error estimation, turbulence modeling, and real gas effects in hypersonic flows. △ Less

Submitted 29 December, 2023; originally announced December 2023.

Comments: 34 pages, 25 figures, and 1 table

MSC Class: 76M10; 76F65; 76K05

arXiv:2312.16631 [pdf, other]

doi 10.1103/PhysRevD.109.092008

Measurement of Electron Neutrino and Antineutrino Cross Sections at Low Momentum Transfer

Authors: S. Henry, H. Su, S. Akhter, Z. Ahmad Dar, V. Ansari, M. V. Ascencio, M. Sajjad Athar, A. Bashyal, M. Betancourt, J. L. Bonilla, A. Bravar, G. Caceres, G. A. Díaz, J. Felix, L. Fields, R. Fine, P. K. Gaur, S. M. Gilligan, R. Gran, E. Granados, D. A. Harris, A. L. Hart, J. Kleykamp, A. Klustová, M. Kordosky , et al. (31 additional authors not shown)

Abstract: Accelerator based neutrino oscillation experiments seek to measure the relative number of electron and muon neutrinos and antineutrinos at different $L/E$ values. However high statistics studies of neutrino interactions are almost exclusively measured using muon neutrinos and antineutrinos since the dominant flavor of neutrinos produced by accelerator based beams are of the muon type. This work re… ▽ More Accelerator based neutrino oscillation experiments seek to measure the relative number of electron and muon neutrinos and antineutrinos at different $L/E$ values. However high statistics studies of neutrino interactions are almost exclusively measured using muon neutrinos and antineutrinos since the dominant flavor of neutrinos produced by accelerator based beams are of the muon type. This work reports new measurements of electron neutrino and antineutrino interactions in hydrocarbon, obtained by strongly suppressing backgrounds initiated by muon flavor neutrinos and antineutrinos. Double differential cross sections as a function of visible energy transfer, $E_\text{avail}$, and transverse momentum transfer, $p_T$, or three momentum transfer, $q_3$ are presented. △ Less

Submitted 16 April, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

Comments: 25 pages, 32 figures and 7 tables, accepted for publication in Physical Review D. Revised to add content updated in review process

Report number: FERMILAB-PUB-23-0830-PPD

Journal ref: Physical Review D109, 092008 (2024)

arXiv:2312.06950 [pdf, other]

READ-PVLA: Recurrent Adapter with Partial Video-Language Alignment for Parameter-Efficient Transfer Learning in Low-Resource Video-Language Modeling

Authors: Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Khoi Le, Zhiyuan Hu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

Abstract: Fully fine-tuning pretrained large-scale transformer models has become a popular paradigm for video-language modeling tasks, such as temporal language grounding and video-language summarization. With a growing number of tasks and limited training data, such full fine-tuning approach leads to costly model storage and unstable training. To overcome these shortcomings, we introduce lightweight adapte… ▽ More Fully fine-tuning pretrained large-scale transformer models has become a popular paradigm for video-language modeling tasks, such as temporal language grounding and video-language summarization. With a growing number of tasks and limited training data, such full fine-tuning approach leads to costly model storage and unstable training. To overcome these shortcomings, we introduce lightweight adapters to the pre-trained model and only update them at fine-tuning time. However, existing adapters fail to capture intrinsic temporal relations among video frames or textual words. Moreover, they neglect the preservation of critical task-related information that flows from the raw video-language input into the adapter's low-dimensional space. To address these issues, we first propose a novel REcurrent ADapter (READ) that employs recurrent computation to enable temporal modeling capability. Second, we propose Partial Video-Language Alignment (PVLA) objective via the use of partial optimal transport to maintain task-related information flowing into our READ modules. We validate our READ-PVLA framework through extensive experiments where READ-PVLA significantly outperforms all existing fine-tuning strategies on multiple low-resource temporal language grounding and video-language summarization benchmarks. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: Accepted at AAAI 2024

arXiv:2312.02549 [pdf, other]

DemaFormer: Damped Exponential Moving Average Transformer with Energy-Based Modeling for Temporal Language Grounding

Authors: Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

Abstract: Temporal Language Grounding seeks to localize video moments that semantically correspond to a natural language query. Recent advances employ the attention mechanism to learn the relations between video moments and the text query. However, naive attention might not be able to appropriately capture such relations, resulting in ineffective distributions where target video moments are difficult to sep… ▽ More Temporal Language Grounding seeks to localize video moments that semantically correspond to a natural language query. Recent advances employ the attention mechanism to learn the relations between video moments and the text query. However, naive attention might not be able to appropriately capture such relations, resulting in ineffective distributions where target video moments are difficult to separate from the remaining ones. To resolve the issue, we propose an energy-based model framework to explicitly learn moment-query distributions. Moreover, we propose DemaFormer, a novel Transformer-based architecture that utilizes exponential moving average with a learnable damping factor to effectively encode moment-query inputs. Comprehensive experiments on four public temporal language grounding datasets showcase the superiority of our methods over the state-of-the-art baselines. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: Accepted at EMNLP 2023 (Findings)

arXiv:2312.02541 [pdf, other]

Explainable Severity ranking via pairwise n-hidden comparison: a case study of glaucoma

Authors: Hong Nguyen, Cuong V. Nguyen, Shrikanth Narayanan, Benjamin Y. Xu, Michael Pazzani

Abstract: Primary open-angle glaucoma (POAG) is a chronic and progressive optic nerve condition that results in an acquired loss of optic nerve fibers and potential blindness. The gradual onset of glaucoma results in patients progressively losing their vision without being consciously aware of the changes. To diagnose POAG and determine its severity, patients must undergo a comprehensive dilated eye examina… ▽ More Primary open-angle glaucoma (POAG) is a chronic and progressive optic nerve condition that results in an acquired loss of optic nerve fibers and potential blindness. The gradual onset of glaucoma results in patients progressively losing their vision without being consciously aware of the changes. To diagnose POAG and determine its severity, patients must undergo a comprehensive dilated eye examination. In this work, we build a framework to rank, compare, and interpret the severity of glaucoma using fundus images. We introduce a siamese-based severity ranking using pairwise n-hidden comparisons. We additionally have a novel approach to explaining why a specific image is deemed more severe than others. Our findings indicate that the proposed severity ranking model surpasses traditional ones in terms of diagnostic accuracy and delivers improved saliency explanations. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: 4 pages

arXiv:2312.02227 [pdf, other]

Improving Multimodal Sentiment Analysis: Supervised Angular Margin-based Contrastive Learning for Enhanced Fusion Representation

Authors: Cong-Duy Nguyen, Thong Nguyen, Duc Anh Vu, Luu Anh Tuan

Abstract: The effectiveness of a model is heavily reliant on the quality of the fusion representation of multiple modalities in multimodal sentiment analysis. Moreover, each modality is extracted from raw input and integrated with the rest to construct a multimodal representation. Although previous methods have proposed multimodal representations and achieved promising results, most of them focus on forming… ▽ More The effectiveness of a model is heavily reliant on the quality of the fusion representation of multiple modalities in multimodal sentiment analysis. Moreover, each modality is extracted from raw input and integrated with the rest to construct a multimodal representation. Although previous methods have proposed multimodal representations and achieved promising results, most of them focus on forming positive and negative pairs, neglecting the variation in sentiment scores within the same class. Additionally, they fail to capture the significance of unimodal representations in the fusion vector. To address these limitations, we introduce a framework called Supervised Angular-based Contrastive Learning for Multimodal Sentiment Analysis. This framework aims to enhance discrimination and generalizability of the multimodal representation and overcome biases in the fusion vector's modality. Our experimental results, along with visualizations on two widely used datasets, demonstrate the effectiveness of our approach. △ Less

Submitted 3 December, 2023; originally announced December 2023.

arXiv:2312.01592 [pdf, other]

Expand BERT Representation with Visual Information via Grounded Language Learning with Multimodal Partial Alignment

Authors: Cong-Duy Nguyen, The-Anh Vu-Le, Thong Nguyen, Tho Quan, Luu Anh Tuan

Abstract: Language models have been supervised with both language-only objective and visual grounding in existing studies of visual-grounded language learning. However, due to differences in the distribution and scale of visual-grounded datasets and language corpora, the language model tends to mix up the context of the tokens that occurred in the grounded data with those that do not. As a result, during re… ▽ More Language models have been supervised with both language-only objective and visual grounding in existing studies of visual-grounded language learning. However, due to differences in the distribution and scale of visual-grounded datasets and language corpora, the language model tends to mix up the context of the tokens that occurred in the grounded data with those that do not. As a result, during representation learning, there is a mismatch between the visual information and the contextual meaning of the sentence. To overcome this limitation, we propose GroundedBERT - a grounded language learning method that enhances the BERT representation with visually grounded information. GroundedBERT comprises two components: (i) the original BERT which captures the contextual representation of words learned from the language corpora, and (ii) a visual grounding module which captures visual information learned from visual-grounded datasets. Moreover, we employ Optimal Transport (OT), specifically its partial variant, to solve the fractional alignment problem between the two modalities. Our proposed method significantly outperforms the baseline language models on various language tasks of the GLUE and SQuAD datasets. △ Less

Submitted 9 January, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

arXiv:2312.00656 [pdf, other]

Simple Transferability Estimation for Regression Tasks

Authors: Cuong N. Nguyen, Phong Tran, Lam Si Tung Ho, Vu Dinh, Anh T. Tran, Tal Hassner, Cuong V. Nguyen

Abstract: We consider transferability estimation, the problem of estimating how well deep learning models transfer from a source to a target task. We focus on regression tasks, which received little previous attention, and propose two simple and computationally efficient approaches that estimate transferability based on the negative regularized mean squared error of a linear regression model. We prove novel… ▽ More We consider transferability estimation, the problem of estimating how well deep learning models transfer from a source to a target task. We focus on regression tasks, which received little previous attention, and propose two simple and computationally efficient approaches that estimate transferability based on the negative regularized mean squared error of a linear regression model. We prove novel theoretical results connecting our approaches to the actual transferability of the optimal target models obtained from the transfer learning process. Despite their simplicity, our approaches significantly outperform existing state-of-the-art regression transferability estimators in both accuracy and efficiency. On two large-scale keypoint regression benchmarks, our approaches yield 12% to 36% better results on average while being at least 27% faster than previous state-of-the-art methods. △ Less

Submitted 3 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

Comments: Paper published at The 39th Conference on Uncertainty in Artificial Intelligence (UAI) 2023

arXiv:2311.18043 [pdf, other]

Cryogenic Focus Measurement System for a Wide-Field Infrared Space Telescope

Authors: Samuel S. Condon, Stephen Padin, James Bock, Howard Hui, Phillip Korngut, Chi Nguyen, Jordan Otsby

Abstract: We describe a technique for measuring focus errors in a cryogenic, wide-field, near-infrared space telescope. The measurements are made with a collimator looking through a large vacuum window, with a reflective cold filter to reduce background thermal infrared loading on the detectors and optics. For the $300\textrm{ mm}$ diameter aperture $f/3$ space telescope, SPHEREx, we achieve a focus positio… ▽ More We describe a technique for measuring focus errors in a cryogenic, wide-field, near-infrared space telescope. The measurements are made with a collimator looking through a large vacuum window, with a reflective cold filter to reduce background thermal infrared loading on the detectors and optics. For the $300\textrm{ mm}$ diameter aperture $f/3$ space telescope, SPHEREx, we achieve a focus position measurement with $\sim \! 5\textrm{ }μ\textrm{m statistical}$ and $\sim \! 15 \textrm{ }μ\textrm{m systematic}$ error. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 9 pages, 14 figures, submission to Applied Optics

arXiv:2311.15836 [pdf, other]

Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Authors: Léo Lebrat, Rodrigo Santa Cruz, Remi Chierchia, Yulia Arzhaeva, Mohammad Ali Armin, Joshua Goldsmith, Jeremy Oorloff, Prithvi Reddy, Chuong Nguyen, Lars Petersson, Michelle Barakat-Johnson, Georgina Luscombe, Clinton Fookes, Olivier Salvado, David Ahmedt-Aristizabal

Abstract: Wound management poses a significant challenge, particularly for bedridden patients and the elderly. Accurate diagnostic and healing monitoring can significantly benefit from modern image analysis, providing accurate and precise measurements of wounds. Despite several existing techniques, the shortage of expansive and diverse training datasets remains a significant obstacle to constructing machine… ▽ More Wound management poses a significant challenge, particularly for bedridden patients and the elderly. Accurate diagnostic and healing monitoring can significantly benefit from modern image analysis, providing accurate and precise measurements of wounds. Despite several existing techniques, the shortage of expansive and diverse training datasets remains a significant obstacle to constructing machine learning-based frameworks. This paper introduces Syn3DWound, an open-source dataset of high-fidelity simulated wounds with 2D and 3D annotations. We propose baseline methods and a benchmarking framework for automated 3D morphometry analysis and 2D/3D wound segmentation. △ Less

Submitted 3 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: In the IEEE International Symposium on Biomedical Imaging (ISBI) 2024

arXiv:2311.14578 [pdf, other]

Criticality-Enhanced Precision in Phase Thermometry

Authors: Mei Yu, H. Chau Nguyen, Stefan Nimmrichter

Abstract: Temperature estimation of interacting quantum many-body systems is both a challenging task and topic of interest in quantum metrology, given that critical behavior at phase transitions can boost the metrological sensitivity. Here we study non-invasive quantum thermometry of a finite, two-dimensional Ising spin lattice based on measuring the non-Markovian dephasing dynamics of a spin probe coupled… ▽ More Temperature estimation of interacting quantum many-body systems is both a challenging task and topic of interest in quantum metrology, given that critical behavior at phase transitions can boost the metrological sensitivity. Here we study non-invasive quantum thermometry of a finite, two-dimensional Ising spin lattice based on measuring the non-Markovian dephasing dynamics of a spin probe coupled to the lattice. We demonstrate a strong critical enhancement of the achievable precision in terms of the quantum Fisher information, which depends on the coupling range and the interrogation time. Our numerical simulations are compared to instructive analytic results for the critical scaling of the sensitivity in the Curie-Weiss model of a fully connected lattice and to the mean-field description in the thermodynamic limit, both of which fail to describe the critical spin fluctuations on the lattice the spin probe is sensitive to. Phase metrology could thus help to investigate the critical behaviour of finite many-body systems beyond the validity of mean-field models. △ Less

Submitted 1 December, 2023; v1 submitted 24 November, 2023; originally announced November 2023.

Comments: 11 pages, 8 figures

arXiv:2311.13172 [pdf, other]

Learning to Complement with Multiple Humans

Authors: Zheng Zhang, Cuong Nguyen, Kevin Wells, Thanh-Toan Do, Gustavo Carneiro

Abstract: Real-world image classification tasks tend to be complex, where expert labellers are sometimes unsure about the classes present in the images, leading to the issue of learning with noisy labels (LNL). The ill-posedness of the LNL task requires the adoption of strong assumptions or the use of multiple noisy labels per training image, resulting in accurate models that work well in isolation but fail… ▽ More Real-world image classification tasks tend to be complex, where expert labellers are sometimes unsure about the classes present in the images, leading to the issue of learning with noisy labels (LNL). The ill-posedness of the LNL task requires the adoption of strong assumptions or the use of multiple noisy labels per training image, resulting in accurate models that work well in isolation but fail to optimise human-AI collaborative classification (HAI-CC). Unlike such LNL methods, HAI-CC aims to leverage the synergies between human expertise and AI capabilities but requires clean training labels, limiting its real-world applicability. This paper addresses this gap by introducing the innovative Learning to Complement with Multiple Humans (LECOMH) approach. LECOMH is designed to learn from noisy labels without depending on clean labels, simultaneously maximising collaborative accuracy while minimising the cost of human collaboration, measured by the number of human expert annotations required per image. Additionally, new benchmarks featuring multiple noisy labels for both training and testing are proposed to evaluate HAI-CC methods. Through quantitative comparisons on these benchmarks, LECOMH consistently outperforms competitive HAI-CC approaches, human labellers, multi-rater learning, and noisy-label learning methods across various datasets, offering a promising solution for addressing real-world image classification challenges. △ Less

Submitted 1 May, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

Comments: Under review

arXiv:2311.11378 [pdf, other]

Inspecting Explainability of Transformer Models with Additional Statistical Information

Authors: Hoang C. Nguyen, Haeil Lee, Junmo Kim

Abstract: Transformer becomes more popular in the vision domain in recent years so there is a need for finding an effective way to interpret the Transformer model by visualizing it. In recent work, Chefer et al. can visualize the Transformer on vision and multi-modal tasks effectively by combining attention layers to show the importance of each image patch. However, when applying to other variants of Transf… ▽ More Transformer becomes more popular in the vision domain in recent years so there is a need for finding an effective way to interpret the Transformer model by visualizing it. In recent work, Chefer et al. can visualize the Transformer on vision and multi-modal tasks effectively by combining attention layers to show the importance of each image patch. However, when applying to other variants of Transformer such as the Swin Transformer, this method can not focus on the predicted object. Our method, by considering the statistics of tokens in layer normalization layers, shows a great ability to interpret the explainability of Swin Transformer and ViT. △ Less

Submitted 19 November, 2023; originally announced November 2023.

arXiv:2311.09542 [pdf, other]

Pregnant Questions: The Importance of Pragmatic Awareness in Maternal Health Question Answering

Authors: Neha Srikanth, Rupak Sarkar, Heran Mane, Elizabeth M. Aparicio, Quynh C. Nguyen, Rachel Rudinger, Jordan Boyd-Graber

Abstract: Questions posed by information-seeking users often contain implicit false or potentially harmful assumptions. In a high-risk domain such as maternal and infant health, a question-answering system must recognize these pragmatic constraints and go beyond simply answering user questions, examining them in context to respond helpfully. To achieve this, we study assumptions and implications, or pragmat… ▽ More Questions posed by information-seeking users often contain implicit false or potentially harmful assumptions. In a high-risk domain such as maternal and infant health, a question-answering system must recognize these pragmatic constraints and go beyond simply answering user questions, examining them in context to respond helpfully. To achieve this, we study assumptions and implications, or pragmatic inferences, made when mothers ask questions about pregnancy and infant care by collecting a dataset of 2,727 inferences from 500 questions across three diverse sources. We study how health experts naturally address these inferences when writing answers, and illustrate that informing existing QA pipelines with pragmatic inferences produces responses that are more complete, mitigating the propagation of harmful beliefs. △ Less

Submitted 2 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

Comments: Accepted to NAACL 2024

arXiv:2311.05192 [pdf, other]

TransReg: Cross-transformer as auto-registration module for multi-view mammogram mass detection

Authors: Hoang C. Nguyen, Chi Phan, Hieu H. Pham

Abstract: Screening mammography is the most widely used method for early breast cancer detection, significantly reducing mortality rates. The integration of information from multi-view mammograms enhances radiologists' confidence and diminishes false-positive rates since they can examine on dual-view of the same breast to cross-reference the existence and location of the lesion. Inspired by this, we present… ▽ More Screening mammography is the most widely used method for early breast cancer detection, significantly reducing mortality rates. The integration of information from multi-view mammograms enhances radiologists' confidence and diminishes false-positive rates since they can examine on dual-view of the same breast to cross-reference the existence and location of the lesion. Inspired by this, we present TransReg, a Computer-Aided Detection (CAD) system designed to exploit the relationship between craniocaudal (CC), and mediolateral oblique (MLO) views. The system includes cross-transformer to model the relationship between the region of interest (RoIs) extracted by siamese Faster RCNN network for mass detection problems. Our work is the first time cross-transformer has been integrated into an object detection framework to model the relation between ipsilateral views. Our experimental evaluation on DDSM and VinDr-Mammo datasets shows that our TransReg, equipped with SwinT as a feature extractor achieves state-of-the-art performance. Specifically, at the false positive rate per image at 0.5, TransReg using SwinT gets a recall at 83.3% for DDSM dataset and 79.7% for VinDr-Mammo dataset. Furthermore, we conduct a comprehensive analysis to demonstrate that cross-transformer can function as an auto-registration module, aligning the masses in dual-view and utilizing this information to inform final predictions. It is a replication diagnostic workflow of expert radiologists △ Less

Submitted 9 November, 2023; originally announced November 2023.

arXiv:2311.04507 [pdf, other]

doi 10.18653/v1/2023.emnlp-main.937

Conversation Understanding using Relational Temporal Graph Neural Networks with Auxiliary Cross-Modality Interaction

Authors: Cam-Van Thi Nguyen, Anh-Tuan Mai, The-Son Le, Hai-Dang Kieu, Duc-Trong Le

Abstract: Emotion recognition is a crucial task for human conversation understanding. It becomes more challenging with the notion of multimodal data, e.g., language, voice, and facial expressions. As a typical solution, the global- and the local context information are exploited to predict the emotional label for every single sentence, i.e., utterance, in the dialogue. Specifically, the global representatio… ▽ More Emotion recognition is a crucial task for human conversation understanding. It becomes more challenging with the notion of multimodal data, e.g., language, voice, and facial expressions. As a typical solution, the global- and the local context information are exploited to predict the emotional label for every single sentence, i.e., utterance, in the dialogue. Specifically, the global representation could be captured via modeling of cross-modal interactions at the conversation level. The local one is often inferred using the temporal information of speakers or emotional shifts, which neglects vital factors at the utterance level. Additionally, most existing approaches take fused features of multiple modalities in an unified input without leveraging modality-specific representations. Motivating from these problems, we propose the Relational Temporal Graph Neural Network with Auxiliary Cross-Modality Interaction (CORECT), an novel neural network framework that effectively captures conversation-level cross-modality interactions and utterance-level temporal dependencies with the modality-specific manner for conversation understanding. Extensive experiments demonstrate the effectiveness of CORECT via its state-of-the-art results on the IEMOCAP and CMU-MOSEI datasets for the multimodal ERC task. △ Less

Submitted 30 January, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

Comments: EMNLP 2023

Journal ref: The 2023 Conference on Empirical Methods in Natural Language Processing

arXiv:2311.04224 [pdf, other]

doi 10.1007/s41666-024-00168-3

MELEP: A Novel Predictive Measure of Transferability in Multi-Label ECG Diagnosis

Authors: Cuong V. Nguyen, Hieu Minh Duong, Cuong D. Do

Abstract: In practical electrocardiography (ECG) interpretation, the scarcity of well-annotated data is a common challenge. Transfer learning techniques are valuable in such situations, yet the assessment of transferability has received limited attention. To tackle this issue, we introduce MELEP, which stands for Muti-label Expected Log of Empirical Predictions, a measure designed to estimate the effectiven… ▽ More In practical electrocardiography (ECG) interpretation, the scarcity of well-annotated data is a common challenge. Transfer learning techniques are valuable in such situations, yet the assessment of transferability has received limited attention. To tackle this issue, we introduce MELEP, which stands for Muti-label Expected Log of Empirical Predictions, a measure designed to estimate the effectiveness of knowledge transfer from a pre-trained model to a downstream multi-label ECG diagnosis task. MELEP is generic, working with new target data with different label sets, and computationally efficient, requiring only a single forward pass through the pre-trained model. To the best of our knowledge, MELEP is the first transferability metric specifically designed for multi-label ECG classification problems. Our experiments show that MELEP can predict the performance of pre-trained convolutional and recurrent deep neural networks, on small and imbalanced ECG data. Specifically, we observed strong correlation coefficients (with absolute values exceeding 0.6 in most cases) between MELEP and the actual average F1 scores of the fine-tuned models. Our work highlights the potential of MELEP to expedite the selection of suitable pre-trained models for ECG diagnosis tasks, saving time and effort that would otherwise be spent on fine-tuning these models. △ Less

Submitted 12 June, 2024; v1 submitted 27 October, 2023; originally announced November 2023.

Comments: Accepted to the Journal of Healthcare Informatics Research

arXiv:2311.03785 [pdf, other]

Self-MI: Efficient Multimodal Fusion via Self-Supervised Multi-Task Learning with Auxiliary Mutual Information Maximization

Authors: Cam-Van Thi Nguyen, Ngoc-Hoa Thi Nguyen, Duc-Trong Le, Quang-Thuy Ha

Abstract: Multimodal representation learning poses significant challenges in capturing informative and distinct features from multiple modalities. Existing methods often struggle to exploit the unique characteristics of each modality due to unified multimodal annotations. In this study, we propose Self-MI in the self-supervised learning fashion, which also leverage Contrastive Predictive Coding (CPC) as an… ▽ More Multimodal representation learning poses significant challenges in capturing informative and distinct features from multiple modalities. Existing methods often struggle to exploit the unique characteristics of each modality due to unified multimodal annotations. In this study, we propose Self-MI in the self-supervised learning fashion, which also leverage Contrastive Predictive Coding (CPC) as an auxiliary technique to maximize the Mutual Information (MI) between unimodal input pairs and the multimodal fusion result with unimodal inputs. Moreover, we design a label generation module, $ULG_{MI}$ for short, that enables us to create meaningful and informative labels for each modality in a self-supervised manner. By maximizing the Mutual Information, we encourage better alignment between the multimodal fusion and the individual modalities, facilitating improved multimodal fusion. Extensive experiments on three benchmark datasets including CMU-MOSI, CMU-MOSEI, and SIMS, demonstrate the effectiveness of Self-MI in enhancing the multimodal fusion task. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: Accepted at The 37th Pacific Asia Conference on Language, Information and Computation (PACLIC 37)

arXiv:2311.00737 [pdf]

Real-Time Magnetic Tracking and Diagnosis of COVID-19 via Machine Learning

Authors: Dang Nguyen, Phat K. Huynh, Vinh Duc An Bui, Kee Young Hwang, Nityanand Jain, Chau Nguyen, Le Huu Nhat Minh, Le Van Truong, Xuan Thanh Nguyen, Dinh Hoang Nguyen, Le Tien Dung, Trung Q. Le, Manh-Huong Phan

Abstract: The COVID-19 pandemic underscored the importance of reliable, noninvasive diagnostic tools for robust public health interventions. In this work, we fused magnetic respiratory sensing technology (MRST) with machine learning (ML) to create a diagnostic platform for real-time tracking and diagnosis of COVID-19 and other respiratory diseases. The MRST precisely captures breathing patterns through thre… ▽ More The COVID-19 pandemic underscored the importance of reliable, noninvasive diagnostic tools for robust public health interventions. In this work, we fused magnetic respiratory sensing technology (MRST) with machine learning (ML) to create a diagnostic platform for real-time tracking and diagnosis of COVID-19 and other respiratory diseases. The MRST precisely captures breathing patterns through three specific breath testing protocols: normal breath, holding breath, and deep breath. We collected breath data from both COVID-19 patients and healthy subjects in Vietnam using this platform, which then served to train and validate ML models. Our evaluation encompassed multiple ML algorithms, including support vector machines and deep learning models, assessing their ability to diagnose COVID-19. Our multi-model validation methodology ensures a thorough comparison and grants the adaptability to select the most optimal model, striking a balance between diagnostic precision with model interpretability. The findings highlight the exceptional potential of our diagnostic tool in pinpointing respiratory anomalies, achieving over 90% accuracy. This innovative sensor technology can be seamlessly integrated into healthcare settings for patient monitoring, marking a significant enhancement for the healthcare infrastructure. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.17014 [pdf, other]

Measurement of the Multi-Neutron $\barν_μ$ Charged Current Differential Cross Section at Low Available Energy on Hydrocarbon

Authors: A. Olivier, T. Cai, S. Akhter, Z. Ahmad Dar, V. Ansari, M. V. Ascencio, M. Sajjad Athar, A. Bashyal, A. Bercellie, M. Betancourt, J. L. Bonilla, A. Bravar, H. Budd, G. Caceres, G. A. Díaz, J. Felix, L. Fields, A. Filkins, R. Fine, A. M. Gago, P. K. Gaur, S. M. Gilligan, R. Gran, E. Granados, D. A. Harris , et al. (36 additional authors not shown)

Abstract: Neutron production in antineutrino interactions can lead to bias in energy reconstruction in neutrino oscillation experiments, but these interactions have rarely been studied. MINERvA previously studied neutron production at an average antineutrino energy of ~3 GeV in 2016 and found deficiencies in leading models. In this paper, the MINERvA 6 GeV average antineutrino energy data set is shown to ha… ▽ More Neutron production in antineutrino interactions can lead to bias in energy reconstruction in neutrino oscillation experiments, but these interactions have rarely been studied. MINERvA previously studied neutron production at an average antineutrino energy of ~3 GeV in 2016 and found deficiencies in leading models. In this paper, the MINERvA 6 GeV average antineutrino energy data set is shown to have similar disagreements. A measurement of the cross section for an antineutrino to produce two or more neutrons and have low visible energy is presented as an experiment-independent way to explore neutron production modeling. This cross section disagrees with several leading models' predictions. Neutron modeling techniques from nuclear physics are used to quantify neutron detection uncertainties on this result. △ Less

Submitted 21 November, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

Comments: 25 pages, 11 figures; Added ancillary files with cross section values as .csv Matches preprint accepted by publisher

Report number: PUB-23-610-ND

arXiv:2310.14609 [pdf, other]

Long Short-Term Planning for Conversational Recommendation Systems

Authors: Xian Li, Hongguang Shi, Yunfei Wang, Yeqin Zhang, Xubin Li, Cam-Tu Nguyen

Abstract: In Conversational Recommendation Systems (CRS), the central question is how the conversational agent can naturally ask for user preferences and provide suitable recommendations. Existing works mainly follow the hierarchical architecture, where a higher policy decides whether to invoke the conversation module (to ask questions) or the recommendation module (to make recommendations). This architectu… ▽ More In Conversational Recommendation Systems (CRS), the central question is how the conversational agent can naturally ask for user preferences and provide suitable recommendations. Existing works mainly follow the hierarchical architecture, where a higher policy decides whether to invoke the conversation module (to ask questions) or the recommendation module (to make recommendations). This architecture prevents these two components from fully interacting with each other. In contrast, this paper proposes a novel architecture, the long short-term feedback architecture, to connect these two essential components in CRS. Specifically, the recommendation predicts the long-term recommendation target based on the conversational context and the user history. Driven by the targeted recommendation, the conversational model predicts the next topic or attribute to verify if the user preference matches the target. The balance feedback loop continues until the short-term planner output matches the long-term planner output, that is when the system should make the recommendation. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: 14 pages, 3 figures. Accepted by ICONIP 2023

arXiv:2310.13512 [pdf, other]

Improving Question Generation with Multi-level Content Planning

Authors: Zehua Xia, Qi Gou, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li, Cam-Tu Nguyen

Abstract: This paper addresses the problem of generating questions from a given context and an answer, specifically focusing on questions that require multi-hop reasoning across an extended context. Previous studies have suggested that key phrase selection is essential for question generation (QG), yet it is still challenging to connect such disjointed phrases into meaningful questions, particularly for lon… ▽ More This paper addresses the problem of generating questions from a given context and an answer, specifically focusing on questions that require multi-hop reasoning across an extended context. Previous studies have suggested that key phrase selection is essential for question generation (QG), yet it is still challenging to connect such disjointed phrases into meaningful questions, particularly for long context. To mitigate this issue, we propose MultiFactor, a novel QG framework based on multi-level content planning. Specifically, MultiFactor includes two components: FA-model, which simultaneously selects key phrases and generates full answers, and Q-model which takes the generated full answer as an additional input to generate questions. Here, full answer generation is introduced to connect the short answer with the selected key phrases, thus forming an answer-aware summary to facilitate QG. Both FA-model and Q-model are formalized as simple-yet-effective Phrase-Enhanced Transformers, our joint model for phrase selection and text generation. Experimental results show that our method outperforms strong baselines on two popular QG datasets. Our code is available at https://github.com/zeaver/MultiFactor. △ Less

Submitted 22 October, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

Comments: Camera-ready. Accepted by EMNLP 2023 Findings

arXiv:2310.09757 [pdf, other]

doi 10.1109/IROS55552.2023.10342417

MoEmo Vision Transformer: Integrating Cross-Attention and Movement Vectors in 3D Pose Estimation for HRI Emotion Detection

Authors: David C. Jeong, Tianma Shen, Hongji Liu, Raghav Kapoor, Casey Nguyen, Song Liu, Christopher A. Kitts

Abstract: Emotion detection presents challenges to intelligent human-robot interaction (HRI). Foundational deep learning techniques used in emotion detection are limited by information-constrained datasets or models that lack the necessary complexity to learn interactions between input data elements, such as the the variance of human emotions across different contexts. In the current effort, we introduce 1)… ▽ More Emotion detection presents challenges to intelligent human-robot interaction (HRI). Foundational deep learning techniques used in emotion detection are limited by information-constrained datasets or models that lack the necessary complexity to learn interactions between input data elements, such as the the variance of human emotions across different contexts. In the current effort, we introduce 1) MoEmo (Motion to Emotion), a cross-attention vision transformer (ViT) for human emotion detection within robotics systems based on 3D human pose estimations across various contexts, and 2) a data set that offers full-body videos of human movement and corresponding emotion labels based on human gestures and environmental contexts. Compared to existing approaches, our method effectively leverages the subtle connections between movement vectors of gestures and environmental contexts through the use of cross-attention on the extracted movement vectors of full-body human gestures/poses and feature maps of environmental contexts. We implement a cross-attention fusion model to combine movement vectors and environment contexts into a joint representation to derive emotion estimation. Leveraging our Naturalistic Motion Database, we train the MoEmo system to jointly analyze motion and context, yielding emotion detection that outperforms the current state-of-the-art. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: IEEE/RSJ International Conference on Intelligent Robots (IROS), Detroit, Michigan

Journal ref: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots (IROS), 2023

arXiv:2310.08423 [pdf, other]

doi 10.1515/nanoph-2023-0834

Dirac exciton-polariton condensates in photonic crystal gratings

Authors: Helgi Sigurðsson, Hai Chau Nguyen, Hai Son Nguyen

Abstract: Bound states in the continuum have recently been utilized in photonic crystal gratings to achieve strong coupling and ultralow power-driven condensation of bosonic exciton-polariton quasiparticles with atypical Dirac-like features in their dispersion relation. Here, we develop the single- and many-body theory of these new effective relativistic exciton-polaritons modes and describe their mean fiel… ▽ More Bound states in the continuum have recently been utilized in photonic crystal gratings to achieve strong coupling and ultralow power-driven condensation of bosonic exciton-polariton quasiparticles with atypical Dirac-like features in their dispersion relation. Here, we develop the single- and many-body theory of these new effective relativistic exciton-polaritons modes and describe their mean field condensation dynamics facilitated by the interplay between protection from the radiative continuum and negative-mass pump induced optical trapping. Our theory accounts for many tunable grating parameters giving full control over the diffractive coupling properties between guided polaritons and the radiative continuum previously unexplored in the context of driven condensation. In particular, we discover stable cyclical condensate solutions mimicking a driven-dissipative analog of the zitterbewegung effect characterized by coherent superposition of both ballistic (rapid phase front) and trapped (slow phase front) polariton waves. Finally, important distinctions are drawn between the concepts of near field and far field in the photonic grating, clarifying recent experimental observations on the emission characteristics of these long lived nonlinear Dirac polaritons. △ Less

Submitted 31 July, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.07951 [pdf, other]

Adaptive model reduction of high-order solutions of compressible flows via optimal transport

Authors: R. Loek Van Heyningen, Ngoc Cuong Nguyen, Patrick Blonigan, Jaime Peraire

Abstract: The solution of conservation laws with parametrized shock waves presents challenges for both high-order numerical methods and model reduction techniques. We introduce an r-adaptivity scheme based on optimal transport and apply it to develop reduced order models for compressible flows. The optimal transport theory allows us to compute high-order r-adaptive meshes from a starting reference mesh by s… ▽ More The solution of conservation laws with parametrized shock waves presents challenges for both high-order numerical methods and model reduction techniques. We introduce an r-adaptivity scheme based on optimal transport and apply it to develop reduced order models for compressible flows. The optimal transport theory allows us to compute high-order r-adaptive meshes from a starting reference mesh by solving the Monge-Ampere equation. A high-order discretization of the conservation laws enables high-order solutions to be computed on the resulting r-adaptive meshes. Furthermore, the Monge-Ampere solutions contain mappings that are used to reduce the spatial locality of the resulting solutions and make them more amenable to model reduction. We use a non-intrusive model reduction method to construct reduced order models of both the mesh and the solution. The procedure is demonstrated on three supersonic and hypersonic test cases, with the hybridizable discontinuous Galerkin method being used as the full order model. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 27 pages, 17 figures

arXiv:2310.03525 [pdf, other]

V2X Cooperative Perception for Autonomous Driving: Recent Advances and Challenges

Authors: Tao Huang, Jianan Liu, Xi Zhou, Dinh C. Nguyen, Mostafa Rahimi Azghadi, Yuxuan Xia, Qing-Long Han, Sumei Sun

Abstract: Accurate perception is essential for advancing autonomous driving and addressing safety challenges in modern transportation systems. Despite significant advancements in computer vision for object recognition, current perception methods still face difficulties in complex real-world traffic environments. Challenges such as physical occlusion and limited sensor field of view persist for individual ve… ▽ More Accurate perception is essential for advancing autonomous driving and addressing safety challenges in modern transportation systems. Despite significant advancements in computer vision for object recognition, current perception methods still face difficulties in complex real-world traffic environments. Challenges such as physical occlusion and limited sensor field of view persist for individual vehicle systems. Cooperative Perception (CP) with Vehicle-to-Everything (V2X) technologies has emerged as a solution to overcome these obstacles and enhance driving automation systems. While some research has explored CP's fundamental architecture and critical components, there remains a lack of comprehensive summaries of the latest innovations, particularly in the context of V2X communication technologies. To address this gap, this paper provides a comprehensive overview of the evolution of CP technologies, spanning from early explorations to recent developments, including advancements in V2X communication technologies. Additionally, a contemporary generic framework is also proposed to illustrate the V2X-based CP workflow, aiding in the structured understanding of CP system components. Furthermore, this paper categorizes prevailing V2X-based CP methodologies based on the critical issues they address. An extensive literature review is conducted within this taxonomy, evaluating existing datasets and simulators. Finally, open challenges and future directions in CP for autonomous driving are discussed by considering both perception and V2X communication advancements. △ Less

Submitted 9 May, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2310.02876 [pdf, other]

Hate Speech Detection in Limited Data Contexts using Synthetic Data Generation

Authors: Aman Khullar, Daniel Nkemelu, Cuong V. Nguyen, Michael L. Best

Abstract: A growing body of work has focused on text classification methods for detecting the increasing amount of hate speech posted online. This progress has been limited to only a select number of highly-resourced languages causing detection systems to either under-perform or not exist in limited data contexts. This is majorly caused by a lack of training data which is expensive to collect and curate in… ▽ More A growing body of work has focused on text classification methods for detecting the increasing amount of hate speech posted online. This progress has been limited to only a select number of highly-resourced languages causing detection systems to either under-perform or not exist in limited data contexts. This is majorly caused by a lack of training data which is expensive to collect and curate in these settings. In this work, we propose a data augmentation approach that addresses the problem of lack of data for online hate speech detection in limited data contexts using synthetic data generation techniques. Given a handful of hate speech examples in a high-resource language such as English, we present three methods to synthesize new examples of hate speech data in a target language that retains the hate sentiment in the original examples but transfers the hate targets. We apply our approach to generate training data for hate speech classification tasks in Hindi and Vietnamese. Our findings show that a model trained on synthetic data performs comparably to, and in some cases outperforms, a model trained only on the samples available in the target domain. This method can be adopted to bootstrap hate speech detection models from scratch in limited data contexts. As the growth of social media within these contexts continues to outstrip response efforts, this work furthers our capacities for detection, understanding, and response to hate speech. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: Accepted at ACM Journal on Computing and Sustainable Societies

arXiv:2310.01196 [pdf, other]

Optimal transport for mesh adaptivity and shock capturing of compressible flows

Authors: Ngoc Cuong Nguyen, R. Loek Van Heyningen, Jordi Vila-Perez, Jaime Peraire

Abstract: We present an optimal transport approach for mesh adaptivity and shock capturing of compressible flows. Shock capturing is based on a viscosity regularization of the governing equations by introducing an artificial viscosity field as solution of the Helmholtz equation. Mesh adaptation is based on the optimal transport theory by formulating a mesh mapping as solution of Monge-Ampere equation. The m… ▽ More We present an optimal transport approach for mesh adaptivity and shock capturing of compressible flows. Shock capturing is based on a viscosity regularization of the governing equations by introducing an artificial viscosity field as solution of the Helmholtz equation. Mesh adaptation is based on the optimal transport theory by formulating a mesh mapping as solution of Monge-Ampere equation. The marriage of optimal transport and viscosity regularization for compressible flows leads to a coupled system of the compressible Euler/Navier-Stokes equations, the Helmholtz equation, and the Monge-Ampere equation. We propose an iterative procedure to solve the coupled system in a sequential fashion using homotopy continuation to minimize the amount of artificial viscosity while enforcing positivity-preserving and smoothness constraints on the numerical solution. We explore various mesh monitor functions for computing r-adaptive meshes in order to reduce the amount of artificial dissipation and improve the accuracy of the numerical solution. The hybridizable discontinuous Galerkin method is used for the spatial discretization of the governing equations to obtain high-order accurate solutions. Extensive numerical results are presented to demonstrate the optimal transport approach on transonic, supersonic, hypersonic flows in two dimensions. The approach is found to yield accurate, sharp yet smooth solutions within a few mesh adaptation iterations. △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: 41 pages, 22 figures. arXiv admin note: text overlap with arXiv:2305.00461

MSC Class: 35L65; 35L67; 76L05; 65N30

arXiv:2309.15966 [pdf]

doi 10.3847/1538-4365/acebc1

Noise Reduction Methods for Large-scale Intensity-mapping Measurements with Infrared Detector Arrays

Authors: Grigory Heaton, Walter Cook, James Bock, Jill Burnham, Sam Condon, Viktor Hristov, Howard Hui, Branislav Kecman, Phillip Korngut, Hiromasa Miyasaka, Chi Nguyen, Stephen Padin, Marco Viero

Abstract: Intensity mapping observations measure galaxy clustering fluctuations from spectral-spatial maps, requiring stable noise properties on large angular scales. We have developed specialized readouts and analysis methods for achieving large-scale noise stability with Teledyne 2048$\times$2048 H2RG infrared detector arrays. We designed and fabricated a room-temperature low-noise ASIC Video8 amplifier t… ▽ More Intensity mapping observations measure galaxy clustering fluctuations from spectral-spatial maps, requiring stable noise properties on large angular scales. We have developed specialized readouts and analysis methods for achieving large-scale noise stability with Teledyne 2048$\times$2048 H2RG infrared detector arrays. We designed and fabricated a room-temperature low-noise ASIC Video8 amplifier to sample each of the 32 detector outputs continuously in sample-up-the-ramp mode with interleaved measurements of a stable reference voltage that remove current offsets and $1/f$ noise from the amplifier. The amplifier addresses rows in an order different from their physical arrangement on the array, modulating temporal $1/f$ noise in the H2RG to high spatial frequencies. Finally, we remove constant signal offsets in each of the 32 channels using reference pixels. These methods will be employed in the upcoming SPHEREx orbital mission that will carry out intensity mapping observations in near-infrared spectral maps in deep fields located near the ecliptic poles. We also developed a noise model for the H2RG and Video8 to optimize the choice of parameters. Our analysis indicates that these methods hold residual $1/f$ noise near the level of SPHEREx photon noise on angular scales smaller than $\sim30$ arcminutes. △ Less

Submitted 27 September, 2023; originally announced September 2023.

Comments: Accepted to Astrophysical Journal Supplement Series

Journal ref: ApJS 268 44 (2023)

arXiv:2309.15483 [pdf, ps, other]

Energy-Efficient Precoding Designs for Multi-User Visible Light Communication Systems with Confidential Messages

Authors: Son T. Duong, Thanh V. Pham, Chuyen T. Nguyen, Anh T. Pham

Abstract: This paper studies energy-efficient precoding designs for multi-user visible light communication (VLC) systems from the perspective of physical layer security where users' messages must be kept mutually confidential. For such systems, we first derive a lower bound on the achievable secrecy rate of each user. Next, the total power consumption for illumination and data transmission is thoroughly ana… ▽ More This paper studies energy-efficient precoding designs for multi-user visible light communication (VLC) systems from the perspective of physical layer security where users' messages must be kept mutually confidential. For such systems, we first derive a lower bound on the achievable secrecy rate of each user. Next, the total power consumption for illumination and data transmission is thoroughly analyzed. We then tackle the problem of maximizing energy efficiency, given that each user's secrecy rate satisfies a certain threshold. The design problem is shown to be non-convex fractional programming, which renders finding the optimal solution computationally prohibitive. Our aim in this paper is, therefore, to find sub-optimal yet low complexity solutions. For this purpose, the traditional Dinkelbach algorithm is first employed to reformulate the original problem to a non-fractional parameterized one. Two different approaches based on the convex-concave procedure (CCCP) and Semidefinite Relaxation (SDR) are utilized to solve the non-convex parameterized problem. In addition, to further reduce the complexity, we investigate a design using the zero-forcing (ZF) technique. Numerical results are conducted to show the feasibility, convergence, and performance of the proposed algorithms depending on different parameters of the system. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.11039 [pdf, other]

Federated Learning in Intelligent Transportation Systems: Recent Applications and Open Problems

Authors: Shiying Zhang, Jun Li, Long Shi, Ming Ding, Dinh C. Nguyen, Wuzheng Tan, Jian Weng, Zhu Han

Abstract: Intelligent transportation systems (ITSs) have been fueled by the rapid development of communication technologies, sensor technologies, and the Internet of Things (IoT). Nonetheless, due to the dynamic characteristics of the vehicle networks, it is rather challenging to make timely and accurate decisions of vehicle behaviors. Moreover, in the presence of mobile wireless communications, the privacy… ▽ More Intelligent transportation systems (ITSs) have been fueled by the rapid development of communication technologies, sensor technologies, and the Internet of Things (IoT). Nonetheless, due to the dynamic characteristics of the vehicle networks, it is rather challenging to make timely and accurate decisions of vehicle behaviors. Moreover, in the presence of mobile wireless communications, the privacy and security of vehicle information are at constant risk. In this context, a new paradigm is urgently needed for various applications in dynamic vehicle environments. As a distributed machine learning technology, federated learning (FL) has received extensive attention due to its outstanding privacy protection properties and easy scalability. We conduct a comprehensive survey of the latest developments in FL for ITS. Specifically, we initially research the prevalent challenges in ITS and elucidate the motivations for applying FL from various perspectives. Subsequently, we review existing deployments of FL in ITS across various scenarios, and discuss specific potential issues in object recognition, traffic management, and service providing scenarios. Furthermore, we conduct a further analysis of the new challenges introduced by FL deployment and the inherent limitations that FL alone cannot fully address, including uneven data distribution, limited storage and computing power, and potential privacy and security concerns. We then examine the existing collaborative technologies that can help mitigate these challenges. Lastly, we discuss the open challenges that remain to be addressed in applying FL in ITS and propose several future research directions. △ Less

Submitted 19 September, 2023; originally announced September 2023.

arXiv:2309.09400 [pdf, other]

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Authors: Thuat Nguyen, Chien Van Nguyen, Viet Dac Lai, Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Ryan A. Rossi, Thien Huu Nguyen

Abstract: The driving factors behind the development of large language models (LLMs) with impressive learning capabilities are their colossal model sizes and extensive training datasets. Along with the progress in natural language processing, LLMs have been frequently made accessible to the public to foster deeper investigation and applications. However, when it comes to training datasets for these LLMs, es… ▽ More The driving factors behind the development of large language models (LLMs) with impressive learning capabilities are their colossal model sizes and extensive training datasets. Along with the progress in natural language processing, LLMs have been frequently made accessible to the public to foster deeper investigation and applications. However, when it comes to training datasets for these LLMs, especially the recent state-of-the-art models, they are often not fully disclosed. Creating training data for high-performing LLMs involves extensive cleaning and deduplication to ensure the necessary level of quality. The lack of transparency for training data has thus hampered research on attributing and addressing hallucination and bias issues in LLMs, hindering replication efforts and further advancements in the community. These challenges become even more pronounced in multilingual learning scenarios, where the available multilingual text datasets are often inadequately collected and cleaned. Consequently, there is a lack of open-source and readily usable dataset to effectively train LLMs in multiple languages. To overcome this issue, we present CulturaX, a substantial multilingual dataset with 6.3 trillion tokens in 167 languages, tailored for LLM development. Our dataset undergoes meticulous cleaning and deduplication through a rigorous pipeline of multiple stages to accomplish the best quality for model training, including language identification, URL-based filtering, metric-based cleaning, document refinement, and data deduplication. CulturaX is fully released to the public in HuggingFace to facilitate research and advancements in multilingual LLMs: https://huggingface.co/datasets/uonlp/CulturaX. △ Less

Submitted 17 September, 2023; originally announced September 2023.

Comments: Ongoing Work

arXiv:2308.15804 [pdf, other]

Collaborative Learning Framework to Detect Attacks in Transactions and Smart Contracts

Authors: Tran Viet Khoa, Do Hai Son, Chi-Hieu Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Tran Thi Thuy Quynh, Trong-Minh Hoang, Nguyen Viet Ha, Eryk Dutkiewicz, Abu Alsheikh, Nguyen Linh Trung

Abstract: With the escalating prevalence of malicious activities exploiting vulnerabilities in blockchain systems, there is an urgent requirement for robust attack detection mechanisms. To address this challenge, this paper presents a novel collaborative learning framework designed to detect attacks in blockchain transactions and smart contracts by analyzing transaction features. Our framework exhibits the… ▽ More With the escalating prevalence of malicious activities exploiting vulnerabilities in blockchain systems, there is an urgent requirement for robust attack detection mechanisms. To address this challenge, this paper presents a novel collaborative learning framework designed to detect attacks in blockchain transactions and smart contracts by analyzing transaction features. Our framework exhibits the capability to classify various types of blockchain attacks, including intricate attacks at the machine code level (e.g., injecting malicious codes to withdraw coins from users unlawfully), which typically necessitate significant time and security expertise to detect. To achieve that, the proposed framework incorporates a unique tool that transforms transaction features into visual representations, facilitating efficient analysis and classification of low-level machine codes. Furthermore, we propose an advanced collaborative learning model to enable real-time detection of diverse attack types at distributed mining nodes. Our model can efficiently detect attacks in smart contracts and transactions for blockchain systems without the need to gather all data from mining nodes into a centralized server. In order to evaluate the performance of our proposed framework, we deploy a pilot system based on a private Ethereum network and conduct multiple attack scenarios to generate a novel dataset. To the best of our knowledge, our dataset is the most comprehensive and diverse collection of transactions and smart contracts synthesized in a laboratory for cyberattack detection in blockchain systems. Our framework achieves a detection accuracy of approximately 94% through extensive simulations and 91% in real-time experiments with a throughput of over 2,150 transactions per second. △ Less

Submitted 10 August, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

arXiv:2308.13924 [pdf, other]

doi 10.1145/3586183.3606832

PaperToPlace: Transforming Instruction Documents into Spatialized and Context-Aware Mixed Reality Experiences

Authors: Chen Chen, Cuong Nguyen, Jane Hoffswell, Jennifer Healey, Trung Bui, Nadir Weibel

Abstract: While paper instructions are one of the mainstream medium for sharing knowledge, consuming such instructions and translating them into activities are inefficient due to the lack of connectivity with physical environment. We present PaperToPlace, a novel workflow comprising an authoring pipeline, which allows the authors to rapidly transform and spatialize existing paper instructions into MR experi… ▽ More While paper instructions are one of the mainstream medium for sharing knowledge, consuming such instructions and translating them into activities are inefficient due to the lack of connectivity with physical environment. We present PaperToPlace, a novel workflow comprising an authoring pipeline, which allows the authors to rapidly transform and spatialize existing paper instructions into MR experience, and a consumption pipeline, which computationally place each instruction step at an optimal location that is easy to read and do not occlude key interaction areas. Our evaluations of the authoring pipeline with 12 participants demonstrated the usability of our workflow and the effectiveness of using a machine learning based approach to help extracting the spatial locations associated with each steps. A second within-subject study with another 12 participants demonstrates the merits of our consumption pipeline by reducing efforts of context switching, delivering the segmented instruction steps and offering the hands-free affordances. △ Less

Submitted 26 August, 2023; originally announced August 2023.

Comments: 21 pages, 23 figures, Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST '23), San Francisco, CA, USA

ACM Class: H.4.m; H.5.2; I.7.m

arXiv:2308.13666 [pdf, other]

A Joint Fermi-GBM and Swift-BAT Analysis of Gravitational-Wave Candidates from the Third Gravitational-wave Observing Run

Authors: C. Fletcher, J. Wood, R. Hamburg, P. Veres, C. M. Hui, E. Bissaldi, M. S. Briggs, E. Burns, W. H. Cleveland, M. M. Giles, A. Goldstein, B. A. Hristov, D. Kocevski, S. Lesage, B. Mailyan, C. Malacaria, S. Poolakkil, A. von Kienlin, C. A. Wilson-Hodge, The Fermi Gamma-ray Burst Monitor Team, M. Crnogorčević, J. DeLaunay, A. Tohuvavohu, R. Caputo, S. B. Cenko , et al. (1674 additional authors not shown)

Abstract: We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses,… ▽ More We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses, the Targeted Search and the Untargeted Search, we investigate whether there are any coincident GRBs associated with the GWs. We also search the Swift-BAT rate data around the GW times to determine whether a GRB counterpart is present. No counterparts are found. Using both the Fermi-GBM Targeted Search and the Swift-BAT search, we calculate flux upper limits and present joint upper limits on the gamma-ray luminosity of each GW. Given these limits, we constrain theoretical models for the emission of gamma-rays from binary black hole mergers. △ Less

Submitted 25 August, 2023; originally announced August 2023.

Showing 51–100 of 636 results for author: Nguyen, C