Search | arXiv e-print repository

End-to-end learned Lossy Dynamic Point Cloud Attribute Compression

Authors: Dat Thanh Nguyen, Daniel Zieger, Marc Stamminger, Andre Kaup

Abstract: Recent advancements in point cloud compression have primarily emphasized geometry compression while comparatively fewer efforts have been dedicated to attribute compression. This study introduces an end-to-end learned dynamic lossy attribute coding approach, utilizing an efficient high-dimensional convolution to capture extensive inter-point dependencies. This enables the efficient projection of a… ▽ More Recent advancements in point cloud compression have primarily emphasized geometry compression while comparatively fewer efforts have been dedicated to attribute compression. This study introduces an end-to-end learned dynamic lossy attribute coding approach, utilizing an efficient high-dimensional convolution to capture extensive inter-point dependencies. This enables the efficient projection of attribute features into latent variables. Subsequently, we employ a context model that leverage previous latent space in conjunction with an auto-regressive context model for encoding the latent tensor into a bitstream. Evaluation of our method on widely utilized point cloud datasets from the MPEG and Microsoft demonstrates its superior performance compared to the core attribute compression module Region-Adaptive Hierarchical Transform method from MPEG Geometry Point Cloud Compression with 38.1% Bjontegaard Delta-rate saving in average while ensuring a low-complexity encoding/decoding. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: 6 pages, accepted for presentation at 2024 IEEE International Conference on Image Processing (ICIP) 2024

arXiv:2407.20249 [pdf, other]

Revisiting the Disequilibrium Issues in Tackling Heart Disease Classification Tasks

Authors: Thao Hoang, Linh Nguyen, Khoi Do, Duong Nguyen, Viet Dung Nguyen

Abstract: In the field of heart disease classification, two primary obstacles arise. Firstly, existing Electrocardiogram (ECG) datasets consistently demonstrate imbalances and biases across various modalities. Secondly, these time-series data consist of diverse lead signals, causing Convolutional Neural Networks (CNNs) to become overfitting to the one with higher power, hence diminishing the performance of… ▽ More In the field of heart disease classification, two primary obstacles arise. Firstly, existing Electrocardiogram (ECG) datasets consistently demonstrate imbalances and biases across various modalities. Secondly, these time-series data consist of diverse lead signals, causing Convolutional Neural Networks (CNNs) to become overfitting to the one with higher power, hence diminishing the performance of the Deep Learning (DL) process. In addition, when facing an imbalanced dataset, performance from such high-dimensional data may be susceptible to overfitting. Current efforts predominantly focus on enhancing DL models by designing novel architectures, despite these evident challenges, seemingly overlooking the core issues, therefore hindering advancements in heart disease classification. To address these obstacles, our proposed approach introduces two straightforward and direct methods to enhance the classification tasks. To address the high dimensionality issue, we employ a Channel-wise Magnitude Equalizer (CME) on signal-encoded images. This approach reduces redundancy in the feature data range, highlighting changes in the dataset. Simultaneously, to counteract data imbalance, we propose the Inverted Weight Logarithmic Loss (IWL) to alleviate imbalances among the data. When applying IWL loss, the accuracy of state-of-the-art models (SOTA) increases up to 5% in the CPSC2018 dataset. CME in combination with IWL also surpasses the classification results of other baseline models from 5% to 10%. △ Less

Submitted 19 July, 2024; originally announced July 2024.

arXiv:2407.20247 [pdf, other]

How Homogenizing the Channel-wise Magnitude Can Enhance EEG Classification Model?

Authors: Huyen Ngo, Khoi Do, Duong Nguyen, Viet Dung Nguyen, Lan Dang

Abstract: A significant challenge in the electroencephalogram EEG lies in the fact that current data representations involve multiple electrode signals, resulting in data redundancy and dominant lead information. However extensive research conducted on EEG classification focuses on designing model architectures without tackling the underlying issues. Otherwise, there has been a notable gap in addressing dat… ▽ More A significant challenge in the electroencephalogram EEG lies in the fact that current data representations involve multiple electrode signals, resulting in data redundancy and dominant lead information. However extensive research conducted on EEG classification focuses on designing model architectures without tackling the underlying issues. Otherwise, there has been a notable gap in addressing data preprocessing for EEG, leading to considerable computational overhead in Deep Learning (DL) processes. In light of these issues, we propose a simple yet effective approach for EEG data pre-processing. Our method first transforms the EEG data into an encoded image by an Inverted Channel-wise Magnitude Homogenization (ICWMH) to mitigate inter-channel biases. Next, we apply the edge detection technique on the EEG-encoded image combined with skip connection to emphasize the most significant transitions in the data while preserving structural and invariant information. By doing so, we can improve the EEG learning process efficiently without using a huge DL network. Our experimental evaluations reveal that we can significantly improve (i.e., from 2% to 5%) over current baselines. △ Less

Submitted 19 July, 2024; originally announced July 2024.

arXiv:2407.16803 [pdf, other]

Fusion and Cross-Modal Transfer for Zero-Shot Human Action Recognition

Authors: Abhi Kamboj, Anh Duy Nguyen, Minh Do

Abstract: Despite living in a multi-sensory world, most AI models are limited to textual and visual interpretations of human motion and behavior. Inertial measurement units (IMUs) provide a salient signal to understand human motion; however, they are challenging to use due to their uninterpretability and scarcity of their data. We investigate a method to transfer knowledge between visual and inertial modali… ▽ More Despite living in a multi-sensory world, most AI models are limited to textual and visual interpretations of human motion and behavior. Inertial measurement units (IMUs) provide a salient signal to understand human motion; however, they are challenging to use due to their uninterpretability and scarcity of their data. We investigate a method to transfer knowledge between visual and inertial modalities using the structure of an informative joint representation space designed for human action recognition (HAR). We apply the resulting Fusion and Cross-modal Transfer (FACT) method to a novel setup, where the model does not have access to labeled IMU data during training and is able to perform HAR with only IMU data during testing. Extensive experiments on a wide range of RGB-IMU datasets demonstrate that FACT significantly outperforms existing methods in zero-shot cross-modal transfer. △ Less

Submitted 23 July, 2024; originally announced July 2024.

arXiv:2407.09534 [pdf, other]

DFS-based fast crack detection

Authors: Vsevolod Chernyshev, Vitalii Makogin, Duc Nguyen, Evgeny Spodarev

Abstract: In this paper, we propose an fast method for crack detection in 3D computed tomography (CT) images. Our approach combines the Maximal Hessian Entry filter and a Deep-First Search algorithm-based technique to strike a balance between computational complexity and accuracy. Experimental results demonstrate the effectiveness of our approach in detecting the crack structure with predefined misclassific… ▽ More In this paper, we propose an fast method for crack detection in 3D computed tomography (CT) images. Our approach combines the Maximal Hessian Entry filter and a Deep-First Search algorithm-based technique to strike a balance between computational complexity and accuracy. Experimental results demonstrate the effectiveness of our approach in detecting the crack structure with predefined misclassification probability. △ Less

Submitted 25 June, 2024; originally announced July 2024.

arXiv:2407.06142 [pdf, ps, other]

Delay-Aware Robust Edge Network Hardening Under Decision-Dependent Uncertainty

Authors: Jiaming Cheng, Duong Thuy Anh Nguyen, Ni Trieu, Duong Tung Nguyen

Abstract: Edge computing promises to offer low-latency and ubiquitous computation to numerous devices at the network edge. For delay-sensitive applications, link delays can have a direct impact on service quality. These delays can fluctuate drastically over time due to various factors such as network congestion, changing traffic conditions, cyberattacks, component failures, and natural disasters. Thus, it i… ▽ More Edge computing promises to offer low-latency and ubiquitous computation to numerous devices at the network edge. For delay-sensitive applications, link delays can have a direct impact on service quality. These delays can fluctuate drastically over time due to various factors such as network congestion, changing traffic conditions, cyberattacks, component failures, and natural disasters. Thus, it is crucial to efficiently harden the edge network to mitigate link delay variation as well as ensure a stable and improved user experience. To this end, we propose a novel robust model for optimal edge network hardening, considering the link delay uncertainty. Departing from the existing literature that treats uncertainties as exogenous, our model incorporates an endogenous uncertainty set to properly capture the impact of hardening and workload allocation decisions on link delays. However, the endogenous set introduces additional complexity to the problem due to the interdependence between decisions and uncertainties. We present two efficient methods to transform the problem into a solvable form. Extensive numerical results are shown to demonstrate the effectiveness of the proposed approach. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 14 pages, 18 figures

arXiv:2406.11220 [pdf, other]

No Analog Combiner TTD-based Hybrid Precoding for Multi-User Sub-THz Communications

Authors: Dang Qua Nguyen, Alexei Ashikhmin, Hong Yang, Taejoon Kim

Abstract: We address the design and optimization of real-world-suitable hybrid precoders for multi-user wideband sub-terahertz (sub-THz) communications. We note that the conventional fully connected true-time delay (TTD)-based architecture is impractical because there is no room for the required large number of analog signal combiners in the circuit board. Additionally, analog signal combiners incur signifi… ▽ More We address the design and optimization of real-world-suitable hybrid precoders for multi-user wideband sub-terahertz (sub-THz) communications. We note that the conventional fully connected true-time delay (TTD)-based architecture is impractical because there is no room for the required large number of analog signal combiners in the circuit board. Additionally, analog signal combiners incur significant signal power loss. These limitations are often overlooked in sub-THz research. To overcome these issues, we study a non-overlapping subarray architecture that eliminates the need for analog combiners. We extend the conventional single-user assumption by formulating an optimization problem to maximize the minimum data rate for simultaneously served users. This complex optimization problem is divided into two sub-problems. The first sub-problem aims to ensure a fair subarray allocation for all users and is solved via a continuous domain relaxation technique. The second sub-problem deals with practical TTD device constraints on range and resolution to maximize the subarray gain and is resolved by shifting to the phase domain. Our simulation results highlight significant performance gain for our real-world-ready TTD-based hybrid precoders. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.02879 [pdf, ps, other]

Second-order differential operators, stochastic differential equations and Brownian motions on embedded manifolds

Authors: Du Nguyen, Stefan Sommer

Abstract: We specify the conditions when a manifold M embedded in an inner product space E is an invariant manifold of a stochastic differential equation (SDE) on E, linking it with the notion of second-order differential operators on M. When M is given a Riemannian metric, we derive a simple formula for the Laplace-Beltrami operator in terms of the gradient and Hessian on E and construct the Riemannian Bro… ▽ More We specify the conditions when a manifold M embedded in an inner product space E is an invariant manifold of a stochastic differential equation (SDE) on E, linking it with the notion of second-order differential operators on M. When M is given a Riemannian metric, we derive a simple formula for the Laplace-Beltrami operator in terms of the gradient and Hessian on E and construct the Riemannian Brownian motions on M as solutions of conservative Stratonovich and Ito SDEs on E. We derive explicitly the SDE for Brownian motions on several important manifolds in applications, including left-invariant matrix Lie groups using embedded coordinates. Numerically, we propose three simulation schemes to solve SDEs on manifolds. In addition to the stochastic projection method, to simulate Riemannian Brownian motions, we construct a second-order tangent retraction of the Levi-Civita connection using a given E-tubular retraction. We also propose the retractive Euler-Maruyama method to solve a SDE, taking into account the second-order term of a tangent retraction. We provide software to implement the methods in the paper, including Brownian motions of the manifolds discussed. We verify numerically that on several compact Riemannian manifolds, the long-term limit of Brownian simulation converges to the uniform distributions, suggesting a method to sample Riemannian uniform distributions △ Less

Submitted 4 June, 2024; originally announced June 2024.

MSC Class: 65C30; 65L20; 65C20; 60J65; 58J65

arXiv:2406.02555 [pdf, ps, other]

PhoWhisper: Automatic Speech Recognition for Vietnamese

Authors: Thanh-Thien Le, Linh The Nguyen, Dat Quoc Nguyen

Abstract: We introduce PhoWhisper in five versions for Vietnamese automatic speech recognition. PhoWhisper's robustness is achieved through fine-tuning the Whisper model on an 844-hour dataset that encompasses diverse Vietnamese accents. Our experimental study demonstrates state-of-the-art performances of PhoWhisper on benchmark Vietnamese ASR datasets. We have open-sourced PhoWhisper at: https://github.com… ▽ More We introduce PhoWhisper in five versions for Vietnamese automatic speech recognition. PhoWhisper's robustness is achieved through fine-tuning the Whisper model on an 844-hour dataset that encompasses diverse Vietnamese accents. Our experimental study demonstrates state-of-the-art performances of PhoWhisper on benchmark Vietnamese ASR datasets. We have open-sourced PhoWhisper at: https://github.com/VinAIResearch/PhoWhisper △ Less

Submitted 27 March, 2024; originally announced June 2024.

Comments: Accepted to ICLR 2024 Tiny Papers Track

arXiv:2405.16664 [pdf]

Deep learning improved autofocus for motion artifact reduction and its application in quantitative susceptibility mapping

Authors: Chao Li, Jinwei Zhang, Hang Zhang, Jiahao Li, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang

Abstract: Purpose: To develop a pipeline for motion artifact correction in mGRE and quantitative susceptibility mapping (QSM). Methods: Deep learning is integrated with autofocus to improve motion artifact suppression, which is applied QSM of patients with Parkinson's disease (PD). The estimation of affine motion parameters in the autofocus method depends on signal-to-noise ratio and lacks accuracy when dat… ▽ More Purpose: To develop a pipeline for motion artifact correction in mGRE and quantitative susceptibility mapping (QSM). Methods: Deep learning is integrated with autofocus to improve motion artifact suppression, which is applied QSM of patients with Parkinson's disease (PD). The estimation of affine motion parameters in the autofocus method depends on signal-to-noise ratio and lacks accuracy when data sampling occurs outside the k-space center. A deep learning strategy is employed to remove the residual motion artifacts in autofocus. Results: Results obtained in simulated brain data (n =15) with reference truth show that the proposed autofocus deep learning method significantly improves the image quality of mGRE and QSM (p = 0.001 for SSIM, p < 0.0001 for PSNR and RMSE). Results from 10 PD patients with real motion artifacts in QSM have also been corrected using the proposed method and sent to an experienced radiologist for image quality evaluation, and the average image quality score has increased (p=0.0039). Conclusions: The proposed method enables substantial suppression of motion artifacts in mGRE and QSM. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.02994 [pdf, other]

Extended State Observer for Mismatch Disturbances Using Taylor Approximation of the Integral

Authors: Cuong Duc Nguyen

Abstract: The development of disturbance estimators using extended state observers (ESOs) typically assumes that the system is observable. This paper introduces an improved method for systems that are initially unobservable, leveraging Taylor expansion to approximate the integral of disturbance dynamics. A new extended system is formulated based on this approximation, enabling the design of an observer that… ▽ More The development of disturbance estimators using extended state observers (ESOs) typically assumes that the system is observable. This paper introduces an improved method for systems that are initially unobservable, leveraging Taylor expansion to approximate the integral of disturbance dynamics. A new extended system is formulated based on this approximation, enabling the design of an observer that achieves exponential stability of the error dynamics. The proposed method's efficacy is demonstrated through a practical example, highlighting its potential for robust disturbance estimation in dynamic systems. △ Less

Submitted 5 May, 2024; originally announced May 2024.

arXiv:2405.00712 [pdf, other]

SoK: Behind the Accuracy of Complex Human Activity Recognition Using Deep Learning

Authors: Duc-Anh Nguyen, Nhien-An Le-Khac

Abstract: Human Activity Recognition (HAR) is a well-studied field with research dating back to the 1980s. Over time, HAR technologies have evolved significantly from manual feature extraction, rule-based algorithms, and simple machine learning models to powerful deep learning models, from one sensor type to a diverse array of sensing modalities. The scope has also expanded from recognising a limited set of… ▽ More Human Activity Recognition (HAR) is a well-studied field with research dating back to the 1980s. Over time, HAR technologies have evolved significantly from manual feature extraction, rule-based algorithms, and simple machine learning models to powerful deep learning models, from one sensor type to a diverse array of sensing modalities. The scope has also expanded from recognising a limited set of activities to encompassing a larger variety of both simple and complex activities. However, there still exist many challenges that hinder advancement in complex activity recognition using modern deep learning methods. In this paper, we comprehensively systematise factors leading to inaccuracy in complex HAR, such as data variety and model capacity. Among many sensor types, we give more attention to wearable and camera due to their prevalence. Through this Systematisation of Knowledge (SoK) paper, readers can gain a solid understanding of the development history and existing challenges of HAR, different categorisations of activities, obstacles in deep learning-based complex HAR that impact accuracy, and potential research directions. △ Less

Submitted 3 May, 2024; v1 submitted 25 April, 2024; originally announced May 2024.

arXiv:2403.17392 [pdf, other]

Natural-artificial hybrid swarm: Cyborg-insect group navigation in unknown obstructed soft terrain

Authors: Yang Bai, Phuoc Thanh Tran Ngoc, Huu Duoc Nguyen, Duc Long Le, Quang Huy Ha, Kazuki Kai, Yu Xiang See To, Yaosheng Deng, Jie Song, Naoki Wakamiya, Hirotaka Sato, Masaki Ogura

Abstract: Navigating multi-robot systems in complex terrains has always been a challenging task. This is due to the inherent limitations of traditional robots in collision avoidance, adaptation to unknown environments, and sustained energy efficiency. In order to overcome these limitations, this research proposes a solution by integrating living insects with miniature electronic controllers to enable roboti… ▽ More Navigating multi-robot systems in complex terrains has always been a challenging task. This is due to the inherent limitations of traditional robots in collision avoidance, adaptation to unknown environments, and sustained energy efficiency. In order to overcome these limitations, this research proposes a solution by integrating living insects with miniature electronic controllers to enable robotic-like programmable control, and proposing a novel control algorithm for swarming. Although these creatures, called cyborg insects, have the ability to instinctively avoid collisions with neighbors and obstacles while adapting to complex terrains, there is a lack of literature on the control of multi-cyborg systems. This research gap is due to the difficulty in coordinating the movements of a cyborg system under the presence of insects' inherent individual variability in their reactions to control input. In response to this issue, we propose a novel swarm navigation algorithm addressing these challenges. The effectiveness of the algorithm is demonstrated through an experimental validation in which a cyborg swarm was successfully navigated through an unknown sandy field with obstacles and hills. This research contributes to the domain of swarm robotics and showcases the potential of integrating biological organisms with robotics and control theory to create more intelligent autonomous systems with real-world applications. △ Less

Submitted 27 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.16595 [pdf, other]

The Adaptive Workplace: Orchestrating Architectural Services around the Wellbeing of Individual Occupants

Authors: Andrew Vande Moere, Sara Arko, Alena Safrova Drasilova, Tomáš Ondráček, Ilaria Pigliautile, Benedetta Pioppi, Anna Laura Pisello, Jakub Prochazka, Paula Acuna Roncancio, Davide Schaumann, Marcel Schweiker, Binh Vinh Duc Nguyen

Abstract: As the academic consortia members of the EU Horizon project SONATA ("Situation-aware OrchestratioN of AdapTive Architecture"), we respond to the workshop call for "Office Wellbeing by Design: Don't Stand for Anything Less" by proposing the "Adaptive Workplace" concept. In essence, our vision aims to adapt a workplace to the ever-changing needs of individual occupants, instead of that occupants are… ▽ More As the academic consortia members of the EU Horizon project SONATA ("Situation-aware OrchestratioN of AdapTive Architecture"), we respond to the workshop call for "Office Wellbeing by Design: Don't Stand for Anything Less" by proposing the "Adaptive Workplace" concept. In essence, our vision aims to adapt a workplace to the ever-changing needs of individual occupants, instead of that occupants are expected to adapt to their workplace. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.11104 [pdf, other]

Deep Neural Network NMPC for Computationally Tractable Optimal Power Management of Hybrid Electric Vehicle

Authors: Suyong Park, Duc Giap Nguyen, Jinrak Park, Dohee Kim, Jeong Soo Eo, Kyoungseok Han

Abstract: This study presents a method for deep neural network nonlinear model predictive control (DNN-MPC) to reduce computational complexity, and we show its practical utility through its application in optimizing the energy management of hybrid electric vehicles (HEVs). For optimal power management of HEVs, we first design the online NMPC to collect the data set, and the deep neural network is trained to… ▽ More This study presents a method for deep neural network nonlinear model predictive control (DNN-MPC) to reduce computational complexity, and we show its practical utility through its application in optimizing the energy management of hybrid electric vehicles (HEVs). For optimal power management of HEVs, we first design the online NMPC to collect the data set, and the deep neural network is trained to approximate the NMPC solutions. We assess the effectiveness of our approach by conducting comparative simulations with rule and online NMPC-based power management strategies for HEV, evaluating both fuel consumption and computational complexity. Lastly, we verify the real-time feasibility of our approach through process-in-the-loop (PIL) testing. The test results demonstrate that the proposed method closely approximates the NMPC performance while substantially reducing the computational burden. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: 6 pages, 10 figures, 3 tables, 2024 ACC conference (accepted)

arXiv:2403.08371 [pdf, other]

User-Centric Beam Selection and Precoding Design for Coordinated Multiple-Satellite Systems

Authors: Vu Nguyen Ha, Duy H. N. Nguyen, Juan C. -M. Duncan, Jorge L. Gonzalez-Rios, Juan A. Vasquez, Geoffrey Eappen, Luis M. Garces-Socarras, Rakesh Palisetty, Symeon Chatzinotas, Bjorn Ottersten

Abstract: This paper introduces a joint optimization framework for user-centric beam selection and linear precoding (LP) design in a coordinated multiple-satellite (CoMSat) system, employing a Digital-Fourier-Transform-based (DFT) beamforming (BF) technique. Regarding serving users at their target SINRs and minimizing the total transmit power, the scheme aims to efficiently determine satellites for users to… ▽ More This paper introduces a joint optimization framework for user-centric beam selection and linear precoding (LP) design in a coordinated multiple-satellite (CoMSat) system, employing a Digital-Fourier-Transform-based (DFT) beamforming (BF) technique. Regarding serving users at their target SINRs and minimizing the total transmit power, the scheme aims to efficiently determine satellites for users to associate with and activate the best cluster of beams together with optimizing LP for every satellite-to-user transmission. These technical objectives are first framed as a complex mixed-integer programming (MIP) challenge. To tackle this, we reformulate it into a joint cluster association and LP design problem. Then, by theoretically analyzing the duality relationship between downlink and uplink transmissions, we develop an efficient iterative method to identify the optimal solution. Additionally, a simpler duality approach for rapid beam selection and LP design is presented for comparison purposes. Simulation results underscore the effectiveness of our proposed schemes across various settings. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2402.03648 [pdf, other]

Multilinear Kernel Regression and Imputation via Manifold Learning

Authors: Duc Thien Nguyen, Konstantinos Slavakis

Abstract: This paper introduces a novel nonparametric framework for data imputation, coined multilinear kernel regression and imputation via the manifold assumption (MultiL-KRIM). Motivated by manifold learning, MultiL-KRIM models data features as a point cloud located in or close to a user-unknown smooth manifold embedded in a reproducing kernel Hilbert space. Unlike typical manifold-learning routes, which… ▽ More This paper introduces a novel nonparametric framework for data imputation, coined multilinear kernel regression and imputation via the manifold assumption (MultiL-KRIM). Motivated by manifold learning, MultiL-KRIM models data features as a point cloud located in or close to a user-unknown smooth manifold embedded in a reproducing kernel Hilbert space. Unlike typical manifold-learning routes, which seek low-dimensional patterns via regularizers based on graph-Laplacian matrices, MultiL-KRIM builds instead on the intuitive concept of tangent spaces to manifolds and incorporates collaboration among point-cloud neighbors (regressors) directly into the data-modeling term of the loss function. Multiple kernel functions are allowed to offer robustness and rich approximation properties, while multiple matrix factors offer low-rank modeling, integrate dimensionality reduction, and streamline computations with no need of training data. Two important application domains showcase the functionality of MultiL-KRIM: time-varying-graph-signal (TVGS) recovery, and reconstruction of highly accelerated dynamic-magnetic-resonance-imaging (dMRI) data. Extensive numerical tests on real and synthetic data demonstrate MultiL-KRIM's remarkable speedups over its predecessors, and outperformance over prevalent "shallow" data-imputation techniques, with a more intuitive and explainable pipeline than deep-image-prior methods. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.00238 [pdf, other]

CNN-FL for Biotechnology Industry Empowered by Internet-of-BioNano Things and Digital Twins

Authors: Mohammad, Jamshidi, Dinh Thai Hoang, Diep N. Nguyen

Abstract: Digital twins (DTs) are revolutionizing the biotechnology industry by enabling sophisticated digital representations of biological assets, microorganisms, drug development processes, and digital health applications. However, digital twinning at micro and nano scales, particularly in modeling complex entities like bacteria, presents significant challenges in terms of requiring advanced Internet of… ▽ More Digital twins (DTs) are revolutionizing the biotechnology industry by enabling sophisticated digital representations of biological assets, microorganisms, drug development processes, and digital health applications. However, digital twinning at micro and nano scales, particularly in modeling complex entities like bacteria, presents significant challenges in terms of requiring advanced Internet of Things (IoT) infrastructure and computing approaches to achieve enhanced accuracy and scalability. In this work, we propose a novel framework that integrates the Internet of Bio-Nano Things (IoBNT) with advanced machine learning techniques, specifically convolutional neural networks (CNN) and federated learning (FL), to effectively tackle the identified challenges. Within our framework, IoBNT devices are deployed to gather image-based biological data across various physical environments, leveraging the strong capabilities of CNNs for robust machine vision and pattern recognition. Subsequently, FL is utilized to aggregate insights from these disparate data sources, creating a refined global model that continually enhances accuracy and predictive reliability, which is crucial for the effective deployment of DTs in biotechnology. The primary contribution is the development of a novel framework that synergistically combines CNN and FL, augmented by the capabilities of the IoBNT. This novel approach is specifically tailored to enhancing DTs in the biotechnology industry. The results showcase enhancements in the reliability and safety of microorganism DTs, while preserving their accuracy. Furthermore, the proposed framework excels in energy efficiency and security, offering a user-friendly and adaptable solution. This broadens its applicability across diverse sectors, including biotechnology and pharmaceutical industries, as well as clinical and hospital settings. △ Less

Submitted 31 January, 2024; originally announced February 2024.

arXiv:2401.12488 [pdf]

An Automated Real-Time Approach for Image Processing and Segmentation of Fluoroscopic Images and Videos Using a Single Deep Learning Network

Authors: Viet Dung Nguyen, Michael T. LaCour, Richard D. Komistek

Abstract: Image segmentation in total knee arthroplasty is crucial for precise preoperative planning and accurate implant positioning, leading to improved surgical outcomes and patient satisfaction. The biggest challenges of image segmentation in total knee arthroplasty include accurately delineating complex anatomical structures, dealing with image artifacts and noise, and developing robust algorithms that… ▽ More Image segmentation in total knee arthroplasty is crucial for precise preoperative planning and accurate implant positioning, leading to improved surgical outcomes and patient satisfaction. The biggest challenges of image segmentation in total knee arthroplasty include accurately delineating complex anatomical structures, dealing with image artifacts and noise, and developing robust algorithms that can handle anatomical variations and pathologies commonly encountered in patients. The potential of using machine learning for image segmentation in total knee arthroplasty lies in its ability to improve segmentation accuracy, automate the process, and provide real-time assistance to surgeons, leading to enhanced surgical planning, implant placement, and patient outcomes. This paper proposes a methodology to use deep learning for robust and real-time total knee arthroplasty image segmentation. The deep learning model, trained on a large dataset, demonstrates outstanding performance in accurately segmenting both the implanted femur and tibia, achieving an impressive mean-Average-Precision (mAP) of 88.83 when compared to the ground truth while also achieving a real-time segmented speed of 20 frames per second (fps). We have introduced a novel methodology for segmenting implanted knee fluoroscopic or x-ray images that showcases remarkable levels of accuracy and speed, paving the way for various potential extended applications. △ Less

Submitted 24 May, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.10032 [pdf, other]

FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder

Authors: Tan Dat Nguyen, Ji-Hoon Kim, Youngjoon Jang, Jaehun Kim, Joon Son Chung

Abstract: The goal of this paper is to generate realistic audio with a lightweight and fast diffusion-based vocoder, named FreGrad. Our framework consists of the following three key components: (1) We employ discrete wavelet transform that decomposes a complicated waveform into sub-band wavelets, which helps FreGrad to operate on a simple and concise feature space, (2) We design a frequency-aware dilated co… ▽ More The goal of this paper is to generate realistic audio with a lightweight and fast diffusion-based vocoder, named FreGrad. Our framework consists of the following three key components: (1) We employ discrete wavelet transform that decomposes a complicated waveform into sub-band wavelets, which helps FreGrad to operate on a simple and concise feature space, (2) We design a frequency-aware dilated convolution that elevates frequency awareness, resulting in generating speech with accurate frequency information, and (3) We introduce a bag of tricks that boosts the generation quality of the proposed model. In our experiments, FreGrad achieves 3.7 times faster training time and 2.2 times faster inference speed compared to our baseline while reducing the model size by 0.6 times (only 1.78M parameters) without sacrificing the output quality. Audio samples are available at: https://mm.kaist.ac.kr/projects/FreGrad. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Comments: Accepted to ICASSP 2024

arXiv:2401.00682 [pdf, other]

doi 10.1109/ICCAIS59597.2023.10382267

The Smooth Trajectory Estimator for LMB Filters

Authors: Hoa Van Nguyen, Tran Thien Dat Nguyen, Changbeom Shim, Marzhar Anuar

Abstract: This paper proposes a smooth-trajectory estimator for the labelled multi-Bernoulli (LMB) filter by exploiting the special structure of the generalised labelled multi-Bernoulli (GLMB) filter. We devise a simple and intuitive approach to store the best association map when approximating the GLMB random finite set (RFS) to the LMB RFS. In particular, we construct a smooth-trajectory estimator (i.e.,… ▽ More This paper proposes a smooth-trajectory estimator for the labelled multi-Bernoulli (LMB) filter by exploiting the special structure of the generalised labelled multi-Bernoulli (GLMB) filter. We devise a simple and intuitive approach to store the best association map when approximating the GLMB random finite set (RFS) to the LMB RFS. In particular, we construct a smooth-trajectory estimator (i.e., an estimator over the entire trajectories of labelled estimates) for the LMB filter based on the history of the best association map and all of the measurements up to the current time. Experimental results under two challenging scenarios demonstrate significant tracking accuracy improvements with negligible additional computational time compared to the conventional LMB filter. The source code is publicly available at https://tinyurl.com/ste-lmb, aimed at promoting advancements in MOT algorithms. △ Less

Submitted 1 January, 2024; originally announced January 2024.

Comments: 6 pages, 5 figures. Presented at The 12th IEEE International Conference on Control, Automation and Information Sciences (ICCAIS 2023), Nov 2023, Hanoi, Vietnam

arXiv:2312.16835 [pdf, other]

RimSet: Quantitatively Identifying and Characterizing Chronic Active Multiple Sclerosis Lesion on Quantitative Susceptibility Maps

Authors: Hang Zhang, Thanh D. Nguyen, Jinwei Zhang, Renjiu Hu, Susan A. Gauthier, Yi Wang

Abstract: Background: Rim+ lesions in multiple sclerosis (MS), detectable via Quantitative Susceptibility Mapping (QSM), correlate with increased disability. Existing literature lacks quantitative analysis of these lesions. We introduce RimSet for quantitative identification and characterization of rim+ lesions on QSM. Methods: RimSet combines RimSeg, an unsupervised segmentation method using level-set meth… ▽ More Background: Rim+ lesions in multiple sclerosis (MS), detectable via Quantitative Susceptibility Mapping (QSM), correlate with increased disability. Existing literature lacks quantitative analysis of these lesions. We introduce RimSet for quantitative identification and characterization of rim+ lesions on QSM. Methods: RimSet combines RimSeg, an unsupervised segmentation method using level-set methodology, and radiomic measurements with Local Binary Pattern texture descriptors. We validated RimSet using simulated QSM images and an in vivo dataset of 172 MS subjects with 177 rim+ and 3986 rim-lesions. Results: RimSeg achieved a 78.7% Dice score against the ground truth, with challenges in partial rim lesions. RimSet detected rim+ lesions with a partial ROC AUC of 0.808 and PR AUC of 0.737, surpassing existing methods. QSMRim-Net showed the lowest mean square error (0.85) and high correlation (0.91; 95% CI: 0.88, 0.93) with expert annotations at the subject level. △ Less

Submitted 28 December, 2023; originally announced December 2023.

Comments: 13 pages, 7 figures, 4 tables

arXiv:2312.10543 [pdf, other]

Study of cognitive component of auditory attention to natural speech events

Authors: Nhan D. T. Nguyen, Kaare Mikkelsen, Preben Kidmose

Abstract: Event-related potentials (ERP) have been used to address a wide range of research questions in neuroscience and cognitive psychology including selective auditory attention. The recent progress in auditory attention decoding (AAD) methods is based on algorithms that find a relation between the audio envelope and the neurophysiological response. The most popular approach is based on the reconstructi… ▽ More Event-related potentials (ERP) have been used to address a wide range of research questions in neuroscience and cognitive psychology including selective auditory attention. The recent progress in auditory attention decoding (AAD) methods is based on algorithms that find a relation between the audio envelope and the neurophysiological response. The most popular approach is based on the reconstruction of the audio envelope based on EEG signals. However, these methods are mainly based on the neurophysiological entrainment to physical attributes of the sensory stimulus and are generally limited by a long detection window. This study proposes a novel approach to auditory attention decoding by looking at higher-level cognitive responses to natural speech. To investigate if natural speech events elicit cognitive ERP components and how these components are affected by attention mechanisms, we designed a series of four experimental paradigms with increasing complexity: a word category oddball paradigm, a word category oddball paradigm with competing speakers, and competing speech streams with and without specific targets. We recorded the electroencephalogram (EEG) from 32 scalp electrodes and 12 in-ear electrodes (ear-EEG) from 24 participants. A cognitive ERP component, which we believe is related to the well-known P3b component, was observed at parietal electrode sites with a latency of approximately 620 ms. The component is statistically most significant for the simplest paradigm and gradually decreases in strength with increasing complexity of the paradigm. We also show that the component can be observed in the in-ear EEG signals by using spatial filtering. The cognitive component elicited by auditory attention may contribute to decoding auditory attention from electrophysiological recordings and its presence in the ear-EEG signals is promising for future applications within hearing aids. △ Less

Submitted 19 December, 2023; v1 submitted 16 December, 2023; originally announced December 2023.

Comments: 15 pages, 11 figures

arXiv:2312.07011 [pdf, ps, other]

Securing MIMO Wiretap Channel with Learning-Based Friendly Jamming under Imperfect CSI

Authors: Bui Minh Tuan, Diep N. Nguyen, Nguyen Linh Trung, Van-Dinh Nguyen, Nguyen Van Huynh, Dinh Thai Hoang, Marwan Krunz, Eryk Dutkiewicz

Abstract: Wireless communications are particularly vulnerable to eavesdropping attacks due to their broadcast nature. To effectively deal with eavesdroppers, existing security techniques usually require accurate channel state information (CSI), e.g., for friendly jamming (FJ), and/or additional computing resources at transceivers, e.g., cryptography-based solutions, which unfortunately may not be feasible i… ▽ More Wireless communications are particularly vulnerable to eavesdropping attacks due to their broadcast nature. To effectively deal with eavesdroppers, existing security techniques usually require accurate channel state information (CSI), e.g., for friendly jamming (FJ), and/or additional computing resources at transceivers, e.g., cryptography-based solutions, which unfortunately may not be feasible in practice. This challenge is even more acute in low-end IoT devices. We thus introduce a novel deep learning-based FJ framework that can effectively defeat eavesdropping attacks with imperfect CSI and even without CSI of legitimate channels. In particular, we first develop an autoencoder-based communication architecture with FJ, namely AEFJ, to jointly maximize the secrecy rate and minimize the block error rate at the receiver without requiring perfect CSI of the legitimate channels. In addition, to deal with the case without CSI, we leverage the mutual information neural estimation (MINE) concept and design a MINE-based FJ scheme that can achieve comparable security performance to the conventional FJ methods that require perfect CSI. Extensive simulations in a multiple-input multiple-output (MIMO) system demonstrate that our proposed solution can effectively deal with eavesdropping attacks in various settings. Moreover, the proposed framework can seamlessly integrate MIMO security and detection tasks into a unified end-to-end learning process. This integrated approach can significantly maximize the throughput and minimize the block error rate, offering a good solution for enhancing communication security in wireless communication systems. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: 12 pages, 15 figures

arXiv:2312.01777 [pdf, ps, other]

Doubly 1-Bit Quantized Massive MIMO

Authors: Italo Atzeni, Antti Tölli, Duy H. N. Nguyen, A. Lee Swindlehurst

Abstract: Enabling communications in the (sub-)THz band will call for massive multiple-input multiple-output (MIMO) arrays at either the transmit- or receive-side, or at both. To scale down the complexity and power consumption when operating across massive frequency and antenna dimensions, a sacrifice in the resolution of the digital-to-analog/analog-to-digital converters (DACs/ADCs) will be inevitable. In… ▽ More Enabling communications in the (sub-)THz band will call for massive multiple-input multiple-output (MIMO) arrays at either the transmit- or receive-side, or at both. To scale down the complexity and power consumption when operating across massive frequency and antenna dimensions, a sacrifice in the resolution of the digital-to-analog/analog-to-digital converters (DACs/ADCs) will be inevitable. In this paper, we analyze the extreme scenario where both the transmit- and receive-side are equipped with fully digital massive MIMO arrays and 1-bit DACs/ADCs, which leads to a system with minimum radio-frequency complexity, cost, and power consumption. Building upon the Bussgang decomposition, we derive a tractable approximation of the mean squared error (MSE) between the transmitted data symbols and their soft estimates. Numerical results show that, despite its simplicity, a doubly 1-bit quantized massive MIMO system with very large antenna arrays can deliver an impressive performance in terms of MSE and symbol error rate. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: Presented at the IEEE Asilomar Conference on Signals, Systems, and Computers 2023

arXiv:2311.15041 [pdf, other]

MPCNN: A Novel Matrix Profile Approach for CNN-based Sleep Apnea Classification

Authors: Hieu X. Nguyen, Duong V. Nguyen, Hieu H. Pham, Cuong D. Do

Abstract: Sleep apnea (SA) is a significant respiratory condition that poses a major global health challenge. Previous studies have investigated several machine and deep learning models for electrocardiogram (ECG)-based SA diagnoses. Despite these advancements, conventional feature extractions derived from ECG signals, such as R-peaks and RR intervals, may fail to capture crucial information encompassed wit… ▽ More Sleep apnea (SA) is a significant respiratory condition that poses a major global health challenge. Previous studies have investigated several machine and deep learning models for electrocardiogram (ECG)-based SA diagnoses. Despite these advancements, conventional feature extractions derived from ECG signals, such as R-peaks and RR intervals, may fail to capture crucial information encompassed within the complete PQRST segments. In this study, we propose an innovative approach to address this diagnostic gap by delving deeper into the comprehensive segments of the ECG signal. The proposed methodology draws inspiration from Matrix Profile algorithms, which generate an Euclidean distance profile from fixed-length signal subsequences. From this, we derived the Min Distance Profile (MinDP), Max Distance Profile (MaxDP), and Mean Distance Profile (MeanDP) based on the minimum, maximum, and mean of the profile distances, respectively. To validate the effectiveness of our approach, we use the modified LeNet-5 architecture as the primary CNN model, along with two existing lightweight models, BAFNet and SE-MSCNN, for ECG classification tasks. Our extensive experimental results on the PhysioNet Apnea-ECG dataset revealed that with the new feature extraction method, we achieved a per-segment accuracy up to 92.11 \% and a per-recording accuracy of 100\%. Moreover, it yielded the highest correlation compared to state-of-the-art methods, with a correlation coefficient of 0.989. By introducing a new feature extraction method based on distance relationships, we enhanced the performance of certain lightweight models, showing potential for home sleep apnea test (HSAT) and SA detection in IoT devices. The source code for this work is made publicly available in GitHub: https://github.com/vinuni-vishc/MPCNN-Sleep-Apnea. △ Less

Submitted 25 November, 2023; originally announced November 2023.

arXiv:2311.11096 [pdf, other]

On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation

Authors: Duy Minh Ho Nguyen, Tan Ngoc Pham, Nghiem Tuong Diep, Nghi Quoc Phan, Quang Pham, Vinh Tong, Binh T. Nguyen, Ngan Hoang Le, Nhat Ho, Pengtao Xie, Daniel Sonntag, Mathias Niepert

Abstract: Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging. The foundational models for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. It showcases impressive learning abilities across different tasks with the need for… ▽ More Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging. The foundational models for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. It showcases impressive learning abilities across different tasks with the need for only a limited amount of annotated samples. While numerous techniques have focused on developing better fine-tuning strategies to adapt these models for specific domains, we instead examine their robustness to domain shifts in the medical image segmentation task. To this end, we compare the generalization performance to unseen domains of various pre-trained models after being fine-tuned on the same in-distribution dataset and show that foundation-based models enjoy better robustness than other architectures. From here, we further developed a new Bayesian uncertainty estimation for frozen models and used them as an indicator to characterize the model's performance on out-of-distribution (OOD) data, proving particularly beneficial for real-world applications. Our experiments not only reveal the limitations of current indicators like accuracy on the line or agreement on the line commonly used in natural image applications but also emphasize the promise of the introduced Bayesian uncertainty. Specifically, lower uncertainty predictions usually tend to higher out-of-distribution (OOD) performance. △ Less

Submitted 18 November, 2023; originally announced November 2023.

Comments: Advances in Neural Information Processing Systems (NeurIPS) 2023, Workshop on robustness of zero/few-shot learning in foundation models

arXiv:2311.01715 [pdf, other]

Acousto-optic reconstruction of exterior sound field based on concentric circle sampling with circular harmonic expansion

Authors: Phuc Duc Nguyen, Kenji Ishikawa, Noboru Harada, Takehiro Moriya

Abstract: Acousto-optic sensing provides an alternative approach to traditional microphone arrays by shedding light on the interaction of light with an acoustic field. Sound field reconstruction is a fascinating and advanced technique used in acousto-optics sensing. Current challenges in sound-field reconstruction methods pertain to scenarios in which the sound source is located within the reconstruction ar… ▽ More Acousto-optic sensing provides an alternative approach to traditional microphone arrays by shedding light on the interaction of light with an acoustic field. Sound field reconstruction is a fascinating and advanced technique used in acousto-optics sensing. Current challenges in sound-field reconstruction methods pertain to scenarios in which the sound source is located within the reconstruction area, known as the exterior problem. Existing reconstruction algorithms, primarily designed for interior scenarios, often exhibit suboptimal performance when applied to exterior cases. This paper introduces a novel technique for exterior sound-field reconstruction. The proposed method leverages concentric circle sampling and a two-dimensional exterior sound-field reconstruction approach based on circular harmonic extensions. To evaluate the efficacy of this approach, both numerical simulations and practical experiments are conducted. The results highlight the superior accuracy of the proposed method when compared to conventional reconstruction methods, all while utilizing a minimal amount of measured projection data. △ Less

Submitted 3 November, 2023; originally announced November 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2310.14506 [pdf, other]

Label Space Partition Selection for Multi-Object Tracking Using Two-Layer Partitioning

Authors: Ji Youn Lee, Changbeom Shim, Hoa Van Nguyen, Tran Thien Dat Nguyen, Hyunjin Choi, Youngho Kim

Abstract: Estimating the trajectories of multi-objects poses a significant challenge due to data association ambiguity, which leads to a substantial increase in computational requirements. To address such problems, a divide-and-conquer manner has been employed with parallel computation. In this strategy, distinguished objects that have unique labels are grouped based on their statistical dependencies, the i… ▽ More Estimating the trajectories of multi-objects poses a significant challenge due to data association ambiguity, which leads to a substantial increase in computational requirements. To address such problems, a divide-and-conquer manner has been employed with parallel computation. In this strategy, distinguished objects that have unique labels are grouped based on their statistical dependencies, the intersection of predicted measurements. Several geometry approaches have been used for label grouping since finding all intersected label pairs is clearly infeasible for large-scale tracking problems. This paper proposes an efficient implementation of label grouping for label-partitioned generalized labeled multi-Bernoulli filter framework using a secondary partitioning technique. This allows for parallel computation in the label graph indexing step, avoiding generating and eliminating duplicate comparisons. Additionally, we compare the performance of the proposed technique with several efficient spatial searching algorithms. The results demonstrate the superior performance of the proposed approach on large-scale data sets, enabling scalable trajectory estimation. △ Less

Submitted 22 October, 2023; originally announced October 2023.

Comments: 6 pages, 4 figures

arXiv:2310.00418 [pdf, other]

MVC: A Multi-Task Vision Transformer Network for COVID-19 Diagnosis from Chest X-ray Images

Authors: Huyen Tran, Duc Thanh Nguyen, John Yearwood

Abstract: Medical image analysis using computer-based algorithms has attracted considerable attention from the research community and achieved tremendous progress in the last decade. With recent advances in computing resources and availability of large-scale medical image datasets, many deep learning models have been developed for disease diagnosis from medical images. However, existing techniques focus on… ▽ More Medical image analysis using computer-based algorithms has attracted considerable attention from the research community and achieved tremendous progress in the last decade. With recent advances in computing resources and availability of large-scale medical image datasets, many deep learning models have been developed for disease diagnosis from medical images. However, existing techniques focus on sub-tasks, e.g., disease classification and identification, individually, while there is a lack of a unified framework enabling multi-task diagnosis. Inspired by the capability of Vision Transformers in both local and global representation learning, we propose in this paper a new method, namely Multi-task Vision Transformer (MVC) for simultaneously classifying chest X-ray images and identifying affected regions from the input data. Our method is built upon the Vision Transformer but extends its learning capability in a multi-task setting. We evaluated our proposed method and compared it with existing baselines on a benchmark dataset of COVID-19 chest X-ray images. Experimental results verified the superiority of the proposed method over the baselines on both the image classification and affected region identification tasks. △ Less

Submitted 30 September, 2023; originally announced October 2023.

arXiv:2309.16699 [pdf]

Circular-Line Trajectory Tracking Controller for Mobile Robot using Multi-Pixy2 Sensors

Authors: Xuan Quang Ngo, Tri Duc Tran, Huy Hung Nguyen, Van Dong Nguyen, Van Tu Duong, Tan Tien Nguyen

Abstract: This study suggests a novel tracking method that employs three Pixy2 sensors to identify the desired line trajectories instead of traditional perceiving means. Firstly, the kinematic model of the mobile robot is derived from the information gathered by three Pixy2 sensors. Secondly, the sliding mode controller is implemented to regulate the tracking error. Finally, simulation results are analyzed… ▽ More This study suggests a novel tracking method that employs three Pixy2 sensors to identify the desired line trajectories instead of traditional perceiving means. Firstly, the kinematic model of the mobile robot is derived from the information gathered by three Pixy2 sensors. Secondly, the sliding mode controller is implemented to regulate the tracking error. Finally, simulation results are analyzed to show the effectiveness of the proposed method. △ Less

Submitted 12 August, 2023; originally announced September 2023.

Comments: 6 pages, 12 figures, the 2023 International Symposium on Electrical and Electronics Engineering, Ho Chi Minh, Viet Nam, 2023

arXiv:2309.15053 [pdf]

Thalamic nuclei segmentation from T$_1$-weighted MRI: unifying and benchmarking state-of-the-art methods with young and old cohorts

Authors: Brendan Williams, Dan Nguyen, Julie Vidal, Alzheimer's Disease Neuroimaging Initiative, Manojkumar Saranathan

Abstract: The thalamus and its constituent nuclei are critical for a broad range of cognitive and sensorimotor processes, and implicated in many neurological and neurodegenerative conditions. However, the functional involvement and specificity of thalamic nuclei in human neuroimaging is underappreciated and not well studied due, in part, to technical challenges of accurately identifying and segmenting nucle… ▽ More The thalamus and its constituent nuclei are critical for a broad range of cognitive and sensorimotor processes, and implicated in many neurological and neurodegenerative conditions. However, the functional involvement and specificity of thalamic nuclei in human neuroimaging is underappreciated and not well studied due, in part, to technical challenges of accurately identifying and segmenting nuclei. This challenge is further exacerbated by a lack of common nomenclature for comparing segmentation methods. Here, we use data from healthy young (Human Connectome Project, 100 subjects) and older healthy adults, plus those with minor cognitive impairment and Alzheimer$'$s disease (Alzheimer$'$s Disease Neuroimaging Initiative, 540 subjects), to benchmark four state of the art thalamic segmentation methods for T1 MRI (FreeSurfer, HIPS-THOMAS, SCS-CNN, and T1-THOMAS) under a single segmentation framework. Segmentations were compared using overlap and dissimilarity metrics to the Morel stereotaxic atlas. We also quantified each method$'$s estimation of thalamic nuclear degeneration across Alzheimer$'$s disease progression, and how accurately early and late mild cognitive impairment, and Alzheimers disease could be distinguished from healthy controls. We show that HIPS-THOMAS produced the most effective segmentations of individual thalamic nuclei and was also most accurate in discriminating healthy controls from those with mild cognitive impairment and Alzheimer$'$s disease using individual nucleus volumes. This work is the first to systematically compare the efficacy of anatomical thalamic segmentation approaches under a unified nomenclature. We also provide recommendations of which segmentation method to use for studying the functional relevance of specific thalamic nuclei, based on their overlap and dissimilarity with the Morel atlas. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 10 figures, 4 tables, 3 supplemental figures, 2 supplemental tables

arXiv:2309.04178 [pdf, other]

doi 10.1109/TCOMM.2023.3280209

Double RIS-Assisted MIMO Systems Over Spatially Correlated Rician Fading Channels and Finite Scatterers

Authors: Ha An Le, Trinh Van Chien, Van Duc Nguyen, Wan Choi

Abstract: This paper investigates double RIS-assisted MIMO communication systems over Rician fading channels with finite scatterers, spatial correlation, and the existence of a double-scattering link between the transceiver. First, the statistical information is driven in closed form for the aggregated channels, unveiling various influences of the system and environment on the average channel power gains. N… ▽ More This paper investigates double RIS-assisted MIMO communication systems over Rician fading channels with finite scatterers, spatial correlation, and the existence of a double-scattering link between the transceiver. First, the statistical information is driven in closed form for the aggregated channels, unveiling various influences of the system and environment on the average channel power gains. Next, we study two active and passive beamforming designs corresponding to two objectives. The first problem maximizes channel capacity by jointly optimizing the active precoding and combining matrices at the transceivers and passive beamforming at the double RISs subject to the transmitting power constraint. In order to tackle the inherently non-convex issue, we propose an efficient alternating optimization algorithm (AO) based on the alternating direction method of multipliers (ADMM). The second problem enhances communication reliability by jointly training the encoder and decoder at the transceivers and the phase shifters at the RISs. Each neural network representing a system entity in an end-to-end learning framework is proposed to minimize the symbol error rate of the detected symbols by controlling the transceiver and the RISs phase shifts. Numerical results verify our analysis and demonstrate the superior improvements of phase shift designs to boost system performance. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: 15 pages, 9 figures, accepted by IEEE Transactions on Communications

arXiv:2309.03317 [pdf, other]

Sub-Array Selection in Full-Duplex Massive MIMO for Enhanced Self-Interference Suppression

Authors: Mobeen Mahmood, Asil Koc, Duc Tuong Nguyen, Robert Morawski, Tho Le-Ngoc

Abstract: This study considers a novel full-duplex (FD) massive multiple-input multiple-output (mMIMO) system using hybrid beamforming (HBF) architecture, which allows for simultaneous uplink (UL) and downlink (DL) transmission over the same frequency band. Particularly, our objective is to mitigate the strong self-interference (SI) solely on the design of UL and DL RF beamforming stages jointly with sub-ar… ▽ More This study considers a novel full-duplex (FD) massive multiple-input multiple-output (mMIMO) system using hybrid beamforming (HBF) architecture, which allows for simultaneous uplink (UL) and downlink (DL) transmission over the same frequency band. Particularly, our objective is to mitigate the strong self-interference (SI) solely on the design of UL and DL RF beamforming stages jointly with sub-array selection (SAS) for transmit (Tx) and receive (Rx) sub-arrays at base station (BS). Based on the measured SI channel in an anechoic chamber, we propose a min-SI beamforming scheme with SAS, which applies perturbations to the beam directivity to enhance SI suppression in UL and DL beam directions. To solve this challenging nonconvex optimization problem, we propose a swarm intelligence-based algorithmic solution to find the optimal perturbations as well as the Tx and Rx sub-arrays to minimize SI subject to the directivity degradation constraints for the UL and DL beams. The results show that the proposed min-SI BF scheme can achieve SI suppression as high as 78 dB in FD mMIMO systems. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: This paper has been accepted for publication in IEEE Globecom 2023

arXiv:2308.11557 [pdf, other]

Open Set Synthetic Image Source Attribution

Authors: Shengbang Fang, Tai D. Nguyen, Matthew C. Stamm

Abstract: AI-generated images have become increasingly realistic and have garnered significant public attention. While synthetic images are intriguing due to their realism, they also pose an important misinformation threat. To address this new threat, researchers have developed multiple algorithms to detect synthetic images and identify their source generators. However, most existing source attribution tech… ▽ More AI-generated images have become increasingly realistic and have garnered significant public attention. While synthetic images are intriguing due to their realism, they also pose an important misinformation threat. To address this new threat, researchers have developed multiple algorithms to detect synthetic images and identify their source generators. However, most existing source attribution techniques are designed to operate in a closed-set scenario, i.e. they can only be used to discriminate between known image generators. By contrast, new image-generation techniques are rapidly emerging. To contend with this, there is a great need for open-set source attribution techniques that can identify when synthetic images have originated from new, unseen generators. To address this problem, we propose a new metric learning-based approach. Our technique works by learning transferrable embeddings capable of discriminating between generators, even when they are not seen during training. An image is first assigned to a candidate generator, then is accepted or rejected based on its distance in the embedding space from known generators' learned reference points. Importantly, we identify that initializing our source attribution embedding network by pretraining it on image camera identification can improve our embeddings' transferability. Through a series of experiments, we demonstrate our approach's ability to attribute the source of synthetic images in open-set scenarios. △ Less

Submitted 22 August, 2023; originally announced August 2023.

arXiv:2308.09603 [pdf, ps, other]

A Convergence Predictor Model for Consensus-based Decentralised Energy Markets

Authors: Parikshit Pareek, L. P. Mohasha Isuru Sampath, Hung D. Nguyen, Eddy Y. S. Foo

Abstract: This letter introduces a convergence prediction model (CPM) for decentralized market clearing mechanisms. The CPM serves as a tool to detect potential cyber-attacks that affect the convergence of the consensus mechanism during ongoing market clearing operations. In this study, we propose a successively elongating Bayesian logistic regression approach to model the probability of convergence of real… ▽ More This letter introduces a convergence prediction model (CPM) for decentralized market clearing mechanisms. The CPM serves as a tool to detect potential cyber-attacks that affect the convergence of the consensus mechanism during ongoing market clearing operations. In this study, we propose a successively elongating Bayesian logistic regression approach to model the probability of convergence of real-time market mechanisms. The CPM utilizes net-power balance among all the prosumers/market participants as a feature for convergence prediction, enabling a low-dimensional model to operate efficiently for all the prosumers concurrently. The results highlight that the proposed CPM has achieved a net false rate of less than 0.01% for a stressed dataset. △ Less

Submitted 18 August, 2023; originally announced August 2023.

arXiv:2307.10017 [pdf, ps, other]

Geometry in global coordinates in mechanics and optimal transport

Authors: Du Nguyen

Abstract: For a manifold embedded in an inner product space, we express geometric quantities such as {\it Hamilton vector fields, affine and Levi-Civita connections, curvature} in global coordinates. Instead of coordinate indices, the global formulas for most quantities are expressed as {\it operator-valued} expressions, using an {\it affine projection} to the tangent bundle. For a submersion image of an em… ▽ More For a manifold embedded in an inner product space, we express geometric quantities such as {\it Hamilton vector fields, affine and Levi-Civita connections, curvature} in global coordinates. Instead of coordinate indices, the global formulas for most quantities are expressed as {\it operator-valued} expressions, using an {\it affine projection} to the tangent bundle. For a submersion image of an embedded manifold, we introduce {\it liftings} of Hamilton vector fields, allowing us to use embedded coordinates on horizontal bundles. We derive a {\it Gauss-Codazzi equation} for affine connections on vector bundles. This approach allows us to evaluate geometric expressions globally, and could be used effectively with modern numerical frameworks in applications. Examples considered include rigid body mechanics and Hamilton mechanics on Grassmann manifolds. We show explicitly the cross-curvature (MTW-tensor) for the {\it Kim-McCann} metric with a reflector antenna-type cost function on the space of positive-semidefinite matrices of fixed rank has nonnegative cross-curvature, while the corresponding cost could have negative cross-curvature on Grassmann manifolds, except for projective spaces. △ Less

Submitted 19 July, 2023; originally announced July 2023.

MSC Class: 53C05; 53C42; 70H05; 70H45; 70H33; 53D05; 53Z30; 53Z50

arXiv:2307.01062 [pdf, other]

A Data-Driven Approach to Geometric Modeling of Systems with Low-Bandwidth Actuator Dynamics

Authors: Siming Deng, Junning Liu, Bibekananda Datta, Aishwarya Pantula, David H. Gracias, Thao D. Nguyen, Brian A. Bittner, Noah J. Cowan

Abstract: It is challenging to perform system identification on soft robots due to their underactuated, high-dimensional dynamics. In this work, we present a data-driven modeling framework, based on geometric mechanics (also known as gauge theory) that can be applied to systems with low-bandwidth control of the system's internal configuration. This method constructs a series of connected models comprising a… ▽ More It is challenging to perform system identification on soft robots due to their underactuated, high-dimensional dynamics. In this work, we present a data-driven modeling framework, based on geometric mechanics (also known as gauge theory) that can be applied to systems with low-bandwidth control of the system's internal configuration. This method constructs a series of connected models comprising actuator and locomotor dynamics based on data points from stochastically perturbed, repeated behaviors. By deriving these connected models from general formulations of dissipative Lagrangian systems with symmetry, we offer a method that can be applied broadly to robots with first-order, low-pass actuator dynamics, including swelling-driven actuators used in hydrogel crawlers. These models accurately capture the dynamics of the system shape and body movements of a simplified swimming robot model. We further apply our approach to a stimulus-responsive hydrogel simulator that captures the complexity of chemo-mechanical interactions that drive shape changes in biomedically relevant micromachines. Finally, we propose an approach of numerically optimizing control signals by iteratively refining models, which is applied to optimize the input waveform for the hydrogel crawler. This transfer to realistic environments provides promise for applications in locomotor design and biomedical engineering. △ Less

Submitted 3 October, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: 9 pages, 6 figures

arXiv:2306.12925 [pdf, other]

AudioPaLM: A Large Language Model That Can Speak and Listen

Authors: Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara Sainath, Johan Schalkwyk, Matt Sharifi, Michelle Tadmor Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirović, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats , et al. (5 additional authors not shown)

Abstract: We introduce AudioPaLM, a large language model for speech understanding and generation. AudioPaLM fuses text-based and speech-based language models, PaLM-2 [Anil et al., 2023] and AudioLM [Borsos et al., 2022], into a unified multimodal architecture that can process and generate text and speech with applications including speech recognition and speech-to-speech translation. AudioPaLM inherits the… ▽ More We introduce AudioPaLM, a large language model for speech understanding and generation. AudioPaLM fuses text-based and speech-based language models, PaLM-2 [Anil et al., 2023] and AudioLM [Borsos et al., 2022], into a unified multimodal architecture that can process and generate text and speech with applications including speech recognition and speech-to-speech translation. AudioPaLM inherits the capability to preserve paralinguistic information such as speaker identity and intonation from AudioLM and the linguistic knowledge present only in text large language models such as PaLM-2. We demonstrate that initializing AudioPaLM with the weights of a text-only large language model improves speech processing, successfully leveraging the larger quantity of text training data used in pretraining to assist with the speech tasks. The resulting model significantly outperforms existing systems for speech translation tasks and has the ability to perform zero-shot speech-to-text translation for many languages for which input/target language combinations were not seen in training. AudioPaLM also demonstrates features of audio language models, such as transferring a voice across languages based on a short spoken prompt. We release examples of our method at https://google-research.github.io/seanet/audiopalm/examples △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: Technical report

arXiv:2306.01159 [pdf, other]

Quantum-based Distributed Algorithms for Edge Node Placement and Workload Allocation

Authors: Duong The Do, Ni Trieu, Duong Tung Nguyen

Abstract: Edge computing is a promising technology that offers a superior user experience and enables various innovative Internet of Things applications. In this paper, we present a mixed-integer linear programming (MILP) model for optimal edge server placement and workload allocation, which is known to be NP-hard. To this end, we explore the possibility of addressing this computationally challenging proble… ▽ More Edge computing is a promising technology that offers a superior user experience and enables various innovative Internet of Things applications. In this paper, we present a mixed-integer linear programming (MILP) model for optimal edge server placement and workload allocation, which is known to be NP-hard. To this end, we explore the possibility of addressing this computationally challenging problem using quantum computing. However, existing quantum solvers are limited to solving unconstrained binary programming problems. To overcome this obstacle, we propose a hybrid quantum-classical solution that decomposes the original problem into a quadratic unconstrained binary optimization (QUBO) problem and a linear program (LP) subproblem. The QUBO problem can be solved by a quantum solver, while the LP subproblem can be solved using traditional LP solvers. Our numerical experiments demonstrate the practicality of leveraging quantum supremacy to solve complex optimization problems in edge computing. △ Less

Submitted 1 June, 2023; originally announced June 2023.

arXiv:2305.19709 [pdf, other]

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech

Authors: Linh The Nguyen, Thinh Pham, Dat Quoc Nguyen

Abstract: We present XPhoneBERT, the first multilingual model pre-trained to learn phoneme representations for the downstream text-to-speech (TTS) task. Our XPhoneBERT has the same model architecture as BERT-base, trained using the RoBERTa pre-training approach on 330M phoneme-level sentences from nearly 100 languages and locales. Experimental results show that employing XPhoneBERT as an input phoneme encod… ▽ More We present XPhoneBERT, the first multilingual model pre-trained to learn phoneme representations for the downstream text-to-speech (TTS) task. Our XPhoneBERT has the same model architecture as BERT-base, trained using the RoBERTa pre-training approach on 330M phoneme-level sentences from nearly 100 languages and locales. Experimental results show that employing XPhoneBERT as an input phoneme encoder significantly boosts the performance of a strong neural TTS model in terms of naturalness and prosody and also helps produce fairly high-quality speech with limited training data. We publicly release our pre-trained XPhoneBERT with the hope that it would facilitate future research and downstream TTS applications for multiple languages. Our XPhoneBERT model is available at https://github.com/VinAIResearch/XPhoneBERT △ Less

Submitted 31 May, 2023; originally announced May 2023.

Comments: In Proceedings of INTERSPEECH 2023 (to appear)

arXiv:2305.08754 [pdf, other]

On the Stability of Approximate Message Passing with Independent Measurement Ensembles

Authors: Dang Qua Nguyen, Taejoon Kim

Abstract: Approximate message passing (AMP) is a scalable, iterative approach to signal recovery. For structured random measurement ensembles, including independent and identically distributed (i.i.d.) Gaussian and rotationally-invariant matrices, the performance of AMP can be characterized by a scalar recursion called state evolution (SE). The pseudo-Lipschitz (polynomial) smoothness is conventionally assu… ▽ More Approximate message passing (AMP) is a scalable, iterative approach to signal recovery. For structured random measurement ensembles, including independent and identically distributed (i.i.d.) Gaussian and rotationally-invariant matrices, the performance of AMP can be characterized by a scalar recursion called state evolution (SE). The pseudo-Lipschitz (polynomial) smoothness is conventionally assumed. In this work, we extend the SE for AMP to a new class of measurement matrices with independent (not necessarily identically distributed) entries. We also extend it to a general class of functions, called controlled functions which are not constrained by the polynomial smoothness; unlike the pseudo-Lipschitz function that has polynomial smoothness, the controlled function grows exponentially. The lack of structure in the assumed measurement ensembles is addressed by leveraging Lindeberg-Feller. The lack of smoothness of the assumed controlled function is addressed by a proposed conditioning technique leveraging the empirical statistics of the AMP instances. The resultants grant the use of the SE to a broader class of measurement ensembles and a new class of functions. △ Less

Submitted 25 June, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

arXiv:2305.05165 [pdf]

Assessing the optimal contributions of renewables and carbon capture and storage toward carbon neutrality by 2050

Authors: Dinh Hoa Nguyen, Andrew Chapman, Takeshi Tsuji

Abstract: Building on the carbon reduction targets agreed in the Paris Agreements, many nations have renewed their efforts toward achieving carbon neutrality by the year 2050. In line with this ambitious goal, nations are seeking to understand the appropriate combination of technologies which will enable the required reductions in such a way that they are appealing to investors. Around the globe, solar and… ▽ More Building on the carbon reduction targets agreed in the Paris Agreements, many nations have renewed their efforts toward achieving carbon neutrality by the year 2050. In line with this ambitious goal, nations are seeking to understand the appropriate combination of technologies which will enable the required reductions in such a way that they are appealing to investors. Around the globe, solar and wind power lead in terms of renewable energy deployment, while carbon capture and storage (CCS) is scaling up toward making a significant contribution to deep carbon cuts. Using Japan as a case study nation, this research proposes a linear optimization modeling approach to identify the potential contributions of renewables and CCS toward maximizing carbon reduction and identifying their economic merits over time. Results identify that the combination of these three technologies could enable a carbon dioxide emission reduction of between 55 and 67 percent in the energy sector by 2050 depending on resilience levels and CCS deployment regimes. Further reductions are likely to emerge with increased carbon pricing over time. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2305.03844 [pdf]

Physics-based network fine-tuning for robust quantitative susceptibility mapping from high-pass filtered phase

Authors: Jinwei Zhang, Alexey Dimov, Chao Li, Hang Zhang, Thanh D. Nguyen, Pascal Spincemaille, Yi Wang

Abstract: Purpose: To improve the generalization ability of convolutional neural network (CNN) based prediction of quantitative susceptibility mapping (QSM) from high-pass filtered phase (HPFP) image. Methods: The proposed network addresses two common generalization issues that arise when using a pre-trained network to predict QSM from HPFP: a) data with unseen voxel sizes, and b) data with unknown high-pas… ▽ More Purpose: To improve the generalization ability of convolutional neural network (CNN) based prediction of quantitative susceptibility mapping (QSM) from high-pass filtered phase (HPFP) image. Methods: The proposed network addresses two common generalization issues that arise when using a pre-trained network to predict QSM from HPFP: a) data with unseen voxel sizes, and b) data with unknown high-pass filter parameters. A network fine-tuning step based on a high-pass filtering dipole convolution forward model is proposed to reduce the generalization error of the pre-trained network. A progressive Unet architecture is proposed to improve prediction accuracy without increasing fine-tuning computational cost. Results: In retrospective studies using RMSE, PSNR, SSIM and HFEN as quality metrics, the performance of both Unet and progressive Unet was improved after physics-based fine-tuning at all voxel sizes and most high-pass filtering cutoff frequencies tested in the experiment. Progressive Unet slightly outperformed Unet both before and after fine-tuning. In a prospective study, image sharpness was improved after physics-based fine-tuning for both Unet and progressive Unet. Compared to Unet, progressive Unet had better agreement of regional susceptibility values with reference QSM. Conclusion: The proposed method shows improved robustness compared to the pre-trained network without fine-tuning when the test dataset deviates from training. Our code is available at https://github.com/Jinwei1209/SWI_to_QSM/ △ Less

Submitted 5 May, 2023; originally announced May 2023.

arXiv:2304.14455 [pdf, other]

doi 10.1109/ICCAIS59597.2023.10382296

Bearing-Based Network Localization Under Randomized Gossip Protocol

Authors: Nhat-Minh Le-Phan, Minh Hoang Trinh, Phuoc Doan Nguyen

Abstract: In this paper, we consider a randomized gossip algorithm for the bearing-based network localization problem. Let each sensor node be able to obtain the bearing vectors and communicate its position estimates with several neighboring agents. Each update involves two agents, and the update sequence follows a stochastic process. Under the assumption that the network is infinitesimally bearing rigid an… ▽ More In this paper, we consider a randomized gossip algorithm for the bearing-based network localization problem. Let each sensor node be able to obtain the bearing vectors and communicate its position estimates with several neighboring agents. Each update involves two agents, and the update sequence follows a stochastic process. Under the assumption that the network is infinitesimally bearing rigid and contains at least two beacon nodes, we show that when the updating step-size is properly selected, the proposed algorithm can successfully estimate the actual sensor nodes' positions with probability one. The randomized update provides a simple, distributed, and cost-effective method for localizing the network. The theoretical result is supported with a simulation of a 1089-node sensor network. △ Less

Submitted 17 January, 2024; v1 submitted 27 April, 2023; originally announced April 2023.

Comments: preprint, 6 pages, 2 figures. Published in the Proceeding of the 12th International Conference on Control, Automation and Information Sciences (ICCAIS). arXiv admin note: text overlap with arXiv:2303.14733

arXiv:2304.12852 [pdf, ps, other]

doi 10.1109/TIP.2023.3346695

The Bjøntegaard Bible -- Why your Way of Comparing Video Codecs May Be Wrong

Authors: Christian Herglotz, Hannah Och, Anna Meyer, Geetha Ramasubbu, Lena Eichermüller, Matthias Kränzler, Fabian Brand, Kristian Fischer, Dat Thanh Nguyen, Andy Regensky, André Kaup

Abstract: In this paper, we provide an in-depth assessment on the Bjøntegaard Delta. We construct a large data set of video compression performance comparisons using a diverse set of metrics including PSNR, VMAF, bitrate, and processing energies. These metrics are evaluated for visual data types such as classic perspective video, 360$^\circ$ video, point clouds, and screen content. As compression technology… ▽ More In this paper, we provide an in-depth assessment on the Bjøntegaard Delta. We construct a large data set of video compression performance comparisons using a diverse set of metrics including PSNR, VMAF, bitrate, and processing energies. These metrics are evaluated for visual data types such as classic perspective video, 360$^\circ$ video, point clouds, and screen content. As compression technology, we consider multiple hybrid video codecs as well as state-of-the-art neural network based compression methods. Using additional supporting points inbetween standard points defined by parameters such as the quantization parameter, we assess the interpolation error of the Bjøntegaard-Delta (BD) calculus and its impact on the final BD value. From the analysis, we find that the BD calculus is most accurate in the standard application of rate-distortion comparisons with mean errors below 0.5 percentage points. For other applications and special cases, e.g., VMAF quality, energy considerations, or inter-codec comparisons, the errors are higher (up to 5 percentage points), but can be halved by using a higher number of supporting points. We finally come up with recommendations on how to use the BD calculus such that the validity of the resulting BD-values is maximized. Main recommendations are as follows: First, relative curve differences should be plotted and analyzed. Second, the logarithmic domain should be used for saturating metrics such as SSIM and VMAF. Third, BD values below a certain threshold indicated by the subset error should not be used to draw recommendations. Fourth, using two supporting points is sufficient to obtain rough performance estimates. △ Less

Submitted 22 December, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

Comments: 21 pages, 14 figures

arXiv:2304.11476 [pdf]

Maximum Spherical Mean Value (mSMV) Filtering for Whole Brain Quantitative Susceptibility Mapping

Authors: Alexandra G. Roberts, Dominick J. Romano, Mert Şişman, Alexey V. Dimov, Pascal Spincemaille, Thanh D. Nguyen, Ilhami Kovanlikaya, Susan A. Gauthier, Yi Wang

Abstract: To develop a tissue field filtering algorithm, called maximum Spherical Mean Value (mSMV), for reducing shadow artifacts in quantitative susceptibility mapping (QSM) of the brain without requiring brain tissue erosion.Residual background field is a major source of shadow artifacts in QSM. The mSMV algorithm filters large field values near the border, where the maximum value of the harmonic backgro… ▽ More To develop a tissue field filtering algorithm, called maximum Spherical Mean Value (mSMV), for reducing shadow artifacts in quantitative susceptibility mapping (QSM) of the brain without requiring brain tissue erosion.Residual background field is a major source of shadow artifacts in QSM. The mSMV algorithm filters large field values near the border, where the maximum value of the harmonic background field is located. The effectiveness of mSMV for artifact removal was evaluated by comparing with existing QSM algorithms in numerical brain simulation as well as using in vivo human data acquired from 11 healthy volunteers and 93 patients. Numerical simulation showed that mSMV reduces shadow artifacts and improves QSM accuracy. Better shadow reduction, as demonstrated by lower QSM variation in the gray matter and higher QSM image quality score, was also observed in healthy subjects and in patients with hemorrhages, stroke and multiple sclerosis. The mSMV algorithm allows QSM maps that are substantially equivalent to those obtained using SMV-filtered dipole inversion without eroding the volume of interest. △ Less

Submitted 27 November, 2023; v1 submitted 22 April, 2023; originally announced April 2023.

Comments: 12 pages, 5 figures

arXiv:2304.03458 [pdf]

doi 10.1002/mrm.29854

mcLARO: Multi-Contrast Learned Acquisition and Reconstruction Optimization for simultaneous quantitative multi-parametric mapping

Authors: Jinwei Zhang, Thanh D. Nguyen, Eddy Solomon, Chao Li, Qihao Zhang, Jiahao Li, Hang Zhang, Pascal Spincemaille, Yi Wang

Abstract: Purpose: To develop a method for rapid sub-millimeter T1, T2, T2* and QSM mapping in a single scan using multi-contrast Learned Acquisition and Reconstruction Optimization (mcLARO). Methods: A pulse sequence was developed by interleaving inversion recovery and T2 magnetization preparations and single-echo and multi-echo gradient echo acquisitions, which sensitized k-space data to T1, T2, T2* and… ▽ More Purpose: To develop a method for rapid sub-millimeter T1, T2, T2* and QSM mapping in a single scan using multi-contrast Learned Acquisition and Reconstruction Optimization (mcLARO). Methods: A pulse sequence was developed by interleaving inversion recovery and T2 magnetization preparations and single-echo and multi-echo gradient echo acquisitions, which sensitized k-space data to T1, T2, T2* and magnetic susceptibility. The proposed mcLARO used a deep learning framework to optimize both the multi-contrast k-space under-sampling pattern and the image reconstruction based on image feature fusion. The proposed mcLARO method with R=8 under-sampling was validated in a retrospective ablation study using fully sampled data as reference and evaluated in a prospective study using separately acquired conventionally sampled quantitative maps as reference standard. Results: The retrospective ablation study showed improved image sharpness of mcLARO compared to the baseline network without multi-contrast sampling pattern optimization or image feature fusion, and negligible bias and narrow 95% limits of agreement on regional T1, T2, T2* and QSM values were obtained by the under-sampled reconstructions compared to the fully sampled reconstruction. The prospective study showed small or negligible bias and narrow 95% limits of agreement on regional T1, T2, T2* and QSM values by mcLARO (5:39 mins) compared to reference scans (40:03 mins in total). Conclusion: mcLARO enabled fast sub-millimeter T1, T2, T2* and QSM mapping in a single scan. △ Less

Submitted 6 April, 2023; originally announced April 2023.

Journal ref: Magn Reson Med. 2024; 91: 344-356

arXiv:2304.03041 [pdf, other]

Multi-Linear Kernel Regression and Imputation in Data Manifolds

Authors: Duc Thien Nguyen, Konstantinos Slavakis

Abstract: This paper introduces an efficient multi-linear nonparametric (kernel-based) approximation framework for data regression and imputation, and its application to dynamic magnetic-resonance imaging (dMRI). Data features are assumed to reside in or close to a smooth manifold embedded in a reproducing kernel Hilbert space. Landmark points are identified to describe concisely the point cloud of features… ▽ More This paper introduces an efficient multi-linear nonparametric (kernel-based) approximation framework for data regression and imputation, and its application to dynamic magnetic-resonance imaging (dMRI). Data features are assumed to reside in or close to a smooth manifold embedded in a reproducing kernel Hilbert space. Landmark points are identified to describe concisely the point cloud of features by linear approximating patches which mimic the concept of tangent spaces to smooth manifolds. The multi-linear model effects dimensionality reduction, enables efficient computations, and extracts data patterns and their geometry without any training data or additional information. Numerical tests on dMRI data under severe under-sampling demonstrate remarkable improvements in efficiency and accuracy of the proposed approach over its predecessors, popular data modeling methods, as well as recent tensor-based and deep-image-prior schemes. △ Less

Submitted 6 April, 2023; originally announced April 2023.

arXiv:2303.14733 [pdf, other]

doi 10.1109/TNSE.2024.3376643

Randomized Matrix Weighted Consensus

Authors: Nhat-Minh Le-Phan, Minh Hoang Trinh, Phuoc Doan Nguyen

Abstract: In this paper, randomized gossip-type matrix-weighted consensus algorithms are proposed for both leaderless and leader-follower topologies. First, we introduce the notion of expected matrix-weighted network, which captures the multi-dimensional interactions between any two agents in a probabilistic sense. Under some mild assumptions on the distribution of the expected matrix weights and the upper… ▽ More In this paper, randomized gossip-type matrix-weighted consensus algorithms are proposed for both leaderless and leader-follower topologies. First, we introduce the notion of expected matrix-weighted network, which captures the multi-dimensional interactions between any two agents in a probabilistic sense. Under some mild assumptions on the distribution of the expected matrix weights and the upper bound of the updating step size, the proposed asynchronous pairwise update algorithms drive the network to achieve a consensus in expectation. An upper bound of the $ε$-convergence time of the algorithm is then derived. Furthermore, the proposed algorithms are applied to the bearing-based network localization and formation control problems. The theoretical results are supported by several numerical examples. △ Less

Submitted 6 February, 2024; v1 submitted 26 March, 2023; originally announced March 2023.

Comments: 32 pages, 6 figures, preprint

Showing 1–50 of 231 results for author: Nguyen, D