Zum Hauptinhalt springen

Showing 1–50 of 187 results for author: Vũ, N

.
  1. arXiv:2408.14154  [pdf, other

    cs.CL

    Investigating the effect of Mental Models in User Interaction with an Adaptive Dialog Agent

    Authors: Lindsey Vanderlyn, Dirk Väth, Ngoc Thang Vu

    Abstract: Mental models play an important role in whether user interaction with intelligent systems, such as dialog systems is successful or not. Adaptive dialog systems present the opportunity to align a dialog agent's behavior with heterogeneous user expectations. However, there has been little research into what mental models users form when interacting with a task-oriented dialog system, how these model… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: submitted to COLING 2025

  2. arXiv:2408.14153  [pdf, other

    cs.CV cs.AI cs.CL

    Explaining Vision-Language Similarities in Dual Encoders with Feature-Pair Attributions

    Authors: Lucas Möller, Pascal Tilli, Ngoc Thang Vu, Sebastian Padó

    Abstract: Dual encoder architectures like CLIP models map two types of inputs into a shared embedding space and learn similarities between them. However, it is not understood how such models compare two inputs. Here, we address this research gap with two contributions. First, we derive a method to attribute predictions of any differentiable dual encoder onto feature-pair interactions between its inputs. Sec… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  3. arXiv:2408.05300  [pdf, other

    gr-qc

    High-Precision Ringdown Surrogate Model for Non-Precessing Binary Black Holes

    Authors: Lorena Magaña Zertuche, Leo C. Stein, Keefe Mitman, Scott E. Field, Vijay Varma, Michael Boyle, Nils Deppe, Lawrence E. Kidder, Jordan Moxon, Harald P. Pfeiffer, Mark A. Scheel, Kyle C. Nelli, William Throwe, Nils L. Vu

    Abstract: Highly precise and robust waveform models are required as improvements in detector sensitivity enable us to test general relativity with more precision than ever before. In this work, we introduce a spin-aligned surrogate ringdown model. This ringdown surrogate, NRSur3dq8_RD, is built with numerical waveforms produced using Cauchy-characteristic evolution. In addition, these waveforms are in the s… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: 11+2 pages, 13 figures, 1 table. This new model is publicly available through surfinBH https://pypi.org/project/surfinBH/

  4. arXiv:2408.00122  [pdf, other

    cs.CL

    A Course Shared Task on Evaluating LLM Output for Clinical Questions

    Authors: Yufang Hou, Thy Thy Tran, Doan Nam Long Vu, Yiwen Cao, Kai Li, Lukas Rohde, Iryna Gurevych

    Abstract: This paper presents a shared task that we organized at the Foundations of Language Technology (FoLT) course in 2023/2024 at the Technical University of Darmstadt, which focuses on evaluating the output of Large Language Models (LLMs) in generating harmful answers to health-related clinical questions. We describe the task design considerations and report the feedback we received from the students.… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

    Comments: accepted at the sixth Workshop on Teaching NLP (co-located with ACL 2024)

  5. arXiv:2407.21061  [pdf, other

    cs.CL cs.SD eess.AS

    Improving noisy student training for low-resource languages in End-to-End ASR using CycleGAN and inter-domain losses

    Authors: Chia-Yu Li, Ngoc Thang Vu

    Abstract: Training a semi-supervised end-to-end speech recognition system using noisy student training has significantly improved performance. However, this approach requires a substantial amount of paired speech-text and unlabeled speech, which is costly for low-resource languages. Therefore, this paper considers a more extreme case of semi-supervised end-to-end automatic speech recognition where there are… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: 10 pages (2 for references), 4 figures, published in SIGUL2024@LREC-COLING 2024

  6. arXiv:2407.19877  [pdf, other

    cs.RO cs.CV

    Language-driven Grasp Detection with Mask-guided Attention

    Authors: Tuan Van Vo, Minh Nhat Vu, Baoru Huang, An Vuong, Ngan Le, Thieu Vo, Anh Nguyen

    Abstract: Grasp detection is an essential task in robotics with various industrial applications. However, traditional methods often struggle with occlusions and do not utilize language for grasping. Incorporating natural language into grasp detection remains a challenging task and largely unexplored. To address this gap, we propose a new method for language-driven grasp detection with mask-guided attention… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Accepted at IROS 2024

  7. arXiv:2407.18789  [pdf, other

    cs.CL

    Granularity is crucial when applying differential privacy to text: An investigation for neural machine translation

    Authors: Doan Nam Long Vu, Timour Igamberdiev, Ivan Habernal

    Abstract: Applying differential privacy (DP) by means of the DP-SGD algorithm to protect individual data points during training is becoming increasingly popular in NLP. However, the choice of granularity at which DP is applied is often neglected. For example, neural machine translation (NMT) typically operates on the sentence-level granularity. From the perspective of DP, this setup assumes that each senten… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  8. arXiv:2407.17967  [pdf, other

    cs.RO cs.CV

    Lightweight Language-driven Grasp Detection using Conditional Consistency Model

    Authors: Nghia Nguyen, Minh Nhat Vu, Baoru Huang, An Vuong, Ngan Le, Thieu Vo, Anh Nguyen

    Abstract: Language-driven grasp detection is a fundamental yet challenging task in robotics with various industrial applications. In this work, we present a new approach for language-driven grasp detection that leverages the concept of lightweight diffusion models to achieve fast inference time. By integrating diffusion processes with grasping prompts in natural language, our method can effectively encode v… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: Accepted at IROS 2024

  9. arXiv:2407.13842  [pdf, other

    cs.RO cs.CV

    Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance

    Authors: Toan Nguyen, Minh Nhat Vu, Baoru Huang, An Vuong, Quan Vuong, Ngan Le, Thieu Vo, Anh Nguyen

    Abstract: 6-DoF grasp detection has been a fundamental and challenging problem in robotic vision. While previous works have focused on ensuring grasp stability, they often do not consider human intention conveyed through natural language, hindering effective collaboration between robots and users in complex 3D environments. In this paper, we present a new approach for language-driven 6-DoF grasp detection i… ▽ More

    Submitted 25 July, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted at ECCV 2024

  10. arXiv:2407.02937  [pdf, other

    cs.CL cs.SD eess.AS

    Probing the Feasibility of Multilingual Speaker Anonymization

    Authors: Sarina Meyer, Florian Lux, Ngoc Thang Vu

    Abstract: In speaker anonymization, speech recordings are modified in a way that the identity of the speaker remains hidden. While this technology could help to protect the privacy of individuals around the globe, current research restricts this by focusing almost exclusively on English data. In this study, we extend a state-of-the-art anonymization system to nine languages by transforming language-dependen… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: accepted at Interspeech 2024

  11. arXiv:2407.01381  [pdf, other

    physics.chem-ph

    Polaritonic Chemistry using the Density Matrix Renormalization Group Method

    Authors: Mikuláš Matoušek, Nam Vu, Niranjan Govind, Jonathan J. Foley IV, Libor Veis

    Abstract: The emerging field of polaritonic chemistry explores the behavior of molecules under strong coupling with cavity modes. Despite recent developments in ab initio polaritonic methods for simulating polaritonic chemistry under electronic strong coupling, their capabilities are limited, especially in cases where the molecule also features strong electronic correlation. To bridge this gap, we have deve… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  12. arXiv:2407.00145  [pdf, other

    physics.soc-ph math.DS

    Co-evolving networks for opinion and social dynamics in agent-based models

    Authors: Nataša Djurdjevac Conrad, Nhu Quang Vu, Sören Nagel

    Abstract: The rise of digital social media has strengthened the coevolution of public opinions and social interactions, that shape social structures and collective outcomes in increasingly complex ways. Existing literature often explores this interplay as a one-directional influence, focusing on how opinions determine social ties within adaptive networks. However, this perspective overlooks the intrinsic dy… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    MSC Class: 91Dxx; 05C82; 37Hxx

  13. arXiv:2406.19038  [pdf, other

    gr-qc

    Binary neutron star mergers using a discontinuous Galerkin-finite difference hybrid method

    Authors: Nils Deppe, Francois Foucart, Marceline S. Bonilla, Michael Boyle, Nicholas J. Corso, Matthew D. Duez, Matthew Giesler, François Hébert, Lawrence E. Kidder, Yoonsoo Kim, Prayush Kumar, Isaac Legred, Geoffrey Lovelace, Elias R. Most, Jordan Moxon, Kyle C. Nelli, Harald P. Pfeiffer, Mark A. Scheel, Saul A. Teukolsky, William Throwe, Nils L. Vu

    Abstract: We present a discontinuous Galerkin-finite difference hybrid scheme that allows high-order shock capturing with the discontinuous Galerkin method for general relativistic magnetohydrodynamics in dynamical spacetimes. We present several optimizations and stability improvements to our algorithm that allow the hybrid method to successfully simulate single, rotating, and binary neutron stars. The hybr… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 31 pages, 8 figures, comments welcome!

  14. arXiv:2406.09489  [pdf, other

    cs.CV

    Language-driven Grasp Detection

    Authors: An Dinh Vuong, Minh Nhat Vu, Baoru Huang, Nghia Nguyen, Hieu Le, Thieu Vo, Anh Nguyen

    Abstract: Grasp detection is a persistent and intricate challenge with various industrial applications. Recently, many methods and datasets have been proposed to tackle the grasp detection problem. However, most of them do not consider using natural language as a condition to detect the grasp poses. In this paper, we introduce Grasp-Anything++, a new language-driven grasp detection dataset featuring 1M samp… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 19 pages. Accepted to CVPR24

  15. arXiv:2406.09039  [pdf, other

    cs.RO

    Language-Driven Closed-Loop Grasping with Model-Predictive Trajectory Replanning

    Authors: Huy Hoang Nguyen, Minh Nhat Vu, Florian Beck, Gerald Ebmer, Anh Nguyen, Andreas Kugi

    Abstract: Combining a vision module inside a closed-loop control system for a \emph{seamless movement} of a robot in a manipulation task is challenging due to the inconsistent update rates between utilized modules. This task is even more difficult in a dynamic environment, e.g., objects are moving. This paper presents a \emph{modular} zero-shot framework for language-driven manipulation of (dynamic) objects… ▽ More

    Submitted 19 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 9 pages, 6 figures

  16. arXiv:2406.08410  [pdf, other

    gr-qc

    Quasistationary hair for binary black hole initial data in scalar Gauss-Bonnet gravity

    Authors: Peter James Nee, Guillermo Lara, Harald P. Pfeiffer, Nils L. Vu

    Abstract: Recent efforts to numerically simulate compact objects in alternative theories of gravity have largely focused on the time-evolution equations. Another critical aspect is the construction of constraint-satisfying initial data with precise control over the properties of the systems under consideration. Here, we augment the extended conformal thin sandwich framework to construct quasistationary init… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 13 pages, 11 figures

  17. arXiv:2406.07124  [pdf, other

    cs.AI cs.LG

    CHARME: A chain-based reinforcement learning approach for the minor embedding problem

    Authors: Hoang M. Ngo, Nguyen H K. Do, Minh N. Vu, Tamer Kahveci, My T. Thai

    Abstract: Quantum Annealing (QA) holds great potential for solving combinatorial optimization problems efficiently. However, the effectiveness of QA algorithms heavily relies on the embedding of problem instances, represented as logical graphs, into the quantum unit processing (QPU) whose topology is in form of a limited connectivity graph, known as the minor embedding Problem. Existing methods for the mino… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  18. arXiv:2406.06406  [pdf, other

    cs.CL cs.SD eess.AS

    Controlling Emotion in Text-to-Speech with Natural Language Prompts

    Authors: Thomas Bott, Florian Lux, Ngoc Thang Vu

    Abstract: In recent years, prompting has quickly become one of the standard ways of steering the outputs of generative machine learning models, due to its intuitive use of natural language. In this work, we propose a system conditioned on embeddings derived from an emotionally rich text that serves as prompt. Thereby, a joint representation of speaker and prompt embeddings is integrated at several points wi… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: accepted at Interspeech 2024

  19. arXiv:2406.06403  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Meta Learning Text-to-Speech Synthesis in over 7000 Languages

    Authors: Florian Lux, Sarina Meyer, Lyonel Behringer, Frank Zalkow, Phat Do, Matt Coler, Emanuël A. P. Habets, Ngoc Thang Vu

    Abstract: In this work, we take on the challenging task of building a single text-to-speech synthesis system that is capable of generating speech in over 7000 languages, many of which lack sufficient data for traditional TTS development. By leveraging a novel integration of massively multilingual pretraining and meta learning to approximate language representations, our approach enables zero-shot speech syn… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: accepted at Interspeech 2024

  20. arXiv:2405.09335  [pdf, other

    cs.CL

    Prompting-based Synthetic Data Generation for Few-Shot Question Answering

    Authors: Maximilian Schmidt, Andrea Bartezzaghi, Ngoc Thang Vu

    Abstract: Although language models (LMs) have boosted the performance of Question Answering, they still need plenty of data. Data annotation, in contrast, is a time-consuming process. This especially applies to Question Answering, where possibly large documents have to be parsed and annotated with questions and their corresponding answers. Furthermore, Question Answering models often only work well for the… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: LREC-COLING 2024

  21. arXiv:2405.08868  [pdf, other

    gr-qc hep-th

    A Review of Gravitational Memory and BMS Frame Fixing in Numerical Relativity

    Authors: Keefe Mitman, Michael Boyle, Leo C. Stein, Nils Deppe, Lawrence E. Kidder, Jordan Moxon, Harald P. Pfeiffer, Mark A. Scheel, Saul A. Teukolsky, William Throwe, Nils L. Vu

    Abstract: Gravitational memory effects and the BMS freedoms exhibited at future null infinity have recently been resolved and utilized in numerical relativity simulations. With this, gravitational wave models and our understanding of the fundamental nature of general relativity have been vastly improved. In this paper, we review the history and intuition behind memory effects and BMS symmetries, how they ma… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 20 pages, 8 figures. Submitted to CGQ's focus issue: Gravitational-Wave Memory Effects: From Theory to Observation

  22. arXiv:2405.06197  [pdf, other

    gr-qc

    Improved frequency spectra of gravitational waves with memory in a binary-black-hole simulation

    Authors: Yitian Chen, Michael Boyle, Nils Deppe, Lawrence E. Kidder, Keefe Mitman, Jordan Moxon, Kyle C. Nelli, Harald P. Pfeiffer, Mark A. Scheel, William Throwe, Nils L. Vu, Saul A. Teukolsky

    Abstract: Numerical relativists can now produce gravitational waveforms with memory effects routinely and accurately. The gravitational-wave memory effect contains very low-frequency components, including a persistent offset. The presence of these components violates basic assumptions about time-shift behavior underpinning standard data-analysis techniques in gravitational-wave astronomy. This poses a chall… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 24 pages, 11 figures, 5 tables

  23. arXiv:2405.06120  [pdf, other

    gr-qc math.NA

    A discontinuous Galerkin scheme for elliptic equations on extremely stretched grids

    Authors: Nils L. Vu

    Abstract: Discontinuous Galerkin (DG) methods for solving elliptic equations are gaining popularity in the computational physics community for their high-order spectral convergence and their potential for parallelization on computing clusters. However, problems in numerical relativity with extremely stretched grids, such as initial data problems for binary black holes that impose boundary conditions at larg… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 12 pages, 10 figures. Results are reproducible with the ancillary input files

  24. arXiv:2404.10922  [pdf, other

    cs.CL cs.SD eess.AS

    Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training

    Authors: Pavel Denisov, Ngoc Thang Vu

    Abstract: Recent advancements in language modeling have led to the emergence of Large Language Models (LLMs) capable of various natural language processing tasks. Despite their success in text-based tasks, applying LLMs to the speech domain remains limited and challenging. This paper presents BLOOMZMMS, a novel model that integrates a multilingual LLM with a multilingual speech encoder, aiming to harness th… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: NAACL Findings 2024

  25. arXiv:2404.10222  [pdf, other

    quant-ph

    Simulating electronic structure on bosonic quantum computers

    Authors: Rishab Dutta, Nam P. Vu, Ningyi Lyu, Chen Wang, Victor S. Batista

    Abstract: Computations with quantum harmonic oscillators or qumodes is a promising and rapidly evolving approach towards quantum computing. In contrast to qubits, which are two-level quantum systems, bosonic qumodes can in principle have infinite discrete levels, and can also be represented with continuous variable bases. One of the most promising applications of quantum computing is simulating many-fermion… ▽ More

    Submitted 27 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: 47 pages including references, 7 figures, revised

  26. Simulating Chemistry on Bosonic Quantum Devices

    Authors: Rishab Dutta, Delmar G. A. Cabral, Ningyi Lyu, Nam P. Vu, Yuchen Wang, Brandon Allen, Xiaohan Dan, Rodrigo G. Cortiñas, Pouya Khazaei, Max Schäfer, Alejandro C. C. d. Albornoz, Scott E. Smart, Scott Nie, Michel H. Devoret, David A. Mazziotti, Prineha Narang, Chen Wang, James D. Whitfield, Angela K. Wilson, Heidi P. Hendrickson, Daniel A. Lidar, Francisco Pérez-Bernal, Lea F. Santos, Sabre Kais, Eitan Geva , et al. (1 additional authors not shown)

    Abstract: Bosonic quantum devices offer a novel approach to realize quantum computations, where the quantum two-level system (qubit) is replaced with the quantum (an)harmonic oscillator (qumode) as the fundamental building block of the quantum simulator. The simulation of chemical structure and dynamics can then be achieved by representing or mapping the system Hamiltonians in terms of bosonic operators. In… ▽ More

    Submitted 5 July, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: 40 pages including references, 13 figures, revised

  27. arXiv:2404.07122  [pdf, other

    cs.CV

    Driver Attention Tracking and Analysis

    Authors: Dat Viet Thanh Nguyen, Anh Tran, Hoai Nam Vu, Cuong Pham, Minh Hoai

    Abstract: We propose a novel method to estimate a driver's points-of-gaze using a pair of ordinary cameras mounted on the windshield and dashboard of a car. This is a challenging problem due to the dynamics of traffic environments with 3D scenes of unknown depths. This problem is further complicated by the volatile distance between the driver and the camera system. To tackle these challenges, we develop a n… ▽ More

    Submitted 11 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  28. Superior Genetic Algorithms for the Target Set Selection Problem Based on Power-Law Parameter Choices and Simple Greedy Heuristics

    Authors: Benjamin Doerr, Martin S. Krejca, Nguyen Vu

    Abstract: The target set selection problem (TSS) asks for a set of vertices such that an influence spreading process started in these vertices reaches the whole graph. The current state of the art for this NP-hard problem are three recently proposed randomized search heuristics, namely a biased random-key genetic algorithm (BRKGA) obtained from extensive parameter tuning, a max-min ant system (MMAS), and a… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Journal ref: GECCO '24: Proceedings of the Genetic and Evolutionary Computation Conference, pages 169-177, ACM, 2024

  29. arXiv:2403.17647  [pdf, other

    cs.CL

    Intrinsic Subgraph Generation for Interpretable Graph based Visual Question Answering

    Authors: Pascal Tilli, Ngoc Thang Vu

    Abstract: The large success of deep learning based methods in Visual Question Answering (VQA) has concurrently increased the demand for explainable methods. Most methods in Explainable Artificial Intelligence (XAI) focus on generating post-hoc explanations rather than taking an intrinsic approach, the latter characterizing an interpretable model. In this work, we introduce an interpretable approach for grap… ▽ More

    Submitted 27 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  30. arXiv:2403.17582  [pdf, other

    cs.CL cs.AI cs.LG

    Towards a Zero-Data, Controllable, Adaptive Dialog System

    Authors: Dirk Väth, Lindsey Vanderlyn, Ngoc Thang Vu

    Abstract: Conversational Tree Search (Väth et al., 2023) is a recent approach to controllable dialog systems, where domain experts shape the behavior of a Reinforcement Learning agent through a dialog tree. The agent learns to efficiently navigate this tree, while adapting to information needs, e.g., domain familiarity, of different users. However, the need for additional training data hinders deployment in… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  31. arXiv:2403.08705  [pdf, other

    gr-qc

    Scalarization of isolated black holes in scalar Gauss-Bonnet theory in the fixing-the-equations approach

    Authors: Guillermo Lara, Harald P. Pfeiffer, Nikolas A. Wittek, Nils L. Vu, Kyle C. Nelli, Alexander Carpenter, Geoffrey Lovelace, Mark A. Scheel, William Throwe

    Abstract: One of the most promising avenues to perform numerical evolutions in theories beyond General Relativity is the fixing-the-equations approach, a proposal in which new ``driver'' equations are added to the evolution equations in a way that allows for stable numerical evolutions. In this direction, we extend the numerical relativity code SpECTRE to evolve a ``fixed'' version of scalar Gauss-Bonnet th… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 16 pages, 12 figures

  32. arXiv:2403.05338  [pdf, other

    cs.CL

    Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings

    Authors: Wei Zhou, Heike Adel, Hendrik Schuff, Ngoc Thang Vu

    Abstract: Attribution scores indicate the importance of different input parts and can, thus, explain model behaviour. Currently, prompt-based models are gaining popularity, i.a., due to their easier adaptability in low-resource settings. However, the quality of attribution scores extracted from prompt-based models has not been investigated yet. In this work, we address this topic by analyzing attribution sc… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  33. arXiv:2403.04784  [pdf, other

    cs.CR cs.LG

    Analysis of Privacy Leakage in Federated Large Language Models

    Authors: Minh N. Vu, Truc Nguyen, Tre' R. Jeter, My T. Thai

    Abstract: With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need for significant modifications to FL to accommodate the large-scale of LLMs. While substantial adjustments to the protocol have been introduced as a response, comprehensive privacy analysis for the adapted FL protocol is… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  34. arXiv:2402.04769  [pdf, other

    cs.RO

    Hierarchical Motion Planning and Offline Robust Model Predictive Control for Autonomous Vehicles

    Authors: Hung Duy Nguyen, Minh Nhat Vu, Nguyen Ngoc Nam, Kyoungseok Han

    Abstract: Driving vehicles in complex scenarios under harsh conditions is the biggest challenge for autonomous vehicles (AVs). To address this issue, we propose hierarchical motion planning and robust control strategy using the front-active steering system in complex scenarios with various slippery road adhesion coefficients while considering vehicle uncertain parameters. Behaviors of human vehicles (HVs) a… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 6 pages, 9 illustrations, Accepted for publication in American Control Conference (ACC) 2024

  35. arXiv:2402.04730  [pdf, other

    cs.RO

    Model Predictive Trajectory Optimization With Dynamically Changing Waypoints for Serial Manipulators

    Authors: Florian Beck, Minh Nhat Vu, Christian Hartl-Nesic, Andreas Kugi

    Abstract: Systematically including dynamically changing waypoints as desired discrete actions, for instance, resulting from superordinate task planning, has been challenging for online model predictive trajectory optimization with short planning horizons. This paper presents a novel waypoint model predictive control (wMPC) concept for online replanning tasks. The main idea is to split the planning horizon a… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 8 pages, 6 figures

  36. Striking the right tone: toward a self-consistent framework for measuring black hole ringdowns

    Authors: Teagan A. Clarke, Maximiliano Isi, Paul D. Lasky, Eric Thrane, Michael Boyle, Nils Deppe, Lawrence E. Kidder, Keefe Mitman, Jordan Moxon, Kyle C. Nelli, William Throwe, Nils L. Vu

    Abstract: The ringdown portion of a binary black hole merger consists of a sum of modes, each containing an infinite number of tones that are exponentially damped sinusoids. In principle, these can be measured as gravitational-waves with observatories like LIGO/Virgo/KAGRA, however in practice it is unclear how many tones can be meaningfully resolved. We investigate the consistency and resolvability of the… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 14 pages, 8 figures, 2 tables. Published in PRD

  37. arXiv:2401.17676  [pdf, other

    cs.RO

    Observer-based Controller Design for Oscillation Damping of a Novel Suspended Underactuated Aerial Platform

    Authors: Hemjyoti Das, Minh Nhat Vu, Tobias Egle, Christian Ott

    Abstract: In this work, we present a novel actuation strategy for a suspended aerial platform. By utilizing an underactuation approach, we demonstrate the successful oscillation damping of the proposed platform, modeled as a spherical double pendulum. A state estimator is designed in order to obtain the deflection angles of the platform, which uses only onboard IMU measurements. The state estimator is an ex… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 7 pages, 11 figures, Accepted for publication to ICRA 2024

  38. arXiv:2401.09059  [pdf, other

    cs.RO cs.CV

    Autonomous Catheterization with Open-source Simulator and Expert Trajectory

    Authors: Tudor Jianu, Baoru Huang, Tuan Vo, Minh Nhat Vu, Jingxuan Kang, Hoan Nguyen, Olatunji Omisore, Pierre Berthet-Rayne, Sebastiano Fichera, Anh Nguyen

    Abstract: Endovascular robots have been actively developed in both academia and industry. However, progress toward autonomous catheterization is often hampered by the widespread use of closed-source simulators and physical phantoms. Additionally, the acquisition of large-scale datasets for training machine learning algorithms with endovascular robots is usually infeasible due to expensive medical procedures… ▽ More

    Submitted 19 January, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: Code: https://github.com/airvlab/cathsim

  39. arXiv:2401.00805  [pdf, other

    gr-qc astro-ph.CO

    Nonlinear Effects In Black Hole Ringdown From Scattering Experiments I: spin and initial data dependence of quadratic mode coupling

    Authors: Hengrui Zhu, Justin L. Ripley, Frans Pretorius, Sizheng Ma, Keefe Mitman, Robert Owen, Michael Boyle, Yitian Chen, Nils Deppe, Lawrence E. Kidder, Jordan Moxon, Kyle C. Nelli, Harald P. Pfeiffer, Mark A. Scheel, William Throwe, Nils L. Vu

    Abstract: We investigate quadratic quasinormal mode coupling in black hole spacetime through numerical simulations of single perturbed black holes using both numerical relativity and second-order black hole perturbation theory. Focusing on the dominant $\ell=|m|=2$ quadrupolar modes, we find good agreement (within $\sim10\%$) between these approaches, with discrepancies attributed to truncation error and un… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  40. arXiv:2312.08588  [pdf, other

    gr-qc astro-ph.CO astro-ph.SR

    Black Hole Spectroscopy for Precessing Binary Black Hole Coalescences

    Authors: Hengrui Zhu, Harrison Siegel, Keefe Mitman, Maximiliano Isi, Will M. Farr, Michael Boyle, Nils Deppe, Lawrence E. Kidder, Sizheng Ma, Jordan Moxon, Kyle C. Nelli, Harald P. Pfeiffer, Mark A. Scheel, Saul A. Teukolsky, William Throwe, Vijay Varma, Nils L. Vu

    Abstract: The spectroscopic study of black hole quasinormal modes in gravitational-wave ringdown observations is hindered by our ignorance of which modes should dominate astrophysical signals for different binary configurations, limiting tests of general relativity and astrophysics. In this work, we present a description of the quasinormal modes that are excited in the ringdowns of comparable mass, quasi-ci… ▽ More

    Submitted 18 July, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: Data Release and Analysis Scripts: https://github.com/HengruiPrinceton/precession_ringdown

  41. arXiv:2311.14465  [pdf, other

    cs.CL

    DP-NMT: Scalable Differentially-Private Machine Translation

    Authors: Timour Igamberdiev, Doan Nam Long Vu, Felix Künnecke, Zhuo Yu, Jannik Holmer, Ivan Habernal

    Abstract: Neural machine translation (NMT) is a widely popular text generation task, yet there is a considerable research gap in the development of privacy-preserving NMT models, despite significant data privacy concerns for NMT systems. Differentially private stochastic gradient descent (DP-SGD) is a popular method for training machine learning models with concrete privacy guarantees; however, the implemen… ▽ More

    Submitted 24 April, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

    Comments: Accepted at EACL 2024

  42. Controllable Generation of Artificial Speaker Embeddings through Discovery of Principal Directions

    Authors: Florian Lux, Pascal Tilli, Sarina Meyer, Ngoc Thang Vu

    Abstract: Customizing voice and speaking style in a speech synthesis system with intuitive and fine-grained controls is challenging, given that little data with appropriate labels is available. Furthermore, editing an existing human's voice also comes with ethical concerns. In this paper, we propose a method to generate artificial speaker embeddings that cannot be linked to a real human while offering intui… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Published at ISCA Interspeech 2023 https://www.isca-speech.org/archive/interspeech_2023/lux23_interspeech.html

  43. arXiv:2310.17499  [pdf, other

    cs.CL cs.LG eess.AS

    The IMS Toucan System for the Blizzard Challenge 2023

    Authors: Florian Lux, Julia Koch, Sarina Meyer, Thomas Bott, Nadja Schauffler, Pavel Denisov, Antje Schweitzer, Ngoc Thang Vu

    Abstract: For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021. Our approach entails a rule-based text-to-phoneme processing system that includes rule-based disambiguation of homographs in the French language. It then transforms the phonemes to spectrograms as intermediate representations using a fast and efficient non-autoregressive synt… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Published at the Blizzard Challenge Workshop 2023, colocated with the Speech Synthesis Workshop 2023, a sattelite event of the Interspeech 2023

  44. arXiv:2310.16618  [pdf, other

    cs.CV cs.RO

    Real-time 6-DoF Pose Estimation by an Event-based Camera using Active LED Markers

    Authors: Gerald Ebmer, Adam Loch, Minh Nhat Vu, Germain Haessig, Roberto Mecca, Markus Vincze, Christian Hartl-Nesic, Andreas Kugi

    Abstract: Real-time applications for autonomous operations depend largely on fast and robust vision-based localization systems. Since image processing tasks require processing large amounts of data, the computational resources often limit the performance of other processes. To overcome this limitation, traditional marker-based localization systems are widely used since they are easy to integrate and achieve… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 14 pages, 12 figures, this paper has been accepted to WACV 2024

  45. arXiv:2310.15948  [pdf, other

    cs.CV

    Language-driven Scene Synthesis using Multi-conditional Diffusion Model

    Authors: An Vuong, Minh Nhat Vu, Toan Tien Nguyen, Baoru Huang, Dzung Nguyen, Thieu Vo, Anh Nguyen

    Abstract: Scene synthesis is a challenging problem with several industrial applications. Recently, substantial efforts have been directed to synthesize the scene using human motions, room layouts, or spatial graphs as the input. However, few studies have addressed this problem from multiple modalities, especially combining text prompts. In this paper, we propose a language-driven scene synthesis task, which… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  46. arXiv:2310.15262  [pdf, other

    cs.CL

    Data Augmentation Techniques for Machine Translation of Code-Switched Texts: A Comparative Study

    Authors: Injy Hamed, Nizar Habash, Ngoc Thang Vu

    Abstract: Code-switching (CSW) text generation has been receiving increasing attention as a solution to address data scarcity. In light of this growing interest, we need more comprehensive studies comparing different augmentation approaches. In this work, we compare three popular approaches: lexical replacements, linguistic theories, and back-translation (BT), in the context of Egyptian Arabic-English CSW.… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  47. arXiv:2310.06103  [pdf, other

    cs.CL cs.SD eess.AS

    Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding

    Authors: Pavel Denisov, Ngoc Thang Vu

    Abstract: A number of methods have been proposed for End-to-End Spoken Language Understanding (E2E-SLU) using pretrained models, however their evaluation often lacks multilingual setup and tasks that require prediction of lexical fillers, such as slot filling. In this work, we propose a unified method that integrates multilingual pretrained speech and text models and performs E2E-SLU on six datasets in four… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2023

  48. arXiv:2309.10932  [pdf, other

    cs.RO

    Open-Vocabulary Affordance Detection using Knowledge Distillation and Text-Point Correlation

    Authors: Tuan Van Vo, Minh Nhat Vu, Baoru Huang, Toan Nguyen, Ngan Le, Thieu Vo, Anh Nguyen

    Abstract: Affordance detection presents intricate challenges and has a wide range of robotic applications. Previous works have faced limitations such as the complexities of 3D object shapes, the wide range of potential affordances on real-world objects, and the lack of open-vocabulary support for affordance understanding. In this paper, we introduce a new open-vocabulary affordance detection method in 3D po… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: 8 pages

  49. arXiv:2309.10911  [pdf, other

    cs.RO

    Language-Conditioned Affordance-Pose Detection in 3D Point Clouds

    Authors: Toan Nguyen, Minh Nhat Vu, Baoru Huang, Tuan Van Vo, Vy Truong, Ngan Le, Thieu Vo, Bac Le, Anh Nguyen

    Abstract: Affordance detection and pose estimation are of great importance in many robotic applications. Their combination helps the robot gain an enhanced manipulation capability, in which the generated pose can facilitate the corresponding affordance task. Previous methods for affodance-pose joint learning are limited to a predefined set of affordances, thus limiting the adaptability of robots in real-wor… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Project page: https://3DAPNet.github.io

  50. arXiv:2309.09818  [pdf, other

    cs.RO cs.CV

    Grasp-Anything: Large-scale Grasp Dataset from Foundation Models

    Authors: An Dinh Vuong, Minh Nhat Vu, Hieu Le, Baoru Huang, Binh Huynh, Thieu Vo, Andreas Kugi, Anh Nguyen

    Abstract: Foundation models such as ChatGPT have made significant strides in robotic tasks due to their universal representation of real-world domains. In this paper, we leverage foundation models to tackle grasp detection, a persistent challenge in robotics with broad industrial applications. Despite numerous grasp datasets, their object diversity remains limited compared to real-world figures. Fortunately… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: Project page: https://grasp-anything-2023.github.io