Skip to main content

Showing 1–50 of 203 results for author: Bui, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12094  [pdf, other

    cs.CL

    Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models

    Authors: Minh Nguyen, Franck Dernoncourt, Seunghyun Yoon, Hanieh Deilamsalehy, Hao Tan, Ryan Rossi, Quan Hung Tran, Trung Bui, Thien Huu Nguyen

    Abstract: We introduce an approach to identifying speaker names in dialogue transcripts, a crucial task for enhancing content accessibility and searchability in digital media archives. Despite the advancements in speech recognition, the task of text-based speaker identification (SpeakerID) has received limited attention, lacking large-scale, diverse datasets for effective model training. Addressing these ga… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: accepted to INTERSPEECH 2024

  2. arXiv:2407.09152  [pdf

    cs.AI cs.CL

    The Two Sides of the Coin: Hallucination Generation and Detection with LLMs as Evaluators for LLMs

    Authors: Anh Thu Maria Bui, Saskia Felizitas Brech, Natalie Hußfeldt, Tobias Jennert, Melanie Ullrich, Timo Breuer, Narjes Nikzad Khasmakhi, Philipp Schaer

    Abstract: Hallucination detection in Large Language Models (LLMs) is crucial for ensuring their reliability. This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content. We explored the capabilities of four LLMs: Llama 3, Gemma, GPT-3.5 Turbo, and GPT-4, for this purpose. We also employed ens… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Paper accepted at ELOQUENT@CLEF'24

  3. arXiv:2407.08470  [pdf, other

    cs.CV cs.AI

    Brain Tumor Segmentation in MRI Images with 3D U-Net and Contextual Transformer

    Authors: Thien-Qua T. Nguyen, Hieu-Nghia Nguyen, Thanh-Hieu Bui, Thien B. Nguyen-Tat, Vuong M. Ngo

    Abstract: This research presents an enhanced approach for precise segmentation of brain tumor masses in magnetic resonance imaging (MRI) using an advanced 3D-UNet model combined with a Context Transformer (CoT). By architectural expansion CoT, the proposed model extends its architecture to a 3D format, integrates it smoothly with the base model to utilize the complex contextual information found in MRI scan… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 6 pages, 7 figures

  4. arXiv:2407.04855  [pdf, other

    cs.CL cs.AI

    Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs

    Authors: Mihir Parmar, Hanieh Deilamsalehy, Franck Dernoncourt, Seunghyun Yoon, Ryan A. Rossi, Trung Bui

    Abstract: Extractive summarization plays a pivotal role in natural language processing due to its wide-range applications in summarizing diverse content efficiently, while also being faithful to the original content. Despite significant advancement achieved in extractive summarization by Large Language Models (LLMs), these summaries frequently exhibit incoherence. An important aspect of the coherent summary… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 10 pages

  5. arXiv:2407.01863  [pdf, other

    cs.CL

    VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs

    Authors: Qiucheng Wu, Handong Zhao, Michael Saxon, Trung Bui, William Yang Wang, Yang Zhang, Shiyu Chang

    Abstract: Vision language models (VLMs) are an exciting emerging class of language models (LMs) that have merged classic LM capabilities with those of image processing systems. However, the ways that these capabilities combine are not always intuitive and warrant direct investigation. One understudied capability in VLMs is visual spatial planning -- the ability to comprehend the spatial arrangements of obje… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  6. arXiv:2406.15633  [pdf, other

    cs.SE

    Good things come in three: Generating SO Post Titles with Pre-Trained Models, Self Improvement and Post Ranking

    Authors: Duc Anh Le, Anh M. T. Bui, Phuong T. Nguyen, Davide Di Ruscio

    Abstract: Stack Overflow is a prominent Q and A forum, supporting developers in seeking suitable resources on programming-related matters. Having high-quality question titles is an effective means to attract developers' attention. Unfortunately, this is often underestimated, leaving room for improvement. Research has been conducted, predominantly leveraging pre-trained models to generate titles from code sn… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: The paper has been per-reviewed and accepted for publication to the International Symposium on Empirical Software Engineering and Measurement (ESEM 2024)

  7. arXiv:2405.12522  [pdf, other

    cs.CL cs.LG

    Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models

    Authors: Charles O'Neill, Thang Bui

    Abstract: This paper introduces an efficient and robust method for discovering interpretable circuits in large language models using discrete sparse autoencoders. Our approach addresses key limitations of existing techniques, namely computational complexity and sensitivity to hyperparameters. We propose training sparse autoencoders on carefully designed positive and negative examples, where the model can on… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  8. arXiv:2405.05827  [pdf, other

    cs.IT

    Efficient designs for threshold group testing without gap

    Authors: Thach V. Bui, Yeow Meng Chee, Van Khu Vu

    Abstract: Given $d$ defective items in a population of $n$ items with $d \ll n$, in threshold group testing without gap, the outcome of a test on a subset of items is positive if the subset has at least $u$ defective items and negative otherwise, where $1 \leq u \leq d$. The basic goal of threshold group testing is to quickly identify the defective items via a small number of tests. In non-adaptive design,… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 11 pages, 2 figures

  9. arXiv:2404.14760  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Retrieval Augmented Generation for Domain-specific Question Answering

    Authors: Sanat Sharma, David Seunghyun Yoon, Franck Dernoncourt, Dewang Sultania, Karishma Bagga, Mengjiao Zhang, Trung Bui, Varun Kotte

    Abstract: Question answering (QA) has become an important application in the advanced development of large language models. General pre-trained large language models for question-answering are not trained to properly understand the knowledge or terminology for a specific domain, such as finance, healthcare, education, and customer service for a product. To better cater to domain-specific understanding, we b… ▽ More

    Submitted 29 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: AAAI 2024 (Association for the Advancement of Artificial Intelligence) Scientific Document Understanding Workshop

  10. arXiv:2404.13522  [pdf, other

    cs.AI cs.LG stat.ML

    Error Analysis of Shapley Value-Based Model Explanations: An Informative Perspective

    Authors: Ningsheng Zhao, Jia Yuan Yu, Krzysztof Dzieciolowski, Trang Bui

    Abstract: Shapley value attribution (SVA) is an increasingly popular explainable AI (XAI) method, which quantifies the contribution of each feature to the model's output. However, recent work has shown that most existing methods to implement SVAs have some drawbacks, resulting in biased or unreliable explanations that fail to correctly capture the true intrinsic relationships between features and model outp… ▽ More

    Submitted 29 May, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

  11. arXiv:2404.12652  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Pre-trained Vision-Language Models Learn Discoverable Visual Concepts

    Authors: Yuan Zang, Tian Yun, Hao Tan, Trung Bui, Chen Sun

    Abstract: Do vision-language models (VLMs) pre-trained to caption an image of a "durian" learn visual concepts such as "brown" (color) and "spiky" (texture) at the same time? We aim to answer this question as visual concepts learned "for free" would enable wide applications such as neuro-symbolic reasoning or human-interpretable object classification. We assume that the visual concepts, if captured by pre-t… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  12. arXiv:2404.09296  [pdf, other

    cs.CL

    Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A~Case~Study~at~HCMUT

    Authors: Tuan Bui, Oanh Tran, Phuong Nguyen, Bao Ho, Long Nguyen, Thang Bui, Tho Quan

    Abstract: In today's rapidly evolving landscape of Artificial Intelligence, large language models (LLMs) have emerged as a vibrant research topic. LLMs find applications in various fields and contribute significantly. Despite their powerful language capabilities, similar to pre-trained language models (PLMs), LLMs still face challenges in remembering events, incorporating new information, and addressing dom… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 8 pages, 7 figures

  13. arXiv:2404.04408  [pdf, other

    cs.CE

    A novel section-section potential for short-range interactions between plane beams

    Authors: A. Borković, M. H. Gfrerer, R. A. Sauer, B. Marussig, T. Q. Bui

    Abstract: We derive a novel formulation for the interaction potential between deformable fibers due to short-range fields arising from intermolecular forces. The formulation improves the existing section-section interaction potential law for in-plane beams by considering an offset between interacting cross sections. The new law is asymptotically consistent, which is particularly beneficial for computational… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  14. arXiv:2404.03398  [pdf, other

    cs.CV

    Scaling Up Video Summarization Pretraining with Large Language Models

    Authors: Dawit Mureja Argaw, Seunghyun Yoon, Fabian Caba Heilbron, Hanieh Deilamsalehy, Trung Bui, Zhaowen Wang, Franck Dernoncourt, Joon Son Chung

    Abstract: Long-form video content constitutes a significant portion of internet traffic, making automated video summarization an essential research problem. However, existing video summarization datasets are notably limited in their size, constraining the effectiveness of state-of-the-art methods for generalization. Our work aims to overcome this limitation by capitalizing on the abundance of long-form vide… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  15. arXiv:2404.00399  [pdf, other

    cs.CL cs.AI cs.LG

    Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

    Authors: Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak , et al. (20 additional authors not shown)

    Abstract: Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to democratize access to pretrained models for collaborative community development. However, such existing models face challenges: limited multilingual capabilities, continual pretraining causing catastrophic forgetting, where… ▽ More

    Submitted 23 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Preprint

  16. arXiv:2403.09914  [pdf, other

    cs.CV

    ProMark: Proactive Diffusion Watermarking for Causal Attribution

    Authors: Vishal Asnani, John Collomosse, Tu Bui, Xiaoming Liu, Shruti Agarwal

    Abstract: Generative AI (GenAI) is transforming creative workflows through the capability to synthesize and manipulate images via high-level prompts. Yet creatives are not well supported to receive recognition or reward for the use of their content in GenAI training. To this end, we propose ProMark, a causal attribution technique to attribute a synthetically generated image to its training data concepts lik… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  17. arXiv:2403.05873  [pdf, other

    cs.SE cs.IR cs.LG

    LEGION: Harnessing Pre-trained Language Models for GitHub Topic Recommendations with Distribution-Balance Loss

    Authors: Yen-Trang Dang, Thanh-Le Cong, Phuc-Thanh Nguyen, Anh M. T. Bui, Phuong T. Nguyen, Bach Le, Quyet-Thang Huynh

    Abstract: Open-source development has revolutionized the software industry by promoting collaboration, transparency, and community-driven innovation. Today, a vast amount of various kinds of open-source software, which form networks of repositories, is often hosted on GitHub - a popular software development platform. To enhance the discoverability of the repository networks, i.e., groups of similar reposito… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Accepted to EASE'24

  18. arXiv:2403.05297  [pdf, other

    cs.CV cs.AI cs.CL

    PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck

    Authors: Thang M. Pham, Peijie Chen, Tin Nguyen, Seunghyun Yoon, Trung Bui, Anh Totti Nguyen

    Abstract: CLIP-based classifiers rely on the prompt containing a {class name} that is known to the text encoder. Therefore, they perform poorly on new classes or the classes whose names rarely appear on the Internet (e.g., scientific names of birds). For fine-grained classification, we propose PEEB - an explainable and editable classifier to (1) express the class name into a set of text descriptors that des… ▽ More

    Submitted 12 April, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: Findings of NAACL 2024 (long paper)

  19. arXiv:2402.19119  [pdf, other

    cs.CV cs.CL

    VIXEN: Visual Text Comparison Network for Image Difference Captioning

    Authors: Alexander Black, Jing Shi, Yifei Fan, Tu Bui, John Collomosse

    Abstract: We present VIXEN - a technique that succinctly summarizes in text the visual differences between a pair of images in order to highlight any content manipulation present. Our proposed network linearly maps image features in a pairwise manner, constructing a soft prompt for a pretrained large language model. We address the challenge of low volume of training data and lack of manipulation variety in… ▽ More

    Submitted 14 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: AAAI 2024

  20. arXiv:2402.15120  [pdf, other

    cs.CV cs.AI cs.LG

    Fine-tuning CLIP Text Encoders with Two-step Paraphrasing

    Authors: Hyunjae Kim, Seunghyun Yoon, Trung Bui, Handong Zhao, Quan Tran, Franck Dernoncourt, Jaewoo Kang

    Abstract: Contrastive language-image pre-training (CLIP) models have demonstrated considerable success across various vision-language tasks, such as text-to-image retrieval, where the model is required to effectively process natural language input to produce an accurate visual output. However, current models still face limitations in dealing with linguistic variations in input queries, such as paraphrases,… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: EACL 2024 (Findings of the ACL)

  21. arXiv:2402.08946  [pdf, other

    cs.LG

    Measuring Sharpness in Grokking

    Authors: Jack Miller, Patrick Gleeson, Charles O'Neill, Thang Bui, Noam Levi

    Abstract: Neural networks sometimes exhibit grokking, a phenomenon where perfect or near-perfect performance is achieved on a validation set well after the same performance has been obtained on the corresponding training set. In this workshop paper, we introduce a robust technique for measuring grokking, based on fitting an appropriate functional form. We then use this to investigate the sharpness of transi… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  22. ROSE: Rotation-based Squeezing Robotic Gripper toward Universal Handling of Objects

    Authors: Son Tien Bui, Shinya Kawano, Van Anh Ho

    Abstract: Robotics hand/grippers nowadays are not limited to manufacturing lines; instead, they are widely utilized in cluttered environments, such as restaurants, farms, and warehouses. In such scenarios, they need to deal with high uncertainty of the grasped objects' shapes, postures, surfaces, and material properties, which requires complex integration of sensing and decision-making process. On the other… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: 9 pages, 9 figures, RSS2023 conference

    Journal ref: Robotics: Science and System 2023

  23. arXiv:2402.03805  [pdf, other

    cs.SE

    Automated Description Generation for Software Patches

    Authors: Thanh Trong Vu, Tuan-Dung Bui, Thanh-Dat Do, Thu-Trang Nguyen, Hieu Dinh Vo, Son Nguyen

    Abstract: Software patches are pivotal in refining and evolving codebases, addressing bugs, vulnerabilities, and optimizations. Patch descriptions provide detailed accounts of changes, aiding comprehension and collaboration among developers. However, manual description creation poses challenges in terms of time consumption and variations in quality and detail. In this paper, we propose PATCHEXPLAINER, an ap… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Pre-print version of PATCHEXPLAINER

  24. arXiv:2312.00220  [pdf, other

    cs.MM cs.CL cs.CV

    Multi-Modal Video Topic Segmentation with Dual-Contrastive Domain Adaptation

    Authors: Linzi Xing, Quan Tran, Fabian Caba, Franck Dernoncourt, Seunghyun Yoon, Zhaowen Wang, Trung Bui, Giuseppe Carenini

    Abstract: Video topic segmentation unveils the coarse-grained semantic structure underlying videos and is essential for other video understanding tasks. Given the recent surge in multi-modal, relying solely on a single modality is arguably insufficient. On the other hand, prior solutions for similar tasks like video scene/shot segmentation cater to short videos with clear visual shifts but falter for long v… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: Accepted at the 30th International Conference on Multimedia Modeling (MMM 2024)

  25. arXiv:2311.18297  [pdf, other

    cs.CV cs.AI

    TrustMark: Universal Watermarking for Arbitrary Resolution Images

    Authors: Tu Bui, Shruti Agarwal, John Collomosse

    Abstract: Imperceptible digital watermarking is important in copyright protection, misinformation prevention, and responsible generative AI. We propose TrustMark - a GAN-based watermarking method with novel design in architecture and spatio-spectra losses to balance the trade-off between watermarked image quality with the watermark recovery accuracy. Our model is trained with robustness in mind, withstandin… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  26. arXiv:2311.04400  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    LRM: Large Reconstruction Model for Single Image to 3D

    Authors: Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, Hao Tan

    Abstract: We propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an object from a single input image within just 5 seconds. In contrast to many previous methods that are trained on small-scale datasets such as ShapeNet in a category-specific fashion, LRM adopts a highly scalable transformer-based architecture with 500 million learnable parameters to directly predict a neural rad… ▽ More

    Submitted 9 March, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: ICLR 2024

  27. arXiv:2311.04292  [pdf, other

    cs.CL

    Aspect-based Meeting Transcript Summarization: A Two-Stage Approach with Weak Supervision on Sentence Classification

    Authors: Zhongfen Deng, Seunghyun Yoon, Trung Bui, Franck Dernoncourt, Quan Hung Tran, Shuaiqi Liu, Wenting Zhao, Tao Zhang, Yibo Wang, Philip S. Yu

    Abstract: Aspect-based meeting transcript summarization aims to produce multiple summaries, each focusing on one aspect of content in a meeting transcript. It is challenging as sentences related to different aspects can mingle together, and those relevant to a specific aspect can be scattered throughout the long transcript of a meeting. The traditional summarization methods produce one summary mixing inform… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted by 2023 IEEE International Conference on Big Data

  28. arXiv:2310.17247  [pdf, other

    cs.LG stat.ML

    Grokking Beyond Neural Networks: An Empirical Exploration with Model Complexity

    Authors: Jack Miller, Charles O'Neill, Thang Bui

    Abstract: In some settings neural networks exhibit a phenomenon known as \textit{grokking}, where they achieve perfect or near-perfect accuracy on the validation set long after the same performance has been achieved on the training set. In this paper, we discover that grokking is not limited to neural networks but occurs in other settings such as Gaussian process (GP) classification, GP regression, linear r… ▽ More

    Submitted 31 March, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

  29. arXiv:2310.06801  [pdf, other

    cs.LG cs.MA

    Inverse Factorized Q-Learning for Cooperative Multi-agent Imitation Learning

    Authors: The Viet Bui, Tien Mai, Thanh Hong Nguyen

    Abstract: This paper concerns imitation learning (IL) (i.e, the problem of learning to mimic expert behaviors from demonstrations) in cooperative multi-agent systems. The learning problem under consideration poses several challenges, characterized by high-dimensional state and action spaces and intricate inter-agent dependencies. In a single-agent setting, IL has proven to be done efficiently through an inv… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  30. arXiv:2309.08185  [pdf, other

    cs.CL

    Multilingual Sentence-Level Semantic Search using Meta-Distillation Learning

    Authors: Meryem M'hamdi, Jonathan May, Franck Dernoncourt, Trung Bui, Seunghyun Yoon

    Abstract: Multilingual semantic search is the task of retrieving relevant contents to a query expressed in different language combinations. This requires a better semantic understanding of the user's intent and its contextual meaning. Multilingual semantic search is less explored and more challenging than its monolingual or bilingual counterparts, due to the lack of multilingual parallel resources for this… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  31. arXiv:2309.06126  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA astro-ph.HE cs.CL cs.LG

    AstroLLaMA: Towards Specialized Foundation Models in Astronomy

    Authors: Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciucă, Charlie O'Neill, Ze-Chang Sun, Maja Jabłońska, Sandor Kruk, Ernest Perkowski, Jack Miller, Jason Li, Josh Peek, Kartheik Iyer, Tomasz Różański, Pranav Khetarpal, Sharaf Zaman, David Brodrick, Sergio J. Rodríguez Méndez, Thang Bui, Alyssa Goodman, Alberto Accomazzi, Jill Naiman, Jesse Cranney, Kevin Schawinski, UniverseTBD

    Abstract: Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marke… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: 6 pages, 3 figures, submitted to IJCNLP-AACL 2023. Comments are welcome. The model can be found on Hugging Face - https://huggingface.co/universeTBD/astrollama

  32. Concomitant Group Testing

    Authors: Thach V. Bui, Jonathan Scarlett

    Abstract: In this paper, we introduce a variation of the group testing problem capturing the idea that a positive test requires a combination of multiple ``types'' of item. Specifically, we assume that there are multiple disjoint \emph{semi-defective sets}, and a test is positive if and only if it contains at least one item from each of these sets. The goal is to reliably identify all of the semi-defective… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 15 pages, 3 figures, 1 table

  33. PaperToPlace: Transforming Instruction Documents into Spatialized and Context-Aware Mixed Reality Experiences

    Authors: Chen Chen, Cuong Nguyen, Jane Hoffswell, Jennifer Healey, Trung Bui, Nadir Weibel

    Abstract: While paper instructions are one of the mainstream medium for sharing knowledge, consuming such instructions and translating them into activities are inefficient due to the lack of connectivity with physical environment. We present PaperToPlace, a novel workflow comprising an authoring pipeline, which allows the authors to rapidly transform and spatialize existing paper instructions into MR experi… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

    Comments: 21 pages, 23 figures, Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST '23), San Francisco, CA, USA

    ACM Class: H.4.m; H.5.2; I.7.m

  34. arXiv:2308.13768  [pdf, other

    cs.CL cs.LG

    Adversarial Fine-Tuning of Language Models: An Iterative Optimisation Approach for the Generation and Detection of Problematic Content

    Authors: Charles O'Neill, Jack Miller, Ioana Ciuca, Yuan-Sen Ting, Thang Bui

    Abstract: In this paper, we tackle the emerging challenge of unintended harmful content generation in Large Language Models (LLMs) with a novel dual-stage optimisation technique using adversarial fine-tuning. Our two-pronged approach employs an adversarial model, fine-tuned to generate potentially harmful prompts, and a judge model, iteratively optimised to discern these prompts. In this adversarial cycle,… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  35. arXiv:2308.10188  [pdf, other

    cs.LG cs.MA

    Mimicking To Dominate: Imitation Learning Strategies for Success in Multiagent Competitive Games

    Authors: The Viet Bui, Tien Mai, Thanh Hong Nguyen

    Abstract: Training agents in multi-agent competitive games presents significant challenges due to their intricate nature. These challenges are exacerbated by dynamics influenced not only by the environment but also by opponents' strategies. Existing methods often struggle with slow convergence and instability. To address this, we harness the potential of imitation learning to comprehend and anticipate oppon… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  36. arXiv:2308.07645  [pdf, other

    cs.CL

    Steering Language Generation: Harnessing Contrastive Expert Guidance and Negative Prompting for Coherent and Diverse Synthetic Data Generation

    Authors: Charles O'Neill, Yuan-Sen Ting, Ioana Ciuca, Jack Miller, Thang Bui

    Abstract: Large Language Models (LLMs) hold immense potential to generate synthetic data of high quality and utility, which has numerous applications from downstream model training to practical data utilisation. However, contemporary models, despite their impressive capacities, consistently struggle to produce both coherent and diverse data. To address the coherency issue, we introduce contrastive expert gu… ▽ More

    Submitted 17 August, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

  37. StarSRGAN: Improving Real-World Blind Super-Resolution

    Authors: Khoa D. Vo, Len T. Bui

    Abstract: The aim of blind super-resolution (SR) in computer vision is to improve the resolution of an image without prior knowledge of the degradation process that caused the image to be low-resolution. The State of the Art (SOTA) model Real-ESRGAN has advanced perceptual loss and produced visually compelling outcomes using more complex degradation models to simulate real-world degradations. However, there… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

    Comments: 11 pages, 7 figures, 2 tables, accepted for oral presentation at WSCG 2023

  38. arXiv:2307.12949  [pdf, ps, other

    cs.CL

    Boosting Punctuation Restoration with Data Generation and Reinforcement Learning

    Authors: Viet Dac Lai, Abel Salinas, Hao Tan, Trung Bui, Quan Tran, Seunghyun Yoon, Hanieh Deilamsalehy, Franck Dernoncourt, Thien Huu Nguyen

    Abstract: Punctuation restoration is an important task in automatic speech recognition (ASR) which aim to restore the syntactic structure of generated ASR texts to improve readability. While punctuated texts are abundant from written documents, the discrepancy between written punctuated texts and ASR texts limits the usability of written texts in training punctuation restoration systems for ASR texts. This… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: Accepted at INTERSPEECH 2023, 6 pages

  39. arXiv:2307.12335  [pdf, other

    cs.CV cs.RO

    Learning Navigational Visual Representations with Semantic Map Supervision

    Authors: Yicong Hong, Yang Zhou, Ruiyi Zhang, Franck Dernoncourt, Trung Bui, Stephen Gould, Hao Tan

    Abstract: Being able to perceive the semantics and the spatial structure of the environment is essential for visual navigation of a household robot. However, most existing works only employ visual backbones pre-trained either with independent images for classification or with self-supervised learning methods to adapt to the indoor navigation domain, neglecting the spatial relationships that are essential to… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

  40. arXiv:2306.10908  [pdf, ps, other

    cs.DM cs.GT

    Optimal Pure Strategies for a Discrete Search Game

    Authors: Thuy Bui, Thomas Lidbetter, Kyle Y. Lin

    Abstract: Consider a two-person zero-sum search game between a Hider and a Searcher. The Hider chooses to hide in one of $n$ discrete locations (or "boxes") and the Searcher chooses a search sequence specifying which order to look in these boxes until finding the Hider. A search at box $i$ takes $t_i$ time units and finds the Hider - if hidden there - independently with probability $q_i$, for… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  41. arXiv:2306.04178  [pdf, other

    cs.LG cs.CG

    Optimal Transport Model Distributional Robustness

    Authors: Van-Anh Nguyen, Trung Le, Anh Tuan Bui, Thanh-Toan Do, Dinh Phung

    Abstract: Distributional robustness is a promising framework for training deep learning models that are less vulnerable to adversarial examples and data distribution shifts. Previous works have mainly focused on exploiting distributional robustness in the data space. In this work, we explore an optimal transport-based distributional robustness framework in model spaces. Specifically, we examine a model dist… ▽ More

    Submitted 1 November, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPs 2023

    Journal ref: Advances in Neural Information Processing Systems, 2023

  42. Development of a Vision System to Enhance the Reliability of the Pick-and-Place Robot for Autonomous Testing of Camera Module used in Smartphones

    Authors: Hoang-Anh Phan, Duy Nam Bui, Tuan Nguyen Dinh, Bao-Anh Hoang, An Nguyen Ngoc, Dong Tran Huu Quoc, Ha Tran Thi Thuy, Tung Thanh Bui, Van Nguyen Thi Thanh

    Abstract: Pick-and-place robots are commonly used in modern industrial manufacturing. For complex devices/parts like camera modules used in smartphones, which contain optical parts, electrical components and interfacing connectors, the placement operation may not absolutely accurate, which may cause damage in the device under test during the mechanical movement to make good contact for electrical functions… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: Published to 2021 International Conference on Engineering and Emerging Technologies (ICEET 2021). 6 pages

  43. arXiv:2305.04594  [pdf, other

    cs.RO

    A sensor fusion approach for improving implementation speed and accuracy of RTAB-Map algorithm based indoor 3D mapping

    Authors: Hoang-Anh Phan, Phuc Vinh Nguyen, Thu Hang Thi Khuat, Hieu Dang Van, Dong Huu Quoc Tran, Bao Lam Dang, Tung Thanh Bui, Van Nguyen Thi Thanh, Trinh Chu Duc

    Abstract: In recent years, 3D mapping for indoor environments has undergone considerable research and improvement because of its effective applications in various fields, including robotics, autonomous navigation, and virtual reality. Building an accurate 3D map for indoor environment is challenging due to the complex nature of the indoor space, the problem of real-time embedding and positioning errors of t… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: Accepted to 20th International Joint Conference on Computer Science and Software Engineering (JCSSE 2023). 5 pages

  44. arXiv:2305.04576  [pdf, other

    cs.RO

    An Enhanced Sampling-Based Method With Modified Next-Best View Strategy For 2D Autonomous Robot Exploration

    Authors: Dong Huu Quoc Tran, Hoang-Anh Phan, Hieu Dang Van, Tan Van Duong, Tung Thanh Bui, Van Nguyen Thi Thanh

    Abstract: Autonomous exploration is a new technology in the field of robotics that has found widespread application due to its objective to help robots independently localize, scan maps, and navigate any terrain without human control. Up to present, the sampling-based exploration strategies have been the most effective for aerial and ground vehicles equipped with depth sensors producing three-dimensional po… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: Accepted to 20th International Joint Conference on Computer Science and Software Engineering (JCSSE 2023). 6 pages

  45. ViMQ: A Vietnamese Medical Question Dataset for Healthcare Dialogue System Development

    Authors: Ta Duc Huy, Nguyen Anh Tu, Tran Hoang Vu, Nguyen Phuc Minh, Nguyen Phan, Trung H. Bui, Steven Q. H. Truong

    Abstract: Existing medical text datasets usually take the form of question and answer pairs that support the task of natural language generation, but lacking the composite annotations of the medical terms. In this study, we publish a Vietnamese dataset of medical questions from patients with sentence-level and entity-level annotations for the Intent Classification and Named Entity Recognition tasks. The tag… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: accepted at ICONIP 2021

  46. arXiv:2304.10175  [pdf

    cs.LG cs.AI

    Automated Dynamic Bayesian Networks for Predicting Acute Kidney Injury Before Onset

    Authors: David Gordon, Panayiotis Petousis, Anders O. Garlid, Keith Norris, Katherine Tuttle, Susanne B. Nicholas, Alex A. T. Bui

    Abstract: Several algorithms for learning the structure of dynamic Bayesian networks (DBNs) require an a priori ordering of variables, which influences the determined graph topology. However, it is often unclear how to determine this order if feature importance is unknown, especially as an exhaustive search is usually impractical. In this paper, we introduce Ranking Approaches for Unknown Structures (RAUS),… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: 27 pages (including 8 pages supplementary information)

  47. arXiv:2304.08300  [pdf, ps, other

    cs.DS math.CO

    Finding A Path Of Length k: An Expository

    Authors: Thai Bui

    Abstract: Given a graph $G(V, E)$ and a positive integer $k$ ($k \geq 1$), a simple path on $k$ vertices is a sequence of $k$ vertices in which no vertex appears more than once and each consecutive pair of vertices in the sequence are connected by an edge. This paper provides an overview of current research on the existence and counting of k-paths in graphs.

    Submitted 14 April, 2023; originally announced April 2023.

  48. arXiv:2304.05613  [pdf, other

    cs.CL cs.AI

    ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning

    Authors: Viet Dac Lai, Nghia Trung Ngo, Amir Pouran Ben Veyseh, Hieu Man, Franck Dernoncourt, Trung Bui, Thien Huu Nguyen

    Abstract: Over the last few years, large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP) that fundamentally transform research and developments in the field. ChatGPT represents one of the most exciting LLM systems developed recently to showcase impressive skills for language generation and highly attract public attention. Among various exciting ap… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  49. arXiv:2304.03869  [pdf, other

    cs.CV

    Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis

    Authors: Qiucheng Wu, Yujian Liu, Handong Zhao, Trung Bui, Zhe Lin, Yang Zhang, Shiyu Chang

    Abstract: Diffusion-based models have achieved state-of-the-art performance on text-to-image synthesis tasks. However, one critical limitation of these models is the low fidelity of generated images with respect to the text description, such as missing objects, mismatched attributes, and mislocated objects. One key reason for such inconsistencies is the inaccurate cross-attention to text in both the spatial… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

    Comments: 20 pages, 16 figures

  50. arXiv:2304.03400  [pdf, other

    cs.CV

    RoSteALS: Robust Steganography using Autoencoder Latent Space

    Authors: Tu Bui, Shruti Agarwal, Ning Yu, John Collomosse

    Abstract: Data hiding such as steganography and invisible watermarking has important applications in copyright protection, privacy-preserved communication and content provenance. Existing works often fall short in either preserving image quality, or robustness against perturbations or are too complex to train. We propose RoSteALS, a practical steganography technique leveraging frozen pretrained autoencoders… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: accepted to CVPR WMF 2023