-
FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization
Authors:
Son Tung Nguyen,
Alejandro Fontan,
Michael Milford,
Tobias Fischer
Abstract:
Hierarchical methods represent state-of-the-art visual localization, optimizing search efficiency by using global descriptors to focus on relevant map regions. However, this state-of-the-art performance comes at the cost of substantial memory requirements, as all database images must be stored for feature matching. In contrast, direct 2D-3D matching algorithms require significantly less memory but…
▽ More
Hierarchical methods represent state-of-the-art visual localization, optimizing search efficiency by using global descriptors to focus on relevant map regions. However, this state-of-the-art performance comes at the cost of substantial memory requirements, as all database images must be stored for feature matching. In contrast, direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space. We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework. This fusion rearranges the local descriptor space such that geographically nearby local descriptors are closer in the feature space according to the global descriptors. Therefore, the number of irrelevant competing descriptors decreases, specifically if they are geographically distant, thereby increasing the likelihood of correctly matching a query descriptor. We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements. Extensive experiments using various state-of-the-art local and global descriptors across four different datasets demonstrate the effectiveness of our approach. For the first time, our approach enables direct matching algorithms to benefit from global descriptors while maintaining memory efficiency. The code for this paper will be published at \href{https://github.com/sontung/descriptor-disambiguation}{github.com/sontung/descriptor-disambiguation}.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
Analyzing Data Efficiency and Performance of Machine Learning Algorithms for Assessing Low Back Pain Physical Rehabilitation Exercises
Authors:
Aleksa Marusic,
Louis Annabi,
Sao Msi Nguyen,
Adriana Tapus
Abstract:
Analyzing human motion is an active research area, with various applications. In this work, we focus on human motion analysis in the context of physical rehabilitation using a robot coach system. Computer-aided assessment of physical rehabilitation entails evaluation of patient performance in completing prescribed rehabilitation exercises, based on processing movement data captured with a sensory…
▽ More
Analyzing human motion is an active research area, with various applications. In this work, we focus on human motion analysis in the context of physical rehabilitation using a robot coach system. Computer-aided assessment of physical rehabilitation entails evaluation of patient performance in completing prescribed rehabilitation exercises, based on processing movement data captured with a sensory system, such as RGB and RGB-D cameras. As 2D and 3D human pose estimation from RGB images had made impressive improvements, we aim to compare the assessment of physical rehabilitation exercises using movement data obtained from both RGB-D camera (Microsoft Kinect) and estimation from RGB videos (OpenPose and BlazePose algorithms). A Gaussian Mixture Model (GMM) is employed from position (and orientation) features, with performance metrics defined based on the log-likelihood values from GMM. The evaluation is performed on a medical database of clinical patients carrying out low back-pain rehabilitation exercises, previously coached by robot Poppy.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Advancing Vietnamese Visual Question Answering with Transformer and Convolutional Integration
Authors:
Ngoc Son Nguyen,
Van Son Nguyen,
Tung Le
Abstract:
Visual Question Answering (VQA) has recently emerged as a potential research domain, captivating the interest of many in the field of artificial intelligence and computer vision. Despite the prevalence of approaches in English, there is a notable lack of systems specifically developed for certain languages, particularly Vietnamese. This study aims to bridge this gap by conducting comprehensive exp…
▽ More
Visual Question Answering (VQA) has recently emerged as a potential research domain, captivating the interest of many in the field of artificial intelligence and computer vision. Despite the prevalence of approaches in English, there is a notable lack of systems specifically developed for certain languages, particularly Vietnamese. This study aims to bridge this gap by conducting comprehensive experiments on the Vietnamese Visual Question Answering (ViVQA) dataset, demonstrating the effectiveness of our proposed model. In response to community interest, we have developed a model that enhances image representation capabilities, thereby improving overall performance in the ViVQA system. Specifically, our model integrates the Bootstrapping Language-Image Pre-training with frozen unimodal models (BLIP-2) and the convolutional neural network EfficientNet to extract and process both local and global features from images. This integration leverages the strengths of transformer-based architectures for capturing comprehensive contextual information and convolutional networks for detailed local features. By freezing the parameters of these pre-trained models, we significantly reduce the computational cost and training time, while maintaining high performance. This approach significantly improves image representation and enhances the performance of existing VQA systems. We then leverage a multi-modal fusion module based on a general-purpose multi-modal foundation model (BEiT-3) to fuse the information between visual and textual features. Our experimental findings demonstrate that our model surpasses competing baselines, achieving promising performance. This is particularly evident in its accuracy of $71.04\%$ on the test set of the ViVQA dataset, marking a significant advancement in our research area. The code is available at https://github.com/nngocson2002/ViVQA.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
LiteGPT: Large Vision-Language Model for Joint Chest X-ray Localization and Classification Task
Authors:
Khai Le-Duc,
Ryan Zhang,
Ngoc Son Nguyen,
Tan-Hanh Pham,
Anh Dao,
Ba Hung Ngo,
Anh Totti Nguyen,
Truong-Son Hy
Abstract:
Vision-language models have been extensively explored across a wide range of tasks, achieving satisfactory performance; however, their application in medical imaging remains underexplored. In this work, we propose a unified framework - LiteGPT - for the medical imaging. We leverage multiple pre-trained visual encoders to enrich information and enhance the performance of vision-language models. To…
▽ More
Vision-language models have been extensively explored across a wide range of tasks, achieving satisfactory performance; however, their application in medical imaging remains underexplored. In this work, we propose a unified framework - LiteGPT - for the medical imaging. We leverage multiple pre-trained visual encoders to enrich information and enhance the performance of vision-language models. To the best of our knowledge, this is the first study to utilize vision-language models for the novel task of joint localization and classification in medical images. Besides, we are pioneers in providing baselines for disease localization in chest X-rays. Finally, we set new state-of-the-art performance in the image classification task on the well-benchmarked VinDr-CXR dataset. All code and models are publicly available online: https://github.com/leduckhai/LiteGPT
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
An Empirical Study on Capability of Large Language Models in Understanding Code Semantics
Authors:
Thu-Trang Nguyen,
Thanh Trong Vu,
Hieu Dinh Vo,
Son Nguyen
Abstract:
Large Language Models for Code (code LLMs) have demonstrated remarkable performance across various software engineering (SE) tasks, increasing the application of code LLMs in software development. Despite the success of code LLMs, there remain significant concerns about the actual capabilities and reliability of these models, "whether these models really learn the semantics of code from the traini…
▽ More
Large Language Models for Code (code LLMs) have demonstrated remarkable performance across various software engineering (SE) tasks, increasing the application of code LLMs in software development. Despite the success of code LLMs, there remain significant concerns about the actual capabilities and reliability of these models, "whether these models really learn the semantics of code from the training data and leverage the learned knowledge to perform the SE tasks". In this paper, we introduce EMPICA, a comprehensive framework designed to systematically and empirically evaluate the capabilities of code LLMs in understanding code semantics. Specifically, EMPICA systematically introduces controlled modifications/transformations into the input code and examines the models' responses. Generally, code LLMs must be robust to semantically equivalent code inputs and be sensitive to non-equivalent ones for all SE tasks. Specifically, for every SE task, given an input code snippet c and its semantic equivalent variants, code LLMs must robustly produce consistent/equivalent outputs while they are expected to generate different outputs for c and its semantic non-equivalent variants. Our experimental results on three representative code understanding tasks, including code summarization, method name prediction, and output prediction, reveal that the robustness and sensitivity of the state-of-the-art code LLMs to code transformations vary significantly across tasks and transformation operators. In addition, the code LLMs exhibit better robustness to the semantic preserving transformations than their sensitivity to the semantic non-preserving transformations. These results highlight a need to enhance the model's capabilities of understanding code semantics, especially the sensitivity property.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Venomancer: Towards Imperceptible and Target-on-Demand Backdoor Attacks in Federated Learning
Authors:
Son Nguyen,
Thinh Nguyen,
Khoa D Doan,
Kok-Seng Wong
Abstract:
Federated Learning (FL) is a distributed machine learning approach that maintains data privacy by training on decentralized data sources. Similar to centralized machine learning, FL is also susceptible to backdoor attacks, where an attacker can compromise some clients by injecting a backdoor trigger into local models of those clients, leading to the global model's behavior being manipulated as des…
▽ More
Federated Learning (FL) is a distributed machine learning approach that maintains data privacy by training on decentralized data sources. Similar to centralized machine learning, FL is also susceptible to backdoor attacks, where an attacker can compromise some clients by injecting a backdoor trigger into local models of those clients, leading to the global model's behavior being manipulated as desired by the attacker. Most backdoor attacks in FL assume a predefined target class and require control over a large number of clients or knowledge of benign clients' information. Furthermore, they are not imperceptible and are easily detected by human inspection due to clear artifacts left on the poison data. To overcome these challenges, we propose Venomancer, an effective backdoor attack that is imperceptible and allows target-on-demand. Specifically, imperceptibility is achieved by using a visual loss function to make the poison data visually indistinguishable from the original data. Target-on-demand property allows the attacker to choose arbitrary target classes via conditional adversarial training. Additionally, experiments showed that the method is robust against state-of-the-art defenses such as Norm Clipping, Weak DP, Krum, Multi-Krum, RLR, FedRAD, Deepsight, and RFLBAT. The source code is available at https://github.com/nguyenhongson1902/Venomancer.
△ Less
Submitted 11 July, 2024; v1 submitted 3 July, 2024;
originally announced July 2024.
-
A Medical Low-Back Pain Physical Rehabilitation Dataset for Human Body Movement Analysis
Authors:
Sao Mai Nguyen,
Maxime Devanne,
Olivier Remy-Neris,
Mathieu Lempereur,
André Thepaut
Abstract:
While automatic monitoring and coaching of exercises are showing encouraging results in non-medical applications, they still have limitations such as errors and limited use contexts. To allow the development and assessment of physical rehabilitation by an intelligent tutoring system, we identify in this article four challenges to address and propose a medical dataset of clinical patients carrying…
▽ More
While automatic monitoring and coaching of exercises are showing encouraging results in non-medical applications, they still have limitations such as errors and limited use contexts. To allow the development and assessment of physical rehabilitation by an intelligent tutoring system, we identify in this article four challenges to address and propose a medical dataset of clinical patients carrying out low back-pain rehabilitation exercises. The dataset includes 3D Kinect skeleton positions and orientations, RGB videos, 2D skeleton data, and medical annotations to assess the correctness, and error classification and localisation of body part and timespan. Along this dataset, we perform a complete research path, from data collection to processing, and finally a small benchmark. We evaluated on the dataset two baseline movement recognition algorithms, pertaining to two different approaches: the probabilistic approach with a Gaussian Mixture Model (GMM), and the deep learning approach with a Long-Short Term Memory (LSTM).
This dataset is valuable because it includes rehabilitation relevant motions in a clinical setting with patients in their rehabilitation program, using a cost-effective, portable, and convenient sensor, and because it shows the potential for improvement on these challenges.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
H-Fac: Memory-Efficient Optimization with Factorized Hamiltonian Descent
Authors:
Son Nguyen,
Lizhang Chen,
Bo Liu,
Qiang Liu
Abstract:
In this study, we introduce a novel adaptive optimizer, H-Fac, which incorporates a factorized approach to momentum and scaling parameters. Our algorithm demonstrates competitive performances on both ResNets and Vision Transformers, while achieving sublinear memory costs through the use of rank-1 parameterizations for moment estimators. We develop our algorithms based on principles derived from Ha…
▽ More
In this study, we introduce a novel adaptive optimizer, H-Fac, which incorporates a factorized approach to momentum and scaling parameters. Our algorithm demonstrates competitive performances on both ResNets and Vision Transformers, while achieving sublinear memory costs through the use of rank-1 parameterizations for moment estimators. We develop our algorithms based on principles derived from Hamiltonian dynamics, providing robust theoretical underpinnings. These optimization algorithms are designed to be both straightforward and adaptable, facilitating easy implementation in diverse settings.
△ Less
Submitted 17 June, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
Matrix Manifold Neural Networks++
Authors:
Xuan Son Nguyen,
Shuo Yang,
Aymeric Histace
Abstract:
Deep neural networks (DNNs) on Riemannian manifolds have garnered increasing interest in various applied areas. For instance, DNNs on spherical and hyperbolic manifolds have been designed to solve a wide range of computer vision and nature language processing tasks. One of the key factors that contribute to the success of these networks is that spherical and hyperbolic manifolds have the rich alge…
▽ More
Deep neural networks (DNNs) on Riemannian manifolds have garnered increasing interest in various applied areas. For instance, DNNs on spherical and hyperbolic manifolds have been designed to solve a wide range of computer vision and nature language processing tasks. One of the key factors that contribute to the success of these networks is that spherical and hyperbolic manifolds have the rich algebraic structures of gyrogroups and gyrovector spaces. This enables principled and effective generalizations of the most successful DNNs to these manifolds. Recently, some works have shown that many concepts in the theory of gyrogroups and gyrovector spaces can also be generalized to matrix manifolds such as Symmetric Positive Definite (SPD) and Grassmann manifolds. As a result, some building blocks for SPD and Grassmann neural networks, e.g., isometric models and multinomial logistic regression (MLR) can be derived in a way that is fully analogous to their spherical and hyperbolic counterparts. Building upon these works, we design fully-connected (FC) and convolutional layers for SPD neural networks. We also develop MLR on Symmetric Positive Semi-definite (SPSD) manifolds, and propose a method for performing backpropagation with the Grassmann logarithmic map in the projector perspective. We demonstrate the effectiveness of the proposed approach in the human action recognition and node classification tasks.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Pegasus-v1 Technical Report
Authors:
Raehyuk Jung,
Hyojun Go,
Jaehyuk Yi,
Jiho Jang,
Daniel Kim,
Jay Suh,
Aiden Lee,
Cooper Han,
Jae Lee,
Jeff Kim,
Jin-Young Kim,
Junwan Kim,
Kyle Park,
Lucas Lee,
Mars Ha,
Minjoon Seo,
Abraham Jo,
Ed Park,
Hassan Kianinejad,
SJ Kim,
Tony Moon,
Wade Jeong,
Andrei Popescu,
Esther Kim,
EK Yoon
, et al. (19 additional authors not shown)
Abstract:
This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's archi…
▽ More
This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's architecture, training strategies, and its performance in benchmarks on video conversation, zero-shot video question answering, and video summarization. We also explore qualitative characteristics of Pegasus-1 , demonstrating its capabilities as well as its limitations, in order to provide readers a balanced view of its current state and its future direction.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Evolutionary Multi-Objective Optimisation for Fairness-Aware Self Adjusting Memory Classifiers in Data Streams
Authors:
Pivithuru Thejan Amarasinghe,
Diem Pham,
Binh Tran,
Su Nguyen,
Yuan Sun,
Damminda Alahakoon
Abstract:
This paper introduces a novel approach, evolutionary multi-objective optimisation for fairness-aware self-adjusting memory classifiers, designed to enhance fairness in machine learning algorithms applied to data stream classification. With the growing concern over discrimination in algorithmic decision-making, particularly in dynamic data stream environments, there is a need for methods that ensur…
▽ More
This paper introduces a novel approach, evolutionary multi-objective optimisation for fairness-aware self-adjusting memory classifiers, designed to enhance fairness in machine learning algorithms applied to data stream classification. With the growing concern over discrimination in algorithmic decision-making, particularly in dynamic data stream environments, there is a need for methods that ensure fair treatment of individuals across sensitive attributes like race or gender. The proposed approach addresses this challenge by integrating the strengths of the self-adjusting memory K-Nearest-Neighbour algorithm with evolutionary multi-objective optimisation. This combination allows the new approach to efficiently manage concept drift in streaming data and leverage the flexibility of evolutionary multi-objective optimisation to maximise accuracy and minimise discrimination simultaneously. We demonstrate the effectiveness of the proposed approach through extensive experiments on various datasets, comparing its performance against several baseline methods in terms of accuracy and fairness metrics. Our results show that the proposed approach maintains competitive accuracy and significantly reduces discrimination, highlighting its potential as a robust solution for fairness-aware data stream classification. Further analyses also confirm the effectiveness of the strategies to trigger evolutionary multi-objective optimisation and adapt classifiers in the proposed approach.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
AirPilot: A PPO-based DRL Auto-Tuned Nonlinear PID Drone Controller for Robust Autonomous Flights
Authors:
Junyang Zhang,
Cristian Emanuel Ocampo Rivera,
Kyle Tyni,
Steven Nguyen,
Ulices Santa Cruz Leal,
Yasser Shoukry
Abstract:
Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuab…
▽ More
Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative (PID) controllers are one of the most popular and widely used control algorithms for drones and other control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller using Proximal Policy Optimization (PPO). AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the simulator and implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%, reducing settling time and overshoot by 17% and 16% respectively.
△ Less
Submitted 22 August, 2024; v1 submitted 29 March, 2024;
originally announced April 2024.
-
CURATRON: Complete Robust Preference Data for Robust Alignment of Large Language Models
Authors:
Son The Nguyen,
Niranjan Uma Naresh,
Theja Tulabandhula
Abstract:
This paper addresses the challenges of aligning large language models (LLMs) with human values via preference learning (PL), with a focus on the issues of incomplete and corrupted data in preference datasets. We propose a novel method for robustly and completely recalibrating values within these datasets to enhance LLMs resilience against the issues. In particular, we devise a guaranteed polynomia…
▽ More
This paper addresses the challenges of aligning large language models (LLMs) with human values via preference learning (PL), with a focus on the issues of incomplete and corrupted data in preference datasets. We propose a novel method for robustly and completely recalibrating values within these datasets to enhance LLMs resilience against the issues. In particular, we devise a guaranteed polynomial time ranking algorithm that robustifies several existing models, such as the classic Bradley--Terry--Luce (BTL) (Bradley and Terry, 1952) model and certain generalizations of it. To the best of our knowledge, our present work is the first to propose an algorithm that provably recovers an ε-optimal ranking with high probability while allowing as large as O(n) perturbed pairwise comparison results per model response. Furthermore, we show robust recovery results in the partially observed setting. Our experiments confirm that our algorithms handle adversarial noise and unobserved comparisons well in both general and LLM preference dataset settings. This work contributes to the development and scaling of more reliable and ethically aligned AI models by equipping the dataset curation pipeline with the ability to handle missing and maliciously manipulated inputs.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Unsupervised Motion Retargeting for Human-Robot Imitation
Authors:
Louis Annabi,
Ziqi Ma,
Sao Mai Nguyen
Abstract:
This early-stage research work aims to improve online human-robot imitation by translating sequences of joint positions from the domain of human motions to a domain of motions achievable by a given robot, thus constrained by its embodiment. Leveraging the generalization capabilities of deep learning methods, we address this problem by proposing an encoder-decoder neural network model performing do…
▽ More
This early-stage research work aims to improve online human-robot imitation by translating sequences of joint positions from the domain of human motions to a domain of motions achievable by a given robot, thus constrained by its embodiment. Leveraging the generalization capabilities of deep learning methods, we address this problem by proposing an encoder-decoder neural network model performing domain-to-domain translation. In order to train such a model, one could use pairs of associated robot and human motions. Though, such paired data is extremely rare in practice, and tedious to collect. Therefore, we turn towards deep learning methods for unpaired domain-to-domain translation, that we adapt in order to perform human-robot imitation.
△ Less
Submitted 18 January, 2024;
originally announced February 2024.
-
Automated Description Generation for Software Patches
Authors:
Thanh Trong Vu,
Tuan-Dung Bui,
Thanh-Dat Do,
Thu-Trang Nguyen,
Hieu Dinh Vo,
Son Nguyen
Abstract:
Software patches are pivotal in refining and evolving codebases, addressing bugs, vulnerabilities, and optimizations. Patch descriptions provide detailed accounts of changes, aiding comprehension and collaboration among developers. However, manual description creation poses challenges in terms of time consumption and variations in quality and detail. In this paper, we propose PATCHEXPLAINER, an ap…
▽ More
Software patches are pivotal in refining and evolving codebases, addressing bugs, vulnerabilities, and optimizations. Patch descriptions provide detailed accounts of changes, aiding comprehension and collaboration among developers. However, manual description creation poses challenges in terms of time consumption and variations in quality and detail. In this paper, we propose PATCHEXPLAINER, an approach that addresses these challenges by framing patch description generation as a machine translation task. In PATCHEXPLAINER, we leverage explicit representations of critical elements, historical context, and syntactic conventions. Moreover, the translation model in PATCHEXPLAINER is designed with an awareness of description similarity. Particularly, the model is explicitly trained to recognize and incorporate similarities present in patch descriptions clustered into groups, improving its ability to generate accurate and consistent descriptions across similar patches. The dual objectives maximize similarity and accurately predict affiliating groups. Our experimental results on a large dataset of real-world software patches show that PATCHEXPLAINER consistently outperforms existing methods, with improvements up to 189% in BLEU, 5.7X in Exact Match rate, and 154% in Semantic Similarity, affirming its effectiveness in generating software patch descriptions.
△ Less
Submitted 26 July, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Verifiable evaluations of machine learning models using zkSNARKs
Authors:
Tobin South,
Alexander Camuto,
Shrey Jain,
Shayla Nguyen,
Robert Mahari,
Christian Paquin,
Jason Morton,
Alex 'Sandy' Pentland
Abstract:
In a world of increasing closed-source commercial machine learning models, model evaluations from developers must be taken at face value. These benchmark results-whether over task accuracy, bias evaluations, or safety checks-are traditionally impossible to verify by a model end-user without the costly or impossible process of re-performing the benchmark on black-box model outputs. This work presen…
▽ More
In a world of increasing closed-source commercial machine learning models, model evaluations from developers must be taken at face value. These benchmark results-whether over task accuracy, bias evaluations, or safety checks-are traditionally impossible to verify by a model end-user without the costly or impossible process of re-performing the benchmark on black-box model outputs. This work presents a method of verifiable model evaluation using model inference through zkSNARKs. The resulting zero-knowledge computational proofs of model outputs over datasets can be packaged into verifiable evaluation attestations showing that models with fixed private weights achieve stated performance or fairness metrics over public inputs. We present a flexible proving system that enables verifiable attestations to be performed on any standard neural network model with varying compute requirements. For the first time, we demonstrate this across a sample of real-world models and highlight key challenges and design solutions. This presents a new transparency paradigm in the verifiable evaluation of private models.
△ Less
Submitted 22 May, 2024; v1 submitted 4 February, 2024;
originally announced February 2024.
-
Prerequisite Structure Discovery in Intelligent Tutoring Systems
Authors:
Louis Annabi,
Sao Mai Nguyen
Abstract:
This paper addresses the importance of Knowledge Structure (KS) and Knowledge Tracing (KT) in improving the recommendation of educational content in intelligent tutoring systems. The KS represents the relations between different Knowledge Components (KCs), while KT predicts a learner's success based on her past history. The contribution of this research includes proposing a KT model that incorpora…
▽ More
This paper addresses the importance of Knowledge Structure (KS) and Knowledge Tracing (KT) in improving the recommendation of educational content in intelligent tutoring systems. The KS represents the relations between different Knowledge Components (KCs), while KT predicts a learner's success based on her past history. The contribution of this research includes proposing a KT model that incorporates the KS as a learnable parameter, enabling the discovery of the underlying KS from learner trajectories. The quality of the uncovered KS is assessed by using it to recommend content and evaluating the recommendation algorithm with simulated students.
△ Less
Submitted 18 January, 2024;
originally announced February 2024.
-
Genetic-based Constraint Programming for Resource Constrained Job Scheduling
Authors:
Su Nguyen,
Dhananjay Thiruvady,
Yuan Sun,
Mengjie Zhang
Abstract:
Resource constrained job scheduling is a hard combinatorial optimisation problem that originates in the mining industry. Off-the-shelf solvers cannot solve this problem satisfactorily in reasonable timeframes, while other solution methods such as many evolutionary computation methods and matheuristics cannot guarantee optimality and require low-level customisation and specialised heuristics to be…
▽ More
Resource constrained job scheduling is a hard combinatorial optimisation problem that originates in the mining industry. Off-the-shelf solvers cannot solve this problem satisfactorily in reasonable timeframes, while other solution methods such as many evolutionary computation methods and matheuristics cannot guarantee optimality and require low-level customisation and specialised heuristics to be effective. This paper addresses this gap by proposing a genetic programming algorithm to discover efficient search strategies of constraint programming for resource-constrained job scheduling. In the proposed algorithm, evolved programs represent variable selectors to be used in the search process of constraint programming, and their fitness is determined by the quality of solutions obtained for training instances. The novelties of this algorithm are (1) a new representation of variable selectors, (2) a new fitness evaluation scheme, and (3) a pre-selection mechanism. Tests with a large set of random and benchmark instances, the evolved variable selectors can significantly improve the efficiency of constraining programming. Compared to highly customised metaheuristics and hybrid algorithms, evolved variable selectors can help constraint programming identify quality solutions faster and proving optimality is possible if sufficiently large run-times are allowed. The evolved variable selectors are especially helpful when solving instances with large numbers of machines.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Rigorous Error Analysis for Logarithmic Number Systems
Authors:
Thanh Son Nguyen,
Alexey Solovyev,
Ganesh Gopalakrishnan
Abstract:
Logarithmic Number Systems (LNS) hold considerable promise in helping reduce the number of bits needed to represent a high dynamic range of real-numbers with finite precision, and also efficiently support multiplication and division. However, under LNS, addition and subtraction turn into non-linear functions that must be approximated - typically using precomputed table-based functions. Additionall…
▽ More
Logarithmic Number Systems (LNS) hold considerable promise in helping reduce the number of bits needed to represent a high dynamic range of real-numbers with finite precision, and also efficiently support multiplication and division. However, under LNS, addition and subtraction turn into non-linear functions that must be approximated - typically using precomputed table-based functions. Additionally, multiple layers of error correction are typically needed to improve result accuracy. Unfortunately, previous efforts have not characterized the resulting error bound. We provide the first rigorous analysis of LNS, covering detailed techniques such as co-transformation that are crucial to implementing subtraction with reasonable accuracy. We provide theorems capturing the error due to table interpolations, the finite precision of pre-computed values in the tables, and the error introduced by fix-point multiplications involved in LNS implementations. We empirically validate our analysis using a Python implementation, showing that our analytical bounds are tight, and that our testing campaign generates inputs diverse-enough to almost match (but not exceed) the analytical bounds. We close with discussions on how to adapt our analysis to LNS systems with different bases and also discuss many pragmatic ramifications of our work in the broader arena of scientific computing and machine learning.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
How Beginning Programmers and Code LLMs (Mis)read Each Other
Authors:
Sydney Nguyen,
Hannah McLean Babe,
Yangtian Zi,
Arjun Guha,
Carolyn Jane Anderson,
Molly Q Feldman
Abstract:
Generative AI models, specifically large language models (LLMs), have made strides towards the long-standing goal of text-to-code generation. This progress has invited numerous studies of user interaction. However, less is known about the struggles and strategies of non-experts, for whom each step of the text-to-code problem presents challenges: describing their intent in natural language, evaluat…
▽ More
Generative AI models, specifically large language models (LLMs), have made strides towards the long-standing goal of text-to-code generation. This progress has invited numerous studies of user interaction. However, less is known about the struggles and strategies of non-experts, for whom each step of the text-to-code problem presents challenges: describing their intent in natural language, evaluating the correctness of generated code, and editing prompts when the generated code is incorrect. This paper presents a large-scale controlled study of how 120 beginning coders across three academic institutions approach writing and editing prompts. A novel experimental design allows us to target specific steps in the text-to-code process and reveals that beginners struggle with writing and editing prompts, even for problems at their skill level and when correctness is automatically determined. Our mixed-methods evaluation provides insight into student processes and perceptions with key implications for non-expert Code LLM use within and outside of education.
△ Less
Submitted 7 July, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Reconciling Spatial and Temporal Abstractions for Goal Representation
Authors:
Mehdi Zadem,
Sergio Mover,
Sao Mai Nguyen
Abstract:
Goal representation affects the performance of Hierarchical Reinforcement Learning (HRL) algorithms by decomposing the complex learning problem into easier subtasks. Recent studies show that representations that preserve temporally abstract environment dynamics are successful in solving difficult problems and provide theoretical guarantees for optimality. These methods however cannot scale to task…
▽ More
Goal representation affects the performance of Hierarchical Reinforcement Learning (HRL) algorithms by decomposing the complex learning problem into easier subtasks. Recent studies show that representations that preserve temporally abstract environment dynamics are successful in solving difficult problems and provide theoretical guarantees for optimality. These methods however cannot scale to tasks where environment dynamics increase in complexity i.e. the temporally abstract transition relations depend on larger number of variables. On the other hand, other efforts have tried to use spatial abstraction to mitigate the previous issues. Their limitations include scalability to high dimensional environments and dependency on prior knowledge.
In this paper, we propose a novel three-layer HRL algorithm that introduces, at different levels of the hierarchy, both a spatial and a temporal goal abstraction. We provide a theoretical study of the regret bounds of the learned policies. We evaluate the approach on complex continuous control tasks, demonstrating the effectiveness of spatial and temporal abstractions learned by this approach. Find open-source code at https://github.com/cosynus-lix/STAR.
△ Less
Submitted 30 June, 2024; v1 submitted 18 January, 2024;
originally announced January 2024.
-
Analysis and Perspectives on the ANA Avatar XPRIZE Competition
Authors:
Kris Hauser,
Eleanor Watson,
Joonbum Bae,
Josh Bankston,
Sven Behnke,
Bill Borgia,
Manuel G. Catalano,
Stefano Dafarra,
Jan B. F. van Erp,
Thomas Ferris,
Jeremy Fishel,
Guy Hoffman,
Serena Ivaldi,
Fumio Kanehiro,
Abderrahmane Kheddar,
Gaelle Lannuzel,
Jacqueline Ford Morie,
Patrick Naughton,
Steve NGuyen,
Paul Oh,
Taskin Padir,
Jim Pippine,
Jaeheung Park,
Daniele Pucci,
Jean Vaz
, et al. (3 additional authors not shown)
Abstract:
The ANA Avatar XPRIZE was a four-year competition to develop a robotic "avatar" system to allow a human operator to sense, communicate, and act in a remote environment as though physically present. The competition featured a unique requirement that judges would operate the avatars after less than one hour of training on the human-machine interfaces, and avatar systems were judged on both objective…
▽ More
The ANA Avatar XPRIZE was a four-year competition to develop a robotic "avatar" system to allow a human operator to sense, communicate, and act in a remote environment as though physically present. The competition featured a unique requirement that judges would operate the avatars after less than one hour of training on the human-machine interfaces, and avatar systems were judged on both objective and subjective scoring metrics. This paper presents a unified summary and analysis of the competition from technical, judging, and organizational perspectives. We study the use of telerobotics technologies and innovations pursued by the competing teams in their avatar systems, and correlate the use of these technologies with judges' task performance and subjective survey ratings. It also summarizes perspectives from team leads, judges, and organizers about the competition's execution and impact to inform the future development of telerobotics and telepresence.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
DiffSpectralNet : Unveiling the Potential of Diffusion Models for Hyperspectral Image Classification
Authors:
Neetu Sigger,
Tuan Thanh Nguyen,
Gianluca Tozzi,
Quoc-Tuan Vien,
Sinh Van Nguyen
Abstract:
Hyperspectral images (HSI) have become popular for analysing remotely sensed images in multiple domain like agriculture, medical. However, existing models struggle with complex relationships and characteristics of spectral-spatial data due to the multi-band nature and data redundancy of hyperspectral data. To address this limitation, we propose a new network called DiffSpectralNet, which combines…
▽ More
Hyperspectral images (HSI) have become popular for analysing remotely sensed images in multiple domain like agriculture, medical. However, existing models struggle with complex relationships and characteristics of spectral-spatial data due to the multi-band nature and data redundancy of hyperspectral data. To address this limitation, we propose a new network called DiffSpectralNet, which combines diffusion and transformer techniques. Our approach involves a two-step process. First, we use an unsupervised learning framework based on the diffusion model to extract both high-level and low-level spectral-spatial features. The diffusion method is capable of extracting diverse and meaningful spectral-spatial features, leading to improvement in HSI classification. Then, we employ a pretrained denoising U-Net to extract intermediate hierarchical features for classification. Finally, we use a supervised transformer-based classifier to perform the HSI classification. Through comprehensive experiments on HSI datasets, we evaluate the classification performance of DiffSpectralNet. The results demonstrate that our framework significantly outperforms existing approaches, achieving state-of-the-art performance.
△ Less
Submitted 29 October, 2023;
originally announced December 2023.
-
User Friendly and Adaptable Discriminative AI: Using the Lessons from the Success of LLMs and Image Generation Models
Authors:
Son The Nguyen,
Theja Tulabandhula,
Mary Beth Watson-Manheim
Abstract:
While there is significant interest in using generative AI tools as general-purpose models for specific ML applications, discriminative models are much more widely deployed currently. One of the key shortcomings of these discriminative AI tools that have been already deployed is that they are not adaptable and user-friendly compared to generative AI tools (e.g., GPT4, Stable Diffusion, Bard, etc.)…
▽ More
While there is significant interest in using generative AI tools as general-purpose models for specific ML applications, discriminative models are much more widely deployed currently. One of the key shortcomings of these discriminative AI tools that have been already deployed is that they are not adaptable and user-friendly compared to generative AI tools (e.g., GPT4, Stable Diffusion, Bard, etc.), where a non-expert user can iteratively refine model inputs and give real-time feedback that can be accounted for immediately, allowing users to build trust from the start. Inspired by this emerging collaborative workflow, we develop a new system architecture that enables users to work with discriminative models (such as for object detection, sentiment classification, etc.) in a fashion similar to generative AI tools, where they can easily provide immediate feedback as well as adapt the deployed models as desired. Our approach has implications on improving trust, user-friendliness, and adaptability of these versatile but traditional prediction models.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
FocusTune: Tuning Visual Localization through Focus-Guided Sampling
Authors:
Son Tung Nguyen,
Alejandro Fontan,
Michael Milford,
Tobias Fischer
Abstract:
We propose FocusTune, a focus-guided sampling technique to improve the performance of visual localization algorithms. FocusTune directs a scene coordinate regression model towards regions critical for 3D point triangulation by exploiting key geometric constraints. Specifically, rather than uniformly sampling points across the image for training the scene coordinate regression model, we instead re-…
▽ More
We propose FocusTune, a focus-guided sampling technique to improve the performance of visual localization algorithms. FocusTune directs a scene coordinate regression model towards regions critical for 3D point triangulation by exploiting key geometric constraints. Specifically, rather than uniformly sampling points across the image for training the scene coordinate regression model, we instead re-project 3D scene coordinates onto the 2D image plane and sample within a local neighborhood of the re-projected points. While our proposed sampling strategy is generally applicable, we showcase FocusTune by integrating it with the recently introduced Accelerated Coordinate Encoding (ACE) model. Our results demonstrate that FocusTune both improves or matches state-of-the-art performance whilst keeping ACE's appealing low storage and compute requirements, for example reducing translation error from 25 to 19 and 17 to 15 cm for single and ensemble models, respectively, on the Cambridge Landmarks dataset. This combination of high performance and low compute and storage requirements is particularly promising for applications in areas like mobile robotics and augmented reality. We made our code available at \url{https://github.com/sontung/focus-tune}.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
Instance Segmentation under Occlusions via Location-aware Copy-Paste Data Augmentation
Authors:
Son Nguyen,
Mikel Lainsa,
Hung Dao,
Daeyoung Kim,
Giang Nguyen
Abstract:
Occlusion is a long-standing problem in computer vision, particularly in instance segmentation. ACM MMSports 2023 DeepSportRadar has introduced a dataset that focuses on segmenting human subjects within a basketball context and a specialized evaluation metric for occlusion scenarios. Given the modest size of the dataset and the highly deformable nature of the objects to be segmented, this challeng…
▽ More
Occlusion is a long-standing problem in computer vision, particularly in instance segmentation. ACM MMSports 2023 DeepSportRadar has introduced a dataset that focuses on segmenting human subjects within a basketball context and a specialized evaluation metric for occlusion scenarios. Given the modest size of the dataset and the highly deformable nature of the objects to be segmented, this challenge demands the application of robust data augmentation techniques and wisely-chosen deep learning architectures. Our work (ranked 1st in the competition) first proposes a novel data augmentation technique, capable of generating more training samples with wider distribution. Then, we adopt a new architecture - Hybrid Task Cascade (HTC) framework with CBNetV2 as backbone and MaskIoU head to improve segmentation performance. Furthermore, we employ a Stochastic Weight Averaging (SWA) training strategy to improve the model's generalization. As a result, we achieve a remarkable occlusion score (OM) of 0.533 on the challenge dataset, securing the top-1 position on the leaderboard. Source code is available at this https://github.com/nguyendinhson-kaist/MMSports23-Seg-AutoID.
△ Less
Submitted 21 November, 2023; v1 submitted 27 October, 2023;
originally announced October 2023.
-
Filling the Holes on 3D Heritage Object Surface based on Automatic Segmentation Algorithm
Authors:
Sinh Van Nguyen,
Son Thanh Le,
Minh Khai Tran,
Le Thanh Sach
Abstract:
Reconstructing and processing the 3D objects are popular activities in the research field of computer graphics, image processing and computer vision. The 3D objects are processed based on the methods like geometric modeling, a branch of applied mathematics and computational geometry, or the machine learning algorithms based on image processing. The computation of geometrical objects includes proce…
▽ More
Reconstructing and processing the 3D objects are popular activities in the research field of computer graphics, image processing and computer vision. The 3D objects are processed based on the methods like geometric modeling, a branch of applied mathematics and computational geometry, or the machine learning algorithms based on image processing. The computation of geometrical objects includes processing the curves and surfaces, subdivision, simplification, meshing, holes filling, reconstructing, and refining the 3D surface objects on both point cloud data and triangular mesh. While the machine learning methods are developed using deep learning models. With the support of 3D laser scan devices and Lidar techniques, the obtained dataset is close to original shape of the real objects. Besides, the photography and its application based on the modern techniques in recent years help us collect data and process the 3D models more precise. This article proposes an improved method for filling holes on the 3D object surface based on an automatic segmentation. Instead of filling the hole directly as the existing methods, we now subdivide the hole before filling it. The hole is first determined and segmented automatically based on computation of its local curvature. It is then filled on each part of the hole to match its local curvature shape. The method can work on both 3D point cloud surfaces and triangular mesh surface. Comparing to the state of the art methods, our proposed method obtained higher accuracy of the reconstructed 3D objects.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Fairness-enhancing mixed effects deep learning improves fairness on in- and out-of-distribution clustered (non-iid) data
Authors:
Adam Wang,
Son Nguyen,
Albert Montillo
Abstract:
Traditional deep learning (DL) suffers from two core problems. Firstly, it assumes training samples are independent and identically distributed. However, numerous real-world datasets group samples by shared measurements (e.g., study participants or cells), violating this assumption. In these scenarios, DL can show compromised performance, limited generalization, and interpretability issues, couple…
▽ More
Traditional deep learning (DL) suffers from two core problems. Firstly, it assumes training samples are independent and identically distributed. However, numerous real-world datasets group samples by shared measurements (e.g., study participants or cells), violating this assumption. In these scenarios, DL can show compromised performance, limited generalization, and interpretability issues, coupled with cluster confounding causing Type 1 and 2 errors. Secondly, models are typically trained for overall accuracy, often neglecting underrepresented groups and introducing biases in crucial areas like loan approvals or determining health insurance rates, such biases can significantly impact one's quality of life. To address both of these challenges simultaneously, we present a mixed effects deep learning (MEDL) framework. MEDL separately quantifies cluster-invariant fixed effects (FE) and cluster-specific random effects (RE) through the introduction of: 1) a cluster adversary which encourages the learning of cluster-invariant FE, 2) a Bayesian neural network which quantifies the RE, and a mixing function combining the FE an RE into a mixed-effect prediction. We marry this MEDL with adversarial debiasing, which promotes equality-of-odds fairness across FE, RE, and ME predictions for fairness-sensitive variables. We evaluated our approach using three datasets: two from census/finance focusing on income classification and one from healthcare predicting hospitalization duration, a regression task. Our framework notably enhances fairness across all sensitive variables-increasing fairness up to 82% for age, 43% for race, 86% for sex, and 27% for marital-status. Besides promoting fairness, our method maintains the robust performance and clarity of MEDL. It's versatile, suitable for various dataset types and tasks, making it broadly applicable. Our GitHub repository houses the implementation.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
AI-Copilot for Business Optimisation: A Framework and A Case Study in Production Scheduling
Authors:
Pivithuru Thejan Amarasinghe,
Su Nguyen,
Yuan Sun,
Damminda Alahakoon
Abstract:
Business optimisation refers to the process of finding and implementing efficient and cost-effective means of operation to bring a competitive advantage for businesses. Synthesizing problem formulations is an integral part of business optimisation, which relies on human expertise to construct problem formulations using optimisation languages. Interestingly, with advancements in Large Language Mode…
▽ More
Business optimisation refers to the process of finding and implementing efficient and cost-effective means of operation to bring a competitive advantage for businesses. Synthesizing problem formulations is an integral part of business optimisation, which relies on human expertise to construct problem formulations using optimisation languages. Interestingly, with advancements in Large Language Models (LLMs), the human expertise needed in problem formulation can be minimized. However, developing an LLM for problem formulation is challenging, due to training data, token limitations, and lack of appropriate performance metrics. For the requirement of training data, recent attention has been directed towards fine-tuning pre-trained LLMs for downstream tasks rather than training an LLM from scratch for a specific task. In this paper, we adopt an LLM fine-tuning approach and propose an AI-Copilot for business optimisation problem formulation. For token limitations, we introduce modularization and prompt engineering techniques to synthesize complex problem formulations as modules that fit into the token limits of LLMs. Additionally, we design performance evaluation metrics that are better suited for assessing the accuracy and quality of problem formulations. The experiment results demonstrate that with this approach we can synthesize complex and large problem formulations for a typical business optimisation problem in production scheduling.
△ Less
Submitted 18 October, 2023; v1 submitted 22 September, 2023;
originally announced September 2023.
-
Dynamic Tiling: A Model-Agnostic, Adaptive, Scalable, and Inference-Data-Centric Approach for Efficient and Accurate Small Object Detection
Authors:
Son The Nguyen,
Theja Tulabandhula,
Duy Nguyen
Abstract:
We introduce Dynamic Tiling, a model-agnostic, adaptive, and scalable approach for small object detection, anchored in our inference-data-centric philosophy. Dynamic Tiling starts with non-overlapping tiles for initial detections and utilizes dynamic overlapping rates along with a tile minimizer. This dual approach effectively resolves fragmented objects, improves detection accuracy, and minimizes…
▽ More
We introduce Dynamic Tiling, a model-agnostic, adaptive, and scalable approach for small object detection, anchored in our inference-data-centric philosophy. Dynamic Tiling starts with non-overlapping tiles for initial detections and utilizes dynamic overlapping rates along with a tile minimizer. This dual approach effectively resolves fragmented objects, improves detection accuracy, and minimizes computational overhead by reducing the number of forward passes through the object detection model. Adaptable to a variety of operational environments, our method negates the need for laborious recalibration. Additionally, our large-small filtering mechanism boosts the detection quality across a range of object sizes. Overall, Dynamic Tiling outperforms existing model-agnostic uniform cropping methods, setting new benchmarks for efficiency and accuracy.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Silent Vulnerability-fixing Commit Identification Based on Graph Neural Networks
Authors:
Hieu Dinh Vo,
Thanh Trong Vu,
Son Nguyen
Abstract:
The growing dependence of software projects on external libraries has generated apprehensions regarding the security of these libraries because of concealed vulnerabilities. Handling these vulnerabilities presents difficulties due to the temporal delay between remediation and public exposure. Furthermore, a substantial fraction of open-source projects covertly address vulnerabilities without any f…
▽ More
The growing dependence of software projects on external libraries has generated apprehensions regarding the security of these libraries because of concealed vulnerabilities. Handling these vulnerabilities presents difficulties due to the temporal delay between remediation and public exposure. Furthermore, a substantial fraction of open-source projects covertly address vulnerabilities without any formal notification, influencing vulnerability management. Established solutions like OWASP predominantly hinge on public announcements, limiting their efficacy in uncovering undisclosed vulnerabilities. To address this challenge, the automated identification of vulnerability-fixing commits has come to the forefront. In this paper, we present VFFINDER, a novel graph-based approach for automated silent vulnerability fix identification. VFFINDER captures structural changes using Abstract Syntax Trees (ASTs) and represents them in annotated ASTs. To precisely capture the meaning of code changes, the changed code is represented in connection with the related unchanged code. In VFFINDER, the structure of the changed code and related unchanged code are captured and the structural changes are represented in annotated Abstract Syntax Trees (aAST). VFFINDER distinguishes vulnerability-fixing commits from non-fixing ones using attention-based graph neural network models to extract structural features expressed in aASTs. We conducted experiments to evaluate VFFINDER on a dataset of 11K+ vulnerability fixing commits in 507 real-world C/C++ projects. Our results show that VFFINDER significantly improves the state-of-the-art methods by 272-420% in Precision, 22-70% in Recall, and 3.2X-8.2X in F1. Especially, VFFINDER speeds up the silent fix identification process by up to 121% with the same effort reviewing 50K LOC compared to the existing approaches.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Goal Space Abstraction in Hierarchical Reinforcement Learning via Set-Based Reachability Analysis
Authors:
Mehdi Zadem,
Sergio Mover,
Sao Mai Nguyen
Abstract:
Open-ended learning benefits immensely from the use of symbolic methods for goal representation as they offer ways to structure knowledge for efficient and transferable learning. However, the existing Hierarchical Reinforcement Learning (HRL) approaches relying on symbolic reasoning are often limited as they require a manual goal representation. The challenge in autonomously discovering a symbolic…
▽ More
Open-ended learning benefits immensely from the use of symbolic methods for goal representation as they offer ways to structure knowledge for efficient and transferable learning. However, the existing Hierarchical Reinforcement Learning (HRL) approaches relying on symbolic reasoning are often limited as they require a manual goal representation. The challenge in autonomously discovering a symbolic goal representation is that it must preserve critical information, such as the environment dynamics. In this paper, we propose a developmental mechanism for goal discovery via an emergent representation that abstracts (i.e., groups together) sets of environment states that have similar roles in the task. We introduce a Feudal HRL algorithm that concurrently learns both the goal representation and a hierarchical policy. The algorithm uses symbolic reachability analysis for neural networks to approximate the transition relation among sets of states and to refine the goal representation. We evaluate our approach on complex navigation tasks, showing the learned representation is interpretable, transferrable and results in data efficient learning.
△ Less
Submitted 22 November, 2023; v1 submitted 14 September, 2023;
originally announced September 2023.
-
Goal Space Abstraction in Hierarchical Reinforcement Learning via Reachability Analysis
Authors:
Mehdi Zadem,
Sergio Mover,
Sao Mai Nguyen
Abstract:
Open-ended learning benefits immensely from the use of symbolic methods for goal representation as they offer ways to structure knowledge for efficient and transferable learning. However, the existing Hierarchical Reinforcement Learning (HRL) approaches relying on symbolic reasoning are often limited as they require a manual goal representation. The challenge in autonomously discovering a symbolic…
▽ More
Open-ended learning benefits immensely from the use of symbolic methods for goal representation as they offer ways to structure knowledge for efficient and transferable learning. However, the existing Hierarchical Reinforcement Learning (HRL) approaches relying on symbolic reasoning are often limited as they require a manual goal representation. The challenge in autonomously discovering a symbolic goal representation is that it must preserve critical information, such as the environment dynamics. In this work, we propose a developmental mechanism for subgoal discovery via an emergent representation that abstracts (i.e., groups together) sets of environment states that have similar roles in the task. We create a HRL algorithm that gradually learns this representation along with the policies and evaluate it on navigation tasks to show the learned representation is interpretable and results in data efficiency.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Companion Animal Disease Diagnostics based on Literal-aware Medical Knowledge Graph Representation Learning
Authors:
Van Thuy Hoang,
Sang Thanh Nguyen,
Sangmyeong Lee,
Jooho Lee,
Luong Vuong Nguyen,
O-Joun Lee
Abstract:
Knowledge graph (KG) embedding has been used to benefit the diagnosis of animal diseases by analyzing electronic medical records (EMRs), such as notes and veterinary records. However, learning representations to capture entities and relations with literal information in KGs is challenging as the KGs show heterogeneous properties and various types of literal information. Meanwhile, the existing met…
▽ More
Knowledge graph (KG) embedding has been used to benefit the diagnosis of animal diseases by analyzing electronic medical records (EMRs), such as notes and veterinary records. However, learning representations to capture entities and relations with literal information in KGs is challenging as the KGs show heterogeneous properties and various types of literal information. Meanwhile, the existing methods mostly aim to preserve graph structures surrounding target nodes without considering different types of literals, which could also carry significant information. In this paper, we propose a knowledge graph embedding model for the efficient diagnosis of animal diseases, which could learn various types of literal information and graph structure and fuse them into unified representations, namely LiteralKG. Specifically, we construct a knowledge graph that is built from EMRs along with literal information collected from various animal hospitals. We then fuse different types of entities and node feature information into unified vector representations through gate networks. Finally, we propose a self-supervised learning task to learn graph structure in pretext tasks and then towards various downstream tasks. Experimental results on link prediction tasks demonstrate that our model outperforms the baselines that consist of state-of-the-art models. The source code is available at https://github.com/NSLab-CUK/LiteralKG.
△ Less
Submitted 31 August, 2023;
originally announced September 2023.
-
VFFINDER: A Graph-based Approach for Automated Silent Vulnerability-Fix Identification
Authors:
Son Nguyen,
Thanh Trong Vu,
Hieu Dinh Vo
Abstract:
The increasing reliance of software projects on third-party libraries has raised concerns about the security of these libraries due to hidden vulnerabilities. Managing these vulnerabilities is challenging due to the time gap between fixes and public disclosures. Moreover, a significant portion of open-source projects silently fix vulnerabilities without disclosure, impacting vulnerability manageme…
▽ More
The increasing reliance of software projects on third-party libraries has raised concerns about the security of these libraries due to hidden vulnerabilities. Managing these vulnerabilities is challenging due to the time gap between fixes and public disclosures. Moreover, a significant portion of open-source projects silently fix vulnerabilities without disclosure, impacting vulnerability management. Existing tools like OWASP heavily rely on public disclosures, hindering their effectiveness in detecting unknown vulnerabilities. To tackle this problem, automated identification of vulnerability-fixing commits has emerged. However, identifying silent vulnerability fixes remains challenging. This paper presents VFFINDER, a novel graph-based approach for automated silent vulnerability fix identification. VFFINDER captures structural changes using Abstract Syntax Trees (ASTs) and represents them in annotated ASTs. VFFINDER distinguishes vulnerability-fixing commits from non-fixing ones using attention-based graph neural network models to extract structural features. We conducted experiments to evaluate VFFINDER on a dataset of 36K+ fixing and non-fixing commits in 507 real-world C/C++ projects. Our results show that VFFINDER significantly improves the state-of-the-art methods by 39-83% in Precision, 19-148% in Recall, and 30-109% in F1. Especially, VFFINDER speeds up the silent fix identification process by up to 47% with the same review effort of 5% compared to the existing approaches.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Generative AI for Business Strategy: Using Foundation Models to Create Business Strategy Tools
Authors:
Son The Nguyen,
Theja Tulabandhula
Abstract:
Generative models (foundation models) such as LLMs (large language models) are having a large impact on multiple fields. In this work, we propose the use of such models for business decision making. In particular, we combine unstructured textual data sources (e.g., news data) with multiple foundation models (namely, GPT4, transformer-based Named Entity Recognition (NER) models and Entailment-based…
▽ More
Generative models (foundation models) such as LLMs (large language models) are having a large impact on multiple fields. In this work, we propose the use of such models for business decision making. In particular, we combine unstructured textual data sources (e.g., news data) with multiple foundation models (namely, GPT4, transformer-based Named Entity Recognition (NER) models and Entailment-based Zero-shot Classifiers (ZSC)) to derive IT (information technology) artifacts in the form of a (sequence of) signed business networks. We posit that such artifacts can inform business stakeholders about the state of the market and their own positioning as well as provide quantitative insights into improving their future outlook.
△ Less
Submitted 27 August, 2023;
originally announced August 2023.
-
Reinforcement Learning -based Adaptation and Scheduling Methods for Multi-source DASH
Authors:
Nghia T. Nguyen,
Long Luu,
Phuong L. Vo,
Thi Thanh Sang Nguyen,
Cuong T. Do,
Ngoc-thanh Nguyen
Abstract:
Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently. In DASH, the client downloads video chunks in order from a server. The rate adaptation function at the video client enhances the user's quality-of-experience (QoE) by choosing a suitable quality level for each video chunk to download based on the network condition. Today networks such as content delivery…
▽ More
Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently. In DASH, the client downloads video chunks in order from a server. The rate adaptation function at the video client enhances the user's quality-of-experience (QoE) by choosing a suitable quality level for each video chunk to download based on the network condition. Today networks such as content delivery networks, edge caching networks, content-centric networks,... usually replicate video contents on multiple cache nodes. We study video streaming from multiple sources in this work. In multi-source streaming, video chunks may arrive out of order due to different conditions of the network paths. Hence, to guarantee a high QoE, the video client needs not only rate adaptation but also chunk scheduling. Reinforcement learning (RL) has emerged as the state-of-the-art control method in various fields in recent years. This paper proposes two algorithms for streaming from multiple sources: RL-based adaptation with greedy scheduling (RLAGS) and RL-based adaptation and scheduling (RLAS). We also build a simulation environment for training and evaluating. The efficiency of the proposed algorithms is proved via extensive simulations with real-trace data.
△ Less
Submitted 25 July, 2023;
originally announced August 2023.
-
Can An Old Fashioned Feature Extraction and A Light-weight Model Improve Vulnerability Type Identification Performance?
Authors:
Hieu Dinh Vo,
Son Nguyen
Abstract:
Recent advances in automated vulnerability detection have achieved potential results in helping developers determine vulnerable components. However, after detecting vulnerabilities, investigating to fix vulnerable code is a non-trivial task. In fact, the types of vulnerability, such as buffer overflow or memory corruption, could help developers quickly understand the nature of the weaknesses and l…
▽ More
Recent advances in automated vulnerability detection have achieved potential results in helping developers determine vulnerable components. However, after detecting vulnerabilities, investigating to fix vulnerable code is a non-trivial task. In fact, the types of vulnerability, such as buffer overflow or memory corruption, could help developers quickly understand the nature of the weaknesses and localize vulnerabilities for security analysis. In this work, we investigate the problem of vulnerability type identification (VTI). The problem is modeled as the multi-label classification task, which could be effectively addressed by "pre-training, then fine-tuning" framework with deep pre-trained embedding models. We evaluate the performance of the well-known and advanced pre-trained models for VTI on a large set of vulnerabilities. Surprisingly, their performance is not much better than that of the classical baseline approach with an old-fashioned bag-of-word, TF-IDF. Meanwhile, these deep neural network approaches cost much more resources and require GPU. We also introduce a lightweight independent component to refine the predictions of the baseline approach. Our idea is that the types of vulnerabilities could strongly correlate to certain code tokens (distinguishing tokens) in several crucial parts of programs. The distinguishing tokens for each vulnerability type are statistically identified based on their prevalence in the type versus the others. Our results show that the baseline approach enhanced by our component can outperform the state-of-the-art deep pre-trained approaches while retaining very high efficiency. Furthermore, the proposed component could also improve the neural network approaches by up to 92.8% in macro-average F1.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
ARIST: An Effective API Argument Recommendation Approach
Authors:
Son Nguyen,
Cuong Tran Manh,
Kien T. Tran,
Tan M. Nguyen,
Thu-Trang Nguyen,
Kien-Tuan Ngo,
Hieu Dinh Vo
Abstract:
Learning and remembering to use APIs are difficult. Several techniques have been proposed to assist developers in using APIs. Most existing techniques focus on recommending the right API methods to call, but very few techniques focus on recommending API arguments. In this paper, we propose ARIST, a novel automated argument recommendation approach which suggests arguments by predicting developers'…
▽ More
Learning and remembering to use APIs are difficult. Several techniques have been proposed to assist developers in using APIs. Most existing techniques focus on recommending the right API methods to call, but very few techniques focus on recommending API arguments. In this paper, we propose ARIST, a novel automated argument recommendation approach which suggests arguments by predicting developers' expectations when they define and use API methods. To implement this idea in the recommendation process, ARIST combines program analysis (PA), language models (LMs), and several features specialized for the recommendation task which consider the functionality of formal parameters and the positional information of code elements (e.g., variables or method calls) in the given context. In ARIST, the LMs and the recommending features are used to suggest the promising candidates identified by PA. Meanwhile, PA navigates the LMs and the features working on the set of the valid candidates which satisfy syntax, accessibility, and type-compatibility constraints defined by the programming language in use. Our evaluation on a large dataset of real-world projects shows that ARIST improves the state-of-the-art approach by 19% and 18% in top-1 precision and recall for recommending arguments of frequently-used libraries. For general argument recommendation task, i.e., recommending arguments for every method call, ARIST outperforms the baseline approaches by up to 125% top-1 accuracy. Moreover, for newly-encountered projects, ARIST achieves more than 60% top-3 accuracy when evaluating on a larger dataset. For working/maintaining projects, with a personalized LM to capture developers' coding practice, ARIST can productively rank the expected arguments at the top-1 position in 7/10 requests.
△ Less
Submitted 11 June, 2023;
originally announced June 2023.
-
StudentEval: A Benchmark of Student-Written Prompts for Large Language Models of Code
Authors:
Hannah McLean Babe,
Sydney Nguyen,
Yangtian Zi,
Arjun Guha,
Molly Q Feldman,
Carolyn Jane Anderson
Abstract:
Code LLMs are being rapidly deployed and there is evidence that they can make professional programmers more productive. Current benchmarks for code generation measure whether models generate correct programs given an expert prompt. In this paper, we present a new benchmark containing multiple prompts per problem, written by a specific population of non-expert prompters: beginning programmers. Stud…
▽ More
Code LLMs are being rapidly deployed and there is evidence that they can make professional programmers more productive. Current benchmarks for code generation measure whether models generate correct programs given an expert prompt. In this paper, we present a new benchmark containing multiple prompts per problem, written by a specific population of non-expert prompters: beginning programmers. StudentEval contains 1,749 prompts for 48 problems, written by 80 students who have only completed one semester of Python programming. Our students wrote these prompts while working interactively with a Code LLM, and we observed very mixed success rates. We use StudentEval to evaluate 5 Code LLMs and find that StudentEval is a better discriminator of model performance than existing benchmarks. We analyze the prompts and find significant variation in students' prompting techniques. We also find that nondeterministic LLM sampling could mislead students into thinking that their prompts are more (or less) effective than they actually are, which has implications for how to teach with Code LLMs.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Building Neural Networks on Matrix Manifolds: A Gyrovector Space Approach
Authors:
Xuan Son Nguyen,
Shuo Yang
Abstract:
Matrix manifolds, such as manifolds of Symmetric Positive Definite (SPD) matrices and Grassmann manifolds, appear in many applications. Recently, by applying the theory of gyrogroups and gyrovector spaces that is a powerful framework for studying hyperbolic geometry, some works have attempted to build principled generalizations of Euclidean neural networks on matrix manifolds. However, due to the…
▽ More
Matrix manifolds, such as manifolds of Symmetric Positive Definite (SPD) matrices and Grassmann manifolds, appear in many applications. Recently, by applying the theory of gyrogroups and gyrovector spaces that is a powerful framework for studying hyperbolic geometry, some works have attempted to build principled generalizations of Euclidean neural networks on matrix manifolds. However, due to the lack of many concepts in gyrovector spaces for the considered manifolds, e.g., the inner product and gyroangles, techniques and mathematical tools provided by these works are still limited compared to those developed for studying hyperbolic geometry. In this paper, we generalize some notions in gyrovector spaces for SPD and Grassmann manifolds, and propose new models and layers for building neural networks on these manifolds. We show the effectiveness of our approach in two applications, i.e., human action recognition and knowledge graph completion.
△ Less
Submitted 5 June, 2023; v1 submitted 8 May, 2023;
originally announced May 2023.
-
Connector 0.5: A unified framework for graph representation learning
Authors:
Thanh Sang Nguyen,
Jooho Lee,
Van Thuy Hoang,
O-Joun Lee
Abstract:
Graph representation learning models aim to represent the graph structure and its features into low-dimensional vectors in a latent space, which can benefit various downstream tasks, such as node classification and link prediction. Due to its powerful graph data modelling capabilities, various graph embedding models and libraries have been proposed to learn embeddings and help researchers ease con…
▽ More
Graph representation learning models aim to represent the graph structure and its features into low-dimensional vectors in a latent space, which can benefit various downstream tasks, such as node classification and link prediction. Due to its powerful graph data modelling capabilities, various graph embedding models and libraries have been proposed to learn embeddings and help researchers ease conducting experiments. In this paper, we introduce a novel graph representation framework covering various graph embedding models, ranging from shallow to state-of-the-art models, namely Connector. First, we consider graph generation by constructing various types of graphs with different structural relations, including homogeneous, signed, heterogeneous, and knowledge graphs. Second, we introduce various graph representation learning models, ranging from shallow to deep graph embedding models. Finally, we plan to build an efficient open-source framework that can provide deep graph embedding models to represent structural relations in graphs. The framework is available at https://github.com/NSLab-CUK/Connector.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
Code-centric Learning-based Just-In-Time Vulnerability Detection
Authors:
Son Nguyen,
Thu-Trang Nguyen,
Thanh Trong Vu,
Thanh-Dat Do,
Kien-Tuan Ngo,
Hieu Dinh Vo
Abstract:
Attacks against computer systems exploiting software vulnerabilities can cause substantial damage to the cyber-infrastructure of our modern society and economy. To minimize the consequences, it is vital to detect and fix vulnerabilities as soon as possible. Just-in-time vulnerability detection (JIT-VD) discovers vulnerability-prone ("dangerous") commits to prevent them from being merged into sourc…
▽ More
Attacks against computer systems exploiting software vulnerabilities can cause substantial damage to the cyber-infrastructure of our modern society and economy. To minimize the consequences, it is vital to detect and fix vulnerabilities as soon as possible. Just-in-time vulnerability detection (JIT-VD) discovers vulnerability-prone ("dangerous") commits to prevent them from being merged into source code and causing vulnerabilities. By JIT-VD, the commits' authors, who understand the commits properly, can review these dangerous commits and fix them if necessary while the relevant modifications are still fresh in their minds. In this paper, we propose CodeJIT, a novel code-centric learning-based approach for just-in-time vulnerability detection. The key idea of CodeJIT is that the meaning of the code changes of a commit is the direct and deciding factor for determining if the commit is dangerous for the code. Based on that idea, we design a novel graph-based representation to represent the semantics of code changes in terms of both code structures and program dependencies. A graph neural network model is developed to capture the meaning of the code changes represented by our graph-based representation and learn to discriminate between dangerous and safe commits. We conducted experiments to evaluate the JIT-VD performance of CodeJIT on a dataset of 20K+ dangerous and safe commits in 506 real-world projects from 1998 to 2022. Our results show that CodeJIT significantly improves the state-of-the-art JIT-VD methods by up to 66% in Recall, 136% in Precision, and 68% in F1. Moreover, CodeJIT correctly classifies nearly 9/10 of dangerous/safe (benign) commits and even detects 69 commits that fix a vulnerability yet produce other issues in source code
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Multisensor Data Fusion for Reliable Obstacle Avoidance
Authors:
Thanh Nguyen Canh,
Truong Son Nguyen,
Cong Hoang Quach,
Xiem HoangVan,
Manh Duong Phung
Abstract:
In this work, we propose a new approach that combines data from multiple sensors for reliable obstacle avoidance. The sensors include two depth cameras and a LiDAR arranged so that they can capture the whole 3D area in front of the robot and a 2D slide around it. To fuse the data from these sensors, we first use an external camera as a reference to combine data from two depth cameras. A projection…
▽ More
In this work, we propose a new approach that combines data from multiple sensors for reliable obstacle avoidance. The sensors include two depth cameras and a LiDAR arranged so that they can capture the whole 3D area in front of the robot and a 2D slide around it. To fuse the data from these sensors, we first use an external camera as a reference to combine data from two depth cameras. A projection technique is then introduced to convert the 3D point cloud data of the cameras to its 2D correspondence. An obstacle avoidance algorithm is then developed based on the dynamic window approach. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively avoid static and dynamic obstacles of different shapes and sizes in different environments.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Enhancing Constraint Programming via Supervised Learning for Job Shop Scheduling
Authors:
Yuan Sun,
Su Nguyen,
Dhananjay Thiruvady,
Xiaodong Li,
Andreas T. Ernst,
Uwe Aickelin
Abstract:
Constraint programming (CP) is a powerful technique for solving constraint satisfaction and optimization problems. In CP solvers, the variable ordering strategy used to select which variable to explore first in the solving process has a significant impact on solver effectiveness. To address this issue, we propose a novel variable ordering strategy based on supervised learning, which we evaluate in…
▽ More
Constraint programming (CP) is a powerful technique for solving constraint satisfaction and optimization problems. In CP solvers, the variable ordering strategy used to select which variable to explore first in the solving process has a significant impact on solver effectiveness. To address this issue, we propose a novel variable ordering strategy based on supervised learning, which we evaluate in the context of job shop scheduling problems. Our learning-based methods predict the optimal solution of a problem instance and use the predicted solution to order variables for CP solvers. \added[]{Unlike traditional variable ordering methods, our methods can learn from the characteristics of each problem instance and customize the variable ordering strategy accordingly, leading to improved solver performance.} Our experiments demonstrate that training machine learning models is highly efficient and can achieve high accuracy. Furthermore, our learned variable ordering methods perform competitively when compared to four existing methods. Finally, we demonstrate that hybridising the machine learning-based variable ordering methods with traditional domain-based methods is beneficial.
△ Less
Submitted 12 April, 2023; v1 submitted 26 November, 2022;
originally announced November 2022.
-
Speeding Up Recommender Systems Using Association Rules
Authors:
Eyad Kannout,
Hung Son Nguyen,
Marek Grzegorowski
Abstract:
Recommender systems are considered one of the most rapidly growing branches of Artificial Intelligence. The demand for finding more efficient techniques to generate recommendations becomes urgent. However, many recommendations become useless if there is a delay in generating and showing them to the user. Therefore, we focus on improving the speed of recommendation systems without impacting the acc…
▽ More
Recommender systems are considered one of the most rapidly growing branches of Artificial Intelligence. The demand for finding more efficient techniques to generate recommendations becomes urgent. However, many recommendations become useless if there is a delay in generating and showing them to the user. Therefore, we focus on improving the speed of recommendation systems without impacting the accuracy. In this paper, we suggest a novel recommender system based on Factorization Machines and Association Rules (FMAR). We introduce an approach to generate association rules using two algorithms: (i) apriori and (ii) frequent pattern (FP) growth. These association rules will be utilized to reduce the number of items passed to the factorization machines recommendation model. We show that FMAR has significantly decreased the number of new items that the recommender system has to predict and hence, decreased the required time for generating the recommendations. On the other hand, while building the FMAR tool, we concentrate on making a balance between prediction time and accuracy of generated recommendations to ensure that the accuracy is not significantly impacted compared to the accuracy of using factorization machines without association rules.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Adaptive Population-based Simulated Annealing for Uncertain Resource Constrained Job Scheduling
Authors:
Dhananjay Thiruvady,
Su Nguyen,
Yuan Sun,
Fatemeh Shiri,
Nayyar Zaidi,
Xiaodong Li
Abstract:
Transporting ore from mines to ports is of significant interest in mining supply chains. These operations are commonly associated with growing costs and a lack of resources. Large mining companies are interested in optimally allocating their resources to reduce operational costs. This problem has been previously investigated in the literature as resource constrained job scheduling (RCJS). While a…
▽ More
Transporting ore from mines to ports is of significant interest in mining supply chains. These operations are commonly associated with growing costs and a lack of resources. Large mining companies are interested in optimally allocating their resources to reduce operational costs. This problem has been previously investigated in the literature as resource constrained job scheduling (RCJS). While a number of optimisation methods have been proposed to tackle the deterministic problem, the uncertainty associated with resource availability, an inevitable challenge in mining operations, has received less attention. RCJS with uncertainty is a hard combinatorial optimisation problem that cannot be solved efficiently with existing optimisation methods. This study proposes an adaptive population-based simulated annealing algorithm that can overcome the limitations of existing methods for RCJS with uncertainty including the premature convergence, the excessive number of hyper-parameters, and the inefficiency in coping with different uncertainty levels. This new algorithm is designed to effectively balance exploration and exploitation, by using a population, modifying the cooling schedule in the Metropolis-Hastings algorithm, and using an adaptive mechanism to select perturbation operators. The results show that the proposed algorithm outperforms existing methods across a wide range of benchmark RCJS instances and uncertainty levels. Moreover, new best known solutions are discovered for all but one problem instance across all uncertainty levels.
△ Less
Submitted 30 October, 2022;
originally announced October 2022.
-
Pedestrian Emergency Braking in Ten Weeks
Authors:
Steven Nguyen,
Zillur Rahman,
Brendan Tan Morris
Abstract:
In the last decade, research in the field of autonomous vehicles has grown immensely, and there is a wealth of information available for researchers to rapidly establish an autonomous vehicle platform for basic maneuvers. In this paper, we design, implement, and test, in ten weeks, a PD approach to longitudinal control for pedestrian emergency braking. We also propose a lateral controller with a s…
▽ More
In the last decade, research in the field of autonomous vehicles has grown immensely, and there is a wealth of information available for researchers to rapidly establish an autonomous vehicle platform for basic maneuvers. In this paper, we design, implement, and test, in ten weeks, a PD approach to longitudinal control for pedestrian emergency braking. We also propose a lateral controller with a similar design for future testing in lane following. Using widely available tools, we demonstrate the safety of the vehicle in pedestrian emergency braking scenarios.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
An Efficient Merge Search Matheuristic for Maximising the Net Present Value of Project Schedules
Authors:
Dhananjay R. Thiruvady,
Su Nguyen,
Christian Blum,
Andreas T. Ernst
Abstract:
Resource constrained project scheduling is an important combinatorial optimisation problem with many practical applications. With complex requirements such as precedence constraints, limited resources, and finance-based objectives, finding optimal solutions for large problem instances is very challenging even with well-customised meta-heuristics and matheuristics. To address this challenge, we pro…
▽ More
Resource constrained project scheduling is an important combinatorial optimisation problem with many practical applications. With complex requirements such as precedence constraints, limited resources, and finance-based objectives, finding optimal solutions for large problem instances is very challenging even with well-customised meta-heuristics and matheuristics. To address this challenge, we propose a new math-heuristic algorithm based on Merge Search and parallel computing to solve the resource constrained project scheduling with the aim of maximising the net present value. This paper presents a novel matheuristic framework designed for resource constrained project scheduling, Merge search, which is a variable partitioning and merging mechanism to formulate restricted mixed integer programs with the aim of improving an existing pool of solutions. The solution pool is obtained via a customised parallel ant colony optimisation algorithm, which is also capable of generating high quality solutions on its own. The experimental results show that the proposed method outperforms the current state-of-the-art algorithms on known benchmark problem instances. Further analyses also demonstrate that the proposed algorithm is substantially more efficient compared to its counterparts in respect to its convergence properties when considering multiple cores.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Broad-persistent Advice for Interactive Reinforcement Learning Scenarios
Authors:
Francisco Cruz,
Adam Bignold,
Hung Son Nguyen,
Richard Dazeley,
Peter Vamplew
Abstract:
The use of interactive advice in reinforcement learning scenarios allows for speeding up the learning process for autonomous agents. Current interactive reinforcement learning research has been limited to real-time interactions that offer relevant user advice to the current state only. Moreover, the information provided by each interaction is not retained and instead discarded by the agent after a…
▽ More
The use of interactive advice in reinforcement learning scenarios allows for speeding up the learning process for autonomous agents. Current interactive reinforcement learning research has been limited to real-time interactions that offer relevant user advice to the current state only. Moreover, the information provided by each interaction is not retained and instead discarded by the agent after a single use. In this paper, we present a method for retaining and reusing provided knowledge, allowing trainers to give general advice relevant to more than just the current state. Results obtained show that the use of broad-persistent advice substantially improves the performance of the agent while reducing the number of interactions required for the trainer.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.