-
Heterogeneous Hypergraph Embedding for Recommendation Systems
Authors:
Darnbi Sakong,
Viet Hung Vu,
Thanh Trung Huynh,
Phi Le Nguyen,
Hongzhi Yin,
Quoc Viet Hung Nguyen,
Thanh Tam Nguyen
Abstract:
Recent advancements in recommender systems have focused on integrating knowledge graphs (KGs) to leverage their auxiliary information. The core idea of KG-enhanced recommenders is to incorporate rich semantic information for more accurate recommendations. However, two main challenges persist: i) Neglecting complex higher-order interactions in the KG-based user-item network, potentially leading to…
▽ More
Recent advancements in recommender systems have focused on integrating knowledge graphs (KGs) to leverage their auxiliary information. The core idea of KG-enhanced recommenders is to incorporate rich semantic information for more accurate recommendations. However, two main challenges persist: i) Neglecting complex higher-order interactions in the KG-based user-item network, potentially leading to sub-optimal recommendations, and ii) Dealing with the heterogeneous modalities of input sources, such as user-item bipartite graphs and KGs, which may introduce noise and inaccuracies. To address these issues, we present a novel Knowledge-enhanced Heterogeneous Hypergraph Recommender System (KHGRec). KHGRec captures group-wise characteristics of both the interaction network and the KG, modeling complex connections in the KG. Using a collaborative knowledge heterogeneous hypergraph (CKHG), it employs two hypergraph encoders to model group-wise interdependencies and ensure explainability. Additionally, it fuses signals from the input graphs with cross-view self-supervised learning and attention mechanisms. Extensive experiments on four real-world datasets show our model's superiority over various state-of-the-art baselines, with an average 5.18\% relative improvement. Additional tests on noise resilience, missing data, and cold-start problems demonstrate the robustness of our KHGRec framework. Our model and evaluation datasets are publicly available at \url{https://github.com/viethungvu1998/KHGRec}.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
ViANLI: Adversarial Natural Language Inference for Vietnamese
Authors:
Tin Van Huynh,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
The development of Natural Language Processing (NLI) datasets and models has been inspired by innovations in annotation design. With the rapid development of machine learning models today, the performance of existing machine learning models has quickly reached state-of-the-art results on a variety of tasks related to natural language processing, including natural language inference tasks. By using…
▽ More
The development of Natural Language Processing (NLI) datasets and models has been inspired by innovations in annotation design. With the rapid development of machine learning models today, the performance of existing machine learning models has quickly reached state-of-the-art results on a variety of tasks related to natural language processing, including natural language inference tasks. By using a pre-trained model during the annotation process, it is possible to challenge current NLI models by having humans produce premise-hypothesis combinations that the machine model cannot correctly predict. To remain attractive and challenging in the research of natural language inference for Vietnamese, in this paper, we introduce the adversarial NLI dataset to the NLP research community with the name ViANLI. This data set contains more than 10K premise-hypothesis pairs and is built by a continuously adjusting process to obtain the most out of the patterns generated by the annotators. ViANLI dataset has brought many difficulties to many current SOTA models when the accuracy of the most powerful model on the test set only reached 48.4%. Additionally, the experimental results show that the models trained on our dataset have significantly improved the results on other Vietnamese NLI datasets.
△ Less
Submitted 1 July, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models
Authors:
Yi Zeng,
Weiyu Sun,
Tran Ngoc Huynh,
Dawn Song,
Bo Li,
Ruoxi Jia
Abstract:
Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of unsafe behaviors while evading detection during normal interactions. The high dimensionality of potential triggers in the token space and the diverse range of malicious behaviors make this a critical challenge. We present BEEAR, a mitigation approach leveraging the insight that backdoor triggers induce relati…
▽ More
Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of unsafe behaviors while evading detection during normal interactions. The high dimensionality of potential triggers in the token space and the diverse range of malicious behaviors make this a critical challenge. We present BEEAR, a mitigation approach leveraging the insight that backdoor triggers induce relatively uniform drifts in the model's embedding space. Our bi-level optimization method identifies universal embedding perturbations that elicit unwanted behaviors and adjusts the model parameters to reinforce safe behaviors against these perturbations. Experiments show BEEAR reduces the success rate of RLHF time backdoor attacks from >95% to <1% and from 47% to 0% for instruction-tuning time backdoors targeting malicious code generation, without compromising model utility. Requiring only defender-defined safe and unwanted behaviors, BEEAR represents a step towards practical defenses against safety backdoors in LLMs, providing a foundation for further advancements in AI safety and security.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
PromptDSI: Prompt-based Rehearsal-free Instance-wise Incremental Learning for Document Retrieval
Authors:
Tuan-Luc Huynh,
Thuy-Trang Vu,
Weiqing Wang,
Yinwei Wei,
Trung Le,
Dragan Gasevic,
Yuan-Fang Li,
Thanh-Toan Do
Abstract:
Differentiable Search Index (DSI) utilizes Pre-trained Language Models (PLMs) for efficient document retrieval without relying on external indexes. However, DSIs need full re-training to handle updates in dynamic corpora, causing significant computational inefficiencies. We introduce PromptDSI, a rehearsal-free, prompt-based approach for instance-wise incremental learning in document retrieval. Pr…
▽ More
Differentiable Search Index (DSI) utilizes Pre-trained Language Models (PLMs) for efficient document retrieval without relying on external indexes. However, DSIs need full re-training to handle updates in dynamic corpora, causing significant computational inefficiencies. We introduce PromptDSI, a rehearsal-free, prompt-based approach for instance-wise incremental learning in document retrieval. PromptDSI attaches prompts to the frozen PLM's encoder of DSI, leveraging its powerful representation to efficiently index new corpora while maintaining a balance between stability and plasticity. We eliminate the initial forward pass of prompt-based continual learning methods that doubles training and inference time. Moreover, we propose a topic-aware prompt pool that employs neural topic embeddings as fixed keys. This strategy ensures diverse and effective prompt usage, addressing the challenge of parameter underutilization caused by the collapse of the query-key matching mechanism. Our empirical evaluations demonstrate that PromptDSI matches IncDSI in managing forgetting while significantly enhancing recall by over 4% on new corpora.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Constructions, bounds, and algorithms for peaceable queens
Authors:
Katie Clinch,
Matthew Drescher,
Tony Huynh,
Abdallah Saffidine
Abstract:
The peaceable queens problem asks to determine the maximum number $a(n)$ such that there is a placement of $a(n)$ white queens and $a(n)$ black queens on an $n \times n$ chessboard so that no queen can capture any queen of the opposite color. In this paper, we consider the peaceable queens problem and its variant on the toroidal board. For the regular board, we show that $a(n) \leq 0.1716n^2$, for…
▽ More
The peaceable queens problem asks to determine the maximum number $a(n)$ such that there is a placement of $a(n)$ white queens and $a(n)$ black queens on an $n \times n$ chessboard so that no queen can capture any queen of the opposite color. In this paper, we consider the peaceable queens problem and its variant on the toroidal board. For the regular board, we show that $a(n) \leq 0.1716n^2$, for all sufficiently large $n$. This improves on the bound $a(n) \leq 0.25n^2$ of van Bommel and MacEachern.
For the toroidal board, we provide new upper and lower bounds. Somewhat surprisingly, our bounds show that there is a sharp contrast in behaviour between the odd torus and the even torus. Our lower bounds are given by explicit constructions. For the upper bounds, we formulate the problem as a non-linear optimization problem with at most $100$ variables, regardless of the size of the board. We solve our non-linear program exactly using modern optimization software.
We also provide a local search algorithm and a software implementation which converges very rapidly to solutions which appear optimal. Our algorithm is sufficiently robust that it works on both the regular and toroidal boards. For example, for the regular board, the algorithm quickly finds the so-called Ainley construction. Thus, our work provides some further evidence that the Ainley construction is indeed optimal.
△ Less
Submitted 1 July, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Fast-FedUL: A Training-Free Federated Unlearning with Provable Skew Resilience
Authors:
Thanh Trung Huynh,
Trong Bang Nguyen,
Phi Le Nguyen,
Thanh Tam Nguyen,
Matthias Weidlich,
Quoc Viet Hung Nguyen,
Karl Aberer
Abstract:
Federated learning (FL) has recently emerged as a compelling machine learning paradigm, prioritizing the protection of privacy for training data. The increasing demand to address issues such as ``the right to be forgotten'' and combat data poisoning attacks highlights the importance of techniques, known as \textit{unlearning}, which facilitate the removal of specific training data from trained FL…
▽ More
Federated learning (FL) has recently emerged as a compelling machine learning paradigm, prioritizing the protection of privacy for training data. The increasing demand to address issues such as ``the right to be forgotten'' and combat data poisoning attacks highlights the importance of techniques, known as \textit{unlearning}, which facilitate the removal of specific training data from trained FL models. Despite numerous unlearning methods proposed for centralized learning, they often prove inapplicable to FL due to fundamental differences in the operation of the two learning paradigms. Consequently, unlearning in FL remains in its early stages, presenting several challenges. Many existing unlearning solutions in FL require a costly retraining process, which can be burdensome for clients. Moreover, these methods are primarily validated through experiments, lacking theoretical assurances. In this study, we introduce Fast-FedUL, a tailored unlearning method for FL, which eliminates the need for retraining entirely. Through meticulous analysis of the target client's influence on the global model in each round, we develop an algorithm to systematically remove the impact of the target client from the trained model. In addition to presenting empirical findings, we offer a theoretical analysis delineating the upper bound of our unlearned model and the exact retrained model (the one obtained through retraining using untargeted clients). Experimental results with backdoor attack scenarios indicate that Fast-FedUL effectively removes almost all traces of the target client, while retaining the knowledge of untargeted clients (obtaining a high accuracy of up to 98\% on the main task). Significantly, Fast-FedUL attains the lowest time complexity, providing a speed that is 1000 times faster than retraining. Our source code is publicly available at \url{https://github.com/thanhtrunghuynh93/fastFedUL}.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
A Qualitative Analysis Framework for mHealth Privacy Practices
Authors:
Thomas Cory,
Wolf Rieder,
Thu-My Huynh
Abstract:
Mobile Health (mHealth) applications have become a crucial part of health monitoring and management. However, the proliferation of these applications has also raised concerns over the privacy and security of Personally Identifiable Information and Protected Health Information. Addressing these concerns, this paper introduces a novel framework for the qualitative evaluation of privacy practices in…
▽ More
Mobile Health (mHealth) applications have become a crucial part of health monitoring and management. However, the proliferation of these applications has also raised concerns over the privacy and security of Personally Identifiable Information and Protected Health Information. Addressing these concerns, this paper introduces a novel framework for the qualitative evaluation of privacy practices in mHealth apps, particularly focusing on the handling and transmission of sensitive user data. Our investigation encompasses an analysis of 152 leading mHealth apps on the Android platform, leveraging the proposed framework to provide a multifaceted view of their data processing activities. Despite stringent regulations like the General Data Protection Regulation in the European Union and the Health Insurance Portability and Accountability Act in the United States, our findings indicate persistent issues with negligence and misuse of sensitive user information. We uncover significant instances of health information leakage to third-party trackers and a widespread neglect of privacy-by-design and transparency principles. Our research underscores the critical need for stricter enforcement of data protection laws and sets a foundation for future efforts aimed at enhancing user privacy within the mHealth ecosystem.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Adaptation of Distinct Semantics for Uncertain Areas in Polyp Segmentation
Authors:
Quang Vinh Nguyen,
Van Thong Huynh,
Soo-Hyung Kim
Abstract:
Colonoscopy is a common and practical method for detecting and treating polyps. Segmenting polyps from colonoscopy image is useful for diagnosis and surgery progress. Nevertheless, achieving excellent segmentation performance is still difficult because of polyp characteristics like shape, color, condition, and obvious non-distinction from the surrounding context. This work presents a new novel arc…
▽ More
Colonoscopy is a common and practical method for detecting and treating polyps. Segmenting polyps from colonoscopy image is useful for diagnosis and surgery progress. Nevertheless, achieving excellent segmentation performance is still difficult because of polyp characteristics like shape, color, condition, and obvious non-distinction from the surrounding context. This work presents a new novel architecture namely Adaptation of Distinct Semantics for Uncertain Areas in Polyp Segmentation (ADSNet), which modifies misclassified details and recovers weak features having the ability to vanish and not be detected at the final stage. The architecture consists of a complementary trilateral decoder to produce an early global map. A continuous attention module modifies semantics of high-level features to analyze two separate semantics of the early global map. The suggested method is experienced on polyp benchmarks in learning ability and generalization ability, experimental results demonstrate the great correction and recovery ability leading to better segmentation performance compared to the other state of the art in the polyp image segmentation task. Especially, the proposed architecture could be experimented flexibly for other CNN-based encoders, Transformer-based encoders, and decoder backbones.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
A Combination of BERT and Transformer for Vietnamese Spelling Correction
Authors:
Hieu Ngo Trung,
Duong Tran Ham,
Tin Huynh,
Kiem Hoang
Abstract:
Recently, many studies have shown the efficiency of using Bidirectional Encoder Representations from Transformers (BERT) in various Natural Language Processing (NLP) tasks. Specifically, English spelling correction task that uses Encoder-Decoder architecture and takes advantage of BERT has achieved state-of-the-art result. However, to our knowledge, there is no implementation in Vietnamese yet. Th…
▽ More
Recently, many studies have shown the efficiency of using Bidirectional Encoder Representations from Transformers (BERT) in various Natural Language Processing (NLP) tasks. Specifically, English spelling correction task that uses Encoder-Decoder architecture and takes advantage of BERT has achieved state-of-the-art result. However, to our knowledge, there is no implementation in Vietnamese yet. Therefore, in this study, a combination of Transformer architecture (state-of-the-art for Encoder-Decoder model) and BERT was proposed to deal with Vietnamese spelling correction. The experiment results have shown that our model outperforms other approaches as well as the Google Docs Spell Checking tool, achieves an 86.24 BLEU score on this task.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Manipulating Recommender Systems: A Survey of Poisoning Attacks and Countermeasures
Authors:
Thanh Toan Nguyen,
Quoc Viet Hung Nguyen,
Thanh Tam Nguyen,
Thanh Trung Huynh,
Thanh Thi Nguyen,
Matthias Weidlich,
Hongzhi Yin
Abstract:
Recommender systems have become an integral part of online services to help users locate specific information in a sea of data. However, existing studies show that some recommender systems are vulnerable to poisoning attacks, particularly those that involve learning schemes. A poisoning attack is where an adversary injects carefully crafted data into the process of training a model, with the goal…
▽ More
Recommender systems have become an integral part of online services to help users locate specific information in a sea of data. However, existing studies show that some recommender systems are vulnerable to poisoning attacks, particularly those that involve learning schemes. A poisoning attack is where an adversary injects carefully crafted data into the process of training a model, with the goal of manipulating the system's final recommendations. Based on recent advancements in artificial intelligence, such attacks have gained importance recently. While numerous countermeasures to poisoning attacks have been developed, they have not yet been systematically linked to the properties of the attacks. Consequently, assessing the respective risks and potential success of mitigation strategies is difficult, if not impossible. This survey aims to fill this gap by primarily focusing on poisoning attacks and their countermeasures. This is in contrast to prior surveys that mainly focus on attacks and their detection methods. Through an exhaustive literature review, we provide a novel taxonomy for poisoning attacks, formalise its dimensions, and accordingly organise 30+ attacks described in the literature. Further, we review 40+ countermeasures to detect and/or prevent poisoning attacks, evaluating their effectiveness against specific types of attacks. This comprehensive survey should serve as a point of reference for protecting recommender systems against poisoning attacks. The article concludes with a discussion on open issues in the field and impactful directions for future research. A rich repository of resources associated with poisoning attacks is available at https://github.com/tamlhp/awesome-recsys-poisoning.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
A Survey of Privacy-Preserving Model Explanations: Privacy Risks, Attacks, and Countermeasures
Authors:
Thanh Tam Nguyen,
Thanh Trung Huynh,
Zhao Ren,
Thanh Toan Nguyen,
Phi Le Nguyen,
Hongzhi Yin,
Quoc Viet Hung Nguyen
Abstract:
As the adoption of explainable AI (XAI) continues to expand, the urgency to address its privacy implications intensifies. Despite a growing corpus of research in AI privacy and explainability, there is little attention on privacy-preserving model explanations. This article presents the first thorough survey about privacy attacks on model explanations and their countermeasures. Our contribution to…
▽ More
As the adoption of explainable AI (XAI) continues to expand, the urgency to address its privacy implications intensifies. Despite a growing corpus of research in AI privacy and explainability, there is little attention on privacy-preserving model explanations. This article presents the first thorough survey about privacy attacks on model explanations and their countermeasures. Our contribution to this field comprises a thorough analysis of research papers with a connected taxonomy that facilitates the categorisation of privacy attacks and countermeasures based on the targeted explanations. This work also includes an initial investigation into the causes of privacy leaks. Finally, we discuss unresolved issues and prospective research directions uncovered in our analysis. This survey aims to be a valuable resource for the research community and offers clear insights for those new to this domain. To support ongoing research, we have established an online resource repository, which will be continuously updated with new and relevant findings. Interested readers are encouraged to access our repository at https://github.com/tamlhp/awesome-privex.
△ Less
Submitted 26 June, 2024; v1 submitted 31 March, 2024;
originally announced April 2024.
-
Open Meshed Anatomy: Towards a comprehensive finite element hexahedral mesh derived from open atlases
Authors:
Andy Trung Huynh,
Benjamin Zwick,
Michael Halle,
Adam Wittek,
Karol Miller
Abstract:
Computational simulations using methods such as the finite element (FE) method rely on high-quality meshes for achieving accurate results. This study introduces a method for creating a high-quality hexahedral mesh using the Open Anatomy Project's brain atlas. Our atlas-based FE hexahedral mesh of the brain mitigates potential inaccuracies and uncertainties due to segmentation - a process that ofte…
▽ More
Computational simulations using methods such as the finite element (FE) method rely on high-quality meshes for achieving accurate results. This study introduces a method for creating a high-quality hexahedral mesh using the Open Anatomy Project's brain atlas. Our atlas-based FE hexahedral mesh of the brain mitigates potential inaccuracies and uncertainties due to segmentation - a process that often requires input of an inexperienced analyst. It accomplishes this by leveraging existing segmentation from the atlas. We further extend the mesh's usability by forming a two-way correspondence between the atlas and mesh. This feature facilitates property assignment for computational simulations and enhances result analysis within an anatomical context. We demonstrate the application of the mesh by solving the electroencephalography (EEG) forward problem. Our method simplifies the mesh creation process, reducing time and effort, and provides a more comprehensive and contextually enriched visualisation of simulation outcomes.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Vietnamese Poem Generation & The Prospect Of Cross-Language Poem-To-Poem Translation
Authors:
Triet Minh Huynh,
Quan Le Bao
Abstract:
Poetry generation has been a challenging task in the field of Natural Language Processing, as it requires the model to understand the nuances of language, sentiment, and style. In this paper, we propose using Large Language Models to generate Vietnamese poems of various genres from natural language prompts, thereby facilitating an intuitive process with enhanced content control. Our most efficacio…
▽ More
Poetry generation has been a challenging task in the field of Natural Language Processing, as it requires the model to understand the nuances of language, sentiment, and style. In this paper, we propose using Large Language Models to generate Vietnamese poems of various genres from natural language prompts, thereby facilitating an intuitive process with enhanced content control. Our most efficacious model, the GPT-3 Babbage variant, achieves a custom evaluation score of 0.8, specifically tailored to the "luc bat" genre of Vietnamese poetry. Furthermore, we also explore the idea of paraphrasing poems into normal text prompts and yield a relatively high score of 0.781 in the "luc bat" genre. This experiment presents the potential for cross-Language poem-to-poem translation with translated poems as the inputs while concurrently maintaining complete control over the generated content.
△ Less
Submitted 4 January, 2024; v1 submitted 2 January, 2024;
originally announced January 2024.
-
Multi-Branch Network for Imagery Emotion Prediction
Authors:
Quoc-Bao Ninh,
Hai-Chan Nguyen,
Triet Huynh,
Trung-Nghia Le
Abstract:
For a long time, images have proved perfect at both storing and conveying rich semantics, especially human emotions. A lot of research has been conducted to provide machines with the ability to recognize emotions in photos of people. Previous methods mostly focus on facial expressions but fail to consider the scene context, meanwhile scene context plays an important role in predicting emotions, le…
▽ More
For a long time, images have proved perfect at both storing and conveying rich semantics, especially human emotions. A lot of research has been conducted to provide machines with the ability to recognize emotions in photos of people. Previous methods mostly focus on facial expressions but fail to consider the scene context, meanwhile scene context plays an important role in predicting emotions, leading to more accurate results. In addition, Valence-Arousal-Dominance (VAD) values offer a more precise quantitative understanding of continuous emotions, yet there has been less emphasis on predicting them compared to discrete emotional categories. In this paper, we present a novel Multi-Branch Network (MBN), which utilizes various source information, including faces, bodies, and scene contexts to predict both discrete and continuous emotions in an image. Experimental results on EMOTIC dataset, which contains large-scale images of people in unconstrained situations labeled with 26 discrete categories of emotions and VAD values, show that our proposed method significantly outperforms state-of-the-art methods with 28.4% in mAP and 0.93 in MAE. The results highlight the importance of utilizing multiple contextual information in emotion prediction and illustrate the potential of our proposed method in a wide range of applications, such as effective computing, human-computer interaction, and social robotics. Source code: https://github.com/BaoNinh2808/Multi-Branch-Network-for-Imagery-Emotion-Prediction
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Predicting Agricultural Commodities Prices with Machine Learning: A Review of Current Research
Authors:
Nhat-Quang Tran,
Anna Felipe,
Thanh Nguyen Ngoc,
Tom Huynh,
Quang Tran,
Arthur Tang,
Thuy Nguyen
Abstract:
Agricultural price prediction is crucial for farmers, policymakers, and other stakeholders in the agricultural sector. However, it is a challenging task due to the complex and dynamic nature of agricultural markets. Machine learning algorithms have the potential to revolutionize agricultural price prediction by improving accuracy, real-time prediction, customization, and integration. This paper re…
▽ More
Agricultural price prediction is crucial for farmers, policymakers, and other stakeholders in the agricultural sector. However, it is a challenging task due to the complex and dynamic nature of agricultural markets. Machine learning algorithms have the potential to revolutionize agricultural price prediction by improving accuracy, real-time prediction, customization, and integration. This paper reviews recent research on machine learning algorithms for agricultural price prediction. We discuss the importance of agriculture in developing countries and the problems associated with crop price falls. We then identify the challenges of predicting agricultural prices and highlight how machine learning algorithms can support better prediction. Next, we present a comprehensive analysis of recent research, discussing the strengths and weaknesses of various machine learning techniques. We conclude that machine learning has the potential to revolutionize agricultural price prediction, but further research is essential to address the limitations and challenges associated with this approach.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
Learning Regionalization using Accurate Spatial Cost Gradients within a Differentiable High-Resolution Hydrological Model: Application to the French Mediterranean Region
Authors:
Ngo Nghi Truyen Huynh,
Pierre-André Garambois,
François Colleoni,
Benjamin Renard,
Hélène Roux,
Julie Demargne,
Maxime Jay-Allemand,
Pierre Javelle
Abstract:
Estimating spatially distributed hydrological parameters in ungauged catchments poses a challenging regionalization problem and requires imposing spatial constraints given the sparsity of discharge data. A possible approach is to search for a transfer function that quantitatively relates physical descriptors to conceptual model parameters. This paper introduces a Hybrid Data Assimilation and Param…
▽ More
Estimating spatially distributed hydrological parameters in ungauged catchments poses a challenging regionalization problem and requires imposing spatial constraints given the sparsity of discharge data. A possible approach is to search for a transfer function that quantitatively relates physical descriptors to conceptual model parameters. This paper introduces a Hybrid Data Assimilation and Parameter Regionalization (HDA-PR) approach incorporating learnable regionalization mappings, based on either multi-linear regressions or artificial neural networks (ANNs), into a differentiable hydrological model. This approach demonstrates how two differentiable codes can be linked and their gradients chained, enabling the exploitation of heterogeneous datasets across extensive spatio-temporal computational domains within a high-dimensional regionalization context, using accurate adjoint-based gradients. The inverse problem is tackled with a multi-gauge calibration cost function accounting for information from multiple observation sites. HDA-PR was tested on high-resolution, hourly and kilometric regional modeling of 126 flash-flood-prone catchments in the French Mediterranean region. The results highlight a strong regionalization performance of HDA-PR especially in the most challenging upstream-to-downstream extrapolation scenario with ANN, achieving median Nash-Sutcliffe efficiency (NSE) scores from 0.6 to 0.71 for spatial, temporal, spatio-temporal validations, and improving NSE by up to 30% on average compared to the baseline model calibrated with lumped parameters. ANN enables to learn a non-linear descriptors-to-parameters mapping which provides better model controllability than a linear mapping for complex calibration cases.
△ Less
Submitted 8 July, 2024; v1 submitted 2 August, 2023;
originally announced August 2023.
-
DCTM: Dilated Convolutional Transformer Model for Multimodal Engagement Estimation in Conversation
Authors:
Vu Ngoc Tu,
Van Thong Huynh,
Hyung-Jeong Yang,
M. Zaigham Zaheer,
Shah Nawaz,
Karthik Nandakumar,
Soo-Hyung Kim
Abstract:
Conversational engagement estimation is posed as a regression problem, entailing the identification of the favorable attention and involvement of the participants in the conversation. This task arises as a crucial pursuit to gain insights into human's interaction dynamics and behavior patterns within a conversation. In this research, we introduce a dilated convolutional Transformer for modeling an…
▽ More
Conversational engagement estimation is posed as a regression problem, entailing the identification of the favorable attention and involvement of the participants in the conversation. This task arises as a crucial pursuit to gain insights into human's interaction dynamics and behavior patterns within a conversation. In this research, we introduce a dilated convolutional Transformer for modeling and estimating human engagement in the MULTIMEDIATE 2023 competition. Our proposed system surpasses the baseline models, exhibiting a noteworthy $7$\% improvement on test set and $4$\% on validation set. Moreover, we employ different modality fusion mechanism and show that for this type of data, a simple concatenated method with self-attention fusion gains the best performance.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
Multi-gauge Hydrological Variational Data Assimilation: Regionalization Learning with Spatial Gradients using Multilayer Perceptron and Bayesian-Guided Multivariate Regression
Authors:
Ngo Nghi Truyen Huynh,
Pierre-André Garambois,
François Colleoni,
Benjamin Renard,
Hélène Roux
Abstract:
Tackling the difficult problem of estimating spatially distributed hydrological parameters, especially for floods on ungauged watercourses, this contribution presents a novel seamless regionalization technique for learning complex regional transfer functions designed for high-resolution hydrological models. The transfer functions rely on: (i) a multilayer perceptron enabling a seamless flow of gra…
▽ More
Tackling the difficult problem of estimating spatially distributed hydrological parameters, especially for floods on ungauged watercourses, this contribution presents a novel seamless regionalization technique for learning complex regional transfer functions designed for high-resolution hydrological models. The transfer functions rely on: (i) a multilayer perceptron enabling a seamless flow of gradient computation to employ machine learning optimization algorithms, or (ii) a multivariate regression mapping optimized by variational data assimilation algorithms and guided by Bayesian estimation, addressing the equifinality issue of feasible solutions. The approach involves incorporating the inferable regionalization mappings into a differentiable hydrological model and optimizing a cost function computed on multi-gauge data with accurate adjoint-based spatially distributed gradients.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Neural Multigrid Memory For Computational Fluid Dynamics
Authors:
Duc Minh Nguyen,
Minh Chau Vu,
Tuan Anh Nguyen,
Tri Huynh,
Nguyen Tri Nguyen,
Truong Son Hy
Abstract:
Turbulent flow simulation plays a crucial role in various applications, including aircraft and ship design, industrial process optimization, and weather prediction. In this paper, we propose an advanced data-driven method for simulating turbulent flow, representing a significant improvement over existing approaches. Our methodology combines the strengths of Video Prediction Transformer (VPTR) (Ye…
▽ More
Turbulent flow simulation plays a crucial role in various applications, including aircraft and ship design, industrial process optimization, and weather prediction. In this paper, we propose an advanced data-driven method for simulating turbulent flow, representing a significant improvement over existing approaches. Our methodology combines the strengths of Video Prediction Transformer (VPTR) (Ye & Bilodeau, 2022) and Multigrid Architecture (MgConv, MgResnet) (Ke et al., 2017). VPTR excels in capturing complex spatiotemporal dependencies and handling large input data, making it a promising choice for turbulent flow prediction. Meanwhile, Multigrid Architecture utilizes multiple grids with different resolutions to capture the multiscale nature of turbulent flows, resulting in more accurate and efficient simulations. Through our experiments, we demonstrate the effectiveness of our proposed approach, named MGxTransformer, in accurately predicting velocity, temperature, and turbulence intensity for incompressible turbulent flows across various geometries and flow conditions. Our results exhibit superior accuracy compared to other baselines, while maintaining computational efficiency. Our implementation in PyTorch is available publicly at https://github.com/Combi2k2/MG-Turbulent-Flow
△ Less
Submitted 24 June, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Why Are Conversational Assistants Still Black Boxes? The Case For Transparency
Authors:
Trung Dong Huynh,
William Seymour,
Luc Moreau,
Jose Such
Abstract:
Much has been written about privacy in the context of conversational and voice assistants. Yet, there have been remarkably few developments in terms of the actual privacy offered by these devices. But how much of this is due to the technical and design limitations of speech as an interaction modality? In this paper, we set out to reframe the discussion on why commercial conversational assistants d…
▽ More
Much has been written about privacy in the context of conversational and voice assistants. Yet, there have been remarkably few developments in terms of the actual privacy offered by these devices. But how much of this is due to the technical and design limitations of speech as an interaction modality? In this paper, we set out to reframe the discussion on why commercial conversational assistants do not offer meaningful privacy and transparency by demonstrating how they \emph{could}. By instrumenting the open-source voice assistant Mycroft to capture audit trails for data access, we demonstrate how such functionality could be integrated into big players in the sector like Alexa and Google Assistant. We show that this problem can be solved with existing technology and open standards and is thus fundamentally a business decision rather than a technical limitation.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
A Menger-type theorem for two induced paths
Authors:
Sandra Albrechtsen,
Tony Huynh,
Raphael W. Jacobs,
Paul Knappe,
Paul Wollan
Abstract:
We give an approximate Menger-type theorem for when a graph $G$ contains two $X-Y$ paths $P_1$ and $P_2$ such that $P_1 \cup P_2$ is an induced subgraph of $G$. More generally, we prove that there exists a function $f(d) \in O(d)$, such that for every graph $G$ and $X,Y \subseteq V(G)$, either there exist two $X-Y$ paths $P_1$ and $P_2$ such that the distance between $P_1$ and $P_2$ is at least…
▽ More
We give an approximate Menger-type theorem for when a graph $G$ contains two $X-Y$ paths $P_1$ and $P_2$ such that $P_1 \cup P_2$ is an induced subgraph of $G$. More generally, we prove that there exists a function $f(d) \in O(d)$, such that for every graph $G$ and $X,Y \subseteq V(G)$, either there exist two $X-Y$ paths $P_1$ and $P_2$ such that the distance between $P_1$ and $P_2$ is at least $d$, or there exists $v \in V(G)$ such that the ball of radius $f(d)$ centered at $v$ intersects every $X-Y$ path.
△ Less
Submitted 3 February, 2024; v1 submitted 8 May, 2023;
originally announced May 2023.
-
Multi-scale Transformer-based Network for Emotion Recognition from Multi Physiological Signals
Authors:
Tu Vu,
Van Thong Huynh,
Soo-Hyung Kim
Abstract:
This paper presents an efficient Multi-scale Transformer-based approach for the task of Emotion recognition from Physiological data, which has gained widespread attention in the research community due to the vast amount of information that can be extracted from these signals using modern sensors and machine learning techniques. Our approach involves applying a Multi-modal technique combined with s…
▽ More
This paper presents an efficient Multi-scale Transformer-based approach for the task of Emotion recognition from Physiological data, which has gained widespread attention in the research community due to the vast amount of information that can be extracted from these signals using modern sensors and machine learning techniques. Our approach involves applying a Multi-modal technique combined with scaling data to establish the relationship between internal body signals and human emotions. Additionally, we utilize Transformer and Gaussian Transformation techniques to improve signal encoding effectiveness and overall performance. Our model achieves decent results on the CASE dataset of the EPiC competition, with an RMSE score of 1.45.
△ Less
Submitted 7 May, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
SketchANIMAR: Sketch-based 3D Animal Fine-Grained Retrieval
Authors:
Trung-Nghia Le,
Tam V. Nguyen,
Minh-Quan Le,
Trong-Thuan Nguyen,
Viet-Tham Huynh,
Trong-Le Do,
Khanh-Duy Le,
Mai-Khiem Tran,
Nhat Hoang-Xuan,
Thang-Long Nguyen-Ho,
Vinh-Tiep Nguyen,
Nhat-Quynh Le-Pham,
Huu-Phuc Pham,
Trong-Vu Hoang,
Quang-Binh Nguyen,
Trong-Hieu Nguyen-Mau,
Tuan-Luc Huynh,
Thanh-Danh Le,
Ngoc-Linh Nguyen-Ha,
Tuong-Vy Truong-Thuy,
Truong Hoai Phong,
Tuong-Nghiem Diep,
Khanh-Duy Ho,
Xuan-Hieu Nguyen,
Thien-Phuc Tran
, et al. (9 additional authors not shown)
Abstract:
The retrieval of 3D objects has gained significant importance in recent years due to its broad range of applications in computer vision, computer graphics, virtual reality, and augmented reality. However, the retrieval of 3D objects presents significant challenges due to the intricate nature of 3D models, which can vary in shape, size, and texture, and have numerous polygons and vertices. To this…
▽ More
The retrieval of 3D objects has gained significant importance in recent years due to its broad range of applications in computer vision, computer graphics, virtual reality, and augmented reality. However, the retrieval of 3D objects presents significant challenges due to the intricate nature of 3D models, which can vary in shape, size, and texture, and have numerous polygons and vertices. To this end, we introduce a novel SHREC challenge track that focuses on retrieving relevant 3D animal models from a dataset using sketch queries and expedites accessing 3D models through available sketches. Furthermore, a new dataset named ANIMAR was constructed in this study, comprising a collection of 711 unique 3D animal models and 140 corresponding sketch queries. Our contest requires participants to retrieve 3D models based on complex and detailed sketches. We receive satisfactory results from eight teams and 204 runs. Although further improvement is necessary, the proposed task has the potential to incentivize additional research in the domain of 3D object retrieval, potentially yielding benefits for a wide range of applications. We also provide insights into potential areas of future research, such as improving techniques for feature extraction and matching and creating more diverse datasets to evaluate retrieval performance. https://aichallenge.hcmus.edu.vn/sketchanimar
△ Less
Submitted 9 August, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
E2C: A Visual Simulator to Reinforce Education of Heterogeneous Computing Systems
Authors:
Ali Mokhtari,
Drake Rawls,
Tony Huynh,
Jeremiah Green,
Mohsen Amini Salehi
Abstract:
With the increasing popularity of accelerator technologies (e.g., GPUs and TPUs) and the emergence of domain-specific computing via ASICs and FPGA, the matter of heterogeneity and understanding its ramifications on the performance has become more critical than ever before. However, it is challenging to effectively educate students about the potential impacts of heterogeneity on the performance of…
▽ More
With the increasing popularity of accelerator technologies (e.g., GPUs and TPUs) and the emergence of domain-specific computing via ASICs and FPGA, the matter of heterogeneity and understanding its ramifications on the performance has become more critical than ever before. However, it is challenging to effectively educate students about the potential impacts of heterogeneity on the performance of distributed systems; and on the logic of resource allocation methods to efficiently utilize the resources. Making use of the real infrastructure for benchmarking the performance of heterogeneous machines, for different applications, with respect to different objectives, and under various workload intensities is cost- and time-prohibitive. To reinforce the quality of learning about various dimensions of heterogeneity, and to decrease the widening gap in education, we develop an open-source simulation tool, called E2C, that can help students researchers to study any type of heterogeneous (or homogeneous) computing system and measure its performance under various configurations. E2C is equipped with an intuitive graphical user interface (GUI) that enables its users to easily examine system-level solutions (scheduling, load balancing, scalability, etc.) in a controlled environment within a short time. E2C is a discrete event simulator that offers the following features: (i) simulating a heterogeneous computing system; (ii) implementing a newly developed scheduling method and plugging it into the system, (iii) measuring energy consumption and other output-related metrics; and (iv) powerful visual aspects to ease the learning curve for students. We used E2C as an assignment in the Distributed and Cloud Computing course. Our anonymous survey study indicates that students rated E2C with the score of 8.7 out of 10 for its usefulness in understanding the concepts of scheduling in heterogeneous computing.
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
Vision Transformer for Action Units Detection
Authors:
Tu Vu,
Van Thong Huynh,
Soo Hyung Kim
Abstract:
Facial Action Units detection (FAUs) represents a fine-grained classification problem that involves identifying different units on the human face, as defined by the Facial Action Coding System. In this paper, we present a simple yet efficient Vision Transformer-based approach for addressing the task of Action Units (AU) detection in the context of Affective Behavior Analysis in-the-wild (ABAW) com…
▽ More
Facial Action Units detection (FAUs) represents a fine-grained classification problem that involves identifying different units on the human face, as defined by the Facial Action Coding System. In this paper, we present a simple yet efficient Vision Transformer-based approach for addressing the task of Action Units (AU) detection in the context of Affective Behavior Analysis in-the-wild (ABAW) competition. We employ the Video Vision Transformer(ViViT) Network to capture the temporal facial change in the video. Besides, to reduce massive size of the Vision Transformers model, we replace the ViViT feature extraction layers with the CNN backbone (Regnet). Our model outperform the baseline model of ABAW 2023 challenge, with a notable 14% difference in result. Furthermore, the achieved results are comparable to those of the top three teams in the previous ABAW 2022 challenge.
△ Less
Submitted 20 March, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Generic Event Boundary Detection in Video with Pyramid Features
Authors:
Van Thong Huynh,
Hyung-Jeong Yang,
Guee-Sang Lee,
Soo-Hyung Kim
Abstract:
Generic event boundary detection (GEBD) aims to split video into chunks at a broad and diverse set of actions as humans naturally perceive event boundaries. In this study, we present an approach that considers the correlation between neighbor frames with pyramid feature maps in both spatial and temporal dimensions to construct a framework for localizing generic events in video. The features at mul…
▽ More
Generic event boundary detection (GEBD) aims to split video into chunks at a broad and diverse set of actions as humans naturally perceive event boundaries. In this study, we present an approach that considers the correlation between neighbor frames with pyramid feature maps in both spatial and temporal dimensions to construct a framework for localizing generic events in video. The features at multiple spatial dimensions of a pre-trained ResNet-50 are exploited with different views in the temporal dimension to form a temporal pyramid feature map. Based on that, the similarity between neighbor frames is calculated and projected to build a temporal pyramid similarity feature vector. A decoder with 1D convolution operations is used to decode these similarities to a new representation that incorporates their temporal relationship for later boundary score estimation. Extensive experiments conducted on the GEBD benchmark dataset show the effectiveness of our system and its variations, in which we outperformed the state-of-the-art approaches. Additional experiments on TAPOS dataset, which contains long-form videos with Olympic sport actions, demonstrated the effectiveness of our study compared to others.
△ Less
Submitted 10 January, 2023;
originally announced January 2023.
-
Aharoni's rainbow cycle conjecture holds up to an additive constant
Authors:
Patrick Hompe,
Tony Huynh
Abstract:
In 2017, Aharoni proposed the following generalization of the Caccetta-Häggkvist conjecture: if $G$ is a simple $n$-vertex edge-colored graph with $n$ color classes of size at least $r$, then $G$ contains a rainbow cycle of length at most $\lceil n/r \rceil$.
In this paper, we prove that, for fixed $r$, Aharoni's conjecture holds up to an additive constant. Specifically, we show that for each fi…
▽ More
In 2017, Aharoni proposed the following generalization of the Caccetta-Häggkvist conjecture: if $G$ is a simple $n$-vertex edge-colored graph with $n$ color classes of size at least $r$, then $G$ contains a rainbow cycle of length at most $\lceil n/r \rceil$.
In this paper, we prove that, for fixed $r$, Aharoni's conjecture holds up to an additive constant. Specifically, we show that for each fixed $r \geq 1$, there exists a constant $c_r$ such that if $G$ is a simple $n$-vertex edge-colored graph with $n$ color classes of size at least $r$, then $G$ contains a rainbow cycle of length at most $n/r + c_r$.
△ Less
Submitted 22 March, 2024; v1 submitted 11 December, 2022;
originally announced December 2022.
-
Multilingual Communication System with Deaf Individuals Utilizing Natural and Visual Languages
Authors:
Tuan-Luc Huynh,
Khoi-Nguyen Nguyen-Ngoc,
Chi-Bien Chu,
Minh-Triet Tran,
Trung-Nghia Le
Abstract:
According to the World Federation of the Deaf, more than two hundred sign languages exist. Therefore, it is challenging to understand deaf individuals, even proficient sign language users, resulting in a barrier between the deaf community and the rest of society. To bridge this language barrier, we propose a novel multilingual communication system, namely MUGCAT, to improve the communication effic…
▽ More
According to the World Federation of the Deaf, more than two hundred sign languages exist. Therefore, it is challenging to understand deaf individuals, even proficient sign language users, resulting in a barrier between the deaf community and the rest of society. To bridge this language barrier, we propose a novel multilingual communication system, namely MUGCAT, to improve the communication efficiency of sign language users. By converting recognized specific hand gestures into expressive pictures, which is universal usage and language independence, our MUGCAT system significantly helps deaf people convey their thoughts. To overcome the limitation of sign language usage, which is mostly impossible to translate into complete sentences for ordinary people, we propose to reconstruct meaningful sentences from the incomplete translation of sign language. We also measure the semantic similarity of generated sentences with fragmented recognized hand gestures to keep the original meaning. Experimental results show that the proposed system can work in a real-time manner and synthesize exquisite stunning illustrations and meaningful sentences from a few hand gestures of sign language. This proves that our MUGCAT has promising potential in assisting deaf communication.
△ Less
Submitted 1 December, 2022;
originally announced December 2022.
-
Notes on Aharoni's rainbow cycle conjecture
Authors:
Katie Clinch,
Jackson Goerner,
Tony Huynh,
Freddie Illingworth
Abstract:
In 2017, Ron Aharoni made the following conjecture about rainbow cycles in edge-coloured graphs: If $G$ is an $n$-vertex graph whose edges are coloured with $n$ colours and each colour class has size at least $r$, then $G$ contains a rainbow cycle of length at most $\lceil \frac{n}{r} \rceil$. One motivation for studying Aharoni's conjecture is that it is a strengthening of the Caccetta-Häggkvist…
▽ More
In 2017, Ron Aharoni made the following conjecture about rainbow cycles in edge-coloured graphs: If $G$ is an $n$-vertex graph whose edges are coloured with $n$ colours and each colour class has size at least $r$, then $G$ contains a rainbow cycle of length at most $\lceil \frac{n}{r} \rceil$. One motivation for studying Aharoni's conjecture is that it is a strengthening of the Caccetta-Häggkvist conjecture on digraphs from 1978.
In this article, we present a survey of Aharoni's conjecture, including many recent partial results and related conjectures. We also present two new results. Our main new result is for the $r=3$ case of Aharoni's conjecture. We prove that if $G$ is an $n$-vertex graph whose edges are coloured with $n$ colours and each colour class has size at least 3, then $G$ contains a rainbow cycle of length at most $\frac{4n}{9}+7$. We also discuss how our approach might generalise to larger values of $r$.
△ Less
Submitted 20 November, 2022; v1 submitted 14 November, 2022;
originally announced November 2022.
-
Efficient Integration of Multi-Order Dynamics and Internal Dynamics in Stock Movement Prediction
Authors:
Thanh Trung Huynh,
Minh Hieu Nguyen,
Thanh Tam Nguyen,
Phi Le Nguyen,
Matthias Weidlich,
Quoc Viet Hung Nguyen,
Karl Aberer
Abstract:
Advances in deep neural network (DNN) architectures have enabled new prediction techniques for stock market data. Unlike other multivariate time-series data, stock markets show two unique characteristics: (i) \emph{multi-order dynamics}, as stock prices are affected by strong non-pairwise correlations (e.g., within the same industry); and (ii) \emph{internal dynamics}, as each individual stock sho…
▽ More
Advances in deep neural network (DNN) architectures have enabled new prediction techniques for stock market data. Unlike other multivariate time-series data, stock markets show two unique characteristics: (i) \emph{multi-order dynamics}, as stock prices are affected by strong non-pairwise correlations (e.g., within the same industry); and (ii) \emph{internal dynamics}, as each individual stock shows some particular behaviour. Recent DNN-based methods capture multi-order dynamics using hypergraphs, but rely on the Fourier basis in the convolution, which is both inefficient and ineffective. In addition, they largely ignore internal dynamics by adopting the same model for each stock, which implies a severe information loss.
In this paper, we propose a framework for stock movement prediction to overcome the above issues. Specifically, the framework includes temporal generative filters that implement a memory-based mechanism onto an LSTM network in an attempt to learn individual patterns per stock. Moreover, we employ hypergraph attentions to capture the non-pairwise correlations. Here, using the wavelet basis instead of the Fourier basis, enables us to simplify the message passing and focus on the localized convolution. Experiments with US market data over six years show that our framework outperforms state-of-the-art methods in terms of profit and stability. Our source code and data are available at \url{https://github.com/thanhtrunghuynh93/estimate}.
△ Less
Submitted 24 November, 2022; v1 submitted 10 November, 2022;
originally announced November 2022.
-
Joint Multilingual Knowledge Graph Completion and Alignment
Authors:
Vinh Tong,
Dat Quoc Nguyen,
Trung Thanh Huynh,
Tam Thanh Nguyen,
Quoc Viet Hung Nguyen,
Mathias Niepert
Abstract:
Knowledge graph (KG) alignment and completion are usually treated as two independent tasks. While recent work has leveraged entity and relation alignments from multiple KGs, such as alignments between multilingual KGs with common entities and relations, a deeper understanding of the ways in which multilingual KG completion (MKGC) can aid the creation of multilingual KG alignments (MKGA) is still l…
▽ More
Knowledge graph (KG) alignment and completion are usually treated as two independent tasks. While recent work has leveraged entity and relation alignments from multiple KGs, such as alignments between multilingual KGs with common entities and relations, a deeper understanding of the ways in which multilingual KG completion (MKGC) can aid the creation of multilingual KG alignments (MKGA) is still limited. Motivated by the observation that structural inconsistencies -- the main challenge for MKGA models -- can be mitigated through KG completion methods, we propose a novel model for jointly completing and aligning knowledge graphs. The proposed model combines two components that jointly accomplish KG completion and alignment. These two components employ relation-aware graph neural networks that we propose to encode multi-hop neighborhood structures into entity and relation representations. Moreover, we also propose (i) a structural inconsistency reduction mechanism to incorporate information from the completion into the alignment component, and (ii) an alignment seed enlargement and triple transferring mechanism to enlarge alignment seeds and transfer triples during KGs alignment. Extensive experiments on a public multilingual benchmark show that our proposed model outperforms existing competitive baselines, obtaining new state-of-the-art results on both MKGC and MKGA tasks. We publicly release the implementation of our model at https://github.com/vinhsuhi/JMAC
△ Less
Submitted 18 October, 2022; v1 submitted 17 October, 2022;
originally announced October 2022.
-
A Survey of Machine Unlearning
Authors:
Thanh Tam Nguyen,
Thanh Trung Huynh,
Phi Le Nguyen,
Alan Wee-Chung Liew,
Hongzhi Yin,
Quoc Viet Hung Nguyen
Abstract:
Today, computer systems hold large amounts of personal data. Yet while such an abundance of data allows breakthroughs in artificial intelligence, and especially machine learning (ML), its existence can be a threat to user privacy, and it can weaken the bonds of trust between humans and AI. Recent regulations now require that, on request, private information about a user must be removed from both c…
▽ More
Today, computer systems hold large amounts of personal data. Yet while such an abundance of data allows breakthroughs in artificial intelligence, and especially machine learning (ML), its existence can be a threat to user privacy, and it can weaken the bonds of trust between humans and AI. Recent regulations now require that, on request, private information about a user must be removed from both computer systems and from ML models, i.e. ``the right to be forgotten''). While removing data from back-end databases should be straightforward, it is not sufficient in the AI context as ML models often `remember' the old data. Contemporary adversarial attacks on trained models have proven that we can learn whether an instance or an attribute belonged to the training data. This phenomenon calls for a new paradigm, namely machine unlearning, to make ML models forget about particular data. It turns out that recent works on machine unlearning have not been able to completely solve the problem due to the lack of common frameworks and resources. Therefore, this paper aspires to present a comprehensive examination of machine unlearning's concepts, scenarios, methods, and applications. Specifically, as a category collection of cutting-edge studies, the intention behind this article is to serve as a comprehensive resource for researchers and practitioners seeking an introduction to machine unlearning and its formulations, design criteria, removal requests, algorithms, and applications. In addition, we aim to highlight the key findings, current trends, and new research areas that have not yet featured the use of machine unlearning but could benefit greatly from it. We hope this survey serves as a valuable resource for ML researchers and those seeking to innovate privacy technologies. Our resources are publicly available at https://github.com/tamlhp/awesome-machine-unlearning.
△ Less
Submitted 21 October, 2022; v1 submitted 6 September, 2022;
originally announced September 2022.
-
SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild
Authors:
Jie Qin,
Shuaihang Yuan,
Jiaxin Chen,
Boulbaba Ben Amor,
Yi Fang,
Nhat Hoang-Xuan,
Chi-Bien Chu,
Khoi-Nguyen Nguyen-Ngoc,
Thien-Tri Cao,
Nhat-Khang Ngo,
Tuan-Luc Huynh,
Hai-Dang Nguyen,
Minh-Triet Tran,
Haoyang Luo,
Jianning Wang,
Zheng Zhang,
Zihao Xin,
Yang Wang,
Feng Wang,
Ying Tang,
Haiqin Chen,
Yan Wang,
Qunying Zhou,
Ji Zhang,
Hongyuan Wang
Abstract:
Sketch-based 3D shape retrieval (SBSR) is an important yet challenging task, which has drawn more and more attention in recent years. Existing approaches address the problem in a restricted setting, without appropriately simulating real application scenarios. To mimic the realistic setting, in this track, we adopt large-scale sketches drawn by amateurs of different levels of drawing skills, as wel…
▽ More
Sketch-based 3D shape retrieval (SBSR) is an important yet challenging task, which has drawn more and more attention in recent years. Existing approaches address the problem in a restricted setting, without appropriately simulating real application scenarios. To mimic the realistic setting, in this track, we adopt large-scale sketches drawn by amateurs of different levels of drawing skills, as well as a variety of 3D shapes including not only CAD models but also models scanned from real objects. We define two SBSR tasks and construct two benchmarks consisting of more than 46,000 CAD models, 1,700 realistic models, and 145,000 sketches in total. Four teams participated in this track and submitted 15 runs for the two tasks, evaluated by 7 commonly-adopted metrics. We hope that, the benchmarks, the comparative results, and the open-sourced evaluation code will foster future research in this direction among the 3D object retrieval community.
△ Less
Submitted 11 July, 2022;
originally announced July 2022.
-
A Methodology and Software Architecture to Support Explainability-by-Design
Authors:
Trung Dong Huynh,
Niko Tsakalakis,
Ayah Helal,
Sophie Stalla-Bourdillon,
Luc Moreau
Abstract:
Algorithms play a crucial role in many technological systems that control or affect various aspects of our lives. As a result, providing explanations for their decisions to address the needs of users and organisations is increasingly expected by laws, regulations, codes of conduct, and the public. However, as laws and regulations do not prescribe how to meet such expectations, organisations are of…
▽ More
Algorithms play a crucial role in many technological systems that control or affect various aspects of our lives. As a result, providing explanations for their decisions to address the needs of users and organisations is increasingly expected by laws, regulations, codes of conduct, and the public. However, as laws and regulations do not prescribe how to meet such expectations, organisations are often left to devise their own approaches to explainability, inevitably increasing the cost of compliance and good governance. Hence, we envision Explainability-by-Design, a holistic methodology characterised by proactive measures to include explanation capability in the design of decision-making systems. The methodology consists of three phases: (A) Explanation Requirement Analysis, (B) Explanation Technical Design, and (C) Explanation Validation. This paper describes phase (B), a technical workflow to implement explanation capability from requirements elicited by domain experts for a specific application context. Outputs of this phase are a set of configurations, allowing a reusable explanation service to exploit logs provided by the target application to create provenance traces of the application's decisions. The provenance then can be queried to extract relevant data points, which can be used in explanation plans to construct explanations personalised to their consumers. Following the workflow, organisations can design their decision-making systems to produce explanations that meet the specified requirements. To facilitate the process, we present a software architecture with reusable components to incorporate the resulting explanation capability into an application. Finally, we applied the workflow to two application scenarios and measured the associated development costs. It was shown that the approach is tractable in terms of development time, which can be as low as two hours per sentence.
△ Less
Submitted 25 May, 2023; v1 submitted 13 June, 2022;
originally announced June 2022.
-
A taxonomy of explanations to support Explainability-by-Design
Authors:
Niko Tsakalakis,
Sophie Stalla-Bourdillon,
Trung Dong Huynh,
Luc Moreau
Abstract:
As automated decision-making solutions are increasingly applied to all aspects of everyday life, capabilities to generate meaningful explanations for a variety of stakeholders (i.e., decision-makers, recipients of decisions, auditors, regulators...) become crucial. In this paper, we present a taxonomy of explanations that was developed as part of a holistic 'Explainability-by-Design' approach for…
▽ More
As automated decision-making solutions are increasingly applied to all aspects of everyday life, capabilities to generate meaningful explanations for a variety of stakeholders (i.e., decision-makers, recipients of decisions, auditors, regulators...) become crucial. In this paper, we present a taxonomy of explanations that was developed as part of a holistic 'Explainability-by-Design' approach for the purposes of the project PLEAD. The taxonomy was built with a view to produce explanations for a wide range of requirements stemming from a variety of regulatory frameworks or policies set at the organizational level either to translate high-level compliance requirements or to meet business needs. The taxonomy comprises nine dimensions. It is used as a stand-alone classifier of explanations conceived as detective controls, in order to aid supportive automated compliance strategies. A machinereadable format of the taxonomy is provided in the form of a light ontology and the benefits of starting the Explainability-by-Design journey with such a taxonomy are demonstrated through a series of examples.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Product structure of graph classes with bounded treewidth
Authors:
Rutger Campbell,
Katie Clinch,
Marc Distel,
J. Pascal Gollin,
Kevin Hendrey,
Robert Hickingbotham,
Tony Huynh,
Freddie Illingworth,
Youri Tamitegama,
Jane Tan,
David R. Wood
Abstract:
We show that many graphs with bounded treewidth can be described as subgraphs of the strong product of a graph with smaller treewidth and a bounded-size complete graph. To this end, define the "underlying treewidth" of a graph class $\mathcal{G}$ to be the minimum non-negative integer $c$ such that, for some function $f$, for every graph ${G \in \mathcal{G}}$ there is a graph $H$ with…
▽ More
We show that many graphs with bounded treewidth can be described as subgraphs of the strong product of a graph with smaller treewidth and a bounded-size complete graph. To this end, define the "underlying treewidth" of a graph class $\mathcal{G}$ to be the minimum non-negative integer $c$ such that, for some function $f$, for every graph ${G \in \mathcal{G}}$ there is a graph $H$ with ${\text{tw}(H) \leq c}$ such that $G$ is isomorphic to a subgraph of ${H \boxtimes K_{f(\text{tw}(G))}}$. We introduce disjointed coverings of graphs and show they determine the underlying treewidth of any graph class. Using this result, we prove that the class of planar graphs has underlying treewidth 3; the class of $K_{s,t}$-minor-free graphs has underlying treewidth $s$ (for ${t \geq \max\{s,3\}}$); and the class of $K_t$-minor-free graphs has underlying treewidth ${t-2}$. In general, we prove that a monotone class has bounded underlying treewidth if and only if it excludes some fixed topological minor. We also study the underlying treewidth of graph classes defined by an excluded subgraph or excluded induced subgraph. We show that the class of graphs with no $H$ subgraph has bounded underlying treewidth if and only if every component of $H$ is a subdivided star, and that the class of graphs with no induced $H$ subgraph has bounded underlying treewidth if and only if every component of $H$ is a star.
△ Less
Submitted 17 July, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Detecting Rumours with Latency Guarantees using Massive Streaming Data
Authors:
Thanh Tam Nguyen,
Thanh Trung Huynh,
Hongzhi Yin,
Matthias Weidlich,
Thanh Thi Nguyen,
Thai Son Mai,
Quoc Viet Hung Nguyen
Abstract:
Today's social networks continuously generate massive streams of data, which provide a valuable starting point for the detection of rumours as soon as they start to propagate. However, rumour detection faces tight latency bounds, which cannot be met by contemporary algorithms, given the sheer volume of high-velocity streaming data emitted by social networks. Hence, in this paper, we argue for best…
▽ More
Today's social networks continuously generate massive streams of data, which provide a valuable starting point for the detection of rumours as soon as they start to propagate. However, rumour detection faces tight latency bounds, which cannot be met by contemporary algorithms, given the sheer volume of high-velocity streaming data emitted by social networks. Hence, in this paper, we argue for best-effort rumour detection that detects most rumours quickly rather than all rumours with a high delay. To this end, we combine techniques for efficient, graph-based matching of rumour patterns with effective load shedding that discards some of the input data while minimising the loss in accuracy. Experiments with large-scale real-world datasets illustrate the robustness of our approach in terms of runtime performance and detection accuracy under diverse streaming conditions.
△ Less
Submitted 13 May, 2022;
originally announced May 2022.
-
TaDeR: A New Task Dependency Recommendation for Project Management Platform
Authors:
Quynh Nguyen,
Dac H. Nguyen,
Son T. Huynh,
Hoa K. Dam,
Binh T. Nguyen
Abstract:
Many startups and companies worldwide have been using project management software and tools to monitor, track and manage their projects. For software projects, the number of tasks from the beginning to the end is quite a large number that sometimes takes a lot of time and effort to search and link the current task to a group of previous ones for further references. This paper proposes an efficient…
▽ More
Many startups and companies worldwide have been using project management software and tools to monitor, track and manage their projects. For software projects, the number of tasks from the beginning to the end is quite a large number that sometimes takes a lot of time and effort to search and link the current task to a group of previous ones for further references. This paper proposes an efficient task dependency recommendation algorithm to suggest tasks dependent on a given task that the user has just created. We present an efficient feature engineering step and construct a deep neural network to this aim. We performed extensive experiments on two different large projects (MDLSITE from moodle.org and FLUME from apache.org) to find the best features in 28 combinations of features and the best performance model using two embedding methods (GloVe and FastText). We consider three types of models (GRU, CNN, LSTM) using Accuracy@K, MRR@K, and Recall@K (where K = 1, 2, 3, and 5) and baseline models using traditional methods: TF-IDF with various matching score calculating such as cosine similarity, Euclidean distance, Manhattan distance, and Chebyshev distance. After many experiments, the GloVe Embedding and CNN model reached the best result in our dataset, so we chose this model as our proposed method. In addition, adding the time filter in the post-processing step can significantly improve the recommendation system's performance. The experimental results show that our proposed method can reach 0.2335 in Accuracy@1 and MRR@1 and 0.2011 in Recall@1 of dataset FLUME. With the MDLSITE dataset, we obtained 0.1258 in Accuracy@1 and MRR@1 and 0.1141 in Recall@1. In the top 5, our model reached 0.3040 in Accuracy@5, 0.2563 MRR@5, and 0.2651 Recall@5 in FLUME. In the MDLSITE dataset, our model got 0.5270 Accuracy@5, 0.2689 MRR@5, and 0.2651 Recall@5.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
FPSRS: A Fusion Approach for Paper Submission Recommendation System
Authors:
Son T. Huynh,
Nhi Dang,
Dac H. Nguyen,
Phong T. Huynh,
Binh T. Nguyen
Abstract:
Recommender systems have been increasingly popular in entertainment and consumption and are evident in academics, especially for applications that suggest submitting scientific articles to scientists. However, because of the various acceptance rates, impact factors, and rankings in different publishers, searching for a proper venue or journal to submit a scientific work usually takes a lot of time…
▽ More
Recommender systems have been increasingly popular in entertainment and consumption and are evident in academics, especially for applications that suggest submitting scientific articles to scientists. However, because of the various acceptance rates, impact factors, and rankings in different publishers, searching for a proper venue or journal to submit a scientific work usually takes a lot of time and effort. In this paper, we aim to present two newer approaches extended from our paper [13] presented at the conference IAE/AIE 2021 by employing RNN structures besides using Conv1D. In addition, we also introduce a new method, namely DistilBertAims, using DistillBert for two cases of uppercase and lower-case words to vectorize features such as Title, Abstract, and Keywords, and then use Conv1d to perform feature extraction. Furthermore, we propose a new calculation method for similarity score for Aim & Scope with other features; this helps keep the weights of similarity score calculation continuously updated and then continue to fit more data. The experimental results show that the second approach could obtain a better performance, which is 62.46% and 12.44% higher than the best of the previous study [13] in terms of the Top 1 accuracy.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
SimCPSR: Simple Contrastive Learning for Paper Submission Recommendation System
Authors:
Duc H. Le,
Tram T. Doan,
Son T. Huynh,
Binh T. Nguyen
Abstract:
The recommendation system plays a vital role in many areas, especially academic fields, to support researchers in submitting and increasing the acceptance of their work through the conference or journal selection process. This study proposes a transformer-based model using transfer learning as an efficient approach for the paper submission recommendation system. By combining essential information…
▽ More
The recommendation system plays a vital role in many areas, especially academic fields, to support researchers in submitting and increasing the acceptance of their work through the conference or journal selection process. This study proposes a transformer-based model using transfer learning as an efficient approach for the paper submission recommendation system. By combining essential information (such as the title, the abstract, and the list of keywords) with the aims and scopes of journals, the model can recommend the Top K journals that maximize the acceptance of the paper. Our model had developed through two states: (i) Fine-tuning the pre-trained language model (LM) with a simple contrastive learning framework. We utilized a simple supervised contrastive objective to fine-tune all parameters, encouraging the LM to learn the document representation effectively. (ii) The fine-tuned LM was then trained on different combinations of the features for the downstream task. This study suggests a more advanced method for enhancing the efficiency of the paper submission recommendation system compared to previous approaches when we respectively achieve 0.5173, 0.8097, 0.8862, 0.9496 for Top 1, 3, 5, and 10 accuracies on the test set for combining the title, abstract, and keywords as input features. Incorporating the journals' aims and scopes, our model shows an exciting result by getting 0.5194, 0.8112, 0.8866, and 0.9496 respective to Top 1, 3, 5, and 10.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
Fall detection using multimodal data
Authors:
Thao V. Ha,
Hoang Nguyen,
Son T. Huynh,
Trung T. Nguyen,
Binh T. Nguyen
Abstract:
In recent years, the occurrence of falls has increased and has had detrimental effects on older adults. Therefore, various machine learning approaches and datasets have been introduced to construct an efficient fall detection algorithm for the social community. This paper studies the fall detection problem based on a large public dataset, namely the UP-Fall Detection Dataset. This dataset was coll…
▽ More
In recent years, the occurrence of falls has increased and has had detrimental effects on older adults. Therefore, various machine learning approaches and datasets have been introduced to construct an efficient fall detection algorithm for the social community. This paper studies the fall detection problem based on a large public dataset, namely the UP-Fall Detection Dataset. This dataset was collected from a dozen of volunteers using different sensors and two cameras. We propose several techniques to obtain valuable features from these sensors and cameras and then construct suitable models for the main problem. The experimental results show that our proposed methods can bypass the state-of-the-art methods on this dataset in terms of accuracy, precision, recall, and F1 score.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-based Textual Knowledge Source
Authors:
Kiet Van Nguyen,
Phong Nguyen-Thuan Do,
Nhat Duy Nguyen,
Tin Van Huynh,
Anh Gia-Tuan Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models. A reader-based QA system is a high-level search engi…
▽ More
Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models. A reader-based QA system is a high-level search engine that can find correct answers to queries or questions in open-domain or domain-specific texts using machine reading comprehension (MRC) techniques. The majority of advancements in data resources and machine-learning approaches in the MRC and QA systems especially are developed significantly in two resource-rich languages such as English and Chinese. A low-resource language like Vietnamese has witnessed a scarcity of research on QA systems. This paper presents XLMRQA, the first Vietnamese QA system using a supervised transformer-based reader on the Wikipedia-based textual knowledge source (using the UIT-ViQuAD corpus), outperforming the two robust QA systems using deep neural network models: DrQA and BERTserini with 24.46% and 6.28%, respectively. From the results obtained on the three systems, we analyze the influence of question types on the performance of the QA systems.
△ Less
Submitted 13 August, 2022; v1 submitted 14 April, 2022;
originally announced April 2022.
-
In-Pocket 3D Graphs Enhance Ligand-Target Compatibility in Generative Small-Molecule Creation
Authors:
Seung-gu Kang,
Jeffrey K. Weber,
Joseph A. Morrone,
Leili Zhang,
Tien Huynh,
Wendy D. Cornell
Abstract:
Proteins in complex with small molecule ligands represent the core of structure-based drug discovery. However, three-dimensional representations are absent from most deep-learning-based generative models. We here present a graph-based generative modeling technology that encodes explicit 3D protein-ligand contacts within a relational graph architecture. The models combine a conditional variational…
▽ More
Proteins in complex with small molecule ligands represent the core of structure-based drug discovery. However, three-dimensional representations are absent from most deep-learning-based generative models. We here present a graph-based generative modeling technology that encodes explicit 3D protein-ligand contacts within a relational graph architecture. The models combine a conditional variational autoencoder that allows for activity-specific molecule generation with putative contact generation that provides predictions of molecular interactions within the target binding pocket. We show that molecules generated with our 3D procedure are more compatible with the binding pocket of the dopamine D2 receptor than those produced by a comparable ligand-based 2D generative method, as measured by docking scores, expected stereochemistry, and recoverability in commercial chemical databases. Predicted protein-ligand contacts were found among highest-ranked docking poses with a high recovery rate. This work shows how the structural context of a protein target can be used to enhance molecule generation.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
VLSP 2021 - ViMRC Challenge: Vietnamese Machine Reading Comprehension
Authors:
Kiet Van Nguyen,
Son Quoc Tran,
Luan Thanh Nguyen,
Tin Van Huynh,
Son T. Luu,
Ngan Luu-Thuy Nguyen
Abstract:
One of the emerging research trends in natural language understanding is machine reading comprehension (MRC) which is the task to find answers to human questions based on textual data. Existing Vietnamese datasets for MRC research concentrate solely on answerable questions. However, in reality, questions can be unanswerable for which the correct answer is not stated in the given textual data. To a…
▽ More
One of the emerging research trends in natural language understanding is machine reading comprehension (MRC) which is the task to find answers to human questions based on textual data. Existing Vietnamese datasets for MRC research concentrate solely on answerable questions. However, in reality, questions can be unanswerable for which the correct answer is not stated in the given textual data. To address the weakness, we provide the research community with a benchmark dataset named UIT-ViQuAD 2.0 for evaluating the MRC task and question answering systems for the Vietnamese language. We use UIT-ViQuAD 2.0 as a benchmark dataset for the challenge on Vietnamese MRC at the Eighth Workshop on Vietnamese Language and Speech Processing (VLSP 2021). This task attracted 77 participant teams from 34 universities and other organizations. In this article, we present details of the organization of the challenge, an overview of the methods employed by shared-task participants, and the results. The highest performances are 77.24% in F1-score and 67.43% in Exact Match on the private test set. The Vietnamese MRC systems proposed by the top 3 teams use XLM-RoBERTa, a powerful pre-trained language model based on the transformer architecture. The UIT-ViQuAD 2.0 dataset motivates researchers to further explore the Vietnamese machine reading comprehension task and related tasks such as question answering, question generation, and natural language inference.
△ Less
Submitted 4 April, 2022; v1 submitted 21 March, 2022;
originally announced March 2022.
-
Incomplete Knowledge Graph Alignment
Authors:
Vinh Van Tong,
Thanh Trung Huynh,
Thanh Tam Nguyen,
Hongzhi Yin,
Quoc Viet Hung Nguyen,
Quyet Thang Huynh
Abstract:
Knowledge graph (KG) alignment - the task of recognizing entities referring to the same thing in different KGs - is recognized as one of the most important operations in the field of KG construction and completion. However, existing alignment techniques often assume that the input KGs are complete and isomorphic, which is not true due to the real-world heterogeneity in the domain, size, and sparsi…
▽ More
Knowledge graph (KG) alignment - the task of recognizing entities referring to the same thing in different KGs - is recognized as one of the most important operations in the field of KG construction and completion. However, existing alignment techniques often assume that the input KGs are complete and isomorphic, which is not true due to the real-world heterogeneity in the domain, size, and sparsity. In this work, we address the problem of aligning incomplete KGs with representation learning. Our KG embedding framework exploits two feature channels: transitivity-based and proximity-based. The former captures the consistency constraints between entities via translation paths, while the latter captures the neighbourhood structure of KGs via attention guided relation-aware graph neural network. The two feature channels are jointly learned to exchange important features between the input KGs while enforcing the output representations of the input KGs in the same embedding space. Also, we develop a missing links detector that discovers and recovers the missing links in the input KGs during the training process, which helps mitigate the incompleteness issue and thus improve the compatibility of the learned representations. The embeddings then are fused to generate the alignment result, and the high-confidence matched node pairs are updated to the pre-aligned supervision data to improve the embeddings gradually. Empirical results show that our model is more accurate than the SOTA and is robust against different levels of incompleteness.
△ Less
Submitted 15 March, 2022; v1 submitted 16 December, 2021;
originally announced December 2021.
-
A War Beyond Deepfake: Benchmarking Facial Counterfeits and Countermeasures
Authors:
Minh Tam Pham,
Thanh Trung Huynh,
Van Vinh Tong,
Thanh Tam Nguyen,
Thanh Thi Nguyen,
Hongzhi Yin,
Quoc Viet Hung Nguyen
Abstract:
In recent years, visual forgery has reached a level of sophistication that humans cannot identify fraud, which poses a significant threat to information security. A wide range of malicious applications have emerged, such as fake news, defamation or blackmailing of celebrities, impersonation of politicians in political warfare, and the spreading of rumours to attract views. As a result, a rich body…
▽ More
In recent years, visual forgery has reached a level of sophistication that humans cannot identify fraud, which poses a significant threat to information security. A wide range of malicious applications have emerged, such as fake news, defamation or blackmailing of celebrities, impersonation of politicians in political warfare, and the spreading of rumours to attract views. As a result, a rich body of visual forensic techniques has been proposed in an attempt to stop this dangerous trend. In this paper, we present a benchmark that provides in-depth insights into visual forgery and visual forensics, using a comprehensive and empirical approach. More specifically, we develop an independent framework that integrates state-of-the-arts counterfeit generators and detectors, and measure the performance of these techniques using various criteria. We also perform an exhaustive analysis of the benchmarking results, to determine the characteristics of the methods that serve as a comparative reference in this never-ending war between measures and countermeasures.
△ Less
Submitted 7 April, 2022; v1 submitted 25 November, 2021;
originally announced November 2021.
-
Semi-supervised learning for medical image classification using imbalanced training data
Authors:
Tri Huynh,
Aiden Nibali,
Zhen He
Abstract:
Medical image classification is often challenging for two reasons: a lack of labelled examples due to expensive and time-consuming annotation protocols, and imbalanced class labels due to the relative scarcity of disease-positive individuals in the wider population. Semi-supervised learning (SSL) methods exist for dealing with a lack of labels, but they generally do not address the problem of clas…
▽ More
Medical image classification is often challenging for two reasons: a lack of labelled examples due to expensive and time-consuming annotation protocols, and imbalanced class labels due to the relative scarcity of disease-positive individuals in the wider population. Semi-supervised learning (SSL) methods exist for dealing with a lack of labels, but they generally do not address the problem of class imbalance. In this study we propose Adaptive Blended Consistency Loss (ABCL), a drop-in replacement for consistency loss in perturbation-based SSL methods. ABCL counteracts data skew by adaptively mixing the target class distribution of the consistency loss in accordance with class frequency. Our experiments with ABCL reveal improvements to unweighted average recall on two different imbalanced medical image classification datasets when compared with existing consistency losses that are not designed to counteract class imbalance.
△ Less
Submitted 19 August, 2021;
originally announced August 2021.
-
VinDr-SpineXR: A deep learning framework for spinal lesions detection and classification from radiographs
Authors:
Hieu T. Nguyen,
Hieu H. Pham,
Nghia T. Nguyen,
Ha Q. Nguyen,
Thang Q. Huynh,
Minh Dao,
Van Vu
Abstract:
Radiographs are used as the most important imaging tool for identifying spine anomalies in clinical practice. The evaluation of spinal bone lesions, however, is a challenging task for radiologists. This work aims at developing and evaluating a deep learning-based framework, named VinDr-SpineXR, for the classification and localization of abnormalities from spine X-rays. First, we build a large data…
▽ More
Radiographs are used as the most important imaging tool for identifying spine anomalies in clinical practice. The evaluation of spinal bone lesions, however, is a challenging task for radiologists. This work aims at developing and evaluating a deep learning-based framework, named VinDr-SpineXR, for the classification and localization of abnormalities from spine X-rays. First, we build a large dataset, comprising 10,468 spine X-ray images from 5,000 studies, each of which is manually annotated by an experienced radiologist with bounding boxes around abnormal findings in 13 categories. Using this dataset, we then train a deep learning classifier to determine whether a spine scan is abnormal and a detector to localize 7 crucial findings amongst the total 13. The VinDr-SpineXR is evaluated on a test set of 2,078 images from 1,000 studies, which is kept separate from the training set. It demonstrates an area under the receiver operating characteristic curve (AUROC) of 88.61% (95% CI 87.19%, 90.02%) for the image-level classification task and a mean average precision ([email protected]) of 33.56% for the lesion-level localization task. These results serve as a proof of concept and set a baseline for future research in this direction. To encourage advances, the dataset, codes, and trained deep learning models are made publicly available.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Slack matrices, $k$-products, and $2$-level polytopes
Authors:
Manuel Aprile,
Michele Conforti,
Yuri Faenza,
Samuel Fiorini,
Tony Huynh,
Marco Macchia
Abstract:
In this paper, we study algorithmic questions concerning products of matrices and their consequences for recognition algorithms for polyhedra.
The 1-product of matrices $S_1$, $S_2$ is a matrix whose columns are the concatenation of each column of $S_1$ with each column of $S_2$. The $k$-product generalizes the $1$-product, by taking as input two matrices $S_1, S_2$ together with $k-1$ special r…
▽ More
In this paper, we study algorithmic questions concerning products of matrices and their consequences for recognition algorithms for polyhedra.
The 1-product of matrices $S_1$, $S_2$ is a matrix whose columns are the concatenation of each column of $S_1$ with each column of $S_2$. The $k$-product generalizes the $1$-product, by taking as input two matrices $S_1, S_2$ together with $k-1$ special rows of each of those matrices, and outputting a certain composition of $S_1,S_2$.
Our study is motivated by a close link between the 1-product of matrices and the Cartesian product of polytopes, and more generally between the $k$-product of matrices and the glued product of polytopes. These connections rely on the concept of slack matrix, which gives an algebraic representation of classes of affinely equivalent polytopes. The slack matrix recognition problem is the problem of determining whether a given matrix is a slack matrix. This is an intriguing problem whose complexity is unknown. Our algorithm reduces the problem to instances which cannot be expressed as $k$-products of smaller matrices.
In the second part of the paper, we give a combinatorial interpretation of $k$-products for two well-known classes of polytopes: 2-level matroid base polytopes and stable set polytopes of perfect graphs. We also show that the slack matrix recognition problem is polynomial-time solvable for such polytopes. Those two classes are special cases of $2$-level polytopes, for which we conjecture that the slack matrix recognition problem is polynomial-time solvable.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Smaller extended formulations for spanning tree polytopes in minor-closed classes and beyond
Authors:
Manuel Aprile,
Samuel Fiorini,
Tony Huynh,
Gwenaël Joret,
David R. Wood
Abstract:
Let $G$ be a connected $n$-vertex graph in a proper minor-closed class $\mathcal G$. We prove that the extension complexity of the spanning tree polytope of $G$ is $O(n^{3/2})$. This improves on the $O(n^2)$ bounds following from the work of Wong (1980) and Martin (1991). It also extends a result of Fiorini, Huynh, Joret, and Pashkovich (2017), who obtained a $O(n^{3/2})$ bound for graphs embedded…
▽ More
Let $G$ be a connected $n$-vertex graph in a proper minor-closed class $\mathcal G$. We prove that the extension complexity of the spanning tree polytope of $G$ is $O(n^{3/2})$. This improves on the $O(n^2)$ bounds following from the work of Wong (1980) and Martin (1991). It also extends a result of Fiorini, Huynh, Joret, and Pashkovich (2017), who obtained a $O(n^{3/2})$ bound for graphs embedded in a fixed surface. Our proof works more generally for all graph classes admitting strongly sublinear balanced separators: We prove that for every constant $β$ with $0<β<1$, if $\mathcal G$ is a graph class closed under induced subgraphs such that all $n$-vertex graphs in $\mathcal G$ have balanced separators of size $O(n^β)$, then the extension complexity of the spanning tree polytope of every connected $n$-vertex graph in $\mathcal{G}$ is $O(n^{1+β})$. We in fact give two proofs of this result, one is a direct construction of the extended formulation, the other is via communication protocols. Using the latter approach we also give a short proof of the $O(n)$ bound for planar graphs due to Williams (2002).
△ Less
Submitted 2 December, 2021; v1 submitted 22 June, 2021;
originally announced June 2021.