Search | arXiv e-print repository

VLind-Bench: Measuring Language Priors in Large Vision-Language Models

Authors: Kang-il Lee, Minbeom Kim, Seunghyun Yoon, Minsung Kim, Dongryeol Lee, Hyukhun Koh, Kyomin Jung

Abstract: Large Vision-Language Models (LVLMs) have demonstrated outstanding performance across various multimodal tasks. However, they suffer from a problem known as language prior, where responses are generated based solely on textual patterns while disregarding image information. Addressing the issue of language prior is crucial, as it can lead to undesirable biases or hallucinations when dealing with im… ▽ More Large Vision-Language Models (LVLMs) have demonstrated outstanding performance across various multimodal tasks. However, they suffer from a problem known as language prior, where responses are generated based solely on textual patterns while disregarding image information. Addressing the issue of language prior is crucial, as it can lead to undesirable biases or hallucinations when dealing with images that are out of training distribution. Despite its importance, current methods for accurately measuring language priors in LVLMs are poorly studied. Although existing benchmarks based on counterfactual or out-of-distribution images can partially be used to measure language priors, they fail to disentangle language priors from other confounding factors. To this end, we propose a new benchmark called VLind-Bench, which is the first benchmark specifically designed to measure the language priors, or blindness, of LVLMs. It not only includes tests on counterfactual images to assess language priors but also involves a series of tests to evaluate more basic capabilities such as commonsense knowledge, visual perception, and commonsense biases. For each instance in our benchmark, we ensure that all these basic tests are passed before evaluating the language priors, thereby minimizing the influence of other factors on the assessment. The evaluation and analysis of recent LVLMs in our benchmark reveal that almost all models exhibit a significant reliance on language priors, presenting a strong challenge in the field. △ Less

Submitted 10 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

arXiv:2404.01628 [pdf, other]

Learning Equi-angular Representations for Online Continual Learning

Authors: Minhyuk Seo, Hyunseo Koh, Wonje Jeung, Minjae Lee, San Kim, Hankook Lee, Sungjun Cho, Sungik Choi, Hyunwoo Kim, Jonghyun Choi

Abstract: Online continual learning suffers from an underfitted solution due to insufficient training for prompt model update (e.g., single-epoch training). To address the challenge, we propose an efficient online continual learning method using the neural collapse phenomenon. In particular, we induce neural collapse to form a simplex equiangular tight frame (ETF) structure in the representation space so th… ▽ More Online continual learning suffers from an underfitted solution due to insufficient training for prompt model update (e.g., single-epoch training). To address the challenge, we propose an efficient online continual learning method using the neural collapse phenomenon. In particular, we induce neural collapse to form a simplex equiangular tight frame (ETF) structure in the representation space so that the continuously learned model with a single epoch can better fit to the streamed data by proposing preparatory data training and residual correction in the representation space. With an extensive set of empirical validations using CIFAR-10/100, TinyImageNet, ImageNet-200, and ImageNet-1K, we show that our proposed method outperforms state-of-the-art methods by a noticeable margin in various online continual learning scenarios such as disjoint and Gaussian scheduled continuous (i.e., boundary-free) data setups. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: CVPR 2024

arXiv:2402.06900 [pdf, other]

Can LLMs Recognize Toxicity? Definition-Based Toxicity Metric

Authors: Hyukhun Koh, Dohyung Kim, Minwoo Lee, Kyomin Jung

Abstract: In the pursuit of developing Large Language Models (LLMs) that adhere to societal standards, it is imperative to detect the toxicity in the generated text. The majority of existing toxicity metrics rely on encoder models trained on specific toxicity datasets, which are susceptible to out-of-distribution (OOD) problems and depend on the dataset's definition of toxicity. In this paper, we introduce… ▽ More In the pursuit of developing Large Language Models (LLMs) that adhere to societal standards, it is imperative to detect the toxicity in the generated text. The majority of existing toxicity metrics rely on encoder models trained on specific toxicity datasets, which are susceptible to out-of-distribution (OOD) problems and depend on the dataset's definition of toxicity. In this paper, we introduce a robust metric grounded on LLMs to flexibly measure toxicity according to the given definition. We first analyze the toxicity factors, followed by an examination of the intrinsic toxic attributes of LLMs to ascertain their suitability as evaluators. Finally, we evaluate the performance of our metric with detailed analysis. Our empirical results demonstrate outstanding performance in measuring toxicity within verified factors, improving on conventional metrics by 12 points in the F1 score. Our findings also indicate that upstream toxicity significantly influences downstream metrics, suggesting that LLMs are unsuitable for toxicity evaluations within unverified factors. △ Less

Submitted 18 June, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

Comments: 8 page long

arXiv:2401.05800 [pdf, other]

Graph Spatiotemporal Process for Multivariate Time Series Anomaly Detection with Missing Values

Authors: Yu Zheng, Huan Yee Koh, Ming Jin, Lianhua Chi, Haishuai Wang, Khoa T. Phan, Yi-Ping Phoebe Chen, Shirui Pan, Wei Xiang

Abstract: The detection of anomalies in multivariate time series data is crucial for various practical applications, including smart power grids, traffic flow forecasting, and industrial process control. However, real-world time series data is usually not well-structured, posting significant challenges to existing approaches: (1) The existence of missing values in multivariate time series data along variabl… ▽ More The detection of anomalies in multivariate time series data is crucial for various practical applications, including smart power grids, traffic flow forecasting, and industrial process control. However, real-world time series data is usually not well-structured, posting significant challenges to existing approaches: (1) The existence of missing values in multivariate time series data along variable and time dimensions hinders the effective modeling of interwoven spatial and temporal dependencies, resulting in important patterns being overlooked during model training; (2) Anomaly scoring with irregularly-sampled observations is less explored, making it difficult to use existing detectors for multivariate series without fully-observed values. In this work, we introduce a novel framework called GST-Pro, which utilizes a graph spatiotemporal process and anomaly scorer to tackle the aforementioned challenges in detecting anomalies on irregularly-sampled multivariate time series. Our approach comprises two main components. First, we propose a graph spatiotemporal process based on neural controlled differential equations. This process enables effective modeling of multivariate time series from both spatial and temporal perspectives, even when the data contains missing values. Second, we present a novel distribution-based anomaly scoring mechanism that alleviates the reliance on complete uniform observations. By analyzing the predictions of the graph spatiotemporal process, our approach allows anomalies to be easily detected. Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods, regardless of whether there are missing values present in the data. Our code is available: https://github.com/huankoh/GST-Pro. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: Accepted by Information Fusion

arXiv:2312.03000 [pdf, other]

VidereX: A Navigational Application inspired by ants

Authors: Nam Ho Koh, Doran Amos, Paul Graham, Andrew Philippides

Abstract: Navigation is a crucial element in any person's life, whether for work, education, social living or any other miscellaneous reason; naturally, the importance of it is universally recognised and valued. One of the critical components of navigation is vision, which facilitates movement from one place to another. Navigating unfamiliar settings, especially for the blind or visually impaired, can pose… ▽ More Navigation is a crucial element in any person's life, whether for work, education, social living or any other miscellaneous reason; naturally, the importance of it is universally recognised and valued. One of the critical components of navigation is vision, which facilitates movement from one place to another. Navigating unfamiliar settings, especially for the blind or visually impaired, can pose significant challenges, impacting their independence and quality of life. Current assistive travel solutions have shortcomings, including GPS limitations and a demand for an efficient, user-friendly, and portable model. Addressing these concerns, this paper presents VidereX: a smartphone-based solution using an ant-inspired navigation algorithm. Emulating ants' ability to learn a route between nest and feeding grounds after a single traversal, VidereX enables users to rapidly acquire navigational data using a one/few-shot learning strategy. A key component of VidereX is its emphasis on active user engagement. Like ants with a scanning behaviour to actively investigate their environment, users wield the camera, actively exploring the visual landscape. Far from the passive reception of data, this process constitutes a dynamic exploration, echoing nature's navigational mechanisms. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: 6 pages, 7 figures, Workshop on Rapid and Robust Robotic Active Learning (R3AL) - Robotics: Science and Systems 2023 (RSS 2023)

arXiv:2311.07343 [pdf, other]

Fine-Tuning the Retrieval Mechanism for Tabular Deep Learning

Authors: Felix den Breejen, Sangmin Bae, Stephen Cha, Tae-Young Kim, Seoung Hyun Koh, Se-Young Yun

Abstract: While interests in tabular deep learning has significantly grown, conventional tree-based models still outperform deep learning methods. To narrow this performance gap, we explore the innovative retrieval mechanism, a methodology that allows neural networks to refer to other data points while making predictions. Our experiments reveal that retrieval-based training, especially when fine-tuning the… ▽ More While interests in tabular deep learning has significantly grown, conventional tree-based models still outperform deep learning methods. To narrow this performance gap, we explore the innovative retrieval mechanism, a methodology that allows neural networks to refer to other data points while making predictions. Our experiments reveal that retrieval-based training, especially when fine-tuning the pretrained TabPFN model, notably surpasses existing methods. Moreover, the extensive pretraining plays a crucial role to enhance the performance of the model. These insights imply that blending the retrieval mechanism with pretraining and transfer learning schemes offers considerable potential for advancing the field of tabular deep learning. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: Table Representation Learning Workshop at NeurIPS 2023

arXiv:2310.14663 [pdf, other]

DPP-TTS: Diversifying prosodic features of speech via determinantal point processes

Authors: Seongho Joo, Hyukhun Koh, Kyomin Jung

Abstract: With the rapid advancement in deep generative models, recent neural Text-To-Speech(TTS) models have succeeded in synthesizing human-like speech. There have been some efforts to generate speech with various prosody beyond monotonous prosody patterns. However, previous works have several limitations. First, typical TTS models depend on the scaled sampling temperature for boosting the diversity of pr… ▽ More With the rapid advancement in deep generative models, recent neural Text-To-Speech(TTS) models have succeeded in synthesizing human-like speech. There have been some efforts to generate speech with various prosody beyond monotonous prosody patterns. However, previous works have several limitations. First, typical TTS models depend on the scaled sampling temperature for boosting the diversity of prosody. Speech samples generated at high sampling temperatures often lack perceptual prosodic diversity, which can adversely affect the naturalness of the speech. Second, the diversity among samples is neglected since the sampling procedure often focuses on a single speech sample rather than multiple ones. In this paper, we propose DPP-TTS: a text-to-speech model based on Determinantal Point Processes (DPPs) with a prosody diversifying module. Our TTS model is capable of generating speech samples that simultaneously consider perceptual diversity in each sample and among multiple samples. We demonstrate that DPP-TTS generates speech samples with more diversified prosody than baselines in the side-by-side comparison test considering the naturalness of speech at the same time. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: EMNLP 2023

arXiv:2310.07984 [pdf]

Large Language Models for Scientific Synthesis, Inference and Explanation

Authors: Yizhen Zheng, Huan Yee Koh, Jiaxin Ju, Anh T. N. Nguyen, Lauren T. May, Geoffrey I. Webb, Shirui Pan

Abstract: Large language models are a form of artificial intelligence systems whose primary knowledge consists of the statistical patterns, semantic relationships, and syntactical structures of language1. Despite their limited forms of "knowledge", these systems are adept at numerous complex tasks including creative writing, storytelling, translation, question-answering, summarization, and computer code gen… ▽ More Large language models are a form of artificial intelligence systems whose primary knowledge consists of the statistical patterns, semantic relationships, and syntactical structures of language1. Despite their limited forms of "knowledge", these systems are adept at numerous complex tasks including creative writing, storytelling, translation, question-answering, summarization, and computer code generation. However, they have yet to demonstrate advanced applications in natural science. Here we show how large language models can perform scientific synthesis, inference, and explanation. We present a method for using general-purpose large language models to make inferences from scientific datasets of the form usually associated with special-purpose machine learning algorithms. We show that the large language model can augment this "knowledge" by synthesizing from the scientific literature. When a conventional machine learning system is augmented with this synthesized and inferred knowledge it can outperform the current state of the art across a range of benchmark tasks for predicting molecular properties. This approach has the further advantage that the large language model can explain the machine learning system's predictions. We anticipate that our framework will open new avenues for AI to accelerate the pace of scientific discovery. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: Supplementary Information: https://drive.google.com/file/d/1KrpUpzuFTeMx6a6zl18lqdo8vV-UUa1Z/view?usp=sharing Github Repo: https://github.com/zyzisastudyreallyhardguy/LLM4SD

arXiv:2307.08390 [pdf, other]

doi 10.1109/TNNLS.2023.3325667

Correlation-aware Spatial-Temporal Graph Learning for Multivariate Time-series Anomaly Detection

Authors: Yu Zheng, Huan Yee Koh, Ming Jin, Lianhua Chi, Khoa T. Phan, Shirui Pan, Yi-Ping Phoebe Chen, Wei Xiang

Abstract: Multivariate time-series anomaly detection is critically important in many applications, including retail, transportation, power grid, and water treatment plants. Existing approaches for this problem mostly employ either statistical models which cannot capture the non-linear relations well or conventional deep learning models (e.g., CNN and LSTM) that do not explicitly learn the pairwise correlati… ▽ More Multivariate time-series anomaly detection is critically important in many applications, including retail, transportation, power grid, and water treatment plants. Existing approaches for this problem mostly employ either statistical models which cannot capture the non-linear relations well or conventional deep learning models (e.g., CNN and LSTM) that do not explicitly learn the pairwise correlations among variables. To overcome these limitations, we propose a novel method, correlation-aware spatial-temporal graph learning (termed CST-GL), for time series anomaly detection. CST-GL explicitly captures the pairwise correlations via a multivariate time series correlation learning module based on which a spatial-temporal graph neural network (STGNN) can be developed. Then, by employing a graph convolution network that exploits one- and multi-hop neighbor information, our STGNN component can encode rich spatial information from complex pairwise dependencies between variables. With a temporal module that consists of dilated convolutional functions, the STGNN can further capture long-range dependence over time. A novel anomaly scoring component is further integrated into CST-GL to estimate the degree of an anomaly in a purely unsupervised manner. Experimental results demonstrate that CST-GL can detect anomalies effectively in general settings as well as enable early detection across different time delays. △ Less

Submitted 16 November, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

Comments: 17 pages, double columns, 10 tables, 3 figures. Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS)

arXiv:2307.03759 [pdf, other]

A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection

Authors: Ming Jin, Huan Yee Koh, Qingsong Wen, Daniele Zambon, Cesare Alippi, Geoffrey I. Webb, Irwin King, Shirui Pan

Abstract: Time series are the primary data type used to record dynamic system measurements and generated in great volume by both physical sensors and online processes (virtual sensors). Time series analytics is therefore crucial to unlocking the wealth of information implicit in available data. With the recent advancements in graph neural networks (GNNs), there has been a surge in GNN-based approaches for t… ▽ More Time series are the primary data type used to record dynamic system measurements and generated in great volume by both physical sensors and online processes (virtual sensors). Time series analytics is therefore crucial to unlocking the wealth of information implicit in available data. With the recent advancements in graph neural networks (GNNs), there has been a surge in GNN-based approaches for time series analysis. These approaches can explicitly model inter-temporal and inter-variable relationships, which traditional and other deep neural network-based methods struggle to do. In this survey, we provide a comprehensive review of graph neural networks for time series analysis (GNN4TS), encompassing four fundamental dimensions: forecasting, classification, anomaly detection, and imputation. Our aim is to guide designers and practitioners to understand, build applications, and advance research of GNN4TS. At first, we provide a comprehensive task-oriented taxonomy of GNN4TS. Then, we present and discuss representative research works and introduce mainstream applications of GNN4TS. A comprehensive discussion of potential future research directions completes the survey. This survey, for the first time, brings together a vast array of knowledge on GNN-based time series research, highlighting foundations, practical applications, and opportunities of graph neural networks for time series analysis. △ Less

Submitted 9 August, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

Comments: Ongoing work; 27 pages, 6 figures, 5 tables; Github page: https://github.com/KimMeen/Awesome-GNN4TS

arXiv:2305.14016 [pdf, other]

Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation

Authors: Minwoo Lee, Hyukhun Koh, Kang-il Lee, Dongdong Zhang, Minsung Kim, Kyomin Jung

Abstract: Gender bias is a significant issue in machine translation, leading to ongoing research efforts in developing bias mitigation techniques. However, most works focus on debiasing bilingual models without much consideration for multilingual systems. In this paper, we specifically target the gender bias issue of multilingual machine translation models for unambiguous cases where there is a single corre… ▽ More Gender bias is a significant issue in machine translation, leading to ongoing research efforts in developing bias mitigation techniques. However, most works focus on debiasing bilingual models without much consideration for multilingual systems. In this paper, we specifically target the gender bias issue of multilingual machine translation models for unambiguous cases where there is a single correct translation, and propose a bias mitigation method based on a novel approach. Specifically, we propose Gender-Aware Contrastive Learning, GACL, which encodes contextual gender information into the representations of non-explicit gender words. Our method is target language-agnostic and is applicable to pre-trained multilingual machine translation models via fine-tuning. Through multilingual evaluation, we show that our approach improves gender accuracy by a wide margin without hampering translation performance. We also observe that incorporated gender information transfers and benefits other target languages regarding gender accuracy. Finally, we demonstrate that our method is applicable and beneficial to models of various sizes. △ Less

Submitted 9 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: Accepted to EMNLP 2023 Main Conference

arXiv:2305.10407 [pdf, other]

BAD: BiAs Detection for Large Language Models in the context of candidate screening

Authors: Nam Ho Koh, Joseph Plata, Joyce Chai

Abstract: Application Tracking Systems (ATS) have allowed talent managers, recruiters, and college admissions committees to process large volumes of potential candidate applications efficiently. Traditionally, this screening process was conducted manually, creating major bottlenecks due to the quantity of applications and introducing many instances of human bias. The advent of large language models (LLMs) s… ▽ More Application Tracking Systems (ATS) have allowed talent managers, recruiters, and college admissions committees to process large volumes of potential candidate applications efficiently. Traditionally, this screening process was conducted manually, creating major bottlenecks due to the quantity of applications and introducing many instances of human bias. The advent of large language models (LLMs) such as ChatGPT and the potential of adopting methods to current automated application screening raises additional bias and fairness issues that must be addressed. In this project, we wish to identify and quantify the instances of social bias in ChatGPT and other OpenAI LLMs in the context of candidate screening in order to demonstrate how the use of these models could perpetuate existing biases and inequalities in the hiring process. △ Less

Submitted 17 May, 2023; originally announced May 2023.

Comments: 12 pages, 6 figures

MSC Class: I.2; I.2.7 ACM Class: F.2.2, I.2.7

arXiv:2303.13099 [pdf, other]

Multi-View Zero-Shot Open Intent Induction from Dialogues: Multi Domain Batch and Proxy Gradient Transfer

Authors: Hyukhun Koh, Haesung Pyun, Nakyeong Yang, Kyomin Jung

Abstract: In Task Oriented Dialogue (TOD) system, detecting and inducing new intents are two main challenges to apply the system in the real world. In this paper, we suggest the semantic multi-view model to resolve these two challenges: (1) SBERT for General Embedding (GE), (2) Multi Domain Batch (MDB) for dialogue domain knowledge, and (3) Proxy Gradient Transfer (PGT) for cluster-specialized semantic. MDB… ▽ More In Task Oriented Dialogue (TOD) system, detecting and inducing new intents are two main challenges to apply the system in the real world. In this paper, we suggest the semantic multi-view model to resolve these two challenges: (1) SBERT for General Embedding (GE), (2) Multi Domain Batch (MDB) for dialogue domain knowledge, and (3) Proxy Gradient Transfer (PGT) for cluster-specialized semantic. MDB feeds diverse dialogue datasets to the model at once to tackle the multi-domain problem by learning the multiple domain knowledge. We introduce a novel method PGT, which employs the Siamese network to fine-tune the model with a clustering method directly.Our model can learn how to cluster dialogue utterances by using PGT. Experimental results demonstrate that our multi-view model with MDB and PGT significantly improves the Open Intent Induction performance compared to baseline systems. △ Less

Submitted 13 August, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

Comments: 8 pages, 3 figures, SIGDIAL DSTC 2023 workshop

arXiv:2303.04623 [pdf]

Continuous Function Structured in Multilayer Perceptron for Global Optimization

Authors: Heeyuen Koh

Abstract: The gradient information of multilayer perceptron with a linear neuron is modified with functional derivative for the global minimum search benchmarking problems. From this approach, we show that the landscape of the gradient derived from given continuous function using functional derivative can be the MLP-like form with ax+b neurons. In this extent, the suggested algorithm improves the availabili… ▽ More The gradient information of multilayer perceptron with a linear neuron is modified with functional derivative for the global minimum search benchmarking problems. From this approach, we show that the landscape of the gradient derived from given continuous function using functional derivative can be the MLP-like form with ax+b neurons. In this extent, the suggested algorithm improves the availability of the optimization process to deal all the parameters in the problem set simultaneously. The functionality of this method could be improved through intentionally designed convex function with Kullack-Liebler divergence applied to cost value as well. △ Less

Submitted 7 March, 2023; originally announced March 2023.

arXiv:2302.13696 [pdf, other]

Moderate Adaptive Linear Units (MoLU)

Authors: Hankyul Koh, Joon-hyuk Ko, Wonho Jhe

Abstract: We propose a new high-performance activation function, Moderate Adaptive Linear Units (MoLU), for the deep neural network. The MoLU is a simple, beautiful and powerful activation function that can be a good main activation function among hundreds of activation functions. Because the MoLU is made up of the elementary functions, not only it is a infinite diffeomorphism (i.e. smooth and infinitely di… ▽ More We propose a new high-performance activation function, Moderate Adaptive Linear Units (MoLU), for the deep neural network. The MoLU is a simple, beautiful and powerful activation function that can be a good main activation function among hundreds of activation functions. Because the MoLU is made up of the elementary functions, not only it is a infinite diffeomorphism (i.e. smooth and infinitely differentiable over whole domains), but also it decreases training time. △ Less

Submitted 10 June, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: 4 pages, 5 figures

arXiv:2210.16732 [pdf, other]

How Far are We from Robust Long Abstractive Summarization?

Authors: Huan Yee Koh, Jiaxin Ju, He Zhang, Ming Liu, Shirui Pan

Abstract: Abstractive summarization has made tremendous progress in recent years. In this work, we perform fine-grained human annotations to evaluate long document abstractive summarization systems (i.e., models and metrics) with the aim of implementing them to generate reliable summaries. For long document abstractive models, we show that the constant strive for state-of-the-art ROUGE results can lead us t… ▽ More Abstractive summarization has made tremendous progress in recent years. In this work, we perform fine-grained human annotations to evaluate long document abstractive summarization systems (i.e., models and metrics) with the aim of implementing them to generate reliable summaries. For long document abstractive models, we show that the constant strive for state-of-the-art ROUGE results can lead us to generate more relevant summaries but not factual ones. For long document evaluation metrics, human evaluation results show that ROUGE remains the best at evaluating the relevancy of a summary. It also reveals important limitations of factuality metrics in detecting different types of factual errors and the reasons behind the effectiveness of BARTScore. We then suggest promising directions in the endeavor of developing factual consistency metrics. Finally, we release our annotated long document dataset with the hope that it can contribute to the development of metrics across a broader range of summarization settings. △ Less

Submitted 29 October, 2022; originally announced October 2022.

Comments: EMNLP 2022

arXiv:2210.13399 [pdf, other]

doi 10.1145/3491102.3517595

Does Mode of Digital Contact Tracing Affect User Willingness to Share Information? A Quantitative Study

Authors: Camellia Zakaria, Pin Sym Foong, Chang Siang Lim, Pavithren V. S. Pakianathan, Gerald Huat Choon Koh, Simon Tangi Perrault

Abstract: Digital contact tracing can limit the spread of infectious diseases. Nevertheless, there remain barriers to attaining sufficient adoption. In this study, we investigate how willingness to participate in contact tracing is affected by two critical factors: the modes of data collection and the type of data collected. We conducted a scenario-based survey study among 220 respondents in the United Stat… ▽ More Digital contact tracing can limit the spread of infectious diseases. Nevertheless, there remain barriers to attaining sufficient adoption. In this study, we investigate how willingness to participate in contact tracing is affected by two critical factors: the modes of data collection and the type of data collected. We conducted a scenario-based survey study among 220 respondents in the United States (U.S.) to understand their perceptions about contact tracing associated with automated and manual contact tracing methods. The findings indicate a promising use of smartphones and a combination of public health officials and medical health records as information sources. Through a quantitative analysis, we describe how different modalities and individual demographic factors may affect user compliance in providing four key pieces of information to contact tracing. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: 18 pages, 11 figures, 13 tables

Journal ref: In CHI Conference on Human Factors in Computing Systems, pp. 1-18. 2022

arXiv:2210.01407 [pdf, other]

Homotopy-based training of NeuralODEs for accurate dynamics discovery

Authors: Joon-Hyuk Ko, Hankyul Koh, Nojun Park, Wonho Jhe

Abstract: Neural Ordinary Differential Equations (NeuralODEs) present an attractive way to extract dynamical laws from time series data, as they bridge neural networks with the differential equation-based modeling paradigm of the physical sciences. However, these models often display long training times and suboptimal results, especially for longer duration data. While a common strategy in the literature im… ▽ More Neural Ordinary Differential Equations (NeuralODEs) present an attractive way to extract dynamical laws from time series data, as they bridge neural networks with the differential equation-based modeling paradigm of the physical sciences. However, these models often display long training times and suboptimal results, especially for longer duration data. While a common strategy in the literature imposes strong constraints to the NeuralODE architecture to inherently promote stable model dynamics, such methods are ill-suited for dynamics discovery as the unknown governing equation is not guaranteed to satisfy the assumed constraints. In this paper, we develop a new training method for NeuralODEs, based on synchronization and homotopy optimization, that does not require changes to the model architecture. We show that synchronizing the model dynamics and the training data tames the originally irregular loss landscape, which homotopy optimization can then leverage to enhance training. Through benchmark experiments, we demonstrate our method achieves competitive or better training loss while often requiring less than half the number of training epochs compared to other model-agnostic techniques. Furthermore, models trained with our method display better extrapolation capabilities, highlighting the effectiveness of our method. △ Less

Submitted 23 January, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

Comments: 10 pages, 5 figures, accepted at NeurIPS2023 (https://neurips.cc/virtual/2023/poster/70313)

Journal ref: Joon-Hyuk, Hankyul Koh, Nojun Park, and Wonho Jhe. Advances in Neural Information Processing Systems (2023)

arXiv:2207.00939 [pdf, other]

doi 10.1145/3545176

An Empirical Survey on Long Document Summarization: Datasets, Models and Metrics

Authors: Huan Yee Koh, Jiaxin Ju, Ming Liu, Shirui Pan

Abstract: Long documents such as academic articles and business reports have been the standard format to detail out important issues and complicated subjects that require extra attention. An automatic summarization system that can effectively condense long documents into short and concise texts to encapsulate the most important information would thus be significant in aiding the reader's comprehension. Rece… ▽ More Long documents such as academic articles and business reports have been the standard format to detail out important issues and complicated subjects that require extra attention. An automatic summarization system that can effectively condense long documents into short and concise texts to encapsulate the most important information would thus be significant in aiding the reader's comprehension. Recently, with the advent of neural architectures, significant research efforts have been made to advance automatic text summarization systems, and numerous studies on the challenges of extending these systems to the long document domain have emerged. In this survey, we provide a comprehensive overview of the research on long document summarization and a systematic evaluation across the three principal components of its research setting: benchmark datasets, summarization models, and evaluation metrics. For each component, we organize the literature within the context of long document summarization and conduct an empirical analysis to broaden the perspective on current research progress. The empirical analysis includes a study on the intrinsic characteristics of benchmark datasets, a multi-dimensional analysis of summarization models, and a review of the summarization evaluation metrics. Based on the overall findings, we conclude by proposing possible directions for future exploration in this rapidly growing field. △ Less

Submitted 2 July, 2022; originally announced July 2022.

Comments: Accepted for publication by ACM Computing Surveys

arXiv:2203.15355 [pdf, other]

Online Continual Learning on a Contaminated Data Stream with Blurry Task Boundaries

Authors: Jihwan Bang, Hyunseo Koh, Seulki Park, Hwanjun Song, Jung-Woo Ha, Jonghyun Choi

Abstract: Learning under a continuously changing data distribution with incorrect labels is a desirable real-world problem yet challenging. A large body of continual learning (CL) methods, however, assumes data streams with clean labels, and online learning scenarios under noisy data streams are yet underexplored. We consider a more practical CL task setup of an online learning from blurry data stream with… ▽ More Learning under a continuously changing data distribution with incorrect labels is a desirable real-world problem yet challenging. A large body of continual learning (CL) methods, however, assumes data streams with clean labels, and online learning scenarios under noisy data streams are yet underexplored. We consider a more practical CL task setup of an online learning from blurry data stream with corrupted labels, where existing CL methods struggle. To address the task, we first argue the importance of both diversity and purity of examples in the episodic memory of continual learning models. To balance diversity and purity in the episodic memory, we propose a novel strategy to manage and use the memory by a unified approach of label noise aware diverse sampling and robust learning with semi-supervised learning. Our empirical validations on four real-world or synthetic noise datasets (CIFAR10 and 100, mini-WebVision, and Food-101N) exhibit that our method significantly outperforms prior arts in this realistic and challenging continual learning scenario. Code and data splits are available in https://github.com/clovaai/puridiver. △ Less

Submitted 30 March, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

Comments: Accepted paper at CVPR 2022

arXiv:2110.10031 [pdf, other]

Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference

Authors: Hyunseo Koh, Dahyun Kim, Jung-Woo Ha, Jonghyun Choi

Abstract: Despite rapid advances in continual learning, a large body of research is devoted to improving performance in the existing setups. While a handful of work do propose new continual learning setups, they still lack practicality in certain aspects. For better practicality, we first propose a novel continual learning setup that is online, task-free, class-incremental, of blurry task boundaries and sub… ▽ More Despite rapid advances in continual learning, a large body of research is devoted to improving performance in the existing setups. While a handful of work do propose new continual learning setups, they still lack practicality in certain aspects. For better practicality, we first propose a novel continual learning setup that is online, task-free, class-incremental, of blurry task boundaries and subject to inference queries at any moment. We additionally propose a new metric to better measure the performance of the continual learning methods subject to inference queries at any moment. To address the challenging setup and evaluation protocol, we propose an effective method that employs a new memory management scheme and novel learning techniques. Our empirical validation demonstrates that the proposed method outperforms prior arts by large margins. Code and data splits are available at https://github.com/naver-ai/i-Blurry. △ Less

Submitted 21 March, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

Comments: to appear in ICLR2022

arXiv:2110.01280 [pdf, other]

Leveraging Information Bottleneck for Scientific Document Summarization

Authors: Jiaxin Ju, Ming Liu, Huan Yee Koh, Yuan Jin, Lan Du, Shirui Pan

Abstract: This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle. Inspired by previous work which uses the Information Bottleneck principle for sentence compression, we extend it to document level summarization with two separate steps. In the first step, we use signal(s) as queries to retrieve the key content from the sour… ▽ More This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle. Inspired by previous work which uses the Information Bottleneck principle for sentence compression, we extend it to document level summarization with two separate steps. In the first step, we use signal(s) as queries to retrieve the key content from the source document. Then, a pre-trained language model conducts further sentence search and edit to return the final extracted summaries. Importantly, our work can be flexibly extended to a multi-view framework by different signals. Automatic evaluation on three scientific document datasets verifies the effectiveness of the proposed framework. The further human evaluation suggests that the extracted summaries cover more content aspects than previous systems. △ Less

Submitted 4 October, 2021; originally announced October 2021.

Comments: Accepted at EMNLP 2021 Findings

arXiv:2102.01033 [pdf, other]

Scalable, End-to-End, Deep-Learning-Based Data Reconstruction Chain for Particle Imaging Detectors

Authors: Francois Drielsma, Kazuhiro Terao, Laura Dominé, Dae Heun Koh

Abstract: Recent inroads in Computer Vision (CV) and Machine Learning (ML) have motivated a new approach to the analysis of particle imaging detector data. Unlike previous efforts which tackled isolated CV tasks, this paper introduces an end-to-end, ML-based data reconstruction chain for Liquid Argon Time Projection Chambers (LArTPCs), the state-of-the-art in precision imaging at the intensity frontier of n… ▽ More Recent inroads in Computer Vision (CV) and Machine Learning (ML) have motivated a new approach to the analysis of particle imaging detector data. Unlike previous efforts which tackled isolated CV tasks, this paper introduces an end-to-end, ML-based data reconstruction chain for Liquid Argon Time Projection Chambers (LArTPCs), the state-of-the-art in precision imaging at the intensity frontier of neutrino physics. The chain is a multi-task network cascade which combines voxel-level feature extraction using Sparse Convolutional Neural Networks and particle superstructure formation using Graph Neural Networks. Each algorithm incorporates physics-informed inductive biases, while their collective hierarchy is used to enforce a causal structure. The output is a comprehensive description of an event that may be used for high-level physics inference. The chain is end-to-end optimizable, eliminating the need for time-intensive manual software adjustments. It is also the first implementation to handle the unprecedented pile-up of dozens of high energy neutrino interactions, expected in the 3D-imaging LArTPC of the Deep Underground Neutrino Experiment. The chain is trained as a whole and its performance is assessed at each step using an open simulated data set. △ Less

Submitted 1 February, 2021; originally announced February 2021.

Comments: Third Workshop on Machine Learning and the Physical Sciences (NeurIPS 2020), Vancouver, Canada

arXiv:2007.06022 [pdf, other]

doi 10.1109/JIOT.2020.3028368

Blockchain for the Internet of Vehicles towards Intelligent Transportation Systems: A Survey

Authors: Muhammad Baqer Mollah, Jun Zhao, Dusit Niyato, Yong Liang Guan, Chau Yuen, Sumei Sun, Kwok-Yan Lam, Leong Hai Koh

Abstract: Internet of Vehicles (IoV) is an emerging concept that is believed to help realise the vision of intelligent transportation systems (ITS). IoV has become an important research area of impactful applications in recent years due to the rapid advancements in vehicular technologies, high throughput satellite communication, Internet of Things and cyber-physical systems. IoV enables the integration of s… ▽ More Internet of Vehicles (IoV) is an emerging concept that is believed to help realise the vision of intelligent transportation systems (ITS). IoV has become an important research area of impactful applications in recent years due to the rapid advancements in vehicular technologies, high throughput satellite communication, Internet of Things and cyber-physical systems. IoV enables the integration of smart vehicles with the Internet and system components attributing to their environment such as public infrastructures, sensors, computing nodes, pedestrians and other vehicles. By allowing the development of a common information exchange platform between vehicles and heterogeneous vehicular networks, this integration aims to create a better environment and public space to the people as well as to enhance safety for all road users. Being a participatory data exchange and storage, the underlying information exchange platform of IoV needs to be secure, transparent and immutable in order to achieve the intended objectives of ITS. In this connection, the adoption of blockchain as a system platform for supporting the information exchange needs of IoV has been explored. Due to their decentralized and immutable nature, IoV applications enabled by blockchain are believed to have a number of desirable properties such as decentralization, security, transparency, immutability, and automation. In this paper, we present a contemporary survey on the latest advancement in blockchain for IoV. Particularly, we highlight the different application scenarios of IoV after carefully reviewing the recent literatures. We also investigate several key challenges where blockchain is applied in IoV. Furthermore, we present the future opportunities and explore further research directions of IoV as a key enabler of ITS. △ Less

Submitted 2 October, 2020; v1 submitted 12 July, 2020; originally announced July 2020.

Comments: 28 Pages, 17 Figures, 4 tables

Journal ref: IEEE Internet of Things Journal 2020

arXiv:2007.03083 [pdf, other]

Scalable, Proposal-free Instance Segmentation Network for 3D Pixel Clustering and Particle Trajectory Reconstruction in Liquid Argon Time Projection Chambers

Authors: Dae Heun Koh, Pierre Côte de Soux, Laura Dominé, François Drielsma, Ran Itay, Qing Lin, Kazuhiro Terao, Ka Vang Tsang, Tracy Usher

Abstract: Liquid Argon Time Projection Chambers (LArTPCs) are high resolution particle imaging detectors, employed by accelerator-based neutrino oscillation experiments for high precision physics measurements. While images of particle trajectories are intuitive to analyze for physicists, the development of a high quality, automated data reconstruction chain remains challenging. One of the most critical reco… ▽ More Liquid Argon Time Projection Chambers (LArTPCs) are high resolution particle imaging detectors, employed by accelerator-based neutrino oscillation experiments for high precision physics measurements. While images of particle trajectories are intuitive to analyze for physicists, the development of a high quality, automated data reconstruction chain remains challenging. One of the most critical reconstruction steps is particle clustering: the task of grouping 3D image pixels into different particle instances that share the same particle type. In this paper, we propose the first scalable deep learning algorithm for particle clustering in LArTPC data using sparse convolutional neural networks (SCNN). Building on previous works on SCNNs and proposal free instance segmentation, we build an end-to-end trainable instance segmentation network that learns an embedding of the image pixels to perform point cloud clustering in a transformed space. We benchmark the performance of our algorithm on PILArNet, a public 3D particle imaging dataset, with respect to common clustering evaluation metrics. 3D pixels were successfully clustered into individual particle trajectories with 90% of them having an adjusted Rand index score greater than 92% with a mean pixel clustering efficiency and purity above 96%. This work contributes to the development of an end-to-end optimizable full data reconstruction chain for LArTPCs, in particular pixel-based 3D imaging detectors including the near detector of the Deep Underground Neutrino Experiment. Our algorithm is made available in the open access repository, and we share our Singularity software container, which can be used to reproduce our work on the dataset. △ Less

Submitted 6 July, 2020; originally announced July 2020.

arXiv:2007.01335 [pdf, other]

Clustering of Electromagnetic Showers and Particle Interactions with Graph Neural Networks in Liquid Argon Time Projection Chambers Data

Authors: Francois Drielsma, Qing Lin, Pierre Côte de Soux, Laura Dominé, Ran Itay, Dae Heun Koh, Bradley J. Nelson, Kazuhiro Terao, Ka Vang Tsang, Tracy L. Usher

Abstract: Liquid Argon Time Projection Chambers (LArTPCs) are a class of detectors that produce high resolution images of charged particles within their sensitive volume. In these images, the clustering of distinct particles into superstructures is of central importance to the current and future neutrino physics program. Electromagnetic (EM) activity typically exhibits spatially detached fragments of varyin… ▽ More Liquid Argon Time Projection Chambers (LArTPCs) are a class of detectors that produce high resolution images of charged particles within their sensitive volume. In these images, the clustering of distinct particles into superstructures is of central importance to the current and future neutrino physics program. Electromagnetic (EM) activity typically exhibits spatially detached fragments of varying morphology and orientation that are challenging to efficiently assemble using traditional algorithms. Similarly, particles that are spatially removed from each other in the detector may originate from a common interaction. Graph Neural Networks (GNNs) were developed in recent years to find correlations between objects embedded in an arbitrary space. The Graph Particle Aggregator (GrapPA) first leverages GNNs to predict the adjacency matrix of EM shower fragments and to identify the origin of showers, i.e. primary fragments. On the PILArNet public LArTPC simulation dataset, the algorithm achieves achieves a shower clustering accuracy characterized by a mean adjusted Rand index (ARI) of 97.8 % and a primary identification accuracy of 99.8 %. It yields a relative shower energy resolution of $(4.1+1.4/\sqrt{E (\text{GeV})})\,\%$ and a shower direction resolution of $(2.1/\sqrt{E(\text{GeV})})^{\circ}$. The optimized algorithm is then applied to the related task of clustering particle instances into interactions and yields a mean ARI of 99.2 % for an interaction density of $\sim\mathcal{O}(1)\,m^{-3}$. △ Less

Submitted 14 December, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

arXiv:2006.14745 [pdf, other]

doi 10.1103/PhysRevD.104.032004

Point Proposal Network for Reconstructing 3D Particle Endpoints with Sub-Pixel Precision in Liquid Argon Time Projection Chambers

Authors: Laura Dominé, Pierre Côte de Soux, François Drielsma, Dae Heun Koh, Ran Itay, Qing Lin, Kazuhiro Terao, Ka Vang Tsang, Tracy L. Usher

Abstract: Liquid Argon Time Projection Chambers (LArTPC) are particle imaging detectors recording 2D or 3D images of trajectories of charged particles. Identifying points of interest in these images, namely the initial and terminal points of track-like particle trajectories such as muons and protons, and the initial points of electromagnetic shower-like particle trajectories such as electrons and gamma rays… ▽ More Liquid Argon Time Projection Chambers (LArTPC) are particle imaging detectors recording 2D or 3D images of trajectories of charged particles. Identifying points of interest in these images, namely the initial and terminal points of track-like particle trajectories such as muons and protons, and the initial points of electromagnetic shower-like particle trajectories such as electrons and gamma rays, is a crucial step of identifying and analyzing these particles and impacts the inference of physics signals such as neutrino interaction. The Point Proposal Network is designed to discover these specific points of interest. The algorithm predicts with a sub-voxel precision their spatial location, and also determines the category of the identified points of interest. Using as a benchmark the PILArNet public LArTPC data sample in which the voxel resolution is 3mm/voxel, our algorithm successfully predicted 96.8% and 97.8% of 3D points within a distance of 3 and 10~voxels from the provided true point locations respectively. For the predicted 3D points within 3 voxels of the closest true point locations, the median distance is found to be 0.25 voxels, achieving the sub-voxel level precision. In addition, we report our analysis of the mistakes where our algorithm prediction differs from the provided true point positions by more than 10~voxels. Among 50 mistakes visually scanned, 25 were due to the definition of true position location, 15 were legitimate mistakes where a physicist cannot visually disagree with the algorithm's prediction, and 10 were genuine mistakes that we wish to improve in the future. Further, using these predicted points, we demonstrate a simple algorithm to cluster 3D voxels into individual track-like particle trajectories with a clustering efficiency, purity, and Adjusted Rand Index of 96%, 93%, and 91% respectively. △ Less

Submitted 10 July, 2020; v1 submitted 25 June, 2020; originally announced June 2020.

Journal ref: Phys. Rev. D 104, 032004 (2021)

arXiv:1911.03298 [pdf, other]

doi 10.1109/JIOT.2020.2993601

Blockchain for Future Smart Grid: A Comprehensive Survey

Authors: Muhammad Baqer Mollah, Jun Zhao, Dusit Niyato, Kwok-Yan Lam, Xin Zhang, Amer M. Y. M. Ghias, Leong Hai Koh, Lei Yang

Abstract: The concept of smart grid has been introduced as a new vision of the conventional power grid to figure out an efficient way of integrating green and renewable energy technologies. In this way, Internet-connected smart grid, also called energy Internet, is also emerging as an innovative approach to ensure the energy from anywhere at any time. The ultimate goal of these developments is to build a su… ▽ More The concept of smart grid has been introduced as a new vision of the conventional power grid to figure out an efficient way of integrating green and renewable energy technologies. In this way, Internet-connected smart grid, also called energy Internet, is also emerging as an innovative approach to ensure the energy from anywhere at any time. The ultimate goal of these developments is to build a sustainable society. However, integrating and coordinating a large number of growing connections can be a challenging issue for the traditional centralized grid system. Consequently, the smart grid is undergoing a transformation to the decentralized topology from its centralized form. On the other hand, blockchain has some excellent features which make it a promising application for smart grid paradigm. In this paper, we aim to provide a comprehensive survey on application of blockchain in smart grid. As such, we identify the significant security challenges of smart grid scenarios that can be addressed by blockchain. Then, we present a number of blockchain-based recent research works presented in different literatures addressing security issues in the area of smart grid. We also summarize several related practical projects, trials, and products that have been emerged recently. Finally, we discuss essential research challenges and future directions of applying blockchain to smart grid security issues. △ Less

Submitted 13 May, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

Comments: 26 pages, 13 figures, 5 tables

Journal ref: IEEE Internet of Things Journal, 2020

Showing 1–28 of 28 results for author: Koh, H