-
Progressive Query Refinement Framework for Bird's-Eye-View Semantic Segmentation from Surrounding Images
Authors:
Dooseop Choi,
Jungyu Kang,
Taeghyun An,
Kyounghwan Ahn,
KyoungWook Min
Abstract:
Expressing images with Multi-Resolution (MR) features has been widely adopted in many computer vision tasks. In this paper, we introduce the MR concept into Bird's-Eye-View (BEV) semantic segmentation for autonomous driving. This introduction enhances our model's ability to capture both global and local characteristics of driving scenes through our proposed residual learning. Specifically, given a…
▽ More
Expressing images with Multi-Resolution (MR) features has been widely adopted in many computer vision tasks. In this paper, we introduce the MR concept into Bird's-Eye-View (BEV) semantic segmentation for autonomous driving. This introduction enhances our model's ability to capture both global and local characteristics of driving scenes through our proposed residual learning. Specifically, given a set of MR BEV query maps, the lowest resolution query map is initially updated using a View Transformation (VT) encoder. This updated query map is then upscaled and merged with a higher resolution query map to undergo further updates in a subsequent VT encoder. This process is repeated until the resolution of the updated query map reaches the target. Finally, the lowest resolution map is added to the target resolution to generate the final query map. During training, we enforce both the lowest and final query maps to align with the ground-truth BEV semantic map to help our model effectively capture the global and local characteristics. We also propose a visual feature interaction network that promotes interactions between features across images and across feature levels, thus highly contributing to the performance improvement. We evaluate our model on a large-scale real-world dataset. The experimental results show that our model outperforms the SOTA models in terms of IoU metric. Codes are available at https://github.com/d1024choi/ProgressiveQueryRefineNet
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Generalizable Disaster Damage Assessment via Change Detection with Vision Foundation Model
Authors:
Kyeongjin Ahn,
Sungwon Han,
Sungwon Park,
Jihee Kim,
Sangyoon Park,
Meeyoung Cha
Abstract:
The increasing frequency and intensity of natural disasters demand more sophisticated approaches for rapid and precise damage assessment. To tackle this issue, researchers have developed various methods on disaster benchmark datasets from satellite imagery to aid in detecting disaster damage. However, the diverse nature of geographical landscapes and disasters makes it challenging to apply existin…
▽ More
The increasing frequency and intensity of natural disasters demand more sophisticated approaches for rapid and precise damage assessment. To tackle this issue, researchers have developed various methods on disaster benchmark datasets from satellite imagery to aid in detecting disaster damage. However, the diverse nature of geographical landscapes and disasters makes it challenging to apply existing methods to regions unseen during training. We present DAVI (Disaster Assessment with VIsion foundation model), which overcomes domain disparities and detects structural damage (e.g., building) without requiring ground-truth labels of the target region. DAVI integrates task-specific knowledge from a model trained on source regions with an image segmentation foundation model to generate pseudo labels of possible damage in the target region. It then employs a two-stage refinement process, targeting both the pixel and overall image, to more accurately pinpoint changes in disaster-struck areas based on before-and-after images. Comprehensive evaluations demonstrate that DAVI achieves exceptional performance across diverse terrains (e.g., USA and Mexico) and disaster types (e.g., wildfires, hurricanes, and earthquakes). This confirms its robustness in assessing disaster impact without dependence on ground-truth labels.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Adam with model exponential moving average is effective for nonconvex optimization
Authors:
Kwangjun Ahn,
Ashok Cutkosky
Abstract:
In this work, we offer a theoretical analysis of two modern optimization techniques for training large and complex models: (i) adaptive optimization algorithms, such as Adam, and (ii) the model exponential moving average (EMA). Specifically, we demonstrate that a clipped version of Adam with model EMA achieves the optimal convergence rates in various nonconvex optimization settings, both smooth an…
▽ More
In this work, we offer a theoretical analysis of two modern optimization techniques for training large and complex models: (i) adaptive optimization algorithms, such as Adam, and (ii) the model exponential moving average (EMA). Specifically, we demonstrate that a clipped version of Adam with model EMA achieves the optimal convergence rates in various nonconvex optimization settings, both smooth and nonsmooth. Moreover, when the scale varies significantly across different coordinates, we demonstrate that the coordinate-wise adaptivity of Adam is provably advantageous. Notably, unlike previous analyses of Adam, our analysis crucially relies on its core elements -- momentum and discounting factors -- as well as model EMA, motivating their wide applications in practice.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Does SGD really happen in tiny subspaces?
Authors:
Minhak Song,
Kwangjun Ahn,
Chulhee Yun
Abstract:
Understanding the training dynamics of deep neural networks is challenging due to their high-dimensional nature and intricate loss landscapes. Recent studies have revealed that, along the training trajectory, the gradient approximately aligns with a low-rank top eigenspace of the training loss Hessian, referred to as the dominant subspace. Given this alignment, this paper explores whether neural n…
▽ More
Understanding the training dynamics of deep neural networks is challenging due to their high-dimensional nature and intricate loss landscapes. Recent studies have revealed that, along the training trajectory, the gradient approximately aligns with a low-rank top eigenspace of the training loss Hessian, referred to as the dominant subspace. Given this alignment, this paper explores whether neural networks can be trained within the dominant subspace, which, if feasible, could lead to more efficient training methods. Our primary observation is that when the SGD update is projected onto the dominant subspace, the training loss does not decrease further. This suggests that the observed alignment between the gradient and the dominant subspace is spurious. Surprisingly, projecting out the dominant subspace proves to be just as effective as the original update, despite removing the majority of the original update component. Similar observations are made for the large learning rate regime (also known as Edge of Stability) and Sharpness-Aware Minimization. We discuss the main causes and implications of this spurious alignment, shedding light on the intricate dynamics of neural network training.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
HiMAP: Learning Heuristics-Informed Policies for Large-Scale Multi-Agent Pathfinding
Authors:
Huijie Tang,
Federico Berto,
Zihan Ma,
Chuanbo Hua,
Kyuree Ahn,
Jinkyoo Park
Abstract:
Large-scale multi-agent pathfinding (MAPF) presents significant challenges in several areas. As systems grow in complexity with a multitude of autonomous agents operating simultaneously, efficient and collision-free coordination becomes paramount. Traditional algorithms often fall short in scalability, especially in intricate scenarios. Reinforcement Learning (RL) has shown potential to address th…
▽ More
Large-scale multi-agent pathfinding (MAPF) presents significant challenges in several areas. As systems grow in complexity with a multitude of autonomous agents operating simultaneously, efficient and collision-free coordination becomes paramount. Traditional algorithms often fall short in scalability, especially in intricate scenarios. Reinforcement Learning (RL) has shown potential to address the intricacies of MAPF; however, it has also been shown to struggle with scalability, demanding intricate implementation, lengthy training, and often exhibiting unstable convergence, limiting its practical application. In this paper, we introduce Heuristics-Informed Multi-Agent Pathfinding (HiMAP), a novel scalable approach that employs imitation learning with heuristic guidance in a decentralized manner. We train on small-scale instances using a heuristic policy as a teacher that maps each single agent observation information to an action probability distribution. During pathfinding, we adopt several inference techniques to improve performance. With a simple training scheme and implementation, HiMAP demonstrates competitive results in terms of success rate and scalability in the field of imitation-learning-only MAPF, showing the potential of imitation-learning-only MAPF equipped with inference techniques.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Authors:
Kwangjun Ahn,
Zhiyu Zhang,
Yunbum Kook,
Yan Dai
Abstract:
Despite the success of the Adam optimizer in practice, the theoretical understanding of its algorithmic components still remains limited. In particular, most existing analyses of Adam show the convergence rate that can be simply achieved by non-adative algorithms like SGD. In this work, we provide a different perspective based on online learning that underscores the importance of Adam's algorithmi…
▽ More
Despite the success of the Adam optimizer in practice, the theoretical understanding of its algorithmic components still remains limited. In particular, most existing analyses of Adam show the convergence rate that can be simply achieved by non-adative algorithms like SGD. In this work, we provide a different perspective based on online learning that underscores the importance of Adam's algorithmic components. Inspired by Cutkosky et al. (2023), we consider the framework called online learning of updates/increments, where we choose the updates/increments of an optimizer based on an online learner. With this framework, the design of a good optimizer is reduced to the design of a good online learner. Our main observation is that Adam corresponds to a principled online learning framework called Follow-the-Regularized-Leader (FTRL). Building on this observation, we study the benefits of its algorithmic components from the online learning perspective.
△ Less
Submitted 30 May, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Exploitation Business: Leveraging Information Asymmetry
Authors:
Kwangseob Ahn
Abstract:
This paper investigates the "Exploitation Business" model, which capitalizes on information asymmetry to exploit vulnerable populations. It focuses on businesses targeting non-experts or fraudsters who capitalize on information asymmetry to sell their products or services to desperate individuals. This phenomenon, also described as "profit-making activities based on informational exploitation," th…
▽ More
This paper investigates the "Exploitation Business" model, which capitalizes on information asymmetry to exploit vulnerable populations. It focuses on businesses targeting non-experts or fraudsters who capitalize on information asymmetry to sell their products or services to desperate individuals. This phenomenon, also described as "profit-making activities based on informational exploitation," thrives on individuals' limited access to information, lack of expertise, and Fear of Missing Out (FOMO).
The recent advancement of social media and the rising trend of fandom business have accelerated the proliferation of such exploitation business models. Discussions on the empowerment and exploitation of fans in the digital media era present a restructuring of relationships between fans and media creators, highlighting the necessity of not overlooking the exploitation of fans' free labor.
This paper analyzes the various facets and impacts of exploitation business models, enriched by real-world examples from sectors like cryptocurrency and GenAI, thereby discussing their social, economic, and ethical implications. Moreover, through theoretical backgrounds and research, it explores similar themes like existing exploitation theories, commercial exploitation, and financial exploitation to gain a deeper understanding of the "Exploitation Business" subject.
△ Less
Submitted 16 June, 2024; v1 submitted 15 October, 2023;
originally announced October 2023.
-
Linear attention is (maybe) all you need (to understand transformer optimization)
Authors:
Kwangjun Ahn,
Xiang Cheng,
Minhak Song,
Chulhee Yun,
Ali Jadbabaie,
Suvrit Sra
Abstract:
Transformer training is notoriously difficult, requiring a careful design of optimizers and use of various heuristics. We make progress towards understanding the subtleties of training Transformers by carefully studying a simple yet canonical linearized shallow Transformer model. Specifically, we train linear Transformers to solve regression tasks, inspired by J.~von Oswald et al.~(ICML 2023), and…
▽ More
Transformer training is notoriously difficult, requiring a careful design of optimizers and use of various heuristics. We make progress towards understanding the subtleties of training Transformers by carefully studying a simple yet canonical linearized shallow Transformer model. Specifically, we train linear Transformers to solve regression tasks, inspired by J.~von Oswald et al.~(ICML 2023), and K.~Ahn et al.~(NeurIPS 2023). Most importantly, we observe that our proposed linearized models can reproduce several prominent aspects of Transformer training dynamics. Consequently, the results obtained in this paper suggest that a simple linearized Transformer model could actually be a valuable, realistic abstraction for understanding Transformer optimization.
△ Less
Submitted 13 March, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Augmenting text for spoken language understanding with Large Language Models
Authors:
Roshan Sharma,
Suyoun Kim,
Daniel Lazar,
Trang Le,
Akshat Shrivastava,
Kwanghoon Ahn,
Piyush Kansal,
Leda Sari,
Ozlem Kalinli,
Michael Seltzer
Abstract:
Spoken semantic parsing (SSP) involves generating machine-comprehensible parses from input speech. Training robust models for existing application domains represented in training data or extending to new domains requires corresponding triplets of speech-transcript-semantic parse data, which is expensive to obtain. In this paper, we address this challenge by examining methods that can use transcrip…
▽ More
Spoken semantic parsing (SSP) involves generating machine-comprehensible parses from input speech. Training robust models for existing application domains represented in training data or extending to new domains requires corresponding triplets of speech-transcript-semantic parse data, which is expensive to obtain. In this paper, we address this challenge by examining methods that can use transcript-semantic parse data (unpaired text) without corresponding speech. First, when unpaired text is drawn from existing textual corpora, Joint Audio Text (JAT) and Text-to-Speech (TTS) are compared as ways to generate speech representations for unpaired text. Experiments on the STOP dataset show that unpaired text from existing and new domains improves performance by 2% and 30% in absolute Exact Match (EM) respectively. Second, we consider the setting when unpaired text is not available in existing textual corpora. We propose to prompt Large Language Models (LLMs) to generate unpaired text for existing and new domains. Experiments show that examples and words that co-occur with intents can be used to generate unpaired text with Llama 2.0. Using the generated text with JAT and TTS for spoken semantic parsing improves EM on STOP by 1.4% and 2.6% absolute for existing and new domains respectively.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Authors:
Haoyuan Sun,
Khashayar Gatmiry,
Kwangjun Ahn,
Navid Azizan
Abstract:
Inspired by the remarkable success of large neural networks, there has been significant interest in understanding the generalization performance of over-parameterized models. Substantial efforts have been invested in characterizing how optimization algorithms impact generalization through their "preferred" solutions, a phenomenon commonly referred to as implicit regularization. In particular, it h…
▽ More
Inspired by the remarkable success of large neural networks, there has been significant interest in understanding the generalization performance of over-parameterized models. Substantial efforts have been invested in characterizing how optimization algorithms impact generalization through their "preferred" solutions, a phenomenon commonly referred to as implicit regularization. In particular, it has been argued that gradient descent (GD) induces an implicit $\ell_2$-norm regularization in regression and classification problems. However, the implicit regularization of different algorithms are confined to either a specific geometry or a particular class of learning problems, indicating a gap in a general approach for controlling the implicit regularization. To address this, we present a unified approach using mirror descent (MD), a notable generalization of GD, to control implicit regularization in both regression and classification settings. More specifically, we show that MD with the general class of homogeneous potential functions converges in direction to a generalized maximum-margin solution for linear classification problems, thereby answering a long-standing question in the classification setting. Further, we show that MD can be implemented efficiently and enjoys fast convergence under suitable conditions. Through comprehensive experiments, we demonstrate that MD is a versatile method to produce learned models with different regularizers, which in turn have different generalization performances.
△ Less
Submitted 11 January, 2024; v1 submitted 23 June, 2023;
originally announced June 2023.
-
Smooth Model Predictive Control with Applications to Statistical Learning
Authors:
Kwangjun Ahn,
Daniel Pfrommer,
Jack Umenberger,
Tobia Marcucci,
Zak Mhammedi,
Ali Jadbabaie
Abstract:
Statistical learning theory and high dimensional statistics have had a tremendous impact on Machine Learning theory and have impacted a variety of domains including systems and control theory. Over the past few years we have witnessed a variety of applications of such theoretical tools to help answer questions such as: how many state-action pairs are needed to learn a static control policy to a gi…
▽ More
Statistical learning theory and high dimensional statistics have had a tremendous impact on Machine Learning theory and have impacted a variety of domains including systems and control theory. Over the past few years we have witnessed a variety of applications of such theoretical tools to help answer questions such as: how many state-action pairs are needed to learn a static control policy to a given accuracy? Recent results have shown that continuously differentiable and stabilizing control policies can be well-approximated using neural networks with hard guarantees on performance, yet often even the simplest constrained control problems are not smooth. To address this void, in this paper we study smooth approximations of linear Model Predictive Control (MPC) policies, in which hard constraints are replaced by barrier functions, a.k.a. barrier MPC. In particular, we show that barrier MPC inherits the exponential stability properties of the original non-smooth MPC policy. Using a careful analysis of the proposed barrier MPC, we show that its smoothness constant can be carefully controlled, thereby paving the way for new sample complexity results for approximating MPC policies from sampled state-action pairs.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.
-
Transformers learn to implement preconditioned gradient descent for in-context learning
Authors:
Kwangjun Ahn,
Xiang Cheng,
Hadi Daneshmand,
Suvrit Sra
Abstract:
Several recent works demonstrate that transformers can implement algorithms like gradient descent. By a careful construction of weights, these works show that multiple layers of transformers are expressive enough to simulate iterations of gradient descent. Going beyond the question of expressivity, we ask: Can transformers learn to implement such algorithms by training over random problem instance…
▽ More
Several recent works demonstrate that transformers can implement algorithms like gradient descent. By a careful construction of weights, these works show that multiple layers of transformers are expressive enough to simulate iterations of gradient descent. Going beyond the question of expressivity, we ask: Can transformers learn to implement such algorithms by training over random problem instances? To our knowledge, we make the first theoretical progress on this question via an analysis of the loss landscape for linear transformers trained over random instances of linear regression. For a single attention layer, we prove the global minimum of the training objective implements a single iteration of preconditioned gradient descent. Notably, the preconditioning matrix not only adapts to the input distribution but also to the variance induced by data inadequacy. For a transformer with $L$ attention layers, we prove certain critical points of the training objective implement $L$ iterations of preconditioned gradient descent. Our results call for future theoretical studies on learning algorithms by training transformers.
△ Less
Submitted 9 November, 2023; v1 submitted 31 May, 2023;
originally announced June 2023.
-
How to escape sharp minima with random perturbations
Authors:
Kwangjun Ahn,
Ali Jadbabaie,
Suvrit Sra
Abstract:
Modern machine learning applications have witnessed the remarkable success of optimization algorithms that are designed to find flat minima. Motivated by this design choice, we undertake a formal study that (i) formulates the notion of flat minima, and (ii) studies the complexity of finding them. Specifically, we adopt the trace of the Hessian of the cost function as a measure of flatness, and use…
▽ More
Modern machine learning applications have witnessed the remarkable success of optimization algorithms that are designed to find flat minima. Motivated by this design choice, we undertake a formal study that (i) formulates the notion of flat minima, and (ii) studies the complexity of finding them. Specifically, we adopt the trace of the Hessian of the cost function as a measure of flatness, and use it to formally define the notion of approximate flat minima. Under this notion, we then analyze algorithms that find approximate flat minima efficiently. For general cost functions, we discuss a gradient-based algorithm that finds an approximate flat local minimum efficiently. The main component of the algorithm is to use gradients computed from randomly perturbed iterates to estimate a direction that leads to flatter minima. For the setting where the cost function is an empirical risk over training data, we present a faster algorithm that is inspired by a recently proposed practical algorithm called sharpness-aware minimization, supporting its success in practice.
△ Less
Submitted 25 May, 2024; v1 submitted 24 May, 2023;
originally announced May 2023.
-
The Crucial Role of Normalization in Sharpness-Aware Minimization
Authors:
Yan Dai,
Kwangjun Ahn,
Suvrit Sra
Abstract:
Sharpness-Aware Minimization (SAM) is a recently proposed gradient-based optimizer (Foret et al., ICLR 2021) that greatly improves the prediction performance of deep neural networks. Consequently, there has been a surge of interest in explaining its empirical success. We focus, in particular, on understanding the role played by normalization, a key component of the SAM updates. We theoretically an…
▽ More
Sharpness-Aware Minimization (SAM) is a recently proposed gradient-based optimizer (Foret et al., ICLR 2021) that greatly improves the prediction performance of deep neural networks. Consequently, there has been a surge of interest in explaining its empirical success. We focus, in particular, on understanding the role played by normalization, a key component of the SAM updates. We theoretically and empirically study the effect of normalization in SAM for both convex and non-convex functions, revealing two key roles played by normalization: i) it helps in stabilizing the algorithm; and ii) it enables the algorithm to drift along a continuum (manifold) of minima -- a property identified by recent theoretical works that is the key to better performance. We further argue that these two properties of normalization make SAM robust against the choice of hyper-parameters, supporting the practicality of SAM. Our conclusions are backed by various experiments.
△ Less
Submitted 23 October, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Learning threshold neurons via the "edge of stability"
Authors:
Kwangjun Ahn,
Sébastien Bubeck,
Sinho Chewi,
Yin Tat Lee,
Felipe Suarez,
Yi Zhang
Abstract:
Existing analyses of neural network training often operate under the unrealistic assumption of an extremely small learning rate. This lies in stark contrast to practical wisdom and empirical studies, such as the work of J. Cohen et al. (ICLR 2021), which exhibit startling new phenomena (the "edge of stability" or "unstable convergence") and potential benefits for generalization in the large learni…
▽ More
Existing analyses of neural network training often operate under the unrealistic assumption of an extremely small learning rate. This lies in stark contrast to practical wisdom and empirical studies, such as the work of J. Cohen et al. (ICLR 2021), which exhibit startling new phenomena (the "edge of stability" or "unstable convergence") and potential benefits for generalization in the large learning rate regime. Despite a flurry of recent works on this topic, however, the latter effect is still poorly understood. In this paper, we take a step towards understanding genuinely non-convex training dynamics with large learning rates by performing a detailed analysis of gradient descent for simplified models of two-layer neural networks. For these models, we provably establish the edge of stability phenomenon and discover a sharp phase transition for the step size below which the neural network fails to learn "threshold-like" neurons (i.e., neurons with a non-zero first-layer bias). This elucidates one possible mechanism by which the edge of stability can in fact lead to better generalization, as threshold neurons are basic building blocks with useful inductive bias for many tasks.
△ Less
Submitted 19 October, 2023; v1 submitted 14 December, 2022;
originally announced December 2022.
-
Implicit Inverse Force Identification Method of Acoustic Liquid-structure Interaction Finite Element Model
Authors:
Seungin Oh,
Chang-uk Ahn,
Kwanghyun Ahn,
Jin-Gyun Kim
Abstract:
The two-field vibroacoustic finite-element (FE) model requires a relatively large number of degrees of freedom compared to the monophysics model, and the conventional force identification method for structural vibration can be adjusted for multiphysics problems. In this study, an effective inverse force identification method for an FE vibroacoustic interaction model of an interior fluid-structure…
▽ More
The two-field vibroacoustic finite-element (FE) model requires a relatively large number of degrees of freedom compared to the monophysics model, and the conventional force identification method for structural vibration can be adjusted for multiphysics problems. In this study, an effective inverse force identification method for an FE vibroacoustic interaction model of an interior fluid-structure system was proposed. The method consists of: (1) implicit inverse force identification based on the Newmark-$β$ time integration algorithm for stability and efficiency, (2) second-order ordinary differential formulation by avoiding the state-space form causing large degrees of freedom, (3) projection-based multiphysics reduced-order modeling for further reduction of degrees of freedom, and (4) Tikhonov regularization to alleviate the measurement noise. The proposed method can accurately identify the unmeasured applied forces on the in situ application and concurrently reconstruct the response fields. The accuracy, stability, and computational efficiency of the proposed method were evaluated using numerical models and an experimental testbed. A comparative study with the augmented Kalman filter method was performed to evaluate its relative performance.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Model Predictive Control via On-Policy Imitation Learning
Authors:
Kwangjun Ahn,
Zakaria Mhammedi,
Horia Mania,
Zhang-Wei Hong,
Ali Jadbabaie
Abstract:
In this paper, we leverage the rapid advances in imitation learning, a topic of intense recent focus in the Reinforcement Learning (RL) literature, to develop new sample complexity results and performance guarantees for data-driven Model Predictive Control (MPC) for constrained linear systems. In its simplest form, imitation learning is an approach that tries to learn an expert policy by querying…
▽ More
In this paper, we leverage the rapid advances in imitation learning, a topic of intense recent focus in the Reinforcement Learning (RL) literature, to develop new sample complexity results and performance guarantees for data-driven Model Predictive Control (MPC) for constrained linear systems. In its simplest form, imitation learning is an approach that tries to learn an expert policy by querying samples from an expert. Recent approaches to data-driven MPC have used the simplest form of imitation learning known as behavior cloning to learn controllers that mimic the performance of MPC by online sampling of the trajectories of the closed-loop MPC system. Behavior cloning, however, is a method that is known to be data inefficient and suffer from distribution shifts. As an alternative, we develop a variant of the forward training algorithm which is an on-policy imitation learning method proposed by Ross et al. (2010). Our algorithm uses the structure of constrained linear MPC, and our analysis uses the properties of the explicit MPC solution to theoretically bound the number of online MPC trajectories needed to achieve optimal performance. We validate our results through simulations and show that the forward training algorithm is indeed superior to behavior cloning when applied to MPC.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Inferring Line-of-Sight Velocities and Doppler Widths from Stokes Profiles of GST/NIRIS Using Stacked Deep Neural Networks
Authors:
Haodi Jiang,
Qin Li,
Yan Xu,
Wynne Hsu,
Kwangsu Ahn,
Wenda Cao,
Jason T. L. Wang,
Haimin Wang
Abstract:
Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at th…
▽ More
Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory (BBSO). The training data of SDNN is prepared by a Milne-Eddington (ME) inversion code used by BBSO. We quantitatively assess SDNN, comparing its inversion results with those obtained by the ME inversion code and related machine learning (ML) algorithms such as multiple support vector regression, multilayer perceptrons and a pixel-level convolutional neural network. Major findings from our experimental study are summarized as follows. First, the SDNN-inferred LOS velocities are highly correlated to the ME-calculated ones with the Pearson product-moment correlation coefficient being close to 0.9 on average. Second, SDNN is faster, while producing smoother and cleaner LOS velocity and Doppler width maps, than the ME inversion code. Third, the maps produced by SDNN are closer to ME's maps than those from the related ML algorithms, demonstrating the better learning capability of SDNN than the ML algorithms. Finally, comparison between the inversion results of ME and SDNN based on GST/NIRIS and those from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory in flare-prolific active region NOAA 12673 is presented. We also discuss extensions of SDNN for inferring vector magnetic fields with empirical evaluation.
△ Less
Submitted 8 October, 2022;
originally announced October 2022.
-
One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares
Authors:
Youngjae Min,
Kwangjun Ahn,
Navid Azizan
Abstract:
While deep neural networks are capable of achieving state-of-the-art performance in various domains, their training typically requires iterating for many passes over the dataset. However, due to computational and memory constraints and potential privacy concerns, storing and accessing all the data is impractical in many real-world scenarios where the data arrives in a stream. In this paper, we inv…
▽ More
While deep neural networks are capable of achieving state-of-the-art performance in various domains, their training typically requires iterating for many passes over the dataset. However, due to computational and memory constraints and potential privacy concerns, storing and accessing all the data is impractical in many real-world scenarios where the data arrives in a stream. In this paper, we investigate the problem of one-pass learning, in which a model is trained on sequentially arriving data without retraining on previous datapoints. Motivated by the increasing use of overparameterized models, we develop Orthogonal Recursive Fitting (ORFit), an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints. By doing so, we bridge two seemingly distinct algorithms in adaptive filtering and machine learning, namely the recursive least-squares (RLS) algorithm and orthogonal gradient descent (OGD). Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA). Further, we show that, for overparameterized linear models, the parameter vector obtained by our algorithm is what stochastic gradient descent (SGD) would converge to in the standard multi-pass setting. Finally, we generalize the results to the nonlinear setting for highly overparameterized models, relevant for deep learning. Our experiments show the effectiveness of the proposed method compared to the baselines.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Mirror Descent Maximizes Generalized Margin and Can Be Implemented Efficiently
Authors:
Haoyuan Sun,
Kwangjun Ahn,
Christos Thrampoulidis,
Navid Azizan
Abstract:
Driven by the empirical success and wide use of deep neural networks, understanding the generalization performance of overparameterized models has become an increasingly popular question. To this end, there has been substantial effort to characterize the implicit bias of the optimization algorithms used, such as gradient descent (GD), and the structural properties of their preferred solutions. Thi…
▽ More
Driven by the empirical success and wide use of deep neural networks, understanding the generalization performance of overparameterized models has become an increasingly popular question. To this end, there has been substantial effort to characterize the implicit bias of the optimization algorithms used, such as gradient descent (GD), and the structural properties of their preferred solutions. This paper answers an open question in this literature: For the classification setting, what solution does mirror descent (MD) converge to? Specifically, motivated by its efficient implementation, we consider the family of mirror descent algorithms with potential function chosen as the $p$-th power of the $\ell_p$-norm, which is an important generalization of GD. We call this algorithm $p$-$\textsf{GD}$. For this family, we characterize the solutions it obtains and show that it converges in direction to a generalized maximum-margin solution with respect to the $\ell_p$-norm for linearly separable classification. While the MD update rule is in general expensive to compute and perhaps not suitable for deep learning, $p$-$\textsf{GD}$ is fully parallelizable in the same manner as SGD and can be used to train deep neural networks with virtually no additional computational overhead. Using comprehensive experiments with both linear and deep neural network models, we demonstrate that $p$-$\textsf{GD}$ can noticeably affect the structure and the generalization performance of the learned models.
△ Less
Submitted 29 September, 2022; v1 submitted 25 May, 2022;
originally announced May 2022.
-
Understanding the unstable convergence of gradient descent
Authors:
Kwangjun Ahn,
Jingzhao Zhang,
Suvrit Sra
Abstract:
Most existing analyses of (stochastic) gradient descent rely on the condition that for $L$-smooth costs, the step size is less than $2/L$. However, many works have observed that in machine learning applications step sizes often do not fulfill this condition, yet (stochastic) gradient descent still converges, albeit in an unstable manner. We investigate this unstable convergence phenomenon from fir…
▽ More
Most existing analyses of (stochastic) gradient descent rely on the condition that for $L$-smooth costs, the step size is less than $2/L$. However, many works have observed that in machine learning applications step sizes often do not fulfill this condition, yet (stochastic) gradient descent still converges, albeit in an unstable manner. We investigate this unstable convergence phenomenon from first principles, and discuss key causes behind it. We also identify its main characteristics, and how they interrelate based on both theory and experiments, offering a principled view toward understanding the phenomenon.
△ Less
Submitted 9 June, 2022; v1 submitted 3 April, 2022;
originally announced April 2022.
-
Reproducibility in Optimization: Theoretical Framework and Limits
Authors:
Kwangjun Ahn,
Prateek Jain,
Ziwei Ji,
Satyen Kale,
Praneeth Netrapalli,
Gil I. Shamir
Abstract:
We initiate a formal study of reproducibility in optimization. We define a quantitative measure of reproducibility of optimization procedures in the face of noisy or error-prone operations such as inexact or stochastic gradient computations or inexact initialization. We then analyze several convex optimization settings of interest such as smooth, non-smooth, and strongly-convex objective functions…
▽ More
We initiate a formal study of reproducibility in optimization. We define a quantitative measure of reproducibility of optimization procedures in the face of noisy or error-prone operations such as inexact or stochastic gradient computations or inexact initialization. We then analyze several convex optimization settings of interest such as smooth, non-smooth, and strongly-convex objective functions and establish tight bounds on the limits of reproducibility in each setting. Our analysis reveals a fundamental trade-off between computation and reproducibility: more computation is necessary (and sufficient) for better reproducibility.
△ Less
Submitted 4 December, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
Environmental and Safety Impacts of Vehicle-to-Everything Enabled Applications: A Review of State-of-the-Art Studies
Authors:
Jianhe Du,
Kyoungho Ahn,
Mohamed Farag,
Hesham Rakha
Abstract:
With the rapid development of communication technology, connected vehicles (CV) have the potential, through the sharing of data, to enhance vehicle safety and reduce vehicle energy consumption and emissions. Numerous research efforts are quantifying the impacts of CV applications, assuming instant and accurate communication among vehicles, devices, pedestrians, infrastructure, the network, the clo…
▽ More
With the rapid development of communication technology, connected vehicles (CV) have the potential, through the sharing of data, to enhance vehicle safety and reduce vehicle energy consumption and emissions. Numerous research efforts are quantifying the impacts of CV applications, assuming instant and accurate communication among vehicles, devices, pedestrians, infrastructure, the network, the cloud, and the grid, collectively known as V2X (vehicle-to-everything). The use of cellular vehicle-to-everything (C-V2X), to share data is emerging as an efficient means to achieve this objective. C-V2X releases 14 and 15 utilize the 4G LTE technology and release 16 utilizes the new 5G new radio (NR) technology. C-V2X can function without network infrastructure coverage and has a better communication range, improved latency, and greater data rates compared to older technologies. Such highly efficient interchange of information among all participating parts in a CV environment will not only provide timely data to enhance the capacity of the transportation system but can also be used to develop applications that enhance vehicle safety and minimize negative environmental impacts. However, before the full benefits of CV can be achieved, there is a need to thoroughly investigate the effectiveness, strengths, and weaknesses of different CV applications, the communication protocols, the varied results with different CV market penetration rates (MPRs), the interaction of CVs and human driven vehicles, the integration of multiple applications, and the errors and latencies associated with data communication. This paper reviews existing literature on the environmental, mobility and safety impacts of CV applications, identifies the gaps in our current research of CVs and recommends future research directions.
△ Less
Submitted 7 December, 2021;
originally announced February 2022.
-
Agnostic Learnability of Halfspaces via Logistic Loss
Authors:
Ziwei Ji,
Kwangjun Ahn,
Pranjal Awasthi,
Satyen Kale,
Stefani Karp
Abstract:
We investigate approximation guarantees provided by logistic regression for the fundamental problem of agnostic learning of homogeneous halfspaces. Previously, for a certain broad class of "well-behaved" distributions on the examples, Diakonikolas et al. (2020) proved an $\tildeΩ(\textrm{OPT})$ lower bound, while Frei et al. (2021) proved an $\tilde{O}(\sqrt{\textrm{OPT}})$ upper bound, where…
▽ More
We investigate approximation guarantees provided by logistic regression for the fundamental problem of agnostic learning of homogeneous halfspaces. Previously, for a certain broad class of "well-behaved" distributions on the examples, Diakonikolas et al. (2020) proved an $\tildeΩ(\textrm{OPT})$ lower bound, while Frei et al. (2021) proved an $\tilde{O}(\sqrt{\textrm{OPT}})$ upper bound, where $\textrm{OPT}$ denotes the best zero-one/misclassification risk of a homogeneous halfspace. In this paper, we close this gap by constructing a well-behaved distribution such that the global minimizer of the logistic risk over this distribution only achieves $Ω(\sqrt{\textrm{OPT}})$ misclassification risk, matching the upper bound in (Frei et al., 2021). On the other hand, we also show that if we impose a radial-Lipschitzness condition in addition to well-behaved-ness on the distribution, logistic regression on a ball of bounded radius reaches $\tilde{O}(\textrm{OPT})$ misclassification risk. Our techniques also show for any well-behaved distribution, regardless of radial Lipschitzness, we can overcome the $Ω(\sqrt{\textrm{OPT}})$ lower bound for logistic loss simply at the cost of one additional convex optimization step involving the hinge loss and attain $\tilde{O}(\textrm{OPT})$ misclassification risk. This two-step convex optimization algorithm is simpler than previous methods obtaining this guarantee, all of which require solving $O(\log(1/\textrm{OPT}))$ minimization problems.
△ Less
Submitted 31 January, 2022;
originally announced January 2022.
-
Multi-objective Eco-Routing Model Development and Evaluation for Battery Electric Vehicles
Authors:
Kyoungho Ahn,
Youssef Bichiou,
Mohamed Farag,
Hesham A. Rakha
Abstract:
This paper develops and investigates the impacts of multi-objective Nash optimum (user equilibrium) traffic assignment on a large-scale network for battery electric vehicles (BEVs) and internal combustion engine vehicles (ICEVs) in a microscopic traffic simulation environment. Eco-routing is a technique that finds the most energy efficient route. ICEV and BEV energy consumption patterns are signif…
▽ More
This paper develops and investigates the impacts of multi-objective Nash optimum (user equilibrium) traffic assignment on a large-scale network for battery electric vehicles (BEVs) and internal combustion engine vehicles (ICEVs) in a microscopic traffic simulation environment. Eco-routing is a technique that finds the most energy efficient route. ICEV and BEV energy consumption patterns are significantly different with regard to their sensitivity to driving cycles. Unlike ICEVs, BEVs are more energy efficient on low-speed arterial trips compared to highway trips. Different energy consumption patterns require different eco-routing strategies for ICEVs and BEVs. This study found that eco-routing could reduce energy consumption for BEVs but also significantly increases their average travel time. The simulation study found that multi-objective routing could reduce the energy consumption of BEVs by 13.5, 14.2, 12.9, and 10.7 percent, as well as the fuel consumption of ICEVs by 0.1, 4.3, 3.4, and 10.6 percent for "not congested", "slightly congested", "moderately congested", and "highly congested" conditions, respectively. The study also found that multi-objective user equilibrium routing reduced the average vehicle travel time by up to 10.1% compared to the standard user equilibrium traffic assignment for the highly congested conditions, producing a solution closer to the system optimum traffic assignment. The results indicate that the multi-objective eco-routing can effectively reduce fuel/energy consumption with minimum impacts on travel times for both BEVs and ICEVs.
△ Less
Submitted 10 August, 2020;
originally announced April 2021.
-
Riemannian Perspective on Matrix Factorization
Authors:
Kwangjun Ahn,
Felipe Suarez
Abstract:
We study the non-convex matrix factorization approach to matrix completion via Riemannian geometry. Based on an optimization formulation over a Grassmannian manifold, we characterize the landscape based on the notion of principal angles between subspaces. For the fully observed case, our results show that there is a region in which the cost is geodesically convex, and outside of which all critical…
▽ More
We study the non-convex matrix factorization approach to matrix completion via Riemannian geometry. Based on an optimization formulation over a Grassmannian manifold, we characterize the landscape based on the notion of principal angles between subspaces. For the fully observed case, our results show that there is a region in which the cost is geodesically convex, and outside of which all critical points are strictly saddle. We empirically study the partially observed case based on our findings.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm
Authors:
Sinho Chewi,
Chen Lu,
Kwangjun Ahn,
Xiang Cheng,
Thibaut Le Gouic,
Philippe Rigollet
Abstract:
Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as $O(d^{1/3})$, where $d$ is the dimension. However, the diffusion scaling limit requires stringent assumptions on the target distribution and is asymptotic in nature. In contrast, the best known non-asymptotic mixin…
▽ More
Conventional wisdom in the sampling literature, backed by a popular diffusion scaling limit, suggests that the mixing time of the Metropolis-Adjusted Langevin Algorithm (MALA) scales as $O(d^{1/3})$, where $d$ is the dimension. However, the diffusion scaling limit requires stringent assumptions on the target distribution and is asymptotic in nature. In contrast, the best known non-asymptotic mixing time bound for MALA on the class of log-smooth and strongly log-concave distributions is $O(d)$. In this work, we establish that the mixing time of MALA on this class of target distributions is $\widetildeΘ(d^{1/2})$ under a warm start. Our upper bound proof introduces a new technique based on a projection characterization of the Metropolis adjustment which reduces the study of MALA to the well-studied discretization analysis of the Langevin SDE and bypasses direct computation of the acceptance probability.
△ Less
Submitted 23 December, 2020;
originally announced December 2020.
-
Efficient constrained sampling via the mirror-Langevin algorithm
Authors:
Kwangjun Ahn,
Sinho Chewi
Abstract:
We propose a new discretization of the mirror-Langevin diffusion and give a crisp proof of its convergence. Our analysis uses relative convexity/smoothness and self-concordance, ideas which originated in convex optimization, together with a new result in optimal transport that generalizes the displacement convexity of the entropy. Unlike prior works, our result both (1) requires much weaker assump…
▽ More
We propose a new discretization of the mirror-Langevin diffusion and give a crisp proof of its convergence. Our analysis uses relative convexity/smoothness and self-concordance, ideas which originated in convex optimization, together with a new result in optimal transport that generalizes the displacement convexity of the entropy. Unlike prior works, our result both (1) requires much weaker assumptions on the mirror map and the target distribution, and (2) has vanishing bias as the step size tends to zero. In particular, for the task of sampling from a log-concave distribution supported on a compact set, our theoretical results are significantly better than the existing guarantees.
△ Less
Submitted 25 October, 2021; v1 submitted 30 October, 2020;
originally announced October 2020.
-
AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results
Authors:
Pengxu Wei,
Hannan Lu,
Radu Timofte,
Liang Lin,
Wangmeng Zuo,
Zhihong Pan,
Baopu Li,
Teng Xi,
Yanwen Fan,
Gang Zhang,
Jingtuo Liu,
Junyu Han,
Errui Ding,
Tangxin Xie,
Liang Cao,
Yan Zou,
Yi Shen,
Jialiang Zhang,
Yu Jia,
Kaihua Cheng,
Chenhuan Wu,
Yue Lin,
Cen Liu,
Yunbo Peng,
Xueyi Zou
, et al. (51 additional authors not shown)
Abstract:
This paper introduces the real image Super-Resolution (SR) challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2020. This challenge involves three tracks to super-resolve an input image for $\times$2, $\times$3 and $\times$4 scaling factors, respectively. The goal is to attract more attention to realistic image degradation for the SR task, wh…
▽ More
This paper introduces the real image Super-Resolution (SR) challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2020. This challenge involves three tracks to super-resolve an input image for $\times$2, $\times$3 and $\times$4 scaling factors, respectively. The goal is to attract more attention to realistic image degradation for the SR task, which is much more complicated and challenging, and contributes to real-world image super-resolution applications. 452 participants were registered for three tracks in total, and 24 teams submitted their results. They gauge the state-of-the-art approaches for real image SR in terms of PSNR and SSIM.
△ Less
Submitted 25 September, 2020;
originally announced September 2020.
-
A simpler strong refutation of random $k$-XOR
Authors:
Kwangjun Ahn
Abstract:
Strong refutation of random CSPs is a fundamental question in theoretical computer science that has received particular attention due to the long-standing gap between the information-theoretic limit and the computational limit. This gap is recently bridged by Raghavendra, Rao and Schramm where they study sub-exponential algorithms for the regime between the two limits. In this work, we take a simp…
▽ More
Strong refutation of random CSPs is a fundamental question in theoretical computer science that has received particular attention due to the long-standing gap between the information-theoretic limit and the computational limit. This gap is recently bridged by Raghavendra, Rao and Schramm where they study sub-exponential algorithms for the regime between the two limits. In this work, we take a simpler approach to their algorithm and analysis.
△ Less
Submitted 8 August, 2020;
originally announced August 2020.
-
HARMer: Cyber-attacks Automation and Evaluation
Authors:
Simon Yusuf Enoch,
Zhibin Huang,
Chun Yong Moon,
Donghwan Lee,
Myung Kil Ahn,
Dong Seong Kim
Abstract:
With the increasing growth of cyber-attack incidences, it is important to develop innovative and effective techniques to assess and defend networked systems against cyber attacks. One of the well-known techniques for this is performing penetration testing which is carried by a group of security professionals (i.e, red team). Penetration testing is also known to be effective to find existing and ne…
▽ More
With the increasing growth of cyber-attack incidences, it is important to develop innovative and effective techniques to assess and defend networked systems against cyber attacks. One of the well-known techniques for this is performing penetration testing which is carried by a group of security professionals (i.e, red team). Penetration testing is also known to be effective to find existing and new vulnerabilities, however, the quality of security assessment can be depending on the quality of the red team members and their time and devotion to the penetration testing. In this paper, we propose a novel automation framework for cyber-attacks generation named `HARMer' to address the challenges with respect to manual attack execution by the red team. Our novel proposed framework, design, and implementation is based on a scalable graphical security model called Hierarchical Attack Representation Model (HARM). (1) We propose the requirements and the key phases for the automation framework. (2) We propose security metrics-based attack planning strategies along with their algorithms. (3) We conduct experiments in a real enterprise network and Amazon Web Services. The results show how the different phases of the framework interact to model the attackers' operations. This framework will allow security administrators to automatically assess the impact of various threats and attacks in an automated manner.
△ Less
Submitted 17 July, 2020; v1 submitted 25 June, 2020;
originally announced June 2020.
-
Understanding Nesterov's Acceleration via Proximal Point Method
Authors:
Kwangjun Ahn,
Suvrit Sra
Abstract:
The proximal point method (PPM) is a fundamental method in optimization that is often used as a building block for designing optimization algorithms. In this work, we use the PPM method to provide conceptually simple derivations along with convergence analyses of different versions of Nesterov's accelerated gradient method (AGM). The key observation is that AGM is a simple approximation of PPM, wh…
▽ More
The proximal point method (PPM) is a fundamental method in optimization that is often used as a building block for designing optimization algorithms. In this work, we use the PPM method to provide conceptually simple derivations along with convergence analyses of different versions of Nesterov's accelerated gradient method (AGM). The key observation is that AGM is a simple approximation of PPM, which results in an elementary derivation of the update equations and stepsizes of AGM. This view also leads to a transparent and conceptually simple analysis of AGM's convergence by using the analysis of PPM. The derivations also naturally extend to the strongly convex case. Ultimately, the results presented in this paper are of both didactic and conceptual value; they unify and explain existing variants of AGM while motivating other accelerated methods for practically relevant settings.
△ Less
Submitted 2 June, 2022; v1 submitted 17 May, 2020;
originally announced May 2020.
-
On Tight Convergence Rates of Without-replacement SGD
Authors:
Kwangjun Ahn,
Suvrit Sra
Abstract:
For solving finite-sum optimization problems, SGD without replacement sampling is empirically shown to outperform SGD. Denoting by $n$ the number of components in the cost and $K$ the number of epochs of the algorithm , several recent works have shown convergence rates of without-replacement SGD that have better dependency on $n$ and $K$ than the baseline rate of $O(1/(nK))$ for SGD. However, ther…
▽ More
For solving finite-sum optimization problems, SGD without replacement sampling is empirically shown to outperform SGD. Denoting by $n$ the number of components in the cost and $K$ the number of epochs of the algorithm , several recent works have shown convergence rates of without-replacement SGD that have better dependency on $n$ and $K$ than the baseline rate of $O(1/(nK))$ for SGD. However, there are two main limitations shared among those works: the rates have extra poly-logarithmic factors on $nK$, and denoting by $κ$ the condition number of the problem, the rates hold after $κ^c\log(nK)$ epochs for some $c>0$. In this work, we overcome these limitations by analyzing step sizes that vary across epochs.
△ Less
Submitted 18 April, 2020;
originally announced April 2020.
-
Fast frequency discrimination and phoneme recognition using a biomimetic membrane coupled to a neural network
Authors:
Woo Seok Lee,
Hyunjae Kim,
Andrew N. Cleland,
Kang-Hun Ahn
Abstract:
In the human ear, the basilar membrane plays a central role in sound recognition. When excited by sound, this membrane responds with a frequency-dependent displacement pattern that is detected and identified by the auditory hair cells combined with the human neural system. Inspired by this structure, we designed and fabricated an artificial membrane that produces a spatial displacement pattern in…
▽ More
In the human ear, the basilar membrane plays a central role in sound recognition. When excited by sound, this membrane responds with a frequency-dependent displacement pattern that is detected and identified by the auditory hair cells combined with the human neural system. Inspired by this structure, we designed and fabricated an artificial membrane that produces a spatial displacement pattern in response to an audible signal, which we used to train a convolutional neural network (CNN). When trained with single frequency tones, this system can unambiguously distinguish tones closely spaced in frequency. When instead trained to recognize spoken vowels, this system outperforms existing methods for phoneme recognition, including the discrete Fourier transform (DFT), zoom FFT and chirp z-transform, especially when tested in short time windows. This sound recognition scheme therefore promises significant benefits in fast and accurate sound identification compared to existing methods.
△ Less
Submitted 9 April, 2020;
originally announced April 2020.
-
Correlation Clustering in Data Streams
Authors:
Kook Jin Ahn,
Graham Cormode,
Sudipto Guha,
Andrew McGregor,
Anthony Wirth
Abstract:
Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms for the relevant optimization problems such as $k$-center, $k$-median, and $k$-means. Such algorithms need to be both time and and space efficient. In this paper, we address the problem of correlation clustering in the dynamic data stream model. The stream consis…
▽ More
Clustering is a fundamental tool for analyzing large data sets. A rich body of work has been devoted to designing data-stream algorithms for the relevant optimization problems such as $k$-center, $k$-median, and $k$-means. Such algorithms need to be both time and and space efficient. In this paper, we address the problem of correlation clustering in the dynamic data stream model. The stream consists of updates to the edge weights of a graph on $n$ nodes and the goal is to find a node-partition such that the end-points of negative-weight edges are typically in different clusters whereas the end-points of positive-weight edges are typically in the same cluster. We present polynomial-time, $O(n\cdot \ \mbox{polylog}~n)$-space approximation algorithms for natural problems that arise.
We first develop data structures based on linear sketches that allow the "quality" of a given node-partition to be measured. We then combine these data structures with convex programming and sampling techniques to solve the relevant approximation problem. Unfortunately, the standard LP and SDP formulations are not obviously solvable in $O(n\cdot \mbox{polylog}~n)$-space. Our work presents space-efficient algorithms for the convex programming required, as well as approaches to reduce the adaptivity of the sampling.
△ Less
Submitted 5 December, 2018;
originally announced December 2018.
-
Driving Experience Transfer Method for End-to-End Control of Self-Driving Cars
Authors:
Dooseop Choi,
Taeg-Hyun An,
Kyounghwan Ahn,
Jeongdan Choi
Abstract:
In this paper, we present a transfer learning method for the end-to-end control of self-driving cars, which enables a convolutional neural network (CNN) trained on a source domain to be utilized for the same task in a different target domain. A conventional CNN for the end-to-end control is designed to map a single front-facing camera image to a steering command. To enable the transfer learning, w…
▽ More
In this paper, we present a transfer learning method for the end-to-end control of self-driving cars, which enables a convolutional neural network (CNN) trained on a source domain to be utilized for the same task in a different target domain. A conventional CNN for the end-to-end control is designed to map a single front-facing camera image to a steering command. To enable the transfer learning, we let the CNN produce not only a steering command but also a lane departure level (LDL) by adding a new task module, which takes the output of the last convolutional layer as input. The CNN trained on the source domain, called source network, is then utilized to train another task module called target network, which also takes the output of the last convolutional layer of the source network and is trained to produce a steering command for the target domain. The steering commands from the source and target network are finally merged according to the LDL and the merged command is utilized for controlling a car in the target domain. To demonstrate the effectiveness of the proposed method, we utilized two simulators, TORCS and GTAV, for the source and the target domains, respectively. Experimental results show that the proposed method outperforms other baseline methods in terms of stable and safe control of cars.
△ Less
Submitted 7 September, 2018; v1 submitted 6 September, 2018;
originally announced September 2018.
-
Artifacts Detection and Error Block Analysis from Broadcasted Videos
Authors:
Md Mehedi Hasan,
Tasneem Rahman,
Kiok Ahn,
Oksam Chae
Abstract:
With the advancement of IPTV and HDTV technology, previous subtle errors in videos are now becoming more prominent because of the structure oriented and compression based artifacts. In this paper, we focus towards the development of a real-time video quality check system. Light weighted edge gradient magnitude information is incorporated to acquire the statistical information and the distorted fra…
▽ More
With the advancement of IPTV and HDTV technology, previous subtle errors in videos are now becoming more prominent because of the structure oriented and compression based artifacts. In this paper, we focus towards the development of a real-time video quality check system. Light weighted edge gradient magnitude information is incorporated to acquire the statistical information and the distorted frames are then estimated based on the characteristics of their surrounding frames. Then we apply the prominent texture patterns to classify them in different block errors and analyze them not only in video error detection application but also in error concealment, restoration and retrieval. Finally, evaluating the performance through experiments on prominent datasets and broadcasted videos show that the proposed algorithm is very much efficient to detect errors for video broadcast and surveillance applications in terms of computation time and analysis of distorted frames.
△ Less
Submitted 29 August, 2018;
originally announced August 2018.
-
Hypergraph Spectral Clustering in the Weighted Stochastic Block Model
Authors:
Kwangjun Ahn,
Kangwook Lee,
Changho Suh
Abstract:
Spectral clustering is a celebrated algorithm that partitions objects based on pairwise similarity information. While this approach has been successfully applied to a variety of domains, it comes with limitations. The reason is that there are many other applications in which only \emph{multi}-way similarity measures are available. This motivates us to explore the multi-way measurement setting. In…
▽ More
Spectral clustering is a celebrated algorithm that partitions objects based on pairwise similarity information. While this approach has been successfully applied to a variety of domains, it comes with limitations. The reason is that there are many other applications in which only \emph{multi}-way similarity measures are available. This motivates us to explore the multi-way measurement setting. In this work, we develop two algorithms intended for such setting: Hypergraph Spectral Clustering (HSC) and Hypergraph Spectral Clustering with Local Refinement (HSCLR). Our main contribution lies in performance analysis of the poly-time algorithms under a random hypergraph model, which we name the weighted stochastic block model, in which objects and multi-way measures are modeled as nodes and weights of hyperedges, respectively. Denoting by $n$ the number of nodes, our analysis reveals the following: (1) HSC outputs a partition which is better than a random guess if the sum of edge weights (to be explained later) is $Ω(n)$; (2) HSC outputs a partition which coincides with the hidden partition except for a vanishing fraction of nodes if the sum of edge weights is $ω(n)$; and (3) HSCLR exactly recovers the hidden partition if the sum of edge weights is on the order of $n \log n$. Our results improve upon the state of the arts recently established under the model and they firstly settle the order-wise optimal results for the binary edge weight case. Moreover, we show that our results lead to efficient sketching algorithms for subspace clustering, a computer vision application. Lastly, we show that HSCLR achieves the information-theoretic limits for a special yet practically relevant model, thereby showing no computational barrier for the case.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
Language and Noise Transfer in Speech Enhancement Generative Adversarial Network
Authors:
Santiago Pascual,
Maruchan Park,
Joan Serrà,
Antonio Bonafonte,
Kang-Hun Ahn
Abstract:
Speech enhancement deep learning systems usually require large amounts of training data to operate in broad conditions or real applications. This makes the adaptability of those systems into new, low resource environments an important topic. In this work, we present the results of adapting a speech enhancement generative adversarial network by finetuning the generator with small amounts of data. W…
▽ More
Speech enhancement deep learning systems usually require large amounts of training data to operate in broad conditions or real applications. This makes the adaptability of those systems into new, low resource environments an important topic. In this work, we present the results of adapting a speech enhancement generative adversarial network by finetuning the generator with small amounts of data. We investigate the minimum requirements to obtain a stable behavior in terms of several objective metrics in two very different languages: Catalan and Korean. We also study the variability of test performance to unseen noise as a function of the amount of different types of noise available for training. Results show that adapting a pre-trained English model with 10 min of data already achieves a comparable performance to having two orders of magnitude more data. They also demonstrate the relative stability in test performance with respect to the number of training noise types.
△ Less
Submitted 18 December, 2017;
originally announced December 2017.
-
Computing the maximum matching width is NP-hard
Authors:
Kwangjun Ahn,
Jisu Jeong
Abstract:
The maximum matching width is a graph width parameter that is defined on a branch-decomposition over the vertex set of a graph. In this short paper, we prove that the problem of computing the maximum matching width is NP-hard.
The maximum matching width is a graph width parameter that is defined on a branch-decomposition over the vertex set of a graph. In this short paper, we prove that the problem of computing the maximum matching width is NP-hard.
△ Less
Submitted 13 October, 2017;
originally announced October 2017.
-
Community Recovery in Hypergraphs
Authors:
Kwangjun Ahn,
Kangwook Lee,
Changho Suh
Abstract:
Community recovery is a central problem that arises in a wide variety of applications such as network clustering, motion segmentation, face clustering and protein complex detection. The objective of the problem is to cluster data points into distinct communities based on a set of measurements, each of which is associated with the values of a certain number of data points. While most of the prior w…
▽ More
Community recovery is a central problem that arises in a wide variety of applications such as network clustering, motion segmentation, face clustering and protein complex detection. The objective of the problem is to cluster data points into distinct communities based on a set of measurements, each of which is associated with the values of a certain number of data points. While most of the prior works focus on a setting in which the number of data points involved in a measurement is two, this work explores a generalized setting in which the number can be more than two. Motivated by applications particularly in machine learning and channel coding, we consider two types of measurements: (1) homogeneity measurement which indicates whether or not the associated data points belong to the same community; (2) parity measurement which denotes the modulo-2 sum of the values of the data points. Such measurements are possibly corrupted by Bernoulli noise. We characterize the fundamental limits on the number of measurements required to reconstruct the communities for the considered models.
△ Less
Submitted 11 September, 2017;
originally announced September 2017.
-
An Executable Specification of Typing Rules for Extensible Records based on Row Polymorphism
Authors:
Ki Yung Ahn
Abstract:
Type inference is an application domain that is a natural fit for logic programming (LP). LP systems natively support unification, which serves as a basic building block of typical type inference algorithms. In particular, polymorphic type inference in the Hindley--Milner type system (HM) can be succinctly specified and executed in Prolog. In our previous work, we have demonstrated that more advan…
▽ More
Type inference is an application domain that is a natural fit for logic programming (LP). LP systems natively support unification, which serves as a basic building block of typical type inference algorithms. In particular, polymorphic type inference in the Hindley--Milner type system (HM) can be succinctly specified and executed in Prolog. In our previous work, we have demonstrated that more advanced features of parametric polymorphism beyond HM, such as type-constructor polymorphism and kind polymorphism, can be similarly specified in Prolog. Here, we demonstrate a specification for records, which is one of the most widely supported compound data structures in real-world programming languages, and discuss the advantages and limitations of Prolog as a specification language for type systems. Record types are specified as order-irrelevant collections of named fields mapped to their corresponding types. In addition, an open-ended collection is used to support row polymorphism for record types to be extensible.
△ Less
Submitted 11 September, 2017; v1 submitted 25 July, 2017;
originally announced July 2017.
-
Generating Witness of Non-Bisimilarity for the pi-Calculus
Authors:
Ki Yung Ahn,
Ross Horne,
Alwen Tiu
Abstract:
In the logic programming paradigm, it is difficult to develop an elegant solution for generating distinguishing formulae that witness the failure of open-bisimilarity between two pi-calculus processes; this was unexpected because the semantics of the pi-calculus and open bisimulation have already been elegantly specified in higher-order logic programming systems. Our solution using Haskell defines…
▽ More
In the logic programming paradigm, it is difficult to develop an elegant solution for generating distinguishing formulae that witness the failure of open-bisimilarity between two pi-calculus processes; this was unexpected because the semantics of the pi-calculus and open bisimulation have already been elegantly specified in higher-order logic programming systems. Our solution using Haskell defines the formulae generation as a tree transformation from the forest of all nondeterministic bisimulation steps to a pair of distinguishing formulae. Thanks to laziness in Haskell, only the necessary paths demanded by the tree transformation function are generated. Our work demonstrates that Haskell and its libraries provide an attractive platform for symbolically analyzing equivalence properties of labeled transition systems in an environment sensitive setting.
△ Less
Submitted 30 May, 2017;
originally announced May 2017.
-
A Characterisation of Open Bisimilarity using an Intuitionistic Modal Logic
Authors:
Ki Yung Ahn,
Ross Horne,
Alwen Tiu
Abstract:
Open bisimilarity is defined for open process terms in which free variables may appear. The insight is, in order to characterise open bisimilarity, we move to the setting of intuitionistic modal logics. The intuitionistic modal logic introduced, called $\mathcal{OM}$, is such that modalities are closed under substitutions, which induces a property known as intuitionistic hereditary. Intuitionistic…
▽ More
Open bisimilarity is defined for open process terms in which free variables may appear. The insight is, in order to characterise open bisimilarity, we move to the setting of intuitionistic modal logics. The intuitionistic modal logic introduced, called $\mathcal{OM}$, is such that modalities are closed under substitutions, which induces a property known as intuitionistic hereditary. Intuitionistic hereditary reflects in logic the lazy instantiation of free variables performed when checking open bisimilarity. The soundness proof for open bisimilarity with respect to our intuitionistic modal logic is mechanised in Abella. The constructive content of the completeness proof provides an algorithm for generating distinguishing formulae, which we have implemented. We draw attention to the fact that there is a spectrum of bisimilarity congruences that can be characterised by intuitionistic modal logics.
△ Less
Submitted 9 August, 2021; v1 submitted 19 January, 2017;
originally announced January 2017.
-
BER-Based Physical Layer Security with Finite Codelength: Combining Strong Converse and Error Amplification
Authors:
Il-Min Kim,
Byoung-Hoon Kim,
Joon Kui Ahn
Abstract:
A bit error rate (BER)-based physical layer security approach is proposed for finite blocklength. For secure communication in the sense of high BER, the information-theoretic strong converse is combined with cryptographic error amplification achieved by substitution permutation networks (SPNs) based on confusion and diffusion. For discrete memoryless channels (DMCs), an analytical framework is pro…
▽ More
A bit error rate (BER)-based physical layer security approach is proposed for finite blocklength. For secure communication in the sense of high BER, the information-theoretic strong converse is combined with cryptographic error amplification achieved by substitution permutation networks (SPNs) based on confusion and diffusion. For discrete memoryless channels (DMCs), an analytical framework is provided showing the tradeoffs among finite blocklength, maximum/minimum possible transmission rates, and BER requirements for the legitimate receiver and the eavesdropper. Also, the security gap is analytically studied for Gaussian channels and the concept is extended to other DMCs including binary symmetric channels (BSCs) and binary erasure channels (BECs). For fading channels, the transmit power is optimized to minimize the outage probability of the legitimate receiver subject to a BER threshold for the eavesdropper.
△ Less
Submitted 4 January, 2015; v1 submitted 16 December, 2014;
originally announced December 2014.
-
Access to Data and Number of Iterations: Dual Primal Algorithms for Maximum Matching under Resource Constraints
Authors:
Kook Jin Ahn,
Sudipto Guha
Abstract:
In this paper we consider graph algorithms in models of computation where the space usage (random accessible storage, in addition to the read only input) is sublinear in the number of edges $m$ and the access to input data is constrained. These questions arises in many natural settings, and in particular in the analysis of MapReduce or similar algorithms that model constrained parallelism with sub…
▽ More
In this paper we consider graph algorithms in models of computation where the space usage (random accessible storage, in addition to the read only input) is sublinear in the number of edges $m$ and the access to input data is constrained. These questions arises in many natural settings, and in particular in the analysis of MapReduce or similar algorithms that model constrained parallelism with sublinear central processing. In SPAA 2011, Lattanzi etal. provided a $O(1)$ approximation of maximum matching using $O(p)$ rounds of iterative filtering via mapreduce and $O(n^{1+1/p})$ space of central processing for a graph with $n$ nodes and $m$ edges.
We focus on weighted nonbipartite maximum matching in this paper. For any constant $p>1$, we provide an iterative sampling based algorithm for computing a $(1-ε)$-approximation of the weighted nonbipartite maximum matching that uses $O(p/ε)$ rounds of sampling, and $O(n^{1+1/p})$ space. The results extends to $b$-Matching with small changes. This paper combines adaptive sketching literature and fast primal-dual algorithms based on relaxed Dantzig-Wolfe decision procedures. Each round of sampling is implemented through linear sketches and executed in a single round of MapReduce. The paper also proves that nonstandard linear relaxations of a problem, in particular penalty based formulations, are helpful in mapreduce and similar settings in reducing the adaptive dependence of the iterations.
△ Less
Submitted 20 April, 2015; v1 submitted 16 July, 2013;
originally announced July 2013.
-
Near Linear Time Approximation Schemes for Uncapacitated and Capacitated b--Matching Problems in Nonbipartite Graphs
Authors:
Kook Jin Ahn,
Sudipto Guha
Abstract:
We present the first near optimal approximation schemes for the
maximum weighted (uncapacitated or capacitated) $b$--matching
problems for non-bipartite graphs that run in time (near) linear in
the number of edges. For any $δ>3/\sqrt{n}$ the algorithm
produces a $(1-δ)$ approximation in $O(m \poly(δ^{-1},\log
n))$ time. We provide fractional solutions for the standard linear
programmin…
▽ More
We present the first near optimal approximation schemes for the
maximum weighted (uncapacitated or capacitated) $b$--matching
problems for non-bipartite graphs that run in time (near) linear in
the number of edges. For any $δ>3/\sqrt{n}$ the algorithm
produces a $(1-δ)$ approximation in $O(m \poly(δ^{-1},\log
n))$ time. We provide fractional solutions for the standard linear
programming formulations for these problems and subsequently also
provide (near) linear time approximation schemes
for rounding the fractional solutions.
Through these problems as a vehicle, we also present several ideas
in the context of solving linear programs approximately using fast
primal-dual algorithms. First, even though the dual of these
problems have exponentially many variables and an efficient exact
computation of dual weights is infeasible, we show that we can
efficiently compute and use a sparse approximation of the dual
weights using a combination of (i) adding perturbation to the
constraints of the polytope and (ii) amplification followed by
thresholding of the dual weights. Second, we show that
approximation algorithms can be used to reduce the width of the
formulation, and faster convergence.
△ Less
Submitted 18 June, 2018; v1 submitted 16 July, 2013;
originally announced July 2013.
-
Irrelevance, Heterogeneous Equality, and Call-by-value Dependent Type Systems
Authors:
Vilhelm Sjöberg,
Chris Casinghino,
Ki Yung Ahn,
Nathan Collins,
Harley D. Eades III,
Peng Fu,
Garrin Kimmell,
Tim Sheard,
Aaron Stump,
Stephanie Weirich
Abstract:
We present a full-spectrum dependently typed core language which includes both nontermination and computational irrelevance (a.k.a. erasure), a combination which has not been studied before. The two features interact: to protect type safety we must be careful to only erase terminating expressions. Our language design is strongly influenced by the choice of CBV evaluation, and by our novel treatmen…
▽ More
We present a full-spectrum dependently typed core language which includes both nontermination and computational irrelevance (a.k.a. erasure), a combination which has not been studied before. The two features interact: to protect type safety we must be careful to only erase terminating expressions. Our language design is strongly influenced by the choice of CBV evaluation, and by our novel treatment of propositional equality which has a heterogeneous, completely erased elimination form.
△ Less
Submitted 13 February, 2012;
originally announced February 2012.
-
Core-Periphery Segregation in Evolving Prisoner's Dilemma Networks
Authors:
Yunkyu Sohn,
Jung-Kyoo Choi,
T. K. Ahn
Abstract:
Dense cooperative networks are an essential element of social capital for a prosperous society. These networks enable individuals to overcome collective action dilemmas by enhancing trust. In many biological and social settings, network structures evolve endogenously as agents exit relationships and build new ones. However, the process by which evolutionary dynamics lead to self-organization of de…
▽ More
Dense cooperative networks are an essential element of social capital for a prosperous society. These networks enable individuals to overcome collective action dilemmas by enhancing trust. In many biological and social settings, network structures evolve endogenously as agents exit relationships and build new ones. However, the process by which evolutionary dynamics lead to self-organization of dense cooperative networks has not been explored. Our large group prisoner's dilemma experiments with exit and partner choice options show that core-periphery segregation of cooperators and defectors drives the emergence of cooperation. Cooperators' Quit-for-Tat and defectors' Roving strategy lead to a highly asymmetric core and periphery structure. Densely connected to each other, cooperators successfully isolate defectors and earn larger payoffs than defectors. Our analysis of the topological characteristics of evolving networks illuminates how social capital is generated.
△ Less
Submitted 9 December, 2012; v1 submitted 3 May, 2011;
originally announced May 2011.
-
Laminar Families and Metric Embeddings: Non-bipartite Maximum Matching Problem in the Semi-Streaming Model
Authors:
Kook Jin Ahn,
Sudipto Guha
Abstract:
In this paper, we study the non-bipartite maximum matching problem in the semi-streaming model. The maximum matching problem in the semi-streaming model has received a significant amount of attention lately. While the problem has been somewhat well solved for bipartite graphs, the known algorithms for non-bipartite graphs use $2^{\frac1ε}$ passes or $n^{\frac1ε}$ time to compute a $(1-ε)$ approxim…
▽ More
In this paper, we study the non-bipartite maximum matching problem in the semi-streaming model. The maximum matching problem in the semi-streaming model has received a significant amount of attention lately. While the problem has been somewhat well solved for bipartite graphs, the known algorithms for non-bipartite graphs use $2^{\frac1ε}$ passes or $n^{\frac1ε}$ time to compute a $(1-ε)$ approximation. In this paper we provide the first FPTAS (polynomial in $n,\frac1ε$) for the problem which is efficient in both the running time and the number of passes. We also show that we can estimate the size of the matching in $O(\frac1ε)$ passes using slightly superlinear space.
To achieve both results, we use the structural properties of the matching polytope such as the laminarity of the tight sets and total dual integrality. The algorithms are iterative, and are based on the fractional packing and covering framework. However the formulations herein require exponentially many variables or constraints. We use laminarity, metric embeddings and graph sparsification to reduce the space required by the algorithms in between and across the iterations. This is the first use of these ideas in the semi-streaming model to solve a combinatorial optimization problem.
△ Less
Submitted 20 April, 2011;
originally announced April 2011.