-
Putting Humans in the Natural Language Processing Loop: A Survey
Authors:
Zijie J. Wang,
Dongjin Choi,
Shenyu Xu,
Diyi Yang
Abstract:
How can we design Natural Language Processing (NLP) systems that learn from human feedback? There is a growing research body of Human-in-the-loop (HITL) NLP frameworks that continuously integrate human feedback to improve the model itself. HITL NLP research is nascent but multifarious -- solving various NLP problems, collecting diverse feedback from different people, and applying different methods…
▽ More
How can we design Natural Language Processing (NLP) systems that learn from human feedback? There is a growing research body of Human-in-the-loop (HITL) NLP frameworks that continuously integrate human feedback to improve the model itself. HITL NLP research is nascent but multifarious -- solving various NLP problems, collecting diverse feedback from different people, and applying different methods to learn from collected feedback. We present a survey of HITL NLP work from both Machine Learning (ML) and Human-Computer Interaction (HCI) communities that highlights its short yet inspiring history, and thoroughly summarize recent frameworks focusing on their tasks, goals, human interactions, and feedback learning methods. Finally, we discuss future directions for integrating human feedback in the NLP development loop.
△ Less
Submitted 6 March, 2021;
originally announced March 2021.
-
Run Your Visual-Inertial Odometry on NVIDIA Jetson: Benchmark Tests on a Micro Aerial Vehicle
Authors:
Jinwoo Jeon,
Sungwook Jung,
Eungchang Lee,
Duckyu Choi,
Hyun Myung
Abstract:
This paper presents benchmark tests of various visual(-inertial) odometry algorithms on NVIDIA Jetson platforms. The compared algorithms include mono and stereo, covering Visual Odometry (VO) and Visual-Inertial Odometry (VIO): VINS-Mono, VINS-Fusion, Kimera, ALVIO, Stereo-MSCKF, ORB-SLAM2 stereo, and ROVIO. As these methods are mainly used for unmanned aerial vehicles (UAVs), they must perform we…
▽ More
This paper presents benchmark tests of various visual(-inertial) odometry algorithms on NVIDIA Jetson platforms. The compared algorithms include mono and stereo, covering Visual Odometry (VO) and Visual-Inertial Odometry (VIO): VINS-Mono, VINS-Fusion, Kimera, ALVIO, Stereo-MSCKF, ORB-SLAM2 stereo, and ROVIO. As these methods are mainly used for unmanned aerial vehicles (UAVs), they must perform well in situations where the size of the processing board and weight is limited. Jetson boards released by NVIDIA satisfy these constraints as they have a sufficiently powerful central processing unit (CPU) and graphics processing unit (GPU) for image processing. However, in existing studies, the performance of Jetson boards as a processing platform for executing VO/VIO has not been compared extensively in terms of the usage of computing resources and accuracy. Therefore, this study compares representative VO/VIO algorithms on several NVIDIA Jetson platforms, namely NVIDIA Jetson TX2, Xavier NX, and AGX Xavier, and introduces a novel dataset 'KAIST VIO dataset' for UAVs. Including pure rotations, the dataset has several geometric trajectories that are harsh to visual(-inertial) state estimation. The evaluation is performed in terms of the accuracy of estimated odometry, CPU usage, and memory usage on various Jetson boards, algorithms, and trajectories. We present the {results of the} comprehensive benchmark test and release the dataset for the computer vision and robotics applications.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Peacock Exploration: A Lightweight Exploration for UAV using Control-Efficient Trajectory
Authors:
EungChang Mason Lee,
Duckyu Choi,
Hyun Myung
Abstract:
Unmanned Aerial Vehicles have received much attention in recent years due to its wide range of applications, such as exploration of an unknown environment to acquire a 3D map without prior knowledge of it. Existing exploration methods have been largely challenged by computationally heavy probabilistic path planning. Similarly, kinodynamic constraints or proper sensors considering the payload for U…
▽ More
Unmanned Aerial Vehicles have received much attention in recent years due to its wide range of applications, such as exploration of an unknown environment to acquire a 3D map without prior knowledge of it. Existing exploration methods have been largely challenged by computationally heavy probabilistic path planning. Similarly, kinodynamic constraints or proper sensors considering the payload for UAVs were not considered. In this paper, to solve those issues and to consider the limited payload and computational resource of UAVs, we propose "Peacock Exploration": A lightweight exploration method for UAVs using precomputed minimum snap trajectories which look like a peacock's tail. Using the widely known, control efficient minimum snap trajectories and OctoMap, the UAV equipped with a RGB-D camera can explore unknown 3D environments without any prior knowledge or human-guidance with only O(logN) computational complexity. It also adopts the receding horizon approach and simple, heuristic scoring criteria. The proposed algorithm's performance is demonstrated by exploring a challenging 3D maze environment and compared with a state-of-the-art algorithm.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
Creating a Physicist: The Impact of Informal Programs on University Student Development
Authors:
Callie Rethman,
Jonathan Perry,
Jonan Donaldson,
Daniel Choi,
Tatiana Erukhimova
Abstract:
Physics outreach programs provide a critical context for informal experiences that promote the transition from new student to contributing physicist. Prior studies have suggested a positive link between participation in informal physics outreach programs and the development of a student's physics identity. In this study, we adopt a student-focused investigation to explore the effects of informal p…
▽ More
Physics outreach programs provide a critical context for informal experiences that promote the transition from new student to contributing physicist. Prior studies have suggested a positive link between participation in informal physics outreach programs and the development of a student's physics identity. In this study, we adopt a student-focused investigation to explore the effects of informal programs on dimensions of physics identity, sense of community, 21st century skill development, and motivation. We employed a mixed methods study combining a survey instrument (117 responses) and interviews (35) with current and former undergraduate and graduate students who participated in five programs through a physics and astronomy department at a large land-grant university. To examine interviews, we employed a framework based on situated learning theory, transformative learning theory, and the Dynamic Systems Model of Role Identity. Our findings, based on self-reported data, show that students who facilitated informal physics programs positively developed their physics identity, experienced increased sense of belonging to the physics community, and developed 21st century career skills. Specifically, students reported positive benefits to their communication, teamwork and networking, and design skills. The benefits of these programs can be achieved by departments of any size without significant commitment of funds or changes to curriculum.
△ Less
Submitted 29 May, 2021; v1 submitted 27 December, 2020;
originally announced December 2020.
-
Reinforcement learning with distance-based incentive/penalty (DIP) updates for highly constrained industrial control systems
Authors:
Hyungjun Park,
Daiki Min,
Jong-hyun Ryu,
Dong Gu Choi
Abstract:
Typical reinforcement learning (RL) methods show limited applicability for real-world industrial control problems because industrial systems involve various constraints and simultaneously require continuous and discrete control. To overcome these challenges, we devise a novel RL algorithm that enables an agent to handle a highly constrained action space. This algorithm has two main features. First…
▽ More
Typical reinforcement learning (RL) methods show limited applicability for real-world industrial control problems because industrial systems involve various constraints and simultaneously require continuous and discrete control. To overcome these challenges, we devise a novel RL algorithm that enables an agent to handle a highly constrained action space. This algorithm has two main features. First, we devise two distance-based Q-value update schemes, incentive update and penalty update, in a distance-based incentive/penalty update technique to enable the agent to decide discrete and continuous actions in the feasible region and to update the value of these types of actions. Second, we propose a method for defining the penalty cost as a shadow price-weighted penalty. This approach affords two advantages compared to previous methods to efficiently induce the agent to not select an infeasible action. We apply our algorithm to an industrial control problem, microgrid system operation, and the experimental results demonstrate its superiority.
△ Less
Submitted 19 May, 2021; v1 submitted 21 November, 2020;
originally announced November 2020.
-
Nonlinear imaging of nanoscale topological corner states
Authors:
Sergey Kruk,
Wenlong Gao,
Duk Yong Choi,
Thomas Zentgraf,
Shuang Zhang,
Yuri Kivshar
Abstract:
Topological states of light represent counterintuitive optical modes localized at boundaries of finite-size optical structures that originate from the properties of the bulk. Being defined by bulk properties, such boundary states are insensitive to certain types of perturbations, thus naturally enhancing robustness of photonic circuitries. Conventionally, the N-dimensional bulk modes correspond to…
▽ More
Topological states of light represent counterintuitive optical modes localized at boundaries of finite-size optical structures that originate from the properties of the bulk. Being defined by bulk properties, such boundary states are insensitive to certain types of perturbations, thus naturally enhancing robustness of photonic circuitries. Conventionally, the N-dimensional bulk modes correspond to (N-1)-dimensional boundary states. The higher-order bulk-boundary correspondence relates N-dimensional bulk to boundary states with dimensionality reduced by more than 1. A special interest lies in miniaturization of such higher-order topological states to the nanoscale. Here, we realize nanoscale topological corner states in metasurfaces with C6-symmetric honeycomb lattices. We directly observe nanoscale topology-empowered edge and corner localizations of light and enhancement of light-matter interactions via a nonlinear imaging technique. Control of light at the nanoscale empowered by topology may facilitate miniaturization and on-chip integration of classical and quantum photonic devices.
△ Less
Submitted 1 September, 2022; v1 submitted 19 November, 2020;
originally announced November 2020.
-
Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering
Authors:
Ricky T. Q. Chen,
Dami Choi,
Lukas Balles,
David Duvenaud,
Philipp Hennig
Abstract:
Standard first-order stochastic optimization algorithms base their updates solely on the average mini-batch gradient, and it has been shown that tracking additional quantities such as the curvature can help de-sensitize common hyperparameters. Based on this intuition, we explore the use of exact per-sample Hessian-vector products and gradients to construct optimizers that are self-tuning and hyper…
▽ More
Standard first-order stochastic optimization algorithms base their updates solely on the average mini-batch gradient, and it has been shown that tracking additional quantities such as the curvature can help de-sensitize common hyperparameters. Based on this intuition, we explore the use of exact per-sample Hessian-vector products and gradients to construct optimizers that are self-tuning and hyperparameter-free. Based on a dynamics model of the gradient, we derive a process which leads to a curvature-corrected, noise-adaptive online gradient estimate. The smoothness of our updates makes it more amenable to simple step size selection schemes, which we also base off of our estimates quantities. We prove that our model-based procedure converges in the noisy quadratic setting. Though we do not see similar gains in deep learning tasks, we can match the performance of well-tuned optimizers and ultimately, this is an interesting step for constructing self-tuning optimizers.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models
Authors:
Changmao Li,
Elaine Fisher,
Rebecca Thomas,
Steve Pittard,
Vicki Hertzberg,
Jinho D. Choi
Abstract:
This paper presents a comprehensive study on resume classification to reduce the time and labor needed to screen an overwhelming number of applications significantly, while improving the selection of suitable candidates. A total of 6,492 resumes are extracted from 24,933 job applications for 252 positions designated into four levels of experience for Clinical Research Coordinators (CRC). Each resu…
▽ More
This paper presents a comprehensive study on resume classification to reduce the time and labor needed to screen an overwhelming number of applications significantly, while improving the selection of suitable candidates. A total of 6,492 resumes are extracted from 24,933 job applications for 252 positions designated into four levels of experience for Clinical Research Coordinators (CRC). Each resume is manually annotated to its most appropriate CRC position by experts through several rounds of triple annotation to establish guidelines. As a result, a high Kappa score of 61% is achieved for inter-annotator agreement. Given this dataset, novel transformer-based classification models are developed for two tasks: the first task takes a resume and classifies it to a CRC level (T1), and the second task takes both a resume and a job description to apply and predicts if the application is suited to the job T2. Our best models using section encoding and multi-head attention decoding give results of 73.3% to T1 and 79.2% to T2. Our analysis shows that the prediction errors are mostly made among adjacent CRC levels, which are hard for even experts to distinguish, implying the practical value of our models in real HR platforms.
△ Less
Submitted 5 November, 2020;
originally announced November 2020.
-
Extracting Chemical-Protein Interactions via Calibrated Deep Neural Network and Self-training
Authors:
Dongha Choi,
Hyunju Lee
Abstract:
The extraction of interactions between chemicals and proteins from several biomedical articles is important in many fields of biomedical research such as drug development and prediction of drug side effects. Several natural language processing methods, including deep neural network (DNN) models, have been applied to address this problem. However, these methods were trained with hard-labeled data,…
▽ More
The extraction of interactions between chemicals and proteins from several biomedical articles is important in many fields of biomedical research such as drug development and prediction of drug side effects. Several natural language processing methods, including deep neural network (DNN) models, have been applied to address this problem. However, these methods were trained with hard-labeled data, which tend to become over-confident, leading to degradation of the model reliability. To estimate the data uncertainty and improve the reliability, "calibration" techniques have been applied to deep learning models. In this study, to extract chemical--protein interactions, we propose a DNN-based approach incorporating uncertainty information and calibration techniques. Our model first encodes the input sequence using a pre-trained language-understanding model, following which it is trained using two calibration methods: mixup training and addition of a confidence penalty loss. Finally, the model is re-trained with augmented data that are extracted using the estimated uncertainties. Our approach has achieved state-of-the-art performance with regard to the Biocreative VI ChemProt task, while preserving higher calibration abilities than those of previous approaches. Furthermore, our approach also presents the possibilities of using uncertainty estimation for performance improvement.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Revealing the Myth of Higher-Order Inference in Coreference Resolution
Authors:
Liyan Xu,
Jinho D. Choi
Abstract:
This paper analyzes the impact of higher-order inference (HOI) on the task of coreference resolution. HOI has been adapted by almost all recent coreference resolution models without taking much investigation on its true effectiveness over representation learning. To make a comprehensive analysis, we implement an end-to-end coreference system as well as four HOI approaches, attended antecedent, ent…
▽ More
This paper analyzes the impact of higher-order inference (HOI) on the task of coreference resolution. HOI has been adapted by almost all recent coreference resolution models without taking much investigation on its true effectiveness over representation learning. To make a comprehensive analysis, we implement an end-to-end coreference system as well as four HOI approaches, attended antecedent, entity equalization, span clustering, and cluster merging, where the latter two are our original methods. We find that given a high-performing encoder such as SpanBERT, the impact of HOI is negative to marginal, providing a new perspective of HOI to this task. Our best model using cluster merging shows the Avg-F1 of 80.2 on the CoNLL 2012 shared task dataset in English.
△ Less
Submitted 28 September, 2020; v1 submitted 24 September, 2020;
originally announced September 2020.
-
Emora: An Inquisitive Social Chatbot Who Cares For You
Authors:
Sarah E. Finch,
James D. Finch,
Ali Ahmadvand,
Ingyu,
Choi,
Xiangjue Dong,
Ruixiang Qi,
Harshita Sahijwani,
Sergey Volokhin,
Zihan Wang,
Zihao Wang,
Jinho D. Choi
Abstract:
Inspired by studies on the overwhelming presence of experience-sharing in human-human conversations, Emora, the social chatbot developed by Emory University, aims to bring such experience-focused interaction to the current field of conversational AI. The traditional approach of information-sharing topic handlers is balanced with a focus on opinion-oriented exchanges that Emora delivers, and new co…
▽ More
Inspired by studies on the overwhelming presence of experience-sharing in human-human conversations, Emora, the social chatbot developed by Emory University, aims to bring such experience-focused interaction to the current field of conversational AI. The traditional approach of information-sharing topic handlers is balanced with a focus on opinion-oriented exchanges that Emora delivers, and new conversational abilities are developed that support dialogues that consist of a collaborative understanding and learning process of the partner's life experiences. We present a curated dialogue system that leverages highly expressive natural language templates, powerful intent classification, and ontology resources to provide an engaging and interesting conversational experience to every user.
△ Less
Submitted 9 September, 2020;
originally announced September 2020.
-
3D Room Layout Estimation Beyond the Manhattan World Assumption
Authors:
Dongho Choi
Abstract:
Predicting 3D room layout from single image is a challenging task with many applications. In this paper, we propose a new training and post-processing method for 3D room layout estimation, built on a recent state-of-the-art 3D room layout estimation model. Experimental results show our method outperforms state-of-the-art approaches by a large margin in predicting visible room layout. Our method ha…
▽ More
Predicting 3D room layout from single image is a challenging task with many applications. In this paper, we propose a new training and post-processing method for 3D room layout estimation, built on a recent state-of-the-art 3D room layout estimation model. Experimental results show our method outperforms state-of-the-art approaches by a large margin in predicting visible room layout. Our method has obtained the 3rd place in 2020 Holistic Scene Structures for 3D Vision Workshop.
△ Less
Submitted 6 September, 2020;
originally announced September 2020.
-
Data Programming by Demonstration: A Framework for Interactively Learning Labeling Functions
Authors:
Sara Evensen,
Chang Ge,
Dongjin Choi,
Çağatay Demiralp
Abstract:
Data programming is a programmatic weak supervision approach to efficiently curate large-scale labeled training data. Writing data programs (labeling functions) requires, however, both programming literacy and domain expertise. Many subject matter experts have neither programming proficiency nor time to effectively write data programs. Furthermore, regardless of one's expertise in coding or machin…
▽ More
Data programming is a programmatic weak supervision approach to efficiently curate large-scale labeled training data. Writing data programs (labeling functions) requires, however, both programming literacy and domain expertise. Many subject matter experts have neither programming proficiency nor time to effectively write data programs. Furthermore, regardless of one's expertise in coding or machine learning, transferring domain expertise into labeling functions by enumerating rules and thresholds is not only time consuming but also inherently difficult. Here we propose a new framework, data programming by demonstration (DPBD), to generate labeling rules using interactive demonstrations of users. DPBD aims to relieve the burden of writing labeling functions from users, enabling them to focus on higher-level semantics such as identifying relevant signals for labeling tasks. We operationalize our framework with Ruler, an interactive system that synthesizes labeling rules for document classification by using span-level annotations of users on document examples. We compare Ruler with conventional data programming through a user study conducted with 10 data scientists creating labeling functions for sentiment and spam classification tasks. We find that Ruler is easier to use and learn and offers higher overall satisfaction, while providing discriminative model performances comparable to ones achieved by conventional data programming.
△ Less
Submitted 15 September, 2020; v1 submitted 3 September, 2020;
originally announced September 2020.
-
Molecular templates of spin textures on superconducting surfaces
Authors:
Cristina Mier,
Benjamin Verlhac,
Léo Garnier,
Roberto Robles,
Laurent Limot,
Nicolás Lorente,
Deung-Jang Choi
Abstract:
We create ordered islands of magnetically anisotropic nickelocene molecules on a Pb (111) substrate. By using inelastic electron tunneling spectra (IETS) and density functional theory, we characterize the magnetic response of these islands. This allows us to conclude that the islands present local and collective magnetic excitations. Furthermore, we show that nickelocene islands present complex no…
▽ More
We create ordered islands of magnetically anisotropic nickelocene molecules on a Pb (111) substrate. By using inelastic electron tunneling spectra (IETS) and density functional theory, we characterize the magnetic response of these islands. This allows us to conclude that the islands present local and collective magnetic excitations. Furthermore, we show that nickelocene islands present complex non-collinear spin patterns on the superconducting Pb (111) surface, opening the possibility of using molecular arrays to engineer spin textures with important implications on topological superconductivity.
△ Less
Submitted 1 September, 2020;
originally announced September 2020.
-
High-harmonic generation from metasurfaces empowered by bound states in the continuum
Authors:
George Zograf,
Kirill Koshelev,
Anastasia Zalogina,
Viacheslav Korolev,
Duk-Yong Choi,
Michael Zurch,
Christian Spielmann,
Barry Luther-Davies,
Daniil Kartashov,
Sergey Makarov,
Sergey Kruk,
Yuri Kivshar
Abstract:
The concept of optical bound states in the continuum (BICs) underpins the existence of strongly localized waves embedded into the radiation spectrum that can enhance the electromagnetic fields in subwavelength photonic structures. Early studies of optical BICs in waveguides and photonic crystals uncovered their topological properties, and the concept of quasi-BIC metasurfaces facilitated applicati…
▽ More
The concept of optical bound states in the continuum (BICs) underpins the existence of strongly localized waves embedded into the radiation spectrum that can enhance the electromagnetic fields in subwavelength photonic structures. Early studies of optical BICs in waveguides and photonic crystals uncovered their topological properties, and the concept of quasi-BIC metasurfaces facilitated applications of strong light-matter interactions to biosensing, lasing, and low-order nonlinear processes. Here we employ BIC-empowered dielectric metasurfaces to generate efficiently high optical harmonics up to the 11th order. We optimize a BIC mode for the first few harmonics and observe a transition between perturbative and nonperturbative nonlinear regimes. We also suggest a general strategy for designing subwavelength structures with strong resonances and nonperturbative nonlinearities. Our work bridges the fields of perturbative and nonperturbative nonlinear optics on the subwavelength scale.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Quantum amplitude estimation algorithms on IBM quantum devices
Authors:
Pooja Rao,
Kwangmin Yu,
Hyunkyung Lim,
Dasol Jin,
Deokkyu Choi
Abstract:
Since the publication of the Quantum Amplitude Estimation (QAE) algorithm by Brassard et al., 2002, several variations have been proposed, such as Aaronson et al., 2019, Grinko et al., 2019, and Suzuki et al., 2020. The main difference between the original and the variants is the exclusion of Quantum Phase Estimation (QPE) by the latter. This difference is notable given that QPE is the key compone…
▽ More
Since the publication of the Quantum Amplitude Estimation (QAE) algorithm by Brassard et al., 2002, several variations have been proposed, such as Aaronson et al., 2019, Grinko et al., 2019, and Suzuki et al., 2020. The main difference between the original and the variants is the exclusion of Quantum Phase Estimation (QPE) by the latter. This difference is notable given that QPE is the key component of original QAE, but is composed of many operations considered expensive for the current NISQ era devices. We compare two recently proposed variants (Grinko et al., 2019 and Suzuki et al., 2020) by implementing them on the IBM Quantum device using Qiskit, an open source framework for quantum computing. We analyze and discuss advantages of each algorithm from the point of view of their implementation and performance on a quantum computer.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
XD at SemEval-2020 Task 12: Ensemble Approach to Offensive Language Identification in Social Media Using Transformer Encoders
Authors:
Xiangjue Dong,
Jinho D. Choi
Abstract:
This paper presents six document classification models using the latest transformer encoders and a high-performing ensemble model for a task of offensive language identification in social media. For the individual models, deep transformer layers are applied to perform multi-head attentions. For the ensemble model, the utterance representations taken from those individual models are concatenated an…
▽ More
This paper presents six document classification models using the latest transformer encoders and a high-performing ensemble model for a task of offensive language identification in social media. For the individual models, deep transformer layers are applied to perform multi-head attentions. For the ensemble model, the utterance representations taken from those individual models are concatenated and fed into a linear decoder to make the final decisions. Our ensemble model outperforms the individual models and shows up to 8.6% improvement over the individual models on the development set. On the test set, it achieves macro-F1 of 90.9% and becomes one of the high performing systems among 85 participants in the sub-task A of this shared task. Our analysis shows that although the ensemble model significantly improves the accuracy on the development set, the improvement is not as evident on the test set.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
The Completed SDSS-IV Extended Baryon Oscillation Spectroscopic Survey: N-body Mock Challenge for Galaxy Clustering Measurements
Authors:
Graziano Rossi,
Peter D. Choi,
Jeongin Moon,
Julian E. Bautista,
Hector Gil-Marin,
Romain Paviot,
Mariana Vargas-Magana,
Sylvain de la Torre,
Sebastien Fromenteau,
Ashley J. Ross,
Santiago Avila,
Etienne Burtin,
Kyle S. Dawson,
Stephanie Escoffier,
Salman Habib,
Katrin Heitmann,
Jiamin Hou,
Eva-Maria Mueller,
Will J. Percival,
Alex Smith,
Cheng Zhao,
Gong-Bo Zhao
Abstract:
We develop a series of N-body data challenges, functional to the final analysis of the extended Baryon Oscillation Spectroscopic Survey (eBOSS) Data Release 16 (DR16) galaxy sample. The challenges are primarily based on high-fidelity catalogs constructed from the Outer Rim simulation - a large box size realization (3 Gpc/h) characterized by an unprecedented combination of volume and mass resolutio…
▽ More
We develop a series of N-body data challenges, functional to the final analysis of the extended Baryon Oscillation Spectroscopic Survey (eBOSS) Data Release 16 (DR16) galaxy sample. The challenges are primarily based on high-fidelity catalogs constructed from the Outer Rim simulation - a large box size realization (3 Gpc/h) characterized by an unprecedented combination of volume and mass resolution, down to 1.85x10^9 M_sun/h. We generate synthetic galaxy mocks by populating Outer Rim halos with a variety of halo occupation distribution (HOD) schemes of increasing complexity, spanning different redshift intervals. We then assess the performance of three complementary redshift space distortion (RSD) models in configuration and Fourier space, adopted for the analysis of the complete DR16 eBOSS sample of Luminous Red Galaxies (LRGs). We find all the methods mutually consistent, with comparable systematic errors on the Alcock-Paczynski parameters and the growth of structure, and robust to different HOD prescriptions - thus validating the robustness of the models and the pipelines used for the baryon acoustic oscillation (BAO) and full shape clustering analysis. In particular, all the techniques are able to recover a_par and a_perp to within 0.9%, and fsig8 to within 1.5%. As a by-product of our work, we are also able to gain interesting insights on the galaxy-halo connection. Our study is relevant for the final eBOSS DR16 `consensus cosmology', as the systematic error budget is informed by testing the results of analyses against these high-resolution mocks. In addition, it is also useful for future large-volume surveys, since similar mock-making techniques and systematic corrections can be readily extended to model for instance the Dark Energy Spectroscopic Instrument (DESI) galaxy sample.
△ Less
Submitted 25 March, 2021; v1 submitted 17 July, 2020;
originally announced July 2020.
-
The Completed SDSS-IV extended Baryon Oscillation Spectroscopic Survey: Large-scale Structure Catalogs for Cosmological Analysis
Authors:
Ashley J. Ross,
Julian Bautista,
Rita Tojeiro,
Shadab Alam,
Stephen Bailey,
Etienne Burtin,
Johan Comparat,
Kyle S. Dawson,
Arnaud de Mattia,
Hélion du Mas des Bourboux,
Héctor Gil-Marín,
Jiamin Hou,
Hui Kong,
Brad W. Lyke,
Faizan G. Mohammad,
John Moustakas,
Eva-Maria Mueller,
Adam D. Myers,
Will J. Percival,
Anand Raichoor,
Mehdi Rezaie,
Hee-Jong Seo,
Alex Smith,
Jeremy L. Tinker,
Pauline Zarrouk
, et al. (31 additional authors not shown)
Abstract:
We present large-scale structure catalogs from the completed extended Baryon Oscillation Spectroscopic Survey (eBOSS). Derived from Sloan Digital Sky Survey (SDSS) -IV Data Release 16 (DR16), these catalogs provide the data samples, corrected for observational systematics, and random positions sampling the survey selection function. Combined, they allow large-scale clustering measurements suitable…
▽ More
We present large-scale structure catalogs from the completed extended Baryon Oscillation Spectroscopic Survey (eBOSS). Derived from Sloan Digital Sky Survey (SDSS) -IV Data Release 16 (DR16), these catalogs provide the data samples, corrected for observational systematics, and random positions sampling the survey selection function. Combined, they allow large-scale clustering measurements suitable for testing cosmological models. We describe the methods used to create these catalogs for the eBOSS DR16 Luminous Red Galaxy (LRG) and Quasar samples. The quasar catalog contains 343,708 redshifts with $0.8 < z < 2.2$ over 4,808\,deg$^2$. We combine 174,816 eBOSS LRG redshifts over 4,242\,deg$^2$ in the redshift interval $0.6 < z < 1.0$ with SDSS-III BOSS LRGs in the same redshift range to produce a combined sample of 377,458 galaxy redshifts distributed over 9,493\,deg$^2$. Improved algorithms for estimating redshifts allow that 98 per cent of LRG observations result in a successful redshift, with less than one per cent catastrophic failures ($Δz > 1000$ ${\rm km~s}^{-1}$). For quasars, these rates are 95 and 2 per cent (with $Δz > 3000$ ${\rm km~s}^{-1}$). We apply corrections for trends between the number densities of our samples and the properties of the imaging and spectroscopic data. For example, the quasar catalog obtains a $χ^2$/DoF$= 776/10$ for a null test against imaging depth before corrections and a $χ^2$/DoF$=6/8$ after. The catalogs, combined with careful consideration of the details of their construction found here-in, allow companion papers to present cosmological results with negligible impact from observational systematic uncertainties.
△ Less
Submitted 30 September, 2020; v1 submitted 17 July, 2020;
originally announced July 2020.
-
The Completed SDSS-IV extended Baryon Oscillation Spectroscopic Survey: measurement of the BAO and growth rate of structure of the luminous red galaxy sample from the anisotropic power spectrum between redshifts 0.6 and 1.0
Authors:
Héctor Gil-Marín,
Julián E. Bautista,
Romain Paviot,
Mariana Vargas-Magaña,
Sylvain de la Torre,
Sebastien Fromenteau,
Shadab Alam,
Santiago Ávila,
Etienne Burtin,
Chia-Hsun Chuang,
Kyle S. Dawson,
Jiamin Hou,
Arnaud de Mattia,
Faizan G. Mohammad,
Eva-Maria Müller,
Seshadri Nadathur,
Richard Neveux,
Will J. Percival,
Anand Raichoor,
Mehdi Rezaie,
Ashley J. Ross,
Graziano Rossi,
Vanina Ruhlmann-Kleider,
Alex Smith,
Amélie Tamone
, et al. (15 additional authors not shown)
Abstract:
We analyse the clustering of the Sloan Digital Sky Survey IV extended Baryon Oscillation Spectroscopic Survey Data Release 16 luminous red galaxy sample (DR16 eBOSS LRG) in combination with the high redshift tail of the Sloan Digital Sky Survey III Baryon Oscillation Spectroscopic Survey Data Release 12 (DR12 BOSS CMASS). We measure the redshift space distortions (RSD) and also extract the longitu…
▽ More
We analyse the clustering of the Sloan Digital Sky Survey IV extended Baryon Oscillation Spectroscopic Survey Data Release 16 luminous red galaxy sample (DR16 eBOSS LRG) in combination with the high redshift tail of the Sloan Digital Sky Survey III Baryon Oscillation Spectroscopic Survey Data Release 12 (DR12 BOSS CMASS). We measure the redshift space distortions (RSD) and also extract the longitudinal and transverse baryonic acoustic oscillation (BAO) scale from the anisotropic power spectrum signal inferred from 377,458 galaxies between redshifts 0.6 and 1.0, with effective redshift of $z_{\rm eff}=0.698$ and effective comoving volume of $2.72\,{\rm Gpc}^3$. After applying reconstruction we measure the BAO scale and infer $D_H(z_{\rm eff})/r_{\rm drag} = 19.30\pm 0.56$ and $D_M(z_{\rm eff})/r_{\rm drag} =17.86 \pm 0.37$. When we perform a redshift space distortions analysis on the pre-reconstructed catalogue on the monopole, quadrupole and hexadecapole we find, $D_H(z_{\rm eff})/r_{\rm drag} = 20.18\pm 0.78$, $D_M(z_{\rm eff})/r_{\rm drag} =17.49 \pm 0.52$ and $fσ_8(z_{\rm eff})=0.454\pm0.046$. We combine both sets of results along with the measurements in configuration space of \cite{LRG_corr} and report the following consensus values: $D_H(z_{\rm eff})/r_{\rm drag} = 19.77\pm 0.47$, $D_M(z_{\rm eff})/r_{\rm drag} = 17.65\pm 0.30$ and $fσ_8(z_{\rm eff})=0.473\pm 0.044$, which are in full agreement with the standard $Λ$CDM and GR predictions. These results represent the most precise measurements within the redshift range $0.6\leq z \leq 1.0$ and are the culmination of more than 8 years of SDSS observations.
△ Less
Submitted 21 December, 2020; v1 submitted 17 July, 2020;
originally announced July 2020.
-
The Completed SDSS-IV extended Baryon Oscillation Spectroscopic Survey: measurement of the BAO and growth rate of structure of the luminous red galaxy sample from the anisotropic correlation function between redshifts 0.6 and 1
Authors:
Julian E. Bautista,
Romain Paviot,
Mariana Vargas Magaña,
Sylvain de la Torre,
Sebastien Fromenteau,
Hector Gil-Marín,
Ashley J. Ross,
Etienne Burtin,
Kyle S. Dawson,
Jiamin Hou,
Jean-Paul Kneib,
Arnaud de Mattia,
Will J. Percival,
Graziano Rossi,
Rita Tojeiro,
Cheng Zhao,
Gong-Bo Zhao,
Shadab Alam,
Joel Brownstein,
Michael J. Chapman,
Peter D. Choi,
Chia-Hsun Chuang,
Stéphanie Escoffier,
Axel de la Macorra,
Hélion du Mas des Bourboux
, et al. (8 additional authors not shown)
Abstract:
We present the cosmological analysis of the configuration-space anisotropic clustering in the completed Sloan Digital Sky Survey IV (SDSS-IV) extended Baryon Oscillation Spectroscopic Survey (eBOSS) DR16 galaxy sample. This sample consists of luminous red galaxies (LRGs) spanning the redshift range $0.6 < z < 1$, at an effective redshift of $z_{\rm eff}=0.698$. It combines 174 816 eBOSS LRGs and 2…
▽ More
We present the cosmological analysis of the configuration-space anisotropic clustering in the completed Sloan Digital Sky Survey IV (SDSS-IV) extended Baryon Oscillation Spectroscopic Survey (eBOSS) DR16 galaxy sample. This sample consists of luminous red galaxies (LRGs) spanning the redshift range $0.6 < z < 1$, at an effective redshift of $z_{\rm eff}=0.698$. It combines 174 816 eBOSS LRGs and 202 642 BOSS CMASS galaxies. We extract and model the baryon acoustic oscillations (BAO) and redshift-space distortions (RSD) features from the galaxy two-point correlation function to infer geometrical and dynamical cosmological constraints. The adopted methodology is extensively tested on a set of realistic simulations. The correlations between the inferred parameters from the BAO and full-shape correlation function analyses are estimated. This allows us to derive joint constraints on the three cosmological parameter combinations: $D_M(z)/r_d$, $D_H(z)/r_d$ and $fσ_8(z)$, where $D_M$ is the comoving angular diameter distance, $D_H$ is Hubble distance, $r_d$ is the comoving BAO scale, $f$ is the linear growth rate of structure, and $σ_8$ is the amplitude of linear matter perturbations. After combining the results with those from the parallel power spectrum analysis of Gil-Marin et al. 2020, we obtain the constraints: $D_M/r_d = 17.65 \pm 0.30$, $D_H/r_d = 19.77 \pm 0.47$, $fσ_8 = 0.473 \pm 0.044$. These measurements are consistent with a flat $Λ$CDM model with standard gravity.
△ Less
Submitted 21 September, 2020; v1 submitted 17 July, 2020;
originally announced July 2020.
-
The Completed SDSS-IV extended Baryon Oscillation Spectroscopic Survey: Cosmological Implications from two Decades of Spectroscopic Surveys at the Apache Point observatory
Authors:
eBOSS Collaboration,
Shadab Alam,
Marie Aubert,
Santiago Avila,
Christophe Balland,
Julian E. Bautista,
Matthew A. Bershady,
Dmitry Bizyaev,
Michael R. Blanton,
Adam S. Bolton,
Jo Bovy,
Jonathan Brinkmann,
Joel R. Brownstein,
Etienne Burtin,
Solene Chabanier,
Michael J. Chapman,
Peter Doohyun Choi,
Chia-Hsun Chuang,
Johan Comparat,
Andrei Cuceu,
Kyle S. Dawson,
Axel de la Macorra,
Sylvain de la Torre,
Arnaud de Mattia,
Victoria de Sainte Agathe
, et al. (75 additional authors not shown)
Abstract:
We present the cosmological implications from final measurements of clustering using galaxies, quasars, and Ly$α$ forests from the completed Sloan Digital Sky Survey (SDSS) lineage of experiments in large-scale structure. These experiments, composed of data from SDSS, SDSS-II, BOSS, and eBOSS, offer independent measurements of baryon acoustic oscillation (BAO) measurements of angular-diameter dist…
▽ More
We present the cosmological implications from final measurements of clustering using galaxies, quasars, and Ly$α$ forests from the completed Sloan Digital Sky Survey (SDSS) lineage of experiments in large-scale structure. These experiments, composed of data from SDSS, SDSS-II, BOSS, and eBOSS, offer independent measurements of baryon acoustic oscillation (BAO) measurements of angular-diameter distances and Hubble distances relative to the sound horizon, $r_d$, from eight different samples and six measurements of the growth rate parameter, $fσ_8$, from redshift-space distortions (RSD). This composite sample is the most constraining of its kind and allows us to perform a comprehensive assessment of the cosmological model after two decades of dedicated spectroscopic observation. We show that the BAO data alone are able to rule out dark-energy-free models at more than eight standard deviations in an extension to the flat, $Λ$CDM model that allows for curvature. When combined with Planck Cosmic Microwave Background (CMB) measurements of temperature and polarization the BAO data provide nearly an order of magnitude improvement on curvature constraints. The RSD measurements indicate a growth rate that is consistent with predictions from Planck primary data and with General Relativity. When combining the results of SDSS BAO and RSD with external data, all multiple-parameter extensions remain consistent with a $Λ$CDM model. Regardless of cosmological model, the precision on $Ω_Λ$, $H_0$, and $σ_8$, remains at roughly 1\%, showing changes of less than 0.6\% in the central values between models. The inverse distance ladder measurement under a o$w_0w_a$CDM yields $H_0= 68.20 \pm 0.81 \, \rm km\, s^{-1} Mpc^{-1}$, remaining in tension with several direct determination methods. (abridged)
△ Less
Submitted 9 July, 2024; v1 submitted 17 July, 2020;
originally announced July 2020.
-
PathGAN: Local Path Planning with Attentive Generative Adversarial Networks
Authors:
Dooseop Choi,
Seung-jun Han,
Kyoungwook Min,
Jeongdan Choi
Abstract:
To achieve autonomous driving without high-definition maps, we present a model capable of generating multiple plausible paths from egocentric images for autonomous vehicles. Our generative model comprises two neural networks: the feature extraction network (FEN) and path generation network (PGN). The FEN extracts meaningful features from an egocentric image, whereas the PGN generates multiple path…
▽ More
To achieve autonomous driving without high-definition maps, we present a model capable of generating multiple plausible paths from egocentric images for autonomous vehicles. Our generative model comprises two neural networks: the feature extraction network (FEN) and path generation network (PGN). The FEN extracts meaningful features from an egocentric image, whereas the PGN generates multiple paths from the features, given a driving intention and speed. To ensure that the paths generated are plausible and consistent with the intention, we introduce an attentive discriminator and train it with the PGN under generative adversarial networks framework. We also devise an interaction model between the positions in the paths and the intentions hidden in the positions and design a novel PGN architecture that reflects the interaction model, resulting in the improvement of the accuracy and diversity of the generated paths. Finally, we introduce ETRIDriving, a dataset for autonomous driving in which the recorded sensor data are labeled with discrete high-level driving actions, and demonstrate the state-of-the-art performance of the proposed model on ETRIDriving in terms of accuracy and diversity.
△ Less
Submitted 2 March, 2021; v1 submitted 7 July, 2020;
originally announced July 2020.
-
The spontaneous symmetry breaking in Ta$_2$NiSe$_5$ is structural in nature
Authors:
Edoardo Baldini,
Alfred Zong,
Dongsung Choi,
Changmin Lee,
Marios H. Michael,
Lukas Windgaetter,
Igor I. Mazin,
Simone Latini,
Doron Azoury,
Baiqing Lv,
Anshul Kogar,
Yao Wang,
Yangfan Lu,
Tomohiro Takayama,
Hidenori Takagi,
Andrew J. Millis,
Angel Rubio,
Eugene Demler,
Nuh Gedik
Abstract:
The excitonic insulator is an electronically-driven phase of matter that emerges upon the spontaneous formation and Bose condensation of excitons. Detecting this exotic order in candidate materials is a subject of paramount importance, as the size of the excitonic gap in the band structure establishes the potential of this collective state for superfluid energy transport. However, the identificati…
▽ More
The excitonic insulator is an electronically-driven phase of matter that emerges upon the spontaneous formation and Bose condensation of excitons. Detecting this exotic order in candidate materials is a subject of paramount importance, as the size of the excitonic gap in the band structure establishes the potential of this collective state for superfluid energy transport. However, the identification of this phase in real solids is hindered by the coexistence of a structural order parameter with the same symmetry as the excitonic order. Only a few materials are currently believed to host a dominant excitonic phase, Ta$_2$NiSe$_5$ being the most promising. Here, we test this scenario by using an ultrashort laser pulse to quench the broken-symmetry phase of this transition metal chalcogenide. Tracking the dynamics of the material's electronic and crystal structure after light excitation reveals surprising spectroscopic fingerprints that are only compatible with a primary order parameter of phononic nature. We rationalize our findings through state-of-the-art calculations, confirming that the structural order accounts for most of the electronic gap opening. Not only do our results uncover the long-sought mechanism driving the phase transition of Ta$_2$NiSe$_5$, but they also conclusively rule out any substantial excitonic character in this instability.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
Delayed Q-update: A novel credit assignment technique for deriving an optimal operation policy for the Grid-Connected Microgrid
Authors:
Hyungjun Park,
Daiki Min,
Jong-hyun Ryu,
Dong Gu Choi
Abstract:
A microgrid is an innovative system that integrates distributed energy resources to supply electricity demand within electrical boundaries. This study proposes an approach for deriving a desirable microgrid operation policy that enables sophisticated controls in the microgrid system using the proposed novel credit assignment technique, delayed-Q update. The technique employs novel features such as…
▽ More
A microgrid is an innovative system that integrates distributed energy resources to supply electricity demand within electrical boundaries. This study proposes an approach for deriving a desirable microgrid operation policy that enables sophisticated controls in the microgrid system using the proposed novel credit assignment technique, delayed-Q update. The technique employs novel features such as the ability to tackle and resolve the delayed effective property of the microgrid, which prevents learning agents from deriving a well-fitted policy under sophisticated controls. The proposed technique tracks the history of the charging period and retroactively assigns an adjusted value to the ESS charging control. The operation policy derived using the proposed approach is well-fitted for the real effects of ESS operation because of the process of the technique. Therefore, it supports the search for a near-optimal operation policy under a sophisticatedly controlled microgrid environment. To validate our technique, we simulate the operation policy under a real-world grid-connected microgrid system and demonstrate the convergence to a near-optimal policy by comparing performance measures of our policy with benchmark policy and optimal policy.
△ Less
Submitted 20 October, 2020; v1 submitted 30 June, 2020;
originally announced June 2020.
-
Sequential Feature Filtering Classifier
Authors:
Minseok Seo,
Jaemin Lee,
Jongchan Park,
Dong-Geol Choi
Abstract:
We propose Sequential Feature Filtering Classifier (FFC), a simple but effective classifier for convolutional neural networks (CNNs). With sequential LayerNorm and ReLU, FFC zeroes out low-activation units and preserves high-activation units. The sequential feature filtering process generates multiple features, which are fed into a shared classifier for multiple outputs. FFC can be applied to any…
▽ More
We propose Sequential Feature Filtering Classifier (FFC), a simple but effective classifier for convolutional neural networks (CNNs). With sequential LayerNorm and ReLU, FFC zeroes out low-activation units and preserves high-activation units. The sequential feature filtering process generates multiple features, which are fed into a shared classifier for multiple outputs. FFC can be applied to any CNNs with a classifier, and significantly improves performances with negligible overhead. We extensively validate the efficacy of FFC on various tasks: ImageNet-1K classification, MS COCO detection, Cityscapes segmentation, and HMDB51 action recognition. Moreover, we empirically show that FFC can further improve performances upon other techniques, including attention modules and augmentation techniques. The code and models will be publicly available.
△ Less
Submitted 21 June, 2020;
originally announced June 2020.
-
Gradient Estimation with Stochastic Softmax Tricks
Authors:
Max B. Paulus,
Dami Choi,
Daniel Tarlow,
Andreas Krause,
Chris J. Maddison
Abstract:
The Gumbel-Max trick is the basis of many relaxed gradient estimators. These estimators are easy to implement and low variance, but the goal of scaling them comprehensively to large combinatorial distributions is still outstanding. Working within the perturbation model framework, we introduce stochastic softmax tricks, which generalize the Gumbel-Softmax trick to combinatorial spaces. Our framewor…
▽ More
The Gumbel-Max trick is the basis of many relaxed gradient estimators. These estimators are easy to implement and low variance, but the goal of scaling them comprehensively to large combinatorial distributions is still outstanding. Working within the perturbation model framework, we introduce stochastic softmax tricks, which generalize the Gumbel-Softmax trick to combinatorial spaces. Our framework is a unified perspective on existing relaxed estimators for perturbation models, and it contains many novel relaxations. We design structured relaxations for subset selection, spanning trees, arborescences, and others. When compared to less structured baselines, we find that stochastic softmax tricks can be used to train latent variable models that perform better and discover more latent structure.
△ Less
Submitted 28 February, 2021; v1 submitted 14 June, 2020;
originally announced June 2020.
-
Emora STDM: A Versatile Framework for Innovative Dialogue System Development
Authors:
James D. Finch,
Jinho D. Choi
Abstract:
This demo paper presents Emora STDM (State Transition Dialogue Manager), a dialogue system development framework that provides novel workflows for rapid prototyping of chat-based dialogue managers as well as collaborative development of complex interactions. Our framework caters to a wide range of expertise levels by supporting interoperability between two popular approaches, state machine and inf…
▽ More
This demo paper presents Emora STDM (State Transition Dialogue Manager), a dialogue system development framework that provides novel workflows for rapid prototyping of chat-based dialogue managers as well as collaborative development of complex interactions. Our framework caters to a wide range of expertise levels by supporting interoperability between two popular approaches, state machine and information state, to dialogue management. Our Natural Language Expression package allows seamless integration of pattern matching, custom NLP modules, and database querying, that makes the workflows much more efficient. As a user study, we adopt this framework to an interdisciplinary undergraduate course where students with both technical and non-technical backgrounds are able to develop creative dialogue managers in a short period of time.
△ Less
Submitted 10 June, 2020;
originally announced June 2020.
-
Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of Current Evaluation Protocols
Authors:
Sarah E. Finch,
Jinho D. Choi
Abstract:
As conversational AI-based dialogue management has increasingly become a trending topic, the need for a standardized and reliable evaluation procedure grows even more pressing. The current state of affairs suggests various evaluation protocols to assess chat-oriented dialogue management systems, rendering it difficult to conduct fair comparative studies across different approaches and gain an insi…
▽ More
As conversational AI-based dialogue management has increasingly become a trending topic, the need for a standardized and reliable evaluation procedure grows even more pressing. The current state of affairs suggests various evaluation protocols to assess chat-oriented dialogue management systems, rendering it difficult to conduct fair comparative studies across different approaches and gain an insightful understanding of their values. To foster this research, a more robust evaluation protocol must be set in place. This paper presents a comprehensive synthesis of both automated and human evaluation methods on dialogue systems, identifying their shortcomings while accumulating evidence towards the most effective evaluation dimensions. A total of 20 papers from the last two years are surveyed to analyze three types of evaluation protocols: automated, static, and interactive. Finally, the evaluation dimensions used in these papers are compared against our expert evaluation on the system-user dialogue data collected from the Alexa Prize 2020.
△ Less
Submitted 10 June, 2020;
originally announced June 2020.
-
Analysis of the Penn Korean Universal Dependency Treebank (PKT-UD): Manual Revision to Build Robust Parsing Model in Korean
Authors:
Tae Hwan Oh,
Ji Yoon Han,
Hyonsu Choe,
Seokwon Park,
Han He,
Jinho D. Choi,
Na-Rae Han,
Jena D. Hwang,
Hansaem Kim
Abstract:
In this paper, we first open on important issues regarding the Penn Korean Universal Treebank (PKT-UD) and address these issues by revising the entire corpus manually with the aim of producing cleaner UD annotations that are more faithful to Korean grammar. For compatibility to the rest of UD corpora, we follow the UDv2 guidelines, and extensively revise the part-of-speech tags and the dependency…
▽ More
In this paper, we first open on important issues regarding the Penn Korean Universal Treebank (PKT-UD) and address these issues by revising the entire corpus manually with the aim of producing cleaner UD annotations that are more faithful to Korean grammar. For compatibility to the rest of UD corpora, we follow the UDv2 guidelines, and extensively revise the part-of-speech tags and the dependency relations to reflect morphological features and flexible word-order aspects in Korean. The original and the revised versions of PKT-UD are experimented with transformer-based parsing models using biaffine attention. The parsing model trained on the revised corpus shows a significant improvement of 3.0% in labeled attachment score over the model trained on the previous corpus. Our error analysis demonstrates that this revision allows the parsing model to learn relations more robustly, reducing several critical errors that used to be made by the previous model.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
Transformer-based Context-aware Sarcasm Detection in Conversation Threads from Social Media
Authors:
Xiangjue Dong,
Changmao Li,
Jinho D. Choi
Abstract:
We present a transformer-based sarcasm detection model that accounts for the context from the entire conversation thread for more robust predictions. Our model uses deep transformer layers to perform multi-head attentions among the target utterance and the relevant context in the thread. The context-aware models are evaluated on two datasets from social media, Twitter and Reddit, and show 3.1% and…
▽ More
We present a transformer-based sarcasm detection model that accounts for the context from the entire conversation thread for more robust predictions. Our model uses deep transformer layers to perform multi-head attentions among the target utterance and the relevant context in the thread. The context-aware models are evaluated on two datasets from social media, Twitter and Reddit, and show 3.1% and 7.0% improvements over their baselines. Our best models give the F1-scores of 79.0% and 75.0% for the Twitter and Reddit datasets respectively, becoming one of the highest performing systems among 36 participants in this shared task.
△ Less
Submitted 22 May, 2020;
originally announced May 2020.
-
Fast magneto-ionic switching of interface anisotropy using yttria-stabilized zirconia gate oxide
Authors:
Ki-Young Lee,
Sujin Jo,
Aik Jun Tan,
Mantao Huang,
Dongwon Choi,
Jung Hoon Park,
Ho-Il Ji,
Ji-Won Son,
Joonyeon Chang,
Geoffrey S. D. Beach,
Seonghoon Woo
Abstract:
Voltage control of interfacial magnetism has been greatly highlighted in spintronics research for many years, as it might enable ultra-low power technologies. Among few suggested approaches, magneto-ionic control of magnetism has demonstrated large modulation of magnetic anisotropy. Moreover, the recent demonstration of magneto-ionic devices using hydrogen ions presented relatively fast magnetizat…
▽ More
Voltage control of interfacial magnetism has been greatly highlighted in spintronics research for many years, as it might enable ultra-low power technologies. Among few suggested approaches, magneto-ionic control of magnetism has demonstrated large modulation of magnetic anisotropy. Moreover, the recent demonstration of magneto-ionic devices using hydrogen ions presented relatively fast magnetization toggle switching, tsw ~ 100 ms, at room temperature. However, the operation speed may need to be significantly improved to be used for modern electronic devices. Here, we demonstrate that the speed of proton-induced magnetization toggle switching largely depends on proton-conducting oxides. We achieve ~1 ms reliable (> 103 cycles) switching using yttria-stabilized zirconia (YSZ), which is ~ 100 times faster than the state-of-the-art magneto-ionic devices reported to date at room temperature. Our results suggest further engineering of the proton-conducting materials could bring substantial improvement that may enable new low-power computing scheme based on magneto-ionics.
△ Less
Submitted 5 May, 2020;
originally announced May 2020.
-
Noise Pollution in Hospital Readmission Prediction: Long Document Classification with Reinforcement Learning
Authors:
Liyan Xu,
Julien Hogan,
Rachel E. Patzer,
Jinho D. Choi
Abstract:
This paper presents a reinforcement learning approach to extract noise in long clinical documents for the task of readmission prediction after kidney transplant. We face the challenges of developing robust models on a small dataset where each document may consist of over 10K tokens with full of noise including tabular text and task-irrelevant sentences. We first experiment four types of encoders t…
▽ More
This paper presents a reinforcement learning approach to extract noise in long clinical documents for the task of readmission prediction after kidney transplant. We face the challenges of developing robust models on a small dataset where each document may consist of over 10K tokens with full of noise including tabular text and task-irrelevant sentences. We first experiment four types of encoders to empirically decide the best document representation, and then apply reinforcement learning to remove noisy text from the long documents, which models the noise extraction process as a sequential decision problem. Our results show that the old bag-of-words encoder outperforms deep learning-based encoders on this task, and reinforcement learning is able to improve upon baseline while pruning out 25% text segments. Our analysis depicts that reinforcement learning is able to identify both typical noisy tokens and task-specific noisy text.
△ Less
Submitted 23 May, 2020; v1 submitted 4 May, 2020;
originally announced May 2020.
-
Integrated Eojeol Embedding for Erroneous Sentence Classification in Korean Chatbots
Authors:
DongHyun Choi,
IlNam Park,
Myeong Cheol Shin,
EungGyun Kim,
Dong Ryeol Shin
Abstract:
This paper attempts to analyze the Korean sentence classification system for a chatbot. Sentence classification is the task of classifying an input sentence based on predefined categories. However, spelling or space error contained in the input sentence causes problems in morphological analysis and tokenization. This paper proposes a novel approach of Integrated Eojeol (Korean syntactic word separ…
▽ More
This paper attempts to analyze the Korean sentence classification system for a chatbot. Sentence classification is the task of classifying an input sentence based on predefined categories. However, spelling or space error contained in the input sentence causes problems in morphological analysis and tokenization. This paper proposes a novel approach of Integrated Eojeol (Korean syntactic word separated by space) Embedding to reduce the effect that poorly analyzed morphemes may make on sentence classification. It also proposes two noise insertion methods that further improve classification performance. Our evaluation results indicate that the proposed system classifies erroneous sentences more accurately than the baseline system by 17%p.0
△ Less
Submitted 12 April, 2020;
originally announced April 2020.
-
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering
Authors:
Changmao Li,
Jinho D. Choi
Abstract:
We introduce a novel approach to transformers that learns hierarchical representations in multiparty dialogue. First, three language modeling tasks are used to pre-train the transformers, token- and utterance-level language modeling and utterance order prediction, that learn both token and utterance embeddings for better understanding in dialogue contexts. Then, multi-task learning between the utt…
▽ More
We introduce a novel approach to transformers that learns hierarchical representations in multiparty dialogue. First, three language modeling tasks are used to pre-train the transformers, token- and utterance-level language modeling and utterance order prediction, that learn both token and utterance embeddings for better understanding in dialogue contexts. Then, multi-task learning between the utterance prediction and the token span prediction is applied to fine-tune for span-based question answering (QA). Our approach is evaluated on the FriendsQA dataset and shows improvements of 3.8% and 1.4% over the two state-of-the-art transformer models, BERT and RoBERTa, respectively.
△ Less
Submitted 23 May, 2020; v1 submitted 7 April, 2020;
originally announced April 2020.
-
RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases
Authors:
DongHyun Choi,
Myeong Cheol Shin,
EungGyun Kim,
Dong Ryeol Shin
Abstract:
Text-to-SQL is the problem of converting a user question into an SQL query, when the question and database are given. In this paper, we present a neural network approach called RYANSQL (Recursively Yielding Annotation Network for SQL) to solve complex Text-to-SQL tasks for cross-domain databases. State-ment Position Code (SPC) is defined to trans-form a nested SQL query into a set of non-nested SE…
▽ More
Text-to-SQL is the problem of converting a user question into an SQL query, when the question and database are given. In this paper, we present a neural network approach called RYANSQL (Recursively Yielding Annotation Network for SQL) to solve complex Text-to-SQL tasks for cross-domain databases. State-ment Position Code (SPC) is defined to trans-form a nested SQL query into a set of non-nested SELECT statements; a sketch-based slot filling approach is proposed to synthesize each SELECT statement for its corresponding SPC. Additionally, two input manipulation methods are presented to improve generation performance further. RYANSQL achieved 58.2% accuracy on the challenging Spider benchmark, which is a 3.2%p improvement over previous state-of-the-art approaches. At the time of writing, RYANSQL achieves the first position on the Spider leaderboard.
△ Less
Submitted 7 April, 2020;
originally announced April 2020.
-
Analytical shape recovery of a conductivity inclusion based on Faber polynomials
Authors:
Doosung Choi,
Junbeom Kim,
Mikyoung Lim
Abstract:
A conductivity inclusion, inserted in a homogeneous background, induces a perturbation in the background potential. This perturbation admits a multipole expansion whose coefficients are the so-called generalized polarization tensors (GPTs). GPTs can be obtained from multistatic measurements. As a modification of GPTs, the Faber polynomial polarization tensors (FPTs) were recently introduced in two…
▽ More
A conductivity inclusion, inserted in a homogeneous background, induces a perturbation in the background potential. This perturbation admits a multipole expansion whose coefficients are the so-called generalized polarization tensors (GPTs). GPTs can be obtained from multistatic measurements. As a modification of GPTs, the Faber polynomial polarization tensors (FPTs) were recently introduced in two dimensions. In this study, we design two novel analytical non-iterative methods for recovering the shape of a simply connected inclusion from GPTs by employing the concept of FPTs. First, we derive an explicit expression for the coefficients of the exterior conformal mapping associated with an inclusion in a simple form in terms of GPTs, which allows us to accurately reconstruct the shape of an inclusion with extreme or near-extreme conductivity. Secondly, we provide an explicit asymptotic formula in terms of GPTs for the shape of an inclusion with arbitrary conductivity by considering the inclusion as a perturbation of its equivalent ellipse. With this formula, one can non-iteratively approximate an inclusion of general shape with arbitrary conductivity, including a straight or asymmetric shape. Numerical experiments demonstrate the validity of the proposed analytical approaches.
△ Less
Submitted 11 May, 2020; v1 submitted 15 January, 2020;
originally announced January 2020.
-
The Sixteenth Data Release of the Sloan Digital Sky Surveys: First Release from the APOGEE-2 Southern Survey and Full Release of eBOSS Spectra
Authors:
Romina Ahumada,
Carlos Allende Prieto,
Andres Almeida,
Friedrich Anders,
Scott F. Anderson,
Brett H. Andrews,
Borja Anguiano,
Riccardo Arcodia,
Eric Armengaud,
Marie Aubert,
Santiago Avila,
Vladimir Avila-Reese,
Carles Badenes,
Christophe Balland,
Kat Barger,
Jorge K. Barrera-Ballesteros,
Sarbani Basu,
Julian Bautista,
Rachael L. Beaton,
Timothy C. Beers,
B. Izamar T. Benavides,
Chad F. Bender,
Mariangela Bernardi,
Matthew Bershady,
Florian Beutler
, et al. (289 additional authors not shown)
Abstract:
This paper documents the sixteenth data release (DR16) from the Sloan Digital Sky Surveys; the fourth and penultimate from the fourth phase (SDSS-IV). This is the first release of data from the southern hemisphere survey of the Apache Point Observatory Galactic Evolution Experiment 2 (APOGEE-2); new data from APOGEE-2 North are also included. DR16 is also notable as the final data release for the…
▽ More
This paper documents the sixteenth data release (DR16) from the Sloan Digital Sky Surveys; the fourth and penultimate from the fourth phase (SDSS-IV). This is the first release of data from the southern hemisphere survey of the Apache Point Observatory Galactic Evolution Experiment 2 (APOGEE-2); new data from APOGEE-2 North are also included. DR16 is also notable as the final data release for the main cosmological program of the Extended Baryon Oscillation Spectroscopic Survey (eBOSS), and all raw and reduced spectra from that project are released here. DR16 also includes all the data from the Time Domain Spectroscopic Survey (TDSS) and new data from the SPectroscopic IDentification of ERosita Survey (SPIDERS) programs, both of which were co-observed on eBOSS plates. DR16 has no new data from the Mapping Nearby Galaxies at Apache Point Observatory (MaNGA) survey (or the MaNGA Stellar Library "MaStar"). We also preview future SDSS-V operations (due to start in 2020), and summarize plans for the final SDSS-IV data release (DR17).
△ Less
Submitted 11 May, 2020; v1 submitted 5 December, 2019;
originally announced December 2019.
-
Automatic Text-based Personality Recognition on Monologues and Multiparty Dialogues Using Attentive Networks and Contextual Embeddings
Authors:
Hang Jiang,
Xianzhe Zhang,
Jinho D. Choi
Abstract:
Previous works related to automatic personality recognition focus on using traditional classification models with linguistic features. However, attentive neural networks with contextual embeddings, which have achieved huge success in text classification, are rarely explored for this task. In this project, we have two major contributions. First, we create the first dialogue-based personality datase…
▽ More
Previous works related to automatic personality recognition focus on using traditional classification models with linguistic features. However, attentive neural networks with contextual embeddings, which have achieved huge success in text classification, are rarely explored for this task. In this project, we have two major contributions. First, we create the first dialogue-based personality dataset, FriendsPersona, by annotating 5 personality traits of speakers from Friends TV Show through crowdsourcing. Second, we present a novel approach to automatic personality recognition using pre-trained contextual embeddings (BERT and RoBERTa) and attentive neural networks. Our models largely improve the state-of-art results on the monologue Essays dataset by 2.49%, and establish a solid benchmark on our FriendsPersona. By comparing results in two datasets, we demonstrate the challenges of modeling personality in multi-party dialogue.
△ Less
Submitted 21 November, 2019;
originally announced November 2019.
-
Incremental Sense Weight Training for the Interpretation of Contextualized Word Embeddings
Authors:
Xinyi Jiang,
Zhengzhe Yang,
Jinho D. Choi
Abstract:
We present a novel online algorithm that learns the essence of each dimension in word embeddings by minimizing the within-group distance of contextualized embedding groups. Three state-of-the-art neural-based language models are used, Flair, ELMo, and BERT, to generate contextualized word embeddings such that different embeddings are generated for the same word type, which are grouped by their sen…
▽ More
We present a novel online algorithm that learns the essence of each dimension in word embeddings by minimizing the within-group distance of contextualized embedding groups. Three state-of-the-art neural-based language models are used, Flair, ELMo, and BERT, to generate contextualized word embeddings such that different embeddings are generated for the same word type, which are grouped by their senses manually annotated in the SemCor dataset. We hypothesize that not all dimensions are equally important for downstream tasks so that our algorithm can detect unessential dimensions and discard them without hurting the performance. To verify this hypothesis, we first mask dimensions determined unessential by our algorithm, apply the masked word embeddings to a word sense disambiguation task (WSD), and compare its performance against the one achieved by the original embeddings. Several KNN approaches are experimented to establish strong baselines for WSD. Our results show that the masked word embeddings do not hurt the performance and can improve it by 3%. Our work can be used to conduct future research on the interpretability of contextualized embeddings.
△ Less
Submitted 23 May, 2020; v1 submitted 5 November, 2019;
originally announced November 2019.
-
Design and Challenges of Cloze-Style Reading Comprehension Tasks on Multiparty Dialogue
Authors:
Changmao Li,
Tianhao Liu,
Jinho D. Choi
Abstract:
This paper analyzes challenges in cloze-style reading comprehension on multiparty dialogue and suggests two new tasks for more comprehensive predictions of personal entities in daily conversations. We first demonstrate that there are substantial limitations to the evaluation methods of previous work, namely that randomized assignment of samples to training and test data substantially decreases the…
▽ More
This paper analyzes challenges in cloze-style reading comprehension on multiparty dialogue and suggests two new tasks for more comprehensive predictions of personal entities in daily conversations. We first demonstrate that there are substantial limitations to the evaluation methods of previous work, namely that randomized assignment of samples to training and test data substantially decreases the complexity of cloze-style reading comprehension. According to our analysis, replacing the random data split with a chronological data split reduces test accuracy on previous single-variable passage completion task from 72\% to 34\%, that leaves much more room to improve. Our proposed tasks extend the previous single-variable passage completion task by replacing more character mentions with variables. Several deep learning models are developed to validate these three tasks. A thorough error analysis is provided to understand the challenges and guide the future direction of this research.
△ Less
Submitted 12 July, 2021; v1 submitted 2 November, 2019;
originally announced November 2019.
-
Photosensitive chalcogenide metasurfaces supporting bound states in the continuum
Authors:
Elena Mikheeva,
Kirill Koshelev,
Duk-Yong Choi,
Sergey Kruk,
Julien Lumeau,
Redha Abdeddaim,
Ivan Voznyuk,
Stefan Enoch,
Yuri Kivshar
Abstract:
We study, both theoretically and experimentally, tunable metasurfaces supporting sharp Fano-resonances inspired by optical bound states in the continuum. We explore the use of arsenic trisulfide (a photosensitive chalcogenide glass) having optical properties which can be finely tuned by light absorption at the post-fabrication stage. We select the resonant wavelength of the metasurface correspondi…
▽ More
We study, both theoretically and experimentally, tunable metasurfaces supporting sharp Fano-resonances inspired by optical bound states in the continuum. We explore the use of arsenic trisulfide (a photosensitive chalcogenide glass) having optical properties which can be finely tuned by light absorption at the post-fabrication stage. We select the resonant wavelength of the metasurface corresponding to the energy below the arsenic trisulfide bandgap, and experimentally control the resonance spectral position via exposure to the light of energies above the bandgap.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
On Empirical Comparisons of Optimizers for Deep Learning
Authors:
Dami Choi,
Christopher J. Shallue,
Zachary Nado,
Jaehoon Lee,
Chris J. Maddison,
George E. Dahl
Abstract:
Selecting an optimizer is a central step in the contemporary deep learning pipeline. In this paper, we demonstrate the sensitivity of optimizer comparisons to the hyperparameter tuning protocol. Our findings suggest that the hyperparameter search space may be the single most important factor explaining the rankings obtained by recent empirical comparisons in the literature. In fact, we show that t…
▽ More
Selecting an optimizer is a central step in the contemporary deep learning pipeline. In this paper, we demonstrate the sensitivity of optimizer comparisons to the hyperparameter tuning protocol. Our findings suggest that the hyperparameter search space may be the single most important factor explaining the rankings obtained by recent empirical comparisons in the literature. In fact, we show that these results can be contradicted when hyperparameter search spaces are changed. As tuning effort grows without bound, more general optimizers should never underperform the ones they can approximate (i.e., Adam should never perform worse than momentum), but recent attempts to compare optimizers either assume these inclusion relationships are not practically relevant or restrict the hyperparameters in ways that break the inclusions. In our experiments, we find that inclusion relationships between optimizers matter in practice and always predict optimizer comparisons. In particular, we find that the popular adaptive gradient methods never underperform momentum or gradient descent. We also report practical tips around tuning often ignored hyperparameters of adaptive gradient methods and raise concerns about fairly benchmarking optimizers for neural network training.
△ Less
Submitted 15 June, 2020; v1 submitted 11 October, 2019;
originally announced October 2019.
-
Direct Visual-Inertial Odometry with Semi-Dense Mapping
Authors:
Wenju Xu,
Dongkyu Choi,
Guanghui Wang
Abstract:
The paper presents a direct visual-inertial odometry system. In particular, a tightly coupled nonlinear optimization based method is proposed by integrating the recent advances in direct dense tracking and Inertial Measurement Unit (IMU) pre-integration, and a factor graph optimization is adapted to estimate the pose of the camera and rebuild a semi-dense map. Two sliding windows are maintained in…
▽ More
The paper presents a direct visual-inertial odometry system. In particular, a tightly coupled nonlinear optimization based method is proposed by integrating the recent advances in direct dense tracking and Inertial Measurement Unit (IMU) pre-integration, and a factor graph optimization is adapted to estimate the pose of the camera and rebuild a semi-dense map. Two sliding windows are maintained in the proposed approach. The first one, based on Direct Sparse Odometry (DSO), is to estimate the depths of candidate points for mapping and dense visual tracking. In the second one, measurements from the IMU pre-integration and dense visual tracking are fused probabilistically using a tightly-coupled, optimization-based sensor fusion framework. As a result, the IMU pre-integration provides additional constraints to suppress the scale drift induced by the visual odometry. Evaluations on real-world benchmark datasets show that the proposed method achieves competitive results in indoor scenes.
△ Less
Submitted 4 October, 2019;
originally announced October 2019.
-
Universal light-guiding geometry for high-nonlinear resonators having molecular-scale roughness
Authors:
Dae-Gon Kim,
Sangyoon Han,
Joonhyuk Hwang,
In Hwan Do,
Dongin Jeong,
Ji-Hun Lim,
Yong-Hoon Lee,
Muhan Choi,
Yong-Hee Lee,
Duk-Yong Choi,
Hansuek Lee
Abstract:
By providing an effective way to leverage nonlinear phenomena in chip-scale, high-Q optical resonators have induced the recent advances of on-chip photonics represented by micro-combs and ultra-narrow linewidth lasers. These achievements mainly relying on Si, SiO$_{2}$, and Si$_{3}$N$_{4}$ are expected to be further improved by introducing new materials having higher nonlinearity. However, establi…
▽ More
By providing an effective way to leverage nonlinear phenomena in chip-scale, high-Q optical resonators have induced the recent advances of on-chip photonics represented by micro-combs and ultra-narrow linewidth lasers. These achievements mainly relying on Si, SiO$_{2}$, and Si$_{3}$N$_{4}$ are expected to be further improved by introducing new materials having higher nonlinearity. However, establishing fabrication processes to shape a new material into the resonator geometries having extremely smooth surfaces on a chip has been a challenging task. Here we describe a universal method to implement high-Q resonators with any materials which can be deposited in high vacuum. This approach, by which light-guiding cores having surface roughness in molecular-scale is automatically defined along the prepatterned platform structures during the deposition, is verified with As$_{2}$S$_{3}$, a typical chalcogenide glass of high-nonlinearity. The Q-factor of the developed resonator is 14.4 million approaching the loss of chalcogenide fibers, which is measured in newly proposed tunable waveguide-to-resonator coupling scheme with high ideality. Lasing by stimulated Brillouin process is demonstrated with threshold power of 0.53 mW which is 100 times lower than the previous record based on chalcogenide glasses. This approach paves the way for bringing various materials of distinguished virtues to the on-chip domain while keeping the loss performance comparable to that of bulk form.
△ Less
Submitted 30 September, 2019;
originally announced September 2019.
-
Observation of a partially rotating superfluid of exciton-polariton
Authors:
Daegwang Choi,
Min Park,
Byoung Yong Oh,
Min-Sik Kwon,
Suk In Park,
Sooseok Kang,
Jin Dong Song,
Yong-Hoon Cho,
Hyoungsoon Choi
Abstract:
Rotation of a container holding a viscous fluid forms a vortex which grows with increasing angular velocity. A superfluid, however, is intrinsically different from these normal fluids because its rotation is quantized. Even if a container of superfluid is rotating, the fluid itself remains still until a critical velocity is reached. Beyond the critical velocity, all the particles conspire to sudde…
▽ More
Rotation of a container holding a viscous fluid forms a vortex which grows with increasing angular velocity. A superfluid, however, is intrinsically different from these normal fluids because its rotation is quantized. Even if a container of superfluid is rotating, the fluid itself remains still until a critical velocity is reached. Beyond the critical velocity, all the particles conspire to suddenly pick up an angular momentum of $\hbar$ each and forms a quantized vortex. As a result, a superfluid is known to increase its rotation by a total angular momentum of $N\hbar$. In this letter, we show that exciton-polariton superfluid can split into an irrotational part and a rotational part. The relative ratio between the two states can be controlled by either pump beam's power or spot size. Consequently, angular momentum of exciton-polariton superfluid can be tuned from zero to $N\hbar$ continuously. This striking observation sets the stage for studying non-equilibrium properties of a superfluid with exciton-polaritons.
△ Less
Submitted 20 October, 2020; v1 submitted 5 September, 2019;
originally announced September 2019.
-
Establishing Strong Baselines for the New Decade: Sequence Tagging, Syntactic and Semantic Parsing with BERT
Authors:
Han He,
Jinho D. Choi
Abstract:
This paper presents new state-of-the-art models for three tasks, part-of-speech tagging, syntactic parsing, and semantic parsing, using the cutting-edge contextualized embedding framework known as BERT. For each task, we first replicate and simplify the current state-of-the-art approach to enhance its model efficiency. We then evaluate our simplified approaches on those three tasks using token emb…
▽ More
This paper presents new state-of-the-art models for three tasks, part-of-speech tagging, syntactic parsing, and semantic parsing, using the cutting-edge contextualized embedding framework known as BERT. For each task, we first replicate and simplify the current state-of-the-art approach to enhance its model efficiency. We then evaluate our simplified approaches on those three tasks using token embeddings generated by BERT. 12 datasets in both English and Chinese are used for our experiments. The BERT models outperform the previously best-performing models by 2.5% on average (7.5% for the most significant case). Moreover, an in-depth analysis on the impact of BERT embeddings is provided using self-attention, which helps understanding in this rich yet representation. All models and source codes are available in public so that researchers can improve upon and utilize them to establish strong baselines for the next decade.
△ Less
Submitted 23 May, 2020; v1 submitted 13 August, 2019;
originally announced August 2019.
-
TopicSifter: Interactive Search Space Reduction Through Targeted Topic Modeling
Authors:
Hannah Kim,
Dongjin Choi,
Barry Drake,
Alex Endert,
Haesun Park
Abstract:
Topic modeling is commonly used to analyze and understand large document collections. However, in practice, users want to focus on specific aspects or "targets" rather than the entire corpus. For example, given a large collection of documents, users may want only a smaller subset which more closely aligns with their interests, tasks, and domains. In particular, our paper focuses on large-scale doc…
▽ More
Topic modeling is commonly used to analyze and understand large document collections. However, in practice, users want to focus on specific aspects or "targets" rather than the entire corpus. For example, given a large collection of documents, users may want only a smaller subset which more closely aligns with their interests, tasks, and domains. In particular, our paper focuses on large-scale document retrieval with high recall where any missed relevant documents can be critical. A simple keyword matching search is generally not effective nor efficient as 1) it is difficult to find a list of keyword queries that can cover the documents of interest before exploring the dataset, 2) some documents may not contain the exact keywords of interest but may still be highly relevant, and 3) some words have multiple meanings, which would result in irrelevant documents included in the retrieved subset. In this paper, we present TopicSifter, a visual analytics system for interactive search space reduction. Our system utilizes targeted topic modeling based on nonnegative matrix factorization and allows users to give relevance feedback in order to refine their target and guide the topic modeling to the most relevant results.
△ Less
Submitted 28 July, 2019;
originally announced July 2019.
-
Faster Neural Network Training with Data Echoing
Authors:
Dami Choi,
Alexandre Passos,
Christopher J. Shallue,
George E. Dahl
Abstract:
In the twilight of Moore's law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training. However, earlier stages of the training pipeline, such as disk I/O and data preprocessing, do not run on accelerators. As accelerators continue to improve, these earlier stages will increasingly become the bottleneck. In this paper, we introduce "data echoing," which…
▽ More
In the twilight of Moore's law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training. However, earlier stages of the training pipeline, such as disk I/O and data preprocessing, do not run on accelerators. As accelerators continue to improve, these earlier stages will increasingly become the bottleneck. In this paper, we introduce "data echoing," which reduces the total computation used by earlier pipeline stages and speeds up training whenever computation upstream from accelerators dominates the training time. Data echoing reuses (or "echoes") intermediate outputs from earlier pipeline stages in order to reclaim idle capacity. We investigate the behavior of different data echoing algorithms on various workloads, for various amounts of echoing, and for various batch sizes. We find that in all settings, at least one data echoing algorithm can match the baseline's predictive performance using less upstream computation. We measured a factor of 3.25 decrease in wall-clock time for ResNet-50 on ImageNet when reading training data over a network.
△ Less
Submitted 7 May, 2020; v1 submitted 11 July, 2019;
originally announced July 2019.
-
Regularizing Neural Networks for Future Trajectory Prediction via Inverse Reinforcement Learning Framework
Authors:
Dooseop Choi,
Kyoungwook Min,
Jeongdan Choi
Abstract:
Predicting distant future trajectories of agents in a dynamic scene is not an easy problem because the future trajectory of an agent is affected by not only his/her past trajectory but also the scene contexts. To tackle this problem, we propose a model based on recurrent neural networks (RNNs) and a novel method for training the model. The proposed model is based on an encoder-decoder architecture…
▽ More
Predicting distant future trajectories of agents in a dynamic scene is not an easy problem because the future trajectory of an agent is affected by not only his/her past trajectory but also the scene contexts. To tackle this problem, we propose a model based on recurrent neural networks (RNNs) and a novel method for training the model. The proposed model is based on an encoder-decoder architecture where the encoder encodes inputs (past trajectories and scene context information) while the decoder produces a trajectory from the context vector given by the encoder. We train the networks of the proposed model to produce a future trajectory, which is the closest to the true trajectory, while maximizing a reward from a reward function. The reward function is also trained at the same time to maximize the margin between the rewards from the ground-truth trajectory and its estimate. The reward function plays the role of a regularizer for the proposed model so the trained networks are able to better utilize the scene context information for the prediction task. We evaluated the proposed model on several public datasets. Experimental results show that the prediction performance of the proposed model is much improved by the regularization, which outperforms the-state-of-the-arts in terms of accuracy. The implementation codes are available at https://github.com/d1024choi/traj-pred-irl/.
△ Less
Submitted 25 December, 2019; v1 submitted 10 July, 2019;
originally announced July 2019.